Learning Path: Azure Synapse Analytics – An Introduction
As technology professionals, we are working during some of the most challenging times. Not only is the pace of change increasing, but keeping up with the volume of new cloud-based products and services is becoming more difficult. To survive in this environment, we must employ various learning techniques, and this Learning Path is an example of Guided Technical Enablement. It provides a structure for quality curated content that a student can use to start mastering a topic.
This learning Path covers Azure Synapse Analytics which Microsoft announced on November 4th 2019. This service is not only the next evolution of Azure SQL Data Warehouse, but combines enterprise data warehousing and Big Data Analytics by providing serverless on-demand or provisioned resources at scale. Through the Azure Synapse Studio your team, made up of; Data Engineers, Data Scientists and IT Pros, can collaborate on projects to ingest, prepare, manage and serve data together in a single pane of glass experience.
Azure Synapse has four main components:
- SQL Analytics: Complete T-SQL based analytics – Generally Available
- SQL pool (pay per DWU provisioned)
- SQL on-demand (pay per TB processed) – (Preview)
- Spark: Deeply integrated Apache Spark (Preview)
- Data Integration: Hybrid data integration (Preview)
- Studio: Unified user experience. (Preview)
As this is a new offering, this learning path will be updated as more material is available; however, the content provided here will help to bring you up to speed in this product set.
Goals and Format of this Learning Path
In this Learning Path for the Azure Synapse Analytics, you will find various online resources in a structured learning format. Each level contains the objectives delivered through articles, labs and tutorials.
The paths split into different levels to provide a structure to the process.
|300||Optimization and Deep Technical|
|400||Best Practices and Use Cases|
The concepts and skills that this Learning Path covers would be beneficial for the following roles:
- Data Professionals looking for an introduction to Azure Synapse Analytics
- Solution Architects who design overall solutions that include Azure Synapse Analytics in end-to-end solutions.
- BI and Data Developers to understand how to design and implement Azure Synapse Analytics data solutions.
- Data Professionals and DevOps teams*, who implement and operate Azure Synapse Analytics.
Level 000 – The Prerequisites
With the introduction of Azure Synapse Analytics, most of the material is introductory. The architecture, however, is composed of various Azure Data Services, which would have their own set of documentation. Azure Synapse is the updated version Azure SQL Data Warehouse and as such some documentation has been updated with most of the other documentation remaining unchanged.
The main component of Azure Synapse Analytics is Azure SQL Data Warehouse. I produced a learning path, Learning Path: Azure Synapse (Formerly Azure SQL Data Warehouse) and am currently updating with new material, however this learning path contains updated documentation.
- Define Azure Synapse Analytics
- Review some of the other services that make up Azure Synapse Analytics
Azure Synapse Analytics – Limitless analytics service with unmatched time to insight – Overview page on the Azure Portal.
What is Azure Synapse Analytics (formerly SQL DW)? – High-level introduction to the services.
How Microsoft re-envisioned the data warehouse with Azure Synapse Analytics – News release and overview.
Level 100 – The Basics
The following section provides the Level 100 introduction articles.
- Review the Architecture of a typical installation
- Gain hands-on overview of the tools
The following videos are 10 to 15 minutes in length and provide an introduction to Azure Synapse Analytics.
Video: Azure Synapse Analytics – Next-gen Azure SQL Data Warehouse – Limitless analytics service with unmatched time to insight. This is an end-to-end experience of building and deploying rich analytics scenarios and how you can automatically generate predictive models.
Demo Video: Azure Synapse Analytics | Microsoft Ignite 2019 – Azure Synapse Analytics makes building and operating analytics solutions a simple, intuitive and no-code experience. See how it can empower users to analyze data quickly while bringing high performance and unmatched scale.
VideoAzure Synapse Analytics | SQL, data warehouse, ML, and on-demand compute (Microsoft Ignite) – Demo of predictive models.
Video : Azure SQL Data Warehouse – Videos – Watch the latest Azure SQL Data Warehouse videos to learn about new capabilities and performance improvements.
Video: Azure Arc for data services, including SQL and PostgreSQL (Microsoft Ignite) – A deep dive on Azure data services as part of Azure Arc. This provides a central way to provision, then manage your SQL or PostgreSQL databases, whether it sits in your data center, in Azure, or in other Clouds.
Video: Azure DevOps updated roadmap & new updates for Azure Stream Analytics & Synapse Analytics – Synapse Analytics Update
Video: Demo: Azure Arc | Microsoft Ignite 2019 – Azure Arc helps those working in IT seamlessly manage all their resources through the Azure portal. See how this experience unifies operations, enabling agility and consistency across data centers on-premises and across multiple clouds.
SQL Data Warehouse Content – Updated
Azure Synapse Analytics (formerly SQL DW) architecture – SQL Analytics leverages a scale-out architecture to distribute computational processing of data across multiple nodes. This article covers the architecture by covering; Azure Storage, Control Node, Compute Nodes, Data Movement Service, Distributions and other architectural elements.
The following PDF contains a lab that walks you through the creation of an environment in Azure with an Azure SQL Databases and a SQL Data Warehouse. SQL Data Warehouse is a foundation technology for Azure Synapse Analytics. Hands-On-Lab-SQL-DW-AzureSetUp
Tutorial – Tutorial: Load New York Taxicab data to Azure SQL Data Warehouse – This covers loading data into Azure Synapse Analytics.
Level 200 – Deeper Concepts
This section contains material that helps implement and use the services that make up Azure Synapse.
- Explain data loading practices
- Use Azure Synapse with Azure Machine Learning
- Review security in this environment
Quickstart: Pause and resume compute for an Azure SQL Data Warehouse in the Azure portal – Important while testing and when not using your instance you scale or stop your to manage costs.
Tutorial – Analyze data with Azure Machine Learning – This tutorial uses Azure Machine Learning to build a predictive machine learning model based on data stored in Azure SQL Data Warehouse. Specifically, this builds a targeted marketing campaign for Adventure Works, the bike shop, by predicting if a customer is likely to buy a bike or not.
Data loading strategies for Azure SQL Data Warehouse(New Copy Command) – Traditional SMP data warehouses use an Extract, Transform, and Load (ETL) process for loading data. Azure SQL Data Warehouse is a massively parallel processing (MPP) architecture that takes advantage of the scalability and flexibility of compute and storage resources. Utilizing an Extract, Load, and Transform (ELT) process can take advantage of MPP and eliminate resources needed to transform the data before loading. While SQL Data Warehouse supports many loading methods including popular SQL Server options such as BCP and the SQL BulkCopy API, the fastest and most scalable way to load data is through PolyBase external tables and the COPY statement (is currently in preview with Azure Analytics).
Tutorial Load data from Azure Data Lake Storage to SQL Data Warehouse – Use PolyBase external tables to load data from Azure Data Lake Storage into Azure SQL Data Warehouse. Although you can run ad-hoc queries on data stored in Data Lake Storage, we recommend importing the data into the SQL Data Warehouse for the best performance.
Tutorial: Tutorial: Load New York Taxicab data to Azure SQL Data Warehouse – This tutorial uses PolyBase to load New York Taxicab data from a global Azure blob to Azure SQL Data Warehouse.
Secure a database in SQL Data Warehouse – This article walks through the basics of securing your Azure SQL Data Warehouse database. In particular, this article gets you started with resources for limiting access, protecting data, and monitoring activities on a database.
Configure and manage Azure Active Directory authentication with SQL – This article shows you how to create and populate Azure AD, and then use Azure AD with Azure SQL Database, managed instance, and SQL Data Warehouse.
Level 300 – Optimization and Deep Technical
This section goes deeper into building solutions, scaling and performance topics.
- Review SQL Server Data Tools and their integration with Azure Synapse
- Review Performance and the various tools
New in Azure Synapse Analytics: CICD for SQL Analytics using SQL Server Data Tools – This release includes support for SQL Server Data Tools with Visual Studio 2019 along with native platform integration with Azure DevOps providing built-in continuous integration and deployment (CI/CD) capabilities for enterprise-level deployments. This announcement also comes with support for the Schema Compare extension in Azure Data Studio for SQL Analytics. You can now expect a frictionless development and deployment experience on any platform for your analytics solution.
Video Maximizing query performance | Azure SQL Data Warehouse – In this video, Charles Feddersen, Principal Program Manager at Microsoft, breaks down table distributions and other best practices to maximize your SQL Data Warehouse performance and tune your queries to meet your workload needs.
Data Warehouse Units (DWUs) – A SQL pool represents a collection of analytic resources that are being provisioned when using SQL Analytics. Analytic resources are defined as a combination of CPU, memory and IO. These three resources are bundled into units of compute scale called Data Warehouse Units (DWUs). Recommendations on choosing the ideal number of data warehouse units (DWUs) to optimize price and performance, and how to change the number of units.
Analyze your workload in Azure SQL Data Warehouse – Techniques for analyzing your workload in Azure SQL Data Warehouse.
Monitor your workload using DMVs – This article describes how to use Dynamic Management Views (DMVs) to monitor your workload. This includes investigating query execution in Azure SQL Data Warehouse.
Manage and monitor workload importance in Azure SQL Data Warehouse – Manage and monitor request level importance in Azure SQL Data Warehouse using DMVs and catalog views.
Use Azure Stream Analytics with SQL Data Warehouse – Azure Stream Analytics is a fully managed service providing low-latency, highly available, scalable complex event processing over streaming data in the cloud. You can learn the basics by reading Introduction to Azure Stream Analytics. You can then learn how to create an end-to-end solution with Stream Analytics by following the Get started using Azure Stream Analytics tutorial.
Visualize Azure SQL DW data with Power BI – This tutorial shows you how to use Power BI to connect to SQL Data Warehouse and create a few basic visualizations.
Level 400 – Best Practices / Lessons from the field / Use Cases
This section reviews some best practices and other material that help in solution development and operation.
- Review Enterprise BI in azure with Azure Synapse
- Review Azure Data Factory and process automation
Enterprise BI in Azure with Azure Synapse Analytics -This reference architecture implements an extract, load, and transform (ELT) pipeline that moves data from an on-premises SQL Server database into Azure Synapse and transforms the data for analysis. A reference implementation for this architecture is available on GitHub. This is a must-read article
Automated enterprise BI with Azure Synapse Analytics and Azure Data Factory – This reference architecture shows how to perform incremental loading in an extract, load, and transform (ELT) pipeline. It uses Azure Data Factory to automate the ELT pipeline. The pipeline incrementally moves the latest OLTP data from an on-premises SQL Server database into Azure Synapse. Transactional data is transformed into a tabular model for analysis. A reference implementation for this architecture is available on GitHub.
Azure Synapse Analytics (formerly SQL DW) Frequently asked questions – This document is updated over time with various FAQ topics.
Cheat sheet for Azure Synapse Analytics (formerly SQL DW) – This cheat sheet provides helpful tips and best practices for building Azure Synapse solutions. This article has just minor changes from the SQL DW version that update some of the concepts to Azure Synapse. A good overview of the capacity limits of a variety of features.
How can we improve Microsoft Azure SQL Data Warehouse ? – Do you have an idea or suggestion based on your experience with SQL Data Warehouse? We would love to hear it! Please take a few minutes to submit your idea or vote up an idea submitted by another SQL Data Warehouse customer. All of the feedback you share in these forums will be monitored and reviewed by the SQL Data Warehouse engineering team. By suggesting or voting for ideas here, you will also be one of the first to know when we begin work on your feature requests and when we release the feature.
Labs & Online Training
More information will be added when available..
<sup>1</sup> Learning styles – Wikipedia.org – https://en.wikipedia.org/wiki/Learning_styles Sept 28,2019