Suppose you work in the analytics department of a large health system. Your organization’s IT infrastructure is hybrid both on-premise and cloud-based, and all data, including customer interactions and services information, resides in Azure SQL Data Warehouse. Your department analyzes customer services usage patterns and proposes inefficiencies in the processes based on your findings. You can achieve the desired results by using the robust machine learning and deep learning functions of Azure Databricks in conjunctions with the Azure SQL Data Warehouse.
Azure Databricks is a fully managed, cloud-based big data and machine learning platform. It enables developers to accelerate AI implementation by simplifying the process of building enterprise-grade production data applications. Built in a joint effort by Microsoft and the team that started Apache Spark, Azure Databricks provides data science and engineering teams with a single platform for big data processing and machine learning.
By combining an end-to-end, managed Apache Spark platform optimized for the cloud with the enterprise scale and security of the Azure platform, Azure Databricks makes it easy to run large-scale Spark workloads.
You can access SQL Data Warehouse from Azure Databricks by using the SQL Data Warehouse connector. SQL Data Warehouse connector is a data source implementation for Apache Spark that uses Azure Blob storage and PolyBase in SQL Data Warehouse to transfer large volumes of data efficiently between an Azure Databricks cluster and a SQL Data Warehouse instance.
Both the Azure Databricks cluster and the SQL Data Warehouse instance access a common Blob storage container to exchange data. In Azure Databricks, Spark jobs are triggered by the SQL Data Warehouse connector to read data from and write data to the Blob storage container. On the SQL Data Warehouse side, data loading and unloading operations performed by PolyBase are triggered by the SQL Data Warehouse connector through JDBC.
PolyBase is a technology that accesses data outside of a database via the T-SQL language. In Azure SQL Data Warehouse, you can import and export data to and from Azure Blob storage and Azure Data Lake Store.
Azure Data Factory is a cloud-based data integration service. It lets you create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation. Data Factory supports various data stores. In this case it uses Azure SQL Database as a data source.