Integrating Azure Data Lake Storage Gen1 with other Azure services

Azure Data Lake Storage Gen1 can be used in conjunction with other Azure services to enable a wider range of scenarios. The following article lists the services that Data Lake Storage Gen1 can be integrated with.

Use Data Lake Storage Gen1 with Azure HDInsight

You can provision an Azure HDInsight cluster that uses Data Lake Storage Gen1 as the HDFS-compliant storage. For this release, for Hadoop and Storm clusters on Windows and Linux, you can use Data Lake Storage Gen1 only as an additional storage. Such clusters still use Azure Storage (WASB) as the default storage. However, for HBase clusters on Windows and Linux, you can use Data Lake Storage Gen1 as the default storage, or additional storage, or both.

For instructions on how to provision an HDInsight cluster with Data Lake Storage Gen1, see:

Use Data Lake Storage Gen1 with Azure Data Lake Analytics

Azure Data Lake Analytics enables you to work with Big Data at cloud scale. It dynamically provisions resources and lets you do analytics on terabytes or even exabytes of data that can be stored in a number of supported data sources, one of them being Data Lake Storage Gen1. Data Lake Analytics is specially optimized to work with Data Lake Storage Gen1 - providing the highest level of performance, throughput, and parallelization for you big data workloads.

For instructions on how to use Data Lake Analytics with Data Lake Storage Gen1, see Get Started with Data Lake Analytics using Data Lake Storage Gen1.

Use Data Lake Storage Gen1 with Azure Data Factory

You can use Azure Data Factory to ingest data from Azure tables, Azure SQL Database, Azure SQL DataWarehouse, Azure Storage Blobs, and on-premises databases. Being a first class citizen in the Azure ecosystem, Azure Data Factory can be used to orchestrate the ingestion of data from these source to Data Lake Storage Gen1.

For instructions on how to use Azure Data Factory with Data Lake Storage Gen1, see Move data to and from Data Lake Storage Gen1 using Data Factory.

Copy data from Azure Storage Blobs into Data Lake Storage Gen1

Azure Data Lake Storage Gen1 provides a command-line tool, AdlCopy, that enables you to copy data from Azure Blob Storage into a Data Lake Storage Gen1 account. For more information, see Copy data from Azure Storage Blobs to Data Lake Storage Gen1.

Copy data between Azure SQL Database and Data Lake Storage Gen1

You can use Apache Sqoop to import and export data between Azure SQL Database and Data Lake Storage Gen1. For more information, see Copy data between Data Lake Storage Gen1 and Azure SQL Database using Sqoop.

Use Data Lake Storage Gen1 with Stream Analytics

You can use Data Lake Storage Gen1 as one of the outputs to store data streamed using Azure Stream Analytics. For more information, see Stream data from Azure Storage Blob into Data Lake Storage Gen1 using Azure Stream Analytics.

Use Data Lake Storage Gen1 with Power BI

You can use Power BI to import data from a Data Lake Storage Gen1 account to analyze and visualize the data. For more information, see Analyze data in Data Lake Storage Gen1 using Power BI.

Use Data Lake Storage Gen1 with Data Catalog

You can register data from Data Lake Storage Gen1 into the Azure Data Catalog to make the data discoverable throughout the organization. For more information see Register data from Data Lake Storage Gen1 in Azure Data Catalog.

Use Data Lake Storage Gen1 with SQL Server Integration Services (SSIS)

You can use the Data Lake Storage Gen1 connection manager in SSIS to connect an SSIS package with Data Lake Storage Gen1. For more information, see Use Data Lake Storage Gen1 with SSIS.

Use Data Lake Storage Gen1 with Azure Event Hubs

You can use Azure Data Lake Storage Gen1 to archive and capture data received by Azure Event Hubs. For more information see Use Data Lake Storage Gen1 with Azure Event Hubs.

See also