DataOps architecture design
DataOps is a lifecycle approach to data analytics. It uses agile practices to orchestrate tools, code, and infrastructure to quickly deliver high-quality data with improved security. When you implement and streamline DataOps processes, your business can easily deliver cost effective analytical insights. DataOps helps you adopt advanced data techniques that can uncover insights and new opportunities.
There are many tools and capabilities to implement DataOps processes, like:
- Apache NiFi. Apache NiFi provides a system for processing and distributing data.
- Azure Data Factory. Azure Data Factory is a cloud-based ETL and data integration service. It enables you to create data-driven workflows to orchestrate data movement and transform data at scale.
- Azure Databricks. Use Azure Databricks to unlock insights from all your data and build AI solutions. You can also quickly set up your Apache Spark environment, autoscale, and collaborate on shared projects.
- Azure Data Lake. Use a single data storage platform to optimize costs and protect your data with encryption at rest and advanced threat protection.
- Azure Synapse Analytics. A limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics.
- Microsoft Purview. Microsoft Purview is a unified data governance solution that helps you manage and govern your on-premises, multicloud, and software-as-a-service (SaaS) data.
- Power BI. Unify data from many sources to create interactive, immersive dashboards and reports that provide actionable insights and drive business results.
Apache®, Apache Spark®, Apache NiFi®, and NiFi® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.
Introduction to DataOps on Azure
If you're new to DataOps, the best place to start is Microsoft Learn. This free online platform offers videos, tutorials, and hands-on learning for various products and services.
The following resources can help you learn about the core services for DataOps:
- Integrate data with Azure Data Factory or Azure Synapse Pipeline
- Data engineering with Azure Databricks
- Introduction to Azure Synapse Analytics
- Analyze and optimize data warehouse storage in Azure Synapse Analytics
- Read and write data in Azure Databricks
- Integrate Azure Databricks with Azure Synapse
- Turn insight into action by combining SAP and other data
- Examine data visualizations with Power BI
Path to production
To help you get started with DataOps production, consider these resources:
- Assess your DataOps process by using the DataOps checklist.
- Get help choosing the right data solution with Choose a data analytics and reporting technology in Azure.
- Start building your data storage system with Build a scalable system for massive data.
Depending on the DataOps technology you use, see the following best practices resources:
- NiFi System Administrator’s Guide
- Microsoft Purview accounts architectures and best practices
- Continuous integration and delivery in Azure Data Factory
- Repos for Git integration
- Deploying and Managing Power BI Premium Capacities
- Continuous integration and delivery for an Azure Synapse Analytics workspace
You can also learn about the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets you can use to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.
To learn about scenario-specific architectures, see the solutions in the following areas.
You can integrate Profisee data management with Azure Purview to build a foundation for data governance and management.
Modern data warehouse
Apply DevOps principles to data pipelines built according to the modern data warehouse (MDW) architectural pattern with Microsoft Azure.
Modernize a mainframe
Modernize IBM mainframe and midrange data and use a data-first approach to migrate this data to Azure.
Change data directly from Power BI
Provide data write-back functionality for Power BI reports. You can update data in Power BI, and then push the changes back to your data source.
Stay current with DataOps
Refer to Azure updates to keep current with Azure technology related to DataOps.
DataOps uses many tools and techniques to deliver data. The following resources can provide you with help on your DataOps journey.
- Azure Data Explorer monitoring
- Data analysis workloads for regulated industries
- Data management across Azure Data Lake with Azure Purview
- Hybrid ETL with Azure Data Factory
- Ingestion, ETL, and stream processing pipelines with Azure Databricks
Amazon Web Services (AWS) or Google Cloud professionals
These articles provide service mapping and comparison between Azure and other cloud services. This reference can help you ramp up quickly on Azure.