Architectures overview

Before you start to build out the data architectures of your cloud-scale analytics framework, review the articles in the following table.

Section Description
Build an Initial Strategy How to build your data strategy and pivot to become a data driven organization.
Define your plan How to develop a plan for cloud-scale analytics.
Prepare analytics estate Overview of data management and data landing zones with key design area considerations like enterprise enrollment, networking, identity and access management, policies, business continuity and disaster recovery.
Govern your analytics Requirements to govern data, data catalog, lineage, master data management, data quality, data sharing agreements and metadata.
Secure your analytics estate How to secure analytics estate with authentication and authorization, data privacy, and data access management.
Organize people and teams How to organize effective operations, roles, teams, and team functions.
Manage your analytics estate How to provision platform and observability for a scenario.

Physical architecture

The physical implementation of cloud-scale analytics consists of two main architectures: the data management landing zone and data landing zone.

Data applications

Data applications are a core concept for delivering a data product and can be aligned to both lakehouse and data mesh patterns.

Cloud-scale analytics

You can scale your cloud-scale analytics deployment by using multiple data landing zones.

Data mesh

Implement data mesh by using cloud-scale analytics. Although most cloud-scale analytics guidance applies, there are some differences to be aware of for data domains, self-serve data platforms, onboarding data products, governance, data marketplace, and data sharing.

Deployment templates for cloud-scale analytics

The following table lists reference templates that you can deploy.

Repository Content Required Deployment model
Data management template Central data management services and shared data services like data catalog and self-hosted integration runtime Yes One per cloud-scale analytics
Data landing zone template Data landing zone shared services, including ingestion, management, and data storage services Yes One per data landing zone
Data integration template - batch processing Additional services necessary for batch data processing No One or more per data landing zone
Data integration template - stream processing Additional services necessary for data stream processing No One or more per data landing zone
Data product template - analytics and data science Additional services necessary for data analytics and AI No One or more per data landing zone

These templates contain Azure Resource Manager templates, the templates' parameter files, and CI/CD pipeline definitions for resource deployment.

Templates can change over time due to new Azure services and requirements. Secure each repository's main branch so it remains error-free and ready for consumption and deployment. Use a development subscription to test template configuration changes before you merge feature enhancements back into your main branch.

Connect to environments privately

The reference architecture is secure by design. It uses a multilayered security approach to overcome common data exfiltration risks.

The most simple security solution is to host a jumpbox on the virtual network of the data management landing zone or data landing zone to connect to the data services through private endpoints.

Frequently asked questions

For a list of questions and answers about cloud-scale analytics, see Frequently asked questions.

Next steps

Cloud-scale analytics data management landing zone overview