Automatic updates on source systems

Anshal 2,251 Reputation points
2024-05-06T16:17:15.8566667+00:00

Hi friends, I have a requirement that the business has 12 different upstream sources SAP, CRM, Oracle HCM, SqlServer, PostgreSQL, and multiple other open-source databases and custom applications, etc. and every system has its master record, and just all the systems working as different entities.

I require that we integrate all the systems into Azure and analytics done in Azure the dashboard should retrieve the records that span across the multiple source systems and give a piece of complete, consistent information. How would you get the data into the cloud and what kind of integration and transformation does it require? lastly, all the source systems should update the change information automatically without any manual process.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2024-05-07T09:14:31.6866667+00:00

    @Anshal

    Thanks for using MS Q&A platform and posting your query.

    Integrating 12 diverse systems into Azure for analytics with a comprehensive dashboard is achievable. Here's a breakdown of the steps involved:

    1. Data Ingestion:

    • Connectors: Azure offers a rich set of connectors for various data sources like SAP, CRM, Oracle HCM, SQL Server, PostgreSQL, and even custom applications through APIs. These connectors can securely extract data from your on-premises systems and cloud sources.
    • Azure Data Factory (ADF): This serverless orchestration service acts as the central hub for data movement. You can create pipelines in ADF to configure these connectors, schedule data extractions at regular intervals, and transfer data to your chosen Azure storage solution.

    2. Data Storage:

    • Azure Data Lake Storage (ADLS): This scalable data lake can store all your ingested data in its raw format, regardless of structure or schema. It's a good choice for initial storage as you define your data transformations.
    • Azure SQL Database (for structured data): Once you define the data structure and schema, consider migrating relevant data to a relational database like Azure SQL Database for efficient querying and analysis.

    3. Data Transformation:

    • Azure Data Factory (ADF): ADF pipelines can also perform data transformations. You can clean, filter, standardize formats, and enrich data as needed before loading it into the target storage.
    • Azure Databricks (for complex transformations): If transformations are complex or involve machine learning, consider using Azure Databricks, a managed Apache Spark environment that offers powerful data processing capabilities.

    4. Building the Unified View:

    • Azure Synapse Analytics: This cloud data warehouse can house your transformed data. It allows you to create dimensional models that integrate data from various sources, enabling you to analyze relationships across systems.

    5. Real-time Updates (Change Data Capture):

    • Change Data Capture (CDC): Many source systems offer CDC capabilities, allowing you to capture only the changes made to the data since the last extraction. Tools like Azure SQL Data Sync or Log Miner for Oracle can be used for CDC, minimizing data transfer and improving efficiency.

    6. Building the Dashboard:

    • Azure Power BI: This powerful business intelligence tool allows you to visualize data from Azure Synapse Analytics. You can create interactive dashboards that display insights from all your integrated systems.

    Additional Considerations:

    • Security: Implement robust security measures to control access to data at all stages of the integration process. Azure offers various security features like role-based access control (RBAC) to ensure data privacy.
    • Error Handling: Design pipelines to handle errors gracefully, log them for troubleshooting, and retry failed data transfers.
    • Monitoring: Continuously monitor your data pipelines and data quality to ensure data consistency and timely updates in your dashboard.

    Remember, this is a high-level overview. The specific tools and configurations will depend on your specific data sources, schema complexities, and desired update frequency.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.