Transparency builds on top of visibility patterns and enables contextual data processing. It helps build the data hierarchy to enable digital twins and consistent business key performance indicator (KPI) calculation flow.
This article contains a common transparency pattern for industrial solutions.
Apache®, Apache Spark, PySpark, and the flame logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.
Download a PowerPoint file for the following patterns.
Business KPI calculation engine
Calculate business metrics by using IoT telemetry and other business systems data.
Download a PowerPoint file of this pattern.
Dataflow
1. The edgeHub module sends the machine availability data to Azure IoT Hub or Azure IoT Central by using advanced message queuing protocol (AMQP) or MQTT. IoT Hub or Azure IoT Central sends module updates to the edge and provides an edge management control plan.
2. IoT Hub or Azure IoT Central uses data connection or data export to send data to Azure Data Explorer.
3. Azure Data Explorer dashboards use Kusto Query Language (KQL) to fetch the data from the cluster and build a near real-time dashboard around machine availability.
4. Data from IoT Hub or Azure IoT Central is pushed to Azure Data Lake Storage by using message routing or data export. Data Lake Storage provides long-term storage and processing.
5. An Azure Synapse Analytics pipeline fetches the production quality data—after every shift—from an on-premises system.
6. The Azure Synapse pipeline stores the data in the data lake for calculation.
7. The Azure Synapse pipeline runs PySpark code, which contains the overall equipment effectiveness (OEE) calculation business logic.
8. The PySpark code fetches the machine availability data from Azure Data Explorer, and then calculates the availability.
9. The PySpark code fetches the production quality data from Data Lake Storage, and then calculates quality, performance, and OEE per shift.
10. The PySpark code stores the OEE data in Azure SQL Database.
11. Power BI connects with SQL Database for reporting and visualization.
Potential use cases
- Use this pattern when:
- You're using Azure Synapse for data fabric aspects like data lake management, data pipeline management, serverless data processing, data warehousing, and data governance.
- You need to combine both real-time and batch data pipelines and perform business KPI calculations.
- You need to standardize KPI calculations across factories and business units.
Considerations
- For more information on data flows, see Data flows in Azure Synapse.
- For more information on spark connectors, see Azure Data Explorer (Kusto) spark connector.
- For a less compute intensive and serverless calculation engine, use Azure Functions instead of an Azure Synapse spark pool.
Deploy this scenario
- Deployment sample:
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal author:
- Jomit Vaghela | Principal Program Manager
To see non-public LinkedIn profiles, sign in to LinkedIn.
Next steps
- Integrate SQL and Apache Spark pools in Azure Synapse Analytics
- Introduction to Azure IoT Hub
- Introduction to Azure Synapse Analytics