Data connectors overview
Data ingestion is the process used to load data from one or more sources into a Real-Time Intelligence KQL database in Microsoft Fabric. Once ingested, the data becomes available for query. Real-Time Intelligence provides several connectors for data ingestion.
The following table summarizes the available data connectors, tools, and integrations.
Name | Functionality | Supports streaming? | Type | Use cases |
---|---|---|---|---|
Apache Flink | Ingestion | ✔️ | Open source | Telemetry |
Apache Kafka | Ingestion | ✔️ | Open source | Logs, Telemetry, Time series |
Apache Log4J 2 | Ingestion | ✔️ | Open source | Logs |
Apache Spark | Export Ingestion |
Open source | Telemetry | |
Apache Spark for Azure Synapse Analytics | Export Ingestion |
First party | Telemetry | |
Azure Data Factory | Export Ingestion |
First party | Data orchestration | |
Azure Event Hubs | Ingestion | ✔️ | First party | Messaging |
Azure Functions | Export Ingestion |
First party | Workflow integrations | |
Azure Stream Analytics | Ingestion | ✔️ | First party | Event processing |
Fluent Bit | Ingestion | ✔️ | Open source | Logs, Metrics, Traces |
Logstash | Ingestion | Open source | Logs | |
NLog | Ingestion | ✔️ | Open source | Telemetry, Logs, Metrics |
Open Telemetry | Ingestion | ✔️ | Open source | Traces, Metrics, Logs |
Power Automate | Export Ingestion |
First party | Data orchestration | |
Serilog | Ingestion | ✔️ | Open source | Logs |
Splunk | Ingestion | Open source | Logs | |
Splunk Universal Forwarder | Ingestion | Open source | Logs | |
Telegraf | Ingestion | ✔️ | Open source | Metrics, Logs |
The following table summarizes the available connectors and their capabilities:
Apache Flink
Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. The connector implements data sink for moving data across Azure Data Explorer and Flink clusters. Using Azure Data Explorer and Apache Flink, you can build fast and scalable applications targeting data driven scenarios. For example, machine learning (ML), Extract-Transform-Load (ETL), and Log Analytics.
- Functionality: Ingestion
- Ingestion type supported: Streaming
- Use cases: Telemetry
- Underlying SDK: Java
- Repository: Microsoft Azure - https://github.com/Azure/flink-connector-kusto/
- Documentation: Get data from Apache Flink
Apache Kafka
Apache Kafka is a distributed streaming platform for building real-time streaming data pipelines that reliably move data between systems or applications. Kafka Connect is a tool for scalable and reliable streaming of data between Apache Kafka and other data systems. The Kafka Sink serves as the connector from Kafka and doesn't require using code. This is gold certified by Confluent - has gone through comprehensive review and testing for quality, feature completeness, compliance with standards, and for performance.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Logs, Telemetry, Time series
- Underlying SDK: Java
- Repository: Microsoft Azure - https://github.com/Azure/kafka-sink-azure-kusto/
- Documentation: Get data from Apache Kafka
- Community Blog: Kafka ingestion into Azure Data Explorer
Apache Log4J 2
Log4J is a popular logging framework for Java applications maintained by the Apache Foundation. Log4j allows developers to control which log statements are output with arbitrary granularity based on the logger's name, logger level, and message pattern. The Apache Log4J 2 sink allows you to stream your log data to your database, where you can analyze and visualize your logs in real time.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Logs
- Underlying SDK: Java
- Repository: Microsoft Azure - https://github.com/Azure/azure-kusto-log4j
- Documentation: Get data with the Apache Log4J 2 connector
- Community Blog: Getting started with Apache Log4J and Azure Data Explorer
Apache Spark
Apache Spark is a unified analytics engine for large-scale data processing. The Spark connector is an open source project that can run on any Spark cluster. It implements data source and data sink for moving data to or from Spark clusters. Using the Apache Spark connector, you can build fast and scalable applications targeting data driven scenarios. For example, machine learning (ML), Extract-Transform-Load (ETL), and Log Analytics. With the connector, your database becomes a valid data store for standard Spark source and sink operations, such as read, write, and writeStream.
- Functionality: Ingestion, Export
- Ingestion type supported: Batching, Streaming
- Use cases: Telemetry
- Underlying SDK: Java
- Repository: Microsoft Azure - https://github.com/Azure/azure-kusto-spark/
- Documentation: Apache Spark connector
- Community Blog: Data preprocessing for Azure Data Explorer for Azure Data Explorer with Apache Spark
Apache Spark for Azure Synapse Analytics
Apache Spark is a parallel processing framework that supports in-memory processing to boost the performance of big data analytic applications. Apache Spark in Azure Synapse Analytics is one of Microsoft's implementations of Apache Spark in the cloud. You can access a database from Synapse Studio with Apache Spark for Azure Synapse Analytics.
- Functionality: Ingestion, Export
- Ingestion type supported: Batching
- Use cases: Telemetry
- Underlying SDK: Java
- Documentation: Connect to an Azure Synapse workspace
Azure Data Factory
Azure Data Factory (ADF) is a cloud-based data integration service that allows you to integrate different data stores and perform activities on the data.
- Functionality: Ingestion, Export
- Ingestion type supported: Batching
- Use cases: Data orchestration
- Documentation: Copy data to your database by using Azure Data Factory
Azure Event Hubs
Azure Event Hubs is a big data streaming platform and event ingestion service. You can configure continuous ingestion from customer-managed Event Hubs.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Documentation: Azure Event Hubs data connection
Azure Functions
Azure Functions allow you to run serverless code in the cloud on a schedule or in response to an event. With input and output bindings for Azure Functions, you can integrate your database into your workflows to ingest data and run queries against your database.
- Functionality: Ingestion, Export
- Ingestion type supported: Batching
- Use cases: Workflow integrations
- Documentation: Integrating Azure Functions using input and output bindings (preview)
- Community Blog: Azure Data Explorer (Kusto) Bindings for Azure Functions
Azure Stream Analytics
Azure Stream Analytics is a real-time analytics and complex event-processing engine that's designed to process high volumes of fast streaming data from multiple sources simultaneously.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Event processing
- Documentation: Get data from Azure Stream Analytics
Fluent Bit
Fluent Bit is an open-source agent that collects logs, metrics, and traces from various sources. It allows you to filter, modify, and aggregate event data before sending it to storage.
- Functionality: Ingestion
- Ingestion type supported: Batching
- Use cases: Logs, Metrics, Traces
- Repository: fluent-bit Kusto Output Plugin
- Documentation: Get data with Fluent Bit
Logstash
The Logstash plugin enables you to process events from Logstash into an Azure Data Explorer database for later analysis.
- Functionality: Ingestion
- Ingestion type supported: Batching
- Use cases: Logs
- Underlying SDK: Java
- Repository: Microsoft Azure - https://github.com/Azure/logstash-output-kusto/
- Documentation: Get data from Logstash
- Community Blog: How to migrate from Elasticsearch to Azure Data Explorer
NLog
NLog is a flexible and free logging platform for various .NET platforms, including .NET standard. NLog allows you to write to several targets, such as a database, file, or console. With NLog you can change the logging configuration on-the-fly. The NLog sink is a target for NLog that allows you to send your log messages to your database. The plugin provides an efficient way to sink your logs to your cluster.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Telemetry, Logs, Metrics
- Underlying SDK: .NET
- Repository: Microsoft Azure - https://github.com/Azure/azure-kusto-nlog-sink
- Documentation: Get data with the NLog sink
- Community Blog: Getting started with NLog sink and Azure Data Explorer
Open Telemetry
The OpenTelemetry connector supports ingestion of data from many receivers into your database. It works as a bridge to ingest data generated by Open telemetry to your database by customizing the format of the exported data according to your needs.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Traces, Metrics, Logs
- Underlying SDK: Go
- Repository: Open Telemetry - https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/azuredataexplorerexporter
- Documentation: Get data from OpenTelemetry
- Community Blog: Getting started with Open Telemetry and Azure Data Explorer
Power Automate
Power Automate is an orchestration service used to automate business processes. The Power Automate (previously Microsoft Flow) connector enables you to orchestrate and schedule flows, send notifications, and alerts, as part of a scheduled or triggered task.
- Functionality: Ingestion, Export
- Ingestion type supported: Batching
- Use cases: Data orchestration
- Documentation: Microsoft Power Automate connector
Serilog
Serilog is a popular logging framework for .NET applications. Serilog allows developers to control which log statements are output with arbitrary granularity based on the logger's name, logger level, and message pattern. The Serilog sink, also known as an appender, streams your log data to your database, where you can analyze and visualize your logs in real time.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Logs
- Underlying SDK: .NET
- Repository: Microsoft Azure - https://github.com/Azure/serilog-sinks-azuredataexplorer
- Documentation: Get data with the Serilog sink
- Community Blog: Getting started with Serilog sink and Azure Data Explorer
Splunk
Splunk Enterprise is a software platform that allows you to ingest data from many sources simultaneously.The Azure Data Explorer add-on sends data from Splunk to a table in your cluster.
- Functionality: Ingestion
- Ingestion type supported: Batching
- Use cases: Logs
- Underlying SDK: Python
- Repository: Microsoft Azure - https://github.com/Azure/azure-kusto-splunk/tree/main/splunk-adx-alert-addon
- Documentation: Get data from Splunk
- Splunk Base: Microsoft Fabric Add-On for Splunk
- Community Blog: Getting started with Microsoft Azure Data Explorer Add-On for Splunk
Splunk Universal Forwarder
- Functionality: Ingestion
- Ingestion type supported: Batching
- Use cases: Logs
- Repository: Microsoft Azure - https://github.com/Azure/azure-kusto-splunk
- Documentation: Get data from Splunk Universal Forwarder to Azure Data Explorer
- Community Blog: Get data using Splunk Universal forwarder into Azure Data Explorer
Telegraf
Telegraf is an open source, lightweight, minimal memory foot print agent for collecting, processing and writing telemetry data including logs, metrics, and IoT data. Telegraf supports hundreds of input and output plugins. It's widely used and well supported by the open source community. The output plugin serves as the connector from Telegraf and supports ingestion of data from many types of input plugins into your database.
- Functionality: Ingestion
- Ingestion type supported: Batching, Streaming
- Use cases: Telemetry, Logs, Metrics
- Underlying SDK: Go
- Repository: InfluxData - https://github.com/influxdata/telegraf/tree/master/plugins/outputs/azure_data_explorer
- Documentation: Get data from Telegraf
- Community Blog: New Azure Data Explorer output plugin for Telegraf enables SQL monitoring at huge scale
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for