Data connectors overview

Data ingestion is the process used to load data from one or more sources into a Real-Time Intelligence KQL database in Microsoft Fabric. Once ingested, the data becomes available for query. Real-Time Intelligence provides several connectors for data ingestion.

The following table summarizes the available data connectors, tools, and integrations.

Name Functionality Supports streaming? Type Use cases
Apache Flink Ingestion ✔️ Open source Telemetry
Apache Kafka Ingestion ✔️ Open source Logs, Telemetry, Time series
Apache Log4J 2 Ingestion ✔️ Open source Logs
Apache Spark Export
Ingestion
Open source Telemetry
Apache Spark for Azure Synapse Analytics Export
Ingestion
First party Telemetry
Azure Data Factory Export
Ingestion
First party Data orchestration
Azure Event Hubs Ingestion ✔️ First party Messaging
Azure Functions Export
Ingestion
First party Workflow integrations
Azure Stream Analytics Ingestion ✔️ First party Event processing
Fluent Bit Ingestion ✔️ Open source Logs, Metrics, Traces
Logstash Ingestion Open source Logs
NLog Ingestion ✔️ Open source Telemetry, Logs, Metrics
Open Telemetry Ingestion ✔️ Open source Traces, Metrics, Logs
Power Automate Export
Ingestion
First party Data orchestration
Serilog Ingestion ✔️ Open source Logs
Splunk Ingestion Open source Logs
Splunk Universal Forwarder Ingestion Open source Logs
Telegraf Ingestion ✔️ Open source Metrics, Logs

The following table summarizes the available connectors and their capabilities:

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. The connector implements data sink for moving data across Azure Data Explorer and Flink clusters. Using Azure Data Explorer and Apache Flink, you can build fast and scalable applications targeting data driven scenarios. For example, machine learning (ML), Extract-Transform-Load (ETL), and Log Analytics.

Apache Kafka

Apache Kafka is a distributed streaming platform for building real-time streaming data pipelines that reliably move data between systems or applications. Kafka Connect is a tool for scalable and reliable streaming of data between Apache Kafka and other data systems. The Kafka Sink serves as the connector from Kafka and doesn't require using code. This is gold certified by Confluent - has gone through comprehensive review and testing for quality, feature completeness, compliance with standards, and for performance.

Apache Log4J 2

Log4J is a popular logging framework for Java applications maintained by the Apache Foundation. Log4j allows developers to control which log statements are output with arbitrary granularity based on the logger's name, logger level, and message pattern. The Apache Log4J 2 sink allows you to stream your log data to your database, where you can analyze and visualize your logs in real time.

Apache Spark

Apache Spark is a unified analytics engine for large-scale data processing. The Spark connector is an open source project that can run on any Spark cluster. It implements data source and data sink for moving data to or from Spark clusters. Using the Apache Spark connector, you can build fast and scalable applications targeting data driven scenarios. For example, machine learning (ML), Extract-Transform-Load (ETL), and Log Analytics. With the connector, your database becomes a valid data store for standard Spark source and sink operations, such as read, write, and writeStream.

Apache Spark for Azure Synapse Analytics

Apache Spark is a parallel processing framework that supports in-memory processing to boost the performance of big data analytic applications. Apache Spark in Azure Synapse Analytics is one of Microsoft's implementations of Apache Spark in the cloud. You can access a database from Synapse Studio with Apache Spark for Azure Synapse Analytics.

Azure Data Factory

Azure Data Factory (ADF) is a cloud-based data integration service that allows you to integrate different data stores and perform activities on the data.

Azure Event Hubs

Azure Event Hubs is a big data streaming platform and event ingestion service. You can configure continuous ingestion from customer-managed Event Hubs.

Azure Functions

Azure Functions allow you to run serverless code in the cloud on a schedule or in response to an event. With input and output bindings for Azure Functions, you can integrate your database into your workflows to ingest data and run queries against your database.

Azure Stream Analytics

Azure Stream Analytics is a real-time analytics and complex event-processing engine that's designed to process high volumes of fast streaming data from multiple sources simultaneously.

Fluent Bit

Fluent Bit is an open-source agent that collects logs, metrics, and traces from various sources. It allows you to filter, modify, and aggregate event data before sending it to storage.

Logstash

The Logstash plugin enables you to process events from Logstash into an Azure Data Explorer database for later analysis.

NLog

NLog is a flexible and free logging platform for various .NET platforms, including .NET standard. NLog allows you to write to several targets, such as a database, file, or console. With NLog you can change the logging configuration on-the-fly. The NLog sink is a target for NLog that allows you to send your log messages to your database. The plugin provides an efficient way to sink your logs to your cluster.

Open Telemetry

The OpenTelemetry connector supports ingestion of data from many receivers into your database. It works as a bridge to ingest data generated by Open telemetry to your database by customizing the format of the exported data according to your needs.

Power Automate

Power Automate is an orchestration service used to automate business processes. The Power Automate (previously Microsoft Flow) connector enables you to orchestrate and schedule flows, send notifications, and alerts, as part of a scheduled or triggered task.

Serilog

Serilog is a popular logging framework for .NET applications. Serilog allows developers to control which log statements are output with arbitrary granularity based on the logger's name, logger level, and message pattern. The Serilog sink, also known as an appender, streams your log data to your database, where you can analyze and visualize your logs in real time.

Splunk

Splunk Enterprise is a software platform that allows you to ingest data from many sources simultaneously.The Azure Data Explorer add-on sends data from Splunk to a table in your cluster.

Splunk Universal Forwarder

Telegraf

Telegraf is an open source, lightweight, minimal memory foot print agent for collecting, processing and writing telemetry data including logs, metrics, and IoT data. Telegraf supports hundreds of input and output plugins. It's widely used and well supported by the open source community. The output plugin serves as the connector from Telegraf and supports ingestion of data from many types of input plugins into your database.