Configure data flow endpoints for Microsoft Fabric OneLake
Article
Important
This page includes instructions for managing Azure IoT Operations components using Kubernetes deployment manifests, which is in preview. This feature is provided with several limitations, and shouldn't be used for production workloads.
To send data to Microsoft Fabric OneLake in Azure IoT Operations, you can configure a data flow endpoint. This configuration allows you to specify the destination endpoint, authentication method, table, and other settings.
To configure a data flow endpoint for Microsoft Fabric OneLake, we recommend using either a user-assigned or system-assigned managed identity. This approach is secure and eliminates the need for managing credentials manually.
After the Microsoft Fabric OneLake is created, you need to assign a role to the Azure IoT Operations managed identity that grants permission to write to the Fabric lakehouse.
If using system-assigned managed identity, in Azure portal, go to your Azure IoT Operations instance and select Overview. Copy the name of the extension listed after Azure IoT Operations Arc extension. For example, azure-iot-operations-xxxx7. Your system-assigned managed identity can be found using the same name of the Azure IoT Operations Arc extension.
Go to Microsoft Fabric workspace you created, select Manage access > + Add people or groups.
Select Contributor as the role, then select Add. This gives the managed identity the necessary permissions to write to the Fabric lakehouse. To learn more, see Roles in workspaces in Microsoft Fabric.
Create data flow endpoint for Microsoft Fabric OneLake
az deployment group create --resource-group<RESOURCE_GROUP>--template-file<FILE>.bicep
Create a Kubernetes manifest .yaml file with the following content.
YAML
apiVersion:connectivity.iotoperations.azure.com/v1kind:DataflowEndpointmetadata: name:<ENDPOINT_NAME> namespace:azure-iot-operationsspec: endpointType:FabricOneLake fabricOneLakeSettings:# The default Fabric OneLake host URL in most cases host:https://onelake.dfs.fabric.microsoft.com authentication:# See available authentication methods section for method types# method: <METHOD_TYPE> oneLakePathType:Tables names: workspaceName:<WORKSPACE_NAME> lakehouseName:<LAKEHOUSE_NAME>
Then apply the manifest file to the Kubernetes cluster.
Bash
kubectl apply -f <FILE>.yaml
OneLake path type
The oneLakePathType setting determines the type of path to use in the OneLake path. The default value is Tables, which is the recommended path type for the most common use cases. The Tables path type is a table in the OneLake lakehouse that is used to store the data. It can also be set as Files, which is a file in the OneLake lakehouse that is used to store the data. The Files path type is useful when you want to store the data in a file format that isn't supported by the Tables path type.
The OneLake path type is set in the Basic tab for the data flow endpoint.
Bicep
fabricOneLakeSettings: {
oneLakePathType: 'Tables'// Or 'Files'
}
YAML
fabricOneLakeSettings: oneLakePathType:Tables# Or Files
Available authentication methods
The following authentication methods are available for Microsoft Fabric OneLake data flow endpoints.
System-assigned managed identity
Before you configure the data flow endpoint, assign a role to the Azure IoT Operations managed identity that grants permission to write to the Fabric lakehouse:
In Azure portal, go to your Azure IoT Operations instance and select Overview.
Copy the name of the extension listed after Azure IoT Operations Arc extension. For example, azure-iot-operations-xxxx7.
Go to Microsoft Fabric workspace, select Manage access > + Add people or groups.
Search for the name of your system-assigned managed identity. For example, azure-iot-operations-xxxx7 .
Select an appropriate role, then select Add.
Then, configure the data flow endpoint with system-assigned managed identity settings.
In the operations experience data flow endpoint settings page, select the Basic tab then choose Authentication method > System assigned managed identity.
In most cases, you don't need to specify a service audience. Not specifying an audience creates a managed identity with the default audience scoped to your storage account.
Before you configure the data flow endpoint, assign a role to the user-assigned managed identity that grants permission to write to the Fabric lakehouse.:
Go to Microsoft Fabric workspace, select Manage access > + Add people or groups.
Search for the name of your user-assigned managed identity.
Select an appropriate role, then select Add.
Then, configure the data flow endpoint with user-assigned managed identity settings.
In the operations experience data flow endpoint settings page, select the Basic tab then choose Authentication method > User assigned managed identity.
Enter the user assigned managed identity client ID and tenant ID in the appropriate fields.
To use a user-assigned managed identity, specify the UserAssignedManagedIdentity authentication method and provide the clientId and tenantId of the managed identity.
Here, the scope is optional and defaults to https://storage.azure.com/.default. If you need to override the default scope, specify the scope setting using Bicep or Kubernetes.
Advanced settings
You can set advanced settings for the Fabric OneLake endpoint, such as the batching latency and message count. You can set these settings in the data flow endpoint Advanced portal tab or within the data flow endpoint custom resource.
Batching
Use the batching settings to configure the maximum number of messages and the maximum latency before the messages are sent to the destination. This setting is useful when you want to optimize for network bandwidth and reduce the number of requests to the destination.
Field
Description
Required
latencySeconds
The maximum number of seconds to wait before sending the messages to the destination. The default value is 60 seconds.
No
maxMessages
The maximum number of messages to send to the destination. The default value is 100000 messages.
No
For example, to configure the maximum number of messages to 1000 and the maximum latency to 100 seconds, use the following settings: