It sounds like you're encountering unexpected behavior with Azure Data Explorer (ADX) ingestion. Here’s what might be happening, Root Cause Analysis:
Unexpected Data Ingestion:
- The ingestion is happening via a
.ingest-from-storage
command, which indicates that the data is being pulled directly from Blob Storage, not your Python function. - This usually points to an Update Policy, Event Grid Trigger, or a Continuous Ingestion Mapping that is configured to automatically ingest data from a storage account.
Unrecognized AAD App IDs:
- The ingestion is logged with a service principal (AAD App ID) that you don’t recognize. This suggests that another application or process has permission to push data to your ADX table.
- It’s also possible that there is a data connection configured at the database or cluster level.
Why is ADX still ingesting data even after I stopped and shut down the VM running the Python service?
The continuous ingestion into your Azure Data Explorer (ADX) table is likely due to a configured automated ingestion method that is independent of your Python service, such as:
- Update Policies - If your table has an update policy, it can automatically trigger ingestion even if your Python service is stopped.
- Data Connections (Event Hub, Blob Storage, IoT Hub) - These connections can continue ingesting data as long as data is arriving in the source.
- Event Grid Subscriptions - If your ADX is subscribed to an Event Grid topic, it can trigger ingestion when new events arrive.
Recommendation -
Run this query to list any active Update Policies on your table:
.show table Test-Table policy update
List any configured Data Connections:
.show ingestion blob storage
.show ingestion eventhub
.show ingestion iothub
Note - Check for any Event Grid Subscriptions associated with your ADX cluster in the Azure Portal.
What triggers this
.ingest-from-storage
ingestion when I didn’t configure it?
The .ingest-from-storage
command indicates that the ingestion is directly pulling data from a storage account. This is typically triggered by:
- An Ingestion Data Connection: Your table might have an automatic data connection to Blob Storage, Event Hub, or IoT Hub.
- An Update Policy: The table may have an update policy configured to pull data from storage automatically.
- External Application or Service: Another application using your service principal credentials may be configured to trigger ingestion.
Recommendation:
Use this query to list any active Blob Storage Connections:
.show ingestion blob storage
Check the details of your Update Policies:
.show table Test-Table policy update
Note - If you see any unfamiliar blob paths, check the associated storage account’s access policies in the Azure Portal.
Is there any ingestion queue or buffer in Azure Data Explorer (ADX)? If so, how can I clear or manage it?
Yes, ADX can have an ingestion queue or buffer, especially if you are using batch ingestion or data connections. This queue temporarily holds data before it is fully ingested.
Recommendation:
List any active ingestion operations (including queued operations):
.show ingestion status
Clear any stuck or queued ingestion commands using:
.cancel operation <OperationId>
If your table has an update policy or a data connection, consider pausing it temporarily to stop further ingestion.
How can I trace or stop any automatic ingestion tied to a service principal or storage I don’t recognise?
To trace and stop automatic ingestion from an unknown service principal or storage:
Step 1 - Identify Unknown Service Principals:
- Use this query to list all recent ingestion commands and identify the unknown service principal (AAD App ID):
.show commands
| where CommandType == "DataIngestPull" or CommandType == "DataIngestPush"
| order by StartedOn desc
- Look at the "User" and "Principal" columns.
- Go to Azure Active Directory > App Registrations in the Azure Portal and search for the App ID.
- If it is unauthorized, you can disable or delete the app registration.
Step 2 - Review Active Data Connections:
- Run this query to list all Data Connections in your ADX cluster:
.show ingestion managed pipelines
.show ingestion blob storage
- If you see any unknown connections, delete or disable them.
Step 3 - Investigate Unknown Storage Accounts:
If ingestion is happening from a blob storage path you do not recognize:
- Go to Azure Portal > Storage Accounts.
- Review the Access Control (IAM) and Access Keys.
- Revoke any unauthorized access.
Step 4: Monitor for Future Ingestion:
- Set up an alert in Azure Monitor to notify you of any unexpected ingestion activities.
- Regularly monitor your ADX ingestion logs using:
.show commands
| where CommandType == "DataIngestPull" or CommandType == "DataIngestPush"
| order by StartedOn desc
| limit 50
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.
Thank you.