Share via

data skipping algorithm works without Databricks

M, Mathan 1 Reputation point
2022-04-27T15:46:30.573+00:00

Delta log stats are not captured in non Databricks environment. Can data skipping algorithm works without Databricks ?

Azure Monitor
Azure Monitor

An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.

Azure Data Lake Storage
Azure Data Lake Storage

An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.


1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 91,866 Reputation points
    2022-04-28T11:52:05.09+00:00

    Hello @M, Mathan ,

    Thanks for the question an using MS Q&A platform.

    Z-Ordering is a technique to co-locate related information in the same set of files. This co-locality is automatically used by Delta Lake on Azure Databricks data-skipping algorithms to dramatically reduce the amount of data that needs to be read.

    For more details, refer to Z-Ordering (multi-dimensional clustering).

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.