Compare Auto Loader file detection modes
Auto Loader supports two modes for detecting new files: directory listing and file notification. You can switch file discovery modes across stream restarts and still obtain exactly-once data processing guarantees.
Directory listing mode
In directory listing mode, Auto Loader identifies new files by listing the input directory. Directory listing mode allows you to quickly start Auto Loader streams without any permission configurations other than access to your data on cloud storage.
In Databricks Runtime 9.1 and above, Auto Loader can automatically detect whether files are arriving with lexical ordering to your cloud storage and significantly reduce the amount of API calls needed to detect new files. See What is Auto Loader directory listing mode? for more details.
File notification mode
File notification mode leverages file notification and queue services in your cloud infrastructure account. Auto Loader can automatically set up a notification service and queue service that subscribe to file events from the input directory.
File notification mode is more performant and scalable for large input directories or a high volume of files but requires additional cloud permissions to set up. For more information, see What is Auto Loader file notification mode?.
Cloud storage supported by modes
The availability for these modes are listed below.
If you migrate from an external location or a DBFS mount to a Unity Catalog volume, Auto Loader continues to provide exactly-once guarantees.
Cloud Storage | Directory listing | File notifications |
---|---|---|
AWS S3 | All versions | All versions |
ADLS Gen2 | All versions | All versions |
GCS | All versions | Databricks Runtime 9.1 and above |
Azure Blob Storage | All versions | All versions |
ADLS Gen1 | All versions | Unsupported |
DBFS | All versions | For mount points only |
Unity Catalog volume | Databricks Runtime 13.3 LTS and above | Unsupported |