@Hiran Amarathunga - Thanks for the question and using MS Q&A platform.
It's great to see that you are using Azure Databricks Autoloader for your data ingestion needs. Regarding your question, it is not recommended to set the cloudFiles.maxFileAge
to 7 years as it is a very large value and can cause performance issues. The recommended value for cloudFiles.maxFileAge
is usually between 30 to 90 days, depending on your specific use case.
As for the cloudFiles.backfillInterval
, it is used to load missing data for a specific time period. If you want to load missing data for all time, you can set the value to a very large number, such as 100 years. However, keep in mind that setting a very large value can cause performance issues and increase the time it takes to load data.
It is recommended to test different values for these parameters and monitor the performance of your Autoloader job to determine the best values for your specific use case.
For more details, refer to Azure Databricks - Auto Loader options
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.