Hi Sourav,
Thank you for posting query in Microsoft Q&A Platform.
Here are the steps to implement the solution approach you have described:
Create an Azure Data Factory (ADF) pipeline to copy files from the on-premises file share to Azure Data Lake Storage (ADLS) and validate the file name against a SharePoint list of approved files. You can use the "Copy Data" activity in ADF to copy the files and the "Lookup" activity to validate the file name against the SharePoint list.
Read the file and validate it against the supplied metadata file. You can use Azure Databricks to read the file and validate it against the metadata file. You can write a Python or Scala script in Databricks to perform the validation.
If the file passes validation, move it to a target folder in ADLS. If the file fails validation, move it to an error folder and send an alert. You can use the "Move Data" activity in ADF to move the file to the target or error folder, and you can use Azure Logic Apps to send the alert.
Use Delta Lake to read the validated file and update the delta table. You can use Azure Databricks to read the validated file and update the delta table.
Create a SharePoint list of approved files and create a workflow to add/remove/update files. You can use SharePoint Designer to create the workflow.
Secure an area in the file share and control it using AD groups. You can use Active Directory to control access to the file share.
Upload the metadata files to the secured area in the file share and sync it with ADLS via ADF pipeline. You can use the "Copy Data" activity in ADF to copy the metadata files from the file share to ADLS.
Register and create data lineage for the whole activity in Azure Purview. You can use Azure Purview to register and create data lineage for the ADF pipeline, Databricks job, and Delta Lake table.
To summarize, the solution approach involves using ADF pipeline to copy files from on-premises file share to ADLS, using Databricks to validate the files against metadata, using Delta Lake to store the validated files, using SharePoint to manage the list of approved files, using AD to control access to the file share, and using Azure Purview to register and create data lineage for the whole activity.
Hope this helps. Please let me know if any further queries. Thank you.