I am breaking down your use case into these steps :
1. Create a Linked Service
- First, create a linked service in ADF to connect to your ADLS Gen2 storage account.
- Go to the Manage tab in ADF.
- Select Linked services and click New.
- Choose Azure Data Lake Storage Gen2 and configure the linked service with your storage account credentials.
2. Create a Dataset
- Create a dataset that points to the folder structure in your ADLS Gen2.
- Go to the Author tab, select Datasets, and click New dataset.
- Choose Azure Data Lake Storage Gen2 as your data store and then select XML as the file format.
- In the Connection tab, choose the linked service you created earlier.
- Specify the Rootfolder (or a broader folder) in the file path. Do not include subfolders in this step.
- In the File path type, select Wildcard folder path.
3. Configure the Dataset to Use Wildcards
- Use the wildcard characters to dynamically reference the subfolders:
- Set the File path in the dataset like this:
@{dataset().FolderPath}/IN/OUT/STOCK/*/*/*/*.xml
.
- Set the File path in the dataset like this:
- This path assumes your XML files are named
*.xml
and are located at the end of a folder structure following the patternSTOCK/YYYY/MM/DD
.
4. Create the Pipeline
- In the Author tab, create a new pipeline.
- Add a Get Metadata activity to retrieve the list of subfolders (dates) within the monthly folder.
- Point this activity to your dataset.
- In the Field list, select Child items to retrieve a list of folders (e.g.,
31
in your example).
- Add a ForEach activity to iterate over each subfolder.
- Inside the ForEach activity, add a Copy Data activity to copy the XML files from the subfolder to your destination.
- In the Source settings of the Copy Data activity, set the File path dynamically using the folder structure retrieved from the Get Metadata activity.
5. Configure the Pipeline
- Make sure to parameterize your dataset to allow dynamic paths based on the subfolder structure.
- Use expressions to dynamically generate the file paths in the Copy Data activity.
- For example, the file path could be:
@concat('Rootfolder/subfolder/IN/OUT/STOCK/', item().Year, '/', item().Month, '/', item().Day, '/*.xml')
.
- For example, the file path could be: