How to use wildcard for XML Files in Copy Data

Question

How to use wildcard for XML Files in Copy Data

Jörg Lang 120

I have a Azure Data Lake Storage Gen2 with following folder and file structure

.\source
.\source\SystemA_20240618.xml
.\source\SystemB_20240618.xml
.\source\SystemA_20240619.xml
.\source\SystemB_20240619.xml

I need to process only the files matching .\source\SystemA_*.xml within a single data flow/pipeline.

If I name the files within the data set configuration, I can process them, but I don't want to modify the dataset every day.

Please help.

phemanth 15,755 Reputation points Microsoft External Staff Moderator

2024-06-25T09:25:28.45+00:00

@Jörg Lang Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

1 answer

Your answer

phemanth 15,755 Reputation points Microsoft External Staff Moderator

2024-06-25T09:25:28.45+00:00

@Jörg Lang Just checking in to see if the below answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

phemanth 15,755 Microsoft External Staff Moderator

@Jörg Lang

Thanks for the question and using MS Q&A platform.

To process only the files matching the pattern .\\source\\SystemA_*.xml within a single data flow or pipeline without modifying the dataset configuration daily, you can use dynamic content in your data flow or pipeline configuration. Here’s how you can achieve this:

Dynamic Content in Data Flow:

In your data flow, create a parameter (let’s call it SourceFilePath) that represents the folder path where your XML files are located (e.g., .\\source).
Use this parameter in your source dataset configuration. For example, if you’re using a File System source, set the folder path to @{dataset().SourceFilePath}.
In your data flow activities, use the SourceFilePath parameter to dynamically read files matching the pattern .\\source\\SystemA_*.xml.

Dynamic Content in Pipeline:

Create a pipeline parameter (e.g., SourceFolderPath) representing the folder path where your XML files reside (e.g., .\\source).
In your pipeline, use this parameter in the ForEach activity to iterate over files matching the pattern .\\source\\SystemA_*.xml.
Inside the ForEach activity, configure your data flow activity to read the current file (using @item().name or @concat(parameters('SourceFolderPath'), '/', item().name)).

By using dynamic content, you avoid hardcoding the file names and adapt to changes in the folder structure without modifying the dataset configuration daily

Hope this helps. Do let us know if you any further queries.

Jörg Lang 120 Reputation points

2024-06-19T10:36:57.7833333+00:00

Hi @phemanth and thank you for your answer.

What is then the purpose in "source options" of the "wilcdard paths"?

I would expect that I can just enter here somenthing.
phemanth 15,755 Reputation points Microsoft External Staff Moderator

2024-06-20T09:01:03.35+00:00

@Jörg Lang

The “Wildcard paths” option in the “Source options” of Azure Data Factory is indeed designed to allow you to specify a pattern for the files you want to process. This can be particularly useful when you have a large number of files and you only want to process certain ones.

For example, if you only want to process files that start with “SystemA_”, you could enter .\source\SystemA_*.xml in the “Wildcard paths”. This tells Azure Data Factory to only process files in the .\source directory that start with SystemA_ and end with .xml.

However, please note that the “Wildcard paths” option works at the dataset level. If your files are spread across multiple folders and you want to process them in a single data flow or pipeline, you might still need to use dynamic content and parameters

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
phemanth 15,755 Reputation points Microsoft External Staff Moderator

2024-06-21T07:17:33.45+00:00

@Jörg Lang

Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

How to use wildcard for XML Files in Copy Data

1 answer

Your answer