How to Use Wildcard in Exists Data Flow Activity

pm-syn-lover 36 Reputation points
2021-05-20T14:59:47.037+00:00

My objective is to use the 'Exists' data flow activity to check if the data I'm processing already exists in a directory in Azure Data Lake Storage. The issue I'm having is I'm wanting to access data within subdirectories. In the past, I've used a double wildcard (**) to get to data in all subdirectories, but it doesn't seem to be working in this case.

All of my images will be provided below. I've provided a screenshot of my current data flow, source activity, and exists activity, and the error I receive.

My top directory is 2021. Subdirectories include months and days, where all data is stored. In my source activity, I placed the double wildcard, but when I previewed the data, I received the error you see below, and it appears as though it is not seeing any of the data in the subdirectories.

If anyone has any thoughts/ideas on this, it would be much appreciated. Thank you.

98300-pipeline.jpg

98309-exists.jpg

98285-source.jpg

98250-error.jpg

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,342 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,369 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,542 questions
{count} votes

Accepted answer
  1. OmarSiado-MSFT 156 Reputation points Microsoft Employee
    2021-05-20T16:52:57.54+00:00

    Hi,

    I think the following link could provide you insights for your requirement
    https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage#source-transformation

    Please take a look at the examples:
    Wildcard examples:

    • Represents any set of characters.

    ** Represents recursive directory nesting.

    ? Replaces one character.

    [] Matches one or more characters in the brackets.

    /data/sales//*.csv Gets all .csv files under /data/sales.**

    /data/sales/20??// Gets all files in the 20th century.**

    /data/sales///*.csv Gets .csv files two levels under /data/sales.

    /data/sales/2004/*/12/[XY]1?.csv Gets all .csv files in December 2004 starting with X or Y prefixed by a two-digit number.

    Please let us know if the suggestion works for you.

    Regards,

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful