Help with Metadata-Driven ADF Pipeline & Column Mapping Case Sensitivity

YERNAIDU SIRAPARAPU 45 Reputation points
2025-06-25T17:41:34.31+00:00

Hi everyone,

I'm building a metadata-driven Azure Data Factory (ADF) pipeline to dynamically ingest Excel sales files from multiple source systems (like SAP, Senior, etc.) into raw SQL tables.

Pipeline Overview

I have created a single pipeline pl_ingest_sales_data that:

Uses a SourceSystem parameter.

Pulls metadata from a SQL table raw.IngestionConfig, including:

FolderPath (ADLS path)

FilePattern (e.g., *_Sales.xlsx)

RawTable (SQL sink table)

ColumnMapping (manual mapping JSON)

The pipeline:

Uses Lookup1 to fetch metadata.

Loops through files with ForEach.

Loads data using the Copy Data activity.

Applies mappings dynamically using:

@json(activity('Lookup1').output.firstRow.ColumnMapping)

Scenario 1: Auto Mapping Works

SAP Excel File:

OrderID | Customer | Amount | OrderDate

SQL Table:

CREATE TABLE raw.Sap_Sales (

OrderID INT,

Customer NVARCHAR(100),

Amount DECIMAL(18,2),

OrderDate DATE

);

Auto Mapping works because column names match exactly — no manual mapping needed.

Scenario 2: Auto Mapping Fails (Manual Mapping Required)

Senior Excel File:

Orders | Client | TotalAmount | DateOfOrder

SQL Table:

CREATE TABLE raw.Senior_Sales (

OrderID INT,

CustomerName NVARCHAR(100),

Amount DECIMAL(18,2),

OrderDate DATE

);

Issue:

Auto Mapping fails due to column name differences:

  • OrdersOrderID
  • ClientCustomerName

❓ Questions for the Community

Is there any way to make ADF column mapping case-insensitive for Excel sources?

Can I dynamically read column headers from Excel in ADF to auto-generate mappings?

What's the best practice to handle Excel schema drift or naming inconsistencies?

I'd appreciate any advice, best practices, or samples that others have used in similar metadata-driven pipelines.

Thanks in advance for your help!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 14,510 Reputation points Microsoft External Staff Moderator
    2025-06-25T18:30:43.1266667+00:00

    @YERNAIDU SIRAPARAPU

    Thanks for sharing the detailed scenario, you're on the right track with your metadata-driven pipeline design in ADF. Let me address your questions one by one:

    Is there any way to make ADF column mapping case-insensitive for Excel sources?

    Azure Data Factory's Copy Activity is indeed case-sensitive when it comes to column names. This means that for auto-mapping to work, the source and sink column names must match exactly, including their case. If there are discrepancies, you will need to provide a manual mapping JSON as you're already doing.

    Can I dynamically read column headers from Excel in ADF to auto-generate mappings?

    To dynamically read column headers from Excel and auto-generate mappings, consider the following approaches:

    Get metadata activity - Use it on the Excel dataset to retrieve the structure, which returns an array of column names and types. You can process that to build your column mapping JSON dynamically.

    Mapping data flow with schema drift - Data Flows support schema drift. You can enable it, and then use expressions to standardize column names (e.g., toLower() or use conditional rules to rename columns on the fly).

    External processing (Optional) - You could also use an Azure Function, Logic App, or Databricks notebook to read the Excel headers, generate the mapping JSON, and return it to the pipeline before the Copy step.

    What's the best practice to handle Excel schema drift or naming inconsistencies?

    Central mapping table - Continue using your raw.IngestionConfig with a ColumnMapping JSON for each source system. This approach is scalable and maintainable.

    Standardize column names early - If possible, align naming conventions with source systems or perform a preprocessing step before ingestion.

    Utilize data flows - They are particularly useful for handling schema drift due to their dynamic nature.

    Fallback handling & alerts - Implement a Get Metadata activity before the Copy step to detect header mismatches and send alerts if necessary.

    Tip for Dynamic Mapping Expression

    If you're loading the column mapping from your config table, this expression works well in the Copy activity’s Mapping section:

    @json(activity('Lookup1').output.firstRow.ColumnMapping)
    

    Just make sure your ColumnMapping field in SQL is a valid JSON like this:

    [
      { "source": { "name": "Orders" }, "sink": { "name": "OrderID" } },
      { "source": { "name": "Client" }, "sink": { "name": "CustomerName" } },
      { "source": { "name": "TotalAmount" }, "sink": { "name": "Amount" } },
      { "source": { "name": "DateOfOrder" }, "sink": { "name": "OrderDate" } }
    ]
    

    I hope this information helps. Please do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    As your feedback is valuable and can assist others in the community facing similar issues.

    Thank you.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.