Share via

Get Metadata activity not infering data type correctly for Excel

Dhruv Singla 210 Reputation points
2026-02-23T18:39:21.2366667+00:00

Following is the screenshot of my excel

User's image

I use GetMetadata activity in ADF to infer the datatypes of the excel sheet

User's image

It infers the data type of the column as string instead of integer.

User's image

I know that 'Get Metadata' doesn't infer data type properly for csv files. Is it the same for excels?

Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.

0 comments No comments
{count} votes

Answer accepted by question author
  1. Jerald Felix 10,975 Reputation points
    2026-02-24T11:18:11.18+00:00

    Hello Dhruv Singla,

    Thanks for raising this question in Q&A forum.

    The Get Metadata activity in ADF returning string instead of integer for Excel columns is a known limitation, not a bug in your configuration.

    Why This Happens

    The Get Metadata activity uses the first few rows of the file to sample and infer data types. For Excel files, the connector reads cell values as generic strings by default because Excel internally stores cell content with formatting metadata that ADF doesn't fully evaluate during schema inference. This is particularly common when:

    The column header row is text-based

    Cells have mixed formatting (e.g., text-formatted number cells)

    • The Excel connector defaults to treating values as strings to avoid data loss

    Workarounds

    1. Manually Define Schema in Dataset Instead of relying on inferred types, explicitly define the schema in your Excel dataset:

    Open the dataset → Go to the Schema tab

    Click Import schema or manually add columns with correct types

    Set the column type to Int32 or Int64 for integer columns

    Once schema is set in the dataset, Get Metadata will respect it.stackoverflow+1

    2. Use a Lookup Activity Instead For more reliable type inference, replace Get Metadata with a Lookup activity using a query to fetch the data and then reference @activity('Lookup1').output.firstRow:

    json
    @activity('Lookup1').output.firstRow.YourColumnName
    

    3. Use Data Flow with Schema Inference For more accurate schema detection, use Mapping Data Flows with the source configured to infer schema from the Excel file:

    • Data Flows use Apache Spark for inference, which is significantly more accurate for data types compared to Get Metadata

    4. Type Cast After Get Metadata If you must use Get Metadata, cast the string value to integer downstream using an expression:

    json
    @int(activity('GetMetadata1').output.structure[0].type)
    

    Or handle the type conversion in a subsequent Copy or Data Flow activity.

    Best Practice

    For Excel specifically, the recommended approach is to define the schema explicitly in the dataset rather than relying on inference, as Excel is not a self-describing format like Parquet or Avro. This gives you full control over data types and avoids surprises in downstream pipeline activities.

    If this helps accept the answer.

    Best Regards,

    Jerald Felix


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.