How to transform the data if my excel file doesn't have a column in ADF?

Dhilip Subramanian 56 Reputation points
2022-02-26T08:03:18.383+00:00

Hi

I am getting one excel file every day in the data lake. My requirement is to export the excel file into an SQL database using ADF. I have five sheets in the excel file. Among the five, I have one excel sheet that doesn't have one column header. I successfully exported all four sheets into the SQL database.

I stuck with the above one sheet which doesn't have a column header. Below is the data for this excel sheet

178073-image.png

See, the first column doesn't have a header. When I tried to use copy activity, the schema is neglecting the first column as it doesn't have a header. When I untick the first row as a header, then it's recognised. However, I don't want to use this because the column header for other columns considers a row value which is difficult as everyday new excel file is getting added.

Any advice on how to address this issue? Tried a lot, with no success.

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,793 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
    2022-03-01T06:18:04.503+00:00

    Hello @Dhilip Subramanian ,
    Thanks for the question and using MS Q&A platform.

    As we understand the ask here is to copy the columns without the header in SQL . Let me know if thats not accurate .

    You can do this by

    1. A copy activity with the "First row as header" set to No .
    2. A store proc actvity

    We will introduce a staging table ( tbl_stg_excelwithHeader) and copy all the data ( including the header ) as rows , and then we will add stored proc activity whcih will copy all the data from the staging table except the row where prop_0 is null .

    Lets assume that the actual table is having the below schema

    CREATE Table tbl_excelwithHeader  
    (   
    Pid varchar(100)  
    ,No int   
    ,Total int   
    ,SiNo int   
    )   
    

    The copy activity looks like this
    178719-image.png

    Pre copy activity script looks like

    IF OBJECT_ID(N'tbl_stg_excelwithHeader') IS NOT NULL  
        BEGIN  
        DROP TABLE tbl_stg_excelwithHeader  
        END  
    

    and the

    and the sink dataset looks like

    178720-image.png

    Add a stored proc activity and point it to the stored proc named "PROC_INS_TBL_EXCELWITHHEADER"

    CREATE PROC PROC_INS_TBL_EXCELWITHHEADER  
    AS   
    INSERT INTO TBL_EXCELWITHHEADER  
    SELECT PROP_0 ,CONVERT(INT,PROP_1) ,CONVERT(INT,PROP_2),CONVERT(INT,PROP_3)   
    FROM TBL_STG_EXCELWITHHEADER WHERE PROP_0 IS NOT NULL   
    

    When the pipeline runs without error , you should have the data in the actual table

    178687-image.png

    178761-image.png

    Please do let me if you have any queries.

    Thanks
    Himanshu


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments