Azure Data Factory -- Skipping 2nd & 3rd rows from the source file

Mahesh Kumar 106 Reputation points
2021-04-21T02:45:23.287+00:00

Hello All,

I have a source .csv file starting with a header followed by a couple of rows containing metadata which I don't need to copy to my Azure SQL table. I would want to start processing the data from the header, skip the 1st and 2nd rows, then continue with the 3rd row, 4th row etc..
I tried using the property Skip Line Count, but it is throwing up errors, probably because of the following characteristics.

89762-image.png

Is there a way I can skip the 2nd and 3rd rows while reading the file or a post-copy script kind of option ?

Thank you.

Regards,
Mahesh

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,481 questions
0 comments No comments
{count} vote

2 answers

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2021-04-21T07:32:42.907+00:00

    Hi @Mahesh Kumar ,

    Thanks for reaching out.

    Since your row1 is header and you want to skip 2nd and 3rd rows, you can utilize the skipLineCount property available in Copy activity source settings.

    1. First you will have to select firstRowAsHeader property in dataset connection settings as shown below: 89806-image.png
    2. Second, in the copy activity source settings, set skipLineCount = 2 this will skip the first 2 rows (in your case 2nd and 3rd row data excluding the header row i.e., 1st row as you have selected firstRowAsHeader property in dataset connection settings) 89824-image.png
    3. In the mapping section do import schemas.
      89816-image.png

    I have tested this scenario and works as expected.

    Here is the source blob used:

    89696-image.png

    Here is the sink table after loading data:

    89797-image.png

    Ref doc: Copy activity properties

    Hope this info helps. Do let us know how it goes.


    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

  2. Jonny M 21 Reputation points
    2022-05-19T10:50:03.383+00:00

    What if "Skip Line Count" is dynamic? For instance, for one CSV the header line is on line 3 but in the next CSV it's on line 5 with the only consistent value being the header itself? so instead of tell ADF to skip to line 3 or line 5 you can tell it to skip to the line that startswith "ID"?

    Thanks

    0 comments No comments