How do I remove the row in Excel file? - Azure Synapse Analytics

Question

Hi, I wanna remove the row in Excel file in Azure Synapse Analytics.
As shown in the attached image, the source Excel file has an unnecessary second row.
I tried to remove it in ADF, but I don't know what condition to set in the component "Alter Row".
Do you know any solution?
Any help would be appreciated.

Accepted Answer

Hi @Kakehi Shunya (筧隼弥) ,
Thankyou for sharing more details on your requirement. Additionally , you can try following approach to achive the requirement:

1. Create the dataset pointing to your excel, make sure to import the schema and check first row as header.
2. In the source transformation of the dataflow, select the above dataset and preview the data.

3. Use surrogate key transformation to generate column say RowNum which would add an incrementing key value to each row of data.

4. Use aggregate transformation to find out the maximum RowNum by using the expression : max(RowNum) in aggregate tab. As group by tab is optional, you can skip that.

5. Use Sink transformation and select cache as the sink type to write the data into spark cache that can be referenced in other transformations.

6. Add another source transformation and refer to the same source dataset.

7. Use Surrogate key again to generate the RowNum again as done in step 3

8. Use Filter transformation and use this expression : toInteger(RowNum) < toInteger(sink1#outputs()[1].MaxRow) - 1

9. Use sink transformation to load this data to the target database.

You can use the similar approach to filter out the first row as well by getting the min(RowNum) and using expression in the filter transformation : toInteger(RowNum) > toInteger(sink1#outputs()[1].MinRow)

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Answer

Hi @Kakehi Shunya (筧隼弥) ,

If you want to filter out such rows which don't bring any useful data, you can use filter transformation in data flow.

You can mention expression like below,

length (column1)>1

This will filter out those rows which bring one character in the column1.

Try this out and let us know

Thanks

Answer

Hi @Kakehi Shunya (筧隼弥) ,
Thankyou for using Microsoft Q&A platform and thanks for posting your question here.
As I understand your ask , you want to skip first row of the excel while processing it via Azure data factory. Please let me know if my understanding is incorrect .

While creating the dataset, you can use the range in the excel dataset which will allow you to read the excel starting from a particular cell . You can provide the range value as A3 in the dataset.

For more details, kindly check this article: Dataset properties in excel

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

How do I remove the row in Excel file? - Azure Synapse Analytics

2 additional answers