store into .parquet

arkiboys 9,621 Reputation points
2022-08-30T07:35:40.873+00:00

hello,
I am storing the activities into copy sink in parquet file.
every time the copy activity is run, I only see only one line in parquet file.
The next run just over writes the previous line and so there is always one line in .parquet file to read.
Question:
How can I store details of each run into a separate line in sink parquet file? I even tried merge in copy behavior but no change

235985-image.png

Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,535 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-09-01T10:17:03.983+00:00

    Hi @arkiboys ,

    Thank you for posting query in Microsoft Q&A Platform.

    When we write to write data into same location same file, then Files will actually get overwritten. There is no way we can append data into existing file using copy activity

    You need to consider having SQL table may be in this case. Or you can also consider written data into different files every time and then you are writing to read data back, read data from all files at same time as single dataset. This video helps you to get idea how to read all files data as single dataset.

    Hope this helps. Please let me know if any further queries.

    -----------

    Please consider hitting Accept Answer button. Accepted answers help community as well.


1 additional answer

Sort by: Most helpful
  1. arkiboys 9,621 Reputation points
    2022-08-30T09:31:20.77+00:00

    hello,
    There is no .json.
    I am simply trying to write the activities of a pipeline into a sink parquet file using copy activity.
    source is a dummy .csv
    source copy has additional columns in settings.
    sink does write to parquet but each run of the copy activity over writes the previous line in the parquet file whereas I am expecting to have one line per run.