Copy data from Oracle database to an Azure Data Lake Gen 2 into parquet files

Question

Copy data from Oracle database to an Azure Data Lake Gen 2 into parquet files

Anonymous

Hello,

I am trying to copy data from an Oracle Database into parquet files inside an Azure Data Lake Gen 2. Since the target format is Parquet and upon each copy a random name gets assigned to the parquet files, I have not been able to replicate the overwrite behavior.

In order to overwrite the existing files, I am currently using the Delete activity to delete existing parquet files before doing the copy; is there a way I could do this in one step without having to delete the existing files myself?

Thanks in advance for your help & support.

AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator

2023-07-07T09:16:39.47+00:00

Hi Moein Torabi ,

Just following up to see if the below answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

1 answer

Your answer

AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator

2023-07-07T09:16:39.47+00:00

Hi Moein Torabi ,

Just following up to see if the below answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

Answer 1

AnnuKumari-MSFT 34,566 Microsoft Employee Moderator

Hi Moein Torabi ,

Thankyou for using Microsoft Q&A platform and thanks for posting your question here.

As I understand your query, you are trying to copy data from oracle database to parquet files in ADLS. However, you want to overwrite the file in each run. Please let me know if that is not the ask.

Could you please confirm if you are partitioning the data into multiple parquet files ? or in a single target file.

In case you are storing the oracle table data into a single target file, kindly provide an output filename , say. outputfile.parquet explicitly in the sink dataset. Else, it will auto generate some random file name . Once you assign a filename , everytime the pipeline runs , the file will be overwritten .

User's image

Hope it helps. Kindly accept the answer if it's helpful. Thankyou

AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator

2023-07-03T06:20:42.1066667+00:00

Hi Moein Torabi ,

Just checking in to see if the above answer helped. Please do consider clicking Accept Answer as accepted answers help community as well. Also, please click on Yes for the survey 'Was the answer helpful'

Share via

Copy data from Oracle database to an Azure Data Lake Gen 2 into parquet files

1 answer

Your answer