Azure data factory mapping data flow does not compress in sink

Question

Azure data factory mapping data flow does not compress in sink

Yoon, Sojin 70

Hello,

I am trying to read data from a database and write it into ADLS gen2 using the Mapping data flow in Data Factory.

It is a fairly simple flow, consisting of 2 steps. 'Inline JSON' was selected as the inline dataset type and in 'settings' tab, the compression was configured.

User's image

The compression type is set as gzip and the filename pattern also has the '.gz' at the end.

concat($param_df_adls_dest,'', toString(toTimestamp($param_df_p_timestamp, 'yyyy-MM-dd'T'HH:mm:ss'), 'yyyyMMddHHmmss'),'.000000[n]','.json.gz")

User's image

Even with this config, the file that are being written into the blob storage is a json file (the name of the file is xxxx.json.gz), instead of being a compressed json file.

I tried different ways, but still get the same result.

Does anyone know what I might be doing wrong? Thanks.

Answer accepted by question author

0 additional answers

Your answer

Answer 1

QuantumCache 20,676 Moderator

Hello @Yoon, Sojin

How does the source data format looks like? any sample data?
To resolve this issue, you can try the following:

Check compression settings: Double-check the compression settings in your Mapping data flow to make sure that they are set correctly. Make sure that the compression type is set to gzip and that the filename pattern includes the '.gz' extension.

Check data format: Make sure that the data being written to ADLS Gen2 is in a format that can be compressed, such as JSON or CSV. If the data is not in the correct format, you may need to transform it before writing it to ADLS Gen2.

Check pipeline configuration: Make sure that the pipeline is configured correctly and that all settings are set correctly. Check the input and output datasets to make sure that they are configured correctly and that the compression settings are being applied correctly.

Yoon, Sojin 70 Reputation points

2023-07-06T13:25:51.1166667+00:00

Hi Satish,

reading your comment, I think it might be due to the source data since everything else (the compression settings, extension etc.) are done correctly as you mentioned. Thank you for your comment!

Just strange thing is that when I used the Copy activity for the data source (SAP CDC - SLT), it compressed the data into gzip without any issue. (Mapping Data Flow is in use mainly because we were notified that the copy activity for this particular data source will not be supported)

I will try to see if I can somehow transform it and if not, will just work with the JSON instead of json.gz.
QuantumCache 20,676 Reputation points Moderator

2023-07-06T16:38:32.8666667+00:00

Hello @Yoon, Sojin

Thanks for the confirmation, please click "Accept Answer" and we will be able to close this thread and is helpful to others as well.
QuantumCache 20,676 Reputation points Moderator

2023-07-12T22:22:29.21+00:00

Hello @Yoon, Sojin Just checking if we are still connected on this discussion? Please let us know if you need to add more info so that we better assist you!

If the response is helpful, please click "Accept Answer" and Click 'Yes'. So that we can close this thread.
QuantumCache 20,676 Reputation points Moderator

2023-07-31T21:36:36.8633333+00:00

Hello @Yoon, Sojin
Just checking.
please click "Accept Answer" .So that we can close this thread.

Share via

Azure data factory mapping data flow does not compress in sink

0 additional answers

Your answer