Hello @rajendar erabathini,
Welcome to the MS Q&A platform.
There could be many reasons for the increased data Written size in the Synapse SQL pool. Here are some of them I can think of
- The data in the PARQUET file may be compressed, reducing the data's size on disk. However, when the data is loaded into Synapse Analytics, it is decompressed, which can increase the size of the data.
- The data in the PARQUET file may be in a binary format, which is not human-readable. When the data is loaded into Synapse Analytics, it is converted to a human-readable format, increasing the data's size.
- The data in the PARQUET file may have a different data type than the target table in Synapse Analytics. For example, if the data in the PARQUET file is stored as a string, but the target table has a numeric data type, the data will be converted to the numeric data type, which may increase the data size.
- If the PARQUET file contains null values, the data written size may be larger than the data read size because null values require additional storage space. Additionally, suppose the PARQUET file contains complex data types, such as arrays or maps. In that case, the data written size may be larger than the data read size because complex data types require additional storage space.
- The default block size for a SQL pool table in Synapse Analytics is 1MB. If the rows in the PARQUET file are smaller than the block size, then the data written size will be larger than the data read size, as each block will contain additional metadata.
I hope this helps. Please let me know if you have any further questions.