ADF copy data from database to a csv file, numeric column format wrong --too many digits after deciamal point

Dataholic 80 Reputation points
2023-08-18T15:57:19.3+00:00

Having a copy activity in synapse pipeline

Source --- a query select data from SQL database.

Target --- a pipe delimited CSV file in Azure Data Lake

one of columns is numeric data field.

The issue is some of rows has 13 digits after decimal point but majority of rows is fine.

Examples

It should be 167.3 or 167.30 but the result is 167.29999999999998

It should be 996 but the result is 996.00000000000011

numeric column is summarized field from the select query and union from multiple selections

I specified the decimal point in the query, but it doesn't help any

Select CAST(SUM(cast([numericColumn] as float)) AS DECIMAL(18,2)) as numericColumn from Table-A
UNION
Select CAST(SUM(cast([numericColumn] as float)) AS DECIMAL(18,2)) as numericColumn from Table-B

But if I run the query directly in SSMS, the result is correct.

What could be missed?

Screenshot references

1 mapping screenshot, there is no schema mapping input

AzureCopyMapping

2.CSV data source(target) screenshot

azureCopyCSVTarget1

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 33,071 Reputation points Volunteer Moderator
    2023-08-20T12:50:32.6+00:00

    The issue you are having with floating-point numbers is a common one that can be traced back to the way floating-point arithmetic works in some computers. I see you already made an attempt to cast the sum to a specific decimal format in your SQL query, and I think that the issue may not be in the query itself but how the ADF is reading and writing the data.

    Try to inspect the data types in your source SQL database to make sure the data is being handled correctly. If you're dealing with monetary values, consider using the DECIMAL or MONEY data type to ensure precision.

    In the copy activity, make sure that you're mapping the source and sink correctly. Check the data types in the mapping and make sure they align with your desired format.

    You can use a Data Flow with a Derived Column transformation to explicitly set the desired format of the column. Check this if you want to limit the number of decimal places.

       round(numericColumn, 2)
    

    As a last resort, you can create a post-copy script that runs after the copy activity, which reads the CSV file, corrects the formatting of the problematic column, and then writes it back to Azure Data Lake. This can be a temporary workaround while you diagnose the underlying issue.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.