Issue sinking from Azure Data Factory v2 to Azure Synapse Warehouse with Greek letters

Cullen, Cory 21 Reputation points
2022-07-26T23:01:00.773+00:00

We are trying to import data that contains Greek letters into Azure Synapse Analytics (Data Warehouse) via Azure Data Factory v2. We are getting replacement characters for the Greek characters.
225025-image.png

Collation for Data Warehouse: SQL_Latin1_General_CP1_CI_AS

I created Temp Table with two columns for testing. One Column has the collation needed.
CREATE TABLE [GRANTSDM].DimTermList_CC;

I limited my data flow to focus on this particular transform. (Source data coming from JSON file UTF-8 encoding)
At the Data Preview for the Sink, the data looks correct for the 'collTerms' column. (Just focusing on the first row)
225026-image.png

When I run in the pipeline for the data flow to sink to the table, the characters gets replaced on the column with the collation (collTerms)
224996-image.png

The sink needs to be similar to this SQL insert statement
INSERT INTO [GRANTSDM].[DimTermList_CC] (terms,collTerms)
VALUES ('γδ T cells', N'γδ T cells');

I tested this out in another temp table via SSMS
225044-image.png

Is there some thing I can do about the Sink to help it behave like the SQL Insert statement (prefixing the string with the letter 'N')? Or does the column need to be derived differently to insert this correctly?

Thank you for any input.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,395 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,599 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-07-28T09:35:47.227+00:00

    Hi @Cullen, Cory ,

    Thank you for posting query in Microsoft Q&A Platform.

    When we work with non-english data we should consider using datatype as NVARCHAR(). N stands for unicode. This allows us to store non english characters.

    I tried your scenario and able to get data correctly. Please check below screenshots.
    225716-image.png

    Script used to create table

    CREATE TABLE [dbo].[DimTermList_CC3]  
    (  
        terms NVARCHAR(64),   
        collTerms NVARCHAR(64) COLLATE SQL_Latin1_General_CP1253_CI_AI  
    );  
    

    Hope this helps.

    ------------------------

    Please consider hitting Accept Answer button. Accepted answers help community as well.


0 additional answers

Sort by: Most helpful