Thank you for reaching out to the community forum with your query.
Based on the error message you received, it seems like there is a collation/encoding error between the SQL Pool and the dataframe in the Spark pool. The error message indicates that the column ordinal 27 is expected to be of type VARCHAR(7) collate SQL_Latin1_General_CP1_CI_AS NOT NULL, but the data in the column does not match this expected data type.
To resolve this issue, you can try changing the collation of the column in the SQL view to match the collation of the Spark pool. You can also try casting the column to the correct data type in the Spark code.
Here's an example of how you can cast the column to the correct data type in the Spark code:
from pyspark.sql.functions import col
# Read from existing internal table
dfToReadFromTable2 = (spark.read
# If `Constants.SERVER` is not provided, the `<database_name>` from the three-part table name argument
# to `synapsesql` method is used to infer the Synapse Dedicated SQL End Point.
.option(Constants.SERVER, "xxx.sql.azuresynapse.net")
# Defaults to storage path defined in the runtime configurations
.option(Constants.TEMP_FOLDER, "abfss://******@xxx.dfs.core.windows.net/xxx")
# Three-part table name from where data will be read.
.synapsesql("database.schema.view")
# Cast column ordinal 27 to the correct data type
.withColumn("column_name", col("column_name").cast("string"))
# Fetch a sample of 10 records
.limit(10))
# Show contents of the dataframe
dfToReadFromTable2.show()
If the issue persists, you can also try checking the encoding of the data in the column and ensure that it matches the encoding of the Spark pool.
I hope this helps! Let me know if you have any further questions.