Hi @HimanshuSinha-msft , Thanks for your comment, but I find it hard to agree with. On the link you provided, it also does not list datetimeoffset(7) as a supported data type, but still when using synapsesql() it loads the datetimeoffset(7) column as a Spark string type, AND it displays the dataframe without errors. Also, uniqueidentifier is mapped to string, but then it gives the error as in my original post. Please see images added to this post. ![62388-image-7.png][1] ![62464-image-9.png][2] It seems to me that correctly fetching uniqueidentifier by synapsesql() is just not yet implemented, or it is implemented erroneously. If the first is true (not yet implemented), I would strongly recommend to implement in the near future. Otherwise, what is the benefit of synapsesql() if you cannot use it for some data types? Also, uniqueidentifier is supported by JDBC and other competitive platforms (eg Databricks). It is strange that synapsesql() doesn't do it, because based on the image here: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/synapse-spark-sql-pool-import-export I understand that synapsesql() is also using jdbc. Unfortunately, Synapse Studio is not useable for our use case then at the moment. Let me know if I misunderstood something. [1]: /api/attachments/62388-image-7.png?platform=QnA [2]: /api/attachments/62464-image-9.png?platform=QnA
How to read uniqueidentifier into Spark notebook in Synapse Analytics Studio?
We are looking into a possibility to move some notebooks from Databricks to Synapse Studio. However, it seems that it is impossible in Synapse Spark notebooks to query tables with uniqueidentifiers?
Example table:
CREATE TABLE dbo.test
(id uniqueidentifier null
, ts datetimeoffset not null
,data float null
)
Create a spark dataframe in notebook:
val df = spark.read.synapsesql("testdb.dbo.test")
display(df)
The display() command returns error:
Error : com.microsoft.spark.sqlanalytics.exception.SQLAnalyticsConnectorException: com.microsoft.sqlserver.jdbc.SQLServerException: Columns with UniqueIdentifier types are not supported in external tables.
I thought about creating a view where the uniqueidentifier would be a varchar, but views are not supported by synapsesql().
Do I have any other way to get this table into Spark notebook for some processing, or should we just forget about Synapse studio for now?
4 answers
Sort by: Most helpful
-
-
Andrew Fogarty 6 Reputation points Microsoft Employee
2021-06-09T15:45:57.94+00:00 This is still an issue and a recommended work around would be great.
-
Dilkush Patel 22 Reputation points
2022-10-08T21:47:19.227+00:00 So I'm trying to reach Synapse DW table with uniqueidentifier table in pyspark dataframe and then cover to pandas in databricks but it fails with
com.microsoft.sqlserver.jdbc.SQLServerException: Columns with UniqueIdentifier types are not supported in external tables. [ErrorCode = 102050] [SQLState = S0001]
Is this same issue?
If so any workaround available/ This is kind of very surprise limitation as I'm sure uniqueidentifier would be very widely used datatype.
-
Thanapon Chaijunla 0 Reputation points
2023-02-02T09:33:19.8133333+00:00 I got the same issue, do you have any update on this?