question

SyedMohammedYusuf-8730 avatar image
0 Votes"
SyedMohammedYusuf-8730 asked SyedMohammedYusuf-8730 commented

Getting Schema Mismatch Issue

I am trying to overwrite existing table which is available in Synapse dedicated pool with dataframe but getting below issue. Both the schemas are same.

com.microsoft.spark.sqlanalytics.SQLAnalyticsConnectorException: Data source schema is different from that on target table.
at com.microsoft.spark.sqlanalytics.SqlAnalyticsConnectorClass$SQLAnalyticsFormatWriter.sqlanalytics(SqlAnalyticsConnectorClass.scala:324)

Schema of table in Sql Dedicated Pool:

CREATE TABLE [dbo].[Dim_Date]
(
[Date] [date] NOT NULL,
[Year] [int] NOT NULL,
[Quarter] [int] NOT NULL,
[Month] [int] NOT NULL,
[Month_Name] [nvarchar](20) NOT NULL,
[Day_of_Week] [int] NOT NULL,
[Day] [nvarchar](20) NOT NULL
)
WITH
(
DISTRIBUTION = ROUND_ROBIN,
CLUSTERED COLUMNSTORE INDEX
)
GO

DataFrame Schema:

|-- Date: date (nullable = false)
|-- Year: integer (nullable = false)
|-- Quarter: integer (nullable = false)
|-- Month: integer (nullable = false)
|-- Month_Name: string (nullable = false)
|-- Day_of_Week: integer (nullable = false)
|-- Day: string (nullable = false)

If I remove the date field, I am able to insert the data into Synapse table.

azure-synapse-analytics
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @SyedMohammedYusuf-8730 ,

Thank you for posting query in Microsoft Q&A Platform. Could you please help on below clarifications to help you better by reproducing scenario.

  • You removed Date field from Dataframe or Synapse table?

  • If you removed Date column from Dataframe then it should error as Date column in table is NOT null. But still if its getting success then could you please check and confirm is any Default constrain on that column?

  • If possible could you please share spark code which you are using to override table?

0 Votes 0 ·

Thanks Shaik for your response.

I have removed the date field in data frame and Azure synapse table.

dfdate.createOrReplaceTempView('tbldate')

val scala_df_date = spark.sqlContext.sql("select * from tbldate")

scala_df_date.write.mode("overwrite").synapsesql("syndp_test.dbo.Dim_Date", Constants.INTERNAL )

0 Votes 0 ·

0 Answers