Problem on saving Spark timestamp into Azure Synapse

Luis Cespedes 41 Reputation points

I have a database in Azure synapse with only one column with datatype datetime2(7). In Azure Databricks I have a table with the following schema.


The table is empty and allows null values

When I try to write this dataframe
| from |
|2019-07-24 20:50:08.667|
|2022-01-24 15:33:22.193|

I get this error message:

Py4JJavaError: An error occurred while calling org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 15.0 failed 4 times, most recent failure: Lost task 3.3 in stage 15.0 (TID 46) ( executor 0): 110802;An internal DMS error occurred that caused this operation to fail SqlNativeBufferBufferBulkCopy.WriteTdsDataToServer, error in OdbcDone: SqlState: 42000, NativeError: 4816, 'Error calling: bcp_done(this->GetHdbc()) | SQL Error Info: SrvrMsgState: 1, SrvrSeverity: 16,  Error <1>: ErrorMsg: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid column type from bcp client for colid 1. | Error calling: pConn->Done() | state: FFFF, number: 75205, active connections: 35', Connection String: Driver={pdwodbc17e};app=TypeD00-DmsNativeWriter:DB2\mpdwsvc (56768)-ODBC;autotranslate=no;trusted_connection=yes;server=\\.\pipe\DB.2-e2f5d1c1f0ba-0\sql\query;database=Distribution_24

Runtime version 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)

This only happens with these data types (timestamp and date), with other types like int, string this problem does not happen

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,683 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,073 questions
0 comments No comments
{count} votes

Accepted answer
  1. HimanshuSinha-msft 19,386 Reputation points Microsoft Employee

    Hello @Luis Cespedes ,
    Thanks for the ask and using Microsoft Q&A platform .
    As we understand the ask here is to insert timestamp data in Azure synapse analytics . Please do let me know if that not accurate.
    I did tried out the below code on databricks and I was able to insert the records in SQLDW .

    from pyspark.sql import functions as F
    dict = [{'name': 'Alice', 'age': 1},{'name': 'Again', 'age': 2}]
    df3 = spark.createDataFrame(dict)
    df3=df3.withColumn('Age', F.current_timestamp())
    tableName = 'SomeTableName'
    df3.write.mode("overwrite") \
    .format("com.databricks.spark.sqldw") \
    .option("url", sqlDwUrl) \
    .option("tempDir", tempDir) \
    .option("forwardSparkAzureStorageCredentials", "true") \
    .option("dbTable", tableName) \

    Out[73]: StructType(List(StructField(Age,TimestampType,false),StructField(name,StringType,true)))

    On the SQLDW side this is what I see



    I am unable to repro this . Let me know if you see something which in diffferent in my case , I will try my best to rero this .

    One an other note can you please try to cast the timestamp in this format "yyyy-MM-dd HH:mm:ss" and try to insert the data .

    Please do let me if you have any queries .


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

0 additional answers

Sort by: Most helpful