Error using Azure Data Factory to Copy data (Using Upsert) from Azure Blob to Azure SQL database

Rome Lin 10 Reputation points
2024-05-23T23:34:36.5766667+00:00

Hi all,

I keep getting this error when I perform an upsert (under copy activity) with this error code:
Failure happened on 'Sink' side. 'Type=System.NullReferenceException,Message=Object reference not set to an instance of an object.,Source=Microsoft.DataTransfer.Connectors.MSSQL,'

When I choose insert, then it works. But whenever I choose upsert, it gave me the above error message. In my source database, there is no duplicate or null values. In sink database, I created the table with primary key (used for upsert) and other columns.

Could it be data type mismatch between blob and sql ? -- since blob reads my id column as string (I can't find a way change the data type in blob), and in sql, I hardcode the id column as integer

Please kindly assist me on this. Thanks!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,801 questions
{count} vote

2 answers

Sort by: Most helpful
  1. Michael Petrolini 55 Reputation points
    2024-05-24T09:30:48.1+00:00

    I'm having the same problem copying data with upsert from Postgres to Azure SQL DB.

    I've been able to resolve this issue by setting the linked service of the sink dataset to Legacy.

    Hope this helps!

    User's image

    11 people found this answer helpful.

  2. phemanth 11,125 Reputation points Microsoft Vendor
    2024-05-24T05:59:21.4766667+00:00

    @Rome Lin Thanks for using MS Q&A platform and posting your query.

    The error message "System.NullReferenceException" indicates a null value being accessed during the upsert operation. Here's how to troubleshoot the issue with your copy activity in Azure Data Factory:

    1. Data Type Mismatch:

    You're right, a mismatch between the ID column data type in your Blob storage (string) and the SQL table (integer) could be causing the issue. Here's what you can do:

    • Convert data type in source (if possible): If your Blob storage allows data type conversion, try converting the ID column to an integer before copying the data. This ensures it matches the SQL table's data type.
    • Data Conversion in ADF: Alternatively, you can use a data flow activity within ADF to transform the data. In the data flow, use a "Derive Column" transformation to convert the string value in the ID column to an integer before writing to the sink.

    2. Missing Primary Key Value:

    The upsert operation relies on the primary key to identify existing records for update. Ensure your source data contains a value in the primary key column that matches the existing data in the SQL table.

    • Check for nulls in the source ID column: Even though you mentioned no null values, double-check the source data for any potential nulls in the ID column. Null values in the primary key can cause this error.

    3. Interim Staging Table:

    During upsert, ADF creates an interim staging table. The error might be related to this temporary table. Here are some things to consider:

    • Review ADF logs: Check the detailed error message in the ADF monitoring section. It might provide more context about the issue within the interim table.
    • Disable interim copy (if possible): Some data sources might allow disabling the interim copy functionality. However, this option might not be available for all sources. Refer to ADF documentation for your specific source connector.

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.