Why does Data Factory Salesforce Connector not pull correct column formats?

Blasko, Dan W 20 Reputation points
2024-06-18T12:37:25.0066667+00:00

We had a data factory pipeline connecting to salesforce using integration runtime 4.9 and everything worked perfect. We upgraded our integration runtime to 5.1 because of a sql server connector issue and now our salesforce connector does not work properly. For example, when I pull the calendar table from salesforce, the ID comes in as nvarchar(max) instead of nvarchar(18). All nvarchar columns are now created as nvarchar(max) instead of the correct lengths.

I use a lookup and pull 400+ tables names from a control table and then loop over those names and pull the data from salesforce and place into sql server. No mappings required. It creates the 400 tables each night. After creation, the last step runs a stored procedure to replace indexes.

I was told to create all 400+ tables manually and then do a truncate and load that way. However, anytime a column is added/changed/deleted then the process would fail.

I just upgraded to 5.2 yesterday and the problem still exist. It works perfect in 4.8 and 4.9.

Does anyone have a work around for this? I hope i'm not the only person pulling salesforce data into sql server.

I do not want 400 pipelines to replace something i did with one pipeline and 3 objects.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,799 questions
{count} votes

5 answers

Sort by: Most helpful
  1. Bhargava-MSFT 31,116 Reputation points Microsoft Employee
    2024-06-18T16:30:37.5733333+00:00

    Hello Blasko, Dan W,

    Welcome to the Microsoft Q&A forum.

    Are you upgraded to the new Salesforce connector?

    I don't believe the issue you're facing is due to the integration runtime upgrade. After an internal discussion, it has been confirmed that this behavior is by design.

    When auto-creating tables, the SQL sink uses nvarchar(max) as the default value for non-SQL family source string columns. This is because, when dealing with external data, it is difficult to determine the optimal size for these columns, and using max ensures that all possible data sizes are accommodated.

    Workaround:

    use existing table or Implement a pre-copy script that creates the tables with reasonable column lengths before the data is loaded.

    0 comments No comments

  2. Adam Vegh 0 Reputation points
    2024-06-19T14:48:27.39+00:00

    I've got the same issue. It took a day or two to do a workaround and to pull schema from SF via REST API. Then I'm creating the tables in Synapse with that schema and trying to pull the data from SF. But even I'm explicitly defining the schema of the table and the query with the same columns defined - by the way removing SELECT * also seems to be a huge step back - the copy activity still fails for some tables and succeed for other tables. I just don't understand.


  3. Pinaki Ghatak 4,535 Reputation points Microsoft Employee
    2024-07-03T09:49:06.2233333+00:00

    Hello @Blasko, Dan W

    The issue you are facing is related to the upgrade of the integration runtime from version 4.9 to version 5.1. It is possible that the upgrade caused changes in the behavior of the Salesforce connector, resulting in the issues you are experiencing.

    One possible workaround for this issue is to use a mapping data flow in your pipeline to transform the data from Salesforce before loading it into SQL Server. This would allow you to specify the correct data types for each column and ensure that the data is loaded correctly.

    You can also use the mapping data flow to perform any necessary data transformations or aggregations. Another option is to use a stored procedure to create the tables dynamically based on the metadata retrieved from Salesforce.

    This would allow you to ensure that the tables are created with the correct data types and lengths.

    You can then use a bulk insert or other method to load the data into the tables. It is also possible that there is a bug in the Salesforce connector in version 5.1 and 5.2 of the integration runtimes.

    I hope this helps.

    0 comments No comments

  4. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  5. Adam Vegh 0 Reputation points
    2024-07-03T12:06:55.04+00:00

    Hi Everyone,

    I created another workaround with Web, Script and Copy activities:

    https://adamsperspectiveondata.com/2024/07/01/upgrade-pipeline-to-new-salesforce-connector/

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.