Hello @Chandran, Vinitha,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
Problem
Based on your questions, I understand that you want to get information from lots of different places in Salesforce and move it to another database with Azure Data Factory. But there's a problem – you can only pick one thing at a time from Salesforce when setting up the dataset in ADF. You're looking for advice on how to get around this and do it efficiently, pulling data from loads of different parts of Salesforce all at once.
Scenarios
You have been given the job of putting together data from different parts of Salesforce and putting it into their Azure SQL Database using Azure Data Factory (ADF). But you ran into a problem: ADF only lets you pick one thing from Salesforce at a time when she's setting up the dataset.
Solution
Do not jump to the solution, If the problem, scenario and solution note meet up with what you need you can continue or else, ask for clarification. Thank you.
NOTE: This solution utilizes Lookup activities to retrieve data from multiple Salesforce objects, addressing the limitation of only being able to select a single object per dataset in Azure Data Factory (ADF). This approach allows for flexibility in querying data from different Salesforce objects and subsequently combining it for loading into the destination database.
There are many ways to solve your challenges but here I will provide two answers and final advice.
OPTION1:
To solve the problem of extracting data from multiple Salesforce objects and transferring it to another database using Azure Data Factory (ADF), while addressing the limitations mentioned, you can follow these steps in your ADF GUI:
- I can see from the screenshot you provided, that you have created Linked Services for Salesforce and Destination Database. Ensure authentication credentials and connection details provided are correct.
- Although ADF allows selecting only a single Salesforce object per dataset, you can create multiple datasets, each representing a different Salesforce object.
- I have seen your lookup stage, when adding Lookup activities to your ADF pipeline. Each Lookup activity should correspond to one of the datasets representing a Salesforce object.
- Configure each Lookup activity to query data from its respective Salesforce object using SOQL queries.
- Combine Data from Lookup Activities, if necessary, use additional activities (e.g., Join, Union) to combine data retrieved from multiple Lookup activities.
- Map the output columns from the Lookup activities to the corresponding columns in your destination database dataset.
- If the order of extraction matters or if there are dependencies between Salesforce objects, ensure that Lookup activities are executed sequentially within the pipeline.
OPTION2:
This option will help you to structure your ADF pipeline in JSON without manual configuration. Depend on your expertise level. The structured ADF pipeline JSON for the scenario, including Lookup activities to retrieve data from Salesforce objects and a Copy activity to load the combined data into the destination database, example is here below:
{
"name": "SalesforceToDatabasePipeline",
"properties": {
"activities": [
{
"name": "LookupAccountData",
"type": "Lookup",
"inputs": [
{
"referenceName": "SalesforceAccountDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AccountData",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"query": "SELECT Id, Name, Industry FROM Account WHERE CreatedDate > @pipeline().parameters.windowStart"
}
}
},
{
"name": "LookupContactData",
"type": "Lookup",
"inputs": [
{
"referenceName": "SalesforceContactDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "ContactData",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"query": "SELECT Id, FirstName, LastName, Email FROM Contact WHERE CreatedDate > @pipeline().parameters.windowStart"
}
}
},
{
"name": "CombineAndLoadData",
"type": "Copy",
"dependsOn": [
{
"activity": "LookupAccountData",
"dependencyConditions": [
"Succeeded"
]
},
{
"activity": "LookupContactData",
"dependencyConditions": [
"Succeeded"
]
}
],
"inputs": [
{
"referenceName": "AccountData",
"type": "DatasetReference"
},
{
"referenceName": "ContactData",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DestinationDatabaseDataset",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlSink"
},
"enableStaging": false
}
}
]
}
}
The example structured JSON defines an ADF pipeline named "SalesforceToDatabasePipeline". It includes two Lookup activities ("LookupAccountData" and "LookupContactData") to retrieve data from Salesforce Account and Contact objects, respectively.
The Copy activity ("CombineAndLoadData") depends on the successful completion of both Lookup activities and loads the combined data into the destination database.
The dependencies ensure that the Copy activity executes only after the Lookup activities have successfully retrieved the required data. Adjust the dataset references, queries, and other properties as per your specific Salesforce objects and destination database configuration.
You will need to tailor the above to number of your Lookup activities, and other parameters.
Finally
There are some alternative approaches to consider for extracting data from multiple Salesforce objects and transferring it to another database.
They are not limited to the followings:
- Custom Activities like using scripts as second option.
- Develop Azure Functions that leverage Salesforce REST APIs to extract data from multiple objects.
- Explore third-party connectors or extensions available in Azure Data Factory Marketplace or other integration platforms.
- onsider using dedicated data integration platforms such as Informatica, Talend, or MuleSoft, which provide robust capabilities for Salesforce integration.
- Implement Change Data Capture mechanisms in Salesforce to track changes in data across multiple objects.
- Employ data replication tools or services that support Salesforce, such as Heroku Connect or DBSync.
- Establish a direct connection between the destination database and Salesforce using technologies like Salesforce Connect or OData.
- Combine multiple approaches for hybrid solution, such as using a combination of ADF Lookup activities for initial data extraction and custom code for complex data transformations or incremental updates.
References
Kindly read more from additional resources provided by the right side of this page for more information.
Accept Answer
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.
Best Regards,
Sina Salam