How can use Lookup to get data from multiple salesforce Object in ADF

Chandran, Vinitha 40 Reputation points
2024-04-23T08:15:29.11+00:00

I need to select Some data from Multiple salesforce object to another database. Currently I am trying pipeline for this because Dataflow is not supporting Salesforce.

But the problem is While creating source/dataset it is showing only object and don't have any soql query option to make my dataset as Mutiple source. Advice how can handle this issue

Screen 1- Salesforce Dataset creation- Only option to select single Object

User's image

Sscreen 2- In look up having option to select dataset which created by step 1 and followed by SOQL option from only that source with filter condition.

User's image

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,755 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Sina Salam 4,221 Reputation points
    2024-04-23T14:42:25.9033333+00:00

    Hello @Chandran, Vinitha,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Problem

    Based on your questions, I understand that you want to get information from lots of different places in Salesforce and move it to another database with Azure Data Factory. But there's a problem – you can only pick one thing at a time from Salesforce when setting up the dataset in ADF. You're looking for advice on how to get around this and do it efficiently, pulling data from loads of different parts of Salesforce all at once.

    Scenarios

    You have been given the job of putting together data from different parts of Salesforce and putting it into their Azure SQL Database using Azure Data Factory (ADF). But you ran into a problem: ADF only lets you pick one thing from Salesforce at a time when she's setting up the dataset.

    Solution

    Do not jump to the solution, If the problem, scenario and solution note meet up with what you need you can continue or else, ask for clarification. Thank you.

    NOTE: This solution utilizes Lookup activities to retrieve data from multiple Salesforce objects, addressing the limitation of only being able to select a single object per dataset in Azure Data Factory (ADF). This approach allows for flexibility in querying data from different Salesforce objects and subsequently combining it for loading into the destination database.

    There are many ways to solve your challenges but here I will provide two answers and final advice.

    OPTION1:

    To solve the problem of extracting data from multiple Salesforce objects and transferring it to another database using Azure Data Factory (ADF), while addressing the limitations mentioned, you can follow these steps in your ADF GUI:

    1. I can see from the screenshot you provided, that you have created Linked Services for Salesforce and Destination Database. Ensure authentication credentials and connection details provided are correct.
    2. Although ADF allows selecting only a single Salesforce object per dataset, you can create multiple datasets, each representing a different Salesforce object.
    3. I have seen your lookup stage, when adding Lookup activities to your ADF pipeline. Each Lookup activity should correspond to one of the datasets representing a Salesforce object.
    4. Configure each Lookup activity to query data from its respective Salesforce object using SOQL queries.
    5. Combine Data from Lookup Activities, if necessary, use additional activities (e.g., Join, Union) to combine data retrieved from multiple Lookup activities.
    6. Map the output columns from the Lookup activities to the corresponding columns in your destination database dataset.
    7. If the order of extraction matters or if there are dependencies between Salesforce objects, ensure that Lookup activities are executed sequentially within the pipeline.

    OPTION2:

    This option will help you to structure your ADF pipeline in JSON without manual configuration. Depend on your expertise level. The structured ADF pipeline JSON for the scenario, including Lookup activities to retrieve data from Salesforce objects and a Copy activity to load the combined data into the destination database, example is here below:

    {
        "name": "SalesforceToDatabasePipeline",
        "properties": {
            "activities": [
                {
                    "name": "LookupAccountData",
                    "type": "Lookup",
                    "inputs": [
                        {
                            "referenceName": "SalesforceAccountDataset",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "AccountData",
                            "type": "DatasetReference"
                        }
                    ],
                    "typeProperties": {
                        "source": {
                            "query": "SELECT Id, Name, Industry FROM Account WHERE CreatedDate > @pipeline().parameters.windowStart"
                        }
                    }
                },
                {
                    "name": "LookupContactData",
                    "type": "Lookup",
                    "inputs": [
                        {
                            "referenceName": "SalesforceContactDataset",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "ContactData",
                            "type": "DatasetReference"
                        }
                    ],
                    "typeProperties": {
                        "source": {
                            "query": "SELECT Id, FirstName, LastName, Email FROM Contact WHERE CreatedDate > @pipeline().parameters.windowStart"
                        }
                    }
                },
                {
                    "name": "CombineAndLoadData",
                    "type": "Copy",
                    "dependsOn": [
                        {
                            "activity": "LookupAccountData",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        },
                        {
                            "activity": "LookupContactData",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "inputs": [
                        {
                            "referenceName": "AccountData",
                            "type": "DatasetReference"
                        },
                        {
                            "referenceName": "ContactData",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "DestinationDatabaseDataset",
                            "type": "DatasetReference"
                        }
                    ],
                    "typeProperties": {
                        "source": {
                            "type": "BlobSource"
                        },
                        "sink": {
                            "type": "SqlSink"
                        },
                        "enableStaging": false
                    }
                }
            ]
        }
    }
    

    The example structured JSON defines an ADF pipeline named "SalesforceToDatabasePipeline". It includes two Lookup activities ("LookupAccountData" and "LookupContactData") to retrieve data from Salesforce Account and Contact objects, respectively.

    The Copy activity ("CombineAndLoadData") depends on the successful completion of both Lookup activities and loads the combined data into the destination database.

    The dependencies ensure that the Copy activity executes only after the Lookup activities have successfully retrieved the required data. Adjust the dataset references, queries, and other properties as per your specific Salesforce objects and destination database configuration.

    You will need to tailor the above to number of your Lookup activities, and other parameters.

    Finally

    There are some alternative approaches to consider for extracting data from multiple Salesforce objects and transferring it to another database.

    They are not limited to the followings:

    • Custom Activities like using scripts as second option.
    • Develop Azure Functions that leverage Salesforce REST APIs to extract data from multiple objects.
    • Explore third-party connectors or extensions available in Azure Data Factory Marketplace or other integration platforms.
    • onsider using dedicated data integration platforms such as Informatica, Talend, or MuleSoft, which provide robust capabilities for Salesforce integration.
    • Implement Change Data Capture mechanisms in Salesforce to track changes in data across multiple objects.
    • Employ data replication tools or services that support Salesforce, such as Heroku Connect or DBSync.
    • Establish a direct connection between the destination database and Salesforce using technologies like Salesforce Connect or OData.
    • Combine multiple approaches for hybrid solution, such as using a combination of ADF Lookup activities for initial data extraction and custom code for complex data transformations or incremental updates.

    References

    Kindly read more from additional resources provided by the right side of this page for more information.

    Accept Answer

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    ** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam

    1 person found this answer helpful.
    0 comments No comments

  2. AnnuKumari-MSFT 31,726 Reputation points Microsoft Employee
    2024-04-24T08:59:02.3333333+00:00

    Hi Chandran, Vinitha ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your query here.

    As per my understanding you are finding it difficult to connect to multiple salesforce objects using ADF pipeline. Please let me know if that is not the correct understanding.

    In dataset, you can do something called 'parameterization' which lets you to point to objects dynamically by doing right configurations. You do not need to select a single object/table in the dataset.

    First of all, create linked service using salesforce connector, then create dataset without selecting the object. Once the dataset is created, open the dataset and go to parameters tab and create a new parameter called 'object'. Then go to connection tab and in object API name, click on 'add dynamic content' and select the created parameter. It will automatically generate this expression: @{dataset().object}

    Use Foreach activity in the ADF pipeline and in Items, provide the name of all the objects in the form of an array using this expression:

    @createArray('object1','object2','object3')
    
    
    

    Inside foreach, use lookup activity and select the created dataset, here you need to pass the parameter value for the dataset using this expression: @item()

    Instead of hardcoding the objectnames in Foreach item, you can pass it using another lookup activity before foreach where the lookup could point to a csv file or a table containing the names of all the objects to iterate through.

    User's image

    User's image

    User's image

    User's image

    Hope it helps. Kindly accept the answer by clicking on Accept answer button. Thankyou

    1 person found this answer helpful.
    0 comments No comments