HTTP Response Header details not available

Muralikrishna Pirla 216 Reputation points
2022-01-03T07:47:57.28+00:00

Hello there, I have a requirement to fetch the billions of records of data from the salesforce objects on daily basis, I am using the BULK API 2.0 for this task to get the data quickly as the salesforce direct connector is taking more than 15 hours to extract all the records from one object, so I got the bearer token first and created a query job using web activity and the job is processing all the records as expected and currently Salesforce BULK API is supporting CSV format only. so I have created http linked service and copying the data using the copy activity, since there are millions of records I am not able to fetch all at once, so we need to implement pagination here.

I have gone through the salesforce bulk api 2.0 documentation and found that if all the records not process in single request then http response body sends a locator value which we can pass as argument for the next set of results.

Here is the sample response from the salesforce documentation

HTTP/1.1 200 OK
...
Sforce-Locator: MTAwMDA
Sforce-NumberOfRecords: 50000
...

"Id","Name"
"005R0000000UyrWIAS","Jane Dunn"
"005R0000000GiwjIAC","George Wright"
"005R0000000GiwoIAC","Pat Wilson"

I have tried with the web activity and I can see the locator value and the response header details as below

"ADFWebActivityResponseHeaders": {
"Date": "Mon, 03 Jan 2022 07:33:06 GMT",
"Set-Cookie": "CookieConsentPolicy=0:1; path=/; expires=Tue, 03-Jan-2023 07:33:06 GMT; Max-Age=31536000;LSKey-c$CookieConsentPolicy=0:1; domain=; path=/; expires=Tue, 03-Jan-2023 07:33:06 GMT; Max-Age=31536000;BrowserId=ZDuTmmxnEeyppMMTrylLrg; domain=.salesforce.com; path=/; expires=Tue, 03-Jan-2023 07:33:06 GMT; Max-Age=31536000",
"Strict-Transport-Security": "max-age=31536000; includeSubDomains",
"X-Content-Type-Options": "nosniff",
"X-XSS-Protection": "1; mode=block",
"X-Robots-Tag": "none",
"Cache-Control": "no-store, must-revalidate, no-cache, max-age=0, private",
"Sforce-Limit-Info": "api-usage=42264/5000000",
"Sforce-Locator": "NTAw",

However I can't see the same header details in the http response currently to implement the pagination technique, can anyone support me here.

Plz Note: we can't use the REST API connector as the source supports only CSV format.
161863-image.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Muralikrishna Pirla 216 Reputation points
    2022-01-04T05:12:14.243+00:00

    @ShaikMaheer-MSFT @KranthiPakala-MSFT @MartinJaffer-MSFT

    can anyone of you check this and help me here.


  2. Muralikrishna Pirla 216 Reputation points
    2022-01-04T17:27:56.82+00:00

    Hi @MartinJaffer-MSFT Thanks for the response.

    I can't really use the web activity as it can't return more than 4mb data and I have to fetch around 10 millions of records daily, if I put it in the loop it will take 10 to 15 hours.

    my destination is ADLS Gen1.

    Just curios, can we use the logic apps here.


  3. Muralikrishna Pirla 216 Reputation points
    2022-01-13T10:29:49.82+00:00

    Hi @MartinJaffer-MSFT we are able to read the HTTP Header values in the Azure Logic Apps, however logic apps are not supportive here with large datasets, we can't extract more than 5k records with the logic apps.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.