Hi @Anonymous ,
Could you please refer to below MS doc and see if that helps to achieve your requirement. In case if you face any issues or if you have any feedback/suggestions regarding this implementation, please do share it with us so that I can take it forward to appropriate team.
Here is the MS doc: Copy file from SharePoint Online using Azure Data Factory
You can copy file from SharePoint Online by using Web activity to authenticate and grab access token from SPO, then passing to subsequent Copy activity to copy data with HTTP connector as source.
- Follow the Prerequisites section to create AAD application and grant permission to SharePoint Online.
- Create a Web Activity to get the access token from SharePoint Online:
- URL: https://accounts.accesscontrol.windows.net/[Tenant-ID]/tokens/OAuth/2. Replace the tenant ID.
- Method: POST
- Headers:
- Content-Type: application/x-www-form-urlencoded
- Body: grant_type=client_credentials&client_id=[Client-ID]@[Tenant-ID]&client_secret=[Client-Secret]&resource=00000003-0000-0ff1-ce00-000000000000/[Tenant-Name].sharepoint.com@[Tenant-ID]. Replace the client ID, client secret, tenant ID and tenant name.
Note: Set the Secure Output option to true in Web activity to prevent the token value from being logged in plain text. Any further activities that consume this
value should have their Secure Input option set to true.
- Chain with a Copy activity with HTTP connector as source to copy SharePoint Online file content:
- HTTP linked service:
i) Base URL: https://[site-url]/_api/web/GetFileByServerRelativeUrl('[relative-path-to-file]')/$value. Replace the site URL and relative path to file. Sample relative
path to file as /sites/site2/Shared Documents/TestBook.xlsx.
ii) Authentication type: Anonymous (to use the Bearer token configured in copy activity source later)
- Dataset: choose the format you want. To copy file as-is, select "Binary" type.
- Copy activity source:
i) Request method: GET
ii) Additional header: use the following expression@{concat('Authorization: Bearer ', activity('<Web-activity-name>').output.access_token)}, which uses the Bearer token generated by the upstream Web activity as authorization header. Replace the Web activity name.
- Configure the copy activity sink as usual.
Hope this helps. Please let us know how it goes.
Thank you.