Share via


Using Microsoft Graph API to convert the format of your documents

This post is a contribution from Jing Wang, an engineer with the SharePoint Developer Support team

Many SharePoint Online customers want to convert their word documents or some other documents in SPO to pdf files programmatically.
Within SharePoint Online User Interface, you can convert them one at a time manually, but sometimes, users wish to have the ability to convert multiple documents automatically without having to open each document library, locate the documents and click through the options to do them one at a time.

With new Graph api endpoint listed below, above requirement can be automated with custom code, for example, C#, JavaScript…

 GET /drive/items/{item-id}/content?format={format}
GET /drive/root:/  {path and filename}  :/content?format={format}

Format options
The following values are valid for the format parameter:

Format value Description Supported source extensions
pdf Converts the item into PDF format. csv, doc, docx, odp, ods, odt, pot, potm, potx, pps, ppsx, ppsxm, ppt, pptm, pptx, rtf, xls, xlsx

See details of the endpoint here:
https://developer.microsoft.com/en-us/graph/docs/api-reference/v1.0/api/driveitem_get_content_format

Sample - Complete solution in C#:

Step I, Create a native app in Azure portal and give permissions, for Graph API.

 

Step II. Create a Console Application, add two dlls and their references:
DLLs:. These can be added as Nuget packages.
Microsoft.IdentityModel.Clients.ActiveDirectory.dll (Nuget Package Microsoft.SharePointOnline.CSOM)
Newtonsoft.Json.dll (Nuget Package Newtonsoft.Json)

Add the below using statements

 using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Newtonsoft.Json;

Step III, Implement the code, with Graph Api to convert the document and download it locally and upload it back to SPO site:
Note: ADAL library is used for authentication.

Source code:
-------------------------------

 using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Security.Cryptography.X509Certificates;
using System.Text;
using System.Threading.Tasks;
using System.Net;
using System.Security.Claims;
using System.IO;
using Microsoft.SharePoint.Client;
using System.Security;


namespace ConsoleApp1
{
    public static class StreamExtensions
    {
        public static byte[] ReadAllBytes(this Stream instream)
        {
            if (instream is MemoryStream)
                return ((MemoryStream)instream).ToArray();

            using (var memoryStream = new MemoryStream())
            {
                instream.CopyTo(memoryStream);
                return memoryStream.ToArray();
            }
        }
    }
    class Program
    {
        private static string TENANT_NAME = "mycompany.onmicrosoft.com";
        private static string resource = "https://graph.microsoft.com";
        private static string loginname = "user@mycompany.onmicrosoft.com";
        private static string loginpassword = "*********";
        private static string AzureTenantID = "********-f247-4d48-a45d-************";
        private static string spositeUrl = "https://mycompany.sharepoint.com/*********";
        private static string destinationDocumentLibrary = "dl1"; 

        static void Main(string[] args)
        {
            //USER TOKEN - THIS WORKS!!!!!!!!!!!!
            UserPasswordCredential userPasswordCredential = new UserPasswordCredential(loginname, loginpassword);
            var graphauthority = "https://login.microsoftonline.com/" + AzureTenantID;
            AuthenticationContext authContext = new AuthenticationContext(graphauthority);
            var token = authContext.AcquireTokenAsync(resource, "94b1544c-35e8-4d45-a941-c3dbaab283dc", userPasswordCredential).Result.AccessToken;

            // Create a new HttpWebRequest Object to the mentioned URL.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create("https://graph.microsoft.com/v1.0/me/drive/root:/orange.docx:/content?format=pdf");
            //HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create("https://graph.microsoft.com/v1.0/drives/b!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6j8GNUD/root:/Repro.docx:/content?format=pdf");
            myHttpWebRequest.AllowAutoRedirect = false;
            myHttpWebRequest.Headers.Set("Authorization", ("Bearer " + token));
            HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
            string downloadPath = myHttpWebResponse.GetResponseHeader("Location");
            Console.WriteLine("Download PDF file from here:\n " + downloadPath);

            //Get the file Stream with Location Url
            HttpWebRequest HttpWebRequest_download = (HttpWebRequest)WebRequest.Create(downloadPath);
            HttpWebRequest_download.Accept = "blob";

            var response = (HttpWebResponse)HttpWebRequest_download.GetResponse();
            Stream myStream = response.GetResponseStream();
            FileStream targetFile = new FileStream("C:\\temp\\orange_converted_localcopy.pdf", FileMode.Create);
            myStream.CopyTo(targetFile);
            myStream.Close();                   
            response.Close();

//You can continue to use Graph API to upload document back to OneDrive or other SPO site
//since we used loginname/password above already, we will use simple CSOM to upload file to another SPO site as quick demo
            using (var clientContext = new ClientContext(spositeUrl))
            {
                SecureString passWord = new SecureString();
                foreach (char c in loginpassword.ToCharArray()) passWord.AppendChar(c);

                clientContext.Credentials = new SharePointOnlineCredentials(loginname, passWord);
                var web = clientContext.Web;
                clientContext.Load(web);
                clientContext.ExecuteQuery();

                List dl = web.Lists.GetByTitle(destinationDocumentLibrary);
                clientContext.Load(dl);
                clientContext.ExecuteQuery();

                //Upload the converted file to SPO site
                targetFile.Position = 0;               
                    var fci = new FileCreationInformation
                    {
                        Url = "orange_converted_spocopy.pdf",
                        ContentStream = targetFile,
                        Overwrite = true
                    };
                    Folder folder = dl.RootFolder;
                    FileCollection files = folder.Files;
                    Microsoft.SharePoint.Client.File file = files.Add(fci);
                    clientContext.Load(files);
                    clientContext.Load(file);
                    clientContext.ExecuteQuery();

                    targetFile.Close();
                    response.Close();            

                Console.WriteLine("Converted file is uploaded to SPO site - orange_converted_spocopy.pdf");
                Console.ReadKey();
            }
        }
       
    }
    }

Here is the converted file downloaded locally.

The converted pdf is also uploaded to this SPO site:

In the process to generate the url for the Graph API endpoint, I found it kind of tricky to identify the drive ID for specific SPO site, so listing the approach to get the same.

First, Use following url format in Graph Explorer to retrieve the drive id:
https://graph.microsoft.com/v1.0/sites/\[spositehostname\]:/\[sites/pub\]:/drive

For example, if the SPO site url is:
https://mycompany.sharepoint.com/sites/testsite

Url to put in Graph Explorer is:
https://graph.microsoft.com/v1.0/sites/mycompany.sharepoint.com:/sites/testsite:/drive

Output has the drive id:
--

 {
    "@odata.context": "https://graph.microsoft.com/beta/$metadata#drives/$entity",
    "createdDateTime": "2017-12-04T19:48:25Z",
    "description": "This system library was created by the Publishing feature to store documents that are used on pages in this site.",
    "id": "b!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6j8GNUD",
    "lastModifiedDateTime": "2018-10-11T04:08:37Z",
    "name": "Documents",
    "webUrl": "https://mycompany.sharepoint.com/sites/****/Documents",
    "driveType": "documentLibrary",
    "createdBy": {
        "user": {
            "displayName": "System Account"
        }
  }
}

I have a file named “Repro.docx” in the root of above drive:

So the file’s conversion endpoint is:
https://graph.microsoft.com/v1.0/drives/b\!zMNDej1sNEG0SanRDltXfAVTYcdt1pdIggMBPYZYp9Wgdi3ir9sFQJXof6j8GNUD/root:/Repro.docx:/content?format=pdf