How to extract PDF attachments

KaziNad 51 Reputation points
2020-08-07T16:19:34.337+00:00

I receive PDF files (electronic invoices), and there are xml files embedded inside the PDF files I would like to extract and process in a Logic App. I could not find a way to extract the files embedded in PDFs. Any idea? Thanks.

Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
2,855 questions
{count} votes

Accepted answer
  1. ChaitanyaNaykodi-MSFT 23,031 Reputation points Microsoft Employee
    2020-08-12T19:19:00.787+00:00

    Hello @KaziNad , Sorry for the delay in my response. Currently none of the pdf connectors for logic app support the functionality to extract attached files from the pdf document. An alternate method to extract the attached ‘xml’ file will be to integrate a Function app within your Logic App. You can find more information here about how to call a function app using your logic app. We found this thread which we think might be helpful in implementing the code required to extract the ‘xml’ attachment in pdf using C# language.
    Please let me know if you need any additional information, I will be glad to continue with our discussions.

    0 comments No comments

4 additional answers

Sort by: Most helpful
  1. Vishant Pandey 6 Reputation points
    2021-07-20T04:55:24.947+00:00

    1.Install Nuget Package of IronPdf into your project
    2.Follow the link: reading-pdf-text

    PdfDocument PDF = PdfDocument.FromFile(@"D:\demoSp.pdf"); // D:\demoSp.pdf full path of your input pdf file
    FileContent.Text = PDF.ExtractAllText();

    1 person found this answer helpful.
    0 comments No comments

  2. Ezreal95 1 Reputation point
    2021-01-12T09:48:26.33+00:00

    You could try Spire.PDF library to extract attachments from PDF using C#.

    //Load PDF
    PdfDocument pdf = new PdfDocument("Attachment1.pdf");
    //Get the first attachment
    PdfAttachment attachment = pdf.Attachments[0];
    //Write to file
    File.WriteAllBytes(attachment.FileName, attachment.Data);

    0 comments No comments

  3. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  4. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more