Hello @KaziNad , Sorry for the delay in my response. Currently none of the pdf connectors for logic app support the functionality to extract attached files from the pdf document. An alternate method to extract the attached ‘xml’ file will be to integrate a Function app within your Logic App. You can find more information here about how to call a function app using your logic app. We found this thread which we think might be helpful in implementing the code required to extract the ‘xml’ attachment in pdf using C# language.
Please let me know if you need any additional information, I will be glad to continue with our discussions.
How to extract PDF attachments
I receive PDF files (electronic invoices), and there are xml files embedded inside the PDF files I would like to extract and process in a Logic App. I could not find a way to extract the files embedded in PDFs. Any idea? Thanks.
-
ChaitanyaNaykodi-MSFT 23,031 Reputation points Microsoft Employee
2020-08-12T19:19:00.787+00:00
4 additional answers
Sort by: Most helpful
-
Vishant Pandey 6 Reputation points
2021-07-20T04:55:24.947+00:00 1.Install Nuget Package of IronPdf into your project
2.Follow the link: reading-pdf-textPdfDocument PDF = PdfDocument.FromFile(@"D:\demoSp.pdf"); // D:\demoSp.pdf full path of your input pdf file
FileContent.Text = PDF.ExtractAllText(); -
Ezreal95 1 Reputation point
2021-01-12T09:48:26.33+00:00 You could try Spire.PDF library to extract attachments from PDF using C#.
//Load PDF
PdfDocument pdf = new PdfDocument("Attachment1.pdf");
//Get the first attachment
PdfAttachment attachment = pdf.Attachments[0];
//Write to file
File.WriteAllBytes(attachment.FileName, attachment.Data); -
Deleted
This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
Comments have been turned off. Learn more
-
Deleted
This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
Comments have been turned off. Learn more