C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,233 questions
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hello, I have a searchable pdf. I want to read certain data from it e.g. (Name: Verona, City: Amsterdam).
How can I do it in C# without having to use an expensive library?
Thanks for the help!
If you want to extract text from PDF, you can use itext7
A basic sample :
PdfReader pdfReader = new PdfReader("e:\\test.pdf");
PdfDocument pdfDoc = new PdfDocument(pdfReader);
for (int nPage = 1; nPage <= pdfDoc.GetNumberOfPages(); nPage++)
{
iText.Kernel.Pdf.Canvas.Parser.Listener.ITextExtractionStrategy extractionStrategy = new iText.Kernel.Pdf.Canvas.Parser.Listener.SimpleTextExtractionStrategy();
string sPageText = iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(nPage), extractionStrategy);
Console.WriteLine(string.Format("Page : {0}", nPage.ToString()));
Console.WriteLine(sPageText);
}
pdfDoc.Close();
pdfReader.Close();