Miscellaneous topics that do not fit into specific categories.
In Microsoft 365, there are two main ways to get text out of a PDF so it’s easier to navigate and work with.
- Convert the PDF to an editable Word document
This is usually the most straightforward way to get structured text (headings, paragraphs) from a PDF:
- Open Word (desktop or Microsoft365.com).
- Go to File > Open.
- Browse to and select the PDF file.
- When prompted that Word will make a copy and convert the PDF to an editable Word document, select OK.
- After conversion, the content will be in Word format. You can:
- Use Navigation Pane (View > Navigation Pane) to move through headings.
- Edit, search, and reorganize text as needed.
- If needed, save as:
- Word document: File > Save As, choose Word Document (.docx).
- Plain text: File > Save As, choose Plain Text (.txt).
Notes:
- This works best for PDFs that are mostly text. Complex layouts, scanned pages, or copier-like manuscripts may not convert cleanly, and page/line breaks may differ from the original.
- Extract text from PDF using Power Automate for desktop
If using Power Automate for desktop and want to programmatically extract text (for example, from a large PDF or many PDFs):
- In Power Automate for desktop, create or edit a desktop flow.
- Add the Extract text from PDF action.
- Configure:
- PDF file: Path to the PDF.
- Page(s) to extract: All, Single, or Range.
- If needed, set Password for protected PDFs.
- Optionally enable Optimize for structured data to better detect formatted layout.
- Run the flow. The action outputs a variable, typically ExtractedPDFText, containing the extracted text.
- Use that text variable to write to a file, process further, or load into another system.
This approach is useful when automating extraction from large or multiple PDFs.
References: