Share via

converting PDF documents to access database tables and files

Anonymous
2018-01-02T23:41:23+00:00

Hello,

I am wondering if I can receive guidance on what software packages or steps are necessary to convert documents which are currently pdf files containing text and pictures and tables etc (source files were originally Microsoft word constructed files) to a format that can be imported into Access to essentially make a database of these files so that they can be more easily maintained?  A long time ago I manually convert Word documents that had all been in a hidden table directly into Access easily enough but unfortunately these documents were not made in a table format and many of the original Word documents are gone, all I have to work with are the pdf files.   I am wondering if Microsoft has a solution of if they do not, if some other third party can help. I tried reaching out to Adobe but I cannot find an answer from them for mixed documents only from pdf forms, which these documents are not.  I appreciate your help

 LF

Microsoft 365 and Office | Access | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

Answer accepted by question author

  1. Anonymous
    2018-01-03T02:07:51+00:00

    Hi Scott OCSD,

    We don’t have any software which can import pdf file with mixed contents to Access database.

    And you can refer to  Introduction to importing, linking, and exporting data in Access  to check which type of file can be imported into the Access.

    Any other community members who have related experience are welcome to share your insights here.

    Regards,

    Virgil

    8 people found this answer helpful.
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Anonymous
    2018-01-03T08:31:11+00:00

    You could try opening some of the PDFs with Word and save them as Docx files.

    Then open these and see it the content is saved in a somewhat structured way (tables, paragraphs) that you easily could retrieve in Access via automation, or otherwise clean up using VBA in Word.

    0 comments No comments
  2. Tom van Stiphout 40,201 Reputation points MVP Volunteer Moderator
    2018-01-03T03:03:39+00:00

    That is a very difficult problem and no tool exists that can automate this for you.

    One of the complications will be the document format: they are probably not all EXACTLY alike, structure-wise, so one complication will be to recognize what you're dealing with and handle it accordingly.

    Reading the files themselves is not the end of the world: you can use the Acrobat library that comes with the full version of Acrobat, or iTextSharp, a free library I have only used from .NET - I'm not sure if you have those skills.

    In the end this probably will require expert programming skills, and lots of debugging. You better REALLY REALLY need to do this.

    0 comments No comments