Is there a way to extract the text with the existing hyperlinks from pdf files using Form Recognizer?

2023-05-17T14:52:41.9833333+00:00

I have some pdf files which some words/sentences contains hyperlinks. Is there a way to capture the text and the hyperlinks?

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,525 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 15,861 Reputation points
    2023-05-17T23:34:56.0766667+00:00

    Hi @ADM256974 (Luis Roberto Sant Anna Henriques) , Thanks for using Microsoft Q&A Platform.

    As per my understanding Form Recognizer does not specifically extract hyperlinks. It focuses on extracting key-value pairs, tables, and text from documents using custom models or pre-built models.

    In prebuilt- business card data extraction model extracting websites field are supported in this scenario. Similar to that do you have any specific scenario? If possible, please share it with us with more details.

    You can give it a try using custom model in Form Recognizer to label and train on the hyperlinks in your document.

    I hope this helps.

    Regards,
    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    1 person found this answer helpful.
    0 comments No comments