How to send a local file to the REST API of AZURE DOCUMENT INTELLIGENCE

Question

I need some help so basically I wen to the documentation for the rest api and I used

python in order to get a json response and got the data back

The issue is that I would like to use my own local storage pdf files in order to use DOCUMENT intelligence AI

basically I want to pass it a pdf from my C drive and use that data in order to let the intelligence ai do its thing. The reason is because if not then I would have to use another api like google drive or any other cloud software that allows me to pull in the files then pass those url files to the params

Let says I have 3 pdf files in my file explorer and those are the only three I would to pass it would not let me because url source wants a url of curse

curl -i -X POST "%FR_ENDPOINT%formrecognizer/documentModels/:analyze?api-version=2023-07-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: %FR_KEY%" --data-ascii "{'urlSource': ''}"

Above is the curl example if you notice at the end it wants a urlsource is there any way to give it a local pdf instead?


data = "{'urlSource': 'url link to pdf file (github in this case)'}"
params = {
    'api-version': '2023-07-31',
}
response = requests.post(
    'url link to end point goes here',
    params=params,
    headers=headers,
    data=data,
)

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/use-sdk-rest-api?view=doc-intel-4.0.0&tabs=windows&pivots=programming-language-rest-api

Accepted Answer

@KEVIN S Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

Document Models - Analyze Document REST API clearly talks about the allowed request body headers:

User's image

More Info here.

If you don't want to pass the urlSource then you can explore base64Source attribute in the request body.

This can be used to pass the base64 content of the pdf file in c:\ drive. You can follow the below approach:

  
import base64
import requests
import json

# Read the PDF file in binary mode, encode it to base64, and decode to string
with open("C:\path\to\your\file.pdf", "rb") as file:
    base64_encoded_pdf = base64.b64encode(file.read()).decode()

# Prepare the API request body
data = {
    "base64Source": base64_encoded_pdf
}

# Prepare the API request headers
headers = {
    "Content-Type": "application/json",
    "Ocp-Apim-Subscription-Key": ""
}

# Send the API request
response = requests.post(
    "{endpoint}/formrecognizer/documentModels/{modelId}:analyze?pages={pages}&locale={locale}&stringIndexType={stringIndexType}&api-version=2023-07-31&features={features}",
    headers=headers,
    data=json.dumps(data),
)

# Print the API response
print(response.json())

Please note, I haven't tested the above sample at my end. Please test it at your end and check if that works fine.

Please remember, the size of the base64 encoded string can be quite large for big PDF files, and there might be a limit on the size of the request body that the API can handle.

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

Share via

How to send a local file to the REST API of AZURE DOCUMENT INTELLIGENCE

0 additional answers

Your answer