I currently have this issue. Have you determined a solution @Sachin S ?
Has anyone else determined a solution?
Thank you
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hello, I'm trying to run form recognizer3.0 prebuilt model on an invoice in pdf file stored in Blob store. The file name in blob store container is as follows:
I need to pass URL to the file to the form recognizer API in the Body of Rest API call but it fails with error message below:
{
"errorCode": "2108",
"message": "{\"error\":{\"code\":\"InvalidRequest\",\"message\":\"Invalid request.\",\"innererror\":{\"code\":\"InvalidContent\",\"message\":\"The file is corrupted or format is unsupported. Refer to documentation for the list of supported formats.\"}}}",
"failureType": "UserError",
"target": "Form Recognizer - POST",
"details": []
}
Azure Portal: E PF Pas (Cleaning Services) Pty Ltd 4844 tkt 1115212 JAM.pdf
Downloaded from blob store via Azure portal: E%20PF%20Pas%20(Cleaning%20Services)%20Pty%20Ltd%204844%20tkt%201115212%20JAM.pdf
I tried the below file path but all of the below fail with above error message, I tried renaming the file to one without spaces and alphanumeric character and it works fine.
Does anyone know how to correctly format the above file name to be passed to the Form Recognizer? I'm trying to call the FR from Azure data factory.
{"urlSource": "https://\
I currently have this issue. Have you determined a solution @Sachin S ?
Has anyone else determined a solution?
Thank you
No. I had to remove spaces from the file name before invoking the form recognizer API.
Hi Sachin,
I had a similar issue. As per the documentation in Azure, this could be because the url is invalid or inaccessible. In my case it was inaccessible and I had to open up the storage container from "private(anonymous access") to "anonymous read access". Refer to containeraccess.jpg
Thanks,
Anand.
I encountered a similar problem, but mine had an unusual twist. While some PDFs functioned correctly from a storage account container, others did not.
To resolve the problem, I found a workaround by placing the seemingly "corrupt" PDFs into the same storage account container connected to my custom model training. Everything appears to be functioning smoothly now.
I attempted renaming and adjusting public access settings, but neither of these solutions proved effective for me.
To clarify put the broken pdf's in the same container as the ".pdf.labels.json", and "pdf.ocr.json". These get created when you train a custom model.