Not getting text from PPTX (with just images on slides)

Kirk Marple 25 Reputation points
2024-03-25T19:06:05.9433333+00:00

I've found a PPTX, which neither the Read or Layout model can get any text from.
(Using latest v4 models, with 2024-02-29-preview) Can confirm this behavior even in the Studio app.

Should be pretty basic OCR, but for some reason, it doesn't give any text back.Untitled Source file:

[http://view.officeapps.live.com/op/view.aspx?src=https://c.s-microsoft.com/en-us/CMSFiles/SlidesFY24Q2.pptx?version=54533b17-bad3-2205-929f-0e19422e921f]

From this site:
[https://www.microsoft.com/en-us/investor/earnings/fy-2024-q2/press-release-webcast

](https://www.microsoft.com/en-us/investor/earnings/fy-2024-q2/press-release-webcast

)

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,368 questions
{count} votes

Accepted answer
  1. VasaviLankipalle-MSFT 14,181 Reputation points
    2024-03-27T22:43:02.38+00:00

    Hello @Kirk Marple , Thank you again for your time and patience throughout this issue.

    I have reproduced the issue with my power point file (.pptx) file, and it worked well on my end. The issue with your file is since document intelligence no longer support embedded image in office/html files you are experiencing this issue. If you convert the same document to pdf, you will get the results since it's supported.

    Versions 2024-02-29-preview, 2023-10-31-preview, and later support Microsoft office (DOCX, XLSX, PPTX) and HTML files.

    User's image

    Sorry for the inconveniences caused. The product team is already working on the documentation fix.

    I hope this helps.

    Regards,

    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful