Hi there Choudhary, Mahika
Thanks for using QandA platform
I dont think the Search Indexers natively support passing the entire content of a PPTX or PDF document, through a single input field to a custom skillset. By default, the indexer extracts text and images separately, with text stored under /document/content
and images under /document/normalized_images
. Tables and graphs are not extracted as structured data, meaning a direct one-field input is not feasible.
maybe try modifying the custom skill to accept multiple inputs, such as both text content and images, allowing Python-based processing to merge them. Another option is preprocessing the documents before indexing using Azure Functions to convert the entire file into a Base64 string, which can then be passed as a single field to the custom skillset.
If this helps kindly accept the answer thanks much.