@Deepak Dange To include the source page information within the index schema, you can use the metadata_storage_content_type
field. This field is automatically populated with the content type of the document being indexed. You can also use it to store additional metadata about the document, such as the source page information.
Here's an example of how you can add the metadata_storage_content_type
field to your index schema:
from azure.search.documents.indexes import SearchFieldDataType, SearchableField fields = [ SearchableField(name="id", type=SearchFieldDataType.String, key=True), SearchableField(name="content", type=SearchFieldDataType.String), SearchableField(name="metadata_storage_content_type", type=SearchFieldDataType.String) ]
In this example, we've added the metadata_storage_content_type
field to the index schema. You can then populate this field with the source page information when you upload your documents to the index.
Please note that the metadata_storage_content_type
field is only available for certain file types, including PDFs, DOCX files, and PPTX files. If you're indexing other file types, you may need to use a different approach to store the source page information.