How to reduce the influence of image text on the semantic vectors when using AI Vision Image Retrieval Endpoint
Hiob Gebisso
101
Reputation points
Hello,
we are currently testing the Vision Image Retrieval API on book covers and noticed that the model is heavily influenced by text (author, titles, subtitles) on an image. Is there a more straightforward way to reduce the influence of text on the model output, other than preprocessing the image with OCR to get rid of the text, before using the retrieval API?
Best,
Hiob