I need to store a bunch of documents (anything from a few hundred to around maybe 10,000) in Azure. I then have a search indexer that cracks these and updates an index. By documents I just mean a bunch of (complex) json objects.
Every hour I will have a service that runs and updates any potentially changed objects and the indexer will run again. Azure of course has a few options for storage to support this such as blob storage or cosmos. After doing a lot of research I have not really been able to find what is the best storage type to use for this scenario. My criteria are as follows:
- The storage type will need to support complex json (ie. with nested properties).
- Each document will have a (relatively) "large" description field which is what is used for my index search in the end, the rest of the json is just metadata.
From what I gather the "easiest" solution would be to use something like Cosmos with MongoDb as it natively supports complex json. I am a bit worried though that I will have issues with my description field being too large. That leads me to blob storage, but I'm not sure how well this will work with this whole hourly syncing. I reckon Blob storage is more intended for readonly uploads and not for "dynamic" data such as this.
Hopefully someone in here has some idea about what I could do or where I could go to read more about such a relatively specific scenario!
Finally I could add that the reason I don't just add the data directly to a search index is that I need to do some enrichments in an indexer.