How to retrieve a set of documents from Cosmos Db from a set of IDs

Ankit Kumar 91 Reputation points
2020-10-16T13:49:22.157+00:00

I am looking to retrieve the items from my Cosmos Db collection and I have a list of Ids which I want to retrieve. all code are running in an Azure Function and below is the code where I get all Ids in List filteredResult where I get list of all Ids. I am looking the best way to complete this code to retrieve all items from my Cosmos Db collection considering 30-40 Ids at a time.

public static void Run([ServiceBusTrigger("testSB", "SubscriberName", Connection = "AzureServiceBusString")] string mySbMsg,
[CosmosDB(
databaseName: "DBName",
collectionName: "CollectionName",
ConnectionStringSetting = "CosmosDBConnection")] DocumentClient client,
ILogger log)
{

    try {
         log.LogInformation($"C# ServiceBus topic trigger function processed message: {mySbMsg}");



        var jsonSerializerSettings = new JsonSerializerSettings();
        jsonSerializerSettings.MissingMemberHandling = MissingMemberHandling.Ignore;
        List<MyItem> lists = JsonConvert.DeserializeObject<List<MyItem>>(mySbMsg, jsonSerializerSettings);
        List<string> filteredResult = (from s in lists
                                      where s.DocType == "TEST"
                             select s.Id).ToList();

    }
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,448 questions
{count} vote

Accepted answer
  1. Anurag Sharma 17,571 Reputation points
    2020-10-20T08:54:05.673+00:00

    Hi @Ankit Kumar , welcome to Microsoft Q&A forum. Really apologize for the delay in response.

    So if we understand it correctly, you want to pass a list of string and get all the documents from the Azure Cosmos DB. There could be many ways to achieve this but you can use below code to achieve the same. This code can be used after you created the 'filteredResult' (just replace the 'input' list with 'filteredResult')

     List<string> input = new List<string>();  
                    input.Add("1");  
                    input.Add("2");  
                    input.Add("3");                  
                     var option = new FeedOptions { EnableCrossPartitionQuery = true };                 
                     IQueryable<Family> queryable = client.CreateDocumentQuery<Family>(UriFactory.CreateDocumentCollectionUri("families", "items").ToString(), "SELECT * FROM books where books.id IN " + "('" + string.Join( "','", input) + "')",option);  
                     List<Family> posts = queryable.ToList();  
                     Console.WriteLine("Read count = {0}", posts.Count);  
    

    Also notice I created a model class for document properties as below:

    public class Family  
        {  
            public int id;  
            public string city;  
        }  
    

    Please let us know if this helps. Or else we can discuss further on the same.

    ----------

    If answer helps, please select 'Accept Answer' as it could help other community members looking for similar issues.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Ritesh kumar 1 Reputation point
    2021-11-10T05:13:40.14+00:00

    I was wondering if I have Id list of say 500,000 length (fetching ids from parallel indexed Persistence layer SQL/warehouse etc) what should faster and more optimised

    1. parallel (single) requests for document on Ids in batch of 5K ? (total 500,000 requests with RU optimised)
    2. queries with "SELECT * FROM books where books.id IN " + "('" + string.Join( "','", input) + "')",option). for a batch of 5K (Total 500 requests with long size queries)

    And then iterating it for all the docs of 500,000 Ids i.e 500 batches

    0 comments No comments