CosmosSearch says "search must be first stage in pipeline" -- But it already is

William Gaty 0 Reputation points
2025-06-20T15:31:50.0033333+00:00

I'm constructing a Python API using FastAPI, with data being stored in an Azure Cosmos DB for MongoDB (vCore). One of the endpoint paths involves running a vector search on the database, whose pipeline has the stages "search" and "project". Additionally, the application tracks user actions during a single "session", with the session's entry in the "active sessions" collection in the database being created the first time a user submits a query and updated with every subsequent query. Sessions that have been inactive for more than a set amount of time are removed from "active sessions" and added to an "archived sessions" collection. This action is handled in the FastAPI application's lifespan handler. I'm encountering two problems:

  1. If the global variables for the MongoDB client, database, and necessary collections are declared and initialized in the top level of the application (e.g., "COLLECTION = client['collection']"), then most sessions will end up with duplicate records in the "archived sessions" collection. I assume that this has something to do with the pair of worker processes that are running the application. However, the search function works fine.
  2. If the global variables for the MongoDB client, database, and necessary collections are declared and initialized to None at the top level of the application and then set to the relevant connections in the lifespan handler (e.g., "COLLECTION = None" at the top level, and then "global COLLECTION" and "COLLECTION = client['collection']" inside of the lifespan handler), then somehow the search fails with an error saying "$search must be the first stage in the pipeline". However, as mentioned the search function works in the other scenario. No code has changed aside from the setup of the collection variables. As far as I've been able to figure out through debugging, $search is still the first stage of the pipeline.

I'm pretty lost here. The solution to problem #1 is pretty simple, but it causes problem #2. I really have no idea why problem #2 is even happening, let alone how to fix it. The relevant collection variable is still pointed at the correct collection (and is not None), but something is still amiss.

Thank you!

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,915 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.