With Cosmos db and Postgres would it be efficient to partition on column?

Klaus Nørregaard 6 Reputation points


This is concerning: Azure Cosmos DB for PostgreSQL

I have this table structure. It is a record over alle events (inserts, deletes and updates) on my server. The catalog column indicates from where the events originates from. There will be 100.000.000 rows of hourdata and 100.000 rows of pnt. Also some other tables will appers.

Please give guidance on partitioning. I would like to query based on the catalog value. I would like to list pnt rows etc.

I am looking for a solution where I only need one table to hold all the events on the database and still an efficient way to query the data.

User's image

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,469 questions
Azure Database for PostgreSQL
{count} votes

1 answer

Sort by: Most helpful
  1. GeethaThatipatri-MSFT 27,892 Reputation points Microsoft Employee

    @Klaus Nørregaard I am sharing a couple of resources that will help with data modeling such as workloads.

    1. https://learn.microsoft.com/en-us/events/azure-cosmos-db-liftoff/how-to-do-data-modeling-for-distributed-postgres-in-azure-cosmos-db
    2. https://learn.microsoft.com/en-us/azure/cosmos-db/postgresql/quickstart-build-scalable-apps-model-high-throughput
    3. https://docs.citusdata.com/en/v11.2/use_cases/timeseries.html

    At a high-level, the use case seems to be in the time series space and Azure Cosmos DB for PostgreSQL is a good fit for such use-cases. You typically distribute tables on a column that adds natural dimension to data (for ex: device_id in IoT workloads) and create time based partitions (ex: weekly, monthly etc). you can leverage JSONB column to store and query semi-structured and unstructured data and you seems to be already doing this. I hope this information helps, please post if you have any additional questions. Regards Geetha