Edit

Share via


Determine table and relation size in Azure Cosmos DB for PostgreSQL

Important

Azure Cosmos DB for PostgreSQL is no longer supported for new projects. Don't use this service for new projects. Instead, use one of these two services:

The usual way to find table sizes in PostgreSQL, pg_total_relation_size, drastically under-reports the size of distributed tables on Azure Cosmos DB for PostgreSQL. All this function does on a cluster is to reveal the size of tables on the coordinator node. In reality, the data in distributed tables lives on the worker nodes (in shards), not on the coordinator. A true measure of distributed table size is obtained as a sum of shard sizes. Azure Cosmos DB for PostgreSQL provides helper functions to query this information.

Function Returns
citus_relation_size(relation_name)
  • Size of actual data in table (the "main fork").
  • A relation can be the name of a table or an index.
citus_table_size(relation_name)
citus_total_relation_size(relation_name)
  • citus_table_size plus:

    • size of indices

These functions are analogous to three of the standard PostgreSQL object size functions, except if they can't connect to a node, they error out.

Example

Here's how to list the sizes of all distributed tables:

SELECT logicalrelid AS name,
       pg_size_pretty(citus_table_size(logicalrelid)) AS size
  FROM pg_dist_partition;

Output:

┌───────────────┬───────┐
│     name      │ size  │
├───────────────┼───────┤
│ github_users  │ 39 MB │
│ github_events │ 37 MB │
└───────────────┴───────┘

Next steps