Determine record uniqueness

Microsoft Cloud for Sustainability Technical Summit May 2024

This article provides information about the rules used to determine record uniqueness in Microsoft Sustainability Manager in Microsoft Cloud for Sustainability. Sustainability Manager provides two ways to determine record uniqueness:

  • Use the Origin Correlation ID (OCID)
  • Automatic generation of a primary key based on key attributes

The OCID is an optional identifier to correlate a record with its data origin. It's in our data model as an optional attribute for reference data, activity, and emissions data records. The OCID is provided during the creation of the record for an activity or emissions data record. If you provide an OCID, Sustainability Manager uses it to generate the primary key for that record.

The OCID must be unique for each record, so you can't use it to associate more than one record in a single entity/table. If you don't provide an OCID, Sustainability Manager uses the approach of generating a primary key based on key attributes, which involves using a certain set of attributes per entity to generate the primary key.

Important

After you set the OCID value on activity records, you can't change it.

Sustainability Manager data falls into the following three categories:

  • Activity data: Scope 1-3 records capture emission-producing activities such as purchased electricity or mobile combustion. Ingested precalculated emissions are considered activity data and are handled similarly.

  • Reference data: Supportive records are typically used during calculation and classification of activity data. Examples include emission factor libraries, transport mode, or business travel type.

  • System data: Common operational records that are typically part of broader standards such as greenhouse gas (GHG) factors, default units, and country/region code mappings.

These categories use different rules while determining the uniqueness behavior of a record. As a result, updates might behave differently across the categories. Use the following table to determine how to manage your system.

Record type Primary key evaluation rule Result Update method
Activity data If OriginCorrelationID is provided, it's used to generate the primary key for that record. If a record with the same OriginCorrelationID already exists, the record is updated. If a record with the same OriginCorrelationID for the entity type doesn't exist, the record is inserted.

If OriginCorrelationID isn't provided, all user-facing fields of the record make up the unique record key, except Connection, ConnectionRefresh, Evidence, and Description.
If any of the fields per evaluation rule are different, and you didn't specify an OriginCorrelationID, the record is considered different and is inserted. Use OriginCorrelationID for updates.
Reference data Name: Must be unique. If the ingested record has an identical name, the record is considered a duplicate.

OriginCorrelationID: If specified, must be unique. If a record with the same OriginCorrelationID already exists, the record is updated. If a record with the same OriginCorrelationID for the entity type doesn't exist, the record is inserted.

Primary key: Both Name and OriginCorrelationID (if specified).

If Name is different and OriginCorrelationID is matched, the record is considered an update, and the Name is overwritten with the incoming record data.

Estimation and emission factor names are unique within their library.

Library name plus Name is the key for factors.
If a record's Name already exists, the record is considered a duplicate, unless you specify an OriginCorrelationID. Use OriginCorrelationID for updates.
System data Name: Must be unique. If the ingested record has an identical name, the record is considered a duplicate.

OriginCorrelationID: If specified, must be unique. Used for updates.
If a record's Name already exists, the record is considered a duplicate, unless you specify an OriginCorrelationID. If provided, use OriginCorrelationID for updates.

Otherwise, if an update is required, you must follow the delete-insert method.

Note: We don't recommend updating system data.

See also

Import data
Microsoft Cloud for Sustainability data model