Edit

Share via


Ingest data into Microsoft Planetary Computer Pro

Microsoft Planetary Computer Pro ingestion capabilities allow users to bring their own data into a cloud-enabled platform effective at indexing, storing, and querying geospatial assets at scale. Data ingested and stored in an Planetary Computer Pro GeoCatalog uses the SpatioTemporal Asset Catalog (STAC) open-standard to index, query, and retrieve geospatial data. For more information on STAC, see STAC overview.

This diagram provides an overview of how the various elements in the ingestion service work together:

Diagram showing secure ingestion in Microsoft Planetary Computer Pro.

Prerequisites

Ingestion Sources

Ingestion sources are representations of the location and authentication mechanisms required to ingest data into a GeoCatalog resource. Users can list and configure ingestion sources in the Settings tab of the web interface or using the GeoCatalog API. Once the ingestion source is set, data stored in that location is available for secure ingestion into your GeoCatalog.

Screenshot of GeoCatalog Portal showing where the Settings button is located.

Supported Storage Types

Planetary Computer Pro supports ingestion of geospatial assets from the following storage sources:

  • Azure Blob Storage with Managed Identity and SAS Tokens
  • Public URLs
  • S3 buckets with signed keys

Warning

All data ingested into Planetary Computer Pro requires STAC Items.

Tip

To accelerate the creation of STAC Items, we have a detailed tutorial and also have an open source tool called STAC Forge.

Ingestion Methods

When you provide an ingestion source, such as a blob storage container or public URL, Planetary Computer Pro can access your data. You can ingest STAC collections, STAC items, and assets stored in the specified location into Planetary Computer Pro. The GeoCatalog resource must have access to both the STAC collection JSON and the geospatial assets (images, data, etc.) that the STAC collection STAC items point to.

During the ingestion process, GEOTIFF, JPEG, JPEG2000, PNG, and TIFF files are transformed to COGs (Cloud Optimized GeoTIFFs), but users can select an option to copy original files as well.

Note

Data already in COG format isn't transformed.

Ingestion also supports cloud optimization for various data cube formats; see Data cube Overview Users can also select to skip certain items in the catalog.

There are two available ingestion methods depending on use case: bulk ingestion and single item ingestion. Each can be done through the web interface or the API.

Bulk Ingestion

Bulk ingestion allows users to automatically ingest an existing STAC Collection, including its collection JSON file, associated STAC Items, and the underlying STAC assets (images, data, etc.). Bulk ingestion quickly moves these artifacts into a GeoCatalog by specifying the data source (for example, Blob Storage), connection URL, and item type. Bulk Ingestion uses these inputs and parameters to execute an ingestion workflow. For more information about bulk ingestion, see Ingest data into GeoCatalog with the Bulk Ingestion API.

Screenshot of bulk ingestion GUI.

Single Item Ingestion

Given a preexisting STAC collection within a GeoCatalog, users can ingest new or update existing STAC items and their associated assets using Planetary Computer Pro's web interface, or the API. In contrast to Planetary Computer Pro's Bulk Ingestion feature, Single Item ingestion is intended for low-latency imports or updates vs large data migrations. For more information about bulk ingestion, see Add STAC Items to a Collection.

Screenshot of single-item ingestion GUI.

Troubleshooting Ingestion

If you encounter issues during data ingestion, such as authentication failures, STAC validation errors, or problems with asset transformation, refer to our dedicated troubleshooting documentation:

Next steps

Learn more about how to set up an Ingestion Source: