대량 수집 API를 사용하여 GeoCatalog로 데이터 수집

이 문서에서는 대량 수집 API를 사용하여 GeoCatalog에 많은 지리 공간적 데이터 자산을 한 번에 수집하는 방법을 보여 줍니다. 먼저 GeoCatalog 데이터 수집 원본을 생성하고 구성합니다. 수집 원본을 만들면 GeoCatalog 리소스와 기존 지리 공간적 데이터의 스토리지 위치 간에 보안 연결이 설정됩니다. 다음으로, 수집되는 데이터를 저장하기 위해 GeoCatalog 리소스 내에 STAC(SpatioTemporal Access Catalog) 컬렉션을 만듭니다. 마지막으로 대량 수집 API를 사용하여 수집 워크플로를 시작합니다. 이러한 단계를 완료하면 GeoCatalog UI 및 API에서 지리 공간적 데이터를 수집하고 액세스할 수 있습니다.

필수 조건

Azure 구독에서:

스토리지 계정의 Blob 컨테이너에 있는 지리 공간 데이터 세트:

지리 공간적 데이터 자산(예: GeoTIFF 파일)
연결된 STAC 항목은 이러한 자산에 대한 STAC 항목을 만듭니 다.
모든 STAC 항목 및 지리 공간적 데이터 자산을 참조하는 STAC 컬렉션 JSON입니다.

로컬 / 개발 환경에서:

Python 3.8 이상을 실행하는 Python 환경입니다.
Azure CLI
Azure에 로그인했습니다.

Microsoft Planetary Computer Pro는 Azure Blob Storage 컨테이너에 액세스할 수 있어야 합니다. 이 문서에서는 임시 SAS 토큰 자격 증명을 만들고 사용하여 이 액세스 권한을 부여합니다. 또는 이러한 가이드를 사용하여 관리 ID 또는 하드 코딩된 SAS 토큰을 설정할 수 있습니다.

데이터 수집 소스 만들기

수집 원본을 만들면 GeoCatalog에서 지리 공간적 데이터를 수집할 원본과 수집 워크플로에서 사용할 자격 증명 메커니즘을 정의합니다.

pip를 사용하여 필요한 Python 모듈을 설치하세요.

pip install pystac-client azure-identity requests azure-storage-blob pyyaml

필요한 Python 모듈 가져오기

import os
import requests
from azure.identity import AzureCliCredential
from datetime import datetime, timedelta, timezone
import azure.storage.blob
from urllib.parse import urlparse
import yaml

사용자 환경에 따라 필요한 상수 설정

MPCPRO_APP_ID = "https://geocatalog.spatio.azure.com"
CONTAINER_URI = "<container_uri>" # The URI for the blob storage container housing your geospatial data
GEOCATALOG_URI = "<geocatalog uri>" # The URI for your GeoCatalog can be found in the Azure portal resource overview 
API_VERSION = "2025-04-30-preview"

SAS 토큰 만들기

# Parse the container URL
parsed_url = urlparse(CONTAINER_URI)
account_url = f"{parsed_url.scheme}://{parsed_url.netloc}"
account_name = parsed_url.netloc.split(".")[0]
container_name = parsed_url.path.lstrip("/")

credential = azure.identity.AzureCliCredential()
blob_service_client = azure.storage.blob.BlobServiceClient(
    account_url=account_url,
    credential=credential,
)

now = datetime.now(timezone.utc).replace(microsecond=0)
key = blob_service_client.get_user_delegation_key(
    key_start_time=now + timedelta(hours=-1),
    key_expiry_time=now + timedelta(hours=1),
)

sas_token = azure.storage.blob.generate_container_sas(
    account_name=account_name,
    container_name=container_name,
    user_delegation_key=key,
    permission=azure.storage.blob.ContainerSasPermissions(
        read=True,
        list=True,
    ),
    start=now + timedelta(hours=-1),
    expiry=now + timedelta(hours=1),
)

GeoCatalog API 액세스 토큰 가져오기

# Obtain an access token
credential = AzureCliCredential()
access_token = credential.get_token(f"{MPCPRO_APP_ID}/.default")

데이터 수집 소스 API에 대한 POST 데이터 페이로드 생성

# Payload for the POST request
payload = {
    "Kind": "SasToken",
    "connectionInfo": {
        "containerUrl": CONTAINER_URI,
        "sasToken": sas_token,
    },
}

POST 페이로드를 수집 원본 엔드포인트로 전송하여 수집 원본 만들기

# STAC Collection API endpoint
endpoint = f"{GEOCATALOG_URI}/inma/ingestion-sources"

# Make the POST request
response = requests.post(
    endpoint,
    json=payload,
    headers={"Authorization": f"Bearer {access_token.token}"},
    params={"api-version": API_VERSION},
)

응답 확인

# Print the response
if response.status_code == 201:
    print("Ingestion source created successfully")
    ingestion_source_id = response.json().get("id") #saved for later to enable resoource clean up
else:
    print(f"Failed to create ingestion: {response.text}")

비고

이러한 단계를 두 번 이상 실행하면 409 응답이 발생합니다.

Container url <container uri> already contains a SAS token ingestion source with id <sas token id>

수집 원본 API에서는 동일한 컨테이너 URL에 대해 둘 이상의 수집 원본을 만들 수 없습니다. 충돌을 방지하려면 새 수집 원본을 만들기 전에 기존 수집 원본을 정리하십시오. 자세한 내용은 리소스 정리를 참조하세요.

컬렉션 만들기

STAC 컬렉션은 STAC 항목 및 관련 지리 공간적 자산에 대한 상위 수준 컨테이너입니다. 이 섹션에서는 GeoCatalog 내에 STAC 컬렉션을 만들어 다음 섹션에서 수집하는 지리 공간적 데이터를 보관합니다.

필요한 모듈 가져오기

import os
import requests
import yaml
from pprint import pprint
from azure.identity import AzureCliCredential

사용자 환경에 따라 필요한 상수 설정

MPCPRO_APP_ID = "https://geocatalog.spatio.azure.com"
GEOCATALOG_URI = "<geocatalog uri>" # The URI for your GeoCatalog can be found in the Azure portal resource overview 
API_VERSION = "2025-04-30-preview"

COLLECTION_ID = "example-collection" #You can your own collection ID
COLLECTION_TITLE = "Example Collection" #You can your own collection title

GeoCatalog API 액세스 토큰 가져오기

# Obtain an access token
credential = AzureCliCredential()
access_token = credential.get_token(f"{MPCPRO_APP_ID}/.default")

기본 STAC 컬렉션 사양 만들기

collection = {
    "id": COLLECTION_ID,
    "type": "Collection",
    "title": COLLECTION_TITLE,
    "description": "An example collection",
    "license": "CC-BY-4.0",
    "extent": {
        "spatial": {"bbox": [[-180, -90, 180, 90]]},
        "temporal": {"interval": [["2018-01-01T00:00:00Z", "2018-12-31T23:59:59Z"]]},
    },
    "links": [],
    "stac_version": "1.0.0",
    "msft:short_description": "An example collection",
}

비고

이 샘플 컬렉션 사양은 컬렉션의 기본 예제입니다. STAC 컬렉션 및 STAC 개방형 표준에 대한 자세한 내용은 STAC 개요를 참조하세요. 전체 STAC 컬렉션을 만드는 방법에 대한 자세한 내용은 STAC 컬렉션 만들기를 참조하세요.

컬렉션 API를 사용하여 새 컬렉션 만들기

response = requests.post(
    f"{GEOCATALOG_URI}/stac/collections",
    json=collection,
    headers={"Authorization": "Bearer " + access_token.token},
    params={"api-version": API_VERSION},
)

응답 결과 확인

if response.status_code == 201:
    print("Collection created successfully")
    pprint(response.json())
else:
    print(f"Failed to create ingestion: {response.text}")

연결 만들기 및 워크플로 실행

이 마지막 단계에서는 수집 API를 사용하여 대량 수집 워크플로를 시작합니다.

필요한 모듈 가져오기

import os
import requests
import yaml
from azure.identity import AzureCliCredential

사용자 환경에 따라 필요한 상수 설정

MPCPRO_APP_ID = "https://geocatalog.spatio.azure.com"
GEOCATALOG_URI = "<geocatalog uri>" # The URI for your GeoCatalog can be found in the Azure portal resource overview 
API_VERSION = "2025-04-30-preview"

COLLECTION_ID = "example-collection" #You can your own collection ID
catalog_href = "<catalog_href>" #The blob storage location of the STAC Catalog JSON file

skip_existing_items = False
keep_original_assets = False
timeout_seconds = 300

GeoCatalog API 액세스 토큰 가져오기

# Obtain an access token
credential = AzureCliCredential()
access_token = credential.get_token(f"{MPCPRO_APP_ID}/.default")

대량 수집 API에 대한 POST 페이로드 만들기

url = f"{GEOCATALOG_URI}/inma/collections/{COLLECTION_ID}/ingestions"
body = {
    "importType": "StaticCatalog",
    "sourceCatalogUrl": catalog_href,
    "skipExistingItems": skip_existing_items,
    "keepOriginalAssets": keep_original_assets,
}

대량 수집 API에 페이로드를 보냅니다.

ing_response = requests.post(
    url,
    json=body,
    timeout=timeout_seconds,
    headers={"Authorization": f"Bearer {access_token.token}"},
    params={"api-version": API_VERSION},
)

응답을 확인합니다.

if ing_response.status_code == 201:
    print("Ingestion created successfully")
    ingestion_id = ing_response.json()["ingestionId"]
    print(f"Created ingestion with ID: {ingestion_id}")
else:
    print(f"Failed to create ingestion: {ing_response.text}")

수집 워크플로의 상태를 확인합니다.

runs_endpoint = (
    f"{geocatalog_url}/inma/collections/{collection_id}/ingestions/{ingestion_id}/runs"
)

wf_response = requests.post(
    runs_endpoint,
    headers={"Authorization": f"Bearer {access_token.token}"},
    params={"api-version": API_VERSION},
)

if wf_response.status_code == 201:
    print("Workflow started successfully")
else:
    print(f"Failed to create ingestion run: {wf_response.text}")

워크플로가 완료되면 GeoCatalog STAC 또는 데이터 API 또는 데이터 탐색기를 사용하여 지리 공간적 데이터를 쿼리, 검색 또는 시각화할 수 있습니다.

자원을 정리하세요

수집 원본 삭제

del_is_endpoint = f"{GEOCATALOG_URI}/inma/ingestion-sources/{ingestion_source_id}"
del_is_response = requests.delete(
    del_is_endpoint,
    headers={"Authorization": f"Bearer {access_token.token}"},
    params={"api-version": API_VERSION},
)

if del_is_response.status_code == 200:
    print("Ingestion source deleted successfully")
else:
    print(f"Failed to delete ingestion source")

다음 단계

이제 몇 가지 항목을 추가했으므로 시각화를 위해 데이터를 구성해야 합니다.

Microsoft Planetary Computer Pro의 컬렉션 구성

피드백

이 페이지가 도움이 되었나요?

Last updated on 2025-05-20

다음을 통해 공유

대량 수집 API를 사용하여 GeoCatalog로 데이터 수집

필수 조건

데이터 수집 소스 만들기

컬렉션 만들기

연결 만들기 및 워크플로 실행

자원을 정리하세요

다음 단계

관련 콘텐츠

피드백

추가 리소스