Κοινή χρήση μέσω


Use CMS claims data transformations in healthcare data solutions

With CMS claims data transformations, you can ingest, store, and analyze claims data in CMS (Centers for Medicare & Medicaid Services) CCLF (Claim and Claim Line Feed) format. To learn more about the capability and understand how to deploy and configure it, see:

Understand the transformation mechanism

The CMS claims data transformation pipeline ingests claims files in either native or compressed format into the lakehouse. The end-to-end transformation follows these high-level consecutive steps:

  • Transform the claims files in OneLake
  • Organize the claims files in OneLake
  • Extract claims data into the bronze lakehouse
  • Convert claims data to FHIR NDJSON files
  • Transform claims data into FHIR flattened tables in the bronze lakehouse
  • Transform claims data into FHIR relational tables in the silver lakehouse

Run the CMS claims data transformations pipeline

Ensure you complete the steps in Set up claims sample data before running the CMS claims data transformations pipeline.

  1. To transform the claims data from the bronze lakehouse to the silver lakehouse, open the healthcare#_msft_clinical_claims_cclf_data_transformation data pipeline and select Run.

    A screenshot displaying a sample data pipeline run.

  2. After the pipeline runs successfully, open the ExplanationOfBenefit table in the silver lakehouse to view the transformed data.

    A screenshot displaying transformed data in the ExplanationOfBenefit table.

Usage considerations

Review these key points before using the CMS claims data transformations capability.

Spark version

The notebooks are preconfigured to run with Spark runtime version 1.2 (Spark 3.4, Delta 2.4) by default. Ensure you maintain this setting at the environment level. To learn more, see Reset Spark runtime version in the Fabric workspace.

File extension

The uploaded CCLF files must follow the extension format: *.T1000001 to *.T1000009. Files with incorrect extensions move to the Failed folder in the bronze lakehouse.

Record length

A record length mismatch in CCLF files occurs when one or more records deviate from the required fixed-length format. This mismatch can cause data misalignment, incomplete data capture, or processing errors. Files with records that don't meet the expected length move to the Failed folder in the bronze lakehouse.