Exporting de-identified data

Note

Results when using the FHIR service's de-identified export will vary based on the nature of the data being exported and what de-id functions are in use. Microsoft is unable to evaluate de-identified export outputs or determine the acceptability for customers' use cases and compliance needs. The FHIR service's de-identified export is not guaranteed to meet any specific legal, regulatory, or compliance requirements.

The FHIR service is able to de-identify data on export when running an $export operation. For de-identified export, the FHIR service uses the anonymization engine from the FHIR tools for anonymization (OSS) project on GitHub. There's a sample config file to help you get started redacting/transforming FHIR data fields that contain personally identifying information.

Configuration file

The anonymization engine comes with a sample configuration file to help you get started with HIPAA Safe Harbor Method de-id requirements. The configuration file is a JSON file with four properties: fhirVersion, processingErrors, fhirPathRules, parameters.

  • fhirVersion specifies the FHIR version for the anonymization engine.
  • processingErrors specifies what action to take for any processing errors that may arise during the anonymization. You can raise or keep the exceptions based on your needs.
  • fhirPathRules specifies which anonymization method to use. The rules are executed in the order they appear in the configuration file.
  • parameters sets more controls for the anonymization behavior specified in fhirPathRules.

Here's a sample configuration file for FHIR R4:

{
  "fhirVersion": "R4",
  "processingError":"raise",
  "fhirPathRules": [
    {"path": "nodesByType('Extension')", "method": "redact"},
    {"path": "Organization.identifier", "method": "keep"},
    {"path": "nodesByType('Address').country", "method": "keep"},
    {"path": "Resource.id", "method": "cryptoHash"},
    {"path": "nodesByType('Reference').reference", "method": "cryptoHash"},
    {"path": "Group.name", "method": "redact"}
  ],
  "parameters": {
    "dateShiftKey": "",
    "cryptoHashKey": "",
    "encryptKey": "",
    "enablePartialAgesForRedact": true
  }
}

For detailed information on the settings within the configuration file, visit here.

Manage Configuration File in storage account

You need to create a container for the de-identified export in your ADLS Gen2 account and specify the <<container_name>> in the API request as shown. Additionally, you need to place the JSON config file with the anonymization rules inside the container and specify the <<config file name>> in the API request.

Note

It is common practice to name the container anonymization. The JSON file within the container is often named anonymizationConfig.json.

Manage Configuration File in ACR

It's recommended that you host the export configuration files on Azure Container Registry(ACR). It takes the following steps similar as hosting templates in ACR for $convert-data.

  1. Push the configuration files to your Azure Container Registry.
  2. Enable Managed Identity on your FHIR service instance.
  3. Provide access of the ACR to the FHIR service Managed Identity.
  4. Register the ACR servers in the FHIR service. You can use the portal to open "Artifacts" under "Transform and transfer data" section to add the ACR server.
  5. Configure ACR firewall for secure access.

Using the $export endpoint for de-identifying data

https://<<FHIR service base URL>>/$export?_container=<<container_name>>&_anonymizationConfigCollectionReference=<<ACR image reference>>&_anonymizationConfig=<<config file name>>&_anonymizationConfigEtag=<<ETag on storage>>

Note

Right now the FHIR service only supports de-identified export at the system level ($export).

Query parameter Example Optionality Description
_container exportContainer Required Name of container within the configured storage account where the data is exported.
_anonymizationConfigCollectionReference "myacr.azurecr.io/deidconfigs:default" Optional Reference to an OCI image on ACR containing de-id configuration files for de-id export (such as stu3-config.json, r4-config.json). The ACR server of the image should be registered within the FHIR service. (Format: <RegistryServer>/<imageName>@<imageDigest>, <RegistryServer>/<imageName>:<imageTag>)
_anonymizationConfig anonymizationConfig.json Required Name of the configuration file. See the configuration file format here. If _anonymizationConfigCollectionReference is provided, we'll search and use this file from the specified image. Otherwise, we'll search and use this file inside a container named anonymization within the configured ADLS Gen2 account.
_anonymizationConfigEtag "0x8D8494A069489EC" Optional Etag of the configuration file, which can be obtained from the blob property in Azure Storage Explorer. Specify this parameter only if the configuration file is stored in Azure storage account. If you use ACR to host the configuration file, you shouldn't include this parameter.

Important

Both the raw export and de-identified export operations write to the same Azure storage account specified in the export configuration for the FHIR service. If you have need for multiple de-identification configurations, it is recommended that you create a different container for each configuration and manage user access at the container level.

Next steps

In this article, you've learned how to set up and use the de-identified export feature in the FHIR service. For more information about how to export FHIR data, see

FHIR® is a registered trademark of HL7 and is used with the permission of HL7.