Overview of Unstructured clinical notes enrichment in healthcare data solutions (preview)

[This article is prerelease documentation and is subject to change.]

Unstructured clinical notes enrichment is a capability that uses Azure AI Language's Text Analytics for health service for data extraction and structuring, enhancing their analytical potential. The service can be used to extract key Fast Healthcare Interoperability Resources (FHIR) entities from unstructured clinical notes and create structured data from these clinical notes. The structured data can then be analyzed to derive insights, predictions, and quality measures aimed at enhancing patient health outcomes.

Text Analytics for health enables information labeling through named entity recognition (NER) and entity linking. The service is used within the healthcare data solutions (preview) data pipelines as a modular service to create structured FHIR data from unstructured clinical notes. FHIR data can contain references to documents or parts of documents, known as DocumentReferences. These documents often contain a wealth of clinical information. When converted to structured health data that conforms to the FHIR standard, the data enriches a patient's clinical profile. Clinical notes are often a great source of information that can be mined to guide a patient's care pathway and deliver improved outcomes for them. This data also serves as a useful resource for analysts and data scientists wishing to conduct exploratory analysis on their clinical datasets.

Unstructured clinical notes enrichment is an optional capability under healthcare data solutions in Microsoft Fabric (preview). You have the flexibility to decide whether or not to use it, depending on your specific needs or scenarios.

To learn how to deploy, configure, and use this capability, see:


Using Azure AI Language's Text Analytics for health service is optional. However, any use of this service requires accepting the Responsible AI Terms and Conditions for deploying the service in your environment. For the installation steps and guidance, go to Set up Azure Language service.

To review the transparency notes, see:

Pricing model

The pricing model bases itself on the total number of text records processed by the Text Analytics for health API service. A text record is measured as 1,000 characters. This measure implies that for each piece of text submitted to the API for analysis, the character count of the text is divided by 1000 to determine the number of text records used. For example, if you submit a text that is 3,200 characters long, it counts as four text records. The service uses this calculation model for billing purposes.

Following is the cost breakdown for document processing:

  • For up to 5,000 text records, inferencing is included in the service.
  • For 5,000 to 500,000 text records, the cost is $25 USD per 1,000 text records processed.
  • For 500,000 to 2.5 million text records, the cost is $15 USD per 1,000 text records processed.
  • For more than 2.5 million text records, the cost is $10 USD per 1,000 text records processed.

The pricing model is designed to encourage users to process large volumes of text by offering a reduced cost per record for higher volumes. Only successful inferences are charged.

To prevent incurring processing costs, we limit the documentreferencecontent text (clinical notes) that the API processes by setting the nlp_document_limit parameter value to 10 in the healthcare#_msft_silver_ta4h notebook. You can review this configuration as explained in Configure the healthcare#_msft_silver_ta4h notebook. For more information about the pricing model, go to Azure AI Language pricing.

See also