Share via


Use orchestrate multimodal AI insights (preview) in healthcare data solutions

[This article is prerelease documentation and is subject to change.]

Orchestrate multimodal AI insights (preview) enables on-demand generation and integration of AI enrichments from healthcare AI models and APIs. To learn more about the capability and understand how to deploy and configure it, see:

Understand the orchestration mechanism

The enrichment execution notebooks for Text Analytics for health, MedImageInsight, and MedImageParse facilitate ease of use for the available healthcare models. Each model execution notebook:

  • Defines the enrichment view and enrichment definition based on the model configuration and input mappings. The input data for each model is specified, which is why the healthcare data transformations are a prerequisite. You can use SQL views to specify which data, like tables or images, should be used as the data source.

  • Specifies the model processor and transformer. The model processor is used to call the model API. The transformer is used to produce the standardized output and save the output in the bronze lakehouse in the Ingest folder. The processor and transformer are specific to the model being used.

After you run the enrichment execution, the integration pipeline ingests the model outputs into the lakehouse.

Run the model execution

Note

The notebooks are preconfigured to run with Spark runtime version 1.2 (Spark 3.4, Delta 2.4) by default. Ensure you maintain this setting at the environment level. For more information, see Reset Spark runtime version in the Fabric workspace.

Select which model you want to use, and then deploy it from its reference link:

Make sure you have the proper data type for your chosen model and run the appropriate data transformations as specified. You can use the sample data deployed with healthcare data solutions with any of these models.

For more information about these models, see Azure AI Foundry. When you deploy the model, add the endpoint and key to your key vault.

Text Analytics for health enrichment execution

  1. Install the healthcare data solutions sample data. Copy the sample data folder 51KSyntheticPatients from Files\SampleData\Clinical\FHIR-NDJSON\FHIR-HDS to Files\Ingest\Clinical\FHIR-NDJSON\FHIR-HDS.

  2. Run the healthcare#_msft_clinical_data_foundation_ingestion pipeline.

  3. After the pipeline executes successfully, validate that the DocumentReference and DocumentReferenceContent tables in the silver lakehouse populate with data.

  4. FHIR resource enrichments (optional): Text Analytics for health enrichments are stored in the text and object enrichment tables. If you wish to generate FHIR resources, you must specify the FHIR resource configuration:

    1. In the healthcare#_msft_ai_enrichments_ta4h_execution notebook, edit the fifth cell to change the FHIR_RESOURCE_CONFIG parameter and set up the FHIR resources to extract during notebook execution. By default, the FHIR resources Observation and Condition are included. For the list of supported resources, review the FHIR Domain Resource column.

    2. Go to the bronze ingestion notebook healthcare#_msft_ai_enrichments_bronze_ingestion and update the inline parameter to enable FHIR resource extraction. Set the extract_fhir_resources_from_object_enrichments parameter value to true.

  5. Select the healthcare#_msft_ai_enrichments_ta4h_execution notebook.

  6. Go to cell 10 to define the Text Analytics for health API settings in the notebook and unlock the cell for editing.

    A screenshot displaying the API settings.

  7. Update TA4H_API_KEY_SECRET_NAME.

  8. Update TA4H_API_ENDPOINT.

  9. Select Run all. All notebook cells must execute without errors.

  10. Verify that NDJSON files are generated in the following bronze lakehouse folders:

    • Files\Ingest\AIEnrichments\Text\Fabric.HDS
    • Files\Ingest\AIEnrichments\Object\Fabric.HDS

MedImageInsight enrichment execution

  1. Install the healthcare data solutions sample data.

  2. Deploy and configure DICOM data transformation capability.

  3. Move the 340ImagingStudies folder from Files\SampleData\Imaging\DICOM\DICOM-HDS to Files\Ingest\Imaging\DICOM\DICOM-HDS.

  4. Run the msft_imaging_with_clinical_foundation_ingestion pipeline.

  5. After the pipeline executes successfully, validate that the ImagingMetastore table populates with data.

  6. Run the MedImageInsight notebook:

    1. Select the msft_ai_enrichments_medimage_insight_execution notebook.

    2. Go to the MedImageInsights API configuration cell in the notebook and make sure you're in Develop mode to unlock the cell.

      A screenshot of the API configuration cell.

    3. Update the MED_IMAGE_INSIGHTS_API_KEY_SECRET_NAME constant.

    4. Update the MED_IMAGE_INSIGHTS_API_ENDPOINT constant.

    5. Set the MED_IMAGE_INSIGHTS_MODEL_VERSION constant.

    6. Select Run all. All notebook cells must execute without errors.

    7. Verify that NDJSON files are generated in the bronze lakehouse folder Files\Ingest\AIEnrichments\Embedding\Fabric.HDS.

MedImageParse enrichment execution

  1. Install the healthcare data solutions sample data.

  2. Deploy and configure DICOM data transformation capability.

  3. Move the 340ImagingStudies folder from Files\SampleData\Imaging\DICOM\DICOM-HDS to Files\Ingest\Imaging\DICOM\DICOM-HDS.

  4. Run the msft_imaging_with_clinical_foundation_ingestion pipeline.

  5. After the pipeline executes successfully, validate that the ImagingMetastore table populates with data.

  6. Run the MedImageParse notebook:

    1. Select the msft_ai_enrichments_medimage_parse_execution notebook.

    2. Go to cell nine in the notebook for the API configuration. Make sure you're in Develop mode to unlock the cell.

    3. Update the MED_IMAGE_PARSE_API_KEY_SECRET_NAME constant.

    4. Update the MED_IMAGE_PARSE_API_ENDPOINT constant.

    5. Set the MED_IMAGE_PARSE_MODEL_VERSION constant.

    6. Set the MED_IMAGE_PARSE_SYSTEM_INSTRUCTIONS string. This string contains multiple sentences separated by the special character &.

      For example, "tumor core & enhancing tumor & nonenhancing tumor". In this case, there are three sentences. So, the output consists of three images with segmentation masks. To begin with, you can set the instructions as "neoplastic cells in breast pathology & inflammatory cells".

    7. Select Run all. All notebook cells must execute without errors.

    8. Verify that NDJSON files are generated in the bronze lakehouse folder Files\Ingest\AIEnrichments\Segmentation_2D\Fabric.HDS.

Run the integration pipeline

  1. Make sure one or more model enrichment execution notebooks successfully ran from the previous steps. The bronze lakehouse Ingest folder should have data before you proceed.

  2. In your workspace, open the healthcare#_msft_ai_enrichments_ingestion pipeline, and select Run.

    A screenshot displaying the pipeline.

Validate data in the bronze lakehouse

  • Open the bronze lakehouse.

  • Load the AIEnrichments table:

    • If you don't see any data in the table after running the integration pipeline, know that it could take up to five minutes for the data to appear. Try refreshing the table after five minutes.

    • You should see a list of rows. The types column in the table is based on the use case you ran previously.

      Use case Types
      Text Analytics for health Produces text and object enrichments
      MedImageInsight Produces embedding enrichments
      MedImageParse Produces segmentation_2D enrichments
  • Depending on which model execution notebook you chose, confirm that each row has all the columns populated:

    • The data column should contain the full data content of the streamed JSON file (or the full line in an NDJSON file).
    • The file_path should contain a path to the file in the landing zone (Process folder).
    • The type column is based on the model you ran.
    • The patient_id column should have data in all records.

Validate data in the silver lakehouse

  1. Open the silver lakehouse and refresh the tables list.

  2. You should find these new tables:

    • TextEnrichments
    • ObjectEnrichments
    • EmbeddingEnrichments
    • Segmentation_2DEnrichments
    • Segmentation_3DEnrichments
  3. Depending on which model execution notebook you ran previously, the data goes into one or more of those five tables.

  4. Check each table schema:

    • All tables share a common set of columns in addition to their own columns.

      Common columns (should exist in every table):

      • unique_id
      • model_name
      • model_version
      • patient_id
      • enrichment_generation_id
      • enrichment_definition_id
      • metadata
      • enrichment_context_id
      • confidence_score
      • msftModifiedDateTime
      • msftCreatedDateTime

      TextEnrichments columns:

      • value
      • domain
      • reasoning
      • evidence
      • terminology

      ObjectEnrichment columns:

      • value
      • type
      • reasoning
      • evidence

      EmbeddingEnrichments columns:

      • vector

      Segmentation_2DEnrichments columns:

      • height
      • width
      • background
      • segmentation_metadata
      • value
      • file_ref_segmentation_mask
      • segmentation_type

      Segmentation_3DEnrichments columns:

      • height
      • width
      • depth
      • background
      • segmentation_metadata
      • value
      • file_ref_segmentation_mask
      • segmentation_type
  5. Check data in each table:

    1. Validate data integrity:

      1. In EmbeddingEnrichments, all rows should have a vector column populated. Empty array is a valid result.
      2. In Text/Object Enrichments, the value and domain columns should have data.
      3. In Segmentation_2DEnrichments, the mask value should exist as an array of numbers. The segmentation type should also be populated.
      4. All records in any table should have a patient_id.
    2. Data can be sparse. For example, not all models produce a confidence score or metadata.

Validate the metadata store

  1. Open the msft_ai_enrichments_metadatastore lakehouse.

  2. Confirm that the following tables have records:

    • EnrichmentView: Contains a view for each table that exists in the silver lakehouse and a view for each time you run the use case notebooks.
    • EnrichmentContext: Contains records only after the successful run of the pipeline.
    • Enrichment: Every time you run the use case notebooks, an enrichment definition is created in this table.
    • EnrichmentGenerationInfo
    • EnrichmentMaterializedView