@Mike Schuckenbrock ,
This is a complex issue involving Azure AI Document Intelligence’s neural custom extraction model for medical data. The core problem is not just accuracy but confidence scores, which remain too low even when the model extracts the correct data. Let’s go step by step and address this with a structured approach.
Root Cause Analysis of Low Confidence Scores:
Confidence scores are low beyond Chemo3 → The model is struggling with recognizing patterns beyond the most frequently seen positions.
Confidence scores remain low even with more training data → The model likely lacks a strong contextual understanding of positional variation.
The model does not generalize well to new placements of the same entity → If it sees "cisplatin" in Chemo1 but never in Chemo7 before, it assigns low confidence.
Data augmentation and increased diversity helped accuracy but not confidence → This indicates the model is still overly dependent on position-based learning.
Resolutions & Advanced Techniques:
1.Improve Generalization with Positional Independence
Azure AI Document Intelligence models rely on learned position-based relationships. Your issue suggests that the model is overfitting to positional data instead of understanding entity context.
Solution: Relative Position Encoding & Context Expansion
Instead of labeling Chemo1, Chemo2, ..., Chemo25, consider labeling just "Chemotherapy Drug" across all instances and let the post-processing logic assign order.
Alternative: Use a bounding box-based approach where you extract the chemotherapy drug irrespective of its order in the document.
In the training phase, introduce documents where chemotherapy drugs appear in random orders (e.g., different tables, different list formats).
Implementation:
Flatten Labels: Instead of defining “Chemo1-Chemo25,” create a single “Chemotherapy Drug” label. Augment Layouts: Vary document structures so that drugs appear in multiple positions. Train with Bounding Boxes: Let the model learn to extract the drugs independent of order, then sort them later.
2.Confidence Score Calibration
Even when Azure AI extracts correct values, confidence scores can be low due to underlying model uncertainty.
Solution: Confidence Score Normalization & Ensemble Methods
Instead of relying solely on the Azure AI model’s raw confidence score, apply post-processing techniques:
Histogram-Based Normalization: Adjust confidence scores based on prior distributions in training data.
Hybrid Ensemble with Regex Matching: If extracted drugs match a predefined medical dictionary, boost their confidence scores.
Metadata-Based Scoring: If a chemotherapy drug is listed within a known section of the document, manually raise the score threshold.
Implementation:
Extract all entities using Azure AI Document Intelligence. Compare against a verified drug list (match known drugs → boost score). Recalibrate scores based on prior distributions using a rule-based approach.
3.Use Named Entity Recognition (NER) for Drug Identification
Azure AI Document Intelligence’s custom model struggles with long lists of drugs because it was not specifically designed for complex biomedical text extraction.
Solution: Integrate NLP-Based Clinical NER (e.g., BioBERT, Azure ML)
Train a ClinicalBERT model in Azure Machine Learning that can recognize chemotherapy drug names with higher accuracy.
Pass document text through ClinicalBERT before feeding into Azure AI Document Intelligence.
Combine outputs (if ClinicalBERT recognizes "cisplatin" in Chemo7 with high confidence, override the Azure AI score).
Implementation:
Use BioBERT/ClinicalBERT in Azure Machine Learning for better extraction. Integrate BERT’s results with Azure AI Document Intelligence’s output for confidence adjustment.
4.Augment Model Training with Synonyms & Variations
Medical texts use multiple variations of the same term, affecting model confidence.
Solution: Train Using Synonyms, Abbreviations, and Variations
Expand the training dataset by automatically replacing drug names with synonyms and re-labeling them.
Use Azure Cognitive Search to enrich extracted entities with external medical knowledge.
Implementation:
Generate synthetic training samples with drug variations. Use Azure Cognitive Search for synonym resolution.
5.Post-Processing with Azure Logic Apps for Human-in-the-Loop Review
Implement a confidence threshold re-adjustment workflow.
Rule-Based Processing: If a recognized chemotherapy drug’s confidence is > 0.30 and it matches the medical dictionary, override the model’s confidence.
Implementation:
Flag low-confidence entities for review but automatically approve high-probability matches. Use business rules to override incorrect low scores.
By implementing these, you should see: Higher confidence scores on correctly extracted drugs. Better generalization across document variations. Lower false negatives in post-processing due to threshold tuning.
Hope this helps.
Thank you!