Document Intelligence Studio - offline container

Robert Ra 20 Reputation points
2024-03-20T08:22:34.67+00:00

Dear Azure Community,

I am writing on behalf of our company, which is a security printing company specializing in high-security personal data products such as passports and ID cards. Our work primarily supports government projects that impose stringent data security requirements, specifically mandating on-premise data processing with no cloud connectivity.

We are interested in leveraging the capabilities of Azure's Document Intelligence Studio for our internal processes. However, due to the strict limitations of our projects, we must ensure that all data remains on-premises and does not interact with cloud services in any capacity.

Our typical deployment involves configuring HPE or Dell servers on-site and developing bespoke software solutions tailored to each customer's needs. Our software's core functionality revolves around collecting applicant data and facilitating secure document issuance.

The specific workflow requires applicants to submit identity proof, such as scanned documents or photographs, which are then uploaded to our server. We need to process these submissions and perform OCR to extract data for verification against an existing database.

Given these constraints and requirements, we are seeking guidance on the following:

  1. Can Document Intelligence Studio be used to train models, which can be used in a container?
  2. Is it possible to train the Azure Form Recognizer in an isolated environment, and if so, how can we set up a local container for processing the documents?
  3. Are there any best practices or considerations we should be aware of when deploying this technology in a highly secure, on-premises context?
  4. We are especially interested in "Containers in disconnected (offline) environments" and what are the limitations here.
  5. Our needs are simple. There is a set number of Identity Proof documents people need to upload to our server to be recognized, we just need to sanitize those, extract data and match it with existing records for verification.

We appreciate any advice.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,368 questions
{count} votes

Accepted answer
  1. santoshkc 4,185 Reputation points Microsoft Vendor
    2024-03-20T11:24:53.6233333+00:00

    Hi @Robert Ra,

    Thank you for reaching out to us with your query. We understand that your organization has strict data security requirements and needs to ensure that all data remains on-premises and does not interact with cloud services in any capacity.

    We will do our best to provide guidance on your queries, regarding your questions:

    • Document Intelligence Studio can be used to train models that can be used in a container. You can use the studio to train custom models for OCR, entity recognition, and key-value pair extraction. Containers enable you to run the Document Intelligence service in your own environment.
    • To train the Azure Form Recognizer in an isolated environment, you can use the Form Recognizer container. The container can be deployed on-premises, and you can use it to train custom models for OCR and entity recognition. See: Install and run containers & Configure Document Intelligence containers.
    • To deploy Azure Form Recognizer in a highly secure, on-premises context, you should consider using Azure Private Link, Azure ExpressRoute, Azure Firewall, Azure Active Directory, Role-Based Access Control, encryption, and Azure Security Center. These measures can help you protect your data and ensure secure access to your Form Recognizer service. See Data, privacy, and security for Document Intelligence.
    • To use Microsoft Azure Cloud containers in disconnected (offline) environments, you need to request access and meet certain requirements. The limitations of using containers in disconnected environments are that you need to have a strategic partnership with Microsoft, and your use cases must meet one of the following requirements: 1. Environment or device(s) with zero connectivity to internet. 2. Remote location that occasionally has internet access. 3. Organization under strict regulation of not sending any kind of data back to cloud. For more info see: Use Docker containers in disconnected environments & Containers in disconnected (offline) environments.
    • Azure Document Intelligence service to extract data from government-issued identification documents such as passports, driver's licenses, and social security cards. Once you extract the data, you can match it with existing records for verification. Azure Form Recognizer uses Optical Character Recognition (OCR) technology to extract key information from identity documents, such as first name, last name, date of birth, document number, and more. You can use the prebuilt IDs model to extract data.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.


0 additional answers

Sort by: Most helpful