Is the pre-built Invoices model for form recognizer set up only for US format invoices?

Emily Harper 76 Reputation points

Can you please confirm if the pre-built invoices model ( for form recognizer (part of the azure cognitive services) is specific to US format invoices?

I see it says it had US locale but we were trying it on non US invoices and have seen some faults. A European invoice came in and the total text was "557,38", which in some countries is equal to saying "557.38". They use comma as the decimal places instead of a full stop. However, when this came in, it assumed this was punctuation and ended up giving us a total of "57738". Does this mean we cannot feed in any invoices that do not use a full stop as the decimal places?

Is this going to be the same issue also with dates? Will it assume they are all in MM/DD/YYYY when in the rest of the world it is largely more common for our invoices to be DD/MM/YYYY.

Thank you


Azure Cognitive Services
Azure Cognitive Services
A group of Azure artificial intelligence services and cognitive APIs that help build intelligent apps.
939 questions
Azure Form Recognizer
Azure Form Recognizer
An Azure service that applies machine learning to extract text, key/value pairs, tables, and structures from documents.
641 questions
No comments
1 vote

Accepted answer
  1. YutongTie-MSFT 24,466 Reputation points Microsoft Employee

    Hello @Emily Harper

    You are correct, root cause of this issue that currently invoice supports only En-US invoices and typical US date formats. Provided invoice is from UK and it causes normalization to fail. We are working on invoice language/locale expansion but there are not exact ETA for en-Gb support.

    If a lot of invoice has same date format, as a workaround you can try to do custom normalization during post processing.

    It depends how many different date/price formats present in customer invoices. If it a single format, it should be pretty easy to do in any programming language, i.e. example above can be solved by removal "th" and parsing regular DateTime.Parse, see code below. But if there are a lot of different unsupported format, it will be much more complex.

        var s = "6th June 2021";  
        s = s.Replace("nd","").Replace("th","").Replace("rd","").Replace("st","");  

    Hope this will help. Please let us know if any further queries.


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    No comments

0 additional answers

Sort by: Most helpful