Processing Time - Custom Neural Model

Paul Pawletta 21 Reputation points

Hi, is there any way that the inference time for a custom neural model can be reduced?

I trained a few custom neural models. 1 model with 10 different entities and 1 model with 3 different entities. For both models the inference call for a single page document takes ~15-16 seconds. For a 2-page document the inference goes up to ~22-27 seconds.

Are there any plans to make the inference time more configurable e.g. offering GPU model inference?

Azure AI Document Intelligence
