Processing Time - Custom Neural Model

Paul Pawletta 21

Hi, is there any way that the inference time for a custom neural model can be reduced?

I trained a few custom neural models. 1 model with 10 different entities and 1 model with 3 different entities. For both models the inference call for a single page document takes ~15-16 seconds. For a 2-page document the inference goes up to ~22-27 seconds.

Are there any plans to make the inference time more configurable e.g. offering GPU model inference?

Ramr-msft 17,731 Reputation points

2022-11-01T08:27:48.96+00:00

@Paul Pawletta Thanks for the question. We are checking internally on this will update on the same.

Share via

Processing Time - Custom Neural Model