I created a module to integrate with the PII detection feature of Azure TextAnalytics a couple of months ago. It was working fine, successfully returning with PII results, but I had to move to other things about a month ago. I recently came back to it. With no code changes to the module it is now failing to detect PII and instead times out on the call to the Analytics API.
The error I receive (on the call to TextAnalyticsClient.RecognizePiiEntities()) is the following:
Retry failed after 4 tries. Retry settings can be adjusted in ClientOptions.Retry. (The operation was cancelled because it exceeded the configured timeout of 0:01:40. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.)
(The operation was canceled.)
(Unable to read data from the transport connection: An established connection was aborted by the software in your host machine..)
(An established connection was aborted by the software in your host machine.)
This is my code:
var options = new RecognizePiiEntitiesOptions();
options.CategoriesFilter.Add(PiiEntityCategory.Address);
... <more PII options>
var client = new TextAnalyticsClient(new Uri(config.uri), new AzureKeyCredential(config.key));
try {
var incidents = new List<PiiIncident>();
var segmenter = new StreamSegmenter(stream);
foreach (var segment in segmenter.Segment(blockSize, overlapSize)) {
PiiEntityCollection entities = client.RecognizePiiEntities(segment.content, null, options);
foreach (PiiEntity entity in entities) {
... <process the PII>
}
}
return incidents;
}
catch (Exception e) {
logger.LogError(e, "Failed to scan for PII");
return Errors.Pii.FailedToSearchPii;
}
Debugging gets to line 12 at which point it is stuck for about 8 minutes and then throws the exception. Azure indicates that it is aware of the attempts to communicate (from the Language resource metrics), but reports them all to be client errors.
Looking online indicates that this could be an issue due to firewalls or antivirus. I have tried it on two different machines as well as in my tests that run in a DevOps pipeline on a MS hosted agent. In all three cases this worked (with no code changes) a couple months ago and now doesn't work on any of them. I've tried disabling firewall and antivirus on the machine and it makes no difference.
I've tried creating a new instance of the Language resource in Azure, but that made no difference.
Has something changed in registration/configuration of the Azure Language resource that is required to prevent this issue? Is there something else that I'm missing that would have caused this to start failing with no code change even when it worked a couple months ago?