Azure Health.Deidentification client library for .NET - version 1.0.0-beta.1

Azure.Health.Deidentification is a managed service that enables users to tag, redact, or surrogate health data.

[Source code][source_root] | [Package (NuGet)][package] | [API reference documentation][reference_docs] | [Product documentation][azconfig_docs] | [Samples][source_samples]

Source code | Package (NuGet) | API reference documentation | Product documentation

Getting started

Install the package

Install the client library for .NET with NuGet:

dotnet add package Azure.Health.Deidentification --prerelease

Prerequisites

You must have an Azure subscription and Deid Service.

Authenticate the client

Pull ServiceUrl from your created Deidentification Service.

Service Url Location

Basic code snippet to create your Deidentification Client and Deidentify a string.

        const string serviceEndpoint = "https://example.api.cac001.deid.azure.com";
        TokenCredential credential = new DefaultAzureCredential();

        DeidentificationClient client = new(
            new Uri(serviceEndpoint),
            credential,
            new DeidentificationClientOptions()
        );

        DeidentificationContent content = new("Hello, John!", OperationType.Surrogate, DocumentDataType.Plaintext);

        Response<DeidentificationResult> result = client.Deidentify(content);
        string outputString = result.Value.OutputText;
        Console.WriteLine(outputString); // Hello, Tom!

Key concepts

Operation Modes

  • Tag: Will return a structure of offset and length with the PHI category of the related text spans.
  • Redact: Will return output text with placeholder stubbed text. ex. [name]
  • Surrogate: Will return output text with synthetic replacements.
    • My name is John Smith
    • My name is Tom Jones

Job Integration with Azure Storage Instead of sending text, you can send an Azure Storage Location to the service. We will asynchronously process the list of files and output the deidentified files to a location of your choice.

Limitations:

  • Maximum file count per job: 1000 documents
  • Maximum file size per file: 2 MB

Redaction Formatting

Redaction formatting guide

Thread safety

We guarantee that all client instance methods are thread-safe and independent of each other (guideline). This ensures that the recommendation of reusing client instances is always safe, even across threads.

Additional concepts

Client options | Accessing the response | Long-running operations | Handling failures | Diagnostics | Mocking | Client lifetime

Examples

You can familiarize yourself with different APIs using Samples.

Next steps

  • Find a bug, or have feedback? Raise an issue with "Health Deidentification" Label.

Troubleshooting

  • Unabled to Access Source or Target Storage
    • Ensure you create your deid service with a system assigned managed identity
    • Ensure your storage account has given permissions to that managed identity

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information, see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Impressions