Modify Exact Data Match schema to use configurable match


If you're not an E5 customer, you can try all the premium features in Microsoft Purview for free. Use the 90-day Purview solutions trial to explore how robust Purview capabilities can help your organization manage data security and compliance needs. Start now at the Microsoft Purview compliance portal trials hub. Learn details about signing up and trial terms.

Applies to

  • Exact data match (EDM) sensitive information type (SIT) creation using PowerShell.

Exact Data Match (EDM) based classification enables you to create custom sensitive information types that refer to exact values in a database of sensitive information. When you need to allow for variants of an exact string, you can use configurable match to tell Microsoft Purview to ignore case and some delimiters.


Use this procedure to modify an existing EDM schema and data file.

  1. Uninstall the EdmUploadAgent.exe from the computer that you use to connect to Microsoft 365 for EDM schema and data file upload purposes.

  2. Download the appropriate EdmUploadAgent.exe file for your subscription using the links below:

    • Commercial + GCC - most commercial customers should use this
    • GCC-High - This is specifically for high security government cloud subscribers
    • DoD - this is specifically for United States Department of Defense cloud customers
  3. Authorize the EDM Upload Agent, open a Command Prompt window (as an administrator) and run the following command:

    EdmUploadAgent.exe /Authorize
  4. If you don't have a current copy of the existing schema, you'll need to download a copy of the existing schema, run this command:

    EdmUploadAgent.exe /SaveSchema /DataStoreName <dataStoreName> [/OutputDir [Output dir location]]
  5. Customize the schema so each column utilizes “caseInsensitive” and / or “ignoredDelimiters”. The default value for “caseInsensitive” is “false” and for “ignoredDelimiters”, it is an empty string.


    The underlying custom sensitive information type or built in sensitive information type used to detect the general regex pattern must support detection of the variations inputs listed with ignoredDelimiters. For example, the built in U.S. social security number (SSN) sensitive information type can detect variations in the data that include dashes, spaces, or lack of spaces between the grouped numbers that make up the SSN. As a result, the only delimiters that are relevant to include in EDM’s ignoredDelimiters for SSN data are: dash and space.

    Here is a sample schema that simulates case insensitive match by creating the extra columns needed to recognize case variations in the sensitive data.

    <EdmSchema xmlns="">
      <DataStore name="PatientRecords" description="Schema for patient records policy" version="1">
               <Field name="PolicyNumber" searchable="true" />
               <Field name="PolicyNumberLowerCase" searchable="true" />
               <Field name="PolicyNumberUpperCase" searchable="true" />
               <Field name="PolicyNumberCapitalLetters" searchable="true" />

    In the above example, the variations of the original PolicyNumber column will no longer be needed if both caseInsensitive and ignoredDelimiters are added.

    To update this schema so that EDM uses configurable match use the caseInsensitive and ignoredDelimiters flags. Here's how that looks:

    <EdmSchema xmlns="">
      <DataStore name="PatientRecords" description="Schema for patient records policy" version="1">
             <Field name="PolicyNumber" searchable="true" caseInsensitive="true" ignoredDelimiters="-,/,*,#,^" />

    The ignoredDelimiters flag supports any non-alphanumeric character, here are some examples:

    • .
    • -
    • /
    • _
    • *
    • ^
    • #
    • !
    • ?
    • [
    • ]
    • {
    • }
    • \
    • ~
    • ;

    The ignoredDelimiters flag doesn't support:

    • characters 0-9
    • A-Z
    • a-z
    • "
    • ,
  6. Connect to Security & Compliance PowerShell.


    If your organization has set up Customer Key for Microsoft 365 at the tenant level (public preview), Exact data match will make use of its encryption functionality automatically. This is available only to E5 licensed tenants in the Commercial cloud.

  7. Update your schema by running the following command:

    Set-DlpEdmSchema -FileData ([System.IO.File]::ReadAllBytes('.\\edm.xml')) -Confirm:$true
  8. If necessary, update the data file to match the new schema version.


    Optionally, you can run a validation against your csv file before uploading by running:

    EdmUploadAgent.exe /ValidateData /DataFile [data file] /Schema [schema file]

    For example: EdmUploadAgent.exe /ValidateData /DataFile C:\data\testdelimiters.csv /Schema C:\EDM\patientrecords.xml

    For more information on all the EdmUploadAgent.exe supported parameters, run

    EdmUploadAgent.exe /?

  9. Open Command Prompt window (as an administrator) and run the following command to hash and upload your sensitive data:

    EdmUploadAgent.exe /UploadData /DataStoreName [DS Name] /DataFile [data file] /HashLocation [hash file location] /Salt [custom salt] /Schema [Schema file]