U.S. social security number (SSN)

Tip

If you're not an E5 customer, use the 90-day Microsoft Purview solutions trial to explore how additional Purview capabilities can help your organization manage data security and compliance needs. Start now at the Microsoft Purview compliance portal trials hub. Learn details about signing up and trial terms.

Format

nine digits, which may be in a formatted or unformatted pattern

Note

If issued before mid-2011, an SSN has strong formatting where certain parts of the number must fall within certain ranges to be valid (but there's no checksum).

Pattern

four functions look for SSNs in four different patterns:

  • Func_ssn finds SSNs with pre-2011 strong formatting that are formatted with dashes or spaces (ddd-dd-dddd OR ddd dd dddd)
  • Func_unformatted_ssn finds SSNs with pre-2011 strong formatting that are unformatted as nine consecutive digits (ddddddddd)
  • Func_randomized_formatted_ssn finds post-2011 SSNs that are formatted with dashes or spaces (ddd-dd-dddd OR ddd dd dddd)
  • Func_randomized_unformatted_ssn finds post-2011 SSNs that are unformatted as nine consecutive digits (ddddddddd)

Checksum

No

Keyword Highlighting

Supported

When keyword highlighting is supported in the contextual summary for a sensitive information type or a trainable classifier, in the Contextual Summary view of activity explorer, the keywords in a document that were matched to a policy are highlighted.

Definition

A DLP policy has high confidence that it's detected this type of sensitive information if, within a proximity of 300 characters:

  • The function Func_ssn finds content that matches the pattern.
  • A keyword from Keyword_ssn is found.

A DLP policy has medium confidence that it's detected this type of sensitive information if, within a proximity of 300 characters:

  • The function Func_unformatted_ssn finds content that matches the pattern.
  • A keyword from Keyword_ssn is found.

A DLP policy has low confidence that it's detected this type of sensitive information if, within a proximity of 300 characters:

  • The function Func_randomized_formatted_ssn or Func_randomized_unformatted_ssn finds content that matches the pattern.
  • A keyword from Keyword_ssn is found.
<!-- U.S. Social Security Number (SSN) -->
  <Entity id="a44669fe-0d48-453d-a9b1-2cc83f2cba77" patternsProximity="300" recommendedConfidence="75">
      <Pattern confidenceLevel="85">
        <IdMatch idRef="Func_ssn" />
        <Match idRef="Keyword_ssn" />
      </Pattern>
      <Pattern confidenceLevel="75">
        <IdMatch idRef="Func_unformatted_ssn" />
        <Match idRef="Keyword_ssn" />
      </Pattern>
      <Pattern confidenceLevel="65">
        <IdMatch idRef="Func_randomized_formatted_ssn" />
        <Match idRef="Keyword_ssn" />
      </Pattern>
      <Pattern confidenceLevel="55">
        <IdMatch idRef="Func_randomized_unformatted_ssn" />
        <Match idRef="Keyword_ssn" />
      </Pattern>
  </Entity>

Keywords

Keyword_ssn

  • SSA Number
  • social security number
  • social security #
  • social security#
  • social security no
  • Social Security#
  • Soc Sec
  • SSN
  • SSNS
  • SSN#
  • SS#
  • SSID