Azure ad b2c diagnostic logging and masking personal user information in logs

Bhushan Gawale 241 Reputation points

We are currently in the process of designing our logging strategy for Azure AD B2C. We are exploring the option of configuring the diagnostic settings of our B2C tenant to send user logs and events to the log analytics workspace, based on the guidance provided here:

However, during our Proof of Concept (PoC) testing, we discovered that personal user information, such as email addresses and phone numbers, is being logged and stored in the log analytics workspace without any masking.

We are looking for recommendations on how to mask this information before storing it. Specifically, we are interested in Microsoft's general guidelines for processing and storing user Personally Identifiable Information (PII).

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
1,935 questions
Azure Active Directory
Azure Active Directory
An Azure enterprise identity service that provides single sign-on and multi-factor authentication.
13,547 questions
Azure Active Directory External Identities
No comments
{count} votes

Accepted answer
  1. Alistair Ross 5,141 Reputation points Microsoft Employee


    So this is a conversation I have had many times with colleagues internally, and here are my recommendations:

    • When deciding on a workspace structure, keep the B2C data in a separate workspace to internal data. You are going to have different access and usage requirements for this data and will be handling differently to internal data. You can perform cross workspace queries, if the data is being investigated as part of a security investigation.
    • Don't just ingest the Azure AD B2C data, but all data that you collect, that is related to the applications and services provided, using Application insights and more. (Specifically, front end logs that relate to user interactions)
    • When using a service which allows custom log collection, like Application Insights, normalise the data here at source where possible
    • Data collection is not protection. Some organisation like to collect all the data available to them, determine what the financial impact of the data is for collecting it and compare that against use cases if you had to examine that data, such as in a breach scenario or for improving services. This can help determine the financial value of the data, and help you decide if it is worth collecting it. Take a look at Microsoft's approach to privacy here
    • When normalising the data at source is not practical / possible, using data collection transformations to normalise at the point of ingestion. With this method you can drop columns, rows or rewrite the values to obfuscate their original values. This can be a method for not only controlling data, but costs as well by reducing the size of the ingested logs.
    • There will always be a need to correlate the user account with PII, so ensure that all logged data has an internal User ID, such as a GUID to perform a lookup against. This will make it far easier when having to delete user data in future.
    • Be aware of any admin / employee log ins that may use the application front ends with B2C. If an employee is logging in with their personal account, then it should be treated as any other customer, but if there are admin interfaces, then this should be monitored, especially where security operations are concerned.
    • Tightly control who you are giving access to the data, use services such as Privileged Identity Manager (PIM) to allow just in time access to the data and ensure that people are getting the correct approvals within your organisation. Internally we have a process we call "Customer Lockbox" for accessing the data. Considering doing something similar internally when accessing PII using PIM
    • When controlling access to the data, don't just consider Identity, but the networking control plane as well. Use Azure Monitor Private Link Scopes and deny queries from public networks and only allow traffic that is from trusted networks (such as internal / VPNS)
    • Building on the last point, I would then implement secured privileged access workstations, or Azure Virtual Machines that are heavily monitored, prevent general data exports and are the only endpoints that can access this data.

    Ultimately only your organisation can decide on how you managing the logging and PII based on which ever regulations you need to follow. My recommendations above are not only around the strategy for collecting the data, but protecting it.

    kind regards

    Alistair Ross

0 additional answers

Sort by: Most helpful