So this is a conversation I have had many times with colleagues internally, and here are my recommendations:
- When deciding on a workspace structure, keep the B2C data in a separate workspace to internal data. You are going to have different access and usage requirements for this data and will be handling differently to internal data. You can perform cross workspace queries, if the data is being investigated as part of a security investigation.
- Don't just ingest the Azure AD B2C data, but all data that you collect, that is related to the applications and services provided, using Application insights and more. (Specifically, front end logs that relate to user interactions)
- When using a service which allows custom log collection, like Application Insights, normalise the data here at source where possible
- Data collection is not protection. Some organisation like to collect all the data available to them, determine what the financial impact of the data is for collecting it and compare that against use cases if you had to examine that data, such as in a breach scenario or for improving services. This can help determine the financial value of the data, and help you decide if it is worth collecting it. Take a look at Microsoft's approach to privacy here https://learn.microsoft.com/en-us/compliance/assurance/assurance-privacy
- When normalising the data at source is not practical / possible, using data collection transformations to normalise at the point of ingestion. https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/data-collection-transformations. With this method you can drop columns, rows or rewrite the values to obfuscate their original values. This can be a method for not only controlling data, but costs as well by reducing the size of the ingested logs.
- There will always be a need to correlate the user account with PII, so ensure that all logged data has an internal User ID, such as a GUID to perform a lookup against. This will make it far easier when having to delete user data in future.
- Be aware of any admin / employee log ins that may use the application front ends with B2C. If an employee is logging in with their personal account, then it should be treated as any other customer, but if there are admin interfaces, then this should be monitored, especially where security operations are concerned.
- Tightly control who you are giving access to the data, use services such as Privileged Identity Manager (PIM) to allow just in time access to the data and ensure that people are getting the correct approvals within your organisation. Internally we have a process we call "Customer Lockbox" for accessing the data. Considering doing something similar internally when accessing PII using PIM
- When controlling access to the data, don't just consider Identity, but the networking control plane as well. Use Azure Monitor Private Link Scopes and deny queries from public networks and only allow traffic that is from trusted networks (such as internal / VPNS)
- Building on the last point, I would then implement secured privileged access workstations, or Azure Virtual Machines that are heavily monitored, prevent general data exports and are the only endpoints that can access this data. https://learn.microsoft.com/en-us/security/privileged-access-workstations/privileged-access-devices
Ultimately only your organisation can decide on how you managing the logging and PII based on which ever regulations you need to follow. My recommendations above are not only around the strategy for collecting the data, but protecting it.