PII detection and masking
APPLIES TO:
Azure Data Factory
Azure Synapse Analytics
This article describes a solution template that you can use to detect and mask PII data in your data flow with Azure Cognitive Services.
About this solution template
This template retrieves a dataset from Azure Data Lake Storage Gen2 source. Then, a request body is created with a derived column and an external call transformation calls Azure Cognitive Services and masks PII before loading to the destination sink.
The template contains one activity:
- Data flow to detect and mask PII data
This template defines 3 parameters:
- sourceFileSystem is the folder path where files are read from the source store. You need to replace the default value with your own folder path.
- sourceFilePath is the subfolder path where files are read from the source store. You need to replace the default value with your own subfolder path.
- sourceFileName is the name of the file that you would like to transform. You need to replace the default value with your own file name.
Prerequisites
- Azure Cognitive Services Resource Endpoint URL and Key (create a new resource here)
How to use this solution template
Go to template PII detection and masking. Create a New connection to your source storage store or choose an existing connection. The source storage store is where you want to read files from.
Create a New connection to your destination storage store or choose an existing connection.
Select Use this template.
You should see the following pipeline:
Clicking into the dataflow activity will show the following dataflow:
Turn on Data flow debug.
Update Parameters in Debug Settings and Save.
Preview the results in Data Preview.
When data preview results are as expected, update the Parameters.
Return to pipeline and select Debug. Review results and publish.
Next steps
Feedback
Submit and view feedback for