Thanks for reaching out to Microsoft Q&A.
It sounds like you’re encountering a common issue with schema drift in Azure Data Factory (ADF) when using the PII detection and masking feature.
- Check Schema Alignment
- Ensure that the schema defined in ADF matches the actual structure of your text file. If there are discrepancies (like missing fields or different data types), it can lead to issues with PII detection.
- You mentioned that your data is in a short text file. Make sure that the fields you expect to mask are correctly defined in the schema.
- Adjust Data Source Settings
- If your text file has a different structure than expected, you may need to adjust the data source settings. This includes specifying the correct delimiters and ensuring that the headers are correctly interpreted.
- Review Request Body
- If you’re using a request body to specify PII entities, double-check that the fields are correctly referenced. Any mismatch here can prevent the system from identifying the PII correctly.
- Debugging the Pipeline
- Since you mentioned that the pipeline ran successfully, check the output logs for any warnings or messages that might indicate what went wrong during the PII detection phase.
- You can also use the Data Preview feature in ADF to see how the data is being interpreted before and after masking.
- Viewing the Masked Document
- After the pipeline runs, the masked output should be stored in the destination you specified in your pipeline configuration. Check the output dataset settings to find where the masked text document is saved.
- If you haven’t specified an output location, you may need to set that up in your pipeline to ensure you can access the masked data.
- Testing with Sample Data
- If possible, create a small sample text file with known PII values and test the pipeline with that. This can help you isolate whether the issue is with the data itself or the configuration.
Hope this helps. Do let us know if you any further queries.