Share via

How to anonymise a text?

Anonymous
2024-07-18T14:30:49+00:00

I have a huge amount of text where I need to get rid of any mention of names, universities, organisations and gender. is there any way of doing this quicker then going through every sentence manually because my brain is struggling to handle it.

Microsoft 365 and Office | Excel | For home | MacOS

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

3 answers

Sort by: Most helpful
  1. Kevin Jones 7,265 Reputation points Volunteer Moderator
    2024-07-18T23:44:03+00:00

    Here is a function that obfuscates a range of cells. To use, set SourceArray (the first parameter in the first LET function) to the range of cells to obfuscate. The function returns an array of obfuscated values the same size as the source array. Alpha characters are converted to random alpha characters, numeric characters are converted to random numeric characters, and all punctuation characters are left untouched. Alpha character casing is retained. If used in a table, SourceArray must a single cell or a #SPILL! error will occur.

    =LET(

    SourceArray, B2:B4,

    UpperVowels, "AEIOU",

    LowerVowels, LOWER(UpperVowels),

    UpperConsonants, "BCDFGHJKLMNPQRSTVWXYZ",

    LowerConsonants, LOWER(UpperConsonants),

    Numbers, "0123456789",

    Result, MAKEARRAY(ROWS(SourceArray), 1, LAMBDA(Row,Column,

    LET( 
    
      SourceValue, INDEX(SourceArray, Row), 
    
      TEXTJOIN( 
    
        "", 
    
        FALSE, 
    
        MAKEARRAY(1, LEN(SourceValue), LAMBDA(Row,Column, 
    
          LET( 
    
            Character, MID(SourceValue, Column, 1), 
    
            SWITCH(TRUE, 
    
              NOT(ISERROR(FIND(Character, UpperVowels))), MID(UpperVowels, RANDBETWEEN(1, LEN(UpperVowels)), 1), 
    
              NOT(ISERROR(FIND(Character, LowerVowels))), MID(LowerVowels, RANDBETWEEN(1, LEN(LowerVowels)), 1), 
    
              NOT(ISERROR(FIND(Character, UpperConsonants))), MID(UpperConsonants, RANDBETWEEN(1, LEN(UpperConsonants)), 1), 
    
              NOT(ISERROR(FIND(Character, LowerConsonants))), MID(LowerConsonants, RANDBETWEEN(1, LEN(LowerConsonants)), 1), 
    
              NOT(ISERROR(FIND(Character, Numbers))), MID(Numbers, RANDBETWEEN(1, LEN(Numbers)), 1), 
    
              Character 
    
            ) 
    
          ) 
    
        )) 
    
      ) 
    
    ) 
    

    )),

    IFERROR(VALUE(Result), Result)

    )

    An example:

    Before After
    Now is the time. Hed ob mfu mico.
    San Francisco, CA  95000 Lef Byiddonfa, MI  61820
    $123,456.00 $193,035.00

    Kevin

    Was this answer helpful?

    1 person found this answer helpful.
    0 comments No comments
  2. Kevin Jones 7,265 Reputation points Volunteer Moderator
    2024-07-19T00:01:41+00:00

    Here is a version that handles dates:

    =LET(

    SourceArray, B2:B6,

    UpperVowels, "AEIOU",

    LowerVowels, LOWER(UpperVowels),

    UpperConsonants, "BCDFGHJKLMNPQRSTVWXYZ",

    LowerConsonants, LOWER(UpperConsonants),

    Numbers, "0123456789",

    Result, MAKEARRAY(ROWS(SourceArray), 1, LAMBDA(Row,Column,

    LET( 
    
      SourceValue, INDEX(SourceArray, Row), 
    
      IF( 
    
        SourceValue = "", 
    
        "", 
    
        IF( 
    
          CELL("format", SourceValue) = "D4", 
    
          DATE(YEAR(SourceValue) + RANDBETWEEN(-10, 10), MONTH(SourceValue) + RANDBETWEEN(-12, 12), DAY(SourceValue) + RANDBETWEEN(-31, 31)), 
    
          TEXTJOIN( 
    
            "", 
    
            FALSE, 
    
            MAKEARRAY(1, LEN(SourceValue), LAMBDA(Row,Column, 
    
              LET( 
    
                Character, MID(SourceValue, Column, 1), 
    
                SWITCH(TRUE, 
    
                  NOT(ISERROR(FIND(Character, UpperVowels))), MID(UpperVowels, RANDBETWEEN(1, LEN(UpperVowels)), 1), 
    
                  NOT(ISERROR(FIND(Character, LowerVowels))), MID(LowerVowels, RANDBETWEEN(1, LEN(LowerVowels)), 1), 
    
                  NOT(ISERROR(FIND(Character, UpperConsonants))), MID(UpperConsonants, RANDBETWEEN(1, LEN(UpperConsonants)), 1), 
    
                  NOT(ISERROR(FIND(Character, LowerConsonants))), MID(LowerConsonants, RANDBETWEEN(1, LEN(LowerConsonants)), 1), 
    
                  NOT(ISERROR(FIND(Character, Numbers))), MID(Numbers, RANDBETWEEN(1, LEN(Numbers)), 1), 
    
                  Character 
    
                ) 
    
              ) 
    
            )) 
    
          ) 
    
        ) 
    
      ) 
    
    ) 
    

    )),

    IFERROR(VALUE(Result), Result)

    )

    Kevin

    Was this answer helpful?

    0 comments No comments
  3. Anonymous
    2024-07-18T18:05:28+00:00

    Hello Morris Rock1, Thanks for using Microsoft products and posting in the community.

    About your need to anonymize private information in text;

    I see that you have placed your question in the Excel board, so if this information is a lot of repetitive information, you can just use the Filter, Replace function to accomplish this.

    • The filter and replace function is in the “Edit Area” at the beginning;
    • Or you can also use formulas to accomplish this operation: =IF(ISNUMBER(SEARCH("John", A1)), "Person A", A1)

    This will look for “John” in cell A1 and replace it with “Person A” if found. Please modify the keywords to suit your needs.

    However, if your need is to find a lot of different keyword information, I understand the amount of work involved, but there may not be an effective way to do this at the moment. Thank you for your understanding.

    Best Regards,Arthur - MSFT | Microsoft Community Support Specialist

    Was this answer helpful?

    0 comments No comments