how to remove specific characters from variable in powershell

Stephen Peterson 36 Reputation points
2021-11-12T15:29:31.44+00:00

I have a CSV file and what to replace certain characters by using a hash table.

Here is the hash table
$hash = @{}
$hash."'" = ''
$hash."&" = ''
$hash."(" = ''
$hash.")" = ''
$hash." " = ''
$hash."/" = ''
$hash."\" = ''
$hash."Â" = ''
$hash."– = ''

and here is the line that mostly works
Foreach ($key in $hash.Keys) {
$update = $update.Replace($key, $hash.$key)
}

The problem is with the last two entries in the hash table. Those symbols are not being found and are not being replaced. The other elements are working. (The A has a carrot over the top of it)

Any idea how I can detect and replace those elements and others like them?

I cant use the standard regex
$pattern = '[^a-zA-Z]'
$update = $update -replace $pattern, ''
because it takes out @ and ? and other characters I need.

Thanks for the help!
Steve

Windows for business Windows Server User experience PowerShell
0 comments No comments
{count} votes

Accepted answer
  1. Rich Matheisen 47,901 Reputation points
    2021-11-14T20:48:08.817+00:00

    If I've interpreted your problem correctly, this should result in (mostly) legitimate SMTP addresses:

    # Assuming no quoted strings are used in the addresses
    # and no comments are used in the addresses
    # and no "international" characters are used in the addresses
    # domain isn't expressed as an IP address "[xxx.xxx.xxx.xxx]"
    # THESE are what's valid in a SMTP address. Note that the range is INVERTED by the "^" at the beginning
    
    $InvertedValidCharactersRange = "[^A-Za-z0-9!#$%&'*+-/=?^_``{|}~.(),:;<>@[\]-]+"
    
    Import-Csv c:\Junk\BadIdea.csv -Encoding UTF8 |
        ForEach-Object{
            $_.Value -replace $InvertedValidCharactersRange, "" # remove everything that's NOT a valid character
    }
    

    NOTE: There is no attempt to actually validate the SMTP address. This is purely a character deletion bit of code!


5 additional answers

Sort by: Most helpful
  1. MotoX80 36,291 Reputation points
    2021-11-12T17:02:48.773+00:00

    I have a CSV file

    Probably an encoding problem. Try this.

    $update = Get-Content -Encoding UTF8 -Path "c:\temp\test1.csv" -raw    
    $update  
      
    $hash = @{}  
    $hash."'" = ''  
    $hash."&" = ''  
    $hash."(" = ''  
    $hash.")" = ''  
    $hash." " = ''  
    $hash."/" = ''  
    $hash."\" = ''  
    $hash."Â" = ''  
    $hash."– = ''  
      
    Foreach ($key in $hash.Keys) {  
        $update = $update.Replace($key, $hash.$key)  
    }  
    ""  
    $update  
      
    $update | Out-File "c:\temp\test2.csv"     
    
      
    

  2. Stephen Peterson 36 Reputation points
    2021-11-12T18:15:15.42+00:00

    After I run it through the script, the output file typed at the command prompt show some of these as a question mark.
    I may be able to just run it through the script again to see if I can eliminate the "?"

    Ill let you know how that goes

    Steve

    0 comments No comments

  3. Rich Matheisen 47,901 Reputation points
    2021-11-12T20:45:15.767+00:00

    Your problem seems to be with the interpretation of the characters. That plus the fact that the characters "(", ")", and "\" all need to be escaped (they're regular expression meta characters).

    $hash = [ordered]@{
        "'"  = ''
        "&"  = ''
        "\(" = ''
        "\)" = ''
        " "  = ''
        "/"  = ''
        "\\" = ''
        "Â"  = ''   # ALT + 0194
        "Æ"  = ''  # ALT + 0196
    }
    
    "1'3","1&3","1(3","1)3","1 3","1/3","1\3","1Â3","1Æ3" |
        ForEach-Object{
            $update = $_
            Foreach ($key in $hash.Keys) {
                $update = $update -Replace $key, "$($hash.$key)"
            }
            "`$update = $update"
        }
    

    That bit of code produced this output:

    $update = 13
    $update = 13
    $update = 13
    $update = 13
    $update = 13
    $update = 13
    $update = 13
    $update = 13
    $update = 13
    

    The character "Æ" (which I'm assuming is the last key in the your hash) appears in your example as $hash."–. On my machine the value 0xC6 is showing as "Æ". If that's incorrect, can you tell me the hex value (or the decimal value, or the Unicode codepoint) for the character?


  4. Stephen Peterson 36 Reputation points
    2021-11-12T21:12:39.83+00:00

    Here is the code Im using

    $idfixfile = Import-Csv -encoding UTF8 $home\Documents\orignal.csv
    $outfile = "$home\documents\fixed_file.csv"
    $output = @()
    $lastoutput = @()
    
                                    $hash = @{}
                                    $hash."'" = ''
                                    $hash."&" = ''
                                    $hash."(" = ''
                                    $hash.")" = ''
                                    $hash." " = ''
                                    $hash."/" = ''
                                    $hash."\" = ''
                                    $hash."Â" = ''
                                    $hash."– = ''
                                    $hash." " = ''
                                    $hash."ΓÇô" = ''
                                    $hash."-" = ''
    
    Foreach ($entry in $idfixfile){
                                    $distinguishedname=$entry.distinguishedname
                                    $objectclass = $entry.objectclass
                                    $attribute = $entry.attribute
                                    $newerror = $entry.error
                                    [string]$oldvalue = $entry.value 
                                    $update = $oldvalue   
    
                                    Foreach ($key in $hash.Keys) {                                                              
                                                                  $update = $update.Replace($key, $hash.$key)
                                                                 }
    
                                    $obj = new-object psobject
                                    $obj | Add-Member NoteProperty 'distinguishedname' $distinguishedname
                                    $obj | Add-Member NoteProperty 'objectclass' $objectclass
                                    $obj | Add-Member NoteProperty 'attribute' $attribute
                                    $obj | Add-Member NoteProperty 'error' $newerror
                                    $obj | Add-Member NoteProperty 'value' $oldvalue
                                    $obj | Add-Member NoteProperty 'update' $update    
                                    $output += $obj                                                                                                              
                                 }
    
                                 $output | export-csv -Encoding ASCII $outfile -nti
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.