Is it possible to delete ANSI characters such as Ââ in multiple UTF-8 files with Powershell?

Nicu F 61 Reputation points
2022-09-23T20:08:30.92+00:00

hello, I have a lots of ANSI characters such as Â|â in multiple UTF-8 files. How to delete them?

With notepad++, I try with regex to make a find and replace in Find in Files, but I did not succeed, because the files are in UTF-8.

In UTF-8, Â and â looks like this. And I cannot copy and make replacement with these simbols

244377-image.png

Windows for business | Windows Server | User experience | PowerShell
{count} votes

Accepted answer
  1. Rich Matheisen 47,901 Reputation points
    2022-09-24T21:42:20.933+00:00

    It's important to understand that no matter the encoding of the file, the characters are going to be Unicode characters in PowerShell.

    I didn't find and "Â" (Latin capital letter A with circumflex), or "â" (Latin small letter a with circumflex) in the file (text.txt) you attached to one of your earlier answers. What I did find were the characters "’" (Right single quotation mark, Unicode 8217 decimal) and " " (Non-breaking space, Unicode 160 decimal).

    Here's some code that makes replacing the ASCII-Extended codes a little easier replace without stringing long sequences of (.Net) Replace, or Powershell -creplace (it's important to do the comparison in a case sensitive manner). Just add the decimal value (cast as a 'char') to the $ExtendedAsciiReplacements hash as a key and provide the character you want to use a a replacement and the hash key's value.

    # decimal code points of Unicode characters  
    $ExtendedAsciiReplacements = @{  
        ([char]160)     = " "    # Non-breaking space  
        ([char]194)     = "A"    # Â = Latin capital letter A with circumflex  
        ([char]226)     = "a"    # â = Latin small letter a with circumflex      
        ([char]8216)    = "'"    # ‘ = Left single quotation mark)      
        ([char]8217)    = "'"    # ’ = Right single quotation mark      
        ([char]8220)    = '"'    # “ = Left double quotation mark      
        ([char]8221)    = '"'    # ” = Right double quotation mark      
    }  
      
    $x = Get-Content c:\junk\text.txt -Raw  
    $Replacement = [System.Collections.ArrayList]::new($x.Count)  
    for ($i = 0; $i -le ($x.Length - 1); $i++){  
        # Get the characters and their location  
        # if their value lies above decimal 127 (i.e., they're in the extended ASCII range)  
        # To replace those characters, add them to the $ExtendedAsciiReplacements has. Find  
        # the chacters in the Unicode code points charts found on the web  
        # uncomment the 3 lines below to enable this behavior  
    #    if ([int][char]$x[$i] -gt 127){  
    #        Write-Host "Found $($x[$i]) ($([int][char]$x[$i])) at position $i"  
    #    }  
        # stop uncommenting lines  
        if ( $ExtendedAsciiReplacements.ContainsKey($x[$i]) ){  
            $Replacement.Add($ExtendedAsciiReplacements[$x[$i]]) | Out-Null  
        }  
        else {  
            $Replacement.Add($x[$i]) | Out-Null  
        }  
       }  
    

10 additional answers

Sort by: Most helpful
  1. Nicu F 61 Reputation points
    2022-09-24T09:44:50.773+00:00

    sure @Andreas Baumgarten @Logan Owen

    Attach the file with the text. You can open it with notepad++. Remember that is UTF8. You will see the characters only if you choose to view it in ANSI mode (Menu - Encoding - ANSI)

    244475-text.txt

    0 comments No comments

  2. Nicu F 61 Reputation points
    2022-09-24T12:32:03.303+00:00

    @Andreas Baumgarten

    I don't know for sure, cannot see well in powershell. I copy/paste the text from the output on powershell into text, and nothing changed.

    Can you make the code, as to save all the text files from a folder with the new replacement?

    I need to test more .txt files as to be sure your solution works.

    0 comments No comments

  3. Nicu F 61 Reputation points
    2022-09-24T12:36:55.097+00:00

    @Andreas Baumgarten @Logan Owen

    I believe, the solution will be different aproach.

    The powershell code must do 3 things:

    1. First, every text file must be convert into ANSI.
    2. Make the replacements
    3. Save it as UTF8 again.

    0 comments No comments

  4. Andreas Baumgarten 123.7K Reputation points MVP Volunteer Moderator
    2022-09-24T12:10:36.323+00:00

    Hi @Nicu F ,

    I used your text file and this script:

    $file = "Junk\text_NicuF.txt"  
    $a = Get-Content -Path $file -Raw -Encoding utf8  
    $a.Replace('â€',"'").Replace("Â","")  
    

    Result looks like this:

    Whether you’re looking to sell your home in 2020 or you’re already in your forever home, there is always something that needs to be done around the house. Home improvement projects don’t need to be expensive or complicated, as there are a variety of value-added projects for all skill levels and price ranges.<br /><br />So before you dip into your savings or take out a loan, explore tasks that can add value to your home with just a few bucks, a do-it-yourself attitude and a bit of sweat equity. Consider some of these lower-cost upgrades to improve your home’s value:<br /><br /><ul><li><strong>Add a fresh coat of paint</strong><strong>. </strong>Repainting your home’s interior is one of the most cost-effective home improvements. A freshly painted room adds value to your home by giving it a clean, updated look. New paint also allows you to fix any imperfections, including dings on the wall and nail holes.  If you are selling your home, choose neutral colors (i.e. gray or beige walls with white trim) as they appeal to the largest number of people. </li><li><strong>Enhance curb appeal</strong><strong>.</strong> Any landscaping or improvements that produce a positive first impression will pay off in the long term. Choose low-maintenance landscaping such as drought-tolerant plants and beds of mulch instead of grass. Painting your door can also make a huge impact. Go neutral with charcoal or smoky-black, or add a pop of color with a red (provided it complements the rest of your home’s exterior). </li><li><strong>Light it up.</strong> Whether you’re selling or staying, old lighting fixtures don’t do much for the value or aesthetics of your home. Replace your kitchen lights with new pendant lights, add an eye-catching chandelier in the dining area, and upgrade to a modern ceiling fan in the living area and bedrooms. These changes are not only inexpensive, but they can help a room feel larger. Another simple and cost-effective way to brighten your home to open and clean your windows and adding modern window treatments that allow more natural light in. </li><li><strong>Fix the flooring.</strong> If you haven’t already done so, think about replacing your carpeting with wood flooring. Most homeowners and potential buyers prefer the look of hardwood, which will benefit you if and when you decide to sell. Simply sprucing up your flooring can also add instant value. If your hardwood floors have visible scratches, stains, or other imperfections, buff out or refinish them. For carpeting, schedule a deep-cleaning. </li><li><strong>Boost the bathroom.</strong> Giving your bathroom a facelift is a DIY project you can do over a weekend. Upgrade the vanity lights and other fixtures (i.e. knobs, faucet) in favor of a more modern look, fix any leaks or drainage issues, and regrout your shower. These minor fixes will not only enhance your bathroom’s style, but it will also add value to your home. </li></ul><br /><br />For your interior painting headquarters, look no further than your friends and experts at Stanford Painting. We provide color consulting to help you pick a palette that would look spectacular in your home. Visit us today at 2330 Old Middlefield Way # 8 in the heart of Mountain View, call us at (650) 321-9302 or visit our website to request a quote for your interior paint project.<strong>  
    

    ----------

    (If the reply was helpful please don't forget to upvote and/or accept as answer, thank you)

    Regards
    Andreas Baumgarten

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.