How can I maintain unicode characters reading from file then writing to a variable for email $body...

Matt W 21 Reputation points
2022-08-29T19:35:22.35+00:00

I'm trying to read from a .txt file containing unicode characters and use that info in a $body variable to be used as part of an email.

The source file has text that looks similar to this...

TrÐn ÐÐc

but it is read then written to the variable as Tr?n ??c

How can I get those unicode characters to display properly when saved in in the variable.

The file is read with $content = Get-Content -path "$source\MyCo-web-application\MyCo-web-application-commits.log"

At this point, the $body variable will contain some data so it is appended with/by...

$body = CopyToBody -content $content -body $body

The CopyToBody function looks like this...

function CopyToBody
{
param ([string[]] $content, [string] $body)

foreach ($line in $content)   
{  
    $body = $body + "`r`n" + $line          
}  

return $body  

}

Is there any way to preserve the unicode characters so the email body looks like the original log file.

Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,344 questions
0 comments No comments
{count} votes

Accepted answer
  1. Andreas Baumgarten 95,021 Reputation points MVP
    2022-08-29T20:06:24.483+00:00

    Hi @Matt W ,

    i created a textfile with this content:

    Just a test  
    TrÐn ÐÐc  
    3rd line  
    

    And I get the correct content using Get-Content this way:

    $body = ""  
    $content = Get-Content -Path .\sample01.txt -Encoding utf8  
    foreach ($line in $content) {  
        $body = $body + "`r`n" + $line          
    }  
    return $body  
    

    Output looks like this:

    235911-image.png

    ----------

    (If the reply was helpful please don't forget to upvote and/or accept as answer, thank you)

    Regards
    Andreas Baumgarten


1 additional answer

Sort by: Most helpful
  1. Rich Matheisen 44,541 Reputation points
    2022-08-30T15:24:13.15+00:00

    If your message isn't sent as a MIME-encoded body part the content will be treated as 7-bit ascii. The two Unicode characters that are causing you problems are a codepoint that requires two characters. When the recipient opens the message, and the message isn't constructed correctly, the recipient's mail client treats the additional character as if it was just another ascii character.

    0 comments No comments