It's a hack, but give this a try:
$x = get-content c:\junk\badfile.txt -raw
$y = $x -replace "`r`n",""
[array]$z = $y -split "<end>"
$z | Out-File C:\junk\ml.txt
$c = import-csv c:\junk\ml.txt -header "h1","h2","h3","h4","h5","h6","h7","h8","h9"
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This site is so wanky I have no clue if is in the right place or even if I'll be able to find it again ... New and improved is not always better, and this new Q&A forum is a perfect example of that!
Hi all, hoping I can get some help.
I would like to make a malformed csv file into a proper csv so it can be imported into excel.
There is no header.
There are no quotes on columns with multi line data. (i.e. CR/LF in the data)
Multi line data is ended with a marker of: <end>
No quotes on hex data. (first and second column)
I do believe there are just three columns that need fixing ... Column1, Column2, and column9 (the last one and is marked with <end>).
There is also a blank line between records to be removed.
Windows 7, Powershell.
Two records of example data
12345,ABCD,0,2,2,"Part Description",Bla Bla,"-1,2,1,-40",
This is a test of the ability to cleanup the file
The files have thousands of lines
And multi lines un-quoted.
<end>
ABCDE,1234,0,2,2,"Description of the part",Simple text without CR/LF,"-20,-40,1,2",
There can be a variable number of lines
Just another line of data
There is a end marker on the un-quoted multi line column.
<end>
It's a hack, but give this a try:
$x = get-content c:\junk\badfile.txt -raw
$y = $x -replace "`r`n",""
[array]$z = $y -split "<end>"
$z | Out-File C:\junk\ml.txt
$c = import-csv c:\junk\ml.txt -header "h1","h2","h3","h4","h5","h6","h7","h8","h9"
Hello @Festus Hagen !
If you like to preserve the multiline column and assuming that there aren't any ,
character in that column, you could try this:
$parentPath = Split-Path -Parent $PSCommandPath
$srcPath = Join-Path -Path $parentPath -ChildPath "malformed.csv"
$tmpPath = Join-Path -Path $parentPath -ChildPath "corrected.csv"
$dstPath = Join-Path -Path $parentPath -ChildPath "final.csv"
Get-Content -Path $srcPath -Delimiter '<end>' |
ForEach-Object {
$_ -replace '(?<=,)[^,]*$', '"$0"'
} | Set-Content -Path $tmpPath
Import-Csv -Path $tmpPath -Header $(1..9) | Export-Csv -Path $dstPath -Delimiter ';' -NoTypeInformation
Does this work for you?
Best whishes
---
Please accept and upvote this answer, if it was helpful