Powershell to clean up a file

RJ 106 Reputation points
2021-12-01T17:27:54.367+00:00

Hi there,

I'm requesting help on powershell script to clean up a input file and write to an output file.

Contents of the input.txt file are..

abcd
cdef
["abcd"]
["ab
cd"]

["efgh"]

xyz

Expected OUTPUT.txt file

abcd
cdef
["abcd"]
["abcd"]

["efgh"]

xyz

Wanted to suggest the criteria to clean up the file is
If a line contains start square bracket but does not end with square bracket, then concatenate the next line to current line.

Looking forward Appreciate your help towards script.

Thanks.

Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,462 questions
0 comments No comments
{count} votes

Accepted answer
  1. Rich Matheisen 45,906 Reputation points
    2021-12-01T20:57:55.38+00:00

    Assuming you don't have a sequence like this that extends over more than two adjacent lines:

    [ab
    cd
    ef]

    Then this should work:

    $file = "c:\junk\test.txt"
    $BufferedLine = $null
    
    Get-Content $file |
        ForEach-Object {
            if ($_ -match "^[^[].*[^]]$") {
                # line doesn't begin with "[" or end with "]"
                $_                                  # -- line is okay, just return it
            }
            elseif ($_ -match "^\[.*\]$") {
                # line begins with "[" and ends with "]"
                $_                                  # -- line is okay, just return it
            }
            elseif ($_ -match "^\[.*[^]]$") {       # line begins with "[" but doesn't end with "]"
                $BufferedLine = $_                  # remember line as beginning
            }
            elseif ($_ -match "^[^[].*\]$") {
                # line doesn't begin with "[" but ends with "]"
                if ($BufferedLine) {                # AND there's a preceeding line awaiting closure
                    $BufferedLine += $_             # concatenate with contents of previous line
                    $BufferedLine                   # return completed line
                    $BufferedLine = $null           # and forget the value
                }
            }
            elseif ($_.length -eq 0){
                $_
            }
        }
    if ($BufferedLine) {
        $BufferedLine                           # return the last line of necessary
    }
    

2 additional answers

Sort by: Most helpful
  1. RJ 106 Reputation points
    2021-12-02T16:57:55.977+00:00

    This below gives the same result expected.

    $file = Get-Content -Path "c:\junk\test.txt" -Raw
    $file = $file -ireplace '(?<match1>[[^]])\r\n(?<match2>[^]]])','${match1}${match2}'
    $file | Out-File -FilePath "C:\Junk\Results.txt"


  2. RJ 106 Reputation points
    2021-12-02T17:09:27.527+00:00

    To search for [" (start square bracket with quotes) anywhere in a line which does not contain close square bracket with quotes "]

    $file = $file -ireplace '(?<match1>[\"[^]])\r\n(?<match2>[^]]\"])','${match1}${match2}'

    0 comments No comments