How to run PowerShell, as a Batch Process file txt/html with REGEX (search and replace)?

Suzana Eree 811 Reputation points
2021-03-10T16:24:19.33+00:00

hello. I have more regex to run on multiple html files from Folder1. I must run more REGEX with search and replace, for example:

SEARCH: (?-s)(".+?") REPLACE BY: $0
SEARCH: (^.*?)=(.*$) Replace by: \1\r\n\2
SEARCH: ^.(.*)$ REPLACE BY: \1

I mage a PowerShellp script, I add those 3 regex search and replace formulas, but is not working. Can anyone help me?

$sourceFiles = Get-ChildItem 'c:\Folder1'  
$destinationFolder = 'c:\Folder1'
foreach ($file in $sourceFiles) {
$sourceContent = Get-Content $file.FullName -Raw

$contentToInsert = [regex]::match($sourceContent,"(?-s)(".+?")").value
$destinationContent = Get-Content $destinationFolder\$($file.Name) -Raw
$destinationContent = $destinationContent -replace '$0',$contentToInsert

$contentToInsert = [regex]::match($sourceContent,"(^.*?)=(.*$)").value
$destinationContent = Get-Content $destinationFolder\$($file.Name) -Raw
$destinationContent = $destinationContent -replace '\1\r\n\2',$contentToInsert

$contentToInsert = [regex]::match($sourceContent,"^.(.*)$").value
$destinationContent = Get-Content $destinationFolder\$($file.Name) -Raw
$destinationContent = $destinationContent -replace '\1',$contentToInsert

Set-Content -Path $destinationFolder\$($file.Name) -Value $destinationContent -Encoding UTF8
} #end foreach file
Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,560 questions
0 comments No comments
{count} votes

Accepted answer
  1. Suzana Eree 811 Reputation points
    2021-03-13T21:42:55.037+00:00

    got it. This is the solution:

    $path = 'c:\Folder1\file1.html'
    $result = 'c:\Folder1\result.html'
    Get-Content -Path $path | ForEach-Object{ 
        $one = $_ -replace '(?<=<li>)\s+','CARPET' #replace First Regex with the word CARPET
        $two = $one -replace 'CARPET','DOOR' #replace the word CARPET with DOOR
        ($three = $two -replace 'DOOR','BEAUTIFUL') | Out-File -FilePath $result -Append #replace the word DOOR with BEAUTIFUL
        "Final = $three"
    }
    
    0 comments No comments

10 additional answers

Sort by: Most helpful
  1. Rich Matheisen 46,896 Reputation points
    2021-03-10T20:19:34.837+00:00

    Where is $destinationcontent? I don't see it defined anywhere -- or assigned any value!

    Your regex on line #6 doesn't work. It won't even run! You have a double-quoted string that contains double-quotes. Change the opening and closing double-quotes to apostrophes (single-quotes).

    The regex on line #9 looks suspect. It will match "shortest possible nothing" at the beginning of the line, followed by and equal sign, followed (possibly) by nothing. IOW it will happily match "=abc" or even "=". If you want "something" to be there change move the "^" outside the 1st group and change the pattern in the 1st group to ".+?". If you expect to always find "something" after the "=" change the 2nd pattern to ".+?$".

    The regex on line #12 looks like you intend to remember the 2nd thru last characters on the line. But the pattern in the group will also match nothing. Is that what you intended?


  2. Rich Matheisen 46,896 Reputation points
    2021-03-10T22:50:16.847+00:00

    The only regex that won't actually work is the one on line #6. It's an easy fix to make it work, though. The other regex may produce what you want, but they also have the possibility of producing results you don't want. The choice of fixing them is, however, up to you.

    Why are you constantly reloading the $destinationcontent variable? You're never saving the result of modifying it, except for the very last modification. Shouldn't you get the contents of each file (which you do on line #4), run each regex, and then save the accumulated modifications???

    I see that you're loading the contents of $sourceContent using Get-Content with the "-Raw" switch. In your 1st regex you explicitly turn off "single line mode". Is it your intent to match only the data on the 1st line? Turning off single-line mode means that the "." matches any character except "\n".

    You also use "[regex]::match" in all your code. Are you only interested in finding the 1st match? What if there are more?

    0 comments No comments

  3. Suzana Eree 811 Reputation points
    2021-03-11T09:58:11.497+00:00

    @Rich Matheisen
    a I don't know PowerShell to good, I am a beginner. So, I update an old code to my expectations.

    Basically, I have more Regex to make Search and Replace, in the same folder, for several html files. I want to run all at once, by order. For example (but it can be any regex)

    1. Search: (?-s)(".+?") Replace by: Anything
    2. Search: (^.*?)=(.*$) Replace by: \1
    3. Search: ^.(.*)$ Replace by: \1

    So, this is what I aimed to do with powershell. I want to integrate more regex S/R as to modify several files.

    Maybe my PowerShell code is not very good, at least I tried a variant.

    Can you, or anybody else, to make another good PowerShell code, or to update mine?


  4. Suzana Eree 811 Reputation points
    2021-03-11T20:38:45.623+00:00

    The text from file.html

    <ul id="sidebarNavigation">
    <li><a href="https://mywebsite.com/page-1.html" title="Page 1">Page 1 (34)</a></li>
    <li> <a href="https://mywebsite.com/page-2.html" title="Page 2">Page 2 (29)</a></li>
    <li><a href="https://mywebsite.com/page-3.html" title="Page-3">Page 3 (11)</a></li>
    </ul>

    Next. I have to run 2 regex on this example (but can be much more regex). Those must be run in this order:

    First regex: (this will delete all the space after <li> )

    SEARCH: (?<=<li>)\K\h*

    Replace: ( leave empty)

    Second regex: (this will add an empty space at the beginning at every line and one space at the end of each line

    SEARCH: (^\h*$)|(^)|((?<!")$)

    SEARCH: \x20

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.