Use this website. Enter your values. It gives a good explanation of your matches
Unexpected regular expression groups capture

Hello,
Trying to get subcaptures with the following string and regular expression, I'm surprised by the result I get. Can someone explain me the result. Is the NFA engine doing bad job ?
I'm working under Powershell. So, I imagine with, using the Windows .NET framework.
My string :
"p1 p2 p3,' '' ', p4"
I put it between quotes at it includes several couples of apostrophes. One with 3 spaces between, the second, immediatly following with 2 spaces inside.
This string wholy match the following regular expression :
^(\w*)\s+(\w+)(([^' ]*)|('[^']*')|\s?)*$
Indeed :
"p1 p2 p3,' '' ', p4" -match "^(\w*)\s+(\w+)(([^' ]*)|('[^']*')|\s?)*$"
results to :
True
But :
$matches
gives
Name Value
---- -----
5 ' '
4
3
2 p2
1 p1
0 p1 p2 p3,' '' ', p4
I would have expected much more. That is (not listing 1st level subgroup with the alternative 2nd level subgroups)
p1
p2
P3,
' '
' '
,
p4
Why "p3,", "' '","' '" and "," are not captured ?
"p1 p2 p3,' '' ', p4" -match "^(\w*)\s+(\w+)(?:([^' ]*)|('[^']*')|\s?)*$"
so with a non capturing 1st level group, do not gives a better result.
It simply gives one less (empty) submatch. That is, expectedly, minus the 1st level subgroup I guess.
Name Value
4 ' '
3
2 p2
1 p1
0 p1 p2 p3,' '' ', p4
Do I miss the equivalent of the "global" Ecmascript flag of regexp objects ?
However I get the same result using javacript.
So what's wrong ?
Thank for the help.
2 answers
Sort by: Most helpful
-
-
Bernard BOROWSKI 1 Reputation point
2022-05-07T17:38:03.593+00:00 Thanks for the website indication. But not as useful as the information I missed which is the grouping-constructs-in-regular-expressions explanation in the .NET reference site.
It gives the rule when a quantifier is applied to a capturing group. And the way to get the "previous" captures for a group while corresponding match item gives only the last.