Regex.Matches return no duplicates?

StewartBW 1,805 Reputation points
2024-07-22T18:50:26.8233333+00:00

Hello

Dim blah As MatchCollection = Regex.Matches(text, "([a-zA-Z0-9_-.]+)@([a-zA-Z0-9_-.]+).([a-zA-Z]{2,9})", RegexOptions.CultureInvariant Or RegexOptions.IgnoreCase Or RegexOptions.Multiline)

How may I force Regex.Matches not to return duplicate items, so MatchCollection will not contain duplicates (case insensitive)?

If not possible, remove duplicates from MatchCollection?

Efficiency and speed is crucial :(

Thanks :)

Developer technologies | VB
Developer technologies | C#
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 122.6K Reputation points
    2024-07-22T19:33:04.0666667+00:00

    If you are interested in a regular expression that excludes duplicates, try this: (?<m>([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,9}))(?!.*\<m>).

    It can be compared with DistinctBy:

    Dim matches = Regex.Matches(Text, "your original expression...", RegexOptions.CultureInvariant Or RegexOptions.IgnoreCase Or RegexOptions.Multiline).DistinctBy(Function(m) m.Value, StringComparer.CurrentCultureIgnoreCase)
    

    The experiments with typical data will show the fastest method.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Marcin Policht 49,715 Reputation points MVP Volunteer Moderator
    2024-07-22T19:29:08.5166667+00:00

    Try the following

    Dim matches As MatchCollection = Regex.Matches(text, "([a-zA-Z0-9_-.]+)@([a-zA-Z0-9_-.]+)\.([a-zA-Z]{2,9})", RegexOptions.CultureInvariant Or RegexOptions.IgnoreCase Or RegexOptions.Multiline)
    Dim uniqueMatches As New HashSet(Of String)(StringComparer.OrdinalIgnoreCase)
    For Each match As Match In matches
        If match.Success Then
            uniqueMatches.Add(match.Value)
        End If
    Next
    ' Now uniqueMatches contains only unique email addresses, case-insensitively
    
    

    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.