Regex for finding tags and text around/in between them?

user42 121 Reputation points
2021-09-01T01:03:49.71+00:00

I have a text that is a mix of "tags" defined as (<[^<>]+>) and "non-tags", I'd like to separate them into an array. This is for colorizing text in a small code editor.

Here are some examples of input and what I'd like to have as the output:

"text1<tag1>text2<tag2>text3" -> ["text1", "<tag1>", "text2", "<tag2>", "text3"]
"<tag1>text text<tag2><tag3>" -> ["<tag1>", "text text", "<tag2>", "<tag3>"]
"<tag1><tag2>" -> ["<tag1>", "<tag2>"]
"text text" -> ["text text"]
"<<<tag1>text>><<<tag2>" -> ["<<", "<tag1>", "text>><<", "<tag2>"]

I assume that is something Regex can do?

Thank you!

Developer technologies | C++
Developer technologies | C#
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 122.6K Reputation points
    2021-09-01T05:42:38.293+00:00

    In C# you can do something like this:

    string text = "text1<tag1>text2<tag2>text3";
    
    string[] results = Regex.Split( text, @"(?!^)(?=<\w+>)|(?<=<\w+>)(?!$)" );
    

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.