How to get string.Split() to recognize quoted string as a single entity?

Nicholas Piazza 541 Reputation points
2021-10-14T21:56:16.183+00:00

I was trying to write a console method to simulate command-line argument input, but I can't get string.Split() to recognize "fourth arg" as a single entity. For example, using the space character as the separator, if I entered a string like:
firstarg second-arg third:arg "fourth arg"
into an args[] using string.Split(), the args[] gets 5 strings instead of 4. It parses "fourth arg" into two strings:
\"fourth and arg\". A true command-line argument would treat "fourth arg" as a single entity
fourth arg. See code example below.

using System;
using static System.Console;

namespace Temp
{
    class Program
    {
        private static void Main ()
        {
            SimulateCommandLineArgs ();
            _ = ReadKey ();
        }

        private static void SimulateCommandLineArgs ()
        {
            string input;
            string [] args;

            Write ("Enter list of input items separated by spaces: ");

            input = ReadLine ();
            args = input.Split (' ', 20, StringSplitOptions.TrimEntries);

            WriteLine ($"There are {args.Length} arguments.");

        }

    }
}
Developer technologies C#
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Sander van de Velde | MVP 36,761 Reputation points MVP Volunteer Moderator
    2021-10-14T22:24:57.173+00:00

    Hello @Nicholas Piazza ,

    I understand what you want want to achieve.

    Using the string.split() is perhaps not the best solution...

    Have you already checked out RegEx? There you can get an array of groups of characters...

    I found this possible solution:

    [^\s"']+|"([^"]*)"|'([^']*)'  
    

    It turns your example in four groups:

    140686-image.png

    Yes, there is a learning curve with RegEx but it's worth it!

    The tester I used, even generated this code:

    using System;
    using System.Text.RegularExpressions;

    public class Example
    {
    public static void Main()
    {
    string pattern = @"[^\s""']+|""([^""])""|'([^'])'";
    string input = @"firstarg second-arg third:arg ""fourth arg""";
    RegexOptions options = RegexOptions.Multiline;
    foreach (Match m in Regex.Matches(input, pattern, options))
    {
    Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
    }
    }
    }

    Regards,

    Sander

    0 comments No comments

  2. Sam of Simple Samples 5,546 Reputation points
    2021-10-15T03:52:48.337+00:00
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.