Performance Tips: String.Split
String.Split methods are provided in .Net as a convenient way of dividing string into parts. Here are two basic forms of String.Split methods:
public string[] Split(char[] separator, int count, StringSplitOptions options);
public string[] Split(string[] separator, int count, StringSplitOptions options);
Also String.Split methods are quite easy to use, there are hidde performance issues you should be aware of if you use them a lot:
- The first parameter is an array of characters, or array of strings. If the same seperators are used over and over again, do not recreate the seperator array in each call; create once and reuse them.
- The second paramter is maximum number of segments to return. If your scenario, pass-in the right value will limit the amount of work Split has to do.
- If you do not want empty items, use StringSplitOptions.RemoveEmptyEntries so they do not need to be stored.
- String.Split allocates internal workspace according to the length of the string, beware that this can create lots of garbage for long strings. Avoiding calling String.Split for really long strings.
- If there is a high chance of the seperators not present in the string at all, check for this condition and handle seperately before calling String.Split.
- As String.Split returns all the strings through the string array, they will be held in memory until there is no reference to the array. Try to limit this duration, otherwise more strings will be promoted to Generation 1 or Generation 2. It will be a good idea to null references to processed strings in the array.t
- If you're using String.Split to process a large file or stream, there will be lots of duplicated strings. Try not to store those strings directly in your data structure. Instead, convert them to low cost versions, e.g. scalar values; or merge them into unique strings.
- If your data is supposed to be not case sensitive, do not try to call String.ToLower/ToUpper on strings returned from String.Split, use non-case sensitive containers or non-case sensitive comparisons.
- If you suspect String.Split is causing performance issue for your scenarios, measure performance using tools like PerfView.
- If performance is still not optimal after these optimizations, avoid using String.Split or reimplement your own versions.