OrdinalIgnoreCase is not a panacea!
There can be too much of a good thing you know! Last night as I was scanning through some code, I noticed that all the String APIs were used in the culture safe overloads so thoughtfully provided by .NET. As I silently thanked FxCop for codifying this, I also came to the conclusion that I can safely bid goodbye to all intl bugs in that code.
After a quick run though, I noticed some of the sorting orders in the UI were wrong! Ouch! OrdinalIgnoreCase was the harmless looking deceiver. In a bid to stick to the .NET guidance rules the code was relentlessly smattered with OrdinalIgnoreCase all around. Hence, no matter what the current culture was, the sorts and compares all gave the same result each time.
Ok - another explanation. There are 2 kinds of data - input and output. Say, your input data is taken in and used for processing. When you want to compare your input data with some benchmark variables defined, you will want to use our favourite OrdinalIgnoreCase in the comparison. This is purely a non linguistic comparison. In cases where you want case sensitivity to be maintained, then you can use Ordinal in the compare API. Consider a case where you want to extract the contents of a project file. Project file names are unique and case insensitive. Hence, when comparing between your user input and the project file name, you can use the OrdinalIgnoreCase in your comparison and expect it to yield accurate results.
In cases where you would like to do a linguistic comparison, where the current culture does matter to you, use the CurrentCulture value in the StringComparison object. For instance, if you would like to create a sorted list of random user input values, you must use the CurrentCulture value. In the Turkish culture I is sorted before i whereas in the English culture i is sorted before I. But if I sort using the Ordinal value, I'll get a standard order that the values will appear in, no matter what the culture of the client is. But that is not what we want! Hence, your comparisons need to take the current culture into account. Usually the output data needs to be presented in the culture that is being used by the user while input data when being processed must be culture insensitive.
That brings us back to the original topic - using OrdinalIgnoreCase in ALL cases will not rid your code of intl bugs!
Comments
- Anonymous
December 14, 2005
"using OrdinalIgnoreCase in ALL cases will not rid your code of intl bugs!"
This is true but when unsure always use Ordinal or its variants as in OrdinalIgnoreCase. The reason is that culture sensitive comparisons/processing are required only in very limited scenario as in sorting tables in a piece of user interface. So bugs that arise due to over-use of Ordinal comparison are cosmetic and can be lived with. Whereas over-use of culture specific comparisons can lead to serious issues including security risks. See http://msdn.microsoft.com/netframework/default.aspx?pull=/library/en-us/dndotnet/html/StringsinNET20.asp for an example (search for "file:") - Anonymous
December 14, 2005
Agreed - this post has been written keeping in mind 2 things:
1. All your devs have read the paper I have in my blog and this is old hat.
2. FxCop rules codify the guidelines and hence it is unlikely the default overloads are used.
So, culture specific comparisons are in essence an exception to the rule while using OrdinalIgnoreCase is more of a general rule. What I want to draw attention to, is to the exception, not the rule. :) - Anonymous
March 02, 2006
> Usually the output data needs to be presented in the culture that is being used by the user while input data when being processed must be culture insensitive.
Not necessarily, Anu :-)
Like you, I always end up with the recommendation that you need two Comparers - one for the display and one to handle the data as most appopriate based on its meaning/usage.
You don't want to use the same comparer (maybe like CurrentCulture) to sort a list of files and to check if they can be saved in the same folder ... ;-)
But in some cases those comparisons might actually be the same. It all depends on what kind of data you are handling.
And the rule "always use Ordinal because it's safer" it's (a) not always true and (b) not very well accepted worldwide.
Aldo
p.s. - and if this were not comlpex enough, you could always think to add a database server and it's collation problems ;-)