Is it possible to get java's '==' semantics in C#?
Prelude: Here's what my current array based implementation of ICollection looks like. It builds upon a lot of suggestions as to the tradeoffs of perf vs. memory and uses delegates to place control in the person who creates the collection for how certain operations behave:
namespace Testing.Collections
{
using System;
using Testing.Functional;
public class ArrayCollection<A> : ICollection<A>
{
private readonly Authenticator<A> authenticate;
private A[] array;
private uint count;
private static readonly Authenticator<A> DefaultAuthenticate = delegate(A a1, A a2)
{
return object.Equals(a1, a2);
};
public ArrayCollection() : this(DefaultAuthenticate) {
}
public ArrayCollection(Authenticator<A> authenticate)
{
this.authenticate = authenticate;
if (null == authenticate)
{
throw new ArgumentNullException("authenticate");
}
}
#region ICollection<A> Members
public bool Empty {
get {
return count == 0;
}
}
public bool Add(A element) {
//From CLR 369 (Dynamic Tables)
EnsureSpace();
array[count] = element;
count++;
//Adding an element always modifies an ArrayCollection
return true;
}
private void EnsureSpace()
{
if (array == null)
{
array = new A[2];
}
if (count == array.Length)
{
A[] doubledArray = new A[count * 2];
Array.Copy(array, doubledArray, count);
array = doubledArray;
}
}
public void Clear()
{
array = null;
count = 0;
}
public bool Contains(A element)
{
for (uint i = 0; i < count; i++)
{
if (this.authenticate(array[i], element))
{
return true;
}
}
return false;
}
public bool Remove(A element)
{
for (uint i = 0; i < count; i++)
{
if (this.authenticate(array[i], element))
{
RemoveAt(i);
Contract();
return true;
}
}
return false;
}
private void Contract()
{
//Shrink the array if the number of elements has gone below
//1/4th the capacity. See CLR 370 (Dynamic Tables: Table-Delete)
if (count < (array.Length / 4))
{
A[] halvedArray = new A[array.Length / 2];
Array.Copy(array, halvedArray, count);
}
}
private void RemoveAt(uint i)
{
//Move the last element into the array into this position
uint lastItem = count - 1;
array[i] = array[lastItem];
array[lastItem] = default(A);
count--;
}
#endregion
}
}
There are small changes to the first version of the code:
- I use null to represent an empty array
- When getting space initially for the array we default to a size of 2. This was chosen because it's the smallest size that the array can reach when you call remove
- Equality is based on a delegate passed in the constructor. If no delegate is passed then we use object.Equals(o1, o2) to test equality.
I'm writing this as a library that I intend people to use and augment. So I don't feel a test is complete unless I try subclassing the class to provide a specialized implementation. In this case I wanted to create an “IdentityCollection”.
My definition of an IdentityCollection was: an ICollection<A> where equality is defined as:
- Reference equality if A is a reference type
- Structural equality if A is a value type
So if you had checked if an object was in the collection it would only report true if that was in teh same location in memory. I could do this with the following code:
ICollection<string> collection1 = new ArrayCollection<string>(delegate(string s1, string s2) {
return object.ReferenceEquals(s1, s2);
});
However, this wouldn't work for value types. For example, I would expect to be able to type:
collection.Add(4) and then have collection.Contains(4) return true. However, using ReferenceEquals wouldn't work because it would automatically box those value type into objects in different locations in memory.
My initial stab at it looked like this:
class IdentityCollection<A> : ArrayCollection<A>
{
public IdentityCollection() : base(delegate (A a1, A a2) {
return a1 == a2;
}) {}
}
I figured that == would have reference semantics on objects and structural semantics on values. Pretty intuitive considering that == has structural semantics on all the predefined value types in the system (bool, int, decimal, etc.). However, when i tried this I got:
Error 1 Operator '==' cannot be applied to operands of type 'A' and 'A'
Sigh... I'm reading through the language spec right now to understand this. However, can anyone else think of way to get this done?
- Anonymous
May 18, 2004
Cyrus, you need to use constraints and CompareTo. You can specify a constraint that any Type Parameter implement IComparable or IComparable<T>
class IdentityCollection<A> where A:IComparable<A> : ArrayCollection<A>
{
public IdentityCollection() : base(delegate (A a1, A a2) {
return (a1.CompareTo(a2) == 0);
}) {}
} - Anonymous
May 18, 2004
Why is your top level namespace named "Testing"? I would choose "MS.CyrusN". - Anonymous
May 18, 2004
Justin: That's a constraint that I don't want to have on my collection. That would exclude many structs from being allowed in.
I woudl like my collections to operate on any type. - Anonymous
May 18, 2004
Jay: No reason. At this point they don't really need to be in a namespace at all. But that's what the default project gave me so I stuck with it. - Anonymous
May 18, 2004
One thing you could do is to take a look at how the STL C++ library is designed. An idea that comes to mind would be to take the operations that require equality of elements away from the container class into separate classes.
E.g., with STL, the basic list has no 'contains' method. Instead, there is a separate 'find' algorithm that you can apply to the list. The 'find' algorithm requires that the list's elements are comparable with the '==' operator - if not, an attempt to apply 'find' to a list fails at compile time. Maybe you could do something similar here? - Anonymous
May 18, 2004
Here's how you can do it:
class IdentityCollection<A> : ArrayCollection<A>
{
public IdentityCollection() : base(delegate (A a1, A a2) {
if((object)A.default != null) {
return object.Equals(o1, o2);
}
return object.ReferenceEquals(o1, o2);
}) {}
}
Unfortunately, this requires boxing for value types.
I'm not sure if your definition of structural equality includes using a user defined override for Object.Equals, if it doesn't you can use:
return System.Runtime.CompilerServices.RuntimeHelpers.Equals(o1, o2);
instead of object.Equals(). That will do a memcmp style comparison of the two values (but still requires boxing).
The CLR really should have an IL instruction to do a bitwise comparison of two values. - Anonymous
May 18, 2004
BTW, in Contract(), do you want to point array at halvedArray? - Anonymous
May 19, 2004
- EnsureSpace
a) EnsureSpace must not rely on count value but take expected size instead.
I.e. in case if I wish to subclass your ArrayCollection to add for example method AddAll(params A[] items) then I will have to reimplement EnsureSpace (I prefer you mark it protected or even public) with increase in sizes different that +1
b) In case if you will need a huge size of variables very close to int.MaxValue (or to be clear - at least 1073741824) then your array.length*2 code does not work and will cause wrong (even negative) length array created to store values ;o))
This is not relevant for IA32 architecture as you will more likely hit 3Gb memory limit - but will definitely cause troubles for 64-bits.
c) You can merge Contract with EnsureSpace and this will be one-place stop for array reallocations.
2. Your RemoveAt and EnsureSpace are private. It will be nice to mark them protected or even public if you wish to allow developers to extend your class. If do not wish to allow them to do so - use sealed class - this will result in some optimization.
If you will make RemoveAt protected or public - apply some basic param validation.
If using RemoveAt with position bigger that count but less that array.length this result last element simply lost. All others results in IndexOutOfRange exception
Param validation will not hurt as it can be optimized/removed by compiler.
3. You have no some required methods for ICollection<A> from Whidbey CTP build.
Like a CopyTo(), Count, IsReadOnly and GetEnumerator(). As well Add return bool instead of int.
- Anonymous
May 19, 2004
David: correct. Any suggestions on how I test that? From an external persepctive the colleciton behaved just fine. - Anonymous
May 19, 2004
The comment has been removed - Anonymous
May 19, 2004
AT:
b) Correct. I was looking at Array.LongLength and thinking I should move to that. I should also put that code in a checked block.
c) What is the benefit of merging them into one location?
2. I see extensibility coming from overriding the things that are in the public signature in the class (i.e. the interface). The private methods are an implementation detail that might change at any time. For example, merging Contract/EnsureSpace as you mentioned.
RemoveAt is private. I could add validation, but the guidelines say that is unnecessary as I control all flow into that method.
3. I'm not attempting to clone System.Collection.Generic.ICollection. I'm trying to write my own. Prefereably without the issues I've found int the BCL libraries. As I mentioned in a previous posts, I'm not sure if Count is appropriate on a collection, and a later post talks about GetEnumerator. - Anonymous
May 19, 2004
The comment has been removed - Anonymous
May 19, 2004
AT: Thanks for your feedback. It's excellent. As an experiment could you try to subclass ArrayCollection and then show the issues you have doing it. Then show how making the modifcations you've talked about will alleviate them.
I believe that there is a different way to accomplish what you want while keeping the patterms I have introduced, but I would love to see yours so that I can compare/contrast them. Thanks much! - Anonymous
May 19, 2004
The comment has been removed - Anonymous
May 19, 2004
The comment has been removed