Udostępnij za pośrednictwem


Mutating Readonly Structs

Consider this program which attempts to mutate a readonly mutable struct. What happens?

struct Mutable {
private int x;
public int Mutate() {
this.x = this.x + 1;
return this.x;
}
}

class Test {
public readonly Mutable m = new Mutable();
static void Main(string[] args) {
Test t = new Test();
System.Console.WriteLine(t.m.Mutate());
System.Console.WriteLine(t.m.Mutate());
System.Console.WriteLine(t.m.Mutate());
}
}

There are a number of things this program could do. Does it:

1) Print 1, 2, 3 -- because m is readonly, but the "readonly" only applies to m, not to its contents.
2) Print 0, 0, 0 -- because m is readonly, x cannot be changed. It always has its default value of zero.
3) Throw an exception at runtime, when the attempt is made to mutate the contents of a readonly field.
4) Do something else

?

People are frequently surprised to learn that the answer is (4). In fact, this prints 1, 1, 1. 

Why?

Because, remember, accessing a value type gives you a COPY of the value. When you say t.m, you get a copy of whatever is presently stored in m. m is immutable, but the copy is not. The copy is then mutated, and the value of x in the copy is returned. But m remains untouched.

The relevant section of the specification is 7.5.4, which states that when resolving "E.I" where E is an object and I is a field...

...if the field is readonly and the reference occurs outside an instance constructor of the class in which the field is declared, then the result is a value, namely the value of the field I in the object referenced by E.

The important word here is that the result is the value of the field, not the variable associated with the field. Readonly fields are not variables outside of the constructor. (The initializer here is considered to be inside the constructor; see my earlier post on that subject.)

Great. What about that second dot, as in ".Mutate()"?  We look at section 7.4.4 to find out how to invoke E.M():

If E is not classified as a variable, then a temporary local variable of E's type is created and the value of E is assigned to that variable. E is then reclassified as a reference to that temporary local variable. The temporary variable is accessible as this within M, but not in any other way. Thus, only when E is a true variable is it possible for the caller to observe the changes that M makes to this.

And there you go. Value semantics are tricky!

This is yet another reason why mutable value types are evil. Try to always make value types immutable.

Comments

  • Anonymous
    May 14, 2008
    The C# designers once again ignored the principle of least surprise :( I'd expect a compiler error - something like "Mutable structs can't be readonly." or "Methods muting the value can't be called on readonly values." (though I also strongly encourage everyone to make all value types immutable). What does the rest of the community expect?

  • Anonymous
    May 14, 2008
    There is a spectrum, from "clearly desirable behaviour", to "possibly dodgy behaviour that still makes some sense", to "clearly undesirable behaviour".  We try to make the latter into warnings or, better, errors.  But stuff that is in the middle category you don't want to restrict unless there is a clear way to work around it. For example, something like:  mydictionary[123].blah = 456; is a compiler error if the indexer returns a mutable struct.  The result of an indexer is not a variable, and therefore, the write to blah will be on the copy returned by the indexer, not the contents of the dictionary. That's clearly not at all what the author intended.  They meant temp = mydictionary[123]; temp.blah = 456; mydictionary[123] = temp;.  Since the code is clearly bogus and there is a simple workaround, we give an error. In your example, how would the user work around the problem?  They might not own the struct definition, so they cannot make it immutable.  If you tell them to make their field mutable instead of readonly, then you're exacerbating the problem -- the mutability is now spreading from type to type!

  • Anonymous
    May 14, 2008
    I would rather say, in this case, that structs are the evil.

  • Anonymous
    May 14, 2008
    Thanks for the article! Oren

  • Anonymous
    May 14, 2008
    Another interesting article. Could FxCop be made to check for mutable value types? I've not really explored the customization side of that tool yet.

  • Anonymous
    May 14, 2008
    @Thomas D: How is the compiler meant to know that the struct is mutable? Or that the method mutates things? That ends up prying into the guts of an object more than I'd be happy with. (It would be nice for a type to be able to say that it's immutable, but that's certainly not available at the moment.) Any feature where the compiler has to care about the implementation (as opposed to the interface) of something it's calling into sounds suspect to me. @Thomas E: I don't think structs are evil per se. I think there's a lot of misunderstanding around them, but there are times when they're handy. @Eric: Wow, that's an evil one. I can absolutely see why the language is specified that way, but I get worried when I can't predict the answer to a language question. (I didn't even try to guess, because they all sounded wrong.)

  • Anonymous
    May 14, 2008
    The comment has been removed

  • Anonymous
    May 14, 2008
    > The C# compiler is well aware about a method muting its object. . He even has to be aware of it to be able to compile the code correctly. What if the compiler is not compiling the method?  The method might be external.

  • Anonymous
    May 14, 2008
    The comment has been removed

  • Anonymous
    May 14, 2008
    > But what do the CLI specs they to such a scenario? You know, you could look it up yourself rather than asking me. But because I am in a charitable mood, and have the spec open already, I'll just tell you that Partition II section 16.1.2 states "These fields shall only be mutated inside a constructor. If the field is a static field, then it shall be mutated only inside the type initializer of the type in which it was declared. If it is an instance field, then it shall be mutated only in one of the instance constructors of the type in which it was defined. It shall not be mutated in any other method or in any other constructor, including constructors of derived classes." So, in short, if we generated the code the way you suggest, that would be illegal code. Specifically, it would be unverifiable; unverifiable code will not load at all in any environment with restricted security. In a fully trusted environment, I do not know whether that code would crash the process, throw a managed exception, silently fail, or silently succeed -- or, for that matter, erase your hard disk.  Unverifiable code is not guaranteed to do anything in particular; hence the name. Obviously we do not wish to generate unverifiable code.

  • Anonymous
    May 14, 2008
    > I, for one, find it extremely irritating that there is a special treatment of that special case in the docs What's the special treatment?  There are only two rules here:

  1. Readonly fields are not variables.
  2. Accessing anything on a value type which is not a variable acts on a copy of the value type. That the interaction of these two simple rules have complex consequences is interesting.  And that some of those consequences are sufficiently interesting that the spec explicitly calls your attention to them is us being nice to you and calling your attention to something you might have otherwise missed. Now, perhaps you don't like one of these rules. I'm not sure which one of them you don't like, or how you'd propose replacing it with something you do like that doesn't make the whole situation much worse. (Like, causing the compiler to generate unverifiable code.)  But frankly, they seem like sensible rules to me.
  • Anonymous
    May 14, 2008
    "Could FxCop be made to check for mutable value types?" Try this: http://blogs.msdn.com/kevinpilchbisson/archive/2007/11/20/enforcing-immutability-in-code.aspx

  • Anonymous
    May 14, 2008
    "This is yet another reason why mutable value types are evil." How about the evil immutable reference types which pretend to be mutable? System.String for starters. Many have fallen in to the trap of doing s [ 0 ] = 'a' only to find that it won't compile. Just yesterday, I found a gem in a function written by an interviewee which reversed a string in-place. [Evil grin]

  • Anonymous
    May 15, 2008
    Sorry if I sounded a bit offending, I just didn't see the dilemma. It's just feeling strange that "Accessing anything on a value type which is not a variable acts on a copy of the value type." It's really sensible if you want to do something like integers[13].ToString(), or obj.value.any_pure_function but it opens a can of worms if the called method is not a pure function but mutating the struct. As you pointed out in this blog post, statement 2 doesn't play well with mutable structs. That's quite a fundamental problem and not at all easy to solve. One solution might be to not allow accessing anything on a value type which is not a variable, but it's far too late to do such an enormous breaking change. Detecting muting operations is also difficult (though you could just be pessimistic on external code - which is very rare anyway) and it introduces versioning issues too. Don't get me wrong, the C# team did a really great job. C# is my favorite language for most problems I've to solve and I really enjoy to use that language. But still, there's also a lot of room for improvements. Let's just stick with it that mutable structs are bad.

  • Anonymous
    May 15, 2008
    A question about "...mutable value types are evil. Try to always make value types immutable." (and also some comments that suggests also suggests that creating mutalbe structs are pure evil) : I think there are really good reasons to make structs mutable, for example: struct Vector3D {  public double X;  public double Y;  public double Z; } struct Triangle {  public Vector3D A;  public Vector3D B;  public Vector3D C; } and var triangle = new Triangle() {  A = new Vector3D() { X = 0.0, Y = 0.0, Z = 0.0 },  B = new Vector3D() { X = 0.0, Y = 1.0, Z = 0.0 },  C  = new Vector3D() { X = 0.0, Y = 0.0, Z = 1.0 }, }; triangle.A.X = -1.0; If Triangle was immutable I would have to write something that was far more complicated... I mean, the triangle gets copied whenever it is used somewhere else; so the only way to modify the triangle is inside the acutal function (if not initialized and accessed through interface of course). So my question is: Making the triangle mutable just makes my code more succinct and I see no reason to make it mutable - you think this is a special case where mutable structs are ok or would you also make anything immutalbe?

  • Anonymous
    May 15, 2008
    I would never, ever, EVER write a mutable vector or triangle as a struct.  A vector is a VALUE.  Values do not change.  That is every bit as bizarre as writing a "mutable number".  Does it make any sense to say, well, I've got the number 12, but I'm going to change this version of 12 to 15?  No, of course not.  When you add 3 to a number, you get a NEW NUMBER, you don't modify the number 12 to be a different value. When you change the vertex of a triangle, you have a different triangle, so it should be a different value. Sure, your code is more succinct.  It is also more brittle and much harder to reason about correctly.  Mutability logically implies referential identity; values do not have referential identity, that's why they're called value types, not reference types.  If you need to mutate something, make it a reference type.

  • Anonymous
    May 15, 2008
    First of all: I completely agree with about anything you have written in the first two segments: Yes, triangle and vectors are true value types and are therefore immutable. But by initializing some variable as a struct and later changing it I dont really change the previously initialized "data" but I change the variable. I dont know how to put this better so I try an example: var triangle = new Triangle(...); var triangleCopy = triangle; triangle.A.X = -1.0; now what happend? I changed the variable - that means I give the variable (which is just a name for something) another meaning - but I didnt changed the previously declared Triangle at all (triangleCopy wont be modified!).

  • Anonymous
    May 15, 2008
    @Tanveer: In what way does System.String pretend to be mutable? Having a read-only indexer doesn't make it mutable in my view. @ThomasD: As Eric says, the C# compiler may not be compiling the struct, and it shouldn't have to look at the IL of the struct to work out whether or not it's a mutating operation. It also shouldn't give different results based on whether or not it is compiling that struct.

  • Anonymous
    May 15, 2008
    The comment has been removed

  • Anonymous
    May 16, 2008
    The comment has been removed

  • Anonymous
    May 16, 2008
    What are good reasons for using structs over classes? I would not in general select them for their semantics, because they are too awkward and quirky, such as always having a default constructor that initialises every field to default values, and the issues that you describe above. It also seems to be unclear when they are good for performance, since depending on how you use them you might end up copying them so much that you actually make things slower. And finally, pretty much anything you can do with an immutable struct you can do with an immutable class, no? I'm left to conclude that the only reason to use structs is when A. performance is very important, and B. you have profiled classes and structs and found structs to be faster. But if the decision should be based solely on performance, why should the programmer even be making it – shouldn't that be up to the compiler/runtime to decide? I think I'm missing something, but I'm not sure what.

  • Anonymous
    May 16, 2008
    There are times when you need to wring every last bit of performance out of a system. And in those scenarios, you sometimes have to make a tradeoff between code that is clean, pure, robust , understandable, predictable, modifiable and code that is none of the above but blazingly fast. Mutable value types are a bad programming practice because they behave in a manner that many people find deeply counterintuitive, and thereby make it easy to write buggy code (or correct code that is easily turned into buggy code by accident.) But yes, they are real fast. I would consider coding up two benchmark solutions -- one using mutable structs, one using immutable structs -- and run some realistic user-scenario-focused benchmarks. But here's the thing: do not pick the faster one.  Instead, decide BEFORE you run the benchmark how slow is unacceptably slow. If both solutions are acceptable, choose the one that is clean, correct and fast enough. If only one is acceptable, choose it.  If neither are acceptable, then I guess either your benchmark is unrealistic, your goals are unrealistic, or the implementations need improvement.

  • Anonymous
    May 22, 2008
    Rather than place the links to the most recent C# team content directly in Community Convergence, I have

  • Anonymous
    June 11, 2008
    Good article entry, Eric. But IMHO, this kind of warning should also be mentioned in MSDN Library, either online or offline version. As far as I know, Peter Goldy, former Microsoft employee, had also described this in his blog entry in the past. He also described "Mutable structs are harmful", but unfortunately his blog entry was then not available anymore. Will there be any clarifications or any future actions about this issue, especially on C#? Eriawan

  • Anonymous
    November 27, 2008
    Hmm, just came across this. First, saying that a value type object is a-priori equivalent of a number seems like a big stretch to me. Let's take a look at the math side since that's where the whole notion of a composite value came from. For a complex z=(1,-1), z.x=2 or z.re=2 or even is completely legit and much preferred to z=(2,im(z)). A variable is still a variable and one expects the most efficient overwriting semantics, not spurious temps floating around. Maybe another way to put it is that the only time when I would expect a silent copy/temp is when the original identifier is not in scope anymore - meaning only for the function call. Also meaning that "not a variable" is not a legit concept (in general not-something definitions are dangerous/messy). The semantics of a readonly value-type variable is the semantics of a constant just with different initialization time. So, I would expect the readonly declaration on a value type object to apply to all elements of that object, and cause both compile-time warning  (error seems too drastic) and exception if touched. This would of course require structs to be allowed to have normal CTOR-s, which is also perfectly legit thing to expect. Oh and if someone really, really wants a clone generator (that's what the example at the beginning really is) there's a plenty of ways to do that without a mess. Last but not least, copy-on-write semantics (akka "evil" string which is pretty efficient - if in doubt compare C++ or Java :-) would probably require at least one extra keyword to be doable in a clean and efficient way and would be a venerable cause (copy-on-write is good only when it's not wasteful and compiler needs to know). Just my $0.02

  • Anonymous
    March 06, 2009
    The comment has been removed

  • Anonymous
    May 06, 2009
    Fire up your favorite search engine, type in “mutable value types” and you might just feel a bit of pity

  • Anonymous
    November 12, 2011
    Eric, I'd love to get your input on this stackoverflow question: stackoverflow.com/.../why-is-it-okay-that-this-struct-is-mutable-when-are-mutable-structs-acceptable