Auto-default structs
Note
This article is a feature specification. The specification serves as the design document for the feature. It includes proposed specification changes, along with information needed during the design and development of the feature. These articles are published until the proposed spec changes are finalized and incorporated in the current ECMA specification.
There may be some discrepancies between the feature specification and the completed implementation. Those differences are captured in the pertinent language design meeting (LDM) notes.
You can learn more about the process for adopting feature speclets into the C# language standard in the article on the specifications.
https://github.com/dotnet/csharplang/issues/5737
Summary
This feature makes it so that in struct constructors, we identify fields which were not explicitly assigned by the user before returning or before use, and initialize them implicitly to default
instead of giving definite assignment errors.
Motivation
This proposal is raised as a possible mitigation for usability issues found in dotnet/csharplang#5552 and dotnet/csharplang#5635, as well as addressing #5563 (all fields must be definitely assigned, but field
is not accessible within the constructor).
Since C# 1.0, struct constructors have been required to definitely assign this
as if it were an out
parameter.
public struct S
{
public int x, y;
public S() // error: Fields 'S.x' and 'S.y' must be fully assigned before control is returned to the caller
{
}
}
This presents issues when setters are manually defined on semi-auto properties, since the compiler can't treat assignment of the property as equivalent to assignment of the backing field.
public struct S
{
public int X { get => field; set => field = value; }
public S() // error: struct fields aren't fully assigned. But caller can only assign 'this.field' by assigning 'this'.
{
}
}
We assume that introducing finer-grained restrictions for setters, such as a scheme where the setter doesn't take ref this
but rather takes out field
as a parameter, is going to be too niche and incomplete for some use cases.
One fundamental tension we are struggling with is that when struct properties have manually implemented setters, users often have to do some form of "repetition" of either repeatedly assigning or repeating their logic:
struct S
{
private int _x;
public int X
{
get => _x;
set => _x = value >= 0 ? value : throw new ArgumentOutOfRangeException();
}
// Solution 1: assign some value in the constructor before "really" assigning through the property setter.
public S(int x)
{
_x = default;
X = x;
}
// Solution 2: assign the field once in the constructor, repeating the implementation of the setter.
public S(int x)
{
_x = x >= 0 ? x : throw new ArgumentOutOfRangeException();
}
}
Previous discussion
A small group has looked at this issue and considered a few possible solutions:
- Require users to assign
this = default
when semi-auto properties have manually implemented setters. We agree this is the wrong solution since it blows away values set in field initializers. - Implicitly initialize all backing fields of auto/semi-auto properties.
- This solves the "semi-auto property setters" problem, and it squarely places explicitly declared fields under different rules: "don't implicitly initialize my fields, but do implicitly initialize my auto-properties."
- Provide a way to assign the backing field of a semi-auto property and require users to assign it.
- This could be cumbersome compared to (2). An auto property is supposed to be "automatic", and perhaps that includes "automatic" initialization of the field. It could introduce confusion as to when the underlying field is being assigned by an assignment to the property, and when the property setter is being called.
We've also received feedback from users who want to, for example, include a few field initializers in structs without having to explicitly assign everything. We can solve this issue as well as the "semi-auto property with manually implemented setter" issue at the same time.
struct MagnitudeVector3d
{
double X, Y, Z;
double Magnitude = 1;
public MagnitudeVector3d() // error: must assign 'X', 'Y', 'Z' before returning
{
}
}
Adjusting definite assignment
Instead of performing a definite assignment analysis to give errors for unassigned fields on this
, we do it to determine which fields need to be initialized implicitly. Such initialization is inserted at the beginning of the constructor.
struct S
{
int x, y;
// Example 1
public S()
{
// ok. Compiler inserts an assignment of `this = default`.
}
// Example 2
public S()
{
// ok. Compiler inserts an assignment of `y = default`.
x = 1;
}
// Example 3
public S()
{
// valid since C# 1.0. Compiler inserts no implicit assignments.
x = 1;
y = 2;
}
// Example 4
public S(bool b)
{
// ok. Compiler inserts assignment of `this = default`.
if (b)
x = 1;
else
y = 2;
}
// Example 5
void M() { }
public S(bool b)
{
// ok. Compiler inserts assignment of `y = default`.
x = 1;
if (b)
M();
y = 2;
}
}
In examples (4) and (5), the resulting codegen sometimes has "double assignments" of fields. This is generally fine, but for users who are concerned with such double assignments, we can emit what used to be definite assignment error diagnostics as disabled-by-default warning diagnostics.
struct S
{
int x;
public S() // warning: 'S.x' is implicitly initialized to 'default'.
{
}
}
Users who set the severity of this diagnostic to "error" will opt in to the pre-C# 11 behavior. Such users are essentially "shut out" of semi-auto properties with manually implemented setters.
struct S
{
public int X
{
get => field;
set => field = field < value ? value : field;
}
public S() // error: backing field of 'S.X' is implicitly initialized to 'default'.
{
X = 1;
}
}
At first glance, this feels like a "hole" in the feature, but it's actually the right thing to do. By enabling the diagnostic, the user is telling us that they don't want the compiler to implicitly initialize their fields in the constructor. There's no way to avoid the implicit initialization here, so the solution for them is to use a different way of initializing the field than a manually implemented setter, such as manually declaring the field and assigning it, or by including a field initializer.
Currently, the JIT does not eliminate dead stores through refs, which means that these implicit initializations do have a real cost. But that might be fixable. https://github.com/dotnet/runtime/issues/13727
It's worth noting that initializing individual fields instead of the entire instance is really just an optimization. The compiler should probably be free to implement whatever heuristic it wants, as long as it meets the invariant that fields which are not definitely assigned at all return points or before any non-field member access of this
are implicitly initialized.
For example, if a struct has 100 fields, and just one of them is explicitly initialized, it might make more sense to do an initobj
on the entire thing, than to implicitly emit initobj
for the 99 other fields. However, an implementation which implicitly emits initobj
for the 99 other fields would still be valid.
Changes to language specification
We adjust the following section of the standard:
https://github.com/dotnet/csharpstandard/blob/draft-v6/standard/expressions.md#11712-this-access
If the constructor declaration has no constructor initializer, the
this
variable behaves exactly the same as anout
parameter of the struct type. In particular, this means that the variable shall be definitely assigned in every execution path of the instance constructor.
We adjust this language to read:
If the constructor declaration has no constructor initializer, the this
variable behaves similarly to an out
parameter of the struct type, except that it is not an error when the definite assignment requirements (§9.4.1) are not met. Instead, we introduce the following behaviors:
- When the
this
variable itself does not meet the requirements, then all unassigned instance variables withinthis
at all points where requirements are violated are implicitly initialized to the default value (§9.3) in an initialization phase before any other code in the constructor runs. - When an instance variable v within
this
does not meet the requirements, or any instance variable at any level of nesting within v does not meet the requirements, then v is implicitly initialized to the default value in an initialization phase before any other code in the constructor runs.