Skip to content

Add option to enable csharp nullable reference types #13218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

ogxd
Copy link

@ogxd ogxd commented Jul 5, 2023

Related to:

Context

Nullable reference types introduce a nullability context. Currently, the C# code generated via protobuf does not allow null for reference types and will intentionally throw at runtime when a null is assigned to such field.
Because generated files are annotated with the // <auto-generated> attribute on top, nullable reference types won't work, even if NRT is explicitly enabled at the project level for instance.
The consequence of this is that there is no way to benefit from NRT static code analysis to know in advance where fields could be assigned with a null (and thus it would throw at runtime)

Solution

A first solution would be to allow removing the // <auto-generated> attribute, however, it has the unwanted side effects of enabling code analysis on the generated file itself, which we want to avoid.
Instead, it turns out that marking generated files with #nullable enable works.

The following PR is about this: Adding a CLI option to add #nullable enable on top of generated files.
The unit tests show that the code generated with that option now carries the nullability information so that we have a warning for any consuming code that may assign nulls (CS8601).

image

Compatibility

Files annotated with #nullable enable can still compile fine under versions of dotnet prior to 6.0, so it is a backward-compatible change.
Also, fields are still annotated as non-nullable, so I believe this feature does not break compatibility.

Todo

I haven't made any changes to the documentation yet as I would like feedback first.
CLA in progress.

@ogxd ogxd requested a review from a team as a code owner July 5, 2023 22:15
@ogxd ogxd requested review from jskeet and removed request for a team July 5, 2023 22:15
@ogxd ogxd force-pushed the csharp-nrt-option branch from 57115ec to 22d4da3 Compare July 5, 2023 22:20
@ogxd ogxd force-pushed the csharp-nrt-option branch from 10a904a to 92bad8c Compare July 5, 2023 22:39
@jskeet
Copy link
Contributor

jskeet commented Jul 6, 2023

Thank you for all your work on this, but we're still discussing whether or not we want to include nullable reference types in code generation at all. Please see grpc/grpc#33507 where there is considerable discussion.
If you're happy to keep working on this PR knowing that the end result may be that we don't want to do anything, that's fine - but I won't be reviewing it in any detail until we've decided what we actually want to do.

(In general we try to avoid adding language-specific command line options - when to use NRTs is one of the key design discussions here. It feels like a new option might be the only way to go as different consumers may want to generate code for the same proto differently, but it's not certain yet.)

@ogxd
Copy link
Author

ogxd commented Jul 6, 2023

No problem, I understand.
I just noticed that my PR is actually about what @JamesNK just proposed yesterday here and here. As an user of protobuf for C# I confirm that simply having warnings in our code fulfills our most important need which is to avoid runtime null exceptions.

By the way, I used #nullable enable here, but #nullable enable annotations might be enough since we might not need nullable warning context within the generated files.

(I don't think I need to push anything else to that PR anyway until a decision has been taken, the implementation and the tests already shows whats this solution implies and its benefits)

@fowles fowles added the c# label Jul 7, 2023
@tonydnewell
Copy link
Contributor

The code generation needs further changing to fully support handline NRT.
See grpc/grpc#33507 (comment)
e.g. allowing message fields to be set to null, Equals, MergeFrom, and removing redundant null checks elsewhere.

@ogxd
Copy link
Author

ogxd commented Jul 13, 2023

The code generation needs further changing to fully support handline NRT.
See grpc/grpc#33507 (comment)
e.g. allowing message fields to be set to null, Equals, MergeFrom, and removing redundant null checks elsewhere.

I don't think it does. As the MS documentations says:

Nullable reference types aren't new class types, but rather annotations on existing reference types. The compiler uses those annotations to help you find potential null reference errors in your code.

This means that, given that the reference types are currently considered non-nullable, enabling NRT is about adding annotations so that a user knows that he might assign a null and trigger NREs.

What you describe with "allowing message fields to be set to null, Equals, MergeFrom, and removing redundant null checks elsewhere" is not about NRT but rather about allowing fields to be set to null. NRT annotations can be added on top, but again, it is different.

This PR is about adding NRT annotations so that users can consume this API while avoiding NREs, without any change to the generated code, unlike allowing fields to be set to null. Personally, I am not against also allowing reference fields to be set to null, but it is another topic (and more debatable).

@tonydnewell
Copy link
Contributor

@ogxd Fields within a message that themselves are messages can be assigned null.
For example, proto file:

message Person {
  string name = 1;
  int32 id = 2;  // Unique ID number for this person.
  string email = 3;

  enum PhoneType {
    PHONE_TYPE_UNSPECIFIED = 0;
    PHONE_TYPE_MOBILE = 1;
    PHONE_TYPE_HOME = 2;
    PHONE_TYPE_WORK = 3;
  }

  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }

  PhoneNumber phone = 4; // this is a message and can be assigned null
}

with code:

Person person = new Person()
{
     Id = 1234,
     Name = "Fred",
     Email = "fred@example.com",
     Phone = null
 };

compiles today without your changes. Setting Phone to null effectively removes it from the message.

But with your changes:

Warning	CS8625	Cannot convert null literal to non-nullable reference type.

There is more involved than just adding #nullable enable to the generated code. Other changes to the generated code are necessary too.

@ogxd
Copy link
Author

ogxd commented Jul 15, 2023

Yes, you're right, I thought they were treated the same as strings. But it means marking these fields/properties as nullable
with ? would be enough. Since this is still about adding annotations, the compiled code remains the same, so it is still a safe change, right?
image

@ogxd
Copy link
Author

ogxd commented Jul 15, 2023

I've just updated the PR to use #nullable enable annotations instead of #nullable enable and to take into account remark from @tonydnewell about message fields. Generated code now also has nullable annotations for message fields (case covered by unit test)

@Falco20019
Copy link
Contributor

Falco20019 commented Jul 27, 2023

One special case that I think might have been missed (only gave it a quick glance) is google.protobuf.StringValue which then needs to be mapped onto string? instead of string. Since all other wrappers were already mapped onto nullables, they should be fine.

bool serializable;
// Whether the generated files are annotated with "#nullable enable" to display warnings on potential null assignments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment still only talks about #nullable enable, which is not true anymore.

@tonydnewell
Copy link
Contributor

I think there is a lot more work to be done. Adding #nullable enable annotations rather than #nullable enable hides a lot of potential problems.

As an experiment I took the code generated from unittest_well_known_types.proto and added #nullable enable and it produced 80 warnings.

@@ -79,6 +79,8 @@ bool Generator::Generate(const FileDescriptor* file,
cli_options.internal_access = true;
} else if (options[i].first == "serializable") {
cli_options.serializable = true;
} else if (options[i].first == "enable_nullable_reference_types") {
Copy link
Contributor

@tonydnewell tonydnewell Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be for a shorter option name. Microsoft just uses "nullable" (<Nullable> and #nullable) so maybe we should? Or "nullable_reference_types"? Whatever is chosen it would be good if both the protocol buffers compiler and gRPC plugin use the same name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the requests that we've had for generating int?-backed properties for optional fields, I think just "nullable" could lead to confusion.

How about enable_nrt? I think NRT is a reasonably commonly-used abbrevation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it is too long and can be confusing, enable_nrt seems more appropriate

@ogxd
Copy link
Author

ogxd commented Jul 28, 2023

I think there is a lot more work to be done. Adding #nullable enable annotations rather than #nullable enable hides a lot of potential problems.

As an experiment I took the code generated from unittest_well_known_types.proto and added #nullable enable and it produced 80 warnings.

As the docs says about NRT, its purpose is to "minimize the likelihood that your code causes the runtime to throw System.NullReferenceException". I think the point here is that the autogenerated code is not "your" code (you don't edit it, you don't maintain it, it is simply an external API generated by an external tool). This is why I think there shouldn't be warnings from NRT within the generated code itself, and one of the reasons why #nullable enable annotations exists.

image

@tonydnewell
Copy link
Contributor

I think there is a lot more work to be done. Adding #nullable enable annotations rather than #nullable enable hides a lot of potential problems.
As an experiment I took the code generated from unittest_well_known_types.proto and added #nullable enable and it produced 80 warnings.

As the docs says about NRT, its purpose is to "minimize the likelihood that your code causes the runtime to throw System.NullReferenceException". I think the point here is that the autogenerated code is not "your" code (you don't edit it, you don't maintain it, it is simply an external API generated by an external tool). This is why I think there shouldn't be warnings from NRT within the generated code itself, and one of the reasons why #nullable enable annotations exists.

Just my opinion (I'm not part of the protocol buffers team) but getting the generated code to be "clean" will probably help in not missing any corner cases, such as those arising with well-known types such as StringValue and BytesValue, and methods such as Equals.

@JamesNK
Copy link
Contributor

JamesNK commented Jul 31, 2023

What are the warnings? Nullable warnings shouldn't come from generated code.

I'm guessing the problem is Google.Protobuf itself isn't annotated. #nullable enable annotations could be a short term fix and a longer term fix is annotating Google.Protobuf (not a small task. I see someone attempted to do that here - #9801).

@tonydnewell
Copy link
Contributor

What are the warnings? Nullable warnings shouldn't come from generated code.

I'm guessing the problem is Google.Protobuf itself isn't annotated. #nullable enable annotations could be a short term fix and a longer term fix is annotating Google.Protobuf (not a small task. I see someone attempted to do that here - #9801).

A lot of warnings about setting null in reflection methods and classes such as GeneratedClrTypeInfo - these could wait for Google.Protobuf to be updated, or the generated code could add the null forgiving operator.

Also warnings in constructors when fields contain null values, e.g. CS8618 Non-nullable field '_unknownFields' must contain a non-null value when exiting constructor. Consider declaring the field as nullable.

Also warnings in Equals - e.g. CS8604 Possible null reference argument for parameter 'other' in 'bool OneofWellKnownTypes.Equals(OneofWellKnownTypes other)'.

I know that we could use #nullable enable annotations but by using #nullable enable instead it might help spot the areas of the code that really need fixing.

@JamesNK
Copy link
Contributor

JamesNK commented Jul 31, 2023

It sounds like these nullable warnings are related to the generated code. In that case, generated code should be updated to fix the warnings. For example, allow things to be null where necessary. Or add the bang operator to suppress errors when there is no other choice.

I suggest putting a project with a copy of the generated code (copied from the temporary obj directory) in a GH repo. Then other people can see the errors you're talking about. Right now, we have no idea.

That would allow us focus to be on what generated code with nullable annotations should look like. Then once that is decided, update protoc generation to output it.

Also warnings in constructors when fields contain null values, e.g. CS8618 Non-nullable field '_unknownFields' must contain a non-null value when exiting constructor. Consider declaring the field as nullable.

_unknownFields field should be nullable. It can have a null value:

if (_unknownFields != null) {
hash ^= _unknownFields.GetHashCode();
}

Also warnings in Equals - e.g. CS8604 Possible null reference argument for parameter 'other' in 'bool OneofWellKnownTypes.Equals(OneofWellKnownTypes other)'.

Messages implement IMessage<T>, which implements IEquatable<T>. Its Equals method accepts a nullable argument. The generated code needs to match:

public bool Equals(OneofWellKnownTypes? other)
{
    // ...
}

@jskeet
Copy link
Contributor

jskeet commented Aug 4, 2023

Just to chime in, I agree that the generated code should build warning-free. It sounds like some more work is needed - but thanks for what you've already done, and I look forward to looking more closely when it's fully "done".
I suspect that when Google.Protobuf eventually becomes NRT-aware, we may need a second pass anyway - but that's okay.

@ogxd
Copy link
Author

ogxd commented Sep 20, 2023

Thanks for the feedback! I just became dad on the 4th of August so my days were a little busy since but if the nullable topic hasn't addressed in parallel since I'll update the PR accordingly to your remarks when I have some time 😄

@stebet
Copy link

stebet commented Nov 10, 2023

Thanks for the feedback! I just became dad on the 4th of August so my days were a little busy since but if the nullable topic hasn't addressed in parallel since I'll update the PR accordingly to your remarks when I have some time 😄

I'm very excited about getting this into the tooling so if there is anything I can do to help or speed things along, please let me know :)

--csharp_out=csharp/src/Google.Protobuf.Test.TestProtos \
--csharp_opt=file_extension=.pb.cs \
--csharp_opt=enable_nullable_reference_types \
--descriptor_set_out=csharp/src/Google.Protobuf.Test/testprotos.pb \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overwriting of testprotos.pb here causes the tests in DescriptorDeclarationTest.cs to fail

TestNrtNestedMessage message_field = 1;
string string_field = 2;
int32 int32_field = 3;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code generated for unittest_nrt_proto3.proto causes RefStructCompatibilityTest.cs tests to fail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version of the files causes tests in DescriptorDeclarationTest.cs to fail

@tonydnewell
Copy link
Contributor

I've started looking at this in more detail to see if I can complete the work.
Currently there are some issues with the existing tests failing after some of the changes. See comments above.

Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please add a comment.

This PR is labeled inactive because the last activity was over 90 days ago. This PR will be closed and archived after 14 additional days without activity.

@github-actions github-actions bot added the inactive Denotes the issue/PR has not seen activity in the last 90 days. label Feb 14, 2024
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it.

This PR was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c# inactive Denotes the issue/PR has not seen activity in the last 90 days.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants