-
Notifications
You must be signed in to change notification settings - Fork 164
First draft: caller unsafe #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
5cfeb39
9e6e730
4522b7c
66d883c
dde56cc
322ea0e
dd98d2b
8e5657a
656ade9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
|
||
# Annotating members as `unsafe` | ||
|
||
## Background | ||
|
||
C# has had the `unsafe` feature since 1.0. There are two different syntaxes for the feature: a block syntax that can be used inside methods and a modifier that can appear on members and types. The original semantics only concern pointers. An error is produced if a variable of pointer type appears outside an unsafe context. For the block syntax, this is anywhere inside the block; for members this is inside the member; for types this is anywhere inside the type. Pointer operations are not fully validated by the type system, so this feature is useful at identifying areas of code needing more validation. Unsafe has subsequently been augmented to also turn off lifetime checking for ref-like variables, but the fundamental semantics are unchanged -- the `unsafe` context serves only to avoid an error that would otherwise occur. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It also included things adjacent, like sizeof(T). As part of this effort, I'd like us to revisit those choices. Based on the direction in this doc, seems sizeof(T) should not be unsafe. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It also included things adjacent, like sizeof(T). As part of this effort, I'd like us to revisit those choices. Based on the direction in this doc, seems sizeof(T) should not be unsafe. |
||
|
||
While existing `unsafe` is useful, it is limited by only applying to pointers and ref-like lifetimes. Many methods may be considered unsafe, but the unsafety may not be related to pointers or ref-like lifetimes. For example, almost all methods in the System.RuntimeServices.CompilerServices.Unsafe class have the same safety issues as pointers, but do not require an `unsafe` block. The same is true of the System.RuntimeServices.CompilerServices.Marshal class. | ||
|
||
## Goals | ||
agocke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The existing unsafe system does not clearly identify which methods need hand-verification to use safely, and it doesn't indicate which methods claim to provide that verification (and produce a safe API). | ||
|
||
We want to achieve all of the following goals: | ||
|
||
* Clearly annotate which methods require extra verification to use safely | ||
* Clearly identify in source code which methods are responsible for performing extra verification | ||
* Provide an easy way to identify in source code if a project doesn't use unsafe code | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about for a binary? Should we set There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. In this design it’s allowed to define a RequiresUnsafe method without an unsafe block if you never actually call the method. If you did, there would be an unsafe block and the assembly would have unverifiable code. I’m not sure if defining such a method, but never using it except through other callers unsafe methods, should be unverifiable. My inclination is yes: anything that contains unsafe code should be marked, even if it’s not called from safe code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The sentence says project. That should be very simple to achieve: does the project have |
||
|
||
## Proposal | ||
|
||
We need to be able to annotate code as unsafe, even if it doesn't use pointers. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the critical part. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Assuming the rules are going to be the same for everyone (BCL / SDKs / everybody else), wouldn't we need the opposite as well?
Otherwise it seems to me that this will become viral (similar to the trimming attributes), where a lot of APIs are considered unsafe, because they eventually do something unsafe in some code path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That’s what unsafe does today. It lets you call unsafe things, but has no requirements on the callers. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit lost here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, I (think I) see, this new attribute would need to be added manually, it won't be added automatically whenever someone uses unsafe code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Does this doc mention marking packages as unsafe? I don't see it. It's perfectly fine to have unsafe code, it just needs to be recognized as such at this point. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @olivier-spinelli -- Fair question. As @EgorBo says, that's not what is being communicated. In terms of helping people understand what we're thinking, this was probably a poor doc to go first. Expect some higher-level docs that better explain a plan. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Propagating I think the key point is to make the customers aware of unsafety at use site. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is also a widely-seen formalism to disallow |
||
|
||
Mechanically, this would be done with a modification to the C# language and a new property to the compilation. When the compilation property "EnableRequiresUnsafe" is set to true, the `unsafe` keyword on C# _members_ would require that their uses appear in an unsafe context. An `unsafe` block would be unchanged -- the statements in the block would be in an unsafe context, while the code outside would have no requirements. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would An approach like this where it becomes a new default would probably be problematic if we make breaking changes to what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a breaking change in the language, so that depends on the observed severity of the break. |
||
|
||
For example, the code below would produce an error: | ||
|
||
```C# | ||
void Caller() | ||
{ | ||
M(); // error, the call to M() is not in an unsafe context | ||
} | ||
|
||
unsafe void M() { } | ||
``` | ||
|
||
This can be addressed by callers in two ways: | ||
|
||
```C# | ||
unsafe void Caller1() | ||
{ | ||
M(); | ||
} | ||
void Caller2() | ||
{ | ||
unsafe | ||
{ | ||
M(); | ||
} | ||
} | ||
unsafe void M() { } | ||
``` | ||
|
||
In the case of `Caller1`, the call to `M()` doesn't produce an error because it is inside an unsafe context. However, calls to `Caller1` will now produce an error for the same reason as `M()`. | ||
|
||
`Caller2` will also not produce an error because `M()` is in an unsafe context. However, this code creates a responsibility for the programmer: by presenting a safe API around an unsafe call, they are asserting that all safety concerns of `M()` have been addressed. | ||
Comment on lines
+54
to
+56
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rising verbosity of requiring There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This document doesn’t just present an annotation, it presents a formalism. By definition, safe code cannot cause memory safety issues because the runtime guarantees the safety. This means that applying unsafe is not a judgment call — if code can cause a memory safety issue, it needs to be unsafe. If not, it does not. If too much code is unsafe, we would have to narrow the property we’re attempting to prove. Otherwise the formalism is unsound. |
||
|
||
Notably, unsafe did not change the requirement that the code in the block must be correct. It merely offset the responsibility from the language and the runtime to the user in verification. | ||
|
||
For more precise details on the error semantics of unsafe blocks and unsafe members, the rules will mirror the rules defined for "Requires" attributes defined in [Feature attribute semantics](https://github.com/dotnet/runtime/blob/main/docs/design/tools/illink/feature-attribute-semantics.md#requiresfeatureattribute). The only addition is the presence of the `unsafe` block, which effectively provides a local `Requires` context. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Behavior of "Requires" attributes differs between classes and structs. Is it desirable to match that in this C# language feature? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we anticipate any future IL analysis here? That’s one of the things that makes structs tricky. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this as a C# language feature. I do not think it makes sense to think about IL analysis in the context of C# language feature.
jjonescz marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we allow marking types as unsafe-callers-only, then it probably needs to be for all members imo. This probably also needs to include nested types, as they are textually within the Firstly, presumably if a method is marked as safe, then it cannot be overridden with something unsafe-callers-only, unless the implementing type is also unsafe-callers-only, since that would allow something like this: interface I1
{
void X();
}
// case 2: mark C1 as unsafe-callers-only
// idea 3: have a mechanism to mark C1's implementation of I1 as unsafe - this runs into the problem that you can't ever cast the type to a base type or pass to generic in a safe context, because you could then dynamically discover that it implements I1 in the safe context, allowing you to call I1.X() on it in a safe context also
class C1 : I1
{
public unsafe void X() => ... something unsafe; // potential error spot: can't implement a safe method with an unsafe method
}
C1 c = new(); //if case 2, we get the error here due to C1 being unsafe-callers-only & we are good
I1 i = c;
i.X(); //if there's no error by this point, then we just did something unsafe without being in an unsafe context But, if we require unsafe-callers-only members to only be able to override other unsafe-callers-only members, we run into a new problem: now we may need to mark an abstract method as unsafe-callers-only for 1 subtype's definition, which now causes all subtypes' implementations as unsafe-callers-only, unless we can opt-out by making it a safe context again somehow (although, this specific thing could presumably be achieved by just not adding the interface I1
{
void X();
}
class C1
{
public unsafe virtual void X() => ...; //something unsafe
}
unsafe class C2 : C1;
unsafe class C3 : C2, I1 //is it legal to implement I1, since I1.X expects a safe impl but we can't make a safe context within an unsafe-callers-only type?
{
public override void X() => ... something not unsafe, but we're in an unsafe context & it's unsafe-callers-only due to class being marked unsafe-callers-only
}
unsafe class C4 : C2, I1 //is it legal to implement I1, since I1.X expects a safe impl?
{
public override void X() => base.X(); //note: this is an unsafe operation
}
I1 i;
C1 c;
// is this valid?
// is it only valid if we know it's specifically C3 & we know that C3's impl for I1 is safe even though it's not guaranteed by any compatibility (source or binary)?
// should casting any instance of any unsafe type to a base class only be valid if we know that all current & future implementations of all non-unsafe-callers-only interface members implemented happen to be implemented with safe implementations, even though they have to be marked unsafe-callers-only due to being in an unsafe-callers-only type?
unsafe { c = GetC2(); }
i = (I1)c;
i.X(); I'm not sure what the correct answers to the above are, but I think it needs careful consideration to make sure we come up with the right solution, and I do think that the ability to introduce a safe context is likely a part of that solution. The other reason is that I think we should aim to keep unsafe contexts at a minimum, but if you need to mark a whole type as unsafe-callers-only for some reason (e.g., if the default value for a struct type represents a state that is unsafe), then you immediately lose the ability to have any non-unsafe contexts for anything textually inside that type, which I think is unideal. |
||
|
||
## Implementation | ||
|
||
In addition to compiler enforcement, the following attribute will be added for annotating unsafe members. It is an error to use this attribute directly in C#. Instead, the `unsafe` keyword should be used. | ||
|
||
```C# | ||
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Constructor)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I assume if you apply this to a type (class/struct), all members are considered "ReuiresUnsafe"? But note that today in C#, iterator methods can escape unsafe context of a class: unsafe class C
{
void M1()
{
int* p; // ok
}
IEnumerable<int> M() // iterator method
{
int* p; // error
yield return 0;
}
} How would RequiresUnsafe semantics apply to the iterator method? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tbh, I don't think it should be allowed on a type level - my understanding is that rust doesn't allow this for the following reason: a type itself cannot be unsafe, only operations can be, and I personally think that is pretty sensible. In saying that, we do have precedence for this in c# already, with pointer types, but I personally tend to think rust's solution of only allowing you to do unsafe things with them in an unsafe context is the better approach. The other reason that I don't think UnsafeCallersOnly should be allowed on a type level is that it feels suspicious to claim that the only possible ways to use a type would be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
FWIW, pointer types themselves are an example of "unsafe types". For example, it might be considered inconsistent if you couldn't create your custom Anyway, how would it work if RequiresUnsafe wouldn't work for types? Users can already declare types as unsafe in C# today and that propagates to its members (except iterators) automatically. If we excluded unsafe types from RequiresUnsafe semantics, would we require users to mark the members separately as unsafe in order to opt them in for RequiresUnsafe? Or just error on unsafe types when EnableRequiresUnsafe is enabled for project? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This was explaining how it works in rust, not how it works today in c#:
In rust, the following function is not unsafe-callers-only: I think this is the ideal setup, but I'm not certain it's possible to actually do in c#, since we have
It also doesn't propogate to Additionally, the current proposal doesn't even make marking a type with It also seems to differ here
Since when you mark a type unsafe-impl today these members do get an unsafe-impl context, but if we change the meaning to getting an unsafe-callers-only context, they shouldn't get that (which makes sense for base/interface members that aren't unsafe-callers-only, but is also another change with what putting Also:
These targets also seem wrong imo, I didn't notice it earlier as I must have skimmed over that line, but e.g., if I have a pointer field (or any other type we mark unsafe-callers-only), does that require me to mark the whole type unsafe-callers-only, even if it has safe operations on it & just happens to use pointers or other unsafe functionality? Do we need to strictly disallow unsafe-callers-only types as instance fields in classes, since we cannot mark them as unsafe-callers-only, and the current spec specifies that it doesn't get passed down to instance members in classes (other than constructors), thus making it impossible to have them as instance fields?
If we change the meaning of Lastly, #330 (comment)
I think it is inevitable that we will end up with some differences to rust (due to the reasons I'm about to explain), but we could make it the same if we really wanted to, but this would require a massive set of changes to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In Rust impls can be unsafe, which serves a similar purpose here. C# does not have a meaningful difference between type declaration and implementation. We will need some way to handle the following problem: class UnsafeEnumerable : IEnumerable<T>
{
unsafe IEnumerator<T> GetEnumerator() => ...; // use unsafe code
} We have to guard virtual invocations (interface methods), otherwise you can trivially bypass unsafe. But we can't mark There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right, so the purpose of marking a type unsafe-callers-only is to indicate that 1 or more of the interfaces it implements is implemented unsafely? That is really not clear from anything imo (not clear syntactically, also not clear from this document), but I suppose it makes sense if the intent is that it's the only reason (or There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good feedback, I think this doc needs some more examples and explanation |
||
public sealed class RequiresUnsafeAttribute : System.Attribute | ||
{ | ||
} | ||
``` | ||
|
||
### Globally-valid properties | ||
|
||
The overall goal is to ensure .NET code is "valid," in the sense that certain properties are always true. Generating a complete list of such properties is out of scope of this document. However, at least the following properties are required: | ||
|
||
* Memory safety | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's crucial to differentiate the unsafety levels. Type safety and GC safety should get highest severity because of the ability to break runtime. There are continuous tickets for inappropriate unsafe API usage causing hard-to-diagnose bugs. For example, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Nearly all memory safety bugs can crash the runtime. For example, incorrect use of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The safety property I suggested, memory safety, is derived from an analysis of severe security breaches across the industry. Things in this category are known to be very dangerous and award attackers with significant power. I don’t think there’s further subdivision of this category that is interesting. Unsafe.As and Unsafe.AsPointer are both capable of producing memory safety problems and are therefore both unsafe. The only other property I’m suggesting is viewing uninitialized memory. This is less severe. We could drop it if necessary. At the moment I’m proposing no other properties that would be covered by the unsafe feature and therefore further categorization is irrelevant. |
||
* No access to uninitialized memory | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it actually possible to guarantee this ( Basically, I think we can guarantee memory safety if no |
||
|
||
In this document **memory safety** is strictly defined as: code never accesses memory that is not managed by the runtime. "Managed" here does not refer to solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime, or memory that is acquired for legal use by the runtime through any other means. | ||
|
||
No access to uninitialized memory means that all memory is either never read before it has been initialized, or it has been initialized to a zero value. This term is defined relative to the application context, not the system context. Therefore, allocating memory using a mechanism where .NET guarantees that the memory has been zeroed would be considered initialized. Reading memory from a function which does not guarantee zero initialization would not. Reusing memory that has previously been initialized but now may hold an unknown value due to previous application execution may be incorrect logic from the application's perspective, but is not invalid code according to .NET. | ||
|
||
These properties are particularly important for security purposes. Any methods that can potentially violate these guarantees should be unsafe. Additionally, all such methods should have documentation that describes what conditions need to be satisfied to ensure that these guarentees are preserved. | ||
|
||
These properties are guaranteed by "safe" code through a combination of compiler and .NET runtime enforcement. For `unsafe` code, these properties must be guaranteed by the programmer. | ||
|
||
`unsafe` members are used to identify the places that cannot be automatically checked by the compiler and runtime for validity. Inside unsafe blocks, the programmer is responsible for ensuring that all requirements of the unsafe code are met, and that all code outside the block will have validity properly enforced by the system. | ||
|
||
|
||
### Examples and APIs | ||
|
||
**ArrayPool.Rent** | ||
|
||
This method is unsafe because it returns an array with unintialized memory. Code must not read the contents of the returned array without initialization. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is uninitialized defined here? Is this because the initial array is allocated with GC.AllocateUninitializedArray, or is this because someone can rent an array, write data into it, return it, and then someone else renting gets that data? If it's the former, I'd rather just change the Rent implementation to use new[] instead of AllocateUninitializedArray. If it's the latter, that won't help. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The former -- uninitialized means uninitialized by the "application" vs. the OS. If the pool used regularly .NET array allocation it would be zero-init and considered initialized. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Uninitialized memory is not the only reason why |
||
|
||
**stackalloc** | ||
|
||
This language feature is unsafe if used in a `SkipLocalsInit` context because the stack allocated buffer is uninitialized. If an initializer is used and the converted type is `Span<T>`, this code is safe. | ||
|
||
agocke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
**P/Invoke** | ||
|
||
All P/Invoke methods are unsafe because they may compromise memory safety if the callee function does not match the P/Invoke method specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be good to clarify that if
unsafe
is on the member, it has an effect on the member's signature as well