Skip to content

Defects vs subsets #127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
249 changes: 249 additions & 0 deletions src/coding-guidelines/expressions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,252 @@ Expressions
}

fn with_base(_: &Base) { ... }


.. guideline:: The ``as`` operator should not be used with numeric operands
:id: gui_ADHABsmK9FXz
:category: advisory
:status: draft
:release: <TODO>
:fls: fls_otaxe9okhdr1
:decidability: decidable
:scope: module
:tags: subset, reduce-human-error

The binary operator ``as`` should not be used with:

* a numeric type, including all supported integer, floating, and machine-dependent arithmetic types; or
* ``bool``; or
* ``char``

as either the right operand or the type of the left operand.

**Exception:** ``as`` may be used with ``usize`` as the right operand and an expression of raw pointer
type as the left operand.

.. rationale::
:id: rat_v56bjjcveLxQ
:status: draft

Although the conversions performed by ``as`` between numeric types are all well-defined, ``as`` coerces
the value to fit in the destination type, which may result in unexpected data loss if the value needs to
be truncated, rounded, or produce a nearest possible non-equal value.

Although some conversions are lossless, others are not symmetrical. Instead of relying on either a defined
lossy behaviour or risking loss of precision, the code can communicate intent by using ``Into`` or ``From``
and ``TryInto`` or ``TryFrom`` to signal which conversions are intended to perfectly preserve the original
value, and which are intended to be fallible. The latter cannot be used from const functions, indicating
that these should avoid using fallible conversions.

A pointer-to-address cast does not lose value, but will be truncated unless the destination type is large
enough to hold the address value. The ``usize`` type is guaranteed to be wide enough for this purpose.

A pointer-to-address cast is not symmetrical because the resulting pointer may not point to a valid object,
may not point to an object of the right type, or may not be properly aligned.
If a conversion in this direction is needed, :std:``std::mem::transmute`` will communicate the intent to perform
an unsafe operation.

.. non_compliant_example::
:id: non_compl_ex_hzGUYoMnK59w
:status: draft

``as`` used here can change the value range or lose precision.
Even when it doesn't, nothing enforces the correct behaviour or communicates whether
we intend to allow lossy conversions, or only expect valid conversions.

.. code-block:: rust

fn f1(x: u16, y: i32, z: u64, w: u8) {
let a = w as char; // non-compliant
let b = y as u32; // non-compliant - changes value range, converting negative values
let c = x as i64; // non-compliant - could use .into()

let d = y as f32; // non-compliant - lossy
let e = d as f64; // non-compliant - could use .into()
let f = e as f32; // non-compliant - lossy

let g = e as i64; // non-compliant - lossy despite object size

let p1: * const u32 = &b;
let a1 = p1 as usize; // compliant by exception
let a2 = p1 as u16; // non-compliant - may lose address range
let a3 = p1 as u64; // non-compliant - use usize to indicate intent

let p2 = a1 as * const u32; // non-compliant - prefer transmute
let p3 = a2 as * const u32; // non-compliant (and most likely not in a valid address range)
}

.. compliant_example::
:id: compl_ex_uilHTIOgxD37
:status: draft

Valid conversions that are guaranteed to preserve exact values can be communicated
better with ``into()`` or ``from()``.
Valid conversions that risk losing value, where doing so would be an error, can
communicate this and include an error check, with ``try_into`` or ``try_from``.
Other forms of conversion may find ``transmute`` better communicates their intent.

.. code-block:: rust

fn f2(x: u16, y: i32, z: u64, w: u8) {
let a: char = w.into();
let b: Result <u32, _> = y.try_into(); // produce an error on range clip
let c: i64 = x.into();

let d = f32::from(x); // u16 is within range, u32 is not
let e = f64::from(d);
// let f = f32::from(e); // no From exists

// let g = ... // no From exists

let h: u32 = 0;
let p1: * const u32 = &h;
let a1 = p1 as usize; // (compliant)

unsafe {
let a2: usize = std::mem::transmute(p1); // OK
let a3: u64 = std::mem::transmute(p1); // OK, size is checked
// let a3: u16 = std::mem::transmute(p1); // invalid, different sizes

let p2: * const u32 = std::mem::transmute(a1); // OK
let p3: * const u32 = std::mem::transmute(a1); // OK
}

unsafe {
// does something entirely different,
// reinterpreting the bits of z as the IEEE bit pattern of a double
// precision object, rather than converting the integer value
let f1: f64 = std::mem::transmute(z);
}
}


.. guideline:: An integer shall not be converted to a pointer
:id: gui_PM8Vpf7lZ51U
:category: <TODO>
:status: draft
:release: <TODO>
:fls: fls_59mpteeczzo
:decidability: decidable
:scope: module
:tags: subset, undefined-behavior

The ``as`` operator shall not be used with an expression of numeric type as the left operand,
and any pointer type as the right operand.

:std:``std::mem::transmute`` shall not be used with any numeric type (including floating point types)
as the argument to the ``Src`` parameter, and any pointer type as the argument to the ``Dst`` parameter.

.. rationale::
:id: rat_YqhEiWTj9z6L
:status: draft

A pointer created from an arbitrary arithmetic expression may designate an invalid address,
including an address that does not point to a valid object, an address that points to an
object of the wrong type, or an address that is not properly aligned. Use of such a pointer
to access memory will result in undefined behavior.

The ``as`` operator also does not check that the size of the source operand is the same as
the size of a pointer, which may lead to unexpected results if the address computation was
originally performed in a differently-sized address space.

While ``as`` can notionally be used to create a null pointer, the functions
``core::ptr::null`` and ``core::ptr::null_mut`` are the more idiomatic way to do this.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading the documentation of the Sphinx coding guidelines extension (again, shamelessly taken from the FLS) and we should also be able to use it with items from core, e.g.

Suggested change
``core::ptr::null`` and ``core::ptr::null_mut`` are the more idiomatic way to do this.
:std:`core::ptr::null` and :std:`core::ptr::null_mut` are the more idiomatic way to do this.

(note the single ticks instead of double ticks)


.. non_compliant_example::
:id: non_compl_ex_0ydPk7VENSrA
:status: draft

Any use of ``as`` or ``transmute`` to create a pointer from an arithmetic address value
is non-compliant:

.. code-block:: rust

fn f1(x: u16, y: i32, z: u64, w: usize) {
let p1 = x as * const u32; // not compliant
let p2 = y as * const u32; // not compliant
let p3 = z as * const u32; // not compliant
let p4 = w as * const u32; // not compliant despite being the right size

let f: f64 = 10.0;
// let p5 = f as * const u32; // not valid

unsafe {
// let p5: * const u32 = std::mem::transmute(x); // not valid
// let p6: * const u32 = std::mem::transmute(y); // not valid

let p7: * const u32 = std::mem::transmute(z); // not compliant
let p8: * const u32 = std::mem::transmute(w); // not compliant

let p9: * const u32 = std::mem::transmute(f); // not compliant, and very strange
}
}

.. compliant_example::
:id: compl_ex_oneKuF52yzrx
:status: draft

There is no compliant example of this operation.


.. guideline:: An integer shall not be converted to an invalid pointer
:id: gui_iv9yCMHRgpE0
:category: <TODO>
:status: draft
:release: <TODO>
:fls: fls_9wgldua1u8yt
:decidability: undecidable
:scope: system
:tags: defect, undefined-behavior

An expression of numeric type shall not be converted to a pointer if the resulting pointer
is incorrectly aligned, does not point to an entity of the referenced type, or is an invalid representation.

.. rationale::
:id: rat_OhxKm751axKw
:status: draft

The mapping between pointers and integers must be consistent with the addressing structure of the
execution environment. Issues may arise, for example, on architectures that have a segmented memory model.

.. non_compliant_example::
:id: non_compl_ex_CkytKjRQezfQ
:status: draft

This example makes assumptions about the layout of the address space that do not hold on all platforms.
The manipulated address may have discarded part of the original address space, and the flag may
silently interfere with the address value. On platforms where pointers are 64-bits this may have
particularly unexpected results.

.. code-block:: rust

fn f1(flag: u32, ptr: * const u32) {
/* ... */
let mut rep = ptr as usize;
rep = (rep & 0x7fffff) | ((flag as usize) << 23);
let p2 = rep as * const u32;
}

.. compliant_example::
:id: compl_ex_oBoluiKSvREu
:status: draft

This compliant solution uses a struct to provide storage for both the pointer and the flag value.
This solution is portable to machines of different word sizes, both smaller and larger than 32 bits,
working even when pointers cannot be represented in any integer type.

.. code-block:: rust

struct PtrFlag {
pointer: * const u32,
flag: u32
}

fn f2(flag: u32, ptr: * const u32) {
let ptrflag = PtrFlag {
pointer: ptr,
flag: flag
};
/* ... */
}

4 changes: 4 additions & 0 deletions src/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,10 @@
dict(name="readability", description="Readability-related guideline"),
dict(name="reduce-human-error", description="Reducing human error guideline"),
dict(name="numerics", description="Numerics-related guideline"),
dict(name="undefined-behavior", description="Numerics-related guideline"),

dict(name="defect", description="Guideline associated with the language-subset profile"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like we can have a new Sphinx Needs option for whether it's a defect or subset.
If we think most guidelines can or should be classified as one or the other it might make sense.
What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue for this, perhaps we tackle this later:
#142

What do you think @AlexCeleste?

dict(name="subset", description="Guideline associated with the defect-prevention profile"),
]

needs_categories = [
Expand Down
54 changes: 42 additions & 12 deletions src/process/style-guideline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,6 @@ We will examine each part:
:scope: module
:tags: numerics

Code must not rely on Rust's implicit integer wrapping behavior that occurs in release builds.
Instead, explicitly handle potential overflows using the standard library's checked,
saturating, or wrapping operations.

``guideline`` Title
-------------------

Expand Down Expand Up @@ -308,15 +304,29 @@ what it covers.
Content **SHOULD** aim to be as short and self-contained as possible, while still explaining
the scope of the guideline.

Content **SHOULD NOT** cover the rationale for the guideline, which is done in the ``rationale`` section.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned about removing the IETF RFC 2119-styling to this chapter when it was intended for readers. Now that it's intended for contributors are you thinking to keep it in?

Or is this change unrelated to that to aid in understanding?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this was rewritten in order to read better? Curious to hear your thoughts.

Guideline content consists of an Amplification and any Exceptions, which are normative,
supported by a Rationale and examples, which are not normative.
The justification extended explanation for the guideline **SHOULD** appear in the non-normative
Rationale rather than in the normative content.

Amplification
^^^^^^^^^^^^^

Guideline Content **MAY** contain a section titled *Amplification* followed by text that provides a more
precise description of the guideline title. An amplification is normative; if it conflicts with the
``guideline`` Title, the amplification **MUST** take precedence. This mechanism is convenient as it allows
a complicated concept to be conveyed using a short Title.
The *Amplification* is the block of text that **MAY** appear immediately below the guideline
attribute block, before any other subheadings.
If it is provided, the Amplification is normative; if it conflicts with the ``guideline`` Title,
the Amplification **MUST** take precedence. This mechanism is convenient as it allows a complicated
concept to be conveyed using a short Title and refined by the text below.

Content in the Amplification **SHOULD NOT** cover the rationale for the guideline or any
non-normative explanations, which **SHOULD** be provided in the ``rationale`` and examples sections
where helpful.

::

Code must not rely on Rust's implicit integer wrapping behavior that occurs in release builds.
Instead, explicitly handle potential overflows using the standard library's checked,
saturating, or wrapping operations.

Exception
^^^^^^^^^
Expand All @@ -327,9 +337,17 @@ some guidelines to be simplified. It is important to note that an exception is a
a guideline does not apply. Code that complies with a guideline by virtue of an exception does not
require a deviation.

If it is provided, the Exception is normative; if it conflicts with the ``guideline`` Title or the
Amplification, the Exception takes precedence over both. Depending on the individual guideline, it
may be clearer to have an Amplification or Title with an explicit Exception overriding parts of
their description, or it may be clearer to express excepted cases as integrated sentences in the
Amplification. This decision is editorial.

``rationale``
=============

Each Guideline **MUST** provide a *Rationale* for its inclusion and enforcement.

::

.. rationale::
Expand Down Expand Up @@ -363,7 +381,13 @@ The ``status`` option of a ``rationale`` **MUST** match the ``status`` of its pa
Rationale Content
-----------------

TODO(pete.levasseur)
The content of the rationale **SHOULD** provide the relevant context for why this guideline is useful.
The Rationale **SHOULD** make reference to any undefined behaviors or known errors associated
with the subject of the guideline.
The Rationale **MAY** make reference to other guidelines or to external documents cited in the
References.

The Rationale **SHOULD** be supported by code examples wherever concise examples are possible.

``non_compliant_example``
=========================
Expand Down Expand Up @@ -408,10 +432,10 @@ The ``non_compliant_example`` is neither normative, nor exhaustive. ``guideline`
``non_compliant_example`` Code Explanation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The Code Explanation of a `non_compliant_example` **MUST** explain in prose the reason the guideline
The Code Explanation of a ``non_compliant_example`` **MUST** explain in prose the reason the guideline
when not applied results in code which is undesirable.

The Code Explanation of a `non_compliant_example` **MAY** be a simple explanation no longer than
The Code Explanation of a ``non_compliant_example`` **MAY** be a simple explanation no longer than
a sentence.

The Code Explanation of a ``non_compliant_example`` **SHOULD** be no longer than necessary to explain
Expand All @@ -435,6 +459,12 @@ than the current guideline.
``compliant_example``
=====================

A compliant example **SHOULD** be omitted when the guideline forbids an action entirely, i.e. there
is no compliant way to achieve the goal of the non-compliant code, rather than giving an irrelevant
example (or encouraging strange workarounds).
When there is a clear and idiomatic compliant way to achieve the goal, a compliant example **SHOULD**
be provided after the corresponding non-compliant example.

::

.. compliant_example::
Expand Down