-
Notifications
You must be signed in to change notification settings - Fork 0
Rust for CXX programmers
Pointers in C++ are vulnerable to bugs involving dangling or null pointers. In Rust, pointers are non-nullable and the compiler enforces the invariant that pointers are never dangling.
C++ | Rust |
---|---|
& |
&mut |
const& |
& (points to immutable data - not just read-only through the reference) |
std::unique_ptr |
owned pointer: ~
|
std::shared_ptr |
|
Borrowed pointers (&
) have the same representation as C pointers or C++
references at runtime, but to enforce safety at compile-time they are more
restricted and have a concept of lifetimes. See the borrowed pointer tutorial for details.
Note that unlike C++ references but like raw pointers, they can be treated as
values and stored in containers.
For example, a container can return a reference to an object inside it, and the lifetime will ensure that the container outlives the reference and is not modified in a way that would make it invalid. This concept also extends to any iterator that wraps borrowed pointers, and makes them impossible to invalidate.
Owned pointers are almost identical to std::unique_ptr
in C++, and point to
memory on a shared heap so they can be moved between tasks.
Managed pointers are similar to std::shared_ptr
, but have full garbage
collection semantics and point to a task-local heap. By virtue of being on a
local heap, they can use a per-task garbage collector.
If managed pointers are avoided, garbage collection is not used (the compiler
can enforce this with the managed-heap-memory
lint check). Garbage collection
will be avoided in the standard library except for concepts like persistent
(copy-on-write, with shared substructures) containers that require them.
C++ | Rust |
---|---|
"foo" (const char * ) |
"foo" (&'static str ) |
std::string("foo") |
~"foo" (~str ) |
std::make_shared<std::string>("foo")
(std::shared_ptr<std::string> ) |
@"foo" (@str , no indirection) |
The &str
type represents an immutable fixed-size slice of any kind of string.
C++ | Rust |
---|---|
std::array<int, 3> {{1, 2, 3}} |
[1, 2, 3] ([int, ..3] ) |
std::vector<int> {1, 2, 3} |
let myvec:Vec<int> = `vec!{1, 2, 3} |
std::shared<std::vector<int>> |
Rc<Vec<int> > |
The &[]
and &mut []
types represent a fixed-size slice (immutable and mutable respectively) of any kind of vector.
Rust's enum
is a sum type like boost::variant
(tagged union). The match
expression is used to pattern match out of it, replacing the usage of a
switch
with a C++ enum
, or a visitor with boost::variant
. A match
expression is required to cover all possible cases, although a default fallback
can be used.
C++ | Rust |
---|---|
enum |
enum , but can only store the given variants |
boost::variant |
enum , Either (an enum ) for 2 types |
boost::optional |
Option (an enum ) |
Option
and Either
are very common patterns, so they're predefined along
with utility methods.
A Rust struct
is similar to struct
/class
in C++, and uses a memory layout
compatible with C structs. Members are public by default but can be marked with
priv
. There is no concept of a friend
function or type, because priv
applies only at a module level.
Rust has no special syntax or semantics for constructors, and just uses static methods that return the type. Struct initializer syntax SomeStruct{field:value..}
avoids the boilerplate of manually creating simple constructors that just copy values into fields.
Similarly rust has no conversion operators - it is necessary to call explicit methods such as .to_str()
to perform conversions. The philosophy is to make the cost (no hidden allocations) & code-path clearer.
Custom destructors for non-memory resources can be provided by implementing the
Drop
trait.
Tuples are built into the language syntax and along with destructuring assignment (available in arguments, match, let) they provide a convenient way of grouping multiple values. There are also 'tuple structs', where the type is named but the fields are not.
Rust has no concept of a copy constructor and only shallow types are implicitly
copyable. Assignment or passing by-value will only do an implicit copy or a
move. Other types can implement the Clone
trait, which provides a clone
method.
Rust will implicitly provide the ability to move any type, along with a swap implementation using moves. The compiler enforces the invariant that no value can be read after it was moved from.
Rust does not allow sharing mutable data between threads, so it isn't vulnerable to races for in-memory data (data races). Instead of sharing data, message passing is used instead - either by copying data, or by moving owned pointers between tasks.
Rust does not include C++-style exceptions that can be caught, only uncatchable
unwinding (fail
) that can be dealt with at task boundaries. The lack of
catch
means that exception safety is much less of an issue, since calling a
method that fails will prevent that object from ever being used again.
Generally, errors are handled by using a enum
of the result type and the
error type, and the result
module implements related functionality.
There is also a condition system, which allows a failure to be handled at the
site of the failure, or otherwise just resorts to fail
if it cannot be
handled.
In addition to the prevention of null or dangling pointers and data races, the Rust compiler also enforces that data be initialized before being read. Variables can still be declared and then initialized later, but there can never be a code path where they could be read first. All possible code paths in a function or expression must also return a correctly typed value, or fail.
This also means indexing a string/vector will do bounds checking like
vector<T>::at()
(though there are unsafe functions in vec::raw
that allow bypassing the bounds checks)
Thanks to memory/pointer safety and more rigorous immutability, rust can guarantee when multiple writable pointers don't access the same object. This gives more opportunities for caching values in registers, reducing loads and stores, and low level scheduling optimizations by the compiler. This avoids the need for 'restrict' that is used as an extension in some C/C++ compilers for performance sensitive code.
C++ | Rust |
---|---|
size_t foo() const |
fn foo(&self) -> uint |
size_t foo() |
fn foo(&mut self) -> uint |
static size_t foo() |
fn foo() -> uint |
n/a |
fn foo(self) , fn foo(~self) , fn foo(@self)
|
Methods in Rust are similar to those in C++, but Rust uses explicit self, so
referring to a member/method from within a method is done with self.member
.
It's also possible to take self
by-value, which allows for moving out of
self
since the object will be consumed.
For an example, the PriorityQueue
struct in the standard library has a
to_sorted_vec
method that takes self by-value, making
PriorityQueue::from_vec(xs).to_sorted_vec()
an in-place heap sort.
The ~self
and @self
forms are available for objects allocated as ~
or @
respectively, but you usually want &self
.
Rust generics are more like the proposed concepts extension for C++, not like the templates that exist today. This allows templates to be type-checked at the time you write them, not when you instantiate them.
In Rust, functions and structs can be given generic type annotations (like templated functions and structs), but to actually call a type's methods and use operators, the type annotations must be explicitly bounded by the traits required. Annotating the type bounds for functions is only necessary at the module-level, and not for nested closures where it is inferred.
Despite the different model in the type system, traits are compiled to code similar to what templates would produce.
Rust can only select function overloads based on the receiver (and sometimes through return values due to HN type inference). To overload functions on multiple parameters, one must create an extra layer of traits for something that looks like double-dispatch, with each parameter being moved to the receiver by an intermediate function. Although this is extra boilerplate, the benefit is that its' also ready for runtime polymorphism through trait-objects if needed. Without overloading multiple parameters, the information you pay in writing more specific function calls is sometimes recovered back with HN type inference feeding back through its' arguments.
You can also consider Enums for conviniently passing varied options to a function - or use rusts powerful macros for wrapping extra convinience around function calls (eg, it is possible to annotate arguments with keywords & special syntax, handle variadics etc)
Instead of adding virtual functions to classes, Rust uses Trait Objects. A trait object is a fat pointer to data and an associated vtable, created by "[~|&]<expr> as [~|&]<traitname>". Any number of traits can be implemented for an existing type without needing to establish an inheritance heirachy, and there is no 'diamond inheritance problem'. It is possible to attach a new vtable to an existing struct instance (with a borrowed pointer) - so one library can take data from another library, without the other having to have even known the interfaces required.
Rust includes macros (not textual ones) and syntax extensions which can be used for many of the other use cases of templates and constexpr
.
These macros handle repeated items in patterns, can match expressions,blocks parsed properly, and separate arguments with custom syntax elements. As such they can achieve what the 'x-macros' pattern in C does in a single macro definition/invocation.
Syntax extensions can also be used in place of user-defined literals.
Being handled within the AST, they do not have many of the hazards of C macros and considered as a useful tool, rather than an obsolete anti-feature as in C++.
Rust's modules don't work like other languages. Modules are namespaces, not 'imported' into individual sources.
mod <module_name>;
is used from the crate root file to reference other source files by mounting their contents in the module tree.
"use" directives are a little like a cross between #include and 'using namespace' - within a crate you can always directly access other module contents with the full path (relative to root ::foo::bar) but use
is common to create shortcuts and aliases. It can also bring individual symbols into scope.
'mod <name> { ..definitions..}' statements can create nested namespaces containing items, replacing the use of nested classes in C++.
'use' statements are relative to the crate root, not the current module.
'mod.rs' files are a special workaround to allow rust's modules tree to mimick a directory structure.