Skip to content

Commit 99e5020

Browse files
authored
Merge pull request #155 from hexcatnl/safe-linking
[2025H1] Research: How to achieve safety when linking separately compiled code
2 parents eab6266 + 3d07ba6 commit 99e5020

File tree

1 file changed

+123
-0
lines changed

1 file changed

+123
-0
lines changed

src/2025h1/safe-linking.md

+123
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Research: How to achieve safety when linking separately compiled code
2+
3+
| Metadata | |
4+
| -------- | ------------------------------------------------------------ |
5+
| Owner(s) | [Mara Bos](https://github.com/m-ou-se) and [Jonathan Dönszelman](https://github.com/jdonszelmann) |
6+
| Teams | [lang], [compiler] |
7+
| Status | Proposed |
8+
9+
## Summary
10+
11+
Research what "safety" and "unsafety" means when dealing with separately compiled code, like when loading dynamically linked libraries.
12+
Specifically, figure out how it'd be possible to provide any kind of safety when linking external code (such as loading a dynamic library or linking a separately compiled static library).
13+
14+
## Motivation
15+
16+
Rust has a very clear definition of "safe" and "unsafe" and (usually) makes it easy to stay in the "safe" world.
17+
`unsafe` blocks usually only have to encapsulate very small blocks of which one can (and should) prove soundness manually.
18+
19+
When using `#[no_mangle]` and/or `extern { … }` to connect separately compiled code, however, any concept safety pretty much disappears.
20+
21+
While it might be reasonable to make some assumptions about (standardized) symbols like `strlen`,
22+
the _unsafe assumption_ that a symbol with the same name will refer to something of the expected signature
23+
is _not_ something that one can _prove at compile time_, but is rather an (hopeful, perhaps reasonable) expectation of
24+
the contents of dynamic libraries available at runtime.
25+
26+
The end result is that for use cases like plugins, we have no option than just `unsafe`ly hoping for the best,
27+
accepting that we cannot perfectly guarantee that undefined behavior is impossible linking/loading a library/a plugin/some external code.
28+
29+
### The status quo
30+
31+
Today, combining separately compiled code (from different Rust projects or different languages)
32+
is done through a combination of `extern "…" fn`, `#[repr(…)]`, `#[no_mangle]`, and `extern {…}`.
33+
34+
Specifically:
35+
36+
1. `extern "…" fn` (which has a bit of a confusing name) is used to specify the _calling convention_ or _ABI_ of a function.
37+
38+
The default one is the `"Rust"` ABI, which (purposely) has no stability guarantees.
39+
The `"C"` ABI is often used for its stability guarantees, but places restrictions on the possible signatures.
40+
41+
2. `#[repr(…)]` is used to control _memory layout_.
42+
43+
The default one is the `Rust` layout, which (purposely) has no stability guarantees.
44+
The `C` layout is often used for its stability guarantees, but places restrictions on the types.
45+
46+
3. `#[no_mangle]` and `extern {…}` are used to control the _symbols_ used for _linking_.
47+
48+
`#[no_mangle]` is used for _exporting_ an item under a known symbol,
49+
and `extern { … }` is used for _importing_ an item with a known symbol.
50+
51+
There have often been requests for a "stable Rust abi" which usually refers to a _calling convention_ and _memory layout_ that is
52+
as unrestrictive as `extern "Rust" fn` and `#[repr(Rust)]`, but as stable as `extern "C" fn` and `#[repr(C)]`.
53+
54+
It seems unlikely that `extern "Rust" fn` and `#[repr(Rust)]` would ever come with stablity guarantees, as allowing for changes when stability is not necessary has its benefits.
55+
It seems most likely that a "stable Rust ABI" will arrive in the form of a _new_ ABI,
56+
by adding some kind of `extern "Rust-stable-v1"` (and `repr`) or similar
57+
(such as `extern "crabi" fn` and `#[repr(crabi)]` [proposed here](https://github.com/rust-lang/rust/pull/105586)),
58+
or by slowly extending `extern "C" fn` and `#[repr(C)]` to support more types (like tuples and slices, etc.).
59+
60+
Such developments would lift restrictions on which types one can use in FFI, but just a stable calling convention and memory layout will do almost nothing for safety,
61+
as linking/loading a symbol (possibly at runtime) with a different signature (or ABI) than expected will still immediately lead to undefined behavior.
62+
63+
### Research question and scope
64+
65+
This research project focusses entirely on point 3 above: symbols and linking.
66+
67+
The main research question is:
68+
69+
_**What is necessary for an alternative for `#[no_mangle]` and `extern { … }` to be safe, with a reasonable and usable definition of "safe"?**_
70+
71+
We believe this question can be answered independently of the specifics of a stable calling convention (point 1) and memory layout (point 2).
72+
73+
[RFC3435 "#[export]" for dynamically linked crates](https://github.com/rust-lang/rfcs/pull/3435) proposes one possible way to provide safety in dynamic linking.
74+
The goal of the research is to explore the entire solution space and understand the requirements and limitations that would apply to any possible solution/alternative.
75+
76+
### The next 6 months
77+
78+
- Assemble a small research team (e.g. an MSc student, a professor, and a researcher/mentor).
79+
- Acquire funding.
80+
- Run this as an academic research project.
81+
- Publish intermediate results as a blog post.
82+
- (After ~9 months) Publish a thesis and possibly a paper that answers the research question.
83+
84+
### The "shiny future" we are working towards
85+
86+
The future we're working towards is one where (dynamically) linking separately compiled code (e.g. plugins, libraries, etc.)
87+
will feel like a first class Rust feature that is both safe and ergonomic.
88+
89+
Depending on the outcomes of the research, this can provide input and design requirements for future (stable) ABIs, and potentially pave the way for
90+
safe cross-language linking.
91+
92+
## Design axioms
93+
94+
- Any design is either fully safe, or makes it possible to encapsulate the unsafety in a way that allows one to prove soundness (to reasonable extend).
95+
- Any design allows for combining code compiled with different versions of the Rust compiler.
96+
- Any design is usable for statically linking separately (pre) compiled static libraries, dynamically linking/loading libraries, and dynamically loading plugins.
97+
- Designs require as little assumptions about the calling convention and memory layout.
98+
Ideally, the only requirement is that they are stable, which means that the design can be used with the existing `extern "C" fn` and `#[repr(C)]`.
99+
100+
## Ownership and team asks
101+
102+
**Owner:** [Mara Bos](https://github.com/m-ou-se) and/or [Jonathan Dönszelman](https://github.com/jdonszelmann).
103+
104+
| Subgoal | Owner(s) or team(s) | Notes |
105+
| ---------------------------------------------- | ----------------------- | ----- |
106+
| Coordination with university | Jonathan | Delft University of Technology |
107+
| Acquire funding | Hexcat (=Mara+Jonathan) | |
108+
| Research | Research team (MSc student, professor, etc.) | |
109+
| Mentoring and interfacing with Rust project | Mara, Jonathan | |
110+
| Input and discussion on concept of safety | ![Team][] [lang] | |
111+
| Blog post (author, review) | MSc student, Jonathan, Mara | |
112+
| Feedback on conclusions | ![Team][] [lang] | also the Rust community (users) |
113+
| Experimental implementation | Msc student | |
114+
| ↳ Mentoring | Jonathan, Mara | |
115+
| ↳ Review/accept lang experiment | ![Team][] [lang] | |
116+
| ↳ Reviews/feedback | ![Team][] [compiler] | |
117+
| Thesis / Paper | Research team (MSc student, professor, etc.) | |
118+
119+
## Frequently asked questions
120+
121+
### Is there a university and professor interested in this?
122+
123+
Yes! We've discussed this with a professor at the Delft University at Technology, who is excited and already looking for interested students.

0 commit comments

Comments
 (0)