Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework the README.md for rustc and add other readmes #44505

Merged
merged 8 commits into from
Sep 20, 2017

Conversation

nikomatsakis
Copy link
Contributor

@nikomatsakis nikomatsakis commented Sep 11, 2017

OK, so, long ago I committed to the idea of trying to write some high-level documentation for rustc. This has proved to be much harder for me to get done than I thought it would! This PR is far from as complete as I had hoped, but I wanted to open it so that people can give me feedback on the conventions that it establishes. If this seems like a good way forward, we can land it and I will open an issue with a good check-list of things to write (and try to take down some of them myself).

Here are the conventions I established on which I would like feedback.

Use README.md files. First off, I'm aiming to keep most of the high-level docs in README.md files, rather than entries on forge. My thought is that such files are (a) more discoverable than forge and (b) closer to the code, and hence can be edited in a single PR. However, since they are not in the code, they will naturally get out of date, so the intention is to focus on the highest-level details, which are least likely to bitrot. I've included a few examples of common functions and so forth, but never tried to (e.g.) exhaustively list the names of functions and so forth.
- I would like to use the tidy scripts to try and check that these do not go out of date. Future work.

librustc/README.md as the main entrypoint. This seems like the most natural place people will look first. It lays out how the crates are structured and is intended to give pointers to the main data structures of the compiler (I didn't update that yet; the existing material is terribly dated).

A glossary listing abbreviations and things. It's much harder to read code if you don't know what some obscure set of letters like infcx stands for.

Major modules each have their own README.md that documents the high-level idea. For example, I wrote some stuff about hir and ty. Both of them have many missing topics, but I think that is roughly the level of depth that would be good. The idea is to give people a "feeling" for what the code does.

What is missing primarily here is lots of content. =) Here are some things I'd like to see:

  • A description of what a QUERY is and how to define one
    • Some comments for librustc/ty/maps.rs
  • An overview of how compilation proceeds now (i.e., the hybrid demand-driven and forward model) and how we would like to see it going in the future (all demand-driven)
  • Some coverage of how incremental will work under red-green
  • An updated list of the major IRs in use of the compiler (AST, HIR, TypeckTables, MIR) and major bits of interesting code (typeck, borrowck, etc)
  • More advice on how to use x.py, or at least pointers to that
  • Good choice for config.toml
  • How to use RUST_LOG and other debugging flags (e.g., -Zverbose, -Ztreat-err-as-bug)
  • Helpful conventions for debug! statement formatting

cc @rust-lang/compiler @mgattozzi

@rust-highfive
Copy link
Collaborator

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link
Member

@steveklabnik steveklabnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful! Always glad to see more docs, and these are great.

I have some small nits on formatting, take 'em or leave 'em.





Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to cut this down to one newline

///
/// Example:
///
/// ```rust
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conventionally the explicit rust here is only used in markdown files, not source files

/// (which is an expression), but also the argument patterns, since
/// those are something that the caller doesn't really care about.
///
/// Example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conventionally, this would be

/// # Examples


As you can see, the `TyCtxt` type takes three lifetime parameters.
These lifetimes are perhaps the most complex thing to understand about
the tcx. During rust compilation, we allocate most of our memory in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "Rust"?

allocated in the global arena). However, the lifetime `'tcx` is always
a safe approximation, so that is what you get back.

NB. Because types are interned, it is possible to compare them for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may or may not want to consider using a > for this note





Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra newlines here too

@eddyb
Copy link
Member

eddyb commented Sep 11, 2017

r? @steveklabnik

@rust-highfive rust-highfive assigned steveklabnik and unassigned eddyb Sep 11, 2017
to a `&hir::Item` (e.g. for the mod `foo`), you do not immediately
gain access to the contents of the function `bar()`. Instead, you only
gain access to the **id** for `bar()`, and you must some function to
lookup the contents of `bar()` given its id; this gives us a change to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change -> chance?

of a function/closure or the definition of a constant. Bodies are
associated with an **owner**, which is typically some kind of item
(e.g., a `fn()` or `const`), but could also be a closure expression
(e.g., `|x, y| x + y`). You can use the HIR map to find find the body
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: find find

@mgattozzi
Copy link
Contributor

@nikomatsakis I'm glad you finally got around to this! I'll take a look at this later when I get the chance!

the code from the other crates within rustc. This crate itself does
not contain any of the "main logic" of the compiler (though it does
have some code related to pretty printing or other minor compiler
options).
Copy link
Member

@qmx qmx Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition of driver as being the "main" function for the compiler is really nice. What about giving more emphasis to it on the top-level README file?

better than others).

The compiler process
====================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to comment below this part of the diff, but between the analysis oasses and translation to LLVM there's also HAIR and MIR lowering. Would be nice to mention those as well to see how they fit into the big picture. (I believe HAIR is where the typeck tables are terminated (as in flushed out), for example?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this part is basically all dated. I just pulled it unedited from before.

integration with incremental compilation. This way, if you gain access
to a `&hir::Item` (e.g. for the mod `foo`), you do not immediately
gain access to the contents of the function `bar()`. Instead, you only
gain access to the **id** for `bar()`, and you must some function to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and you must X some function

@@ -413,6 +413,9 @@ pub struct WhereEqPredicate {

pub type CrateConfig = HirVec<P<MetaItem>>;

/// The top-level data structure that stores the entire contents of
/// the crate currently being compiled.
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this newline serve a purpose?

Copy link
Contributor

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely helpful. Left a few comments but none of them major.

pub type Ty<'tcx> = &'tcx TyS<'tcx>;
```

The `TyS` struct defines the actual details of how a type is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've wondered what the S in TyS is for. And similarly the s in sty. Maybe this is one place to mention it.

/// generates so that so that it can be reused and doesn't have to be redone
/// later on.
/// The central data structure of the compiler. Keeps track of all the
/// information that typechecker generates so that so that it can be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "the typechecker" or "typechecking"?

/// later on.
/// The central data structure of the compiler. Keeps track of all the
/// information that typechecker generates so that so that it can be
/// reused and doesn't have to be redone later on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: nothing after "reused" seems to add anything.

@aidanhs aidanhs added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Sep 13, 2017
@qmx qmx mentioned this pull request Sep 14, 2017
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Sep 15, 2017
bring TyCtxt into scope

got comments both from @eddyb and @nikomatsakis (via rust-lang#44505) that we should always put `TyCtxt` in scope

should I just go and import it at other places in the codebase or we just keep doing small improvements?
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Sep 16, 2017
bring TyCtxt into scope

got comments both from @eddyb and @nikomatsakis (via rust-lang#44505) that we should always put `TyCtxt` in scope

should I just go and import it at other places in the codebase or we just keep doing small improvements?
frewsxcv added a commit to frewsxcv/rust that referenced this pull request Sep 16, 2017
bring TyCtxt into scope

got comments both from @eddyb and @nikomatsakis (via rust-lang#44505) that we should always put `TyCtxt` in scope

should I just go and import it at other places in the codebase or we just keep doing small improvements?
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Sep 17, 2017
bring TyCtxt into scope

got comments both from @eddyb and @nikomatsakis (via rust-lang#44505) that we should always put `TyCtxt` in scope

should I just go and import it at other places in the codebase or we just keep doing small improvements?
@nikomatsakis nikomatsakis changed the title [WIP] rework the README.md for rustc and add other readmes rework the README.md for rustc and add other readmes Sep 18, 2017
@nikomatsakis
Copy link
Contributor Author

@eddyb take a look at the last two commits -- I made an attempt to document the query system (I also broke it up into multiple files, to try and hide some of the "plumbing"). I did not rename anything, because I didn't want to run around and update paths for now (though I think it would maybe be best to move this code into queries instead of ty::maps::queries).

@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Contributor

bors commented Sep 18, 2017

📌 Commit a8a28ce has been approved by alexcrichton

@alexcrichton
Copy link
Member

@bors: p=2

@bors
Copy link
Contributor

bors commented Sep 18, 2017

🔒 Merge conflict

@bors
Copy link
Contributor

bors commented Sep 18, 2017

☔ The latest upstream changes (presumably #44678) made this pull request unmergeable. Please resolve the merge conflicts.

- LLVM then runs its various optimizations, which produces a number of `.o` files
(one for each "codegen unit").
6. **Linking**
- Finally, those `.o` files are linke together.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*linked

@@ -226,9 +180,13 @@ pointers for understanding them better.
found in `src/librustc_mir`.
- obligation -- something that must be proven by the trait system; see `librustc/traits`.
- local crate -- the crate currently being compiled.
- node-id or `NodeId` -- an index identifying a particular node in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better: gradually being replaced with HirId.

- cx -- we tend to use "cx" as an abbrevation for context. See also tcx, infcx, etc.
- `DefId` -- an index identifying a **definition** (see `librustc/hir/def_id.rs`).
Copy link
Contributor

@arielb1 arielb1 Sep 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an index identifying a definition (see librustc/hir/def_id.rs). Uniquely identifies a DefPath.

- HIR -- the **High-level IR**, created by lowering and desugaring the AST. See `librustc/hir`.
- `HirId` -- identifies a particular node in the HIR by combining a
def-id with an "intra-definition offset".
- `'gcx` -- the lifetime of the global arena (see `librustc/ty`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICE - internal compiler error. When the compiler crashes.

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Sep 19, 2017

@bors r=steveklabnik p=22

(Besides the fact that it'd be really nice to have these docs be easier to find, this PR is actually kinda' conflict prone due to breaking up the maps module.)

@bors
Copy link
Contributor

bors commented Sep 19, 2017

📌 Commit 638958b has been approved by steveklabnik

@bors
Copy link
Contributor

bors commented Sep 19, 2017

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Contributor

bors commented Sep 19, 2017

📌 Commit 638958b has been approved by steveklabnik

@mgattozzi
Copy link
Contributor

@nikomatsakis I think everyone already covered my thoughts. I think this is a good first stab at it and addresses things I've brought up to you during Meetups so I'm hoping we can continue to do things like this and make compiler hacking more accessible! :D

@arielb1 arielb1 added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 19, 2017
@bors
Copy link
Contributor

bors commented Sep 19, 2017

⌛ Testing commit 638958b with merge f60bc3a...

bors added a commit that referenced this pull request Sep 19, 2017
rework the README.md for rustc and add other readmes

OK, so, long ago I committed to the idea of trying to write some high-level documentation for rustc. This has proved to be much harder for me to get done than I thought it would! This PR is far from as complete as I had hoped, but I wanted to open it so that people can give me feedback on the conventions that it establishes. If this seems like a good way forward, we can land it and I will open an issue with a good check-list of things to write (and try to take down some of them myself).

Here are the conventions I established on which I would like feedback.

**Use README.md files**. First off, I'm aiming to keep most of the high-level docs in `README.md` files, rather than entries on forge. My thought is that such files are (a) more discoverable than forge and (b) closer to the code, and hence can be edited in a single PR. However, since they are not *in the code*, they will naturally get out of date, so the intention is to focus on the highest-level details, which are least likely to bitrot. I've included a few examples of common functions and so forth, but never tried to (e.g.) exhaustively list the names of functions and so forth.
    - I would like to use the tidy scripts to try and check that these do not go out of date. Future work.

**librustc/README.md as the main entrypoint.** This seems like the most natural place people will look first. It lays out how the crates are structured and **is intended** to give pointers to the main data structures of the compiler (I didn't update that yet; the existing material is terribly dated).

**A glossary listing abbreviations and things.** It's much harder to read code if you don't know what some obscure set of letters like `infcx` stands for.

**Major modules each have their own README.md that documents the high-level idea.** For example, I wrote some stuff about `hir` and `ty`. Both of them have many missing topics, but I think that is roughly the level of depth that would be good. The idea is to give people a "feeling" for what the code does.

What is missing primarily here is lots of content. =) Here are some things I'd like to see:

- A description of what a QUERY is and how to define one
    - Some comments for `librustc/ty/maps.rs`
- An overview of how compilation proceeds now (i.e., the hybrid demand-driven and forward model) and how we would like to see it going in the future (all demand-driven)
- Some coverage of how incremental will work under red-green
- An updated list of the major IRs in use of the compiler (AST, HIR, TypeckTables, MIR) and major bits of interesting code (typeck, borrowck, etc)
- More advice on how to use `x.py`, or at least pointers to that
- Good choice for `config.toml`
- How to use `RUST_LOG` and other debugging flags (e.g., `-Zverbose`, `-Ztreat-err-as-bug`)
- Helpful conventions for `debug!` statement formatting

cc @rust-lang/compiler @mgattozzi
@bors
Copy link
Contributor

bors commented Sep 20, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: steveklabnik
Pushing f60bc3a to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.