Skip to content

Guide: complex data types #15422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 9, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
276 changes: 273 additions & 3 deletions src/doc/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -948,11 +948,281 @@ comments

## Compound Data Types

Tuples
Rust, like many programming languages, has a number of different data types
that are built-in. You've already done some simple work with integers and
strings, but next, let's talk about some more complicated ways of storing data.

Structs
### Tuples

Enums
The first compound data type we're going to talk about are called **tuple**s.
Tuples are an ordered list of a fixed size. Like this:

```rust
let x = (1i, "hello");
```

The parenthesis and commas form this two-length tuple. Here's the same code, but
with the type annotated:

```rust
let x: (int, &str) = (1, "hello");
```

As you can see, the type of a tuple looks just like the tuple, but with each
position having a type name rather than the value. Careful readers will also
note that tuples are heterogeneous: we have an `int` and a `&str` in this tuple.
You haven't seen `&str` as a type before, and we'll discuss the details of
strings later. In systems programming languages, strings are a bit more complex
than in other languages. For now, just read `&str` as "a string slice," and
we'll learn more soon.

You can access the fields in a tuple through a **destructuring let**. Here's
an example:

```rust
let (x, y, z) = (1i, 2i, 3i);

println!("x is {}", x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be one println!?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. shrug Seems fine either way, I'll change it.

```

Remember before when I said the left hand side of a `let` statement was more
powerful than just assigning a binding? Here we are. We can put a pattern on
the left hand side of the `let`, and if it matches up to the right hand side,
we can assign multiple bindings at once. In this case, `let` 'destructures,'
or 'breaks up,' the tuple, and assigns the bits to three bindings.

This pattern is very powerful, and we'll see it repeated more later.

The last thing to say about tuples is that they are only equivalent if
the arity, types, and values are all identical.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the arity and types don't match, you can't use == at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. What do you think the right way to explain this is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd never heard of arity before looking at rust. Maybe it's common but...

At least the definition is pretty straightforward.


```rust
let x = (1i, 2i, 3i);
let y = (2i, 3i, 4i);

if x == y {
println!("yes");
} else {
println!("no");
}
```

This will print `no`, as the values aren't equal.

One other use of tuples is to return multiple values from a function:

```rust
fn next_two(x: int) -> (int, int) { (x + 1i, x + 2i) }

fn main() {
let (x, y) = next_two(5i);
println!("x, y = {}, {}", x, y);
}
```

Even though Rust functions can only return one value, a tuple _is_ one value,
that happens to be made up of two. You can also see in this example how you
can destructure a pattern returned by a function, as well.

Tuples are a very simple data structure, and so are not often what you want.
Let's move on to their bigger sibling, structs.

### Structs

A struct is another form of a 'record type,' just like a tuple. There's a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comma shouldn't be inside the quotes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comma is part of the overall sentence, not the part in quotes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but comma inside the quotes is proper English.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right. I'm sorry. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking, it’s a matter of style, and putting the comma outside the quotes is the prevalent style in British English.[1] However, Rust documentation seems to generally follow American English (where putting the comma inside the quotes is more common), so putting the comma inside the quotes would be more consistent.

(Sometimes the British style is used in America, particularly in linguistic and philosophical works, because it is clearer and less ambiguous.[2])

difference: structs give each element that they contain a name, called a
'field' or a 'member.' Check it out:

```rust
struct Point {
x: int,
y: int,
}

fn main() {
let origin = Point { x: 0i, y: 0i };

println!("The origin is at ({}, {})", origin.x, origin.y);
}
```

There's a lot going on here, so let's break it down. We declare a struct with
the `struct` keyword, and then with a name. By convention, structs begin with a
capital letter and are also camel cased: `PointInSpace`, not `Point_In_Space`.

We can create an instance of our struct via `let`, as usual, but we use a `key:
value` style syntax to set each field. The order doesn't need to be the same as
in the original declaration.

Finally, because fields have names, we can access the field through dot
notation: `origin.x`.

The values in structs are immutable, like other bindings in Rust. However, you
can use `mut` to make them mutable:

```rust
struct Point {
x: int,
y: int,
}

fn main() {
let mut point = Point { x: 0i, y: 0i };

point.x = 5;

println!("The point is at ({}, {})", point.x, point.y);
}
```

This will print `The point is at (5, 0)`.

### Tuple Structs and Newtypes

Rust has another data type that's like a hybrid between a tuple and a struct,
called a **tuple struct**. Tuple structs do have a name, but their fields
don't:


```
struct Color(int, int, int);
struct Point(int, int, int);
```

These two will not be equal, even if they have the same values:

```{rust,ignore}
let black = Color(0, 0, 0);
let origin = Point(0, 0, 0);
```

It is almost always better to use a struct than a tuple struct. We would write
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. The sample code isn't illustrative enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To be clear, I understand the/a reason for it, it just might be good to explain them briefly.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been addressed.

`Color` and `Point` like this instead:

```rust
struct Color {
red: int,
blue: int,
green: int,
}

struct Point {
x: int,
y: int,
z: int,
}
```

Now, we have actual names, rather than positions. Good names are important,
and with a struct, we have actual names.

There _is_ one case when a tuple struct is very useful, though, and that's a
tuple struct with only one element. We call this a 'newtype,' because it lets
you create a new type that's a synonym for another one:

```
struct Inches(int);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, was a consensus ever reached about using int like this? (rather than i32 or i64 or some such?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know there was an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahahahhaa. "I wasn't aware this was an issue" links to a thread where I commented

Hmm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be an easy change later, so this can put it on the backburner... as long as you remember. :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct Centimeters(int);

let length = Inches(10);

let Inches(integer_length) = length;
println!("length is {} inches", integer_length);
```

As you can see here, you can extract the inner integer type through a
destructuring `let`.

### Enums
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No mention of enums with values, like Result or Option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Maybe you're leaving this to an "data types in more detail" section to avoid turning this part of the tutorial into a long list of syntax? If so, maybe tuple structs could be there too?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left them out because we haven't talked about generics at all, and these would require them. I plan on covering those in the generics section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it doesn't have to be generic, e.g. enum OptionalInt { Value(int), Missing }.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh good call. Yeah, I will add that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been added.


Finally, Rust has a "sum type", an **enum**. Enums are an incredibly useful
feature of Rust, and are used throughout the standard library. Enums look
like this:

```
enum Ordering {
Less,
Equal,
Greater,
}
```

This is an enum that is provided by the Rust standard library. An `Ordering`
can only be _one_ of `Less`, `Equal`, or `Greater` at any given time. Here's
an example:

```rust
let x = 5i;
let y = 10i;

let ordering = x.cmp(&y);

if ordering == Less {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

== only works with a PartialEq impl, i.e. #[deriving(PartialEq)].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but I haven't talked about traits OR annotations yet. What should I do instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I've got no idea! You could move the match section before the enum one and just talk about that? Or just talk about enums in general terms here just as a direct lead in to match?

println!("less");
} else if ordering == Greater {
println!("greater");
} else if ordering == Equal {
println!("equal");
}
```

`cmp` is a function that compares two things, and returns an `Ordering`. The
call looks a little bit strange: rather than `cmp(x, y)`, we say `x.cmp(&y)`.
We haven't covered methods and references yet, so it should look a little bit
foreign. Right now, just pretend it says `cmp(x, y)`, and we'll get to those
details soon.

The `ordering` variable has the type `Ordering`, and so contains one of the
three values. We can then do a bunch of `if`/`else` comparisons to check
which one it is.

However, repeated `if`/`else` comparisons get quite tedious. Rust has a feature
that not only makes them nicer to read, but also makes sure that you never
miss a case. Before we get to that, though, let's talk about another kind of
enum: one with values.

This enum has two variants, one of which has a value.:

```
enum OptionalInt {
Value(int),
Missing
}

fn main() {
let x = Value(5);
let y = Missing;

match x {
Value(n) => println!("x is {:d}", n),
Missing => println!("x is missing!"),
}

match y {
Value(n) => println!("y is {:d}", n),
Missing => println!("y is missing!"),
}
}
```

This enum represents an `int` that we may or may not have. In the `Missing`
case, we have no value, but in the `Value` case, we do. This enum is specific
to `int`s, though. We can make it usable by any type, but we haven't quite
gotten there yet!

You can have any number of values in an enum:

```
enum OptionalColor {
Color(int, int, int),
Missing
}
```

Enums with values are quite useful, but as I mentioned, they're even more
useful when they're generic across types. But before we get to generics, let's
talk about how to fix this big `if`/`else` statements we've been writing. We'll
do that with `match`.

## Match

Expand Down