Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically making immutable const data reside in program memory #74

Open
gergoerdi opened this issue Aug 27, 2017 · 19 comments
Open
Labels
A-rust Affects overall Rust compiler architecture enhancement

Comments

@gergoerdi
Copy link
Collaborator

gergoerdi commented Aug 27, 2017

It would be useful to have a way of marking a string constant (a &'static [u8]?) explicitly to ensure it is put into PROGMEM. For reference, it can be done in C by using the PROGMEM macro on the definition of a const uint8_t[]:

const uint8_t font_rom[] PROGMEM = {
    0xF0, 0x90, 0x90, 0x90, 0xF0, 0x00, 0x00, 0x00
    };

and then you can use pgm_read_byte to read it one by one.

Out of scope: Automatically promoting static values to PROGMEM.

@gergoerdi gergoerdi added A-rust Affects overall Rust compiler architecture enhancement labels Aug 27, 2017
@dylanmckay
Copy link
Member

I personally am a fan of this, although here's AVR-GCC's reasoning for not doing this.

Many users bring up the idea of using C's keyword const as a means of declaring data to be in Program Space. Doing this would be an abuse of the intended meaning of the const keyword.

const is used to tell the compiler that the data is to be "read-only". It is used to help make it easier for the compiler to make certain transformations, or to help the compiler check for incorrect usage of those variables.

For example, the const keyword is commonly used in many functions as a modifier on the parameter type. This tells the compiler that the function will only use the parameter as read-only and will not modify the contents of the parameter variable.

const was intended for uses such as this, not as a means to identify where the data should be stored. If it were used as a means to define data storage, then it loses its correct meaning (changes its semantics) in other situations such as in the function parameter example.

@dylanmckay
Copy link
Member

Also, for anybody reading, in LLVM all pointers are annotated with address spaces, and so we don't need to use anything special like pgm_read_byte - any load/store through a pointer will automatically use the correct instructions to read from RAM or program memory.

@gergoerdi
Copy link
Collaborator Author

FWIW, currently a &'static [u8] is already compiled into a .text section:

Rust:

static THE_BYTES : &'static [u8] = include_bytes!("../font.img");

LLVM IR:

@byte_str.0 = internal unnamed_addr constant [64 x i8] c"\F0\90\90\90\F0\00\00\00 `  p\00\00\00\F0\10\F0\80\F0\00\00\00\F0\10\F0\10\F0\00\00\00\90\90\F0\10\10\00\00\00\F0\80\F0\10\F0\00\00\00\F0\80\F0\90\F0\00\00\00\F0\10 @@\00\00\00", align 1

AVR asm:

	.type	byte_str.0,@object      ; @byte_str.0
	.section	.rodata,"a",@progbits
byte_str.0:
	.asciz	"\360\220\220\220\360\000\000\000 `  p\000\000\000\360\020\360\200\360\000\000\000\360\020\360\020\360\000\000\000\220\220\360\020\020\000\000\000\360\200\360\020\360\000\000\000\360\200\360\220\360\000\000\000\360\020 @@\000\000"
	.size	byte_str.0, 64

(which ends up in .text).

However, accessing it doesn't go through lpm:

Rust:

    for &x in THE_BYTES {
        spi::sync(x);
    }

LLVM IR (of the relevant part):

  %1 = getelementptr inbounds i8, i8* %0, i16 1
  %2 = load i8, i8* %0, align 1

AVR asm:

        ldi     r26, lo8(byte_str.0)
        ldi     r27, hi8(byte_str.0)
        ldi     r16, lo8(byte_str.0+64)
        ldi     r17, hi8(byte_str.0+64)
LBB3_1:                                 ; %bb6
                                        ; =>This Inner Loop Header: Depth=1
        ld      r24, X+

@dylanmckay
Copy link
Member

I think we should be able to do something here, what rules should define a value going into program memory?

I'm thinking that all const values should go into program memory.

Regarding statics, I'm not sure if it'd be safe putting immutable statics into program memory. Statics are supposed to be usable from FFI, and it would be an easy mistake to declare an external static in C that is actually defined in Rust and read from it, expecting it to be in RAM, but it is actually in program memory.

@shepmaster
Copy link
Member

AFAIK, Rust's main concern with const vs static is that a static is guaranteed to be able to have its address taken. consts are closer to a C #define in that they may be inlined.

@shepmaster
Copy link
Member

what rules should define a value going into program memory?

If we can't come up with a rule, I do think some kind of attribute allowing both opt-in and opt-out would be a good start. If we notice a pattern, that could be encoded in the compiler.

@dylanmckay dylanmckay changed the title Putting string constants in PROGMEM Automatically making immutable static/const data reside in program memory Nov 12, 2017
@dylanmckay
Copy link
Member

Have made this issue specifically about the automatic promotion of static data to program memory by the Rust compiler itself.

If we want an explicit syntax for it (such as an attribute), that is now at #84

@dylanmckay
Copy link
Member

Oh, I also spent some time working on this a while back, it doesn't look like I wrote anything about it.

I started work on modifying the trans crate so that it would mark immutable statics as addrspace(1). I found it to be quite tricky, and I also got a bit confused because by the looks of it, there's two different translation implementations - one for the old compiler middle-end and one for the MIR backend. IIRC there was some shared code between both so it was tricky to find out what to change. The story on that might be better now that some time has passed though.

Doing this won't be too much work, but it's not as trivial as initially thought.

@dylanmckay
Copy link
Member

Thinking about this more, we can only do this on devices that support the lpm instruction.

@dylanmckay
Copy link
Member

I think it would be a good idea to do this by

  • Create a new AVR backend pass, PromoteConstsToProgramMemory. This pass can be explicitly initialized by the frontend, and is not a part of the default pipeline.
  • Run it with Rust
  • The pass itself should be a NOP if the target MCU does not support lpm
  • If the target does support LPM, move all constant globals to address space 1, update all uses to address space 1

This way, the frontend doesn't need to know AVR specific quirks.

@rubberduck203
Copy link

I don’t know that I would try to automatically determine where a value should be stored based on static or const. Putting something in PROGMEM seems like the job for an attribute.

@dylanmckay
Copy link
Member

dylanmckay commented Jun 6, 2019

The main thing that prompted this idea is the fact that even variables that reside in RAM must still reside in program memory (ignoring zero-initialized variables in .bss) - RAM is zeroed at chip boot and thus like AVR-GCC, we include routines for automatically initializing RAM from an image of it embedded in program memory.

The main difference with Rust and C++ - there is no such thing as const_cast<>() and we know that, barring pointer manipulation and dereferencing via unsafe, a const value is inherently unchangable. In fact, const variables in Rust don't always even get allocated space in the executable - you must use static variables if you want this, which is required for usage in FFI. You can have an integer const in Rust today and it may not even be allocated to a data section (flash OR RAM) - it can exist only in the immediate bits inside the instructions that use it. To add to this, I'm fairly sure it's invalid Rust to take the address of a const for this very reason, so, as far as I understand it, mutation of constants in Rust is provably impossible, so a write can never occur. In C/C++, because consts can actually be mutated, automatically promoting these variables to program memory could actually change program behaviour (also turning any memory write exploit like a out of bounds access into a RCE exploit allowing the program to arbitrarily rewrite itself).

So, my thoughts are, const values make a lot of sense to automatically live in program memory (on devices where lpm is supported)`. I haven't been able to think of any cases where this could actually change behaviour or introduce bugs for any Rust programs.

I think if even just one counterexample can be found, then I agree that we totally should not do this automatically. As it stands, there aren't any documented downsides, but there are known upsides such as lower memory use and faster startup time.

I don’t know that I would try to automatically determine where a value should be stored based on static or const

For what it's worth, my understanding of the proposal is that we leave static as-is and only promote constants to progmem. I guess the title of this ticket doesn't really reflect this. I believe that it might make sense to automatically promote immutable statics to program memory too, but that should be a different discussion (as it has downsides such as mucking with FFI).

@dylanmckay dylanmckay changed the title Automatically making immutable static/const data reside in program memory Automatically making immutable const data reside in program memory Jun 6, 2019
@shepmaster
Copy link
Member

I'm fairly sure it's invalid Rust to take the address of a const for this very reason

It's not. The const will be automatically promoted to a static (just like a literal):

const SIZE: usize = 42;

fn a() -> &'static usize {
    &SIZE
}

fn b() {
    let x = &mut SIZE;
}

fn c() -> &'static usize {
    &99
}

This can cause confusion sometimes

@shepmaster
Copy link
Member

immutable statics to program memory

This might not play well with interior mutability.

@dylanmckay
Copy link
Member

dylanmckay commented Jun 8, 2019

oof, that sounds like the counterexample I was worried about.

@shepmaster do you think it could work if we write the LLVM pass such that it only promotes certain constants to PROGMEM?

There may be a subset of global variables that can be safely promoted to PROGMEM.

Here's a few points from the LLVM LangRef that we could leverage.

A variable may be defined as a global constant, which indicates that the contents of the variable will never be modified (enabling better optimization, allowing the global data to be placed in the read-only section of an executable, etc). Note that variables that need runtime initialization cannot be marked constant as there is a store to the variable.

Global variables can be marked with unnamed_addr which indicates that the address is not significant, only the content. Constants marked like this can be merged with other constants if they have the same initializer. Note that a constant with significant address can be merged with a unnamed_addr constant, the result being a constant whose address is significant.

@shepmaster
Copy link
Member

LLVM pass such that it only promotes certain constants to PROGMEM?

There may be a subset of global variables that can be safely promoted to PROGMEM.

Yes, I think this makes sense, the problem is that I don't have enough vision to say what the accurate criteria are.

@luqasz
Copy link

luqasz commented Nov 25, 2019

One use case of not automatic placement of const variables in programspace would be:
Writing code for a interrupt vector where each cycle counts. E.g. I have to read couple of variables and placing them in PROGMEM would lead to extra 20 cycles.

@luqasz
Copy link

luqasz commented Nov 25, 2019

I've asked LLVM devs if it would be possible to implement automatic PROGMEM "distribution". What I mean is to let compiler decide which PROGMEM variables will be placed in which address space. AVR have quirks about reading across 64KB boundary, which requires separate instruction RAMP{X,Y,Z}

@dylanmckay
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Affects overall Rust compiler architecture enhancement
Projects
None yet
Development

No branches or pull requests

5 participants