Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when calling Builder::build_global_string #32

Open
Redrield opened this issue Feb 17, 2018 · 12 comments
Open

Segmentation fault when calling Builder::build_global_string #32

Redrield opened this issue Feb 17, 2018 · 12 comments

Comments

@Redrield
Copy link

Redrield commented Feb 17, 2018

When playing around with inkwell before starting my project in full, I decided to try to make a hello world program using it. This is the code that I came up with

extern crate inkwell;

use inkwell::context::Context;
use inkwell::targets::{InitializationConfig, Target};
use inkwell::AddressSpace;
use inkwell::module::Linkage;


fn main() {
    Target::initialize_native(&InitializationConfig::default()).unwrap();
    let ctx = Context::create();
    let module = ctx.create_module("program");
    let builder = ctx.create_builder();

    let str_type = ctx.i8_type().ptr_type(AddressSpace::Generic);
    let i32_type = ctx.i32_type();

    let global = builder.build_global_string("Hello, World!", "message");

    let printf_type = i32_type.fn_type(&[&str_type], true);
    let printf = module.add_function("printf", &printf_type, Some(&Linkage::ExternalLinkage));

    let main_fn_type = i32_type.fn_type(&[&i32_type, &str_type.ptr_type(AddressSpace::Generic)], false);

    let main_fn = module.add_function("main", &main_fn_type, None);
    let block = ctx.append_basic_block(&main_fn, "entry");
    builder.position_at_end(&block);
    builder.build_call(&printf, &[&global], "", false);
    builder.build_return(Some(&i32_type.const_int(0, false)));

    println!("{:?}", module.print_to_string());
}

It seems like it should work, however there's a segmentation fault when I call builder.build_global_string. The same happens with builder.build_global_string_ptr

Environment

Operating system: Arch Linux
Kernel version: 4.15.3-1-ARCH
LLVM version: 5.0.1-2
Inkwell version: 0.1.0#eafca272
rustc -vV output:

rustc 1.25.0-nightly (3ec5a99aa 2018-02-14)
binary: rustc
commit-hash: 3ec5a99aaa0084d97a9e845b34fdf03d1462c475
commit-date: 2018-02-14
host: x86_64-unknown-linux-gnu
release: 1.25.0-nightly
LLVM version: 6.0
@TheDan64 TheDan64 added the bug label Feb 17, 2018
@TheDan64
Copy link
Owner

Hi, thanks for reporting this issue!

I'm not quite sure why this is happening - it seems like rustc doesn't allow for a name field at all and just passes in a null ptr but trying that didn't help me either. I will have to dig deeper into it.

Also worth noting that Inkwell only supports LLVM 3.7 officially at the moment (I have a PR to get multiple version support going, but it's turning out to be a massive pain), though regardless I am seeing this issue in 3.7 as well.

Sorry that this is blocking you! In the mean time I bet you could create a global i8 pointer to achieve the same effect? A C/LLVM string is just that (and maybe null terminated)

@Redrield
Copy link
Author

That would work, though how would I fill the global with my intended value? Glancing at the API docs it seems that there's a barrier between rust types like str, and values created from them, and anything in inkwell that has an impl of BasicValue. So I'm not sure how I would translate between the two

@TheDan64
Copy link
Owner

TheDan64 commented Feb 17, 2018

I think something like this should work:

let context = Context::create();
let module = context.create_module("my_mod");
let my_str = "Hello, World";
let i8_type = context.i8_type();
let i8_array_type = i8_type.array_type(my_str.len() as u32);
let global_string = module.add_global(&i8_array_type, Some(AddressSpace::Generic), "message");

let mut chars = Vec::with_capacity(my_str.len()); // Edit: Not sure if len is unicode chars or bytes, but I'm assuming it's bytes here

for chr in my_str.bytes() {
    chars.push(i8_type.const_int(chr as u64, false));
}

let const_str_array = i8_array_type.const_array(chars.as_ref());

global_string.set_initializer(&const_str_array);

Alternatively, you could put it behind an i8_ptr_type but then you should be sure to to have a null byte at the end so you know when the pointer ends. (This is essentially a C String - the above example is closer to what rust does)

@Redrield
Copy link
Author

It seems like I can't use an array because the signature of printf as I have it at the moment expects variadic i8*, and that type doesn't match [i8 x 13]*. I know how I can get a null PointerValue, but how can I fill that with the actual string I'm trying to work with?

@TheDan64
Copy link
Owner

TheDan64 commented Feb 18, 2018

In my edit to my previous comment, I mentioned you could just use almost identical code but with an i8_pointer_type, and as long as you be sure to add a null byte('\0') it should work. Might need to do a store or something, though. I'll try and whip up an example tomorrow

@TheDan64 TheDan64 added this to the 0.1.0 milestone Feb 19, 2018
@TheDan64
Copy link
Owner

TheDan64 commented Feb 19, 2018

Okay, so, I don't have a full example but I updated the build_cast method after reading some LLVM docs and you should be able to cast an array pointer to an i8 pointer:
let i8_ptr_type = builder.build_cast(InstructionOpcode::BitCast, &const_str_array, &i8_ptr_type);

Though I think const_str_array is a [n x i8] and needs to be a [n x i8]* in which case you might need to go one pointer level higher when creating it

@TheDan64
Copy link
Owner

@Redrield This seems to be an LLVM specific issue. This happens because the global string needs to be created in a function (I don't really get why, since it's global, not local...) but see this SO post which seems to showcase an identical problem: https://stackoverflow.com/questions/8966890/createglobalstringptr-crashes-when-i-dont-create-method-basic-block-for-main

I'm going to spend some more time thinking about whether or not there's anything in inkwell we can do about it... but I might just have to end up closing this issue as it's LLVM specific (Maybe I'll end up collecting a list of LLVM gotchas...). To get the code in your initial post you'd have to move the global string creation to after the main function is generated like so:

fn main() {
    Target::initialize_native(&InitializationConfig::default()).unwrap();

    let ctx = Context::create();
    let module = ctx.create_module("program");
    let builder = ctx.create_builder();

    let str_type = ctx.i8_type().ptr_type(AddressSpace::Generic);
    let i32_type = ctx.i32_type();


    let printf_type = i32_type.fn_type(&[&str_type], true);
    let printf = module.add_function("printf", &printf_type, Some(&Linkage::ExternalLinkage));

    let main_fn_type = i32_type.fn_type(&[&i32_type, &str_type.ptr_type(AddressSpace::Generic)], false);

    let main_fn = module.add_function("main", &main_fn_type, None);
    let block = ctx.append_basic_block(&main_fn, "entry");
    builder.position_at_end(&block);

    let global = builder.build_global_string("Hello, World!", "message");

    builder.build_call(&printf, &[&global], "", false);
    builder.build_return(Some(&i32_type.const_int(0, false)));

    println!("{:?}", module.print_to_string());
}

@TheDan64
Copy link
Owner

Maybe Inkwell should require a function value to be passed into the create string global method even if it isn't actually used... Definitely would need to document that

@Redrield
Copy link
Author

I ended up just stealing the stuff from librustc_trans which works. Seems there they associate builders with a block directly. The way a builder is constructed makes it linked with the block it is building for, and they make new builders per function. Maybe an architectural change to that would be of use here?

@TheDan64
Copy link
Owner

TheDan64 commented Feb 25, 2018

Dunno, seems odd since you can have multiple blocks per function, but maybe. I'll have to take a look at librustc_trans.

@TheDan64
Copy link
Owner

TheDan64 commented Feb 25, 2018

It looks like librustc_trans also supports creating a builder without a function, hmm: https://github.com/rust-lang/rust/blob/master/src/librustc_codegen_llvm/builder.rs#L61-L84

@TheDan64 TheDan64 modified the milestones: 0.1.0, 0.2.0 Feb 25, 2018
TheDan64 added a commit that referenced this issue Jul 4, 2018
Finally marked build_global_string as unsafe as per #32
@nlewycky
Copy link
Contributor

nlewycky commented Jul 4, 2019

@TheDan64 "This seems to be an LLVM specific issue. This happens because the global string needs to be created in a function"
The global needs to be created in a Module, though not a Function. The way the IRBuilder finds the module is by looking at the IRBuilder's current position and walking parents from BasicBlock->Function->Module. When you haven't set the position yet, it will perform undefined behaviour in C++ land as it starts with no basic block.

IMO it's pretty uncommon to use an IRBuilder without setting its position first. If you want to support it, the builder will create detached instructions (see #93) and only some of its methods are safe for that. All the type and constant (non-global) creation are. Some, like, build_malloc and build_global_string won't be. Or you could just mandate that IRBuilders must have a position at all times instead of trying to split the functions by whether they support creating detached IR.

Unfortunately, either way, you can't use the IRBuilder's utility functions like creating a global string until after you've created a basic block somewhere in the module you want to create your string in. (Really, LLVM should split this up and expose the utility for creating a string in a module, and then have IRBuilder use that.)

@TheDan64 TheDan64 modified the milestones: 0.6.0, 0.7.0 Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants