Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump qemu to v6.0.1 to resolve compile errors on recent versions of Ubuntu and Arch Linux #1120

Closed
wants to merge 3 commits into from

Conversation

allisonrandal
Copy link

Related issue: #1049

Type of change: bug fix

Impact: software change

Release Notes

@abejgonzalez
Copy link
Contributor

@Mergifyio copy dev

@mergify
Copy link

mergify bot commented Feb 16, 2022

copy dev

✅ Pull request copies have been created

@abejgonzalez
Copy link
Contributor

To keep you in the loop, we are looking to do a major release of CY (hopefully) by EOD today. If this doesn't make it, we will definitely take a look at this by the next minor release (we are aiming for right before ASPLOS).

@NathanTP
Copy link
Contributor

I just tried this on my centos 7 machine (basically the same as Amazon Linux) and got a failure building toolchains:

=> Starting qemu build
==> Initializing qemu submodule
==> Updating autoconf files for qemu
autoreconf: 'configure.ac' or 'configure.in' is required

I did a fresh clone, ran "./init-submodules-no-riscv-tools.sh" and "MAKEFLAGS=-j24 ./build-toolchains.sh ec2fast". Has anyone tried this on ec2? Maybe an ec2fast issue?

also: It'd be best if we make sure marshal still works with the new qemu. It seems unlikely to break but you never know. Just run "chipyard/software/firemarshal/scripts/fullTest.py". I don't have an Ubuntu machine handy so maybe someone who has the environment all ready can try, I'll try centos7 if we fix the build issue.

@allisonrandal
Copy link
Author

The fullTest.py test script fails on Ubuntu 21.10 for reasons unrelated to qemu. The first failure is already captured in firesim/FireMarshal#30 but after I apply that workaround, it fails to build images with errors "You must install 'python' on your build machine" (Which means it expects Python 2.x, which was EOL in 2020.) When I work around that (by installing python2 and creating a symlink to it so 'python' exists), then the tests fail with C compile errors in libfakeroot.c. Ultimately, I'd have to test the new version of qemu on an older version of Ubuntu, to avoid incompatibilities between the FireMarshal code and the more recent versions of dependencies on newer versions of Ubuntu. (At some point, FireMarshal will need to be updated too.)

On the centos 7 build issue, ec2fast installs a pre-compiled RISC-V toolchain, but it doesn't change anything for the qemu build. Still, it would be useful information to see whether you get the same error if you choose riscv-tools as the install type instead of ec2fast.

@NathanTP
Copy link
Contributor

  1. marshal doesn't support python2 and I've tried to use python3 explicitly everywhere I can (though it's possible I missed one somewhere). On my environment, 'python' is python3. Maybe Linux or buildroot use "python" directly somewhere.
  2. I'm not sure what libfakeroot is. Where exactly in the build is that failing? There should be some logs from the build (marshal prints the path to the detailed log every time you run it). It might be a buildroot thing.
  3. I am having the configure.ac issue for "./build-toolchains.sh riscv-tools" as well. Looking a bit deeper, in chipyard/scripts there is a build-util.sh that has some bash functions in it for building stuff. module_build() assumes that whatever it's building uses autoconf. I assume the new QEMU either isn't using that or is using it differently. I'm not sure why this only shows up on centos though. You may need to skip the functions and run the correct commands directly in build-toolchains.sh.

@NathanTP
Copy link
Contributor

Looking again, it seems that the qemu submodule is not being initialized. This PR removes that code from build-toolchains.sh. I don't know what all the shenanigans in there were doing before, but at a minimum we still need to update the qemu submodule. Chipyard is pretty conservative about cloning submodules (hence the "no-riscv-tools" part of the init submodules script). build-toolchains.sh is responsible for setting that up.

@allisonrandal
Copy link
Author

allisonrandal commented Feb 18, 2022

Ubuntu dropped the binary named 'python' years ago, and only installs binaries named 'python3' (by default) and 'python2' (if you manually install the python2 packages). I got my info out of the log files, and happy to upload them, but since the marshal errors are unrelated to qemu, do you want me to open that as a separate issue on FireMarshal?

Update: The C compile error is in fakeroot, which is pulled in as a dependency by buildroot. The problem has already been fixed upstream, first by patching buildroot (buildroot/buildroot@f45925a) and later by updating the version of fakeroot that buildroot pulls in as a dependency (buildroot/buildroot@91ebf2b).

Update: The dependency on 'python' also came from buildroot, and has been fixed upstream. The buildroot changelog for 2022.02-rc1 notes: "Python 2.7 and python 2.x specific packages removed as python 2.x is EOL since April 2020." (The specific relevant change is buildroot/buildroot@9c0c784).

Looks like FireMarshal could use a bump in the version of buildroot (currently it has git submodule at version 2019.08 https://github.com/buildroot/buildroot/tree/1fcdfbfb8a28b81283575efc3faa92ce4a0d4e56).

I know it gets annoying quickly when you start updating versions of dependencies. TBH, the way I've been working is to use Chipyard locally with qemu/verilator (patched with the updated qemu), and only use FireMarshal in ec2 on the super old ec2 images that it targets. I assume everyone else only uses FireMarshal on ec2 with old images as well, otherwise they would have already reported these dependency version incompatibilities. So, it should be fine for now to only make sure that the updated qemu v6.0.1 doesn't break on those old ec2 images, and worry about building FireMarshal on newer versions of Ubuntu/Arch later.

@allisonrandal
Copy link
Author

If you look at the issue #1049, you'll see that those lines from .build-toolchains.sh were deleted because the script itself demanded it:

==>  PLEASE REMOVE qemu URL-REWRITING from scripts/build-toolchains.sh. It is no longer needed!

I expect that one or more of those deleted lines needs to be modified to update the submodule without doing URL rewriting.

@allisonrandal
Copy link
Author

@NathanTP I've added commit to the PR to init the qemu submodule without url rewriting (ba35f80)

@NathanTP
Copy link
Contributor

Cool, things seem to install properly now. I just ran the unit tests and it seems that the bbl tests aren't working. This updated QEMU doesn't seem to play nice with BBL:

qemu-system-riscv64` -nographic -bios none -smp 4 -machine virt -m 16384 -kernel /data/repos/cy_qemuBump/software/firemarshal/images/bbl-bin -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-device,rng=rng0 -device virtio-net-device,netdev=usernet -netdev user,id=usernet,hostfwd=tcp::35934-:22 -device virtio-blk-device,drive=hd0 -drive file=/data/repos/cy_qemuBump/software/firemarshal/images/bbl.img,format=raw,id=hd0
bbl loader
..../machine/mtrap.c./machine/mtrap.c:21: machine mode: unhandlable trap 7 @ 0x0000000080001f6e
.../machine/mtrap.c:21: machine mode: unhandlable trap 7 @ 0x0000000080001f6e
:21: machine mode: unhandlable trap 7 @ 0x0000000080001f6e
/machine/mtrap.c:21: machine mode: unhandlable trap 7 @ 0x0000000080001f6e

We may need to update or fix BBL. I'm not super familiar with these traps, maybe BBL is missing some standard handler that was introduced later?

@NathanTP
Copy link
Contributor

Bumping bbl seems to fix it. I'm tracking that here: firesim/FireMarshal#230

We'll need to wait for that to get fully tested and merged, but after that this QEMU bump should be OK by me.

As for the other build issues @allisonrandal, we can take that over to the mailing lists or marshal issues. I don't think they're related to this PR.

@allisonrandal
Copy link
Author

Thanks, I split out the unrelated FireMarshal build failures into firesim/FireMarshal#232

@sagark sagark added this to the 1.8.0 milestone Jun 15, 2022
@abejgonzalez
Copy link
Contributor

Now that we are using Conda, QEMU shouldn't need to be compiled from source (see #1163). We packaged up QEMU v5.0.0 but if v6 is needed we can support that through the Conda recipe https://github.com/ucb-bar/qemu-feedstock/blob/main/recipe/meta.yaml. Feel free to open a issue if that is needed and we can look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants