Skip to content

Conversation

@carlocaione
Copy link
Contributor

The AArch64 port in growing in complexity and before adding new more layers and code on top it's sometimes needed to revisit the basics.

This PR is doing the following:

  1. The cpu.h header file was being used as a catch-all header file, a no-man land without a proper naming convention for registers and bit fields. It is still a bit of a mess but at least with this PR I'm trying to name things following a reasonable scheme: _BIT for bits, _SHIFT for shifts and _MASK for masks.

  2. AArch64 is currently using a mix of C code and inline assembly pretty much everywhere. This is error prone, redundant and in general in 2021 we should write as little assembly code as possible IMO. So this PR is adding a lib_helper.h file factoring out all the inlined assembly to common operations. We are also adding a new MAKE_REG_HELPER macro to create helpers to read / write system registers. This is handy to have self-explanatory C code when accessing system registers.

  3. Leveraging the two previous points the reset routine has been rewritten, moving most of the code from assembly to C without any change to the actual functionality. This was inspired by some changes requested by @jharris-intel (but not yet implemented) that you can find in [misc] AArch64 improvements and fixes #32205. The C code also allows us to put in place more complex boot paths and in general to better handle all the possible permutations in which you can boot an AArch64 system.

@carlocaione
Copy link
Contributor Author

failure is unrelated to this PR

@jharris-intel
Copy link
Contributor

in general in 2021 we should write as little assembly code as possible IMO.

As an aside, this is one of the reasons why load-exclusive / store-exclusive gets nowhere near as much use as it could.

There are a lot of optimizations you can do with load-exclusive / store-exclusive that you can't do with the standard atomics (notably, you can do nearly-arbitrary calculations and conditions between the load and store, all atomically and without redundant loads. E.g. if (val.bitfieldA == 3 && val.bitfieldB == 4) {val.bitfieldC = 3;} else {val.bitfieldA = 3;}, all atomically.)

(Great for things like atomic recursive spinlocks. You don't need to do a separate fallback check "am I the owner", just have e.g. the owner core and current depth in the spinlock value and do the entire "set the lock to my core if it's unowned, else if I own the lock increment the depth, else clrex and continue" in one go.)

(Yes, you can't do too many calculations between the load and store for various reasons - but even restricting to say ten instructions very rapidly hits a combinatorial explosion of possible atomics APIs.)

Trying to do this in anything other than assembly is a recipe for disaster even if the intrinsics are exposed - because the compiler can and occasionally will insert memory operations.

The best you can do with the standard atomics is load / do calculation / CAS, which has much worse behavior especially under load (load-exclusive store-exclusive has forward-progress guarantees that load/calculate/CAS doesn't) (and load-exclusive store-exclusive doesn't suffer from the ABA problem which makes many things simpler than standard atomics)

...And unfortunately as a result even ARM is moving away from this model (e.g. ARMv8.1 introducing "standard" atomics). It's a shame, although I fully understand why this is the case. The power of a load-exclusive store-exclusive is being able to do (semi)-arbitrary calculations in the atomic section, but if no-one actually takes advantage of this...

I really wish e.g. GCC had a pragma or something to indicate "no implicit memory operations in this section". Most of the assembly I write is due to this for one reason or another (stack pointer invalid, etc, etc).

Copy link

@npitre npitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to end the inline asm content with "\n\t" when there is only one such line

@carlocaione carlocaione mentioned this pull request Feb 19, 2021
@carlocaione
Copy link
Contributor Author

The new revision is modifying a bit the boot path. We are now dropping from EL3 to EL2 if this this is allowed and EL2 actually exists otherwise we drop into EL1 as usual.

@carlescufi carlescufi requested a review from pabigot February 22, 2021 15:32
@carlocaione
Copy link
Contributor Author

@jharris-intel can you ACK / NACK this?

Copy link
Contributor

@jharris-intel jharris-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jharris-intel can you ACK / NACK this?

Sure, shall do.

(If you're wondering; I'm always a little hesitant about showing up and stepping on people's toes.)

@carlocaione
Copy link
Contributor Author

(If you're wondering; I'm always a little hesitant about showing up and stepping on people's toes.)

I really appreciate your reviews and your insights, so your showing up and stepping on people's toes is really welcome here and especially on AArch64 PRs ;)

The name for registers and bit-field in the cpu.h file is incoherent and
messy. Refactor the whole file using the proper suffixes for bits,
shifts and masks.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Introduce C helpers for common assembly operations and preprocessor
constructors to create helper functions to read / write system
registers.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
No need to rely on inline assembly when helpers are available.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
There is no strict reason to use assembly for the reset routine. Move as
much code as possible to C code using the proper helpers.

Signed-off-by: Carlo Caione <ccaione@baylibre.com>
@zephyrbot zephyrbot added the area: ARM64 ARM (64-bit) Architecture label Mar 3, 2021
@carlocaione carlocaione assigned carlocaione and unassigned ioannisg Mar 4, 2021
@nashif nashif merged commit 9d908c7 into zephyrproject-rtos:master Mar 4, 2021
@carlocaione carlocaione deleted the cpu_reset_cleanup branch March 4, 2021 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: API Changes to public APIs area: ARM ARM (32-bit) Architecture area: ARM64 ARM (64-bit) Architecture area: Interrupt Controller

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants