Skip to content

arch: aarch32: linker command file fixes: .bss/.noinit sections, z_mapped_start marker #60368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ibirnbaum
Copy link
Member

This is a follow up to #53262, which still lacked the adjustment of the .noinit section's position within the binary by the time the PR went stale.

Adjust the linker command file so that the .bss and .noinit sections are placed at the end of the resulting binary. Until now, those sections have been located somewhere in the middle of the binary, so that the inclusion of structures like statically defined heaps or large zero- initialized arrays reflected 1:1 in the resulting binary's size. Even for a stripped binary, such data was included in full as the linker couldn't omit it due to subsequent sections within the binary.

This fix has been tested with a 32 MB statically allocated heap and a 32 MB uint8 zero-initialized array. Both structures are clearly identifiable in the memory consumption statistics, however, the final binary's size is unaffected by their inclusion.

@ibirnbaum ibirnbaum requested a review from stephanosio as a code owner July 13, 2023 20:31
@zephyrbot zephyrbot added the area: ARM ARM (32-bit) Architecture label Jul 13, 2023
@ibirnbaum
Copy link
Member Author

cross-reference to #53262

microbuilder
microbuilder previously approved these changes Jul 13, 2023
@SgrrZhf SgrrZhf self-requested a review July 14, 2023 02:25
*/
. = ALIGN(4);
__bss_start = .;
__kernel_ram_start = .;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this way, should __kernel_ram_start be put above the .data section? Otherwise, __kernel_ram_start would not include the data section, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check where this marker is used, you likely have a point there. This was missed by the author of #53262 as well.

Also, I noticed last night that with regards to the linker command file, the issue described in #51024 also still persists. We should preferrably fix this as well before going for the A/R/M separation.

@ibirnbaum
Copy link
Member Author

I've moved the __kernel_ram_start marker out of the bss section and placed it within the first section that originally came right after the bss section, which is the data section. Therefore, data/bss/noinit are now all located between the __kernel_ram_start and __kernel_ram_end markers, but with the adjusted section order. Two test builds of the "philosophers" sample with both revisions of the linker command file produce identical values for the start/end/size markers, as expected, they just differ regarding the order of the sections.

One thing I noticed along the away: I haven't checked who introduced this change and when, but there's now an #include <zephyr/linker/ram-end.ld> statement towards the end of the linker command file which sets _image_ram_end/z_mapped_end. This happens after the OCM declaration. In probably 99% of all cases, the OCM is configured to be located at address 0x0, as this matches the default configuration of the exception vectors' location. However, if I configure the OCM to be located at the high address by changing the corresponding 'chosen' entry, the OCM is located at 0xFFFC0000, beyond all RAM, all AXI memory space, all integrated peripherals and the system control registers. Although the OCM is in a separate group, I'd better test if using the high OCM location somehow messes up the _image_ram_end/z_mapped_end markers.

SECTION_DATA_PROLOGUE(_DATA_SECTION_NAME,,)
{
__kernel_ram_start = .;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original __kernel_ram_start = .; under .bss which aligns to BSS_ALIGN. This is for the MPU or MMU region settings (MPU requires the address to be aligned to 32-byte or 64-byte, MMU requires 4k). I think you should also add a _ALIGN into SECTION_DATA_PROLOGUE(_DATA_SECTION_NAME,,), since data section is the first one. Maybe rename the BSS_ALIGN to DATA_ALIGN or a more generic name, KERNEL_RAM_ALIGN?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, __kernel_ram_start should align to _region_min_align

@ibirnbaum ibirnbaum force-pushed the aarch32_ld_fix_bss_rodata branch from 58f292e to a432eac Compare July 18, 2023 12:03
@ibirnbaum
Copy link
Member Author

I've replaced BSS_ALIGN with a more generic SECTION_ALIGN.

With this revision of the linker command file, I get the follwing memory map when building the philosophers sample for qemu_cortex_a9:

memmap

This looks plausible to me & matches alignment requirements, at least for a MMU-based system.
As mentioned before, z_mapped_start is still in the wrong place (should be 0010 0000), that's the subject of the afforementioned PR #51024. Should I add this fix as a 2nd commit, or should this be a separate PR?
Is it intentional that the markers __data_load_start and __data_region_load_start have no corresponding end markers?

@carlocaione
Copy link
Collaborator

Should I add this fix as a 2nd commit, or should this be a separate PR?

Can you add that fix here as well?

Is it intentional that the markers __data_load_start and __data_region_load_start have no corresponding end markers?

Yeah, AFAICT the only usage of __data_region_load_start is for z_data_copy() and in that case the size of the region (thus its ending) is calculated indirectly with __data_region_end - __data_region_start.

@ibirnbaum
Copy link
Member Author

ibirnbaum commented Jul 18, 2023

Can you add that fix here as well?

Within the same commit? Sure, but in that case, the title of the PR wouldn't be 100% accurate any more, as we're then mixing the issue of section placement in order to reduce binary size with a misplaced marker indicating to the memory subsystem which memory is available for mapping via the MMU.

If that's not an issue, I'll go ahead and include the fix, referencing the original PR.

@microbuilder
Copy link
Member

Can you add that fix here as well?

Within the same commit? Sure, but in that case, the title of the PR wouldn't be 100% accurate any more, as we're then mixing the issue of section placement in order to reduce binary size with a misplaced marker indicating to the memory subsystem which memory is available for mapping via the MMU.

If that's not an issue, I'll go ahead and include the fix, referencing the original PR.

Adding a second commit and making the PR title more generic seems fine to me, to make both changes here.

Copy link
Member

@microbuilder microbuilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a breaking change, but it feels like it should be in the release notes for Aarch32. Do you mind adding a commit with one or two lines in the 3.5 release notes, highlighting the change and the reason for it?

@ibirnbaum
Copy link
Member Author

This isn't a breaking change, but it feels like it should be in the release notes for Aarch32. Do you mind adding a commit with one or two lines in the 3.5 release notes, highlighting the change and the reason for it?

Is there a template for release notes, or is a specific title required for a release note commit?

@microbuilder
Copy link
Member

microbuilder commented Jul 18, 2023

This isn't a breaking change, but it feels like it should be in the release notes for Aarch32. Do you mind adding a commit with one or two lines in the 3.5 release notes, highlighting the change and the reason for it?

Is there a template for release notes, or is a specific title required for a release note commit?

I think just doc: release-notes: Add Aarch32 linker updates should be fine, and then something under: Architectures > ARM

Here's a recent addition I made if it helps:

API Changes
***********

Changes in this release
=======================

* Set :kconfig:option:`CONFIG_BOOTLOADER_SRAM_SIZE` default value to ``0`` (was
  ``16``). Bootloaders that use a part of the SRAM should set this value to an
  appropriate size. :github:`60371`

EDIT: Have a look at the 3.4 Arm release notes for a model to follow: https://github.com/zephyrproject-rtos/zephyr/blob/main/doc/releases/release-notes-3.4.rst#architectures

@ibirnbaum
Copy link
Member Author

I was about to prepare a commit for moving z_mapped_start, which I've done before in some long gone sandbox install when looking into 51024, however, I just noticed that there are at the time being two alternative locations, and they're dependent on XIP.

So I wanted to look further into how XIP affects the output - has anyone ever tried building a binary for Cortex-A/-R with XIP enabled (I guess all Zynq-based boards come with flash, so that sure is a realistic scenario)? If XIP is enabled, a section alignment parameter (which was already there at least for bss before I came along with the current modifications) such as:
SECTION_DATA_PROLOGUE(_BSS_SECTION_NAME,(NOLOAD), SECTION_ALIGN)
which expands to
SECTION_DATA_PROLOGUE(_BSS_SECTION_NAME,(NOLOAD), ALIGN(_region_min_align))
causes a syntax error in linker_zephyr_pre0.cmd where the statement ends up as:

    bss (NOLOAD) : ALIGN_WITH_INPUT ALIGN(_region_min_align)
    {
        . = ALIGN(4);
        __bss_start = .;
        *(.bss)
        *(".bss.*")
        *(COMMON)
        *(".kernel_bss.*")
        __bss_end = ALIGN(4);
    } > RAM AT > RAM

The syntax error comes from the ALIGN_WITH_INPUT ALIGN(_region_min_align) part, if the section alignment specification is removed from linker.ld, bss (NOLOAD) : ALIGN_WITH_INPUT {...} seems to by syntactically correct. However, with XIP disabled, the generated output bss (NOLOAD) : ALIGN(_region_min_align) is valid syntax as well. Could it be that ALIGN_WITH_INPUT and ALIGN(_region_min_align) are mutually exclusive and somehow XIP and non-XIP aren't properly separated here?

microbuilder
microbuilder previously approved these changes Aug 8, 2023
@ibirnbaum ibirnbaum changed the title arch: aarch32: place .bss, .noinit sections at the end of the binary arch: aarch32: linker command file fixes: .bss/.noinit sections, z_mapped_start marker Aug 8, 2023
@ibirnbaum ibirnbaum dismissed stale reviews from microbuilder and povergoing via 5739bee August 8, 2023 07:53
@ibirnbaum ibirnbaum requested a review from nashif as a code owner August 8, 2023 07:53
@zephyrbot zephyrbot added the Release Notes To be mentioned in the release notes label Aug 8, 2023
@ibirnbaum ibirnbaum force-pushed the aarch32_ld_fix_bss_rodata branch from 5739bee to 12f2835 Compare August 8, 2023 08:44
@ibirnbaum ibirnbaum force-pushed the aarch32_ld_fix_bss_rodata branch from 12f2835 to 28ba1ff Compare August 8, 2023 08:57
@ibirnbaum ibirnbaum force-pushed the aarch32_ld_fix_bss_rodata branch from 28ba1ff to 4e0f2ba Compare August 8, 2023 09:08
…hanges

Document the changes implemented in zephyrproject-rtos#60368:
* Placement of the .bss and .noinit sections at the end of the binary
  so that large zero-/uninitialized data structures such as heaps,
  arrays etc. don't have to be padded in the resulting binary.
* Location of the z_mapped_start marker: prevents the assignment
  of the single 4k-page wide .vectors section right at the RAM
  base address as dynamic memory by the MMU at run-time. Instead
  of pointing to the start of the subsequent .text section, the
  z_mapped_start marker now covers all the data contained within
  the binary that ends up in RAM.

Signed-off-by: Immo Birnbaum <mail@birnbaum.immo>
@ibirnbaum ibirnbaum force-pushed the aarch32_ld_fix_bss_rodata branch from 4e0f2ba to d8545f4 Compare August 8, 2023 22:24
@ibirnbaum
Copy link
Member Author

The CI put up a bit of a fight last night with misleading 'trailing whitespace' errors, but by now, everything should be good to go.

Copy link
Member

@microbuilder microbuilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to add release notes as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ARM ARM (32-bit) Architecture Release Notes To be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants