Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve header structure and usage #41543

Open
gmarull opened this issue Jan 3, 2022 · 24 comments
Open

Improve header structure and usage #41543

gmarull opened this issue Jan 3, 2022 · 24 comments
Assignees
Labels
Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community treewide 🧹

Comments

@gmarull
Copy link
Member

gmarull commented Jan 3, 2022

Is your enhancement proposal related to a problem? Please describe.

As of today one can observe a few issues with respect to C headers in Zephyr:

  • Many C files rely on headers being included somewhere else (indirect dependency)
  • Inclusion list does not follow any particular rules (you can find together local, libc, Zephyr, 3rd party lib headers all mixed).
  • Some headers are not self-contained, meaning you need to include another one in the right order or your build will fail (e.g. sys/time_units.h.
  • Some headers work "à la <windows.h>" way. It happened with device.h and pm/device.h. Another common case is soc.h including HAL headers (see related issue: Improve STM32 LL HAL usage #28822).

All these issues may seem not relevant, but over time this translates to increased compile times, changes in a header that have a domino effect to many source files, etc.

I've opened this issue inspired by https://lwn.net/Articles/880175/.

Describe the solution you'd like

  • First establish certain rules. I think the rules used by Google C++ styleguide are a good starting point. Some of the rules:

    • Respect order of inclusion, for example:
      /* local headers */
      #include "mydir/foo.h"
      
      /* libc headers */
      #include <string.h>
      #include <stdlib.h>
      
      /* Zephyr headers */
      #include <kernel.h>
      #include <drivers/gpio.h>
      
      /* 3rd party libraries */
      #include <lvgl.h>
      Note that clang-format can assist on automatically sorting the inclusion list.
    • Headers should be self-contained. Those which are not should error when included standalone (e.g. using a required definition before inclusion).
    • Avoid forward declarations (unless e.g. to break circular dependencies).
    • Include only what is necessary
  • Gradually improve the current situation

  • Add some minimal CI checks to keep things in good shape

Another interesting change that, I think, would be useful is to place all Zephyr headers into a zephyr folder, so that all Zephyr headers would then be used as:

#include <zephyr/kernel.h>
#include <zephyr/drivers/gpio.h>

This would help a lot to identify Zephyr headers, to create a single clang-format sorting rule, etc.

Describe alternatives you've considered

Let the problem grow and open a PR in the future with 2,297 commits in to fix the issues :-)

Additional context

Include graph for kernel.h as generated by Doxygen (can likely be improved):

image

@gmarull gmarull added Enhancement Changes/Updates/Additions to existing features dev-review To be discussed in dev-review meeting labels Jan 3, 2022
@henrikbrixandersen
Copy link
Member

I generally like the ideas put forward in the description. However, in my experience we have always opted to put local includes first in the compiler include search path, but last in the list of #include directives. This allows for the local includes to have conditionals based on the scope they are included in. I would be interested to hear arguments for putting them first?

Apart from that, the proposed order is the same as what I would recommend.

@gmarull
Copy link
Member Author

gmarull commented Jan 4, 2022

I generally like the ideas put forward in the description. However, in my experience we have always opted to put local includes first in the compiler include search path, but last in the list of #include directives. This allows for the local includes to have conditionals based on the scope they are included in. I would be interested to hear arguments for putting them first?

Apart from that, the proposed order is the same as what I would recommend.

The reason is explained in the Google C++ styleguide:

With the preferred ordering, if the related header dir2/foo2.h omits any necessary includes, the build of dir/foo.cc or dir/foo_test.cc will break. Thus, this rule ensures that build breaks show up first for the people working on these files, not for innocent people in other packages.

dir/foo.cc and dir2/foo2.h are usually in the same directory (e.g., base/basictypes_test.cc and base/basictypes.h), but may sometimes be in different directories too.

Some C projects, e.g. https://github.com/libgit2/libgit2 seem to follow this convention too. Could you provide an example of your usecase?

@gmarull gmarull added the RFC Request For Comments: want input from the community label Jan 4, 2022
@marc-hb
Copy link
Collaborator

marc-hb commented Jan 4, 2022

I understand backward compatibility has always been king in C; explains why we're still stuck with the 50-years old #include. However and as highlighted by this issue, #include leaves a large spectrum of possibilities: from disastrous to somewhat maintainable.

From time to time I've (ab)used gcc -E -dI -dM -dZ ... but surely proper tools / compiler plugins exist to help manage this modularity mess? This problem is universal, affects every single C project above a relatively small size. Policies are great but without the automation to implement and maintain them they never go far.

EDIT: I mean tools to not just order includes but to actually analyse dependencies, think jdeps , https://www.jetbrains.com/help/idea/dependencies-analysis.html, https://www.jarchitect.com/Doc_VS_Arch etc.
Doxygen can produce a .dot graph of includes, maybe that's a good start?

@alwa-nordic
Copy link
Collaborator

There exists a tool for detecting indirect dependencies we could consider using: https://include-what-you-use.org/

Indirect dependencies / transitive includes are a reliance on a implementation detail and can break in the future. E.g:

//// feature_foo.h:
#include <stdbool.h>
// Declare Foo

//// user_of_feature_foo.c:
#include <feature_foo.h>
bool baz; // Accidentally use `bool` from <feature_foo.h>
// Use Foo

If feature_foo.h drops its dependency on stdbool, user_of_feature_foo.c will unexpectedly break.

@gmarull
Copy link
Member Author

gmarull commented Jan 6, 2022

There exists a tool for detecting indirect dependencies we could consider using: include-what-you-use.org

That's an interesting option to look at!

@yperess
Copy link
Collaborator

yperess commented Mar 29, 2022

Personal preferences:

  1. Stick to an existing style guide, it's much easier. Google style guide works, I wouldn't change it.
  2. Adding a zephyr/ prefix to the include path would be fantastic but hard to do (I'm sure we'll get a lot of complaints for breaking downstream projects). That said, I think it's one of those better sooner rather than later so I would vote to just rip the bandaid.

@gmarull
Copy link
Member Author

gmarull commented Mar 29, 2022

Personal preferences:

  1. Stick to an existing style guide, it's much easier. Google style guide works, I wouldn't change it.

Agreed, no reason to reinvent yet another style guide.

  1. Adding a zephyr/ prefix to the include path would be fantastic but hard to do (I'm sure we'll get a lot of complaints for breaking downstream projects). That said, I think it's one of those better sooner rather than later so I would vote to just rip the bandaid.

It should be easy to keep a compatibility option, i.e., include both include/ and include/zephyr.

@gregshue
Copy link

Hmm ... watch what happens when we look at the bigger picture.

First establish certain rules.

I completely agree - but I think your strategy is misdirected.

The Zephyr build system must be able to compile code from outside the Zephyr project that the user cannot modify. In fact, when a platinum user needs to reuse the audit artifacts for a safety or security submission, they must use a specific SHA of zephyr itself. So to fulfill the Zephyr Project Mission Statement, the zephyr ecosystem must explicitly carve out subsets of identifiers in each global namespace.

The header file pathnames are only one example of global namespaces that must be managed. Others include:
⦁ linker symbols,
⦁ module identifiers,
⦁ logging identifiers
⦁ shell commands
⦁ preprocessor symbols,
⦁ Kconfig symbols,
⦁ CMake variables,
⦁ doxygen custom tags (requirement IDs),
⦁ RST symbols,
⦁ testcase.yaml/sample.yaml test identifiers
⦁ etc.

Zephyr already has an established pattern for subsystem/driver/kernel header file pathnames that is dependent on subsystem/driver names and linker symbols. When we solve the linker symbols then we have a pattern that will (mostly) resolve the issues.

What's a suitable pattern? Prefix linker symbols and subsystem/driver names with a multi-character abbreviation of the module name.

I think this will address most of the problems across most (all?) of the global namespaces.

Thoughts?

@mbolivar-nordic
Copy link
Contributor

Zephyr Project Mission Statement

Source?

@mbolivar-nordic
Copy link
Contributor

mbolivar-nordic commented Mar 29, 2022

2. Adding a zephyr/ prefix to the include path would be fantastic but hard to do (I'm sure we'll get a lot of complaints for breaking downstream projects). That said, I think it's one of those better sooner rather than later so I would vote to just rip the bandaid.

This has been debated and rejected by the TSC in the past, just FYI. I don't remember the reason why, but you can start digging here:

Edit: 22 May 2019, found it:

[Anas] Does vote need to be a package or dealt with separately?
[Carles + Alberto] better vote 1st part
[Anas] Wants this to be voted, but would prefer to not raise motion as he wants to vote against.
[Alberto] Raises motion to vote on Stage 1 of the Proposal.
[Carles] Seconds motion to vote
From Alberto Escolar (Demant) to Everyone: (08:41 AM)

+1

From Carles to Everyone: (08:42 AM)

+1

From Michael Scott  to Everyone: (08:42 AM)

-1

From Johan Hedberg to Everyone: (08:42 AM)

-1

From Anas Nashif to Everyone: (08:42 AM)

-1

From Ruud to Everyone: (08:42 AM)

-1

From Asger Munk Nielsen to Everyone: (08:42 AM)

+1

From Nathaniel Graff to Everyone: (08:42 AM)

abstain

From Davide ciminaghi to Everyone: (08:42 AM)

-1

From Johann Fischer to Everyone: (08:42 AM)

-1

From David Leach to Everyone: (08:42 AM)

-1

From Ioannis Glaropoulos to Everyone: (08:42 AM)

abstain

From Maureen Helm to Everyone: (08:42 AM)

abstain

Result: Vote does not pass

https://docs.google.com/document/d/12p8Q4USdExsP3a8lIdhyqG-4vxb44kisccZ9TvlxWBA/edit#

This was a long time ago and the project has grown a lot since then, so it may be worth revisiting this, but it would have to go through the TSC

@gregshue
Copy link

@mbolivar-nordic
Copy link
Contributor

Zephyr Project Mission Statement

Source?

https://www.zephyrproject.org/wp-content/uploads/sites/38/2020/09/CLEAN-LF-Zephyr-Charter-20200624-effective-20200901.pdf

Thanks! This document says:

  1. Mission of the Zephyr Project (“Zephyr,” or, alternatively, the “Project”).
    The mission of the Project is to:
    a. deliver the best-in-class RTOS for connected resource-constrained devices, built
    to be secure and safe.
    b. maintain an auditable code base, while taking advantage of community
    participation; this auditable code base is open source;
    c. include participation of leading members of this ecosystem, including
    micro-controller manufacturers, hardware developers, software developers and
    other members of the ecosystem; and
    d. host the infrastructure for the open source Project and sub-projects, establishing a
    neutral home for community meetings, events and collaborative discussions and
    providing structure around the business and technical governance of the Project.

On the other hand, you wrote:

In fact, when a platinum user needs to reuse the audit artifacts for a safety or security submission, they must use a specific SHA of zephyr itself. So to fulfill the Zephyr Project Mission Statement, the zephyr ecosystem must explicitly carve out subsets of identifiers in each global namespace.

I think there are quite a few unconnected dots between the two.

@nashif
Copy link
Member

nashif commented Mar 30, 2022

This was a long time ago and the project has grown a lot since then, so it may be worth revisiting this, but it would have to go through the TSC

agree. It has been 3 years and we should be able to bring this topic back and revisit.

@gregshue
Copy link

I think there are quite a few unconnected dots between the two.

Yes. Here is my thinking (and I am not an expert in this area). AFAICT:

  • Everybody gets access to the entire codebase on both main and auditable branches, including requirements and tests, so anyone is able (and must be enabled) to generate their own artifacts.

  • The artifacts are only relevant to a particular SHA with a particular toolchain, so they are of very limited use unless they can be reused as part of a safety/security submission. (I believe I heard that mentioned in a long-ago discussion.)

  • The platinum members get access to the audit artifacts, but the silver members do not. The artifacts themselves must have some significant value to motivate the general community to be platinum members rather than silver members. The likely value is the platinum members saving time in generating artifacts for the toolchain(s) and configurations they use.

  • In order for the repository SHAs to be perpetually reusable in a product, the contents must be at least extensible, configurable, and individual component implementations replaceable. Among other things, this requires strategically avoiding conflicts in the global namespaces.

Could Zephyr achieve "best in class" without fulfilling all of this? Temporarily, but the Zephyr Project should concede that competition is likely. The plan for sustaining "best in class" must plan for perpetually addressing shortcomings. Whatever is a recognized shortcoming will eventually be addressed by the Zephyr Project, either because competition already addressed it or they want to avoid it being a competition point.

What am I missing?

@mbolivar-nordic
Copy link
Contributor

What am I missing?

IMO you are missing the other advantages to platinum membership documented here: https://www.zephyrproject.org/become-a-member/

Focusing on the audit artifacts as the only carrot for becoming a platinum member seems a bit odd. But that's just me.

cfriedt added a commit to cfriedt/zephyr that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in zephyrproject-rtos#41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
cfriedt added a commit to cfriedt/zephyr that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in zephyrproject-rtos#41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
cfriedt added a commit to cfriedt/zephyr that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in zephyrproject-rtos#41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
cfriedt added a commit to cfriedt/zephyr that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in zephyrproject-rtos#41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
cfriedt added a commit to cfriedt/zephyr that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in zephyrproject-rtos#41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
cfriedt added a commit that referenced this issue Jul 6, 2023
The POSIX testsuite should be run with standard POSIX
header locations.

Additionally, reorder headers to follow the recommendation
in #41543.

Signed-off-by: Christopher Friedt <cfriedt@meta.com>
@marc-hb
Copy link
Collaborator

marc-hb commented Sep 11, 2023

As of today one can observe a few issues with respect to C headers in Zephyr:

  • ...
  • ...

New item: two .h files directly including each other:

@cfriedt
Copy link
Member

cfriedt commented Oct 30, 2023

Is anyone not on-board with this? Can we incorporate this into coding guidelines prior to LTSv3? I guess there were some treewide changes - what is the progress? Is there anything left to do?

@gmarull - if there are problematic areas, feel free to delegate tasks for others to clean up headers.

@gmarull
Copy link
Member Author

gmarull commented Nov 2, 2023

Is anyone not on-board with this? Can we incorporate this into coding guidelines prior to LTSv3? I guess there were some treewide changes - what is the progress? Is there anything left to do?

@gmarull - if there are problematic areas, feel free to delegate tasks for others to clean up headers.

I think it's a generalized problem, I just try to improve the state of headers when I refactor or add new code. Kernel is specially messy, or the infamous soc.h. I do not have much energy to spend time on this right now.

@dkalowsk
Copy link
Contributor

dkalowsk commented Nov 3, 2023

IWYU is a great tool for automating this kind of work. It does take some leverage to get setup. Having used it a little now, I'd also recommend starting a bit "lighter" with something like https://clang.llvm.org/extra/clang-tidy/checks/misc/include-cleaner.html as I believe there is already some clang-tidy work going on.

josuah added a commit to josuah/zephyr that referenced this issue Jul 28, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax  defined in commit 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
josuah added a commit to josuah/zephyr that referenced this issue Jul 28, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax  defined in commit 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
josuah added a commit to josuah/zephyr that referenced this issue Jul 29, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax from 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
@marc-hb
Copy link
Collaborator

marc-hb commented Sep 4, 2024

josuah added a commit to josuah/zephyr that referenced this issue Sep 5, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax from 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
mmahadevan108 pushed a commit that referenced this issue Sep 5, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax from 5e34681

- Sort the #include list to something like #41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
nagelkl pushed a commit to nagelkl/zephyr that referenced this issue Sep 16, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax from 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
facchinm pushed a commit to arduino/zephyr that referenced this issue Oct 21, 2024
Zephyr drivers have typically one log level defined per class. The video
drivers were making exception. This adds the missing log level for video
drivers.

Since all headers had to be modified, this also:

- Update the log initialization to the new syntax from 5e34681

- Sort the #include list to something like zephyrproject-rtos#41543

Signed-off-by: Josuah Demangeon <me@josuah.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community treewide 🧹
Projects
Status: No status
Development

No branches or pull requests