Skip to content

cmake: support generator expression in yaml module #80007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 30, 2025

Conversation

pillo79
Copy link
Collaborator

@pillo79 pillo79 commented Oct 17, 2024

This PR adds support for generator expressions (genexes) in values and lists to the yaml module, along with associated tests. As discussed in #78727, this is a prerequisite for cleaning up and expanding the LLEXT EDK information.

Since the yaml module will also be used from llext-edk.cmake in script mode, this PR expands the current YAML test suite so that it is executed both in CMake project and script modes.

This new functionality can be seen in operation in PR #83705, which rewrites the EDK to use the YAML import.

Implementation details

Generator expression semantics

Values that contain genexes must be stored in the YAML context passing the new GENEX flag to yaml_set() to enable genex expansion on the whole context.

Generator expressions can only be expanded by CMake after all configuration code has been executed and the final values of the project properties are defined. This is a challenge for current yaml_save(), which is expected to write out the YAML file immediately.
To achieve both goals, in a given context name that use generator expressions, the YAML file is written twice:

  • a first partial file is written at config time, from inside yaml_save(), excluding all values that contain a generator expression;
  • a fully-formed file is written at build time, as a build step of a name_yaml_saved target.

If the YAML file is required for further processing by the project, the above target must be listed as a dependency to ensure the full file contents are available.

Lists expanded from a single generator expression

By the time the generator expressions are expanded by CMake, the structure of the YAML file would be already defined, and all items inside a list would get expanded into a single element - definitely not the desired behavior.

To overcome this limitation, in YAML contexts that use generator expressions, lists are not directly expanded into proper JSON lists but stored in their CMake string format with a special marker.
When the yaml_save() function is called, the internal JSON representation is written using file(GENERATE ...) to a temporary file to obtain a seamlessly expanded JSON object. The temporary file is then processed in a build step by a CMake script that reads the intermediate JSON, replaces the marked lists-as-string with the correct list format, and writes the final YAML file.

Review notes

  • the first 2 commits are preparatory refactors that do not change the functionality of the existing code;
  • commits 3 and 4 expand the yaml module tests;
  • commits 5 and 6 introduce generator expression logic and tests, and may be best viewed ignoring whitespace due to indentation changes for large blocks of existing code.

Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing some prototyping of this, for sure this can be beneficial.

Haven't been into all details, but tried to read enough details of the code to understand the flow.

Some overall comments.
I think we should avoid both yaml_save() and yaml_generate().
Users should not care when writing the file whether it contains genex'es or not.

And it should be possible to do multiple writes, for example like this:

yaml_set(...)
yaml_set(...)
yaml_save(...)
yaml_set(... GENEX ...)
yaml_save(...)
yaml_set(...)
yaml_save(...)

I believe we can accomplish that by having each save writing the yaml file with only clean content, and then writing the json with the content and the genex'es.

If there are any genex'es, then the file(GENERATE ...) will be used to transform the internal genex json to the file with evaluated expressions.
The pre-build custom command is then used to convert the final json to final yaml file.
As I don't think file(GENERATE ...) allows to generate the same file multiple times, then we probably need to increment, like <name>_<n>.json.
But this is a detail in this context.

Also remember proper DEPENDS on the custom command.

But I like the idea and direction 👍

Comment on lines 420 to 422
message(FATAL_ERROR "YAML context '${ARG_YAML_NAME}' uses generator expressions."
"Use 'yaml_generate()' to write its contents to a file."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really what we want ?

I could see cases where we would like to write parts of a yaml file (such as build_info) during configure time even though we don't have all knowledge yet.
A partial yaml file may be useful in cases where CMake generator fails.

Secondly, the caller of yaml_save() may not know if other parts of the build system has decided to use genex'es, for example a genex could suddenly be used by a hal, Zephyr module, etc and thus only be active when said feature is enabled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to write intermediate, "broken" output files for the various reasons in the review comment. However, I think it would be fine to write to a different file (.partial_N.yaml?) when genexes are detected. This would still provide debugging info, but avoid most of the above breakages.

Copy link
Collaborator

@tejlmand tejlmand Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to write intermediate, "broken" output files for the various reasons in the review comment.

I said nothing about writing broken files. I said partial files.

For example like this:

build_info(zephyr version VALUE 1.2.3)
build_info(foo lib files VALUE $<TARGET_PROPERTY:foo,SOURCES>)
yaml_save(NAME build_info)    # Out file with known content is written immediately
                              # genex'es are evaluated (and written) at generator time.

... more code ...
build_info(foo bar VALUE baz)
yaml_save(NAME build_info)

So during CMake configure and generator time the following happens:

  • First yaml_save() (configure time)
    zephyr:
      version: 1.2.3
    
  • Second yaml_save() (configure time)
    zephyr:
      version: 1.2.3
    foo:
      bar: baz
    
  • Generator time,
    genex from the property evaluation
    zephyr:
      version: 1.2.3
    foo:
      bar: baz
      lib:
        - 'file1.c'
        - 'file2.c'
    

Comment on lines 483 to 484
add_custom_target(${ARG_YAML_NAME}_yaml_filter ALL DEPENDS ${yaml_file}.tmp ${yaml_target})
add_custom_command(TARGET ${yaml_target} PRE_BUILD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like something which can break in case two parts of the build system decides to call yaml_generate() on the same target independent of each other.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I only expected one "save" operation, as it is a target-generating step. What is a possible use case for multiple yaml_generate calls on the same scope?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that you could want to save known data as soon as you have them.
For example, sysbuild could want to save as much info as possible before CMake config step of individual images are executed (and potentially breaks the configure step).

This would provide knowledge to users / IDEs / other tools about configuration files, included images, etc. even if one of those fails.

Copy link
Collaborator Author

@pillo79 pillo79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the early review @tejlmand! 👍

I think we should avoid [having] both yaml_save() and yaml_generate().
Users should not care when writing the file whether it contains genex'es or not.

I tend to disagree, or more specifically, I think that users must know if they want to support genexes or not. This choice results in very different flows:

  • when no genexes are expected, the YAML may be written immediately and its available even during config time; there is no target associated to the file.
  • when genexes may be present, an intermediate non-YAML file must be written after config is done, and then post-processed during build. If the results are required for further evaluation, targets must depend on the final file.
Config time (yaml_save) Build time (yaml_generate) Target / Deps
No genex full contents (same full contents) meaningless
With genex partial, broken full contents required

Unless we drop the immediate file write in yaml_save, IMO merging them has confusing results:

  • Most importantly, the resulting "common save" would have unspecified semantics - the file would be available immediately or as a build step depending on other code. Having two separate functions makes it explicit what the expected flow is. Note that it is harmless to expect genexes but not actually have them.

  • Secondly, generation requires an associated TARGET and this must be provided to the "common save" [1] when genexes are expected. I could write it anyway and skip the generation code if no target was given, but should I warn the user (noisy in your example!) or not (a "silent" bug)?

  • Finally, the build info YAML is re-loaded from disk to support sysbuild; in this case, a partially written file may be unparsable or introduce unexpected dynamics at best. What if, for example, I read in a genex from another CMake instance?

Let me know your thoughts on these issues.


Note [1]: To avoid that I also tried adding a scoped property early via yaml_create(TARGET ...), but this would require doing the same to yaml_load(), and a yaml_load(TARGET ...) is confusing at best.

Comment on lines 420 to 422
message(FATAL_ERROR "YAML context '${ARG_YAML_NAME}' uses generator expressions."
"Use 'yaml_generate()' to write its contents to a file."
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to write intermediate, "broken" output files for the various reasons in the review comment. However, I think it would be fine to write to a different file (.partial_N.yaml?) when genexes are detected. This would still provide debugging info, but avoid most of the above breakages.

Comment on lines 483 to 484
add_custom_target(${ARG_YAML_NAME}_yaml_filter ALL DEPENDS ${yaml_file}.tmp ${yaml_target})
add_custom_command(TARGET ${yaml_target} PRE_BUILD
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I only expected one "save" operation, as it is a target-generating step. What is a possible use case for multiple yaml_generate calls on the same scope?

@tejlmand
Copy link
Collaborator

I think that users must know if they want to support genexes or not.

How can the Zephyr top-level CMakeLists.txt here:

yaml_save(NAME build_info)

know if some arbitrary HAL decides to call build_info() that includes a genex ?

And the same go for any other yaml save usage.
Code should be flexible enough that you can just decide to add a genex to your yaml call without having to change some other location where the yaml_save() is called.

@pillo79
Copy link
Collaborator Author

pillo79 commented Jan 9, 2025

@tejlmand Here's v2, review ready:

  • rebased to take advantage of YAML script mode support
  • dropped yaml_generate() and implemented the required functionality for yaml_save():
    • immediately writes a file with comments in place of genex entries
    • full file is generated as a pre-build step of the specified TARGET
  • added full tests for the new yaml_save() functionality

This code is used in #83705 to rewrite the EDK to read the same data from build_info.yml and zephyr/.config instead of the command line arguments.

@pillo79 pillo79 marked this pull request as ready for review January 9, 2025 08:56
@pillo79
Copy link
Collaborator Author

pillo79 commented Jan 13, 2025

@tejlmand friendly ping: comments addressed, ready for review, the next PR (EDK rework, #83705) is also up-to-date.

Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial review of the updated PR.

Thanks for the effort, highly appreciated 👍
And I also appreciate the updated tests.

I haven't done in-depth testing, but done some basic testing in areas which caught my attention.

Incremental builds which changes contents to the yaml / genexes are not properly updated.
This can be seen with the following command execution:

$ cmake -GNinja -DBOARD=native_sim -Bbuild tests/cmake/yaml/
$ ninja -C build
$ cat build/test_setting_genexes.yaml
cmake:
  test:
    set:
      key-list-string-genex:
       - A
       - new
       - list
       - of
       - strings
      key-string: fixed string
      key-string-genex: expanded genex

# Update a CMakeLists.txt file, and re-build
$ sed -i 's/expanded genex/expanded genex update/g' tests/cmake/yaml/CMakeLists.txt

$ ninja -C build
ninja: Entering directory `build'
[0/1] Re-running CMake...
...
$ cat build/test_setting_genexes.yaml
cmake:
  test:
    set:
#     key-list-string-genex: @YAML-LIST@;A;new;$<TARGET_PROPERTY:app,expanding_list>;strings
      key-string: fixed string
#     key-string-genex: $<TARGET_PROPERTY:app,expanding_str>

@@ -265,6 +288,12 @@ function(yaml_set)

zephyr_get_scoped(json_content ${ARG_YAML_NAME} JSON)

if(DEFINED ARG_YAML_LIST
OR LIST IN_LIST ARG_YAML_KEYWORDS_MISSING_VALUES
OR ARG_YAML_APPEND)
Copy link
Collaborator

@tejlmand tejlmand Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this technically causes any use of the APPEND keyword to make the content to be treated as a list, even when list the keyword VALUE is used.

This observation also let me to test current and new implementation.
And it seems that a construct like yaml_set(NAME <name> KEY foo APPEND VALUE bar) ignores the APPEND keyword and sets foo: bar in the yaml in current zephyr/main. (should have raised an error of invalid keyword).

But with this change, then it silently changes foo from being a single value to now be treated as a list, which gives a very different / unexpected behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the current code did not properly cover the edge cases and did a bit of both. I though the conversion was desired; I'll implement the error indication instead.

Copy link
Collaborator

@tejlmand tejlmand Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the current code did not properly cover the edge cases

Yep, what I discovered as well. Proper error handling for APPEND VALUE was missed, so best to inform users this structure is invalid.

I'll implement the error indication instead.

Great 👍

Comment on lines 437 to 440
message(WARNING "YAML context '${ARG_YAML_NAME}' uses generator expressions "
"which require a TARGET parameter to 'yaml_save()'. GENEX "
"entries will be included as comments in the output."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is problematic.

How can a common save call, like here

yaml_save(NAME build_info)
know if there are Zephyr modules adding vendor specific entries which uses generator expressions ?

In this particular CMake code, the hal adds a vendor specific entry:

build_info(vendor-specific nordic svdfile VALUE ${SOC_SVD_FILE})

Here the CMake code is in-tree, but for another Zephyr module then such code could be located anywhere and thus outside possible knowledge of the top-level CMake list code. And the code could be using genex'es.

For file(GENERATE ...) then the TARGET argument is only important if the genex'es contains expression like: $<COMPILE_FEATURES:features> or $<TARGET_PROPERTY:prop>, but for $<TARGET_PROPERTY:prop> then users should explicit specify the target, that is use $<TARGET_PROPERTY:tgt,prop> instead.

I think we should state it as a requirement that genex'es working on targets must use genex'es which allows to specify the target itself in the genex, that is:

  • Supported: $<TARGET_PROPERTY:tgt,prop>
  • Not supported: $<TARGET_PROPERTY:prop>

or did you require some special genex'es which you need when writing the implementation and thus cannot follow above requirement ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you require some special genex'es which you need when writing the implementation and thus cannot follow above requirement?

No, that is a great alternative AFAICS. I think I added the TARGET parameter initially as a scheduling trick and ended up "requiring" it for no strong reason.

@@ -419,10 +488,12 @@ function(to_yaml json level yaml)
string(JSON type TYPE "${json}" ${member})
string(JSON subjson GET "${json}" ${member})
if(type STREQUAL OBJECT)
#--------------------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow reason for this and similar comments ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They highlight the main "switch-like" conditions in to_yaml and helped me navigate the many nested levels in the conversion code. Will remove them in the next push 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what I though.

But also feel free to adjust them into proper code comments, as that can help others understand the flow or intention.

)

set_property(TARGET app PROPERTY expanding_str "expanded genex")
set(new_value "$<TARGET_PROPERTY:app,expanding_str>")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper way of using genex for target properties, which means target is not needed as argument to the file(GENERATE ...) call 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only that had not been written by me... 🤦‍♂️ 😁

TARGET ${ARG_YAML_TARGET}
)

add_custom_command(TARGET ${ARG_YAML_TARGET} PRE_BUILD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be mixing concepts.

It both has the output file as a dependency in custom target, which is then supposed to drive the custom command.
But this custom command has no OUTPUT, only BYPRODUCTS.
Instead the custom command is a PRE_BUILD step which is normally used for build targets, such as library or executable.

I believe this should be using Generating Files signature and not the build event signature.

Afaict this is also the reason why incremental builds including changes to CMake files doesn't update the yaml file correctly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an area where I struggled a lot with CMake. I will look for a better way, IIRC I had issues with the generation signature because it was being run too early or too late, but I will test again with your reproducible test case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an area where I struggled a lot with CMake.

A lot of people do, but so far you seem to generally get it right 🙂

my guess in this case is that you lacked the depends to the temp json file.

If you have a depends there, then changes to the tmp json file, which happens during configure time, should result in an updated yaml file during build, which I show in #80007 (comment).

Comment on lines 461 to 462
add_custom_command(TARGET ${ARG_YAML_TARGET} PRE_BUILD
BYPRODUCTS ${yaml_file}
Copy link
Collaborator

@tejlmand tejlmand Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_custom_command(TARGET ${ARG_YAML_TARGET} PRE_BUILD
BYPRODUCTS ${yaml_file}
add_custom_command(
OUTPUT ${yaml_file}
DEPENDS ${json_file}

this change should address the issue regarding incremental builds.


FILE(GENERATE OUTPUT ${json_file}
CONTENT "${json_content}"
TARGET ${ARG_YAML_TARGET}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this as you are creating a custom target below which seems to be sufficient to support the need for genex'es.

Unless that there are use-cases unknown to me, so feel free to reply 🙂 .

Copy link
Collaborator Author

@pillo79 pillo79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and the reproducible test case @tejlmand. Will look into the issue and implement your suggestions.

@@ -265,6 +288,12 @@ function(yaml_set)

zephyr_get_scoped(json_content ${ARG_YAML_NAME} JSON)

if(DEFINED ARG_YAML_LIST
OR LIST IN_LIST ARG_YAML_KEYWORDS_MISSING_VALUES
OR ARG_YAML_APPEND)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the current code did not properly cover the edge cases and did a bit of both. I though the conversion was desired; I'll implement the error indication instead.

Comment on lines 437 to 440
message(WARNING "YAML context '${ARG_YAML_NAME}' uses generator expressions "
"which require a TARGET parameter to 'yaml_save()'. GENEX "
"entries will be included as comments in the output."
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you require some special genex'es which you need when writing the implementation and thus cannot follow above requirement?

No, that is a great alternative AFAICS. I think I added the TARGET parameter initially as a scheduling trick and ended up "requiring" it for no strong reason.

@@ -419,10 +488,12 @@ function(to_yaml json level yaml)
string(JSON type TYPE "${json}" ${member})
string(JSON subjson GET "${json}" ${member})
if(type STREQUAL OBJECT)
#--------------------
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They highlight the main "switch-like" conditions in to_yaml and helped me navigate the many nested levels in the conversion code. Will remove them in the next push 👍

)

set_property(TARGET app PROPERTY expanding_str "expanded genex")
set(new_value "$<TARGET_PROPERTY:app,expanding_str>")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only that had not been written by me... 🤦‍♂️ 😁

TARGET ${ARG_YAML_TARGET}
)

add_custom_command(TARGET ${ARG_YAML_TARGET} PRE_BUILD
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is an area where I struggled a lot with CMake. I will look for a better way, IIRC I had issues with the generation signature because it was being run too early or too late, but I will test again with your reproducible test case.

@pillo79 pillo79 force-pushed the pr-yaml-genex branch 3 times, most recently from 208ea08 to f0df4f3 Compare January 20, 2025 09:40
@pillo79
Copy link
Collaborator Author

pillo79 commented Jan 20, 2025

@tejlmand thanks for the help and hints! I was indeed missing the JSON dependency. That and the more robust add_custom_command variant fixed your sample case, regardless of JSON temp file removal.

Here's v3, review ready:

  • removed the TARGET argument from yaml_save()
  • switched to the more common add_custom_command(OUTPUT ... signature
  • implemented multiple generation steps with counters
  • <name>_yaml_saved target added at every yaml_save() call, so downstream code is unaware of genex use
  • tests improved to check for final values with multiple saves mixed with changes
  • filter script now removes all intermediate JSON files

Main push is the first one, the other 2 only updated comments, commit messages or whitespace issues.

EDIT: Downstream PR updated to use this, including the build_info_yaml_saved dependency.

@pillo79 pillo79 requested a review from tejlmand January 20, 2025 11:49
Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update.

Looks good, and will give a +1 when I have had the time to do some extra testing.

Only a minor nit noticed, but that is not blocking.

Thanks for the effort 👍

endif()

if(ARG_YAML_APPEND AND NOT key_is_list)
message(FATAL_ERROR "APPEND can only be used with LIST")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, adjusting the message slightly will align the error message to those constructed with zephyr_check_arguments_X() macros.

Suggested change
message(FATAL_ERROR "APPEND can only be used with LIST")
message(FATAL_ERROR "${CMAKE_CURRENT_FUNCTION}(APPEND ...) can only be used with argument: LIST"

@pillo79
Copy link
Collaborator Author

pillo79 commented Jan 20, 2025

Minor update, the EDK PR highlighted an issue when running cmake on the same directory multiple times.
Had to keep the current JSON file in the folder, not sure if deleting others is useful at this point...
Used the opportunity to fix the message as well. EDIT: ...and forgot to add the matching ) 🤦‍♂️ .

This introductory commit refactors the `yaml_set` function separating
the bodies of list operations into internal functions while preserving
their original behavior.

The conditions that cause the value to be interpreted as a list are also
verified only once, avoiding multiple checks.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
This introductory commit removes the ignored 'actual' variable from the
'yaml_set()' calls in the tests to make the code cleaner. Also, a badly
indented block is fixed in the 'message()' macro.

No functional change is introduced by this commit.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
This test verifies that the APPEND LIST operation in yaml_set() works as
expected.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
To verify proper functionality, the YAML module test suite is now run
twice: once in the regular CMake project mode and once in the CMake
script mode.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
This commit adds support for generator expressions in values and lists
to the yaml module.

Generator expressions can only be expanded by CMake after all
configuration code has been executed and the final values of the project
properties are defined. This means that contexts that contain generator
expressions are written twice:

 - immediately, during the 'yaml_save()' call, a comment with the raw
   unexpanded string is saved instead of the key that uses generator
   expressions in the YAML file;

 - after the configuration step, a custom command updates the YAML file
   contents with the fully expanded values.

This two-step process also allows to overcome the issue of lists that
are extracted from generator expressions, whose elements would be
expanded into a single string if written directly to the YAML file.
Instead, the lists are stored in their CMake string format with a
special marker, expanded by CMake into a temporary JSON file, and the
conversion to a proper list is performed during the build step.

If the saved YAML file for context <name> is needed by further build
steps in this project, the target '<name>_yaml_saved' must be added as a
dependency to ensure the final contents are ready.

Note that when generator expressions are used in the context, the GENEX
keyword must be provided to yaml_set(). This is necessary to avoid
storing the genexes as raw strings in the YAML.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
Add tests to the YAML suite for generator expressions. The test checks
that generator expressions are handled as expected at different stages:

 - immediately after yaml_save(), the file must contain only non-genex
   entries;
 - at a later time, after the target has been built, the file must
   contain all the expanded genex values.

Signed-off-by: Luca Burelli <l.burelli@arduino.cc>
@pillo79
Copy link
Collaborator Author

pillo79 commented Jan 29, 2025

Resynced with main, since other work was done while I was patiently waiting for a review.
@tejlmand can I get an update?

Copy link
Collaborator

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Did system testing and have found no issues.

A minor simplification possibility, but not blocking.

Thanks 👍

Comment on lines +479 to +481
get_property(temp_files TARGET ${save_target} PROPERTY temp_files)
list(APPEND temp_files ${json_file})
set_property(TARGET ${save_target} PROPERTY temp_files ${temp_files})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be simplified to:

Suggested change
get_property(temp_files TARGET ${save_target} PROPERTY temp_files)
list(APPEND temp_files ${json_file})
set_property(TARGET ${save_target} PROPERTY temp_files ${temp_files})
set_property(TARGET ${save_target} APPEND PROPERTY temp_files ${json_file})

@kartben kartben merged commit 3741a0b into zephyrproject-rtos:main Jan 30, 2025
24 checks passed
@pillo79 pillo79 deleted the pr-yaml-genex branch January 30, 2025 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants