Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto-generated Configuration schema from Pydantic model #16518

Draft
wants to merge 28 commits into
base: dev
Choose a base branch
from

Conversation

davelopez
Copy link
Contributor

@davelopez davelopez commented Aug 4, 2023

Follow up to #16514

I was manually creating the model when I realized we already have the config_schema.yml. It would be much easier to maintain if we could update the model whenever we introduce a new config parameter. As wisely suggested by Marius, I will approach this by first generating the initial basic Pydantic Model from the config schema, and then, the idea is to use the model as the single source of truth and generate the schema from it.

TODO

  • Generate initial Pydantic model from current config_schema.yml
  • Manually curate the model:
    • Include any missing fields that are serialized in the API but not in the YAML schema (computed fields), etc.
    • Create different views of the Configuration model, initially:
      • ComputedGalaxyConfig: contains common Galaxy configuration options computed at runtime (not defined in the YAML config schema).
      • UserExposableComputedGalaxyConfig / AdminExposableComputedGalaxyConfig: mainly used for discriminating admin/non-admin models, but can be used to extend the common ComputedGalaxyConfig with admin-only options.
      • AdminExposableGalaxyConfig/UserExposableGalaxyConfig: contains (admin-only or common) configuration options that are exposed through the API.
      • FullGalaxyConfig: contains all the options in the YAML configuration file. This model will be used to generate the YAML schema. These options are not exposed to the API.
    • Add stricter types using literal values instead of strings
    • Include Fix: serialize tool_shed_urls directly from the API #16561 in model
    • Mark deprecated options as deprecated.
    • Take by_host dynamic config options into account Allow various configuration parameters to be set per host. #12328
    • Annotate path_resolves_to options
    • Annotate reloadable options
    • Figure out why converter tests are failing
      • After some digging, the problem was caused by use_remote_user getting a string (email) instead of a boolean as a value assigned from the single_user option in some cases. I decided to fix it by casting to the bool value which is more explicit c9a6794
    • Fix API schema linting with custom annotations (per_host, path_resolves_to, etc...)
  • Generate config_schema.yml from the Full model

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    • TBD

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@mvdbeek
Copy link
Member

mvdbeek commented Aug 5, 2023

Opening PR to gather feedback on this approach.

This is amazing, I love it. How about we turn this around and make the pydantic model the source of truth from which we generate the other stuff ... it should be much more expressive than what we have currently, and the model_validators would let us for instance easily flag stuff that can't be combined.

And we can easily define what can and can't be exposed through the API inline.

@davelopez
Copy link
Contributor Author

davelopez commented Aug 6, 2023

Thanks so much @mvdbeek! That is exactly what I needed to hear 😄 👍

@davelopez davelopez force-pushed the add_configuration_api_schema_model branch from 609401b to 7efd287 Compare August 6, 2023 12:03
@davelopez davelopez changed the title Add auto-generated Configuration API schema model Add auto-generated Configuration schema from Pydantic model Aug 6, 2023
@davelopez davelopez force-pushed the add_configuration_api_schema_model branch 6 times, most recently from a18cb33 to 329989c Compare August 10, 2023 13:05
@davelopez davelopez force-pushed the add_configuration_api_schema_model branch from 91d7206 to 7a20cd0 Compare August 23, 2023 18:18
From the configuration schema.
And invoke `config-api-schema` when building the config options and updating the client API schema.
This model is not yet properly generated as it exposes everything.
- Fix some encoding issues
- Use Annotated for fields
See test/integration/test_data_manager.py::TestDataManagerIntegration::test_data_manager_installation_table_reload
The documentation says that the option is called `extended_celery` but
the tests were using `celery_extended`. The logic will accept both
options, but the tests should be consistent with the documentation.
The validation logic needs to live in the manager class because otherwise the configuration package will need to depend on the schema, pydantic, etc.
When validating the config file, is a list of objects but when serialized to the API is a dictionary.
The option `use_remote_user` was used an declared as bool in the codebase, but it was assigned the value of `single_user` in some cases and that option can be either a bool or an email address (str) causing the type validation to fail.
This is somewhat controversial, but otherwise the generated schema will have weird spacing and unwanted line breaks in all descriptions.
@davelopez davelopez force-pushed the add_configuration_api_schema_model branch from c9a6794 to b27ee9a Compare September 11, 2023 16:04
@mvdbeek
Copy link
Member

mvdbeek commented Sep 12, 2023

Discussed this in the backend channel, we're all on board with having the pydantic models as the source of truth, and that if necessary, we can replace the tooling built on pykwalify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

2 participants