Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 53 additions & 80 deletions doc/v2/_static/petab_schema_v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,96 +22,67 @@ properties:
File name (absolute or relative) or URL to PEtab parameter table
containing parameters of all models listed in `problems`. A single
table may be split into multiple files and described as an array here.
problems:
type: array
description: |
One or multiple PEtab problems (sets of model, condition, observable
and measurement files). If different model and data files are
independent, they can be specified as separate PEtab problems, which
may allow more efficient handling. Files in one problem cannot refer
to models entities or data specified inside another problem.
items:

type: object
description: |
A set of PEtab model, observable and measurement
files and optional condition, experiment, and visualization files.

properties:

model_files:
type: object
description: One or multiple models

# the model ID
patternProperties:
"^[a-zA-Z_]\\w*$":
type: object
properties:
location:
type: string
description: Model file name or URL
language:
type: string
description: |
Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb'
required:
- location
- language
additionalProperties: false

measurement_files:
type: array
description: List of PEtab measurement files.

items:
type: string
description: PEtab measurement file name or URL.

condition_files:
type: array
description: List of PEtab condition files.
model_files:
type: object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this object but array for the other files? Should this not be array

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far it's:

model_files:
  some_id:
    location: ...
    language: ...

This could be changed to

model_files:
  - id: some_id
    location: ...
    language: ...

I am fine with either, but it's out of scope of this PR.

description: One or multiple models

items:
# the model ID
patternProperties:
"^[a-zA-Z_]\\w*$":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this regular expression correct? This should be ^[a-zA-Z_]\w*$, i.e. only a single \

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct. This is just a regular string. It is interpreted as a regex only elsewhere.

type: object
properties:
location:
type: string
description: PEtab condition file name or URL.
description: Model file name or URL
language:
type: string
description: |
Model language, e.g., 'sbml', 'cellml', 'bngl', 'pysb'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pysb is programming specific format right? Might be outside the scope of this PR, but feels like we should stay with formats that are programming language independent, and exchangeable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both - strong preference for exchangeable, and out of scope here. Nonetheless, if somebody really wants to use PySB models directly, I'd still prefer they are able to use PEtab, so that at least the rest is more standardized and portable.

Copy link
Collaborator Author

@FFroehlich FFroehlich Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well sbml is also formatted in xml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PySB models directly, I'd still prefer they are able to use PEtab, so that at least the rest is more standardized and portable.

Feels like this should then rather be up to implementations? Otherwise it would also be fair to mention frameworks as Catalyst.jl and ModelingToolkit.jl here (which PEtab.jl supports). So how about we just remove pysb here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this should then rather be up to implementations?

Whether they want to support it - yes, definitely. I don't think there will ever be a tool that supports everything.

So how about we just remove pysb here?

The reason why it's listed there, is that if some implementations support it, they all use the same language ID. (So to avoid that tool A expects 'pysb', tool B expects 'PySB', ...). For that reason, I'd rather keep it here. We can explicitly discourage using non-standard formats here or somewhere else.

Otherwise it would also be fair to mention frameworks as Catalyst.jl and ModelingToolkit.jl here (which PEtab.jl supports).

I have no practical experience there, but if it is meaningful to add those, and a given Julia file yields a clearly identifiable model (i.e., it's is clear which object is to be used), those could be included for the same reason. But again, I am strongly in favor of sticking to widely supported formats whenever practicable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points! And given this I think adding at least Catalyst.jl here would be a valid option

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

against listing too much here, I would go with 'sbml', 'cellml', 'bngl'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with about anything here, as long as sbml stays in there. However, I am very much in favor of resolving that somewhere else. This has been in place since #538 and should not block this pull request.

required:
- location
- language
additionalProperties: false

measurement_files:
type: array
description: List of PEtab measurement files.

experiment_files:
type: array
description: List of PEtab experiment files
items:
type: string
description: PEtab measurement file name or URL.

items:
type: string
description: PEtab experiment file name or URL.
condition_files:
type: array
description: List of PEtab condition files.

observable_files:
type: array
description: List of PEtab observable files.
items:
type: string
description: PEtab condition file name or URL.

items:
type: string
description: PEtab observable file name or URL.
experiment_files:
type: array
description: List of PEtab experiment files

visualization_files:
type: array
description: List of PEtab visualization files.
items:
type: string
description: PEtab experiment file name or URL.

items:
type: string
description: PEtab visualization file name or URL.
observable_files:
type: array
description: List of PEtab observable files.

mapping_files:
type: array
description: List of PEtab mapping files.
items:
type: string
description: PEtab observable file name or URL.

items:
type: string
description: PEtab mapping file name or URL.
mapping_files:
type: array
description: List of PEtab mapping files.

required:
- model_files
- observable_files
- measurement_files
items:
type: string
description: PEtab mapping file name or URL.

extensions:
type: object
Expand All @@ -128,7 +99,7 @@ properties:
type: string
pattern: ^([1-9][0-9]*!)?(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))*((a|b|rc)(0|[1-9][0-9]*))?(\.post(0|[1-9][0-9]*))?(\.dev(0|[1-9][0-9]*))?$
required:
type: bool
type: boolean
description: |
Indicates whether the extension is required for the
mathematical interpretation of the problem.
Expand All @@ -142,4 +113,6 @@ properties:
required:
- format_version
- parameter_file
- problems
- model_files
- observable_files
- measurement_files
99 changes: 83 additions & 16 deletions doc/v2/documentation_data_format.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ text-based files in `Tab-Separated Values (TSV)
<https://www.iana.org/assignments/media-types/text/tab-separated-values>`_
format (Figure 2), including:

- A :ref:`model <v2_model>` file specifying the base model
- One or multiple (optional) :ref:`model <v2_model>` file(s) specifying the base model(s)
[SBML, CELLML, BNGL, PYSB, ...].

- A :ref:`measurement file <v2_measurements_table>` containing experimental
Expand Down Expand Up @@ -111,12 +111,20 @@ should not alter the definition of the estimation problem itself.
parameters, but excluding parameters listed in the
:ref:`v2_parameters_table` (independent of their ``estimate`` value).

.. _v2_changes:

Changes from PEtab 1.0.0
------------------------
PEtab 2.0.0 is a major update of the PEtab format. The main changes are:

* The :ref:`PEtab YAML problem description <v2_problem_yaml>` is now
mandatory and its structure changed. In particular, the `problems` list
has been flattened.
* Support for models in other formats than SBML (:ref:`v2_model`).
* The use of different models for different measurements is now
supported via the optional ``modelId`` column in the
:ref:`v2_measurements_table`, see also :ref:`v2_multiple_models`.
This was poorly defined in PEtab 1.0.0 and probably not used in practice.
* The (now optional) conditions table format changed from wide to long
(:ref:`v2_conditions_table`).
* ``simulationConditionId`` and ``preequilibrationConditionId`` in the
Expand Down Expand Up @@ -205,10 +213,10 @@ PEtab distinguishes between three types of entities:
Conditions table
----------------

The optional conditions table defines discrete changes to the model. These (sets of)
changes typically represent interventions, perturbations, or changes in the
environment of the system of interest. These modifications are referred to as
(experimental) *conditions*.
The optional conditions table defines discrete changes to the simulated model(s).
These (sets of) changes typically represent interventions, perturbations, or
changes in the environment of the system of interest. These modifications are
referred to as (experimental) *conditions*.

Conditions are applied at specific time points, which are defined in the
:ref:`v2_experiments_table`. This allows for the specification of time
Expand Down Expand Up @@ -533,6 +541,15 @@ Detailed field description
Numeric values or parameter names are allowed. Same rules apply as for
``observableParameters`` in the previous point.

- ``modelId`` [PETAB_ID, OPTIONAL, REFERENCES(yaml.models.model_id)]

Which model to simulate for each data point. Model IDs are defined by the
keys of the `models` object in the PEtab problem YAML file.
This column is required when multiple models are defined in the PEtab
problem.
For problems with a single model, this column is optional,
and its values default to the ID of the only model present.

.. _v2_simulation_table:

Simulation table
Expand Down Expand Up @@ -608,8 +625,7 @@ Detailed field description
* ``observableFormula`` [STRING]

Observation function as plain text formula expression.

The expression may contain any symbol defined in the model,
The expression may contain any symbol defined in a model,
the mapping table or the parameter table.
Often, this is just the ID of a state variable.
Furthermore, any parameters introduced through the ``observablePlaceholders``
Expand Down Expand Up @@ -750,7 +766,7 @@ and *must not* include:
- "Parameters" that are not *constant* entities (e.g., in an SBML model,
the targets of *AssignmentRules* or *EventAssignments*)
- Any parameters that do not have valid PEtab IDs
(e.g., SBML *local* parameters)
(e.g., SBML *local* parameters that are not mapped in the mapping table)

it *may* include:

Expand Down Expand Up @@ -789,7 +805,7 @@ Detailed field description
- ``parameterId`` [PETAB_ID, REQUIRED]

The ``parameterId`` of the parameter described in this row. This has to match
the ID of a parameter specified in the SBML model, a parameter introduced
the ID of a parameter specified in at least one model, a parameter introduced
as override in the condition table, or a parameter occurring in the
``observableParameters`` or ``noiseParameters`` column of the measurement table
(see above).
Expand Down Expand Up @@ -976,8 +992,8 @@ Detailed field description

- ``modelEntityId`` [STRING or empty, REQUIRED]

A globally unique identifier defined in the model, or empty if the entity is
not present in the model. This does not have to be a valid PEtab identifier.
A globally unique identifier defined in any model, or empty if the entity is
not present in any model. This does not have to be a valid PEtab identifier.
Rows with empty ``modelEntityId`` serve as annotations only.

For example, in SBML, local parameters may be referenced as
Expand Down Expand Up @@ -1015,15 +1031,66 @@ easy validation:
.. literalinclude:: _static/petab_schema_v2.yaml
:language: yaml

.. _v2_multiple_models:

Parameter estimation problems combining multiple models
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Parameter estimation problems can comprise multiple models. For now, PEtab
allows one to specify multiple models with corresponding condition and
measurement tables, and one joint parameter table. This means that the parameter
namespace is global. Therefore, parameters with the same ID in different models
will be considered identical.
Purpose
+++++++

PEtab supports defining multiple models within a single problem specification. This
feature is designed to enable users to define experiment-specific model variants or
submodels. Rather than implementing a single global, parameterized model, users can
define multiple smaller, self-contained models that differ structurally as needed.

This approach offers several benefits:

- Simplified model definition for users, as each variant can be independently
specified.
- Improved simulation performance for tool developers, as smaller models can be
simulated more efficiently.

Scope and Application
+++++++++++++++++++++

While multiple models are intended to be applied to different experiments, model
selection is specified at the level of individual data points in the
:ref:`v2_measurements_table`. This design enables:

- Reuse of experiments across models.
- Fine-grained model-to-data assignment.

With the exception of the :ref:`v2_measurements_table`, all other PEtab tables apply
to all models. Parameters listed in the parameter table are defined globally and
shared across all models. In contrast, entries in all other tables implicitly define
model-specific instances of observables, conditions, experiments, etc., with their
respective PEtab IDs existing in local, model-specific namespaces. Each PEtab
subproblem defined in this way must constitute a valid PEtab problem on its own.

This design has several implications:

- A single experiment may need to be simulated with different models for
different measurements. However, a single simulation of a given experiment
is always performed using one single model.
- Each model may be associated with a distinct subset of experiments.
- The number of conditions to be simulated for a model-specific instance
of an experiment may vary across models.
- Each parameter defined in the :ref:`v2_parameters_table` has a shared value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth to mention that we can achieve model specific parameters via using the condition table (we can map parameter k_m1, k_m2 to k in the condition table, where k then can set different values in different models)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that already described elsewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is described in the context of condition table, but unless you really familiar with the PEtab standard, I am unsure users will make the connection to multiple models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's noteworthy about the interaction between multi-model and condition assignment that it needs to be mentioned here? I imagine that users who want to assign things in specific conditions will check the condition table description and users who want multiple models will check the multi-model description.

Copy link
Contributor

@sebapersson sebapersson Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure that users not very familiar with PEtab will realize that they can use the condition table to assign model-specific values to SBML parameters, such as setting different values for a parameter like k1 across different models. Based on the discussions at the meeting, I had the impression that users might be particularly interested in this kind of use case. This is just a suggestion, and if you feel it is unnecessary, feel free to disregard it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to have some how-tos in the documentation later on that demonstrate how to address certain application problems in PEtab, but to keep the actual specs rather compact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is an excellent point that would help users create more tricky PEtab setups (we should probably open a separate issue for this)!

across all models. Parameters not listed in the parameter table do not share
values, which can result in model-specific instantiations of model observables
referencing these parameters.

Validation Rules
++++++++++++++++

For any given model, only those experiments and observables that appear in the
same rows of the :ref:`v2_measurements_table` need to be valid. This means that all
symbols used in the corresponding ``observableFormula`` and all symbols assigned
in the associated condition definitions must be defined in the model.

Conditions and observables that are not applied to a model do not need to be
valid for that model.


.. _v2_objective_function:
Expand Down