Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulator: rename measurement column to simulation #199

Merged
merged 5 commits into from
May 7, 2023

Conversation

dilpath
Copy link
Member

@dilpath dilpath commented Apr 22, 2023

What's your opinion on whether to default to measurement or simulation? Either is fine for me. Currently I am in favor of keeping measurement, because:

  • I have mostly used this for synthetic data generation
  • It won't break user code

However, I see that simulation might be what users expect from a simulator.

see also: AMICI-dev/AMICI#2050 and AMICI-dev/AMICI#2051

@dilpath dilpath requested review from fbergmann and dweindl April 22, 2023 13:02
@codecov-commenter
Copy link

codecov-commenter commented Apr 22, 2023

Codecov Report

Merging #199 (3acdeef) into develop (1964ca1) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #199      +/-   ##
===========================================
+ Coverage    76.25%   76.28%   +0.03%     
===========================================
  Files           34       34              
  Lines         3155     3159       +4     
  Branches       764      765       +1     
===========================================
+ Hits          2406     2410       +4     
  Misses         552      552              
  Partials       197      197              
Impacted Files Coverage Δ
petab/simulate.py 82.45% <100.00%> (+1.32%) ⬆️

@dweindl
Copy link
Member

dweindl commented Apr 22, 2023

However, I see that simulation might be what users expect from a simulator.

That.

* I have mostly used this for synthetic data generation

Maybe we should split it into a Simulator and a SyntheticDataGenerator, each with their natural default?

* It won't break user code

Fair point, but I'd say it is not so widely used yet, that I'd rather avoid confusing future users.

I would change rename to as_measurement or as_simulation, depending on the final choice of the default.

@fbergmann
Copy link
Collaborator

Looks good to me, though i of course would prefer to have the default for rename set to True, since I'd mainly expect a simulation dataframe when calling simulate.

@@ -115,6 +115,7 @@ def simulate(
self,
noise: bool = False,
noise_scaling_factor: float = 1,
rename: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you consider changing this to True?

@dilpath
Copy link
Member Author

dilpath commented May 3, 2023

Thanks for the feedback! I opted to default to as_measurement=False, and otherwise keep it as-is.

The simulator now enforces the column name, such that the tool (e.g. AMICI/COPASI) naming choice of 'measurement' or 'simulation' doesn't affect the output here.

Copy link
Member

@dweindl dweindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to specify/document what simulate_without_noise is supposed to return - measurement or simulation.
To me it would feel most natural to have simulation for no-noise, and measurement for with-noise. I don't know if this might be considered confusing by others.

@dilpath
Copy link
Member Author

dilpath commented May 5, 2023

Would be good to specify/document what simulate_without_noise is supposed to return - measurement or simulation.

Hm, at the moment the PEtab spec states

(optional) A simulation file, which has the same format as the measurement file, but contains model simulations [TSV]

But this looks like a typo, since the simulation column name is used in practice.

Currently, the docstring for simulate_without_noise specifies measurement (as in the spec):

Simulated data, as a PEtab measurements table, which should be equivalent to replacing all values in the :const:petab.C.MEASUREMENT column of the measurements table (of the PEtab problem supplied to the :meth:petab.simulate.Simulator.__init__ method), with simulated values.

But the simulate method now enforces a specific column name, so the user always sees consistent column names.

To me it would feel most natural to have simulation for no-noise, and measurement for with-noise. I don't know if this might be considered confusing by others.

Currently, as user can choose the column name, with as_measurement. Are you suggesting that I remove as_measurement and instead use Simulator.simulate(noise=True/False) determine the column name? The user is never meant to use simulate_without_noise, rather use Simulator.simulate(noise=True/False).

@dweindl
Copy link
Member

dweindl commented May 5, 2023

Would be good to specify/document what simulate_without_noise is supposed to return - measurement or simulation.

Hm, at the moment the PEtab spec states

(optional) A simulation file, which has the same format as the measurement file, but contains model simulations [TSV]

But this looks like a typo, since the simulation column name is used in practice.

Agreed, should be changed.

Currently, the docstring for simulate_without_noise specifies measurement (as in the spec):

Simulated data, as a PEtab measurements table, which should be equivalent to replacing all values in the :const:petab.C.MEASUREMENT column of the measurements table (of the PEtab problem supplied to the :meth:petab.simulate.Simulator.__init__ method), with simulated values.

Sorry, I missed that. I accidentally looked at TestSimulator, which didn't have an informative docstring. All good.

To me it would feel most natural to have simulation for no-noise, and measurement for with-noise. I don't know if this might be considered confusing by others.

Currently, as user can choose the column name, with as_measurement. Are you suggesting that I remove as_measurement and instead use Simulator.simulate(noise=True/False) determine the column name? The user is never meant to use simulate_without_noise, rather use Simulator.simulate(noise=True/False).

No, I think it's clearly specified then.

@dilpath dilpath merged commit bd0176f into develop May 7, 2023
@dilpath dilpath deleted the simulator_rename_measurement_column branch May 7, 2023 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants