You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Off the back of a couple of recent conversations on PRs and the open issue #129, here's a suggestion to get the ball rolling on a strategy for constants. This is straight up biased as I'm suggesting we adopt something very like what I have in pyrealm but if there are strong arguments for a different approach I'm equally happy to update pyrealm!
This is a sketch for how I think we could structure the "constants". These values are not all 'constant' constants, but they are things that would be constant for a simulation or multiple sets of simulations.
So my starting points are these statements:
Constants should not be hard coded into functions and should be exposed in some way that allows them to be configured for a simulation. That seems mad for things like the gas constant but for many constants - such as coefficients of empirical
processes - we will want to be able to explore the effects of changing defaults.
Each model will need its own constants. We previously discussed a single core.constants.py module. We might still want that for shared or global constants but it won't fly for the modular models approach as a custom BaseModel needs to be
able to provide its own constants without needing to hack the core.constant module: it should be plug and play.
Functions or methods that use calculations should accept an argument that provides the constants used within the function, so that there is a clear way to pass a specific set of those constants into the calculations. Ultimately that means that higher level functions (such as model __init__ methods) also should include the various constant arguments so they can pass them on to called functions.
So, as an example of how I think we could do this, let's say we have a new Hunting model that has a core function to model hunted biomass of a location as a function of distance from habitation and topographic complexity. That might then have:
And then a hunting/models.py containing (fragmentary code ahead...):
defhunting_pressure(
topo: NDArray,
dist: NDArray,
hunting_constants=HuntingConstants()
) ->NDArray:
return (
hunting_constants.intercept+hunting_constants.dist_slope*dist+hunting_constants.topo_slope*topo+hunting_constants.interaction*dist*topo
)
classHunting(BaseModel):
model_name='hunting'def__init__(
data: Data,
...,
hunting_constants: HuntingConstants=HuntingConstants()
) ->None:
...
self.data=data# The set of constants are a key attribute of the model instanceself.hunting_constants=hunting_constants
...
defupdate(...) ->None:
...
# The constants get passed on to other functions used within the modelpressure=hunting_pressure(self.data['topo'], self.data['dist'], self.hunting_constants)
...
deffrom_config(data: Data, ..., config: dict) ->HuntingModel:
...
# If the model config changes constant defaults, extract them and use them in creating # the returned model instance. hunting_const: dict= {}
if'constants'inconfigand'HuntingConstants'inconfig['constants']:
hunting_const=config['constants']['HuntingConstants']
...
returnHuntingModel(data, ..., hunting_constants=HuntingConstants(**hunting_const))
Programatically, you can then just use the defaults or adjust them:
Then we have to allow users to configure this for simulations run from a command line configuration. We can add a constants section to module JSONSchema definitions so that users can alter constants in the model config:
[hunting.constants.HuntingConstants]slope = 7
Then, the from_config factory method can intercept that section of the configuration and use it to initialise the HuntingConstants instance for the model instance.
There are at least two areas that seem possibly iffy.
JSONSchema and dataclass synchronisation
There is overlap in the role of the dataclass, which is the programmatic API, and the JSONSchema for the module configuration, which is setting up the programmatic API from a configuration. I don't have a good handle on how to avoid breaking DRY.
We could just have the JSON schema allow hunting.constants.HuntingConstants to be any random dictionary and then just use try in from_config to handle badly configured constants. Then everything is defined by the dataclass, which is clean and DRY but does mean that the use of JSONSchema to clean the configuration is inconsistent.
At the other end, we could duplicate the HuntingConstants dataclass definition in the JSONSchema, right down to the default values, types, array sizes etc. Then the configuration gets cleaned before it goes near the from_config factory, but it is highly repetitive of the data structure definition in the dataclass and not remotely DRY.
'Bundling' constants
With any remotely complex model, there are likely to be several sets of parameters for several different core functions. If all of these are defined as individual dataclasses then it seems like you'd end up with potentially very long __init__ methods with
multiple different constant dataclasses as arguments.
Here I'd lean towards having bundled constant dataclasses. So the HuntingConstants dataclass might contain all of the different 'constants' used in HuntingModel and you only have one or two arguments for setting constants in __init__ (HuntingConstants and maybe CoreConstants). That does mean that functions get passed a dataclass containing a whole load of parameters they don't need, along with the ones they do, but it seems cleaner. If it turns out that they get huge and splitting them into a couple of logical groups makes life easier, then that's fine too.
But you could have lots of dataclasses in hunting.constants, which are all unique to a specific function. I don't see that being as clean an interface, but there may be an obvious solution.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Off the back of a couple of recent conversations on PRs and the open issue #129, here's a suggestion to get the ball rolling on a strategy for constants. This is straight up biased as I'm suggesting we adopt something very like what I have in
pyrealm
but if there are strong arguments for a different approach I'm equally happy to updatepyrealm
!Links to those conversations:
CarbonPool model: Minimal carbon pool model #134 (comment)
Constants proposal
This is a sketch for how I think we could structure the "constants". These values are not all 'constant' constants, but they are things that would be constant for a simulation or multiple sets of simulations.
So my starting points are these statements:
Constants should not be hard coded into functions and should be exposed in some way that allows them to be configured for a simulation. That seems mad for things like the gas constant but for many constants - such as coefficients of empirical
processes - we will want to be able to explore the effects of changing defaults.
Each model will need its own constants. We previously discussed a single
core.constants.py
module. We might still want that for shared or global constants but it won't fly for the modular models approach as a custom BaseModel needs to beable to provide its own constants without needing to hack the
core.constant
module: it should be plug and play.Functions or methods that use calculations should accept an argument that provides the constants used within the function, so that there is a clear way to pass a specific set of those constants into the calculations. Ultimately that means that higher level functions (such as model
__init__
methods) also should include the various constant arguments so they can pass them on to called functions.So, as an example of how I think we could do this, let's say we have a new
Hunting
model that has a core function to model hunted biomass of a location as a function of distance from habitation and topographic complexity. That might then have:hunting/constants.py
file containing:hunting/models.py
containing (fragmentary code ahead...):Programatically, you can then just use the defaults or adjust them:
Then we have to allow users to configure this for simulations run from a command line configuration. We can add a constants section to module JSONSchema definitions so that users can alter constants in the model config:
Then, the
from_config
factory method can intercept that section of the configuration and use it to initialise theHuntingConstants
instance for the model instance.There are at least two areas that seem possibly iffy.
JSONSchema and dataclass synchronisation
There is overlap in the role of the
dataclass
, which is the programmatic API, and the JSONSchema for the module configuration, which is setting up the programmatic API from a configuration. I don't have a good handle on how to avoid breaking DRY.hunting.constants.HuntingConstants
to be any random dictionary and then just usetry
infrom_config
to handle badly configuredconstants
. Then everything is defined by the dataclass, which is clean and DRY but does mean that the use of JSONSchema to clean the configuration is inconsistent.HuntingConstants
dataclass definition in the JSONSchema, right down to the default values, types, array sizes etc. Then the configuration gets cleaned before it goes near thefrom_config
factory, but it is highly repetitive of the data structure definition in the dataclass and not remotely DRY.'Bundling' constants
With any remotely complex model, there are likely to be several sets of parameters for several different core functions. If all of these are defined as individual dataclasses then it seems like you'd end up with potentially very long
__init__
methods withmultiple different constant dataclasses as arguments.
Here I'd lean towards having bundled constant dataclasses. So the
HuntingConstants
dataclass might contain all of the different 'constants' used inHuntingModel
and you only have one or two arguments for setting constants in__init__
(HuntingConstants
and maybeCoreConstants
). That does mean that functions get passed a dataclass containing a whole load of parameters they don't need, along with the ones they do, but it seems cleaner. If it turns out that they get huge and splitting them into a couple of logical groups makes life easier, then that's fine too.But you could have lots of dataclasses in
hunting.constants
, which are all unique to a specific function. I don't see that being as clean an interface, but there may be an obvious solution.Beta Was this translation helpful? Give feedback.
All reactions