Skip to content

Enable variable to be categorical #409

@dlebauer

Description

@dlebauer

Note: this is open for discussion, currently on hold pending feedback from people who will collect categorical data.

Some traits are categorical - these are commonly used by breeders. The species table imported from USDA Plants has a number of categorical traits (e.g. 'PropagatedBySeed').

For the TERRAREF program, we need to allow categorical variables, but have decided to track these in traits in order to capture the who / where / when metadata (this is not captured in the species table). Categorical traits include some traits that could be quantified, but are not quantified in practice, such as 'seed color', 'maturity class' etc.

Proposed solution:

To support the collection of such data we will create two new fields in the variables table:

  1. 'categorical' of type boolean
  2. 'options' of type array
    • arrays will have fields value, name, definition

When returning data, we can use lookup. As an example, the variable maturity class would be recorded as

field value
id 999
description maturity class
units NULL
notes Maximum rate of RuBP regeneration.
name maturity_class
max 0
min 1
type trait
categorical TRUE
options
value option_name definition
0 early senesces in < 100 days after planting
1 late senesces in > 100 days after planting

Other options

  1. add new columns to cultivars for each characteristic (this is not normalized, requires migrations)
  2. add a cultivars_characteristics and characteristics table (to prevent > 90% sparse table like species)
  3. figure out how to convert all categorical traits to numeric (e.g. in above example, use 'senescence' as an observation, days after planting computed from observation date - planting date.
    • categories such as 'maturity group' can be computed 'on the fly'
  4. (can be combined with above) use BMS to store categorical variables.

would appreciate feedback from @nfahlgren, @terraref/standards-committee

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions