Skip to content

Units of Measurement

HankHerr-NOAA edited this page Jul 25, 2024 · 14 revisions

Table of Contents

The WRES uses a formal units of measurement library supporting the Unified Code for Units of Measure (UCUM), described at https://ucum.org/ucum.html. So long as the units of measurement are compatible with UCUM, the WRES will be able to parse the units, recognize the units and dimensions, and convert the units where appropriate. So, if the dimensions are the same for two units (e.g. [M]/[T], or mass/time) but the units differ (e.g. kg/ms to [lb_av]/d), then WRES can perform the conversion provided it can recognize the units. Furthermore, in the case where the syntax of the unit in the provided evaluation data is not successfully parsed and recognized by WRES (i.e., it does not follow UCUM standards), the declaration can "alias" the measurement unit with a UCUM standard unit allowing for it to be understood. So, for example, declaration can let the WRES know, "the unit kilograms per millisecond in the evaluation data should be treated as kg/ms", where kg/ms is recognized by the WRES because it follows the UCUM standards. Finally, for some units commonly seen in hydrological vernacular, the WRES has an internal aliasing map allowing them to be understood without explicit declaration; this includes, among others, CFS, CMS, FT, `KCFS', and others.

More details are provided below, along with a description of how to "alias" units in the evaluation data to units meeting UCUM standards so that the WRES recognize the units and dimensions.

What do I do when the units are not recognized by the WRES?

When a unit is not recognized, an UnrecognizedUnitException will be thrown. The message should explain that a unit_aliases declaration is required for WRES to recognize the unit.

Suppose the original dataset for the observed and predicted have the unit name kilograms per millisecond and the declared desired unit for pairs is [lb_av]/d (avoirdupois pounds per day; https://ucum.org/ucum.html#para-39), WRES can perform this conversion once it knows what kilograms per millisecond means in terms of UCUM. Here is the appropriate declaration:

unit_aliases:
  - alias: 'kilograms per millisecond'
    unit: 'kg/ms'

This declaration states that, "Whenever you find the unit of measurement string kilograms per millisecond, consider it to be an alias for the UCUM unit of measurement string kg/ms."

A demo for validation and conversion of UCUM units is available at https://ucum.nlm.nih.gov/ucum-lhc/demo.html (though it does not highlight deprecated UCUM units). This site makes it easier to find the unit name you are looking for from plain words.

What does the message INFO Units Treating measurement unit name... mean?

You may see messages like the following in logs:

`2021-10-20T13:51:26.958+0000 INFO Units Treating measurement unit name 'CMS' as UCUM unit 'm3/s' along dimension '[L]³/[T]'`  

or

`2021-10-20T14:30:28.829+0000 INFO Units Treating measurement unit name 'kg m-2' as UCUM unit 'kg/m2' along dimension '[M]/[L]²'`

To clarify what unit is used by WRES during an evaluation, this message indicates the underlying formal unit as well as its dimension. In the first example message, the string CMS is being used by WRES as the unit that UCUM names m3/s and the dimension is volume (length cubed) over time. In the second example message, the string kg m-2 is being used by WRES as the unit that the UCUM names kg/m2 and the dimension is mass over area (length squared). Sometimes the unit name and the UCUM unit will be identical in this message. This can be important information because of the default unit aliases inside WRES, the caller-supplied unit aliases, and the potential interactions between them. When setting unit aliases, this message can help discover if the intended unit is used by WRES. Even when not setting unit aliases it can also be helpful to see that the intended unit is used by WRES. The UCUM unit printed in this message is informational but also can be used directly as the value in the unit tag of a unit_aliases declaration. The dimension printed is informational to help the user judge whether this is the correct unit.

A WRES default map from common strings to formal UCUM units has educated guesses based on experience with datasets at the Office of Water Prediction (OWP). For many datasets at OWP, the map will make the correct guess but your dataset may be different. This message makes it possible to verify that WRES appears to be using the correct interpretation.

What is meant by recognize units versus convert units?

To recognize a unit means the WRES software can create a formal unit of measurement representation using a formal unit of measurement library from a given unit string. In other words, to recognize a unit means that WRES software knows the dimension of the unit (e.g., length, time, volume, etc.) and its relative magnitude compared to other units along that same dimension.

To convert a unit means the WRES software converts from one unit to another. This requires WRES to recognize both units and also that it can perform a sensible conversion between those two already-recognized units.

Most of this wiki has to do with getting WRES to recognize units. Unit conversions are the purpose of unit recognition, so in that sense unit conversions are related and mentioned here. But for the most part the trouble is in making sure that the unit name used in a dataset is understood by WRES accurately. Once the units are recognized, conversions become possible, and conversions along the same dimension are going to work easily at that point.

When are no unit conversions required by WRES?

When all datasets in a declaration (observed, predicted, and optional baseline and covariates) have a unit of measurement, e.g. "ASDF", and the declaration declares the same unit of measurement, e.g. "ASDF", then WRES software treats it as identity and no unit conversion will be attempted. Because no unit conversion is attempted, no attempt is made by WRES to understand or recognize the unit, e.g. "ASDF", because there is no need. In this special case, arbitrary unit names are OK, so long as they match exactly (case-sensitive) between all the datasets and the declared desired unit of measurement. In this special case, you will also see a WARN message such as, "Unknown unit '-' may cause unit conversion issues", but then no subsequent exceptions related to units or unit conversions.

Are there some examples of UCUM unit strings?

The WRES uses UCUM unit strings that are case-sensitive. This table of examples, which is not exhaustive, is intended as an introduction to UCUM for WRES users. In most cases this example table builds from simpler to more complex, with a footnotes after the table having a link to a UCUM-provided explanation of the newly-introduced complication. For example, the first unit shown links to the definition of a unit and the second unit builds on this by using a prefix for that unit and links to the UCUM section regarding prefixes.

UCUM Unit Description
m Meters1
mm Millimeters2
[ft_i] Feet3 (international, US customary, USGS current[4])
[ft_us] US Survey Feet5
[in_i] Inches
s Seconds6
min Minutes7
h Hours8
d Days9
m/s Meters per Second10
[ft_i]/s Feet per Second
m3 Cubic Meters11
[ft_i]3 Cubic Feet (international, US Customary, USGS current12)
[ft_us]3 Cubic US Survey Feet
m3/s Cubic Meters per Second
[ft_i]3/s Cubic Feet per Second
1000.[ft_i]3/s Thousands of Cubic Feet per Second13
1000.[ft_us]3/s Thousands of Cubic US Survey Feet per Second
K Kelvin
Cel Degrees Celsius
[degF] Degrees Fahrenheit

1: https://ucum.org/ucum.html#para-28 (UCUM §28 base units)
2: https://ucum.org/ucum.html#para-27 (UCUM §27 prefixes)
3: https://ucum.org/ucum.html#para-34 (UCUM §34 international customary units)
4: https://pubs.usgs.gov/tm/tm3-a8/tm3a8.pdf#page=13 (USGS conversions on page 13 imply international foot)
5: https://ucum.org/ucum.html#para-35 (UCUM §35 U.S. survey lengths)
6: https://ucum.org/ucum.html#para-28 (UCUM §28 base units)
7: https://ucum.org/ucum.html#para-31 (UCUM §31 other units from ISO 1000, ISO 2955 and ANSI X3.50)
8: https://ucum.org/ucum.html#para-31 (UCUM §31 other units from ISO 1000, ISO 2955 and ANSI X3.50)
9: https://ucum.org/ucum.html#para-31 (UCUM §31 other units from ISO 1000, ISO 2955 and ANSI X3.50)
10: https://ucum.org/ucum.html#para-7 (UCUM §7 algebraic unit terms)
11: https://ucum.org/ucum.html#para-9 (UCUM §9 exponents)
12: https://pubs.usgs.gov/tm/tm3-a8/tm3a8.pdf#page=13 (USGS conversions on page 13 imply international foot)
13: https://ucum.org/ucum.html#para-7 (UCUM §7 algebraic unit terms)

Easily find more units at https://ucum.nlm.nih.gov/ucum-lhc/demo.html (but then double-check in the UCUM documentation that they are not deprecated)

Why Does C Still Work? How can I make C mean Coulomb instead?

WRES has a default map of unit aliases to corresponding UCUM strings. For example, C maps to Cel and F maps to [degF]. What if you want C to mean C (Coulomb) or F to mean F (Farad) instead of the default WRES map to degrees Celsius and degrees Fahrenheit? Any declared alias takes precedence over the default map. Therefore, if you want C to mean Coulomb instead of degrees Celsius, declare the following to force WRES to use the UCUM definition of unit C:

unit_aliases:
  - alias: 'C'  # Force the WRES to recognized 'C' as...
    unit: 'C'   # The UCUM unit, 'C'elsius. The unit always indicates the UCUM unit.

Limitations or Discrepancies Between UCUM, WRES, and the NIH Demo

The library that WRES uses (as of 2021-10-26) does not appear to interpret UCUM powers of numeric literals. For example, 10^6 or 10*6 would typically mean one million or 1000000 in UCUM terms. The workaround is to use the full number such as 1000000 until the library (and WRES in turn) supports it. The metric prefix k would also work with the library and the NIH demo at https://ucum.nlm.nih.gov/ucum-lhc/demo.html, but is prohibited for non-metric units when strictly interpreting UCUM. For example, k[ft_i]3/s (like KCFS) might work, but strictly speaking should be 1000.[ft_i]3/s because [ft_i] is not metric.

Can I Chain Aliases or Alias Aliases?

No. Aliases are flat across the whole evaluation. Conceptually there is only a one-level deep map from arbitrary unit names to formal UCUM units per evaluation. This map is prepared with a set of default entries (educated guesses), but specifying a unit alias with the same unit name as present in the default map in the declaration overrides the key-value pair in the map. The same map from arbitrary unit names to formal UCUM units applies in the same way to the units found in datasets on the left, right, baseline as well as to desired units and threshold units.

Can I Declare Multiple Unit Aliases?

Yes. You can declare zero, one, or many unit aliases, but you may not declare the same alias name twice. You may declare multiple aliases to mean the same target UCUM unit, however.

Example valid declaration, supposing that some data in the evaluation uses the string "metres" to denote meters while other data in the evaluation uses the string "meters" to denote meters, and other data in the evaluation uses the string "feet" to denote feet:

# Example valid unit alias declaration set
unit_aliases:
  - alias: 'metres'
    unit: 'm'
  - alias: 'meters'
    unit: 'm'
  - alias: 'feet'
    unit: '[ft_i]'

Example invalid declaration supposing some data in the evaluation uses the string "cubic meters" to denote cubic meters:

# Example INVALID unit alias declaration set
unit_aliases:
  - alias: 'cubic meters'
    unit: 'm3'
  - alias: 'cubic meters'
    unit: '35315.ft3/1000'

The reason the above is invalid is it defines the same alias twice. The unit alias applies across all the datasets and it can only have one interpretation, not more than one. The only sense in which is is acceptable to have two aliases is when overriding an internal-to-WRES default convenience alias name using a declared unit alias. Still, this results in only one unitAlias declaration per alias.

Clone this wiki locally