Skip to content

muon-data/muon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MuON v1.1

Micro Object Notation
    by Douglas Lau

MuON is a text format for data serialization. It is suitable for configuration files and data interchange — as expressive as other formats, but much simpler.

# MuON example
movie: Alien
  director: Ridley Scott
  cast:=Sigourney Weaver
      :=Tom Skerritt
      :=John Hurt
  release: 1979-06-22
    region: USA
  release: 1979-09-06
    region: UK
  gross: 203_630_630
  emoji: 👽 👾

Specification

MuON is Unicode text encoded in UTF-8, with no byte-order mark. MuON is case sensitive and line based.

A line is a definition, comment or blank, and must end with a single line feed character (U+000A). Comments begin with a number sign #, which may be preceded by spaces. Blank lines contain no characters.

  # Example comment

A definition maps a key to a value, with a colon and space between.

key: value

If the value is empty, the space is not required.

Some definitions create branches. Starting from a root record, all branches form a tree. With no indents, definitions are contained in the root. After a branch, subsequent definitions with one more indent are contained in it.

key_in_root: value in root
branch:
  key_in_branch: value in branch

Definition indents are exactly 2, 3 or 4 spaces (U+0020). Nested branches use multiple indents. The number of spaces must be the same for all indents in a file.

family: Ursidae
   # One indent; 3 spaces
   genus: Ailuropoda
      # Two indents; 6 spaces
      species: A. melanoleuca 🐼

A key is a sequence of one or more characters. It must be "quoted" if it contains a colon or begins with a space, quote mark or number sign. In this case, all quote marks in the key must be doubled.

# "skeleton" key begins with a quote mark
"""skeleton"" key": value

Also, a key should be quoted if it begins with whitespace, contains control characters or homoglyphs of colon.

A value is a sequence of characters. With the exception of line feed, any Unicode character is allowed.


A schema is a template with types as values. It can be separate or prepended to a MuON file. In either case, it begins and ends with a line of three colons.

:::
# Example MuON schema
movie: list record
  title: text
  director: text Alan Smithee
  cast: list text
  release: list record
    release_date: date >=1878-01-01
    region: text
  gross: int
  emoji: optional text
:::

There are eleven available types: text, bool, int, number, datetime, date, time, record, choice, dictionary and any. They are used to parse objects from values.

An optional or list modifier may precede the type, followed by a space.

One or two constraints may follow the type, with a space between. This is one of four specifiers >, >=, < or <=, followed by a value. For int, number, datetime, date and time types, it defines a subrange of valid values. For text, it restricts the count of characters.

A default value may be included after the type and any constraints, also separated with a space. It is a value for the type, used when a definition is not present. Allowed types are text, bool, int, number, datetime, date and time. Defaults are not allowed with optional or list modifiers.


Text is a sequence of characters.

:::
greeting: text Hello!
farewell: text Goodbye!
:::
# greeting is Hello!
farewell: Be seeing you.

Because values cannot contain line feeds, they can only be represented using multiple definitions. The text must be split into values between each line feed. For each definition after the first, use a text append separator, which is :> instead of the usual :  before the value.

When appending, use a blank key — a sequence of spaces with the same number of characters as the key.

lyric: Out in the garden
     :>There's half of a heaven

Bool is a boolean: either true or false.

earth_is_flat: false

Int is an integer (whole number) in one of three forms:

  • Decimal: sequence of digits 0-9. May have a + or - sign prefix
  • Binary: b followed by sequence of digits 0 or 1
  • Hexadecimal: x followed by sequence of digits 0-9, A-F or a-f

An underscore may be inserted between digits to improve readability.

locke: 4
reyes: b1000
ford: x0F
jarrah: +16
shephard: b01_0111
kwon: x2a

If no constraints are given, an integer has no bounds (BigInt).

:::
uint8: int >=0 <=255
rank: int >0 <6
:::
uint8: 49
rank: 3

A number is a 64-bit floating point number, made up of these parts:

  1. Whole number part (same as decimal int)
  2. Fractional part (decimal point followed by sequence of digits 0-9)
  3. Exponent part (e followed by decimal int)

One or both of the whole or fractional parts must be present, but the exponent part is not required. As with ints, underscores may be included.

The values inf and NaN stand for infinity and not a number, respectively. Either can be prefixed with a + or - sign.

prime: 37
log_e_2: .6931471805599453
mercury: -38.83440
planck: 6.626_070_15e-34
buzz: +inf
avogadro: 6.022_140_76e23

Datetime is date, time and offset, as specified by date-time from RFC 3339. The date and time are separated by an uppercase T only. If the offset is represented by Z, it must also be uppercase.

moonwalk: 1969-07-21T02:56:00Z

Date is year, month and day, as specified by full-date from RFC 3339.

birthday: 2019-08-01

Time is hour, minute and second, as specified by partial-time from RFC 3339.

start: 08:00:00
end: 15:58:14.593849001

A record is a branch containing fields as subsequent definitions. A record represents all of its fields.

:::
book: record
  title: text
  author: text
  year: int
:::
book:
  title: If on a winter's night a traveler
  author: Italo Calvino
  year: 1979

Field keys are often used in programming languages as identifiers. For compatibility, they should contain only ASCII alphanumeric or underscore characters.

Since records do not use their values, they can substitute for the first field, which must then be left out. Like defaults, substitution is only allowed for text, bool, int, number, datetime, date and time types without optional or list modifiers.

book: The Left Hand of Darkness
  author: Ursula K. Le Guin
  year: 1969

In the schema, a record id can follow record after a space. It is used if a record exists in more than one place. After the first definition, the fields do not need to be included.

:::
player: record Character
  name: text
  health: int
nemesis: record Character
:::
player: Arthur
  health: 50
nemesis: Mordred
  health: 60

A choice is a branch containing variants as subsequent definitions. A choice represents exactly one of the variants.

A variant containing no data is declared as a definition with no type. These variants can substitute for the choice value.

:::
pill: choice
  red
  blue
:::
pill: red

Variants can also contain arbitrary data. These must be declared with a subsequent definition.

:::
strategy: choice
  attack: int
  retreat
  surrender: text
:::
strategy:
  attack: 50

In the schema, a variant id can follow choice after a space. This works in the same way as a record id.

:::
face_a: choice direction
  North
  South
  East
  West
face_b: choice direction
:::
face_a: North
face_b: East

A dictionary is a branch for associative arrays — useful if keys are not known in advance. The schema must contain a single definition with types for both key and value. The key type is restricted to text, bool, int, number, datetime, date or time.

:::
num_word: dictionary
  text: int
:::
num_word:
  fifty: 50
  one: 1
  thirteen: 13

Any is a branch containing data of any type. It should be used for data which does not fit into a rigid schema.

:::
product: list record
  name: text
  price: number
  details: any
:::
product: duct tape
  price: 4.99
  details:
    color: silver
    width: 8 cm
product: machete
  price: 29.99
  details:
    length: 50 cm
    weight: 0.5 kg

Optional types are not required — the absence of a definition represents a None or null value.

:::
name: text
occupation: optional text
:::
name: Surfer Joe
# no occupation

A list is parsed as a sequence of objects, separated by spaces. If a list is empty, omit its definition.

:::
show_times: list time
healthy_snacks: list text
:::
show_times: 15:40:00 18:00:00 20:20:00
# no healthy_snacks

Like text, lists can be appended. All objects are added to the end.

fibonacci: 0 1 1 2 3
         : 5 8 13 21 34
# same as fibonacci: 0 1 1 2 3 5 8 13 21 34

When appending to list record or list dictionary, the key cannot be blank, since the definitions are not consecutive.

person: George Washington
  birthday: 1732-02-22
person: Abraham Lincoln
  birthday: 1809-02-12

For list text, objects are separated by spaces, just like other lists. If a text object contains spaces, use the text value separator := to treat an entire value as a single object. The text append separator :> will also append an entire value to the previous text object.

shopping: avocado banana
        :=cream cheese
        : cucumber
        :=ice cream
        : raw
        :>burger! (mmmm)

Contributing

Any feedback, bug reports, spelling fixes, or text clarity improvements are welcome! Please create an issue.