Metadata | Definition | Reference of definition[<url_to_definition>] | Expected value OR expected unit of measurement | Example filed field | Checklist (where this or similar matadata field is mentioned) | |
---|---|---|---|---|---|---|
Site metadata | collection_date - Collection Date | The time of sampling, either as an instance (single point in time) or interval. In case no exact time is available, the date/time can be right truncated. ISO8601 compliant | [MIXS:0000011] | YYYY-MM-DD | e.g. 2013-03-25T12:42:31+01:00 | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” |
Collected by | Name of person or institute that collected the sample | Link to reference | free text string | e.g. UFZ - Centre for environmental research | “ENA Marine Microalgae Checklist; Checklist: ERC000043” | |
geo_loc_name - Geographic location (country and/or sea,region) | Geographic location (country and/or sea,region). The geographical origin of the sample as defined by the country or sea name followed by specific region name. Country or sea names should be chosen from the INSDC country list, or the GAZ ontology | [MIXS:0000010] | free text or ontology | e.g. USA: Maryland, Bethesda; Atlantic Ocean region OR [GAZ:00051071] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
lat - Latitude | The geographical origin of the sample as defined by latitude. The values should be reported in decimal degrees and in WGS84 system. | [MIXS:0000009] | Expected_value: decimal degrees, limit to 8 decimal points | e.g. -41.373744 | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
lon - Longitude | The geographical origin of the sample as defined by longitude. The values should be reported in decimal degrees and in WGS84 system. | [MIXS:0000009] | Expected_value: decimal degrees, limit to 8 decimal points | e.g. 146.266145 | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
elev - Elevation | Elevation is mainly used when referring to points on the earth’s surface. Origin elevation in m | [MIXS:0000093] | Preferred_unit: meter | e.g. 100 m | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
alt - Altitude | Altitude is used for points above the surface, such as an aircraft in flight or a spacecraft in orbit | [MIXS:0000094] | Preferred_unit: meter | e.g. 100 m | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
depth - Depth | The vertical distance below local surface. For sediment or soil samples depth is measured from sediment or soil surface, respectively. Depth can be reported as an interval for subsurface samples | [MIXS:0000018] | Preferred_unit: meter | e.g. 100 m | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS: Host-associatedMIMS”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
env_broad_scale - Broad-Scale Environmental Context | Report the major environmental system the sample or specimen came from. Systems(s) identifiers should provide a coarse, general environmental context of where the sampling was done. Recommended use of EnvO s biome class: [ENVO_00000428]. If more than one term applies to the field, | should be used to separate them. | [MIXS:0000012] | Expected_value: Environmental entities having causal influences upon the entity at time of sampling in form of ontologies, separated by “|” if two or more apply | E.g. aquatic biome [ENVO:00002030]|terrestrial biome [ENVO:00000446] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
env_local_scale - Local-Scale Environmental Context | Report the entity or entities which are in the sample or specimen’s local vicinity and which you believe have significant causal influences on your sample or specimen. Entry should be of a smaller environmental context than env_broad_scale. Terms, such as anatomical sites, from other OBO Library ontologies which interoperate with EnvO (e.g. UBERON) are accepted in this field. If more than one term applies to the field, | should be used to separate them. | [MIXS:0000013] | Expected_value: Environmental entities having causal influences upon the entity at time of sampling in form of ontologies, separated by “|” if two or more apply | e.g. if terrestrial biome [ENVO:00000446] was used in env_broad_scale, env_local_scale should be one of the following subclasses : anthropogenic terrestrial [ENVO:01000219]|mangrove biome [ENVO:01000181]| shrubland biome [ENVO:01000176]| terrestrial environmental zone [ENVO:01001199] | tundra biome [ENVO:01000180] | woodland biome [ENVO:01000175] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
env_medium - Environmental Medium | Report the environmental material(s) immediately surrounding the sample or specimen at the time of sampling. Recommended use of EnvO’s subclasses of environmental material [ENVO:00010483]. Terms from other OBO ontologies are permissible as long as they reference mass/volume nouns (e.g. air, water, blood) and not discrete, countable entities (e.g. a tree, a leaf, a table top). If more than one term applies to the field, | should be used to separate them. | [MIXS:0000014] | ontologies, separated by “|” if two or more apply | e.g. glacial ice [ENVO:03000004] OR arable soil [ENVO:00005742]|bare soil [ENVO:01001616]|bulk soil [ENVO:00005802] | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)”, “GSC MIxS Human Associated; ENA Checklist: ERC000014” | |
chem_administration - Chemical Administration | List of chemical compounds administered to host or on site where sampling occurred. Can include multiple compounds separated by |. For compounds consult chemical entities of biological interest ontology (chebi) (v 163) | [MIXS:0000751] | Expected_value: CHEBI;timestamp; CHEBI ontologies, separated by “|” if two or more apply | e.g. agar [CHEBI:2509];2018-05-11T20:00Z|castor oil [CHEBI:140618];2023-12-07T17:00+02:00 | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS” | |
Site conditions | Environmental temp, salinity | temp [MIXS:0000113], salinity [MIXS:0000183] | Preferred_unit: degree Celsius; Preferred_unit: practical salinity unit, percentage | e.g. 25 degree Celsius, 25 practical salinity unit, pH 7.2 | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS”, “GSC MIxS Human Associated; ENA Checklist: ERC000014” , MSI-ECWSG (Morrison et al. (2007)), “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
Sample metadata | samp_name - Sample Name | A local identifier or name that for the material sample used for extracting nucleic acids, and subsequent sequencing. It can refer either to the original material collected or to any derived sub-samples. It can have any format, but we suggest that you make it concise, unique and consistent within your lab, and as informative as possible. INSDC requires every sample name from a single Submitter to be unique. Use of a globally unique identifier for the field source_mat_id is recommended in addition to sample_name | [MIXS:0001107] | free text string | e.g. Host1Sample2Seq2 | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” |
source_mat_id - Source Material Identifier(s) | A unique identifier assigned to a material sample used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. | [MIXS:0000001] | Expected_value: for cultures of microorganisms: identifiers for two culture collections; for other material a unique arbitrary identifer | e.g. MPI012345 | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
samp_size - Amount or size of sample collected | The total amount or size (volume (ml), mass (g) or area (m2) ) of sample collected | [MIXS:0000001] | ml, g, m² | e.g. H x W x L, vol., mass 2000 ml water, 1000 g soil | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
temp - Temperature | Temperature of the sample at the time of sampling | [MIXS:0000113] | Preferred_unit: degree Celsius | 25 degree Celsius | “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
salinity - Salinity | The total concentration of all dissolved salts in a liquid or solid sample. While salinity can be measured by a complete chemical analysis, this method is difficult and time consuming. More often, it is instead derived from the conductivity measurement. This is known as practical salinity. These derivations compare the specific conductance of the sample to a salinity standard such as seawater | [MIXS:0000183] | Preferred_unit: practical salinity unit, percentage | e.g. 25 practical salinity unit | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS”, “GSC MIxS Human Associated; ENA Checklist: ERC000014” | |
samp_taxon_id - Taxonomy Identifier of DNA Sample | NCBI taxon id of the sample. Maybe be a single taxon or mixed taxa sample. Use ‘synthetic metagenome for mock community/positive controls, or ’blank sample’ for negative controls. Expected_value: [NCBI taxonomy ID] | [MIXS:0001320 | Expected_value: NCBI taxon identifier | e.g. Gut Metagenome [NCBITaxon:749906] | “GSC MIXS: MIMAG”, “GSC MIXS: MIGSBacteria”, “GSC MIMS: Metagenome or Environmental”, “Minimum Information about a Single Ampligied Genome (MiSAG)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” | |
Microbial isolate | Microbial isolate cultured?: Y/N | |||||
microb_cult_med - Microbiological culture medium (applicable only if microorganism can be cultivated) | Composition of processed material providing the needed nourishment for microorganisms or cells to grow in vitro. This field accepts terms listed under culture medium [OBI:0000079]. If the proper descriptor is not listed please use text to describe the culture medium | [MIXS:0001216 | ontologies, separated by “|” if two or more apply or free text | e.g. minimal defined medium [MCO:0000881] | “MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package”, MSI-ECWSG (Morrison et al. (2007)) | |
chem_administration - Chemical Administration | List of chemical compounds administered to host or on site where sampling occurred. Can include multiple compounds separated by |. For compounds consult chemical entities of biological interest ontology (chebi) (v 163) | [MIXS:0000751] | Expected_value: CHEBI;timestamp; CHEBI ontologies, separated by “|” if two or more apply | e.g. agar [CHEBI:2509];2018-05-11T20:00Z|castor oil [CHEBI:140618];2023-12-07T17:00+02:00 | “GSC MIxS: Host-associatedMIMS”, “GSC MIxS: Human-associatedMIMS” | |
Host metadata | host_taxid - Taxonomy Identifier of Host | NCBI taxon id of the host [NCBI taxonomy ID] | [MIXS:0000250] | Expected_value: NCBI taxon identifier | e.g. Canis lupus familiaris [NCBI:txid9615], Homo sapiens [NCBI:txid9606] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, MSI-ECWSG (Morrison et al. (2007)) |
host_common_name - Common Name of Host | Common name of the host | [MIXS:0000248] | free text string | e.g. human, dog, cattle | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
host_height - Height of Host | The height of subject/host | [MIXS:0000264] | Preferred_unit: centimeter, millimeter, meter | e.g. 177 cm | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_length - Lenght of Host | The length of subject/host | [MIXS:0000256] | Preferred_unit: centimeter, millimeter, meter | e.g. 100 cm | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
host_tot_mass - Total Mass of the Host | Total mass of the host at collection, the unit depends on host | [MIXS:0000263] | Preferred_unit: kilogram, gram | e.g. 77 kg OR 3568 g | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_body_site - Sampled Body Site of Host | Name of body site where the sample was obtained from, such as a specific organ or tissue (tongue, lung etc…). Recomended use of FMA or UBERON ontologies | [MIXS:0000867] | Expected_value: FMA or UBERON ontologies, separated by “|” if two or more apply | e.g. gut [FMA:45615] OR gut wall [UBERON:0000328] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_body_product - Sampled Body Product of Host | Substance produced by the body, e.g. Stool, mucus, where the sample was obtained from. Recomended use of FMA or UBERON ontologies | [MIXS:0000867] | Expected_value: FMA or UBERON ontologies, separated by “|” if two or more apply | e.g. mucus [FMA:66938], arterial blood [UBERON:0013755]|venous blood [UBERON:0013756]|blood plasma [UBERON:0001969] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_age - Age of Host | Age of host at the time of sampling; relevant scale depends on species and study, e.g. Could be seconds for amoebae or centuries for trees | [MIXS:0000255] | Preferred_unit: year, day, hour | e.g. e.g. 28 y OR 30 d OR 30 h OR 12 s | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_sex - Sex of Host | Gender or physical sex of the host | [MIXS:0000811] | Expected_value: enumeration | e.g. male, female, unknown | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013” | |
host_diet - Diet of Host | Type of diet depending on the host, for animals omnivore, herbivore etc., for humans high-fat, meditteranean etc.; can include multiple diet types | [MIXS:0000869] | free text string | E.g. omnivore [ecocore:00000082], substrate, medium; note here, that multiple possible subclasses exist in organism [OBI:0100026], such as autotroph [ECOCORE:00000023]|heterotroph [ECOCORE:00000010] | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS” | |
host_disease_stat - Disease Status of Host | Diagnosed diseases of host; can hold multiple values, separated by |. Non-human host diseases should be deposited as free text | [MIXS:0000031] | Expected_value: free text disease name or Disease Ontology term | e.g. rabies, avian influenza, heartworm disease | “GSC MIxS: Host-associatedMIMS”, “ENA Host Associated Checklist; Checklist: ERC000013”, “GSC MIxS Human Associated; ENA Checklist: ERC000014”, “GSC MIxS: Human-associatedMIMS”, “GSC MIXS: MIGSBacteria”, “Minimum Information about Viral Genome Sequence (MigsVi)”, “Minimum Information about an Uncultivated Virus Genome (Miuvig)” |
For readers of this repository, confused by the use of EnvO s ontologies, we recommend they read the EnvO s use documentation here: https://github.com/EnvironmentOntology/envo/wiki/Using-ENVO-with-MIxS.
“ENA Host Associated Checklist; Checklist: ERC000013.” https://www.ebi.ac.uk/ena/browser/view/ERC000013.
“ENA Marine Microalgae Checklist; Checklist: ERC000043.” https://www.ebi.ac.uk/ena/browser/view/ERC000043.
“GSC MIMS: Metagenome or Environmental.” https://genomicsstandardsconsortium.github.io/mixs/0010007/.
“GSC MIxS Human Associated; ENA Checklist: ERC000014.” https://www.ebi.ac.uk/ena/browser/view/ERC000014.
“GSC MIxS: Host-associatedMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016002/.
“GSC MIxS: Human-associatedMIMS.” https://genomicsstandardsconsortium.github.io/mixs/0016003/.
“GSC MIXS: MIGSBacteria.” https://genomicsstandardsconsortium.github.io/mixs/0010003/.
“GSC MIXS: MIMAG.” https://genomicsstandardsconsortium.github.io/mixs/0010011/.
“MIMS: Metagenome/Environmental, Human-Associated; Version 6.0 Package.” https://www.ncbi.nlm.nih.gov/biosample/docs/packages/MIMS.me.human-associated.5.0/.
“Minimum Information about a Single Ampligied Genome (MiSAG).” https://genomicsstandardsconsortium.github.io/mixs/0010010/.
“Minimum Information about an Uncultivated Virus Genome (Miuvig).” https://genomicsstandardsconsortium.github.io/mixs/0010012/.
“Minimum Information about Viral Genome Sequence (MigsVi).” https://genomicsstandardsconsortium.github.io/mixs/0010005/.
Morrison, Norman, Daniel Bearden, Jacob G. Bundy, Timothy Collette, Fraser Currie, Matthew Davey, Migdalia Dominguez, et al. 2007. “Standard Reporting Requirements for Biological Samples in Metabolomics Experiments: Environmental Context.” Metabolomics 3 (2): 203–10. https://doi.org/10.1007/s11306-007-0067-1.