Skip to content

Commit f6af5a6

Browse files
committed
Updating the radiocarbon folder.
1 parent c7ccacb commit f6af5a6

File tree

3 files changed

+342
-0
lines changed

3 files changed

+342
-0
lines changed

Proposals/ostracode_support/ostracode_DataBUS_additions.qmd

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,12 @@ VALUES('AquaticSystem', 'Aquatic System');
6262

6363
## Depositional Environments
6464

65+
We need to check the following parameters against Neotoma:
66+
67+
```{r depositionalenvts}
68+
habitats <- c(eanode$habitat, node$`NATURAL HABITAT`, node$`ARTIFICIAL HABITAT`) |>
69+
unique() |>
70+
na.omit() |>
71+
write.csv('habitat_equiv.csv')
72+
73+
```
Lines changed: 271 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,271 @@
1+
---
2+
title: "Radiocarbon Data Formats"
3+
format: html
4+
---
5+
6+
## Reporting Standards
7+
8+
### Standards from Millard (2014)
9+
10+
1. The laboratory measurement should be reported as a conventional 14C age in 14C yr BP or a
11+
fractionation-corrected fraction modern (the F14C value of Reimer et al. 2004) according to
12+
the amended conventions of Stuiver and Polach (Stuiver and Polach 1977; Stuiver 1980, 1983;
13+
Reimer et al. 2004).
14+
2. The laboratory code for the determination should be included.
15+
3. The sample material dated, the pretreatment method applied, and quality control measurements
16+
should be reported.
17+
4. The calibration curve and any reservoir offset used.
18+
5. The software used for calibration, including version number, the options and/or models used,
19+
and wherever possible a citation of a published description of the software.
20+
6. The calibrated date given as a range (or ranges) with an associated probability on a clearly
21+
identifiable calendar timescale.
22+
23+
24+
## Data Souces
25+
26+
* [Cross-References p3k14c Data Table](https://opencontext.org/tables/e7fce3cb-eb78-4a36-8b70-da0323da23b3)
27+
* [Neotoma Radiocarbon Data](https://www.neotomadb.org)
28+
* [East African Radiocarbon Database](https://openquaternary.com/articles/10.5334/oq.22)
29+
* [IntCal aalen Record](https://intchron.org/schema.html)
30+
* [DarwinCore Chronometric Extension](https://www.tdwg.org/community/esp/chrono/)
31+
* [UC Irvine Records]()
32+
* [Nerd Database](https://openarchaeologydata.metajnl.com/articles/10.5334/joad.90#T1)
33+
34+
## Data Columns
35+
36+
### CARD
37+
38+
Lab Number
39+
Field Number
40+
Material Dated
41+
Taxa Dated
42+
Type of Date
43+
Locality
44+
Latitude
45+
Longitude
46+
Map Sheet
47+
Elevation (m ASL)
48+
Submitter
49+
Date Submitted
50+
Collector
51+
Date Collected
52+
Updater
53+
Date Updated
54+
Measured Age
55+
MA Sigma
56+
Normalized Age
57+
NA Sigma
58+
Delta 13C (per mil)
59+
Delta 13 C Source
60+
Significance
61+
Site Identifier
62+
Site Name
63+
Stratigraphic Component
64+
Context
65+
Associated Taxa
66+
Additional Information
67+
Comments
68+
References
69+
70+
### NERD
71+
72+
DateID
73+
LabID
74+
OthLabID
75+
Problems
76+
CRA
77+
Error
78+
DC13
79+
Material
80+
Species
81+
SiteID
82+
SiteName
83+
SiteContext
84+
SiteType
85+
Country
86+
Longitude
87+
Latitude
88+
LocQual
89+
Source
90+
Comments
91+
92+
### IntCAL Standard Reporting
93+
94+
* Standards are reported on the [IntCal Instructions](https://intchron.org/tools/integrate/help_IntCal.html) page and in the [schema](https://intchron.org/schema.html)
95+
96+
Record Metadata
97+
* site: the site name
98+
* country (blank if marine)
99+
* longitude (optional but desirable)
100+
* latitude (optional but desirable)
101+
* elevation (optional but desirable)
102+
* site_type (optional but desirable)
103+
* record: the record name
104+
* changed (optional - if set implies file_data is different from the original file)
105+
* refs: DOIs
106+
107+
Key Intcal parameters
108+
* z (Depth/Ring): general depth or ring measurement (see below)
109+
* z_range (z range): range in z (see below)
110+
* t (Date): general date parameter - the timescale is defined at the record level; internally this is stored as fractional astronomical years but can be displayed in a number of different formats (calBP, fractional years, BC/AD etc)
111+
* t_sigma (±1σ): one sigma uncertainty in t
112+
* sample (Sample): the sample name (see below)
113+
* ring_segment (Seg.): the segment of a ring measured (EW, LW, EW/LW, EW1, EW2, LW1, LW2 etc.)
114+
* labcode (Lab code): the radiocarbon lab code in standard format
115+
* batch (Batch): the measurement batch (magzine/wheel) in which the measurements were made
116+
* r_date (R_Date): the uncalibrated radiocarbon date in 14C years BP
117+
* r_date_sigma (±1σ): the one sigma uncertainty in r_date
118+
119+
Special IntCal parameters
120+
* calage (CalAge): required for all data - this is the age of the sample in years calBP; in normal circumstances t will be 1950.5 - calage
121+
* calage_range (CalSpan): this is the range of the sample material (see below); the default value is 1
122+
* calage_sigmaI (±1σ): this is the one sigma independent uncertainty in the age (0 for dendrochronologically dated material)
123+
* calage_sigmaD (±1σ): this is the one sigma systematic uncertainty in the age of the entire dataset (typically for tree ring series)
124+
* calage_sigmaC (±1σ): correlated uncertainty (used for example in the Cariaco Basin varved sediments)
125+
* intcal_reservoir (Reservoir): the marine reservoir offset (relative to the atmosphere) of this sample set
126+
* intcal_reservoir_sigma (±1σ): the one sigma uncertainty in intcal_reservoir_sigma
127+
128+
The following parameters are used internally in the IntCal curve process and are typically not set on submission of data
129+
* intcal_data_id (Id): the unique ID number for this data point in IntCal
130+
* intcal_seq_no (Seq): the sequence number (for series with multiple sequences)
131+
* r_date_sigma_extra (+ ±1σ): extra uncertainty in the radiocarbon value
132+
* r_date_sigma_mult (× ±1σ): an error multiplier (typically 1 as a default)
133+
134+
### Chronometric Reporting (DWC)
135+
136+
* chronometricAgeID: URL or unique id for the record.
137+
* verbatimChronometricAge: The verbatim age for a specimen, whether reported by a dating assay, associated references, or legacy information.
138+
* chronometricAgeProtocol: A description of or reference to the methods used to determine the chronometric age.
139+
* uncalibratedChronometricAge: The output of a dating assay before it is calibrated into an age using a specific conversion protocol.
140+
* chronometricAgeConversionProtocol: The method used for converting the uncalibratedChronometricAge into a chronometric age in years, as captured in the earliestChronometricAge, earliestChronometricAgeReferenceSystem, latestChronometricAge, and latestChronometricAgeReferenceSystem fields.
141+
* earliestChronometricAge: The maximum/earliest/oldest possible age of a specimen as determined by a dating method.
142+
* earliestChronometricAgeReferenceSystem: The reference system associated with the earliestChronometricAge.
143+
* latestChronometricAge: The minimum/latest/youngest possible age of a specimen as determined by a dating method.
144+
* latestChronometricAgeReferenceSystem: The reference system associated with the latestChronometricAge.
145+
* chronometricAgeUncertaintyInYears: The temporal uncertainty of the earliestChronometricAge and latestChronometicAge in years.
146+
* chronometricAgeUncertaintyMethod: The method used to generate the value of chronometricAgeUncertaintyInYears.
147+
* materialDated: A description of the material on which the chronometricAgeProtocol was actually performed, if known.
148+
* materialDatedID: An identifier for the MaterialSample on which the chronometricAgeProtocol was performed, if applicable.
149+
* materialDatedRelationship: The relationship of the materialDated to the subject of the ChronometricAge record, from which the ChronometricAge of the subject is inferred.
150+
* chronometricAgeDeterminedBy: A list (concatenated and separated) of names of people, groups, or organizations who determined the ChronometricAge.
151+
* chronometricAgeDeterminedDate: The date on which the ChronometricAge was determined.
152+
* chronometricAgeReferences: A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the ChronometricAge.
153+
* chronometricAgeRemarks: Notes or comments about the ChronometricAge.
154+
155+
### Neotoma Reporting
156+
157+
See JSON (we need to do this better though)
158+
159+
160+
### p2k21 [in OpenContext]
161+
162+
Item URI
163+
Item Label
164+
Persistent ID (ARK)
165+
Item Category
166+
Project Label
167+
Project URI
168+
Item Context URI
169+
Site Wikidata URI
170+
Site Pleiades URI
171+
Latitude (WGS-84)
172+
Longitude (WGS-84)
173+
Geospatial Note
174+
Geospatial Inference
175+
Earliest Year (-BCE/+CE)
176+
Latest Year (-BCE/+CE)
177+
Chronology Inference
178+
Context (1)
179+
Context (2)
180+
Context (3)
181+
Context (4)
182+
Context (5)
183+
Authors and Contributors
184+
Material Type [https://opencontext.org/predicates/b342d8f3-c47c-4ab0-a6dc-39413e373c7d]
185+
LabID [https://opencontext.org/predicates/ac06a96f-a41d-4fd8-be1f-86642336f375]
186+
Consists of [https://erlangen-crm.org/current/P45_consists_of]
187+
Age [https://opencontext.org/predicates/3101d7ea-b28c-4262-adf1-12c609ed5e7a]
188+
Has taxonomic identifier [https://purl.obolibrary.org/obo/FOODON_00001303]
189+
Error [https://opencontext.org/predicates/25858351-ff6b-4b59-877f-b2cfe020caa2]
190+
Longitude (WGS84, sample) [https://opencontext.org/predicates/6c6a6ae0-cb7d-4558-bb82-3a5e85152cf1]
191+
Longitude (WGS84, sample) [Note] [https://opencontext.org/predicates/6dbdd6c1-7dbf-4b66-bf50-485c9044e5b9]
192+
Latitude (WGS84, sample) [https://opencontext.org/predicates/2f24017e-1b5c-4554-8908-5d7107412099]
193+
Material (original) [https://opencontext.org/predicates/2d4fd385-12e6-429a-a965-9d401d15c80f]
194+
Taxa [https://opencontext.org/predicates/63b5403a-41fc-471c-97f7-c36789398217]
195+
d13C [Note] [https://opencontext.org/predicates/e3f16d13-91d3-499d-afcd-6c41d7ec7451]
196+
d13C [https://opencontext.org/predicates/84d688d1-1fd3-433d-9e69-b1b8547bf248]
197+
Method [https://opencontext.org/predicates/07717fe1-6b85-4bf5-a378-9027a2c578b3]
198+
Location Note [https://opencontext.org/predicates/894aa7a2-27cb-4c7e-a1a3-855e4a841a40]
199+
LocAccuracy [https://opencontext.org/predicates/9615a4a5-8075-4f5b-a056-1ecf3fc416a2]
200+
Period [https://opencontext.org/predicates/1d48f6f4-7006-4a20-a785-bc12fa671a52]
201+
SiteID [https://opencontext.org/predicates/4060cc11-9a31-4dfd-a85e-652b5d0ef186]
202+
SiteName [https://opencontext.org/predicates/1b857265-f7e7-4bc9-bf71-e01c9e50425b]
203+
Country [https://opencontext.org/predicates/6c4f05b3-3f67-4c72-bb09-f4487bfcf514]
204+
Province [https://opencontext.org/predicates/3a58765e-dc57-4b58-9cf7-8bebdf3c5546]
205+
Continent [https://opencontext.org/predicates/9a5ef89f-3ca8-4e42-be97-340eccbdf23b]
206+
Source [https://opencontext.org/predicates/d6bdccfe-0609-486d-918e-3242f8b1465c]
207+
Reference [https://opencontext.org/predicates/1d319fd8-0a5d-4318-81dd-6658a9098eea]
208+
209+
210+
## Aligned Records
211+
212+
CARD Field | Neotoma Fields | DwC | IntCal |
213+
----------| ------- |
214+
Lab Number | ndb.geochronology.labnumber | | |
215+
Field Number | | | |
216+
Material Dated | ndb.geochronology.materialdated | materialDated | |
217+
Taxa Dated | | | |
218+
Type of Date | ndb.geochronology.agetypeid | | |
219+
Locality | ndb.collectionunits. |
220+
Latitude | ndb.sites.geog |
221+
Longitude | ndb.sites.geog |
222+
Map Sheet | API? |
223+
Elevation (m ASL) | ndb.sites.altitude |
224+
Submitter | ndb.datasets.datasetsubmitter |
225+
Date Submitted | ndb.datasets.redatecreated |
226+
Collector | ndb.collectionunits |
227+
Date Collected | ndb.collectionunits |
228+
Updater | ndb.??? |
229+
Date Updated | ndb.datasets.recdateupdated |
230+
Measured Age | ndb.geochronology.geochronage | uncalibratedChronometricAge
231+
MA Sigma | ndb.geochronology.errorolder |
232+
Normalized Age | |
233+
NA Sigma | |
234+
Delta 13C (per mil) | |
235+
Delta 13 C Source | |
236+
Significance | |
237+
Site Identifier | ndb.sites.siteid |
238+
Site Name | ndb.sites.sitename |
239+
Stratigraphic Component | |
240+
Context | ndb.sites.sitedescription |
241+
Associated Taxa | ndb.variables |
242+
Additional Information | |
243+
Comments | ndb.geochronology.notes | chronometricAgeRemarks |
244+
References | ndb.datasetpublications | chronometricAgeReferences |
245+
| geochrontype |
246+
| infinite |
247+
| percentc |
248+
| percentcollagen |
249+
| percentn |
250+
| radiocarbonmethod | chronometricAgeProtocol
251+
| reservoir |
252+
| geochronologyid | chronometricAgeID |
253+
| | earliestChronometricAge |
254+
| | latestChronometricAge |
255+
| | chronometricAgeConversionProtocol |
256+
| ndb.chroncontrols.recdatecreated | chronometricAgeDeterminedDate |
257+
| ndb.chronologies.contactid | chronometricAgeDeterminedBy |
258+
| ndb.analysisunit.analysisunitd | materialDatedID
259+
260+
261+
* verbatimChronometricAge: The verbatim age for a specimen, whether reported by a dating assay, associated references, or legacy information.
262+
* : The method used for converting the uncalibratedChronometricAge into a chronometric age in years, as captured in the earliestChronometricAge, earliestChronometricAgeReferenceSystem, latestChronometricAge, and latestChronometricAgeReferenceSystem fields.
263+
* earliestChronometricAge: The maximum/earliest/oldest possible age of a specimen as determined by a dating method.
264+
* earliestChronometricAgeReferenceSystem: The reference system associated with the earliestChronometricAge.
265+
* latestChronometricAge: The minimum/latest/youngest possible age of a specimen as determined by a dating method.
266+
* latestChronometricAgeReferenceSystem: The reference system associated with the latestChronometricAge.
267+
* chronometricAgeUncertaintyInYears: The temporal uncertainty of the earliestChronometricAge and latestChronometicAge in years.
268+
* chronometricAgeUncertaintyMethod: The method used to generate the value of chronometricAgeUncertaintyInYears.
269+
* materialDatedID: An identifier for the MaterialSample on which the chronometricAgeProtocol was performed, if applicable.
270+
* materialDatedRelationship: The relationship of the materialDated to the subject of the ChronometricAge record, from which the ChronometricAge of the subject is inferred.
271+
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Loading in the data
2+
3+
We want to pull in the full compliment of columns, add new publications where neccessary, and reassign dates & cetera.
4+
5+
```{r loadtsv}
6+
raw_data <- readr::read_tsv('data/syverson_14c.tsv')
7+
```
8+
9+
# Fields & Considerations
10+
11+
Most columns directly map onto existing Neotoma columns, overwriting some data, or adding additional information to the records.
12+
13+
Some considerations:
14+
* We don't have the dataset ID.
15+
16+
17+
We can build a SQL Query to access this data as a flat file in the following way:
18+
19+
```
20+
SELECT st.siteid,
21+
st.sitename,
22+
cu.collectionunitid,
23+
cu.handle,
24+
au.analysisunitid,
25+
au.analysisunitname,
26+
gc.geochronid,
27+
gc.materialdated,
28+
gc.labnumber,
29+
gc.age,
30+
gc.errorolder,
31+
gc.erroryounger,
32+
gc.infinite,
33+
gc.agetypeid,
34+
gc.geochrontypeid,
35+
gc.notes,
36+
rc.radiocarbonmethodid,
37+
rc.delta13c,
38+
rc.delta15n,
39+
rc.cnratio,
40+
tx.taxonname,
41+
spd.elementtypeid,
42+
pub.publicationid,
43+
pub.doi,
44+
pub.isbn,
45+
pub.citation,
46+
pub.pubtypeid
47+
FROM ndb.sites AS st
48+
INNER JOIN ndb.collectionunits AS cu ON cu.siteid = st.siteid
49+
INNER JOIN ndb.analysisunits AS au ON au.collecitonunitid = cu.collectionunitid
50+
INNER JOIN ndb.samples AS smp ON smp.analysisunitid = au.analysisunitid
51+
INNER JOIN ndb.geochronology AS gc ON gc.sampleid = smp.sampleid
52+
LEFT JOIN ndb.radiocarbon AS rc ON rc.geochronid = gc.geochronid
53+
LEFT JOIN ndb.specimendates AS spd ON spd.sampleid = smp.sampleid AND spd.geochronid = gc.geochronid
54+
LEFT JOIN ndb.taxa AS tx ON spd.taxonid = tx.taxonid
55+
LEFT JOIN ndb.specimens AS spc ON spc.specimenid = spd.specimenid
56+
LEFT JOIN ndb.repositoryinstitutions AS ri ON ri.repositoryid = spc.repositoryid
57+
INNER JOIN ndb.datasets AS ds ON ds.collectionunitid = cu.collectionunitid
58+
LEFT JOIN ndb.datasetpublications AS dspb ON dspb.datasetid = ds.datasetid
59+
LEFT JOIN ndb.publications AS pub ON pub.publicationid = dspb.publicationid;
60+
```
61+
62+
This results in row duplication for some fields, particularly as a result of multiple datasets associated with a single sampleid, or multiple publications associated with a datasetid.

0 commit comments

Comments
 (0)