Replies: 6 comments 2 replies
-
RE @smroecker's bug report #158 -- you were right to be suspicious of these results as they probably have not have been the same historically. I am sorry this passed my initial sniff test... I should have looked further beyond After speaking with @jskovlin today, I identified an issue with the join logic for horizon data being used inside The details of the bug have to do with nuances of Since Summer 2020, the SoilProfileCollection A corresponding version bump of soilDB to 2.6.0 has been made. I certainly wish I had caught this before the 2.5.9 release-- and Stephen provided me the opportunity, but alas I missed it. That said this is very timely considering a) we didnt find it during the stats class and b) the plans to incorporate much more rigorous unit testing of NASIS functionality via DBI interface + SQLite backend etc. in 2.6.x. Upgrading soilDB to take advantage of the new aqp integrity features should also reduce the amount of duplication of logic-checking and join code across the two packages. |
Beta Was this translation helpful? Give feedback.
-
After talking with @brownag, I'd support the following:
|
Beta Was this translation helpful? Give feedback.
-
Added two more older issues to above list: normalize parent material, geomorp, ecosite, flattening strategies#84 opened on Oct 31, 2018 by @dylanbeaudette
|
Beta Was this translation helpful? Give feedback.
-
Added #192 to list; thanks @hammerly for pointing out more cases where this discussion is relevant |
Beta Was this translation helpful? Give feedback.
-
Some more thoughts from #192 for laboratory (field or KSSL) records that cause duplication by (naive) queries that assume 1:1 relationship between phorizon and child tables: Ideally we would be able to distinguish depth repeated measures from the technical replicates of data within strata. If a re-run was due to "bad data" or some error in the process then I would argue that old data should probably be removed from database to eliminate any uncertainty; or we add a flag that allows data to be filtered and marked obsolete while still retaining the "record" that it was measured and re-done. You raise a very good point on the PSCS. That seems like an example where we wouldn't want to weighted average the values within a specific horizon, but rather across horizons? So, say you have a profile that has 25-100cm PSCS, and morphologic horizon upper bounds at 18, 36, 75, 100. The first PSCS subsample in phlabsample might only be the 25-36 portion of the 18-36cm horizon, and the purpose of collecting that sample was not to produce a weighted average for 18-36 but rather as a component of the 25-100 (which spans 3 morphologic horizons). If a single horizon contained the phsample data for the whole 25-100 interval that would be another case. Currently there is very little validaton that is done on the sample depths populated in there with respect to parent horion depths, and that is probably what we would need more of to get more specific here. Another related example is subsampling diffuse clay increases to interpolate where the clay increase/upper boundary of the argillic horizon is. I am not very familiar with this approach or sure of how prevalent it is but I remember it from correlation training. Soil Taxonomy (p.33 1999 "The top of the argillic horizon") briefly discusses a method for interpolating the upper boundary of the argillic by "fitting a smooth curve." In the cases where this has been done I am not sure of the conventions for the field horizons versus the subsamples of layers are portrayed in NASIS. I imagine some sort of constant depth sampling within horizons that would then in disaggregated form be used to fit some sort of function to. Perhaps some queries or reports on NASIS side to help identify [potential] data population issues in those tables would help highlight how and where these tables are being used. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
We do in defined cases.
The sequence and specific cutoff for what is fixable needs to be discussed. And we need to document those parameters after it has been discussed.
This is me trying to get more discussion going outside the context of "issues"
Especially about fundamental things about important functions like default behavior of
fetchNASIS
The following issues are related.
error in fetchNASIS with diagHzBoolean NASIS-local
rmHzErrors
Dissaggregated glossic horizons unsupported? NASIS-local
get_hz_data_from_NASIS_db - issues with join to phsample NASIS-local
new functions for common QC of pedon / component data NASIS-local
normalize parent material, geomorp, ecosite, flattening strategies
should
NA
be interpreted as FALSE in .diagHzLongtoWide()?Beta Was this translation helpful? Give feedback.
All reactions