Researcher-dependent VS researcher-independent information #40

Azamattf · 2022-09-08T12:13:35Z

Necessity

I think we should divide information we receive from the community into two:

User-dependent information: Information about how researchers use the data

Scientific field
Physical variables derived
Software used
Region/object of study
etc

User-independent information: All data related to tech specs of the data set/sensor.

Sensor name and type
Satellite active period
Temporal and regional coverage
Temporal and regional resolution
Data access platform
Data accessibility (open access/commercial)
etc.

Why is it important?

As we collect info about tech specs of the sensor and validate them, they are not subject to change, unless somebody spots a typo or a mistake. On the other hand, scientific application is user- or researcher-specific and one data set can have various applications.
Therefore, we can work with 2 types of files:

[data_set_name]_techspecs - for one sensor we would have only one file (like we have currently)
[data_set_name]_application_001, [data_set_name]_application_002 - for one sensor we would have multiple files for applications.

This separation would also help us:

aggregate the data on applications, such as CryoSAT: used in Glaciology (24 researchers), used to derive Ice Velocity (20 users), used in the study of the Arctic (62 users) and Antarctic (50 users).
collect information from the community more efficiently using two templates, one about tech specs and one for scientific applications.

In the end, the code would compile the third type of file for each dataset/sensor with all information - one file for each sensor, probably called [data_set_name]_index. All these index files can then be sent to Google Sheet.

What do you guys think?

AdrienWehrle · 2022-09-11T15:35:25Z

Hi @Azamattf! Thanks for your work on this!

I'm not sure I see the benefit in this, compared to the complexity it is adding. In your opinion, what would be the main gain of integrating such a structure? To me all fields are similar at the end, some will change more than others but I don't see an issue in this.

Azamattf · 2022-09-19T14:54:32Z

Hi @AdrienWehrle, thanks for the reply.
As we know, our database is not the only one that is aimed to collect information about satellite datasets in the world. But our project is likely to be unique because we are emphasizing on the applications of datasets. Also, the scientific applications will most likely to be different for each user/researcher. On the other hand, the tech specs of sensors are the same for everyone (user-independent).

For ECRs: considering this, like I mentioned in my post, the proposed structure would address our objective of giving ECRs a tool to help them with both tech specs and aggregate information on scientific applications (see the original post).

For Contributors: I think it would be inefficient to gather information about tech specs each time a Contributor submits a template-based information because such info is user-independent. I think user-independent info shouldn't be collected all the time because it could be waste of time for the Contributor and the project team (the team is still supposed to verify the sensor tech specs and does it once only). A Contributor may want to add new information either about tech specs (if not included in database already) or about applications or both.

Hope that helps :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Researcher-dependent VS researcher-independent information #40

Researcher-dependent VS researcher-independent information #40

Azamattf commented Sep 8, 2022

AdrienWehrle commented Sep 11, 2022

Azamattf commented Sep 19, 2022

Researcher-dependent VS researcher-independent information #40

Researcher-dependent VS researcher-independent information #40

Comments

Azamattf commented Sep 8, 2022

Necessity

Why is it important?

AdrienWehrle commented Sep 11, 2022

Azamattf commented Sep 19, 2022