Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Researcher-dependent VS researcher-independent information #40

Open
Azamattf opened this issue Sep 8, 2022 · 2 comments
Open

Researcher-dependent VS researcher-independent information #40

Azamattf opened this issue Sep 8, 2022 · 2 comments

Comments

@Azamattf
Copy link
Contributor

Azamattf commented Sep 8, 2022

Necessity

I think we should divide information we receive from the community into two:

  1. User-dependent information: Information about how researchers use the data
  • Scientific field
  • Physical variables derived
  • Software used
  • Region/object of study
  • etc
  1. User-independent information: All data related to tech specs of the data set/sensor.
  • Sensor name and type
  • Satellite active period
  • Temporal and regional coverage
  • Temporal and regional resolution
  • Data access platform
  • Data accessibility (open access/commercial)
  • etc.

Why is it important?

As we collect info about tech specs of the sensor and validate them, they are not subject to change, unless somebody spots a typo or a mistake. On the other hand, scientific application is user- or researcher-specific and one data set can have various applications.
Therefore, we can work with 2 types of files:

  1. [data_set_name]_techspecs - for one sensor we would have only one file (like we have currently)
  2. [data_set_name]_application_001, [data_set_name]_application_002 - for one sensor we would have multiple files for applications.

This separation would also help us:

  • aggregate the data on applications, such as CryoSAT: used in Glaciology (24 researchers), used to derive Ice Velocity (20 users), used in the study of the Arctic (62 users) and Antarctic (50 users).
  • collect information from the community more efficiently using two templates, one about tech specs and one for scientific applications.

In the end, the code would compile the third type of file for each dataset/sensor with all information - one file for each sensor, probably called [data_set_name]_index. All these index files can then be sent to Google Sheet.

What do you guys think?

@AdrienWehrle
Copy link
Contributor

Hi @Azamattf! Thanks for your work on this!

I'm not sure I see the benefit in this, compared to the complexity it is adding. In your opinion, what would be the main gain of integrating such a structure? To me all fields are similar at the end, some will change more than others but I don't see an issue in this.

@Azamattf
Copy link
Contributor Author

Hi @AdrienWehrle, thanks for the reply.
As we know, our database is not the only one that is aimed to collect information about satellite datasets in the world. But our project is likely to be unique because we are emphasizing on the applications of datasets. Also, the scientific applications will most likely to be different for each user/researcher. On the other hand, the tech specs of sensors are the same for everyone (user-independent).

For ECRs: considering this, like I mentioned in my post, the proposed structure would address our objective of giving ECRs a tool to help them with both tech specs and aggregate information on scientific applications (see the original post).

For Contributors: I think it would be inefficient to gather information about tech specs each time a Contributor submits a template-based information because such info is user-independent. I think user-independent info shouldn't be collected all the time because it could be waste of time for the Contributor and the project team (the team is still supposed to verify the sensor tech specs and does it once only). A Contributor may want to add new information either about tech specs (if not included in database already) or about applications or both.

Hope that helps :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants