Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Reach out to CMCC to see if there is a way to access the ESGF publication/download stats database. #24

Open
tomvothecoder opened this issue Nov 7, 2023 · 0 comments

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Nov 7, 2023

Source: https://acme-climate.atlassian.net/wiki/spaces/IPD/pages/3974791234/2023.Q4%3A+Finish+phase+2+new+data+publication+features?focusedCommentId=3988095036

Yeah this task is for download and publication stats from ESGF.

ESGF/CMCC does have a database that stores download and publication stats. However, the public ESGF dashboards displaying these stats are not granular enough for our needs. The ESGF API can be queried for publication stats, but not download stats.

The esgf_metrics package collects more granular download stats in E3SM data in Native and CMIP6 formats, but it is only limited to the LLNL node.

I’ll need to reach out to CMCC again to see if there is a way to access their database. This might open up the ability to collect more comprehensive stats across nodes. We can also simplify esgf_metrics to just query this database instead of collecting and parsing logs at the LLNL node.

Also, I’ll need to see if CMCC stores individual HTTP request information such as IP address.

With esgf_metrics, I store parsed logs in a PostgreSQL database that includes IP addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant