Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large GitHub releases (use caching!?) #672

Open
mfonville opened this issue Mar 18, 2016 · 7 comments
Open

Handle large GitHub releases (use caching!?) #672

mfonville opened this issue Mar 18, 2016 · 7 comments
Labels
bug Bugs in badges and the frontend service-badge New or updated service badge

Comments

@mfonville
Copy link

For OpenGApps I would like to include a badge with # downloads of our releases.
But already even one single release of one architecture often gives timeouts, like: Latest ARM

While I would prefer to even do:
ARM Releases

And I would like to do that for our other architectures (arm64,x86,x86_64) preferably even with a grand-total for all of the 4 repositories together.
We do daily releases, so we are the "weird" one and not the typical use-case, but I would really like to use the badges.

Some caching of non real-time data would not hurt the use-case, since we are well above 3 million downloads...

(see currently regularly failing implementation at https://github.com/opengapps/opengapps/blob/master/README.md )

@espadrine
Copy link
Member

I think the timeouts are actually caused by GitHub rate limiting, for which we have an issue here: #529. I suspect it was a lot, lot less often recently.

Still, we very recently have added functionality to enforce custom browser caching: #534. You can set it to, say, 86400 seconds (a day): https://img.shields.io/github/downloads/opengapps/arm/total.svg?maxAge=86400. I will add some documentation about it on the front page soon.

Do you feel that this addresses your issues?

@mfonville
Copy link
Author

@espadrine Indeed, the rate-limiting is very probably the cause of the problem. I will add the browser-caching, I hope that helps with the issue a bit, but I do fear that with the amount of unique visitors the rate-limit is hit nonetheless.
Thanks for the support!

@mfonville
Copy link
Author

I have btw the feeling that the 'total' is not correct.
I fetched all number of downloads data from the Python API and I see many millions more of downloads there. When discarding the .md5s and .txts and only counting the .zips the number is around 14 million.

Does the shields.io take into account that the GitHubAPI uses pagination for large result sets?

@espadrine
Copy link
Member

Based on the raw request's Link header, having the complete information would require 10 requests: curl -I 'https://api.github.com/repos/opengapps/arm/releases' | grep ^Link. We indeed only perform one, and that request is already pretty expensive and slow for GitHub's servers.

If GitHub offered a fast way to access the complete total download, we would be able to implement it for a live badge. Us offering a badge that takes ages to load is not very practical, and, as you noted, the current badge already is a pretty time-consuming badge for GitHub to compute.

Fortunately, there's an alternative. There is a badge suggestion system designed to produce static badges when the data would be too expensive for a live badge. Users simply go to the shields.io website, paste their GitHub project's URL, and click on the "Suggest badges" button.

Currently, it does not generate a badge for real total downloads. The badge suggestion code lives in /suggest.js. If you are interested in relying on that, I would welcome a pull request.

Note: The rate-limiting system for GitHub was overhauled as described here. I believe it should no longer be an issue.

@mfonville
Copy link
Author

@espadrine sorry for the late reply, but thanks for your response.
I understand the issues concerning the total computation. I personally think this is something @github should (try to) provide a single API call to get some of the grant totals without having to perform many API calls to compute it.

For the current daily totals an import note; are you passing the per_page=100 parameter to the API request? Because otherwise you only get 30 results in 1 request. While 'for the same price' you can get 100, which is a better approximation for projects with many assets (like opengapps).

@espadrine
Copy link
Member

Testing from many months ago revealed that per_page=100 wasn't "for the same price", as the time it took for the request to be made was correlated to the number of elements in the page. Maybe it changed. I'll have to check, unless you do it first.

@mfonville
Copy link
Author

@espadrine "for the same price" was not meant as performance related for the time to answer 1 query (though in my perception GitHub API is quite fast) and I don't have any timing-related stats.
I meant that with a same single request (so not needing multiple requests) and only counting for 1 towards the API rate-limit more (complete) information is returned.

@paulmelnikow paulmelnikow added enhancement service-badge New or updated service badge labels Apr 18, 2017
@paulmelnikow paulmelnikow added the frontend The Docusaurus app serving the docs site label Oct 13, 2017
@paulmelnikow paulmelnikow added bug Bugs in badges and the frontend and removed frontend The Docusaurus app serving the docs site labels Oct 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs in badges and the frontend service-badge New or updated service badge
Projects
None yet
Development

No branches or pull requests

3 participants