Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove all dependencies on sloth #649

Merged

Conversation

SurferJeffAtGoogle
Copy link
Contributor

Also, move code that was copied and pasted twice into a common py file.

fixes #645

Also, move code that was copied and pasted twice into a common py file.

fixes googleapis#645
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jun 25, 2020
@@ -0,0 +1,25 @@
- name: gaxios
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to look at other ways to approach this problem. I am very worried about having yet another place where we have to list out all applicable targets. Two different thoughts.

Using the GitHub Search API
To get around this problem in github-repo-automation, I opted to use the GitHub Search API to identify the relevant repositories:
https://developer.github.com/v3/search/#search-repositories

The query we used there looks like this:

org:googleapis language:typescript language:javascript is:public archived:false

Inversion of control
Maybe a better question is, why do we need to have a centralized place for these jobs to run. Should we consider pushing responsibility of registration down to the repository level? I could absolutely see a future where orgs outside of googleapis (ads, analytics) want to use this tool. Is there a reason we couldn't ask repository owners to have a kokoro job on a cron that calls autosynth on their own?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to look at other ways to approach this problem. I am very worried about having yet another place where we have to list out all applicable targets. Two different thoughts.

Using the GitHub Search API
To get around this problem in github-repo-automation, I opted to use the GitHub Search API to identify the relevant repositories:
https://developer.github.com/v3/search/#search-repositories

The query we used there looks like this:

org:googleapis language:typescript language:javascript is:public archived:false

That's actually how I wrote the code the first time. The problem is that all the queries came back marked incomplete, and indeed were missing a few results. So autosynth would not regenerate a few unlucky repos every day.

https://developer.github.com/v3/search/#timeouts-and-incomplete-results

Inversion of control
Maybe a better question is, why do we need to have a centralized place for these jobs to run. Should we consider pushing responsibility of registration down to the repository level?

That's kinda what this code does for all but about 10 repos. Register your new nodejs repo by adding synth.py to the root, and autosynth will start generating it.

If we pushed kokoro registration down to the repo levels, then we'd be maintaining 300+ kokoro job configs in piper, that have to stay in sync with 300+ build configs and synth.py files in the repos.

I could absolutely see a future where orgs outside of googleapis (ads, analytics) want to use this tool. Is there a reason we couldn't ask repository owners to have a kokoro job on a cron that calls autosynth on their own?

I see no reason. They can do that today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - on the search result API thing, bummer, didn't know that, thank you :) I'm still wigged out about the 10 weird repos in the yaml. Instead, could we use the repos list API, find all TypeScript/JavaScript APIs with a synth.py, and use those?
https://developer.github.com/v3/repos/#list-organization-repositories

I'm really worried we are going to create a new repo without the nodejs- prefix, try to start using synthtool, and forget to register it here. Thinking out loud - I wonder if a model like this wouldn't be better to drive the entire autosynth process. Grab every repo in the org, check if it has a synth.py, and run it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll write the code to scan all the repos that contain a majority of javascript code and a synth.py file.

One issue that prevents a more general solution is that different languages expect a different build environment, as is manifest by the .sh files in https://github.com/googleapis/synthtool/tree/master/.kokoro-autosynth.

If all 7 languages attempted to examine all googleapis' repos, then that would be 7 languages * hundreds of repos = 1000s of API calls within the same hour, and I worry we'd exceed our github API quota.

We could work around that with bots or maybe github actions (can the be installed for a whole organization?) that update a master list periodically, or maybe using an https cache, but that's a much larger solution.

Copy link
Contributor

@JustinBeckwith JustinBeckwith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's make sure @chingor13 and @busunkim96 take a look at the Python and Java specific bits too.

autosynth/github.py Show resolved Hide resolved
@JustinBeckwith
Copy link
Contributor

Haha, this will create a little trouble for Java:
https://github.com/googleapis?q=&type=&language=shell

There are a few repositories GitHub Linguist thinks are shell scripts due to kokoro being so chonky. This is pretty easily fixed with a .gitattributes file filtering out the .kokoro directory:
https://github.com/github/linguist#overrides

@SurferJeffAtGoogle
Copy link
Contributor Author

Haha, this will create a little trouble for Java:
https://github.com/googleapis?q=&type=&language=shell

I added code to only consider the 7 silver languages, and ignore languages like Shell.

Most of these repos will be correctly collected because they contain the language name in their names. Inspecting the languages in the repo is a fall-back.

There are a few repositories GitHub Linguist thinks are shell scripts due to kokoro being so chonky. This is pretty easily fixed with a .gitattributes file filtering out the .kokoro directory:
https://github.com/github/linguist#overrides

Copy link
Contributor

@busunkim96 busunkim96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python looks good to me.

@SurferJeffAtGoogle SurferJeffAtGoogle merged commit d79dc99 into googleapis:master Jun 26, 2020
@SurferJeffAtGoogle SurferJeffAtGoogle deleted the squash-sloth2 branch June 26, 2020 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove usage of repos.json
5 participants