Exploring a baseline Action build #48421

bhack · 2021-04-08T21:50:25Z

With this I want to explore a new testing baseline with Github Action and our official CPU tensorflow/tensorflow:devel image.

The idea is to test in the CI the (more or less) Episodic contributor journey to contribute code to Tensorflow at least on CPU.

This is the proposed list of steps:

tensorflow/tensorflow:devel image rebuid build (or Dockerhub pull?)
Code checkout
ci_sanity.sh selected steps (--pylint, -- see Supersed pylint_allowlist #48294)
TF bazel ./configure
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
bazel test //tensorflow/

As the average user it is already experiencing, this will probably require a bazel cache (on GCS like for TF/IO?) to achieve reasonable compilation times.

I think that reproducibility and the timing of these build steps will let us to monitor the experience of a Tensorflow episodic contribution.

/cc @angerson @mihaimaruseac @theadactyl @joanafilipa

bhack · 2021-04-09T11:37:09Z

As expected we had a Github Action timeout on the TensorFlow build step after 5h 56m 42s with only 11,286 compiled target on an estimate of 33563 targets configured.

Github Actions are currently running on a Standard_DS2_v2 machine.

As we already know this is really a bottleneck for an average TF external (Episodic or not) contributor as we ask to reproduce these steps on its own local machine just for preparing an occasional code PR.

I think that it is important to continuously monitor this Action over time to expect that we could execute it in the expected time that it seems to us reasonable for an Episodic/Average TF contributor.

Some proposed solutions to enable this action in order of preference:

Use a Google hosted Action runner and produce a GCS cache that could be re-used by the action itself and (read-only) by every contributor on his own local-machine when he is working with the official tensorflow/tensorflow:devel. A sort of an improvement over Enable ready-only bazel cache io#1294
Still use Github hosted Action consuming an usable GCS cache produced elsewhere (Where?)
Let use a bazel python only build and test commands in this Action using all the c/c++ component from a system pip install ft-nightly. This is risky cause as we build nightly once a day we could have misalignment against current master c/c++ features.

bhack · 2021-04-13T00:32:33Z

Just in the case we want to explore the first option with the self hosted Github Action runner on GKE:
https://github.com/summerwind/actions-runner-controller
https://github.com/evryfs/github-actions-runner-operator/

bhack · 2021-04-13T12:42:24Z

There is also a Terraform Github Self Hosted Runners on GKE repo maintained by Google Cloud members (/cc @bharathkkb) at https://github.com/terraform-google-modules/terraform-google-github-actions-runners

bhack · 2021-04-14T18:47:23Z

/cc @perfinion If we can do some steps together on this.

bhack · 2021-04-15T17:00:31Z

Update: We discussed a pilot plan with @perfinion yesterday on SIG-Build Gitter.

vnghia · 2021-04-15T17:30:44Z

I would add one more difficulty: Even with local cache, it seems to be invalidated each time I pull the commits from upstream. ( I think LLVM-related commits like 17e6dc2 are the culprits ).

bhack · 2021-04-15T17:36:30Z

I would add one more difficulty: Even with local cache, it seems to be invalidated each time I pull the commits from upstream. ( I think LLVM-related commits like 17e6dc2 are the culprits ).

What cache command are you using?

vnghia · 2021-04-15T17:44:41Z

I am using --disk_cache. I noice that I have a much longer build ( around 8~10 hours ) everytime there is one LLVM-related commit ( which is pretty much daily but I don't pull upstream that often ).

vnghia · 2021-04-15T17:54:18Z

I found that in #40505 (comment), @mihaimaruseac said the same thing. Do you have any problem regarding this issue @bhack ?

bhack · 2021-04-15T18:04:34Z

I found that in #40505 (comment), @mihaimaruseac said the same thing. Do you have any problem regarding this issue @bhack ?

We are waiting to have a bootstrapped GCS cache for this action produced with a fresh master build in tensorflow/tensorflow:devel

bhack · 2021-04-15T18:09:48Z

If the llvm sync will totally invalidate the remote bazel cache we cannot use Github Action but we need to use self hosted github Actions as suggested in #48421 (comment).

gbaned · 2021-06-25T17:15:13Z

@bhack This PR is in draft, any update on this? Please. Thanks!

bhack · 2021-06-25T17:20:28Z

@gbaned It is a draft cause as you can see the introduced action go in Timeout on Github.
I am waiting on an agreement on how I could use a GCS readonly cache with the infra/build team /cc @mihaimaruseac @angerson

bhack · 2021-10-14T17:03:20Z

Just for reference, it is going in timeout on this kind of HW resources:

https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

bhack · 2022-09-27T20:22:58Z

Closing this for #57630

google-ml-butler bot added the size:M CL Change Size: Medium label Apr 8, 2021

google-cla bot added the cla: yes label Apr 8, 2021

bhack added 2 commits April 9, 2021 00:00

Remove template file

0d8b8a5

Add pull request triggger

c57d2bc

gbaned self-assigned this Apr 9, 2021

bhack mentioned this pull request Apr 14, 2021

Update TF official devel images to Ubuntu 20.04 and precommit hooks #48371

Closed

bhack mentioned this pull request Apr 15, 2021

Want to contribute but build takes time #40505

Closed

This was referenced Apr 19, 2021

Provide Bazel cache for TensorFlow builds tensorflow/build#5

Open

Why there is no DepthwiseConv1D function in TensorFlow? #48557

Closed

Add Vscode and Github Codespaces devcontainer #48679

Closed

bhack mentioned this pull request Oct 14, 2021

Add calls to reserve() before populating vectors #51739

Closed

Merge branch 'master' into docker_devel_action

3175de1

bhack force-pushed the docker_devel_action branch from 1b12917 to 3175de1 Compare October 25, 2021 23:26

bhack added 2 commits October 28, 2021 13:41

Merge branch 'master' into docker_devel_action

2e5800f

Merge branch 'master' into docker_devel_action

a0023d9

gbaned requested a review from mihaimaruseac December 28, 2021 16:42

google-ml-butler bot added the awaiting review Pull request awaiting review label Dec 28, 2021

gbaned removed the awaiting review Pull request awaiting review label Feb 9, 2022

bhack closed this Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exploring a baseline Action build #48421

Exploring a baseline Action build #48421

Uh oh!

bhack commented Apr 8, 2021 •

edited

Loading

Uh oh!

bhack commented Apr 9, 2021 •

edited

Loading

Uh oh!

bhack commented Apr 13, 2021

Uh oh!

bhack commented Apr 13, 2021

Uh oh!

bhack commented Apr 14, 2021

Uh oh!

bhack commented Apr 15, 2021 •

edited

Loading

Uh oh!

vnghia commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

vnghia commented Apr 15, 2021 •

edited

Loading

Uh oh!

vnghia commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

gbaned commented Jun 25, 2021

Uh oh!

bhack commented Jun 25, 2021

Uh oh!

bhack commented Oct 14, 2021

Uh oh!

bhack commented Sep 27, 2022

Uh oh!

Uh oh!

Exploring a baseline Action build #48421

Exploring a baseline Action build #48421

Uh oh!

Conversation

bhack commented Apr 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhack commented Apr 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhack commented Apr 13, 2021

Uh oh!

bhack commented Apr 13, 2021

Uh oh!

bhack commented Apr 14, 2021

Uh oh!

bhack commented Apr 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vnghia commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

vnghia commented Apr 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vnghia commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

bhack commented Apr 15, 2021

Uh oh!

gbaned commented Jun 25, 2021

Uh oh!

bhack commented Jun 25, 2021

Uh oh!

bhack commented Oct 14, 2021

Uh oh!

bhack commented Sep 27, 2022

Uh oh!

Uh oh!

bhack commented Apr 8, 2021 •

edited

Loading

bhack commented Apr 9, 2021 •

edited

Loading

bhack commented Apr 15, 2021 •

edited

Loading

vnghia commented Apr 15, 2021 •

edited

Loading