Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Playground] GetMetadata() endpoint #25610

Merged
merged 29 commits into from
Mar 24, 2023

Conversation

TSultanov
Copy link
Contributor

@TSultanov TSultanov commented Feb 23, 2023

Backend

Add new endpoint to the Playground runners which returns metadata. Returned metadata includes:

  • Type of Beam SDK (Java, Go, Python, Scio)
  • Version of Beam SDK
  • Commit from which the runner was built
  • Timestamp of the commit

Frontend

  • Display frontend version
  • Display backend version information

resolves #22271
resolves #22542
resolves #21925
resolves #24200


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

playground/api/v1/api.proto Outdated Show resolved Hide resolved
ext.getGitCommitHash = () -> {
def stdout = new ByteArrayOutputStream()
exec {
executable 'git'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use Kotlin for new endeavors. Or at least simplify possible migration in the future with

Suggested change
executable 'git'
executable("git")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found any good ways how to import functions defined in Kotlin DSL into Gradle Groovy. If you have suggestions, I'll be glad to hear them

playground/backend/containers/java/build.gradle Outdated Show resolved Hide resolved
@TSultanov TSultanov force-pushed the get-metadata-endpoint-playground branch from f4cd8ba to 5877451 Compare February 24, 2023 10:42
@github-actions github-actions bot added the build label Mar 1, 2023
# under the License.

# Run this from the playground_components project root
FILE='lib/src/build_metadata.g.dart'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexeyinkin Can you also add this file to .gitignore so if user would build the project it wouldn't pollute git status?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this file in the repository so that the project is immediately runnable after checkout. So adding it to .gitignore has no effect. Unfortunately, it will always pollute git status.

An alternative would be to write these data to a git-ignored .txt file and to download when the frontend starts. The cost is +1 request on each page load. I would rather not do it.

git rev-parse --sq HEAD >> $FILE
echo ';' >> $FILE

echo -n 'const buildCommitSecondsSinceEpoch = ' >> $FILE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use printf instead of echo -n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

# Run this from the playground_components project root
FILE='lib/src/build_metadata.g.dart'

echo '// GENERATED CODE - DO NOT MODIFY BY HAND' > $FILE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the following pattern to put large snippets of text into files:

cat > $FILE << EOF
....
EOF

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

RUN apt-get update && apt-get install -y wget xz-utils git

RUN cat /app/playground_components/lib/src/build_metadata.g.dart
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep it in build logs. If anything goes wrong with generation, it will be problematic to extract this file content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we already printing it out in the script generating this file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

// TODO(alexeyinkin): When it is merged, find a better place for the remaining
// constant and delete this file.

const String kAnalyticsUA = 'G-BXFP2FNCKC';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to config.dart or similar as it no longer is a generated file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to prevent merging conflicts because there are other PRs to change this file. When analytics lands, I will.

# under the License.

# Run this from the playground_components project root
FILE='lib/src/build_metadata.g.dart'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can we determine the path to the script location and use it to construct the full path to this generated file to drop the requirement of being in the playground_components project root?
Or, pass this value as an argument from a gradle task
Or, why not generate this file from gradle directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ determining from the script location.
I would rather not use Gradle for this because Groovy files are currently messy and we cannot do this in Kotlin.

final _runnerVersionFutures = <Sdk, Future<GetMetadataResponse>>{};

/// Returns the router version and starts loading if it is not started yet.
ComponentVersion? get routerVersion {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

such way don't assure that it returns version and client won't know when it will be available. Suggest make it future and return cached data if the have been loaded

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single place, when this getter is used is _ComponentVersionsWidget Just use FutureBuilder there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same about getRunnerVersion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

/// - node.play.beam.apache.org
/// This means that if the stage runs its own container for the given
/// backend node, it is used. Otherwise the production backend is used.
List<Uri> _getBackendUrlOptions(String node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

place this method under _getBackendUrl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

return urls.first;
}

print('Probing multiple options for $node backend:');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of these prints are required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code behaves differently locally vs staging vs prod, so better logging is required.

await client.getMetadata();
print('Using $url');
return url;
} catch (ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid catches without on clauses

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

@@ -35,9 +35,11 @@ const kProcessingStartedText = 'The processing has started\n';

// TODO(alexeyinkin): Rename. This is not a repository but a higher level client.
class CodeRepository {
final CodeClient _client;
final CodeClient client;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client changed to public here just because of one method in build_metadata.dart. I suggest to keep this field incapsulated and turn back to private

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+

@MakarkinSAkvelon
Copy link
Contributor

LGTM

@TSultanov TSultanov marked this pull request as ready for review March 6, 2023 13:24
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@alexeyinkin
Copy link
Contributor

R: @damondouglas

The only failing check is due to snippet renaming, it is addressed in

Not merging it here to simplify separate PRs and reduce conflicts.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

learning/tour-of-beam/backend/README.md Outdated Show resolved Hide resolved
playground/backend/cmd/server/controller.go Outdated Show resolved Hide resolved
playground/api/v1/api.proto Outdated Show resolved Hide resolved
Comment on lines +19 to +37
ext.getGitCommitHash = () -> {
def stdout = new ByteArrayOutputStream()
exec {
executable('git')
args('rev-parse', 'HEAD')
standardOutput = stdout
}
return stdout.toString().trim()
} as Closure<String>

ext.getGitCommitTimestamp = () -> {
def stdout = new ByteArrayOutputStream()
exec {
executable('git')
args('show', '-s', '--format=%ct', 'HEAD')
standardOutput = stdout
}
return stdout.toString().trim()
} as Closure<String>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a gradle wrapper needed? Why not just run the command in the context where the value is needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this in 6 separate locations:

  • Building the frontend container.
  • Building the router container.
  • Building the 4 runners.

The 5 backends may be similar but frontend and backend are still two rather distinct users of this. This motivates to do this in a single location like this because duplicating git show -s --format=%ct HEAD and git rev-parse HEAD 2-6 times is more confusing.

Next, we cannot do this in Dockerfile because by the time it is built the directory is unaware of .git.

This leads to the requirement of having the values before the 6 containers' builds start. And their builds are triggered from Gradle. So we cannot shorten this to command $(git rev-parse HEAD), we would still have the whole exec { ... } thing but in 12 locations.

Comment on lines +87 to +102
name containerImageName(
name: project.docker_image_default_repo_prefix + "playground-backend-go",
root: project.rootProject.hasProperty(["docker-repository-root"]) ?
project.rootProject["docker-repository-root"] :
project.docker_image_default_repo_root)
files "./build/"
tags containerImageTags()
buildArgs(
['BASE_IMAGE' : project.rootProject.hasProperty(["base-image"]) ?
project.rootProject["base-image"] :
"golang:1.18-bullseye",
'SDK_TAG' : project.rootProject.hasProperty(["sdk-tag"]) ?
project.rootProject["sdk-tag"] : project.rootProject.sdk_version,
'SDK_TAG_LOCAL': project.rootProject.sdk_version,
'GIT_COMMIT' : getGitCommitHash(),
'GIT_TIMESTAMP': getGitCommitTimestamp()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk of wrapping more and more commands with gradle, the difficult it will be to troubleshoot in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is ok in order to get the reusability

@@ -55,6 +57,7 @@ FROM apache/beam_java8_sdk:$BEAM_VERSION
ARG BEAM_VERSION
ARG SPRING_VERSION=5.3.25
ARG KAFKA_CLIENTS_VERSION=2.3.1
ENV BEAM_VERSION=$BEAM_VERSION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify environment variables outside the Docker image using ConfigMap

Copy link
Contributor Author

@TSultanov TSultanov Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a runtime configuration parameter. This variable holds the value of Beam SDK version with which the image was built, so it's a compile-time parameter. Storing this value outside of the container makes little sense to me.

…nt-playground

# Conflicts:
#	playground/frontend/playground_components/lib/playground_components.dart
#	playground/frontend/playground_components/pubspec.yaml
Copy link
Contributor

@damondouglas damondouglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran through the entire build/deploy solution documented at https://github.com/akvelon/beam/blob/get-metadata-endpoint-playground/playground/terraform/README.md and I'm seeing CORS errors from the domain I specified -Pdns-name= in the documentation. I was unable to test through the installed application whether this PR broke anything. The images built successfully and the k8s manifests installed correctly.

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, we can iterate further as needed as called out in a few places

@damccorm damccorm merged commit 1d26d39 into apache:master Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants