Skip to content

Conversation

@johannaojeling
Copy link
Contributor

@johannaojeling johannaojeling commented Jan 6, 2023

The BeamRunGoPipelineOperator currently has a go_file parameter, which represents the path to a Go source file with the pipeline code. The operator starts the pipeline with go run, i.e. compiles the code into a temporary binary and executes.

This PR adds support for the operator to start the pipeline with an already compiled binary, as an alternative to the source file approach. It introduces two new parameters:

  1. launcher_binary path to a binary compiled for the launching platform, i.e. the platform where Airflow is deployed
  2. worker_binary (optional) path to a binary compiled for the worker platform if using a remote runner

Some motivations to introduce this feature:

  • It does not require a Go installation on the system where Airflow is run (which is more similar to how the BeamRunJavaPipelineOperator works, running a jar)
  • It does not involve the extra steps of initializing a Go module, installing dependences and compiling the code every task run, which is what currently happens when the Go source file is downloaded from GCS
  • In the current implementation only a single Go source file can downloaded from GCS. This can be limiting if the project comprises multiple files

@johannaojeling johannaojeling force-pushed the feature/beam-go-binary branch 2 times, most recently from 9c2fb76 to c93c4e2 Compare January 12, 2023 06:29
@johannaojeling
Copy link
Contributor Author

@uranusjr can this be merged?

@kaxil kaxil merged commit 8c4303e into apache:main Jan 18, 2023
@johannaojeling johannaojeling deleted the feature/beam-go-binary branch February 9, 2023 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants