Replies: 3 comments 2 replies
-
What's the difference between |
Beta Was this translation helpful? Give feedback.
-
For me, the first solution is the best. It is not adding a new field to Job/BeamJob APIs and it is the cheapest to implement, probably more elastic for future changes if we have to add some other special logic for local runs. My only caveat is to the name. I suggest |
Beta Was this translation helpful? Give feedback.
-
IMO last option (3rd) is no-go, as it will require too much "hackery" and breaking of internal abstraction layers (eg. it will be hard to support). For option 1 I suppose we can just put such script into documentation (3 command), or even put to documentation how to create an alias for this (instruction how to add oneliner to Second option is doable, less hacky than 3rd, but IMO still a bit too complex for a problem we are trying to solve. |
Beta Was this translation helpful? Give feedback.
-
Simplified dockerized runs
The problem
Running dockerized Dataflow jobs locally, using an image built from the BigFlow project Dockerfile, requires
building the project package and image, and deploying the image to the Google Container registry, so
Dataflow can reach it. It requires understanding what is going on under the hood, from BigFlow users.
It's just too complex and inconvenient in the current form, based on the observations inside Allegro.
Proposed solutions
Adding a new command to the CLI
Adding another command to BigFlow CLI could be a solution for the problem. We can add the
run-dockerized
command (or a flag to the standard
run
command), which is basically an alias for:bigflow build-package --skip-tests # we have to implement the skip-tests flag btw. bigflow build-image bigflow deploy-image bigflow run ...
Pros:
Cons:
Extending the Job contract
We can extend the
bigflow.Job
contract, with an additional fielduse_docker_image: Union[True, False, DockerImageId]
.Then, the
run
command can automatically build and deploy the image, ifuse_docker_image==True and bigflow.is_running_locally()
.Pros:
Cons:
is_running_locally
concept for the whole frameworkHandling the problem inside the BeamJob
We can run build and deploy inside the
BeamJob.execute
method ifuse_docker_image==True and bigflow.is_running_locally()
.Pros:
Cons:
BeamJob.execute
runs build and deploy methods under the hoodis_running_locally()
thingBeta Was this translation helpful? Give feedback.
All reactions