Skip to content

[Go SDK] Rewrite dot runner to generate DOT from portable pipeline proto#37673

Open
YousufFFFF wants to merge 3 commits intoapache:masterfrom
YousufFFFF:go-dot-skip-composites
Open

[Go SDK] Rewrite dot runner to generate DOT from portable pipeline proto#37673
YousufFFFF wants to merge 3 commits intoapache:masterfrom
YousufFFFF:go-dot-skip-composites

Conversation

@YousufFFFF
Copy link

@YousufFFFF YousufFFFF commented Feb 21, 2026

Fixes #27508

This change rewrites the Go SDK dot runner to generate the DOT graph from the portable pipeline proto representation instead of relying on Go SDK internal graph structures.

By basing DOT generation on the portable pipeline model:

• Cross-language pipelines can now be rendered correctly.
• The implementation aligns with the portable runner architecture.
• It enables future reuse in Prism Runner and other portable tooling.

The current implementation focuses on rendering leaf transforms (composites are skipped explicitly), keeping the traversal simple while leaving room for future refinement of composite expansion strategies.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @YousufFFFF, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the Go SDK's DOT graph generation mechanism. Instead of relying on internal Go SDK graph structures, the system now processes the portable pipeline protocol buffer representation to create DOT graphs. This fundamental shift enhances compatibility with cross-language pipelines and aligns the Go SDK with the broader portable runner architecture, paving the way for more integrated and reusable tooling.

Highlights

  • Portable Pipeline Proto Integration: The Go SDK dot runner has been rewritten to generate DOT graphs directly from the portable pipeline protocol buffer representation.
  • Cross-Language Compatibility: This change enables the correct rendering of cross-language pipelines in DOT format.
  • Architectural Alignment: The implementation now aligns with the portable runner architecture, improving consistency and maintainability.
  • Future Reusability: The new approach facilitates future reuse in tools like Prism Runner and other portable tooling.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • sdks/go/pkg/beam/runners/dot/dot.go
    • Replaced the dotlib import with graphx and added the fmt package.
    • Modified the Execute function to marshal pipeline edges into a portable protocol buffer representation using graphx.Marshal.
    • Implemented custom DOT graph generation logic by iterating over portable pipeline components and transforms.
    • Removed the previous reliance on dotlib.Render for DOT graph creation.
    • Added logic to build consumer relationships between PCollections and transforms, correctly handling and skipping composite transforms for accurate graph representation.
Activity
  • No specific activity (comments, reviews, or progress updates) was provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@YousufFFFF
Copy link
Author

Hi @mohamedawnallah!
CI checks are green and the implementation is complete.
Would appreciate your review when you have time.
Thanks!

@github-actions
Copy link
Contributor

Assigning reviewers:

R: @shunping for label go.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@YousufFFFF
Copy link
Author

Hi @lostluck !

I’ve pushed updates to align the implementation more closely with the portable pipeline proto.
The DOT generation now operates directly on pipeline.GetComponents() and skips composite transforms explicitly, rather than relying on the previous internal graph utilities.
This keeps the implementation simple for now and leaves room to evolve the composite expansion strategy later based on further discussion.
I’d really appreciate your feedback when you have time.

@YousufFFFF
Copy link
Author

R: @lostluck

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

Copy link
Contributor

@mohamedawnallah mohamedawnallah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @YousufFFFF for the PR! Left some comments.

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with all of Mohamed's comments so far.

In particular, the one about simply just rewriting this to use bespoke code instead of the package.

As a first pass, add a few simple test cases that check the returned output. This is just good practice when doing a refactoring to make sure that some things are working at least as good as they were before. This is not a high bar, at least, due to not having any tests in the first place! Even just a 3-4 very basic pipelines as a smoke test would go a long way.

This dot runner is very old, so it is unfortunate that it was authored without tests, since it was actively being used to look at pipeline shapes.


Also, do take a look at the python "render" runner for inspiration, which also creates dot representations.

class PipelineRenderer:

(Admittedly, having done a different python -> go conversion for prism's Fusion handling, the python handling is very Set Theory based rather than graph based, so it can take a moment to grasp what it's doing.)

@YousufFFFF
Copy link
Author

YousufFFFF commented Feb 25, 2026

Thank you @lostluck and @mohamedawnallah for the detailed review and guidance.

Based on the feedback, I’ve made the following updates:

• Removed the unused core/util/dot dependency.
• Verified there are no remaining usages in the Go SDK.
• Marked core/util/dot as Deprecated to discourage future use.
• Removed the redundant composite consumer check.
• Added deterministic topological sorting of transforms to ensure stable DOT emission.
• Added a basic deterministic output test for the DOT runner.

I also reviewed the Python render runner for reference and will keep alignment considerations in mind for future improvements while keeping this PR focused.

Appreciate the thoughtful suggestions, they helped improve clarity and maintainability.

@YousufFFFF
Copy link
Author

I agree with all of Mohamed's comments so far.

In particular, the one about simply just rewriting this to use bespoke code instead of the package.

As a first pass, add a few simple test cases that check the returned output. This is just good practice when doing a refactoring to make sure that some things are working at least as good as they were before. This is not a high bar, at least, due to not having any tests in the first place! Even just a 3-4 very basic pipelines as a smoke test would go a long way.

This dot runner is very old, so it is unfortunate that it was authored without tests, since it was actively being used to look at pipeline shapes.

Also, do take a look at the python "render" runner for inspiration, which also creates dot representations.

class PipelineRenderer:

(Admittedly, having done a different python -> go conversion for prism's Fusion handling, the python handling is very Set Theory based rather than graph based, so it can take a moment to grasp what it's doing.)

Thanks @lostluck - I’ll take a look at the Python render runner for reference.

That’s a good suggestion, especially since it already handles DOT generation in a more structured way. I’ll review its approach and see if there are any patterns or simplifications we can adopt here while keeping the Go implementation idiomatic.

Appreciate the pointer 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request][Go SDK]: Rewrite the dot runner in terms of a Portable pipeline.

3 participants