Skip to content

Feat: Add auto formatting bot #14927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

georgereuben
Copy link

@georgereuben georgereuben commented Jul 9, 2025

Description

Added a workflow bot that automatically formats a PR's file changes and raises a separate PR against that PR

How it works :

When someone comments /format-fix apply on a PR, the bot will:

  • Check your code using ./gradlew check -x test to find any formatting/style issues
  • Fix what it can by running ./gradlew tidy (which includes Google Java Format, EditorConfig fixes, and other style tools)
  • Verify everything looks good by running the checks again
  • Create a separate PR with just the formatting fixes, so you can review and merge them without mixing formatting changes with your actual code changes

Supported files:

Java, Gradle, Groovy, Markdown, properties, XML, Python, shell scripts, and batch files.
Just comment "/format-fix apply" on any PR to trigger it

Closes: Issue #14835

Test Run : Test PR

Copy link

github-actions bot commented Jul 9, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@georgereuben
Copy link
Author

Hi @dweiss , added a workflow bot to apply formatting fixes and create a PR against a PR. Could you please review and suggest changes.

@dweiss
Copy link
Contributor

dweiss commented Jul 9, 2025

This looks very interesting... and way beyond my knowledge of gh actions. What are the security implications for having these permissions in the workflow? Can somebody craft a PR that would somehow alter Lucene's main repo?

@georgereuben
Copy link
Author

This looks very interesting... and way beyond my knowledge of gh actions. What are the security implications for having these permissions in the workflow? Can somebody craft a PR that would somehow alter Lucene's main repo?

There is a check before the workflow runs so that only users with admin or write repository permissions can trigger the bot. I have tested it in the test PR and verified that the workflow rejects the trigger command from an external actor

Screenshot 2025-07-09 at 5 23 25 PM


- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v46
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one needs to use a different action, one approved by apache. list:

https://github.com/apache/infrastructure-actions/blob/main/actions.yml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using ana06/get-changed-files@v2.3.0, but it doesn't support issue comment triggers. So I have replaced this action with a gh cli script that would perform the same action when triggered by a comment. Please let me know if this works.

@dweiss
Copy link
Contributor

dweiss commented Jul 9, 2025

Also - this should be primarily for PR authors who, after they submit the PR, find out something didn't pass the check. So it's a convenience to people who don't have permissions to the main Lucene repository. Is this what the permission check is actually verifying (sorry, it's a bit confusing with those permissions)?

@georgereuben
Copy link
Author

Also - this should be primarily for PR authors who, after they submit the PR, find out something didn't pass the check. So it's a convenience to people who don't have permissions to the main Lucene repository. Is this what the permission check is actually verifying (sorry, it's a bit confusing with those permissions)?

Understood, thanks for the insight. I see the issue. Should I go ahead and implement a check where it will allow the PR author along with folks with admin and write access to trigger the format bot?
I think this should be fine security wise since the automated PR raised would only suggest edits for the PR raised by the person triggering it and not to the main branch. What would you suggest?

Working on the other suggestions meanwhile, I will update here soon

Copy link

github-actions bot commented Jul 9, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@georgereuben georgereuben marked this pull request as draft July 9, 2025 14:17
@dweiss
Copy link
Contributor

dweiss commented Jul 9, 2025

Should I go ahead and implement a check where it will allow the PR author along with folks with admin and write access to trigger the format bot?

Yes, I think this makes sense. It's a PR against their own repository (against their PR), so I think it's fine (?).

@dweiss
Copy link
Contributor

dweiss commented Jul 9, 2025

Right - thank you for working on this. It seems way more complex than I thought. Also, it entails some worrying uncertainty with respect to what can go wrong. Here's an example - I received an email today with this:

image

but I can't locate this "comment" anywhere on this issue. Yet it did send out an e-mail, somehow. It's kind of scary to me.

@georgereuben
Copy link
Author

Right - thank you for working on this. It seems way more complex than I thought. Also, it entails some worrying uncertainty with respect to what can go wrong. Here's an example - I received an email today with this:
but I can't locate this "comment" anywhere on this issue. Yet it did send out an e-mail, somehow. It's kind of scary to me.

Yes sorry, I was testing the gh cli comment command locally and accidentally added this PR instead of my forked repo's PR as the target. So this test comment was posted from my id on that PR. I deleted it later, my apologies.

@georgereuben
Copy link
Author

Hi @dweiss, I now see that it might not be possible to raise a PR against a PR raised from a branch in a forked repo since the github actions workflow will not have access to that forked repo.
The other viable solution I can see is to add a commit to the author's PR itself as this will be possible if the PR author has allowed edits by maintainers. I have been working on this but I wanted to check with you if this would be viable

@dweiss
Copy link
Contributor

dweiss commented Jul 9, 2025

Yes, this could be an alternative. I initially wanted a PR because they you'd see what's changed but a commit is also fine if the change is triggered explicitly.

@georgereuben
Copy link
Author

Hi @dweiss @rmuir, I have updated the workflow. If the PR is raised in the same repo by a maintainer, it will raise a PR with formatting fixes, and if the PR is raised from a forked repo, it will directly commit the formatting fixes to the author's PR and the author can check the commit changes.

I have updated the workflow with the rest of your suggestions as well, please review and let me know if any other improvements can be made, thanks.

@georgereuben georgereuben marked this pull request as ready for review July 10, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A workflow "bot" that would apply formatting fixes and create a PR against a PR?
3 participants