[Pyrefly][Github actions] Add Two-pass LLM classification with PR diff attribution for primer classification#2539
Draft
[Pyrefly][Github actions] Add Two-pass LLM classification with PR diff attribution for primer classification#2539
Conversation
…ifier Split LLM classification into two passes to fix verdict-reasoning contradictions (4/26 in PR #2493). Pass 1 produces reasoning and PR attribution without a verdict. Pass 2 reads the reasoning and assigns the verdict. This separates code analysis (hard) from labeling (easy), eliminating cases where the LLM commits to a verdict early and writes contradictory reasoning. Also adds --pyrefly-diff CLI flag to include the pyrefly PR code diff in each LLM call, enabling per-project attribution of which code change caused errors to appear or disappear.
Restructure format_markdown() to show an overview table with linked function names and file paths, collapsible detailed analysis, and a suggested fix section. Add helpers for function-name linkification and root cause extraction from PR attribution text.
Add --suggest CLI flag, Suggestion/SuggestionResult dataclasses, and generate_suggestions() LLM client that produces actionable source code fix suggestions from classification results and the PR diff.
Use a stricter regex (_INTERNAL_FUNCTION_PATTERN) that requires underscores to distinguish pyrefly internal function names like check_for_imported_final_reassignment() from common Python method names like get(), match(), set() that appear in error messages.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is another iteration on our mypy primer classifier work. There are a few bugs and improvements we can make. Specifically
Solution: Separate the concerns. One pass for analyzing the diff and coming up with the message, and then a light weight pass to read the message and determine the verdict.