-
Notifications
You must be signed in to change notification settings - Fork 54
Escape characters inline #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideAdds explicit escaping for a curated set of control and invisible Unicode characters in inline Python approval reports, documents the behavior, and hardens GitHub Actions workflows by explicitly running Sequence diagram for InlinePythonReporter create_received_file processing with control character escapingsequenceDiagram
actor Developer
participant InlinePythonReporter
participant Path
participant handle_preceeding_whitespace
participant escape_backslashes
participant escape_control_characters
participant StackFrameNamer
Developer->>InlinePythonReporter: create_received_file(received_path, test_source_file)
InlinePythonReporter->>Path: read_text(encoding_utf_8)
Path-->>InlinePythonReporter: original_received_text_raw
InlinePythonReporter->>InlinePythonReporter: original_received_text = original_received_text_raw without last char
InlinePythonReporter->>InlinePythonReporter: received_text = original_received_text + footer
InlinePythonReporter->>handle_preceeding_whitespace: handle_preceeding_whitespace(received_text)
handle_preceeding_whitespace-->>InlinePythonReporter: whitespace_normalized_text
InlinePythonReporter->>escape_backslashes: escape_backslashes(whitespace_normalized_text)
escape_backslashes-->>InlinePythonReporter: backslashes_escaped_text
InlinePythonReporter->>escape_control_characters: escape_control_characters(backslashes_escaped_text)
escape_control_characters-->>InlinePythonReporter: fully_escaped_text
InlinePythonReporter->>StackFrameNamer: get_test_frame()
StackFrameNamer-->>InlinePythonReporter: frame_with_function_name
InlinePythonReporter->>InlinePythonReporter: method_name = frame_with_function_name.function
InlinePythonReporter-->>Developer: path_to_generated_test_method_file
Class diagram for InlinePythonReporter and control character escaping helpersclassDiagram
class InlinePythonReporter {
- footer: str
+ __init__(self, footer: str)
+ create_received_file(self, received_path: str, test_source_file: str) str
}
class Reporter {
<<interface>>
+ create_received_file(received_path: str, test_source_file: str) str
}
class StackFrameNamer {
+ get_test_frame() frame
}
class inline_python_reporter_module_helpers {
+ handle_preceeding_whitespace(text: str) str
+ detect_trailing_whitespace(text: str) bool
+ escape_backslashes(text: str) str
+ escape_control_characters(text: str) str
}
InlinePythonReporter ..|> Reporter
InlinePythonReporter ..> StackFrameNamer : uses
InlinePythonReporter ..> inline_python_reporter_module_helpers : uses
note for inline_python_reporter_module_helpers "escape_control_characters maps selected control and invisible Unicode characters (such as null, backspace, zero-width spaces, directional marks, BOM) to explicit escape sequences before embedding them in docstrings"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider moving the
translation_tableout ofescape_control_characters()into a module-level constant so it isn’t rebuilt on every call and is easier to inspect/extend. - The list of control characters in
inline_approvals_whitespace.mdis a bit inconsistent (e.g.,\u200fappears twice and some items are formatted without backticks or with different descriptions); it would be clearer to normalize the formatting and ensure each character is listed once with a consistent label.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider moving the `translation_table` out of `escape_control_characters()` into a module-level constant so it isn’t rebuilt on every call and is easier to inspect/extend.
- The list of control characters in `inline_approvals_whitespace.md` is a bit inconsistent (e.g., `\u200f` appears twice and some items are formatted without backticks or with different descriptions); it would be clearer to normalize the formatting and ensure each character is listed once with a consistent label.
## Individual Comments
### Comment 1
<location> `approvaltests/namer/inline_python_reporter.py:98-99` </location>
<code_context>
- # Handle preceding whitespace consistently across all lines.
received_text = handle_preceeding_whitespace(received_text)
- # Escape backslashes to avoid accidental escape sequences in the docstring.
received_text = escape_backslashes(received_text)
+ received_text = escape_control_characters(received_text)
method_name = StackFrameNamer.get_test_frame().function
if detect_trailing_whitespace(original_received_text):
</code_context>
<issue_to_address>
**issue (bug_risk):** Consider the ordering of backslash vs control-character escaping to keep behavior consistent.
Currently, `escape_backslashes` doubles only the original backslashes; the backslashes introduced later by `escape_control_characters` (e.g. in `"\x00"`) remain single. This produces mixed single/double backslashes and may leave control-character sequences still interpretable as escapes in docstrings. To keep behavior consistent and avoid accidental escape sequences, consider running `escape_control_characters` before `escape_backslashes`, or doing a final `escape_backslashes` after both steps so all backslashes are handled uniformly.
</issue_to_address>
### Comment 2
<location> `.github/workflows/tidy_code_and_update_markdown_snippets.yml:11` </location>
<code_context>
- uses: actions/checkout@v6
- run: echo "${{ matrix.python-version }}" > .python-version
- uses: jdx/mise-action@v3
+ - run: mise install # retry for network issues
- run: ./build_and_test.sh
shell: bash
</code_context>
<issue_to_address>
**question:** The retry step always runs rather than conditionally on failure.
As written, this will always run `mise install` twice—once via `jdx/mise-action` and once in this step—rather than only retrying on failure. If you want an actual retry, consider gating this on a previous step’s failure with `if: ${{ failure() }}` or using a dedicated retry wrapper, so you avoid extra work when the initial install succeeds.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| received_text = escape_backslashes(received_text) | ||
| received_text = escape_control_characters(received_text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Consider the ordering of backslash vs control-character escaping to keep behavior consistent.
Currently, escape_backslashes doubles only the original backslashes; the backslashes introduced later by escape_control_characters (e.g. in "\x00") remain single. This produces mixed single/double backslashes and may leave control-character sequences still interpretable as escapes in docstrings. To keep behavior consistent and avoid accidental escape sequences, consider running escape_control_characters before escape_backslashes, or doing a final escape_backslashes after both steps so all backslashes are handled uniformly.
| steps: | ||
| - uses: actions/checkout@v6 | ||
| - uses: jdx/mise-action@v3 | ||
| - run: mise install # retry for network issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: The retry step always runs rather than conditionally on failure.
As written, this will always run mise install twice—once via jdx/mise-action and once in this step—rather than only retrying on failure. If you want an actual retry, consider gating this on a previous step’s failure with if: ${{ failure() }} or using a dedicated retry wrapper, so you avoid extra work when the initial install succeeds.
Summary by Sourcery
Escape problematic control characters in inline approval outputs and stabilize CI workflows for mise-based builds.
New Features:
Enhancements:
CI:
mise installstep to GitHub Actions workflows to improve reliability of test and tidy jobs.