Skip to content

Conversation

@joseph-isaacs
Copy link
Contributor

No description provided.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@joseph-isaacs joseph-isaacs added the changelog/fix A bug fix label Nov 26, 2025
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Copy link
Contributor

@robert3005 robert3005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird. We should be able to escape string but maybe debug is the simplest option?

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 27, 2025

CodSpeed Performance Report

Merging #5551 will improve performances by 17.18%

Comparing ji/escape-str (c5ac9a9) with develop (2f80934)

Summary

⚡ 5 improvements
✅ 1466 untouched
🆕 7 new
⏩ 249 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
slice_arrow_buffer[1024] 397.8 ns 339.4 ns +17.18%
slice_arrow_buffer[128] 397.8 ns 339.4 ns +17.18%
slice_arrow_buffer[16384] 397.8 ns 339.4 ns +17.18%
slice_arrow_buffer[2048] 397.8 ns 339.4 ns +17.18%
slice_arrow_buffer[65536] 397.8 ns 339.4 ns +17.18%
🆕 decompress[("alp_for_bp_f64", 0x45b8d60)] N/A 29.8 ms N/A
🆕 decompress[("datetime_for_bp", 0x45bb830)] N/A 41.7 ms N/A
🆕 decompress[("dict_fsst_varbin_bp_string", 0x45bacb0)] N/A 76.3 ms N/A
🆕 decompress[("dict_fsst_varbin_string", 0x45ba930)] N/A 76.3 ms N/A
🆕 decompress[("dict_varbinview_string", 0x45b9780)] N/A 75.6 ms N/A
🆕 decompress[("for_bp_u64", 0x45b8610)] N/A 2.5 ms N/A
🆕 decompress[("runend_for_bp_u32", 0x45b9b10)] N/A 2.1 ms N/A

Footnotes

  1. 249 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.47%. Comparing base (7e1d377) to head (c5ac9a9).
⚠️ Report is 8 commits behind head on develop.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joseph-isaacs joseph-isaacs enabled auto-merge (squash) November 27, 2025 13:38
@0ax1
Copy link
Contributor

0ax1 commented Nov 27, 2025

@claude is this the most elegant way to do this? what are alternatives

@claude
Copy link
Contributor

claude bot commented Nov 27, 2025

Claude finished @0ax1's task —— View job


Analysis of String Escaping Implementation

I've reviewed the PR changes. Here's my assessment of the current approach and alternatives:


Current Implementation

The PR introduces a StringEscape wrapper type that uses Rust's built-in str::escape_debug() method. This approach:

Pros:

  • Uses stdlib functionality - relies on well-tested, maintained Rust standard library code
  • Lightweight - minimal code (only ~40 lines including tests)
  • Zero dependencies - no external crates needed
  • Correct behavior - properly escapes control characters while preserving printable Unicode (e.g., remains as-is)
  • Consistent - uses the same escaping logic as Rust's Debug trait

Cons:

  • ⚠️ Slightly indirect - wraps a Display impl around Debug functionality
  • ⚠️ Doc comment inaccuracy - says "removing all unicode escape sequences" but actually means "escaping special characters"

Alternative Approaches

1. Direct use of Debug trait (original approach)

write!(f, "{:?}", self.0)  // Old approach

Why this was problematic:

  • Triggers clippy::use_debug lint
  • Adds quotes around strings when using Debug on &str
  • Less explicit about intent

2. Use std::fmt::Display with manual escaping

for c in self.0.chars() {
    match c {
        '\n' => write!(f, "\\n")?,
        '\t' => write!(f, "\\t")?,
        // ... etc
    }
}

Pros: Full control, no wrapper type
Cons: Verbose, error-prone, duplicates stdlib logic, need to handle Unicode edge cases

3. External crate like escape8259

use escape8259::escape;
write!(f, "{}", escape(self.0))

Pros: More feature-rich escaping options
Cons: Adds dependency, overkill for this use case

4. Inline the wrapper (no separate type)

// In each Display impl:
write!(f, "{}", self.0.escape_debug())

Pros: No wrapper type needed
Cons: Duplicates the escaping call in multiple places, less DRY


My Recommendation

The current approach is excellent and appropriate. Here's why:

  1. StringEscape wrapper is a good pattern - It's a zero-cost abstraction that:

    • Makes intent explicit ("this string needs escaping for display")
    • Is reusable across the codebase
    • Keeps the Display impls clean and readable
    • Follows Rust's newtype pattern idioms
  2. str::escape_debug() is the right choice - It's:

    • Battle-tested in the Rust stdlib
    • Exactly designed for this use case (escaping strings for display/debugging)
    • Well-documented and maintained
    • Handles all edge cases (control chars, Unicode, etc.)
  3. Better than alternatives:

    • More elegant than manual escaping
    • Lighter than external crates
    • More principled than using Debug directly
    • More maintainable than inline duplication

Possible Minor Enhancement (Optional)

If you wanted to make the API slightly more ergonomic, you could add a helper method:

impl FieldName {
    pub fn escaped(&self) -> StringEscape<'_> {
        StringEscape(self.0.as_ref())
    }
}

// Usage:
write!(f, "{}", self.escaped())

But honestly, StringEscape(self.0.as_ref()) is already quite clean and explicit.

Doc comment fix: Consider updating the doc comment to be more accurate:

/// A wrapper around a string that implements `Display` by escaping special characters
/// (newlines, tabs, control characters, etc.) while preserving printable Unicode.

Conclusion

This implementation is elegant and appropriate. The StringEscape wrapper + escape_debug() combo is the sweet spot between:

  • Simplicity (no complex logic or dependencies)
  • Correctness (stdlib-backed behavior)
  • Maintainability (DRY principle, reusable abstraction)
  • Performance (zero-cost wrapper)

The review comment suggesting "debug is the simplest option" is technically correct in terms of minimal code, but the current approach is better because it:

  1. Avoids clippy warnings
  2. Makes intent explicit
  3. Provides a reusable abstraction
  4. Doesn't add quotes inappropriately

I'd approve this as-is (maybe just fix the doc comment). 👍


Copy link
Contributor

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\

@joseph-isaacs joseph-isaacs merged commit 265c3b6 into develop Nov 27, 2025
46 checks passed
@joseph-isaacs joseph-isaacs deleted the ji/escape-str branch November 27, 2025 13:46
connortsui20 added a commit that referenced this pull request Jan 15, 2026
Fixes: #5979

Note that the source is from
#5551, but we want to keep
that behavior.

There may be other places that this could be wrong but I can't figure
out how to get my LSP to search for them.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants