[Repo Assist] fix: encode non-ASCII path chars as UTF-8 percent sequences in FilePathToUri by github-actions[bot] · Pull Request #1455 · ionide/FsAutoComplete

github-actions · 2026-02-22T20:08:16Z

🤖 This PR was created by Repo Assist, an automated AI assistant.

Closes #840

Root Cause

FilePathToUri in Utils.fs was encoding non-ASCII characters in the U+0080–U+00FF range (accented letters, e.g. ó in código) as a single-byte %XX value derived from the Unicode code point. For ó (U+00F3) this produced %F3.

However, the LSP URI specification (and RFC 3986) require percent-encoded bytes to use UTF-8. In UTF-8, ó is two bytes — 0xC3 0xB3 — so the correct encoding is %C3%B3.

When a client (or .NET's own Uri.LocalPath) receives %F3, it interprets it as a standalone byte that starts an incomplete 4-byte UTF-8 sequence. The path is left as c%F3digo rather than código, breaking go-to-definition, diagnostics, and file-open for any project living in a directory whose name contains such characters.

Fix

Changed the else branch in FilePathToUri to obtain the UTF-8 byte representation of each character via Text.Encoding.UTF8.GetBytes and emit one %XX segment per byte:

// before
uri.Append('%')
(int c).TryFormat(buffer.Span, &out, "X2") |> ignore   // ó -> %F3 ❌

// after
let bytes = Text.Encoding.UTF8.GetBytes(string c)
for b in bytes do
    uri.Append('%')
    (int b).TryFormat(buffer.Span, &out, "X2") |> ignore  // ó -> %C3%B3 ✓

ASCII characters (space → %20, # → %23, etc.) are unaffected because their single-byte UTF-8 encoding equals their code point. Characters above U+00FF (ideographs, etc.) already passed through unencoded and are unchanged.

Trade-offs

Minimal allocation per character encode (GetBytes(string c) for non-ASCII chars). These paths are encoded rarely (once per opened document), so the cost is negligible.
Characters above U+00FF still pass through as bare Unicode (IRI-style). This matches the previous behaviour and avoids breaking any existing round-trips.

Test Status

Build: ✅ dotnet build src/FsAutoComplete.Core -f net8.0 — 0 errors, 0 warnings
URI unit tests: ✅ All 42 URI tests pass (--filter "Uri tests") including the new test:
FilePathToUri encodes accented chars as UTF-8 percent sequences — verifies ó → %C3%B3

Generated by Repo Assist

To install this workflow, run gh aw add githubnext/agentics/workflows/repo-assist.md@ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5. View source at https://github.com/githubnext/agentics/tree/ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5/workflows/repo-assist.md.

Characters in the U+0080..U+00FF range (e.g. accented letters like ó) were encoded as a single-byte `%XX` using their Unicode code point, rather than their UTF-8 byte sequence. For example, ó (U+00F3) was emitted as `%F3` instead of the correct `%C3%B3`. The LSP URI spec requires percent-encoded bytes to be UTF-8. When a client or .NET's own Uri.LocalPath property receives `%F3`, it tries to decode it as a standalone UTF-8 byte (invalid — 0xF3 starts a 4-byte sequence but nothing follows) and either leaves it encoded or substitutes a replacement character, breaking go-to-definition and diagnostics for projects in directories whose names contain accented/non-ASCII characters. Fix: change the else branch in FilePathToUri to obtain the UTF-8 byte representation of the character (via Encoding.UTF8.GetBytes) and emit one `%XX` segment per byte. ASCII special chars (e.g. space => `%20`) are unaffected because their single-byte UTF-8 encoding is identical to their code point. Adds a new unit test that asserts ó is encoded as `%C3%B3`, and adds a roundtrip sample for a unicode-containing path to the existing suite. Closes #840 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-02-22T20:08:18Z

✅ Pull request created: #1455

Summarise changes merged since v0.83.0: - Fix SourceLink go-to-def on .NET 10 Linux (#1441) - Add backgroundServiceProgress config option (#1452) - Fix { trigger char for interpolated strings (#1454) - Fix non-ASCII URI encoding (#1455) - Fix spurious get/set rename (#1453) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Include all PRs merged since v0.83.0: - #1441: Fix SourceLink go-to-def failure on .NET 10 on Linux - #1452: Add FSharp.notifications.backgroundServiceProgress config option - #1449: Fix semantic token multiline range uint32 underflow - #1453: Fix spurious get/set rename in TextDocumentRename - #1454: Fix missing { interpolated string completion trigger - #1455: Fix non-ASCII path encoding in file URIs - #1456: Disable inline values by default to restore pipeline hints - #1457: Fix missing parens in function-type segments in AddExplicitTypeAnnotation - #1458: Fix signature help parameter types showing fully-qualified names - #1463: Fix seealso href/langword XML doc rendering Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions bot added automation bug repo-assist labels Feb 22, 2026

github-actions bot added the repo-assist label Feb 22, 2026

github-actions bot mentioned this pull request Feb 22, 2026

[Repo Assist] Monthly Activity 2026-02 #1450

Open

17 tasks

Krzysztof-Cieslak closed this Feb 22, 2026

Krzysztof-Cieslak reopened this Feb 22, 2026

Krzysztof-Cieslak marked this pull request as ready for review February 22, 2026 22:27

Krzysztof-Cieslak merged commit de26242 into main Feb 23, 2026
19 checks passed

github-actions bot mentioned this pull request Feb 23, 2026

[Repo Assist] changelog: prepare v0.84.0 release #1460

Draft

github-actions bot mentioned this pull request Feb 24, 2026

[Repo Assist] changelog: update v0.84.0 release prep with all merged fixes #1465

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Repo Assist] fix: encode non-ASCII path chars as UTF-8 percent sequences in FilePathToUri#1455

[Repo Assist] fix: encode non-ASCII path chars as UTF-8 percent sequences in FilePathToUri#1455
Krzysztof-Cieslak merged 1 commit intomainfrom
repo-assist/fix-issue-840-unicode-path-uri-encoding-a4b464b87b167e29

github-actions bot commented Feb 22, 2026

Uh oh!

github-actions bot commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

github-actions bot commented Feb 22, 2026

Root Cause

Fix

Trade-offs

Test Status

Uh oh!

github-actions bot commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant