Add semantic token LSP support for Quarto files #868

juliasilge · 2025-11-21T04:25:02Z

Addresses #420

What is semantic highlighting, you ask? Why, it is special, extra highlighting that some LSPs provide (as opposed to grammar-provided syntax highlighting):
https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide

After working on this, I understand why we didn't add it a while ago; there is a LOT of bookkeeping here! 😅

The best way to see how this is behaving is to test this in VS Code (not Positron) using Pylance. We do intend to make changes in Positron so it works similarly, but we are a bit in flux right now with our Python LSP.

If you have some Python code that gets semantic tokens highlighted in a regular .py file:

We should mostly get the same semantic tokens highlighted in a .qmd file:

Only some themes support the semantic token highlighting, so be sure to use one of those (like the main built-in themes).

Copilot

Pull request overview

This PR adds semantic token support from LSP servers to Quarto documents. Semantic tokens provide enhanced syntax highlighting based on language server analysis rather than grammar-based highlighting. The implementation includes middleware to intercept semantic token requests, convert them to virtual document coordinates, remap token indices between different legend formats, and adjust positions back to real document coordinates.

Key changes:

Added semantic token provider middleware that creates virtual documents and delegates to embedded language servers
Implemented token encoding/decoding and legend remapping utilities to handle differences between language server token legends
Added comprehensive test coverage for token manipulation functions

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
apps/vscode/src/vdoc/vdoc.ts	Added semantic token coordinate adjustment function and registered "semanticTokens" as a virtual document action
apps/vscode/src/test/semanticTokens.test.ts	Added comprehensive test suite for semantic token encoding, decoding, and legend remapping
apps/vscode/src/providers/semantic-tokens.ts	Implemented semantic token provider with legend remapping and coordinate adjustment
apps/vscode/src/lsp/client.ts	Registered semantic token middleware provider in LSP client
apps/quarto-utils/src/semantic-tokens-legend.ts	Defined standard semantic token legend for Quarto documents
apps/quarto-utils/src/index.ts	Exported semantic token legend for use across packages
apps/lsp/src/middleware.ts	Added semantic token capability and handler to LSP middleware

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vezwork

Without reading too deeply into semantic highlighting, this makes sense and the code seems straightforward. Kind of wild that we've got to do some bit twiddling lol.

Is it possible to add a test for embeddedSemanticTokensProvider? I'm not sure if we have access to MarkdownEngine in tests though. Perhaps we could make mock versions of token: CancellationToken, next: DocumentSemanticsTokensSignature to pass in?

vezwork · 2025-11-25T15:33:14Z

apps/lsp/src/middleware.ts

+  connection.languages.semanticTokens.on(async () => {
+    return { data: [] };
+  });


Is this supposed to return with an empty array? If so, why?

I believe so, based on how semantic tokens work. Returning { data: [] } says, "I'm handling this request successfully, but have no tokens to provide", while null would mean "capability not available" or "error". It's different from the other handlers where null is the standard way to say "no result".

This is the first time I've worked with semantic tokens, but I did find the spec helpful: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_semanticTokens

vezwork · 2025-11-25T15:37:13Z

apps/vscode/src/providers/semantic-tokens.ts

+ * Decode semantic tokens from delta-encoded format to absolute positions
+ *
+ * Semantic tokens are encoded as [deltaLine, deltaStartChar, length, tokenType, tokenModifiers, ...]


Does "delta-encoded" mean that i.e. imagine we extracted line and delta line data to their own arrays, then deltaLine[i] === line[i] - line[i-1]? And deltaLine[0] === line[0]?

Yep, from what I understand, "delta-encoded" means the token positions are stored as relative offsets from the previous token, rather than absolute positions, so deltaLine is the number of lines relative to the previous token's line.

juliasilge · 2025-11-26T01:43:15Z

I did look in to testing embeddedSemanticTokensProvider, but given that we don't install other extensions into the tests we'd have to mock:

window.activeTextEditor
engine.parse()
commands.executeCommand() (twice)
The virtual doc system

I don't think we get a lot of value and would end up just testing our mocks really.

vezwork · 2025-11-26T16:21:05Z

I did look in to testing embeddedSemanticTokensProvider, but given that we don't install other extensions into the tests we'd have to mock:

window.activeTextEditor

engine.parse()

commands.executeCommand() (twice)

The virtual doc system

I don't think we get a lot of value and would end up just testing our mocks really.

thanks, thats helpful to understand

juliasilge added 4 commits November 20, 2025 21:22

First draft of semantic token support for Quarto files

d082ba8

Actually, let's put this in the quarto-utils package

347d8f9

Better to use virtualDocForLanguage() for this provider

3fc179b

Add some tests

739e525

juliasilge changed the title ~~WIP: Add semantic token LSP support for Quarto files~~ Add semantic token LSP support for Quarto files Nov 22, 2025

juliasilge marked this pull request as ready for review November 22, 2025 00:34

juliasilge requested a review from Copilot November 22, 2025 00:35

Copilot AI reviewed Nov 22, 2025

View reviewed changes

Update CHANGELOG

05d70a4

juliasilge requested a review from vezwork November 22, 2025 00:37

vezwork approved these changes Nov 25, 2025

View reviewed changes

juliasilge merged commit 591b352 into main Nov 26, 2025
2 checks passed

juliasilge deleted the add-semantic-token-middleware branch November 26, 2025 01:58

juliasilge mentioned this pull request Nov 26, 2025

Quarto extension in VSCode should better support Python syntax highlighting #420

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add semantic token LSP support for Quarto files #868

Add semantic token LSP support for Quarto files #868

Uh oh!

juliasilge commented Nov 21, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

vezwork left a comment

Uh oh!

vezwork Nov 25, 2025 •

edited

Loading

Uh oh!

juliasilge Nov 26, 2025 •

edited

Loading

Uh oh!

vezwork Nov 25, 2025

Uh oh!

juliasilge Nov 26, 2025

Uh oh!

juliasilge commented Nov 26, 2025

Uh oh!

Uh oh!

vezwork commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add semantic token LSP support for Quarto files #868

Add semantic token LSP support for Quarto files #868

Uh oh!

Conversation

juliasilge commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

vezwork left a comment

Choose a reason for hiding this comment

Uh oh!

vezwork Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juliasilge Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vezwork Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

juliasilge Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

juliasilge commented Nov 26, 2025

Uh oh!

Uh oh!

vezwork commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juliasilge commented Nov 21, 2025 •

edited

Loading

vezwork Nov 25, 2025 •

edited

Loading

juliasilge Nov 26, 2025 •

edited

Loading