Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tokenization to a web worker #147066

Closed
4 tasks done
hediet opened this issue Apr 8, 2022 · 4 comments · Fixed by #174443
Closed
4 tasks done

Move tokenization to a web worker #147066

hediet opened this issue Apr 8, 2022 · 4 comments · Fixed by #174443
Assignees
Labels
feature-request Request for new features or functionality insiders-released Patch has been released in VS Code Insiders on-testplan tokenization Text tokenization
Milestone

Comments

@hediet
Copy link
Member

hediet commented Apr 8, 2022

Also see #77140

Plan of attack:

  • Adopt vscode-textmate refactorings in vscode #167288
  • Make sure rule ids are stable by initializing injection rules eagerly
  • Tokenize in the webworker
  • Send over states by diffing against the state of the previous line. Send the delta as pop(n), push(state) instructions.

Sync tokenization:

Code_-_OSS_RQiEvFdPla.mp4

Async tokenization:

Code_-_OSS_owfdHXRTpY.mp4
@hediet hediet added feature-request Request for new features or functionality tokenization Text tokenization labels Apr 8, 2022
@hediet hediet added this to the April 2022 milestone Apr 8, 2022
@hediet hediet self-assigned this Apr 8, 2022
@IllusionMH
Copy link
Contributor

At first I was thinking it's about .ts extension that is treated as Video Transport Stream (TS) 😆

@hediet hediet changed the title Movie tokenization to a web worker Move tokenization to a web worker Apr 12, 2022
@hediet hediet modified the milestones: April 2022, May 2022 Apr 29, 2022
@hediet hediet modified the milestones: May 2022, June 2022 Jun 2, 2022
@hediet hediet modified the milestones: June 2022, On Deck Jul 1, 2022
@hediet hediet modified the milestones: On Deck, December 2022 Nov 28, 2022
@hediet hediet modified the milestones: January 2023, February 2023 Jan 26, 2023
@VSCodeTriageBot VSCodeTriageBot added the unreleased Patch has not yet been released in VS Code Insiders label Feb 20, 2023
@VSCodeTriageBot VSCodeTriageBot added insiders-released Patch has been released in VS Code Insiders and removed unreleased Patch has not yet been released in VS Code Insiders labels Feb 21, 2023
@RedCMD
Copy link
Contributor

RedCMD commented Mar 16, 2023

lines inside the viewport are retokenized everytime the viewport changes (scrolling etc)
this causes many lag spikes when scrolling over massively long lines that don't tokenize fully

this is different to when it was synchronous, as it would try to tokenize the line once and not try again until the user types

@hediet
Copy link
Member Author

hediet commented Mar 16, 2023

this is different to when it was synchronous, as it would try to tokenize the line once and not try again until the user types

I don't think this is true, it wouldn't tokenize lines only once when it used heuristically computed states to tokenize them.

@RedCMD
Copy link
Contributor

RedCMD commented Mar 17, 2023

me just scrolling normally at the same speed in both videos
notice the lag spikes with async on and the flickering of colours at the bottom section of the document

test file is the builtin c extension c.tmLanguage.json file
it is a single line of 72138 characters long
vscode manages to tokenize roughly 40,000 chars before giving up
roughly 45,500 chars when async is enabled

Async.Off-1.mp4
Async.On-1.mp4

I don't think this is true, it wouldn't tokenize lines only once when it used heuristically computed states to tokenize them.

I don't understand
is when multiple lines are present in the viewport?

@github-actions github-actions bot locked and limited conversation to collaborators Apr 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Request for new features or functionality insiders-released Patch has been released in VS Code Insiders on-testplan tokenization Text tokenization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants