Skip to content

Completion should take <16ms most of the time #7542

Open
@matklad

Description

@matklad

Code completion performance is more important than it seems. Obviously, completion shouldn't take a second or two. But even for "almost instant" completion, theres' a huge difference in experience when going 300ms -> 150ms -> 50 ms -> 16ms. It's difficult to perceive such small differences by just eyeballing a single completion, but it does affect fluency of typing a lot.

Historically, we haven't payed much attention to completion, barring obvious regressions. It was, and it is, generally fast enough. But it seems like we are at the point where it makes sense to push performance here a bit.

The 16ms is the boundary such that it doesn't make much sense to go beyond it (as thats comparable to typing latency), and which should be achievable. The size of the result is small, and the work should be roughly linear in the size of the result.

The best way to start here is to set this config:

    "rust-analyzer.server.extraEnv": {
        "RA_PROFILE": "handle_completion>16",
    },

and check Code's output panel for profiling info, which looks like this:

   85ms - handle_completion
       68ms - import_on_the_fly @ 
           67ms - import_assets::search_for_relative_paths
                0ms - crate_def_map:wait (804 calls)
                0ms - find_path (16 calls)
                2ms - find_similar_imports (1 calls)
                0ms - generic_params_query (334 calls)
               59ms - trait_solve_query (186 calls)
            0ms - Semantics::analyze_impl (1 calls)
            1ms - render_resolution (8 calls)
        0ms - Semantics::analyze_impl (5 calls)

Keep in mind that rust-analyzer employes lazy evaluation. That means that if both f and g call s, and s is slow, then s time will be attributed to either f or g depending on their relative order.

Another place to look at is the end-to-end request flow in the main loop. Profiling captures only the computation, but it's important that completion is not blocked by other requests.

It would also be sweet to implement some maintainable benchmarks here. This would have high impact, but I don't know how to best approach this.

Other that that, just look at the core with the profiler and try to make slow things faster!

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-completionautocompletionA-perfperformance issuesE-has-instructionsIssue has some instructions and pointers to code to get startedS-actionableSomeone could pick this issue up and work on it right nowfunA technically challenging issue with high impactgood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions