Description
Motivation
Currently, rust-analyzer’s "Find All References" can become slow on large projects due to scanning many files and symbols repeatedly. I propose a bucketed symbol-to-files index to dramatically speed up reference searches.
Algorithm Overview
Initialization (Run once on analyzer startup)
- Traverse all files (N files) and collect a unique set of symbols.
- Complexity: O(N) assuming bounded symbols per file.
- Traverse all files again to create a map from symbol to the files that contain it (M symbols).
- Data structure: HashMap<SymbolID, HashSet>
- Complexity: O(M)
Since these two steps run sequentially, total cost is O(max(N, M)) — linear time, which is efficient for large codebases.
Finding References (find_references_by_bucket
)
When a user requests all references for symbol S:
- Lookup the list of files from the bucket index in O(1) time.
- Search for references to S only in these files, rather than the whole project.
- Return the found references.
This search takes O(c) or at worst O(cN), where c is the cost to search a single file. This results in a very fast lookup for most cases, and remains performant even in worst-case scenarios.
Handling New or Modified Files
To maintain correctness incrementally:
- On new or edited files, track their paths in a Vec (or similar structure).
- When running
find_references_by_bucket
, include these new/modified files in the search along with the precomputed bucket files for the symbol.
This approach avoids the need for full reindexing on every file change and keeps the system responsive.
Benefits
- Significant speedup for "Find All References" due to drastically reduced search space (O(1) in most cases and O(N) in the worst case).
- Linear-time indexing on startup is practical for large projects.
- Incremental updates keep the index accurate without full reprocessing.
- Simple data structures and algorithms make it feasible to implement and maintain.
Next Steps
I would like feedback on this approach and am happy to work on a prototype implementation if the maintainers find this direction promising.