Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion lib/ruby_indexer/lib/ruby_indexer/index.rb
Original file line number Diff line number Diff line change
Expand Up @@ -196,11 +196,12 @@ def prefix_search(query, nesting = nil)

# Fuzzy searches index entries based on Jaro-Winkler similarity. If no query is provided, all entries are returned
#: (String? query) -> Array[Entry]
def fuzzy_search(query)
def fuzzy_search(query, &condition)
unless query
entries = @entries.filter_map do |_name, entries|
next if entries.first.is_a?(Entry::SingletonClass)

entries = entries.select(&condition) if condition
entries
end

Expand All @@ -212,6 +213,9 @@ def fuzzy_search(query)
results = @entries.filter_map do |name, entries|
next if entries.first.is_a?(Entry::SingletonClass)

entries = entries.select(&condition) if condition
next if entries.empty?

similarity = DidYouMean::JaroWinkler.distance(name.gsub("::", "").downcase, normalized_query)
[entries, -similarity] if similarity > ENTRY_SIMILARITY_THRESHOLD
end
Expand Down
3 changes: 3 additions & 0 deletions lib/ruby_indexer/lib/ruby_indexer/uri.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ def from_path(path:, fragment: nil, scheme: "file", load_path_entry: nil)
end

uri = build(scheme: scheme, path: escaped_path, fragment: fragment)
uri.raw_path = path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we should do that, we should rather split whole work into platform specifci path unscaping/escaping and then use the right code paths on the right platform, also the result can be cached on the entry level

Copy link
Contributor Author

@janko janko Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unescaping was done regardless of the platform, and that was the main bottleneck in my profiling. The secondary bottleneck was the Regex#match? in the Windows code path, that could arguably be skipped non non-Windows platforms. But since I skipped the reconstruction of the file path entirely, there was no need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between raw_path and simply path (already present in the URI object)?

Copy link
Contributor Author

@janko janko Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, path is the URI-escaped version of file path, while raw_path is the unchanged file path. For example, path will keep any spaces in the path percent-encoded:

uri = URI::Generic.from_path(path: "/Users/janko/Library/Application Support")
uri.path     # => "/Users/janko/Library/Application%20Support"
uri.raw_path # => "/Users/janko/Library/Application Support"

The problem is that with the URI-escaping we lose the information of the valid file path that was passed to URI::Generic.from_path, so URI::Generic#to_standardized_path needs to reconstruct it.

I proposed storing the original valid file path to avoid having to do that work.


if load_path_entry
uri.require_path = path.delete_prefix("#{load_path_entry}/").delete_suffix(".rb")
Expand All @@ -40,6 +41,8 @@ def from_path(path:, fragment: nil, scheme: "file", load_path_entry: nil)
end
end

#: String
attr_accessor :raw_path
#: String?
attr_accessor :require_path

Expand Down
31 changes: 19 additions & 12 deletions lib/ruby_lsp/requests/workspace_symbol.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,7 @@ def initialize(global_state, query)
# @override
#: -> Array[Interface::WorkspaceSymbol]
def perform
@index.fuzzy_search(@query).filter_map do |entry|
uri = entry.uri
file_path = uri.full_path

# We only show symbols declared in the workspace
in_dependencies = file_path && !not_in_dependencies?(file_path)
next if in_dependencies

# We should never show private symbols when searching the entire workspace
next if entry.private?

fuzzy_search.filter_map do |entry|
kind = kind_for_entry(entry)
loc = entry.location

Expand All @@ -44,7 +34,7 @@ def perform
container_name: container.join("::"),
kind: kind,
location: Interface::Location.new(
uri: uri.to_s,
uri: entry.uri.to_s,
range: Interface::Range.new(
start: Interface::Position.new(line: loc.start_line - 1, character: loc.start_column),
end: Interface::Position.new(line: loc.end_line - 1, character: loc.end_column),
Expand All @@ -53,6 +43,23 @@ def perform
)
end
end

private

def fuzzy_search
@index.fuzzy_search(@query) do |entry|
file_path = entry.uri.raw_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is the right approach, we could rather cache this on entry level instead of skipping unscapping parser

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the unescaping at all? We had the full path when building the URI, why reconstruct it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't work properly because the URI path and file_path aren't the same thing on Windows and our comparison for not_in_dependencies will fail.

uri = URI("file:///C:/ruby/something.rb")

# The URI's path is not a valid file path
uri.path # => "/C:/ruby/something.rb"

# It's the handling of `to_standardized_path` that turns it into the correct one
uri.to_standardized_path # => "C:/ruby/something.rb"

Copy link
Contributor Author

@janko janko Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but my understanding is that it's because of the conversion done in URI::Generic.from_path. The path that's passed to that method should be correct even Windows, shouldn't it? That's what's stored in raw_path.


# We only show symbols declared in the workspace
in_dependencies = file_path && !not_in_dependencies?(file_path)
next if in_dependencies

# We should never show private symbols when searching the entire workspace
next if entry.private?

true
end
end
end
end
end
Loading