Skip to content

Commit

Permalink
Don't include extension when measuring cur_file_prefix_len
Browse files Browse the repository at this point in the history
Stripping the extension allows e.g. "a/foo.c" and "a/foo.h" to rank
equally for a current file of "foo.c" (or, more importantly, "a/foo.h"
to rank above "a/b/foo.c"). Retaining the "." ensures that "foo.cc" will
still rank above e.g. "foox.h".
  • Loading branch information
nixprime committed Jun 4, 2015
1 parent 4a031fc commit dc6eb65
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 37 deletions.
48 changes: 14 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ cpsm is to enable a particular one based on CtrlP:

4. Hit Enter to open the file you wanted in the current window.

To achieve this, cpsm needs to deliver
To achieve this, cpsm needs to deliver:

- high quality search results (at sufficiently high levels of quality, it's
possible to enter a short query, hit Enter without needing to look at and
Expand All @@ -33,7 +33,7 @@ To achieve this, cpsm needs to deliver
common switching between files is)

- with as little latency as possible (to support scaling to very large, and
especially very deeply nested, code bases with very long pathnames).
especially very deeply nested, code bases with very long pathnames)

See the "Performance" section below for both search quality and time
comparisons to other matchers.
Expand Down Expand Up @@ -168,40 +168,15 @@ Performance

- Query "", current file "kernel/signal.c":

- cpsm: "arch/alpha/kernel/signal.c"; 4.936ms (15.769ms with 1 thread)

- "signal" is significantly more common (e.g. just about every arch has its
own signal.c), so this result is sane, but not likely to be useful.
- cpsm: "include/asm-generic/signal.c"; 5.048ms (16.013ms with 1 thread)

- All others: same as above

- The next two cases simulate a user trying to get to files that are closely
related to the currently open file.

- Query ".h", current file "kernel/signal.c":

- cpsm: "include/asm-generic/signal.h"; 6.218ms (22.107ms with 1 thread)

- ctrlp-cmatcher: "fs/ext2/ext2.h"; 22.850ms

- ctrlp-py-matcher: "mm/slab.h"; 33.533ms

- ctrlp: "security/selinux/ss/sidtab.h"

- fzf: "Documentation/scsi/LICENSE.FlashPoint"

- Without using the current filename, there is nothing the other matchers
can do to disambiguate the query. (cpsm doesn't get what I would consider
the best match either - "include/linux/signal.h" - but it's impossible to
choose between these two results without knowledge of the Linux kernel's
source layout. cpsm does pick that file as the second-best match.) fzf's
top result seems to be particularly out of whack, since at least the
other matchers return a .h file, but this is perhaps understandable given
that fzf is the most generic matcher out of the group.

- cpsm is much faster than either of the other two benchmarkable matchers
with multithreading enabled, and competitive with ctrlp-cmatcher when
locked to a single thread.
- "signal" is a significantly more common prefix; cpsm doesn't get what I
would consider the best match ("include/linux/signal.h") but it's
impossible to choose between these two results without knowledge of the
Linux kernel's source layout. (cpsm does pick that file as the
second-best match.)

- Query "x86/", current file "kernel/signal.c":

Expand All @@ -215,7 +190,8 @@ Performance

- fzf: "Documentation/x86/early-microcode.txt"

- Similar story to the previous case.
- Without using the current filename, there is nothing the other matchers
can do to disambiguate the query.

- The next set of cases simulate a user typing progressively more letters in
a desired file's name ("include/linux/rcupdate.h"), when they happen to be
Expand All @@ -233,6 +209,10 @@ Performance

- fzf: "CREDITS"

- cpsm is much faster than either of the other two benchmarkable matchers
with multithreading enabled, and competitive with ctrlp-cmatcher when
locked to a single thread.

- Query "rc", current file "kernel/signal.c":

- cpsm: "kernel/rcu/rcu.h"; 7.617ms (26.343ms with 1 thread)
Expand Down
13 changes: 10 additions & 3 deletions src/matcher.cc
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,14 @@ Matcher::Matcher(boost::string_ref const query, MatcherOpts opts,
[&](char32_t const c) { return strings_.is_uppercase(c); });

cur_file_parts_ = path_components_of(opts_.cur_file);
if (!cur_file_parts_.empty()) {
cur_file_key_ = cur_file_parts_.back();
// Strip the extension from cur_file_key_, if any (but not the trailing .)
auto const ext_sep_pos = cur_file_key_.find_last_of('.');
if (ext_sep_pos != boost::string_ref::npos) {
cur_file_key_ = cur_file_key_.substr(0, ext_sep_pos + 1);
}
}
}

bool Matcher::match_base(boost::string_ref const item, MatchBase& m,
Expand Down Expand Up @@ -157,9 +165,8 @@ void Matcher::match_path(std::vector<boost::string_ref> const& item_parts,
// We don't want to exclude cur_file as a match, but we also don't want it
// to be the top match, so force cur_file_prefix_len to 0 for cur_file (i.e.
// if path_distance is 0).
if (m.path_distance != 0 && !cur_file_parts_.empty() && !item_parts.empty()) {
m.cur_file_prefix_len =
common_prefix(cur_file_parts_.back(), item_parts.back());
if (m.path_distance != 0 && !item_parts.empty()) {
m.cur_file_prefix_len = common_prefix(cur_file_key_, item_parts.back());
}
}

Expand Down
1 change: 1 addition & 0 deletions src/matcher.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ class Matcher final {
bool is_case_sensitive_;
bool require_full_part_;
std::vector<boost::string_ref> cur_file_parts_;
boost::string_ref cur_file_key_;
};

} // namespace cpsm
Expand Down

0 comments on commit dc6eb65

Please sign in to comment.