-
Notifications
You must be signed in to change notification settings - Fork 1.8k
VFS watcher impl #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VFS watcher impl #556
Conversation
crates/ra_vfs/src/lib.rs
Outdated
root, | ||
path: path.clone(), | ||
filter: Box::new(filter), | ||
}; | ||
res.worker.inp.send(task).unwrap(); | ||
res.watcher.watch(path).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add error handling and log a warning for errors.
I was missing the After fixing that I'm still seing high CPU usage and the ra_lsp_server sometimes hangs and doesn't close after closing the editor. I will investigate, but I think it's not ready to merge yet. |
Note that I've had to rewrite the history recently, so expect storms while rebasing: #566 (comment) Sorry about that: it's definitively my mistake 🙁 |
Didn't run into any problems when rebasing. I found out what the issue was. Using the default recursive watching ment also watching the I've changed the watching to non recursive and implemented recursive watching taking |
Huh, that is getting more complex that I have anticipated... I think “watch only specific stuff” is the right approach, but we must share that same filtering logic betweeen initial scan and subsequent watching. Specifically RootFilter is the thing we want to use here: Conceptually, SourceRoot is the set of files in the directory, and RootFilter captures this concept. It currently only filters by .rs extension, but it should filter out /target directory as well. I think it’s fine to filter out target by default. Using gitignore is also a good idea, but I think it should be read by the clients of vfs and supplied at a construction time as a RootFilter. Perhaps a RootGlob would be a better game even? |
Ok I think it's almost there. I have implemented recursive watching using the Right now in I think this should be usable now. Maybe the |
I'd just hard-code those for now as well :)
Makes sense to me! I'll also be fine with just hard-coding worst offenders and punting on this until notify 5.0
Sure, long-term we need faster data structures everywhere. Everything in rust-analyzer currently uses a dumb linear scan and recomputed every time. However, I think that before starting with "obvious" optimizations, we need to put some integration profiling infrastructure in place. That is, I'd like to see info like "we've spent 5 CPU seconds on traversing the roots in the last minute of rust-analyzer usage". |
Ok, I've hardcoded
Ok, that makes sense. But I think it's related to how ignoring files should be implemented. Using this project as an example there's a What I'm thinking is a tree with nodes like: struct Node {
ignores: Vec<Ignore>,
path: PathBuf,
root: Option<VfsRoot>,
children: Vec<Node>,
}
|
crates/ra_vfs/src/io.rs
Outdated
root: VfsRoot, | ||
path: PathBuf, | ||
root_filter: Arc<RootFilter>, | ||
nested_roots: Vec<PathBuf>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think nested_roots
should just be a part of RootFilter
, with a more generic name like excluded_dirs
.
crates/ra_vfs/src/io.rs
Outdated
} | ||
Task::HandleChange(change) => { | ||
// forward as is because Vfs has to decide if we should load it | ||
TaskResult::HandleChange(change) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this looks like one elaborate control flow! I wonder if we could simplify this... At least when I originally thought about it, I've thought of io
as a separate actor which completely encapsulates file system access and gives very "clean" TaskResult
s outside:
enum TaskResult {
BulkLoadRoot { root: VfsRoot, files: Vec<(Path, String)> },
AddSingleFile { root: VfsRoot, path: Path, text: String},
ChangeSingleFile { ... },
RemoveSingleFile { ... },
}
That is, io
handles all loading and watching of roots, and vfs itself then decides if it should ignore ChangeSingleFile
due to memap or not. Watcher becomes and implementation detail of IO.
I think that probably Worker
will need to know the set of all roots for this to work, so perhaps we should encapsulate roots: Arena<VfsRoot, Arc<RootFilter>>,
into Roots
struct and share it via an Arc
between vfs and Worker?
Obviously, I don't know the specifics of notify
API very well, so perhaps this wont' work for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, yes, that sounds better and I think it should work, I will try to implement it tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Roots
struct be in a RwLock
? At the moment it can be implemented without it but if the vfs will support adding roots after initialization it will be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Roots struct be in a RwLock? At the moment it can be implemented without it but if the vfs will support adding roots after initialization it will be needed.
I'd lean towards sharing an immutable snapshot for starters.
EDIT: that is, just Arc<Roots>
, without RwLock. I think adding dynamically reconfigurable roots will be a giant can of worms, which actually might completely change the design here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now implemented.
I had to move the watcher to inside the io workers thread:
https://github.com/rust-analyzer/rust-analyzer/blob/1692a11d09cc3d70dcde10cc896c6f7ebb9eb3c3/crates/ra_vfs/src/io.rs#L50-L51
to have access to the output sender and be able to send the TaskResult
s directly from the watcher.
I've added excluded_dirs
to RootFilter
and also
https://github.com/rust-analyzer/rust-analyzer/blob/1692a11d09cc3d70dcde10cc896c6f7ebb9eb3c3/crates/ra_vfs/src/lib.rs#L63
to make using it with walkdir
cleaner.
The rest is pretty much as you described it, which is a lot cleaner.
One thing I have found is that WalkDir
by default doesn't follow symlinks which means that entry.file_type().is_[dir|file]()
and entry.path().is_[dir|file]()
won't necessarily match. I'm thinking we should probably follow symlinks but i'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this now looks superb! I’ll do a manual smoke test tomorrow and merge!
I think it’s ok not to follow simlnks for now
use same channel for scanner and watcher some implementations pending
…d files contents in `io`
use `RootFilter` to filter recursive watches untested
Merged in #665! |
Hm, I still get occasional 100% CPU hangs. Need to investigate this |
In my case it happened when watching directories with a lot of notifications, and was accompanied by high memory usage. Maybe try adding logging for the all the |
I've found something that may be related notify-rs/notify#177 |
Implements #433
For now I ended up referencing my forked
notify
with the fix until it's merged and released.There are some things left to do, like what to do on
Rescan
orError
events, but the rest it's working.