Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Make results of initial scan available #478

Closed
Shelim opened this issue Apr 23, 2023 · 20 comments · Fixed by #507
Closed

[Feature request] Make results of initial scan available #478

Shelim opened this issue Apr 23, 2023 · 20 comments · Fixed by #507
Labels
A-enhancement Z-needs implementation Needs an implementation, will accept PRs

Comments

@Shelim
Copy link

Shelim commented Apr 23, 2023

This is usecase for #339

Scenario:

  • Platform with slow file-system access and a lot of files
  • Need initial list of files as well as events on changes

Now I cannot obtain initial list of files from the API. I have to call separately walkdir which imply some drawbacks:

  1. I need to rescan the whole file-system again
  2. If one file is changed during rescanning, it changes can be lost (initial scan output will be provided later then changed event)
  3. It imply memory cost for rescanning large file-system again
  4. There may be some platform-specific inconsistencies in case symlink/hardlink. Initial set of files from notify may differ from walkdir output (unless notify is using walkdir internally, I haven't check up that)
@0xpr03
Copy link
Member

0xpr03 commented Apr 23, 2023

There is no initial scan on windows. So this would be a special api for inotify / pollwatcher backend.

@Shelim
Copy link
Author

Shelim commented Apr 29, 2023

Ah, I tricked myself the notify is doing initial scan as I only used pollwatcher. Okay, it seems legit now

@ShayBox
Copy link

ShayBox commented May 7, 2023

This would be useful, I already use walkdir prior to notify (pollwatcher), this would prevent double scanning, speeding up my program startup, and reduce dependencies in my project.

@0xpr03 0xpr03 added A-enhancement Z-needs implementation Needs an implementation, will accept PRs labels May 7, 2023
@0xpr03
Copy link
Member

0xpr03 commented May 9, 2023

How exactly would you like to access the initial scan ? Or do you want to access the live state too ?

One could want access to the full internal path cache, or alternatively provide a callback function that gets called for each path found during the initial scan.

@ShayBox
Copy link

ShayBox commented May 9, 2023

I suppose whichever is easier, pollwatcher uses walkdir internally so exposing that would probably be easiest, but I'm not sure

Maybe an option to emit the first scan results to the existing event callback? That might be more work

@Shelim
Copy link
Author

Shelim commented May 9, 2023

If we can change emitted event kinds, I believe the best way would be to add event callback kind for initial state. This way you can use exactly the same implementation mechanism you use for updated files

@0xpr03
Copy link
Member

0xpr03 commented Jul 14, 2023

Could you try out #507 and give some feedback ? There is an example inside too.

@ShayBox
Copy link

ShayBox commented Jul 15, 2023

Is there a way to optionally supply the same handler for both?

let mut watcher = PollWatcher::with_initial_scan(tx, notify_config, Some(tx))?;

EDIT: It seems the ScanEventHandler sends a ScanEvent which is a type alias for Result but the type alias for EventHandler excludes the Result, instead Result

struct Handler;

impl EventHandler for Handler {
    fn handle_event(&mut self, event: notify::Result<Event>) {
        todo!()
    }
}

impl ScanEventHandler for Handler {
    fn handle_event(&mut self, event: notify::Result<PathBuf>) {
        todo!()
    }
}

@0xpr03
Copy link
Member

0xpr03 commented Jul 15, 2023

Hm yeah, I opted for putting the Result inside the ScanEvent type - maybe that's more confusing than anything (changing this for notify would be a semver break).

@0xpr03
Copy link
Member

0xpr03 commented Jul 15, 2023

I left it like that and added some info to the ScanEventHandler docs.

@ShayBox
Copy link

ShayBox commented Jul 15, 2023

Returning a result is fine, I just meant that normal event is wrapped in a result, while a scan event includes the result not just the pathbuf

Whats the purpose of the option parameter since the with variant is optional to use?

@0xpr03
Copy link
Member

0xpr03 commented Jul 15, 2023

Whats the purpose of the option parameter since the with variant is optional to use?

Code unification

while a scan event includes the result not just the pathbuf

Yeah changing this to use the result in the normal event would be a major version change. But for the scan events it definitely makes sense to have this as one type.

@0xpr03
Copy link
Member

0xpr03 commented Jul 15, 2023

Hm I could internalize this so you don't have to add a Some, why not.

@Shelim
Copy link
Author

Shelim commented Jul 15, 2023

Yeah, since we already include with_initial_scan in the function name :-)

I looked at the example and it seems to be exactly what I needed. I will check the implementation in actual app in a few moments

@ShayBox
Copy link

ShayBox commented Jul 16, 2023

This is what I came up with but 'Sender<PathBuf>' cannot be shared between threads safely

use std::{
    path::{Path, PathBuf},
    time::Duration,
};

use notify::{Config, Event, PollWatcher, RecursiveMode, Watcher};

fn main() {
    let (tx, rx) = std::sync::mpsc::channel();
    let mut watcher = PollWatcher::with_initial_scan(
        |event_result: notify::Result<Event>| {
            let Ok(event) = event_result else {
                return;
            };

            for path in event.paths {
                tx.send(path).unwrap();
            }
        },
        Config::default().with_poll_interval(Duration::from_secs(1)),
        Some(|path_result: notify::Result<PathBuf>| {
            let Ok(path) = path_result else {
                return;
            };

            tx.send(path).unwrap();
        }),
    )
    .unwrap();

    watcher
        .watch(Path::new("."), RecursiveMode::Recursive)
        .unwrap();

    loop {
        let path = rx.recv().unwrap();
        println!("path: {:?}", path);
    }
}

@0xpr03
Copy link
Member

0xpr03 commented Jul 16, 2023

@ShayBox

fn main() {
    let (tx, rx) = std::sync::mpsc::channel();
    let tx_c = tx.clone();
    let mut watcher = PollWatcher::with_initial_scan(
        move |event_result: notify::Result<Event>| {
            let Ok(event) = event_result else {
                return;
            };

            for path in event.paths {
                tx_c.send(path).unwrap();
            }
        },
        Config::default().with_poll_interval(Duration::from_secs(1)),
        move |path_result: notify::Result<PathBuf>| {
            let Ok(path) = path_result else {
                return;
            };

            tx.send(path).unwrap();
        },
    )
    .unwrap();
   //[..]
}

See also the example. You need to move and clone() your tx once, as you need one per closure.

@ShayBox
Copy link

ShayBox commented Jul 16, 2023

👍 It works for me, I can remove walkdir, rayon, and a few functions from my project now

@Shelim
Copy link
Author

Shelim commented Jul 17, 2023

Looks good indeed, thank you!

@ShayBox
Copy link

ShayBox commented Aug 11, 2023

Is there a planned release soon or can a release be made with these changes? I want to publish to crates.io but I can't use git dependencies on crates.io

@0xpr03
Copy link
Member

0xpr03 commented Aug 11, 2023

I have a release planned, there are just some PRs I want to merge first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-enhancement Z-needs implementation Needs an implementation, will accept PRs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants