Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardlinks are considered duplicate. #352

Open
frenchiveruti opened this issue May 15, 2021 · 4 comments
Open

Hardlinks are considered duplicate. #352

frenchiveruti opened this issue May 15, 2021 · 4 comments
Labels
enhancement New feature or request PR welcome The given topic has already been analyzed and you can safely create a PR implementing this functiona

Comments

@frenchiveruti
Copy link

Hello, I have many files duplicated as Hardlinks on my windows PC, and Czkawka considered them as duplicated even though they were already non-duplicate due to them being hardlinks.
I ran the analysis using Hash+Blake3 method, and it returned a LOT of alleged duplicated files that were all HardLinks.
¿How can I make the software ignore hardlinks?

@qarmin
Copy link
Owner

qarmin commented May 16, 2021

Currently ignoring hardlinks are implemented only for Unix based systems like Linux and macOS.

If anyone want to implement this feature, this is code responsible for it

#[cfg(target_family = "windows")]
fn filter_hard_links(vec_file_entry: &[FileEntry]) -> Vec<FileEntry> {
vec_file_entry.to_vec()
}
#[cfg(target_family = "unix")]
fn filter_hard_links(vec_file_entry: &[FileEntry]) -> Vec<FileEntry> {
let mut inodes: HashSet<u64> = HashSet::with_capacity(vec_file_entry.len());
let mut identical: Vec<FileEntry> = Vec::with_capacity(vec_file_entry.len());
for f in vec_file_entry {
if let Ok(meta) = fs::metadata(&f.path) {
if !inodes.insert(meta.ino()) {
continue;
}
}
identical.push(f.clone());
}
identical
}

@qarmin qarmin added the PR welcome The given topic has already been analyzed and you can safely create a PR implementing this functiona label May 4, 2022
@slowthgt
Copy link

slowthgt commented May 7, 2022

While I don't have a PR to submit (as I have 0 experience with Rust), I did reach somewhat of a solution. However, it requires the win32api, which can be used but requires the Rust for Windows dependency. The method revolves around the usage of GetFileInformationByHandle, which returns a structure named BY_HANDLE_FILE_INFORMATION and then check the nFileIndexHigh, nFileIndexLow and dwVolumeSerialNumber to see if they are the same. Hope this somewhat helps.
Reference

@qarmin qarmin added the enhancement New feature or request label May 29, 2022
@ongchi
Copy link
Contributor

ongchi commented Jun 8, 2022

The file index information can be obtained by nightly api os::windows::fs::MetadataExt::file_index, returns an u64, and this value is not garnered to be unique on some filesystems.

Please refer to rust-lang/rust#63010.

While I don't have a PR to submit (as I have 0 experience with Rust), I did reach somewhat of a solution. However, it requires the win32api, which can be used but requires the Rust for Windows dependency. The method revolves around the usage of GetFileInformationByHandle, which returns a structure named BY_HANDLE_FILE_INFORMATION and then check the nFileIndexHigh, nFileIndexLow and dwVolumeSerialNumber to see if they are the same. Hope this somewhat helps.

@Devocub
Copy link

Devocub commented Jun 20, 2023

This is important feature!
as easy solution you can use fsutil hardlink list "C:\123.txt"

\Program Files\456.txt
\789.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PR welcome The given topic has already been analyzed and you can safely create a PR implementing this functiona
Projects
None yet
Development

No branches or pull requests

6 participants
@ongchi @frenchiveruti @Devocub @slowthgt @qarmin and others