-
Notifications
You must be signed in to change notification settings - Fork 0
Identify similar files
License
binbash23/dupfinder
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
NAME dupfinder DESCRIPTION Use dupfinder to find duplicate files in big archives of files. Dupfinder searches the given path for files, creates a hash database for all files and searches for files which have the same hash. So if you have a picture or video with 2 diffenrent names or in different folders, you can identify them and create an archive script and a delete script. Dupfinder can be used to search for duplicate pictures in huge picture collections for example. SYNOPSIS dupfinder OPTIONS OPTIONS -a Search in path for duplicates (requires -r to set the root path). This option scans the root path for files, creates hashes for them and searches for duplicate files. The filehash database will also be cleaned from files that do not exist any more in the filesystem. You can use dupfinder -g later to generate scripts for moving the duplicates to an archive folder. So if you have a picture or video with 2 different names or in different folders, you can identify them and create an archive script and a delete script. The delete script will only delete duplicates - one version of the duplicate files will be kept in the root path. -c Calculate hashes for files in filelist file and add them to filehash database -C Create duplicate files file from hash database -D Show contents of duplicates file -d Delete not existing files from filehash database -f search root path for files and create plain files file (requires -r) -g Generate scripts to copy duplicates and to delete duplicates -h Show usage information -H Show contents of hash database -L Show contents of filelist database -n Calculate new hash for all files found, even if they exist in hash db -N Skip calculating sum of filesizes (size calculation may take some time in big archives) -p Create filelist database for root path (requires -r) -r [PATH] set root path -R Delete filehash database files -s Show statistics of database -v Verbose mode -V Show program version EXAMPLES 1) To search duplicate files in path /tmp/picture use: > ./dupfinder -avr /tmp/Pictures 3 files will be created: > filelist.db - list of files in root path > filehash.db - list of all files with every hash value > duplicates.db - list of duplicate files in 2) To get help for all options use: > ./dupfinder -h 3) To generate 2 scripts which can be used to archive the duplicates to an archive folder and to delete the duplicates use: > ./dupfinder -g 2 scripts will be generated: > copy_dups_from_root_path_to_archive.sh This script which will copy all duplicates in root path to an archive folder in your current directory called "duplicates". > delete_dups_from_root_path.sh This script can be used to delete all duplicate files from the root path. If you want to exclude several folders, you can modify the script with grep for example. The delete script contents only the duplicates. If there are 3 similar files, the delete script will only have 2 of them deleted. Just have a look into the delete script: > tail delete_dups_from_root_path.sh rm -v '/tmp/2015/921.JPG' rm -v '/tmp/2015/.thumbs/923.JPG' If you don't want to delete file from the thumbnail folders, create a new script: > grep -v ".thumbs" > delete_dups_from_root_path_custom.sh 4) To find out about the duplicates statistics use: > ./dupfinder -s Dupfinder Statistics Files in filehash database : 4774 (537 Kb) modified: 2020-10-03 17:06:59.953627326 +0200 Files in plain file list : 4774 (378 Kb) modified: 2020-10-03 17:06:08.454297780 +0200 Files in duplicates file : 1529 (130 Kb) modified: 2020-10-03 17:07:04.149572704 +0200 Size of all duplicates : 1.489 GB REPORTING BUGS Report bugs to <binbash@gmx.net> AUTHOR dupfinder by Jens Heine <binbash@gmx.net> 2020 and Dennis Brossat <dennis.brossat@email.de> and Benjamin Heine <benjaminheine@gmx.net>
About
Identify similar files
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published