Siegfried is a signature-based file format identification tool.
Key features are:
- complete implementation of PRONOM (byte and container signatures)
- fast matching without limiting the number of bytes scanned
- detailed information about the basis for format matches
- simple command line interface with a choice of outputs
- a built-in server for integrating with workflows and language inter-op
- power options including debug mode, signature modification, and multiple identifiers
1.2.0
sf file.ext
sf DIR
sf -csv file.ext | DIR // Output CSV rather than YAML
sf -json file.ext | DIR // Output JSON rather than YAML
sf -droid file.ext | DIR // Output DROID CSV rather than YAML
sf - // Read list of files piped to stdin
sf -nr DIR // Don't scan subdirectories
sf -z file.zip | DIR // Decompress and scan zip, tar, gzip
sf -hash md5 file.ext | DIR // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext // Use a custom signature file
sf -home c:\junk -sig custom.sig file.ext // Use a custom home directory
sf -debug file.ext // Scan in debug mode
sf -version // Display version information
sf -serve hostname:port // Server mode
By default, siegfried uses the latest PRONOM and container signatures with no buffer limits. You can customise your signature file by using the roy tool.
go get github.com/richardlehane/siegfried/cmd/sf
sf -update
For OS X:
brew install mistydemeo/digipres/siegfried
For Ubuntu/Debian (64 bit):
wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add -
echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried
For Win:
Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:
sf -update
- text matcher (i.e. sf README will now report a 'Plain Text File' result)
- -notext flag to suppress text matcher (roy build -notext)
- all outputs now include file last modified time
- -hash flag with choice of md5, sha1, sha256, sha512, crc (e.g. sf -hash md5 FILE)
- -droid flag to mimic droid output (sf -droid FILE)
- bugfix: detect encoding of zip filenames reported by Dragan Espenschied
- bugfix: mscfb reported by Dragan Espenschied
- scan within archive formats (zip, tar, gzip) with -z flag
- format sets (e.g. roy build -exclude @pdfa)
- leaner, faster signature format
- support bitmask patterns
- mirror bof patterns as eof patterns where both roy -bof and -eof limits set
- 'sf -' reads files piped to stdin
- bugfix: mscfb reported by Pascal Aantz
- bugfix: race condition in scorer (affected tip golang)
- archivematica build: fpr server
- user documentation
- bugfixes (mscfb, match/wac and sf)
- QA using comparator
Copyright 2015 Richard Lehane
Licensed under the Apache License, Version 2.0
Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.
Thanks TNA for http://www.nationalarchives.gov.uk/pronom/ and http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm
Thanks Ross for https://github.com/exponential-decay/skeleton-test-suite-generator and http://exponentialdecay.co.uk/sd/index.htm, both are very handy!
Thanks Misty for the brew and ubuntu packaging