LinuxCommandLibrary

hashdeep

Compute and compare file hashes

SYNOPSIS

hashdeep [options] [files|directories]

PARAMETERS

-a, --all
    Compute all supported algorithms (md5,sha1,sha256,sha384,sha512,whirlpool,tiger)

-b, --block-size=<num>
    Piecewise hash block size in bytes (default 4096)

-c <algorithms>, --algorithms=<algorithms>
    Select algorithms, comma-separated (e.g., sha256,md5)

-d, --csv
    Output in CSV format

-e, --entropy
    Print Shannon entropy of files

-h, --help
    Show help message

-j <num>, --jobs=<num>
    Number of parallel threads

-k <size>, --ssdeep-size=<size>
    Minimum file size for ssdeep fuzzy hashing

-l, --no-sort
    Do not sort output by filename

-m <file>, --known=<file>
    Compare against known hashes file

-p <string>
    Prepend string to filenames in output

-r, --recursive
    Recurse into subdirectories

-s, --statistics
    Print only statistics (counts)

-t, --html
    Output in HTML format

-v, --verbose
    Increase verbosity

-V, --verify
    Verify mode only (with -m)

-X <ext>, --exclude-ext=<ext>
    Exclude files with given extension

-Z <ext>, --exclude-path=<ext>
    Exclude paths ending with extension

DESCRIPTION

hashdeep is a powerful command-line tool for generating, analyzing, and verifying message digests (hashes) of files and directories. It supports multiple algorithms including MD5, SHA-1, SHA-256, SHA-384, SHA-512, Tiger, Whirlpool, and fuzzy hashing via ssdeep. Designed for digital forensics, incident response, and integrity checking, it excels at creating baseline hash sets for known-good files and verifying against them to detect changes, additions, or deletions.

Key features include recursive directory traversal, parallel processing with multiple threads, piecewise hashing for large files (to avoid memory issues), entropy computation, and flexible output formats like plain text, CSV, and HTML. It can exclude files by extension or path, prepend paths to output, and provide statistics without full listing. Verification mode highlights matches, mismatches, and unknowns.

Common workflow: Generate hashes with hashdeep -c sha256 -r /target > baseline.txt, then verify later with hashdeep -m baseline.txt -r /target. Fuzzy hashing detects similar malware variants. Widely used by law enforcement and cybersecurity professionals for efficient file triage across large datasets.

CAVEATS

Requires libfuzzy for ssdeep support; install via package manager. Memory-intensive for millions of files. Large outputs need ample disk space. Not suitable for real-time monitoring.

SUPPORTED ALGORITHMS

md5, sha1, sha256, sha384, sha512, whirlpool, tiger, ssdeep (fuzzy)

TYPICAL WORKFLOW

hashdeep -c sha256,ssdeep -r /dir -j 4 > baseline.txt
hashdeep -m baseline.txt -c sha256,ssdeep -r /dir -V

HISTORY

Developed by Jesse Kornblum in 2006 for U.S. Air Force digital forensics. Evolved into standalone hashdeep suite (v4.4, 2011). Open-source on GitHub; companion audit for verification.

SEE ALSO

md5sum(1), sha1sum(1), sha256sum(1), ssdeep(1), audit(1)

Copied to clipboard