What is a File Hash?
A file hash is like a digital fingerprint. It is a unique string of characters generated by a mathematical algorithm based on the file's contents. If even a single bit of the file changes (e.g., due to a download error, virus infection, or malicious tampering), the hash value changes completely. This makes hashing the gold standard for verifying file integrity.
Understanding Algorithms: MD5 vs SHA-1 vs SHA-256
MD5 (Message Digest Algorithm 5)
Produces a 32-character hexadecimal number. It is extremely fast but is no longer considered cryptographically secure against collision attacks. Use it for basic file corruption checks only.
SHA-1 (Secure Hash Algorithm 1)
Produces a 40-character value. Historically used in Git and SSL, it is now considered legacy (unsafe for high-security digital signatures) but remains widely used for general file identification.
SHA-256 (Secure Hash Algorithm 2)
Produces a 64-character value. It is the modern standard for security, used in Bitcoin, SSL certificates, and secure OS verification. It provides a very high level of security against tampering.
Common Use Cases
- Download Verification: Many software developers publish the SHA-256 hash of their installers. By calculating the hash of your downloaded file and comparing it, you ensure the file wasn't modified by a hacker during download.
- Deduplication: If two files have the exact same hash, they are identical, regardless of their filename. This allows you to find and delete exact duplicate files to save space.