Checksums & Hashes & Salts! Oh my!

You want to download a couple of Linux distributions to try out, so you head over to www.distrowatch.com and pick out the first one you want to try. It is Solus. When you go to the Solus website to download the ISO file, underneath it, you see beneath the torrent link:

solus download page screenshot

What is SHA256SUMS? And why do I care about it?

Checksums ensure file integrity

This is what is called a checksum. It is a mathematically derived value that is based on the contents of a digital file. There are a few different checksums, but they all have these properties:

  • Two files that are exactly the same will have the same checksum
  • Just one tiny difference between files results in completely different checksums
  • You cannot derive the contents of the file from the checksum

What does this mean? It means you can download a file – your Solus ISO file for example – compute the checksum of the downloaded file, and if it matches the one Solus publishes on their website, you can be sure your file is not compromised or incomplete. In other words, you can ensure the file’s integrity, thereby keeping your computer safe from the annoyances of incomplete ISO downloads!

Types of checksums

There is more than one way to create a checksum. One popular checksum is MD5. Solus uses SHA256.

MD5? SHA256? Where do these weird names come from?

These are actually names of the cryptographic hash functions – hash function for short – used to create the corresponding checksum. A hash function simply takes some data (can be any length up to some astronomical length) as input, and applies some algorithm to create a fixed length string of bits. MD5 was created in 1991 by an MIT professor named Ronald Rivest. MD stands for message digest, and MD5 is the successor of MD4.

SHA stands for secure hash algorithm. It was published in 1994 by the National Institute of Standards and Technology (NIST) in collaboration with the NSA. The first SHA is called SHA-0. SHA-1 came out a year later to correct a design flaw, making SHA-0 obsolete. Now, we have SHA-2 that replaces SHA-1.

For each algorithm, there may also be a number of digest sizes. For example, in SHA-2, the SHA256 hash has a 256-bit message digest length. Or, the SHA512 hash has a 512-bit message digest length.

What typically occurs with these hash functions is expert cryptographers try to break them. Once they’ve been broken, a new algorithm is designed and released, replacing the one that was broken. As of the writing of this blog post, we have SHA-2 and SHA-3 both current (no one has broken them). Also, MD6 which came out in 2008 is still secure. However, MD5 remains popular even though it is no longer considered cryptographically secure.

Here’s what a SHA256 checksum looks like:

SHA256 checksum

Here’s what an MD5 sum looks like:md5 checksum

How to use checksums

The first thing you need is a file with a published checksum so you can compare your own computed checksum to it. Or you need access to the original file so you can compute the original checksum yourself. Note: This file MUST be the original file, not a copy!!! You also need to know which hashing algorithm was used to create the published checksum. It does no good to create a checksum with a different algorithm as the original. They will definitely not match!

Next, if you don’t have it already, you need to get a checksum computing program appropriate to your operating system. For most Linux systems, which usually already have the commands available, you can simply type in a terminal, for example:

user@linux ~ $ sha512sum file.txt

In our Solus example, we would download the Solus ISO. Let’s call it solus.iso. Luckily, Solus provides a file with the checksum rather than just publishing the checksum on their site. This helps us avoid having to eyeball the two checksums to see if they match. We simply download the checksum file as well. Let’s call that solus.sha256sum. It’s just a text file with the checksum and file name on one line. Now, in the terminal cd into the directory where solus.iso and solus.sha256sum sit and type:

user@linux ~ $ sha256sum -c solus.sha256sum

If your download was successful, your results will be:

solus.iso: OK

Sometimes people put multiple files in these checksum files. If you only download one of the files, you can ignore results from the other lines. Or, if you’re a fancy Linux user, you can pipe the results through grep like so:

user@linux ~ $ sha256sum -c solus.sha256sum | grep OK

Then just make sure your file comes up in the results.

On Windows, you will need to download a checksum program. A popular one to use is called File Checksum Integrity Verifier, or FCIV, from Microsoft. It’s a command line tool, just like on Linux. However, it only uses MD5 or SHA-1 algorithms, so I don’t recommend it. There are other programs available that are easier to use and support more algorithms.

certUtil is a command line utility that should be pre-installed. To check a file called isofile.iso in C:\TEMP\ with SHA256, in the Command window:

PS C:\> certUtil -hashfile C:\TEMP\isofile.iso SHA256

certUtil supports MD2, MD4, MD5, SHA1, SHA256, SHA384, & SHA512.

There are programs that integrate into Windows. HashCheck, for instance, adds a tab to the file properties window with a few checksums calculated. It can also create a checksum file for you.

I don’t use Macs, so I had to do a little research. You can check a file’s checksum with the Terminal on a Mac. Enter the following to get an MD5 checksum:

user:~ Username$ md5 path/to/file.exe

A couple of things to note here.

  • A program only checks the contents of the file, not the file name. However, if you use a file of checksums to do your checking, you need the file name of the file you are checking to be the same as the one in the file of checksums.
  • If your file name is long, tab is your friend. Start typing the file name, and once you believe the characters you’ve entered are unique to your file, hit the tab key to auto complete the name.

Other uses of hashes

Hashes are also used to encrypt passwords in databases. When you go to a website where you have registered an account, the website must store your password. But, it can’t store it in plain text – not if it requires any type of security. Plain text storage is a no-no, as websites’ databases can get hacked at any time. The passwords need to be encrypted. Using a hash means that you cannot guess the password from seeing the hash.

For additional security, it’s common practice to also salt the hash. Yummy!

What is a salt?

I lied to you above. My bad. Hackers actually can determine a password if they obtain the hash of the password. How? A few ways, most of which involve huge look-up tables of hashes called rainbow tables.

A salt is simply another string that is added to something being hashed. Here’s how it works.

  1. User creates password.
  2. After checking for security of password (if necessary), the site adds a randomly generated salt (string of random characters) onto the password.
  3. The salt + password combination is hashed.
  4. The hash and salt are stored in the database with the user name. The password is never stored.
  5. Upon the next login, the user enters their password.
  6. The salt is brought in and added to the password and all of it hashed.
  7. This hash is compared to the stored hash.

This website is a great resource to learn more about hashing and salting passwords:

Salted Password Hashing

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.