Hashing is used for authentication and non-repudiation, the concept that the authenticity of information can’t be denied.
Hash functions are algorithms that produce a code that can’t be decrypted. Hash functions convert information into a unique value that can then be used to determine integrity.
origin story
Hash functions were originally created as a way to quickly search for data. The purpose was to represent data of any size as small, fixed-size values, or “digests”. Using a hash table, which is a data structure that’s used to store and reference has values, these small values became a more secure and efficient way for computers to reference data.
The earliest hash function is Message Digest 5 (MD5). Developed to verify that a file sent over the network matched its source file.
MD5 works by converting data into a 128-bit value. In a hash table, user inputs (1s and 0s), appear as a string of 32 characters. Altering anything in the source file generates an entirely new hash value.
Generally, the longer the hash value, the more secure it is.
Hash collisions
One flaw with MD5 happens to be a characteristic of all hash functions. Hash algorithms map any input, regardless of length, into a fixed-size value of letters and numbers. The issue is that, although there are an infinite amount of possible inputs, there’s only a finite set of available outputs.
MD5 values are limited to 32 characters. Because of this limited output size, the algorithm is considered to be vulnerable to hash collisions. This is an instance when different inputs create the same has value. And because hashes are used for authentication, attackers can carry out collision attacks to fraudulently impersonate authentic data.
Next-generation hashing
To avoid collisions, functions needed to generate longer values. MD5 gave way to SHA, Secure Hashing Algorithms.
The numbers next to SHA indicate the size of its hash value in bits, except for SHA-1 (which produces a 160-bit digest). SHA is considered to be collision resistant. But this does not make them vulnerable to other exploits.
The SHA group consists of:
- SHA-1
- SHA-224
- SHA-256
- SHA-384
- SHA-512
Secure password storage
Passwords are typically stored in a database where they are mapped to a username. The server receives a request for authentication that contains the credentials supplied by the user. It then looks up the username in the database and compares it with the password that was provided and verifies that it matches before granting access.
If an attacker gains access to the user database and the passwords are stored in plaintext, then the information can be stolen. Hashing adds an additional layer of security. Because hash values can’t be reversed, an attack would not be able to steal someone’s login credentials if they managed to gain access to the database.
Rainbow tables
A rainbow table is a file of pre-generated hash values and their associated plaintext. They’re like dictionaries of weak passwords. Attackers capable of obtaining an organization’s password database can use a rainbow table to compare them against all possible values.
Adding some “salt”
Salting is an additional safeguard that’s used to strengthen hash functions. A salt is a random string of characters that’s added to data before it’s hashed. The additional characters produce a more unique hash value, making salted data resilient to rainbow table attacks.
For example if the password is “password” and each user’s password is salted, then that produces unique hashes. This limits an attacker from using rainbow tables to search for users with the same password (hash).
Similar to hash values, the longer and more complex a salt is, the harder it is to crack.
Leave a Reply