How would you store a password if you were asked to create an authentication system? The easiest solution would be to store the plain text inputted by the users. For example, if the users input their email as [email protected]
and password as weakpassword
, then we can insert it into our users
table, right?
When the user tries to log in, we can query the table with SELECT * FROM users WHERE email = {email} AND password = {password}
. If the query returns a result, then we can authenticate the user. Task done? Well, not quite.
Even though the solution above will work, it is not secure and prone to many attacks. The most direct attack that might not be obvious is an internal attack. In this attack, the people or employees who have access to your database (including yourself) can easily see the user's password and get their credentials.
A data breach is also another thing as to why plain text passwords are terrible. Someone you don't intend to might acquire your database, and it happens all the time, even to a large company. With a plain text password, the attacker can get all of your user's credentials by querying your users
table.
The first step that you want to take is to hash your user password with a hashing function before storing it in the database. Unlike encryption, the hashing function can only go one way, and the result of hashing a specific string will always be the same. This makes the hashing function a very suitable process for password storage.
One of the most popular and secure hashing functions is SHA256
. If we try to hash weakpassword
with SHA256
, the result would be:
9b5705878182ccecf493b6c5ef3d2c723082141d0af33432c997b52dcc9f3e71
The hashing function only goes one way, so we can't convert the hash result back to weakpassword
. Also, every time weakpassword
is hashed using SHA256
, the result will always be the same.
Now, the attackers wouldn't be able to know the users' passwords even if they can access your database. But, it's not good enough.
A Rainbow table attack is an attack with a table of a precomputed hash of common passwords. With a rainbow table attack, the attacker will be able to get the credentials of the users with weak passwords in your database. To combat this, salt
is usually used when hashing a password. salt
is a randomly generated value that you can combine with the password before hashing it.
For example, if we generated jvFJ4
as a salt and combine it with weakpassword
and hash it (sha256("jvFJ4weakpassword")) it will produce:
b104c5bf49e2e4937ac2419e94864f7209014a96cae582302f6e5f891e426e22
Which is a completely different result that hashed plain weakpassword
.
Now, with salt
, our table will become:
What do we need to do when the user logs in?
users
table by email, e.g., SELECT * FROM users WHERE email='[email protected]' to get the hashed password and salt.
We have mitigated many attacks with this design, but is it enough? Well...
The approaches we created previously can mitigate a lot of attacks. But not a dictionary attack. What if the attacker gets the salt, combines it with a common password, hashes the combination, and compares it with the password in the database? Only time will separate the attacker from getting your user's password. And actually, time is one variable that we can tune.
SHA256
is unsuitable for password hashing as it is designed to hash a complex enough input(which a password often does not) and compute it quickly. I tried to do a hashing on weakpassword
10 million times with my AMD Ryzen 5 3600
(6 Cores, 12 Threads @3.6GHz) CPU. I can finish it very fast.
h := sha256.New()
start := time.Now()
for i := 0; i < 10000000; i++ {
h.Write([]byte("weakpassword"))
}
elapsed := time.Since(start)
log.Printf("SHA256 took %s \n", elapsed)
The result is:
2022/07/28 16:57:00 SHA256 took 337.2232ms
With SHA-256
, the attacker would be able to hash and compare many passwords quickly with it. We need a slower hashing function.
This is where Bcrypt
comes in. Bcrypt
is a password hashing function based on Blowfish in which you can determine its cost to run. This trait, in particular, is perfect for password hashing because it will future-proof the hashing function when a faster machine comes up.
Let's see Bcrypt
it in action:
for i := 10; i < 21; i++ {
start := time.Now()
bcrypt.GenerateFromPassword([]byte("weakpassword"), i)
elapsed := time.Since(start)
fmt.Printf("cost: %d Elapsed time: %s\n", i, elapsed)
}
The result is:
cost: 10 Elapsed time: 56.233ms
cost: 11 Elapsed time: 113.1521ms
cost: 12 Elapsed time: 213.0455ms
cost: 13 Elapsed time: 447.572ms
cost: 14 Elapsed time: 877.3284ms
cost: 15 Elapsed time: 1.8126554s
cost: 16 Elapsed time: 3.375513s
cost: 17 Elapsed time: 6.5935858s
cost: 18 Elapsed time: 13.3655301s
cost: 19 Elapsed time: 27.0033831s
cost: 20 Elapsed time: 53.8954938s
We can see that the time went parabolic compared to the cost. Higher cost means better security but a worse user experience. Just imagine if you set the cost as 20, the user will need to wait 53 seconds when logging in. But put it too low, it will be easier for the attacker to steal your user's credentials.
Let's do some math. Suppose you have ten million users in your database, and the attacker has a dictionary of 1000 most common passwords. How long would it take for the attacker to calculate the password hash with SHA256 compared to Bcrypt with the cost of 12? First, we will need to calculate how many hash operations the attacker needs to do, which we can get by multiplying how many users we have by the number of common passwords the attacker uses.
So, 10.000.000 * 1000 = 10.000.000.00 Now, we calculate the time the attacker needs to calculate the hash ten billion times. For SHA256, we did a million calculations in 337.2232ms. So, we can calculate all of the hash: 10.000.000.000 / 1.000.000 * 337.2232 ms = 3372232m, which is just under 1 hour. Next, let's try with Bcrypt with cost 12: 10.000.000.000 * 213.0455ms = 2.130455 e+12m, which equals 4053377 years.
As you can see, using Bcrypt for your password hashing function makes a lot of difference. If a data breach happens to your database, it will buy you a lot of time to notice it and ask your users to change their passwords.
Besides determining cost, Bcrypt
also uses salt
by default, which means the attacker won't be able to do a rainbow table attack we discussed previously. Let's see what the result is if we hash weakpassword
with Bcrypt
:
hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)
fmt.Printf("hashed password: %s", hashedPassword)
hashed password: $2a$10$.krQtTcne8xlhG2rJONbKu9KZepUpwl8tyC/fFIB6lRmNufvPfge2
If we break the result down, we will get:
Bcrypt breakdown
alg
: The has algorithm identifier, $2a
means Bcrypt
cost
: The cost of the Bcrypt
; remember, we set this as 10 in the code
salt
(22 characters): Random salt for password hashing generated by the Bcrypt
hashing function.
hashed password
(31 characters): The hashing function result.
Lastly, let's see how to validate the password hashed by Bcrypt
:
hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)
err := bcrypt.CompareHashAndPassword(hashedPassword, []byte("weakpassword"))
if err != nil {
fmt.Print(err)
} else {
fmt.Printf("Password true")
}
As we can see, we don't need to send the cost, alg, and salt when comparing the hash and password because every required input has been added to the Bcrypt
hashing result itself.
Let's review, there are two essential traits of Bcrypt
that make it suitable for password hashing:
Bcrypt
let us determine the cost to calculate the hash result, which makes it future-proof for faster machines.
Bcrypt
calculates its forces using salt, making a rainbow table attack impossible to do.
We've discussed storing your user password correctly so attackers can't figure out your user's password quickly. But securely storing our users' passwords doesn't mean the attacker can't get the password. For example, the attacker can take a look when your user inputs their password to figure out the password.
There is also a chance that the attacker can make a man-in-the-middle attack if your website doesn't use HTTPS.
If you want to understand more about how to secure the authentication process, I urge you to read about: