Da Gampa's Code

Personal weblog of Jakub Hampl who is an AI & Psychology student at Edinburgh University.

Ask me whatever you want. I'll reply to whatever I want.

Popular passwords in 2010

A long story short: I got my hands on a data-dump of 184 550 hacked usernames and passwords from Gawker. With a bit of data-analysis magick I did a bit of research.

20 most popular passwords

Password Count
password1914
lifehack648
qwerty412
abc123328
monkey294
consumer267
letmein241
trustno1240
dragon229
baseball211
Password Count
superman203
iloveyou201
gizmodo194
sunshine192
princess182
starwars180
whatever179
shadow172
cheese153
nintendo148

Now the first entry is rather shocking: the most used password is also the most obvious password possible. A full 1% (and 0.11% of the full DB; see bellow) used this password! What didn’t exactly fit on the list was that another 111 people used ‘Password’ and 129 people used the slightly better ‘passw0rd’.

“lifehack” and “gizmodo” are both names of the sites these passwords were retrieved from. Not very secure. ‘qwerty’ and ‘abc123’ are also extremely obvious passwords. 15 of the remaining passwords are single words contained in any dictionary.

Password Strength

Next I marked the passwords with these simple rules:

The following table shows the distribution of these strengths.

Score Count Percentage
08567 4.6%
19673852.5%
26587635.6%
3127816.9%
4586 0.3%

Now a score of 1 is very bad - it is the default unless you use the same password as your name. However a staggering amount of people use these very unsafe passwords.

A new year’s (a bit early, but well) promise should be a bit more online safety for all of us.

A few notes

  1. The dataset is available here and the details of how it was obtained are here.

  2. The full db contained ~1.5M records out of which the attackers obtained ~1.2M (presumably randomly). They proceeded to crack the week encryption algorithm of ~200K accounts. This I do not assume was completely random. Out of this I got a total number of 184 550 records.

  3. The site apparently stored only the first 8 characters (sic!) of the password. This suggests caution with passwords of length 8 because these were possibly truncated. I set those shorter then 8 characters in an italicized font. However apart from “lifehack” which would be probably “lifehacker” all other passwords in the top 20 seem as complete words or phrases. As an aside a strong password is generally considered to be at least 16 characters long.

  4. I would like to perform more research into password sharing with this dataset but I don’t have the time for that right now. I’d be interested if someone will investigate.

#tech #password #security