Popular passwords in 2010
A long story short: I got my hands on a data-dump of 184 550 hacked usernames and passwords from Gawker. With a bit of data-analysis magick I did a bit of research.
20 most popular passwords
|
|
Now the first entry is rather shocking: the most used password is also the most obvious password possible. A full 1% (and 0.11% of the full DB; see bellow) used this password! What didn’t exactly fit on the list was that another 111 people used ‘Password’ and 129 people used the slightly better ‘passw0rd’.
“lifehack” and “gizmodo” are both names of the sites these passwords were retrieved from. Not very secure. ‘qwerty’ and ‘abc123’ are also extremely obvious passwords. 15 of the remaining passwords are single words contained in any dictionary.
Password Strength
Next I marked the passwords with these simple rules:
- +1 point for each of: uppercase characters, lowercase characters, numerals, punctuation
- +1 point for passwords longer then 7 letters
- 0 points if they were part of the username
The following table shows the distribution of these strengths.
| Score | Count | Percentage |
|---|---|---|
| 0 | 8567 | 4.6% |
| 1 | 96738 | 52.5% |
| 2 | 65876 | 35.6% |
| 3 | 12781 | 6.9% |
| 4 | 586 | 0.3% |
Now a score of 1 is very bad - it is the default unless you use the same password as your name. However a staggering amount of people use these very unsafe passwords.
A new year’s (a bit early, but well) promise should be a bit more online safety for all of us.
A few notes
The dataset is available here and the details of how it was obtained are here.
The full db contained ~1.5M records out of which the attackers obtained ~1.2M (presumably randomly). They proceeded to crack the week encryption algorithm of ~200K accounts. This I do not assume was completely random. Out of this I got a total number of 184 550 records.
The site apparently stored only the first 8 characters (sic!) of the password. This suggests caution with passwords of length 8 because these were possibly truncated. I set those shorter then 8 characters in an italicized font. However apart from “lifehack” which would be probably “lifehacker” all other passwords in the top 20 seem as complete words or phrases. As an aside a strong password is generally considered to be at least 16 characters long.
I would like to perform more research into password sharing with this dataset but I don’t have the time for that right now. I’d be interested if someone will investigate.