A long story short: I got my hands on a data-dump of 184 550 hacked usernames and passwords from Gawker. With a bit of data-analysis magick I did a bit of research.
20 most popular passwords
| Password |
Count |
|---|
| password | 1914 |
|---|
| lifehack | 648 |
|---|
| qwerty | 412 |
|---|
| abc123 | 328 |
|---|
| monkey | 294 |
|---|
| consumer | 267 |
|---|
| letmein | 241 |
|---|
| trustno1 | 240 |
|---|
| dragon | 229 |
|---|
| baseball | 211 |
|---|
|
| Password |
Count |
|---|
| superman | 203 |
|---|
| iloveyou | 201 |
|---|
| gizmodo | 194 |
|---|
| sunshine | 192 |
|---|
| princess | 182 |
|---|
| starwars | 180 |
|---|
| whatever | 179 |
|---|
| shadow | 172 |
|---|
| cheese | 153 |
|---|
| nintendo | 148 |
|---|
|
Now the first entry is rather shocking: the most used password is also the most obvious password possible. A full 1% (and 0.11% of the full DB; see bellow) used this password! What didn’t exactly fit on the list was that another 111 people used ‘Password’ and 129 people used the slightly better ‘passw0rd’.
“lifehack” and “gizmodo” are both names of the sites these passwords were retrieved from. Not very secure. ‘qwerty’ and ‘abc123’ are also extremely obvious passwords. 15 of the remaining passwords are single words contained in any dictionary.
Password Strength
Next I marked the passwords with these simple rules:
- +1 point for each of: uppercase characters, lowercase characters, numerals, punctuation
- +1 point for passwords longer then 7 letters
- 0 points if they were part of the username
The following table shows the distribution of these strengths.
| Score |
Count |
Percentage |
|---|
| 0 | 8567 | 4.6% |
|---|
| 1 | 96738 | 52.5% |
|---|
| 2 | 65876 | 35.6% |
|---|
| 3 | 12781 | 6.9% |
|---|
| 4 | 586 | 0.3% |
|---|
Now a score of 1 is very bad - it is the default unless you use the same password as your name. However a staggering amount of people use these very unsafe passwords.
A new year’s (a bit early, but well) promise should be a bit more online safety for all of us.
A few notes
The dataset is available here and the details of how it was obtained are here.
The full db contained ~1.5M records out of which the attackers obtained ~1.2M (presumably randomly). They proceeded to crack the week encryption algorithm of ~200K accounts. This I do not assume was completely random. Out of this I got a total number of 184 550 records.
The site apparently stored only the first 8 characters (sic!) of the password. This suggests caution with passwords of length 8 because these were possibly truncated. I set those shorter then 8 characters in an italicized font. However apart from “lifehack” which would be probably “lifehacker” all other passwords in the top 20 seem as complete words or phrases. As an aside a strong password is generally considered to be at least 16 characters long.
I would like to perform more research into password sharing with this dataset but I don’t have the time for that right now. I’d be interested if someone will investigate.
The internet was stormed yesterday by the occurrence of a ReadWriteWeb article that is about the Facebook login process and quickly hundreds of completely confused users posted comments about their hate of this redesign of the Facebook login page.
As Neven Mrgan correctly says, the amount of information people were required to ignore is staggering:
- People google for “facebook login” to log in to Facebook. I understand that they don’t use bookmarks and don’t type in facebook.com, but note that they don’t google “facebook”; they google “facebook login”. Clearly users don’t even see logging in as a function of the site itself; those are separate in the users’ mental maps. This is perhaps partly explained by the excess of websites which use Facebook as their authentication system, but it’s not the whole story.
- They then click the small google result which says “News results: ReadWriteWeb” expecting they’ll be taken to Facebook.
- They land on a page with an absolutely enormous heading saying ReadWriteWeb, below which is a headline, a byline, and endless paragraphs of what is even at the quickest glance obviously a news story.
- They scroll all the way to the bottom of this completely un-Facebook-like page, with not a single thing in the way that would indicate this is a Facebook redesign.
- They then go past the big heading saying Leave a comment and instead focus on the small link which says Optional: Sign in with Facebook. And don’t tell me these folks searched for “facebook” or “login” on the page itself.
The implications for the design and UX community are staggering. How are you supposed to design for this? It seems that the whole webapp metaphor is fundamentally flawed for a lot of users.
However there is a flickr screenshot of the facebook login page with similar comments. When I was quickly reading through these I recognized one of the commenters. I don’t know him personally but he clearly is an advanced computer user (probably dependent on it for his living). Yet he is confused with not being able to login into the tiny screenshot. This made me think of two possible explanations:
Statistics - Facebook has about 80 million users (I believe). So a few hundred comments on a few websites make for about 0.0001% users confused. There are probably more who didn’t post anything, but still a very small number. These people could be high for all we know or just this is a quantum level error of the transmission :)
Joke - somebody’s malware idea of a joke. It wouldn’t surprise me if somebody created a virus that would take peoples Facebook accounts and post helpless comments on top Google results for “facebook login”. The internet saw weirder things happen.
Anyway if it turns out to be a case of real honest user confusion, UX may as well change quite radically.