After making the aforementioned “fixes” to the database, there are 412670 accounts, 157794 (38.2%) of which had their password decrypted.The following table displays the 10 most frequently-occurring domain names used for e-mail addresses in the database along with how many users of the domain had their password cracked.In this post I will look at trends in which users’ passwords were cracked to gain insight into which users do and do not create strong passwords.
Singles dating singles semmi kjaerlighet speed dating events in orange county ca
This past weekend, Gawker Media was hacked and its user account database was leaked online.
The database contained about 1.3 million rows of information containing usernames, e-mail addresses, and passwords (encrypted via DES).
Of course, the passwords that were cracked were relatively weak.
For example, all 2641 accounts that used some trivial modification of “password” or “querty” as their password were of course decrypted.
The database of course had to be significantly cleaned before it could be of too much use statistically, so some of the numbers here may differ slightly from the raw numbers you see from news outlets or if you download the raw database yourself.
The numbers here are the result of removing any incomplete rows from the database (i.e., rows missing a password, e-mail address, or both) and removing any accounts that were clearly created by SPAMbots (I’m only interested in the password strength of real users).Also, I will only look at accounts that contain an e-mail address with a domain that was registered in the database at least 50 times.This restriction is in place partly because it is extremely difficult to compute any sort of meaningful statistics on something with a sample size that is much smaller than 50, and it is partly due to the fact that Gawker doesn’t require verified e-mail addresses (so 46993 of the 52593 domain names listed in the database were used by exactly one person, many of which are clearly fake and/or for SPAM).The following table shows the z-values associated with the statistical test that the two given domains have the same proportion of users with strong passwords.This security breach is unfortunate for people whose information is contained within that database, but the silver lining is that it provides a rare opportunity for statistics nerds like me to analyze some otherwise completely unobtainable data.Because the passwords were encrypted using such an out-of-date scheme (tsk, tsk, Gawker), about 200,000 of the passwords contained in the database have been decrypted.