13
$\begingroup$

My wife and I are having a bit of a disagreement.

Concerning eight-digit passwords like the kind most secure websites require, she believes you could generate more unique password combinations by using only one of the following sets: one lowercase or one uppercase or one symbol or one numeral. In other words, if you created a password using only lowercase letters with no other conventions. e.g.: jtpdlrkc or JTPDLRKC or 29456014 or #^@%#&(%.

I have asserted that, by using at least one each of the aforementioned sets, a person can generate FAR, FAR more unique password combinations, thereby making the password much more difficult for someone to hack. e.g. gy7HU*j9 or $F6ms38@.

She thinks I'm making this up. I need numbers to prove my point. Anyone interested in running with this one?

  • 0
    @J.Harris can you please clarify what you mean again? It seems like your proposed rule for a password is that the eight-character password **must** contain at least one uppercase, at least one lowercase, at least one digit, and at least one symbol. Is that right?2017-01-29
  • 3
    There are lots of correct answers here with good mathematics. But I'd suggest another way to settle the argument that might be more convincing and less an appeal to mathematical authority. Work out all possible two character passwords using very small alphabets: ab, AB, 12, &*.2017-01-30

6 Answers 6

21

There are $26^8=208,827,064,576$ lower-case passwords, the same number of upper-case passwords, $10^8=100,000,000$ digit-only passwords and (if we assume there are about ten symbols) about $10^8$ symbol-only passwords as well. That gives your wife about $418,000,000,000$ possible passwords.

We get a simple lower bound for the number of passwords having all kinds of characters: We can use any character for the first five places, that's $(26+26+10+10)^5=1,934,917,632$. If no upper-case was used so far, add an upper-case. If no lower-case was used so far, add one lower case. If no digit was used so far, add one digit. If no special was used so far, add one special. (At most three of these conditions can be true). As long as we still have less than eight characters add one of the 72 allowed characters. This method will not generate all valid passwords, but there will be at least $10\cdot 10\cdot 26$ possible choices in the last three places. Multiply with the above to arrive at $5,030,785,843,200$, more than ten times the other count.

  • 0
    Nice answer, this addresses all the potential problems of the other answers.2017-01-29
  • 1
    It should be possible to count the exact number of combinations in the case where there must be at least one symbol from each of the four groups. If you first figure out how many ways you can partition the eight into an ordered sum of four terms (lowercase+uppercase+digits+others), then you can iterate over each of these and find the grand total. With a computer. Anyone?2017-01-29
7

Here's an attempt to compute not just a lower bound, but exactly how many passwords the OP's criteria allow.

We have $26$ upper-case letters, $26$ lower-case letters, $10$ digits, and an unknown number of "symbols". The number of "symbols" is unknown because many websites accept only a restricted subset of the non-alphabetic, non-numeric characters that are generally available. I'll assume all password characters must be among the $94$ non-whitespace printable ASCII characters (excluding "delete") and that the symbols must be some subset of the characters that remain after we remove the $62$ letters and digits. Hence there are $n$ symbols, where $n \leq 32$.

Let $U$ be the set of upper-case letters, $L$ the set of lower-case letters, $D$ the set of digits, and $S$ the set of symbols. The total number of eight-character passwords using all characters from $U\cup L\cup D\cup S$ without restriction is $(62+n)^8.$ Exclude those that use only three of the classes of characters:

  • $U\cup L\cup D$ only: $62^8$ passwords

  • $U\cup L\cup S$ only: $(52+n)^8$ passwords

  • $U\cup D\cup S$ only: $(36+n)^8$ passwords

  • $L\cup D\cup S$ only: $(36+n)^8$ passwords

By the Inclusion-Exclusion principle, we now add back the following numbers of passwords using only two classes of characters:

  • $U\cup L$ only: $52^8$ passwords

  • $U\cup D$ only: $36^8$ passwords

  • $L\cup D$ only: $36^8$ passwords

  • $U\cup S$ only: $(26+n)^8$ passwords

  • $L\cup S$ only: $(26+n)^8$ passwords

  • $D\cup S$ only: $(10+n)^8$ passwords

Finally, by the Inclusion-Exclusion principle, once again exclude the passwords generated by only one class of character:

  • $U$ only: $26^8$ passwords

  • $L$ only: $26^8$ passwords

  • $D$ only: $10^8$ passwords

  • $S$ only: $n^8$ passwords

So the grand total as a function of the number of symbols, $n,$ is \begin{align} P(n) &= (62+n)^8 - 62^8 - (52+n)^8 - 2(36+n)^8 \\ &\qquad + 36^8 + 2(26+n)^8 + (10+n)^8 - 2\left(26^8\right) - 10^8 - n^8 \\ &= 2271360 n (n^4 + 155 n^3 + 10820 n^2 + 410440 n + 8287152) \end{align} (expanded and simplified by Wolfram Alpha).

Assuming the wife's password rules allow the choice of any one of the classes of characters, but then the password must be made only from characters in that set, these rules allow the user to make any of $Q(n) = 2\left(26^8\right) - 10^8 - n^8$ passwords.

The values of $P(n)$ and $Q(n)$ for some values of $n$ are:

\begin{array}{rrr} \hfill n\hfill & \hfill P(n)\hfill & \hfill Q(n)\hfill \\ \hline 1 & 19\,780\,293\,012\,480 \approx 1.98\times10^{13} & 417\,754\,129\,153 \approx 4.18\times10^{11} \\ 10 & 309\,780\,614\,707\,200 \approx 3.10\times10^{14} & 417\,854\,129\,152 \approx 4.18\times10^{11} \\ 25 & 1\,596\,945\,063\,168\,000 \approx 1.60\times10^{15} & 570\,342\,019\,777 \approx 5.70\times10^{11} \\ 32 & 2\,807\,657\,387\,458\,560 \approx 2.81\times10^{15} & 1\,517\,265\,756\,928 \approx 1.52\times10^{12}\\ \end{array}

For a reasonable number of symbols (about 25), we see that the OP's rules allow thousands of times as many passwords as the wife's. Even if there is just one symbol, $P(n)$ is more than $40$ times $Q(n).$ Of course, when $n=0$, $P(n)=0,$ since it is impossible to include a symbol in the password when no symbols are allowed.

Note, however, that the OP's relative advantage (the ratio $P(n)/Q(n)$) peaks at $n = 25$ symbols, when $P(n)/Q(n) \approx 2800.$ As $n$ increases above $25,$ the ratio rapidly falls off. If there were a much larger number of symbols available, specifically, if $n\geq 175$, the wife's rules would allow for more passwords. The extremely large number of passwords that could then be generated with symbols alone would then overwhelm the number of passwords that could be generated by replacing some of the symbols by other characters.

  • 0
    [This site](https://msdn.microsoft.com/en-us/library/cc875839.aspx) lists 31 symbols and even some Unicode characters.2017-01-29
  • 1
    @FabioSomenzi It's unclear why they left out the symbol `_`, which would make 32 symbols. As for Unicode, Microsoft considers it a separate class from "symbols", and apparently neither the OP nor OP's wife considered it at all. Note that rather than a Unicode character one could just require the insertion of another four digits somewhere in the password; the entropy would be greater and fewer keys would need to be pressed according to the Unicode procedure described by that Microsoft document.2017-01-29
  • 0
    Agreed. Just providing some data showing that 32 is very close to what one encounters in practice.2017-01-29
  • 0
    @FabioSomenzi Those 32 symbols [in order](https://en.wikipedia.org/wiki/Basic_Latin_(Unicode_block)) are: `!"#$%&'()*+,-./` and `:;<=>?@` and `[\]^_\`` and `{|}~`. And 32 is `15+7+6+4=96-1-10-26-26-1`.2017-01-30
4

I'll let the binomials and the factorials, and the jokes on who's right between husbands and wives, to the other responders. They are useful and fun, but let's take another viewpoint.

The rules that restrict the choice of passwords have two aims:

  • Make sure that there is a large enough pool of passwords to choose from.

  • Make sure passwords people choose are reasonably spread out in that pool.

To increase the number of possible passwords, you'd increase the maximum length of the password and you'd allow more symbols in each position.

But this is not what you see in the rules: your password must be at least eight characters long, and must contain at least one character from each of these groups.

The reason is that, without those constraints, too many would choose "fido" or "1984" as their passwords. This would make the task of those who want to break into accounts way too easy, because they could focus their efforts on the (few) commonly used passwords.

So, those "distribution requirements" actually decrease the number of available passwords, but greatly improve the way they are spread out.

Now, you don't see any site requesting that your password be all lowercase or all uppercase or all digits, or all symbols for two reasons: the important one is that almost all users would go for alphabetic passwords. The other is that in fact, there would be fewer passwords to choose from (for long enough passwords).

1

I realize that this is a Mathematics stack and that my answer is primarily non-mathematical. However, I feel that the question’s subtext concerning password strength deserves an answer from the flip-side.

If attackers used purely randomly generated passwords, they would have the same probability of guessing your password, your wife’s password, some person’s ‘12345678’ password, provided they are of the same length and use the same symbol set. Several other answers have already shown that the sample space of 8-character passwords is very large. Attackers would be better off making money or having fun buying Lotto tickets.

What it comes down to in the end is the predictability of a password. Alas, targets are far more predictable than they think. Attackers commonly use dictionary attacks, which essentially is the application of a set of known password distributions to better their chances. They may further apply tailored dictionary attacks, based on information acquired through social engineering, online presence, profession, hobbies, geographic location, etc. It all depends on the purpose of the attack.

Consider that your wife’s password is one of a million other passwords nabbed from some online retailer. Not only will attackers have a list of encrypted passwords, but also account information, which can be used to apply more specific cracking strategies. In such cases, it is no different than the old bear joke. Bob and Joe are startled by a bear while camping. Joe puts on his sneakers. Bob says, ‘What are you doing? You can’t outrun a bear!” Steve replies, “I don’t have to, I just have to outrun you!”

Attackers generally have a quota to fill. The brute strength of your wife’s password just needs to be stronger than others till the quota is reached. At that point, passwords are used to commit more serious cybercrimes. No time is spent cracking more passwords because attackers are working against the clock, since their nefarious activities will unveil the security breach eventually, hopefully sooner than later. Stopgap measures thereafter will invalidate any newly discovered passwords. Then again, if illicit activities are stealthier, they may have more time.

Consider that the attack is more focused. Your wife is the sole target or part of a smaller set of targets such as her fellow employees. The purpose is most likely not for immediate financial gain, but for information warfare. In such cases, attackers will spend more time and be far more resilient in their efforts to discover her password. Your wife has already given such would-be attackers an advantage by letting it be known that she uses a restricted character set. If attackers “acquired” both your encrypted passwords and knew that tidbit of information, they have a high probability of success cracking your wife’s password than yours, despite that both of you drew your passwords from the same sample space.

To avoid giving attackers the favor, your wife should come up with a different “formula” and tell no living soul about it, even you. Furthermore, while the brute strength of her password is still important, it is more important that it is also immune to social reverse engineering. In other words, composed of something that is as orthogonal as possible to the known set of information that attackers could discover about her. The easiest solution is to use a random password generator. A better solution is to use simpler, longer, multi-word passwords, i.e. passphrases, that can be remembered by mnemonic devices; obviously, this is system dependent.

It is difficult to measure the extract strength of a password. The most common measure used is the password's information entropy. Borrowed from Shannon’s work in information theory, the entropy measures the number of information bits provided by a password. A password with an entropy of 50 would require 250 attempts to exhaust all possibilities in a brute force attack. For example, your 8-character password picked from the standard 94-character symbol set, would have an entropy of 18. If the password passes an extensive dictionary test (by the system), it gains an additional 6 bits, for a total of 24. The bonus of a dictionary test declines as the password length increases. A phrase such as spider came to see ms muffet on a tuffet has an entropy of 56.

If you are interested in finding out more about password entropy, these following National Institute of Standards and Technology (NIST) papers provide a wealth of detail:

1

There is a context in which the wife is right, which may explain her intuition. Suppose we split the permitted symbols into two sets, $S$ and $T$, of size $s=|S|$ and $t=|T|$. Suppose further that our password must be exactly two letters long.

The wife's strategy yields $s^2+t^2$ passwords, either two from $S$ or two from $T$. The OP's strategy yields $2st$ passwords, either one from $S$ then one from $T$, or the other way.

The wife's strategy will give at least as many passwords as the OP's, because $(s^2+t^2)-2st=(s-t)^2\ge 0$. Indeed, if $s\neq t$, then the wife's strategy gives strictly more passwords.

0

Allowing either only lowercase, only uppercase, only numerals or only symbols, the number of possible passwords equals $$26^8+26^8+10^8+33^8=1\ 824\ 162\ 747\ 393,$$ assuming $33$ symbols. On the other hand, the number of passwords requiring at least one from each group can be counted as the total number of passwords allowing all characters, minus the number of passwords using characters from only three out of four groups. The total number of passwords is $$(26+26+10+33)^8=6\ 634\ 204\ 312\ 890\ 625,$$ which is spectacularly much larger than the other number (about $16000$ times larger). But we still have to subtract the passwords that don't use all symbols. The numbers of passwords that use only three out of four symbols are \begin{align*} (26+26+10)^8&=\ \ \ 218\ 340\ 105\ 584\ 986&\text{ not using symbols}\\ (26+26+33)^8&=2\ 724\ 905\ 250\ 390\ 625&\text{ not using numbers}\\ (26+10+33)^8&=\ \ \ 513\ 798\ 374\ 428\ 641&\text{ not using uppercase} \end{align*} and the number of passwords not using lowercase is the same. These numbers are a lot smaller than the total number we just computed; subtracting them barely still leaves $$2\ 663\ 362\ 208\ 057\ 732,$$ passwords; well over a thousand times more than the first number. And in fact we've subtracted too much; for example passwords using only lowercase characters have been subtracted three times (it doesn't use symbols, doesn't use numbers and doesn't use uppercase letter). So the actual number of passwords is even greater.


To compute the exact number you could use the inclusion-exclusion principle. For this we also need the number of passwords not using characters from two out of four groups, and not using characters from three out of four groups. The latter is simply the number of passwords using symbols from only one group, which we already computed to be $$26^8+26^8+10^8+33^8=1\ 824\ 162\ 747\ 393,$$ The numbers of passwords not using characters from two out of four groups equal: \begin{align*} (26+26)^8&=\ \ 53\ 459\ 728\ 531\ 456&\text{not using numbers or symbols}\\ (26+10)^8&=\ \ \ \ 2\ 821\ 109\ 907\ 456&\text{not using lower/uppercase or symbols}\\ (26+33)^8&=146\ 830\ 437\ 604\ 321&\text{not using lower/uppercase or numbers}\\ (10+33)^8&=\ \ 11\ 688\ 200\ 277\ 601&\text{not using lowercase or uppercase} \end{align*} Together these add up to $$364\ 451\ 023\ 832\ 611,$$ passwords that we subtracted twice originally. Adding this and subsequently subtracting the number of passwords that we double counted here again, by the inclusion-exclusion principle the total number of passwords equals $$2\ 663\ 362\ 208\ 057\ 732+364\ 451\ 023\ 832\ 611-1\ 824\ 162\ 747\ 393=3\ 026\ 197\ 587\ 291\ 750,$$ which is substantially more than the other number of passwords.


DISCLAIMER: The numbers might very well be off a bit as I'm in a hurry, I'll fix it later today. But the general point holds.