Representing the crack resistance of a password.

Discussion:

Jeffrey Goldberg

2013-04-29 06:05:30 UTC

In a discussion on Twitter, Matt Weir has persuaded me that talking about the Shannon Entropy, H, of a password generation system doesn't get what we want. H doesn't capture appropriate facts about the distribution of passwords within the set of possible passwords.

The far more meaningful notion would be "what is the prob of an attacker cracking a pw under that policy after N guesses". This, as I now understand, is not in general computable from H. (It is computable from H when the distribution is uniform, but the whole point is that humans do not select passwords uniformly from some set.)

I asked how we should characterize, or even name, this notion. I tossed out

C(X, k) = 2 log_2 G(0.5, X, k)

where k is the key or password, X is this distribution, and G(p, X, k) is is the number of Guesses needed to hit k in X with probability p.

My idea is to have C(X,-) = H(X) when X is a uniform distribution.

Matt, however, thinks that that is not the direction to go in, and will probably be able to point out the error of my ways. But I'm presenting it anyway as starting point for discussion.

I've actually been drafting an article about this (well, about the non-randomness of humans and how this matters for considering password strength), and I would love to have some language for talking out this. Some way to talk about how crack resistant a password is.

Cheers,

-j

Solar Designer

2013-04-29 12:26:30 UTC

Permalink

Post by Jeffrey Goldberg
In a discussion on Twitter, Matt Weir has persuaded me that talking about the Shannon Entropy, H, of a password generation system doesn't get what we want. H doesn't capture appropriate facts about the distribution of passwords within the set of possible passwords.

That's correct.

Post by Jeffrey Goldberg
The far more meaningful notion would be "what is the prob of an attacker cracking a pw under that policy after N guesses". This, as I now understand, is not in general computable from H. (It is computable from H when the distribution is uniform, but the whole point is that humans do not select passwords uniformly from some set.)
I asked how we should characterize, or even name, this notion. I tossed out
C(X, k) = 2 log_2 G(0.5, X, k)
where k is the key or password, X is this distribution, and G(p, X, k) is is the number of Guesses needed to hit k in X with probability p.

Regarding your G() above, see:

http://www.lysator.liu.se/~jc/mthesis/4_Entropy.html#SECTION00430000000000000000

for a formal definition of "Guessing entropy" and some discussion.

Alexander

Jeffrey Goldberg

2013-04-29 14:38:14 UTC

Permalink

Post by Solar Designer

Post by Jeffrey Goldberg
I asked how we should characterize, or even name, this notion. I tossed out
C(X, k) = 2 log_2 G(0.5, X, k)

And now that I've had a few hours sleep, I see that even if that is the right concept, I got the math wrong.

Should be

C(X, k) = log_2 G(0.5, X, k) +1 (or log_2(2G(0.5, X, k))

Maybe I'll do even better if I get coffee.

Post by Solar Designer
http://www.lysator.liu.se/~jc/mthesis/4_Entropy.html#SECTION00430000000000000000
for a formal definition of "Guessing entropy" and some discussion.

Thank you! I would have gotten there eventually as I was starting to work through "Testing Metrics for Password Creation Policies
by Attacking Large Sets of Revealed Passwords" (Weir et al.)

http://goo.gl/YxRk

Using Guessing Entropy, then my C (crack resistance) would be C(X) = log_2 (2G(X)). So here if X is a uniform distribution, C(X) == H(X).

But I am looking for something more (and so maybe "entropy" isn't the right analogy). I want to be able to talk about the crack resistance of a particular password given a distribution.

Suppose that a password policy is "at least 8 characters, at least one uppercase letter, at least one digit" but X is the actual distribution of what humans do when told to create a password under that policy. I would want

C(X, 'Password1') < C(X, '2n8PGUoPeb')

So instead of having G be the average number of guesses needed for a random x in X; I'd like to be able to talk about the strength of a particular password (with respect to a particular distribution of passwords).

Cheers,

-j

Matt Weir

2013-04-29 15:40:06 UTC

Permalink

Jeffrey,
That'll teach me to write my e-mails faster, (or at least check my inbox
before I hit send). To address your question on how to figure out:

C(X, 'Password1') < C(X, '2n8PGUoPeb')

Ideally you'd want to develop an "ideal" cracking strategy, and then figure
out where each password falls into that strategy. The two problems of
course are A) What's an ideal cracking strategy, and B) How do
you efficiently decide how long a password takes to crack? Point B is
important since you don't want to wait 3 months while running someone's
password through a GPU cracker to say "Yup, it's strong enough, you can use
it!"

I'd recommend checking out Shiva Houshmand Yazdi's and Dr. Sudhir
Aggarwal's paper on the subject available at:

http://www.acsac.org/2012/openconf/modules/request.php?module=oc_program&action=view.php&a=&id=7&type=4&OPENCONF=75799905b2bb79c800ddf7ade3caad00

This is a slightly different question than the original one though, since
you're looking for a password strength metric, vs the strength of a generic
password creation policy. Admittedly if you enforce a threshold for your
password strength checke,r that can then become a password creation policy.
That's pretty much what Shiva, Sudhir and I are arguing in our papers.

Matt

Post by Jeffrey Goldberg

Post by Solar Designer

Post by Jeffrey Goldberg
I asked how we should characterize, or even name, this notion. I tossed

out

Post by Solar Designer

Post by Jeffrey Goldberg
C(X, k) = 2 log_2 G(0.5, X, k)

And now that I've had a few hours sleep, I see that even if that is the
right concept, I got the math wrong.
Should be
C(X, k) = log_2 G(0.5, X, k) +1 (or log_2(2G(0.5, X, k))
Maybe I'll do even better if I get coffee.
http://www.lysator.liu.se/~jc/mthesis/4_Entropy.html#SECTION00430000000000000000

Post by Solar Designer
for a formal definition of "Guessing entropy" and some discussion.

Thank you! I would have gotten there eventually as I was starting to work
through "Testing Metrics for Password Creation Policies
by Attacking Large Sets of Revealed Passwords" (Weir et al.)
http://goo.gl/YxRk
Using Guessing Entropy, then my C (crack resistance) would be C(X) = log_2
(2G(X)). So here if X is a uniform distribution, C(X) == H(X).
But I am looking for something more (and so maybe "entropy" isn't the
right analogy). I want to be able to talk about the crack resistance of a
particular password given a distribution.
Suppose that a password policy is "at least 8 characters, at least one
uppercase letter, at least one digit" but X is the actual distribution of
what humans do when told to create a password under that policy. I would
want
C(X, 'Password1') < C(X, '2n8PGUoPeb')
So instead of having G be the average number of guesses needed for a
random x in X; I'd like to be able to talk about the strength of a
particular password (with respect to a particular distribution of
passwords).
Cheers,
-j

Jeffrey Goldberg

2013-04-29 16:39:39 UTC

Permalink

Post by Jeffrey Goldberg
C(X, 'Password1') < C(X, '2n8PGUoPeb')
Ideally you'd want to develop an "ideal" cracking strategy, and then figure out where each password falls into that strategy. [...]

Let me make it clear that I am not expecting an operational definition. First because it is really really hard to get a grasp on X. The best we can do is look at leaks and a few experimental results. We just don't know the underlying function that humans use when asked to create a password.

But then, even if we do know (or have a good approximation) for X we have your two questions.

Post by Jeffrey Goldberg
The two problems of course are A) What's an ideal cracking strategy,

I don't know on whether finding an ideal cracking strategy for a given arbitrary X is tractable.

Post by Jeffrey Goldberg
and B) How do you efficiently decide how long a password takes to crack?
Point B is important since you don't want to wait 3 months while running someone's password through a GPU cracker to say "Yup, it's strong enough, you can use it!"

Right. I absolutely agree. And I don't hold out hope for good password strength meters (though I do think that we may be able to improve upon them). I've always wanted to write an article for the AgileBits blog titled "All Password Strength Meters Suck, Include Ours".

Post by Jeffrey Goldberg
I'd recommend checking out Shiva Houshmand Yazdi's and Dr. Sudhir Aggarwal's paper on the subject

Thanks.

Post by Jeffrey Goldberg
This is a slightly different question than the original one though, since you're looking for a password strength metric,

Again, let me clarify that I'm not asking for a metric that will give us an efficient and effective way to calculate it. I'm looking for a metric that just gives us a way to talk about these things. The concept that I'm after describes what actual strength meters are trying to approximate.

Post by Jeffrey Goldberg
vs the strength of a generic password creation policy.

As we discussed in the Twitter discussion, if a password creation policy expects (but doesn't enforce) a uniform (or other desirable) distribution of passwords, it doesn't give us much information about the strength of passwords that humans actually create when following the policy.

That is, if the policy merely defines a set of strings, but assumes that people will pick uniformly from that set, it will suck at telling us how strong any given password is. In this, Matt and I have been in perfect agreement. (I had just been confused about other stuff.)

Again, I would like my (abstract) strength concept to be commensurable to Shannon Entropy so that we can say that when X is a uniform distribution C(X, k) == H(X) for all k in X. We can then talk about how much the average C(X, k) differs from H(X). This really isn't necessary, as it should always be easy to calculate C(X, k) from when X is uniform, so I can always put them in the same terms when I want to.

But if I'm the only one who thinks that such a definition would be useful, that's fine too.

Cheers,

-j

Matt Weir

2013-04-29 15:08:29 UTC

Permalink

Jeffrey,
First of all, thanks for taking this conversation here since it'll be
good to get everyone else's input as well! My views on what make a useful
metric for the resistance of a password creation policy to attack have been
evolving over the years. I used to think that "Guessing Entropy" as brought
up by Alexander was the way to go, but I'm not so sure anymore. My main
issue with it, and with the metric you proposed, is I suspect the resulting
value won't provide actionable info to the defender, (Note: that's also my
objection to using Shannon Entropy).

Let me back up. From the way I understand your metric, it attempts to
measure the expected number of guesses it takes to crack 50% of user
passwords, (or to put it another way, to achieve a 50% success rate
cracking one password). The usual definition of "Guessing Entropy" is the
the average number of guesses required to crack a password, (which is a
slightly different metric if the distribution is not even). I'd argue that
either one of those metrics isn't what most defenders really want to know.

To put it another way, let's say your method said that for a given password
policy P, C(X,k) was equal to 32. Aka it takes around 2 billion guesses to
crack 50% of the passwords. One question many defenders want to know is how
does this password policy fare during an online attack when the attacker
can only make around 1000 guesses? While it does give an upper bound, (aka
an attacker will have less than a 50% success rate), that's not all that
useful since a 50% success rate is higher than the amount of risk most
defenders are willing to accept. Also the distribution is not even for
human generated passwords so the defender can't simply estimate the rate of
success as if the password policy created passwords equivalent to a random
keys 32 bits long.

I suspect a better metric is to model the expected success rate of an
attacker given K guesses. I apologize for mangling the equation since I'm
sending it as an e-mail, but you'd want something along the lines of:

X = (k=1 to k=Max Allowed Guesses) Sum( Pr(X=k) )

Where Pr(X=k) is the probability of an attacker guessing a password on
guess number k.

This would give you a metric that would allow you to say, "If an attacker
is allowed to make 1k guesses, their chances of success is X for password
creation policy P."

The advantage of this is that it also allows you to estimate risk for both
online and offline attacks where the number of allowed guesses can be
drastically different. You can then change variables until the resulting
value represents the amount of risk you are willing to accept. Aka let's
say we calculate X and the chance of an attacker succeeding is 10%, but
we're only willing to accept 1% risk. We then have a choice to either make
our creation policy stronger or limit the number of guesses an attacker can
make.

This is kind of what NIST tried to do in SP800-63, but they got into issues
trying to fit Shannon Entropy into that model.

The hard part of course is estimating Pr(X=k).... One way to go about doing
that is to model a password cracking session against real passwords. The
problem is, it's hard to estimate the effectiveness of password policies
that we haven't seen. For example, there hasn't been a major disclosure of
passwords where they had to be 16+ characters long.

At the end of the day though, this is always going to be a problem with any
metric we choose. We need to go into this willing to revise our findings as
new attack techniques are developed and we learn more about how users
create passwords.

Matt

Post by Jeffrey Goldberg

Post by Jeffrey Goldberg
In a discussion on Twitter, Matt Weir has persuaded me that talking

about the Shannon Entropy, H, of a password generation system doesn't get
what we want. H doesn't capture appropriate facts about the distribution of
passwords within the set of possible passwords.
That's correct.

Post by Jeffrey Goldberg
The far more meaningful notion would be "what is the prob of an attacker

cracking a pw under that policy after N guesses". This, as I now
understand, is not in general computable from H. (It is computable from H
when the distribution is uniform, but the whole point is that humans do not
select passwords uniformly from some set.)

Post by Jeffrey Goldberg
I asked how we should characterize, or even name, this notion. I tossed

out

Post by Jeffrey Goldberg
C(X, k) = 2 log_2 G(0.5, X, k)
where k is the key or password, X is this distribution, and G(p, X, k)

is is the number of Guesses needed to hit k in X with probability p.
http://www.lysator.liu.se/~jc/mthesis/4_Entropy.html#SECTION00430000000000000000
for a formal definition of "Guessing entropy" and some discussion.
Alexander