Discussion:
scrypt time-memory tradeoff
Solar Designer
2011-07-01 00:39:00 UTC
Permalink
Colin, all -

This is probably nothing new to you, but here's some analysis Anthony
Ferrara (ircmaxell) posted regarding an attacker making scrypt run in a
lot less memory, by trading CPU/GPU time for that:

http://drupal.org/node/1201444#comment-4675994

For some settings and some types of attacker's equipment (not ASICs, but
GPUs with their constraints) the trade-off might be worthwhile, although
I think that it is not for the settings given by ircmaxell and for
present GPUs, and it would not provide impressive performance anyway
(potentially just slightly better than the straightforward approach).

Alexander
Colin Percival
2011-07-01 00:48:01 UTC
Permalink
Post by Solar Designer
Colin, all -
This is probably nothing new to you, but here's some analysis Anthony
Ferrara (ircmaxell) posted regarding an attacker making scrypt run in a
Yes... I think I explicitly mentioned that in my paper, in fact. ;-)

The design of scrypt puts a lower bound on the area-time product -- you can
use less memory and more CPU time, but the ratios stay within a constant
factor of each other, so for the worst-case attacker (ASICs) the cost per
password attempted stays the same.

I'm actually planning on adding this support to my scrypt code, in order to
avoid the "you don't have enough RAM to compute this hash" error; but I
haven't had the time yet.
--
Colin Percival
Security Officer, FreeBSD | freebsd.org | The power to serve
Founder / author, Tarsnap | tarsnap.com | Online backups for the truly paranoid
Solar Designer
2012-11-17 01:20:54 UTC
Permalink
Post by Colin Percival
The design of scrypt puts a lower bound on the area-time product -- you can
use less memory and more CPU time, but the ratios stay within a constant
factor of each other, so for the worst-case attacker (ASICs) the cost per
password attempted stays the same.
This doesn't appear to be exactly the case.

SMix consists of two loops, which are roughly of equal cost. Suppose we
want to halve our memory needs (area). The first one of two loops will
simply store every other computed value. The second loop will have a
50% probability that a V_j is present in the half-size table. When V_j
is not present, the extra cost is recomputing V_j from j-1's.

With full V table, we invoke BlockMix N*2 times. With a half-size V
table, we invoke BlockMix on average 5 times per 4 loop iterations:
2 times per 2 iterations for the first loop, and 3 times per 2
iterations for the second loop. We invoke it a total of around N*2.5
times.

Thus, we have halved our memory needs (circuit area) and paid for this
by only a 25% increase in processing time.

Now, if we want to reduce our memory needs a lot more, the situation is
a lot better (for the defender). Yet for relatively low reduction in
memory needs - like 2x or 4x - the trade-off favors the attacker. So an
attacker with an ASIC will want to use this, and will reduce the cost.

Am I missing something? If not, does this possibly affect the dollar
costs presented in the scrypt paper? Should the dollar cost estimates
for scrypt be multiplied by 0.5*1.25 = 0.625 (that is, reduced to 62.5%
of what the paper says)?

Alexander
Solar Designer
2012-11-17 10:53:41 UTC
Permalink
On Sat, Nov 17, 2012 at 05:20:54AM +0400, Solar Designer wrote:
[...]
Post by Solar Designer
Thus, we have halved our memory needs (circuit area) and paid for this
by only a 25% increase in processing time.
Now, if we want to reduce our memory needs a lot more, the situation is
a lot better (for the defender).
Actually, unless I am mistaken, the area-time product is asymptotically
(by making extreme use of this trade-off) only 25% of what's expected in
the scrypt paper.

Without trade-off, it is 2*N time * N area = 2 * N^2.
With extreme trade-off, it is N+N*N/2 time * 1 area = ~ 0.5 * N^2

For e.g. a 100x reduction in memory, it is:

(100+100*101/2)/200/100 = 0.2575
Post by Solar Designer
Yet for relatively low reduction in
memory needs - like 2x or 4x - the trade-off favors the attacker. So an
attacker with an ASIC will want to use this, and will reduce the cost.
It appears that the trade-off always favors the ASIC-empowered attacker,
reducing the cost to 0.625 at 2x and further down to 0.25 at much higher
ratios.

Apparently, the only reason why GPU Litecoin miners are documented to be
optimal at 2x is that GPUs are not ASICs: they readily have some
resources available for use (including memory and bandwidth), which
would otherwise be partially idle.
Post by Solar Designer
Am I missing something? If not, does this possibly affect the dollar
costs presented in the scrypt paper? Should the dollar cost estimates
for scrypt be multiplied by 0.5*1.25 = 0.625 (that is, reduced to 62.5%
of what the paper says)?
Now I think the costs need to be divided by 4.

Alexander
Colin Percival
2012-11-18 11:23:31 UTC
Permalink
Post by Solar Designer
Post by Solar Designer
Thus, we have halved our memory needs (circuit area) and paid for this
by only a 25% increase in processing time.
Now, if we want to reduce our memory needs a lot more, the situation is
a lot better (for the defender).
Actually, unless I am mistaken, the area-time product is asymptotically
(by making extreme use of this trade-off) only 25% of what's expected in
the scrypt paper.
You get a 2x cost reduction by trading increased time for reduced area (as
in a previous email) and another 2x reduction by ignoring the initial setup
(practically speaking) but I never intended to include the setup in my
area-time bound.
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Solar Designer
2012-11-18 11:38:40 UTC
Permalink
Post by Colin Percival
You get a 2x cost reduction by trading increased time for reduced area (as
in a previous email) and another 2x reduction by ignoring the initial setup
(practically speaking)
Yes, that's what I was thinking. The initial setup time becomes
negligible compared to that of the second phase - or it can even be
fully removed, but that's not optimal in practice because the Salsa20
core has some area cost too.
Post by Colin Percival
but I never intended to include the setup in my
area-time bound.
Oh, that's nice.

In other words, you could have killed this trade-off (by slightly
different design) and then claim 4x or 2x higher costs (depending on
whether the trade-off was accounted for in the costs or not).

Alexander
Colin Percival
2012-11-18 11:48:08 UTC
Permalink
Post by Solar Designer
Post by Colin Percival
You get a 2x cost reduction by trading increased time for reduced area (as
in a previous email) and another 2x reduction by ignoring the initial setup
(practically speaking)
Yes, that's what I was thinking. The initial setup time becomes
negligible compared to that of the second phase - or it can even be
fully removed, but that's not optimal in practice because the Salsa20
core has some area cost too.
There's other ways the setup time can be effectively removed too -- use
one CPU and O(sqrt(N)) RAM to compute and store H^(sqrt(N) i)(B) for
i = 0 .. sqrt(N), then send those values over to a larger die which has
sqrt(N) CPUs and O(N) RAM which fills in the full table and then does
the phase-2 computation.
Post by Solar Designer
Post by Colin Percival
but I never intended to include the setup in my
area-time bound.
Oh, that's nice.
In other words, you could have killed this trade-off (by slightly
different design) and then claim 4x or 2x higher costs (depending on
whether the trade-off was accounted for in the costs or not).
Yes, I think so. I liked the ROMix construction because I was able to
write a formal proof about its properties; and when I turned it into the
scrypt KDF I wanted to minimize the divergence from the proven-secure
construction.
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Colin Percival
2012-11-18 11:07:54 UTC
Permalink
Post by Solar Designer
Post by Colin Percival
The design of scrypt puts a lower bound on the area-time product -- you can
use less memory and more CPU time, but the ratios stay within a constant
factor of each other, so for the worst-case attacker (ASICs) the cost per
password attempted stays the same.
This doesn't appear to be exactly the case.
Note the words "constant factor". ;-)
Post by Solar Designer
SMix consists of two loops, which are roughly of equal cost. Suppose we
want to halve our memory needs (area). The first one of two loops will
simply store every other computed value. The second loop will have a
50% probability that a V_j is present in the half-size table. When V_j
is not present, the extra cost is recomputing V_j from j-1's.
With full V table, we invoke BlockMix N*2 times. With a half-size V
2 times per 2 iterations for the first loop, and 3 times per 2
iterations for the second loop. We invoke it a total of around N*2.5
times.
Thus, we have halved our memory needs (circuit area) and paid for this
by only a 25% increase in processing time.
Now, if we want to reduce our memory needs a lot more, the situation is
a lot better (for the defender). Yet for relatively low reduction in
memory needs - like 2x or 4x - the trade-off favors the attacker. So an
attacker with an ASIC will want to use this, and will reduce the cost.
Am I missing something? If not, does this possibly affect the dollar
costs presented in the scrypt paper? Should the dollar cost estimates
for scrypt be multiplied by 0.5*1.25 = 0.625 (that is, reduced to 62.5%
of what the paper says)?
This is correct, and gives you asymptotically a 2x reduction in area-time
cost during the second phase. Which falls within the definition of "constant
factor", and was taken into account in the cost estimates in the paper.
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Solar Designer
2012-11-18 11:19:57 UTC
Permalink
Colin,

Thanks for replying! I understand that you're very busy these days.
Post by Colin Percival
Post by Solar Designer
Post by Colin Percival
The design of scrypt puts a lower bound on the area-time product -- you can
use less memory and more CPU time, but the ratios stay within a constant
factor of each other, so for the worst-case attacker (ASICs) the cost per
password attempted stays the same.
This doesn't appear to be exactly the case.
Note the words "constant factor". ;-)
Fair enough.
Post by Colin Percival
This is correct, and gives you asymptotically a 2x reduction in area-time
cost during the second phase.
Yes, but that's 4x for scrypt overall.
Post by Colin Percival
Which falls within the definition of "constant
factor", and was taken into account in the cost estimates in the paper.
Was it? That's good news. IIRC, when I tried repeating your cost
calculations ~2 years ago, I managed to arrive at numbers in your paper
without taking this trade-off into account. So it must be one of: I
made an error back then, I do not recall correctly, or you did not
actually take this into account. Should we verify those numbers now?

Alexander
Colin Percival
2012-11-18 11:25:10 UTC
Permalink
Post by Solar Designer
Post by Colin Percival
This is correct, and gives you asymptotically a 2x reduction in area-time
cost during the second phase.
Yes, but that's 4x for scrypt overall.
Right, by ignoring the setup phase.
Post by Solar Designer
Post by Colin Percival
Which falls within the definition of "constant
factor", and was taken into account in the cost estimates in the paper.
Was it? That's good news. IIRC, when I tried repeating your cost
calculations ~2 years ago, I managed to arrive at numbers in your paper
without taking this trade-off into account. So it must be one of: I
made an error back then, I do not recall correctly, or you did not
actually take this into account. Should we verify those numbers now?
I was certainly aware of these factors, and intended to take them into account
in my estimates. But I'm not so brave as to claim that I definitely didn't
lose a factor of 2 when I was doing my calculations. :-)
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Loading...