Page MenuHomeFreeBSD

lib/libc/gen: use Lemire's algorithm for arc4random_uniform().
Needs ReviewPublic

Authored by fuz on Mon, Nov 18, 12:53 PM.

Details

Reviewers
cem
delphij
Group Reviewers
security
Summary

Daniel Lemire has published a more efficient range reduction algorithm
for finding a random number in a given range without bias:
https://arxiv.org/pdf/1805.10941

This algorithm is already in use by the Go standard library:
https://cs.opensource.google/go/go/+/refs/tags/go1.23.3:src/math/rand/rand.go;l=161

Reimplement arc4random_uniform using this method.
A microbenchmark shows that performance is improved by around 22% on my Haswell box:

os: FreeBSD
arch: amd64
cpu: Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz
                   │ benchmark.out │
                   │    sec/op     │
Arc4random_uniform     56.53n ± 0%
Fast_uniform           44.00n ± 0%
geomean                49.87n

A new unit test is added to validate that the range reduction works correctly.
We cannot currently validate that there is no bias however.

The paper is referenced in arc4random(3).

Test Plan

passes newly added unit test.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 60663
Build 57547: arc lint + arc unit

Event Timeline

fuz requested review of this revision.Mon, Nov 18, 12:53 PM

Some context on the Lemire idea if others haven't seen it:

https://lemire.me/blog/2019/06/06/nearly-divisionless-random-integer-generation-on-various-systems/

https://shufflesharding.com/posts/dissecting-lemire

https://sts10.github.io/2020/10/10/lemire-neaarly-divisionless-random.html

For bias validation -- you could do some kind of primitive statistical test (histogram). It can't be perfectly reliable but could give a little confidence it isn't totally broken.

lib/libc/gen/arc4random_uniform.c
12–13

I think the original alphabetical include order was correct.

22–23

I have to wonder how bad the 64-bit multiplication is on 32-bit architectures.

22–25

I think we could at least keep this comment describing the purpose of the function.

fuz planned changes to this revision.Mon, Nov 18, 3:00 PM

I'll look into adding a statistical test.

lib/libc/gen/arc4random_uniform.c
12–13

Note that this is a complete rewrite—I didn't look at the old code at all as to avoid accidentally copying from it. Neverthless, I agree and will correct the include order.

22–23

We currently support two 32 bit architectures. i386 has a 32×32→64 bit multiplication with the mull instruction, which performs well. Likewise, armv7 has the same with the umull instruction. Running the same benchmark, I get a 15% speedup on i386 and a 70% speedup on armv7 (division is really expensive there as we cannot guarantee the presence of the udiv instruction).

22–25

As I did not start with the original code, there was nothing to keep. I believe the purpose of the function is well described in the man page already.

  • lib/libc: arc4random_uniform include order fix

I was unfortunately not able to construct a statistical test
that is able to distinguish an obviously biased implementation like

uint32_t arc4random_uniform(uint32_t thresh) { return arc4random() % thresh; }

from a proper implementation. Maybe someone with more knowledge of
statistics could help here.