Users of nvme_completion_poll are all in the initialization path. Most
of the commands they queue and wait for finish quickly as they involve
no I/O to the drive's media. These command finsh much faster than a
single tick, on the order of a few microseconds. Adaptively polling for
the first tick allows us to return much earlier than we would
otherwise. The cumulative effect of not waiting until the next tick to
re-poll the condition is impressive (~80 of 100 ms saved).
Use this same technique waiting for RDY state transitions as well. Those
transition quickly as well and we have to wait for a couple of
them. This saves the rest (~20ms).
This eliminates almost 100ms of delay on boot in cperciva's EC2 test
harness and makes nvme disappear from the flame graph of boot times.
Tested by: cperciva
Sponsored by: Netflix