While this won't be noticed by most users the time to zero the bss
while using instruction tracing in the Arm FVP models (simulators) is
noticeable.
Reduce this time by using a store-pair instruction to double the size
of memory we zero on each iteration of the loop.
Sponsored by: Arm Ltd