In the arm64 busdma we have an internal flag to signal when a tag is
for a cache-coherent device. In this case we don't need to adjust the
size and alignment of allocated buffers to be within a cache line.
The cache line adjustment was incorrectly using the coherent flag
passed in to bus_dma_tag_create and not the internal flag. Fix it to
use the latter to reduce the memory usage slightly.