busdma: Annotate bus_dmamap_sync() with fence
Add an explicit thread fence release before returning from
bus_dmamap_sync. This should be a no-op in practice, but makes
explicit that all ordinary stores will be completed before subsequent
reads/writes to ordinary device memory. On x86, normal memory ordering
is strong enough to generally guarantee this. The fence keeps the
optimizer (likely LTO) from reordering other calls around this.
The other architectures already have calls, as appropriate, that
are equivalent.
Note: On x86, there is one exception to this rule. If you've mapped
memory as write combining, then you will need to add a sfence or
similar. Normally, though, busdma doesn't operate on such memory, and
drivers that do already cope appropriately.
Reviewed by: kib@, gallatin@, chuck@, mav@
Differential Revision: https://reviews.freebsd.org/D27448