Page MenuHomeFreeBSD

arm64: busdma bounce, handle nsegment==1 case better for load_phys()
Needs ReviewPublic

Authored by bz on Sep 3 2021, 12:03 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 18, 9:31 AM
Unknown Object (File)
Thu, Jan 9, 7:00 PM
Unknown Object (File)
Oct 3 2024, 1:37 AM
Unknown Object (File)
Sep 5 2024, 3:14 PM
Unknown Object (File)
Sep 4 2024, 5:55 PM
Unknown Object (File)
Aug 25 2024, 11:33 PM
Unknown Object (File)
Aug 23 2024, 11:38 AM
Unknown Object (File)
Aug 19 2024, 9:15 AM
Subscribers

Details

Summary

While trying to do single segment operations (out of LinuxKPI) I found that
we keep failing as we are not returning contiguous bounce pages.
That seemed silly given we did have enough of them available matching the
criteria (size and 1 seg).

This is a quick hacked-up prototype doing two things:

  • it keeps the bounce_page_list sorted by busaddr upon pages being freed; this increases the likelyhood of (quickly) finding enough contiguous pages. (probably should ensure order with alloc as well?)
  • It adds a special add_bounce_pages() function which will go out of its way and try to find enough contiguous segments in the bounce_page_list (they have to be in order there) to fullfill the special requirement of pagesneeded. It will return the first and queue up the remaining pages in order in the map. The add_bounce_page() called for further iterations will then take the bpages from there rather than the bounce_page_list allowing us to fullfill the entire request in bounce_bus_dmamap_load_phys() with the single segment. We only call into this logic there if the segments are limited to 1 and we need more than 1 page.

As said, this is a prototype, probably also applying elsewhere, and is
hackish. The alogrithms could be improved, etc.
A quick looked showed that this might equally apply to other architectures
but we may not need it there/see the problem there due to less bouncing.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 41870
Build 38758: arc lint + arc unit

Event Timeline

bz requested review of this revision.Sep 3 2021, 12:03 PM

Split out the sorted inssert into a function and also call from alloc_bounce_pages().

I think adding more irregularity to an already ill-defined code behavior is not a good idea. Moreover, I think this is an obvious driver error - requesting a single segment for a buffer that can bounce is clear nonsense in which case the driver should allocate the buffer using bus_dmamem_alloc() or possibly pass the buffer aligned. Copying multiple pages back and forth doesn't sound like an optimal solution.

In D31823#727699, @mmel wrote:

I think adding more irregularity to an already ill-defined code behavior is not a good idea. Moreover, I think this is an obvious driver error - requesting a single segment for a buffer that can bounce is clear nonsense in which case the driver should allocate the buffer using bus_dmamem_alloc() or possibly pass the buffer aligned. Copying multiple pages back and forth doesn't sound like an optimal solution.

Agreed but this comes out of LinuxKPI where you do not have all the busdma "tools" at hand. However while debugging this I found that bounce_page_list was quite fragmented even early on after boot and that probably does not yield good results either as every extra segment you need is overhead and that's probably due to the fact that multiple consumers will grab and return pages interleaving.

In D31823#727715, @bz wrote:
In D31823#727699, @mmel wrote:

I think adding more irregularity to an already ill-defined code behavior is not a good idea. Moreover, I think this is an obvious driver error - requesting a single segment for a buffer that can bounce is clear nonsense in which case the driver should allocate the buffer using bus_dmamem_alloc() or possibly pass the buffer aligned. Copying multiple pages back and forth doesn't sound like an optimal solution.

Agreed but this comes out of LinuxKPI where you do not have all the busdma "tools" at hand. However while debugging this I found that bounce_page_list was quite fragmented even early on after boot and that probably does not yield good results either as every extra segment you need is overhead and that's probably due to the fact that multiple consumers will grab and return pages interleaving.

I do think this is rather hacky. I think the current code tries to return bounce pages in a LIFO order on the assumption that the most recently used bounce page might be in cache still. I've no idea if that assumption is really sensible.

Just to leave this: the general problem remains real: compat.linuxkpi.lkpi_pci_nseg1_fail: 10