Page MenuHomeFreeBSD

busdma: fix page miscount for small segment sizes
ClosedPublic

Authored by mhorne on Jan 31 2022, 9:23 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 10, 5:38 AM
Unknown Object (File)
Fri, Jan 10, 4:59 AM
Unknown Object (File)
Mon, Jan 6, 1:45 PM
Unknown Object (File)
Fri, Jan 3, 1:31 AM
Unknown Object (File)
Dec 21 2024, 3:29 PM
Unknown Object (File)
Nov 19 2024, 10:25 AM
Unknown Object (File)
Nov 9 2024, 3:40 PM
Unknown Object (File)
Nov 1 2024, 3:47 PM

Details

Summary

There is a mismatch between how required bounce pages are counted in
_bus_dmamap_count_pages() and bounce_bus_dmamap_load_buffer().

This problem has been observed on the RISC-V VisionFive v2 SoC which has
memory physically addressed above 4GB. This requires some bouncing for
the dwmmc driver. This driver has a maximum segment size of 2048 bytes.
When attempting to load a page-aligned 4-page buffer that requires
bouncing, we can end up counting 4 bounce pages for an 8-segment
transfer. These pages will be incorrectly configured to cover only the
first half of the transfer (4 x 2048 bytes). With this change, 8 bounce
pages are allocated and set up.

Note that _bus_dmamap_count_phys() does not appear to have this problem,
as it clamps the segment size to dmat->common.maxsegsz.

Transactions must meet the following conditions in order for the
miscalculation to manifest:

  1. Maximum segment size smaller than 1 page
  2. Transfer size exceeding 1 segment
  3. Buffer requires bouncing
  4. Driver uses _bus_dmamap_load_buffer(), not _bus_dmamap_load_phys() or other variations

It seems unusual but not inconceivable that this exact combination has
not been encountered or has gone unnoticed on other architectures, which
also lack this check for max segment size. For example, the rockpro64
uses the dwmmc driver, but fails to meet 3, as its memory is physically
addressed below 4GB. Some other mmc drivers appear to fail 1, etc.

Test Plan

Before this change, booting my beagle-v from the SD card would hang as soon as the first bounced request was issued. With this change no issues with the mmc driver are observed.

I was able to observed the same miscounting on a rockpro64 (also dwmmc) with an artificial bounce window.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 55989
Build 52878: arc lint + arc unit

Event Timeline

My main question here is: am I correct that we want to allocate 8 bounce pages in the scenario described, or is it expected that we try to fit the transfer into the bounce pages optimally (i.e. two segments per page). I did not see any code that would realize the latter.

If the change is correct, should I extend it to other archs? As far as I can tell they are also affected.

sys/riscv/riscv/busdma_bounce.c
707

Here is the source of the counting mismatch.

I would say that a max segment size < PAGE_SIZE is generally under-tested and probably broken in several ways. I would be tempted to still copy PAGE_SIZE chunks into bounce pages when needed and then carve the PAGE_SIZE segments up into multiple segments when writing out the segments if possible.

Rebase and extend the patch to other architectures.

mhorne retitled this revision from riscv: fix bounce page miscalculation to busdma: fix page miscount for small segment sizes.Feb 13 2024, 6:02 PM
mhorne edited the summary of this revision. (Show Details)
mhorne edited the test plan for this revision. (Show Details)
In D34118#771586, @jhb wrote:

I would say that a max segment size < PAGE_SIZE is generally under-tested and probably broken in several ways. I would be tempted to still copy PAGE_SIZE chunks into bounce pages when needed and then carve the PAGE_SIZE segments up into multiple segments when writing out the segments if possible.

At the time of this comment I did experiment with batching segments optimally among bounce pages, but I never got it working reliably.

For now there is still a need for this fix, so I intend to move ahead with the minimal changes required.

This is ok, but I do think it's not ideal, per se. I think to make bounce pages more packed you would not apply this diff but instead do something like:

diff --git a/sys/riscv/riscv/busdma_bounce.c b/sys/riscv/riscv/busdma_bounce.c
index c9fdb0e38e40..dfe90d819eb5 100644
--- a/sys/riscv/riscv/busdma_bounce.c
+++ b/sys/riscv/riscv/busdma_bounce.c
@@ -587,6 +587,28 @@ _bus_dmamap_addseg(bus_dma_tag_t dmat, bus_dmamap_t map, bus_addr_t curaddr,
 	return (sgsize);
 }
 
+/*
+ * Add one or more entries to a segment list describing a contiguous
+ * physical range.
+ */
+static bool
+_bus_dmamap_addsegs(bus_dma_tag_t dmat, bus_dmamap_t map, bus_addr_t curaddr,
+    bus_size_t sgsize, bus_dma_segment_t *segs, int *segp)
+{
+	bus_size_t done, todo;
+
+	while (sgsize > 0) {
+		todo = MIN(sgsize, dmat->common.maxsegsz);
+		done = _bus_dmamap_addseg(dmat, map, curaddr, todo, segs,
+		    segp);
+		if (done == 0)
+			return (false);
+		curaddr += done;
+		sgsize -= done;
+	}
+	return (true);
+}
+
 /*
  * Utility function to load a physical buffer.  segp contains
  * the starting segment on entrace, and the ending segment on exit.
@@ -618,7 +640,7 @@ bounce_bus_dmamap_load_phys(bus_dma_tag_t dmat, bus_dmamap_t map,
 
 	while (buflen > 0) {
 		curaddr = buf;
-		sgsize = MIN(buflen, dmat->common.maxsegsz);
+		sgsize = buflen;
 		if (((dmat->bounce_flags & BF_COULD_BOUNCE) != 0) &&
 		    map->pagesneeded != 0 &&
 		    addr_needs_bounce(dmat, curaddr)) {
@@ -643,9 +665,8 @@ bounce_bus_dmamap_load_phys(bus_dma_tag_t dmat, bus_dmamap_t map,
 			} else
 				sl->datacount += sgsize;
 		}
-		sgsize = _bus_dmamap_addseg(dmat, map, curaddr, sgsize, segs,
-		    segp);
-		if (sgsize == 0)
+		if (!_bus_dmamap_addsegs(dmat, map, curaddr, sgsize, segs,
+		    segp))
 			break;
 		buf += sgsize;
 		buflen -= sgsize;
@@ -667,7 +688,7 @@ bounce_bus_dmamap_load_buffer(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf,
     int *segp)
 {
 	struct sync_list *sl;
-	bus_size_t sgsize, max_sgsize;
+	bus_size_t sgsize;
 	bus_addr_t curaddr, sl_pend;
 	vm_offset_t kvaddr, vaddr, sl_vend;
 	int error;
@@ -704,17 +725,14 @@ bounce_bus_dmamap_load_buffer(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf,
 		/*
 		 * Compute the segment size, and adjust counts.
 		 */
-		max_sgsize = MIN(buflen, dmat->common.maxsegsz);
 		sgsize = PAGE_SIZE - (curaddr & PAGE_MASK);
 		if (((dmat->bounce_flags & BF_COULD_BOUNCE) != 0) &&
 		    map->pagesneeded != 0 &&
 		    addr_needs_bounce(dmat, curaddr)) {
 			sgsize = roundup2(sgsize, dmat->common.alignment);
-			sgsize = MIN(sgsize, max_sgsize);
 			curaddr = add_bounce_page(dmat, map, kvaddr, curaddr,
 			    sgsize);
 		} else if ((dmat->bounce_flags & BF_COHERENT) == 0) {
-			sgsize = MIN(sgsize, max_sgsize);
 			if (map->sync_count > 0) {
 				sl_pend = sl->paddr + sl->datacount;
 				sl_vend = sl->vaddr + sl->datacount;
@@ -740,12 +758,9 @@ bounce_bus_dmamap_load_buffer(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf,
 				sl->datacount = sgsize;
 			} else
 				sl->datacount += sgsize;
-		} else {
-			sgsize = MIN(sgsize, max_sgsize);
 		}
-		sgsize = _bus_dmamap_addseg(dmat, map, curaddr, sgsize, segs,
-		    segp);
-		if (sgsize == 0)
+		if (!_bus_dmamap_addsegs(dmat, map, curaddr, sgsize, segs,
+		    segp))
 			break;
 		vaddr += sgsize;
 		buflen -= sgsize;
In D34118#1000839, @jhb wrote:

This is ok, but I do think it's not ideal, per se. I think to make bounce pages more packed you would not apply this diff but instead do something like:

...

Respectfully, today I am not aiming for ideal, I am aiming for the smallest code change which will address this bug. It was an independent headache for someone else working on bring-up for this hardware (PR 273694).

What you suggest is elegant, and I will thankfully take it as a follow-up item, but I'm not set up to test it thoroughly this week. There is opportunity to do the same in the _bus_dmamap_load_phys() code path as well, and probably a chance for some de-duplication into subr_bus_dma.c. I do not currently have the time to do this all the "right" way. So, I intend to move forward with what I have here, which I can vouch for having tested and evaluated in the past and present.

I will thankfully take it as a follow-up item

Thanks @mhorne. Having this as a workaround in the interim makes sense to me. My only suggestion is that you add a note to that effect (that this is an interim fix for the issue) in the commit message.

Sorry, to be clear I intended "I think this is ok for now" to mean that the current patch is ok. My suggestion was more of an idea if it was easy to test quickly. If you are able to test my suggestion that's great, but I certainly appreciate the limited cycles we all have.

This revision is now accepted and ready to land.Feb 14 2024, 9:52 PM