Page MenuHomeFreeBSD

cxgbei: Support unmapped I/O requests.
ClosedPublic

Authored by jhb on Feb 25 2022, 11:27 PM.
Tags
None
Referenced Files
F103049825: D34383.diff
Wed, Nov 20, 6:50 AM
Unknown Object (File)
Fri, Nov 1, 1:22 AM
Unknown Object (File)
Tue, Oct 29, 3:23 PM
Unknown Object (File)
Tue, Oct 29, 3:23 PM
Unknown Object (File)
Tue, Oct 29, 3:23 PM
Unknown Object (File)
Wed, Oct 23, 7:01 PM
Unknown Object (File)
Oct 3 2024, 6:24 AM
Unknown Object (File)
Oct 3 2024, 6:23 AM
Subscribers

Details

Summary
  • Add icl_pdu_append_bio and icl_pdu_get_bio methods.
  • Add new page pod routines for allocating and writing page pods for unmapped bio requests. Use these new routines for setting up DDP for iSCSI tasks with a SCSI I/O CCB which uses CAM_DATA_BIO.
  • When ICL_NOCOPY is used to append data from an unmapped I/O request to a PDU, construct unmapped mbufs from the relevant pages backing the struct bio. This also requires changes in the t4_push_pdus path to support unmapped mbufs.

Sponsored by: Chelsio Communications

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Feb 25 2022, 11:27 PM
jhb retitled this revision from cxgbei: Support for unmapped I/O requests to cxgbei: Support unmapped I/O requests..Mar 1 2022, 6:16 PM
jhb edited the summary of this revision. (Show Details)
jhb added a subscriber: rscheff.

While I do see some performance improvements in terms of throughput in the cxgbei initiator with this series of changes, the software initiator seems to be little changed (which doesn't seem to gel with Alexander's experience of ICL_NOCOPY being a win on the software target). The only benchmarks I ran were against ramdisk targets however using fio(1). I'm not sure if anyone on the review has interest in comparing initiator performance? (I know Alexander is mostly concerned with target).

Oh, and for anyone wanting to test / benchmark this, the entire series is available as the cxgbei_unmap branch at https://github.com/bsdjhb/freebsd.git.

I'd expect performance improvement from unmapped I/O support to grow with number of cores in the system. I remember from my tests that number of cores reduction in BIOS from 20 to 4 dramatically improved IOPS, though huge number of TLB shootdowns was impossible to ignore no matter what. You may try bigger system, if not yet.

About ICL_NOCOPY I can only guess that for initiator effect may be less visible since there are only 1/2 memory copies per I/O, while on target side I had 3/4, that may create more memory/cache bottlenecks. But if the goal is to reach full 100Gbps, there should better be no copies.

Since you mentioned "ramdisk", and fixed issues in "ramdisk" backend of CTL, I'd like to mention that the last is based around PAGE_SIZE chunks and also does not use ICL_NOCOPY, that may affect its performance. Configuration with block backend may behave differently. md(4) (GEOM), files and zvols (volmode=dev) also may have different characteristics, though unlike "ramdisk" CTL backend that difference is outside iSCSI layer.

sys/dev/cxgbe/tom/t4_ddp.c
1264

I know nothing about these pods, but is it expected for all of them to have the same length and offset? Is the offset field just ignored for the following ones?

1270

Just wonder why ->phys_addr vs VM_PAGE_TO_PHYS() in icl_soft?

sys/dev/cxgbe/tom/t4_ddp.c
1264

It is expected. Logically a set of page pods describe a single logical buffer backed by one or more physical pages (though the physical page size is variable and can be set to one of a few different sizes (the smallest size is 4k and the other sizes are multiples of that to permit coalescing ranges if you have physically contiguous pages in the buffer). Each page pod holds some fixed number of physical addresses (PPOD_PAGES) but you can chain multiple of them together to describe a larger buffer. The header bits (vld_tid_pgsz_tag_color and len_offset) are expected to be the same across all of the page pods that are chained together. The other oddity is that the first entry in a subsequent page pod in a chain has to match the last entry in the previous page pod. (This is the reason for the -1 below when calculating idx.)

1270

Mmm, I'll fix. The latter spelling is the more canonical in FreeBSD. The former I had used here as I had derived this from an existing similar routine in this file (t4_write_page_pods_for_ps).

This revision was not accepted when it landed; it landed in state Needs Review.Mar 10 2022, 11:51 PM
This revision was automatically updated to reflect the committed changes.