Page MenuHomeFreeBSD

gve: Add DQO RDA support
AcceptedPublic

Authored by shailend_google.com on Sep 17 2024, 6:54 PM.
Tags
None
Referenced Files
Restricted File
Tue, Nov 5, 9:18 PM
Restricted File
Tue, Nov 5, 8:55 PM
Unknown Object (File)
Sun, Nov 3, 8:18 PM
Unknown Object (File)
Mon, Oct 7, 1:29 AM
Unknown Object (File)
Oct 2 2024, 3:23 AM
Unknown Object (File)
Sep 27 2024, 5:13 AM
Unknown Object (File)
Sep 22 2024, 10:08 AM
Unknown Object (File)
Sep 21 2024, 2:04 PM
Subscribers

Details

Reviewers
markj
emaste
delphij
lwhsu
kibab
kib
Group Reviewers
network
Summary

DQO is the descriptor format for our next generation virtual NIC.
It is necessary to make full use of the hardware bandwidth on many
newer GCP VM shapes.

One major change with DQO from its predecessor GQI is that it uses
dual descriptor rings for both TX and RX queues.

The TX path uses a descriptor ring to send descriptors to HW, and
receives packet completion events on a TX completion ring.

The RX path posts buffers to HW using an RX descriptor ring and
receives incoming packets on an RX completion ring.

In GQI-QPL, the hardware could not access arbitrary regions of
guest memory, which is why there was a pre-negotitated bounce buffer
(QPL: Queue Page List). DQO-RDA has no such limitation.

"RDA" is in contrast to QPL and stands for "Raw DMA Addressing" which
just means that HW does not need a fixed bounce buffer and can DMA
arbitrary regions of guest memory.

A subsequent patch will introduce the DQO-QPL datapath that uses the
same descriptor format as in this patch, but will have a fixed
bounce buffer.

Signed-off-by: Shailend Chand <shailend@google.com>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 60382
Build 57266: arc lint + arc unit

Event Timeline

shailend_google.com retitled this revision from [RFC] gve: Add DQO RDA support to gve: Add DQO RDA support.Tue, Oct 15, 8:25 PM
shailend_google.com added a reviewer: kib.

Removed the RFC prefix in the commit msg
Added a couple of sysctls
Updated man page

share/man/man4/gve.4
209
212
233

It'd be nice to expand the GQI and DQO acronyms too, if that makes sense.

sys/dev/gve/gve.h
262

Why not give a name to the GQI struct? Is it just to avoid churn?

358

Why not give this struct field a name, like you did with dqo below?

389

It shouldn't be necessary to use volatile if all accesses are via atomic(9), which they seem to be.

sys/dev/gve/gve_main.c
178

This part of the change (adding err) seems superfluous. Is something missing?

556

but M_WAITOK allocations never fail, so this check should be removed.

sys/dev/gve/gve_rx.c
272

Why does the caller have to pass the callback as a parameter? This function operates on both types of rings, it seems natural for it to hardcode the two cleanup routines that are used, so that the caller doesn't always have to check gve_is_gqi(priv).

315

It seems a bit more consistent for gve_clear_rx_ring() to call gve_clear_rx_ring_dqo() itself.

sys/dev/gve/gve_rx_dqo.c
278

Can nsegs != 1 arise in practice? The DMA tag shouldn't permit it since maxseg == 1.

414

Could you please wrap these comments so that the lines aren't so long?

sys/dev/gve/gve_sysctl.c
38

sysctl descriptions shouldn't end with periods or newlines.

sys/dev/gve/gve_tx.c
186

Spurious whitespace change.

sys/dev/gve/gve_tx_dqo.c
170

M_WAITOK allocations won't fail.

318

I don't think this can happen, did you mean to check l4_off == l3_off?

352

You can write case __predict_true(0): break to avoid the extra indentation caused by the if-statement.

sys/modules/gve/Makefile
33–41

As a matter of style, once the list has more than a few entries it's nice to list files on their own lines:

SRCS=<tab>gve_main.c \
<tab><tab>gve_adminq.c \
...
shailend_google.com marked 18 inline comments as done.

Address Mark's comments

Thanks for reviewing!

sys/dev/gve/gve.h
262

Yup, didnt want to have a lot of diff in the existing datapath

358

I was trying to avoid introducing diff in the existing "GQI" datapath

sys/dev/gve/gve_main.c
178

Sorry, it is superfluous, removing it

sys/dev/gve/gve_rx_dqo.c
278

I was afraid of bus_dmamap_load_mbuf_sg returning err = 0 and nsegs = 0. Maybe that's not possible, but it felt good defense to check both return values, let me know if you think its too odd, I will remove the nsegs check.

414

I've removed them, the code is simple enough that they were unnecessary

sys/dev/gve/gve_tx_dqo.c
318

Sorry, just a useless check, removed it.

352

Neat!

Some parts of this are pretty opaque to me, but I don't see any problems.

This revision is now accepted and ready to land.Mon, Nov 4, 3:30 PM
sys/dev/gve/gve_rx_dqo.c
278

I have no objection, but I would add assertions with KASSERT() instead. (FreeBSD tends to use assertions pretty heavily, for what it's worth.)

shailend_google.com marked an inline comment as done.

Add a KASSERT instead of an explicit check

This revision now requires review to proceed.Mon, Nov 4, 5:14 PM
This revision is now accepted and ready to land.Tue, Nov 5, 3:34 PM

{F101963052}

I did some build testing and found that a no-INET6 kernel fails to build. The patch here fixes it for me - if it's acceptable to you I can squash that into the larger patch.

{F101963052}

I did some build testing and found that a no-INET6 kernel fails to build. The patch here fixes it for me - if it's acceptable to you I can squash that into the larger patch.

{F101964585}

Thinking about it a bit more, I think this version is better, but it needs some testing.

{F101963052}

I did some build testing and found that a no-INET6 kernel fails to build. The patch here fixes it for me - if it's acceptable to you I can squash that into the larger patch.

{F101964585}

Thinking about it a bit more, I think this version is better, but it needs some testing.

I'm unable to view the patch at https://reviews.freebsd.org/F101964585, maybe I need to look someplace else?