Page MenuHomeFreeBSD

netlink: retry mbuf allocation in different pool on failure
AbandonedPublic

Authored by melifaro on May 31 2023, 2:40 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 1, 4:23 PM
Unknown Object (File)
Oct 1 2024, 7:21 AM
Unknown Object (File)
Oct 1 2024, 2:49 AM
Unknown Object (File)
Sep 30 2024, 6:15 AM
Unknown Object (File)
Sep 30 2024, 2:15 AM
Unknown Object (File)
Sep 28 2024, 3:48 PM
Unknown Object (File)
Sep 27 2024, 3:39 AM
Unknown Object (File)
Sep 19 2024, 11:22 AM

Details

Reviewers
gallatin
glebius
Group Reviewers
network

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Errors
SeverityLocationCodeMessage
Errortests/atf_python/ktest.py:24E501PEP8 E501
Errortests/atf_python/ktest.py:86E501PEP8 E501
Errortests/atf_python/ktest.py:88E501PEP8 E501
Errortests/atf_python/ktest.py:93E501PEP8 E501
Errortests/atf_python/ktest.py:96E501PEP8 E501
Errortests/atf_python/ktest.py:97E501PEP8 E501
Errortests/atf_python/ktest.py:98E501PEP8 E501
Errortests/atf_python/ktest.py:123E501PEP8 E501
Errortests/atf_python/ktest.py:155E501PEP8 E501
Errortests/atf_python/ktest.py:164E501PEP8 E501
Unit
No Test Coverage
Build Status
Buildable 51800
Build 48691: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.May 31 2023, 3:06 PM

Why? In what scenario is it likely that we'll succeed in allocating a larger buffer when a smaller allocation failed?

Why? In what scenario is it likely that we'll succeed in allocating a larger buffer when a smaller allocation failed?

Because mbuf zones are separate resources, and can become exhausted at different times. For example, a server under a DOS attack may exhaust 2K clusters because that's what the NIC stocks its RX rings with. However, it may have tons of page size or 16k clusters available.

I just experienced a similar scenario and complained to melifaro that netlink stopped working in that scenario. The ideal situation would be that netlink would just malloc memory, or use its own dedicated zone rather than mbufs.

I think having a separate dedicated allocator for netlink messages is the better call. Don't make your control plane reliant on your data plane's memory management.

I just experienced a similar scenario and complained to melifaro that netlink stopped working in that scenario. The ideal situation would be that netlink would just malloc memory, or use its own dedicated zone rather than mbufs.

Netlink by itself is opaque to underlying memory type (and uses plain malloc for linux app buffers / cases when the requested netlink message size is > 2k). I'd also prefer not to use mbufs at all, but that's the current way of interacting with socket buffers. To switch from that, one needs to implement routines like soreceive_generic, which is gigantic 500-line function. I'm looking into the alternative approaches, but for now let's try to get maximum from the current implementation

I think having a separate dedicated allocator for netlink messages is the better call. Don't make your control plane reliant on your data plane's memory management.

I guess some intermediate solution may actually work - introducing another mbuf cluster zone, e.g. EXT_NETLINK, without UMA_ZONE_CONTIG and use it solely for the Netlink sockets.

I just experienced a similar scenario and complained to melifaro that netlink stopped working in that scenario. The ideal situation would be that netlink would just malloc memory, or use its own dedicated zone rather than mbufs.

Netlink by itself is opaque to underlying memory type (and uses plain malloc for linux app buffers / cases when the requested netlink message size is > 2k). I'd also prefer not to use mbufs at all, but that's the current way of interacting with socket buffers. To switch from that, one needs to implement routines like soreceive_generic, which is gigantic 500-line function. I'm looking into the alternative approaches, but for now let's try to get maximum from the current implementation

It is possible to provide an alternate allocator for mbufs. m_free() will invoke a custom destructor for M_EXT mbufs, so you can bypass the normal UMA zones used for network packets.

With my changes to socket buffer layer made for the unix/dgram it is going to be easy to make Netlink use plain malloc for the socket buffers. I will help Alexander with that. I'm fine with the change as long as it fixes the problem that Drew observes.

I just experienced a similar scenario and complained to melifaro that netlink stopped working in that scenario. The ideal situation would be that netlink would just malloc memory, or use its own dedicated zone rather than mbufs.

Netlink by itself is opaque to underlying memory type (and uses plain malloc for linux app buffers / cases when the requested netlink message size is > 2k). I'd also prefer not to use mbufs at all, but that's the current way of interacting with socket buffers. To switch from that, one needs to implement routines like soreceive_generic, which is gigantic 500-line function. I'm looking into the alternative approaches, but for now let's try to get maximum from the current implementation

It is possible to provide an alternate allocator for mbufs. m_free() will invoke a custom destructor for M_EXT mbufs, so you can bypass the normal UMA zones used for network packets.

Ack. Thank you for the hint, I've created D40356, implementing this approach. It looks better than the current fix, so I'd prefer to land the new diff instead.