Page MenuHomeFreeBSD

mbuf: Add m_len assertion to mtod() and mtodo()
Needs ReviewPublic

Authored by igoro on Sep 16 2024, 6:38 PM.
Tags
None
Referenced Files
F102395413: D46684.diff
Mon, Nov 11, 4:40 PM
Unknown Object (File)
Thu, Nov 7, 1:38 PM
Unknown Object (File)
Thu, Nov 7, 1:37 PM
Unknown Object (File)
Thu, Nov 7, 1:24 PM
Unknown Object (File)
Thu, Nov 7, 11:25 AM
Unknown Object (File)
Mon, Nov 4, 7:43 PM
Unknown Object (File)
Sun, Nov 3, 5:48 AM
Unknown Object (File)
Sat, Oct 19, 4:42 AM
Subscribers

Details

Reviewers
kp
markj
Summary

This is expected to start a long journey of compile/runtime issues to cover before its landing. In addition, there is an open topic of a static inline function instead of a macro.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 59795
Build 56681: arc lint + arc unit

Event Timeline

igoro requested review of this revision.Sep 16 2024, 6:38 PM

RFC.

Adding assertions to the existing mtod() and mtodo() seems to break some of the existing things. For instance, one case was found in pf, where m_len is calculated after the mbuf data manipulations. It could be re-worked to form m_len first, but more cases like this can be found. Another hypothetical case could be some optimization-like when a code is sure that used amount of data is pulled, even if the type of a pointer means a bigger structure.

Probably, opt-in solution is better.

P.S. There are no strong opinions regarding the naming from my side.

RFC.

Adding assertions to the existing mtod() and mtodo() seems to break some of the existing things. For instance, one case was found in pf, where m_len is calculated after the mbuf data manipulations. It could be re-worked to form m_len first, but more cases like this can be found. Another hypothetical case could be some optimization-like when a code is sure that used amount of data is pulled, even if the type of a pointer means a bigger structure.

Did you try to fix pf, just to see what else breaks? I wonder how hard it is to get the whole test suite to run with such a change.

Probably, opt-in solution is better.

I would prefer to keep the interface simple if at all possible. That is, IMHO it is better to avoid having extra variants of the same macro. If we can sweep the tree and fix all consumers, then that would be better. Code which has some clever optimizations can be fixed (or we can add an opt-out variant if it's really necessary).

Aside from keeping the interface simple, I'm concerned that developers simply won't use a new variant of these commonly used macros.

I agree that better to fix all problems that assertions catch and just add those assertions into mtod() itself.

Maybe convert mtod() into static inline and that would allow to avoid addition of KASSERTE.

igoro retitled this revision from mbuf: Add mtod_() and mtodo_() counterparts with assertions to mbuf: Add m_len assertion to mtod() and mtodo().Oct 6 2024, 1:01 PM
igoro edited the summary of this revision. (Show Details)
igoro edited the test plan for this revision. (Show Details)

Re-work to add assertions to the existing macros

That seems like the right direction, but I recall from experimenting with this approach that there are a number of instances of mtod(m, void *), which doesn't build with this change:

/usr/src/sys/compat/freebsd32/freebsd32_misc.c:1625:7: error: invalid application of 'sizeof' to a void type [-Werror,-Wpointer-arith]
 1625 |         md = mtod(m, void *);
      |              ^~~~~~~~~~~~~~~
/usr/src/sys/sys/mbuf.h:109:31: note: expanded from macro 'mtod'
  109 |         KASSERTE((m)->m_len >= sizeof(*((t)((m)->m_data))),             \
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  110 |             ("%s: mtod_(): m_len=%d < %zu of expected data len @ %s:%d",\
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  111 |             __func__, (m)->m_len, sizeof(*((t)((m)->m_data))),          \
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  112 |             __FILE__, __LINE__))                                        \
      |             ~~~~~~~~~~~~~~~~~~~~

I believe there were three major categories of issue with it. The first being mtod(void*), another being an alignment check (mtod(m, intptr_t) & 3)) and the last one is a specialised version of mtod(void*), where it's used in bus_dmamap_load(). That last one probably needs to be converted to bus_dmamap_load_mbuf().

I've tried main kernel build for all make targets. I do not believe this is all the cases due to something can be hidden behind conditional preprocessing, but I guess it's the majority we could work with to reason our next steps (the full list can be found here):

All cases:        176
---------------------
Void cases:       116     // mtod(m, void *); bus_dmamap_load(..., mtod(m, void *), ...)
Pointer cases:     60
    vm_offset_t:   52     // mtod(m, vm_offset_t)
    uintptr_t:      4     // mtod(m, uintptr_t) + offset
    intptr_t:       2     // mtod(m, intptr_t) & 3
    bus_addr_t:     2

It feels legit to propose elimination of such mtod() use cases, provided the macro was not intended for that (Mbuf TO Data). If there is an agreement to do that, then I'm okay to help with smaller patches per file/driver/subsystem. We can start with the simplest macro expansion, and driver/subsystem authors may propose their way. Or we can think of alternatives to direct expansion from the very beginning, e.g. pointer or vm_offset_t only cases could migrate to something like mtop() or mtovmoff() macro (if it's worth it), the dma calls probably could switch to bus_dmamap_load_mbuf() as Kristof suggests, and so on.

P.S. This is only about compilation for now, I guess more fun is awaiting from runtime perspective.

Do I understand correctly that casts to void * are the problematic cases? If so, can we do something clever with _Generic to avoid having to restrict consumers?

#define mtod(m, t) (
    t __tmp;
    _Generic(__tmp,
        void *: 0,
        default: KASSERT(...));
    ((t)((m)->m_data))
)

(Untested, maybe this won't work for some reason.)

Do I understand correctly that casts to void * are the problematic cases? If so, can we do something clever with _Generic to avoid having to restrict consumers?

Thanks for reminding about the _Generic feature. It looks that our obstacle is [-Werror,-Wpointer-arith] usage. While it picks a correct type branch, it still parses the default one, and the latter contains the stumbling block -- sizeof(void). This is what I've tried:

#define mtod(m, t) _Generic(                                                    \
        t,                                                                      \
        void *: ((t)((m)->m_data)),                                             \
        default: (                                                              \
                KASSERTE((m)->m_len >= sizeof(*((t)((m)->m_data))),             \
                    ("%s: mtod_(): m_len=%d < %zu of expected data len @ %s:%d",\
                    __func__, (m)->m_len, sizeof(*((t)((m)->m_data))),          \
                    __FILE__, __LINE__))                                        \
                ,                                                               \
                ((t)((m)->m_data))                                              \
        )                                                                       \
)

Do I understand correctly that casts to void * are the problematic cases? If so, can we do something clever with _Generic to avoid having to restrict consumers?

Thanks for reminding about the _Generic feature. It looks that our obstacle is [-Werror,-Wpointer-arith] usage. While it picks a correct type branch, it still parses the default one, and the latter contains the stumbling block -- sizeof(void). This is what I've tried:

#define mtod(m, t) _Generic(                                                    \
        t,                                                                      \
        void *: ((t)((m)->m_data)),                                             \
        default: (                                                              \
                KASSERTE((m)->m_len >= sizeof(*((t)((m)->m_data))),             \
                    ("%s: mtod_(): m_len=%d < %zu of expected data len @ %s:%d",\
                    __func__, (m)->m_len, sizeof(*((t)((m)->m_data))),          \
                    __FILE__, __LINE__))                                        \
                ,                                                               \
                ((t)((m)->m_data))                                              \
        )                                                                       \
)

Hrmph. Maybe we can just try to eliminate all uses of mtod(m, void *) then. I see some that are simply unnecessary, e.g., the ones in debugnet.c. Others can be converted to mtod(m, char *) or similar. What do you think?

Hrmph. Maybe we can just try to eliminate all uses of mtod(m, void *) then. I see some that are simply unnecessary, e.g., the ones in debugnet.c. Others can be converted to mtod(m, char *) or similar. What do you think?

Yes, it would be great to keep void * consumers as is. If there are no other ideas then, yes, we can easily change existing void * consumers considering each case's specifics.

Hrmph. Maybe we can just try to eliminate all uses of mtod(m, void *) then. I see some that are simply unnecessary, e.g., the ones in debugnet.c. Others can be converted to mtod(m, char *) or similar. What do you think?

Yes, it would be great to keep void * consumers as is. If there are no other ideas then, yes, we can easily change existing void * consumers considering each case's specifics.

It's a fair amount of churn to fix them (I just pushed some small portion of this), but it's doable, and I don't have a better suggestion. Using mtod(void *) is a bit lazy and it would be good to avoid it when possible, and use a different macro when it's not possible.

Maybe this should be discussed on the freebsd-net mailing list, folks may have other opinions.