Page MenuHomeFreeBSD

carp: support VRRPv3
ClosedPublic

Authored by kp on Apr 13 2024, 8:51 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Nov 5, 3:09 PM
Unknown Object (File)
Fri, Oct 18, 9:20 AM
Unknown Object (File)
Fri, Oct 18, 9:19 AM
Unknown Object (File)
Oct 13 2024, 11:55 AM
Unknown Object (File)
Oct 2 2024, 9:45 PM
Unknown Object (File)
Oct 2 2024, 4:28 AM
Unknown Object (File)
Sep 30 2024, 7:11 PM
Unknown Object (File)
Sep 24 2024, 4:52 AM

Details

Summary

Allow carp(4) to use the VRRPv3 protocol (RFC 5798). We can distinguish carp and
VRRP based on the protocol version number (carp is 2, VRRPv3 is 3), and support
both from the carp(4) code.

Co-authored by: glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/netinet/ip_carp.c
561–578

IMHO, switch() would look nicer here.

595–596

These two reassignments belong to the block that did m_pullup(). We don't need them at common path.

597–625

IMHO, switch() looks nicer.

632

All my comments to IPv4 input also apply to IPv6 version.

782

Maybe make it const struct mbuf *m and do same to carp_source_is_self()? Ideally we would pass struct ifnet * and IP header here, but that would be too much of change.

967

Can we ifa_free(ifa) right here?

1358–1364

Up to your taste, but I would go with C99 initializer here. That allows reader of the code to be 100% sure that everything that is not set is set to 0 and no stack bytes leak anywhere.

1490

Please use char * in a new code.

P.S. I'm surprised compiler didn't say anything about using caddr_t instead of c_caddr_t.

2779

Just as with the ioctl structure, I'd suggest to keep this one untouched and created a new one for new version.

kp marked 11 inline comments as done.Apr 16 2024, 2:41 PM
kp added inline comments.
sys/netinet/ip_carp.c
110–122

We probably could, but given that they're different data types (int vs. uint8_t/uint16_t) and mean different things I don't think it'd improve anything.

It's not as if a handful of extra bytes per carp address is of any concern.

180

It's used only in the kernel. It's used to abstract the old ioctl configuration api and the new netlink one.
Both the carp_ioctl() and carp_nl_set() functions then call into carp_ioctl_set(), which takes a struct carpkreq. It needs to list the new fields to support the new functionality.

2779

That seems a bit silly. The nl_carp_parsed struct is only used during netlink attribute parsing, and the whole point of netlink is that we can extend the protocol without breaking user/kernel space interfaces.

sys/netinet/ip_carp.h
126–127

I followed the existing struct carp_header here.
It'd arguably be better to move both into ip_carp.c, but I'd prefer to do that outside of this commit.

sys/netinet/ip_carp.c
110–122

It is more about code readability rather than memory savings. A new reader comes and immediately sees that certain fields are shared between protocols and certain are mutually exclusive.

180

Got it. Thanks!

2779

Maybe I mistake, but won't the preceding nlmsghdr now have a bigger nlmsg_len? If so, won't the old ABI request fail?

sys/netinet/ip_carp.h
126–127

Agreed.

sys/netinet/ip_carp.c
110–122

So something like this?

diff --git a/sys/netinet/ip_carp.c b/sys/netinet/ip_carp.c
index 61b5a65d0bf3..714f2f5ed23f 100644
--- a/sys/netinet/ip_carp.c
+++ b/sys/netinet/ip_carp.c
@@ -108,13 +108,19 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
+       union {
+               struct carp_softc_carp {
+                       int                     sc_advskew;
+                       int                     sc_advbase;
+               } carp;
+               struct carp_softc_vrrp {
+                       uint8_t                 sc_vrrp_prio;
+                       uint16_t                sc_vrrp_adv_inter;
+                       uint16_t                sc_vrrp_master_inter;
+               } vrrp;
+       } sc_settings;
        struct in_addr          sc_carpaddr;
        struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;

        int                     sc_naddrs;
        int                     sc_naddrs6;
@@ -327,10 +333,10 @@ SYSCTL_VNET_PCPUSTAT(_net_inet_carp, OID_AUTO, stats, struct carpstats,
        TAILQ_FOREACH((sc), &(ifp)->if_carp->cif_vrs, sc_list)

 #define        DEMOTE_ADVSKEW(sc)                                      \
-    (((sc)->sc_advskew + V_carp_demotion > CARP_MAXSKEW) ?     \
+    (((sc)->sc_settings.carp.sc_advskew + V_carp_demotion > CARP_MAXSKEW) ?    \
     CARP_MAXSKEW :                                             \
-        (((sc)->sc_advskew + V_carp_demotion < 0) ?            \
-        0 : ((sc)->sc_advskew + V_carp_demotion)))
+        (((sc)->sc_settings.carp.sc_advskew + V_carp_demotion < 0) ?           \
+        0 : ((sc)->sc_settings.carp.sc_advskew + V_carp_demotion)))

 static void    carp_input_c(struct mbuf *, struct carp_header *, sa_family_t, int);
 static void    vrrp_input_c(struct mbuf *, int, sa_family_t, int, int, uint16_t);
@@ -885,7 +891,7 @@ carp_input_c(struct mbuf *m, struct carp_header *ch, sa_family_t af, int ttl)
        sc->sc_init_counter = 0;
        sc->sc_counter = tmp_counter;

-       sc_tv.tv_sec = sc->sc_advbase;
+       sc_tv.tv_sec = sc->sc_settings.carp.sc_advbase;
        sc_tv.tv_usec = DEMOTE_ADVSKEW(sc) * 1000000 / 256;
        ch_tv.tv_sec = ch->carp_advbase;
        ch_tv.tv_usec = ch->carp_advskew * 1000000 / 256;
...

I'm not sure that's better (and I seem to have broken things making a mechanical substitution too).

2779

The messages will indeed be slightly larger. Message size is already variable because things like strings only use strlen(string) + 1, so setting up carp on "if1" has a different message length from setting it up on "interface1" already.

That's a long winded way of saying that the message length is not part of the ABI.

sys/netinet/ip_carp.c
110–122

Nope, just this:

@@ -108,13 +108,19 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
+       union {
+               struct {  /* CARP specific context */
+                       int                     sc_advskew;
+                       int                     sc_advbase;
+               };
+               struct {  /* VRRPv4 specific context */
+                       uint8_t                 sc_vrrp_prio;
+                       uint16_t                sc_vrrp_adv_inter;
+                       uint16_t                sc_vrrp_master_inter;
+               };
+       };
        struct in_addr          sc_carpaddr;
        struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;

        int                     sc_naddrs;
        int                     sc_naddrs6;

And no changes to any code or macros!

2779

Got it, thanks!

sys/netinet/ip_carp.c
110–122

I've tried that, and while I can make it work it requires a few additional changes to cope with the conflicts this introduces.

That is, the valid values for sc_advbase and sc_vrrp_adv_inter are different, so changing version between carp and vrrp becomes more complex. This seems to introduce several possible ways for things to break, without adding much more than a hint that the sc_advskew value won't be used in VRRPv3 mode.

However, it occurs to me that we don't actually need a union to hint that these are protocol specific contexts, we could just do this too:

diff --git a/sys/netinet/ip_carp.c b/sys/netinet/ip_carp.c
index 61b5a65d0bf3..5e91c85ae885 100644
--- a/sys/netinet/ip_carp.c
+++ b/sys/netinet/ip_carp.c
@@ -108,13 +108,17 @@ struct carp_softc {
        struct mtx              sc_mtx;

        int                     sc_vhid;
-       int                     sc_advskew;
-       int                     sc_advbase;
-       struct in_addr          sc_carpaddr;
-       struct in6_addr         sc_carpaddr6;
-       uint8_t                 sc_vrrp_prio;
-       uint16_t                sc_vrrp_adv_inter;
-       uint16_t                sc_vrrp_master_inter;
+       struct { /* CARP specific context */
+               int             sc_advskew;
+               int             sc_advbase;
+               struct in_addr  sc_carpaddr;
+               struct in6_addr sc_carpaddr6;
+       };
+       struct { /* VRRPv3 specific context */
+               uint8_t         sc_vrrp_prio;
+               uint16_t        sc_vrrp_adv_inter;
+               uint16_t        sc_vrrp_master_inter;
+       };

        int                     sc_naddrs;
        int                     sc_naddrs6;

Put carp/vrrp3 specific variables in their own structs.

In D44774#1020944, @kp wrote:
In D44774#1020899, @bz wrote:

I will simply express that this will not only open a can of worms by mixing both but the original reasons not to include VRRPv2/3 and hence the "existence" of CARP is also ignored.

I'm assuming you're referring to the supposed patent issues?

There are multiple other open source VRRP implementations, e.g. https://github.com/FDio/vpp/tree/master/src/plugins/vrrp and https://www.keepalived.org
Also, the relevant patents have expired by now (by which I mean, in 2017 and 2012, so 7 and 12 years ago): https://en.wikipedia.org/wiki/Virtual_Router_Redundancy_Protocol#cite_note-6

Absent follow-up I'm going to consider this objection to be settled.

In D44774#1023611, @kp wrote:

Put carp/vrrp3 specific variables in their own structs.

This definitely is better, but I still can't figure out what prevents to unionize it? Do we have a code that switches certain address operation between CARP and VRRP?

In D44774#1023611, @kp wrote:

Put carp/vrrp3 specific variables in their own structs.

This definitely is better, but I still can't figure out what prevents to unionize it? Do we have a code that switches certain address operation between CARP and VRRP?

Putting those variables in a union means that setting sc_vrrp_prio will clobber sc_advskew (and vice versa). These have different constraints and different meanings. It just makes things harder than they need to be when switching between carp and vrrp.

In D44774#1024457, @kp wrote:

Putting those variables in a union means that setting sc_vrrp_prio will clobber sc_advskew (and vice versa). These have different constraints and different meanings. It just makes things harder than they need to be when switching between carp and vrrp.

The switching code should just use a local variable to convert between prio and adnskew. IMHO better make the rarely executed switching code a tiny bit more complex, but avoid carrying unrelated data in the softc. If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

The switching code should just use a local variable to convert between prio and adnskew.

We can't convert between them. They're not different representations of the same concept, they just represent different things.

If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

But that's what putting them in a union does. It, for lack of a better word, infects carp with vrrp settings and vice versa.

In D44774#1024640, @kp wrote:

The switching code should just use a local variable to convert between prio and adnskew.

We can't convert between them. They're not different representations of the same concept, they just represent different things.

If we dereference CARP specific stuff in VRRP code or vice versa - that's a bug that needs be fixed rather than masked by non-unionizing data.

But that's what putting them in a union does. It, for lack of a better word, infects carp with vrrp settings and vice versa.

If it is possible to set VRRP values while interface is in CARP mode and vice versa, then it is already a bug and de-unionizing just hides it.

Do you have the branch with this project shared anywhere? I'd like to give it a try.

If it is possible to set VRRP values while interface is in CARP mode and vice versa, then it is already a bug and de-unionizing just hides it.

It's possible to switch version without simultaneously changing the vrrp priority or advertisement interval, which means these values are whatever the carp skew/interval values were.
That is, ifconfig foo carpver 3 is insufficient, users would always have to pass the priority and advertisement interval as well. Or we'd have to have additional code to handle this case and force the default values to be set instead. We'd be adding code to handle issues that simply don't exist without the union.

Do you have the branch with this project shared anywhere? I'd like to give it a try.

https://github.com/kprovost/freebsd-src/tree/netgate/vrrp

I see. There is quite a lot of CARP specific code to be executed when we are VRRP mode and little bit code vice versa. I'm working pull requestes on github for you.

This revision now requires changes to proceed.Apr 29 2024, 10:26 PM
sys/netinet/ip_carp.c
623

Isn't this the most easy way to panic system remotely?

kp marked an inline comment as done.Apr 30 2024, 7:10 AM
kp added inline comments.
sys/netinet/ip_carp.c
623

No, because this is the second time we check the version. On line 567 we check for the first time, so asserting it here is appropriate.

sys/netinet/ip_carp.c
623

My bad, sorry.

kp marked an inline comment as done.

Add glebius' changes from https://github.com/freebsd/freebsd-src/compare/main...glebius:FreeBSD:carp-vrrp

Also change sc_addr to a uint8_t array (from sockaddr_dl).

sys/netinet/ip_carp.c
2109

This comment can now be deleted.

2110–2115

And this can go up into the sparse initializer above ^^

  • remove stale comment
  • init sc_addr in the initializer

Thanks! Please write a longer commit message, that would mention preparatory and cleanup stuff that we did and aren't part of VRRP addition:

  • Refactor packet tagging for ether_output(). Separate HMAC preparation (CARP specific) from tagging. In unicast mode (CARP specific) don't put tag at all. Don't put pointer to software context into the tag. Putting just vhid, an integer value, is a safer design.
  • Get rid of unnecessary sockaddr_dl and just use uint8_t array of ETHER_ADDR length.
  • Fix a tiny bug in ioctl handler, where we would accept new advbase, but return EINVAL due to invalid advskew.

Split out Gleb's commits again.

It makes the individual changes easier to understand.

In D44774#1026860, @kp wrote:

Split out Gleb's commits again.

It makes the individual changes easier to understand.

I believe I've done the required magic to make Phabricator understand the entire chain, but it also lives here if I didn't: https://github.com/kprovost/freebsd-src/tree/netgate/vrrp

bz requested changes to this revision.May 3 2024, 12:35 AM

I'll reply here rather than the email--one well two major concerns (and I didn't mean legal) with the "can of worms" are:
(1) You are adding VRRP version 3... someone will come next and ask "and what about VRRP version 2"?
(2) We are adding to but not fixing the problem of the conflicting version number; it's easy to say I can add v3 but ... see (1).

And given the conflict, there are parties out there which simply changed the number of what was "CARP_VERSION" (to something unassigned) to avoid the conflict and still be working with themselves. Hence my other comments of making this more switch statements, not hard coding "version 2", etc. as that will allow at least these people to not trap over other standards and live with this change in-kernel and should we ever get around to "fix" this proper, then we won't trip over our own feet either...

sbin/ifconfig/carp.c
90

I am surprised this code has no #ifdef INET/INET6 but didn't have either before....

253

You could use %d and not hardcode the numbers?

I think the switch statement building a whitelist and erroring with unsupported version number %d would be more flexible...

There's a few more places in this review which would equally benefit from being more easily extensible.

sys/netinet/ip_carp.c
541–542

I think I would add a comment as-to why this struct is used (it's the minimal size I assume); in theory you need an unit8_t to get the version. I hope VRRPv<n+1> will not have a smaller header...

590

Is there any chance we can make the %s:%d func, LINE as there are multiple of them and for debugging it would probably help to know which one?

1811

Can this be a switch as well?

sys/netinet/ip_carp.h
108

That is actually a struct vrrp_header_v3 as v2 has a 1 octet Auth Type and a 1 octet Adver Int.

183

This may be a silly request but can that 'v' please be uppercase as well?

This revision now requires changes to proceed.May 3 2024, 12:35 AM
sys/netinet/ip_carp.h
85

You actually want RFC 9568; 5798 is obsolete!

sys/netinet/ip_carp.c
1811

There is a revision on top of this one that converts almost everything to a switch statement.

sys/netinet/ip_carp.h
183

In my opinion CARP_VERSION_VRRPV3 is ugly and CARP_VERSION_VRRPv3 is neat.

kp marked 8 inline comments as done.May 3 2024, 11:38 AM
In D44774#1027778, @bz wrote:

I'll reply here rather than the email--one well two major concerns (and I didn't mean legal) with the "can of worms" are:
(1) You are adding VRRP version 3... someone will come next and ask "and what about VRRP version 2"?

Right now the answer is "We don't have v2, and we have no plans to support v2 either.".
If someone does ever want to add VRRPv2 support they get to deal with the fallout. I'm not inclined to copy/paste chunks of code on the off chance that maybe it'll be slightly helpful in the future.
We're not making choices that'll make that impossible to do in the future.

(2) We are adding to but not fixing the problem of the conflicting version number; it's easy to say I can add v3 but ... see (1).

As in carpv3? That seems entirely unlikely to ever happen, given that carp's entire reason for existing is now moot.

And given the conflict, there are parties out there which simply changed the number of what was "CARP_VERSION" (to something unassigned) to avoid the conflict and still be working with themselves. Hence my other comments of making this more switch statements, not hard coding "version 2", etc. as that will allow at least these people to not trap over other standards and live with this change in-kernel and should we ever get around to "fix" this proper, then we won't trip over our own feet either...

I'll see about at ensuring that we don't leave '2' or '3' in the code anywhere, but do use the CARP_VERSION_CARP/CARP_VERSION_VRRPv3 everywhere.

sys/netinet/ip_carp.c
541–542

You assume correctly.

If the VRRPv4 header is smaller then .. well, the code doesn't support v4 either way, so it doesn't actually matter if we discard the packet here or later. We can always change this to sizeof(uint8_t) if and when that happens.

sys/netinet/ip_carp.h
183

I'm with Gleb here. I prefer the lowercase 'v' as well.

kp marked 2 inline comments as done.

review remarks

In D44774#1027883, @kp wrote:
In D44774#1027778, @bz wrote:

I'll reply here rather than the email--one well two major concerns (and I didn't mean legal) with the "can of worms" are:
(1) You are adding VRRP version 3... someone will come next and ask "and what about VRRP version 2"?

Right now the answer is "We don't have v2, and we have no plans to support v2 either.".
If someone does ever want to add VRRPv2 support they get to deal with the fallout. I'm not inclined to copy/paste chunks of code on the off chance that maybe it'll be slightly helpful in the future.
We're not making choices that'll make that impossible to do in the future.

(2) We are adding to but not fixing the problem of the conflicting version number; it's easy to say I can add v3 but ... see (1).

As in carpv3? That seems entirely unlikely to ever happen, given that carp's entire reason for existing is now moot.

No, as in carp getting an official version number.

And given the conflict, there are parties out there which simply changed the number of what was "CARP_VERSION" (to something unassigned) to avoid the conflict and still be working with themselves. Hence my other comments of making this more switch statements, not hard coding "version 2", etc. as that will allow at least these people to not trap over other standards and live with this change in-kernel and should we ever get around to "fix" this proper, then we won't trip over our own feet either...

I'll see about at ensuring that we don't leave '2' or '3' in the code anywhere, but do use the CARP_VERSION_CARP/CARP_VERSION_VRRPv3 everywhere.

Great thank you! Go ahead then once everyone else is d'accord with it.

sys/netinet/ip_carp.h
183

Do we do CaMeLcAsE anywhere else in the network stack? I am not going to block it on this but it's just "special"

In D44774#1028213, @bz wrote:
In D44774#1027883, @kp wrote:
In D44774#1027778, @bz wrote:

(2) We are adding to but not fixing the problem of the conflicting version number; it's easy to say I can add v3 but ... see (1).

As in carpv3? That seems entirely unlikely to ever happen, given that carp's entire reason for existing is now moot.

No, as in carp getting an official version number.

I'm not sure I follow. Do you mean carp might be officially designated vrrp version 9 (let's say)? That seems both very unlikely, and something that wouldn't actually be a problem to deal with. We already distinguish vrrpv3 from carp by the version number, so looking for version 9 would, if anything, make things easier.

Last call for objections. I intend to commit tomorrow.

My personal suggestion: it's worth to mention it in carp(4), and perhaps even adding a link of vrrp(4)

My personal suggestion: it's worth to mention it in carp(4), and perhaps even adding a link of vrrp(4)

Something like D44776 maybe?

This revision was not accepted when it landed; it landed in state Needs Review.May 8 2024, 11:20 AM
Closed by commit rG37115154672f: carp: support VRRPv3 (authored by kp). · Explain Why
This revision was automatically updated to reflect the committed changes.