Page MenuHomeFreeBSD

nd6: Fix the routing table subscription
ClosedPublic

Authored by markj on Jul 18 2024, 3:22 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Sep 21, 10:37 AM
Unknown Object (File)
Sun, Sep 8, 6:03 PM
Unknown Object (File)
Sun, Sep 8, 4:23 AM
Unknown Object (File)
Sat, Sep 7, 8:50 PM
Unknown Object (File)
Sat, Sep 7, 4:35 PM
Unknown Object (File)
Wed, Sep 4, 12:57 PM
Unknown Object (File)
Sun, Sep 1, 8:53 PM
Unknown Object (File)
Sun, Sep 1, 4:28 PM

Details

Summary

The nd6 code listens for RTM_DELETE events so that it can mark the
corresponding default router as inactive in the case where the default
route is deleted. A subsequent RA from the router may then reinstall
the default route.

Commit fedeb08b6a58e broke this for non-multipath routes, as
rib_decompose_notification() only invokes the callback for multipath
routes. Restore the old behaviour. Also ensure that we update the
router only for RTM_DELETE notifications, lost in commit 2259a03020fe0.

Fixes: fedeb08b6a58 ("Introduce scalable route multipath.")
Fixes: 2259a03020fe ("Rework part of routing code to reduce difference to D26449.")
Sponsored by: Klara, Inc.
Sponsored by: Bell Tower Integration

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Jul 18 2024, 3:22 PM

Ah, fast fix. I'm also looking at this issue :)

sys/netinet6/nd6.c
1636

This should work great for plain nexthop. Shall we check if the rc->rc_nh_old is nexthop group, aka via NH_IS_NHGRP() ?

Ty for working on that! IIRC the idea of rib_decompose_notification was to provide the same handling for both multipath and non-multipath routes for the callers that care only about specific paths. I'd rather add the relevant callback to the rib_decompose_notification() and not touch nd6_subscription_cb code.

Ty for working on that! IIRC the idea of rib_decompose_notification was to provide the same handling for both multipath and non-multipath routes for the callers that care only about specific paths. I'd rather add the relevant callback to the rib_decompose_notification() and not touch nd6_subscription_cb code.

To be clear, you are asking me to:

  • make rib_decompose_notification() work for non-multipath nexthops, i.e., just remove the NH_IS_NHGRP() checks
  • move rib_decompose_notification() out of the ROUTE_MPATH ifdef
  • fix up existing callers

?

If so, sure, I will work on that.

Avoid invoking the callback twice for multipath routes.

Ty for working on that! IIRC the idea of rib_decompose_notification was to provide the same handling for both multipath and non-multipath routes for the callers that care only about specific paths. I'd rather add the relevant callback to the rib_decompose_notification() and not touch nd6_subscription_cb code.

To be clear, you are asking me to:

  • make rib_decompose_notification() work for non-multipath nexthops, i.e., just remove the NH_IS_NHGRP() checks
  • move rib_decompose_notification() out of the ROUTE_MPATH ifdef
  • fix up existing callers

?

I took a stab at this and it's more complicated than I thought. I'm not familiar at all with the multipath routing code so would prefer to fix the bug minimally for now. This will also make backports simpler.

bz added inline comments.
sys/netinet6/nd6.c
1616

The call stack was too deep for me to conclude rc_nh_old would always be non-NULL here? If you keep removing the NULL check below, can we add a KASSERT to document the assumption (here and below)?

sys/netinet6/nd6.c
1616

rib_decompose_notification() also assumes that rc_nh_old is non-NULL (when the command is RTM_DELETE), and that apparently did not cause any problems in the past several years.

I am happy with it for now.
If someone wants to refine it (and write a regression test for it) then I think they can in main afterwards.

This revision is now accepted and ready to land.Jul 25 2024, 1:01 PM
In D46020#1051358, @bz wrote:

I am happy with it for now.
If someone wants to refine it (and write a regression test for it) then I think they can in main afterwards.

I will work on a regression test today. Something like:

  • create a vnet jail with a SLAAC-enabled interface,
  • use scapy to inject a RA
  • verify that a default route is installed
  • flush v6 routes, verify that the default route is gone
  • inject a second RA, then verify that the default route is reinstalled
This revision was automatically updated to reflect the committed changes.
In D46020#1051358, @bz wrote:

I am happy with it for now.
If someone wants to refine it (and write a regression test for it) then I think they can in main afterwards.

I will work on a regression test today. Something like:

  • create a vnet jail with a SLAAC-enabled interface,
  • use scapy to inject a RA
  • verify that a default route is installed
  • flush v6 routes, verify that the default route is gone
  • inject a second RA, then verify that the default route is reinstalled

https://reviews.freebsd.org/D46136