Paths

Table of Contentst

Allow ND entries creation for all routes without gateway.
ClosedPublic
Actions

Authored by melifaro on Feb 15 2020, 2:38 PM.

Details

Reviewers

bz
kbowling

Group Reviewers

network

Commits

rG95b5ff22a93c: netinet6: allow ND entries creation for all directly-reachable
rGf998535a66b9: netinet6: allow ND entries creation for all directly-reachable

Summary

Current assumption is that kernel-handled rtadv prefixes along with
the interface address prefixes are the only prefixes considered in
the ND neighbor eligibility code.
Change this by allowing any non-gatewayed to be eligible.

This will allow DHCPv6-controlled routes to be correctly handled by
the ND code.

Refactor nd6_is_new_addr_neighbor() to enable more deterministic
performance in "found" case and remove non-needed
V_rt_add_addr_allfibs handling logic.

This is the alternative form of D15404 && D15405, not involving new route flags.

Test Plan

Before

route -n get -6 2a02:6b8:2::
   route to: 2a02:6b8:2::
destination: 2a02:6b8:2::
       mask: ffff:ffff:ffff:ffff::
        fib: 0
  interface: vtnet0
      flags: <UP,DONE,STATIC>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0


16:10 [0] m@devel2 ping6 2a02:6b8:2::1
PING6(56=40+8+8 bytes) 2a01:4f8:13a:70c:ffff::8 --> 2a02:6b8:2::1
ping6: sendmsg: No buffer space available
ping6: wrote 2a02:6b8:2::1 16 chars, ret=-1
^C
--- 2a02:6b8:2::1 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss


16:10 [0] m@devel2 ndp -an | grep 2a02:6b8:2::

After

route -n get -6 2a02:6b8:2::
   route to: 2a02:6b8:2::
destination: 2a02:6b8:2::
       mask: ffff:ffff:ffff:ffff::
        fib: 0
  interface: vtnet0
      flags: <UP,DONE,STATIC>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0


16:10 [1] m@devel0 ping6 2a02:6b8:2::1
PING6(56=40+8+8 bytes) 2a01:4f8:13a:70c:ffff::6 --> 2a02:6b8:2::1
^C
--- 2a02:6b8:2::1 ping6 statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss


16:10 [1] m@devel0 ndp -an  | grep 2a02:6b8:2::
2a02:6b8:2::1                        (incomplete)      vtnet0 1s        I  2

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

melifaro created this revision.Feb 15 2020, 2:38 PM

Herald added subscribers: ae, imp. · View Herald TranscriptFeb 15 2020, 2:38 PM

Harbormaster completed remote builds in B29399: Diff 68353.Feb 15 2020, 2:38 PM

Add test for checking valid ND operation.

Herald added a subscriber: asomers. · View Herald TranscriptFeb 15 2020, 4:08 PM

Harbormaster completed remote builds in B29401: Diff 68356.Feb 15 2020, 4:09 PM

melifaro edited the summary of this revision. (Show Details)Feb 15 2020, 4:15 PM

melifaro edited the test plan for this revision. (Show Details)

hsw_bitmark.com added a subscriber: hsw_bitmark.com.Feb 17 2020, 1:18 AM

hrs added a subscriber: hrs.Feb 18 2020, 12:50 AM

I have no strong objection to allow a prefix route with no gateway, but I think the case pointed out in Bug 194485 can be solved by just adding an address with the delegated prefix on the interface (EUI-64 always works as the interface id). Is there any specific reason for DHCP-PD (or another use case) to have an interface route?

In D23695#521458, @hrs wrote:

I have no strong objection to allow a prefix route with no gateway, but I think the case pointed out in Bug 194485 can be solved by just adding an address with the delegated prefix on the interface (EUI-64 always works as the interface id). Is there any specific reason for DHCP-PD (or another use case) to have an interface route?

That's not always possible.

With IPv4 we already have to deal with overlay networks, where all IPs are taken by other nodes in the shared network. In those cases a static route to the network interfaces is set and the ND code does the direct ARP for us (using an IP from a different network).

With IPv6 the problem can become more serious, because every interface should have an link local address on the shared link. So in principle, there is no need for addresses visible from a broader scope in the interface itself. Dynamic routing protocols for IPv6 do exploit this property extensively. So it's quiet common to do not have an IPv6 address in the network we are routing to.

With Prefix-Delegation (DHCPv6) a node on the link obtains a bunch of addresses, which are solely used by the node itself. So there is not even a remote possibility to use an address from the IP space delegated. But the node can use the delegated IP space even on this link, so the ND should be able handle this case. Of course in typical setups the routing table of the delegating router is modified to route the delegated IP space to the link local address of the receiving node, so there is no special case. But there is no need to do so.

I'd recommend to only add ND entries if we really have a route for the covering network to the link itself. Otherwise we risk cache pollution with unusable entries or (even worse) security issued by redirected traffic.

In D23695#521458, @hrs wrote:

I have no strong objection to allow a prefix route with no gateway, but I think the case pointed out in Bug 194485 can be solved by just adding an address with the delegated prefix on the interface (EUI-64 always works as the interface id). Is there any specific reason for DHCP-PD (or another use case) to have an interface route?

Thank you for looking into this!

Indeed, I should have provided a more accurate description. The problem is a bit bigger than just DHCPv6.

In general, RFC 5942 advocates for the explicit split between address management and prefix assignment:

RFC 5942 1. Introduction

The on-link determination is separate from the address assignment.  A host can have IPv6 addresses without any related on-link prefixes or can have on-link prefixes that are
   not related to any IPv6 addresses that are assigned to the host.

RFC 5942 4.1

The assignment of an IPv6 address -- whether through IPv6
       stateless address autoconfiguration [RFC4862], DHCPv6 [RFC3315],
       or manual configuration -- MUST NOT implicitly cause a prefix
       derived from that address to be treated as on-link and added to
       the Prefix List.  A host considers a prefix to be on-link only
       through explicit means, such as those specified in the on-link
       definition in the Terminology section of [RFC4861] (as modified
       by this document) or via manual configuration.

The rationale for this change is to support not only DHCPv6, but a generic case where IPv6 prefixes are added by other control software or manually.

In D23695#521499, @melifaro wrote:

In D23695#521458, @hrs wrote:

I have no strong objection to allow a prefix route with no gateway, but I think the case pointed out in Bug 194485 can be solved by just adding an address with the delegated prefix on the interface (EUI-64 always works as the interface id). Is there any specific reason for DHCP-PD (or another use case) to have an interface route?

Thank you for looking into this!

Indeed, I should have provided a more accurate description. The problem is a bit bigger than just DHCPv6.

In general, RFC 5942 advocates for the explicit split between address management and prefix assignment:

Yes, it is true that a configured address must not have any implication about prefix configuration.

Let me make myself clear. I am interested in how the application side tells on-link prefix information to the kernel. I think DHCPv6-PD is one of the typical cases, and another example is RA PIO. When the kernel receives an RA PIO, nd6_prelist_update() will install an interface route and prefix list entry. A userland program, on the other hand, can install an interface route but cannot install a new prefix list entry directly. In my understanding the problem of DHCPv6-PD here is the latter. An easy workaround is to install an address with the delegated prefix to the interface on the client side because it also installs a new on-link prefix list entry.

So my question is whether we need a separation between a prefix and an interface route. Patch in D23695 assumes an interface route to allow NS/NA communications to update the neighbor cache (correct?) If we can assume an interface route also implies an on-link prefix, just installing an on-link prefix list entry upon installing an interface route is more reasonable to me than looking up the routing table because the current code uses the prefix list to determine if an address is a neighbor or not.

Already we always install an interface route when installing an on-link prefix list entry, so this assumption should not be surprising. But this idea does not work if we need a clear separation between a prefix list entry and an interface route. Do we have any reason to keep a prefix associated with an interface route away from the prefix list? And do we need a way to configure a prefix only as either a route or a prefix list entry?

In D23695#521734, @hrs wrote:

If we can assume an interface route also implies an on-link prefix, just installing an on-link prefix list entry upon installing an interface route is more reasonable to me than looking up the routing table because the current code uses the prefix list to determine if an address is a neighbor or not.

Yes, that's reasonable. It's the same logic commercial routers (i.e. Cisco) are working. Route to interface -> ND for those addresses.

Do we have any reason to keep a prefix associated with an interface route away from the prefix list?

Router advertisements are allowed to offer networks, which are not on-link (even for the router). There are several cases of (broken) networks where ND does not even work for on-link prefixes. In such cases an nd-proxy (i.e. ports/parpd) is needed anyway.

val_packett.cool added a subscriber: val_packett.cool.Jun 10 2020, 11:01 AM

guyyur_gmail.com mentioned this in D15406: Separate ioctl address prefix management from RA prefix management as we have no API for controlling the latter..Oct 3 2020, 7:55 PM

guyyur_gmail.com added a child revision: D15406: Separate ioctl address prefix management from RA prefix management as we have no API for controlling the latter..

guyyur_gmail.com mentioned this in D15405: Match IPv6 neighbor routes when they are marked with RTF_CONNECTED..Oct 9 2020, 10:37 AM

guyyur_gmail.com mentioned this in D15404: Add and use new RTF_CONNECTED flag to mark connected routes.

roy_marples.name mentioned this in D26652: Implement SO_RERROR.Jul 29 2021, 9:15 AM

emaste added a subscriber: emaste.Jul 29 2021, 3:07 PM

There's been some healthy debate in the PR and several reviews, and it seems like consensus is accepting these routes is prudent due to real world networks. @donner I am not sure I understand the second part of your last comment, but as a real world use @roy_marples.name needs this for dhcpcd.

As a reviewer I'm looking at the code with the mindset of ND6 lock concerns since that is the only particular area I have some previous experience with this code and it looks correct. The test looks like it covers much of the rest

This revision is now accepted and ready to land.Aug 1 2021, 4:49 AM

crest_freebsd_rlwinm.de added a subscriber: crest_freebsd_rlwinm.de.Jul 5 2022, 4:08 PM

Herald added a subscriber: glebius. · View Herald TranscriptJul 5 2022, 4:08 PM

@melifaro do you plan to pick this up again?

In D23695#521734, @hrs wrote:

In D23695#521499, @melifaro wrote:

In D23695#521458, @hrs wrote:

I have no strong objection to allow a prefix route with no gateway, but I think the case pointed out in Bug 194485 can be solved by just adding an address with the delegated prefix on the interface (EUI-64 always works as the interface id). Is there any specific reason for DHCP-PD (or another use case) to have an interface route?

Thank you for looking into this!

Indeed, I should have provided a more accurate description. The problem is a bit bigger than just DHCPv6.

In general, RFC 5942 advocates for the explicit split between address management and prefix assignment:

Yes, it is true that a configured address must not have any implication about prefix configuration.

Let me make myself clear. I am interested in how the application side tells on-link prefix information to the kernel. I think DHCPv6-PD is one of the typical cases, and another example is RA PIO. When the kernel receives an RA PIO, nd6_prelist_update() will install an interface route and prefix list entry. A userland program, on the other hand, can install an interface route but cannot install a new prefix list entry directly. In my understanding the problem of DHCPv6-PD here is the latter. An easy workaround is to install an address with the delegated prefix to the interface on the client side because it also installs a new on-link prefix list entry.

So my question is whether we need a separation between a prefix and an interface route. Patch in D23695 assumes an interface route to allow NS/NA communications to update the neighbor cache (correct?) If we can assume an interface route also implies an on-link prefix, just installing an on-link prefix list entry upon installing an interface route is more reasonable to me than looking up the routing table because the current code uses the prefix list to determine if an address is a neighbor or not.

Well, I certainly agree that there should be a userland ability to make a "valid" prefix. Adding IPv6-specific datastructures upon generic route insertion would be a bit hackish and I'd prefer to have it the other way round (IPv6-specific syscall requesting route table addition).
However, there is another aspect to consider: performance. Technically speaking, we have to do longest-prefix-match

Already we always install an interface route when installing an on-link prefix list entry, so this assumption should not be surprising. But this idea does not work if we need a clear separation between a prefix list entry and an interface route. Do we have any reason to keep a prefix associated with an interface route away from the prefix list? And do we need a way to configure a prefix only as either a route or a prefix list entry?

In D23695#810834, @driesm wrote:

@melifaro do you plan to pick this up again?

Yep! It slipped through the cracks, will look into it in a couple of days.

Rewrite & sync wth recent HEAD.

This revision now requires review to proceed.Jul 10 2022, 1:55 PM

Harbormaster completed remote builds in B46351: Diff 107979.Jul 10 2022, 1:55 PM

Let me make myself clear. I am interested in how the application side tells on-link prefix information to the kernel. I think DHCPv6-PD is one of the typical cases, and another example is RA PIO. When the kernel receives an RA PIO, nd6_prelist_update() will install an interface route and prefix list entry. A userland program, on the other hand, can install an interface route but cannot install a new prefix list entry directly. In my understanding the problem of DHCPv6-PD here is the latter. An easy workaround is to install an address with the delegated prefix to the interface on the client side because it also installs a new on-link prefix list entry.

So my question is whether we need a separation between a prefix and an interface route. Patch in D23695 assumes an interface route to allow NS/NA communications to update the neighbor cache (correct?) If we can assume an interface route also implies an on-link prefix, just installing an on-link prefix list entry upon installing an interface route is more reasonable to me than looking up the routing table because the current code uses the prefix list to determine if an address is a neighbor or not.

Already we always install an interface route when installing an on-link prefix list entry, so this assumption should not be surprising. But this idea does not work if we need a clear separation between a prefix list entry and an interface route. Do we have any reason to keep a prefix associated with an interface route away from the prefix list? And do we need a way to configure a prefix only as either a route or a prefix list entry?

We don't. RFC 5942 3.1 states that prefix list can be populated via manual configuration. I'm totally fine with an idea of merging these, however it doesn't seem to be too straight-forwarded. The current code doesn't have a good way of tracking the origin of such prefix routes (e.g. interface route prefix is save in ia and that's it), so handling the use case when a) /64 route has been added to the interface, then b) address within this route has been added and then c) same prefix received via RA is complicated. It looks like the current code needs some changes to address combination of manual/automated configuration first.

In D23695#811485, @melifaro wrote:

Let me make myself clear. I am interested in how the application side tells on-link prefix information to the kernel. I think DHCPv6-PD is one of the typical cases, and another example is RA PIO. When the kernel receives an RA PIO, nd6_prelist_update() will install an interface route and prefix list entry. A userland program, on the other hand, can install an interface route but cannot install a new prefix list entry directly. In my understanding the problem of DHCPv6-PD here is the latter. An easy workaround is to install an address with the delegated prefix to the interface on the client side because it also installs a new on-link prefix list entry.

So my question is whether we need a separation between a prefix and an interface route. Patch in D23695 assumes an interface route to allow NS/NA communications to update the neighbor cache (correct?) If we can assume an interface route also implies an on-link prefix, just installing an on-link prefix list entry upon installing an interface route is more reasonable to me than looking up the routing table because the current code uses the prefix list to determine if an address is a neighbor or not.

Already we always install an interface route when installing an on-link prefix list entry, so this assumption should not be surprising. But this idea does not work if we need a clear separation between a prefix list entry and an interface route. Do we have any reason to keep a prefix associated with an interface route away from the prefix list? And do we need a way to configure a prefix only as either a route or a prefix list entry?

We don't. RFC 5942 3.1 states that prefix list can be populated via manual configuration. I'm totally fine with an idea of merging these, however it doesn't seem to be too straight-forwarded. The current code doesn't have a good way of tracking the origin of such prefix routes (e.g. interface route prefix is save in ia and that's it), so handling the use case when a) /64 route has been added to the interface, then b) address within this route has been added and then c) same prefix received via RA is complicated. It looks like the current code needs some changes to address combination of manual/automated configuration first.

Regardless of the merging state, I'm going to commit the current path tomorrow, August 10. It sounds pretty sane to me to consider all addresses that are directly reachable as ND neighbour candidates. It also simplifies the current logic and allows to remove last used of rib_lookup_info() KPI.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 10 2022, 2:20 PM

Closed by commit rGf998535a66b9: netinet6: allow ND entries creation for all directly-reachable (authored by melifaro). · Explain Why

This revision was automatically updated to reflect the committed changes.

melifaro added a commit: rGf998535a66b9: netinet6: allow ND entries creation for all directly-reachable.

melifaro added a commit: rG95b5ff22a93c: netinet6: allow ND entries creation for all directly-reachable.Jan 13 2023, 9:27 PM