Page MenuHomeFreeBSD

IPv4: experimental changes to allow net 0/8, 240/4, part of 127/8
ClosedPublic

Authored by karels on Jul 6 2022, 10:04 PM.
Tags
None
Referenced Files
F102434420: D35741.diff
Tue, Nov 12, 5:37 AM
Unknown Object (File)
Sun, Nov 10, 6:25 PM
Unknown Object (File)
Sat, Nov 9, 12:54 PM
Unknown Object (File)
Sat, Nov 9, 1:39 AM
Unknown Object (File)
Tue, Nov 5, 9:14 PM
Unknown Object (File)
Tue, Nov 5, 9:14 PM
Unknown Object (File)
Tue, Nov 5, 9:14 PM
Unknown Object (File)
Tue, Nov 5, 9:14 PM

Details

Summary

Combined changes to allow experimentation with net 0/8 (network 0),
240/4 (Experimental/"Class E"), and part of the loopback net 127/8
(all but 127.0/16). All changes are disabled by default, and can be
enabled by the following sysctls:

net.inet.ip.allow_zeronet=1
net.inet.ip.allow_experimental=1
net.inet.ip.loopback_mask=0xffff0000

When enabled, the corresponding addresses can be used as normal
unicast IP addresses, both as endpoints and when forwarding.

The proposals motivating this experimentation can be found in

https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-0/01/
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-240/
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-127/

This review is not expected to be pushed as is; it is for comment
only at this time. The loopback_mask will probably be replaced
by a prefix length.

Test Plan

manual testing

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This revision is now accepted and ready to land.Jul 7 2022, 10:01 AM
rgrimes added a subscriber: rgrimes.

Adding a reference to https://reviews.freebsd.org/D19316 for some historical discussion.

Adding network in case anyone missed this. I plan to update the review with a version with a sysctl to set the loopback prefix length rather than mask later today. Although I had planned to separate these changes into 2 or 3 commits, they won't really be independent because they overlap. Thus, the next version will be a candidate for submission.

change sysctl from loopback_mask to loopback_prefix

This revision now requires review to proceed.Jul 8 2022, 12:47 PM

This change lacks following parts:

  • documentation in share/man/man4/inet.4
  • improving examples in firewall configs libexec/rc/rc.firewall (see around BAD_ADDR_TBL)

Can we imagine a better name for net.inet.ip.allow_experimental? You can't guess what it means without going into documentation. Maybe experimental_class_e?

sys/net/vnet.h
68

Why is this change in here?

sys/netinet/in.c
280

I'd suggest first check the boolean, then check the mask.

sys/netinet/in.h
383

IMHO, this define should go away. Not used anywhere and with proposed changes is no longer correct.

This change lacks following parts:

  • documentation in share/man/man4/inet.4

Hmm, looks like only 7 of 38 current net.inet.ip sysctls are documented there. Not sure what to do about that... I'll look at this more.

  • improving examples in firewall configs libexec/rc/rc.firewall (see around BAD_ADDR_TBL)

Thanks, I never noticed /etc/rc.firewall. Looks like it needs a moderate amount of customization in most cases, although some can be done via rc.conf. I wonder how widely this is used? Some of the implied changes might be possible by checking the sysctls and skipping the relevant rule additions or modifying the loopback rules. Seems like this should be a separate project.

Can we imagine a better name for net.inet.ip.allow_experimental? You can't guess what it means without going into documentation. Maybe experimental_class_e?

Do you mean allow_experimental_class_e? Seems a bit long, but might be better. Other suggestions welcome.

sys/net/vnet.h
68

It was required to get in.c to compile with vnet.h. It could be in a separate commit, although it would not make vnet.h self-sufficient.

sys/netinet/in.c
280

Given the default setting of the boolean, that would require both conditions to be evaluated essentially always; seems like a pessimization.

sys/netinet/in.h
383

I'd be nervous that IN_BADCLASS was used somewhere, e.g. ports. I see that it is also in NetBSD's in.h at least. If it were dynamic, I'd make it "false" when allow_experimental was set, but that doesn't work for user level.

I have in the past hidden this kind of early-draft-status code behind #ifdef EXPERIMENTAL .
Having them marked with #ifdef usually makes it easier to leave them in or rip them out later if the drafts don't go forward as sometimes the code is still valuable for research and experiments.
I see you have sysctls which by default do not change the status quo; I am quite fine with that; put the comment somewhere which draft they are based on.

I would advise against changing any other infrastructure in the tree (such as firewall configurations) at this point. People who turn this on will know what to do. Or in classic Mark Crispin terms "I accept the risk!" ;-)

I think the only real issue I can quickly think is that in the past FreeBSD users have used addressed out of 127/8 for adding a loopback IP to classic IP-jails based on certain schemes.

sys/net/vnet.h
68

What was the error that cannot be fixed in the in.c code itself? It seems wrong here.

sys/netinet/in.c
104

I am almost tempted to add "according to draft-schoen-intarea-unicast-240"; if not there as a comment above it.

Similar for the other ones. We may have to update the names at some point if the drafts don't stay individual but that's okay.

sys/netinet/in.h
427

Maybe not "experimental"; may call them by their blocks? ip_allow_net0, ip_allow_net240 or ip_allow_class_e or something... And same for the sysctl names then.

sys/net/vnet.h
68

Correction: in.c compiles without this include, but other users of in.h fail if it includes vnet.h, such as nlm/sm_inter_xdr.c.

sys/net/vnet.h
68

Well, that file clearly violates FreeBSD expectations (being generated or not). The problem needs to be fixed there.

style.9:

Kernel include files (sys/*.h) come first.  If <sys/cdefs.h> is needed
for __FBSDID(), include it first.  If either <sys/types.h> or
<sys/param.h> is needed, include it before other include files.
(<sys/param.h> includes <sys/types.h>; do not include both.) Next,
include <sys/systm.h>, if needed.  The remaining kernel headers should be
sorted alphabetically.

and nlm/sm_inter_xdr.c.

#include <nlm/sm_inter.h>
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");

and them all the way down to sys/rpc/rpc.h and from there to netinet/in.h L423. Sad this is all so IPv4-only stuff still .. but that's a different problem.

sys/netinet/in.h
424

HERE ... you need sys/param.h in all files includeing this or just right above this line and not in vnet.h. Though I am not sure what @imp 's "header files should be self-contained" thinks about all this...
In theory you make me now wonder why these don't go into in_var.h?

sys/netinet/in.h
424

I did make -i and got 16 errors; some might be duplicates due to modules. iwlwifi was among them, via an ipv6.h include in linuxkpi.

I was thinking of imp's recent statement when putting the fix into vnet.h. But I'm not sure what in_var.h has to do with this; fewer things need in_var.h than in.h.

427

I like the numeric versions; any objections from anyone else?

sys/netinet/in.h
424

I am just wondering if the VNET_DELCARE and #define actually belong into in_var.h as they have nothing to do with the protocol spec itself and in_var.h is where the other VNET specifics for "in" are...
I guess the problem is IN_LOOPBACK() now... half of the consumers get it one way or the other...

Updates per review comments

Change sysctls and variables to allow_net0 and allow_net240.

Add references to internet-drafts as comments.

Add IN_LOOPBACK_MASK_DFLT definition to in.h (kernel only), and use
in in.c.

Mention vnet.h change in commit message.

I like the numberical ip_allow_netX version as well. I also agree that changing rc.firewall at this time is premature.

This revision is now accepted and ready to land.Jul 10 2022, 4:35 PM

I have changes for inet.4. I put them into a separate review, https://reviews.freebsd.org/D35765, as I think I'd like to do some reorganization (use sysctl names uniformly, rather than macro names for the oldest ones), sort alphabetically, add other missing variables worth documenting, and then add these.

I have changes to auto-update /etc/rc.firewall based on sysctl values, but I think I agree with @bz: users who experiment with this should be able to fend for themselves.

Are there any other unresolved issues?

glebius requested changes to this revision.Jul 10 2022, 4:53 PM

Please add documentation into this review. I'm writing down a longer write up in D35765 right now...

This revision now requires changes to proceed.Jul 10 2022, 4:53 PM

Add changes to inet.4 to this review.

Same changes as D35765, plus one additional sentence for
loopback_prefix about values other than 8 being experimental.

This revision is now accepted and ready to land.Jul 11 2022, 2:49 PM
In D35741#811409, @bz wrote:

I think the only real issue I can quickly think is that in the past FreeBSD users have used addressed out of 127/8 for adding a loopback IP to classic IP-jails based on certain schemes.

I dimly remember reading, 25-30 years ago, that some low-stratum NTP servers used 127.0.1.0/24 for their atomic clocks, but that's admittedly a corner case even if they use(d) FreeBSD.

share/man/man4/inet.4
293

Maybe loopback_prefixlen to be more accurate? (Needs code changes too if done.)

In D35741#811409, @bz wrote:

I think the only real issue I can quickly think is that in the past FreeBSD users have used addressed out of 127/8 for adding a loopback IP to classic IP-jails based on certain schemes.

I dimly remember reading, 25-30 years ago, that some low-stratum NTP servers used 127.0.1.0/24 for their atomic clocks, but that's admittedly a corner case even if they use(d) FreeBSD.

Close. ntp never used 127.x addresses to talk to each other. For a while, though, stratum 1 clocks used that as the reference for different types of time source in a 4-byte field that often would also have the IP address of the upstream for stratum 2 and greater clocks. That was changed in the last 90s / early 2000s though I believe.

rgrimes added inline comments.
sys/netinet/in.c
111

Not sure why your referencing a specific (and now obsolete) version of this draft, the canonical reference should be:
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-0 (you can remove the trailing slash, I believe that to be an artifact of the ietf's web server rewrite rules, and you'll get the proper page without it).

Also above in draft-schoen-intarea-unicast-240 reference, and below in draft-schoen-intarea-unicast-127.

share/man/man4/inet.4
293

That sounds fine to me. Any other votes before I make the change?

sys/netinet/in.c
111

Those were the hits I got from the IETF search for "draft-schoen"; at least they all point to the version summary. I can change if I'm making updates though.

Q: that’s the current thoughts on making the behavior default one?

share/man/man4/inet.4
293

I’d second the ‘loopback_prefixlen’ thingy. Something mentioning prefixlen in the nsme is desired to make it self-explanatory and the proposed name seem to be a good variant.

Q: that’s the current thoughts on making the behavior default one?

I'm not sure what you mean; the defaults on allow_net*? I think I'd prefer to go with unchanged behavior by default for now; maybe change the defaults depending on testing, and the progress of the internet-drafts.

Q: that’s the current thoughts on making the behavior default one?

I would be "OK" with ip_allow_net0 and ip_allow_net240 being altered to default AFTER the draft gets closer to being a reality, but at this early stage I would leave them as they presently are. The loopback prefix length is a slippery slope.

share/man/man4/inet.4
293

I do like the loopback_prefixlen more as well.

Change loopback_prefix to loopback_prefixlen.

Change sysctl name and documentation.

This revision now requires review to proceed.Jul 11 2022, 4:55 PM

I forgot to mention it, but this revision updates the I-D URLs in the comments and in the commit message.

In D35741#811738, @imp wrote:

I dimly remember reading, 25-30 years ago, that some low-stratum NTP servers used 127.0.1.0/24 for their atomic clocks, but that's admittedly a corner case even if they use(d) FreeBSD.

Close. ntp never used 127.x addresses to talk to each other. For a while, though, stratum 1 clocks used that as the reference for different types of time source in a 4-byte field that often would also have the IP address of the upstream for stratum 2 and greater clocks. That was changed in the last 90s / early 2000s though I believe.

That matches the timeline and may well be what I (mis)remembered. Objection withdrawn.

This revision is now accepted and ready to land.Jul 11 2022, 6:29 PM
In D35741#811738, @imp wrote:

I dimly remember reading, 25-30 years ago, that some low-stratum NTP servers used 127.0.1.0/24 for their atomic clocks, but that's admittedly a corner case even if they use(d) FreeBSD.

Close. ntp never used 127.x addresses to talk to each other. For a while, though, stratum 1 clocks used that as the reference for different types of time source in a 4-byte field that often would also have the IP address of the upstream for stratum 2 and greater clocks. That was changed in the last 90s / early 2000s though I believe.

That matches the timeline and may well be what I (mis)remembered. Objection withdrawn.

I just looked more closely at the current ntpd sources. I may have been in error when it comes to it being removed. I still see a lot of it there, and in guides on the net for how to setup NTPd.

127.127.x.y was used as a pseudo address as well. 127.127.1.0 is used for normal hosts that aren't stratum 1 and X = refclock number, Y = 0 (or unit number) for stratum 1 clocks that get their time from a reliable source of truth.

So there's still some specialness in NTP, but they aren't for actual exchange of IP packets.

I don't think it affects anything, but it may be something to be mindful of should this progress past the 'experimental' phases. Though maybe not in scope for the FreeBSD project, as it would be an impedance mismatch between NTP and IP specs for everybody (though maybe just the reference implementation does the 127 hack, I've not looked closely at full NTP specifications to see if this is enshrined there or not).

In D35741#811851, @imp wrote:
In D35741#811738, @imp wrote:

I dimly remember reading, 25-30 years ago, that some low-stratum NTP servers used 127.0.1.0/24 for their atomic clocks, but that's admittedly a corner case even if they use(d) FreeBSD.

Close. ntp never used 127.x addresses to talk to each other. For a while, though, stratum 1 clocks used that as the reference for different types of time source in a 4-byte field that often would also have the IP address of the upstream for stratum 2 and greater clocks. That was changed in the last 90s / early 2000s though I believe.

That matches the timeline and may well be what I (mis)remembered. Objection withdrawn.

I just looked more closely at the current ntpd sources. I may have been in error when it comes to it being removed. I still see a lot of it there, and in guides on the net for how to setup NTPd.

127.127.x.y was used as a pseudo address as well. 127.127.1.0 is used for normal hosts that aren't stratum 1 and X = refclock number, Y = 0 (or unit number) for stratum 1 clocks that get their time from a reliable source of truth.

So there's still some specialness in NTP, but they aren't for actual exchange of IP packets.

I don't think it affects anything, but it may be something to be mindful of should this progress past the 'experimental' phases. Though maybe not in scope for the FreeBSD project, as it would be an impedance mismatch between NTP and IP specs for everybody (though maybe just the reference implementation does the 127 hack, I've not looked closely at full NTP specifications to see if this is enshrined there or not).

Yes, ntp reference clock drivers are supposed to use this address range (How to Write a Reference Clock Driver and Reference Clock Support) as a pseudo or fictitious server.

One can install these drivers by selecting a driver in the ntp port using make config, building and installing a custom ntp. (No, we cannot enable them all in base or in the port because the drivers do conflict with each other, thus it is left to the sysadmin to select the ONE needed driver.)

sys/netinet/in.c
105

Normally we put comments above not below. And as indicated I'd only use the draft name (you already removed versions); the URLs in the past at least weren't long-term stable for drafts.

sys/netinet/in.c
105

In this case, I thought it was clearer to put the comment next to the description. I'd put the comment to the right, but there isn't room.

I'm not sure what you are saying about the URL. Are you saying that only the draft-schoen-* name should appear? That would seem to be missing information; seems like we'd also need to mention the site. But these URLs point to all the recent versions.

Any other comments on the references in the code? Any approvals for this version from the kernel/networking folks?

bjk added inline comments.
sys/netinet/in.c
105

Bjoern is saying that only the draft-schoen-* name should appear. I would prefer the URL to the draft's page (as in the current revision) but agree with Rod that linking to a specific version is probably a bad idea at this point.
The historical issues with URLs being not-long-term-stable I assume relates to the old practice of actually removing drafts from the main repository when they expire (after 6 months), but that has not been practice for quite some years, and there is general understanding that the value of having the old drafts available and stable references outweighs the "forced expiration".