This is the root of a set of patches to clean up code that should be using IN_foo macros from sys/netinet/in.h, this particular part of the patch set opens these ranges up for some experiments being conducted, this part should not be committed as is.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
I don't understand the purpose of this review. As it says, these changes should not be committed. I'm not actually sure of the official status of Class E, although I suspect it is still experimental/not for production use.
I suspect one will find an answer here:
https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml
The changes well need expansion, the present form is to make a patch available so that the other patches can be tested.
A walk down through the RFC's say a few things, one is that it is not called Class E anymore. It is still reserved.
That source just says RFC1122 section 4, Class E IP addresses, i.e., those with "1111" as their high-order four bits, are reserved for future addressing modes.
Why it is called IN_EXPERIMENTAL is unclear at this point in time. There are other RFC's that talk about 240/8, including some drafts.
It looks like I am the guilty party in calling Class E IN_EXPERIMENTAL, which went in when "Class D" became multicast. I suspect it was actually called that in some document, although it doesn't really matter. https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml doesn't call it Class E, but refers to RFC112, which does. In any case, assuming that https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml is still current, the treatment is the same.
From the history department...
There's an attempt under-weigh to finally make the former IN_EXPERIMENTAL range usable, and to get that past the IETF, it helps to have two interoperating implementations. Linux has most of the stuff needed already.
PS @karels a pleasure to meet you finally. I re-read "congestion avoidance and control" and zillions of cites as part of the "bufferbloat" effort so many times in the past 8 years. I'd love to talk to you about that period of history at some point, but not on this bugreport.
Thanks @rgrimes! Hi Mike @karels! John Gilmore here...
Unicast is the crown jewel of the Internet Protocol. All the other kinds of addresses are in the noise. With unicast IPv4 addresses going for $20 apiece, it's time that we reduce that price by adding some supply from the "reserved for future use" parts of the address space. There isn't going to be any bizarro future use of reserved IPv4 addresses, other than for ordinary unicast addresses. These changes would free up BILLIONS of dollars worth of IPv4 addresses for future use. There is no reason to wait; it will take years for the changes to trickle into enough parts of the Internet that ordinary users could use these addrs, so if you think having more unicast addrs is a positive change, we should start now. Indeed should have started 10 years ago.
An IETF effort 10 years ago by Eliot Lear and Paul Wilson caused Linux, IOS, and Android to all support 240/4 as unicast, even though IETF never eventually approved the drafts. We recently tested all three; they work. The interesting part is that with full support in Linux and Apple OS's over 10 years, NOBODY NOTICED! Because you actually have to configure an interface to these addresses, or make routes for them, to notice, which nobody anywhere ever did except in testbeds. So enabling these ranges as unicast addrs is one of the more harmless changes (it turns a hardcoded error into a softcoded error due to no matching routing table entry).
Nobody even noticed for 30 years that we were reserving an entire /8 for a single reserved address (0.0.0.0) since RFC 1122 in 1989 deprecated the 0.x.y.z special addresses that didn't work in LANs but only in WANs like the Arpanet. Ditto for 127/8, though application usage of multiple loopback addrs in the meantime suggests that we should keep 65,000 of them as loopbacks and only reclaim the other 255/256ths of the 16 million addresses. The Internet standards banished the Class A/B/C distinctions in the 1990s in favor of CIDR, but there's still a mess of code in BSD that didn't get cleaned up to be classless; these patches are doing most of that cleanup.
I'm working with Dave Täht on this unicast expansion; he's coordinating the tech work of testing and patching in various platforms and routing implementations. Since IETF v4-vs-v6 politics derailed it last time, we're trying to have very capable running code before seeking rough consensus. We are sorting out implementation issues that arise (like whether all of 240/4 and 255/8 except 255.255.255.255 becomes routable, or whether we have to break up the /4 and the /8 and keep the last /24 to always remain not routable). We plan to run a Network Telescope listening in for any existing traffic, and testing global reachability, for years, long before trying to have "the authorities" like ICANN or IANA allocate these addresses for production use. But Step 1 is to have it work in multiple diverse implementations; Step 2 is to make sure routing is forward and backward compatible; Step 3 is to announce a route and listen to these ranges with a Telescope. Step 4 is to report to IETF on what we found out. Then the politics may become tractable to make the benefit of cheaper unicast addresses real for all Internet users.
Hi, John! Very, very long time no (anything). I am very happy to see tests in this area. I certainly expect this to work with minimal changes (i.e. this one). Re: Classs A/B/C: I'd be quite happy to see those definitions go away. Re: "loopback net": I'm aware of a small number of things using that other than 127.0.0.1 (including one of my company's legacy products), but they are minor. I still remember driving a Sun workstation crazy by pinging 127.0.0.2 (there was a network route, but obviously only one address recognized.).
In my first pass audit of a search results on IN_CLASS[A-E] the common use cases here are the default calculation of netmasks, 2 minutes ago I would of told you that probably can not go away, but it just hit me, IN_DEFAULTMASKFOR(i) could be written that would incorporate the ancient class rules in a central place and cleaning up some code. Thoughts?
Re: "loopback net": I'm aware of a small number of things using that other than 127.0.0.1 (including one of my company's legacy products), but they are minor. I still remember driving a Sun workstation crazy by pinging 127.0.0.2 (there was a network route, but obviously only one address recognized.)
I am aware of several uses of some of these ranges, ntp for one uses them as addresses of local hardware clocks. 127.0.0.2 is usually the sink device on a cisco, there are varied uses with the eBGP community that use various low numbered 127/8 addresses for DDOS mitigation, blackhole routes, bogon routes, route destination for bogon traffic, etc. I am rather suprized that the FreeBSD jail system does not have a {127,${jailid}} setup instead of maping the internal lo0 to something.
In my first pass audit of a search results on IN_CLASS[A-E] the common use cases here are the default calculation of netmasks, 2 minutes ago I would of told you that probably can not go away, but it just hit me, IN_DEFAULTMASKFOR(i) could be written that would incorporate the ancient class rules in a central place and cleaning up some code. Thoughts?
Rather than perpetuate the ancient class rules even in one macro, I would make macros that define both the full address and the netmask for each remaining reserved block (IN_LOOPBACK could remain, but IN_LOOPBACKNET would disappear since it requires knowledge of how to embed a Class A net number into an IP address). If anyplace needs those numbers, replace IN_LOOPBACKNET with IN_LOOPBACK_BASEADDR 0x7F000000 and IN_LOOPBACK_NETMASK 0xFFFF0000 (and use those in IN_LOOPBACK rather than hardcoding hex constants). Remove all the macros for CLASS[ABC]*. Fix up code to remove all CLASS uses, and only use the new names where appropriate. Then recompile the world including userspace, and fix whatever we missed.
I agree with John that we should do everything we can to eliminate class A/B/C rather than just shuffle it around. Removing those macros will cause some turmoil; e.g. libc's inet_netof and inet_lnaof assume classes, and don't know about local netmasks. They should probably go away. I don't know how painful that will be. But we are way overdue to get rid of this stuff, and it appears that Linux (Ubuntu) doesn't have them. But the old A/B/C are a somewhat different problem than making Class E usable.
But I will also note that ip_output() won't emit packets on the "loopback" net (/8) to a non-loopback interface at the moment, citing RFC1122. The RFC says that, although it isn't clear about the mask. But that RFC is also pre-CIDR and is fully invested in Class A/B/C. ip_output() should be changed to correspond with what you are doing here. in_canforward() too. Removing IN_LOOPBACKNET would help smoke out anything else like this.
20-something-odd years ago I would have agreed with you as that would have been the time to do it... IPv4 is dying faster than FreeBSD and while it would be nice to leave the history books with the correct terminology, I just not sure it's worth it. And everyone who's still trying to make the /4 usable is clearly living in a past trying to push back the unavoidable. I would really prefer FreeBSD to be an OS of the present and future and not of the past.
Some of the patches I have already posted do infact remove the direct use of IN_CLASSA by use of the proper macro IN_LOOPBACK. I shall keep an eye out as I look over the code for issues that effect my stated goals in cleaning up things such that IN_ZERONET, IN_LOOPBACK, and IN_EXPERIMENTAL can be used to fully effect all places that are trying to deal with these ranges should they ever be redefined.
But the old A/B/C are a somewhat different problem than making Class E usable.
Agreed, but not totally in that often IN_CLASSA && = 0x7f000001 hand rolled code is applied instead of using the IN_LOCALNET macro, see D19317.
But I will also note that ip_output() won't emit packets on the "loopback" net (/8) to a non-loopback interface at the moment, citing RFC1122. The RFC says that, although it isn't clear about the mask. But that RFC is also pre-CIDR and is fully invested in Class A/B/C. ip_output() should be changed to correspond with what you are doing here. in_canforward() too. Removing IN_LOOPBACKNET would help smoke out anything else like this.
D19317 already cleans up ip_forward, canforward, and ip_output, by using the proper macros from in.h
Hi, I'm working on this topic and I wanted to give an update on a bunch of things, most of them from earlier today.
- I built a FreeBSD kernel with this patch under FreeBSD 13 and was able to get it to operate (ARP, ICMP, TCP) on a local network segment with a patched Linux machine (for each of the 240/4, 127/8, and 0/8 address ranges). Yay!
- I'm still going to do more tests where the patched hosts are separated by a router, in case there are any interactions with routing logic that need to be shaken out.
- Another kind of reserved address that we're working on unreserving is the lowest-address additional broadcast address (for example, .0 in a /24). This patch doesn't currently do that, but I made another patch against sys/netinet/in.c's definition of in_ifaddr_broadcast() which was then also able to interoperate with a patched Linux system. After I also do some tests with a router, I'll suggest that @rgrimes include that functionality with this patch. (FreeBSD currently designates and displays the highest address on a local network segment -- or another address explicitly specified with ifconfig -- as "the" broadcast address. However, sys/netinet/in.c has an additional check for an address equal to the subnet address, which is then automatically treated as broadcast as well, in compliance with the existing RFCs but to no particular benefit today.
- This current version of the patch from @rgrimes also changes the behavior of ping 127.0.0.2, which results in an error on an unpatched system but a *different* error on a patched system. The stock system gives "No route to host", while the patched system gives "Can't assign requested address". Having only previously explored the behavior of addresses in 127/8 on Linux, I don't know what FreeBSD's preferred behavior for the other addresses is, but it seems like the change was still unintentional. Addresses outside 127.0/16 do, as expected, become routable and can be used to refer to a non-local host.
Hello again to @rgrimes and Mike @karels!
Seth and I at the IPv4 Unicast Extension Project are now happy that the latest Linux 5.14 kernel now includes our tiny improvement repurposing the lowest address in each subnet as a unicast address (rather than as a second broadcast address). See https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/net/ipv4?id=94c821c74bf5fe0c25e09df5334a16f98608db90 . In addition, we have an Internet-Draft on that topic under consideration in the Intarea working group at IETF: https://datatracker.ietf.org/doc/draft-schoen-intarea-lowest-address/ . Since 4.2BSD's original broadcast behavior is what caused this historical bobble, it would be especially fitting if modern BSD maintainers wanted to participate in improving on it. This little innocuous change adds one extra usable IPv4 address in every globally routable subnet -- tens of millions of newly usable IPv4 addresses in total.
We also have another Internet-Draft in preparation about repurposing 240/4 as unicast addresses, a third about repurposing 0/8 as unicast, and a fourth about reducing 127/8 (which was invented by Bill Joy or Sam Leffler at CSRG in 1981) from 16 million loopback addresses to only 65,536.
I wonder if either or both of you would be willing to sign-on as co-authors of any of these drafts, indicating your interest in having the Internet community adopt these minor code improvements. They would not only clean up historic detritis from early days of BSD research, but would also have a positive impact on users facing the ongoing issue of rising IPv4 address prices.
Hi, John. I like the idea of allowing the "zero" address, as that requires only local changes to support. I'd be happy to collaborate on that, including the I-D and any FreeBSD changes. It's not obvious that any are required, but I'll test it. I've wondered if my DSL modem would support it, as I have a /29.
I'm also happy to kibitz on the other changes.
btw, I objected to the "nonstandard broadcast" terminology in RFC1122, as there was no standard before that. 4.2BSD used 0 because it started out on 3 Mb/s Ethernet, where address 0 was broadcast and link-layer addresses were just the low octet of the IP address.
Should we maybe move this discussion to a mailing list? If there is no other good alternative, I could host one.
Hi, that change looks a lot like one I did yesterday. I used the opposite default for the sysctl; I was surprised that we are still broadcasting packets to host 0, and I don't think that is useful for most people. btw, the sysctl probably should be per VNET. I'll probably put my change into a review soon for reference.
I created a review for my change for host 0 broadcast, https://reviews.freebsd.org/D31861. The code is very similar, although of course details came out different. It handles VNETs though, and the default is different. I haven't come up with a sysctl name that I really like.
When writing the I-D, we went through several name changes too. For a while we called it the "zero" or "zeroth" address in the subnet, but that either sounded pretentious or could easily be mistaken for (Class A) network number zero (0/8) which we also propose to allow for unicast use. (0/8 was also a historical method that got revised -- it was the first try at ARP, but couldn't handle Ethernet addresses because it only had 24 bits for the LAN address. It and two matching ICMP messages were deprecated in the '80s but it never got reallocated as an ordinary unicast address, so nobody has used it at all for 30+ years.) We settled on calling this second broadcast address the "lowest" address in the subnet, which is accurate, hard to confuse, and easy for everyone to understand. Perhaps broadcast_on_lowest as a sysctl name?