Page MenuHomeFreeBSD

netlink: make netlink the standard part of the kernel.
AbandonedPublic

Authored by melifaro on Mar 13 2023, 11:32 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 28, 3:18 PM
Unknown Object (File)
Dec 1 2024, 3:27 AM
Unknown Object (File)
Nov 6 2024, 3:29 PM
Unknown Object (File)
Oct 16 2024, 7:13 AM
Unknown Object (File)
Oct 16 2024, 7:12 AM
Unknown Object (File)
Oct 16 2024, 6:47 AM
Unknown Object (File)
Oct 4 2024, 5:36 AM
Unknown Object (File)
Oct 4 2024, 1:58 AM

Details

Reviewers
manu
emaste
cperciva
gbe
kp
Group Reviewers
network
Summary

At the initial introduction of Netlink, having it as a module had a lot of sense - as an experimental feature, it didn't have to be in the kernel.
It was also easier to iterate over changes by unloading/loading the module instead of rebooting the VM.

Currently, netlink is being extended to provide a wider range of APIs, including interface creation with custom properties. Such features require access to the private interface data (like vlan parent interface) that are not available externally. Addressing it requires netlink-specific code to be present in the interface modules themselves and inside the kernel cloner interface. Netlink can continue to be a custom kernel part or a module, but that would require exposing dozens of function pointers. I believe at the current stage it starts to add more complexity than benefits. Additionally, many cloner interfaces (vlan, lagg, bridge, tuntap, etc) will get netlink-related code and begin depending on netlink, further reducing the benefit of it being the standalone module.
Lastly, netlink by itself is not too big (~112k additional code).

With that in mind, I'd like to propose including Netlink to the main kernel as the standard part, with the goal of 14.0 being shipped with non-custom Netlink.
I'm not yet sure of what should be done to 13-S. I certainly prefer having fully-functional Netlink there and I want to retain the ability to merge new netlink-only functions there, but I'm less certain of making it a standard part. Probably, some sort of middle ground solution can be done, but it needs some thinking.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 50320
Build 47212: arc lint + arc unit

Event Timeline

Do whatever it needs with CURRENT but make sure that stable/13 can still be built without NETLINK code bloat.

This revision is now accepted and ready to land.Mar 13 2023, 12:45 PM

Should sys/modules/netlink be removed as well?

So, maybe there are other reasons to do this, but so far I'm not so convinced.

You make a decent argument that we shouldn't build netlink as a module. Looking at things quickly shows that indeed there are a lot of function pointers that are accessing private data. However, that suggests a need for more public interfaces to get this information that we're now making public with netgraph. So even if we don't build a module here, we'd need some way to get this that doesn't reach over as much into the private parts of things. The code that's there will have to get it somehow, even if the function pointers are eliminated.

Next, even things like INET are optional. We've coped well with that over the years, and so a good case hasn't been made that having this standard would help or having it optional would be an undue burden. Even in Linux it's an optional feature.

The size is on the cusp of being interesting. The smallest kernel we can create is on the order of 2-3MB (my numbers are a little old, but not that old), so saving 100k of code still represents a single digit % savings, which can be useful. So if we had to do it, it wouldn't be terrible, but I'm not sure a good case has been made that we have to do it.

This hasn't been more widely discussed, which also gives me pause to make a decision like this w/o more input from the wider networking community.

This patch doesn't remove the netlink module, which is the only thing that's well supported in the reasons put forth so far. One could manage the module with more module depends, which isn't terrible, but will get super tedious I fear as well.

We haven't even had a release yet with netlink, so that suggests this is premature. We get the bulk of new usage when a release happens, so moving to make it mandatory before we've been through that cycle to see if there's unknown issues when we deploy this at scale seems unwise. I'd also argue that it's a complete and total non-starter for stable/13: it goes against the stability guarantees, would break config files in a stable branch, etc.

So while this might be the right thing longer term, I remain extremely uneasy with doing it now, in just a code review, before we've acid-tested the code with a release cycle or two. Apart from eliminating the module, I'm uneasy about doing this before the 14 branch even, unless there can be a more compelling argument made.

So I have concerns...

sys/conf/files
4438

This is my heartache about the patch.

In D39047#888923, @imp wrote:

So, maybe there are other reasons to do this, but so far I'm not so convinced.

First of all, thank you for the feedback, it’s extremely valuable.
I think the most important thing is agreeing (or not agreeing) on the desired target state (Netlink being the default network control plane mechanism).
If there's an agreement on that, then working out the steps towards that is much easier.

You make a decent argument that we shouldn't build netlink as a module. Looking at things quickly shows that indeed there are a lot of function pointers that are accessing private data. However, that suggests a need for more public interfaces to get this information that we're now making public with netgraph. So even if we don't build a module here, we'd need some way to get this that doesn't reach over as much into the private parts of things. The code that's there will have to get it somehow, even if the function pointers are eliminated.

This is one approach, yes. I was considering going forward with it and did some work. I .. two things:

  1. there will be multiple kind of interfaces required. For example, adding vlan KPI is easy, but adding bridge, lag or carp is much harder. They are more complex, as they have more configuration knobs, dynamic peers, state, etc.. Basically, for each new netlink interface, one needs to implement two interfaces: netlink API and the virtual KPI.
  2. There is limited use of such KPI. For example, till recently, it was not possible to create a cloned interface with parameters inside the kernel.

With that in mind, I’m currently leaning to the approach of using netlink interfaces directly. It’ll also allow to automatically provide kernel KPIs for everything mentioned above - via netlink.

What do you think?

Next, even things like INET are optional. We've coped well with that over the years, and so a good case hasn't been made that having this standard would help or having it optional would be an undue burden. Even in Linux it's an optional feature.

Linux approach is slightly different. Netlink core(socket, family, genetlink framework) & rtnetlink is enabled if CONFIG_NET is enabled. Separate families such as firewall interfaces or some genetlink customers have their own config options.

The size is on the cusp of being interesting. The smallest kernel we can create is on the order of 2-3MB (my numbers are a little old, but not that old), so saving 100k of code still represents a single digit % savings, which can be useful.

Ack, that's a good datapoint.

So if we had to do it, it wouldn't be terrible, but I'm not sure a good case has been made that we have to do it.

I think it would help if someone could clearly define the goals of having define-based compilation for Netlink (or generally). Each define comes with an attached maintenance cost. For example, for the INET use case we have 1.3k of INET defines, 1.6k of INET6 defines and ~300 INET || INET6 defines.
It took quite a lot of my time (as a developer) over the years to maintain these in the code I’m adding or touching. Clearly understanding the achieved benefits here helps greatly, as it can be weighed agains the added complexity.
For example, 100k is too much, but is 50 okay or not? Should it be one compile option or multiple? Rtsock interface is compiled in unconditionally, should it be also converted or not? What are the considerations apart from the size concern?
I see multiple potential solutions that can address this, being on the same page w.r.t the desired goals helps to pick the best one (and reiterate if required):

Based on my current understanding, I'm thinking of the following:

  • Introduce "standard" netlink/netlink_glue.c, with shared function definitions, internally pointed to stubs.
  • Drivers (or other callers) have Netlink-specific code under the new "default-on" "NETLINK_SUPPORT" config opton.
  • Once Netlink module is loaded, the function pointers are switched from the stubs to the actualy implementation, but "netlink_glue.c" remains the primary kernel<>netlink interface
  • Userland uses Netlink by default but retains the rtsock/ioctl code, that can be enabled by the new "WITHOUT_NETLINK" src.conf knob.

This hasn't been more widely discussed, which also gives me pause to make a decision like this w/o more input from the wider networking community.

I’m happy to discuss it in any place with a reasonable signal-to-noise ratio.
(And this review is btw not the worst). If you suggest writing some to -arch and start the discussion there, I can do that as well.

This patch doesn't remove the netlink module, which is the only thing that's well supported in the reasons put forth so far. One could manage the module with more module depends, which isn't terrible, but will get super tedious I fear as well.

We haven't even had a release yet with netlink, so that suggests this is premature. We get the bulk of new usage when a release happens, so moving to make it mandatory before we've been through that cycle to see if there's unknown issues when we deploy this at scale seems unwise. I'd also argue that it's a complete and total non-starter for stable/13: it goes against the stability guarantees, would break config files in a stable branch, etc.

Yes, that is a good argument. I'll come up with a different approach once there's a some sort of consensus on the previous items in this discussion.

So while this might be the right thing longer term, I remain extremely uneasy with doing it now, in just a code review, before we've acid-tested the code with a release cycle or two. Apart from eliminating the module, I'm uneasy about doing this before the 14 branch even, unless there can be a more compelling argument made.

So I have concerns...

Let's discuss :-)

In D39047#888923, @imp wrote:

So, maybe there are other reasons to do this, but so far I'm not so convinced.

You make a decent argument that we shouldn't build netlink as a module. Looking at things quickly shows that indeed there are a lot of function pointers that are accessing private data. However, that suggests a need for more public interfaces to get this information that we're now making public with netgraph. So even if we don't build a module here, we'd need some way to get this that doesn't reach over as much into the private parts of things. The code that's there will have to get it somehow, even if the function pointers are eliminated.

Next, even things like INET are optional. We've coped well with that over the years, and so a good case hasn't been made that having this standard would help or having it optional would be an undue burden. Even in Linux it's an optional feature.

The size is on the cusp of being interesting. The smallest kernel we can create is on the order of 2-3MB (my numbers are a little old, but not that old), so saving 100k of code still represents a single digit % savings, which can be useful. So if we had to do it, it wouldn't be terrible, but I'm not sure a good case has been made that we have to do it.

This hasn't been more widely discussed, which also gives me pause to make a decision like this w/o more input from the wider networking community.

This patch doesn't remove the netlink module, which is the only thing that's well supported in the reasons put forth so far. One could manage the module with more module depends, which isn't terrible, but will get super tedious I fear as well.

We haven't even had a release yet with netlink, so that suggests this is premature. We get the bulk of new usage when a release happens, so moving to make it mandatory before we've been through that cycle to see if there's unknown issues when we deploy this at scale seems unwise. I'd also argue that it's a complete and total non-starter for stable/13: it goes against the stability guarantees, would break config files in a stable branch, etc.

So while this might be the right thing longer term, I remain extremely uneasy with doing it now, in just a code review, before we've acid-tested the code with a release cycle or two. Apart from eliminating the module, I'm uneasy about doing this before the 14 branch even, unless there can be a more compelling argument made.

So I have concerns...

Please take a look at the D39269 (add only a minimal part of Netlink KPI to the kernel) and D39148 (introduce WITH_NETLINK_SUPPORT option for userland). Do you think those two addresses your concerns?

First of, please accept apologies for my tardiness in replies... prepping for vacation and being on vacation has taken some time... Now things have settled down...

In D39047#888923, @imp wrote:

So, maybe there are other reasons to do this, but so far I'm not so convinced.

First of all, thank you for the feedback, it’s extremely valuable.
I think the most important thing is agreeing (or not agreeing) on the desired target state (Netlink being the default network control plane mechanism).
If there's an agreement on that, then working out the steps towards that is much easier.

Yes. And if this is to be the best / preferred way to get this data out of the kernel, then that also has a baring on the discussions.

You make a decent argument that we shouldn't build netlink as a module. Looking at things quickly shows that indeed there are a lot of function pointers that are accessing private data. However, that suggests a need for more public interfaces to get this information that we're now making public with netgraph. So even if we don't build a module here, we'd need some way to get this that doesn't reach over as much into the private parts of things. The code that's there will have to get it somehow, even if the function pointers are eliminated.

This is one approach, yes. I was considering going forward with it and did some work. I .. two things:

  1. there will be multiple kind of interfaces required. For example, adding vlan KPI is easy, but adding bridge, lag or carp is much harder. They are more complex, as they have more configuration knobs, dynamic peers, state, etc.. Basically, for each new netlink interface, one needs to implement two interfaces: netlink API and the virtual KPI.
  2. There is limited use of such KPI. For example, till recently, it was not possible to create a cloned interface with parameters inside the kernel.

With that in mind, I’m currently leaning to the approach of using netlink interfaces directly. It’ll also allow to automatically provide kernel KPIs for everything mentioned above - via netlink.

That makes sense. While it would be nice to have a generic API for all the things within the kernel, if netlink turns out to be that API and it is the the only thing that reaches into the private implementations is a good second best... Especially for those things that there's not another consumer of the information.

What do you think?

Next, even things like INET are optional. We've coped well with that over the years, and so a good case hasn't been made that having this standard would help or having it optional would be an undue burden. Even in Linux it's an optional feature.

Linux approach is slightly different. Netlink core(socket, family, genetlink framework) & rtnetlink is enabled if CONFIG_NET is enabled. Separate families such as firewall interfaces or some genetlink customers have their own config options.

The size is on the cusp of being interesting. The smallest kernel we can create is on the order of 2-3MB (my numbers are a little old, but not that old), so saving 100k of code still represents a single digit % savings, which can be useful.

Ack, that's a good datapoint.

So if we had to do it, it wouldn't be terrible, but I'm not sure a good case has been made that we have to do it.

I think it would help if someone could clearly define the goals of having define-based compilation for Netlink (or generally). Each define comes with an attached maintenance cost. For example, for the INET use case we have 1.3k of INET defines, 1.6k of INET6 defines and ~300 INET || INET6 defines.
It took quite a lot of my time (as a developer) over the years to maintain these in the code I’m adding or touching. Clearly understanding the achieved benefits here helps greatly, as it can be weighed agains the added complexity.
For example, 100k is too much, but is 50 okay or not? Should it be one compile option or multiple? Rtsock interface is compiled in unconditionally, should it be also converted or not? What are the considerations apart from the size concern?
I see multiple potential solutions that can address this, being on the same page w.r.t the desired goals helps to pick the best one (and reiterate if required):

Yea, as I said 100k is the approximate dividing line between 'interesting savings' and 'not worth the hassle'. What I didn't say, though, is that 100k and above it becomes a sliding scale. 100k, if it's trivial then yes, make it optional (don't to it for trivial convenience). But as the hassle increases, then the cost / benefit calculus shifts more towards inclusion always.

Based on my current understanding, I'm thinking of the following:

  • Introduce "standard" netlink/netlink_glue.c, with shared function definitions, internally pointed to stubs.
  • Drivers (or other callers) have Netlink-specific code under the new "default-on" "NETLINK_SUPPORT" config opton.
  • Once Netlink module is loaded, the function pointers are switched from the stubs to the actualy implementation, but "netlink_glue.c" remains the primary kernel<>netlink interface
  • Userland uses Netlink by default but retains the rtsock/ioctl code, that can be enabled by the new "WITHOUT_NETLINK" src.conf knob.

If this separation is easy to maintain, then having a 'null netlink' and a 'full up netlink' seems like it is a reasonable compromise. But only if that split isn't a huge burden on you. It sounds like not, but I thought I'd state the reasoning in case I misunderstand.

This hasn't been more widely discussed, which also gives me pause to make a decision like this w/o more input from the wider networking community.

I’m happy to discuss it in any place with a reasonable signal-to-noise ratio.
(And this review is btw not the worst). If you suggest writing some to -arch and start the discussion there, I can do that as well.

This patch doesn't remove the netlink module, which is the only thing that's well supported in the reasons put forth so far. One could manage the module with more module depends, which isn't terrible, but will get super tedious I fear as well.

We haven't even had a release yet with netlink, so that suggests this is premature. We get the bulk of new usage when a release happens, so moving to make it mandatory before we've been through that cycle to see if there's unknown issues when we deploy this at scale seems unwise. I'd also argue that it's a complete and total non-starter for stable/13: it goes against the stability guarantees, would break config files in a stable branch, etc.

Yes, that is a good argument. I'll come up with a different approach once there's a some sort of consensus on the previous items in this discussion.

So while this might be the right thing longer term, I remain extremely uneasy with doing it now, in just a code review, before we've acid-tested the code with a release cycle or two. Apart from eliminating the module, I'm uneasy about doing this before the 14 branch even, unless there can be a more compelling argument made.

So I have concerns...

Let's discuss :-)

Sure. I think I like the split, but if the split is a ton of work, I think this quick exchange has biased me towards accepting by default, or suggesting that a quick discussion on arch@ might be good. If for nothing else than to communicate the decision and give people a chance to ask questions and understand why/if it's the best one for the project.

So I like the split idea, but I'm offering you the 'out' of doing that if it is burdensome. Ideally, we'd not have a ton of ifdefs, and this sounds like a good way to avoid that... But I don't want you to have to do that if it only saves 10k of space.

Also, while it would be nice to have a global kill switch in case bad things pop up, there's other ways to mitigate that risk, including commitments of time / resources should issues come up as we move towards the stable/14 branch. I (and I think generally the project) prefers a modular approach when possible, but there's a practical side of that as well that understands that a long term hassle may not be worth the cost...

Different version of the proposed diff landed as 3091d980f581 .