Page MenuHomeFreeBSD

Add modular routing lookup framework.
ClosedPublic

Authored by melifaro on Nov 28 2020, 10:42 AM.
Tags
None
Referenced Files
F108439084: D27401.id80244.diff
Fri, Jan 24, 7:03 PM
Unknown Object (File)
Thu, Jan 23, 6:58 PM
Unknown Object (File)
Tue, Jan 21, 8:54 PM
Unknown Object (File)
Tue, Jan 21, 8:54 PM
Unknown Object (File)
Tue, Jan 21, 8:51 PM
Unknown Object (File)
Tue, Jan 21, 7:58 PM
Unknown Object (File)
Mon, Jan 20, 10:02 PM
Unknown Object (File)
Mon, Jan 20, 3:25 AM
Subscribers

Details

Summary
NOTE: in order to test the changes, you need to build kernel with options FIB_ALGO. Based against r368820.
NOTE: I'm going to commit this revision on December 23, unless I get any objections.

Problem statement

Currently FreeBSD uses radix (compressed binary tree) to perform all unicast route manipulations including loookups.
Radix requires storing key length in each item, allowing to use sockaddrs and literally any address family.
This flexibility comes at the cost: radix is slow, cache-unfriendly and adds locking to the hot path.

There is an extremely high bar in trying to switch radix to something else.
Some efforts has been done to reduce the coupling, however radix is still closely tied to the rest of the system. Fixed locking semantics, rtentry format, iteration assumptions prevents restrict the list of possible solutions.

Algo overview

For small tables (VM, potentially embedded), class of read-only datastructures can be used, as it is easy to rebuild the entire datastructure from scratch on each change.

For large tables, there are far more effective algorithms tailored for IPv4 and IPv6 lookups. Algorithms like DIR24-8, Lulea or DXR use 3-5 bytes per prefix, compared to ~192 bytes in radix for large-table use cases.
They also limit lookup to 2-3 memory accesses (IPv4), while radix can be notably worse.
Some of the algorithms require complex update procedures, so using them assumes some sort of update batching to reduce the change overhead.

Goals

  • Reduce the bar in introducing new route lookup algorithms
  • Make existing routing lookups for existing families fast and lockless

Proposed solution

Add a framework that allows to attach lookup algorithms via kernel modules, allowing them to be used for dataplane LPM lookups.
As most of the effective algorithms perform one-way compilation of prefixes into custom data structures, their implementations rely on another "control-plane" copy of the prefixes to perform datastructure updates. This approach keeps radix as the “control plane” source of truth, simplifying the actual algorithm implementation. It also serves as the abstraction layer for the current routing code details such as requirements on lock ordering or control plane performance.

Algorithms

As a baseline, the following algorithms will be provided out of the box:
IPv4:

  • lockless radix (small amount of routes, rebuilding on each change, in-kernel)
  • DPDK rte lpm (DIR24-8 variation, large-tables, kernel-module) (D27412)
  • "base" radix (to serve as a fallback, in-kernel)

IPv6:

  • lockless radix (small amount of routes, rebuilding on each change, in-kernel)
  • DPDK rte lpm (DIR24-8 variation, large-tables, kernel-module) (D27412)
  • "base" radix (to serve as a fallback, in-kernel)

Implementation details

Framework takes care of handling initial synchronisation, route subscription, nhop/nhop groups reference and indexing, dataplane attachments and fib instance algorithm setup/teardown.

Retries

Framework is build to be resilient about failures. It explicitly allows algorithms to request "rebuild" if an algorithm is unable to perform in-place modification. For example, it is possible that memory allocation fails or algorithm/framework runs of object indexes.
Rebuild is simply building new algorithm instance, potentially fetching data from an old instance and switching dataplane pointers.
This approach simplifies implementation of readonly datastructures and update batching.

Automatic algorithm selection

As different workloads may have different route scale, including the framework in GENERIC requires supporting all scales w/o human intervention. Framework implements automatic algorithm selection and switchover the following way:

  • each algo has get_preference() callback, returning relative preference (0..255) for the provided route table scale
  • after routing table change, callback is scheduled to re-evaluate currently used algorithm vs others. Callback executes after N=30 sec or M=100 route changes, whichever happens first
  • New algorithm preference has to be X=5% better than the current one to enable switchover

Nexthop referencing and indexing

Framework provide wrappers to automatically reference nexthops, ensuring they can be safely returned and their refcount is non-zero.
It also maintains idx->nhop pointer array, transparently handling nhop/nhop group indexes, allowing algorithms to store 16 or 32-bit indexes instead of pointers.

Dataplane pointers

Instead of two-dimensional rnh array operated by`rt_table_get_rnh()`, framework uses per-family linear array of the following structures:

struct fib_dp {
        flm_lookup_t    *f;
        void            *arg;
};

Function is the algorithm lookup function and the data is the pointer to the algorithm-specific data structure.
Changing the function/data pointer is implemented by creating another array copy, switching it and reclaiming old copy via epoch(9) callback.

Callbacks

Effectively the lookup module needs to implement 6 callbacks, with nearly all of table interaction is handled by the framework:

# Lookup: return nexthop pointer for address specified by key and scopeid
typedef struct nhop_object *flm_lookup_t(void *algo_data,
    const struct flm_lookup_key key, uint32_t scopeid);

# Create base datastructures for the instance tied to a specific RIB
typedef enum flm_op_result flm_init_t (uint32_t fibnum, struct fib_data *fd,
    void *_old_data, void **new_data);

# Free algorithm-specific datastructures
typedef void flm_destroy_t(void *data);

# Callback for initial synchronisation, called for each route in the routing table as a part of "rebuild"
# called under rib write lock
typedef enum flm_op_result flm_dump_t(struct rtentry *rt, void *data);

# Callback for providing the datapath func/pointer to be used in lookups
# called under rib write lock
typedef enum flm_op_result flm_dump_end_t(void *data, struct fib_dp *dp);

# Callback notifying of a single route table change
# called under rib write lock
typedef enum flm_op_result flm_change_t(struct rib_head *rnh,
    struct rib_cmd_info *rc, void *data);

# Callback for determining relative algorithm preference based on the routing table data
typedef uint8_t flm_get_pref_t(const struct rib_rtable_info *rinfo);
Test Plan

Performance

Lookup performance is tested using D27604 kernel module. Basically, the module calls fib[46]_lookup() in a loop, measuring total lookup time.

  • CPU: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz ()
  • 8 IP destinations (both IPv4/IPv6)
  • single thread
  • 10M lookups
  • 4 runs
  • dynamic algo switch
  • r368604 (GENERIC-NODEBUG + ROUTE_ALGO)

Results

Small-fib ("standard" configuration: interface & default route)

  • radix4: 279064482 nanoseconds, 35 830 428 pps
  • radix4_lockless: 208777967 nanoseconds, 47 892 984 pps
  • bsearch4: 68720124 nanoseconds, 145 503 229 pps
  • dpdk_lpm4: 60284954 nanoseconds, 165 862 281 pps
  • radix6: 346572490 nanoseconds, 28 853 992 pps
  • radix6_lockless: 292266765 nanoseconds, 34 215 316 pps

Large fib

IPv4: 710k routes

  • radix4_lockless: 1070335461 nanoseconds, 9 342 865 pps
  • bsearch4: N/A
  • dpdk_lpm4: 73376846 nanoseconds, 136 282 772 pps

IPv6: 100k routes

  • radix6_lockless: 1587777930 nanoseconds, 6 298 109 pps
  • dpdk_lpm6: 176917777 nanoseconds, 56 523 432 pps

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
melifaro edited the summary of this revision. (Show Details)

Small cleanup, rebase to account for already-committed parts.

  • Move dpdk code to contrib
  • Add IPv4 dpdk lpm code (compile tested only)
  • Move sysctl tree to net.route.algo
  • Fix panic on dpdk modules unload
  • Fix multipath refcounting

Update to reflect committed changes. Split DPDK part in a separate review.

Kernel build failed with this error (on r368264):

--- kernel.full ---
ld: error: undefined symbol: fib_destroy_rib
>>> referenced by route.c:158 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route.c:158)
>>>               route.o:(rt_table_destroy)

ld: error: undefined symbol: fib_grow_rtables
>>> referenced by route_tables.c:191 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_tables.c:191)                                >>>               route_tables.o:(grow_rtables)

ld: error: undefined symbol: fib_select_algo_initial
>>> referenced by route_tables.c:215 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_tables.c:215)                                >>>               route_tables.o:(grow_rtables)                                                                                                                                                                                                                                         ld: error: undefined symbol: vnet_entry_inet6_dp                                                                                            >>> referenced by in6_fib.c:271 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/netinet6/in6_fib.c:271)                                           >>>               in6_fib.o:(fib6_check_urpf)                                                                                               *** [kernel.full] Error code 1
melifaro edited the summary of this revision. (Show Details)

Kernel build failed with this error (on r368264):

any chance you could try building with ROUTE_ALGO config option enabled?

The error is different:

ld: error: undefined symbol: nhgrp_get_count
>>> referenced by route_algo.c:962 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_algo.c:962)
>>>               route_algo.o:(fib_get_rtable_info)
>>> referenced by route_algo.c:962 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_algo.c:962)
>>>               route_algo.o:(fib_check_best_algo)
*** [kernel.full] Error code 1

Fix build with !ROUTE_MPATH.

The error is different:

ld: error: undefined symbol: nhgrp_get_count
>>> referenced by route_algo.c:962 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_algo.c:962)
>>>               route_algo.o:(fib_get_rtable_info)
>>> referenced by route_algo.c:962 (/usr/local/BSDRP/TESTING/FreeBSD/src/sys/net/route/route_algo.c:962)
>>>               route_algo.o:(fib_check_best_algo)
*** [kernel.full] Error code 1

Indeed. Was broken for non-multipath case. Should work now.

Yes, it builds and boot, but seems quiet verbose, the dmesg is full of messages like those:

[rt_algo] inet6.0 (radix6_lockless) handle_rtable_change_cb: Scheduling rebuilt
[rt_algo] inet6.0 fib_check_best_algo: candidate_algos: 2, curr: radix6_lockless(255) result: NULL(255)
[rt_algo] inet6.0 (radix6_lockless) handle_rtable_change_cb: Scheduling rebuilt
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 1 0xfffff8000bb01e00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 3 0xfffff8000bb01c00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 2 0xfffff8000bb01d00
[rt_algo] inet6.0 (radix6_lockless) sync_algo: initial dump completed.
[rt_algo] inet6.0 (radix6_lockless) try_setup_instance: DUMP completed successfully.
[rt_algo] inet.0 (radix4_lockless) handle_rtable_change_cb: Scheduling rebuilt
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: [vnet 0xfffff80003088f00] replace with f:0xffffffff80de82b0 arg:0xfffff800039739
00
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: OLD FFI: 0xfffff8000b3cde00 NEW FFI: 0xfffff8000b3cd380
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: update 0xfffff8000b3cde00 -> 0xfffff8000b3cd380
[rt_algo] inet6.0 setup_instance: try 0: fib algo result: 0
[rt_algo] inet6.0 (radix6_lockless) rebuild_callout: switched to new instance
[rt_algo] inet6.0 (radix6_lockless) schedule_destroy_instance: DETACH
[rt_algo] inet6.0 (radix6_lockless) schedule_destroy_instance: destroying old instance
[rt_algo] inet.0 fib_check_best_algo: candidate_algos: 2, curr: radix4_lockless(255) result: NULL(255)
[rt_algo] inet6.0 (radix6_lockless) destroy_instance: destroy fd 0xfffff80003985d00
[rt_algo] inet.0 (radix4_lockless) fib_get_nhop_idx:  REF nhop 1 0xfffff8000bb13e00
[rt_algo] inet.0 (radix4_lockless) sync_algo: initial dump completed.
[rt_algo] inet.0 (radix4_lockless) try_setup_instance: DUMP completed successfully.
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: [vnet 0xfffff80003088f00] replace with f:0xffffffff80d98f10 arg:0xfffff80003973c0
0
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: OLD FFI: 0xfffff8000b3cde40 NEW FFI: 0xfffff80003778bc0
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: update 0xfffff8000b3cde40 -> 0xfffff80003778bc0
[rt_algo] inet.0 setup_instance: try 0: fib algo result: 0
[rt_algo] inet.0 (radix4_lockless) rebuild_callout: switched to new instance
[rt_algo] inet.0 (radix4_lockless) schedule_destroy_instance: DETACH
[rt_algo] inet.0 (radix4_lockless) schedule_destroy_instance: destroying old instance
[rt_algo] inet.0 (radix4_lockless) destroy_instance: destroy fd 0xfffff80003985b00
[rt_algo] inet.0 (radix4_lockless) handle_rtable_change_cb: Scheduling rebuilt
[rt_algo] inet6.0 (radix6_lockless) handle_rtable_change_cb: Scheduling rebuilt
[rt_algo] inet.0 fib_check_best_algo: candidate_algos: 2, curr: radix4_lockless(255) result: NULL(255)
[rt_algo] inet.0 (radix4_lockless) fib_get_nhop_idx:  REF nhop 1 0xfffff8000bb13e00
[rt_algo] inet.0 (radix4_lockless) fib_get_nhop_idx:  REF nhop 2 0xfffff8000bc48e00
[rt_algo] inet.0 (radix4_lockless) fib_get_nhop_idx:  REF nhop 3 0xfffff8000bc48d00
[rt_algo] inet.0 (radix4_lockless) sync_algo: initial dump completed.
[rt_algo] inet.0 (radix4_lockless) try_setup_instance: DUMP completed successfully.
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: [vnet 0xfffff80003088f00] replace with f:0xffffffff80d98f10 arg:0xfffff8000bc7cd8
0
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: OLD FFI: 0xfffff80003778bc0 NEW FFI: 0xfffff8000b360a80
[rt_algo] inet.0 (radix4_lockless) replace_rtables_family: update 0xfffff80003778bc0 -> 0xfffff8000b360a80
[rt_algo] inet.0 setup_instance: try 0: fib algo result: 0
[rt_algo] inet.0 (radix4_lockless) rebuild_callout: switched to new instance
[rt_algo] inet.0 (radix4_lockless) schedule_destroy_instance: DETACH
[rt_algo] inet.0 (radix4_lockless) schedule_destroy_instance: destroying old instance
[rt_algo] inet6.0 fib_check_best_algo: candidate_algos: 2, curr: radix6_lockless(255) result: NULL(255)
[rt_algo] inet.0 (radix4_lockless) destroy_instance: destroy fd 0xfffff80003985d00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 1 0xfffff8000bb01e00
[rt_algo] inet.0 (radix4_lockless) destroy_instance:  FREE nhop 1 0xfffff8000bb13e00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 5 0xfffff8000bc48b00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 4 0xfffff8000bc48c00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 3 0xfffff8000bb01c00
[rt_algo] inet6.0 (radix6_lockless) fib_get_nhop_idx:  REF nhop 2 0xfffff8000bb01d00
[rt_algo] inet6.0 (radix6_lockless) sync_algo: initial dump completed.
[rt_algo] inet6.0 (radix6_lockless) try_setup_instance: DUMP completed successfully.
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: [vnet 0xfffff80003088f00] replace with f:0xffffffff80de82b0 arg:0xfffff80003973a
80
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: OLD FFI: 0xfffff8000b3cd380 NEW FFI: 0xfffff8000b3609c0
[rt_algo] inet6.0 (radix6_lockless) replace_rtables_family: update 0xfffff8000b3cd380 -> 0xfffff8000b3609c0
[rt_algo] inet6.0 setup_instance: try 0: fib algo result: 0
[rt_algo] inet6.0 (radix6_lockless) rebuild_callout: switched to new instance
[rt_algo] inet6.0 (radix6_lockless) schedule_destroy_instance: DETACH
[rt_algo] inet6.0 (radix6_lockless) schedule_destroy_instance: destroying old instance

And inet forwarding performance is very very slow:

[root@apu2]~# netstat -ihw 1
            input        (Total)           output
   packets  errs idrops      bytes    packets  errs      bytes colls
      3.0M     0     0       171M        15k     0       874K     0
      3.0M     0     0       170M        15k     0       872K     0
      3.1M     0     0       176M        15k     0       903K     0

And inet6 forwarding seems not working at all, extrat of a netstat -ss:

ip6:
        98000627 total packets received
        98000627 packets not forwardable
        12 packets sent from this host
        94454498 output packets discarded due to no route

Fix IPv6 forwarding.
Fix tests.

Result on a small and medium hardware (huge difference between those hardware).
pmc data no more available on the small AMD cpu (regression because it was).

Result on a small and medium hardware (huge difference between those hardware).
pmc data no more available on the small AMD cpu (regression because it was).

Thank you for testing it & providing detailed measurements & PMC data!
So far the key takeaways for me are the following:

  • IPv6 performance loss for small systems may be a result of unaligned radix memory allocation
  • unlocked radix does not bring a lot of benefit - it may be worth considering something else for small-fib usecase - 6% in rn_match is way too costy.

I'll experiment with another algo approach and update this & DPDK review.

Update default small-fib lookup algo to bsearch.

Result on a small and medium hardware (huge difference between those hardware).
pmc data no more available on the small AMD cpu (regression because it was).

Thank you for testing it & providing detailed measurements & PMC data!
So far the key takeaways for me are the following:

  • IPv6 performance loss for small systems may be a result of unaligned radix memory allocation
  • unlocked radix does not bring a lot of benefit - it may be worth considering something else for small-fib usecase - 6% in rn_match is way too costy.

I'll experiment with another algo approach and update this & DPDK review.

Result on a small and medium hardware (huge difference between those hardware).
pmc data no more available on the small AMD cpu (regression because it was).

Is there any change you can retest IPv4 forwarding using the same approach (preferably with PMC data where possible)?

sys/netinet/in_fib_algo.c
165

Where rt_get_raw_nhop() is defined ?

--- in_fib_algo.o ---
/usr/src/sys/netinet/in_fib_algo.c:164:7: error: implicit declaration of function 'rt_get_raw_nhop' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        nh = rt_get_raw_nhop(rt);
             ^
/usr/src/sys/netinet/in_fib_algo.c:164:5: error: incompatible integer to pointer conversion assigning to 'struct nhop_object *' from 'int' [-Werror,-Wint-conversion]
        nh = rt_get_raw_nhop(rt);
175

Same here:

/usr/src/sys/netinet/in_fib_algo.c:174:2: error: implicit declaration of function 'rt_get_inet_prefix_pmask' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        rt_get_inet_prefix_pmask(rt, &addr4, &mask4, &scopeid);
sys/netinet/in_fib_algo.c
165

Added in r 368317

175

Sorry, should have stated that rebase to recent head is required :-(
Added in r368317

Just find unexpected improvement (not related to this review) on latest -head: On my 10Gb/s Chelsio server (8-cores) is now reaching the line-rate of 14.8Mpps.
So I will start another bench on biggest hardware (40 and 100Gb/s).
Meanwhile results and flamegraphs here (but seems not useful now):
https://github.com/ocochard/netbenches/blob/master/Xeon_E5-2650_8Cores-Chelsio_T540-CR/forwarding-pf-ipfw/results/fbsd13-r368606.D27401v3/README.md

The previous unexpected improvement should became from Chelsio drivers, because on small hardware with Intel NIC, there is no such difference:Full report here.

x r368606: IPv4 packets-per-second forwarded
+ r368606 with D27401(Diff 80654): IPv4 packets-per-second forwarded
+--------------------------------------------------------------------------+
|x   x       x x  x                                    +   +       +  +   +|
|  |______A__M___|                                                         |
|                                                        |_______A_M_____| |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5        693659        700880        698804      697673.4     3029.3716
+   5        717041      725241.5        722101      721296.6     3389.6496
Difference at 95.0% confidence
	23623.2 +/- 4688.25
	3.386% +/- 0.682181%
	(Student's t, pooled s = 3214.56)
=> D27401(Diff 80654) brings an improvement of 3.39% to forward IPv4: the improvement was 7.2% with previous Diff 80345.
  • Add dynamic algo switch (default fib only)
  • Fixup logging
  • Fix algo instance removal
melifaro edited the test plan for this revision. (Show Details)

New benches on Diff 80677 on multiples hardwares (click their links for full data and flame-graphs):

melifaro edited the test plan for this revision. (Show Details)
  • Update to recent HEAD
  • Export fib_radix_lookup_nh
  • small bigfixes
  • Update to recent HEAD
  • Export fib_radix_lookup_nh
  • small bigfixes

New benches on Diff 80677 on multiples hardwares (click their links for full data and flame-graphs):

Yay! thank you for working on that!

I still don't understand what's happening here. Changes only touches fib6_lookup() internals and should be just "better" in both memory allocation and cpu cycles.
Is there any chance you can eventually test this with net.route.algo.inet6.algo=radix6 to check if the result is the same as "stock" version?

That's an interesting one.
fib4_lookup() utilisation goes 11% -> 4%, and fib6_lookup() goes 11.7% -> 10%. IPv6 part is understandable here - there is no bsearch4 analogue, so the only benefit is unlocked radix.
However, lack of IPv4 difference is a bit weird. I haven't understood where did the 7% go.

Anyway, my impression that so far it looks positive performance-wise. It certainly has potential to deliver notably better results for large-fib boxes ( D27412 should address it ).

With that in mind, I plan to commit the change (after a bit more tweaks and more commented code) on Dec 19 unless I receive any objections.

I need to redo all my benches!!!

I just found a netmap pkt-gen bug that doesn't correctly use the range of IP sources & destinations:

pkt-gen -f tx -N -i igb2 -l 60 -4 -d 198.19.10.1:2000-198.19.10.100 -D 00:0d:b9:41:ca:3d -s 198.18.10.1:2000-198.18.10.20 -w 2 -R 1000

This pkt-gen command line should generate 2000 flows (100 sources IP * 20 destinations IP), but it is not the case, the last bit is never updated.
As example from one source IP in this range, I should see 100 different destinations but there is only 10:

# tcpdump -pni igb1 host 198.18.10.15
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:15:05.868196 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:05.884197 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18
13:15:05.889205 IP 198.18.10.15.2000 > 198.19.10.55.2000: UDP, length 18
13:15:05.905178 IP 198.18.10.15.2000 > 198.19.10.15.2000: UDP, length 18
13:15:05.907193 IP 198.18.10.15.2000 > 198.19.10.75.2000: UDP, length 18
13:15:05.926168 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:05.928219 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18
13:15:05.944168 IP 198.18.10.15.2000 > 198.19.10.55.2000: UDP, length 18
13:15:05.949198 IP 198.18.10.15.2000 > 198.19.10.15.2000: UDP, length 18
13:15:05.965167 IP 198.18.10.15.2000 > 198.19.10.75.2000: UDP, length 18
13:15:05.967190 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:05.986165 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18
13:15:05.988191 IP 198.18.10.15.2000 > 198.19.10.55.2000: UDP, length 18
13:15:06.004167 IP 198.18.10.15.2000 > 198.19.10.15.2000: UDP, length 18
13:15:06.009202 IP 198.18.10.15.2000 > 198.19.10.75.2000: UDP, length 18
13:15:06.025166 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:06.027191 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18
13:15:06.046165 IP 198.18.10.15.2000 > 198.19.10.55.2000: UDP, length 18
13:15:06.048192 IP 198.18.10.15.2000 > 198.19.10.15.2000: UDP, length 18
13:15:06.064180 IP 198.18.10.15.2000 > 198.19.10.75.2000: UDP, length 18
13:15:06.069208 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:06.085164 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18
13:15:06.087189 IP 198.18.10.15.2000 > 198.19.10.55.2000: UDP, length 18
13:15:06.106185 IP 198.18.10.15.2000 > 198.19.10.15.2000: UDP, length 18
13:15:06.108213 IP 198.18.10.15.2000 > 198.19.10.75.2000: UDP, length 18
13:15:06.124164 IP 198.18.10.15.2000 > 198.19.10.35.2000: UDP, length 18
13:15:06.129187 IP 198.18.10.15.2000 > 198.19.10.95.2000: UDP, length 18

So due to a bug in netmap pkt-gen, the previous benches were generating only 200 UDP flows (in place of 2000 for the smallest device and 5000 for the others).
New benches were done, and the new results with lots more flows seems concluding the same as previous.

graph.png (1×1 px, 109 KB)

graph.png (1×1 px, 110 KB)

Move all nexthop refcounting to the framework side.
Fix lookup module refcounting.
Implement FLM_ERROR handling.
Rebase to latest HEAD.

sys/conf/files
4181

Where is this file ?

melifaro marked an inline comment as done.

Virtualise fib_error_list.
Rename MOD_LOCK lock to FIB_MOD_LOCK to avoid clash with
module subsystem.
Improve comments.

So, here are benches result against the diff 80909 (not the latest one):

So, here are benches result against the diff 80909 (not the latest one):

Would it be possible if you could test the full-view with this diff and D27412 ? The latter produces 2 kernel modules, which should be loaded (dpdk_lpm4 and dpdk_lpm6) before the test.

  • Rename ROUTE_ALGO to FIB_ALGO
  • Rewrite / simplify bsearch4
  • Add more documentation
  • Add more detailed errors
  • Rebase to latest HEAD

Update to the latest -HEAD.
Fix panic on vnet teardown.

More comments & small refactoring.
Rebase to latest HEAD.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 25 2020, 11:37 AM
This revision was automatically updated to reflect the committed changes.