Page MenuHomeFreeBSD

Speed up geom_stats_resync in the presence of many devices
ClosedPublic

Authored by asomers on Feb 27 2021, 4:05 PM.
Tags
None
Referenced Files
F103034881: D28968.diff
Wed, Nov 20, 2:24 AM
Unknown Object (File)
Oct 2 2024, 4:31 AM
Unknown Object (File)
Oct 1 2024, 10:05 PM
Unknown Object (File)
Oct 1 2024, 3:10 PM
Unknown Object (File)
Oct 1 2024, 3:10 PM
Unknown Object (File)
Oct 1 2024, 3:10 PM
Unknown Object (File)
Oct 1 2024, 3:08 PM
Unknown Object (File)
Oct 1 2024, 3:08 PM
Subscribers

Details

Summary

The old code had a O(n) loop, where n is the size of /dev/devstat.
Multiply that by another O(n) loop in devstat_mmap for a total of
O(n^2).

This change adds DIOCGMEDIASIZE support to /dev/devstat so userland can
quickly determine the right amount of memory to map, eliminating the
O(n) loop in userland.

This change decreases the time to run "gstat -bI0.001" with 16,384 md
devices from 29.7s to 4.2s.

MFC after: 2 weeks
Sponsored by: Axcient

[skip ci] fix a typo in a comment in mdconfig.c Sponsored by: Axcient

Test Plan

Tested with gstat -bI0.001 with up to 16,384 md devices

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 37437
Build 34326: arc lint + arc unit

Event Timeline

  • Speed up geom_stats_resync in the presence of many devices
  • [skip ci] fix a typo in a comment in mdconfig.c

Looks good to my eye

This revision is now accepted and ready to land.Feb 27 2021, 5:08 PM
lib/libgeom/geom_stats.c
74–80

I don't very like the use of DIOCGMEDIASIZE here, but I have no better ideas.

80

In the new code you are neither reading nor updating npages variable. You could read npages to skip mmap() call if nothing has changed. And you should update it for other library parts to work correctly.

sys/kern/subr_devstat.c
506

Double semicolon.

lib/libgeom/geom_stats.c
74–80

I had the same thought... but didn't want to suggest a new one

  • * Fix mav's npages observation
This revision now requires review to proceed.Feb 27 2021, 11:57 PM
lib/libgeom/geom_stats.c
136

I think previous order was intentional to make time more up to date of paging on access take some time.

lib/libgeom/geom_stats.c
136

Interesting idea. Do you have any good idea why there used to be two memset calls?

lib/libgeom/geom_stats.c
136

Not really. Guess some form of swap pager woodoo. Something like first pass triggers swap-in, while second brings up to date TLB caches if not yet, to make following copy as coherent as possible.

lib/libgeom/geom_stats.c
136

So you really think both memsets are required?

lib/libgeom/geom_stats.c
136

And you think it is a copy/paste bug?

lib/libgeom/geom_stats.c
136

Yes, I think it's a copy/paste bug.

mav added inline comments.
lib/libgeom/geom_stats.c
77

I haven't used mmap() that much to know. I am curios whether the lack of munmap() before used some trick not avoid extra actions.

136

I this case I'd say don't fix what's not broken. But whatever you decide. Responsibility is yours. ;)

This revision is now accepted and ready to land.Feb 28 2021, 2:48 AM
lib/libgeom/geom_stats.c
77

I don't know. But the resulted in rapid accumulation of MAP ENTRY memory with every call to geom_stats_resync

136

Even though that code is very old, the original author is still active. So I'll pull it out of this review and take it up separately.