In searching for where to insert a new node, RB_INSERT discards the address of the pointer that will have to be modified, so that it must find it again from the values of 'parent' and 'comp'. Stop discarding that address, and so avoid having to recompute it.
Details
Details
This change reduces the size of iommu_gas.o by 48 bytes.
A test program (
), compiled with -DINS_ONLY -O2, repeatedly inserts 65k nodes into an rb tree, empties the tree by nulling the root, and starts over inserting in a slightly different order. With 24 sample tests on 5 machines, this change shows performance improvements on 4 of them, and a degradation on the other:x ins.lip1.res + ins.lip1.new.res +-------------------------------------------------------------------------------------+ | + + x | | + + + + + x xx xx x | |+ + + ++ ++++ + +++ + + + + xx xx xx x xxxxxx x x x x| | |_______A________| |________A________| | +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 24 45.129591 47.046357 45.71217 45.728479 0.43179336 + 24 42.96992 44.521438 43.675265 43.704165 0.41821916 Difference at 95.0% confidence -2.02431 +/- 0.247004 -4.42681% +/- 0.540153% (Student's t, pooled s = 0.42506) x ins.lip2.res + ins.lip2.new.res +-------------------------------------------------------------------------------------+ | ++ x | | ++ xx | | ++ xx | | ++ xx | | ++ xx x | | ++ + xx x | | +++++ **** x x x x x +x + x| ||_______M_____A_M______A___|__________| | +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 24 44.975475 74.543689 45.405689 48.390873 6.5609909 + 24 41.503702 65.277594 41.758037 44.193325 5.8169746 Difference at 95.0% confidence -4.19755 +/- 3.60293 -8.67426% +/- 7.44547% (Student's t, pooled s = 6.20015) x ins.lip3.res + ins.lip3.new.res +-------------------------------------------------------------------------------------+ | ++ + x | | + + ++ + + x x x x| |++ ++++ + +++ +++ + x * xx xx x x xxx x x x xx x xx| | |_______M_A________| |______________A_M___________| | +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 24 29.111069 30.372008 29.931353 29.869748 0.38332178 + 24 28.184956 29.236128 28.606005 28.641103 0.25667181 Difference at 95.0% confidence -1.22864 +/- 0.189557 -4.11334% +/- 0.634613% (Student's t, pooled s = 0.326202) x ins.lip4.res + ins.lip4.new.res +-------------------------------------------------------------------------------------+ | + x | | + x | | + x | | ++ xx| | ++ xx| |++++ xx| |++++ xx| |++++ xx| |+++++ xx| | MA A|| +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 24 36.800296 36.840009 36.814799 36.817438 0.011267106 + 24 34.128489 34.264221 34.173894 34.177868 0.030824877 Difference at 95.0% confidence -2.63957 +/- 0.0134856 -7.16935% +/- 0.0366283% (Student's t, pooled s = 0.0232069) x ins.lip5.res + ins.lip5.new.res +-------------------------------------------------------------------------------------+ |x +| |x +| |x +| |x +| |x +| |x +| |x ++| |x ++| |x ++| |x ++| |x ++| |x ++| |x ++| |x ++| |xx ++| |A |A| +-------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 24 53.118808 53.178434 53.141788 53.141197 0.014284841 + 24 60.045924 60.103849 60.074503 60.071469 0.016125488 Difference at 95.0% confidence 6.93027 +/- 0.00885194 13.0412% +/- 0.0166574% (Student's t, pooled s = 0.015233)
Diff Detail
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
sys/sys/tree.h | ||
---|---|---|
736 | I would try two variables: the original tmp and a new tmpp. The point being not to rely on the compiler not reloading *tmp below. |
Comment Actions
Make explicit that the value extracted from the pointer doesn't need to be extracted twice. The performance effect is zero with the compiler and machines I'm testing with:
x ins.lip1.res + ins.lip1.new.res N Min Max Median Avg Stddev x 24 44.666931 47.246019 44.951619 45.048621 0.50288168 + 24 42.581848 43.332712 42.795631 42.845068 0.19898552 Difference at 95.0% confidence -2.20355 +/- 0.222224 -4.8915% +/- 0.493297% (Student's t, pooled s = 0.382417) x ins.lip2.res + ins.lip2.new.res N Min Max Median Avg Stddev x 24 44.966331 46.029942 45.085098 45.190326 0.29323327 + 24 41.457632 42.995131 41.613923 41.712543 0.35076252 Difference at 95.0% confidence -3.47778 +/- 0.187859 -7.69586% +/- 0.415706% (Student's t, pooled s = 0.32328) x ins.lip3.res + ins.lip3.new.res N Min Max Median Avg Stddev x 24 29.253933 30.172271 29.681797 29.664392 0.25813438 + 24 28.071634 29.046893 28.580763 28.60449 0.25383136 Difference at 95.0% confidence -1.0599 +/- 0.148758 -3.57298% +/- 0.501469% (Student's t, pooled s = 0.255992) x ins.lip4.res + ins.lip4.new.res N Min Max Median Avg Stddev x 24 36.787981 36.837032 36.815186 36.816023 0.015639191 + 24 34.135403 34.235068 34.175161 34.174839 0.026672323 Difference at 95.0% confidence -2.64118 +/- 0.0127048 -7.17401% +/- 0.0345088% (Student's t, pooled s = 0.0218632) x ins.lip5.res + ins.lip5.new.res N Min Max Median Avg Stddev x 24 53.176464 53.228984 53.198042 53.200987 0.014640566 + 24 60.096746 60.163316 60.134099 60.134424 0.018302593 Difference at 95.0% confidence 6.93344 +/- 0.00963064 13.0325% +/- 0.0181024% (Student's t, pooled s = 0.016573)
sys/sys/tree.h | ||
---|---|---|
733 | This variable is now unnecessary. |
sys/sys/tree.h | ||
---|---|---|
736 | I'm happy to do that, to work around some stupid compiler someday, though it does not affect the results produced by the compiler I'm using now. |