Page MenuHomeFreeBSD

lib/libc/amd64/string: add strspn(3) scalar, x86-64-v2 implementation, unit tests
ClosedPublic

Authored by fuz on Aug 23 2023, 8:20 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Sep 21, 5:18 PM
Unknown Object (File)
Sun, Sep 15, 12:39 AM
Unknown Object (File)
Sat, Sep 14, 5:13 AM
Unknown Object (File)
Mon, Sep 9, 3:51 PM
Unknown Object (File)
Mon, Sep 9, 1:46 AM
Unknown Object (File)
Sun, Sep 8, 7:34 AM
Unknown Object (File)
Thu, Sep 5, 12:32 AM
Unknown Object (File)
Sat, Aug 31, 10:14 PM
Subscribers

Details

Summary

This is conceptually very similar to the strcspn(3) implementations
from D41557, but we can't do the fast paths the same way.

Also like for the strcspn(3) implementation, new unit tests are
provided for strspn(3), superseeding the rudimentary tests from
NetBSD.

Performance is comparable with glibc.

os: FreeBSD
arch: amd64
cpu: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
        │ strspn.x86-64-v2.out │            strspn.pre.out             │           strspn.scalar.out           │
        │        sec/op        │    sec/op     vs base                 │    sec/op     vs base                 │
Short1             66.77µ ± 0%   141.36µ ± 0%  +111.70% (p=0.000 n=20)    64.83µ ± 0%    -2.91% (p=0.000 n=20)
Mid1               15.59µ ± 0%    79.25µ ± 0%  +408.35% (p=0.000 n=20)    30.82µ ± 0%   +97.67% (p=0.000 n=20)
Long1              6.030µ ± 0%   48.440µ ± 0%  +703.35% (p=0.000 n=20)   21.386µ ± 0%  +254.67% (p=0.000 n=20)
Short5             67.11µ ± 0%   182.21µ ± 0%  +171.49% (p=0.000 n=20)   127.97µ ± 0%   +90.68% (p=0.000 n=20)
Mid5               15.59µ ± 0%    87.71µ ± 0%  +462.75% (p=0.000 n=20)    56.11µ ± 0%  +260.01% (p=0.000 n=20)
Long5              6.030µ ± 0%   49.626µ ± 1%  +723.05% (p=0.000 n=20)   35.214µ ± 0%  +484.03% (p=0.000 n=20)
Short20            77.76µ ± 0%   352.34µ ± 1%  +353.12% (p=0.000 n=20)   165.16µ ± 0%  +112.41% (p=0.000 n=20)
Mid20              26.61µ ± 0%   133.80µ ± 0%  +402.84% (p=0.000 n=20)    67.05µ ± 0%  +151.97% (p=0.000 n=20)
Long20             14.04µ ± 0%    62.11µ ± 1%  +342.24% (p=0.000 n=20)    35.22µ ± 0%  +150.81% (p=0.000 n=20)
Short40            171.6µ ± 1%    588.1µ ± 1%  +242.74% (p=0.000 n=20)    203.5µ ± 0%   +18.61% (p=0.000 n=20)
Mid40              70.10µ ± 1%   197.90µ ± 0%  +182.32% (p=0.000 n=20)    74.47µ ± 0%    +6.24% (p=0.000 n=20)
Long40             35.18µ ± 0%    51.69µ ± 2%   +46.92% (p=0.000 n=20)    35.45µ ± 0%    +0.74% (p=0.000 n=20)
geomean            29.78µ         118.4µ       +297.52%                   60.20µ       +102.13%

        │ strspn.x86-64-v2.out │            strspn.pre.out             │           strspn.scalar.out           │
        │         B/s          │      B/s       vs base                │      B/s       vs base                │
Short1           1785.3Mi ± 0%    843.3Mi ± 0%  -52.76% (p=0.000 n=20)   1838.7Mi ± 0%   +2.99% (p=0.000 n=20)
Mid1              7.467Gi ± 0%    1.469Gi ± 0%  -80.33% (p=0.000 n=20)    3.778Gi ± 0%  -49.41% (p=0.000 n=20)
Long1            19.307Gi ± 0%    2.403Gi ± 0%  -87.55% (p=0.000 n=20)    5.444Gi ± 0%  -71.81% (p=0.000 n=20)
Short5           1776.2Mi ± 0%    654.2Mi ± 0%  -63.17% (p=0.000 n=20)    931.5Mi ± 0%  -47.56% (p=0.000 n=20)
Mid5              7.469Gi ± 0%    1.327Gi ± 0%  -82.23% (p=0.000 n=20)    2.075Gi ± 0%  -72.22% (p=0.000 n=20)
Long5            19.307Gi ± 0%    2.346Gi ± 1%  -87.85% (p=0.000 n=20)    3.306Gi ± 0%  -82.88% (p=0.000 n=20)
Short20          1533.1Mi ± 0%    338.3Mi ± 1%  -77.93% (p=0.000 n=20)    721.8Mi ± 0%  -52.92% (p=0.000 n=20)
Mid20            4480.0Mi ± 0%    890.9Mi ± 0%  -80.11% (p=0.000 n=20)   1778.0Mi ± 0%  -60.31% (p=0.000 n=20)
Long20            8.290Gi ± 0%    1.874Gi ± 1%  -77.39% (p=0.000 n=20)    3.305Gi ± 0%  -60.13% (p=0.000 n=20)
Short40           694.7Mi ± 1%    202.7Mi ± 1%  -70.82% (p=0.000 n=20)    585.7Mi ± 0%  -15.69% (p=0.000 n=20)
Mid40            1700.6Mi ± 1%    602.4Mi ± 0%  -64.58% (p=0.000 n=20)   1600.7Mi ± 0%   -5.88% (p=0.000 n=20)
Long40            3.309Gi ± 0%    2.252Gi ± 2%  -31.93% (p=0.000 n=20)    3.284Gi ± 0%   -0.74% (p=0.000 n=20)
geomean           3.909Gi        1006.9Mi       -74.84%                   1.934Gi       -50.53%

os: Linux
arch: x86_64
cpu:
        │ strspn.glibc.out │
        │      sec/op      │
Short1         71.64µ ± 0%
Mid1           18.82µ ± 0%
Long1          6.029µ ± 0%
Short5         71.63µ ± 1%
Mid5           18.82µ ± 0%
Long5          6.029µ ± 0%
Short20        217.1µ ± 0%
Mid20          81.78µ ± 0%
Long20         35.77µ ± 0%
Short40        266.8µ ± 0%
Mid40          95.84µ ± 0%
Long40         35.77µ ± 0%
geomean        42.85µ

        │ strspn.glibc.out │
        │       B/s        │
Short1        1.625Gi ± 0%
Mid1          6.186Gi ± 0%
Long1         19.31Gi ± 0%
Short5        1.625Gi ± 1%
Mid5          6.187Gi ± 0%
Long5         19.31Gi ± 0%
Short20       549.2Mi ± 0%
Mid20         1.423Gi ± 0%
Long20        3.254Gi ± 0%
Short40       446.8Mi ± 0%
Mid40         1.215Gi ± 0%
Long40        3.254Gi ± 0%
geomean       2.717Gi

Sponsored by: The FreeBSD Foundation

Test Plan

passes new unit tests added for this purpose.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable