Page MenuHomeFreeBSD

lib/libc/tests/string: add extended unit tests for strcmp()
ClosedPublic

Authored by fuz on Sep 25 2023, 6:33 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 9, 10:21 PM
Unknown Object (File)
Sun, May 5, 10:19 AM
Unknown Object (File)
Apr 30 2024, 10:21 AM
Unknown Object (File)
Apr 12 2024, 7:23 AM
Unknown Object (File)
Apr 6 2024, 8:46 AM
Unknown Object (File)
Feb 2 2024, 2:28 AM
Unknown Object (File)
Dec 28 2023, 5:21 PM
Unknown Object (File)
Dec 11 2023, 12:20 AM
Subscribers

Details

Summary

This changeset add a new set of tests that comprehensively test strcmp() on
various alignments of the input. This made it easy to smoke out many
exciting new bugs in my SSE strcmp() implementation from D41971.

Sponsored by: The FreeBSD Foundation

Test Plan

n/a

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

fuz requested review of this revision.Sep 25 2023, 6:33 AM

Let me see if I understand this dlopen business correctly. Your plan is that if you create a separate .c file containing "test_strcmp" and link it to t_strcmp.o, then the test will operate on that function. Otherwise, it will operate on the standard strcmp?

Let me see if I understand this dlopen business correctly. Your plan is that if you create a separate .c file containing "test_strcmp" and link it to t_strcmp.o, then the test will operate on that function. Otherwise, it will operate on the standard strcmp?

What I usually do is prepare my own strcmp to be named test_strcmp. Then I assemble and link it into a shared object and preload the shared object with LD_PRELOAD. It doesn't suffice to just name it strcmp, as defects in the code can mean that the test crashes before it gets to actually run any unit tests. This design pattern is already used in some other libc string tests such as contrib/netbsd-tests/lib/libc/string/t_strchr.c.

ngie requested changes to this revision.Sep 27 2023, 1:20 AM

Please don't modify the NetBSD test: if the plan is to test out FreeBSD-specific behavior for strcmp, it should go in its own test.

This revision now requires changes to proceed.Sep 27 2023, 1:20 AM

Ok. Should I unhook the netbsd strcmp tests from the build and replace them with a FreeBSD-specific file then? What about the other open DRs (D41520, D41528)?

In D41970#957458, @fuz wrote:

Ok. Should I unhook the netbsd strcmp tests from the build and replace them with a FreeBSD-specific file then? What about the other open DRs (D41520, D41528)?

I'm a bit confused -- why would you want to unhook them from the build...? Why not just add the FreeBSD-specific tests to another test file?

So there will then be two test files for strcmp? If the build system supports this, sure, why not.

  • tests: split strcmp_alignments into new test

This implements your requested move of the new strcmp unit test into a
FreeBSD-specific file.

All of my questions relate to understanding the rationale behind some of the design choices presented in the change.

contrib/netbsd-tests/lib/libc/string/t_strcmp.c
12–17 ↗(On Diff #127845)

Please sort per style(9).

lib/libc/tests/string/strcmp_test.c
104
  • What do 64, 16, and 3 represent? Why are those numbers important? What assumptions are you making about the hardware architecture -- does this work with 32-bit, 64-bit, will this work with 128-bit architectures, etc? It would be nice if these numbers were better explained in the code with comments and/or descriptive constant names.
106–107

Why '-'? Is this an arbitrary character choice?

111–115

Why doesn't this use nitems or sizeof and instead hardcoded values are used instead?

fuz planned changes to this revision.Oct 4 2023, 5:17 PM
fuz marked 3 inline comments as done.

Will prepare a new revision pending response to my comments.

contrib/netbsd-tests/lib/libc/string/t_strcmp.c
12–17 ↗(On Diff #127845)

Will do in the upcoming revision.

lib/libc/tests/string/strcmp_test.c
104

The 64 + 16 + 3 characters are needed to support 64 string lengths plus 16 possibly alignment offsets plus up to 3 sentinel characters behind the string. These values directly result from the code. I can add a comment with my next revision if you like.

106–107

More or less arbitrary. The code uses A and B as sentinels after the end of the string and X and Y to induce a mismatch. The fill character merely needs to have a character code lower than these two. A typical test case will look like strcmp("----X\0A", "-----\0B"); which is easy to see in a memory dump or in the debugger.

111–115

Because these values fall out of the properties of my strcmp routine I want to test. It uses SSE registers, which are 16 bytes in size. So there are 16 possible misalignments to consider. The code starts by processing up to 32 bytes in special “head” handling code. The main loop is then unrolled twice and thus processes 32 bytes per iteration. A 64 byte maximum length ensures that we test at least until the main loop completes once, catching all possibilities.

The sizeof operator cannot be used as none of these loops iterate by the size of some array.

For future implementations using wider registers, the values can be increased. I have taken this approach as it was also taken by various other unit tests. Please let me know if you would like the code to be more verbose.

fuz marked 3 inline comments as done.Oct 23 2023, 3:58 PM

I'm waiting for a response to my questions so I can prepare an update to this revision.

Ok, that part wasn't apparent before.

If you're willing to go through upstreaming the change to NetBSD and they accept it, I say it's worth taking -- if however upstream isn't interested in taking changes to the test, I don't see the point in modifying their test: I'd take the test, fork it, apply the necessary changes, then maintain it internally.

The goal is to consume the NetBSD test suite without having to hack it too much/do the bare minimum necessary to make it work on FreeBSD, while upstreaming anything possible back to the NetBSD project. It's not where FreeBSD specific tests/testing should go.

Also, please add comments for non-obvious things: you have the context now, but I didn't understand the reasoning, and you might not have the context at a later date. The code needs to either be intuitive by design or intuitive via comments.

Add comments to clarify parameter choices.
Also drop changes to the NetBSD bits. They are not required for
test suite coverage and prevent this DR from landing.

fuz retitled this revision from contrib/netbsd-tests/lib/libc/string/t_strcmp.c: extend and add support for external strcmp function to lib/libc/tests/string: add extended unit tests for strcmp().Nov 4 2023, 3:44 AM
fuz edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Nov 14 2023, 6:34 AM