Page MenuHomeFreeBSD

awk: revert to upstream behavior for ranges for gawk compatibility
ClosedPublic

Authored by imp on Jul 9 2021, 8:21 AM.
Tags
None
Referenced Files
F107141410: D31114.id.diff
Fri, Jan 10, 6:29 PM
F107141352: D31114.id91996.diff
Fri, Jan 10, 6:28 PM
F107141344: D31114.id92615.diff
Fri, Jan 10, 6:28 PM
Unknown Object (File)
Fri, Jan 10, 4:00 PM
Unknown Object (File)
Sun, Jan 5, 9:29 PM
Unknown Object (File)
Sun, Dec 15, 8:34 PM
Unknown Object (File)
Dec 9 2024, 4:35 AM
Unknown Object (File)
Dec 1 2024, 7:29 AM
Subscribers

Details

Summary

In 2005, FreeBSD changed one-true-awk to honor the locale's collating
order. This was billed as a temporary patch. It was also compatible with
the then-current behavior of gawk. That temporary patch has lasted 16
years now.

However, IEEE Std 1003.1-2008 changed the behaivor of ranges in regular
expressions outside of the "C" and "POSIX" locales to be undefined.

Starting in 2011, gawk 4.0 stopped using the locale for the range
regular expressions and used the traditional behavior only. The
maintainer had grown weary of answering why '[A-Z]' would sometimes
match lower-case expressions. The details about are explained here:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html

To restore compatibility with other implementaitons of awk, revert this
patch. FreeBSD is the odd-system out. It also has the nice side effect
of eliminating the last of our differences with upstream one-true-awk.

MFC After: 2 weeks
Sponsored by: Netflix

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

imp requested review of this revision.Jul 9 2021, 8:21 AM
imp created this revision.
rgrimes added a subscriber: rgrimes.
rgrimes added inline comments.
contrib/one-true-awk/main.c
120

Are there any possible side effects from this no-longer being cleared to NULL? Ie, what if LC_COLLATE is set, prior to this change that wouldn't matter, but after this change can that cause things like ports to break ? NVM, I think I answered that myself, after this change there are no more calls to strcoll, so this value would not matter anyway.

This revision is now accepted and ready to land.Jul 9 2021, 1:19 PM
cy added a subscriber: cy.

Reverting to standard is a good idea.

I plan on committing this w/o an exp run in the next few days.
https://marc.info/?l=freebsd-arch&m=162592710615072&w=2
has the context for why... The exp run is done in a context where
these changes wouldn't change anything, so it wouldn't be able to
detect any problems and would thus be a waste of time.