HomeFreeBSD

sort: test against all month formats in month-sort

Description

sort: test against all month formats in month-sort

The CLDR specification [1] defines three possible month formats:

  • Abbreviation (e.g Jan, Ιαν)
  • Full (e.g January, Ιανουαρίου)
  • Standalone (e.g January, Ιανουάριος)

Many languages use different case endings depending on whether the month
is referenced as a standalone word (nominative case), or in date context
(genitive, partitive, etc.). sort(1)'s -M option currently sorts months
by testing input against only the abbrevation format, which is
essentially a substring of the full format. While this works fine for
languages like English, where there are no cases, for languages where
there is a different case ending between the abbreviation/full and
standalone formats, it is not sufficient.

For example, in Greek, "May" can take the following forms:

Abbreviation: Μαΐ (genitive case)
Full: Μαΐου (genitive case)
Standalone: Μάιος (nominative case)

If we use the standalone format in Greek, sort(1) will not able to match
"Μαΐ" to "Μάιος" and the sort will fail.

This change makes sort(1) test against all three formats. It also works
when the input contains mixed formats.

[1] https://cldr.unicode.org/translation/date-time/date-time-patterns

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42847

Details

Provenance
christosAuthored on Dec 1 2023, 12:30 AM
Reviewer
markj
Differential Revision
D42847: sort: test against all month formats in month-sort
Parents
rGf42518ff1250: tcp: for LRD move sysctl from tcp.do_lrd tp tcp.sack.lrd, remove sockopt
Branches
Unknown
Tags
Unknown