Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate
use Unicode::Collate::Locale;
#construct
$Collator = Unicode::Collate::Locale->
new(locale => $locale_name, %tailoring);
#sort
@sorted = $Collator->sort(@not_sorted);
#compare
$result = $Collator->cmp($a, $b); # returns 1, 0, or -1.
Note: Strings in @not_sorted
, $a
and $b
are interpreted according to Perl's Unicode support. See perlunicode, perluniintro, perlunitut, perlunifaq, utf8. Otherwise you can use preprocess
(cf. Unicode::Collate
) or should decode them before.
This module provides linguistic tailoring for it taking advantage of Unicode::Collate
.
The new
method returns a collator object.
A parameter list for the constructor is a hash, which can include a special key locale
and its value (case-insensitive) standing for a Unicode base language code (two or three-letter). For example, Unicode::Collate::Locale->new(locale => 'FR')
returns a collator tailored for French.
$locale_name
may be suffixed with a Unicode script code (four-letter), a Unicode region code, a Unicode language variant code. These codes are case-insensitive, and separated with '_'
or '-'
. E.g. en_US
for English in USA, az_Cyrl
for Azerbaijani in the Cyrillic script, es_ES_traditional
for Spanish in Spain (Traditional).
If $locale_name
is not available, fallback is selected in the following order:
1. language with a variant code
2. language with a script code
3. language with a region code
4. language
5. default
Tailoring tags provided by Unicode::Collate
are allowed as long as they are not used for locale
support. Esp. the table
tag is always untailorable, since it is reserved for DUCET.
However entry
is allowed, even if it is used for locale
support, to add or override mappings.
E.g. a collator for French, which ignores diacritics and case difference (i.e. level 1), with reversed case ordering and no normalization.
Unicode::Collate::Locale->new(
level => 1,
locale => 'fr',
upper_before_lower => 1,
normalization => undef
)
Overriding a behavior already tailored by locale
is disallowed if such a tailoring is passed to new()
.
Unicode::Collate::Locale->new(
locale => 'da',
upper_before_lower => 0, # causes error as reserved by 'da'
)
However change()
inherited from Unicode::Collate
allows such a tailoring that is reserved by locale
. Examples:
new(locale => 'ca')->change(backwards => undef)
new(locale => 'da')->change(upper_before_lower => 0)
new(locale => 'ja')->change(overrideCJK => undef)