Locale naming
Draft
For the naming of the locales at the user level, the following conventions are used.
Languages, and therefore the corresponding files, are named with the
English name, lowercased and without spaces: northernkurdish
.
Diacritics and non-letters are just removed (lu
for “Lü”, kinaraya
for “Kinaray-a”, bosniaherzegovina
for “Bosnia & Herzegovina”). Other
fields, like script and region, are separated with hyphens:
serbian-latin
, spanish-mexico
. Some regions may have long names
(eg, bosniaherzegovina
), so, for convenience, the corresponding code
is also allowed (ba
, in this case).
They are taken from the CLDR. Wherever the CLDR doesn’t provide a name
(eg, “Medieval Latin”), the pattern followed in practice for other
names is applied, namely, use the ‘natural’ form in English:
medievallatin
. They should be preferably based on the description
field in the
IANA
registry (eg, polytonicgreek
), although some simplifications can be
necessary, because some names are “too” descriptive. See also the
templates
for about 500 locales already available. As a secondary source,
Glottolog is used, too. (Wikipedia articles
can be taken as a complementary but unreliable source, and its
information must be verified; on the other hand, internal data, like
this one, is
useful for both names and tags.)
A few locales with a region or a script have in the CLDR a more precise
name. For example, ro-MD
is “Moldavian”. They will be normalized in
babel
in the next few releases.
When there are ‘short’ additional names (without hyphens), prefer
‘plain’ demonyms (even if vernacular) over composed names (eg,
british
better than UKenglish
). This reflects the evolution of the
english
style, because the names american
and british
predate
USenglish
and UKenglish
. Further, the main names in the CLDR for
en-US
and en-GB
are American English
and British English
. Note
the names ukenglish
and usenglish
(all lowercase) are not
supported by babel
as ldf
files, even if they work in some
operating systems.
The following names are deprecated (use the name after the arrow):
brazil
→brazilian
bahasai
→indonesian
bahasam
,meyalu
[sic] →malay
classiclatin
→classicallatin
ecclesiasticlatin
→ecclesiasticallatin
frenchb
,francais
,canadien
→french
germanb
→german
,ngerman
(see below)lsorbian
→lowersorbian
portuges
[sic] →portuguese
samin
→northernsami
ukraineb
→ukrainian
usorbian
→uppersorbian
vietnam
→vietnamese
russianb
→russian
Therefore, they are not included in name.babel
. Some anomalous names
(related to german
and serbian
) should be fixed in the future, but
how to deal with them is under study.