{
    "content": [
        {
            "type": "text",
            "text": "# Unicode::Collate::Locale (perldoc)\n\n## NAME\n\nUnicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate\n\n## SYNOPSIS\n\nuse Unicode::Collate::Locale;\n#construct\n$Collator = Unicode::Collate::Locale->\nnew(locale => $localename, %tailoring);\n#sort\n@sorted = $Collator->sort(@notsorted);\n#compare\n$result = $Collator->cmp($a, $b); # returns 1, 0, or -1.\nNote: Strings in @notsorted, $a and $b are interpreted according to Perl's Unicode support. See\nperlunicode, perluniintro, perlunitut, perlunifaq, utf8. Otherwise you can use \"preprocess\" (cf.\n\"Unicode::Collate\") or should decode them before.\n\n## DESCRIPTION\n\nThis module provides linguistic tailoring for it taking advantage of \"Unicode::Collate\".\n\n## Sections\n\n- **NAME**\n- **SYNOPSIS**\n- **DESCRIPTION** (2 subsections)\n- **INSTALL**\n- **CAVEAT** (1 subsections)\n- **AUTHOR**\n- **SEE ALSO**\n\nUse structuredContent.sections for detailed options, examples, and full documentation.\n"
        }
    ],
    "structuredContent": {
        "command": "Unicode::Collate::Locale",
        "section": "",
        "mode": "perldoc",
        "summary": "Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate",
        "synopsis": "use Unicode::Collate::Locale;\n#construct\n$Collator = Unicode::Collate::Locale->\nnew(locale => $localename, %tailoring);\n#sort\n@sorted = $Collator->sort(@notsorted);\n#compare\n$result = $Collator->cmp($a, $b); # returns 1, 0, or -1.\nNote: Strings in @notsorted, $a and $b are interpreted according to Perl's Unicode support. See\nperlunicode, perluniintro, perlunitut, perlunifaq, utf8. Otherwise you can use \"preprocess\" (cf.\n\"Unicode::Collate\") or should decode them before.",
        "tldr_summary": null,
        "tldr_examples": [],
        "tldr_source": null,
        "flags": [],
        "examples": [],
        "see_also": [],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "SYNOPSIS",
                "lines": 16,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 2,
                "subsections": [
                    {
                        "name": "Constructor",
                        "lines": 51
                    },
                    {
                        "name": "Methods",
                        "lines": 149
                    }
                ]
            },
            {
                "name": "INSTALL",
                "lines": 5,
                "subsections": []
            },
            {
                "name": "CAVEAT",
                "lines": 9,
                "subsections": [
                    {
                        "name": "Reference",
                        "lines": 96
                    }
                ]
            },
            {
                "name": "AUTHOR",
                "lines": 7,
                "subsections": []
            },
            {
                "name": "SEE ALSO",
                "lines": 15,
                "subsections": []
            }
        ],
        "sections": {
            "NAME": {
                "content": "Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate\n",
                "subsections": []
            },
            "SYNOPSIS": {
                "content": "use Unicode::Collate::Locale;\n\n#construct\n$Collator = Unicode::Collate::Locale->\nnew(locale => $localename, %tailoring);\n\n#sort\n@sorted = $Collator->sort(@notsorted);\n\n#compare\n$result = $Collator->cmp($a, $b); # returns 1, 0, or -1.\n\nNote: Strings in @notsorted, $a and $b are interpreted according to Perl's Unicode support. See\nperlunicode, perluniintro, perlunitut, perlunifaq, utf8. Otherwise you can use \"preprocess\" (cf.\n\"Unicode::Collate\") or should decode them before.\n",
                "subsections": []
            },
            "DESCRIPTION": {
                "content": "This module provides linguistic tailoring for it taking advantage of \"Unicode::Collate\".\n",
                "subsections": [
                    {
                        "name": "Constructor",
                        "content": "The \"new\" method returns a collator object.\n\nA parameter list for the constructor is a hash, which can include a special key \"locale\" and its\nvalue (case-insensitive) standing for a Unicode base language code (two or three-letter). For\nexample, \"Unicode::Collate::Locale->new(locale => 'ES')\" returns a collator tailored for\nSpanish.\n\n$localename may be suffixed with a Unicode script code (four-letter), a Unicode region\n(territory) code, a Unicode language variant code. These codes are case-insensitive, and\nseparated with '' or '-'. E.g. \"enUS\" for English in USA, \"azCyrl\" for Azerbaijani in the\nCyrillic script, \"esEStraditional\" for Spanish in Spain (Traditional).\n\nIf $localename is not available, fallback is selected in the following order:\n\n1. language with a variant code\n2. language with a script code\n3. language with a region code\n4. language\n5. default\n\nTailoring tags provided by \"Unicode::Collate\" are allowed as long as they are not used for\n\"locale\" support. Esp. the \"table\" tag is always untailorable, since it is reserved for DUCET.\n\nHowever \"entry\" is allowed, even if it is used for \"locale\" support, to add or override\nmappings.\n\nE.g. a collator for Spanish, which ignores diacritics and case difference (i.e. level 1), with\nreversed case ordering and no normalization.\n\nUnicode::Collate::Locale->new(\nlevel => 1,\nlocale => 'es',\nupperbeforelower => 1,\nnormalization => undef\n)\n\nOverriding a behavior already tailored by \"locale\" is disallowed if such a tailoring is passed\nto \"new()\".\n\nUnicode::Collate::Locale->new(\nlocale => 'da',\nupperbeforelower => 0, # causes error as reserved by 'da'\n)\n\nHowever \"change()\" inherited from \"Unicode::Collate\" allows such a tailoring that is reserved by\n\"locale\". Examples:\n\nnew(locale => 'frca')->change(backwards => undef)\nnew(locale => 'da')->change(upperbeforelower => 0)\nnew(locale => 'ja')->change(overrideCJK => undef)\n"
                    },
                    {
                        "name": "Methods",
                        "content": "\"Unicode::Collate::Locale\" is a subclass of \"Unicode::Collate\" and methods other than \"new\" are\ninherited from \"Unicode::Collate\".\n\nHere is a list of additional methods:\n\n\"$Collator->getlocale\"\nReturns a language code accepted and used actually on collation. If linguistic tailoring is\nnot provided for a language code you passed (intensionally for some languages, or due to the\nincomplete implementation), this method returns a string 'default' meaning no special\ntailoring.\n\n\"$Collator->localeversion\"\n(Since Unicode::Collate::Locale 0.87) Returns the version number (perhaps \"/\\d\\.\\d\\d/\") of\nthe locale, as that of Locale/*.pl.\n\nNote: Locale/*.pl that a collator uses should be identified by a combination of return\nvalues from \"getlocale\" and \"localeversion\".\n\nA list of tailorable locales\nlocale name       description\n--------------------------------------------------------------\naf                Afrikaans\nar                Arabic\nas                Assamese\naz                Azerbaijani (Azeri)\nbe                Belarusian\nbn                Bengali\nbs                Bosnian (tailored as Croatian)\nbsCyrl           Bosnian in Cyrillic (tailored as Serbian)\nca                Catalan\ncs                Czech\ncu                Church Slavic\ncy                Welsh\nda                Danish\ndephonebook     German (umlaut as 'ae', 'oe', 'ue')\ndeATphonebook   Austrian German (umlaut primary greater)\ndsb               Lower Sorbian\nee                Ewe\neo                Esperanto\nes                Spanish\nestraditional   Spanish ('ch' and 'll' as a grapheme)\net                Estonian\nfa                Persian\nfi                Finnish (v and w are primary equal)\nfiphonebook     Finnish (v and w as separate characters)\nfil               Filipino\nfo                Faroese\nfrCA             Canadian French\ngu                Gujarati\nha                Hausa\nhaw               Hawaiian\nhe                Hebrew\nhi                Hindi\nhr                Croatian\nhu                Hungarian\nhy                Armenian\nig                Igbo\nis                Icelandic\nja                Japanese [1]\nkk                Kazakh\nkl                Kalaallisut\nkn                Kannada\nko                Korean [2]\nkok               Konkani\nlkt               Lakota\nln                Lingala\nlt                Lithuanian\nlv                Latvian\nmk                Macedonian\nml                Malayalam\nmr                Marathi\nmt                Maltese\nnb                Norwegian Bokmal\nnn                Norwegian Nynorsk\nnso               Northern Sotho\nom                Oromo\nor                Oriya\npa                Punjabi\npl                Polish\nro                Romanian\nsa                Sanskrit\nse                Northern Sami\nsi                Sinhala\nsidictionary    Sinhala (U+0DA5 = U+0DA2,0DCA,0DA4)\nsk                Slovak\nsl                Slovenian\nsq                Albanian\nsr                Serbian\nsrLatn           Serbian in Latin (tailored as Croatian)\nsv                Swedish (v and w are primary equal)\nsvreformed      Swedish (v and w as separate characters)\nta                Tamil\nte                Telugu\nth                Thai\ntn                Tswana\nto                Tonga\ntr                Turkish\nugCyrl           Uyghur in Cyrillic\nuk                Ukrainian\nur                Urdu\nvi                Vietnamese\nvo                Volapu\"k\nwae               Walser\nwo                Wolof\nyo                Yoruba\nzh                Chinese\nzhbig5han       Chinese (ideographs: big5 order)\nzhgb2312han     Chinese (ideographs: GB-2312 order)\nzhpinyin        Chinese (ideographs: pinyin order) [3]\nzhstroke        Chinese (ideographs: stroke order) [3]\nzhzhuyin        Chinese (ideographs: zhuyin order) [3]\n--------------------------------------------------------------\n\nLocales according to the default UCA rules include am (Amharic) without \"[reorder Ethi]\", bg\n(Bulgarian) without \"[reorder Cyrl]\", chr (Cherokee) without \"[reorder Cher]\", de (German), en\n(English), fr (French), ga (Irish), id (Indonesian), it (Italian), ka (Georgian) without\n\"[reorder Geor]\", mn (Mongolian) without \"[reorder Cyrl Mong]\", ms (Malay), nl (Dutch), pt\n(Portuguese), ru (Russian) without \"[reorder Cyrl]\", sw (Swahili), zu (Zulu).\n\nNote\n\n[1] ja: Ideographs are sorted in JIS X 0208 order. Fullwidth and halfwidth forms are identical\nto their regular form. The difference between hiragana and katakana is at the 4th level, the\ncomparison also requires \"(variable => 'Non-ignorable')\", and then \"katakanabeforehiragana\"\nhas no effect.\n\n[2] ko: Plenty of ideographs are sorted by their reading. Such an ideograph is primary (level 1)\nequal to, and secondary (level 2) greater than, the corresponding hangul syllable.\n\n[3] zhpinyin, zhstroke and zhzhuyin: implemented alt='short', where a smaller number of\nideographs are tailored.\n\nA list of variant codes and their aliases\nvariant code       alias\n------------------------------------------\ndictionary         dict\nphonebook          phone     phonebk\nreformed           reform\ntraditional        trad\n------------------------------------------\nbig5han            big5\ngb2312han          gb2312\npinyin\nstroke\nzhuyin\n------------------------------------------\n\nNote: 'pinyin' is Han in Latin, 'zhuyin' is Han in Bopomofo.\n"
                    }
                ]
            },
            "INSTALL": {
                "content": "Installation of \"Unicode::Collate::Locale\" requires Collate/Locale.pm, Collate/Locale/*.pm,\nCollate/CJK/*.pm and Collate/allkeys.txt. On building, \"Unicode::Collate::Locale\" doesn't\nrequire any of data/*.txt, gendata/*, and mklocale. Tests for \"Unicode::Collate::Locale\" are\nnamed t/loc*.t.\n",
                "subsections": []
            },
            "CAVEAT": {
                "content": "Tailoring is not maximum\nEven if a certain letter is tailored, its equivalent would not always tailored as well as\nit. For example, even though W is tailored, fullwidth W (\"U+FF37\"), W with acute (\"U+1E82\"),\netc. are not tailored. The result may depend on whether source strings are normalized or\nnot, and whether decomposed or composed. Thus \"(normalization => undef)\" is less preferred.\n\nCollation reordering is not supported\nThe order of any groups including scripts is not changed.\n",
                "subsections": [
                    {
                        "name": "Reference",
                        "content": "locale            based CLDR or other reference\n--------------------------------------------------------------------\naf                30 = 1.8.1\nar                30 = 28 (\"compat\" wo [reorder Arab]) = 1.9.0\nas                30 = 28 (without [reorder Beng..]) = 23\naz                30 = 24 (\"standard\" wo [reorder Latn Cyrl])\nbe                30 = 28 (without [reorder Cyrl])\nbn                30 = 28 (\"standard\" wo [reorder Beng..]) = 2.0.1\nbs                30 = 28 (type=\"standard\": [import hr])\nbsCyrl           30 = 28 (type=\"standard\": [import sr])\nca                30 = 23 (alt=\"proposed\" type=\"standard\")\ncs                30 = 1.8.1 (type=\"standard\")\ncu                34 = 30 (without [reorder Cyrl])\ncy                30 = 1.8.1\nda                22.1 = 1.8.1 (type=\"standard\")\ndephonebook     30 = 2.0 (type=\"phonebook\")\ndeATphonebook   30 = 27 (type=\"phonebook\")\ndsb               30 = 26\nee                30 = 21\neo                30 = 1.8.1\nes                30 = 1.9.0 (type=\"standard\")\nestraditional   30 = 1.8.1 (type=\"traditional\")\net                30 = 26\nfa                22.1 = 1.8.1\nfi                22.1 = 1.8.1 (type=\"standard\" alt=\"proposed\")\nfiphonebook     22.1 = 1.8.1 (type=\"phonebook\")\nfil               30 = 1.9.0 (type=\"standard\") = 1.8.1\nfo                22.1 = 1.8.1 (alt=\"proposed\" type=\"standard\")\nfrCA             30 = 1.9.0\ngu                30 = 28 (\"standard\" wo [reorder Gujr..]) = 1.9.0\nha                30 = 1.9.0\nhaw               30 = 24\nhe                30 = 28 (without [reorder Hebr]) = 23\nhi                30 = 28 (without [reorder Deva..]) = 1.9.0\nhr                30 = 28 (\"standard\" wo [reorder Latn Cyrl]) = 1.9.0\nhu                22.1 = 1.8.1 (alt=\"proposed\" type=\"standard\")\nhy                30 = 28 (without [reorder Armn]) = 1.8.1\nig                30 = 1.8.1\nis                22.1 = 1.8.1 (type=\"standard\")\nja                22.1 = 1.8.1 (type=\"standard\")\nkk                30 = 28 (without [reorder Cyrl])\nkl                22.1 = 1.8.1 (type=\"standard\")\nkn                30 = 28 (\"standard\" wo [reorder Knda..]) = 1.9.0\nko                22.1 = 1.8.1 (type=\"standard\")\nkok               30 = 28 (without [reorder Deva..]) = 1.8.1\nlkt               30 = 25\nln                30 = 2.0 (type=\"standard\") = 1.8.1\nlt                22.1 = 1.9.0\nlv                22.1 = 1.9.0 (type=\"standard\") = 1.8.1\nmk                30 = 28 (without [reorder Cyrl])\nml                22.1 = 1.9.0\nmr                30 = 28 (without [reorder Deva..]) = 1.8.1\nmt                22.1 = 1.9.0\nnb                22.1 = 2.0   (type=\"standard\")\nnn                22.1 = 2.0   (type=\"standard\")\nnso           [*] 26 = 1.8.1\nom                22.1 = 1.8.1\nor                30 = 28 (without [reorder Orya..]) = 1.9.0\npa                22.1 = 1.8.1\npl                30 = 1.8.1\nro                30 = 1.9.0 (type=\"standard\")\nsa            [*] 1.9.1 = 1.8.1 (type=\"standard\" alt=\"proposed\")\nse                22.1 = 1.8.1 (type=\"standard\")\nsi                30 = 28 (\"standard\" wo [reorder Sinh..]) = 1.9.0\nsidictionary    30 = 28 (\"dictionary\" wo [reorder Sinh..]) = 1.9.0\nsk                22.1 = 1.9.0 (type=\"standard\")\nsl                22.1 = 1.8.1 (type=\"standard\" alt=\"proposed\")\nsq                22.1 = 1.8.1 (alt=\"proposed\" type=\"standard\")\nsr                30 = 28 (without [reorder Cyrl])\nsrLatn           30 = 28 (type=\"standard\": [import hr])\nsv                22.1 = 1.9.0 (type=\"standard\")\nsvreformed      22.1 = 1.8.1 (type=\"reformed\")\nta                22.1 = 1.9.0\nte                30 = 28 (without [reorder Telu..]) = 1.9.0\nth                22.1 = 22\ntn            [*] 26 = 1.8.1\nto                22.1 = 22\ntr                22.1 = 1.8.1 (type=\"standard\")\nuk                30 = 28 (without [reorder Cyrl])\nugCyrl           https://en.wikipedia.org/wiki/UyghurCyrillicalphabet\nur                22.1 = 1.9.0\nvi                22.1 = 1.8.1\nvo                30 = 25\nwae               30 = 2.0\nwo            [*] 1.9.1 = 1.8.1\nyo                30 = 1.8.1\nzh                22.1 = 1.8.1 (type=\"standard\")\nzhbig5han       22.1 = 1.8.1 (type=\"big5han\")\nzhgb2312han     22.1 = 1.8.1 (type=\"gb2312han\")\nzhpinyin        22.1 = 2.0   (type='pinyin' alt='short')\nzhstroke        22.1 = 1.9.1 (type='stroke' alt='short')\nzhzhuyin        22.1 = 22    (type='zhuyin' alt='short')\n--------------------------------------------------------------------\n\n[*] http://www.unicode.org/repos/cldr/tags/latest/seed/collation/\n"
                    }
                ]
            },
            "AUTHOR": {
                "content": "The Unicode::Collate::Locale module for perl was written by SADAHIRO Tomoyuki,\n<SADAHIRO@cpan.org>. This module is Copyright(C) 2004-2020, SADAHIRO Tomoyuki. Japan. All rights\nreserved.\n\nThis module is free software; you can redistribute it and/or modify it under the same terms as\nPerl itself.\n",
                "subsections": []
            },
            "SEE ALSO": {
                "content": "Unicode Collation Algorithm - UTS #10\n<http://www.unicode.org/reports/tr10/>\n\nThe Default Unicode Collation Element Table (DUCET)\n<http://www.unicode.org/Public/UCA/latest/allkeys.txt>\n\nUnicode Locale Data Markup Language (LDML) - UTS #35\n<http://www.unicode.org/reports/tr35/>\n\nCLDR - Unicode Common Locale Data Repository\n<http://cldr.unicode.org/>\n\nUnicode::Collate\nUnicode::Normalize\n",
                "subsections": []
            }
        }
    }
}