{
    "content": [
        {
            "type": "text",
            "text": "# PERLLOCALE(1) (man)\n\n**Summary:** perllocale - Perl locale handling (internationalization and localization)\n\n## Section Outline\n\n- **NAME** (2 lines)\n- **DESCRIPTION** (45 lines)\n- **WHAT IS A LOCALE** (35 lines)\n- **PREPARING TO USE LOCALES** (30 lines)\n- **USING LOCALES** (1 lines) — 13 subsections\n  - The \"use locale\" pragma (24 lines)\n  - Not within the scope of \"use locale\" (43 lines)\n  - Under \"\"use locale\";\" (85 lines)\n  - The setlocale function (86 lines)\n  - Multi-threaded operation (38 lines)\n  - Finding locales (67 lines)\n  - Testing for broken locales (10 lines)\n  - Temporarily fixing locale problems (37 lines)\n  - Permanently fixing locale problems (16 lines)\n  - Permanently fixing your system's locale configuration (13 lines)\n  - Fixing system locale configuration (6 lines)\n  - The localeconv function (64 lines)\n  - I18N::Langinfo (21 lines)\n- **LOCALE CATEGORIES** (207 lines) — 1 subsections\n  - Other categories (6 lines)\n- **SECURITY** (128 lines)\n- **ENVIRONMENT** (71 lines) — 1 subsections\n  - Examples (15 lines)\n- **NOTES** (29 lines) — 7 subsections\n  - Backward compatibility (13 lines)\n  - I18N:Collate obsolete (6 lines)\n  - Sort speed and memory use impacts (7 lines)\n  - Freely available locale definitions (15 lines)\n  - I18n and l10n (4 lines)\n  - An imperfect standard (5 lines)\n  - Unicode and UTF-8 (123 lines)\n- **BUGS** (1 lines) — 3 subsections\n  - Collation of strings containing embedded \"NUL\" characters (8 lines)\n  - Multi-threaded (27 lines)\n  - Broken systems (8 lines)\n- **SEE ALSO** (6 lines)\n- **HISTORY** (7 lines)\n\n## Full Content\n\n### NAME\n\nperllocale - Perl locale handling (internationalization and localization)\n\n### DESCRIPTION\n\nIn the beginning there was ASCII, the \"American Standard Code for Information Interchange\",\nwhich works quite well for Americans with their English alphabet and dollar-denominated\ncurrency.  But it doesn't work so well even for other English speakers, who may use different\ncurrencies, such as the pound sterling (as the symbol for that currency is not in ASCII); and\nit's hopelessly inadequate for many of the thousands of the world's other languages.\n\nTo address these deficiencies, the concept of locales was invented (formally the ISO C, XPG4,\nPOSIX 1.c \"locale system\").  And applications were and are being written that use the locale\nmechanism.  The process of making such an application take account of its users' preferences\nin these kinds of matters is called internationalization (often abbreviated as i18n); telling\nsuch an application about a particular set of preferences is known as localization (l10n).\n\nPerl has been extended to support certain types of locales available in the locale system.\nThis is controlled per application by using one pragma, one function call, and several\nenvironment variables.\n\nPerl supports single-byte locales that are supersets of ASCII, such as the ISO 8859 ones, and\none multi-byte-type locale, UTF-8 ones, described in the next paragraph.  Perl doesn't\nsupport any other multi-byte locales, such as the ones for East Asian languages.\n\nUnfortunately, there are quite a few deficiencies with the design (and often, the\nimplementations) of locales.  Unicode was invented (see perlunitut for an introduction to\nthat) in part to address these design deficiencies, and nowadays, there is a series of \"UTF-8\nlocales\", based on Unicode.  These are locales whose character set is Unicode, encoded in\nUTF-8.  Starting in v5.20, Perl fully supports UTF-8 locales, except for sorting and string\ncomparisons like \"lt\" and \"ge\".  Starting in v5.26, Perl can handle these reasonably as well,\ndepending on the platform's implementation.  However, for earlier releases or for better\ncontrol, use Unicode::Collate.  There are actually two slightly different types of UTF-8\nlocales: one for Turkic languages and one for everything else.\n\nStarting in Perl v5.30, Perl detects Turkic locales by their behaviour, and seamlessly\nhandles both types; previously only the non-Turkic one was supported.  The name of the locale\nis ignored, if your system has a \"trTR.UTF-8\" locale and it doesn't behave like a Turkic\nlocale, perl will treat it like a non-Turkic locale.\n\nPerl continues to support the old non UTF-8 locales as well.  There are currently no UTF-8\nlocales for EBCDIC platforms.\n\n(Unicode is also creating \"CLDR\", the \"Common Locale Data Repository\",\n<http://cldr.unicode.org/> which includes more types of information than are available in the\nPOSIX locale system.  At the time of this writing, there was no CPAN module that provides\naccess to this XML-encoded data.  However, it is possible to compute the POSIX locale data\nfrom them, and earlier CLDR versions had these already extracted for you as UTF-8 locales\n<http://unicode.org/Public/cldr/2.0.1/>.)\n\n### WHAT IS A LOCALE\n\nA locale is a set of data that describes various aspects of how various communities in the\nworld categorize their world.  These categories are broken down into the following types\n(some of which include a brief note here):\n\nCategory \"LCNUMERIC\": Numeric formatting\nThis indicates how numbers should be formatted for human readability, for example the\ncharacter used as the decimal point.\n\nCategory \"LCMONETARY\": Formatting of monetary amounts\n\n\nCategory \"LCTIME\": Date/Time formatting\n\n\nCategory \"LCMESSAGES\": Error and other messages\nThis is used by Perl itself only for accessing operating system error messages via $! and\n$^E.\n\nCategory \"LCCOLLATE\": Collation\nThis indicates the ordering of letters for comparison and sorting.  In Latin alphabets,\nfor example, \"b\", generally follows \"a\".\n\nCategory \"LCCTYPE\": Character Types\nThis indicates, for example if a character is an uppercase letter.\n\nOther categories\nSome platforms have other categories, dealing with such things as measurement units and\npaper sizes.  None of these are used directly by Perl, but outside operations that Perl\ninteracts with may use these.  See \"Not within the scope of \"use locale\"\" below.\n\nMore details on the categories used by Perl are given below in \"LOCALE CATEGORIES\".\n\nTogether, these categories go a long way towards being able to customize a single program to\nrun in many different locations.  But there are deficiencies, so keep reading.\n\n### PREPARING TO USE LOCALES\n\nPerl itself (outside the POSIX module) will not use locales unless specifically requested to\n(but again note that Perl may interact with code that does use them).  Even if there is such\na request, all of the following must be true for it to work properly:\n\n•   Your operating system must support the locale system.  If it does, you should find that\nthe \"setlocale()\" function is a documented part of its C library.\n\n•   Definitions for locales that you use must be installed.  You, or your system\nadministrator, must make sure that this is the case. The available locales, the location\nin which they are kept, and the manner in which they are installed all vary from system\nto system.  Some systems provide only a few, hard-wired locales and do not allow more to\nbe added.  Others allow you to add \"canned\" locales provided by the system supplier.\nStill others allow you or the system administrator to define and add arbitrary locales.\n(You may have to ask your supplier to provide canned locales that are not delivered with\nyour operating system.)  Read your system documentation for further illumination.\n\n•   Perl must believe that the locale system is supported.  If it does, \"perl -V:dsetlocale\"\nwill say that the value for \"dsetlocale\" is \"define\".\n\nIf you want a Perl application to process and present your data according to a particular\nlocale, the application code should include the \"use locale\" pragma (see \"The \"use locale\"\npragma\") where appropriate, and at least one of the following must be true:\n\n1.  The locale-determining environment variables (see \"ENVIRONMENT\") must be correctly set up\nat the time the application is started, either by yourself or by whomever set up your\nsystem account; or\n\n2.  The application must set its own locale using the method described in \"The setlocale\nfunction\".\n\n### USING LOCALES\n\n#### The \"use locale\" pragma\n\nStarting in Perl 5.28, this pragma may be used in multi-threaded applications on systems that\nhave thread-safe locale ability.  Some caveats apply, see \"Multi-threaded\" below.  On systems\nwithout this capability, or in earlier Perls, do NOT use this pragma in scripts that have\nmultiple threads active.  The locale in these cases is not local to a single thread.  Another\nthread may change the locale at any time, which could cause at a minimum that a given thread\nis operating in a locale it isn't expecting to be in.  On some platforms, segfaults can also\noccur.  The locale change need not be explicit; some operations cause perl to change the\nlocale itself.  You are vulnerable simply by having done a \"use locale\".\n\nBy default, Perl itself (outside the POSIX module) ignores the current locale.  The\n\"use locale\" pragma tells Perl to use the current locale for some operations.  Starting in\nv5.16, there are optional parameters to this pragma, described below, which restrict which\noperations are affected by it.\n\nThe current locale is set at execution time by setlocale() described below.  If that function\nhasn't yet been called in the course of the program's execution, the current locale is that\nwhich was determined by the \"ENVIRONMENT\" in effect at the start of the program.  If there is\nno valid environment, the current locale is whatever the system default has been set to.   On\nPOSIX systems, it is likely, but not necessarily, the \"C\" locale.  On Windows, the default is\nset via the computer's \"Control Panel->Regional and Language Options\" (or its current\nequivalent).\n\nThe operations that are affected by locale are:\n\n#### Not within the scope of \"use locale\"\n\nOnly certain operations (all originating outside Perl) should be affected, as follows:\n\n•   The current locale is used when going outside of Perl with operations like system()\nor qx//, if those operations are locale-sensitive.\n\n•   Also Perl gives access to various C library functions through the POSIX module.  Some\nof those functions are always affected by the current locale.  For example,\n\"POSIX::strftime()\" uses \"LCTIME\"; \"POSIX::strtod()\" uses \"LCNUMERIC\";\n\"POSIX::strcoll()\" and \"POSIX::strxfrm()\" use \"LCCOLLATE\".  All such functions will\nbehave according to the current underlying locale, even if that locale isn't exposed\nto Perl space.\n\nThis applies as well to I18N::Langinfo.\n\n•   XS modules for all categories but \"LCNUMERIC\" get the underlying locale, and hence\nany C library functions they call will use that underlying locale.  For more\ndiscussion, see \"CAVEATS\" in perlxs.\n\nNote that all C programs (including the perl interpreter, which is written in C) always\nhave an underlying locale.  That locale is the \"C\" locale unless changed by a call to\nsetlocale().  When Perl starts up, it changes the underlying locale to the one which is\nindicated by the \"ENVIRONMENT\".  When using the POSIX module or writing XS code, it is\nimportant to keep in mind that the underlying locale may be something other than \"C\",\neven if the program hasn't explicitly changed it.\n\n\n\nLingering effects of \"use  locale\"\nCertain Perl operations that are set-up within the scope of a \"use locale\" retain that\neffect even outside the scope.  These include:\n\n•   The output format of a write() is determined by an earlier format declaration\n(\"format\" in perlfunc), so whether or not the output is affected by locale is\ndetermined by if the \"format()\" is within the scope of a \"use locale\", not whether\nthe \"write()\" is.\n\n•   Regular expression patterns can be compiled using qr// with actual matching deferred\nto later.  Again, it is whether or not the compilation was done within the scope of\n\"use locale\" that determines the match behavior, not if the matches are done within\nsuch a scope or not.\n\n#### Under \"\"use locale\";\"\n\n•   All the above operations\n\n•   Format declarations (\"format\" in perlfunc) and hence any subsequent \"write()\"s use\n\"LCNUMERIC\".\n\n•   stringification and output use \"LCNUMERIC\".  These include the results of \"print()\",\n\"printf()\", \"say()\", and \"sprintf()\".\n\n•   The comparison operators (\"lt\", \"le\", \"cmp\", \"ge\", and \"gt\") use \"LCCOLLATE\".\n\"sort()\" is also affected if used without an explicit comparison function, because it\nuses \"cmp\" by default.\n\nNote: \"eq\" and \"ne\" are unaffected by locale: they always perform a char-by-char\ncomparison of their scalar operands.  What's more, if \"cmp\" finds that its operands\nare equal according to the collation sequence specified by the current locale, it\ngoes on to perform a char-by-char comparison, and only returns 0 (equal) if the\noperands are char-for-char identical.  If you really want to know whether two\nstrings--which \"eq\" and \"cmp\" may consider different--are equal as far as collation\nin the locale is concerned, see the discussion in \"Category \"LCCOLLATE\": Collation\".\n\n•   Regular expressions and case-modification functions (\"uc()\", \"lc()\", \"ucfirst()\", and\n\"lcfirst()\") use \"LCCTYPE\"\n\n•   The variables $! (and its synonyms $ERRNO and $OSERROR) and $^E (and its synonym\n$EXTENDEDOSERROR) when used as strings use \"LCMESSAGES\".\n\nThe default behavior is restored with the \"no locale\" pragma, or upon reaching the end of the\nblock enclosing \"use locale\".  Note that \"use locale\" calls may be nested, and that what is\nin effect within an inner scope will revert to the outer scope's rules at the end of the\ninner scope.\n\nThe string result of any operation that uses locale information is tainted, as it is possible\nfor a locale to be untrustworthy.  See \"SECURITY\".\n\nStarting in Perl v5.16 in a very limited way, and more generally in v5.22, you can restrict\nwhich category or categories are enabled by this particular instance of the pragma by adding\nparameters to it.  For example,\n\nuse locale qw(:ctype :numeric);\n\nenables locale awareness within its scope of only those operations (listed above) that are\naffected by \"LCCTYPE\" and \"LCNUMERIC\".\n\nThe possible categories are: \":collate\", \":ctype\", \":messages\", \":monetary\", \":numeric\",\n\":time\", and the pseudo category \":characters\" (described below).\n\nThus you can say\n\nuse locale ':messages';\n\nand only $! and $^E will be locale aware.  Everything else is unaffected.\n\nSince Perl doesn't currently do anything with the \"LCMONETARY\" category, specifying\n\":monetary\" does effectively nothing.  Some systems have other categories, such as\n\"LCPAPER\", but Perl also doesn't do anything with them, and there is no way to specify them\nin this pragma's arguments.\n\nYou can also easily say to use all categories but one, by either, for example,\n\nuse locale ':!ctype';\nuse locale ':notctype';\n\nboth of which mean to enable locale awareness of all categories but \"LCCTYPE\".  Only one\ncategory argument may be specified in a \"use locale\" if it is of the negated form.\n\nPrior to v5.22 only one form of the pragma with arguments is available:\n\nuse locale ':notcharacters';\n\n(and you have to say \"not\"; you can't use the bang \"!\" form).  This pseudo category is a\nshorthand for specifying both \":collate\" and \":ctype\".  Hence, in the negated form, it is\nnearly the same thing as saying\n\nuse locale qw(:messages :monetary :numeric :time);\n\nWe use the term \"nearly\", because \":notcharacters\" also turns on\n\"use feature 'unicodestrings'\" within its scope.  This form is less useful in v5.20 and\nlater, and is described fully in \"Unicode and UTF-8\", but briefly, it tells Perl to not use\nthe character portions of the locale definition, that is the \"LCCTYPE\" and \"LCCOLLATE\"\ncategories.  Instead it will use the native character set (extended by Unicode).  When using\nthis parameter, you are responsible for getting the external character set translated into\nthe native/Unicode one (which it already will be if it is one of the increasingly popular\nUTF-8 locales).  There are convenient ways of doing this, as described in \"Unicode and\nUTF-8\".\n\n#### The setlocale function\n\nWARNING!  Prior to Perl 5.28 or on a system that does not support thread-safe locale\noperations, do NOT use this function in a thread.  The locale will change in all other\nthreads at the same time, and should your thread get paused by the operating system, and\nanother started, that thread will not have the locale it is expecting.  On some platforms,\nthere can be a race leading to segfaults if two threads call this function nearly\nsimultaneously.  This warning does not apply on unthreaded builds, or on perls where\n\"${^SAFELOCALES}\" exists and is non-zero; namely Perl 5.28 and later unthreaded or compiled\nto be locale-thread-safe.\n\nYou can switch locales as often as you wish at run time with the \"POSIX::setlocale()\"\nfunction:\n\n# Import locale-handling tool set from POSIX module.\n# This example uses: setlocale -- the function call\n#                    LCCTYPE -- explained below\n# (Showing the testing for success/failure of operations is\n# omitted in these examples to avoid distracting from the main\n# point)\n\nuse POSIX qw(localeh);\nuse locale;\nmy $oldlocale;\n\n# query and save the old locale\n$oldlocale = setlocale(LCCTYPE);\n\nsetlocale(LCCTYPE, \"frCA.ISO8859-1\");\n# LCCTYPE now in locale \"French, Canada, codeset ISO 8859-1\"\n\nsetlocale(LCCTYPE, \"\");\n# LCCTYPE now reset to the default defined by the\n# LCALL/LCCTYPE/LANG environment variables, or to the system\n# default.  See below for documentation.\n\n# restore the old locale\nsetlocale(LCCTYPE, $oldlocale);\n\nThe first argument of \"setlocale()\" gives the category, the second the locale.  The category\ntells in what aspect of data processing you want to apply locale-specific rules.  Category\nnames are discussed in \"LOCALE CATEGORIES\" and \"ENVIRONMENT\".  The locale is the name of a\ncollection of customization information corresponding to a particular combination of\nlanguage, country or territory, and codeset.  Read on for hints on the naming of locales: not\nall systems name locales as in the example.\n\nIf no second argument is provided and the category is something other than \"LCALL\", the\nfunction returns a string naming the current locale for the category.  You can use this value\nas the second argument in a subsequent call to \"setlocale()\", but on some platforms the\nstring is opaque, not something that most people would be able to decipher as to what locale\nit means.\n\nIf no second argument is provided and the category is \"LCALL\", the result is implementation-\ndependent.  It may be a string of concatenated locale names (separator also implementation-\ndependent) or a single locale name.  Please consult your setlocale(3) man page for details.\n\nIf a second argument is given and it corresponds to a valid locale, the locale for the\ncategory is set to that value, and the function returns the now-current locale value.  You\ncan then use this in yet another call to \"setlocale()\".  (In some implementations, the return\nvalue may sometimes differ from the value you gave as the second argument--think of it as an\nalias for the value you gave.)\n\nAs the example shows, if the second argument is an empty string, the category's locale is\nreturned to the default specified by the corresponding environment variables.  Generally,\nthis results in a return to the default that was in force when Perl started up: changes to\nthe environment made by the application after startup may or may not be noticed, depending on\nyour system's C library.\n\nNote that when a form of \"use locale\" that doesn't include all categories is specified, Perl\nignores the excluded categories.\n\nIf \"setlocale()\" fails for some reason (for example, an attempt to set to a locale unknown to\nthe system), the locale for the category is not changed, and the function returns \"undef\".\n\nStarting in Perl 5.28, on multi-threaded perls compiled on systems that implement POSIX 2008\nthread-safe locale operations, this function doesn't actually call the system \"setlocale\".\nInstead those thread-safe operations are used to emulate the \"setlocale\" function, but in a\nthread-safe manner.\n\nYou can force the thread-safe locale operations to always be used (if available) by\nrecompiling perl with\n\n-Accflags='-DUSETHREADSAFELOCALE'\n\nadded to your call to Configure.\n\nFor further information about the categories, consult setlocale(3).\n\n#### Multi-threaded operation\n\nBeginning in Perl 5.28, multi-threaded locale operation is supported on systems that\nimplement either the POSIX 2008 or Windows-specific thread-safe locale operations.  Many\nmodern systems, such as various Unix variants and Darwin do have this.\n\nYou can tell if using locales is safe on your system by looking at the read-only boolean\nvariable \"${^SAFELOCALES}\".  The value is 1 if the perl is not threaded, or if it is using\nthread-safe locale operations.\n\nThread-safe operations are supported in Windows starting in Visual Studio 2005, and in\nsystems compatible with POSIX 2008.  Some platforms claim to support POSIX 2008, but have\nbuggy implementations, so that the hints files for compiling to run on them turn off\nattempting to use thread-safety.  \"${^SAFELOCALES}\" will be 0 on them.\n\nBe aware that writing a multi-threaded application will not be portable to a platform which\nlacks the native thread-safe locale support.  On systems that do have it, you automatically\nget this behavior for threaded perls, without having to do anything.  If for some reason, you\ndon't want to use this capability (perhaps the POSIX 2008 support is buggy on your system),\nyou can manually compile Perl to use the old non-thread-safe implementation by passing the\nargument \"-Accflags='-DNOTHREADSAFELOCALE'\" to Configure.  Except on Windows, this will\ncontinue to use certain of the POSIX 2008 functions in some situations.  If these are buggy,\nyou can pass the following to Configure instead or additionally:\n\"-Accflags='-DNOPOSIX2008LOCALE'\".  This will also keep the code from using thread-safe\nlocales.  \"${^SAFELOCALES}\" will be 0 on systems that turn off the thread-safe operations.\n\nNormally on unthreaded builds, the traditional \"setlocale()\" is used and not the thread-safe\nlocale functions.  You can force the use of these on systems that have them by adding the\n\"-Accflags='-DUSETHREADSAFELOCALE'\" to Configure.\n\nThe initial program is started up using the locale specified from the environment, as\ncurrently, described in \"ENVIRONMENT\".   All newly created threads start with \"LCALL\" set to\n\"C\".  Each thread may use \"POSIX::setlocale()\" to query or switch its locale at any time,\nwithout affecting any other thread.  All locale-dependent operations automatically use their\nthread's locale.\n\nThis should be completely transparent to any applications written entirely in Perl (minus a\nfew rarely encountered caveats given in the \"Multi-threaded\" section).  Information for XS\nmodule writers is given in \"Locale-aware XS code\" in perlxs.\n\n#### Finding locales\n\nFor locales available in your system, consult also setlocale(3) to see whether it leads to\nthe list of available locales (search for the SEE ALSO section).  If that fails, try the\nfollowing command lines:\n\nlocale -a\n\nnlsinfo\n\nls /usr/lib/nls/loc\n\nls /usr/lib/locale\n\nls /usr/lib/nls\n\nls /usr/share/locale\n\nand see whether they list something resembling these\n\nenUS.ISO8859-1     deDE.ISO8859-1     ruRU.ISO8859-5\nenUS.iso88591      deDE.iso88591      ruRU.iso88595\nenUS               deDE               ruRU\nen                  de                  ru\nenglish             german              russian\nenglish.iso88591    german.iso88591     russian.iso88595\nenglish.roman8                          russian.koi8r\n\nSadly, even though the calling interface for \"setlocale()\" has been standardized, names of\nlocales and the directories where the configuration resides have not been.  The basic form of\nthe name is languageterritory.codeset, but the latter parts after language are not always\npresent.  The language and country are usually from the standards ISO 3166 and ISO 639, the\ntwo-letter abbreviations for the countries and the languages of the world, respectively.  The\ncodeset part often mentions some ISO 8859 character set, the Latin codesets.  For example,\n\"ISO 8859-1\" is the so-called \"Western European codeset\" that can be used to encode most\nWestern European languages adequately.  Again, there are several ways to write even the name\nof that one standard.  Lamentably.\n\nTwo special locales are worth particular mention: \"C\" and \"POSIX\".  Currently these are\neffectively the same locale: the difference is mainly that the first one is defined by the C\nstandard, the second by the POSIX standard.  They define the default locale in which every\nprogram starts in the absence of locale information in its environment.  (The default default\nlocale, if you will.)  Its language is (American) English and its character codeset ASCII or,\nrarely, a superset thereof (such as the \"DEC Multinational Character Set (DEC-MCS)\").\nWarning. The C locale delivered by some vendors may not actually exactly match what the C\nstandard calls for.  So beware.\n\nNOTE: Not all systems have the \"POSIX\" locale (not all systems are POSIX-conformant), so use\n\"C\" when you need explicitly to specify this default locale.\n\nLOCALE PROBLEMS\nYou may encounter the following warning message at Perl startup:\n\nperl: warning: Setting locale failed.\nperl: warning: Please check that your locale settings:\nLCALL = \"EnUS\",\nLANG = (unset)\nare supported and installed on your system.\nperl: warning: Falling back to the standard locale (\"C\").\n\nThis means that your locale settings had \"LCALL\" set to \"EnUS\" and LANG exists but has no\nvalue.  Perl tried to believe you but could not.  Instead, Perl gave up and fell back to the\n\"C\" locale, the default locale that is supposed to work no matter what.  (On Windows, it\nfirst tries falling back to the system default locale.)  This usually means your locale\nsettings were wrong, they mention locales your system has never heard of, or the locale\ninstallation in your system has problems (for example, some system files are broken or\nmissing).  There are quick and temporary fixes to these problems, as well as more thorough\nand lasting fixes.\n\n#### Testing for broken locales\n\nIf you are building Perl from source, the Perl test suite file lib/locale.t can be used to\ntest the locales on your system.  Setting the environment variable \"PERLDEBUGFULLTEST\" to\n1 will cause it to output detailed results.  For example, on Linux, you could say\n\nPERLDEBUGFULLTEST=1 ./perl -T -Ilib lib/locale.t > locale.log 2>&1\n\nBesides many other tests, it will test every locale it finds on your system to see if they\nconform to the POSIX standard.  If any have errors, it will include a summary near the end of\nthe output of which locales passed all its tests, and which failed, and why.\n\n#### Temporarily fixing locale problems\n\nThe two quickest fixes are either to render Perl silent about any locale inconsistencies or\nto run Perl under the default locale \"C\".\n\nPerl's moaning about locale problems can be silenced by setting the environment variable\n\"PERLBADLANG\" to \"0\" or \"\".  This method really just sweeps the problem under the carpet:\nyou tell Perl to shut up even when Perl sees that something is wrong.  Do not be surprised if\nlater something locale-dependent misbehaves.\n\nPerl can be run under the \"C\" locale by setting the environment variable \"LCALL\" to \"C\".\nThis method is perhaps a bit more civilized than the \"PERLBADLANG\" approach, but setting\n\"LCALL\" (or other locale variables) may affect other programs as well, not just Perl.  In\nparticular, external programs run from within Perl will see these changes.  If you make the\nnew settings permanent (read on), all programs you run see the changes.  See \"ENVIRONMENT\"\nfor the full list of relevant environment variables and \"USING LOCALES\" for their effects in\nPerl.  Effects in other programs are easily deducible.  For example, the variable\n\"LCCOLLATE\" may well affect your sort program (or whatever the program that arranges\n\"records\" alphabetically in your system is called).\n\nYou can test out changing these variables temporarily, and if the new settings seem to help,\nput those settings into your shell startup files.  Consult your local documentation for the\nexact details.  For Bourne-like shells (sh, ksh, bash, zsh):\n\nLCALL=enUS.ISO8859-1\nexport LCALL\n\nThis assumes that we saw the locale \"enUS.ISO8859-1\" using the commands discussed above.  We\ndecided to try that instead of the above faulty locale \"EnUS\"--and in Cshish shells (csh,\ntcsh)\n\nsetenv LCALL enUS.ISO8859-1\n\nor if you have the \"env\" application you can do (in any shell)\n\nenv LCALL=enUS.ISO8859-1 perl ...\n\nIf you do not know what shell you have, consult your local helpdesk or the equivalent.\n\n#### Permanently fixing locale problems\n\nThe slower but superior fixes are when you may be able to yourself fix the misconfiguration\nof your own environment variables.  The mis(sing)configuration of the whole system's locales\nusually requires the help of your friendly system administrator.\n\nFirst, see earlier in this document about \"Finding locales\".  That tells how to find which\nlocales are really supported--and more importantly, installed--on your system.  In our\nexample error message, environment variables affecting the locale are listed in the order of\ndecreasing importance (and unset variables do not matter).  Therefore, having LCALL set to\n\"EnUS\" must have been the bad choice, as shown by the error message.  First try fixing\nlocale settings listed first.\n\nSecond, if using the listed commands you see something exactly (prefix matches do not count\nand case usually counts) like \"EnUS\" without the quotes, then you should be okay because you\nare using a locale name that should be installed and available in your system.  In this case,\nsee \"Permanently fixing your system's locale configuration\".\n\n#### Permanently fixing your system's locale configuration\n\nThis is when you see something like:\n\nperl: warning: Please check that your locale settings:\nLCALL = \"EnUS\",\nLANG = (unset)\nare supported and installed on your system.\n\nbut then cannot see that \"EnUS\" listed by the above-mentioned commands.  You may see things\nlike \"enUS.ISO8859-1\", but that isn't the same.  In this case, try running under a locale\nthat you can list and which somehow matches what you tried.  The rules for matching locale\nnames are a bit vague because standardization is weak in this area.  See again the \"Finding\nlocales\" about general rules.\n\n#### Fixing system locale configuration\n\nContact a system administrator (preferably your own) and report the exact error message you\nget, and ask them to read this same documentation you are now reading.  They should be able\nto check whether there is something wrong with the locale configuration of the system.  The\n\"Finding locales\" section is unfortunately a bit vague about the exact commands and places\nbecause these things are not that standardized.\n\n#### The localeconv function\n\nThe \"POSIX::localeconv()\" function allows you to get particulars of the locale-dependent\nnumeric formatting information specified by the current underlying \"LCNUMERIC\" and\n\"LCMONETARY\" locales (regardless of whether called from within the scope of \"use locale\" or\nnot).  (If you just want the name of the current locale for a particular category, use\n\"POSIX::setlocale()\" with a single parameter--see \"The setlocale function\".)\n\nuse POSIX qw(localeh);\n\n# Get a reference to a hash of locale-dependent info\n$localevalues = localeconv();\n\n# Output sorted list of the values\nfor (sort keys %$localevalues) {\nprintf \"%-20s = %s\\n\", $, $localevalues->{$}\n}\n\n\"localeconv()\" takes no arguments, and returns a reference to a hash.  The keys of this hash\nare variable names for formatting, such as \"decimalpoint\" and \"thousandssep\".  The values\nare the corresponding, er, values.  See \"localeconv\" in POSIX for a longer example listing\nthe categories an implementation might be expected to provide; some provide more and others\nfewer.  You don't need an explicit \"use locale\", because \"localeconv()\" always observes the\ncurrent locale.\n\nHere's a simple-minded example program that rewrites its command-line parameters as integers\ncorrectly formatted in the current locale:\n\nuse POSIX qw(localeh);\n\n# Get some of locale's numeric formatting parameters\nmy ($thousandssep, $grouping) =\n@{localeconv()}{'thousandssep', 'grouping'};\n\n# Apply defaults if values are missing\n$thousandssep = ',' unless $thousandssep;\n\n# grouping and mongrouping are packed lists\n# of small integers (characters) telling the\n# grouping (thousandseps and monthousandseps\n# being the group dividers) of numbers and\n# monetary quantities.  The integers' meanings:\n# 255 means no more grouping, 0 means repeat\n# the previous grouping, 1-254 means use that\n# as the current grouping.  Grouping goes from\n# right to left (low to high digits).  In the\n# below we cheat slightly by never using anything\n# else than the first grouping (whatever that is).\nif ($grouping) {\n@grouping = unpack(\"C*\", $grouping);\n} else {\n@grouping = (3);\n}\n\n# Format command line params for current locale\nfor (@ARGV) {\n$ = int;    # Chop non-integer part\n1 while\ns/(\\d)(\\d{$grouping[0]}($|$thousandssep))/$1$thousandssep$2/;\nprint \"$\";\n}\nprint \"\\n\";\n\nNote that if the platform doesn't have \"LCNUMERIC\" and/or \"LCMONETARY\" available or\nenabled, the corresponding elements of the hash will be missing.\n\n#### I18N::Langinfo\n\nAnother interface for querying locale-dependent information is the\n\"I18N::Langinfo::langinfo()\" function.\n\nThe following example will import the \"langinfo()\" function itself and three constants to be\nused as arguments to \"langinfo()\": a constant for the abbreviated first day of the week (the\nnumbering starts from Sunday = 1) and two more constants for the affirmative and negative\nanswers for a yes/no question in the current locale.\n\nuse I18N::Langinfo qw(langinfo ABDAY1 YESSTR NOSTR);\n\nmy ($abday1, $yesstr, $nostr)\n= map { langinfo } qw(ABDAY1 YESSTR NOSTR);\n\nprint \"$abday1? [$yesstr/$nostr] \";\n\nIn other words, in the \"C\" (or English) locale the above will probably print something like:\n\nSun? [yes/no]\n\nSee I18N::Langinfo for more information.\n\n### LOCALE CATEGORIES\n\nThe following subsections describe basic locale categories.  Beyond these, some combination\ncategories allow manipulation of more than one basic category at a time.  See \"ENVIRONMENT\"\nfor a discussion of these.\n\nCategory \"LCCOLLATE\": Collation: Text Comparisons and Sorting\nIn the scope of a \"use locale\" form that includes collation, Perl looks to the \"LCCOLLATE\"\nenvironment variable to determine the application's notions on collation (ordering) of\ncharacters.  For example, \"b\" follows \"a\" in Latin alphabets, but where do \"á\" and \"å\"\nbelong?  And while \"color\" follows \"chocolate\" in English, what about in traditional Spanish?\n\nThe following collations all make sense and you may meet any of them if you \"use locale\".\n\nA B C D E a b c d e\nA a B b C c D d E e\na A b B c C d D e E\na b c d e A B C D E\n\nHere is a code snippet to tell what \"word\" characters are in the current locale, in that\nlocale's order:\n\nuse locale;\nprint +(sort grep /\\w/, map { chr } 0..255), \"\\n\";\n\nCompare this with the characters that you see and their order if you state explicitly that\nthe locale should be ignored:\n\nno locale;\nprint +(sort grep /\\w/, map { chr } 0..255), \"\\n\";\n\nThis machine-native collation (which is what you get unless \"use locale\" has appeared earlier\nin the same block) must be used for sorting raw binary data, whereas the locale-dependent\ncollation of the first example is useful for natural text.\n\nAs noted in \"USING LOCALES\", \"cmp\" compares according to the current collation locale when\n\"use locale\" is in effect, but falls back to a char-by-char comparison for strings that the\nlocale says are equal. You can use \"POSIX::strcoll()\" if you don't want this fall-back:\n\nuse POSIX qw(strcoll);\n$equalinlocale =\n!strcoll(\"space and case ignored\", \"SpaceAndCaseIgnored\");\n\n$equalinlocale will be true if the collation locale specifies a dictionary-like ordering\nthat ignores space characters completely and which folds case.\n\nPerl uses the platform's C library collation functions \"strcoll()\" and \"strxfrm()\".  That\nmeans you get whatever they give.  On some platforms, these functions work well on UTF-8\nlocales, giving a reasonable default collation for the code points that are important in that\nlocale.  (And if they aren't working well, the problem may only be that the locale definition\nis deficient, so can be fixed by using a better definition file.  Unicode's definitions (see\n\"Freely available locale definitions\") provide reasonable UTF-8 locale collation\ndefinitions.)  Starting in Perl v5.26, Perl's use of these functions has been made more\nseamless.  This may be sufficient for your needs.  For more control, and to make sure strings\ncontaining any code point (not just the ones important in the locale) collate properly, the\nUnicode::Collate module is suggested.\n\nIn non-UTF-8 locales (hence single byte), code points above 0xFF are technically invalid.\nBut if present, again starting in v5.26, they will collate to the same position as the\nhighest valid code point does.  This generally gives good results, but the collation order\nmay be skewed if the valid code point gets special treatment when it forms particular\nsequences with other characters as defined by the locale.  When two strings collate\nidentically, the code point order is used as a tie breaker.\n\nIf Perl detects that there are problems with the locale collation order, it reverts to using\nnon-locale collation rules for that locale.\n\nIf you have a single string that you want to check for \"equality in locale\" against several\nothers, you might think you could gain a little efficiency by using \"POSIX::strxfrm()\" in\nconjunction with \"eq\":\n\nuse POSIX qw(strxfrm);\n$xfrmstring = strxfrm(\"Mixed-case string\");\nprint \"locale collation ignores spaces\\n\"\nif $xfrmstring eq strxfrm(\"Mixed-casestring\");\nprint \"locale collation ignores hyphens\\n\"\nif $xfrmstring eq strxfrm(\"Mixedcase string\");\nprint \"locale collation ignores case\\n\"\nif $xfrmstring eq strxfrm(\"mixed-case string\");\n\n\"strxfrm()\" takes a string and maps it into a transformed string for use in char-by-char\ncomparisons against other transformed strings during collation.  \"Under the hood\", locale-\naffected Perl comparison operators call \"strxfrm()\" for both operands, then do a char-by-char\ncomparison of the transformed strings.  By calling \"strxfrm()\" explicitly and using a non\nlocale-affected comparison, the example attempts to save a couple of transformations.  But in\nfact, it doesn't save anything: Perl magic (see \"Magic Variables\" in perlguts) creates the\ntransformed version of a string the first time it's needed in a comparison, then keeps this\nversion around in case it's needed again.  An example rewritten the easy way with \"cmp\" runs\njust about as fast.  It also copes with null characters embedded in strings; if you call\n\"strxfrm()\" directly, it treats the first null it finds as a terminator.  Don't expect the\ntransformed strings it produces to be portable across systems--or even from one revision of\nyour operating system to the next.  In short, don't call \"strxfrm()\" directly: let Perl do it\nfor you.\n\nNote: \"use locale\" isn't shown in some of these examples because it isn't needed: \"strcoll()\"\nand \"strxfrm()\" are POSIX functions which use the standard system-supplied \"libc\" functions\nthat always obey the current \"LCCOLLATE\" locale.\n\nCategory \"LCCTYPE\": Character Types\nIn the scope of a \"use locale\" form that includes \"LCCTYPE\", Perl obeys the \"LCCTYPE\"\nlocale setting.  This controls the application's notion of which characters are alphabetic,\nnumeric, punctuation, etc.  This affects Perl's \"\\w\" regular expression metanotation, which\nstands for alphanumeric characters--that is, alphabetic, numeric, and the platform's native\nunderscore.  (Consult perlre for more information about regular expressions.)  Thanks to\n\"LCCTYPE\", depending on your locale setting, characters like \"æ\", \"ð\", \"ß\", and \"ø\" may be\nunderstood as \"\\w\" characters.  It also affects things like \"\\s\", \"\\D\", and the POSIX\ncharacter classes, like \"[[:graph:]]\".  (See perlrecharclass for more information on all\nthese.)\n\nThe \"LCCTYPE\" locale also provides the map used in transliterating characters between lower\nand uppercase.  This affects the case-mapping functions--\"fc()\", \"lc()\", \"lcfirst()\", \"uc()\",\nand \"ucfirst()\"; case-mapping interpolation with \"\\F\", \"\\l\", \"\\L\", \"\\u\", or \"\\U\" in double-\nquoted strings and \"s///\" substitutions; and case-insensitive regular expression pattern\nmatching using the \"i\" modifier.\n\nStarting in v5.20, Perl supports UTF-8 locales for \"LCCTYPE\", but otherwise Perl only\nsupports single-byte locales, such as the ISO 8859 series.  This means that wide character\nlocales, for example for Asian languages, are not well-supported.  Use of these locales may\ncause core dumps.  If the platform has the capability for Perl to detect such a locale,\nstarting in Perl v5.22, Perl will warn, default enabled, using the \"locale\" warning category,\nwhenever such a locale is switched into.  The UTF-8 locale support is actually a superset of\nPOSIX locales, because it is really full Unicode behavior as if no \"LCCTYPE\" locale were in\neffect at all (except for tainting; see \"SECURITY\").  POSIX locales, even UTF-8 ones, are\nlacking certain concepts in Unicode, such as the idea that changing the case of a character\ncould expand to be more than one character.  Perl in a UTF-8 locale, will give you that\nexpansion.  Prior to v5.20, Perl treated a UTF-8 locale on some platforms like an ISO 8859-1\none, with some restrictions, and on other platforms more like the \"C\" locale.  For releases\nv5.16 and v5.18, \"use locale 'notcharacters\" could be used as a workaround for this (see\n\"Unicode and UTF-8\").\n\nNote that there are quite a few things that are unaffected by the current locale.  Any\nliteral character is the native character for the given platform.  Hence 'A' means the\ncharacter at code point 65 on ASCII platforms, and 193 on EBCDIC.  That may or may not be an\n'A' in the current locale, if that locale even has an 'A'.  Similarly, all the escape\nsequences for particular characters, \"\\n\" for example, always mean the platform's native one.\nThis means, for example, that \"\\N\" in regular expressions (every character but new-line)\nworks on the platform character set.\n\nStarting in v5.22, Perl will by default warn when switching into a locale that redefines any\nASCII printable character (plus \"\\t\" and \"\\n\") into a different class than expected.  This is\nlikely to happen on modern locales only on EBCDIC platforms, where, for example, a CCSID 0037\nlocale on a CCSID 1047 machine moves \"[\", but it can happen on ASCII platforms with the ISO\n646 and other 7-bit locales that are essentially obsolete.  Things may still work, depending\non what features of Perl are used by the program.  For example, in the example from above\nwhere \"|\" becomes a \"\\w\", and there are no regular expressions where this matters, the\nprogram may still work properly.  The warning lists all the characters that it can determine\ncould be adversely affected.\n\nNote: A broken or malicious \"LCCTYPE\" locale definition may result in clearly ineligible\ncharacters being considered to be alphanumeric by your application.  For strict matching of\n(mundane) ASCII letters and digits--for example, in command strings--locale-aware\napplications should use \"\\w\" with the \"/a\" regular expression modifier.  See \"SECURITY\".\n\nCategory \"LCNUMERIC\": Numeric Formatting\nAfter a proper \"POSIX::setlocale()\" call, and within the scope of a \"use locale\" form that\nincludes numerics, Perl obeys the \"LCNUMERIC\" locale information, which controls an\napplication's idea of how numbers should be formatted for human readability.  In most\nimplementations the only effect is to change the character used for the decimal\npoint--perhaps from \".\"  to \",\".  The functions aren't aware of such niceties as thousands\nseparation and so on. (See \"The localeconv function\" if you care about these things.)\n\nuse POSIX qw(strtod setlocale LCNUMERIC);\nuse locale;\n\nsetlocale LCNUMERIC, \"\";\n\n$n = 5/2;   # Assign numeric 2.5 to $n\n\n$a = \" $n\"; # Locale-dependent conversion to string\n\nprint \"half five is $n\\n\";       # Locale-dependent output\n\nprintf \"half five is %g\\n\", $n;  # Locale-dependent output\n\nprint \"DECIMAL POINT IS COMMA\\n\"\nif $n == (strtod(\"2,5\"))[0]; # Locale-dependent conversion\n\nSee also I18N::Langinfo and \"RADIXCHAR\".\n\nCategory \"LCMONETARY\": Formatting of monetary amounts\nThe C standard defines the \"LCMONETARY\" category, but not a function that is affected by its\ncontents.  (Those with experience of standards committees will recognize that the working\ngroup decided to punt on the issue.)  Consequently, Perl essentially takes no notice of it.\nIf you really want to use \"LCMONETARY\", you can query its contents--see \"The localeconv\nfunction\"--and use the information that it returns in your application's own formatting of\ncurrency amounts.  However, you may well find that the information, voluminous and complex\nthough it may be, still does not quite meet your requirements: currency formatting is a hard\nnut to crack.\n\nSee also I18N::Langinfo and \"CRNCYSTR\".\n\nCategory \"LCTIME\": Respresentation of time\nOutput produced by \"POSIX::strftime()\", which builds a formatted human-readable date/time\nstring, is affected by the current \"LCTIME\" locale.  Thus, in a French locale, the output\nproduced by the %B format element (full month name) for the first month of the year would be\n\"janvier\".  Here's how to get a list of long month names in the current locale:\n\nuse POSIX qw(strftime);\nfor (0..11) {\n$longmonthname[$] =\nstrftime(\"%B\", 0, 0, 0, 1, $, 96);\n}\n\nNote: \"use locale\" isn't needed in this example: \"strftime()\" is a POSIX function which uses\nthe standard system-supplied \"libc\" function that always obeys the current \"LCTIME\" locale.\n\nSee also I18N::Langinfo and \"ABDAY1\"..\"ABDAY7\", \"DAY1\"..\"DAY7\", \"ABMON1\"..\"ABMON12\",\nand \"ABMON1\"..\"ABMON12\".\n\n#### Other categories\n\nThe remaining locale categories are not currently used by Perl itself.  But again note that\nthings Perl interacts with may use these, including extensions outside the standard Perl\ndistribution, and by the operating system and its utilities.  Note especially that the string\nvalue of $! and the error messages given by external utilities may be changed by\n\"LCMESSAGES\".  If you want to have portable error codes, use \"%!\".  See Errno.\n\n### SECURITY\n\nAlthough the main discussion of Perl security issues can be found in perlsec, a discussion of\nPerl's locale handling would be incomplete if it did not draw your attention to locale-\ndependent security issues.  Locales--particularly on systems that allow unprivileged users to\nbuild their own locales--are untrustworthy.  A malicious (or just plain broken) locale can\nmake a locale-aware application give unexpected results.  Here are a few possibilities:\n\n•   Regular expression checks for safe file names or mail addresses using \"\\w\" may be spoofed\nby an \"LCCTYPE\" locale that claims that characters such as \">\" and \"|\" are alphanumeric.\n\n•   String interpolation with case-mapping, as in, say, \"$dest = \"C:\\U$name.$ext\"\", may\nproduce dangerous results if a bogus \"LCCTYPE\" case-mapping table is in effect.\n\n•   A sneaky \"LCCOLLATE\" locale could result in the names of students with \"D\" grades\nappearing ahead of those with \"A\"s.\n\n•   An application that takes the trouble to use information in \"LCMONETARY\" may format\ndebits as if they were credits and vice versa if that locale has been subverted.  Or it\nmight make payments in US dollars instead of Hong Kong dollars.\n\n•   The date and day names in dates formatted by \"strftime()\" could be manipulated to\nadvantage by a malicious user able to subvert the \"LCDATE\" locale.  (\"Look--it says I\nwasn't in the building on Sunday.\")\n\nSuch dangers are not peculiar to the locale system: any aspect of an application's\nenvironment which may be modified maliciously presents similar challenges.  Similarly, they\nare not specific to Perl: any programming language that allows you to write programs that\ntake account of their environment exposes you to these issues.\n\nPerl cannot protect you from all possibilities shown in the examples--there is no substitute\nfor your own vigilance--but, when \"use locale\" is in effect, Perl uses the tainting mechanism\n(see perlsec) to mark string results that become locale-dependent, and which may be\nuntrustworthy in consequence.  Here is a summary of the tainting behavior of operators and\nfunctions that may be affected by the locale:\n\n•   Comparison operators (\"lt\", \"le\", \"ge\", \"gt\" and \"cmp\"):\n\nScalar true/false (or less/equal/greater) result is never tainted.\n\n•   Case-mapping interpolation (with \"\\l\", \"\\L\", \"\\u\", \"\\U\", or \"\\F\")\n\nThe result string containing interpolated material is tainted if a \"use locale\" form that\nincludes \"LCCTYPE\" is in effect.\n\n•   Matching operator (\"m//\"):\n\nScalar true/false result never tainted.\n\nAll subpatterns, either delivered as a list-context result or as $1 etc., are tainted if\na \"use locale\" form that includes \"LCCTYPE\" is in effect, and the subpattern regular\nexpression contains a locale-dependent construct.  These constructs include \"\\w\" (to\nmatch an alphanumeric character), \"\\W\" (non-alphanumeric character), \"\\b\" and \"\\B\" (word-\nboundary and non-boundardy, which depend on what \"\\w\" and \"\\W\" match), \"\\s\" (whitespace\ncharacter), \"\\S\" (non whitespace character), \"\\d\" and \"\\D\" (digits and non-digits), and\nthe POSIX character classes, such as \"[:alpha:]\" (see \"POSIX Character Classes\" in\nperlrecharclass).\n\nTainting is also likely if the pattern is to be matched case-insensitively (via \"/i\").\nThe exception is if all the code points to be matched this way are above 255 and do not\nhave folds under Unicode rules to below 256.  Tainting is not done for these because Perl\nonly uses Unicode rules for such code points, and those rules are the same no matter what\nthe current locale.\n\nThe matched-pattern variables, $&, \"$`\" (pre-match), \"$'\" (post-match), and $+ (last\nmatch) also are tainted.\n\n•   Substitution operator (\"s///\"):\n\nHas the same behavior as the match operator.  Also, the left operand of \"=~\" becomes\ntainted when a \"use locale\" form that includes \"LCCTYPE\" is in effect, if modified as a\nresult of a substitution based on a regular expression match involving any of the things\nmentioned in the previous item, or of case-mapping, such as \"\\l\", \"\\L\",\"\\u\", \"\\U\", or\n\"\\F\".\n\n•   Output formatting functions (\"printf()\" and \"write()\"):\n\nResults are never tainted because otherwise even output from print, for example\n\"print(1/7)\", should be tainted if \"use locale\" is in effect.\n\n•   Case-mapping functions (\"lc()\", \"lcfirst()\", \"uc()\", \"ucfirst()\"):\n\nResults are tainted if a \"use locale\" form that includes \"LCCTYPE\" is in effect.\n\n•   POSIX locale-dependent functions (\"localeconv()\", \"strcoll()\", \"strftime()\",\n\"strxfrm()\"):\n\nResults are never tainted.\n\nThree examples illustrate locale-dependent tainting.  The first program, which ignores its\nlocale, won't run: a value taken directly from the command line may not be used to name an\noutput file when taint checks are enabled.\n\n#/usr/local/bin/perl -T\n# Run with taint checking\n\n# Command line sanity check omitted...\n$taintedoutputfile = shift;\n\nopen(F, \">$taintedoutputfile\")\nor warn \"Open of $taintedoutputfile failed: $!\\n\";\n\nThe program can be made to run by \"laundering\" the tainted value through a regular\nexpression: the second example--which still ignores locale information--runs, creating the\nfile named on its command line if it can.\n\n#/usr/local/bin/perl -T\n\n$taintedoutputfile = shift;\n$taintedoutputfile =~ m%[\\w/]+%;\n$untaintedoutputfile = $&;\n\nopen(F, \">$untaintedoutputfile\")\nor warn \"Open of $untaintedoutputfile failed: $!\\n\";\n\nCompare this with a similar but locale-aware program:\n\n#/usr/local/bin/perl -T\n\n$taintedoutputfile = shift;\nuse locale;\n$taintedoutputfile =~ m%[\\w/]+%;\n$localizedoutputfile = $&;\n\nopen(F, \">$localizedoutputfile\")\nor warn \"Open of $localizedoutputfile failed: $!\\n\";\n\nThis third program fails to run because $& is tainted: it is the result of a match involving\n\"\\w\" while \"use locale\" is in effect.\n\n### ENVIRONMENT\n\nPERLSKIPLOCALEINIT\nThis environment variable, available starting in Perl v5.20, if set (to any\nvalue), tells Perl to not use the rest of the environment variables to initialize\nwith.  Instead, Perl uses whatever the current locale settings are.  This is\nparticularly useful in embedded environments, see \"Using embedded Perl with POSIX\nlocales\" in perlembed.\n\nPERLBADLANG\nA string that can suppress Perl's warning about failed locale settings at\nstartup.  Failure can occur if the locale support in the operating system is\nlacking (broken) in some way--or if you mistyped the name of a locale when you\nset up your environment.  If this environment variable is absent, or has a value\nother than \"0\" or \"\", Perl will complain about locale setting failures.\n\nNOTE: \"PERLBADLANG\" only gives you a way to hide the warning message.  The\nmessage tells about some problem in your system's locale support, and you should\ninvestigate what the problem is.\n\nDPKGRUNNINGVERSION\nOn Debian systems, if the DPKGRUNNINGVERSION environment variable is set (to\nany value), the locale failure warnings will be suppressed just like with a zero\nPERLBADLANG setting. This is done to avoid floods of spurious warnings during\nsystem upgrades.  See <http://bugs.debian.org/508764>.\n\nThe following environment variables are not specific to Perl: They are part of the\nstandardized (ISO C, XPG4, POSIX 1.c) \"setlocale()\" method for controlling an application's\nopinion on data.  Windows is non-POSIX, but Perl arranges for the following to work as\ndescribed anyway.  If the locale given by an environment variable is not valid, Perl tries\nthe next lower one in priority.  If none are valid, on Windows, the system default locale is\nthen tried.  If all else fails, the \"C\" locale is used.  If even that doesn't work, something\nis badly broken, but Perl tries to forge ahead with whatever the locale settings might be.\n\n\"LCALL\"    \"LCALL\" is the \"override-all\" locale environment variable. If set, it overrides\nall the rest of the locale environment variables.\n\n\"LANGUAGE\"  NOTE: \"LANGUAGE\" is a GNU extension, it affects you only if you are using the GNU\nlibc.  This is the case if you are using e.g. Linux.  If you are using\n\"commercial\" Unixes you are most probably not using GNU libc and you can ignore\n\"LANGUAGE\".\n\nHowever, in the case you are using \"LANGUAGE\": it affects the language of\ninformational, warning, and error messages output by commands (in other words,\nit's like \"LCMESSAGES\") but it has higher priority than \"LCALL\".  Moreover,\nit's not a single value but instead a \"path\" (\":\"-separated list) of languages\n(not locales).  See the GNU \"gettext\" library documentation for more information.\n\n\"LCCTYPE\"  In the absence of \"LCALL\", \"LCCTYPE\" chooses the character type locale.  In the\nabsence of both \"LCALL\" and \"LCCTYPE\", \"LANG\" chooses the character type\nlocale.\n\n\"LCCOLLATE\"\nIn the absence of \"LCALL\", \"LCCOLLATE\" chooses the collation (sorting) locale.\nIn the absence of both \"LCALL\" and \"LCCOLLATE\", \"LANG\" chooses the collation\nlocale.\n\n\"LCMONETARY\"\nIn the absence of \"LCALL\", \"LCMONETARY\" chooses the monetary formatting locale.\nIn the absence of both \"LCALL\" and \"LCMONETARY\", \"LANG\" chooses the monetary\nformatting locale.\n\n\"LCNUMERIC\"\nIn the absence of \"LCALL\", \"LCNUMERIC\" chooses the numeric format locale.  In\nthe absence of both \"LCALL\" and \"LCNUMERIC\", \"LANG\" chooses the numeric format.\n\n\"LCTIME\"   In the absence of \"LCALL\", \"LCTIME\" chooses the date and time formatting\nlocale.  In the absence of both \"LCALL\" and \"LCTIME\", \"LANG\" chooses the date\nand time formatting locale.\n\n\"LANG\"      \"LANG\" is the \"catch-all\" locale environment variable. If it is set, it is used\nas the last resort after the overall \"LCALL\" and the category-specific \"LCfoo\".\n\n#### Examples\n\nThe \"LCNUMERIC\" controls the numeric output:\n\nuse locale;\nuse POSIX qw(localeh); # Imports setlocale() and the LC constants.\nsetlocale(LCNUMERIC, \"frFR\") or die \"Pardon\";\nprintf \"%g\\n\", 1.23; # If the \"frFR\" succeeded, probably shows 1,23.\n\nand also how strings are parsed by \"POSIX::strtod()\" as numbers:\n\nuse locale;\nuse POSIX qw(localeh strtod);\nsetlocale(LCNUMERIC, \"deDE\") or die \"Entschuldigung\";\nmy $x = strtod(\"2,34\") + 5;\nprint $x, \"\\n\"; # Probably shows 7,34.\n\n### NOTES\n\nString \"eval\" and \"LCNUMERIC\"\nA string eval parses its expression as standard Perl.  It is therefore expecting the decimal\npoint to be a dot.  If \"LCNUMERIC\" is set to have this be a comma instead, the parsing will\nbe confused, perhaps silently.\n\nuse locale;\nuse POSIX qw(localeh);\nsetlocale(LCNUMERIC, \"frFR\") or die \"Pardon\";\nmy $a = 1.2;\nprint eval \"$a + 1.5\";\nprint \"\\n\";\n\nprints \"13,5\".  This is because in that locale, the comma is the decimal point character.\nThe \"eval\" thus expands to:\n\neval \"1,2 + 1.5\"\n\nand the result is not what you likely expected.  No warnings are generated.  If you do string\n\"eval\"'s within the scope of \"use locale\", you should instead change the \"eval\" line to do\nsomething like:\n\nprint eval \"no locale; $a + 1.5\";\n\nThis prints 2.7.\n\nYou could also exclude \"LCNUMERIC\", if you don't need it, by\n\nuse locale ':!numeric';\n\n#### Backward compatibility\n\nVersions of Perl prior to 5.004 mostly ignored locale information, generally behaving as if\nsomething similar to the \"C\" locale were always in force, even if the program environment\nsuggested otherwise (see \"The setlocale function\").  By default, Perl still behaves this way\nfor backward compatibility.  If you want a Perl application to pay attention to locale\ninformation, you must use the \"use locale\" pragma (see \"The \"use locale\" pragma\") or, in the\nunlikely event that you want to do so for just pattern matching, the \"/l\" regular expression\nmodifier (see \"Character set modifiers\" in perlre) to instruct it to do so.\n\nVersions of Perl from 5.002 to 5.003 did use the \"LCCTYPE\" information if available; that\nis, \"\\w\" did understand what were the letters according to the locale environment variables.\nThe problem was that the user had no control over the feature: if the C library supported\nlocales, Perl used them.\n\n#### I18N:Collate obsolete\n\nIn versions of Perl prior to 5.004, per-locale collation was possible using the\n\"I18N::Collate\" library module.  This module is now mildly obsolete and should be avoided in\nnew applications.  The \"LCCOLLATE\" functionality is now integrated into the Perl core\nlanguage: One can use locale-specific scalar data completely normally with \"use locale\", so\nthere is no longer any need to juggle with the scalar references of \"I18N::Collate\".\n\n#### Sort speed and memory use impacts\n\nComparing and sorting by locale is usually slower than the default sorting; slow-downs of two\nto four times have been observed.  It will also consume more memory: once a Perl scalar\nvariable has participated in any string comparison or sorting operation obeying the locale\ncollation rules, it will take 3-15 times more memory than before.  (The exact multiplier\ndepends on the string's contents, the operating system and the locale.) These downsides are\ndictated more by the operating system's implementation of the locale system than by Perl.\n\n#### Freely available locale definitions\n\nThe Unicode CLDR project extracts the POSIX portion of many of its locales, available at\n\nhttps://unicode.org/Public/cldr/2.0.1/\n\n(Newer versions of CLDR require you to compute the POSIX data yourself.  See\n<http://unicode.org/Public/cldr/latest/>.)\n\nThere is a large collection of locale definitions at:\n\nhttp://std.dkuug.dk/i18n/WG15-collection/locales/\n\nYou should be aware that it is unsupported, and is not claimed to be fit for any purpose.  If\nyour system allows installation of arbitrary locales, you may find the definitions useful as\nthey are, or as a basis for the development of your own locales.\n\n#### I18n and l10n\n\n\"Internationalization\" is often abbreviated as i18n because its first and last letters are\nseparated by eighteen others.  (You may guess why the internalin ... internaliti ... i18n\ntends to get abbreviated.)  In the same way, \"localization\" is often abbreviated to l10n.\n\n#### An imperfect standard\n\nInternationalization, as defined in the C and POSIX standards, can be criticized as\nincomplete and ungainly.  They also have a tendency, like standards groups, to divide the\nworld into nations, when we all know that the world can equally well be divided into bankers,\nbikers, gamers, and so on.\n\n#### Unicode and UTF-8\n\nThe support of Unicode is new starting from Perl version v5.6, and more fully implemented in\nversions v5.8 and later.  See perluniintro.\n\nStarting in Perl v5.20, UTF-8 locales are supported in Perl, except \"LCCOLLATE\" is only\npartially supported; collation support is improved in Perl v5.26 to a level that may be\nsufficient for your needs (see \"Category \"LCCOLLATE\": Collation: Text Comparisons and\nSorting\").\n\nIf you have Perl v5.16 or v5.18 and can't upgrade, you can use\n\nuse locale ':notcharacters';\n\nWhen this form of the pragma is used, only the non-character portions of locales are used by\nPerl, for example \"LCNUMERIC\".  Perl assumes that you have translated all the characters it\nis to operate on into Unicode (actually the platform's native character set (ASCII or EBCDIC)\nplus Unicode).  For data in files, this can conveniently be done by also specifying\n\nuse open ':locale';\n\nThis pragma arranges for all inputs from files to be translated into Unicode from the current\nlocale as specified in the environment (see \"ENVIRONMENT\"), and all outputs to files to be\ntranslated back into the locale.  (See open).  On a per-filehandle basis, you can instead use\nthe PerlIO::locale module, or the Encode::Locale module, both available from CPAN.  The\nlatter module also has methods to ease the handling of \"ARGV\" and environment variables, and\ncan be used on individual strings.  If you know that all your locales will be UTF-8, as many\nare these days, you can use the -C command line switch.\n\nThis form of the pragma allows essentially seamless handling of locales with Unicode.  The\ncollation order will be by Unicode code point order.  Unicode::Collate can be used to get\nUnicode rules collation.\n\nAll the modules and switches just described can be used in v5.20 with just plain \"use\nlocale\", and, should the input locales not be UTF-8, you'll get the less than ideal behavior,\ndescribed below, that you get with pre-v5.16 Perls, or when you use the locale pragma without\nthe \":notcharacters\" parameter in v5.16 and v5.18.  If you are using exclusively UTF-8\nlocales in v5.20 and higher, the rest of this section does not apply to you.\n\nThere are two cases, multi-byte and single-byte locales.  First multi-byte:\n\nThe only multi-byte (or wide character) locale that Perl is ever likely to support is UTF-8.\nThis is due to the difficulty of implementation, the fact that high quality UTF-8 locales are\nnow published for every area of the world (<https://unicode.org/Public/cldr/2.0.1/> for ones\nthat are already set-up, but from an earlier version;\n<https://unicode.org/Public/cldr/latest/> for the most up-to-date, but you have to extract\nthe POSIX information yourself), and that failing all that you can use the Encode module to\ntranslate to/from your locale.  So, you'll have to do one of those things if you're using one\nof these locales, such as Big5 or Shift JIS.  For UTF-8 locales, in Perls (pre v5.20) that\ndon't have full UTF-8 locale support, they may work reasonably well (depending on your C\nlibrary implementation) simply because both they and Perl store characters that take up\nmultiple bytes the same way.  However, some, if not most, C library implementations may not\nprocess the characters in the upper half of the Latin-1 range (128 - 255) properly under\n\"LCCTYPE\".  To see if a character is a particular type under a locale, Perl uses the\nfunctions like \"isalnum()\".  Your C library may not work for UTF-8 locales with those\nfunctions, instead only working under the newer wide library functions like \"iswalnum()\",\nwhich Perl does not use.  These multi-byte locales are treated like single-byte locales, and\nwill have the restrictions described below.  Starting in Perl v5.22 a warning message is\nraised when Perl detects a multi-byte locale that it doesn't fully support.\n\nFor single-byte locales, Perl generally takes the tack to use locale rules on code points\nthat can fit in a single byte, and Unicode rules for those that can't (though this isn't\nuniformly applied, see the note at the end of this section).  This prevents many problems in\nlocales that aren't UTF-8.  Suppose the locale is ISO8859-7, Greek.  The character at 0xD7\nthere is a capital Chi. But in the ISO8859-1 locale, Latin1, it is a multiplication sign.\nThe POSIX regular expression character class \"[[:alpha:]]\" will magically match 0xD7 in the\nGreek locale but not in the Latin one.\n\nHowever, there are places where this breaks down.  Certain Perl constructs are for Unicode\nonly, such as \"\\p{Alpha}\".  They assume that 0xD7 always has its Unicode meaning (or the\nequivalent on EBCDIC platforms).  Since Latin1 is a subset of Unicode and 0xD7 is the\nmultiplication sign in both Latin1 and Unicode, \"\\p{Alpha}\" will never match it, regardless\nof locale.  A similar issue occurs with \"\\N{...}\".  Prior to v5.20, it is therefore a bad\nidea to use \"\\p{}\" or \"\\N{}\" under plain \"use locale\"--unless you can guarantee that the\nlocale will be ISO8859-1.  Use POSIX character classes instead.\n\nAnother problem with this approach is that operations that cross the single byte/multiple\nbyte boundary are not well-defined, and so are disallowed.  (This boundary is between the\ncodepoints at 255/256.)  For example, lower casing LATIN CAPITAL LETTER Y WITH DIAERESIS\n(U+0178) should return LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF).  But in the Greek\nlocale, for example, there is no character at 0xFF, and Perl has no way of knowing what the\ncharacter at 0xFF is really supposed to represent.  Thus it disallows the operation.  In this\nmode, the lowercase of U+0178 is itself.\n\nThe same problems ensue if you enable automatic UTF-8-ification of your standard file\nhandles, default \"open()\" layer, and @ARGV on non-ISO8859-1, non-UTF-8 locales (by using\neither the -C command line switch or the \"PERLUNICODE\" environment variable; see perlrun).\nThings are read in as UTF-8, which would normally imply a Unicode interpretation, but the\npresence of a locale causes them to be interpreted in that locale instead.  For example, a\n0xD7 code point in the Unicode input, which should mean the multiplication sign, won't be\ninterpreted by Perl that way under the Greek locale.  This is not a problem provided you make\ncertain that all locales will always and only be either an ISO8859-1, or, if you don't have a\ndeficient C library, a UTF-8 locale.\n\nStill another problem is that this approach can lead to two code points meaning the same\ncharacter.  Thus in a Greek locale, both U+03A7 and U+00D7 are GREEK CAPITAL LETTER CHI.\n\nBecause of all these problems, starting in v5.22, Perl will raise a warning if a multi-byte\n(hence Unicode) code point is used when a single-byte locale is in effect.  (Although it\ndoesn't check for this if doing so would unreasonably slow execution down.)\n\nVendor locales are notoriously buggy, and it is difficult for Perl to test its locale-\nhandling code because this interacts with code that Perl has no control over; therefore the\nlocale-handling code in Perl may be buggy as well.  (However, the Unicode-supplied locales\nshould be better, and there is a feed back mechanism to correct any problems.  See \"Freely\navailable locale definitions\".)\n\nIf you have Perl v5.16, the problems mentioned above go away if you use the \":notcharacters\"\nparameter to the locale pragma (except for vendor bugs in the non-character portions).  If\nyou don't have v5.16, and you do have locales that work, using them may be worthwhile for\ncertain specific purposes, as long as you keep in mind the gotchas already mentioned.  For\nexample, if the collation for your locales works, it runs faster under locales than under\nUnicode::Collate; and you gain access to such things as the local currency symbol and the\nnames of the months and days of the week.  (But to hammer home the point, in v5.16, you get\nthis access without the downsides of locales by using the \":notcharacters\" form of the\npragma.)\n\nNote: The policy of using locale rules for code points that can fit in a byte, and Unicode\nrules for those that can't is not uniformly applied.  Pre-v5.12, it was somewhat haphazard;\nin v5.12 it was applied fairly consistently to regular expression matching except for\nbracketed character classes; in v5.14 it was extended to all regex matches; and in v5.16 to\nthe casing operations such as \"\\L\" and \"uc()\".  For collation, in all releases so far, the\nsystem's \"strxfrm()\" function is called, and whatever it does is what you get.  Starting in\nv5.26, various bugs are fixed with the way perl uses this function.\n\n### BUGS\n\n#### Collation of strings containing embedded \"NUL\" characters\n\n\"NUL\" characters will sort the same as the lowest collating control character does, or to\n\"\\001\" in the unlikely event that there are no control characters at all in the locale.  In\ncases where the strings don't contain this non-\"NUL\" control, the results will be correct,\nand in many locales, this control, whatever it might be, will rarely be encountered.  But\nthere are cases where a \"NUL\" should sort before this control, but doesn't.  If two strings\ndo collate identically, the one containing the \"NUL\" will sort to earlier.  Prior to 5.26,\nthere were more bugs.\n\n#### Multi-threaded\n\nXS code or C-language libraries called from it that use the system setlocale(3) function\n(except on Windows) likely will not work from a multi-threaded application without changes.\nSee \"Locale-aware XS code\" in perlxs.\n\nAn XS module that is locale-dependent could have been written under the assumption that it\nwill never be called in a multi-threaded environment, and so uses other non-locale constructs\nthat aren't multi-thread-safe.  See \"Thread-aware system interfaces\" in perlxs.\n\nPOSIX does not define a way to get the name of the current per-thread locale.  Some systems,\nsuch as Darwin and NetBSD do implement a function, querylocale(3) to do this.  On non-Windows\nsystems without it, such as Linux, there are some additional caveats:\n\n•   An embedded perl needs to be started up while the global locale is in effect.  See \"Using\nembedded Perl with POSIX locales\" in perlembed.\n\n•   It becomes more important for perl to know about all the possible locale categories on\nthe platform, even if they aren't apparently used in your program.  Perl knows all of the\nLinux ones.  If your platform has others, you can submit an issue at\n<https://github.com/Perl/perl5/issues> for inclusion of it in the next release.  In the\nmeantime, it is possible to edit the Perl source to teach it about the category, and then\nrecompile.  Search for instances of, say, \"LCPAPER\" in the source, and use that as a\ntemplate to add the omitted one.\n\n•   It is possible, though hard to do, to call \"POSIX::setlocale\" with a locale that it\ndoesn't recognize as syntactically legal, but actually is legal on that system.  This\nshould happen only with embedded perls, or if you hand-craft a locale name yourself.\n\n#### Broken systems\n\nIn certain systems, the operating system's locale support is broken and cannot be fixed or\nused by Perl.  Such deficiencies can and will result in mysterious hangs and/or Perl core\ndumps when \"use locale\" is in effect.  When confronted with such a system, please report in\nexcruciating detail to <<https://github.com/Perl/perl5/issues>>, and also contact your\nvendor: bug fixes may exist for these problems in your operating system.  Sometimes such bug\nfixes are called an operating system upgrade.  If you have the source for Perl, include in\nthe bug report the output of the test described above in \"Testing for broken locales\".\n\n### SEE ALSO\n\nI18N::Langinfo, perluniintro, perlunicode, open, \"localeconv\" in POSIX, \"setlocale\" in POSIX,\n\"strcoll\" in POSIX, \"strftime\" in POSIX, \"strtod\" in POSIX, \"strxfrm\" in POSIX.\n\nFor special considerations when Perl is embedded in a C program, see \"Using embedded Perl\nwith POSIX locales\" in perlembed.\n\n### HISTORY\n\nJarkko Hietaniemi's original perli18n.pod heavily hacked by Dominic Dunlop, assisted by the\nperl5-porters.  Prose worked over a bit by Tom Christiansen, and now maintained by Perl 5\nporters.\n\n\n\nperl v5.34.0                                 2025-07-25                                PERLLOCALE(1)\n\n"
        }
    ],
    "structuredContent": {
        "command": "PERLLOCALE",
        "section": "1",
        "mode": "man",
        "summary": "perllocale - Perl locale handling (internationalization and localization)",
        "synopsis": null,
        "tldr_summary": null,
        "tldr_examples": [],
        "tldr_source": null,
        "flags": [],
        "examples": [],
        "see_also": [],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 45,
                "subsections": []
            },
            {
                "name": "WHAT IS A LOCALE",
                "lines": 35,
                "subsections": []
            },
            {
                "name": "PREPARING TO USE LOCALES",
                "lines": 30,
                "subsections": []
            },
            {
                "name": "USING LOCALES",
                "lines": 1,
                "subsections": [
                    {
                        "name": "The \"use locale\" pragma",
                        "lines": 24
                    },
                    {
                        "name": "Not within the scope of \"use locale\"",
                        "lines": 43
                    },
                    {
                        "name": "Under \"\"use locale\";\"",
                        "lines": 85
                    },
                    {
                        "name": "The setlocale function",
                        "lines": 86
                    },
                    {
                        "name": "Multi-threaded operation",
                        "lines": 38
                    },
                    {
                        "name": "Finding locales",
                        "lines": 67
                    },
                    {
                        "name": "Testing for broken locales",
                        "lines": 10
                    },
                    {
                        "name": "Temporarily fixing locale problems",
                        "lines": 37
                    },
                    {
                        "name": "Permanently fixing locale problems",
                        "lines": 16
                    },
                    {
                        "name": "Permanently fixing your system's locale configuration",
                        "lines": 13
                    },
                    {
                        "name": "Fixing system locale configuration",
                        "lines": 6
                    },
                    {
                        "name": "The localeconv function",
                        "lines": 64
                    },
                    {
                        "name": "I18N::Langinfo",
                        "lines": 21
                    }
                ]
            },
            {
                "name": "LOCALE CATEGORIES",
                "lines": 207,
                "subsections": [
                    {
                        "name": "Other categories",
                        "lines": 6
                    }
                ]
            },
            {
                "name": "SECURITY",
                "lines": 128,
                "subsections": []
            },
            {
                "name": "ENVIRONMENT",
                "lines": 71,
                "subsections": [
                    {
                        "name": "Examples",
                        "lines": 15
                    }
                ]
            },
            {
                "name": "NOTES",
                "lines": 29,
                "subsections": [
                    {
                        "name": "Backward compatibility",
                        "lines": 13
                    },
                    {
                        "name": "I18N:Collate obsolete",
                        "lines": 6
                    },
                    {
                        "name": "Sort speed and memory use impacts",
                        "lines": 7
                    },
                    {
                        "name": "Freely available locale definitions",
                        "lines": 15
                    },
                    {
                        "name": "I18n and l10n",
                        "lines": 4
                    },
                    {
                        "name": "An imperfect standard",
                        "lines": 5
                    },
                    {
                        "name": "Unicode and UTF-8",
                        "lines": 123
                    }
                ]
            },
            {
                "name": "BUGS",
                "lines": 1,
                "subsections": [
                    {
                        "name": "Collation of strings containing embedded \"NUL\" characters",
                        "lines": 8
                    },
                    {
                        "name": "Multi-threaded",
                        "lines": 27
                    },
                    {
                        "name": "Broken systems",
                        "lines": 8
                    }
                ]
            },
            {
                "name": "SEE ALSO",
                "lines": 6,
                "subsections": []
            },
            {
                "name": "HISTORY",
                "lines": 7,
                "subsections": []
            }
        ],
        "sections": {
            "NAME": {
                "content": "perllocale - Perl locale handling (internationalization and localization)\n",
                "subsections": []
            },
            "DESCRIPTION": {
                "content": "In the beginning there was ASCII, the \"American Standard Code for Information Interchange\",\nwhich works quite well for Americans with their English alphabet and dollar-denominated\ncurrency.  But it doesn't work so well even for other English speakers, who may use different\ncurrencies, such as the pound sterling (as the symbol for that currency is not in ASCII); and\nit's hopelessly inadequate for many of the thousands of the world's other languages.\n\nTo address these deficiencies, the concept of locales was invented (formally the ISO C, XPG4,\nPOSIX 1.c \"locale system\").  And applications were and are being written that use the locale\nmechanism.  The process of making such an application take account of its users' preferences\nin these kinds of matters is called internationalization (often abbreviated as i18n); telling\nsuch an application about a particular set of preferences is known as localization (l10n).\n\nPerl has been extended to support certain types of locales available in the locale system.\nThis is controlled per application by using one pragma, one function call, and several\nenvironment variables.\n\nPerl supports single-byte locales that are supersets of ASCII, such as the ISO 8859 ones, and\none multi-byte-type locale, UTF-8 ones, described in the next paragraph.  Perl doesn't\nsupport any other multi-byte locales, such as the ones for East Asian languages.\n\nUnfortunately, there are quite a few deficiencies with the design (and often, the\nimplementations) of locales.  Unicode was invented (see perlunitut for an introduction to\nthat) in part to address these design deficiencies, and nowadays, there is a series of \"UTF-8\nlocales\", based on Unicode.  These are locales whose character set is Unicode, encoded in\nUTF-8.  Starting in v5.20, Perl fully supports UTF-8 locales, except for sorting and string\ncomparisons like \"lt\" and \"ge\".  Starting in v5.26, Perl can handle these reasonably as well,\ndepending on the platform's implementation.  However, for earlier releases or for better\ncontrol, use Unicode::Collate.  There are actually two slightly different types of UTF-8\nlocales: one for Turkic languages and one for everything else.\n\nStarting in Perl v5.30, Perl detects Turkic locales by their behaviour, and seamlessly\nhandles both types; previously only the non-Turkic one was supported.  The name of the locale\nis ignored, if your system has a \"trTR.UTF-8\" locale and it doesn't behave like a Turkic\nlocale, perl will treat it like a non-Turkic locale.\n\nPerl continues to support the old non UTF-8 locales as well.  There are currently no UTF-8\nlocales for EBCDIC platforms.\n\n(Unicode is also creating \"CLDR\", the \"Common Locale Data Repository\",\n<http://cldr.unicode.org/> which includes more types of information than are available in the\nPOSIX locale system.  At the time of this writing, there was no CPAN module that provides\naccess to this XML-encoded data.  However, it is possible to compute the POSIX locale data\nfrom them, and earlier CLDR versions had these already extracted for you as UTF-8 locales\n<http://unicode.org/Public/cldr/2.0.1/>.)\n",
                "subsections": []
            },
            "WHAT IS A LOCALE": {
                "content": "A locale is a set of data that describes various aspects of how various communities in the\nworld categorize their world.  These categories are broken down into the following types\n(some of which include a brief note here):\n\nCategory \"LCNUMERIC\": Numeric formatting\nThis indicates how numbers should be formatted for human readability, for example the\ncharacter used as the decimal point.\n\nCategory \"LCMONETARY\": Formatting of monetary amounts\n\n\nCategory \"LCTIME\": Date/Time formatting\n\n\nCategory \"LCMESSAGES\": Error and other messages\nThis is used by Perl itself only for accessing operating system error messages via $! and\n$^E.\n\nCategory \"LCCOLLATE\": Collation\nThis indicates the ordering of letters for comparison and sorting.  In Latin alphabets,\nfor example, \"b\", generally follows \"a\".\n\nCategory \"LCCTYPE\": Character Types\nThis indicates, for example if a character is an uppercase letter.\n\nOther categories\nSome platforms have other categories, dealing with such things as measurement units and\npaper sizes.  None of these are used directly by Perl, but outside operations that Perl\ninteracts with may use these.  See \"Not within the scope of \"use locale\"\" below.\n\nMore details on the categories used by Perl are given below in \"LOCALE CATEGORIES\".\n\nTogether, these categories go a long way towards being able to customize a single program to\nrun in many different locations.  But there are deficiencies, so keep reading.\n",
                "subsections": []
            },
            "PREPARING TO USE LOCALES": {
                "content": "Perl itself (outside the POSIX module) will not use locales unless specifically requested to\n(but again note that Perl may interact with code that does use them).  Even if there is such\na request, all of the following must be true for it to work properly:\n\n•   Your operating system must support the locale system.  If it does, you should find that\nthe \"setlocale()\" function is a documented part of its C library.\n\n•   Definitions for locales that you use must be installed.  You, or your system\nadministrator, must make sure that this is the case. The available locales, the location\nin which they are kept, and the manner in which they are installed all vary from system\nto system.  Some systems provide only a few, hard-wired locales and do not allow more to\nbe added.  Others allow you to add \"canned\" locales provided by the system supplier.\nStill others allow you or the system administrator to define and add arbitrary locales.\n(You may have to ask your supplier to provide canned locales that are not delivered with\nyour operating system.)  Read your system documentation for further illumination.\n\n•   Perl must believe that the locale system is supported.  If it does, \"perl -V:dsetlocale\"\nwill say that the value for \"dsetlocale\" is \"define\".\n\nIf you want a Perl application to process and present your data according to a particular\nlocale, the application code should include the \"use locale\" pragma (see \"The \"use locale\"\npragma\") where appropriate, and at least one of the following must be true:\n\n1.  The locale-determining environment variables (see \"ENVIRONMENT\") must be correctly set up\nat the time the application is started, either by yourself or by whomever set up your\nsystem account; or\n\n2.  The application must set its own locale using the method described in \"The setlocale\nfunction\".\n",
                "subsections": []
            },
            "USING LOCALES": {
                "content": "",
                "subsections": [
                    {
                        "name": "The \"use locale\" pragma",
                        "content": "Starting in Perl 5.28, this pragma may be used in multi-threaded applications on systems that\nhave thread-safe locale ability.  Some caveats apply, see \"Multi-threaded\" below.  On systems\nwithout this capability, or in earlier Perls, do NOT use this pragma in scripts that have\nmultiple threads active.  The locale in these cases is not local to a single thread.  Another\nthread may change the locale at any time, which could cause at a minimum that a given thread\nis operating in a locale it isn't expecting to be in.  On some platforms, segfaults can also\noccur.  The locale change need not be explicit; some operations cause perl to change the\nlocale itself.  You are vulnerable simply by having done a \"use locale\".\n\nBy default, Perl itself (outside the POSIX module) ignores the current locale.  The\n\"use locale\" pragma tells Perl to use the current locale for some operations.  Starting in\nv5.16, there are optional parameters to this pragma, described below, which restrict which\noperations are affected by it.\n\nThe current locale is set at execution time by setlocale() described below.  If that function\nhasn't yet been called in the course of the program's execution, the current locale is that\nwhich was determined by the \"ENVIRONMENT\" in effect at the start of the program.  If there is\nno valid environment, the current locale is whatever the system default has been set to.   On\nPOSIX systems, it is likely, but not necessarily, the \"C\" locale.  On Windows, the default is\nset via the computer's \"Control Panel->Regional and Language Options\" (or its current\nequivalent).\n\nThe operations that are affected by locale are:\n"
                    },
                    {
                        "name": "Not within the scope of \"use locale\"",
                        "content": "Only certain operations (all originating outside Perl) should be affected, as follows:\n\n•   The current locale is used when going outside of Perl with operations like system()\nor qx//, if those operations are locale-sensitive.\n\n•   Also Perl gives access to various C library functions through the POSIX module.  Some\nof those functions are always affected by the current locale.  For example,\n\"POSIX::strftime()\" uses \"LCTIME\"; \"POSIX::strtod()\" uses \"LCNUMERIC\";\n\"POSIX::strcoll()\" and \"POSIX::strxfrm()\" use \"LCCOLLATE\".  All such functions will\nbehave according to the current underlying locale, even if that locale isn't exposed\nto Perl space.\n\nThis applies as well to I18N::Langinfo.\n\n•   XS modules for all categories but \"LCNUMERIC\" get the underlying locale, and hence\nany C library functions they call will use that underlying locale.  For more\ndiscussion, see \"CAVEATS\" in perlxs.\n\nNote that all C programs (including the perl interpreter, which is written in C) always\nhave an underlying locale.  That locale is the \"C\" locale unless changed by a call to\nsetlocale().  When Perl starts up, it changes the underlying locale to the one which is\nindicated by the \"ENVIRONMENT\".  When using the POSIX module or writing XS code, it is\nimportant to keep in mind that the underlying locale may be something other than \"C\",\neven if the program hasn't explicitly changed it.\n\n\n\nLingering effects of \"use  locale\"\nCertain Perl operations that are set-up within the scope of a \"use locale\" retain that\neffect even outside the scope.  These include:\n\n•   The output format of a write() is determined by an earlier format declaration\n(\"format\" in perlfunc), so whether or not the output is affected by locale is\ndetermined by if the \"format()\" is within the scope of a \"use locale\", not whether\nthe \"write()\" is.\n\n•   Regular expression patterns can be compiled using qr// with actual matching deferred\nto later.  Again, it is whether or not the compilation was done within the scope of\n\"use locale\" that determines the match behavior, not if the matches are done within\nsuch a scope or not.\n\n\n"
                    },
                    {
                        "name": "Under \"\"use locale\";\"",
                        "content": "•   All the above operations\n\n•   Format declarations (\"format\" in perlfunc) and hence any subsequent \"write()\"s use\n\"LCNUMERIC\".\n\n•   stringification and output use \"LCNUMERIC\".  These include the results of \"print()\",\n\"printf()\", \"say()\", and \"sprintf()\".\n\n•   The comparison operators (\"lt\", \"le\", \"cmp\", \"ge\", and \"gt\") use \"LCCOLLATE\".\n\"sort()\" is also affected if used without an explicit comparison function, because it\nuses \"cmp\" by default.\n\nNote: \"eq\" and \"ne\" are unaffected by locale: they always perform a char-by-char\ncomparison of their scalar operands.  What's more, if \"cmp\" finds that its operands\nare equal according to the collation sequence specified by the current locale, it\ngoes on to perform a char-by-char comparison, and only returns 0 (equal) if the\noperands are char-for-char identical.  If you really want to know whether two\nstrings--which \"eq\" and \"cmp\" may consider different--are equal as far as collation\nin the locale is concerned, see the discussion in \"Category \"LCCOLLATE\": Collation\".\n\n•   Regular expressions and case-modification functions (\"uc()\", \"lc()\", \"ucfirst()\", and\n\"lcfirst()\") use \"LCCTYPE\"\n\n•   The variables $! (and its synonyms $ERRNO and $OSERROR) and $^E (and its synonym\n$EXTENDEDOSERROR) when used as strings use \"LCMESSAGES\".\n\nThe default behavior is restored with the \"no locale\" pragma, or upon reaching the end of the\nblock enclosing \"use locale\".  Note that \"use locale\" calls may be nested, and that what is\nin effect within an inner scope will revert to the outer scope's rules at the end of the\ninner scope.\n\nThe string result of any operation that uses locale information is tainted, as it is possible\nfor a locale to be untrustworthy.  See \"SECURITY\".\n\nStarting in Perl v5.16 in a very limited way, and more generally in v5.22, you can restrict\nwhich category or categories are enabled by this particular instance of the pragma by adding\nparameters to it.  For example,\n\nuse locale qw(:ctype :numeric);\n\nenables locale awareness within its scope of only those operations (listed above) that are\naffected by \"LCCTYPE\" and \"LCNUMERIC\".\n\nThe possible categories are: \":collate\", \":ctype\", \":messages\", \":monetary\", \":numeric\",\n\":time\", and the pseudo category \":characters\" (described below).\n\nThus you can say\n\nuse locale ':messages';\n\nand only $! and $^E will be locale aware.  Everything else is unaffected.\n\nSince Perl doesn't currently do anything with the \"LCMONETARY\" category, specifying\n\":monetary\" does effectively nothing.  Some systems have other categories, such as\n\"LCPAPER\", but Perl also doesn't do anything with them, and there is no way to specify them\nin this pragma's arguments.\n\nYou can also easily say to use all categories but one, by either, for example,\n\nuse locale ':!ctype';\nuse locale ':notctype';\n\nboth of which mean to enable locale awareness of all categories but \"LCCTYPE\".  Only one\ncategory argument may be specified in a \"use locale\" if it is of the negated form.\n\nPrior to v5.22 only one form of the pragma with arguments is available:\n\nuse locale ':notcharacters';\n\n(and you have to say \"not\"; you can't use the bang \"!\" form).  This pseudo category is a\nshorthand for specifying both \":collate\" and \":ctype\".  Hence, in the negated form, it is\nnearly the same thing as saying\n\nuse locale qw(:messages :monetary :numeric :time);\n\nWe use the term \"nearly\", because \":notcharacters\" also turns on\n\"use feature 'unicodestrings'\" within its scope.  This form is less useful in v5.20 and\nlater, and is described fully in \"Unicode and UTF-8\", but briefly, it tells Perl to not use\nthe character portions of the locale definition, that is the \"LCCTYPE\" and \"LCCOLLATE\"\ncategories.  Instead it will use the native character set (extended by Unicode).  When using\nthis parameter, you are responsible for getting the external character set translated into\nthe native/Unicode one (which it already will be if it is one of the increasingly popular\nUTF-8 locales).  There are convenient ways of doing this, as described in \"Unicode and\nUTF-8\".\n"
                    },
                    {
                        "name": "The setlocale function",
                        "content": "WARNING!  Prior to Perl 5.28 or on a system that does not support thread-safe locale\noperations, do NOT use this function in a thread.  The locale will change in all other\nthreads at the same time, and should your thread get paused by the operating system, and\nanother started, that thread will not have the locale it is expecting.  On some platforms,\nthere can be a race leading to segfaults if two threads call this function nearly\nsimultaneously.  This warning does not apply on unthreaded builds, or on perls where\n\"${^SAFELOCALES}\" exists and is non-zero; namely Perl 5.28 and later unthreaded or compiled\nto be locale-thread-safe.\n\nYou can switch locales as often as you wish at run time with the \"POSIX::setlocale()\"\nfunction:\n\n# Import locale-handling tool set from POSIX module.\n# This example uses: setlocale -- the function call\n#                    LCCTYPE -- explained below\n# (Showing the testing for success/failure of operations is\n# omitted in these examples to avoid distracting from the main\n# point)\n\nuse POSIX qw(localeh);\nuse locale;\nmy $oldlocale;\n\n# query and save the old locale\n$oldlocale = setlocale(LCCTYPE);\n\nsetlocale(LCCTYPE, \"frCA.ISO8859-1\");\n# LCCTYPE now in locale \"French, Canada, codeset ISO 8859-1\"\n\nsetlocale(LCCTYPE, \"\");\n# LCCTYPE now reset to the default defined by the\n# LCALL/LCCTYPE/LANG environment variables, or to the system\n# default.  See below for documentation.\n\n# restore the old locale\nsetlocale(LCCTYPE, $oldlocale);\n\nThe first argument of \"setlocale()\" gives the category, the second the locale.  The category\ntells in what aspect of data processing you want to apply locale-specific rules.  Category\nnames are discussed in \"LOCALE CATEGORIES\" and \"ENVIRONMENT\".  The locale is the name of a\ncollection of customization information corresponding to a particular combination of\nlanguage, country or territory, and codeset.  Read on for hints on the naming of locales: not\nall systems name locales as in the example.\n\nIf no second argument is provided and the category is something other than \"LCALL\", the\nfunction returns a string naming the current locale for the category.  You can use this value\nas the second argument in a subsequent call to \"setlocale()\", but on some platforms the\nstring is opaque, not something that most people would be able to decipher as to what locale\nit means.\n\nIf no second argument is provided and the category is \"LCALL\", the result is implementation-\ndependent.  It may be a string of concatenated locale names (separator also implementation-\ndependent) or a single locale name.  Please consult your setlocale(3) man page for details.\n\nIf a second argument is given and it corresponds to a valid locale, the locale for the\ncategory is set to that value, and the function returns the now-current locale value.  You\ncan then use this in yet another call to \"setlocale()\".  (In some implementations, the return\nvalue may sometimes differ from the value you gave as the second argument--think of it as an\nalias for the value you gave.)\n\nAs the example shows, if the second argument is an empty string, the category's locale is\nreturned to the default specified by the corresponding environment variables.  Generally,\nthis results in a return to the default that was in force when Perl started up: changes to\nthe environment made by the application after startup may or may not be noticed, depending on\nyour system's C library.\n\nNote that when a form of \"use locale\" that doesn't include all categories is specified, Perl\nignores the excluded categories.\n\nIf \"setlocale()\" fails for some reason (for example, an attempt to set to a locale unknown to\nthe system), the locale for the category is not changed, and the function returns \"undef\".\n\nStarting in Perl 5.28, on multi-threaded perls compiled on systems that implement POSIX 2008\nthread-safe locale operations, this function doesn't actually call the system \"setlocale\".\nInstead those thread-safe operations are used to emulate the \"setlocale\" function, but in a\nthread-safe manner.\n\nYou can force the thread-safe locale operations to always be used (if available) by\nrecompiling perl with\n\n-Accflags='-DUSETHREADSAFELOCALE'\n\nadded to your call to Configure.\n\nFor further information about the categories, consult setlocale(3).\n"
                    },
                    {
                        "name": "Multi-threaded operation",
                        "content": "Beginning in Perl 5.28, multi-threaded locale operation is supported on systems that\nimplement either the POSIX 2008 or Windows-specific thread-safe locale operations.  Many\nmodern systems, such as various Unix variants and Darwin do have this.\n\nYou can tell if using locales is safe on your system by looking at the read-only boolean\nvariable \"${^SAFELOCALES}\".  The value is 1 if the perl is not threaded, or if it is using\nthread-safe locale operations.\n\nThread-safe operations are supported in Windows starting in Visual Studio 2005, and in\nsystems compatible with POSIX 2008.  Some platforms claim to support POSIX 2008, but have\nbuggy implementations, so that the hints files for compiling to run on them turn off\nattempting to use thread-safety.  \"${^SAFELOCALES}\" will be 0 on them.\n\nBe aware that writing a multi-threaded application will not be portable to a platform which\nlacks the native thread-safe locale support.  On systems that do have it, you automatically\nget this behavior for threaded perls, without having to do anything.  If for some reason, you\ndon't want to use this capability (perhaps the POSIX 2008 support is buggy on your system),\nyou can manually compile Perl to use the old non-thread-safe implementation by passing the\nargument \"-Accflags='-DNOTHREADSAFELOCALE'\" to Configure.  Except on Windows, this will\ncontinue to use certain of the POSIX 2008 functions in some situations.  If these are buggy,\nyou can pass the following to Configure instead or additionally:\n\"-Accflags='-DNOPOSIX2008LOCALE'\".  This will also keep the code from using thread-safe\nlocales.  \"${^SAFELOCALES}\" will be 0 on systems that turn off the thread-safe operations.\n\nNormally on unthreaded builds, the traditional \"setlocale()\" is used and not the thread-safe\nlocale functions.  You can force the use of these on systems that have them by adding the\n\"-Accflags='-DUSETHREADSAFELOCALE'\" to Configure.\n\nThe initial program is started up using the locale specified from the environment, as\ncurrently, described in \"ENVIRONMENT\".   All newly created threads start with \"LCALL\" set to\n\"C\".  Each thread may use \"POSIX::setlocale()\" to query or switch its locale at any time,\nwithout affecting any other thread.  All locale-dependent operations automatically use their\nthread's locale.\n\nThis should be completely transparent to any applications written entirely in Perl (minus a\nfew rarely encountered caveats given in the \"Multi-threaded\" section).  Information for XS\nmodule writers is given in \"Locale-aware XS code\" in perlxs.\n"
                    },
                    {
                        "name": "Finding locales",
                        "content": "For locales available in your system, consult also setlocale(3) to see whether it leads to\nthe list of available locales (search for the SEE ALSO section).  If that fails, try the\nfollowing command lines:\n\nlocale -a\n\nnlsinfo\n\nls /usr/lib/nls/loc\n\nls /usr/lib/locale\n\nls /usr/lib/nls\n\nls /usr/share/locale\n\nand see whether they list something resembling these\n\nenUS.ISO8859-1     deDE.ISO8859-1     ruRU.ISO8859-5\nenUS.iso88591      deDE.iso88591      ruRU.iso88595\nenUS               deDE               ruRU\nen                  de                  ru\nenglish             german              russian\nenglish.iso88591    german.iso88591     russian.iso88595\nenglish.roman8                          russian.koi8r\n\nSadly, even though the calling interface for \"setlocale()\" has been standardized, names of\nlocales and the directories where the configuration resides have not been.  The basic form of\nthe name is languageterritory.codeset, but the latter parts after language are not always\npresent.  The language and country are usually from the standards ISO 3166 and ISO 639, the\ntwo-letter abbreviations for the countries and the languages of the world, respectively.  The\ncodeset part often mentions some ISO 8859 character set, the Latin codesets.  For example,\n\"ISO 8859-1\" is the so-called \"Western European codeset\" that can be used to encode most\nWestern European languages adequately.  Again, there are several ways to write even the name\nof that one standard.  Lamentably.\n\nTwo special locales are worth particular mention: \"C\" and \"POSIX\".  Currently these are\neffectively the same locale: the difference is mainly that the first one is defined by the C\nstandard, the second by the POSIX standard.  They define the default locale in which every\nprogram starts in the absence of locale information in its environment.  (The default default\nlocale, if you will.)  Its language is (American) English and its character codeset ASCII or,\nrarely, a superset thereof (such as the \"DEC Multinational Character Set (DEC-MCS)\").\nWarning. The C locale delivered by some vendors may not actually exactly match what the C\nstandard calls for.  So beware.\n\nNOTE: Not all systems have the \"POSIX\" locale (not all systems are POSIX-conformant), so use\n\"C\" when you need explicitly to specify this default locale.\n\nLOCALE PROBLEMS\nYou may encounter the following warning message at Perl startup:\n\nperl: warning: Setting locale failed.\nperl: warning: Please check that your locale settings:\nLCALL = \"EnUS\",\nLANG = (unset)\nare supported and installed on your system.\nperl: warning: Falling back to the standard locale (\"C\").\n\nThis means that your locale settings had \"LCALL\" set to \"EnUS\" and LANG exists but has no\nvalue.  Perl tried to believe you but could not.  Instead, Perl gave up and fell back to the\n\"C\" locale, the default locale that is supposed to work no matter what.  (On Windows, it\nfirst tries falling back to the system default locale.)  This usually means your locale\nsettings were wrong, they mention locales your system has never heard of, or the locale\ninstallation in your system has problems (for example, some system files are broken or\nmissing).  There are quick and temporary fixes to these problems, as well as more thorough\nand lasting fixes.\n"
                    },
                    {
                        "name": "Testing for broken locales",
                        "content": "If you are building Perl from source, the Perl test suite file lib/locale.t can be used to\ntest the locales on your system.  Setting the environment variable \"PERLDEBUGFULLTEST\" to\n1 will cause it to output detailed results.  For example, on Linux, you could say\n\nPERLDEBUGFULLTEST=1 ./perl -T -Ilib lib/locale.t > locale.log 2>&1\n\nBesides many other tests, it will test every locale it finds on your system to see if they\nconform to the POSIX standard.  If any have errors, it will include a summary near the end of\nthe output of which locales passed all its tests, and which failed, and why.\n"
                    },
                    {
                        "name": "Temporarily fixing locale problems",
                        "content": "The two quickest fixes are either to render Perl silent about any locale inconsistencies or\nto run Perl under the default locale \"C\".\n\nPerl's moaning about locale problems can be silenced by setting the environment variable\n\"PERLBADLANG\" to \"0\" or \"\".  This method really just sweeps the problem under the carpet:\nyou tell Perl to shut up even when Perl sees that something is wrong.  Do not be surprised if\nlater something locale-dependent misbehaves.\n\nPerl can be run under the \"C\" locale by setting the environment variable \"LCALL\" to \"C\".\nThis method is perhaps a bit more civilized than the \"PERLBADLANG\" approach, but setting\n\"LCALL\" (or other locale variables) may affect other programs as well, not just Perl.  In\nparticular, external programs run from within Perl will see these changes.  If you make the\nnew settings permanent (read on), all programs you run see the changes.  See \"ENVIRONMENT\"\nfor the full list of relevant environment variables and \"USING LOCALES\" for their effects in\nPerl.  Effects in other programs are easily deducible.  For example, the variable\n\"LCCOLLATE\" may well affect your sort program (or whatever the program that arranges\n\"records\" alphabetically in your system is called).\n\nYou can test out changing these variables temporarily, and if the new settings seem to help,\nput those settings into your shell startup files.  Consult your local documentation for the\nexact details.  For Bourne-like shells (sh, ksh, bash, zsh):\n\nLCALL=enUS.ISO8859-1\nexport LCALL\n\nThis assumes that we saw the locale \"enUS.ISO8859-1\" using the commands discussed above.  We\ndecided to try that instead of the above faulty locale \"EnUS\"--and in Cshish shells (csh,\ntcsh)\n\nsetenv LCALL enUS.ISO8859-1\n\nor if you have the \"env\" application you can do (in any shell)\n\nenv LCALL=enUS.ISO8859-1 perl ...\n\nIf you do not know what shell you have, consult your local helpdesk or the equivalent.\n"
                    },
                    {
                        "name": "Permanently fixing locale problems",
                        "content": "The slower but superior fixes are when you may be able to yourself fix the misconfiguration\nof your own environment variables.  The mis(sing)configuration of the whole system's locales\nusually requires the help of your friendly system administrator.\n\nFirst, see earlier in this document about \"Finding locales\".  That tells how to find which\nlocales are really supported--and more importantly, installed--on your system.  In our\nexample error message, environment variables affecting the locale are listed in the order of\ndecreasing importance (and unset variables do not matter).  Therefore, having LCALL set to\n\"EnUS\" must have been the bad choice, as shown by the error message.  First try fixing\nlocale settings listed first.\n\nSecond, if using the listed commands you see something exactly (prefix matches do not count\nand case usually counts) like \"EnUS\" without the quotes, then you should be okay because you\nare using a locale name that should be installed and available in your system.  In this case,\nsee \"Permanently fixing your system's locale configuration\".\n"
                    },
                    {
                        "name": "Permanently fixing your system's locale configuration",
                        "content": "This is when you see something like:\n\nperl: warning: Please check that your locale settings:\nLCALL = \"EnUS\",\nLANG = (unset)\nare supported and installed on your system.\n\nbut then cannot see that \"EnUS\" listed by the above-mentioned commands.  You may see things\nlike \"enUS.ISO8859-1\", but that isn't the same.  In this case, try running under a locale\nthat you can list and which somehow matches what you tried.  The rules for matching locale\nnames are a bit vague because standardization is weak in this area.  See again the \"Finding\nlocales\" about general rules.\n"
                    },
                    {
                        "name": "Fixing system locale configuration",
                        "content": "Contact a system administrator (preferably your own) and report the exact error message you\nget, and ask them to read this same documentation you are now reading.  They should be able\nto check whether there is something wrong with the locale configuration of the system.  The\n\"Finding locales\" section is unfortunately a bit vague about the exact commands and places\nbecause these things are not that standardized.\n"
                    },
                    {
                        "name": "The localeconv function",
                        "content": "The \"POSIX::localeconv()\" function allows you to get particulars of the locale-dependent\nnumeric formatting information specified by the current underlying \"LCNUMERIC\" and\n\"LCMONETARY\" locales (regardless of whether called from within the scope of \"use locale\" or\nnot).  (If you just want the name of the current locale for a particular category, use\n\"POSIX::setlocale()\" with a single parameter--see \"The setlocale function\".)\n\nuse POSIX qw(localeh);\n\n# Get a reference to a hash of locale-dependent info\n$localevalues = localeconv();\n\n# Output sorted list of the values\nfor (sort keys %$localevalues) {\nprintf \"%-20s = %s\\n\", $, $localevalues->{$}\n}\n\n\"localeconv()\" takes no arguments, and returns a reference to a hash.  The keys of this hash\nare variable names for formatting, such as \"decimalpoint\" and \"thousandssep\".  The values\nare the corresponding, er, values.  See \"localeconv\" in POSIX for a longer example listing\nthe categories an implementation might be expected to provide; some provide more and others\nfewer.  You don't need an explicit \"use locale\", because \"localeconv()\" always observes the\ncurrent locale.\n\nHere's a simple-minded example program that rewrites its command-line parameters as integers\ncorrectly formatted in the current locale:\n\nuse POSIX qw(localeh);\n\n# Get some of locale's numeric formatting parameters\nmy ($thousandssep, $grouping) =\n@{localeconv()}{'thousandssep', 'grouping'};\n\n# Apply defaults if values are missing\n$thousandssep = ',' unless $thousandssep;\n\n# grouping and mongrouping are packed lists\n# of small integers (characters) telling the\n# grouping (thousandseps and monthousandseps\n# being the group dividers) of numbers and\n# monetary quantities.  The integers' meanings:\n# 255 means no more grouping, 0 means repeat\n# the previous grouping, 1-254 means use that\n# as the current grouping.  Grouping goes from\n# right to left (low to high digits).  In the\n# below we cheat slightly by never using anything\n# else than the first grouping (whatever that is).\nif ($grouping) {\n@grouping = unpack(\"C*\", $grouping);\n} else {\n@grouping = (3);\n}\n\n# Format command line params for current locale\nfor (@ARGV) {\n$ = int;    # Chop non-integer part\n1 while\ns/(\\d)(\\d{$grouping[0]}($|$thousandssep))/$1$thousandssep$2/;\nprint \"$\";\n}\nprint \"\\n\";\n\nNote that if the platform doesn't have \"LCNUMERIC\" and/or \"LCMONETARY\" available or\nenabled, the corresponding elements of the hash will be missing.\n"
                    },
                    {
                        "name": "I18N::Langinfo",
                        "content": "Another interface for querying locale-dependent information is the\n\"I18N::Langinfo::langinfo()\" function.\n\nThe following example will import the \"langinfo()\" function itself and three constants to be\nused as arguments to \"langinfo()\": a constant for the abbreviated first day of the week (the\nnumbering starts from Sunday = 1) and two more constants for the affirmative and negative\nanswers for a yes/no question in the current locale.\n\nuse I18N::Langinfo qw(langinfo ABDAY1 YESSTR NOSTR);\n\nmy ($abday1, $yesstr, $nostr)\n= map { langinfo } qw(ABDAY1 YESSTR NOSTR);\n\nprint \"$abday1? [$yesstr/$nostr] \";\n\nIn other words, in the \"C\" (or English) locale the above will probably print something like:\n\nSun? [yes/no]\n\nSee I18N::Langinfo for more information.\n"
                    }
                ]
            },
            "LOCALE CATEGORIES": {
                "content": "The following subsections describe basic locale categories.  Beyond these, some combination\ncategories allow manipulation of more than one basic category at a time.  See \"ENVIRONMENT\"\nfor a discussion of these.\n\nCategory \"LCCOLLATE\": Collation: Text Comparisons and Sorting\nIn the scope of a \"use locale\" form that includes collation, Perl looks to the \"LCCOLLATE\"\nenvironment variable to determine the application's notions on collation (ordering) of\ncharacters.  For example, \"b\" follows \"a\" in Latin alphabets, but where do \"á\" and \"å\"\nbelong?  And while \"color\" follows \"chocolate\" in English, what about in traditional Spanish?\n\nThe following collations all make sense and you may meet any of them if you \"use locale\".\n\nA B C D E a b c d e\nA a B b C c D d E e\na A b B c C d D e E\na b c d e A B C D E\n\nHere is a code snippet to tell what \"word\" characters are in the current locale, in that\nlocale's order:\n\nuse locale;\nprint +(sort grep /\\w/, map { chr } 0..255), \"\\n\";\n\nCompare this with the characters that you see and their order if you state explicitly that\nthe locale should be ignored:\n\nno locale;\nprint +(sort grep /\\w/, map { chr } 0..255), \"\\n\";\n\nThis machine-native collation (which is what you get unless \"use locale\" has appeared earlier\nin the same block) must be used for sorting raw binary data, whereas the locale-dependent\ncollation of the first example is useful for natural text.\n\nAs noted in \"USING LOCALES\", \"cmp\" compares according to the current collation locale when\n\"use locale\" is in effect, but falls back to a char-by-char comparison for strings that the\nlocale says are equal. You can use \"POSIX::strcoll()\" if you don't want this fall-back:\n\nuse POSIX qw(strcoll);\n$equalinlocale =\n!strcoll(\"space and case ignored\", \"SpaceAndCaseIgnored\");\n\n$equalinlocale will be true if the collation locale specifies a dictionary-like ordering\nthat ignores space characters completely and which folds case.\n\nPerl uses the platform's C library collation functions \"strcoll()\" and \"strxfrm()\".  That\nmeans you get whatever they give.  On some platforms, these functions work well on UTF-8\nlocales, giving a reasonable default collation for the code points that are important in that\nlocale.  (And if they aren't working well, the problem may only be that the locale definition\nis deficient, so can be fixed by using a better definition file.  Unicode's definitions (see\n\"Freely available locale definitions\") provide reasonable UTF-8 locale collation\ndefinitions.)  Starting in Perl v5.26, Perl's use of these functions has been made more\nseamless.  This may be sufficient for your needs.  For more control, and to make sure strings\ncontaining any code point (not just the ones important in the locale) collate properly, the\nUnicode::Collate module is suggested.\n\nIn non-UTF-8 locales (hence single byte), code points above 0xFF are technically invalid.\nBut if present, again starting in v5.26, they will collate to the same position as the\nhighest valid code point does.  This generally gives good results, but the collation order\nmay be skewed if the valid code point gets special treatment when it forms particular\nsequences with other characters as defined by the locale.  When two strings collate\nidentically, the code point order is used as a tie breaker.\n\nIf Perl detects that there are problems with the locale collation order, it reverts to using\nnon-locale collation rules for that locale.\n\nIf you have a single string that you want to check for \"equality in locale\" against several\nothers, you might think you could gain a little efficiency by using \"POSIX::strxfrm()\" in\nconjunction with \"eq\":\n\nuse POSIX qw(strxfrm);\n$xfrmstring = strxfrm(\"Mixed-case string\");\nprint \"locale collation ignores spaces\\n\"\nif $xfrmstring eq strxfrm(\"Mixed-casestring\");\nprint \"locale collation ignores hyphens\\n\"\nif $xfrmstring eq strxfrm(\"Mixedcase string\");\nprint \"locale collation ignores case\\n\"\nif $xfrmstring eq strxfrm(\"mixed-case string\");\n\n\"strxfrm()\" takes a string and maps it into a transformed string for use in char-by-char\ncomparisons against other transformed strings during collation.  \"Under the hood\", locale-\naffected Perl comparison operators call \"strxfrm()\" for both operands, then do a char-by-char\ncomparison of the transformed strings.  By calling \"strxfrm()\" explicitly and using a non\nlocale-affected comparison, the example attempts to save a couple of transformations.  But in\nfact, it doesn't save anything: Perl magic (see \"Magic Variables\" in perlguts) creates the\ntransformed version of a string the first time it's needed in a comparison, then keeps this\nversion around in case it's needed again.  An example rewritten the easy way with \"cmp\" runs\njust about as fast.  It also copes with null characters embedded in strings; if you call\n\"strxfrm()\" directly, it treats the first null it finds as a terminator.  Don't expect the\ntransformed strings it produces to be portable across systems--or even from one revision of\nyour operating system to the next.  In short, don't call \"strxfrm()\" directly: let Perl do it\nfor you.\n\nNote: \"use locale\" isn't shown in some of these examples because it isn't needed: \"strcoll()\"\nand \"strxfrm()\" are POSIX functions which use the standard system-supplied \"libc\" functions\nthat always obey the current \"LCCOLLATE\" locale.\n\nCategory \"LCCTYPE\": Character Types\nIn the scope of a \"use locale\" form that includes \"LCCTYPE\", Perl obeys the \"LCCTYPE\"\nlocale setting.  This controls the application's notion of which characters are alphabetic,\nnumeric, punctuation, etc.  This affects Perl's \"\\w\" regular expression metanotation, which\nstands for alphanumeric characters--that is, alphabetic, numeric, and the platform's native\nunderscore.  (Consult perlre for more information about regular expressions.)  Thanks to\n\"LCCTYPE\", depending on your locale setting, characters like \"æ\", \"ð\", \"ß\", and \"ø\" may be\nunderstood as \"\\w\" characters.  It also affects things like \"\\s\", \"\\D\", and the POSIX\ncharacter classes, like \"[[:graph:]]\".  (See perlrecharclass for more information on all\nthese.)\n\nThe \"LCCTYPE\" locale also provides the map used in transliterating characters between lower\nand uppercase.  This affects the case-mapping functions--\"fc()\", \"lc()\", \"lcfirst()\", \"uc()\",\nand \"ucfirst()\"; case-mapping interpolation with \"\\F\", \"\\l\", \"\\L\", \"\\u\", or \"\\U\" in double-\nquoted strings and \"s///\" substitutions; and case-insensitive regular expression pattern\nmatching using the \"i\" modifier.\n\nStarting in v5.20, Perl supports UTF-8 locales for \"LCCTYPE\", but otherwise Perl only\nsupports single-byte locales, such as the ISO 8859 series.  This means that wide character\nlocales, for example for Asian languages, are not well-supported.  Use of these locales may\ncause core dumps.  If the platform has the capability for Perl to detect such a locale,\nstarting in Perl v5.22, Perl will warn, default enabled, using the \"locale\" warning category,\nwhenever such a locale is switched into.  The UTF-8 locale support is actually a superset of\nPOSIX locales, because it is really full Unicode behavior as if no \"LCCTYPE\" locale were in\neffect at all (except for tainting; see \"SECURITY\").  POSIX locales, even UTF-8 ones, are\nlacking certain concepts in Unicode, such as the idea that changing the case of a character\ncould expand to be more than one character.  Perl in a UTF-8 locale, will give you that\nexpansion.  Prior to v5.20, Perl treated a UTF-8 locale on some platforms like an ISO 8859-1\none, with some restrictions, and on other platforms more like the \"C\" locale.  For releases\nv5.16 and v5.18, \"use locale 'notcharacters\" could be used as a workaround for this (see\n\"Unicode and UTF-8\").\n\nNote that there are quite a few things that are unaffected by the current locale.  Any\nliteral character is the native character for the given platform.  Hence 'A' means the\ncharacter at code point 65 on ASCII platforms, and 193 on EBCDIC.  That may or may not be an\n'A' in the current locale, if that locale even has an 'A'.  Similarly, all the escape\nsequences for particular characters, \"\\n\" for example, always mean the platform's native one.\nThis means, for example, that \"\\N\" in regular expressions (every character but new-line)\nworks on the platform character set.\n\nStarting in v5.22, Perl will by default warn when switching into a locale that redefines any\nASCII printable character (plus \"\\t\" and \"\\n\") into a different class than expected.  This is\nlikely to happen on modern locales only on EBCDIC platforms, where, for example, a CCSID 0037\nlocale on a CCSID 1047 machine moves \"[\", but it can happen on ASCII platforms with the ISO\n646 and other 7-bit locales that are essentially obsolete.  Things may still work, depending\non what features of Perl are used by the program.  For example, in the example from above\nwhere \"|\" becomes a \"\\w\", and there are no regular expressions where this matters, the\nprogram may still work properly.  The warning lists all the characters that it can determine\ncould be adversely affected.\n\nNote: A broken or malicious \"LCCTYPE\" locale definition may result in clearly ineligible\ncharacters being considered to be alphanumeric by your application.  For strict matching of\n(mundane) ASCII letters and digits--for example, in command strings--locale-aware\napplications should use \"\\w\" with the \"/a\" regular expression modifier.  See \"SECURITY\".\n\nCategory \"LCNUMERIC\": Numeric Formatting\nAfter a proper \"POSIX::setlocale()\" call, and within the scope of a \"use locale\" form that\nincludes numerics, Perl obeys the \"LCNUMERIC\" locale information, which controls an\napplication's idea of how numbers should be formatted for human readability.  In most\nimplementations the only effect is to change the character used for the decimal\npoint--perhaps from \".\"  to \",\".  The functions aren't aware of such niceties as thousands\nseparation and so on. (See \"The localeconv function\" if you care about these things.)\n\nuse POSIX qw(strtod setlocale LCNUMERIC);\nuse locale;\n\nsetlocale LCNUMERIC, \"\";\n\n$n = 5/2;   # Assign numeric 2.5 to $n\n\n$a = \" $n\"; # Locale-dependent conversion to string\n\nprint \"half five is $n\\n\";       # Locale-dependent output\n\nprintf \"half five is %g\\n\", $n;  # Locale-dependent output\n\nprint \"DECIMAL POINT IS COMMA\\n\"\nif $n == (strtod(\"2,5\"))[0]; # Locale-dependent conversion\n\nSee also I18N::Langinfo and \"RADIXCHAR\".\n\nCategory \"LCMONETARY\": Formatting of monetary amounts\nThe C standard defines the \"LCMONETARY\" category, but not a function that is affected by its\ncontents.  (Those with experience of standards committees will recognize that the working\ngroup decided to punt on the issue.)  Consequently, Perl essentially takes no notice of it.\nIf you really want to use \"LCMONETARY\", you can query its contents--see \"The localeconv\nfunction\"--and use the information that it returns in your application's own formatting of\ncurrency amounts.  However, you may well find that the information, voluminous and complex\nthough it may be, still does not quite meet your requirements: currency formatting is a hard\nnut to crack.\n\nSee also I18N::Langinfo and \"CRNCYSTR\".\n\nCategory \"LCTIME\": Respresentation of time\nOutput produced by \"POSIX::strftime()\", which builds a formatted human-readable date/time\nstring, is affected by the current \"LCTIME\" locale.  Thus, in a French locale, the output\nproduced by the %B format element (full month name) for the first month of the year would be\n\"janvier\".  Here's how to get a list of long month names in the current locale:\n\nuse POSIX qw(strftime);\nfor (0..11) {\n$longmonthname[$] =\nstrftime(\"%B\", 0, 0, 0, 1, $, 96);\n}\n\nNote: \"use locale\" isn't needed in this example: \"strftime()\" is a POSIX function which uses\nthe standard system-supplied \"libc\" function that always obeys the current \"LCTIME\" locale.\n\nSee also I18N::Langinfo and \"ABDAY1\"..\"ABDAY7\", \"DAY1\"..\"DAY7\", \"ABMON1\"..\"ABMON12\",\nand \"ABMON1\"..\"ABMON12\".\n",
                "subsections": [
                    {
                        "name": "Other categories",
                        "content": "The remaining locale categories are not currently used by Perl itself.  But again note that\nthings Perl interacts with may use these, including extensions outside the standard Perl\ndistribution, and by the operating system and its utilities.  Note especially that the string\nvalue of $! and the error messages given by external utilities may be changed by\n\"LCMESSAGES\".  If you want to have portable error codes, use \"%!\".  See Errno.\n"
                    }
                ]
            },
            "SECURITY": {
                "content": "Although the main discussion of Perl security issues can be found in perlsec, a discussion of\nPerl's locale handling would be incomplete if it did not draw your attention to locale-\ndependent security issues.  Locales--particularly on systems that allow unprivileged users to\nbuild their own locales--are untrustworthy.  A malicious (or just plain broken) locale can\nmake a locale-aware application give unexpected results.  Here are a few possibilities:\n\n•   Regular expression checks for safe file names or mail addresses using \"\\w\" may be spoofed\nby an \"LCCTYPE\" locale that claims that characters such as \">\" and \"|\" are alphanumeric.\n\n•   String interpolation with case-mapping, as in, say, \"$dest = \"C:\\U$name.$ext\"\", may\nproduce dangerous results if a bogus \"LCCTYPE\" case-mapping table is in effect.\n\n•   A sneaky \"LCCOLLATE\" locale could result in the names of students with \"D\" grades\nappearing ahead of those with \"A\"s.\n\n•   An application that takes the trouble to use information in \"LCMONETARY\" may format\ndebits as if they were credits and vice versa if that locale has been subverted.  Or it\nmight make payments in US dollars instead of Hong Kong dollars.\n\n•   The date and day names in dates formatted by \"strftime()\" could be manipulated to\nadvantage by a malicious user able to subvert the \"LCDATE\" locale.  (\"Look--it says I\nwasn't in the building on Sunday.\")\n\nSuch dangers are not peculiar to the locale system: any aspect of an application's\nenvironment which may be modified maliciously presents similar challenges.  Similarly, they\nare not specific to Perl: any programming language that allows you to write programs that\ntake account of their environment exposes you to these issues.\n\nPerl cannot protect you from all possibilities shown in the examples--there is no substitute\nfor your own vigilance--but, when \"use locale\" is in effect, Perl uses the tainting mechanism\n(see perlsec) to mark string results that become locale-dependent, and which may be\nuntrustworthy in consequence.  Here is a summary of the tainting behavior of operators and\nfunctions that may be affected by the locale:\n\n•   Comparison operators (\"lt\", \"le\", \"ge\", \"gt\" and \"cmp\"):\n\nScalar true/false (or less/equal/greater) result is never tainted.\n\n•   Case-mapping interpolation (with \"\\l\", \"\\L\", \"\\u\", \"\\U\", or \"\\F\")\n\nThe result string containing interpolated material is tainted if a \"use locale\" form that\nincludes \"LCCTYPE\" is in effect.\n\n•   Matching operator (\"m//\"):\n\nScalar true/false result never tainted.\n\nAll subpatterns, either delivered as a list-context result or as $1 etc., are tainted if\na \"use locale\" form that includes \"LCCTYPE\" is in effect, and the subpattern regular\nexpression contains a locale-dependent construct.  These constructs include \"\\w\" (to\nmatch an alphanumeric character), \"\\W\" (non-alphanumeric character), \"\\b\" and \"\\B\" (word-\nboundary and non-boundardy, which depend on what \"\\w\" and \"\\W\" match), \"\\s\" (whitespace\ncharacter), \"\\S\" (non whitespace character), \"\\d\" and \"\\D\" (digits and non-digits), and\nthe POSIX character classes, such as \"[:alpha:]\" (see \"POSIX Character Classes\" in\nperlrecharclass).\n\nTainting is also likely if the pattern is to be matched case-insensitively (via \"/i\").\nThe exception is if all the code points to be matched this way are above 255 and do not\nhave folds under Unicode rules to below 256.  Tainting is not done for these because Perl\nonly uses Unicode rules for such code points, and those rules are the same no matter what\nthe current locale.\n\nThe matched-pattern variables, $&, \"$`\" (pre-match), \"$'\" (post-match), and $+ (last\nmatch) also are tainted.\n\n•   Substitution operator (\"s///\"):\n\nHas the same behavior as the match operator.  Also, the left operand of \"=~\" becomes\ntainted when a \"use locale\" form that includes \"LCCTYPE\" is in effect, if modified as a\nresult of a substitution based on a regular expression match involving any of the things\nmentioned in the previous item, or of case-mapping, such as \"\\l\", \"\\L\",\"\\u\", \"\\U\", or\n\"\\F\".\n\n•   Output formatting functions (\"printf()\" and \"write()\"):\n\nResults are never tainted because otherwise even output from print, for example\n\"print(1/7)\", should be tainted if \"use locale\" is in effect.\n\n•   Case-mapping functions (\"lc()\", \"lcfirst()\", \"uc()\", \"ucfirst()\"):\n\nResults are tainted if a \"use locale\" form that includes \"LCCTYPE\" is in effect.\n\n•   POSIX locale-dependent functions (\"localeconv()\", \"strcoll()\", \"strftime()\",\n\"strxfrm()\"):\n\nResults are never tainted.\n\nThree examples illustrate locale-dependent tainting.  The first program, which ignores its\nlocale, won't run: a value taken directly from the command line may not be used to name an\noutput file when taint checks are enabled.\n\n#/usr/local/bin/perl -T\n# Run with taint checking\n\n# Command line sanity check omitted...\n$taintedoutputfile = shift;\n\nopen(F, \">$taintedoutputfile\")\nor warn \"Open of $taintedoutputfile failed: $!\\n\";\n\nThe program can be made to run by \"laundering\" the tainted value through a regular\nexpression: the second example--which still ignores locale information--runs, creating the\nfile named on its command line if it can.\n\n#/usr/local/bin/perl -T\n\n$taintedoutputfile = shift;\n$taintedoutputfile =~ m%[\\w/]+%;\n$untaintedoutputfile = $&;\n\nopen(F, \">$untaintedoutputfile\")\nor warn \"Open of $untaintedoutputfile failed: $!\\n\";\n\nCompare this with a similar but locale-aware program:\n\n#/usr/local/bin/perl -T\n\n$taintedoutputfile = shift;\nuse locale;\n$taintedoutputfile =~ m%[\\w/]+%;\n$localizedoutputfile = $&;\n\nopen(F, \">$localizedoutputfile\")\nor warn \"Open of $localizedoutputfile failed: $!\\n\";\n\nThis third program fails to run because $& is tainted: it is the result of a match involving\n\"\\w\" while \"use locale\" is in effect.\n",
                "subsections": []
            },
            "ENVIRONMENT": {
                "content": "PERLSKIPLOCALEINIT\nThis environment variable, available starting in Perl v5.20, if set (to any\nvalue), tells Perl to not use the rest of the environment variables to initialize\nwith.  Instead, Perl uses whatever the current locale settings are.  This is\nparticularly useful in embedded environments, see \"Using embedded Perl with POSIX\nlocales\" in perlembed.\n\nPERLBADLANG\nA string that can suppress Perl's warning about failed locale settings at\nstartup.  Failure can occur if the locale support in the operating system is\nlacking (broken) in some way--or if you mistyped the name of a locale when you\nset up your environment.  If this environment variable is absent, or has a value\nother than \"0\" or \"\", Perl will complain about locale setting failures.\n\nNOTE: \"PERLBADLANG\" only gives you a way to hide the warning message.  The\nmessage tells about some problem in your system's locale support, and you should\ninvestigate what the problem is.\n\nDPKGRUNNINGVERSION\nOn Debian systems, if the DPKGRUNNINGVERSION environment variable is set (to\nany value), the locale failure warnings will be suppressed just like with a zero\nPERLBADLANG setting. This is done to avoid floods of spurious warnings during\nsystem upgrades.  See <http://bugs.debian.org/508764>.\n\nThe following environment variables are not specific to Perl: They are part of the\nstandardized (ISO C, XPG4, POSIX 1.c) \"setlocale()\" method for controlling an application's\nopinion on data.  Windows is non-POSIX, but Perl arranges for the following to work as\ndescribed anyway.  If the locale given by an environment variable is not valid, Perl tries\nthe next lower one in priority.  If none are valid, on Windows, the system default locale is\nthen tried.  If all else fails, the \"C\" locale is used.  If even that doesn't work, something\nis badly broken, but Perl tries to forge ahead with whatever the locale settings might be.\n\n\"LCALL\"    \"LCALL\" is the \"override-all\" locale environment variable. If set, it overrides\nall the rest of the locale environment variables.\n\n\"LANGUAGE\"  NOTE: \"LANGUAGE\" is a GNU extension, it affects you only if you are using the GNU\nlibc.  This is the case if you are using e.g. Linux.  If you are using\n\"commercial\" Unixes you are most probably not using GNU libc and you can ignore\n\"LANGUAGE\".\n\nHowever, in the case you are using \"LANGUAGE\": it affects the language of\ninformational, warning, and error messages output by commands (in other words,\nit's like \"LCMESSAGES\") but it has higher priority than \"LCALL\".  Moreover,\nit's not a single value but instead a \"path\" (\":\"-separated list) of languages\n(not locales).  See the GNU \"gettext\" library documentation for more information.\n\n\"LCCTYPE\"  In the absence of \"LCALL\", \"LCCTYPE\" chooses the character type locale.  In the\nabsence of both \"LCALL\" and \"LCCTYPE\", \"LANG\" chooses the character type\nlocale.\n\n\"LCCOLLATE\"\nIn the absence of \"LCALL\", \"LCCOLLATE\" chooses the collation (sorting) locale.\nIn the absence of both \"LCALL\" and \"LCCOLLATE\", \"LANG\" chooses the collation\nlocale.\n\n\"LCMONETARY\"\nIn the absence of \"LCALL\", \"LCMONETARY\" chooses the monetary formatting locale.\nIn the absence of both \"LCALL\" and \"LCMONETARY\", \"LANG\" chooses the monetary\nformatting locale.\n\n\"LCNUMERIC\"\nIn the absence of \"LCALL\", \"LCNUMERIC\" chooses the numeric format locale.  In\nthe absence of both \"LCALL\" and \"LCNUMERIC\", \"LANG\" chooses the numeric format.\n\n\"LCTIME\"   In the absence of \"LCALL\", \"LCTIME\" chooses the date and time formatting\nlocale.  In the absence of both \"LCALL\" and \"LCTIME\", \"LANG\" chooses the date\nand time formatting locale.\n\n\"LANG\"      \"LANG\" is the \"catch-all\" locale environment variable. If it is set, it is used\nas the last resort after the overall \"LCALL\" and the category-specific \"LCfoo\".\n",
                "subsections": [
                    {
                        "name": "Examples",
                        "content": "The \"LCNUMERIC\" controls the numeric output:\n\nuse locale;\nuse POSIX qw(localeh); # Imports setlocale() and the LC constants.\nsetlocale(LCNUMERIC, \"frFR\") or die \"Pardon\";\nprintf \"%g\\n\", 1.23; # If the \"frFR\" succeeded, probably shows 1,23.\n\nand also how strings are parsed by \"POSIX::strtod()\" as numbers:\n\nuse locale;\nuse POSIX qw(localeh strtod);\nsetlocale(LCNUMERIC, \"deDE\") or die \"Entschuldigung\";\nmy $x = strtod(\"2,34\") + 5;\nprint $x, \"\\n\"; # Probably shows 7,34.\n"
                    }
                ]
            },
            "NOTES": {
                "content": "String \"eval\" and \"LCNUMERIC\"\nA string eval parses its expression as standard Perl.  It is therefore expecting the decimal\npoint to be a dot.  If \"LCNUMERIC\" is set to have this be a comma instead, the parsing will\nbe confused, perhaps silently.\n\nuse locale;\nuse POSIX qw(localeh);\nsetlocale(LCNUMERIC, \"frFR\") or die \"Pardon\";\nmy $a = 1.2;\nprint eval \"$a + 1.5\";\nprint \"\\n\";\n\nprints \"13,5\".  This is because in that locale, the comma is the decimal point character.\nThe \"eval\" thus expands to:\n\neval \"1,2 + 1.5\"\n\nand the result is not what you likely expected.  No warnings are generated.  If you do string\n\"eval\"'s within the scope of \"use locale\", you should instead change the \"eval\" line to do\nsomething like:\n\nprint eval \"no locale; $a + 1.5\";\n\nThis prints 2.7.\n\nYou could also exclude \"LCNUMERIC\", if you don't need it, by\n\nuse locale ':!numeric';\n",
                "subsections": [
                    {
                        "name": "Backward compatibility",
                        "content": "Versions of Perl prior to 5.004 mostly ignored locale information, generally behaving as if\nsomething similar to the \"C\" locale were always in force, even if the program environment\nsuggested otherwise (see \"The setlocale function\").  By default, Perl still behaves this way\nfor backward compatibility.  If you want a Perl application to pay attention to locale\ninformation, you must use the \"use locale\" pragma (see \"The \"use locale\" pragma\") or, in the\nunlikely event that you want to do so for just pattern matching, the \"/l\" regular expression\nmodifier (see \"Character set modifiers\" in perlre) to instruct it to do so.\n\nVersions of Perl from 5.002 to 5.003 did use the \"LCCTYPE\" information if available; that\nis, \"\\w\" did understand what were the letters according to the locale environment variables.\nThe problem was that the user had no control over the feature: if the C library supported\nlocales, Perl used them.\n"
                    },
                    {
                        "name": "I18N:Collate obsolete",
                        "content": "In versions of Perl prior to 5.004, per-locale collation was possible using the\n\"I18N::Collate\" library module.  This module is now mildly obsolete and should be avoided in\nnew applications.  The \"LCCOLLATE\" functionality is now integrated into the Perl core\nlanguage: One can use locale-specific scalar data completely normally with \"use locale\", so\nthere is no longer any need to juggle with the scalar references of \"I18N::Collate\".\n"
                    },
                    {
                        "name": "Sort speed and memory use impacts",
                        "content": "Comparing and sorting by locale is usually slower than the default sorting; slow-downs of two\nto four times have been observed.  It will also consume more memory: once a Perl scalar\nvariable has participated in any string comparison or sorting operation obeying the locale\ncollation rules, it will take 3-15 times more memory than before.  (The exact multiplier\ndepends on the string's contents, the operating system and the locale.) These downsides are\ndictated more by the operating system's implementation of the locale system than by Perl.\n"
                    },
                    {
                        "name": "Freely available locale definitions",
                        "content": "The Unicode CLDR project extracts the POSIX portion of many of its locales, available at\n\nhttps://unicode.org/Public/cldr/2.0.1/\n\n(Newer versions of CLDR require you to compute the POSIX data yourself.  See\n<http://unicode.org/Public/cldr/latest/>.)\n\nThere is a large collection of locale definitions at:\n\nhttp://std.dkuug.dk/i18n/WG15-collection/locales/\n\nYou should be aware that it is unsupported, and is not claimed to be fit for any purpose.  If\nyour system allows installation of arbitrary locales, you may find the definitions useful as\nthey are, or as a basis for the development of your own locales.\n"
                    },
                    {
                        "name": "I18n and l10n",
                        "content": "\"Internationalization\" is often abbreviated as i18n because its first and last letters are\nseparated by eighteen others.  (You may guess why the internalin ... internaliti ... i18n\ntends to get abbreviated.)  In the same way, \"localization\" is often abbreviated to l10n.\n"
                    },
                    {
                        "name": "An imperfect standard",
                        "content": "Internationalization, as defined in the C and POSIX standards, can be criticized as\nincomplete and ungainly.  They also have a tendency, like standards groups, to divide the\nworld into nations, when we all know that the world can equally well be divided into bankers,\nbikers, gamers, and so on.\n"
                    },
                    {
                        "name": "Unicode and UTF-8",
                        "content": "The support of Unicode is new starting from Perl version v5.6, and more fully implemented in\nversions v5.8 and later.  See perluniintro.\n\nStarting in Perl v5.20, UTF-8 locales are supported in Perl, except \"LCCOLLATE\" is only\npartially supported; collation support is improved in Perl v5.26 to a level that may be\nsufficient for your needs (see \"Category \"LCCOLLATE\": Collation: Text Comparisons and\nSorting\").\n\nIf you have Perl v5.16 or v5.18 and can't upgrade, you can use\n\nuse locale ':notcharacters';\n\nWhen this form of the pragma is used, only the non-character portions of locales are used by\nPerl, for example \"LCNUMERIC\".  Perl assumes that you have translated all the characters it\nis to operate on into Unicode (actually the platform's native character set (ASCII or EBCDIC)\nplus Unicode).  For data in files, this can conveniently be done by also specifying\n\nuse open ':locale';\n\nThis pragma arranges for all inputs from files to be translated into Unicode from the current\nlocale as specified in the environment (see \"ENVIRONMENT\"), and all outputs to files to be\ntranslated back into the locale.  (See open).  On a per-filehandle basis, you can instead use\nthe PerlIO::locale module, or the Encode::Locale module, both available from CPAN.  The\nlatter module also has methods to ease the handling of \"ARGV\" and environment variables, and\ncan be used on individual strings.  If you know that all your locales will be UTF-8, as many\nare these days, you can use the -C command line switch.\n\nThis form of the pragma allows essentially seamless handling of locales with Unicode.  The\ncollation order will be by Unicode code point order.  Unicode::Collate can be used to get\nUnicode rules collation.\n\nAll the modules and switches just described can be used in v5.20 with just plain \"use\nlocale\", and, should the input locales not be UTF-8, you'll get the less than ideal behavior,\ndescribed below, that you get with pre-v5.16 Perls, or when you use the locale pragma without\nthe \":notcharacters\" parameter in v5.16 and v5.18.  If you are using exclusively UTF-8\nlocales in v5.20 and higher, the rest of this section does not apply to you.\n\nThere are two cases, multi-byte and single-byte locales.  First multi-byte:\n\nThe only multi-byte (or wide character) locale that Perl is ever likely to support is UTF-8.\nThis is due to the difficulty of implementation, the fact that high quality UTF-8 locales are\nnow published for every area of the world (<https://unicode.org/Public/cldr/2.0.1/> for ones\nthat are already set-up, but from an earlier version;\n<https://unicode.org/Public/cldr/latest/> for the most up-to-date, but you have to extract\nthe POSIX information yourself), and that failing all that you can use the Encode module to\ntranslate to/from your locale.  So, you'll have to do one of those things if you're using one\nof these locales, such as Big5 or Shift JIS.  For UTF-8 locales, in Perls (pre v5.20) that\ndon't have full UTF-8 locale support, they may work reasonably well (depending on your C\nlibrary implementation) simply because both they and Perl store characters that take up\nmultiple bytes the same way.  However, some, if not most, C library implementations may not\nprocess the characters in the upper half of the Latin-1 range (128 - 255) properly under\n\"LCCTYPE\".  To see if a character is a particular type under a locale, Perl uses the\nfunctions like \"isalnum()\".  Your C library may not work for UTF-8 locales with those\nfunctions, instead only working under the newer wide library functions like \"iswalnum()\",\nwhich Perl does not use.  These multi-byte locales are treated like single-byte locales, and\nwill have the restrictions described below.  Starting in Perl v5.22 a warning message is\nraised when Perl detects a multi-byte locale that it doesn't fully support.\n\nFor single-byte locales, Perl generally takes the tack to use locale rules on code points\nthat can fit in a single byte, and Unicode rules for those that can't (though this isn't\nuniformly applied, see the note at the end of this section).  This prevents many problems in\nlocales that aren't UTF-8.  Suppose the locale is ISO8859-7, Greek.  The character at 0xD7\nthere is a capital Chi. But in the ISO8859-1 locale, Latin1, it is a multiplication sign.\nThe POSIX regular expression character class \"[[:alpha:]]\" will magically match 0xD7 in the\nGreek locale but not in the Latin one.\n\nHowever, there are places where this breaks down.  Certain Perl constructs are for Unicode\nonly, such as \"\\p{Alpha}\".  They assume that 0xD7 always has its Unicode meaning (or the\nequivalent on EBCDIC platforms).  Since Latin1 is a subset of Unicode and 0xD7 is the\nmultiplication sign in both Latin1 and Unicode, \"\\p{Alpha}\" will never match it, regardless\nof locale.  A similar issue occurs with \"\\N{...}\".  Prior to v5.20, it is therefore a bad\nidea to use \"\\p{}\" or \"\\N{}\" under plain \"use locale\"--unless you can guarantee that the\nlocale will be ISO8859-1.  Use POSIX character classes instead.\n\nAnother problem with this approach is that operations that cross the single byte/multiple\nbyte boundary are not well-defined, and so are disallowed.  (This boundary is between the\ncodepoints at 255/256.)  For example, lower casing LATIN CAPITAL LETTER Y WITH DIAERESIS\n(U+0178) should return LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF).  But in the Greek\nlocale, for example, there is no character at 0xFF, and Perl has no way of knowing what the\ncharacter at 0xFF is really supposed to represent.  Thus it disallows the operation.  In this\nmode, the lowercase of U+0178 is itself.\n\nThe same problems ensue if you enable automatic UTF-8-ification of your standard file\nhandles, default \"open()\" layer, and @ARGV on non-ISO8859-1, non-UTF-8 locales (by using\neither the -C command line switch or the \"PERLUNICODE\" environment variable; see perlrun).\nThings are read in as UTF-8, which would normally imply a Unicode interpretation, but the\npresence of a locale causes them to be interpreted in that locale instead.  For example, a\n0xD7 code point in the Unicode input, which should mean the multiplication sign, won't be\ninterpreted by Perl that way under the Greek locale.  This is not a problem provided you make\ncertain that all locales will always and only be either an ISO8859-1, or, if you don't have a\ndeficient C library, a UTF-8 locale.\n\nStill another problem is that this approach can lead to two code points meaning the same\ncharacter.  Thus in a Greek locale, both U+03A7 and U+00D7 are GREEK CAPITAL LETTER CHI.\n\nBecause of all these problems, starting in v5.22, Perl will raise a warning if a multi-byte\n(hence Unicode) code point is used when a single-byte locale is in effect.  (Although it\ndoesn't check for this if doing so would unreasonably slow execution down.)\n\nVendor locales are notoriously buggy, and it is difficult for Perl to test its locale-\nhandling code because this interacts with code that Perl has no control over; therefore the\nlocale-handling code in Perl may be buggy as well.  (However, the Unicode-supplied locales\nshould be better, and there is a feed back mechanism to correct any problems.  See \"Freely\navailable locale definitions\".)\n\nIf you have Perl v5.16, the problems mentioned above go away if you use the \":notcharacters\"\nparameter to the locale pragma (except for vendor bugs in the non-character portions).  If\nyou don't have v5.16, and you do have locales that work, using them may be worthwhile for\ncertain specific purposes, as long as you keep in mind the gotchas already mentioned.  For\nexample, if the collation for your locales works, it runs faster under locales than under\nUnicode::Collate; and you gain access to such things as the local currency symbol and the\nnames of the months and days of the week.  (But to hammer home the point, in v5.16, you get\nthis access without the downsides of locales by using the \":notcharacters\" form of the\npragma.)\n\nNote: The policy of using locale rules for code points that can fit in a byte, and Unicode\nrules for those that can't is not uniformly applied.  Pre-v5.12, it was somewhat haphazard;\nin v5.12 it was applied fairly consistently to regular expression matching except for\nbracketed character classes; in v5.14 it was extended to all regex matches; and in v5.16 to\nthe casing operations such as \"\\L\" and \"uc()\".  For collation, in all releases so far, the\nsystem's \"strxfrm()\" function is called, and whatever it does is what you get.  Starting in\nv5.26, various bugs are fixed with the way perl uses this function.\n"
                    }
                ]
            },
            "BUGS": {
                "content": "",
                "subsections": [
                    {
                        "name": "Collation of strings containing embedded \"NUL\" characters",
                        "content": "\"NUL\" characters will sort the same as the lowest collating control character does, or to\n\"\\001\" in the unlikely event that there are no control characters at all in the locale.  In\ncases where the strings don't contain this non-\"NUL\" control, the results will be correct,\nand in many locales, this control, whatever it might be, will rarely be encountered.  But\nthere are cases where a \"NUL\" should sort before this control, but doesn't.  If two strings\ndo collate identically, the one containing the \"NUL\" will sort to earlier.  Prior to 5.26,\nthere were more bugs.\n"
                    },
                    {
                        "name": "Multi-threaded",
                        "content": "XS code or C-language libraries called from it that use the system setlocale(3) function\n(except on Windows) likely will not work from a multi-threaded application without changes.\nSee \"Locale-aware XS code\" in perlxs.\n\nAn XS module that is locale-dependent could have been written under the assumption that it\nwill never be called in a multi-threaded environment, and so uses other non-locale constructs\nthat aren't multi-thread-safe.  See \"Thread-aware system interfaces\" in perlxs.\n\nPOSIX does not define a way to get the name of the current per-thread locale.  Some systems,\nsuch as Darwin and NetBSD do implement a function, querylocale(3) to do this.  On non-Windows\nsystems without it, such as Linux, there are some additional caveats:\n\n•   An embedded perl needs to be started up while the global locale is in effect.  See \"Using\nembedded Perl with POSIX locales\" in perlembed.\n\n•   It becomes more important for perl to know about all the possible locale categories on\nthe platform, even if they aren't apparently used in your program.  Perl knows all of the\nLinux ones.  If your platform has others, you can submit an issue at\n<https://github.com/Perl/perl5/issues> for inclusion of it in the next release.  In the\nmeantime, it is possible to edit the Perl source to teach it about the category, and then\nrecompile.  Search for instances of, say, \"LCPAPER\" in the source, and use that as a\ntemplate to add the omitted one.\n\n•   It is possible, though hard to do, to call \"POSIX::setlocale\" with a locale that it\ndoesn't recognize as syntactically legal, but actually is legal on that system.  This\nshould happen only with embedded perls, or if you hand-craft a locale name yourself.\n"
                    },
                    {
                        "name": "Broken systems",
                        "content": "In certain systems, the operating system's locale support is broken and cannot be fixed or\nused by Perl.  Such deficiencies can and will result in mysterious hangs and/or Perl core\ndumps when \"use locale\" is in effect.  When confronted with such a system, please report in\nexcruciating detail to <<https://github.com/Perl/perl5/issues>>, and also contact your\nvendor: bug fixes may exist for these problems in your operating system.  Sometimes such bug\nfixes are called an operating system upgrade.  If you have the source for Perl, include in\nthe bug report the output of the test described above in \"Testing for broken locales\".\n"
                    }
                ]
            },
            "SEE ALSO": {
                "content": "I18N::Langinfo, perluniintro, perlunicode, open, \"localeconv\" in POSIX, \"setlocale\" in POSIX,\n\"strcoll\" in POSIX, \"strftime\" in POSIX, \"strtod\" in POSIX, \"strxfrm\" in POSIX.\n\nFor special considerations when Perl is embedded in a C program, see \"Using embedded Perl\nwith POSIX locales\" in perlembed.\n",
                "subsections": []
            },
            "HISTORY": {
                "content": "Jarkko Hietaniemi's original perli18n.pod heavily hacked by Dominic Dunlop, assisted by the\nperl5-porters.  Prose worked over a bit by Tom Christiansen, and now maintained by Perl 5\nporters.\n\n\n\nperl v5.34.0                                 2025-07-25                                PERLLOCALE(1)",
                "subsections": []
            }
        }
    }
}