{
    "content": [
        {
            "type": "text",
            "text": "# Encode::Guess (perldoc)\n\n## NAME\n\nEncode::Guess -- Guesses encoding from data\n\n## SYNOPSIS\n\n# if you are sure $data won't contain anything bogus\nuse Encode;\nuse Encode::Guess qw/euc-jp shiftjis 7bit-jis/;\nmy $utf8 = decode(\"Guess\", $data);\nmy $data = encode(\"Guess\", $utf8);   # this doesn't work!\n# more elaborate way\nuse Encode::Guess;\nmy $enc = guessencoding($data, qw/euc-jp shiftjis 7bit-jis/);\nref($enc) or die \"Can't guess: $enc\"; # trap error this way\n$utf8 = $enc->decode($data);\n# or\n$utf8 = decode($enc->name, $data)\n\n## DESCRIPTION\n\nBy default, it checks only ascii, utf8 and UTF-16/32 with BOM.\n\n## Sections\n\n- **NAME**\n- **SYNOPSIS**\n- **ABSTRACT**\n- **DESCRIPTION** (1 subsections)\n- **CAVEATS**\n- **TO DO**\n- **SEE ALSO**\n\nUse structuredContent.sections for detailed options, examples, and full documentation.\n"
        }
    ],
    "structuredContent": {
        "command": "Encode::Guess",
        "section": "",
        "mode": "perldoc",
        "summary": "Encode::Guess -- Guesses encoding from data",
        "synopsis": "# if you are sure $data won't contain anything bogus\nuse Encode;\nuse Encode::Guess qw/euc-jp shiftjis 7bit-jis/;\nmy $utf8 = decode(\"Guess\", $data);\nmy $data = encode(\"Guess\", $utf8);   # this doesn't work!\n# more elaborate way\nuse Encode::Guess;\nmy $enc = guessencoding($data, qw/euc-jp shiftjis 7bit-jis/);\nref($enc) or die \"Can't guess: $enc\"; # trap error this way\n$utf8 = $enc->decode($data);\n# or\n$utf8 = decode($enc->name, $data)",
        "tldr_summary": null,
        "tldr_examples": [],
        "tldr_source": null,
        "flags": [],
        "examples": [],
        "see_also": [],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "SYNOPSIS",
                "lines": 15,
                "subsections": []
            },
            {
                "name": "ABSTRACT",
                "lines": 3,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 58,
                "subsections": [
                    {
                        "name": "guess_encoding",
                        "lines": 10
                    }
                ]
            },
            {
                "name": "CAVEATS",
                "lines": 42,
                "subsections": []
            },
            {
                "name": "TO DO",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "SEE ALSO",
                "lines": 2,
                "subsections": []
            }
        ],
        "sections": {
            "NAME": {
                "content": "Encode::Guess -- Guesses encoding from data\n",
                "subsections": []
            },
            "SYNOPSIS": {
                "content": "# if you are sure $data won't contain anything bogus\n\nuse Encode;\nuse Encode::Guess qw/euc-jp shiftjis 7bit-jis/;\nmy $utf8 = decode(\"Guess\", $data);\nmy $data = encode(\"Guess\", $utf8);   # this doesn't work!\n\n# more elaborate way\nuse Encode::Guess;\nmy $enc = guessencoding($data, qw/euc-jp shiftjis 7bit-jis/);\nref($enc) or die \"Can't guess: $enc\"; # trap error this way\n$utf8 = $enc->decode($data);\n# or\n$utf8 = decode($enc->name, $data)\n",
                "subsections": []
            },
            "ABSTRACT": {
                "content": "Encode::Guess enables you to guess in what encoding a given data is encoded, or at least tries\nto.\n",
                "subsections": []
            },
            "DESCRIPTION": {
                "content": "By default, it checks only ascii, utf8 and UTF-16/32 with BOM.\n\nuse Encode::Guess; # ascii/utf8/BOMed UTF\n\nTo use it more practically, you have to give the names of encodings to check (*suspects* as\nfollows). The name of suspects can either be canonical names or aliases.\n\nCAVEAT: Unlike UTF-(16|32), BOM in utf8 is NOT AUTOMATICALLY STRIPPED.\n\n# tries all major Japanese Encodings as well\nuse Encode::Guess qw/euc-jp shiftjis 7bit-jis/;\n\nIf the $Encode::Guess::NoUTFAutoGuess variable is set to a true value, no heuristics will be\napplied to UTF8/16/32, and the result will be limited to the suspects and \"ascii\".\n\nEncode::Guess->setsuspects\nYou can also change the internal suspects list via \"setsuspects\" method.\n\nuse Encode::Guess;\nEncode::Guess->setsuspects(qw/euc-jp shiftjis 7bit-jis/);\n\nEncode::Guess->addsuspects\nOr you can use \"addsuspects\" method. The difference is that \"setsuspects\" flushes the\ncurrent suspects list while \"addsuspects\" adds.\n\nuse Encode::Guess;\nEncode::Guess->addsuspects(qw/euc-jp shiftjis 7bit-jis/);\n# now the suspects are euc-jp,shiftjis,7bit-jis, AND\n# euc-kr,euc-cn, and big5-eten\nEncode::Guess->addsuspects(qw/euc-kr euc-cn big5-eten/);\n\nEncode::decode(\"Guess\" ...)\nWhen you are content with suspects list, you can now\n\nmy $utf8 = Encode::decode(\"Guess\", $data);\n\nEncode::Guess->guess($data)\nBut it will croak if:\n\n*   Two or more suspects remain\n\n*   No suspects left\n\nSo you should instead try this;\n\nmy $decoder = Encode::Guess->guess($data);\n\nOn success, $decoder is an object that is documented in Encode::Encoding. So you can now do\nthis;\n\nmy $utf8 = $decoder->decode($data);\n\nOn failure, $decoder now contains an error message so the whole thing would be as follows;\n\nmy $decoder = Encode::Guess->guess($data);\ndie $decoder unless ref($decoder);\nmy $utf8 = $decoder->decode($data);\n",
                "subsections": [
                    {
                        "name": "guess_encoding",
                        "content": "You can also try \"guessencoding\" function which is exported by default. It takes $data to\ncheck and it also takes the list of suspects by option. The optional suspect list is *not\nreflected* to the internal suspects list.\n\nmy $decoder = guessencoding($data, qw/euc-jp euc-kr euc-cn/);\ndie $decoder unless ref($decoder);\nmy $utf8 = $decoder->decode($data);\n# check only ascii, utf8 and UTF-(16|32) with BOM\nmy $decoder = guessencoding($data);\n"
                    }
                ]
            },
            "CAVEATS": {
                "content": "*   Because of the algorithm used, ISO-8859 series and other single-byte encodings do not work\nwell unless either one of ISO-8859 is the only one suspect (besides ascii and utf8).\n\nuse Encode::Guess;\n# perhaps ok\nmy $decoder = guessencoding($data, 'latin1');\n# definitely NOT ok\nmy $decoder = guessencoding($data, qw/latin1 greek/);\n\nThe reason is that Encode::Guess guesses encoding by trial and error. It first splits $data\ninto lines and tries to decode the line for each suspect. It keeps it going until all but\none encoding is eliminated out of suspects list. ISO-8859 series is just too successful for\nmost cases (because it fills almost all code points in \\x00-\\xff).\n\n*   Do not mix national standard encodings and the corresponding vendor encodings.\n\n# a very bad idea\nmy $decoder\n= guessencoding($data, qw/shiftjis MacJapanese cp932/);\n\nThe reason is that vendor encoding is usually a superset of national standard so it becomes\ntoo ambiguous for most cases.\n\n*   On the other hand, mixing various national standard encodings automagically works unless\n$data is too short to allow for guessing.\n\n# This is ok if $data is long enough\nmy $decoder =\nguessencoding($data, qw/euc-cn\neuc-jp shiftjis 7bit-jis\neuc-kr\nbig5-eten/);\n\n*   DO NOT PUT TOO MANY SUSPECTS! Don't you try something like this!\n\nmy $decoder = guessencoding($data,\nEncode->encodings(\":all\"));\n\nIt is, after all, just a guess. You should alway be explicit when it comes to encodings. But\nthere are some, especially Japanese, environment that guess-coding is a must. Use this module\nwith care.\n",
                "subsections": []
            },
            "TO DO": {
                "content": "Encode::Guess does not work on EBCDIC platforms.\n",
                "subsections": []
            },
            "SEE ALSO": {
                "content": "Encode, Encode::Encoding\n",
                "subsections": []
            }
        }
    }
}