phpman > man > NKF(3pm)

Che Dong
NAME
SYNOPSIS
DESCRIPTION
    Nkf is a yet another kanji code converter among networks, hosts and terminals. It converts input
    kanji code to designated kanji code such as ISO-2022-JP, Shift_JIS, EUC-JP, UTF-8, UTF-16 or
    UTF-32.

    One of the most unique faculty of nkf is the guess of the input kanji encodings. It currently
    recognizes ISO-2022-JP, Shift_JIS, EUC-JP, UTF-8, UTF-16 and UTF-32. So users needn't set the
    input kanji code explicitly.

    By default, X0201 kana is converted into X0208 kana. For X0201 kana, SO/SI, SSO and ESC-(-I
    methods are supported. For automatic code detection, nkf assumes no X0201 kana in Shift_JIS. To
    accept X0201 in Shift_JIS, use -X, -x or -S.

    multiple options are specifed as seprate strings, such as

      print nkf('--ic=UTF8-MAC', '-w', $string), "\n";

    except the last arguments.

OPTIONS
    -J -S -E -W -W16 -W32 -j -s -e -w -w16 -w32
        Specify input and output encodings. Upper case is input. cf. --ic and --oc.

        -J  ISO-2022-JP (JIS code).

        -S  Shift_JIS and JIS X 0201 kana. EUC-JP is recognized as X0201 kana. Without -x flag, JIS
            X 0201 Katakana (a.k.a.halfwidth kana) is converted into JIS X 0208. If you use Windows,
            see Windows-31J (CP932).

        -E  EUC-JP.

        -W  UTF-8N.

        -W16[BL][0]
            UTF-16. B or L gives whether Big Endian or Little Endian. 0 gives whther put BOM or not.

        -W32[BL][0]
            UTF-32. B or L gives whether Big Endian or Little Endian. 0 gives whther put BOM or not.

    -b -u
        Output is buffered (DEFAULT), Output is unbuffered.

    -t  No conversion.

    -i[@B]
        Specify the escape sequence for JIS X 0208.

        -i@ Use ESC ( @. (JIS X 0208-1978)

        -iB Use ESC ( B. (JIS X 0208-1983/1990 DEFAULT)

    -o[BJ]
        Specify the escape sequence for US-ASCII/JIS X 0201 Roman. (DEFAULT B)

    -r  {de/en}crypt ROT13/47

    -h[123] --hiragana --katakana --katakana-hiragana

        -h1 --hiragana
            Katakana to Hiragana conversion.

        -h2 --katakana
            Hiragana to Katakana conversion.

        -h3 --katakana-hiragana
            Katakana to Hiragana and Hiragana to Katakana conversion.

    -T  Text mode output (MS-DOS)

    -f[*m* [- *n*]]
        Folding on *m* length with *n* margin in a line. Without this option, fold length is 60 and
        fold margin is 10.

    -F  New line preserving line folding.

    -Z[0-3]
        Convert X0208 alphabet (Fullwidth Alphabets) to ASCII.

        -Z -Z0
            Convert X0208 alphabet to ASCII.

        -Z1 Convert X0208 kankaku to single ASCII space.

        -Z2 Convert X0208 kankaku to double ASCII spaces.

        -Z3 Replacing fullwidth >, <, ", & into '&gt;', '&lt;', '&quot;', '&amp;' as in HTML.

    -X -x
        With -X or without this option, X0201 is converted into X0208 Kana. With -x, try to preserve
        X0208 kana and do not convert X0201 kana to X0208. In JIS output, ESC-(-I is used. In EUC
        output, SS2 is used.

    -B[0-2]
        Assume broken JIS-Kanji input, which lost ESC. Useful when your site is using old B-News
        Nihongo patch.

        -B1 allows any chars after ESC-( or ESC-$.

        -B2 force ASCII after NL.

    -I  Replacing non iso-2022-jp char into a geta character (substitute character in Japanese).

    -m[BQN0]
        MIME ISO-2022-JP/ISO8859-1 decode. (DEFAULT) To see ISO8859-1 (Latin-1) -l is necessary.

        -mB Decode MIME base64 encoded stream. Remove header or other part before conversion.

        -mQ Decode MIME quoted stream. '_' in quoted stream is converted to space.

        -mN Non-strict decoding. It allows line break in the middle of the base64 encoding.

        -m0 No MIME decode.

    -M  MIME encode. Header style. All ASCII code and control characters are intact.

        -MB MIME encode Base64 stream. Kanji conversion is performed before encoding, so this cannot
            be used as a picture encoder.

        -MQ Perform quoted encoding.

    -l  Input and output code is ISO8859-1 (Latin-1) and ISO-2022-JP. -s, -e and -x are not
        compatible with this option.

    -L[uwm] -d -c
        Convert line breaks.

        -Lu -d
            unix (LF)

        -Lw -c
            windows (CRLF)

        -Lm mac (CR)

            Without this option, nkf doesn't convert line breaks.

    --fj --unix --mac --msdos --windows
        Convert for these systems.

    --jis --euc --sjis --mime --base64
        Convert to named code.

    --jis-input --euc-input --sjis-input --mime-input --base64-input
        Assume input system

    --ic=*input codeset* --oc=*output codeset*
        Set the input or output codeset. NKF supports following codesets and those codeset names are
        case insensitive.

        ISO-2022-JP
            a.k.a. RFC1468, 7bit JIS, JUNET

        EUC-JP (eucJP-nkf)
            a.k.a. AT&T JIS, Japanese EUC, UJIS

        eucJP-ascii
        eucJP-ms
        CP51932
            Microsoft Version of EUC-JP.

        Shift_JIS
            a.k.a. SJIS, MS_Kanji

        Windows-31J
            a.k.a. CP932

        UTF-8
            same as UTF-8N

        UTF-8N
            UTF-8 without BOM

        UTF-8-BOM
            UTF-8 with BOM

        UTF8-MAC (input only)
            decomposed UTF-8

        UTF-16
            same as UTF-16BE

        UTF-16BE
            UTF-16 Big Endian without BOM

        UTF-16BE-BOM
            UTF-16 Big Endian with BOM

        UTF-16LE
            UTF-16 Little Endian without BOM

        UTF-16LE-BOM
            UTF-16 Little Endian with BOM

        UTF-32
            same as UTF-32BE

        UTF-32BE
            UTF-32 Big Endian without BOM

        UTF-32BE-BOM
            UTF-32 Big Endian with BOM

        UTF-32LE
            UTF-32 Little Endian without BOM

        UTF-32LE-BOM
            UTF-32 Little Endian with BOM

    --fb-{skip, html, xml, perl, java, subchar}
        Specify the way that nkf handles unassigned characters. Without this option, --fb-skip is
        assumed.

    --prefix=*escape character**target character*..
        When nkf converts to Shift_JIS, nkf adds a specified escape character to specified 2nd byte
        of Shift_JIS characters. 1st byte of argument is the escape character and following bytes
        are target characters.

    --no-cp932ext
        Handle the characters extended in CP932 as unassigned characters.

    --no-best-fit-chars
        When Unicode to Encoded byte conversion, don't convert characters which is not round trip
        safe. When Unicode to Unicode conversion, with this and -x option, nkf can be used as UTF
        converter. (In other words, without this and -x option, nkf doesn't save some characters)

        When nkf converts strings that related to path, you should use this opion.

    --cap-input
        Decode hex encoded characters.

    --url-input
        Unescape percent escaped characters.

    --numchar-input
        Decode character reference, such as "&#....;".

    --  Ignore rest of -option.

AUTHOR
    Copyright (c) 1987, Fujitsu LTD. (Itaru ICHIKAWA).

    Copyright (c) 1996-2018, The nkf Project.
Generated by phpman v3.7.12 Author: Che Dong Under GNU General Public License
2026-06-13 21:12 @216.73.216.215
CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)