Jcode(3pm) - phpMan

Command: man perldoc info search(apropos)  


Jcode(3pm)                     User Contributed Perl Documentation                     Jcode(3pm)

NAME
       Jcode - Japanese Charset Handler

SYNOPSIS
        use Jcode;
        #
        # traditional
        Jcode::convert(\$str, $ocode, $icode, "z");
        # or OOP!
        print Jcode->new($str)->h2z->tr($from, $to)->utf8;

DESCRIPTION
       <Japanese document is now available as Jcode::Nihongo. >

       Jcode.pm supports both object and traditional approach.  With object approach, you can go
       like;

         $iso_2022_jp = Jcode->new($str)->h2z->jis;

       Which is more elegant than:

         $iso_2022_jp = $str;
         &jcode::convert(\$iso_2022_jp, 'jis', &jcode::getcode(\$str), "z");

       For those unfamiliar with objects, Jcode.pm still supports "getcode()" and "convert()."

       If the perl version is 5.8.1, Jcode acts as a wrapper to Encode, the standard charset han-
       dler module for Perl 5.8 or later.

Methods
       Methods mentioned here all return Jcode object unless otherwise mentioned.

       Constructors

       $j = Jcode->new($str [, $icode])
         Creates Jcode object $j from $str.  Input code is automatically checked unless you ex-
         plicitly set $icode. For available charset, see getcode below.

         For perl 5.8.1 or better, $icode can be any encoding name that Encode understands.

           $j = Jcode->new($european, 'iso-latin1');

         When the object is stringified, it returns the EUC-converted string so you can <print
         $j> instead of <print $j->euc>.

         Passing Reference
           Instead of scalar value, You can use reference as

           Jcode->new(\$str);

           This saves time a little bit.  In exchange of the value of $str being converted. (In a
           way, $str is now "tied" to jcode object).

       $j->set($str [, $icode])
         Sets $j's internal string to $str.  Handy when you use Jcode object repeatedly (saves
         time and memory to create object).

          # converts mailbox to SJIS format
          my $jconv = new Jcode;
          $/ = 00;
          while(&lt;&gt;){
              print $jconv->set(\$_)->mime_decode->sjis;
          }

       $j->append($str [, $icode]);
         Appends $str to $j's internal string.

       $j = jcode($str [, $icode]);
         shortcut for Jcode->new() so you can go like;

       Encoded Strings

       In general, you can retrieve encoded string as $j->encoded.

       $sjis = jcode($str)->sjis
       $euc = $j->euc
       $jis = $j->jis
       $sjis = $j->sjis
       $ucs2 = $j->ucs2
       $utf8 = $j->utf8
         What you code is what you get :)

       $iso_2022_jp = $j->iso_2022_jp
         Same as "$j->h2z->jis".  Hankaku Kanas are forcibly converted to Zenkaku.

         For perl 5.8.1 and better, you can also use any encoding names and aliases that Encode
         supports.  For example:

           $european = $j->iso_latin1; # replace '-' with '_' for names.

         FYI: Encode::Encoder uses similar trick.

         $j->fallback($fallback)
           For perl is 5.8.1 or better, Jcode stores the internal string in UTF-8.  Any character
           that does not map to ->encoding are replaced with a '?', which is Encode standard.

             my $unistr = "\x{262f}"; # YIN YANG
             my $j = jcode($unistr);  # $j->euc is '?'

           You can change this behavior by specifying fallback like Encode.  Values are the same
           as Encode.  "Jcode::FB_PERLQQ", "Jcode::FB_XMLCREF", "Jcode::FB_HTMLCREF" are aliased
           to those of Encode for convenice.

             print $j->fallback(Jcode::FB_PERLQQ)->euc;   # '\x{262f}'
             print $j->fallback(Jcode::FB_XMLCREF)->euc;  # '&#x262f;'
             print $j->fallback(Jcode::FB_HTMLCREF)->euc; # '&#9775;'

           The global variable $Jcode::FALLBACK stores the default fallback so you can override
           that by assigning the value.

             $Jcode::FALLBACK = Jcode::FB_PERLQQ; # set default fallback scheme

       [@lines =] $jcode->jfold([$width, $newline_str, $kref])
         folds lines in jcode string every $width (default: 72) where $width is the number of
         "halfwidth" character.  Fullwidth Characters are counted as two.

         with a newline string spefied by $newline_str (default: "\n").

         Rudimentary kinsoku suppport is now available for Perl 5.8.1 and better.

       $length = $jcode->jlength();
         returns character length properly, rather than byte length.

       Methods that use MIME::Base64

       To use methods below, you need MIME::Base64.  To install, simply

          perl -MCPAN -e 'CPAN::Shell->install("MIME::Base64")'

       If your perl is 5.6 or better, there is no need since MIME::Base64 is bundled.

       $mime_header = $j->mime_encode([$lf, $bpl])
         Converts $str to MIME-Header documented in RFC1522.  When $lf is specified, it uses $lf
         to fold line (default: \n).  When $bpl is specified, it uses $bpl for the number of
         bytes (default: 76; this number must be smaller than 76).

         For Perl 5.8.1 or better, you can also encode MIME Header as:

           $mime_header = $j->MIME_Header;

         In which case the resulting $mime_header is MIME-B-encoded UTF-8 whereas "$j->mime_en-
         code()" returnes MIME-B-encoded ISO-2022-JP.  Most modern MUAs support both.

       $j->mime_decode;
         Decodes MIME-Header in Jcode object.  For perl 5.8.1 or better, you can also do the same
         as:

           Jcode->new($str, 'MIME-Header')

       Hankaku vs. Zenkaku

       $j->h2z([$keep_dakuten])
         Converts X201 kana (Hankaku) to X208 kana (Zenkaku).  When $keep_dakuten is set, it
         leaves dakuten as is (That is, "ka + dakuten" is left as is instead of being converted
         to "ga")

         You can retrieve the number of matches via $j->nmatch;

       $j->z2h
         Converts X208 kana (Zenkaku) to X201 kana (Hankaku).

         You can retrieve the number of matches via $j->nmatch;

       Regexp emulators

       To use "->m()" and "->s()", you need perl 5.8.1 or better.

       $j->tr($from, $to, $opt);
         Applies "tr/$from/$to/" on Jcode object where $from and $to are EUC-JP strings.  On perl
         5.8.1 or better, $from and $to can also be flagged UTF-8 strings.

         If $opt is set, "tr/$from/$to/$opt" is applied.  $opt must be 'c', 'd' or the combina-
         tion thereof.

         You can retrieve the number of matches via $j->nmatch;

         The following methods are available only for perl 5.8.1 or better.

       $j->s($patter, $replace, $opt);
         Applies "s/$pattern/$replace/$opt". $pattern and "replace" must be in EUC-JP or flagged
         UTF-8. $opt are the same as regexp options.  See perlre for regexp options.

         Like "$j->tr()", "$j->s()" returns the object itself so you can nest the operation as
         follows;

           $j->tr("a-z", "A-Z")->s("foo", "bar");

       [@match = ] $j->m($pattern, $opt);
         Applies "m/$patter/$opt".  Note that this method DOES NOT RETURN AN OBJECT so you can't
         chain the method like  "$j->s()".

       Instance Variables

       If you need to access instance variables of Jcode object, use access methods below instead
       of directly accessing them (That's what OOP is all about)

       FYI, Jcode uses a ref to array instead of ref to hash (common way) to optimize speed (Ac-
       tually you don't have to know as long as you use access methods instead;  Once again,
       that's OOP)

       $j->r_str
         Reference to the EUC-coded String.

       $j->icode
         Input charcode in recent operation.

       $j->nmatch
         Number of matches (Used in $j->tr, etc.)

Subroutines
       ($code, [$nmatch]) = getcode($str)
         Returns char code of $str. Return codes are as follows

          ascii   Ascii (Contains no Japanese Code)
          binary  Binary (Not Text File)
          euc     EUC-JP
          sjis    SHIFT_JIS
          jis     JIS (ISO-2022-JP)
          ucs2    UCS2 (Raw Unicode)
          utf8    UTF8

         When array context is used instead of scaler, it also returns how many character codes
         are found.  As mentioned above, $str can be \$str instead.

         jcode.pl Users:  This function is 100% upper-conpatible with jcode::getcode() -- well,
         almost;

          * When its return value is an array, the order is the opposite;
            jcode::getcode() returns $nmatch first.

          * jcode::getcode() returns 'undef' when the number of EUC characters
            is equal to that of SJIS.  Jcode::getcode() returns EUC.  for
            Jcode.pm there is no in-betweens.

       Jcode::convert($str, [$ocode, $icode, $opt])
         Converts $str to char code specified by $ocode.  When $icode is specified also, it as-
         sumes $icode for input string instead of the one checked by getcode(). As mentioned
         above, $str can be \$str instead.

         jcode.pl Users:  This function is 100% upper-conpatible with jcode::convert() !

BUGS
       For perl is 5.8.1 or later, Jcode acts as a wrapper to Encode.  Meaning Jcode is subject
       to bugs therein.

ACKNOWLEDGEMENTS
       This package owes a lot in motivation, design, and code, to the jcode.pl for Perl4 by
       Kazumasa Utashiro <utashiro AT iij.jp>.

       Hiroki Ohzaki <ohzaki AT iod.jp> has helped me polish regexp from the very first
       stage of development.

       JEncode by makamaka AT donzoko.net has inspired me to integrate Encode to Jcode.  He has also
       contributed Japanese POD.

       And folks at Jcode Mailing list <jcode5 AT ring.jp>.  Without them, I couldn't have coded
       this far.

SEE ALSO
       Encode

       Jcode::Nihongo

       <http://www.iana.org/assignments/character-sets>

COPYRIGHT
       Copyright 1999-2005 Dan Kogai <dankogai AT dan.jp>

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.

perl v5.8.8                                 2005-02-19                                 Jcode(3pm)

Generated by $Id: phpMan.php,v 4.55 2007/09/05 04:42:51 chedong Exp $ Author: Che Dong
On Apache
Under GNU General Public License
2024-04-19 14:57 @18.222.117.109 CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Valid XHTML 1.0!Valid CSS!