pcrebuild - phpMan

Command: man perldoc info search(apropos)  


PCRE(3)                                                                PCRE(3)



NAME
       PCRE - Perl-compatible regular expressions

PCRE BUILD-TIME OPTIONS

       This document describes the optional features of PCRE that can be selected when the
       library is compiled. They are all selected, or deselected, by providing options  to
       the  configure  script  that  is  run before the make command. The complete list of
       options for configure (which includes the standard ones such as  the  selection  of
       the installation directory) can be obtained by running

         ./configure --help

       The  following sections describe certain options whose names begin with --enable or
       --disable. These settings specify changes to the defaults for  the  configure  com-
       mand.  Because  of the way that configure works, --enable and --disable always come
       in pairs, so the complementary option always exists as well, but  as  it  specifies
       the default, it is not described.

UTF-8 SUPPORT

       To build PCRE with support for UTF-8 character strings, add

         --enable-utf8

       to  the  configure  command.  Of  itself,  this does not make PCRE treat strings as
       UTF-8. As well as compiling PCRE with this option, you also have have  to  set  the
       PCRE_UTF8 option when you call the pcre_compile() function.

UNICODE CHARACTER PROPERTY SUPPORT

       UTF-8  support  allows  PCRE  to  process  character values greater than 255 in the
       strings that it handles. On its own, however, it does not  provide  any  facilities
       for  accessing the properties of such characters. If you want to be able to use the
       pattern escapes \P, \p, and \X, which refer to Unicode  character  properties,  you
       must add

         --enable-unicode-properties

       to  the configure command. This implies UTF-8 support, even if you have not explic-
       itly requested it.

       Including Unicode property support adds around 90K of tables to the  PCRE  library,
       approximately  doubling  its  size. Only the general category properties such as Lu
       and Nd are supported. Details are given in the pcrepattern documentation.

CODE VALUE OF NEWLINE

       By default, PCRE treats character 10 (linefeed) as the newline character.  This  is
       the  normal  newline  character  on  Unix-like systems. You can compile PCRE to use
       character 13 (carriage return) instead by adding

         --enable-newline-is-cr

       to the configure command. For completeness there is also  a  --enable-newline-is-lf
       option, which explicitly specifies linefeed as the newline character.

BUILDING SHARED AND STATIC LIBRARIES

       The  PCRE  building  process  uses  libtool  to  build  both shared and static Unix
       libraries by default. You can suppress one of these by adding one of

         --disable-shared
         --disable-static

       to the configure command, as required.

POSIX MALLOC USAGE

       When PCRE is called through the POSIX interface (see the pcreposix  documentation),
       additional  working  storage is required for holding the pointers to capturing sub-
       strings, because PCRE requires three integers  per  substring,  whereas  the  POSIX
       interface  provides  only  two.  If the number of expected substrings is small, the
       wrapper function uses space on the stack, because this is faster  than  using  mal-
       loc()  for each call. The default threshold above which the stack is no longer used
       is 10; it can be changed by adding a setting such as

         --with-posix-malloc-threshold=20

       to the configure command.

LIMITING PCRE RESOURCE USAGE

       Internally, PCRE has a function called match(), which it calls repeatedly (possibly
       recursively)  when  matching  a pattern. By controlling the maximum number of times
       this function may be called during a single matching  operation,  a  limit  can  be
       placed  on  the  resources  used  by a single call to pcre_exec(). The limit can be
       changed at run time, as described in the pcreapi documentation. The default  is  10
       million, but this can be changed by adding a setting such as

         --with-match-limit=500000

       to the configure command.

HANDLING VERY LARGE PATTERNS

       Within a compiled pattern, offset values are used to point from one part to another
       (for example, from an opening parenthesis  to  an  alternation  metacharacter).  By
       default,  two-byte values are used for these offsets, leading to a maximum size for
       a compiled pattern of around 64K. This is sufficient to handle  all  but  the  most
       gigantic  patterns. Nevertheless, some people do want to process enormous patterns,
       so it is possible to compile PCRE to use three-byte or four-byte offsets by  adding
       a setting such as

         --with-link-size=3

       to  the configure command. The value given must be 2, 3, or 4. Using longer offsets
       slows down the operation of PCRE because it has to load additional bytes when  han-
       dling them.

       If  you build PCRE with an increased link size, test 2 (and test 5 if you are using
       UTF-8) will fail. Part of the output of these tests is a representation of the com-
       piled pattern, and this changes with the link size.

AVOIDING EXCESSIVE STACK USAGE

       PCRE  implements backtracking while matching by making recursive calls to an inter-
       nal function called match(). In environments where the size of the  stack  is  lim-
       ited, this can severely limit PCRE’s operation. (The Unix environment does not usu-
       ally suffer from this problem.) An alternative approach that uses memory  from  the
       heap  to  remember data, instead of using recursive function calls, has been imple-
       mented to work round this problem. If you want to build  a  version  of  PCRE  that
       works this way, add

         --disable-stack-for-recursion

       to   the   configure   command.   With   this  configuration,  PCRE  will  use  the
       pcre_stack_malloc  and  pcre_stack_free  variables  to   call   memory   management
       functions.  Separate  functions are provided because the usage is very predictable:
       the block sizes requested are always the same, and the blocks are always  freed  in
       reverse  order.  A  calling  program might be able to implement optimized functions
       that perform better than the standard malloc()  and  free()  functions.  PCRE  runs
       noticeably more slowly when built in this way.

USING EBCDIC CODE

       PCRE assumes by default that it will run in an environment where the character code
       is ASCII (or Unicode, which is a superset of ASCII). PCRE can, however, be compiled
       to run in an EBCDIC environment by adding

         --enable-ebcdic

       to the configure command.

Last updated: 09 September 2004
Copyright (c) 1997-2004 University of Cambridge.



                                                                       PCRE(3)

Generated by $Id: phpMan.php,v 4.55 2007/09/05 04:42:51 chedong Exp $ Author: Che Dong
On Apache/1.3.41 (Unix) PHP/5.2.5 mod_perl/1.30 mod_gzip/1.3.26.1a
Under GNU General Public License
2008-08-21 14:05 @38.103.63.61 CrawledBy CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
Valid XHTML 1.0!Valid CSS!