{
    "mode": "man",
    "parameter": "perlreapi",
    "section": "1",
    "url": "https://www.chedong.com/phpMan.php/man/perlreapi/1/json",
    "generated": "2026-06-15T18:51:38Z",
    "sections": {
        "NAME": {
            "content": "perlreapi - Perl regular expression plugin interface\n",
            "subsections": []
        },
        "DESCRIPTION": {
            "content": "As of Perl 5.9.5 there is a new interface for plugging and using regular expression engines\nother than the default one.\n\nEach engine is supposed to provide access to a constant structure of the following format:\n\ntypedef struct regexpengine {\nREGEXP* (*comp) (pTHX\nconst SV * const pattern, const U32 flags);\nI32     (*exec) (pTHX\nREGEXP * const rx,\nchar* stringarg,\nchar* strend, char* strbeg,\nSSizet minend, SV* sv,\nvoid* data, U32 flags);\nchar*   (*intuit) (pTHX\nREGEXP * const rx, SV *sv,\nconst char * const strbeg,\nchar *strpos, char *strend, U32 flags,\nstruct rescreamposdatas *data);\nSV*     (*checkstr) (pTHX REGEXP * const rx);\nvoid    (*free) (pTHX REGEXP * const rx);\nvoid    (*numberedbuffFETCH) (pTHX\nREGEXP * const rx,\nconst I32 paren,\nSV * const sv);\nvoid    (*numberedbuffSTORE) (pTHX\nREGEXP * const rx,\nconst I32 paren,\nSV const * const value);\nI32     (*numberedbuffLENGTH) (pTHX\nREGEXP * const rx,\nconst SV * const sv,\nconst I32 paren);\nSV*     (*namedbuff) (pTHX\nREGEXP * const rx,\nSV * const key,\nSV * const value,\nU32 flags);\nSV*     (*namedbuffiter) (pTHX\nREGEXP * const rx,\nconst SV * const lastkey,\nconst U32 flags);\nSV*     (*qrpackage)(pTHX REGEXP * const rx);\n#ifdef USEITHREADS\nvoid*   (*dupe) (pTHX REGEXP * const rx, CLONEPARAMS *param);\n#endif\nREGEXP* (*opcomp) (...);\n\nWhen a regexp is compiled, its \"engine\" field is then set to point at the appropriate\nstructure, so that when it needs to be used Perl can find the right routines to do so.\n\nIn order to install a new regexp handler, $^H{regcomp} is set to an integer which (when\ncasted appropriately) resolves to one of these structures.  When compiling, the \"comp\" method\nis executed, and the resulting \"regexp\" structure's engine field is expected to point back at\nthe same structure.\n\nThe pTHX symbol in the definition is a macro used by Perl under threading to provide an\nextra argument to the routine holding a pointer back to the interpreter that is executing the\nregexp. So under threading all routines get an extra argument.\n",
            "subsections": []
        },
        "Callbacks": {
            "content": "",
            "subsections": [
                {
                    "name": "comp",
                    "content": "REGEXP* comp(pTHX const SV * const pattern, const U32 flags);\n\nCompile the pattern stored in \"pattern\" using the given \"flags\" and return a pointer to a\nprepared \"REGEXP\" structure that can perform the match.  See \"The REGEXP structure\" below for\nan explanation of the individual fields in the REGEXP struct.\n\nThe \"pattern\" parameter is the scalar that was used as the pattern.  Previous versions of\nPerl would pass two \"char*\" indicating the start and end of the stringified pattern; the\nfollowing snippet can be used to get the old parameters:\n\nSTRLEN plen;\nchar*  exp = SvPV(pattern, plen);\nchar* xend = exp + plen;\n\nSince any scalar can be passed as a pattern, it's possible to implement an engine that does\nsomething with an array (\"\"ook\" =~ [ qw/ eek hlagh / ]\") or with the non-stringified form of\na compiled regular expression (\"\"ook\" =~ qr/eek/\").  Perl's own engine will always stringify\neverything using the snippet above, but that doesn't mean other engines have to.\n\nThe \"flags\" parameter is a bitfield which indicates which of the \"msixpn\" flags the regex was\ncompiled with.  It also contains additional info, such as if \"use locale\" is in effect.\n\nThe \"eogc\" flags are stripped out before being passed to the comp routine.  The regex engine\ndoes not need to know if any of these are set, as those flags should only affect what Perl\ndoes with the pattern and its match variables, not how it gets compiled and executed.\n\nBy the time the comp callback is called, some of these flags have already had effect (noted\nbelow where applicable).  However most of their effect occurs after the comp callback has\nrun, in routines that read the \"rx->extflags\" field which it populates.\n\nIn general the flags should be preserved in \"rx->extflags\" after compilation, although the\nregex engine might want to add or delete some of them to invoke or disable some special\nbehavior in Perl.  The flags along with any special behavior they cause are documented below:\n\nThe pattern modifiers:\n\n\"/m\" - RXfPMfMULTILINE\nIf this is in \"rx->extflags\" it will be passed to \"Perlfbminstr\" by \"ppsplit\" which\nwill treat the subject string as a multi-line string.\n\n\"/s\" - RXfPMfSINGLELINE\n\"/i\" - RXfPMfFOLD\n\"/x\" - RXfPMfEXTENDED\nIf present on a regex, \"#\" comments will be handled differently by the tokenizer in some\ncases.\n\nTODO: Document those cases.\n\n\"/p\" - RXfPMfKEEPCOPY\nTODO: Document this\n\nCharacter set\nThe character set rules are determined by an enum that is contained in this field.  This\nis still experimental and subject to change, but the current interface returns the rules\nby use of the in-line function \"getregexcharset(const U32 flags)\".  The only currently\ndocumented value returned from it is REGEXLOCALECHARSET, which is set if \"use locale\"\nis in effect. If present in \"rx->extflags\", \"split\" will use the locale dependent\ndefinition of whitespace when RXfSKIPWHITE or RXfWHITE is in effect.  ASCII whitespace\nis defined as per isSPACE, and by the internal macros \"isutf8space\" under UTF-8, and\n\"isSPACELC\" under \"use locale\".\n\nAdditional flags:\n\nRXfSPLIT\nThis flag was removed in perl 5.18.0.  \"split ' '\" is now special-cased solely in the\nparser.  RXfSPLIT is still #defined, so you can test for it.  This is how it used to\nwork:\n\nIf \"split\" is invoked as \"split ' '\" or with no arguments (which really means \"split(' ',\n$)\", see split), Perl will set this flag.  The regex engine can then check for it and\nset the SKIPWHITE and WHITE extflags.  To do this, the Perl engine does:\n\nif (flags & RXfSPLIT && r->prelen == 1 && r->precomp[0] == ' ')\nr->extflags |= (RXfSKIPWHITE|RXfWHITE);\n\nThese flags can be set during compilation to enable optimizations in the \"split\" operator.\n\nRXfSKIPWHITE\nThis flag was removed in perl 5.18.0.  It is still #defined, so you can set it, but doing\nso will have no effect.  This is how it used to work:\n\nIf the flag is present in \"rx->extflags\" \"split\" will delete whitespace from the start of\nthe subject string before it's operated on.  What is considered whitespace depends on if\nthe subject is a UTF-8 string and if the \"RXfPMfLOCALE\" flag is set.\n\nIf RXfWHITE is set in addition to this flag, \"split\" will behave like \"split \" \"\" under\nthe Perl engine.\n\nRXfSTARTONLY\nTells the split operator to split the target string on newlines (\"\\n\") without invoking\nthe regex engine.\n\nPerl's engine sets this if the pattern is \"/^/\" (\"plen == 1 && *exp == '^'\"), even under\n\"/^/s\"; see split.  Of course a different regex engine might want to use the same\noptimizations with a different syntax.\n\nRXfWHITE\nTells the split operator to split the target string on whitespace without invoking the\nregex engine.  The definition of whitespace varies depending on if the target string is a\nUTF-8 string and on if RXfPMfLOCALE is set.\n\nPerl's engine sets this flag if the pattern is \"\\s+\".\n\nRXfNULL\nTells the split operator to split the target string on characters.  The definition of\ncharacter varies depending on if the target string is a UTF-8 string.\n\nPerl's engine sets this flag on empty patterns, this optimization makes \"split //\" much\nfaster than it would otherwise be.  It's even faster than \"unpack\".\n\nRXfNOINPLACESUBST\nAdded in perl 5.18.0, this flag indicates that a regular expression might perform an\noperation that would interfere with inplace substitution. For instance it might contain\nlookbehind, or assign to non-magical variables (such as $REGMARK and $REGERROR) during\nmatching.  \"s///\" will skip certain optimisations when this is set.\n"
                },
                {
                    "name": "exec",
                    "content": "I32 exec(pTHX REGEXP * const rx,\nchar *stringarg, char* strend, char* strbeg,\nSSizet minend, SV* sv,\nvoid* data, U32 flags);\n\nExecute a regexp. The arguments are\n\nrx  The regular expression to execute.\n\nsv  This is the SV to be matched against.  Note that the actual char array to be matched\nagainst is supplied by the arguments described below; the SV is just used to determine\nUTF8ness, \"pos()\" etc.\n\nstrbeg\nPointer to the physical start of the string.\n\nstrend\nPointer to the character following the physical end of the string (i.e.  the \"\\0\", if\nany).\n\nstringarg\nPointer to the position in the string where matching should start; it might not be equal\nto \"strbeg\" (for example in a later iteration of \"/.../g\").\n\nminend\nMinimum length of string (measured in bytes from \"stringarg\") that must match; if the\nengine reaches the end of the match but hasn't reached this position in the string, it\nshould fail.\n\ndata\nOptimisation data; subject to change.\n\nflags\nOptimisation flags; subject to change.\n"
                },
                {
                    "name": "intuit",
                    "content": "char* intuit(pTHX\nREGEXP * const rx,\nSV *sv,\nconst char * const strbeg,\nchar *strpos,\nchar *strend,\nconst U32 flags,\nstruct rescreamposdatas *data);\n\nFind the start position where a regex match should be attempted, or possibly if the regex\nengine should not be run because the pattern can't match.  This is called, as appropriate, by\nthe core, depending on the values of the \"extflags\" member of the \"regexp\" structure.\n\nArguments:\n\nrx:     the regex to match against\nsv:     the SV being matched: only used for utf8 flag; the string\nitself is accessed via the pointers below. Note that on\nsomething like an overloaded SV, SvPOK(sv) may be false\nand the string pointers may point to something unrelated to\nthe SV itself.\nstrbeg: real beginning of string\nstrpos: the point in the string at which to begin matching\nstrend: pointer to the byte following the last char of the string\nflags   currently unused; set to 0\ndata:   currently unused; set to NULL\n"
                },
                {
                    "name": "checkstr",
                    "content": "SV* checkstr(pTHX REGEXP * const rx);\n\nReturn a SV containing a string that must appear in the pattern. Used by \"split\" for\noptimising matches.\n"
                },
                {
                    "name": "free",
                    "content": "void free(pTHX REGEXP * const rx);\n\nCalled by Perl when it is freeing a regexp pattern so that the engine can release any\nresources pointed to by the \"pprivate\" member of the \"regexp\" structure.  This is only\nresponsible for freeing private data; Perl will handle releasing anything else contained in\nthe \"regexp\" structure.\n"
                },
                {
                    "name": "Numbered capture callbacks",
                    "content": "Called to get/set the value of \"$`\", \"$'\", $& and their named equivalents, ${^PREMATCH},\n${^POSTMATCH} and ${^MATCH}, as well as the numbered capture groups ($1, $2, ...).\n\nThe \"paren\" parameter will be 1 for $1, 2 for $2 and so forth, and have these symbolic values\nfor the special variables:\n\n${^PREMATCH}  RXBUFFIDXCARETPREMATCH\n${^POSTMATCH} RXBUFFIDXCARETPOSTMATCH\n${^MATCH}     RXBUFFIDXCARETFULLMATCH\n$`            RXBUFFIDXPREMATCH\n$'            RXBUFFIDXPOSTMATCH\n$&            RXBUFFIDXFULLMATCH\n\nNote that in Perl 5.17.3 and earlier, the last three constants were also used for the caret\nvariants of the variables.\n\nThe names have been chosen by analogy with Tie::Scalar methods names with an additional\nLENGTH callback for efficiency.  However named capture variables are currently not tied\ninternally but implemented via magic.\n\nnumberedbuffFETCH\n\nvoid numberedbuffFETCH(pTHX REGEXP * const rx, const I32 paren,\nSV * const sv);\n\nFetch a specified numbered capture.  \"sv\" should be set to the scalar to return, the scalar\nis passed as an argument rather than being returned from the function because when it's\ncalled Perl already has a scalar to store the value, creating another one would be redundant.\nThe scalar can be set with \"svsetsv\", \"svsetpvn\" and friends, see perlapi.\n\nThis callback is where Perl untaints its own capture variables under taint mode (see\nperlsec).  See the \"Perlregnumberedbufffetch\" function in regcomp.c for how to untaint\ncapture variables if that's something you'd like your engine to do as well.\n\nnumberedbuffSTORE\n\nvoid    (*numberedbuffSTORE) (pTHX\nREGEXP * const rx,\nconst I32 paren,\nSV const * const value);\n\nSet the value of a numbered capture variable.  \"value\" is the scalar that is to be used as\nthe new value.  It's up to the engine to make sure this is used as the new value (or reject\nit).\n\nExample:\n\nif (\"ook\" =~ /(o*)/) {\n# 'paren' will be '1' and 'value' will be 'ee'\n$1 =~ tr/o/e/;\n}\n\nPerl's own engine will croak on any attempt to modify the capture variables, to do this in\nanother engine use the following callback (copied from \"Perlregnumberedbuffstore\"):\n\nvoid\nExampleregnumberedbuffstore(pTHX\nREGEXP * const rx,\nconst I32 paren,\nSV const * const value)\n{\nPERLUNUSEDARG(rx);\nPERLUNUSEDARG(paren);\nPERLUNUSEDARG(value);\n\nif (!PLlocalizing)\nPerlcroak(aTHX PLnomodify);\n}\n\nActually Perl will not always croak in a statement that looks like it would modify a numbered\ncapture variable.  This is because the STORE callback will not be called if Perl can\ndetermine that it doesn't have to modify the value.  This is exactly how tied variables\nbehave in the same situation:\n\npackage CaptureVar;\nuse parent 'Tie::Scalar';\n\nsub TIESCALAR { bless [] }\nsub FETCH { undef }\nsub STORE { die \"This doesn't get called\" }\n\npackage main;\n\ntie my $sv => \"CaptureVar\";\n$sv =~ y/a/b/;\n\nBecause $sv is \"undef\" when the \"y///\" operator is applied to it, the transliteration won't\nactually execute and the program won't \"die\".  This is different to how 5.8 and earlier\nversions behaved since the capture variables were READONLY variables then; now they'll just\ndie when assigned to in the default engine.\n\nnumberedbuffLENGTH\n\nI32 numberedbuffLENGTH (pTHX\nREGEXP * const rx,\nconst SV * const sv,\nconst I32 paren);\n\nGet the \"length\" of a capture variable.  There's a special callback for this so that Perl\ndoesn't have to do a FETCH and run \"length\" on the result, since the length is (in Perl's\ncase) known from an offset stored in \"rx->offs\", this is much more efficient:\n\nI32 s1  = rx->offs[paren].start;\nI32 s2  = rx->offs[paren].end;\nI32 len = t1 - s1;\n\nThis is a little bit more complex in the case of UTF-8, see what\n\"Perlregnumberedbufflength\" does with isutf8stringloclen.\n"
                },
                {
                    "name": "Named capture callbacks",
                    "content": "Called to get/set the value of \"%+\" and \"%-\", as well as by some utility functions in re.\n\nThere are two callbacks, \"namedbuff\" is called in all the cases the FETCH, STORE, DELETE,\nCLEAR, EXISTS and SCALAR Tie::Hash callbacks would be on changes to \"%+\" and \"%-\" and\n\"namedbuffiter\" in the same cases as FIRSTKEY and NEXTKEY.\n\nThe \"flags\" parameter can be used to determine which of these operations the callbacks should\nrespond to.  The following flags are currently defined:\n\nWhich Tie::Hash operation is being performed from the Perl level on \"%+\" or \"%+\", if any:\n\nRXapifFETCH\nRXapifSTORE\nRXapifDELETE\nRXapifCLEAR\nRXapifEXISTS\nRXapifSCALAR\nRXapifFIRSTKEY\nRXapifNEXTKEY\n\nIf \"%+\" or \"%-\" is being operated on, if any.\n\nRXapifONE /* %+ */\nRXapifALL /* %- */\n\nIf this is being called as \"re::regname\", \"re::regnames\" or \"re::regnamescount\", if any.\nThe first two will be combined with \"RXapifONE\" or \"RXapifALL\".\n\nRXapifREGNAME\nRXapifREGNAMES\nRXapifREGNAMESCOUNT\n\nInternally \"%+\" and \"%-\" are implemented with a real tied interface via\nTie::Hash::NamedCapture.  The methods in that package will call back into these functions.\nHowever the usage of Tie::Hash::NamedCapture for this purpose might change in future\nreleases.  For instance this might be implemented by magic instead (would need an extension\nto mgvtbl).\n\nnamedbuff\n\nSV*     (*namedbuff) (pTHX REGEXP * const rx, SV * const key,\nSV * const value, U32 flags);\n\nnamedbuffiter\n\nSV*     (*namedbuffiter) (pTHX\nREGEXP * const rx,\nconst SV * const lastkey,\nconst U32 flags);\n\nqrpackage\nSV* qrpackage(pTHX REGEXP * const rx);\n\nThe package the qr// magic object is blessed into (as seen by \"ref qr//\").  It is recommended\nthat engines change this to their package name for identification regardless of if they\nimplement methods on the object.\n\nThe package this method returns should also have the internal \"Regexp\" package in its @ISA.\n\"qr//->isa(\"Regexp\")\" should always be true regardless of what engine is being used.\n\nExample implementation might be:\n\nSV*\nExampleqrpackage(pTHX REGEXP * const rx)\n{\nPERLUNUSEDARG(rx);\nreturn newSVpvs(\"re::engine::Example\");\n}\n\nAny method calls on an object created with \"qr//\" will be dispatched to the package as a\nnormal object.\n\nuse re::engine::Example;\nmy $re = qr//;\n$re->meth; # dispatched to re::engine::Example::meth()\n\nTo retrieve the \"REGEXP\" object from the scalar in an XS function use the \"SvRX\" macro, see\n\"REGEXP Functions\" in perlapi.\n\nvoid meth(SV * rv)\nPPCODE:\nREGEXP * re = SvRX(sv);\n"
                },
                {
                    "name": "dupe",
                    "content": "void* dupe(pTHX REGEXP * const rx, CLONEPARAMS *param);\n\nOn threaded builds a regexp may need to be duplicated so that the pattern can be used by\nmultiple threads.  This routine is expected to handle the duplication of any private data\npointed to by the \"pprivate\" member of the \"regexp\" structure.  It will be called with the\npreconstructed new \"regexp\" structure as an argument, the \"pprivate\" member will point at the\nold private structure, and it is this routine's responsibility to construct a copy and return\na pointer to it (which Perl will then use to overwrite the field as passed to this routine.)\n\nThis allows the engine to dupe its private data but also if necessary modify the final\nstructure if it really must.\n\nOn unthreaded builds this field doesn't exist.\n\nopcomp\nThis is private to the Perl core and subject to change. Should be left null.\n"
                },
                {
                    "name": "The REGEXP structure",
                    "content": "The REGEXP struct is defined in regexp.h.  All regex engines must be able to correctly build\nsuch a structure in their \"comp\" routine.\n\nThe REGEXP structure contains all the data that Perl needs to be aware of to properly work\nwith the regular expression.  It includes data about optimisations that Perl can use to\ndetermine if the regex engine should really be used, and various other control info that is\nneeded to properly execute patterns in various contexts, such as if the pattern anchored in\nsome way, or what flags were used during the compile, or if the program contains special\nconstructs that Perl needs to be aware of.\n\nIn addition it contains two fields that are intended for the private use of the regex engine\nthat compiled the pattern.  These are the \"intflags\" and \"pprivate\" members.  \"pprivate\" is a\nvoid pointer to an arbitrary structure, whose use and management is the responsibility of the\ncompiling engine.  Perl will never modify either of these values.\n\ntypedef struct regexp {\n/* what engine created this regexp? */\nconst struct regexpengine* engine;\n\n/* what re is this a lightweight copy of? */\nstruct regexp* motherre;\n\n/* Information about the match that the Perl core uses to manage\n* things */\nU32 extflags;   /* Flags used both externally and internally */\nI32 minlen;     /* mininum possible number of chars in */\nstring to match */\nI32 minlenret;  /* mininum possible number of chars in $& */\nU32 gofs;       /* chars left of pos that we search from */\n\n/* substring data about strings that must appear\nin the final match, used for optimisations */\nstruct regsubstrdata *substrs;\n\nU32 nparens;  /* number of capture groups */\n\n/* private engine specific data */\nU32 intflags;   /* Engine Specific Internal flags */\nvoid *pprivate; /* Data private to the regex engine which\ncreated this object. */\n\n/* Data about the last/current match. These are modified during\n* matching*/\nU32 lastparen;            /* highest close paren matched ($+) */\nU32 lastcloseparen;       /* last close paren matched ($^N) */\nregexpparenpair *offs;  /* Array of offsets for (@-) and\n(@+) */\n\nchar *subbeg;  /* saved or original string so \\digit works\nforever. */\nSVSAVEDCOPY  /* If non-NULL, SV which is COW from original */\nI32 sublen;    /* Length of string pointed by subbeg */\nI32 suboffset;  /* byte offset of subbeg from logical start of\nstr */\nI32 subcoffset; /* suboffset equiv, but in chars (for @-/@+) */\n\n/* Information about the match that isn't often used */\nI32 prelen;           /* length of precomp */\nconst char *precomp;  /* pre-compilation regular expression */\n\nchar *wrapped;  /* wrapped version of the pattern */\nI32 wraplen;    /* length of wrapped */\n\nI32 seenevals;   /* number of eval groups in the pattern - for\nsecurity checks */\nHV *parennames;  /* Optional hash of paren names */\n\n/* Refcount of this regexp */\nI32 refcnt;             /* Refcount of this regexp */\n} regexp;\n\nThe fields are discussed in more detail below:\n"
                },
                {
                    "name": "\"engine\"",
                    "content": "This field points at a \"regexpengine\" structure which contains pointers to the subroutines\nthat are to be used for performing a match.  It is the compiling routine's responsibility to\npopulate this field before returning the regexp object.\n\nInternally this is set to \"NULL\" unless a custom engine is specified in $^H{regcomp}, Perl's\nown set of callbacks can be accessed in the struct pointed to by \"REENGINEPTR\".\n\n\"motherre\"\nTODO, see commit 28d8d7f41a.\n"
                },
                {
                    "name": "\"extflags\"",
                    "content": "This will be used by Perl to see what flags the regexp was compiled with, this will normally\nbe set to the value of the flags parameter by the comp callback.  See the comp documentation\nfor valid flags.\n"
                },
                {
                    "name": "\"minlen\" \"minlenret\"",
                    "content": "The minimum string length (in characters) required for the pattern to match.  This is used to\nprune the search space by not bothering to match any closer to the end of a string than would\nallow a match.  For instance there is no point in even starting the regex engine if the\nminlen is 10 but the string is only 5 characters long.  There is no way that the pattern can\nmatch.\n\n\"minlenret\" is the minimum length (in characters) of the string that would be found in $&\nafter a match.\n\nThe difference between \"minlen\" and \"minlenret\" can be seen in the following pattern:\n\n/ns(?=\\d)/\n\nwhere the \"minlen\" would be 3 but \"minlenret\" would only be 2 as the \\d is required to match\nbut is not actually included in the matched content.  This distinction is particularly\nimportant as the substitution logic uses the \"minlenret\" to tell if it can do in-place\nsubstitutions (these can result in considerable speed-up).\n"
                },
                {
                    "name": "\"gofs\"",
                    "content": "Left offset from pos() to start match at.\n"
                },
                {
                    "name": "\"substrs\"",
                    "content": "Substring data about strings that must appear in the final match.  This is currently only\nused internally by Perl's engine, but might be used in the future for all engines for\noptimisations.\n"
                },
                {
                    "name": "\"nparens\", \"lastparen\", and \"lastcloseparen\"",
                    "content": "These fields are used to keep track of: how many paren capture groups there are in the\npattern; which was the highest paren to be closed (see \"$+\" in perlvar); and which was the\nmost recent paren to be closed (see \"$^N\" in perlvar).\n"
                },
                {
                    "name": "\"intflags\"",
                    "content": "The engine's private copy of the flags the pattern was compiled with. Usually this is the\nsame as \"extflags\" unless the engine chose to modify one of them.\n"
                },
                {
                    "name": "\"pprivate\"",
                    "content": "A void* pointing to an engine-defined data structure.  The Perl engine uses the\n\"regexpinternal\" structure (see \"Base Structures\" in perlreguts) but a custom engine should\nuse something else.\n"
                },
                {
                    "name": "\"offs\"",
                    "content": "A \"regexpparenpair\" structure which defines offsets into the string being matched which\ncorrespond to the $& and $1, $2 etc. captures, the \"regexpparenpair\" struct is defined as\nfollows:\n\ntypedef struct regexpparenpair {\nI32 start;\nI32 end;\n} regexpparenpair;\n\nIf \"->offs[num].start\" or \"->offs[num].end\" is \"-1\" then that capture group did not match.\n\"->offs[0].start/end\" represents $& (or \"${^MATCH}\" under \"/p\") and \"->offs[paren].end\"\nmatches $$paren where $paren = 1>.\n"
                },
                {
                    "name": "\"precomp\" \"prelen\"",
                    "content": "Used for optimisations.  \"precomp\" holds a copy of the pattern that was compiled and \"prelen\"\nits length.  When a new pattern is to be compiled (such as inside a loop) the internal\n\"regcomp\" operator checks if the last compiled \"REGEXP\"'s \"precomp\" and \"prelen\" are\nequivalent to the new one, and if so uses the old pattern instead of compiling a new one.\n\nThe relevant snippet from \"Perlppregcomp\":\n\nif (!re || !re->precomp || re->prelen != (I32)len ||\nmemNE(re->precomp, t, len))\n/* Compile a new pattern */\n\n\"parennames\"\nThis is a hash used internally to track named capture groups and their offsets.  The keys are\nthe names of the buffers the values are dualvars, with the IV slot holding the number of\nbuffers with the given name and the pv being an embedded array of I32.  The values may also\nbe contained independently in the data array in cases where named backreferences are used.\n"
                },
                {
                    "name": "\"substrs\"",
                    "content": "Holds information on the longest string that must occur at a fixed offset from the start of\nthe pattern, and the longest string that must occur at a floating offset from the start of\nthe pattern.  Used to do Fast-Boyer-Moore searches on the string to find out if its worth\nusing the regex engine at all, and if so where in the string to search.\n\n\"subbeg\" \"sublen\" \"savedcopy\" \"suboffset\" \"subcoffset\"\nUsed during the execution phase for managing search and replace patterns, and for providing\nthe text for $&, $1 etc. \"subbeg\" points to a buffer (either the original string, or a copy\nin the case of \"RXMATCHCOPIED(rx)\"), and \"sublen\" is the length of the buffer.  The\n\"RXOFFS\" start and end indices index into this buffer.\n\nIn the presence of the \"REXECCOPYSTR\" flag, but with the addition of the\n\"REXECCOPYSKIPPRE\" or \"REXECCOPYSKIPPOST\" flags, an engine can choose not to copy the\nfull buffer (although it must still do so in the presence of \"RXfPMfKEEPCOPY\" or the\nrelevant bits being set in \"PLsawampersand\").  In this case, it may set \"suboffset\" to\nindicate the number of bytes from the logical start of the buffer to the physical start (i.e.\n\"subbeg\").  It should also set \"subcoffset\", the number of characters in the offset. The\nlatter is needed to support \"@-\" and \"@+\" which work in characters, not bytes.\n"
                },
                {
                    "name": "\"wrapped\" \"wraplen\"",
                    "content": "Stores the string \"qr//\" stringifies to. The Perl engine for example stores \"(?^:eek)\" in the\ncase of \"qr/eek/\".\n\nWhen using a custom engine that doesn't support the \"(?:)\" construct for inline modifiers,\nit's probably best to have \"qr//\" stringify to the supplied pattern, note that this will\ncreate undesired patterns in cases such as:\n\nmy $x = qr/a|b/;  # \"a|b\"\nmy $y = qr/c/i;   # \"c\"\nmy $z = qr/$x$y/; # \"a|bc\"\n\nThere's no solution for this problem other than making the custom engine understand a\nconstruct like \"(?:)\".\n\n\"seenevals\"\nThis stores the number of eval groups in the pattern.  This is used for security purposes\nwhen embedding compiled regexes into larger patterns with \"qr//\".\n"
                },
                {
                    "name": "\"refcnt\"",
                    "content": "The number of times the structure is referenced.  When this falls to 0, the regexp is\nautomatically freed by a call to \"pregfree\".  This should be set to 1 in each engine's \"comp\"\nroutine.\n"
                }
            ]
        },
        "HISTORY": {
            "content": "Originally part of perlreguts.\n",
            "subsections": []
        },
        "AUTHORS": {
            "content": "Originally written by Yves Orton, expanded by Ævar Arnfjörð Bjarmason.\n",
            "subsections": []
        },
        "LICENSE": {
            "content": "Copyright 2006 Yves Orton and 2007 Ævar Arnfjörð Bjarmason.\n\nThis program is free software; you can redistribute it and/or modify it under the same terms\nas Perl itself.\n\n\n\nperl v5.34.0                                 2025-07-25                                 PERLREAPI(1)",
            "subsections": []
        }
    },
    "summary": "perlreapi - Perl regular expression plugin interface",
    "flags": [],
    "examples": [],
    "see_also": []
}