{
    "mode": "perldoc",
    "parameter": "Parse::RecDescent",
    "section": "",
    "url": "https://www.chedong.com/phpMan.php/perldoc/Parse%3A%3ARecDescent/json",
    "generated": "2026-06-11T23:12:32Z",
    "synopsis": "use Parse::RecDescent;\n# Generate a parser from the specification in $grammar:\n$parser = new Parse::RecDescent ($grammar);\n# Generate a parser from the specification in $othergrammar\n$anotherparser = new Parse::RecDescent ($othergrammar);\n# Parse $text using rule 'startrule' (which must be\n# defined in $grammar):\n$parser->startrule($text);\n# Parse $text using rule 'otherrule' (which must also\n# be defined in $grammar):\n$parser->otherrule($text);\n# Change the universal token prefix pattern\n# before building a grammar\n# (the default is: '\\s*'):\n$Parse::RecDescent::skip = '[ \\t]+';\n# Replace productions of existing rules (or create new ones)\n# with the productions defined in $newgrammar:\n$parser->Replace($newgrammar);\n# Extend existing rules (or create new ones)\n# by adding extra productions defined in $moregrammar:\n$parser->Extend($moregrammar);\n# Global flags (useful as command line arguments under -s):\n$::RDERRORS       # unless undefined, report fatal errors\n$::RDWARN         # unless undefined, also report non-fatal problems\n$::RDHINT         # if defined, also suggestion remedies\n$::RDTRACE        # if defined, also trace parsers' behaviour\n$::RDAUTOSTUB     # if defined, generates \"stubs\" for undefined rules\n$::RDAUTOACTION   # if defined, appends specified action to productions",
    "sections": {
        "NAME": {
            "content": "Parse::RecDescent - Generate Recursive-Descent Parsers\n",
            "subsections": []
        },
        "VERSION": {
            "content": "This document describes version 1.967015 of Parse::RecDescent released April 4th, 2017.\n",
            "subsections": []
        },
        "SYNOPSIS": {
            "content": "use Parse::RecDescent;\n\n# Generate a parser from the specification in $grammar:\n\n$parser = new Parse::RecDescent ($grammar);\n\n# Generate a parser from the specification in $othergrammar\n\n$anotherparser = new Parse::RecDescent ($othergrammar);\n\n\n# Parse $text using rule 'startrule' (which must be\n# defined in $grammar):\n\n$parser->startrule($text);\n\n\n# Parse $text using rule 'otherrule' (which must also\n# be defined in $grammar):\n\n$parser->otherrule($text);\n\n\n# Change the universal token prefix pattern\n# before building a grammar\n# (the default is: '\\s*'):\n\n$Parse::RecDescent::skip = '[ \\t]+';\n\n\n# Replace productions of existing rules (or create new ones)\n# with the productions defined in $newgrammar:\n\n$parser->Replace($newgrammar);\n\n\n# Extend existing rules (or create new ones)\n# by adding extra productions defined in $moregrammar:\n\n$parser->Extend($moregrammar);\n\n\n# Global flags (useful as command line arguments under -s):\n\n$::RDERRORS       # unless undefined, report fatal errors\n$::RDWARN         # unless undefined, also report non-fatal problems\n$::RDHINT         # if defined, also suggestion remedies\n$::RDTRACE        # if defined, also trace parsers' behaviour\n$::RDAUTOSTUB     # if defined, generates \"stubs\" for undefined rules\n$::RDAUTOACTION   # if defined, appends specified action to productions\n",
            "subsections": []
        },
        "DESCRIPTION": {
            "content": "",
            "subsections": [
                {
                    "name": "Overview",
                    "content": "Parse::RecDescent incrementally generates top-down recursive-descent text parsers from simple\n*yacc*-like grammar specifications. It provides:\n\n*   Regular expressions or literal strings as terminals (tokens),\n\n*   Multiple (non-contiguous) productions for any rule,\n\n*   Repeated and optional subrules within productions,\n\n*   Full access to Perl within actions specified as part of the grammar,\n\n*   Simple automated error reporting during parser generation and parsing,\n\n*   The ability to commit to, uncommit to, or reject particular productions during a parse,\n\n*   The ability to pass data up and down the parse tree (\"down\" via subrule argument lists, \"up\"\nvia subrule return values)\n\n*   Incremental extension of the parsing grammar (even during a parse),\n\n*   Precompilation of parser objects,\n\n*   User-definable reduce-reduce conflict resolution via \"scoring\" of matching productions.\n\nUsing \"Parse::RecDescent\"\nParser objects are created by calling \"Parse::RecDescent::new\", passing in a grammar\nspecification (see the following subsections). If the grammar is correct, \"new\" returns a\nblessed reference which can then be used to initiate parsing through any rule specified in the\noriginal grammar. A typical sequence looks like this:\n\n$grammar = q {\n# GRAMMAR SPECIFICATION HERE\n};\n\n$parser = new Parse::RecDescent ($grammar) or die \"Bad grammar!\\n\";\n\n# acquire $text\n\ndefined $parser->startrule($text) or print \"Bad text!\\n\";\n\nThe rule through which parsing is initiated must be explicitly defined in the grammar (i.e. for\nthe above example, the grammar must include a rule of the form: \"startrule: <subrules>\".\n\nIf the starting rule succeeds, its value (see below) is returned. Failure to generate the\noriginal parser or failure to match a text is indicated by returning \"undef\". Note that it's\neasy to set up grammars that can succeed, but which return a value of 0, \"0\", or \"\". So don't be\ntempted to write:\n\n$parser->startrule($text) or print \"Bad text!\\n\";\n\nNormally, the parser has no effect on the original text. So in the previous example the value of\n$text would be unchanged after having been parsed.\n\nIf, however, the text to be matched is passed by reference:\n\n$parser->startrule(\\$text)\n\nthen any text which was consumed during the match will be removed from the start of $text.\n"
                },
                {
                    "name": "Rules",
                    "content": "In the grammar from which the parser is built, rules are specified by giving an identifier\n(which must satisfy /[A-Za-z]\\w*/), followed by a colon *on the same line*, followed by one or\nmore productions, separated by single vertical bars. The layout of the productions is entirely\nfree-format:\n\nrule1:  production1\n|  production2 |\nproduction3 | production4\n\nAt any point in the grammar previously defined rules may be extended with additional\nproductions. This is achieved by redeclaring the rule with the new productions. Thus:\n\nrule1: a | b | c\nrule2: d | e | f\nrule1: g | h\n\nis exactly equivalent to:\n\nrule1: a | b | c | g | h\nrule2: d | e | f\n\nEach production in a rule consists of zero or more items, each of which may be either: the name\nof another rule to be matched (a \"subrule\"), a pattern or string literal to be matched directly\n(a \"token\"), a block of Perl code to be executed (an \"action\"), a special instruction to the\nparser (a \"directive\"), or a standard Perl comment (which is ignored).\n\nA rule matches a text if one of its productions matches. A production matches if each of its\nitems match consecutive substrings of the text. The productions of a rule being matched are\ntried in the same order that they appear in the original grammar, and the first matching\nproduction terminates the match attempt (successfully). If all productions are tried and none\nmatches, the match attempt fails.\n\nNote that this behaviour is quite different from the \"prefer the longer match\" behaviour of\n*yacc*. For example, if *yacc* were parsing the rule:\n\nseq : 'A' 'B'\n| 'A' 'B' 'C'\n\nupon matching \"AB\" it would look ahead to see if a 'C' is next and, if so, will match the second\nproduction in preference to the first. In other words, *yacc* effectively tries all the\nproductions of a rule breadth-first in parallel, and selects the \"best\" match, where \"best\"\nmeans longest (note that this is a gross simplification of the true behaviour of *yacc* but it\nwill do for our purposes).\n\nIn contrast, \"Parse::RecDescent\" tries each production depth-first in sequence, and selects the\n\"best\" match, where \"best\" means first. This is the fundamental difference between \"bottom-up\"\nand \"recursive descent\" parsing.\n\nEach successfully matched item in a production is assigned a value, which can be accessed in\nsubsequent actions within the same production (or, in some cases, as the return value of a\nsuccessful subrule call). Unsuccessful items don't have an associated value, since the failure\nof an item causes the entire surrounding production to immediately fail. The following sections\ndescribe the various types of items and their success values.\n"
                },
                {
                    "name": "Subrules",
                    "content": "A subrule which appears in a production is an instruction to the parser to attempt to match the\nnamed rule at that point in the text being parsed. If the named subrule is not defined when\nrequested the production containing it immediately fails (unless it was \"autostubbed\" - see\nAutostubbing).\n\nA rule may (recursively) call itself as a subrule, but *not* as the left-most item in any of its\nproductions (since such recursions are usually non-terminating).\n\nThe value associated with a subrule is the value associated with its $return variable (see\n\"Actions\" below), or with the last successfully matched item in the subrule match.\n\nSubrules may also be specified with a trailing repetition specifier, indicating that they are to\nbe (greedily) matched the specified number of times. The available specifiers are:\n\nsubrule(?)  # Match one-or-zero times\nsubrule(s)  # Match one-or-more times\nsubrule(s?) # Match zero-or-more times\nsubrule(N)  # Match exactly N times for integer N > 0\nsubrule(N..M)   # Match between N and M times\nsubrule(..M)    # Match between 1 and M times\nsubrule(N..)    # Match at least N times\n\nRepeated subrules keep matching until either the subrule fails to match, or it has matched the\nminimal number of times but fails to consume any of the parsed text (this second condition\nprevents the subrule matching forever in some cases).\n\nSince a repeated subrule may match many instances of the subrule itself, the value associated\nwith it is not a simple scalar, but rather a reference to a list of scalars, each of which is\nthe value associated with one of the individual subrule matches. In other words in the rule:\n\nprogram: statement(s)\n\nthe value associated with the repeated subrule \"statement(s)\" is a reference to an array\ncontaining the values matched by each call to the individual subrule \"statement\".\n\nRepetition modifiers may include a separator pattern:\n\nprogram: statement(s /;/)\n\nspecifying some sequence of characters to be skipped between each repetition. This is really\njust a shorthand for the <leftop:...> directive (see below).\n"
                },
                {
                    "name": "Tokens",
                    "content": "If a quote-delimited string or a Perl regex appears in a production, the parser attempts to\nmatch that string or pattern at that point in the text. For example:\n\ntypedef: \"typedef\" typename identifier ';'\n\nidentifier: /[A-Za-z][A-Za-z0-9]*/\n\nAs in regular Perl, a single quoted string is uninterpolated, whilst a double-quoted string or a\npattern is interpolated (at the time of matching, *not* when the parser is constructed). Hence,\nit is possible to define rules in which tokens can be set at run-time:\n\ntypedef: \"$::typedefkeyword\" typename identifier ';'\n\nidentifier: /$::identpat/\n\nNote that, since each rule is implemented inside a special namespace belonging to its parser, it\nis necessary to explicitly quantify variables from the main package.\n\nRegex tokens can be specified using just slashes as delimiters or with the explicit\n\"m<delimiter>......<delimiter>\" syntax:\n\ntypedef: \"typedef\" typename identifier ';'\n\ntypename: /[A-Za-z][A-Za-z0-9]*/\n\nidentifier: m{[A-Za-z][A-Za-z0-9]*}\n\nA regex of either type can also have any valid trailing parameter(s) (that is, any of\n[cgimsox]):\n\ntypedef: \"typedef\" typename identifier ';'\n\nidentifier: / [a-z]        # LEADING ALPHA OR UNDERSCORE\n[a-z0-9]*    # THEN DIGITS ALSO ALLOWED\n/ix     # CASE/SPACE/COMMENT INSENSITIVE\n\nThe value associated with any successfully matched token is a string containing the actual text\nwhich was matched by the token.\n\nIt is important to remember that, since each grammar is specified in a Perl string, all\ninstances of the universal escape character '\\' within a grammar must be \"doubled\", so that they\ninterpolate to single '\\'s when the string is compiled. For example, to use the grammar:\n\nword:       /\\S+/ | backslash\nline:       prefix word(s) \"\\n\"\nbackslash:  '\\\\'\n\nthe following code is required:\n\n$parser = new Parse::RecDescent (q{\n\nword:   /\\\\S+/ | backslash\nline:   prefix word(s) \"\\\\n\"\nbackslash:  '\\\\\\\\'\n\n});\n"
                },
                {
                    "name": "Anonymous subrules",
                    "content": "Parentheses introduce a nested scope that is very like a call to an anonymous subrule. Hence\nthey are useful for \"in-lining\" subroutine calls, and other kinds of grouping behaviour. For\nexample, instead of:\n\nword:       /\\S+/ | backslash\nline:       prefix word(s) \"\\n\"\n\nyou could write:\n\nline:       prefix ( /\\S+/ | backslash )(s) \"\\n\"\n\nand get exactly the same effects.\n\nParentheses are also use for collecting unrepeated alternations within a single production.\n\nsecretidentity: \"Mr\" (\"Incredible\"|\"Fantastic\"|\"Sheen\") \", Esq.\"\n"
                },
                {
                    "name": "Terminal Separators",
                    "content": "For the purpose of matching, each terminal in a production is considered to be preceded by a\n\"prefix\" - a pattern which must be matched before a token match is attempted. By default, the\nprefix is optional whitespace (which always matches, at least trivially), but this default may\nbe reset in any production.\n\nThe variable $Parse::RecDescent::skip stores the universal prefix, which is the default for all\nterminal matches in all parsers built with \"Parse::RecDescent\".\n\nIf you want to change the universal prefix using $Parse::RecDescent::skip, be careful to set it\n*before* creating the grammar object, because it is applied statically (when a grammar is built)\nrather than dynamically (when the grammar is used). Alternatively you can provide a global\n\"<skip:...>\" directive in your grammar before any rules (described later).\n\nThe prefix for an individual production can be altered by using the \"<skip:...>\" directive\n(described later). Setting this directive in the top-level rule is an alternative approach to\nsetting $Parse::RecDescent::skip before creating the object, but in this case you don't get the\nintended skipping behaviour if you directly invoke methods different from the top-level rule.\n"
                },
                {
                    "name": "Actions",
                    "content": "An action is a block of Perl code which is to be executed (as the block of a \"do\" statement)\nwhen the parser reaches that point in a production. The action executes within a special\nnamespace belonging to the active parser, so care must be taken in correctly qualifying variable\nnames (see also \"Start-up Actions\" below).\n\nThe action is considered to succeed if the final value of the block is defined (that is, if the\nimplied \"do\" statement evaluates to a defined value - *even one which would be treated as\n\"false\"*). Note that the value associated with a successful action is also the final value in\nthe block.\n\nAn action will *fail* if its last evaluated value is \"undef\". This is surprisingly easy to\naccomplish by accident. For instance, here's an infuriating case of an action that makes its\nproduction fail, but only when debugging *isn't* activated:\n\ndescription: name rank serialnumber\n{ print \"Got $item[2] $item[1] ($item[3])\\n\"\nif $::debugging\n}\n\nIf $debugging is false, no statement in the block is executed, so the final value is \"undef\",\nand the entire production fails. The solution is:\n\ndescription: name rank serialnumber\n{ print \"Got $item[2] $item[1] ($item[3])\\n\"\nif $::debugging;\n1;\n}\n\nWithin an action, a number of useful parse-time variables are available in the special parser\nnamespace (there are other variables also accessible, but meddling with them will probably just\nbreak your parser. As a general rule, if you avoid referring to unqualified variables -\nespecially those starting with an underscore - inside an action, things should be okay):\n\n@item and %item\nThe array slice @item[1..$#item] stores the value associated with each item (that is, each\nsubrule, token, or action) in the current production. The analogy is to $1, $2, etc. in a\n*yacc* grammar. Note that, for obvious reasons, @item only contains the values of items\n*before* the current point in the production.\n\nThe first element ($item[0]) stores the name of the current rule being matched.\n\n@item is a standard Perl array, so it can also be indexed with negative numbers,\nrepresenting the number of items *back* from the current position in the parse:\n\nstuff: /various/ bits 'and' pieces \"then\" data 'end'\n{ print $item[-2] }  # PRINTS data\n# (EASIER THAN: $item[6])\n\nThe %item hash complements the <@item> array, providing named access to the same item\nvalues:\n\nstuff: /various/ bits 'and' pieces \"then\" data 'end'\n{ print $item{data}  # PRINTS data\n# (EVEN EASIER THAN USING @item)\n\nThe results of named subrules are stored in the hash under each subrule's name (including\nthe repetition specifier, if any), whilst all other items are stored under a \"named\npositional\" key that indicates their ordinal position within their item type: STRING*n*,\nPATTERN*n*, DIRECTIVE*n*, ACTION*n*:\n\nstuff: /various/ bits 'and' pieces \"then\" data 'end' { save }\n{ print $item{PATTERN1}, # PRINTS 'various'\n$item{STRING2},  # PRINTS 'then'\n$item{ACTION1},  # PRINTS RETURN\n# VALUE OF save\n}\n\nIf you want proper *named* access to patterns or literals, you need to turn them into\nseparate rules:\n\nstuff: various bits 'and' pieces \"then\" data 'end'\n{ print $item{various}  # PRINTS various\n}\n\nvarious: /various/\n\nThe special entry $item{RULE} stores the name of the current rule (i.e. the same value\nas $item[0].\n\nThe advantage of using %item, instead of @items is that it removes the need to track items\npositions that may change as a grammar evolves. For example, adding an interim \"<skip>\"\ndirective of action can silently ruin a trailing action, by moving an @item element \"down\"\nthe array one place. In contrast, the named entry of %item is unaffected by such an\ninsertion.\n\nA limitation of the %item hash is that it only records the *last* value of a particular\nsubrule. For example:\n\nrange: '(' number '..' number )'\n{ $return = $item{number} }\n\nwill return only the value corresponding to the *second* match of the \"number\" subrule. In\nother words, successive calls to a subrule overwrite the corresponding entry in %item. Once\nagain, the solution is to rename each subrule in its own rule:\n\nrange: '(' fromnum '..' tonum ')'\n{ $return = $item{fromnum} }\n\nfromnum: number\ntonum:   number\n\n@arg and %arg\nThe array @arg and the hash %arg store any arguments passed to the rule from some other rule\n(see \"Subrule argument lists\"). Changes to the elements of either variable do not propagate\nback to the calling rule (data can be passed back from a subrule via the $return variable -\nsee next item).\n\n$return\nIf a value is assigned to $return within an action, that value is returned if the production\ncontaining the action eventually matches successfully. Note that setting $return *doesn't*\ncause the current production to succeed. It merely tells it what to return if it *does*\nsucceed. Hence $return is analogous to $$ in a *yacc* grammar.\n\nIf $return is not assigned within a production, the value of the last component of the\nproduction (namely: $item[$#item]) is returned if the production succeeds.\n\n$commit\nThe current state of commitment to the current production (see \"Directives\" below).\n\n$skip\nThe current terminal prefix (see \"Directives\" below).\n\n$text\nThe remaining (unparsed) text. Changes to $text *do not propagate* out of unsuccessful\nproductions, but *do* survive successful productions. Hence it is possible to dynamically\nalter the text being parsed - for example, to provide a \"#include\"-like facility:\n\nhashinclude: '#include' filename\n{ $text = ::loadfile($item[2]) . $text }\n\nfilename: '<' /[a-z0-9.-]+/i '>'  { $return = $item[2] }\n| '\"' /[a-z0-9.-]+/i '\"'  { $return = $item[2] }\n\n$thisline and $prevline\n$thisline stores the current line number within the current parse (starting from 1).\n$prevline stores the line number for the last character which was already successfully\nparsed (this will be different from $thisline at the end of each line).\n\nFor efficiency, $thisline and $prevline are actually tied hashes, and only recompute the\nrequired line number when the variable's value is used.\n\nAssignment to $thisline adjusts the line number calculator, so that it believes that the\ncurrent line number is the value being assigned. Note that this adjustment will be reflected\nin all subsequent line numbers calculations.\n\nModifying the value of the variable $text (as in the previous \"hashinclude\" example, for\ninstance) will confuse the line counting mechanism. To prevent this, you should call\n\"Parse::RecDescent::LineCounter::resync($thisline)\" *immediately* after any assignment to\nthe variable $text (or, at least, before the next attempt to use $thisline).\n\nNote that if a production fails after assigning to or resync'ing $thisline, the parser's\nline counter mechanism will usually be corrupted.\n\nAlso see the entry for @itempos.\n\nThe line number can be set to values other than 1, by calling the start rule with a second\nargument. For example:\n\n$parser = new Parse::RecDescent ($grammar);\n\n$parser->input($text, 10);  # START LINE NUMBERS AT 10\n\n$thiscolumn and $prevcolumn\n$thiscolumn stores the current column number within the current line being parsed (starting\nfrom 1). $prevcolumn stores the column number of the last character which was actually\nsuccessfully parsed. Usually \"$prevcolumn == $thiscolumn-1\", but not at the end of lines.\n\nFor efficiency, $thiscolumn and $prevcolumn are actually tied hashes, and only recompute the\nrequired column number when the variable's value is used.\n\nAssignment to $thiscolumn or $prevcolumn is a fatal error.\n\nModifying the value of the variable $text (as in the previous \"hashinclude\" example, for\ninstance) may confuse the column counting mechanism.\n\nNote that $thiscolumn reports the column number *before* any whitespace that might be\nskipped before reading a token. Hence if you wish to know where a token started (and ended)\nuse something like this:\n\nrule: token1 token2 startcol token3 endcol token4\n{ print \"token3: columns $item[3] to $item[5]\"; }\n\nstartcol: '' { $thiscolumn }    # NEED THE '' TO STEP PAST TOKEN SEP\nendcol:  { $prevcolumn }\n\nAlso see the entry for @itempos.\n\n$thisoffset and $prevoffset\n$thisoffset stores the offset of the current parsing position within the complete text being\nparsed (starting from 0). $prevoffset stores the offset of the last character which was\nactually successfully parsed. In all cases \"$prevoffset == $thisoffset-1\".\n\nFor efficiency, $thisoffset and $prevoffset are actually tied hashes, and only recompute the\nrequired offset when the variable's value is used.\n\nAssignment to $thisoffset or <$prevoffset> is a fatal error.\n\nModifying the value of the variable $text will *not* affect the offset counting mechanism.\n\nAlso see the entry for @itempos.\n\n@itempos\nThe array @itempos stores a hash reference corresponding to each element of @item. The\nelements of the hash provide the following:\n\n$itempos[$n]{offset}{from}  # VALUE OF $thisoffset BEFORE $item[$n]\n$itempos[$n]{offset}{to}    # VALUE OF $prevoffset AFTER $item[$n]\n$itempos[$n]{line}{from}    # VALUE OF $thisline BEFORE $item[$n]\n$itempos[$n]{line}{to}  # VALUE OF $prevline AFTER $item[$n]\n$itempos[$n]{column}{from}  # VALUE OF $thiscolumn BEFORE $item[$n]\n$itempos[$n]{column}{to}    # VALUE OF $prevcolumn AFTER $item[$n]\n\nNote that the various \"$itempos[$n]...{from}\" values record the appropriate value *after*\nany token prefix has been skipped.\n\nHence, instead of the somewhat tedious and error-prone:\n\nrule: startcol token1 endcol\nstartcol token2 endcol\nstartcol token3 endcol\n{ print \"token1: columns $item[1]\nto $item[3]\ntoken2: columns $item[4]\nto $item[6]\ntoken3: columns $item[7]\nto $item[9]\" }\n\nstartcol: '' { $thiscolumn }    # NEED THE '' TO STEP PAST TOKEN SEP\nendcol:  { $prevcolumn }\n\nit is possible to write:\n\nrule: token1 token2 token3\n{ print \"token1: columns $itempos[1]{column}{from}\nto $itempos[1]{column}{to}\ntoken2: columns $itempos[2]{column}{from}\nto $itempos[2]{column}{to}\ntoken3: columns $itempos[3]{column}{from}\nto $itempos[3]{column}{to}\" }\n\nNote however that (in the current implementation) the use of @itempos anywhere in a grammar\nimplies that item positioning information is collected *everywhere* during the parse.\nDepending on the grammar and the size of the text to be parsed, this may be prohibitively\nexpensive and the explicit use of $thisline, $thiscolumn, etc. may be a better choice.\n\n$thisparser\nA reference to the \"Parse::RecDescent\" object through which parsing was initiated.\n\nThe value of $thisparser propagates down the subrules of a parse but not back up. Hence, you\ncan invoke subrules from another parser for the scope of the current rule as follows:\n\nrule: subrule1 subrule2\n| { $thisparser = $::otherparser } <reject>\n| subrule3 subrule4\n| subrule5\n\nThe result is that the production calls \"subrule1\" and \"subrule2\" of the current parser, and\nthe remaining productions call the named subrules from $::otherparser. Note, however that\n\"Bad Things\" will happen if \"::otherparser\" isn't a blessed reference and/or doesn't have\nmethods with the same names as the required subrules!\n\n$thisrule\nA reference to the \"Parse::RecDescent::Rule\" object corresponding to the rule currently\nbeing matched.\n\n$thisprod\nA reference to the \"Parse::RecDescent::Production\" object corresponding to the production\ncurrently being matched.\n\n$score and $scorereturn\n$score stores the best production score to date, as specified by an earlier \"<score:...>\"\ndirective. $scorereturn stores the corresponding return value for the successful\nproduction.\n\nSee \"Scored productions\".\n\nWarning: the parser relies on the information in the various \"this...\" objects in some\nnon-obvious ways. Tinkering with the other members of these objects will probably cause Bad\nThings to happen, unless you *really* know what you're doing. The only exception to this advice\nis that the use of \"$this...->{local}\" is always safe.\n"
                },
                {
                    "name": "Start-up Actions",
                    "content": "Any actions which appear *before* the first rule definition in a grammar are treated as\n\"start-up\" actions. Each such action is stripped of its outermost brackets and then evaluated\n(in the parser's special namespace) just before the rules of the grammar are first compiled.\n\nThe main use of start-up actions is to declare local variables within the parser's special\nnamespace:\n\n{ my $lastitem = '???'; }\n\nlist: item(s)   { $return = $lastitem }\n\nitem: book  { $lastitem = 'book'; }\nbell  { $lastitem = 'bell'; }\ncandle    { $lastitem = 'candle'; }\n\nbut start-up actions can be used to execute *any* valid Perl code within a parser's special\nnamespace.\n\nStart-up actions can appear within a grammar extension or replacement (that is, a partial\ngrammar installed via \"Parse::RecDescent::Extend()\" or \"Parse::RecDescent::Replace()\" - see\n\"Incremental Parsing\"), and will be executed before the new grammar is installed. Note, however,\nthat a particular start-up action is only ever executed once.\n"
                },
                {
                    "name": "Autoactions",
                    "content": "It is sometimes desirable to be able to specify a default action to be taken at the end of every\nproduction (for example, in order to easily build a parse tree). If the variable\n$::RDAUTOACTION is defined when \"Parse::RecDescent::new()\" is called, the contents of that\nvariable are treated as a specification of an action which is to appended to each production in\nthe corresponding grammar.\n\nAlternatively, you can hard-code the autoaction within a grammar, using the \"<autoaction:...>\"\ndirective.\n\nSo, for example, to construct a simple parse tree you could write:\n\n$::RDAUTOACTION = q { [@item] };\n\nparser = Parse::RecDescent->new(q{\nexpression: andexpr '||' expression | andexpr\nandexpr:   notexpr '&&' andexpr   | notexpr\nnotexpr:   '!' brackexpr       | brackexpr\nbrackexpr: '(' expression ')'       | identifier\nidentifier: /[a-z]+/i\n});\n\nor:\n\nparser = Parse::RecDescent->new(q{\n<autoaction: { [@item] } >\n\nexpression: andexpr '||' expression | andexpr\nandexpr:   notexpr '&&' andexpr   | notexpr\nnotexpr:   '!' brackexpr       | brackexpr\nbrackexpr: '(' expression ')'       | identifier\nidentifier: /[a-z]+/i\n});\n\nEither of these is equivalent to:\n\nparser = new Parse::RecDescent (q{\nexpression: andexpr '||' expression\n{ [@item] }\n| andexpr\n{ [@item] }\n\nandexpr:   notexpr '&&' andexpr\n{ [@item] }\n|   notexpr\n{ [@item] }\n\nnotexpr:   '!' brackexpr\n{ [@item] }\n|   brackexpr\n{ [@item] }\n\nbrackexpr: '(' expression ')'\n{ [@item] }\n| identifier\n{ [@item] }\n\nidentifier: /[a-z]+/i\n{ [@item] }\n});\n\nAlternatively, we could take an object-oriented approach, use different classes for each node\n(and also eliminating redundant intermediate nodes):\n\n$::RDAUTOACTION = q\n{ $#item==1 ? $item[1] : \"$item[0]node\"->new(@item[1..$#item]) };\n\nparser = Parse::RecDescent->new(q{\nexpression: andexpr '||' expression | andexpr\nandexpr:   notexpr '&&' andexpr   | notexpr\nnotexpr:   '!' brackexpr           | brackexpr\nbrackexpr: '(' expression ')'       | identifier\nidentifier: /[a-z]+/i\n});\n\nor:\n\nparser = Parse::RecDescent->new(q{\n<autoaction:\n$#item==1 ? $item[1] : \"$item[0]node\"->new(@item[1..$#item])\n>\n\nexpression: andexpr '||' expression | andexpr\nandexpr:   notexpr '&&' andexpr   | notexpr\nnotexpr:   '!' brackexpr           | brackexpr\nbrackexpr: '(' expression ')'       | identifier\nidentifier: /[a-z]+/i\n});\n\nwhich are equivalent to:\n\nparser = Parse::RecDescent->new(q{\nexpression: andexpr '||' expression\n{ \"expressionnode\"->new(@item[1..3]) }\n| andexpr\n\nandexpr:   notexpr '&&' andexpr\n{ \"andexprnode\"->new(@item[1..3]) }\n|   notexpr\n\nnotexpr:   '!' brackexpr\n{ \"notexprnode\"->new(@item[1..2]) }\n|   brackexpr\n\nbrackexpr: '(' expression ')'\n{ \"brackexprnode\"->new(@item[1..3]) }\n| identifier\n\nidentifier: /[a-z]+/i\n{ \"identifernode\"->new(@item[1]) }\n});\n\nNote that, if a production already ends in an action, no autoaction is appended to it. For\nexample, in this version:\n\n$::RDAUTOACTION = q\n{ $#item==1 ? $item[1] : \"$item[0]node\"->new(@item[1..$#item]) };\n\nparser = Parse::RecDescent->new(q{\nexpression: andexpr '&&' expression | andexpr\nandexpr:   notexpr '&&' andexpr   | notexpr\nnotexpr:   '!' brackexpr           | brackexpr\nbrackexpr: '(' expression ')'       | identifier\nidentifier: /[a-z]+/i\n{ 'terminalnode'->new($item[1]) }\n});\n\neach \"identifier\" match produces a \"terminalnode\" object, *not* an \"identifiernode\" object.\n\nA level 1 warning is issued each time an \"autoaction\" is added to some production.\n"
                },
                {
                    "name": "Autotrees",
                    "content": "A commonly needed autoaction is one that builds a parse-tree. It is moderately tricky to set up\nsuch an action (which must treat terminals differently from non-terminals), so Parse::RecDescent\nsimplifies the process by providing the \"<autotree>\" directive.\n\nIf this directive appears at the start of grammar, it causes Parse::RecDescent to insert\nautoactions at the end of any rule except those which already end in an action. The action\ninserted depends on whether the production is an intermediate rule (two or more items), or a\nterminal of the grammar (i.e. a single pattern or string item).\n\nSo, for example, the following grammar:\n\n<autotree>\n\nfile    : command(s)\ncommand : get | set | vet\nget : 'get' ident ';'\nset : 'set' ident 'to' value ';'\nvet : 'check' ident 'is' value ';'\nident   : /\\w+/\nvalue   : /\\d+/\n\nis equivalent to:\n\nfile    : command(s)        { bless \\%item, $item[0] }\ncommand : get       { bless \\%item, $item[0] }\n| set           { bless \\%item, $item[0] }\n| vet           { bless \\%item, $item[0] }\nget : 'get' ident ';'   { bless \\%item, $item[0] }\nset : 'set' ident 'to' value ';'    { bless \\%item, $item[0] }\nvet : 'check' ident 'is' value ';'  { bless \\%item, $item[0] }\n\nident   : /\\w+/  { bless {VALUE=>$item[1]}, $item[0] }\nvalue   : /\\d+/  { bless {VALUE=>$item[1]}, $item[0] }\n\nNote that each node in the tree is blessed into a class of the same name as the rule itself.\nThis makes it easy to build object-oriented processors for the parse-trees that the grammar\nproduces. Note too that the last two rules produce special objects with the single attribute\n'VALUE'. This is because they consist solely of a single terminal.\n\nThis autoaction-ed grammar would then produce a parse tree in a data structure like this:\n\n{\nfile => {\ncommand => {\n[ get => {\nidentifier => { VALUE => 'a' },\n},\nset => {\nidentifier => { VALUE => 'b' },\nvalue      => { VALUE => '7' },\n},\nvet => {\nidentifier => { VALUE => 'b' },\nvalue      => { VALUE => '7' },\n},\n],\n},\n}\n}\n\n(except, of course, that each nested hash would also be blessed into the appropriate class).\n\nYou can also specify a base class for the \"<autotree>\" directive. The supplied prefix will be\nprepended to the rule names when creating tree nodes. The following are equivalent:\n\n<autotree:MyBase::Class>\n<autotree:MyBase::Class::>\n\nAnd will produce a root node blessed into the \"MyBase::Class::file\" package in the example\nabove.\n"
                },
                {
                    "name": "Autostubbing",
                    "content": "Normally, if a subrule appears in some production, but no rule of that name is ever defined in\nthe grammar, the production which refers to the non-existent subrule fails immediately. This\ntypically occurs as a result of misspellings, and is a sufficiently common occurrence that a\nwarning is generated for such situations.\n\nHowever, when prototyping a grammar it is sometimes useful to be able to use subrules before a\nproper specification of them is really possible. For example, a grammar might include a section\nlike:\n\nfunctioncall: identifier '(' arg(s?) ')'\n\nidentifier: /[a-z]\\w*/i\n\nwhere the possible format of an argument is sufficiently complex that it is not worth specifying\nin full until the general function call syntax has been debugged. In this situation it is\nconvenient to leave the real rule \"arg\" undefined and just slip in a placeholder (or \"stub\"):\n\narg: 'arg'\n\nso that the function call syntax can be tested with dummy input such as:\n\nf0()\nf1(arg)\nf2(arg arg)\nf3(arg arg arg)\n\net cetera.\n\nEarly in prototyping, many such \"stubs\" may be required, so \"Parse::RecDescent\" provides a means\nof automating their definition. If the variable $::RDAUTOSTUB is defined when a parser is\nbuilt, a subrule reference to any non-existent rule (say, \"subrule\"), will cause a \"stub\" rule\nto be automatically defined in the generated parser. If \"$::RDAUTOSTUB eq '1'\" or is false, a\nstub rule of the form:\n\nsubrule: 'subrule'\n\nwill be generated. The special-case for a value of '1' is to allow the use of the perl -s with\n-RDAUTOSTUB without generating \"subrule: '1'\" per below. If $::RDAUTOSTUB is true, a stub rule\nof the form:\n\nsubrule: $::RDAUTOSTUB\n\nwill be generated. $::RDAUTOSTUB must contain a valid production item, no checking is\nperformed. No lazy evaluation of $::RDAUTOSTUB is performed, it is evaluated at the time the\nParser is generated.\n\nHence, with $::RDAUTOSTUB defined, it is possible to only partially specify a grammar, and then\n\"fake\" matches of the unspecified (sub)rules by just typing in their name, or a literal value\nthat was assigned to $::RDAUTOSTUB.\n"
                },
                {
                    "name": "Look-ahead",
                    "content": "If a subrule, token, or action is prefixed by \"...\", then it is treated as a \"look-ahead\"\nrequest. That means that the current production can (as usual) only succeed if the specified\nitem is matched, but that the matching *does not consume any of the text being parsed*. This is\nvery similar to the \"/(?=...)/\" look-ahead construct in Perl patterns. Thus, the rule:\n\ninnerword: word ...word\n\nwill match whatever the subrule \"word\" matches, provided that match is followed by some more\ntext which subrule \"word\" would also match (although this second substring is not actually\nconsumed by \"innerword\")\n\nLikewise, a \"...!\" prefix, causes the following item to succeed (without consuming any text) if\nand only if it would normally fail. Hence, a rule such as:\n\nidentifier: ...!keyword ...!'' /[A-Za-z]\\w*/\n\nmatches a string of characters which satisfies the pattern \"/[A-Za-z]\\w*/\", but only if the\nsame sequence of characters would not match either subrule \"keyword\" or the literal token ''.\n\nSequences of look-ahead prefixes accumulate, multiplying their positive and/or negative senses.\nHence:\n\ninnerword: word ...!......!word\n\nis exactly equivalent to the original example above (a warning is issued in cases like these,\nsince they often indicate something left out, or misunderstood).\n\nNote that actions can also be treated as look-aheads. In such cases, the state of the parser\ntext (in the local variable $text) *after* the look-ahead action is guaranteed to be identical\nto its state *before* the action, regardless of how it's changed *within* the action (unless you\nactually undefine $text, in which case you get the disaster you deserve :-).\n"
                },
                {
                    "name": "Directives",
                    "content": "Directives are special pre-defined actions which may be used to alter the behaviour of the\nparser. There are currently twenty-three directives: \"<commit>\", \"<uncommit>\", \"<reject>\",\n\"<score>\", \"<autoscore>\", \"<skip>\", \"<resync>\", \"<error>\", \"<warn>\", \"<hint>\", \"<tracebuild>\",\n\"<traceparse>\", \"<nocheck>\", \"<rulevar>\", \"<matchrule>\", \"<leftop>\", \"<rightop>\", \"<defer>\",\n\"<nocheck>\", \"<perlquotelike>\", \"<perlcodeblock>\", \"<perlvariable>\", and \"<token>\".\n\nCommitting and uncommitting\nThe \"<commit>\" and \"<uncommit>\" directives permit the recursive descent of the parse tree to\nbe pruned (or \"cut\") for efficiency. Within a rule, a \"<commit>\" directive instructs the\nrule to ignore subsequent productions if the current production fails. For example:\n\ncommand: 'find' <commit> filename\n| 'open' <commit> filename\n| 'move' filename filename\n\nClearly, if the leading token 'find' is matched in the first production but that production\nfails for some other reason, then the remaining productions cannot possibly match. The\npresence of the \"<commit>\" causes the \"command\" rule to fail immediately if an invalid\n\"find\" command is found, and likewise if an invalid \"open\" command is encountered.\n\nIt is also possible to revoke a previous commitment. For example:\n\nifstatement: 'if' <commit> condition\n'then' block <uncommit>\n'else' block\n| 'if' <commit> condition\n'then' block\n\nIn this case, a failure to find an \"else\" block in the first production shouldn't preclude\ntrying the second production, but a failure to find a \"condition\" certainly should.\n\nAs a special case, any production in which the *first* item is an \"<uncommit>\" immediately\nrevokes a preceding \"<commit>\" (even though the production would not otherwise have been\ntried). For example, in the rule:\n\nrequest: 'explain' expression\n| 'explain' <commit> keyword\n| 'save'\n| 'quit'\n| <uncommit> term '?'\n\nif the text being matched was \"explain?\", and the first two productions failed, then the\n\"<commit>\" in production two would cause productions three and four to be skipped, but the\nleading \"<uncommit>\" in the production five would allow that production to attempt a match.\n\nNote in the preceding example, that the \"<commit>\" was only placed in production two. If\nproduction one had been:\n\nrequest: 'explain' <commit> expression\n\nthen production two would be (inappropriately) skipped if a leading \"explain...\" was\nencountered.\n\nBoth \"<commit>\" and \"<uncommit>\" directives always succeed, and their value is always 1.\n\nRejecting a production\nThe \"<reject>\" directive immediately causes the current production to fail (it is exactly\nequivalent to, but more obvious than, the action \"{undef}\"). A \"<reject>\" is useful when it\nis desirable to get the side effects of the actions in one production, without prejudicing a\nmatch by some other production later in the rule. For example, to insert tracing code into\nthe parse:\n\ncomplexrule: { print \"In complex rule...\\n\"; } <reject>\n\ncomplexrule: simplerule '+' 'i' '*' simplerule\n| 'i' '*' simplerule\n| simplerule\n\nIt is also possible to specify a conditional rejection, using the form\n\"<reject:*condition*>\", which only rejects if the specified condition is true. This form of\nrejection is exactly equivalent to the action \"{(*condition*)?undef:1}>\". For example:\n\ncommand: savecommand\n| restorecommand\n| <reject: defined $::tolerant> { exit }\n| <error: Unknown command. Ignored.>\n\nA \"<reject>\" directive never succeeds (and hence has no associated value). A conditional\nrejection may succeed (if its condition is not satisfied), in which case its value is 1.\n\nAs an extra optimization, \"Parse::RecDescent\" ignores any production which *begins* with an\nunconditional \"<reject>\" directive, since any such production can never successfully match\nor have any useful side-effects. A level 1 warning is issued in all such cases.\n\nNote that productions beginning with conditional \"<reject:...>\" directives are *never*\n\"optimized away\" in this manner, even if they are always guaranteed to fail (for example:\n\"<reject:1>\")\n\nDue to the way grammars are parsed, there is a minor restriction on the condition of a\nconditional \"<reject:...>\": it cannot contain any raw '<' or '>' characters. For example:\n\nline: cmd <reject: $thiscolumn > max> data\n\nresults in an error when a parser is built from this grammar (since the grammar parser has\nno way of knowing whether the first > is a \"less than\" or the end of the \"<reject:...>\".\n\nTo overcome this problem, put the condition inside a do{} block:\n\nline: cmd <reject: do{$thiscolumn > max}> data\n\nNote that the same problem may occur in other directives that take arguments. The same\nsolution will work in all cases.\n\nSkipping between terminals\nThe \"<skip>\" directive enables the terminal prefix used in a production to be changed. For\nexample:\n\nOneLiner: Command <skip:'[ \\t]*'> Arg(s) /;/\n\ncauses only blanks and tabs to be skipped before terminals in the \"Arg\" subrule (and any of\n*its* subrules>, and also before the final \"/;/\" terminal. Once the production is complete,\nthe previous terminal prefix is reinstated. Note that this implies that distinct productions\nof a rule must reset their terminal prefixes individually.\n\nThe \"<skip>\" directive evaluates to the *previous* terminal prefix, so it's easy to\nreinstate a prefix later in a production:\n\nCommand: <skip:\",\"> CSV(s) <skip:$item[1]> Modifier\n\nThe value specified after the colon is interpolated into a pattern, so all of the following\nare equivalent (though their efficiency increases down the list):\n\n<skip: \"$colon|$comma\">   # ASSUMING THE VARS HOLD THE OBVIOUS VALUES\n\n<skip: ':|,'>\n\n<skip: q{[:,]}>\n\n<skip: qr/[:,]/>\n\nThere is no way of directly setting the prefix for an entire rule, except as follows:\n\nRule: <skip: '[ \\t]*'> Prod1\n| <skip: '[ \\t]*'> Prod2a Prod2b\n| <skip: '[ \\t]*'> Prod3\n\nor, better:\n\nRule: <skip: '[ \\t]*'>\n(\nProd1\n| Prod2a Prod2b\n| Prod3\n)\n\nThe skip pattern is passed down to subrules, so setting the skip for the top-level rule as\ndescribed above actually sets the prefix for the entire grammar (provided that you only call\nthe method corresponding to the top-level rule itself). Alternatively, or if you have more\nthan one top-level rule in your grammar, you can provide a global \"<skip>\" directive prior\nto defining any rules in the grammar. These are the preferred alternatives to setting\n$Parse::RecDescent::skip.\n\nAdditionally, using \"<skip>\" actually allows you to have a completely dynamic skipping\nbehaviour. For example:\n\nRulewithdynamicskip: <skip: $::skippattern> Rule\n\nThen you can set $::skippattern before invoking \"Rulewithdynamicskip\" and have it skip\nwhatever you specified.\n\nNote: Up to release 1.51 of Parse::RecDescent, an entirely different mechanism was used for\nspecifying terminal prefixes. The current method is not backwards-compatible with that early\napproach. The current approach is stable and will not change again.\n\nNote: the global \"<skip>\" directive added in 1.967004 did not interpolate the pattern\nargument, instead the pattern was placed inside of single quotes and then interpolated. This\nbehavior was changed in 1.967010 so that all \"<skip>\" directives behavior similarly.\n\nResynchronization\nThe \"<resync>\" directive provides a visually distinctive means of consuming some of the text\nbeing parsed, usually to skip an erroneous input. In its simplest form \"<resync>\" simply\nconsumes text up to and including the next newline (\"\\n\") character, succeeding only if the\nnewline is found, in which case it causes its surrounding rule to return zero on success.\n\nIn other words, a \"<resync>\" is exactly equivalent to the token \"/[^\\n]*\\n/\" followed by the\naction \"{ $return = 0 }\" (except that productions beginning with a \"<resync>\" are ignored\nwhen generating error messages). A typical use might be:\n\nscript : command(s)\n\ncommand: savecommand\n| restorecommand\n| <resync> # TRY NEXT LINE, IF POSSIBLE\n\nIt is also possible to explicitly specify a resynchronization pattern, using the\n\"<resync:*pattern*>\" variant. This version succeeds only if the specified pattern matches\n(and consumes) the parsed text. In other words, \"<resync:*pattern*>\" is exactly equivalent\nto the token \"/*pattern*/\" (followed by a \"{ $return = 0 }\" action). For example, if\ncommands were terminated by newlines or semi-colons:\n\ncommand: savecommand\n| restorecommand\n| <resync:[^;\\n]*[;\\n]>\n\nThe value of a successfully matched \"<resync>\" directive (of either type) is the text that\nit consumed. Note, however, that since the directive also sets $return, a production\nconsisting of a lone \"<resync>\" succeeds but returns the value zero (which a calling rule\nmay find useful to distinguish between \"true\" matches and \"tolerant\" matches). Remember that\nreturning a zero value indicates that the rule *succeeded* (since only an \"undef\" denotes\nfailure within \"Parse::RecDescent\" parsers.\n\nError handling\nThe \"<error>\" directive provides automatic or user-defined generation of error messages\nduring a parse. In its simplest form \"<error>\" prepares an error message based on the\nmismatch between the last item expected and the text which cause it to fail. For example,\ngiven the rule:\n\nMcCoy: curse ',' name ', I'm a doctor, not a' aprofession '!'\n| pronoun 'dead,' name '!'\n| <error>\n\nthe following strings would produce the following messages:\n\n\"Amen, Jim!\"\nERROR (line 1): Invalid McCoy: Expected curse or pronoun\nnot found\n\n\"Dammit, Jim, I'm a doctor!\"\nERROR (line 1): Invalid McCoy: Expected \", I'm a doctor, not a\"\nbut found \", I'm a doctor!\" instead\n\n\"He's dead,\\n\"\nERROR (line 2): Invalid McCoy: Expected name not found\n\n\"He's alive!\"\nERROR (line 1): Invalid McCoy: Expected 'dead,' but found\n\"alive!\" instead\n\n\"Dammit, Jim, I'm a doctor, not a pointy-eared Vulcan!\"\nERROR (line 1): Invalid McCoy: Expected a profession but found\n\"pointy-eared Vulcan!\" instead\n\nNote that, when autogenerating error messages, all underscores in any rule name used in a\nmessage are replaced by single spaces (for example \"aproduction\" becomes \"a production\").\nJudicious choice of rule names can therefore considerably improve the readability of\nautomatic error messages (as well as the maintainability of the original grammar).\n\nIf the automatically generated error is not sufficient, it is possible to provide an\nexplicit message as part of the error directive. For example:\n\nSpock: \"Fascinating ',' (name | 'Captain') '.'\n| \"Highly illogical, doctor.\"\n| <error: He never said that!>\n\nwhich would result in *all* failures to parse a \"Spock\" subrule printing the following\nmessage:\n\nERROR (line <N>): Invalid Spock:  He never said that!\n\nThe error message is treated as a \"qq{...}\" string and interpolated when the error is\ngenerated (*not* when the directive is specified!). Hence:\n\n<error: Mystical error near \"$text\">\n\nwould correctly insert the ambient text string which caused the error.\n\nThere are two other forms of error directive: \"<error?>\" and \"<error?: msg>\". These behave\njust like \"<error>\" and \"<error: msg>\" respectively, except that they are only triggered if\nthe rule is \"committed\" at the time they are encountered. For example:\n\nScotty: \"Ya kenna change the Laws of Phusics,\" <commit> name\n| name <commit> ',' 'she's goanta blaw!'\n| <error?>\n\nwill only generate an error for a string beginning with \"Ya kenna change the Laws o'\nPhusics,\" or a valid name, but which still fails to match the corresponding production. That\nis, \"$parser->Scotty(\"Aye, Cap'ain\")\" will fail silently (since neither production will\n\"commit\" the rule on that input), whereas\n\"$parser->Scotty(\"Mr Spock, ah jest kenna do'ut!\")\" will fail with the error message:\n\nERROR (line 1): Invalid Scotty: expected 'she's goanta blaw!'\nbut found 'I jest kenna do'ut!' instead.\n\nsince in that case the second production would commit after matching the leading name.\n\nNote that to allow this behaviour, all \"<error>\" directives which are the first item in a\nproduction automatically uncommit the rule just long enough to allow their production to be\nattempted (that is, when their production fails, the commitment is reinstated so that\nsubsequent productions are skipped).\n\nIn order to *permanently* uncommit the rule before an error message, it is necessary to put\nan explicit \"<uncommit>\" before the \"<error>\". For example:\n\nline: 'Kirk:'  <commit> Kirk\n| 'Spock:' <commit> Spock\n| 'McCoy:' <commit> McCoy\n| <uncommit> <error?> <reject>\n| <resync>\n\nError messages generated by the various \"<error...>\" directives are not displayed\nimmediately. Instead, they are \"queued\" in a buffer and are only displayed once parsing\nultimately fails. Moreover, \"<error...>\" directives that cause one production of a rule to\nfail are automatically removed from the message queue if another production subsequently\ncauses the entire rule to succeed. This means that you can put \"<error...>\" directives\nwherever useful diagnosis can be done, and only those associated with actual parser failure\nwill ever be displayed. Also see \"GOTCHAS\".\n\nAs a general rule, the most useful diagnostics are usually generated either at the very\nlowest level within the grammar, or at the very highest. A good rule of thumb is to identify\nthose subrules which consist mainly (or entirely) of terminals, and then put an \"<error...>\"\ndirective at the end of any other rule which calls one or more of those subrules.\n\nThere is one other situation in which the output of the various types of error directive is\nsuppressed; namely, when the rule containing them is being parsed as part of a \"look-ahead\"\n(see \"Look-ahead\"). In this case, the error directive will still cause the rule to fail, but\nwill do so silently.\n\nAn unconditional \"<error>\" directive always fails (and hence has no associated value). This\nmeans that encountering such a directive always causes the production containing it to fail.\nHence an \"<error>\" directive will inevitably be the last (useful) item of a rule (a level 3\nwarning is issued if a production contains items after an unconditional \"<error>\"\ndirective).\n\nAn \"<error?>\" directive will *succeed* (that is: fail to fail :-), if the current rule is\nuncommitted when the directive is encountered. In that case the directive's associated value\nis zero. Hence, this type of error directive *can* be used before the end of a production.\nFor example:\n\ncommand: 'do' <commit> something\n| 'report' <commit> something\n| <error?: Syntax error> <error: Unknown command>\n\nWarning: The \"<error?>\" directive does *not* mean \"always fail (but do so silently unless\ncommitted)\". It actually means \"only fail (and report) if committed, otherwise *succeed*\".\nTo achieve the \"fail silently if uncommitted\" semantics, it is necessary to use:\n\nrule: item <commit> item(s)\n| <error?> <reject>  # FAIL SILENTLY UNLESS COMMITTED\n\nHowever, because people seem to expect a lone \"<error?>\" directive to work like this:\n\nrule: item <commit> item(s)\n| <error?: Error message if committed>\n| <error:  Error message if uncommitted>\n\nParse::RecDescent automatically appends a \"<reject>\" directive if the \"<error?>\" directive\nis the only item in a production. A level 2 warning (see below) is issued when this happens.\n\nThe level of error reporting during both parser construction and parsing is controlled by\nthe presence or absence of four global variables: $::RDERRORS, $::RDWARN, $::RDHINT, and\n<$::RDTRACE>. If $::RDERRORS is defined (and, by default, it is) then fatal errors are\nreported.\n\nWhenever $::RDWARN is defined, certain non-fatal problems are also reported.\n\nWarnings have an associated \"level\": 1, 2, or 3. The higher the level, the more serious the\nwarning. The value of the corresponding global variable ($::RDWARN) determines the *lowest*\nlevel of warning to be displayed. Hence, to see *all* warnings, set $::RDWARN to 1. To see\nonly the most serious warnings set $::RDWARN to 3. By default $::RDWARN is initialized to\n3, ensuring that serious but non-fatal errors are automatically reported.\n\nThere is also a grammar directive to turn on warnings from within the grammar: \"<warn>\". It\ntakes an optional argument, which specifies the warning level: \"<warn: 2>\".\n\nSee \"DIAGNOSTICS\" for a list of the various error and warning messages that\nParse::RecDescent generates when these two variables are defined.\n\nDefining any of the remaining variables (which are not defined by default) further increases\nthe amount of information reported. Defining $::RDHINT causes the parser generator to offer\nmore detailed analyses and hints on both errors and warnings. Note that setting $::RDHINT\nat any point automagically sets $::RDWARN to 1. There is also a \"<hint>\" directive, which\ncan be hard-coded into a grammar.\n\nDefining $::RDTRACE causes the parser generator and the parser to report their progress to\nSTDERR in excruciating detail (although, without hints unless $::RDHINT is separately\ndefined). This detail can be moderated in only one respect: if $::RDTRACE has an integer\nvalue (*N*) greater than 1, only the *N* characters of the \"current parsing context\" (that\nis, where in the input string we are at any point in the parse) is reported at any time.\n\n$::RDTRACE is mainly useful for debugging a grammar that isn't behaving as you expected it\nto. To this end, if $::RDTRACE is defined when a parser is built, any actual parser code\nwhich is generated is also written to a file named \"RDTRACE\" in the local directory.\n\nThere are two directives associated with the $::RDTRACE variable. If a grammar contains a\n\"<tracebuild>\" directive anywhere in its specification, $::RDTRACE is turned on during the\nparser construction phase. If a grammar contains a \"<traceparse>\" directive anywhere in its\nspecification, $::RDTRACE is turned on during any parse the parser performs.\n\nNote that the four variables belong to the \"main\" package, which makes them easier to refer\nto in the code controlling the parser, and also makes it easy to turn them into command line\nflags (\"-RDERRORS\", \"-RDWARN\", \"-RDHINT\", \"-RDTRACE\") under perl -s.\n\nThe corresponding directives are useful to \"hardwire\" the various debugging features into a\nparticular grammar (rather than having to set and reset external variables).\n\nRedirecting diagnostics\nThe diagnostics provided by the tracing mechanism always go to STDERR. If you need them to\ngo elsewhere, localize and reopen STDERR prior to the parse.\n\nFor example:\n\n{\nlocal *STDERR = IO::File->new(\">$filename\") or die $!;\n\nmy $result = $parser->startrule($text);\n}\n\nConsistency checks\nWhenever a parser is build, Parse::RecDescent carries out a number of (potentially\nexpensive) consistency checks. These include: verifying that the grammar is not\nleft-recursive and that no rules have been left undefined.\n\nThese checks are important safeguards during development, but unnecessary overheads when the\ngrammar is stable and ready to be deployed. So Parse::RecDescent provides a directive to\ndisable them: \"<nocheck>\".\n\nIf a grammar contains a \"<nocheck>\" directive anywhere in its specification, the extra\ncompile-time checks are by-passed.\n\nSpecifying local variables\nIt is occasionally convenient to specify variables which are local to a single rule. This\nmay be achieved by including a \"<rulevar:...>\" directive anywhere in the rule. For example:\n\nmarkup: <rulevar: $tag>\n\nmarkup: tag {($tag=$item[1]) =~ s/^<|>$//g} body[$tag]\n\nThe example \"<rulevar: $tag>\" directive causes a \"my\" variable named $tag to be declared at\nthe start of the subroutine implementing the \"markup\" rule (that is, *before* the first\nproduction, regardless of where in the rule it is specified).\n\nSpecifically, any directive of the form: \"<rulevar:*text*>\" causes a line of the form \"my\n*text*;\" to be added at the beginning of the rule subroutine, immediately after the\ndefinitions of the following local variables:\n\n$thisparser $commit\n$thisrule   @item\n$thisline   @arg\n$text   %arg\n\nThis means that the following \"<rulevar>\" directives work as expected:\n\n<rulevar: $count = 0 >\n\n<rulevar: $firstarg = $arg[0] || '' >\n\n<rulevar: $myItems = \\@item >\n\n<rulevar: @context = ( $thisline, $text, @arg ) >\n\n<rulevar: ($name,$age) = $arg{\"name\",\"age\"} >\n\nIf a variable that is also visible to subrules is required, it needs to be \"local\"'d, not\n\"my\"'d. \"rulevar\" defaults to \"my\", but if \"local\" is explicitly specified:\n\n<rulevar: local $count = 0 >\n\nthen a \"local\"-ized variable is declared instead, and will be available within subrules.\n\nNote however that, because all such variables are \"my\" variables, their values *do not\npersist* between match attempts on a given rule. To preserve values between match attempts,\nvalues can be stored within the \"local\" member of the $thisrule object:\n\ncountedrule: { $thisrule->{\"local\"}{\"count\"}++ }\n<reject>\n| subrule1\n| subrule2\n| <reject: $thisrule->{\"local\"}{\"count\"} == 1>\nsubrule3\n\nWhen matching a rule, each \"<rulevar>\" directive is matched as if it were an unconditional\n\"<reject>\" directive (that is, it causes any production in which it appears to immediately\nfail to match). For this reason (and to improve readability) it is usual to specify any\n\"<rulevar>\" directive in a separate production at the start of the rule (this has the added\nadvantage that it enables \"Parse::RecDescent\" to optimize away such productions, just as it\ndoes for the \"<reject>\" directive).\n\nDynamically matched rules\nBecause regexes and double-quoted strings are interpolated, it is relatively easy to specify\nproductions with \"context sensitive\" tokens. For example:\n\ncommand:  keyword  body  \"end $item[1]\"\n\nwhich ensures that a command block is bounded by a \"*<keyword>*...end *<same keyword>*\"\npair.\n\nBuilding productions in which subrules are context sensitive is also possible, via the\n\"<matchrule:...>\" directive. This directive behaves identically to a subrule item, except\nthat the rule which is invoked to match it is determined by the string specified after the\ncolon. For example, we could rewrite the \"command\" rule like this:\n\ncommand:  keyword  <matchrule:body>  \"end $item[1]\"\n\nWhatever appears after the colon in the directive is treated as an interpolated string (that\nis, as if it appeared in \"qq{...}\" operator) and the value of that interpolated string is\nthe name of the subrule to be matched.\n\nOf course, just putting a constant string like \"body\" in a \"<matchrule:...>\" directive is of\nlittle interest or benefit. The power of directive is seen when we use a string that\ninterpolates to something interesting. For example:\n\ncommand:    keyword <matchrule:$item[1]body> \"end $item[1]\"\n\nkeyword:    'while' | 'if' | 'function'\n\nwhilebody: condition block\n\nifbody:    condition block ('else' block)(?)\n\nfunctionbody:  arglist block\n\nNow the \"command\" rule selects how to proceed on the basis of the keyword that is found. It\nis as if \"command\" were declared:\n\ncommand:    'while'    whilebody    \"end while\"\n|    'if'       ifbody   \"end if\"\n|    'function' functionbody \"end function\"\n\nWhen a \"<matchrule:...>\" directive is used as a repeated subrule, the rule name expression\nis \"late-bound\". That is, the name of the rule to be called is re-evaluated *each time* a\nmatch attempt is made. Hence, the following grammar:\n\n{ $::species = 'dogs' }\n\npair:   'two' <matchrule:$::species>(s)\n\ndogs:   /dogs/ { $::species = 'cats' }\n\ncats:   /cats/\n\nwill match the string \"two dogs cats cats\" completely, whereas it will only match the string\n\"two dogs dogs dogs\" up to the eighth letter. If the rule name were \"early bound\" (that is,\nevaluated only the first time the directive is encountered in a production), the reverse\nbehaviour would be expected.\n\nNote that the \"matchrule\" directive takes a string that is to be treated as a rule name,\n*not* as a rule invocation. That is, it's like a Perl symbolic reference, not an \"eval\".\nJust as you can say:\n\n$subname = 'foo';\n\n# and later...\n\n&{$foo}(@args);\n\nbut not:\n\n$subname = 'foo(@args)';\n\n# and later...\n\n&{$foo};\n\nlikewise you can say:\n\n$rulename = 'foo';\n\n# and in the grammar...\n\n<matchrule:$rulename>[@args]\n\nbut not:\n\n$rulename = 'foo[@args]';\n\n# and in the grammar...\n\n<matchrule:$rulename>\n\nDeferred actions\nThe \"<defer:...>\" directive is used to specify an action to be performed when (and only if!)\nthe current production ultimately succeeds.\n\nWhenever a \"<defer:...>\" directive appears, the code it specifies is converted to a closure\n(an anonymous subroutine reference) which is queued within the active parser object. Note\nthat, because the deferred code is converted to a closure, the values of any \"local\"\nvariable (such as $text, <@item>, etc.) are preserved until the deferred code is actually\nexecuted.\n\nIf the parse ultimately succeeds *and* the production in which the \"<defer:...>\" directive\nwas evaluated formed part of the successful parse, then the deferred code is executed\nimmediately before the parse returns. If however the production which queued a deferred\naction fails, or one of the higher-level rules which called that production fails, then the\ndeferred action is removed from the queue, and hence is never executed.\n\nFor example, given the grammar:\n\nsentence: noun trans noun\n| noun intrans\n\nnoun:     'the dog'\n{ print \"$item[1]\\t(noun)\\n\" }\n|     'the meat'\n{ print \"$item[1]\\t(noun)\\n\" }\n\ntrans:    'ate'\n{ print \"$item[1]\\t(transitive)\\n\" }\n\nintrans:  'ate'\n{ print \"$item[1]\\t(intransitive)\\n\" }\n|  'barked'\n{ print \"$item[1]\\t(intransitive)\\n\" }\n\nthen parsing the sentence \"the dog ate\" would produce the output:\n\nthe dog  (noun)\nate  (transitive)\nthe dog  (noun)\nate  (intransitive)\n\nThis is because, even though the first production of \"sentence\" ultimately fails, its\ninitial subrules \"noun\" and \"trans\" do match, and hence they execute their associated\nactions. Then the second production of \"sentence\" succeeds, causing the actions of the\nsubrules \"noun\" and \"intrans\" to be executed as well.\n\nOn the other hand, if the actions were replaced by \"<defer:...>\" directives:\n\nsentence: noun trans noun\n| noun intrans\n\nnoun:     'the dog'\n<defer: print \"$item[1]\\t(noun)\\n\" >\n|     'the meat'\n<defer: print \"$item[1]\\t(noun)\\n\" >\n\ntrans:    'ate'\n<defer: print \"$item[1]\\t(transitive)\\n\" >\n\nintrans:  'ate'\n<defer: print \"$item[1]\\t(intransitive)\\n\" >\n|  'barked'\n<defer: print \"$item[1]\\t(intransitive)\\n\" >\n\nthe output would be:\n\nthe dog  (noun)\nate  (intransitive)\n\nsince deferred actions are only executed if they were evaluated in a production which\nultimately contributes to the successful parse.\n\nIn this case, even though the first production of \"sentence\" caused the subrules \"noun\" and\n\"trans\" to match, that production ultimately failed and so the deferred actions queued by\nthose subrules were subsequently discarded. The second production then succeeded, causing\nthe entire parse to succeed, and so the deferred actions queued by the (second) match of the\n\"noun\" subrule and the subsequent match of \"intrans\" *are* preserved and eventually\nexecuted.\n\nDeferred actions provide a means of improving the performance of a parser, by only executing\nthose actions which are part of the final parse-tree for the input data.\n\nAlternatively, deferred actions can be viewed as a mechanism for building (and executing) a\ncustomized subroutine corresponding to the given input data, much in the same way that\nautoactions (see \"Autoactions\") can be used to build a customized data structure for\nspecific input.\n\nWhether or not the action it specifies is ever executed, a \"<defer:...>\" directive always\nsucceeds, returning the number of deferred actions currently queued at that point.\n\nParsing Perl\nParse::RecDescent provides limited support for parsing subsets of Perl, namely: quote-like\noperators, Perl variables, and complete code blocks.\n\nThe \"<perlquotelike>\" directive can be used to parse any Perl quote-like operator: 'a\nstring', \"m/a pattern/\", \"tr{ans}{lation}\", etc. It does this by calling\nText::Balanced::quotelike().\n\nIf a quote-like operator is found, a reference to an array of eight elements is returned.\nThose elements are identical to the last eight elements returned by\nText::Balanced::extractquotelike() in an array context, namely:\n\n[0] the name of the quotelike operator -- 'q', 'qq', 'm', 's', 'tr' -- if the operator was\nnamed; otherwise \"undef\",\n\n[1] the left delimiter of the first block of the operation,\n\n[2] the text of the first block of the operation (that is, the contents of a quote, the\nregex of a match, or substitution or the target list of a translation),\n\n[3] the right delimiter of the first block of the operation,\n\n[4] the left delimiter of the second block of the operation if there is one (that is, if it\nis a \"s\", \"tr\", or \"y\"); otherwise \"undef\",\n\n[5] the text of the second block of the operation if there is one (that is, the replacement\nof a substitution or the translation list of a translation); otherwise \"undef\",\n\n[6] the right delimiter of the second block of the operation (if any); otherwise \"undef\",\n\n[7] the trailing modifiers on the operation (if any); otherwise \"undef\".\n\nIf a quote-like expression is not found, the directive fails with the usual \"undef\" value.\n\nThe \"<perlvariable>\" directive can be used to parse any Perl variable: $scalar, @array,\n%hash, $ref->{field}[$index], etc. It does this by calling\nText::Balanced::extractvariable().\n\nIf the directive matches text representing a valid Perl variable specification, it returns\nthat text. Otherwise it fails with the usual \"undef\" value.\n\nThe \"<perlcodeblock>\" directive can be used to parse curly-brace-delimited block of Perl\ncode, such as: { $a = 1; f() =~ m/pat/; }. It does this by calling\nText::Balanced::extractcodeblock().\n\nIf the directive matches text representing a valid Perl code block, it returns that text.\nOtherwise it fails with the usual \"undef\" value.\n\nYou can also tell it what kind of brackets to use as the outermost delimiters. For example:\n\narglist: <perlcodeblock ()>\n\ncauses an arglist to match a perl code block whose outermost delimiters are \"(...)\" (rather\nthan the default \"{...}\").\n\nConstructing tokens\nEventually, Parse::RecDescent will be able to parse tokenized input, as well as ordinary\nstrings. In preparation for this joyous day, the \"<token:...>\" directive has been provided.\nThis directive creates a token which will be suitable for input to a Parse::RecDescent\nparser (when it eventually supports tokenized input).\n\nThe text of the token is the value of the immediately preceding item in the production. A\n\"<token:...>\" directive always succeeds with a return value which is the hash reference that\nis the new token. It also sets the return value for the production to that hash ref.\n\nThe \"<token:...>\" directive makes it easy to build a Parse::RecDescent-compatible lexer in\nParse::RecDescent:\n\nmy $lexer = new Parse::RecDescent q\n{\nlex:    token(s)\n\ntoken:  /a\\b/          <token:INDEF>\n|  /the\\b/        <token:DEF>\n|  /fly\\b/        <token:NOUN,VERB>\n|  /[a-z]+/i { lc $item[1] }  <token:ALPHA>\n|  <error: Unknown token>\n\n};\n\nwhich will eventually be able to be used with a regular Parse::RecDescent grammar:\n\nmy $parser = new Parse::RecDescent q\n{\nstartrule: subrule1 subrule 2\n\n# ETC...\n};\n\neither with a pre-lexing phase:\n\n$parser->startrule( $lexer->lex($data) );\n\nor with a lex-on-demand approach:\n\n$parser->startrule( sub{$lexer->token(\\$data)} );\n\nBut at present, only the \"<token:...>\" directive is actually implemented. The rest is\nvapourware.\n\nSpecifying operations\nOne of the commonest requirements when building a parser is to specify binary operators.\nUnfortunately, in a normal grammar, the rules for such things are awkward:\n\ndisjunction:    conjunction ('or' conjunction)(s?)\n{ $return = [ $item[1], @{$item[2]} ] }\n\nconjunction:    atom ('and' atom)(s?)\n{ $return = [ $item[1], @{$item[2]} ] }\n\nor inefficient:\n\ndisjunction:    conjunction 'or' disjunction\n{ $return = [ $item[1], @{$item[2]} ] }\n|    conjunction\n{ $return = [ $item[1] ] }\n\nconjunction:    atom 'and' conjunction\n{ $return = [ $item[1], @{$item[2]} ] }\n|    atom\n{ $return = [ $item[1] ] }\n\nand either way is ugly and hard to get right.\n\nThe \"<leftop:...>\" and \"<rightop:...>\" directives provide an easier way of specifying such\noperations. Using \"<leftop:...>\" the above examples become:\n\ndisjunction:    <leftop: conjunction 'or' conjunction>\nconjunction:    <leftop: atom 'and' atom>\n\nThe \"<leftop:...>\" directive specifies a left-associative binary operator. It is specified\naround three other grammar elements (typically subrules or terminals), which match the left\noperand, the operator itself, and the right operand respectively.\n\nA \"<leftop:...>\" directive such as:\n\ndisjunction:    <leftop: conjunction 'or' conjunction>\n\nis converted to the following:\n\ndisjunction:    ( conjunction ('or' conjunction)(s?)\n{ $return = [ $item[1], @{$item[2]} ] } )\n\nIn other words, a \"<leftop:...>\" directive matches the left operand followed by zero or more\nrepetitions of both the operator and the right operand. It then flattens the matched items\ninto an anonymous array which becomes the (single) value of the entire \"<leftop:...>\"\ndirective.\n\nFor example, an \"<leftop:...>\" directive such as:\n\noutput:  <leftop: ident '<<' expr >\n\nwhen given a string such as:\n\ncout << var << \"str\" << 3\n\nwould match, and $item[1] would be set to:\n\n[ 'cout', 'var', '\"str\"', '3' ]\n\nIn other words:\n\noutput:  <leftop: ident '<<' expr >\n\nis equivalent to a left-associative operator:\n\noutput:  ident          { $return = [$item[1]]   }\n|  ident '<<' expr        { $return = [@item[1,3]]     }\n|  ident '<<' expr '<<' expr      { $return = [@item[1,3,5]]   }\n|  ident '<<' expr '<<' expr '<<' expr    { $return = [@item[1,3,5,7]] }\n#  ...etc...\n\nSimilarly, the \"<rightop:...>\" directive takes a left operand, an operator, and a right\noperand:\n\nassign:  <rightop: var '=' expr >\n\nand converts them to:\n\nassign:  ( (var '=' {$return=$item[1]})(s?) expr\n{ $return = [ @{$item[1]}, $item[2] ] } )\n\nwhich is equivalent to a right-associative operator:\n\nassign:  expr       { $return = [$item[1]]       }\n|  var '=' expr       { $return = [@item[1,3]]     }\n|  var '=' var '=' expr   { $return = [@item[1,3,5]]   }\n|  var '=' var '=' var '=' expr   { $return = [@item[1,3,5,7]] }\n#  ...etc...\n\nNote that for both the \"<leftop:...>\" and \"<rightop:...>\" directives, the directive does not\nnormally return the operator itself, just a list of the operands involved. This is\nparticularly handy for specifying lists:\n\nlist: '(' <leftop: listitem ',' listitem> ')'\n{ $return = $item[2] }\n\nThere is, however, a problem: sometimes the operator is itself significant. For example, in\na Perl list a comma and a \"=>\" are both valid separators, but the \"=>\" has additional\nstringification semantics. Hence it's important to know which was used in each case.\n\nTo solve this problem the \"<leftop:...>\" and \"<rightop:...>\" directives *do* return the\noperator(s) as well, under two circumstances. The first case is where the operator is\nspecified as a subrule. In that instance, whatever the operator matches is returned (on the\nassumption that if the operator is important enough to have its own subrule, then it's\nimportant enough to return).\n\nThe second case is where the operator is specified as a regular expression. In that case, if\nthe first bracketed subpattern of the regular expression matches, that matching value is\nreturned (this is analogous to the behaviour of the Perl \"split\" function, except that only\nthe first subpattern is returned).\n\nIn other words, given the input:\n\n( a=>1, b=>2 )\n\nthe specifications:\n\nlist:      '('  <leftop: listitem separator listitem>  ')'\n\nseparator: ',' | '=>'\n\nor:\n\nlist:      '('  <leftop: listitem /(,|=>)/ listitem>  ')'\n\ncause the list separators to be interleaved with the operands in the anonymous array in\n$item[2]:\n\n[ 'a', '=>', '1', ',', 'b', '=>', '2' ]\n\nBut the following version:\n\nlist:      '('  <leftop: listitem /,|=>/ listitem>  ')'\n\nreturns only the operators:\n\n[ 'a', '1', 'b', '2' ]\n\nOf course, none of the above specifications handle the case of an empty list, since the\n\"<leftop:...>\" and \"<rightop:...>\" directives require at least a single right or left\noperand to match. To specify that the operator can match \"trivially\", it's necessary to add\na \"(s?)\" qualifier to the directive:\n\nlist:      '('  <leftop: listitem /(,|=>)/ listitem>(s?)  ')'\n\nNote that in almost all the above examples, the first and third arguments of the\n\"<leftop:...>\" directive were the same subrule. That is because \"<leftop:...>\"'s are\nfrequently used to specify \"separated\" lists of the same type of item. To make such lists\neasier to specify, the following syntax:\n\nlist:   element(s /,/)\n\nis exactly equivalent to:\n\nlist:   <leftop: element /,/ element>\n\nNote that the separator must be specified as a raw pattern (i.e. not a string or subrule).\n\nScored productions\nBy default, Parse::RecDescent grammar rules always accept the first production that matches\nthe input. But if two or more productions may potentially match the same input, choosing the\nfirst that does so may not be optimal.\n\nFor example, if you were parsing the sentence \"time flies like an arrow\", you might use a\nrule like this:\n\nsentence: verb noun preposition article noun { [@item] }\n| adjective noun verb article noun   { [@item] }\n| noun verb preposition article noun { [@item] }\n\nEach of these productions matches the sentence, but the third one is the most likely\ninterpretation. However, if the sentence had been \"fruit flies like a banana\", then the\nsecond production is probably the right match.\n\nTo cater for such situations, the \"<score:...>\" can be used. The directive is equivalent to\nan unconditional \"<reject>\", except that it allows you to specify a \"score\" for the current\nproduction. If that score is numerically greater than the best score of any preceding\nproduction, the current production is cached for later consideration. If no later production\nmatches, then the cached production is treated as having matched, and the value of the item\nimmediately before its \"<score:...>\" directive is returned as the result.\n\nIn other words, by putting a \"<score:...>\" directive at the end of each production, you can\nselect which production matches using criteria other than specification order. For example:\n\nsentence: verb noun preposition article noun { [@item] } <score: sensible(@item)>\n| adjective noun verb article noun   { [@item] } <score: sensible(@item)>\n| noun verb preposition article noun { [@item] } <score: sensible(@item)>\n\nNow, when each production reaches its respective \"<score:...>\" directive, the subroutine\n\"sensible\" will be called to evaluate the matched items (somehow). Once all productions have\nbeen tried, the one which \"sensible\" scored most highly will be the one that is accepted as\na match for the rule.\n\nThe variable $score always holds the current best score of any production, and the variable\n$scorereturn holds the corresponding return value.\n\nAs another example, the following grammar matches lines that may be separated by commas,\ncolons, or semi-colons. This can be tricky if a colon-separated line also contains commas,\nor vice versa. The grammar resolves the ambiguity by selecting the rule that results in the\nfewest fields:\n\nline: seplist[sep=>',']  <score: -@{$item[1]}>\n| seplist[sep=>':']  <score: -@{$item[1]}>\n| seplist[sep=>\" \"]  <score: -@{$item[1]}>\n\nseplist: <skip:\"\"> <leftop: /[^$arg{sep}]*/ \"$arg{sep}\" /[^$arg{sep}]*/>\n\nNote the use of negation within the \"<score:...>\" directive to ensure that the seplist with\nthe most items gets the lowest score.\n\nAs the above examples indicate, it is often the case that all productions in a rule use\nexactly the same \"<score:...>\" directive. It is tedious to have to repeat this identical\ndirective in every production, so Parse::RecDescent also provides the \"<autoscore:...>\"\ndirective.\n\nIf an \"<autoscore:...>\" directive appears in any production of a rule, the code it specifies\nis used as the scoring code for every production of that rule, except productions that\nalready end with an explicit \"<score:...>\" directive. Thus the rules above could be\nrewritten:\n\nline: <autoscore: -@{$item[1]}>\nline: seplist[sep=>',']\n| seplist[sep=>':']\n| seplist[sep=>\" \"]\n\n\nsentence: <autoscore: sensible(@item)>\n| verb noun preposition article noun { [@item] }\n| adjective noun verb article noun   { [@item] }\n| noun verb preposition article noun { [@item] }\n\nNote that the \"<autoscore:...>\" directive itself acts as an unconditional \"<reject>\", and\n(like the \"<rulevar:...>\" directive) is pruned at compile-time wherever possible.\n\nDispensing with grammar checks\nDuring the compilation phase of parser construction, Parse::RecDescent performs a small\nnumber of checks on the grammar it's given. Specifically it checks that the grammar is not\nleft-recursive, that there are no \"insatiable\" constructs of the form:\n\nrule: subrule(s) subrule\n\nand that there are no rules missing (i.e. referred to, but never defined).\n\nThese checks are important during development, but can slow down parser construction in\nstable code. So Parse::RecDescent provides the <nocheck> directive to turn them off. The\ndirective can only appear before the first rule definition, and switches off checking\nthroughout the rest of the current grammar.\n\nTypically, this directive would be added when a parser has been thoroughly tested and is\nready for release.\n"
                },
                {
                    "name": "Subrule argument lists",
                    "content": "It is occasionally useful to pass data to a subrule which is being invoked. For example,\nconsider the following grammar fragment:\n\nclassdecl: keyword decl\n\nkeyword:   'struct' | 'class';\n\ndecl:      # WHATEVER\n\nThe \"decl\" rule might wish to know which of the two keywords was used (since it may affect some\naspect of the way the subsequent declaration is interpreted). \"Parse::RecDescent\" allows the\ngrammar designer to pass data into a rule, by placing that data in an *argument list* (that is,\nin square brackets) immediately after any subrule item in a production. Hence, we could pass the\nkeyword to \"decl\" as follows:\n\nclassdecl: keyword decl[ $item[1] ]\n\nkeyword:   'struct' | 'class';\n\ndecl:      # WHATEVER\n\nThe argument list can consist of any number (including zero!) of comma-separated Perl\nexpressions. In other words, it looks exactly like a Perl anonymous array reference. For\nexample, we could pass the keyword, the name of the surrounding rule, and the literal 'keyword'\nto \"decl\" like so:\n\nclassdecl: keyword decl[$item[1],$item[0],'keyword']\n\nkeyword:   'struct' | 'class';\n\ndecl:      # WHATEVER\n\nWithin the rule to which the data is passed (\"decl\" in the above examples) that data is\navailable as the elements of a local variable @arg. Hence \"decl\" might report its intentions as\nfollows:\n\nclassdecl: keyword decl[$item[1],$item[0],'keyword']\n\nkeyword:   'struct' | 'class';\n\ndecl:      { print \"Declaring $arg[0] (a $arg[2])\\n\";\nprint \"(this rule called by $arg[1])\" }\n\nSubrule argument lists can also be interpreted as hashes, simply by using the local variable\n%arg instead of @arg. Hence we could rewrite the previous example:\n\nclassdecl: keyword decl[keyword => $item[1],\ncaller  => $item[0],\ntype    => 'keyword']\n\nkeyword:   'struct' | 'class';\n\ndecl:      { print \"Declaring $arg{keyword} (a $arg{type})\\n\";\nprint \"(this rule called by $arg{caller})\" }\n\nBoth @arg and %arg are always available, so the grammar designer may choose whichever convention\n(or combination of conventions) suits best.\n\nSubrule argument lists are also useful for creating \"rule templates\" (especially when used in\nconjunction with the \"<matchrule:...>\" directive). For example, the subrule:\n\nlist:     <matchrule:$arg{rule}> /$arg{sep}/ list[%arg]\n{ $return = [ $item[1], @{$item[3]} ] }\n|     <matchrule:$arg{rule}>\n{ $return = [ $item[1]] }\n\nis a handy template for the common problem of matching a separated list. For example:\n\nfunction: 'func' name '(' list[rule=>'param',sep=>';'] ')'\n\nparam:    list[rule=>'name',sep=>','] ':' typename\n\nname:     /\\w+/\n\ntypename: name\n\nWhen a subrule argument list is used with a repeated subrule, the argument list goes *before*\nthe repetition specifier:\n\nlist:   /some|many/ thing[ $item[1] ](s)\n\nThe argument list is \"late bound\". That is, it is re-evaluated for every repetition of the\nrepeated subrule. This means that each repeated attempt to match the subrule may be passed a\ncompletely different set of arguments if the value of the expression in the argument list\nchanges between attempts. So, for example, the grammar:\n\n{ $::species = 'dogs' }\n\npair:   'two' animal[$::species](s)\n\nanimal: /$arg[0]/ { $::species = 'cats' }\n\nwill match the string \"two dogs cats cats\" completely, whereas it will only match the string\n\"two dogs dogs dogs\" up to the eighth letter. If the value of the argument list were \"early\nbound\" (that is, evaluated only the first time a repeated subrule match is attempted), one would\nexpect the matching behaviours to be reversed.\n\nOf course, it is possible to effectively \"early bind\" such argument lists by passing them a\nvalue which does not change on each repetition. For example:\n\n{ $::species = 'dogs' }\n\npair:   'two' { $::species } animal[$item[2]](s)\n\nanimal: /$arg[0]/ { $::species = 'cats' }\n\nArguments can also be passed to the start rule, simply by appending them to the argument list\nwith which the start rule is called (*after* the \"line number\" parameter). For example, given:\n\n$parser = new Parse::RecDescent ( $grammar );\n\n$parser->data($text, 1, \"str\", 2, \\@arr);\n\n#         ^^^^^  ^  ^^^^^^^^^^^^^^^\n#       |    |     |\n# TEXT TO BE PARSED  |     |\n# STARTING LINE NUMBER     |\n# ELEMENTS OF @arg WHICH IS PASSED TO RULE data\n\nthen within the productions of the rule \"data\", the array @arg will contain \"(\"str\", 2, \\@arr)\".\n"
                },
                {
                    "name": "Alternations",
                    "content": "Alternations are implicit (unnamed) rules defined as part of a production. An alternation is\ndefined as a series of '|'-separated productions inside a pair of round brackets. For example:\n\ncharacter: 'the' ( good | bad | ugly ) /dude/\n\nEvery alternation implicitly defines a new subrule, whose automatically-generated name indicates\nits origin: \"alternation<I>ofproduction<P>ofrule<R>\" for the appropriate values of <I>,\n<P>, and <R>. A call to this implicit subrule is then inserted in place of the brackets. Hence\nthe above example is merely a convenient short-hand for:\n\ncharacter: 'the'\nalternation1ofproduction1ofrulecharacter\n/dude/\n\nalternation1ofproduction1ofrulecharacter:\ngood | bad | ugly\n\nSince alternations are parsed by recursively calling the parser generator, any type(s) of item\ncan appear in an alternation. For example:\n\ncharacter: 'the' ( 'high' \"plains\"  # Silent, with poncho\n| /no[- ]name/ # Silent, no poncho\n| vengeanceseeking    # Poncho-optional\n| <error>\n) drifter\n\nIn this case, if an error occurred, the automatically generated message would be:\n\nERROR (line <N>): Invalid implicit subrule: Expected\n'high' or /no[- ]name/ or generic,\nbut found \"pacifist\" instead\n\nSince every alternation actually has a name, it's even possible to extend or replace them:\n\nparser->Replace(\n\"alternation1ofproduction1ofrulecharacter:\n'generic Eastwood'\"\n);\n\nMore importantly, since alternations are a form of subrule, they can be given repetition\nspecifiers:\n\ncharacter: 'the' ( good | bad | ugly )(?) /dude/\n"
                },
                {
                    "name": "Incremental Parsing",
                    "content": "\"Parse::RecDescent\" provides two methods - \"Extend\" and \"Replace\" - which can be used to alter\nthe grammar matched by a parser. Both methods take the same argument as\n\"Parse::RecDescent::new\", namely a grammar specification string\n\n\"Parse::RecDescent::Extend\" interprets the grammar specification and adds any productions it\nfinds to the end of the rules for which they are specified. For example:\n\n$add = \"name: 'Jimmy-Bob' | 'Bobby-Jim'\\ndesc: colour /necks?/\";\nparser->Extend($add);\n\nadds two productions to the rule \"name\" (creating it if necessary) and one production to the\nrule \"desc\".\n\n\"Parse::RecDescent::Replace\" is identical, except that it first resets are rule specified in the\nadditional grammar, removing any existing productions. Hence after:\n\n$add = \"name: 'Jimmy-Bob' | 'Bobby-Jim'\\ndesc: colour /necks?/\";\nparser->Replace($add);\n\nthere are *only* valid \"name\"s and the one possible description.\n\nA more interesting use of the \"Extend\" and \"Replace\" methods is to call them inside the action\nof an executing parser. For example:\n\ntypedef: 'typedef' typename identifier ';'\n{ $thisparser->Extend(\"typename: '$item[3]'\") }\n| <error>\n\nidentifier: ...!typename /[A-Za-z]w*/\n\nwhich automatically prevents type names from being typedef'd, or:\n\ncommand: 'map' keyname 'to' abortkey\n{ $thisparser->Replace(\"abortkey: '$item[2]'\") }\n| 'map' keyname 'to' keyname\n{ mapkey($item[2],$item[4]) }\n| abortkey\n{ exit if confirm(\"abort?\") }\n\nabortkey: 'q'\n\nkeyname: ...!abortkey /[A-Za-z]/\n\nwhich allows the user to change the abort key binding, but not to unbind it.\n\nThe careful use of such constructs makes it possible to reconfigure a a running parser,\neliminating the need for semantic feedback by providing syntactic feedback instead. However, as\ncurrently implemented, \"Replace()\" and \"Extend()\" have to regenerate and re-\"eval\" the entire\nparser whenever they are called. This makes them quite slow for large grammars.\n\nIn such cases, the judicious use of an interpolated regex is likely to be far more efficient:\n\ntypedef: 'typedef' typename/ identifier ';'\n{ $thisparser->{local}{typename} .= \"|$item[3]\" }\n| <error>\n\nidentifier: ...!typename /[A-Za-z]w*/\n\ntypename: /$thisparser->{local}{typename}/\n"
                },
                {
                    "name": "Precompiling parsers",
                    "content": "Normally Parse::RecDescent builds a parser from a grammar at run-time. That approach simplifies\nthe design and implementation of parsing code, but has the disadvantage that it slows the\nparsing process down - you have to wait for Parse::RecDescent to build the parser every time the\nprogram runs. Long or complex grammars can be particularly slow to build, leading to\nunacceptable delays at start-up.\n\nTo overcome this, the module provides a way of \"pre-building\" a parser object and saving it in a\nseparate module. That module can then be used to create clones of the original parser.\n\nA grammar may be precompiled using the \"Precompile\" class method. For example, to precompile a\ngrammar stored in the scalar $grammar, and produce a class named PreGrammar in a module file\nnamed PreGrammar.pm, you could use:\n\nuse Parse::RecDescent;\n\nParse::RecDescent->Precompile([$optionshashref], $grammar, \"PreGrammar\", [\"RuntimeClass\"]);\n\nThe first required argument is the grammar string, the second is the name of the class to be\nbuilt. The name of the module file is generated automatically by appending \".pm\" to the last\nelement of the class name. Thus\n\nParse::RecDescent->Precompile($grammar, \"My::New::Parser\");\n\nwould produce a module file named Parser.pm.\n\nAfter the class name, you may specify the name of the runtimeclass called by the Precompiled\nparser. See \"Precompiled runtimes\" for more details.\n\nAn optional hash reference may be supplied as the first argument to \"Precompile\". This argument\nis currently EXPERIMENTAL, and may change in a future release of Parse::RecDescent. The only\nsupported option is currently \"-standalone\", see \"Standalone precompiled parsers\".\n\nIt is somewhat tedious to have to write a small Perl program just to generate a precompiled\ngrammar class, so Parse::RecDescent has some special magic that allows you to do the job\ndirectly from the command-line.\n\nIf your grammar is specified in a file named grammar, you can generate a class named\nYet::Another::Grammar like so:\n\n> perl -MParse::RecDescent - grammar Yet::Another::Grammar [Runtime::Class]\n\nThis would produce a file named Grammar.pm containing the full definition of a class called\nYet::Another::Grammar. Of course, to use that class, you would need to put the Grammar.pm file\nin a directory named Yet/Another, somewhere in your Perl include path.\n\nHaving created the new class, it's very easy to use it to build a parser. You simply \"use\" the\nnew module, and then call its \"new\" method to create a parser object. For example:\n\nuse Yet::Another::Grammar;\nmy $parser = Yet::Another::Grammar->new();\n\nThe effect of these two lines is exactly the same as:\n\nuse Parse::RecDescent;\n\nopen GRAMMARFILE, \"grammar\" or die;\nlocal $/;\nmy $grammar = <GRAMMARFILE>;\n\nmy $parser = Parse::RecDescent->new($grammar);\n\nonly considerably faster.\n\nNote however that the parsers produced by either approach are exactly the same, so whilst\nprecompilation has an effect on *set-up* speed, it has no effect on *parsing* speed. RecDescent\n2.0 will address that problem.\n\nStandalone precompiled parsers\nUntil version 1.967003 of Parse::RecDescent, parser modules built with \"Precompile\" were\ndependent on Parse::RecDescent. Future Parse::RecDescent releases with different internal\nimplementations would break pre-existing precompiled parsers.\n\nVersion 1.967005 added the ability for Parse::RecDescent to include itself in the resulting .pm\nfile if you pass the boolean option \"-standalone\" to \"Precompile\":\n\nParse::RecDescent->Precompile({ -standalone => 1, },\n$grammar, \"My::New::Parser\");\n\nParse::RecDescent is included as $class::Runtime in order to avoid conflicts between an\ninstalled version of Parse::RecDescent and other precompiled, standalone parser made with\nParse::RecDescent. The name of this class may be changed with the \"-runtimeclass\" option to\nPrecompile. This renaming is experimental, and is subject to change in future versions.\n\nPrecompiled parsers remain dependent on Parse::RecDescent by default, as this feature is still\nconsidered experimental. In the future, standalone parsers will become the default.\n\nPrecompiled runtimes\nStandalone precompiled parsers each include a copy of Parse::RecDescent. For users who have a\nfamily of related precompiled parsers, this is very inefficient. \"Precompile\" now supports an\nexperimental \"-runtimeclass\" option. To build a precompiled parser with a different runtime\nname, call:\n\nParse::RecDescent->Precompile({\n-standalone => 1,\n-runtimeclass => \"My::Runtime\",\n},\n$grammar, \"My::New::Parser\");\n\nThe resulting standalone parser will contain a copy of Parse::RecDescent, renamed to\n\"My::Runtime\".\n\nTo build a set of parsers that \"use\" a custom-named runtime, without including that runtime in\nthe output, simply build those parsers with \"-runtimeclass\" and without \"-standalone\":\n\nParse::RecDescent->Precompile({\n-runtimeclass => \"My::Runtime\",\n},\n$grammar, \"My::New::Parser\");\n\nThe runtime itself must be generated as well, so that it may be \"use\"d by My::New::Parser. To\ngenerate the runtime file, use one of the two folling calls:\n\nParse::RecDescent->PrecompiledRuntime(\"My::Runtime\");\n\nParse::RecDescent->Precompile({\n-standalone => 1,\n-runtimeclass => \"My::Runtime\",\n},\n'', # empty grammar\n\"My::Runtime\");\n"
                }
            ]
        },
        "GOTCHAS": {
            "content": "This section describes common mistakes that grammar writers seem to make on a regular basis.\n\n1. Expecting an error to always invalidate a parse\nA common mistake when using error messages is to write the grammar like this:\n\nfile: line(s)\n\nline: linetype1\n| linetype2\n| linetype3\n| <error>\n\nThe expectation seems to be that any line that is not of type 1, 2 or 3 will invoke the\n\"<error>\" directive and thereby cause the parse to fail.\n\nUnfortunately, that only happens if the error occurs in the very first line. The first rule\nstates that a \"file\" is matched by one or more lines, so if even a single line succeeds, the\nfirst rule is completely satisfied and the parse as a whole succeeds. That means that any error\nmessages generated by subsequent failures in the \"line\" rule are quietly ignored.\n\nTypically what's really needed is this:\n\nfile: line(s) eofile    { $return = $item[1] }\n\nline: linetype1\n| linetype2\n| linetype3\n| <error>\n\neofile: /^\\Z/\n\nThe addition of the \"eofile\" subrule to the first production means that a file only matches a\nseries of successful \"line\" matches *that consume the complete input text*. If any input text\nremains after the lines are matched, there must have been an error in the last \"line\". In that\ncase the \"eofile\" rule will fail, causing the entire \"file\" rule to fail too.\n\nNote too that \"eofile\" must match \"/^\\Z/\" (end-of-text), *not* \"/^\\cZ/\" or \"/^\\cD/\"\n(end-of-file).\n\nAnd don't forget the action at the end of the production. If you just write:\n\nfile: line(s) eofile\n\nthen the value returned by the \"file\" rule will be the value of its last item: \"eofile\". Since\n\"eofile\" always returns an empty string on success, that will cause the \"file\" rule to return\nthat empty string. Apart from returning the wrong value, returning an empty string will trip up\ncode such as:\n\n$parser->file($filetext) || die;\n\n(since \"\" is false).\n\nRemember that Parse::RecDescent returns undef on failure, so the only safe test for failure is:\n\ndefined($parser->file($filetext)) || die;\n\n2. Using a \"return\" in an action\nAn action is like a \"do\" block inside the subroutine implementing the surrounding rule. So if\nyou put a \"return\" statement in an action:\n\nrange: '(' start '..' end )'\n{ return $item{end} }\n/\\s+/\n\nthat subroutine will immediately return, without checking the rest of the items in the current\nproduction (e.g. the \"/\\s+/\") and without setting up the necessary data structures to tell the\nparser that the rule has succeeded.\n\nThe correct way to set a return value in an action is to set the $return variable:\n\nrange: '(' start '..' end )'\n{ $return = $item{end} }\n/\\s+/\n\n2. Setting $Parse::RecDescent::skip at parse time\nIf you want to change the default skipping behaviour (see \"Terminal Separators\" and the\n\"<skip:...>\" directive) by setting $Parse::RecDescent::skip you have to remember to set this\nvariable *before* creating the grammar object.\n\nFor example, you might want to skip all Perl-like comments with this regular expression:\n\nmy $skipspacesandcomments = qr/\n(?mxs:\n\\s+         # either spaces\n| \\# .*?$   # or a dash and whatever up to the end of line\n)*             # repeated at will (in whatever order)\n/;\n\nAnd then:\n\nmy $parser1 = Parse::RecDescent->new($grammar);\n\n$Parse::RecDescent::skip = $skipspacesandcomments;\n\nmy $parser2 = Parse::RecDescent->new($grammar);\n\n$parser1->parse($text); # this does not cope with comments\n$parser2->parse($text); # this skips comments correctly\n\nThe two parsers behave differently, because any skipping behaviour specified via\n$Parse::RecDescent::skip is hard-coded when the grammar object is built, not at parse time.\n",
            "subsections": []
        },
        "DIAGNOSTICS": {
            "content": "Diagnostics are intended to be self-explanatory (particularly if you use -RDHINT (under perl\n-s) or define $::RDHINT inside the program).\n\n\"Parse::RecDescent\" currently diagnoses the following:\n\n*   Invalid regular expressions used as pattern terminals (fatal error).\n\n*   Invalid Perl code in code blocks (fatal error).\n\n*   Lookahead used in the wrong place or in a nonsensical way (fatal error).\n\n*   \"Obvious\" cases of left-recursion (fatal error).\n\n*   Missing or extra components in a \"<leftop>\" or \"<rightop>\" directive.\n\n*   Unrecognisable components in the grammar specification (fatal error).\n\n*   \"Orphaned\" rule components specified before the first rule (fatal error) or after an\n\"<error>\" directive (level 3 warning).\n\n*   Missing rule definitions (this only generates a level 3 warning, since you may be providing\nthem later via \"Parse::RecDescent::Extend()\").\n\n*   Instances where greedy repetition behaviour will almost certainly cause the failure of a\nproduction (a level 3 warning - see \"ON-GOING ISSUES AND FUTURE DIRECTIONS\" below).\n\n*   Attempts to define rules named 'Replace' or 'Extend', which cannot be called directly\nthrough the parser object because of the predefined meaning of \"Parse::RecDescent::Replace\"\nand \"Parse::RecDescent::Extend\". (Only a level 2 warning is generated, since such rules\n*can* still be used as subrules).\n\n*   Productions which consist of a single \"<error?>\" directive, and which therefore may succeed\nunexpectedly (a level 2 warning, since this might conceivably be the desired effect).\n\n*   Multiple consecutive lookahead specifiers (a level 1 warning only, since their effects\nsimply accumulate).\n\n*   Productions which start with a \"<reject>\" or \"<rulevar:...>\" directive. Such productions are\noptimized away (a level 1 warning).\n\n*   Rules which are autogenerated under $::AUTOSTUB (a level 1 warning).\n",
            "subsections": []
        },
        "AUTHOR": {
            "content": "Damian Conway (damian@conway.org) Jeremy T. Braun (JTBRAUN@CPAN.org) [current maintainer]\n",
            "subsections": []
        },
        "BUGS AND IRRITATIONS": {
            "content": "There are undoubtedly serious bugs lurking somewhere in this much code :-) Bug reports, test\ncases and other feedback are most welcome.\n\nOngoing annoyances include:\n\n*   There's no support for parsing directly from an input stream. If and when the Perl Gods give\nus regular expressions on streams, this should be trivial (ahem!) to implement.\n\n*   The parser generator can get confused if actions aren't properly closed or if they contain\nparticularly nasty Perl syntax errors (especially unmatched curly brackets).\n\n*   The generator only detects the most obvious form of left recursion (potential recursion on\nthe first subrule in a rule). More subtle forms of left recursion (for example, through the\nsecond item in a rule after a \"zero\" match of a preceding \"zero-or-more\" repetition, or\nafter a match of a subrule with an empty production) are not found.\n\n*   Instead of complaining about left-recursion, the generator should silently transform the\ngrammar to remove it. Don't expect this feature any time soon as it would require a more\nsophisticated approach to parser generation than is currently used.\n\n*   The generated parsers don't always run as fast as might be wished.\n\n*   The meta-parser should be bootstrapped using \"Parse::RecDescent\" :-)\n",
            "subsections": []
        },
        "ON-GOING ISSUES AND FUTURE DIRECTIONS": {
            "content": "1.  Repetitions are \"incorrigibly greedy\" in that they will eat everything they can and won't\nbacktrack if that behaviour causes a production to fail needlessly. So, for example:\n\nrule: subrule(s) subrule\n\nwill *never* succeed, because the repetition will eat all the subrules it finds, leaving\nnone to match the second item. Such constructions are relatively rare (and\n\"Parse::RecDescent::new\" generates a warning whenever they occur) so this may not be a\nproblem, especially since the insatiable behaviour can be overcome \"manually\" by writing:\n\nrule: penultimatesubrule(s) subrule\n\npenultimatesubrule: subrule ...subrule\n\nThe issue is that this construction is exactly twice as expensive as the original, whereas\nbacktracking would add only 1/*N* to the cost (for matching *N* repetitions of \"subrule\"). I\nwould welcome feedback on the need for backtracking; particularly on cases where the lack of\nit makes parsing performance problematical.\n\n2.  Having opened that can of worms, it's also necessary to consider whether there is a need for\nnon-greedy repetition specifiers. Again, it's possible (at some cost) to manually provide\nthe required functionality:\n\nrule: nongreedysubrule(s) othersubrule\n\nnongreedysubrule: subrule ...!othersubrule\n\nOverall, the issue is whether the benefit of this extra functionality outweighs the\ndrawbacks of further complicating the (currently minimalist) grammar specification syntax,\nand (worse) introducing more overhead into the generated parsers.\n\n3.  An \"<autocommit>\" directive would be nice. That is, it would be useful to be able to say:\n\ncommand: <autocommit>\ncommand: 'find' name\n| 'find' address\n| 'do' command 'at' time 'if' condition\n| 'do' command 'at' time\n| 'do' command\n| unusualcommand\n\nand have the generator work out that this should be \"pruned\" thus:\n\ncommand: 'find' name\n| 'find' <commit> address\n| 'do' <commit> command <uncommit>\n'at' time\n'if' <commit> condition\n| 'do' <commit> command <uncommit>\n'at' <commit> time\n| 'do' <commit> command\n| unusualcommand\n\nThere are several issues here. Firstly, should the \"<autocommit>\" automatically install an\n\"<uncommit>\" at the start of the last production (on the grounds that the \"command\" rule\ndoesn't know whether an \"unusualcommand\" might start with \"find\" or \"do\") or should the\n\"unusualcommand\" subgraph be analysed (to see if it *might* be viable after a \"find\" or\n\"do\")?\n\nThe second issue is how regular expressions should be treated. The simplest approach would\nbe simply to uncommit before them (on the grounds that they *might* match). Better\nefficiency would be obtained by analyzing all preceding literal tokens to determine whether\nthe pattern would match them.\n\nOverall, the issues are: can such automated \"pruning\" approach a hand-tuned version\nsufficiently closely to warrant the extra set-up expense, and (more importantly) is the\nproblem important enough to even warrant the non-trivial effort of building an automated\nsolution?\n",
            "subsections": []
        },
        "SUPPORT": {
            "content": "",
            "subsections": [
                {
                    "name": "Source Code Repository",
                    "content": "<http://github.com/jtbraun/Parse-RecDescent>\n"
                },
                {
                    "name": "Mailing List",
                    "content": "Visit <http://www.perlfoundation.org/perl5/index.cgi?parserecdescent> to sign up for the\nmailing list.\n\n<http://www.PerlMonks.org> is also a good place to ask questions. Previous posts about\nParse::RecDescent can typically be found with this search:\n<http://perlmonks.org/index.pl?node=recdescent>.\n\nFAQ\nVisit Parse::RecDescent::FAQ for answers to frequently (and not so frequently) asked questions\nabout Parse::RecDescent.\n\nView/Report Bugs\nTo view the current bug list or report a new issue visit\n<https://rt.cpan.org/Public/Dist/Display.html?Name=Parse-RecDescent>.\n"
                }
            ]
        },
        "SEE ALSO": {
            "content": "Regexp::Grammars provides Parse::RecDescent style parsing using native Perl 5.10 regular\nexpressions.\n",
            "subsections": []
        },
        "LICENCE AND COPYRIGHT": {
            "content": "Copyright (c) 1997-2007, Damian Conway \"<DCONWAY@CPAN.org>\". All rights reserved.\n\nThis module is free software; you can redistribute it and/or modify it under the same terms as\nPerl itself. See perlartistic.\n",
            "subsections": []
        },
        "DISCLAIMER OF WARRANTY": {
            "content": "BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE\nEXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT\nHOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE \"AS IS\" WITHOUT WARRANTY OF ANY KIND, EITHER\nEXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY\nAND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE\nSOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY\nSERVICING, REPAIR, OR CORRECTION.\n\nIN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER,\nOR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE\nLICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR\nCONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT\nLIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\nPARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR\nOTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.\n",
            "subsections": []
        }
    },
    "summary": "Parse::RecDescent - Generate Recursive-Descent Parsers",
    "flags": [],
    "examples": [],
    "see_also": []
}