{
    "mode": "perldoc",
    "parameter": "XML::Parser",
    "section": "",
    "url": "https://www.chedong.com/phpMan.php/perldoc/XML%3A%3AParser/json",
    "generated": "2026-05-30T06:07:28Z",
    "synopsis": "use XML::Parser;\n$p1 = XML::Parser->new(Style => 'Debug');\n$p1->parsefile('REC-xml-19980210.xml');\n$p1->parse('<foo id=\"me\">Hello World</foo>');\n# Alternative\n$p2 = XML::Parser->new(Handlers => {Start => \\&handlestart,\nEnd   => \\&handleend,\nChar  => \\&handlechar});\n$p2->parse($socket);\n# Another alternative\n$p3 = XML::Parser->new(ErrorContext => 2);\n$p3->setHandlers(Char    => \\&text,\nDefault => \\&other);\nopen(my $fh, 'xmlgenerator |');\n$p3->parse($foo, ProtocolEncoding => 'ISO-8859-1');\nclose($foo);\n$p3->parsefile('junk.xml', ErrorContext => 3);",
    "sections": {
        "NAME": {
            "content": "XML::Parser - A perl module for parsing XML documents\n",
            "subsections": []
        },
        "SYNOPSIS": {
            "content": "use XML::Parser;\n\n$p1 = XML::Parser->new(Style => 'Debug');\n$p1->parsefile('REC-xml-19980210.xml');\n$p1->parse('<foo id=\"me\">Hello World</foo>');\n\n# Alternative\n$p2 = XML::Parser->new(Handlers => {Start => \\&handlestart,\nEnd   => \\&handleend,\nChar  => \\&handlechar});\n$p2->parse($socket);\n\n# Another alternative\n$p3 = XML::Parser->new(ErrorContext => 2);\n\n$p3->setHandlers(Char    => \\&text,\nDefault => \\&other);\n\nopen(my $fh, 'xmlgenerator |');\n$p3->parse($foo, ProtocolEncoding => 'ISO-8859-1');\nclose($foo);\n\n$p3->parsefile('junk.xml', ErrorContext => 3);\n",
            "subsections": []
        },
        "DESCRIPTION": {
            "content": "This module provides ways to parse XML documents. It is built on top of\nXML::Parser::Expat, which is a lower level interface to James Clark's\nexpat library. Each call to one of the parsing methods creates a new\ninstance of XML::Parser::Expat which is then used to parse the document.\nExpat options may be provided when the XML::Parser object is created.\nThese options are then passed on to the Expat object on each parse call.\nThey can also be given as extra arguments to the parse methods, in which\ncase they override options given at XML::Parser creation time.\n\nThe behavior of the parser is controlled either by \"STYLES\" and/or\n\"HANDLERS\" options, or by \"setHandlers\" method. These all provide\nmechanisms for XML::Parser to set the handlers needed by\nXML::Parser::Expat. If neither \"Style\" nor \"Handlers\" are specified,\nthen parsing just checks the document for being well-formed.\n\nWhen underlying handlers get called, they receive as their first\nparameter the *Expat* object, not the Parser object.\n",
            "subsections": []
        },
        "METHODS": {
            "content": "new This is a class method, the constructor for XML::Parser. Options are\npassed as keyword value pairs. Recognized options are:\n\n*   Style\n\nThis option provides an easy way to create a given style of\nparser. The built in styles are: \"Debug\", \"Subs\", \"Tree\",\n\"Objects\", and \"Stream\". These are all defined in separate\npackages under \"XML::Parser::Style::*\", and you can find further\ndocumentation for each style both below, and in those packages.\n\nCustom styles can be provided by giving a full package name\ncontaining at least one '::'. This package should then have subs\ndefined for each handler it wishes to have installed. See\n\"STYLES\" below for a discussion of each built in style.\n\n*   Handlers\n\nWhen provided, this option should be an anonymous hash\ncontaining as keys the type of handler and as values a sub\nreference to handle that type of event. All the handlers get\npassed as their 1st parameter the instance of expat that is\nparsing the document. Further details on handlers can be found\nin \"HANDLERS\". Any handler set here overrides the corresponding\nhandler set with the Style option.\n\n*   Pkg\n\nSome styles will refer to subs defined in this package. If not\nprovided, it defaults to the package which called the\nconstructor.\n\n*   ErrorContext\n\nThis is an Expat option. When this option is defined, errors are\nreported in context. The value should be the number of lines to\nshow on either side of the line in which the error occurred.\n\n*   ProtocolEncoding\n\nThis is an Expat option. This sets the protocol encoding name.\nIt defaults to none. The built-in encodings are: \"UTF-8\",\n\"ISO-8859-1\", \"UTF-16\", and \"US-ASCII\". Other encodings may be\nused if they have encoding maps in one of the directories in the\n@EncodingPath list. Check \"ENCODINGS\" for more information on\nencoding maps. Setting the protocol encoding overrides any\nencoding in the XML declaration.\n\n*   Namespaces\n\nThis is an Expat option. If this is set to a true value, then\nnamespace processing is done during the parse. See \"Namespaces\"\nin XML::Parser::Expat for further discussion of namespace\nprocessing.\n\n*   NoExpand\n\nThis is an Expat option. Normally, the parser will try to expand\nreferences to entities defined in the internal subset. If this\noption is set to a true value, and a default handler is also\nset, then the default handler will be called when an entity\nreference is seen in text. This has no effect if a default\nhandler has not been registered, and it has no effect on the\nexpansion of entity references inside attribute values.\n\n*   StreamDelimiter\n\nThis is an Expat option. It takes a string value. When this\nstring is found alone on a line while parsing from a stream,\nthen the parse is ended as if it saw an end of file. The\nintended use is with a stream of xml documents in a MIME\nmultipart format. The string should not contain a trailing\nnewline.\n\n*   ParseParamEnt\n\nThis is an Expat option. Unless standalone is set to \"yes\" in\nthe XML declaration, setting this to a true value allows the\nexternal DTD to be read, and parameter entities to be parsed and\nexpanded.\n\n*   NoLWP\n\nThis option has no effect if the ExternEnt or ExternEntFin\nhandlers are directly set. Otherwise, if true, it forces the use\nof a file based external entity handler.\n\n*   NonExpatOptions\n\nIf provided, this should be an anonymous hash whose keys are\noptions that shouldn't be passed to Expat. This should only be\nof concern to those subclassing XML::Parser.\n\nsetHandlers(TYPE, HANDLER [, TYPE, HANDLER [...]])\nThis method registers handlers for various parser events. It\noverrides any previous handlers registered through the Style or\nHandler options or through earlier calls to setHandlers. By\nproviding a false or undefined value as the handler, the existing\nhandler can be unset.\n\nThis method returns a list of type, handler pairs corresponding to\nthe input. The handlers returned are the ones that were in effect\nprior to the call.\n\nSee a description of the handler types in \"HANDLERS\".\n\nparse(SOURCE [, OPT => OPTVALUE [...]])\nThe SOURCE parameter should either be a string containing the whole\nXML document, or it should be an open IO::Handle. Constructor\noptions to XML::Parser::Expat given as keyword-value pairs may\nfollow the SOURCE parameter. These override, for this call, any\noptions or attributes passed through from the XML::Parser instance.\n\nA die call is thrown if a parse error occurs. Otherwise it will\nreturn 1 or whatever is returned from the Final handler, if one is\ninstalled. In other words, what parse may return depends on the\nstyle.\n\nparsestring\nThis is just an alias for parse for backwards compatibility.\n\nparsefile(FILE [, OPT => OPTVALUE [...]])\nOpen FILE for reading, then call parse with the open handle. The\nfile is closed no matter how parse returns. Returns what parse\nreturns.\n\nparsestart([ OPT => OPTVALUE [...]])\nCreate and return a new instance of XML::Parser::ExpatNB.\nConstructor options may be provided. If an init handler has been\nprovided, it is called before returning the ExpatNB object.\nDocuments are parsed by making incremental calls to the parsemore\nmethod of this object, which takes a string. A single call to the\nparsedone method of this object, which takes no arguments,\nindicates that the document is finished.\n\nIf there is a final handler installed, it is executed by the\nparsedone method before returning and the parsedone method returns\nwhatever is returned by the final handler.\n",
            "subsections": []
        },
        "HANDLERS": {
            "content": "Expat is an event based parser. As the parser recognizes parts of the\ndocument (say the start or end tag for an XML element), then any\nhandlers registered for that type of an event are called with suitable\nparameters. All handlers receive an instance of XML::Parser::Expat as\ntheir first argument. See \"METHODS\" in XML::Parser::Expat for a\ndiscussion of the methods that can be called on this object.\n\nInit                (Expat)\nThis is called just before the parsing of the document starts.\n\nFinal                (Expat)\nThis is called just after parsing has finished, but only if no errors\noccurred during the parse. Parse returns what this returns.\n\nStart                (Expat, Element [, Attr, Val [,...]])\nThis event is generated when an XML start tag is recognized. Element is\nthe name of the XML element type that is opened with the start tag. The\nAttr & Val pairs are generated for each attribute in the start tag.\n\nEnd                (Expat, Element)\nThis event is generated when an XML end tag is recognized. Note that an\nXML empty tag (<foo/>) generates both a start and an end event.\n\nChar                (Expat, String)\nThis event is generated when non-markup is recognized. The non-markup\nsequence of characters is in String. A single non-markup sequence of\ncharacters may generate multiple calls to this handler. Whatever the\nencoding of the string in the original document, this is given to the\nhandler in UTF-8.\n\nProc                (Expat, Target, Data)\nThis event is generated when a processing instruction is recognized.\n\nComment                (Expat, Data)\nThis event is generated when a comment is recognized.\n\nCdataStart        (Expat)\nThis is called at the start of a CDATA section.\n\nCdataEnd                (Expat)\nThis is called at the end of a CDATA section.\n\nDefault                (Expat, String)\nThis is called for any characters that don't have a registered handler.\nThis includes both characters that are part of markup for which no\nevents are generated (markup declarations) and characters that could\ngenerate events, but for which no handler has been registered.\n\nWhatever the encoding in the original document, the string is returned\nto the handler in UTF-8.\n\nUnparsed                (Expat, Entity, Base, Sysid, Pubid, Notation)\nThis is called for a declaration of an unparsed entity. Entity is the\nname of the entity. Base is the base to be used for resolving a relative\nURI. Sysid is the system id. Pubid is the public id. Notation is the\nnotation name. Base and Pubid may be undefined.\n\nNotation                (Expat, Notation, Base, Sysid, Pubid)\nThis is called for a declaration of notation. Notation is the notation\nname. Base is the base to be used for resolving a relative URI. Sysid is\nthe system id. Pubid is the public id. Base, Sysid, and Pubid may all be\nundefined.\n\nExternEnt        (Expat, Base, Sysid, Pubid)\nThis is called when an external entity is referenced. Base is the base\nto be used for resolving a relative URI. Sysid is the system id. Pubid\nis the public id. Base, and Pubid may be undefined.\n\nThis handler should either return a string, which represents the\ncontents of the external entity, or return an open filehandle that can\nbe read to obtain the contents of the external entity, or return undef,\nwhich indicates the external entity couldn't be found and will generate\na parse error.\n\nIf an open filehandle is returned, it must be returned as either a glob\n(*FOO) or as a reference to a glob (e.g. an instance of IO::Handle).\n\nA default handler is installed for this event. The default handler is\nXML::Parser::lwpextenthandler unless the NoLWP option was provided\nwith a true value, otherwise XML::Parser::fileextenthandler is the\ndefault handler for external entities. Even without the NoLWP option, if\nthe URI or LWP modules are missing, the file based handler ends up being\nused after giving a warning on the first external entity reference.\n\nThe LWP external entity handler will use proxies defined in the\nenvironment (httpproxy, ftpproxy, etc.).\n\nPlease note that the LWP external entity handler reads the entire entity\ninto a string and returns it, where as the file handler opens a\nfilehandle.\n\nAlso note that the file external entity handler will likely choke on\nabsolute URIs or file names that don't fit the conventions of the local\noperating system.\n\nThe expat base method can be used to set a basename for relative\npathnames. If no basename is given, or if the basename is itself a\nrelative name, then it is relative to the current working directory.\n\nExternEntFin        (Expat)\nThis is called after parsing an external entity. It's not called unless\nan ExternEnt handler is also set. There is a default handler installed\nthat pairs with the default ExternEnt handler.\n\nIf you're going to install your own ExternEnt handler, then you should\nset (or unset) this handler too.\n\nEntity                (Expat, Name, Val, Sysid, Pubid, Ndata, IsParam)\nThis is called when an entity is declared. For internal entities, the\nVal parameter will contain the value and the remaining three parameters\nwill be undefined. For external entities, the Val parameter will be\nundefined, the Sysid parameter will have the system id, the Pubid\nparameter will have the public id if it was provided (it will be\nundefined otherwise), the Ndata parameter will contain the notation for\nunparsed entities. If this is a parameter entity declaration, then the\nIsParam parameter is true.\n\nNote that this handler and the Unparsed handler above overlap. If both\nare set, then this handler will not be called for unparsed entities.\n\nElement                (Expat, Name, Model)\nThe element handler is called when an element declaration is found. Name\nis the element name, and Model is the content model as an\nXML::Parser::Content object. See \"XML::Parser::ContentModel Methods\" in\nXML::Parser::Expat for methods available for this class.\n\nAttlist                (Expat, Elname, Attname, Type, Default, Fixed)\nThis handler is called for each attribute in an ATTLIST declaration. So\nan ATTLIST declaration that has multiple attributes will generate\nmultiple calls to this handler. The Elname parameter is the name of the\nelement with which the attribute is being associated. The Attname\nparameter is the name of the attribute. Type is the attribute type,\ngiven as a string. Default is the default value, which will either be\n\"#REQUIRED\", \"#IMPLIED\" or a quoted string (i.e. the returned string\nwill begin and end with a quote character). If Fixed is true, then this\nis a fixed attribute.\n\nDoctype                (Expat, Name, Sysid, Pubid, Internal)\nThis handler is called for DOCTYPE declarations. Name is the document\ntype name. Sysid is the system id of the document type, if it was\nprovided, otherwise it's undefined. Pubid is the public id of the\ndocument type, which will be undefined if no public id was given.\nInternal is the internal subset, given as a string. If there was no\ninternal subset, it will be undefined. Internal will contain all\nwhitespace, comments, processing instructions, and declarations seen in\nthe internal subset. The declarations will be there whether or not they\nhave been processed by another handler (except for unparsed entities\nprocessed by the Unparsed handler). However, comments and processing\ninstructions will not appear if they've been processed by their\nrespective handlers.\n\n* DoctypeFin                (Parser)\nThis handler is called after parsing of the DOCTYPE declaration has\nfinished, including any internal or external DTD declarations.\n\nXMLDecl                (Expat, Version, Encoding, Standalone)\nThis handler is called for xml declarations. Version is a string\ncontaining the version. Encoding is either undefined or contains an\nencoding string. Standalone will be either true, false, or undefined if\nthe standalone attribute is yes, no, or not made respectively.\n",
            "subsections": []
        },
        "STYLES": {
            "content": "",
            "subsections": [
                {
                    "name": "Debug",
                    "content": "This just prints out the document in outline form. Nothing special is\nreturned by parse.\n"
                },
                {
                    "name": "Subs",
                    "content": "Each time an element starts, a sub by that name in the package specified\nby the Pkg option is called with the same parameters that the Start\nhandler gets called with.\n\nEach time an element ends, a sub with that name appended with an\nunderscore (\"\"), is called with the same parameters that the End\nhandler gets called with.\n\nNothing special is returned by parse.\n"
                },
                {
                    "name": "Tree",
                    "content": "Parse will return a parse tree for the document. Each node in the tree\ntakes the form of a tag, content pair. Text nodes are represented with a\npseudo-tag of \"0\" and the string that is their content. For elements,\nthe content is an array reference. The first item in the array is a\n(possibly empty) hash reference containing attributes. The remainder of\nthe array is a sequence of tag-content pairs representing the content of\nthe element.\n\nSo for example the result of parsing:\n\n<foo><head id=\"a\">Hello <em>there</em></head><bar>Howdy<ref/></bar>do</foo>\n\nwould be:\n\nTag   Content\n==================================================================\n[foo, [{}, head, [{id => \"a\"}, 0, \"Hello \",  em, [{}, 0, \"there\"]],\nbar, [         {}, 0, \"Howdy\",  ref, [{}]],\n0, \"do\"\n]\n]\n\nThe root document \"foo\", has 3 children: a \"head\" element, a \"bar\"\nelement and the text \"do\". After the empty attribute hash, these are\nrepresented in it's contents by 3 tag-content pairs.\n"
                },
                {
                    "name": "Objects",
                    "content": "This is similar to the Tree style, except that a hash object is created\nfor each element. The corresponding object will be in the class whose\nname is created by appending \"::\" and the element name to the package\nset with the Pkg option. Non-markup text will be in the ::Characters\nclass. The contents of the corresponding object will be in an anonymous\narray that is the value of the Kids property for that object.\n"
                },
                {
                    "name": "Stream",
                    "content": "This style also uses the Pkg package. If none of the subs that this\nstyle looks for is there, then the effect of parsing with this style is\nto print a canonical copy of the document without comments or\ndeclarations. All the subs receive as their 1st parameter the Expat\ninstance for the document they're parsing.\n\nIt looks for the following routines:\n\n*   StartDocument\n\nCalled at the start of the parse .\n\n*   StartTag\n\nCalled for every start tag with a second parameter of the element\ntype. The $ variable will contain a copy of the tag and the %\nvariable will contain attribute values supplied for that element.\n\n*   EndTag\n\nCalled for every end tag with a second parameter of the element\ntype. The $ variable will contain a copy of the end tag.\n\n*   Text\n\nCalled just before start or end tags with accumulated non-markup\ntext in the $ variable.\n\n*   PI\n\nCalled for processing instructions. The $ variable will contain a\ncopy of the PI and the target and data are sent as 2nd and 3rd\nparameters respectively.\n\n*   EndDocument\n\nCalled at conclusion of the parse.\n"
                }
            ]
        },
        "ENCODINGS": {
            "content": "XML documents may be encoded in character sets other than Unicode as\nlong as they may be mapped into the Unicode character set. Expat has\nfurther restrictions on encodings. Read the xmlparse.h header file in\nthe expat distribution to see details on these restrictions.\n\nExpat has built-in encodings for: \"UTF-8\", \"ISO-8859-1\", \"UTF-16\", and\n\"US-ASCII\". Encodings are set either through the XML declaration\nencoding attribute or through the ProtocolEncoding option to XML::Parser\nor XML::Parser::Expat.\n\nFor encodings other than the built-ins, expat calls the function\nloadencoding in the Expat package with the encoding name. This function\nlooks for a file in the path list @XML::Parser::Expat::EncodingPath,\nthat matches the lower-cased name with a '.enc' extension. The first one\nit finds, it loads.\n\nIf you wish to build your own encoding maps, check out the XML::Encoding\nmodule from CPAN.\n",
            "subsections": []
        },
        "AUTHORS": {
            "content": "Larry Wall <larry@wall.org> wrote version 1.0.\n\nClark Cooper <coopercc@netheaven.com> picked up support, changed the API\nfor this version (2.x), provided documentation, and added some standard\npackage features.\n\nMatt Sergeant <matt@sergeant.org> is now maintaining XML::Parser\n",
            "subsections": []
        }
    },
    "summary": "XML::Parser - A perl module for parsing XML documents",
    "flags": [],
    "examples": [],
    "see_also": []
}