phpman > man > HTML::TreeBuilder::XPath(3pm)

Markdown | JSON | MCP    

NAME
    HTML::TreeBuilder::XPath - add XPath support to HTML::TreeBuilder

SYNOPSIS
      use HTML::TreeBuilder::XPath;
      my $tree= HTML::TreeBuilder::XPath->new;
      $tree->parse_file( "mypage.html");
      my $nb=$tree->findvalue( '/html/body//p[@class="section_title"]/span[@class="nb"]');
      my $id=$tree->findvalue( '/html/body//p[@class="section_title"]/@id');

      my $p= $html->findnodes( '//p[@id="toto"]')->[0];
      my $link_texts= $p->findvalue( './a'); # the texts of all a elements in $p
      $tree->delete; # to avoid memory leaks, if you parse many HTML documents

DESCRIPTION
    This module adds typical XPath methods to HTML::TreeBuilder, to make it easy to query a
    document.

METHODS
    Extra methods added both to the tree object and to each element:

  findnodes ($path)
    Returns a list of nodes found by $path. In scalar context returns an
    "Tree::XPathEngine::NodeSet" object.

  findnodes_as_string ($path)
    Returns the text values of the nodes, as one string.

  findnodes_as_strings ($path)
    Returns a list of the values of the result nodes.

  findvalue ($path)
    Returns either a "Tree::XPathEngine::Literal", a "Tree::XPathEngine::Boolean" or a
    "Tree::XPathEngine::Number" object. If the path returns a NodeSet, $nodeset->xpath_to_literal is
    called automatically for you (and thus a "Tree::XPathEngine::Literal" is returned). Note that
    for each of the objects stringification is overloaded, so you can just print the value found, or
    manipulate it in the ways you would a normal perl value (e.g. using regular expressions).

  findvalues ($path)
    Returns the values of the matching nodes as a list. This is mostly the same as
    findnodes_as_strings, except that the elements of the list are objects (with overloaded
    stringification) instead of plain strings.

  exists ($path)
    Returns true if the given path exists.

  matches($path)
    Returns true if the element matches the path.

  find ($path)
    The find function takes an XPath expression (a string) and returns either a
    Tree::XPathEngine::NodeSet object containing the nodes it found (or empty if no nodes matched
    the path), or one of XML::XPathEngine::Literal (a string), XML::XPathEngine::Number, or
    XML::XPathEngine::Boolean. It should always return something - and you can use ->isa() to find
    out what it returned. If you need to check how many nodes it found you should check
    $nodeset->size. See XML::XPathEngine::NodeSet.

  as_XML_compact
    HTML::TreeBuilder's "as_XML" output is not really nice to look at, so I added a new method, that
    can be used as a simple replacement for it. It escapes only the '<', '>' and '&' (plus '"' in
    attribute values), and wraps CDATA elements in CDATA sections.

    Note that the XML is actually not garanteed to be valid at this point. Nothing is done about the
    encoding of the string. Patches or just ideas of how it could work are welcome.

  as_XML_indented
    Same as as_XML, except that the output is indented.

SEE ALSO
    HTML::TreeBuilder

    XML::XPathEngine

REPOSITORY
    <https://github.com/mirod/HTML--TreeBuilder--XPath>

AUTHOR
    Michel Rodriguez, <mirod AT cpan.org>

COPYRIGHT AND LICENSE
    Copyright (C) 2006-2011 by Michel Rodriguez

    This library is free software; you can redistribute it and/or modify it under the same terms as
    Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may
    have available.

HTML::TreeBuilder::XPath(3pm)
NAME SYNOPSIS DESCRIPTION METHODS SEE ALSO REPOSITORY AUTHOR COPYRIGHT AND LICENSE

Generated by phpman v3.7.11 Author: Che Dong Under GNU General Public License
2026-06-12 19:39 @216.73.216.28
CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Valid XHTML 1.0 TransitionalValid CSS!

^_back to top