Markdown Format | JSON API | MCP Server Tool
Help on class Tag in bs4.element: bs4.element.Tag = class Tag(PageElement) | bs4.element.Tag(parser=None, builder=None, name=None, namespace=None, prefix=None, attrs=None, parent=None, previous=None, is_xml=None, sourceline=None, sourcepos=None, can_be_empty_element=None, cdata_list_attributes=None, preserve_whitespace_tags=None, interesting_string_types=None, namespaces=None) | | Represents an HTML or XML tag that is part of a parse tree, along | with its attributes and contents. | | When Beautiful Soup parses the markup <b>penguin</b>, it will | create a Tag object representing the <b> tag. | | Method resolution order: | Tag | PageElement | builtins.object | | Methods defined here: | | __bool__(self) | A tag is non-None even if it has no contents. | | __call__(self, *args, **kwargs) | Calling a Tag like a function is the same as calling its | find_all() method. Eg. tag('a') returns a list of all the A tags | found within this tag. | | __contains__(self, x) | | __copy__(self) | A copy of a Tag must always be a deep copy, because a Tag's | children can only have one parent at a time. | | __deepcopy__(self, memo, recursive=True) | A deepcopy of a Tag is a new Tag, unconnected to the parse tree. | Its contents are a copy of the old Tag's contents. | | __delitem__(self, key) | Deleting tag[key] deletes all 'key' attributes for the tag. | | __eq__(self, other) | Returns true iff this Tag has the same name, the same attributes, | and the same contents (recursively) as `other`. | | __getattr__(self, tag) | Calling tag.subtag is the same as calling tag.find(name="subtag") | | __getitem__(self, key) | tag[key] returns the value of the 'key' attribute for the Tag, | and throws an exception if it's not there. | | __hash__(self) | Return hash(self). | | __init__(self, parser=None, builder=None, name=None, namespace=None, prefix=None, attrs=None, parent=None, previous=None, is_xml=None, sourceline=None, sourcepos=None, can_be_empty_element=None, cdata_list_attributes=None, preserve_whitespace_tags=None, interesting_string_types=None, namespaces=None) | Basic constructor. | | :param parser: A BeautifulSoup object. | :param builder: A TreeBuilder. | :param name: The name of the tag. | :param namespace: The URI of this Tag's XML namespace, if any. | :param prefix: The prefix for this Tag's XML namespace, if any. | :param attrs: A dictionary of this Tag's attribute values. | :param parent: The PageElement to use as this Tag's parent. | :param previous: The PageElement that was parsed immediately before | this tag. | :param is_xml: If True, this is an XML tag. Otherwise, this is an | HTML tag. | :param sourceline: The line number where this tag was found in its | source document. | :param sourcepos: The character position within `sourceline` where this | tag was found. | :param can_be_empty_element: If True, this tag should be | represented as <tag/>. If False, this tag should be represented | as <tag></tag>. | :param cdata_list_attributes: A list of attributes whose values should | be treated as CDATA if they ever show up on this tag. | :param preserve_whitespace_tags: A list of tag names whose contents | should have their whitespace preserved. | :param interesting_string_types: This is a NavigableString | subclass or a tuple of them. When iterating over this | Tag's strings in methods like Tag.strings or Tag.get_text, | these are the types of strings that are interesting enough | to be considered. The default is to consider | NavigableString and CData the only interesting string | subtypes. | :param namespaces: A dictionary mapping currently active | namespace prefixes to URIs. This can be used later to | construct CSS selectors. | | __iter__(self) | Iterating over a Tag iterates over its contents. | | __len__(self) | The length of a Tag is the length of its list of contents. | | __ne__(self, other) | Returns true iff this Tag is not identical to `other`, | as defined in __eq__. | | __repr__ = __unicode__(self) | | __setitem__(self, key, value) | Setting tag[key] sets the value of the 'key' attribute for the | tag. | | __str__ = __unicode__(self) | | __unicode__(self) | Renders this PageElement as a Unicode string. | | childGenerator(self) | Deprecated generator. | | clear(self, decompose=False) | Wipe out all children of this PageElement by calling extract() | on them. | | :param decompose: If this is True, decompose() (a more | destructive method) will be called instead of extract(). | | decode(self, indent_level=None, eventual_encoding='utf-8', formatter='minimal', iterator=None) | | decode_contents(self, indent_level=None, eventual_encoding='utf-8', formatter='minimal') | Renders the contents of this tag as a Unicode string. | | :param indent_level: Each line of the rendering will be | indented this many levels. (The formatter decides what a | 'level' means in terms of spaces or other characters | output.) Used internally in recursive calls while | pretty-printing. | | :param eventual_encoding: The tag is destined to be | encoded into this encoding. decode_contents() is _not_ | responsible for performing that encoding. This information | is passed in so that it can be substituted in if the | document contains a <META> tag that mentions the document's | encoding. | | :param formatter: A Formatter object, or a string naming one of | the standard Formatters. | | decompose(self) | Recursively destroys this PageElement and its children. | | This element will be removed from the tree and wiped out; so | will everything beneath it. | | The behavior of a decomposed PageElement is undefined and you | should never use one for anything, but if you need to _check_ | whether an element has been decomposed, you can use the | `decomposed` property. | | encode(self, encoding='utf-8', indent_level=None, formatter='minimal', errors='xmlcharrefreplace') | Render a bytestring representation of this PageElement and its | contents. | | :param encoding: The destination encoding. | :param indent_level: Each line of the rendering will be | indented this many levels. (The formatter decides what a | 'level' means in terms of spaces or other characters | output.) Used internally in recursive calls while | pretty-printing. | :param formatter: A Formatter object, or a string naming one of | the standard formatters. | :param errors: An error handling strategy such as | 'xmlcharrefreplace'. This value is passed along into | encode() and its value should be one of the constants | defined by Python. | :return: A bytestring. | | encode_contents(self, indent_level=None, encoding='utf-8', formatter='minimal') | Renders the contents of this PageElement as a bytestring. | | :param indent_level: Each line of the rendering will be | indented this many levels. (The formatter decides what a | 'level' means in terms of spaces or other characters | output.) Used internally in recursive calls while | pretty-printing. | | :param eventual_encoding: The bytestring will be in this encoding. | | :param formatter: A Formatter object, or a string naming one of | the standard Formatters. | | :return: A bytestring. | | find(self, name=None, attrs={}, recursive=True, string=None, **kwargs) | Look in the children of this PageElement and find the first | PageElement that matches the given criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param recursive: If this is True, find() will perform a | recursive search of this PageElement's children. Otherwise, | only the direct children will be considered. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | findAll = find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs) | | findChild = find(self, name=None, attrs={}, recursive=True, string=None, **kwargs) | | findChildren = find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs) | | find_all(self, name=None, attrs={}, recursive=True, string=None, limit=None, **kwargs) | Look in the children of this PageElement and find all | PageElements that match the given criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param recursive: If this is True, find_all() will perform a | recursive search of this PageElement's children. Otherwise, | only the direct children will be considered. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A ResultSet of PageElements. | :rtype: bs4.element.ResultSet | | get(self, key, default=None) | Returns the value of the 'key' attribute for the tag, or | the value given for 'default' if it doesn't have that | attribute. | | get_attribute_list(self, key, default=None) | The same as get(), but always returns a list. | | :param key: The attribute to look for. | :param default: Use this value if the attribute is not present | on this PageElement. | :return: A list of values, probably containing only a single | value. | | has_attr(self, key) | Does this PageElement have an attribute with the given name? | | has_key(self, key) | Deprecated method. This was kind of misleading because has_key() | (attributes) was different from __in__ (contents). | | has_key() is gone in Python 3, anyway. | | index(self, element) | Find the index of a child by identity, not value. | | Avoids issues with tag.contents.index(element) getting the | index of equal elements. | | :param element: Look for this PageElement in `self.contents`. | | prettify(self, encoding=None, formatter='minimal') | Pretty-print this PageElement as a string. | | :param encoding: The eventual encoding of the string. If this is None, | a Unicode string will be returned. | :param formatter: A Formatter object, or a string naming one of | the standard formatters. | :return: A Unicode string (if encoding==None) or a bytestring | (otherwise). | | recursiveChildGenerator(self) | Deprecated generator. | | renderContents(self, encoding='utf-8', prettyPrint=False, indentLevel=0) | Deprecated method for BS3 compatibility. | | select(self, selector, namespaces=None, limit=None, **kwargs) | Perform a CSS selection operation on the current element. | | This uses the SoupSieve library. | | :param selector: A string containing a CSS selector. | | :param namespaces: A dictionary mapping namespace prefixes | used in the CSS selector to namespace URIs. By default, | Beautiful Soup will use the prefixes it encountered while | parsing the document. | | :param limit: After finding this number of results, stop looking. | | :param kwargs: Keyword arguments to be passed into SoupSieve's | soupsieve.select() method. | | :return: A ResultSet of Tags. | :rtype: bs4.element.ResultSet | | select_one(self, selector, namespaces=None, **kwargs) | Perform a CSS selection operation on the current element. | | :param selector: A CSS selector. | | :param namespaces: A dictionary mapping namespace prefixes | used in the CSS selector to namespace URIs. By default, | Beautiful Soup will use the prefixes it encountered while | parsing the document. | | :param kwargs: Keyword arguments to be passed into Soup Sieve's | soupsieve.select() method. | | :return: A Tag. | :rtype: bs4.element.Tag | | smooth(self) | Smooth out this element's children by consolidating consecutive | strings. | | This makes pretty-printed output look more natural following a | lot of operations that modified the tree. | | ---------------------------------------------------------------------- | Readonly properties defined here: | | children | Iterate over all direct children of this PageElement. | | :yield: A sequence of PageElements. | | css | Return an interface to the CSS selector API. | | descendants | Iterate over all children of this PageElement in a | breadth-first sequence. | | :yield: A sequence of PageElements. | | isSelfClosing | Is this tag an empty-element tag? (aka a self-closing tag) | | A tag that has contents is never an empty-element tag. | | A tag that has no contents may or may not be an empty-element | tag. It depends on the builder used to create the tag. If the | builder has a designated list of empty-element tags, then only | a tag whose name shows up in that list is considered an | empty-element tag. | | If the builder has no designated list of empty-element tags, | then any tag with no contents is an empty-element tag. | | is_empty_element | Is this tag an empty-element tag? (aka a self-closing tag) | | A tag that has contents is never an empty-element tag. | | A tag that has no contents may or may not be an empty-element | tag. It depends on the builder used to create the tag. If the | builder has a designated list of empty-element tags, then only | a tag whose name shows up in that list is considered an | empty-element tag. | | If the builder has no designated list of empty-element tags, | then any tag with no contents is an empty-element tag. | | self_and_descendants | Iterate over this PageElement and its children in a | breadth-first sequence. | | :yield: A sequence of PageElements. | | strings | Yield all strings of certain classes, possibly stripping them. | | :param strip: If True, all strings will be stripped before being | yielded. | | :param types: A tuple of NavigableString subclasses. Any strings of | a subclass not found in this list will be ignored. By | default, the subclasses considered are the ones found in | self.interesting_string_types. If that's not specified, | only NavigableString and CData objects will be | considered. That means no comments, processing | instructions, etc. | | :yield: A sequence of strings. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | parserClass | | string | Convenience property to get the single string within this | PageElement. | | TODO It might make sense to have NavigableString.string return | itself. | | :return: If this element has a single string child, return | value is that string. If this element has one child tag, | return value is the 'string' attribute of the child tag, | recursively. If this element is itself a string, has no | children, or has more than one child, return value is None. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | DEFAULT_INTERESTING_STRING_TYPES = (<class 'bs4.element.NavigableStrin... | | EMPTY_ELEMENT_EVENT = <object object> | | END_ELEMENT_EVENT = <object object> | | START_ELEMENT_EVENT = <object object> | | STRING_ELEMENT_EVENT = <object object> | | ---------------------------------------------------------------------- | Methods inherited from PageElement: | | append(self, tag) | Appends the given PageElement to the contents of this one. | | :param tag: A PageElement. | | extend(self, tags) | Appends the given PageElements to this one's contents. | | :param tags: A list of PageElements. If a single Tag is | provided instead, this PageElement's contents will be extended | with that Tag's contents. | | extract(self, _self_index=None) | Destructively rips this element out of the tree. | | :param _self_index: The location of this element in its parent's | .contents, if known. Passing this in allows for a performance | optimization. | | :return: `self`, no longer part of the tree. | | fetchNextSiblings = find_next_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | fetchParents = find_parents(self, name=None, attrs={}, limit=None, **kwargs) | | fetchPrevious = find_all_previous(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | fetchPreviousSiblings = find_previous_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | findAllNext = find_all_next(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | findAllPrevious = find_all_previous(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | findNext = find_next(self, name=None, attrs={}, string=None, **kwargs) | | findNextSibling = find_next_sibling(self, name=None, attrs={}, string=None, **kwargs) | | findNextSiblings = find_next_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | findParent = find_parent(self, name=None, attrs={}, **kwargs) | | findParents = find_parents(self, name=None, attrs={}, limit=None, **kwargs) | | findPrevious = find_previous(self, name=None, attrs={}, string=None, **kwargs) | | findPreviousSibling = find_previous_sibling(self, name=None, attrs={}, string=None, **kwargs) | | findPreviousSiblings = find_previous_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | | find_all_next(self, name=None, attrs={}, string=None, limit=None, **kwargs) | Find all PageElements that match the given criteria and appear | later in the document than this PageElement. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A ResultSet containing PageElements. | | find_all_previous(self, name=None, attrs={}, string=None, limit=None, **kwargs) | Look backwards in the document from this PageElement and find all | PageElements that match the given criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A ResultSet of PageElements. | :rtype: bs4.element.ResultSet | | find_next(self, name=None, attrs={}, string=None, **kwargs) | Find the first PageElement that matches the given criteria and | appears later in the document than this PageElement. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :kwargs: A dictionary of filters on attribute values. | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_next_sibling(self, name=None, attrs={}, string=None, **kwargs) | Find the closest sibling to this PageElement that matches the | given criteria and appears later in the document. | | All find_* methods take a common set of arguments. See the | online documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :kwargs: A dictionary of filters on attribute values. | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_next_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | Find all siblings of this PageElement that match the given criteria | and appear later in the document. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A ResultSet of PageElements. | :rtype: bs4.element.ResultSet | | find_parent(self, name=None, attrs={}, **kwargs) | Find the closest parent of this PageElement that matches the given | criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :kwargs: A dictionary of filters on attribute values. | | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_parents(self, name=None, attrs={}, limit=None, **kwargs) | Find all parents of this PageElement that match the given criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_previous(self, name=None, attrs={}, string=None, **kwargs) | Look backwards in the document from this PageElement and find the | first PageElement that matches the given criteria. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :kwargs: A dictionary of filters on attribute values. | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_previous_sibling(self, name=None, attrs={}, string=None, **kwargs) | Returns the closest sibling to this PageElement that matches the | given criteria and appears earlier in the document. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :kwargs: A dictionary of filters on attribute values. | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | find_previous_siblings(self, name=None, attrs={}, string=None, limit=None, **kwargs) | Returns all siblings to this PageElement that match the | given criteria and appear earlier in the document. | | All find_* methods take a common set of arguments. See the online | documentation for detailed explanations. | | :param name: A filter on tag name. | :param attrs: A dictionary of filters on attribute values. | :param string: A filter for a NavigableString with specific text. | :param limit: Stop looking after finding this many results. | :kwargs: A dictionary of filters on attribute values. | :return: A ResultSet of PageElements. | :rtype: bs4.element.ResultSet | | format_string(self, s, formatter) | Format the given string using the given formatter. | | :param s: A string. | :param formatter: A Formatter object, or a string naming one of the standard formatters. | | formatter_for_name(self, formatter) | Look up or create a Formatter for the given identifier, | if necessary. | | :param formatter: Can be a Formatter object (used as-is), a | function (used as the entity substitution hook for an | XMLFormatter or HTMLFormatter), or a string (used to look | up an XMLFormatter or HTMLFormatter in the appropriate | registry. | | getText = get_text(self, separator='', strip=False, types=<object object at 0x7f03ff188cb0>) | | get_text(self, separator='', strip=False, types=<object object at 0x7f03ff188cb0>) | Get all child strings of this PageElement, concatenated using the | given separator. | | :param separator: Strings will be concatenated using this separator. | | :param strip: If True, strings will be stripped before being | concatenated. | | :param types: A tuple of NavigableString subclasses. Any | strings of a subclass not found in this list will be | ignored. Although there are exceptions, the default | behavior in most cases is to consider only NavigableString | and CData objects. That means no comments, processing | instructions, etc. | | :return: A string. | | insert(self, position, new_child) | Insert a new PageElement in the list of this PageElement's children. | | This works the same way as `list.insert`. | | :param position: The numeric position that should be occupied | in `self.children` by the new PageElement. | :param new_child: A PageElement. | | insert_after(self, *args) | Makes the given element(s) the immediate successor of this one. | | The elements will have the same parent, and the given elements | will be immediately after this one. | | :param args: One or more PageElements. | | insert_before(self, *args) | Makes the given element(s) the immediate predecessor of this one. | | All the elements will have the same parent, and the given elements | will be immediately before this one. | | :param args: One or more PageElements. | | nextGenerator(self) | # Old non-property versions of the generators, for backwards | # compatibility with BS3. | | nextSiblingGenerator(self) | | parentGenerator(self) | | previousGenerator(self) | | previousSiblingGenerator(self) | | replaceWith = replace_with(self, *args) | | replaceWithChildren = unwrap(self) | | replace_with(self, *args) | Replace this PageElement with one or more PageElements, keeping the | rest of the tree the same. | | :param args: One or more PageElements. | :return: `self`, no longer part of the tree. | | replace_with_children = unwrap(self) | | setup(self, parent=None, previous_element=None, next_element=None, previous_sibling=None, next_sibling=None) | Sets up the initial relations between this element and | other elements. | | :param parent: The parent of this element. | | :param previous_element: The element parsed immediately before | this one. | | :param next_element: The element parsed immediately before | this one. | | :param previous_sibling: The most recently encountered element | on the same level of the parse tree as this one. | | :param previous_sibling: The next element to be encountered | on the same level of the parse tree as this one. | | unwrap(self) | Replace this PageElement with its contents. | | :return: `self`, no longer part of the tree. | | wrap(self, wrap_inside) | Wrap this PageElement inside another one. | | :param wrap_inside: A PageElement. | :return: `wrap_inside`, occupying the position in the tree that used | to be occupied by `self`, and with `self` inside it. | | ---------------------------------------------------------------------- | Readonly properties inherited from PageElement: | | decomposed | Check whether a PageElement has been decomposed. | | :rtype: bool | | next | The PageElement, if any, that was parsed just after this one. | | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | next_elements | All PageElements that were parsed after this one. | | :yield: A sequence of PageElements. | | next_siblings | All PageElements that are siblings of this one but were parsed | later. | | :yield: A sequence of PageElements. | | parents | All PageElements that are parents of this PageElement. | | :yield: A sequence of PageElements. | | previous | The PageElement, if any, that was parsed just before this one. | | :return: A PageElement. | :rtype: bs4.element.Tag | bs4.element.NavigableString | | previous_elements | All PageElements that were parsed before this one. | | :yield: A sequence of PageElements. | | previous_siblings | All PageElements that are siblings of this one but were parsed | earlier. | | :yield: A sequence of PageElements. | | stripped_strings | Yield all strings in this PageElement, stripping them first. | | :yield: A sequence of stripped strings. | | text | Get all child strings of this PageElement, concatenated using the | given separator. | | :param separator: Strings will be concatenated using this separator. | | :param strip: If True, strings will be stripped before being | concatenated. | | :param types: A tuple of NavigableString subclasses. Any | strings of a subclass not found in this list will be | ignored. Although there are exceptions, the default | behavior in most cases is to consider only NavigableString | and CData objects. That means no comments, processing | instructions, etc. | | :return: A string. | | ---------------------------------------------------------------------- | Data descriptors inherited from PageElement: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | nextSibling | | previousSibling | | ---------------------------------------------------------------------- | Data and other attributes inherited from PageElement: | | default = <object object> | | known_xml = None
Generated by phpMan Author: Che Dong Under GNU General Public License
2026-06-02 07:17 @216.73.216.198 CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)