{
    "mode": "perldoc",
    "parameter": "HTML::Clean",
    "section": "",
    "url": "https://www.chedong.com/phpMan.php/perldoc/HTML%3A%3AClean/json",
    "generated": "2026-06-02T23:30:42Z",
    "synopsis": "use HTML::Clean;\n$h = HTML::Clean->new($filename); # or..\n$h = HTML::Clean->new($htmlcode);\n$h->compat();\n$h->strip();\n$data = $h->data();\nprint $$data;",
    "sections": {
        "NAME": {
            "content": "HTML::Clean - Cleans up HTML code for web browsers, not humans\n",
            "subsections": []
        },
        "SYNOPSIS": {
            "content": "use HTML::Clean;\n$h = HTML::Clean->new($filename); # or..\n$h = HTML::Clean->new($htmlcode);\n\n$h->compat();\n$h->strip();\n$data = $h->data();\nprint $$data;\n",
            "subsections": []
        },
        "DESCRIPTION": {
            "content": "The HTML::Clean module encapsulates a number of common techniques for minimizing the size of\nHTML files. You can typically save between 10% and 50% of the size of a HTML file using these\nmethods. It provides the following features:\n\nRemove unneeded whitespace (beginning of line, etc)\nRemove unneeded META elements.\nRemove HTML comments (except for styles, javascript and SSI)\nReplace tags with equivalent shorter tags (<strong> --> <b>)\netc.\n\nThe entire process is configurable, so you can pick and choose what you want to clean.\n\nTHE HTML::Clean CLASS\n$h = HTML::Clean->new($dataorfile, [$level]);\nThis creates a new HTML::Clean object. A Prerequisite for all other functions in this module.\n\nThe $dataorfile parameter supplies the input HTML, either a filename, or a reference to a scalar\nvalue holding the HTML, for example:\n\n$h = HTML::Clean->new(\"/htdocs/index.html\");\n$html = \"<strong>Hello!</strong>\";\n$h = HTML::Clean->new(\\$html);\n\nAn optional 'level' parameter controls the level of optimization performed. Levels range from 1\nto 9. Level 1 includes only simple fast optimizations. Level 9 includes all optimizations.\n\n$h->initialize($dataorfile)\nThis function allows you to reinitialize the HTML data used by the current object. This is\nuseful if you are processing many files.\n\n$dataorfile has the same usage as the new method.\n\nReturn 0 for an error, 1 for success.\n\n$h->level([$level])\nGet/set the optimization level. $level is a number from 1 to 9.\n\n$myref = $h->data()\nReturns the current HTML data as a scalar reference.\n\nstrip(\\%options);\nRemoves excess space from HTML\n\nYou can control the optimizations used by specifying them in the %options hash reference.\n\nThe following options are recognized:\n\nboolean values (0 or 1 values)\nwhitespace    Remove excess whitespace\nshortertags   <strong> -> <b>, etc..\nblink         No blink tags.\ncontenttype   Remove default contenttype.\ncomments      Remove excess comments.\nentities      &quot; -> \", etc.\ndequote       remove quotes from tag parameters where possible.\ndefcolor      recode colors in shorter form. (#ffffff -> white, etc.)\njavascript    remove excess spaces and newlines in javascript code.\nhtmldefaults  remove default values for some html tags\nlowercasetags translate all HTML tags to lowercase\n\nparameterized values\nmeta        Takes a space separated list of meta tags to remove,\ndefault \"GENERATOR FORMATTER\"\n\nemptytags   Takes a space separated list of tags to remove when there is no\ncontent between the start and end tag, like this: <b></b>.\nThe default is 'b i font center'\n\nPlease note that if your HTML includes preformatted regions (this means, if it includes\n<pre>...</pre>, we do not suggest removing whitespace, as it will alter the rendered defaults.\n\nHTML::Clean will print out a warning if it finds a preformatted region and is requested to strip\nwhitespace. In order to prevent this, specify that you don't want to strip whitespace - i.e.\n\n$h->strip( {whitespace => 0} );\n\ncompat()\nThis function improves the cross-platform compatibility of your HTML. Currently checks for the\nfollowing problems:\n\nInsuring all IMG tags have ALT elements.\nUse of Arial, Futura, or Verdana as a font face.\nPositioning the <TITLE> tag immediately after the <head> tag.\n\ndefrontpage();\nThis function converts pages created with Microsoft Frontpage to something a Unix server will\nunderstand a bit better. This function currently does the following:\n\nConverts Frontpage 'hit counters' into a unix specific format.\nRemoves some frontpage specific html comments\n",
            "subsections": []
        },
        "SEE ALSO": {
            "content": "",
            "subsections": [
                {
                    "name": "Modules",
                    "content": "FrontPage::Web, FrontPage::File\n"
                },
                {
                    "name": "Web Sites",
                    "content": "Distribution Site - http://people.itu.int/~lindner/\n\nAUTHORS and CO-AUTHORS\nPaul Lindner for the International Telecommunication Union (ITU)\n\nPavel Kuptsov <admin@modernperl.ru>\n"
                }
            ]
        },
        "COPYRIGHT": {
            "content": "The HTML::Strip module is Copyright (c) 1998,99 by the ITU, Geneva Switzerland. All rights\nreserved.\n\nYou may distribute under the terms of either the GNU General Public License or the Artistic\nLicense, as specified in the Perl README file.\n",
            "subsections": []
        }
    },
    "summary": "HTML::Clean - Cleans up HTML code for web browsers, not humans",
    "flags": [],
    "examples": [],
    "see_also": []
}