{
    "mode": "perldoc",
    "parameter": "WWW::Mechanize::FAQ",
    "section": "",
    "url": "https://www.chedong.com/phpMan.php/perldoc/WWW%3A%3AMechanize%3A%3AFAQ/json",
    "generated": "2026-06-11T06:52:42Z",
    "sections": {
        "NAME": {
            "content": "WWW::Mechanize::FAQ - Frequently Asked Questions about WWW::Mechanize\n",
            "subsections": []
        },
        "VERSION": {
            "content": "version 2.06\n",
            "subsections": []
        },
        "How to get help with WWW::Mechanize": {
            "content": "If your question isn't answered here in the FAQ, please turn to the communities at:\n\n*   StackOverflow <https://stackoverflow.com/questions/tagged/www-mechanize>\n\n*   #lwp on irc.perl.org\n\n*   <http://perlmonks.org>\n\n*   The libwww-perl mailing list at <http://lists.perl.org>\n",
            "subsections": []
        },
        "JavaScript": {
            "content": "I have this web page that has JavaScript on it, and my Mech program doesn't work.\nThat's because WWW::Mechanize doesn't operate on the JavaScript. It only understands the HTML\nparts of the page.\n\nI thought Mech was supposed to work like a web browser.\nIt does pretty much, but it doesn't support JavaScript.\n\nI added some basic attempts at picking up URLs in \"window.open()\" calls and return them in\n\"$mech->links\". They work sometimes.\n\nSince Javascript is completely visible to the client, it cannot be used to prevent a scraper\nfrom following links. But it can make life difficult. If you want to scrape specific pages, then\na solution is always possible.\n\nOne typical use of Javascript is to perform argument checking before posting to the server. The\nURL you want is probably just buried in the Javascript function. Do a regular expression match\non \"$mech->content()\" to find the link that you want and \"$mech->get\" it directly (this assumes\nthat you know what you are looking for in advance).\n\nIn more difficult cases, the Javascript is used for URL mangling to satisfy the needs of some\nmiddleware. In this case you need to figure out what the Javascript is doing (why are these URLs\nalways really long?). There is probably some function with one or more arguments which\ncalculates the new URL. Step one: using your favorite browser, get the before and after URLs and\nsave them to files. Edit each file, converting the argument separators ('?', '&' or ';') into\nnewlines. Now it is easy to use diff or comm to find out what Javascript did to the URL. Step 2\n- find the function call which created the URL - you will need to parse and interpret its\nargument list. The Javascript Debugger in the Firebug extension for Firefox helps with the\nanalysis. At this point, it is fairly trivial to write your own function which emulates the\nJavascript for the pages you want to process.\n\nHere's another approach that answers the question, \"It works in Firefox, but why not Mech?\"\nEverything the web server knows about the client is present in the HTTP request. If two requests\nare identical, the results should be identical. So the real question is \"What is different\nbetween the mech request and the Firefox request?\"\n\nThe Firefox extension \"Tamper Data\" is an effective tool for examining the headers of the\nrequests to the server. Compare that with what LWP is sending. Once the two are identical, the\naction of the server should be the same as well.\n\nI say \"should\", because this is an oversimplification - some values are naturally unique, e.g. a\nSessionID, but if a SessionID is present, that is probably sufficient, even though the value\nwill be different between the LWP request and the Firefox request. The server could use the\nsession to store information which is troublesome, but that's not the first place to look (and\nhighly unlikely to be relevant when you are requesting the login page of your site).\n\nGenerally the problem is to be found in missing or incorrect POSTDATA arguments, Cookies,\nUser-Agents, Accepts, etc. If you are using mech, then redirects and cookies should not be a\nproblem, but are listed here for completeness. If you are missing headers, \"$mech->addheader\"\ncan be used to add the headers that you need.\n\nWhich modules work like Mechanize and have JavaScript support?\nIn no particular order: Gtk2::WebKit::Mechanize, Win32::IE::Mechanize, WWW::Mechanize::Firefox,\nWWW::Scripter, WWW::Selenium\n\nHow do I do X?\nCan I do [such-and-such] with WWW::Mechanize?\nIf it's possible with LWP::UserAgent, then yes. WWW::Mechanize is a subclass of LWP::UserAgent,\nso all the wondrous magic of that class is inherited.\n\nHow do I use WWW::Mechanize through a proxy server?\nSee the docs in LWP::UserAgent on how to use the proxy. Short version:\n\n$mech->proxy(['http', 'ftp'], 'http://proxy.example.com:8000/');\n\nor get the specs from the environment:\n\n$mech->envproxy();\n\n# Environment set like so:\ngopherproxy=http://proxy.my.place/\nwaisproxy=http://proxy.my.place/\nnoproxy=\"localhost,my.domain\"\nexport gopherproxy waisproxy noproxy\n\nHow can I see what fields are on the forms?\nUse the mech-dump utility, optionally installed with Mechanize.\n\n$ mech-dump --forms http://search.cpan.org\nDumping forms\nGET http://search.cpan.org/search\nquery=\nmode=all                        (option)  [*all|module|dist|author]\n<NONAME>=CPAN Search            (submit)\n\nHow do I get Mech to handle authentication?\nuse MIME::Base64;\n\nmy $agent = WWW::Mechanize->new();\nmy @args = (\nAuthorization => \"Basic \" .\nMIME::Base64::encode( USER . ':' . PASS )\n);\n\n$agent->credentials( ADDRESS, REALM, USER, PASS );\n$agent->get( URL, @args );\n\nIf you want to use the credentials for all future requests, you can also use the LWP::UserAgent\n\"defaultheader()\" method instead of the extra arguments to \"get()\"\n\n$mech->defaultheader(\nAuthorization => 'Basic ' . encodebase64( USER . ':' . PASSWORD ) );\n\nHow can I get WWW::Mechanize to execute this JavaScript?\nYou can't. JavaScript is entirely client-based, and WWW::Mechanize is a client that doesn't\nunderstand JavaScript. See the top part of this FAQ.\n\nHow do I check a checkbox that doesn't have a value defined?\nSet it to the value of \"on\".\n\n$mech->field( mycheckbox => 'on' );\n\nHow do I handle frames?\nYou don't deal with them as frames, per se, but as links. Extract them with\n\nmy @framelinks = $mech->findlink( tag => \"frame\" );\n\nHow do I get a list of HTTP headers and their values?\nAll HTTP::Headers methods work on a HTTP::Response object which is returned by the *get()*,\n*reload()*, *response()/res()*, *click()*, *submitform()*, and *request()* methods.\n\nmy $mech = WWW::Mechanize->new( autocheck => 1 );\n$mech->get( 'http://my.site.com' );\nmy $response = $mech->response();\nfor my $key ( $response->headerfieldnames() ) {\nprint $key, \" : \", $response->header( $key ), \"\\n\";\n}\n\nHow do I enable keep-alive?\nSince WWW::Mechanize is a subclass of LWP::UserAgent, you can use the same mechanism to enable\nkeep-alive:\n\nuse LWP::ConnCache;\n...\n$mech->conncache(LWP::ConnCache->new);\n\nHow can I change/specify the action parameter of an HTML form?\nYou can access the action of the form by utilizing the HTML::Form object returned from one of\nthe specifying form methods.\n\nUsing \"$mech->formnumber($number)\":\n\nmy $mech = WWW::mechanize->new;\n$mech->get('http://someurlhere.com');\n# Access the form using its Zero-Based Index by DOM order\n$mech->formnumber(0)->action('http://newAction'); #ABS URL\n\nUsing \"$mech->formname($number)\":\n\nmy $mech = WWW::mechanize->new;\n$mech->get('http://someurlhere.com');\n#Access the form using its Zero-Based Index by DOM order\n$mech->formname('trgForm')->action('http://newAction'); #ABS URL\n\nHow do I save an image?  How do I save a large tarball?\nAn image is just content. You get the image and save it.\n\n$mech->get( 'photo.jpg' );\n$mech->savecontent( '/path/to/my/directory/photo.jpg' );\n\nYou can also save any content directly to disk using the \":contentfile\" flag to \"get()\", which\nis part of LWP::UserAgent.\n\n$mech->get( 'http://www.cpan.org/src/stable.tar.gz',\n':contentfile' => 'stable.tar.gz' );\n\nHow do I pick a specific value from a \"<select>\" list?\nFind the \"HTML::Form::ListInput\" in the page.\n\nmy ($listbox) = $mech->findallinputs( name => 'listbox' );\n\nThen create a hash for the lookup:\n\nmy %namelookup;\n@namelookup{ $listbox->valuenames } = $listbox->possiblevalues;\nmy $value = $namelookup{ 'Name I want' };\n\nIf you have duplicate names, this method won't work, and you'll have to loop over\n\"$listbox->valuenames\" and \"$listbox->possiblevalues\" in parallel until you find a matching\nname.\n\nHow do I get Mech to not follow redirects?\nYou use functionality in LWP::UserAgent, not Mech itself.\n\n$mech->requestsredirectable( [] );\n\nOr you can set \"maxredirect\":\n\n$mech->maxredirect( 0 );\n\nBoth these options can also be set in the constructor. Mech doesn't understand them, so will\npass them through to the LWP::UserAgent constructor.\n",
            "subsections": []
        },
        "Why doesn't this work: Debugging your Mechanize program": {
            "content": "",
            "subsections": [
                {
                    "name": "My Mech program doesn't work, but it works in the browser.",
                    "content": "Mechanize acts like a browser, but apparently something you're doing is not matching the\nbrowser's behavior. Maybe it's expecting a certain web client, or maybe you've not handling a\nfield properly. For some reason, your Mech problem isn't doing exactly what the browser is\ndoing, and when you find that, you'll have the answer.\n"
                },
                {
                    "name": "My Mech program gets these 500 errors.",
                    "content": "A 500 error from the web server says that the program on the server side died. Probably the web\nserver program was expecting certain inputs that you didn't supply, and instead of handling it\nnicely, the program died.\n\nWhatever the cause of the 500 error, if it works in the browser, but not in your Mech program,\nyou're not acting like the browser. See the previous question.\n\nWhy doesn't my program handle this form correctly?\nRun mech-dump on your page and see what it says.\n\nmech-dump is a marvelous diagnostic tool for figuring out what forms and fields are on the page.\nSay you're scraping CNN.com, you'd get this:\n\n$ mech-dump http://www.cnn.com/\nGET http://search.cnn.com/cnn/search\nsource=cnn                     (hidden readonly)\ninvocationType=search/top      (hidden readonly)\nsites=web                      (radio)    [*web/The Web ??|cnn/CNN.com ??]\nquery=                         (text)\n<NONAME>=Search                (submit)\n\nPOST http://cgi.money.cnn.com/servlets/quoteredirect\nquery=                         (text)\n<NONAME>=GET                   (submit)\n\nPOST http://polls.cnn.com/poll\npollid=2112                   (hidden readonly)\nquestion1=<UNDEF>             (radio)    [1/Simplistic option|2/VIEW RESULTS]\n<NONAME>=VOTE                  (submit)\n\nGET http://search.cnn.com/cnn/search\nsource=cnn                     (hidden readonly)\ninvocationType=search/bottom   (hidden readonly)\nsites=web                      (radio)    [*web/??CNN.com|cnn/??]\nquery=                         (text)\n<NONAME>=Search                (submit)\n\nFour forms, including the first one duplicated at the end. All the fields, all their defaults,\nlovingly generated by HTML::Form's \"dump\" method.\n\nIf you want to run mech-dump on something that doesn't lend itself to a quick URL fetch, then\nuse the \"savecontent()\" method to write the HTML to a file, and run mech-dump on the file.\n\nWhy don't https:// URLs work?\nYou need either IO::Socket::SSL or Crypt::SSLeay installed.\n\nWhy do I get \"Input 'fieldname' is readonly\"?\nYou're trying to change the value of a hidden field and you have warnings on.\n\nFirst, make sure that you actually mean to change the field that you're changing, and that you\ndon't have a typo. Usually, hidden variables are set by the site you're working on for a reason.\nIf you change the value, you might be breaking some functionality by faking it out.\n\nIf you really do want to change a hidden value, make the changes in a scope that has warnings\nturned off:\n\n{\nlocal $^W = 0;\n$agent->field( name => $value );\n}\n\nI tried to [such-and-such] and I got this weird error.\nAre you checking your errors?\n\nAre you sure?\n\nAre you checking that your action succeeded after every action?\n\nAre you sure?\n\nFor example, if you try this:\n\n$mech->get( \"http://my.site.com\" );\n$mech->followlink( \"foo\" );\n\nand the \"get\" call fails for some reason, then the Mech internals will be unusable for the\n\"followlink\" and you'll get a weird error. You must, after every action that GETs or POSTs a\npage, check that Mech succeeded, or all bets are off.\n\n$mech->get( \"http://my.site.com\" );\ndie \"Can't even get the home page: \", $mech->response->statusline\nunless $mech->success;\n\n$mech->followlink( \"foo\" );\ndie \"Foo link failed: \", $mech->response->statusline\nunless $mech->success;\n\nHow do I figure out why \"$mech->get($url)\" doesn't work?\nThere are many reasons why a \"get()\" can fail. The server can take you to someplace you didn't\nexpect. It can generate redirects which are not properly handled. You can get time-outs. Servers\nare down more often than you think! etc, etc, etc. A couple of places to start:\n\n1 Check \"$mech->status()\" after each call\n2 Check the URL with \"$mech->uri()\" to see where you ended up\n3 Try debugging with \"LWP::ConsoleLogger\".\n\nIf things are really strange, turn on debugging with \"use LWP::ConsoleLogger::Everywhere;\" Just\nput this in the main program. This causes LWP to print out a trace of the HTTP traffic between\nclient and server and can be used to figure out what is happening at the protocol level.\n\nIt is also useful to set many traps to verify that processing is proceeding as expected. A Mech\nprogram should always have an \"I didn't expect to get here\" or \"I don't recognize the page that\nI am processing\" case and bail out.\n\nSince errors can be transient, by the time you notice that the error has occurred, it might not\nbe possible to reproduce it manually. So for automated processing it is useful to email yourself\nthe following information:\n\n*   where processing is taking place\n\n*   An Error Message\n\n*   $mech->uri\n\n*   $mech->content\n\nYou can also save the content of the page with \"$mech->savecontent( 'filename.html' );\"\n\nI submitted a form, but the server ignored everything!  I got an empty form back!\nThe post is handled by application software. It is common for PHP programmers to use the same\nfile both to display a form and to process the arguments returned. So the first task of the\napplication programmer is to decide whether there are arguments to processes. The program can\ncheck whether a particular parameter has been set, whether a hidden parameter has been set, or\nwhether the submit button has been clicked. (There are probably other ways that I haven't\nthought of).\n\nIn any case, if your form is not setting the parameter (e.g. the submit button) which the web\napplication is keying on (and as an outsider there is no way to know what it is keying on), it\nwill not notice that the form has been submitted. Try using \"$mech->click()\" instead of\n\"$mech->submit()\" or vice-versa.\n\nI've logged in to the server, but I get 500 errors when I try to get to protected content.\nSome web sites use distributed databases for their processing. It can take a few seconds for the\nlogin/session information to percolate through to all the servers. For human users with their\nslow reaction times, this is not a problem, but a Perl script can outrun the server. So try\nadding a sleep(5) between logging in and actually doing anything (the optimal delay must be\ndetermined experimentally).\n\nMech is a big memory pig!  I'm running out of RAM!\nMech keeps a history of every page, and the state it was in. It actually keeps a clone of the\nfull Mech object at every step along the way.\n\nYou can limit this stack size with the \"stackdepth\" param in the \"new()\" constructor. If you\nset stacksize to 0, Mech will not keep any history.\n"
                }
            ]
        },
        "AUTHOR": {
            "content": "Andy Lester <andy at petdance.com>\n",
            "subsections": []
        },
        "COPYRIGHT AND LICENSE": {
            "content": "This software is copyright (c) 2004 by Andy Lester.\n\nThis is free software; you can redistribute it and/or modify it under the same terms as the Perl\n5 programming language system itself.\n",
            "subsections": []
        }
    },
    "summary": "WWW::Mechanize::FAQ - Frequently Asked Questions about WWW::Mechanize",
    "flags": [],
    "examples": [],
    "see_also": []
}