Found in /usr/share/perl/5.34/pod/perlfaq1.pod Who supports Perl? Who develops it? Why is it free? The original culture of the pre-populist Internet and the deeply-held beliefs of Perl's author, Larry Wall, gave rise to the free and open distribution policy of Perl. Perl is supported by its users. The core, the standard Perl library, the optional modules, and the documentation you're reading now were all written by volunteers. The core development team (known as the Perl Porters) are a group of highly altruistic individuals committed to producing better software for free than you could hope to purchase for money. You may snoop on pending developments via the archives <http://www.nntp.perl.org/group/perl.perl5.porters/> or you can subscribe to the mailing list by sending perl5-porters-subscribe AT perl.org a subscription request (an empty message with no subject is fine). While the GNU project includes Perl in its distributions, there's no such thing as "GNU Perl". Perl is not produced nor maintained by the Free Software Foundation. Perl's licensing terms are also more open than GNU software's tend to be. You can get commercial support of Perl if you wish, although for most users the informal support will more than suffice. See the answer to "Where can I buy a commercial version of Perl?" for more information. What are Perl 4, Perl 5, or Raku (Perl 6)? In short, Perl 4 is the parent to both Perl 5 and Raku (formerly known as Perl 6). Perl 5 is the older sibling, and though they are different languages, someone who knows one will spot many similarities in the other. The number after Perl (i.e. the 5 after Perl 5) is the major release of the perl interpreter as well as the version of the language. Each major version has significant differences that earlier versions cannot support. The current major release of Perl is Perl 5, first released in 1994. It can run scripts from the previous major release, Perl 4 (March 1991), but has significant differences. Raku is a reinvention of Perl, a language in the same lineage but not compatible. The two are complementary, not mutually exclusive. Raku is not meant to replace Perl, and vice versa. See "What is Raku (Perl 6)?" below to find out more. See perlhist for a history of Perl revisions. How often are new versions of Perl released? Recently, the plan has been to release a new version of Perl roughly every April, but getting the release right is more important than sticking rigidly to a calendar date, so the release date is somewhat flexible. The historical release dates can be viewed at <http://www.cpan.org/src/README.html>. Even numbered minor versions (5.14, 5.16, 5.18) are production versions, and odd numbered minor versions (5.15, 5.17, 5.19) are development versions. Unless you want to try out an experimental feature, you probably never want to install a development version of Perl. The Perl development team are called Perl 5 Porters, and their organization is described at <http://perldoc.perl.org/perlpolicy.html>. The organizational rules really just boil down to one: Larry is always right, even when he was wrong. How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? Perl can be used for almost any coding problem, even ones which require integrating specialist C code for extra speed. As with any tool it can be used well or badly. Perl has many strengths, and a few weaknesses, precisely which areas are good and bad is often a personal choice. When choosing a language you should also be influenced by the resources <http://www.cpan.org/>, testing culture <http://www.cpantesters.org/> and community <http://www.perl.org/community.html> which surrounds it. For comparisons to a specific language it is often best to create a small project in both languages and compare the results, make sure to use all the resources <http://www.cpan.org/> of each language, as a language is far more than just it's syntax. What's the difference between "perl" and "Perl"? "Perl" is the name of the language. Only the "P" is capitalized. The name of the interpreter (the program which runs the Perl script) is "perl" with a lowercase "p". You may or may not choose to follow this usage. But never write "PERL", because perl is not an acronym. Found in /usr/share/perl/5.34/pod/perlfaq2.pod What machines support Perl? Where do I get it? The standard release of Perl (the one maintained by the Perl development team) is distributed only in source code form. You can find the latest releases at <http://www.cpan.org/src/>. Perl builds and runs on a bewildering number of platforms. Virtually all known and current Unix derivatives are supported (perl's native platform), as are other systems like VMS, DOS, OS/2, Windows, QNX, BeOS, OS X, MPE/iX and the Amiga. Binary distributions for some proprietary platforms can be found <http://www.cpan.org/ports/> directory. Because these are not part of the standard distribution, they may and in fact do differ from the base perl port in a variety of ways. You'll have to check their respective release notes to see just what the differences are. These differences can be either positive (e.g. extensions for the features of the particular platform that are not supported in the source release of perl) or negative (e.g. might be based upon a less current source release of perl). I don't have a C compiler. How can I build my own Perl interpreter? For Windows, use a binary version of Perl, Strawberry Perl <http://strawberryperl.com/> and ActivePerl <http://www.activestate.com/activeperl> come with a bundled C compiler. Otherwise if you really do want to build Perl, you need to get a binary version of "gcc" for your system first. Use a search engine to find out how to do this for your operating system. What modules and extensions are available for Perl? What is CPAN? CPAN stands for Comprehensive Perl Archive Network, a multi-gigabyte archive replicated on hundreds of machines all over the world. CPAN contains tens of thousands of modules and extensions, source code and documentation, designed for *everything* from commercial database interfaces to keyboard/screen control and running large web sites. You can search CPAN on <http://metacpan.org>. The master web site for CPAN is <http://www.cpan.org/>, <http://www.cpan.org/SITES.html> lists all mirrors. See the CPAN FAQ at <http://www.cpan.org/misc/cpan-faq.html> for answers to the most frequently asked questions about CPAN. The Task::Kensho module has a list of recommended modules which you should review as a good starting point. Where can I get information on Perl? * <http://www.perl.org/> * <http://perldoc.perl.org/> * <http://learn.perl.org/> The complete Perl documentation is available with the Perl distribution. If you have Perl installed locally, you probably have the documentation installed as well: type "perldoc perl" in a terminal or view online <http://perldoc.perl.org/perl.html>. (Some operating system distributions may ship the documentation in a different package; for instance, on Debian, you need to install the "perl-doc" package.) Many good books have been written about Perl--see the section later in perlfaq2 for more details. Where can I post questions? There are many Perl mailing lists for various topics, specifically the beginners list <http://lists.perl.org/list/beginners.html> may be of use. Other places to ask questions are on the PerlMonks site <http://www.perlmonks.org/> or stackoverflow <http://stackoverflow.com/questions/tagged/perl>. Which Perl blogs should I read? Perl News <http://perlnews.org/> covers some of the major events in the Perl world, Perl Weekly <http://perlweekly.com/> is a weekly e-mail (and RSS feed) of hand-picked Perl articles. <http://blogs.perl.org/> hosts many Perl blogs, there are also several blog aggregators: Perlsphere <http://perlsphere.net/> and IronMan <http://ironman.enlightenedperl.org/> are two of them. What mailing lists are there for Perl? A comprehensive list of Perl-related mailing lists can be found at <http://lists.perl.org/> Where can I buy a commercial version of Perl? Perl already *is* commercial software: it has a license that you can grab and carefully read to your manager. It is distributed in releases and comes in well-defined packages. There is a very large and supportive user community and an extensive literature. If you still need commercial support ActiveState <http://www.activestate.com/activeperl> offers this. Where do I send bug reports? (contributed by brian d foy) First, ensure that you've found an actual bug. Second, ensure you've found an actual bug. If you've found a bug with the perl interpreter or one of the modules in the standard library (those that come with Perl), you can submit a bug report to the GitHub issue tracker at <https://github.com/Perl/perl5/issues>. To determine if a module came with your version of Perl, you can install and use the Module::CoreList module. It has the information about the modules (with their versions) included with each release of Perl. Every CPAN module has a bug tracker set up in RT, <http://rt.cpan.org>. You can submit bugs to RT either through its web interface or by email. To email a bug report, send it to bug-<distribution-name>@rt.cpan.org . For example, if you wanted to report a bug in Business::ISBN, you could send a message to bug-Business-ISBN AT rt.org . Some modules might have special reporting requirements, such as a GitHub or Google Code tracking system, so you should check the module documentation too. Found in /usr/share/perl/5.34/pod/perlfaq3.pod How do I find which modules are installed on my system? From the command line, you can use the "cpan" command's "-l" switch: $ cpan -l You can also use "cpan"'s "-a" switch to create an autobundle file that "CPAN.pm" understands and can use to re-install every module: $ cpan -a Inside a Perl program, you can use the ExtUtils::Installed module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up as "Perl" (although you can get those with Module::CoreList). use ExtUtils::Installed; my $inst = ExtUtils::Installed->new(); my @modules = $inst->modules(); If you want a list of all of the Perl module filenames, you can use File::Find::Rule: use File::Find::Rule; my @files = File::Find::Rule-> extras({follow => 1})-> file()-> name( '*.pm' )-> in( @INC ) ; If you do not have that module, you can do the same thing with File::Find which is part of the standard library: use File::Find; my @files; find( { wanted => sub { push @files, $File::Find::fullname if -f $File::Find::fullname && /\.pm$/ }, follow => 1, follow_skip => 2, }, @INC ); print join "\n", @files; If you simply need to check quickly to see if a module is available, you can check for its documentation. If you can read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases): $ perldoc Module::Name You can also try to include the module in a one-liner to see if perl finds it: $ perl -MModule::Name -e1 (If you don't receive a "Can't locate ... in @INC" error message, then Perl found the module name you asked for.) How do I cross-reference my Perl programs? The B::Xref module can be used to generate cross-reference reports for Perl programs. perl -MO=Xref[,OPTIONS] scriptname.plx Is there a pretty-printer (formatter) for Perl? Perl::Tidy comes with a perl script perltidy which indents and reformats Perl scripts to make them easier to read by trying to follow the rules of the perlstyle. If you write Perl, or spend much time reading Perl, you will probably find it useful. Of course, if you simply follow the guidelines in perlstyle, you shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode or newer cperl-mode for emacs can provide remarkable amounts of help with most (but not all) code, and even less programmable editors can provide significant assistance. Tom Christiansen and many other VI users swear by the following settings in vi and its clones: set ai sw=4 map! ^O {^M}^[O^T Put that in your .exrc file (replacing the caret characters with control characters) and away you go. In insert mode, ^T is for indenting, ^D is for undenting, and ^O is for blockdenting--as it were. A more complete example, with comments, can be found at <http://www.cpan.org/authors/id/T/TO/TOMC/scripts/toms.exrc.gz> Is there an IDE or Windows Perl Editor? Perl programs are just plain text, so any editor will do. If you're on Unix, you already have an IDE--Unix itself. The Unix philosophy is the philosophy of several small tools that each do one thing and do it well. It's like a carpenter's toolbox. If you want an IDE, check the following (in alphabetical order, not order of preference): Eclipse <http://e-p-i-c.sf.net/> The Eclipse Perl Integration Project integrates Perl editing/debugging with Eclipse. Enginsite <http://www.enginsite.com/> Perl Editor by EngInSite is a complete integrated development environment (IDE) for creating, testing, and debugging Perl scripts; the tool runs on Windows 9x/NT/2000/XP or later. IntelliJ IDEA <https://plugins.jetbrains.com/plugin/7796> Camelcade plugin provides Perl5 support in IntelliJ IDEA and other JetBrains IDEs. Kephra <http://kephra.sf.net> GUI editor written in Perl using wxWidgets and Scintilla with lots of smaller features. Aims for a UI based on Perl principles like TIMTOWTDI and "easy things should be easy, hard things should be possible". Komodo <http://www.ActiveState.com/Products/Komodo/> ActiveState's cross-platform (as of October 2004, that's Windows, Linux, and Solaris), multi-language IDE has Perl support, including a regular expression debugger and remote debugging. Notepad++ <http://notepad-plus.sourceforge.net/> Open Perl IDE <http://open-perl-ide.sourceforge.net/> Open Perl IDE is an integrated development environment for writing and debugging Perl scripts with ActiveState's ActivePerl distribution under Windows 95/98/NT/2000. OptiPerl <http://www.optiperl.com/> OptiPerl is a Windows IDE with simulated CGI environment, including debugger and syntax-highlighting editor. Padre <http://padre.perlide.org/> Padre is cross-platform IDE for Perl written in Perl using wxWidgets to provide a native look and feel. It's open source under the Artistic License. It is one of the newer Perl IDEs. PerlBuilder <http://www.solutionsoft.com/perl.htm> PerlBuilder is an integrated development environment for Windows that supports Perl development. visiPerl+ <http://helpconsulting.net/visiperl/index.html> From Help Consulting, for Windows. Visual Perl <http://www.activestate.com/Products/Visual_Perl/> Visual Perl is a Visual Studio.NET plug-in from ActiveState. Zeus <http://www.zeusedit.com/lookmain.html> Zeus for Windows is another Win32 multi-language editor/IDE that comes with support for Perl. For editors: if you're on Unix you probably have vi or a vi clone already, and possibly an emacs too, so you may not need to download anything. In any emacs the cperl-mode (M-x cperl-mode) gives you perhaps the best available Perl editing mode in any editor. If you are using Windows, you can use any editor that lets you work with plain text, such as NotePad or WordPad. Word processors, such as Microsoft Word or WordPerfect, typically do not work since they insert all sorts of behind-the-scenes information, although some allow you to save files as "Text Only". You can also download text editors designed specifically for programming, such as Textpad ( <http://www.textpad.com/> ) and UltraEdit ( <http://www.ultraedit.com/> ), among others. If you are using MacOS, the same concerns apply. MacPerl (for Classic environments) comes with a simple editor. Popular external editors are BBEdit ( <http://www.barebones.com/products/bbedit/> ) or Alpha ( <http://www.his.com/~jguyer/Alpha/Alpha8.html> ). MacOS X users can use Unix editors as well. GNU Emacs <http://www.gnu.org/software/emacs/windows/ntemacs.html> MicroEMACS <http://www.microemacs.de/> XEmacs <http://www.xemacs.org/Download/index.html> Jed <http://space.mit.edu/~davis/jed/> or a vi clone such as Vim <http://www.vim.org/> Vile <http://invisible-island.net/vile/vile.html> The following are Win32 multilanguage editor/IDEs that support Perl: MultiEdit <http://www.MultiEdit.com/> SlickEdit <http://www.slickedit.com/> ConTEXT <http://www.contexteditor.org/> There is also a toyedit Text widget based editor written in Perl that is distributed with the Tk module on CPAN. The ptkdb ( <http://ptkdb.sourceforge.net/> ) is a Perl/Tk-based debugger that acts as a development environment of sorts. Perl Composer ( <http://perlcomposer.sourceforge.net/> ) is an IDE for Perl/Tk GUI creation. In addition to an editor/IDE you might be interested in a more powerful shell environment for Win32. Your options include bash from the Cygwin package ( <http://cygwin.com/> ) zsh <http://www.zsh.org/> Cygwin is covered by the GNU General Public License (but that shouldn't matter for Perl use). Cygwin contains (in addition to the shell) a comprehensive set of standard Unix toolkit utilities. BBEdit and TextWrangler are text editors for OS X that have a Perl sensitivity mode ( <http://www.barebones.com/> ). Where can I get Perl macros for vi? For a complete version of Tom Christiansen's vi configuration file, see <http://www.cpan.org/authors/id/T/TO/TOMC/scripts/toms.exrc.gz> , the standard benchmark file for vi emulators. The file runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter--see <http://www.cpan.org/src/misc/> . Where can I get perl-mode or cperl-mode for emacs? Since Emacs version 19 patchlevel 22 or so, there have been both a perl-mode.el and support for the Perl debugger built in. These should come with the standard Emacs 19 distribution. Note that the perl-mode of emacs will have fits with "main'foo" (single quote), and mess up the indentation and highlighting. You are probably using "main::foo" in new Perl code anyway, so this shouldn't be an issue. For CPerlMode, see <http://www.emacswiki.org/cgi-bin/wiki/CPerlMode> Is it safe to return a reference to local or lexical data? Yes. Perl's garbage collection system takes care of this so everything works out right. sub makeone { my @a = ( 1 .. 10 ); return \@a; } for ( 1 .. 10 ) { push @many, makeone(); } print $many[4][5], "\n"; print "@many\n"; How can I free an array or hash so my program shrinks? (contributed by Michael Carman) You usually can't. Memory allocated to lexicals (i.e. my() variables) cannot be reclaimed or reused even if they go out of scope. It is reserved in case the variables come back into scope. Memory allocated to global variables can be reused (within your program) by using undef() and/or delete(). On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs sometimes re- exec themselves. Some operating systems (notably, systems that use mmap(2) for allocating large chunks of memory) can reclaim memory that is no longer used, but on such systems, perl must be configured and compiled to use the OS's malloc, not perl's. In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl. See also "How can I make my Perl program take less memory?" How can I make my CGI script more efficient? Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is. There are three popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from <http://www.apache.org/> ) with either of the mod_perl or mod_fastcgi plugin modules. With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an embedded Perl interpreter which pre-compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see <http://perl.apache.org/> With the FCGI module (from CPAN) and the mod_fastcgi module (available from <http://www.fastcgi.com/> ) each of your Perl programs becomes a permanent CGI daemon process. Finally, Plack is a Perl module and toolkit that contains PSGI middleware, helpers and adapters to web servers, allowing you to easily deploy scripts which can continue running, and provides flexibility with regards to which web server you use. It can allow existing CGI scripts to enjoy this flexibility and performance with minimal changes, or can be used along with modern Perl web frameworks to make writing and deploying web services with Perl a breeze. These solutions can have far-reaching effects on your system and on the way you write your CGI programs, so investigate them with care. See also <http://www.cpan.org/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI /> . Where can I learn about CGI or Web programming in Perl? For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to web stuff in the question on books. For problems and questions related to the web, like "Why do I get 500 Errors" or "Why doesn't it run from the browser right when it runs fine on the command line", see the troubleshooting guides and references in perlfaq9 or in the CGI MetaFAQ: L<http://www.perl.org/CGI_MetaFAQ.html> Looking into <https://plackperl.org> and modern Perl web frameworks is highly recommended, though; web programming in Perl has evolved a long way from the old days of simple CGI scripts. Where can I learn about object-oriented Perl programming? A good place to start is perlootut, and you can use perlobj for reference. A good book on OO on Perl is the "Object-Oriented Perl" by Damian Conway from Manning Publications, or "Intermediate Perl" by Randal Schwartz, brian d foy, and Tom Phoenix from O'Reilly Media. Where can I learn about linking C with Perl? If you want to call C from Perl, start with perlxstut, moving on to perlxs, xsubpp, and perlguts. If you want to call Perl from C, then read perlembed, perlcall, and perlguts. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems. You might not need all the power of XS. The Inline::C module lets you put C code directly in your Perl source. It handles all the magic to make it work. You still have to learn at least some of the perl API but you won't have to deal with the complexity of the XS support files. I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they fail, submit a bug report to <https://github.com/Perl/perl5/issues> with the output of "make test TEST_VERBOSE=1" along with "perl -V". Found in /usr/share/perl/5.34/pod/perlfaq4.pod Why isn't my octal data interpreted correctly? (contributed by brian d foy) You're probably trying to convert a string to a number, which Perl only converts as a decimal number. When Perl converts a string to a number, it ignores leading spaces and zeroes, then assumes the rest of the digits are in base 10: my $string = '0644'; print $string + 0; # prints 644 print $string + 44; # prints 688, certainly not octal! This problem usually involves one of the Perl built-ins that has the same name a Unix command that uses octal numbers as arguments on the command line. In this example, "chmod" on the command line knows that its first argument is octal because that's what it does: %prompt> chmod 644 file If you want to use the same literal digits (644) in Perl, you have to tell Perl to treat them as octal numbers either by prefixing the digits with a 0 or using "oct": chmod( 0644, $filename ); # right, has leading zero chmod( oct(644), $filename ); # also correct The problem comes in when you take your numbers from something that Perl thinks is a string, such as a command line argument in @ARGV: chmod( $ARGV[0], $filename ); # wrong, even if "0644" chmod( oct($ARGV[0]), $filename ); # correct, treat string as octal You can always check the value you're using by printing it in octal notation to ensure it matches what you think it should be. Print it in octal and decimal format: printf "0%o %d", $number, $number; How do I convert between numeric representations/bases/radixes? As always with Perl there is more than one way to do it. Below are a few examples of approaches to making common conversions between number representations. This is intended to be representational rather than exhaustive. Some of the examples later in perlfaq4 use the Bit::Vector module from CPAN. The reason you might choose Bit::Vector over the perl built-in functions is that it works with numbers of ANY size, that it is optimized for speed on some operations, and for at least some programmers the notation might be familiar. How do I convert hexadecimal into decimal Using perl's built in conversion of "0x" notation: my $dec = 0xDEADBEEF; Using the "hex" function: my $dec = hex("DEADBEEF"); Using "pack": my $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8))); Using the CPAN module "Bit::Vector": use Bit::Vector; my $vec = Bit::Vector->new_Hex(32, "DEADBEEF"); my $dec = $vec->to_Dec(); How do I convert from decimal to hexadecimal Using "sprintf": my $hex = sprintf("%X", 3735928559); # upper case A-F my $hex = sprintf("%x", 3735928559); # lower case a-f Using "unpack": my $hex = unpack("H*", pack("N", 3735928559)); Using Bit::Vector: use Bit::Vector; my $vec = Bit::Vector->new_Dec(32, -559038737); my $hex = $vec->to_Hex(); And Bit::Vector supports odd bit counts: use Bit::Vector; my $vec = Bit::Vector->new_Dec(33, 3735928559); $vec->Resize(32); # suppress leading 0 if unwanted my $hex = $vec->to_Hex(); How do I convert from octal to decimal Using Perl's built in conversion of numbers with leading zeros: my $dec = 033653337357; # note the leading 0! Using the "oct" function: my $dec = oct("33653337357"); Using Bit::Vector: use Bit::Vector; my $vec = Bit::Vector->new(32); $vec->Chunk_List_Store(3, split(//, reverse "33653337357")); my $dec = $vec->to_Dec(); How do I convert from decimal to octal Using "sprintf": my $oct = sprintf("%o", 3735928559); Using Bit::Vector: use Bit::Vector; my $vec = Bit::Vector->new_Dec(32, -559038737); my $oct = reverse join('', $vec->Chunk_List_Read(3)); How do I convert from binary to decimal Perl 5.6 lets you write binary numbers directly with the "0b" notation: my $number = 0b10110110; Using "oct": my $input = "10110110"; my $decimal = oct( "0b$input" ); Using "pack" and "ord": my $decimal = ord(pack('B8', '10110110')); Using "pack" and "unpack" for larger strings: my $int = unpack("N", pack("B32", substr("0" x 32 . "11110101011011011111011101111", -32))); my $dec = sprintf("%d", $int); # substr() is used to left-pad a 32-character string with zeros. Using Bit::Vector: my $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111"); my $dec = $vec->to_Dec(); How do I convert from decimal to binary Using "sprintf" (perl 5.6+): my $bin = sprintf("%b", 3735928559); Using "unpack": my $bin = unpack("B*", pack("N", 3735928559)); Using Bit::Vector: use Bit::Vector; my $vec = Bit::Vector->new_Dec(32, -559038737); my $bin = $vec->to_Bin(); The remaining transformations (e.g. hex -> oct, bin -> hex, etc.) are left as an exercise to the inclined reader. Why aren't my random numbers random? If you're using a version of Perl before 5.004, you must call "srand" once at the start of your program to seed the random number generator. BEGIN { srand() if $] < 5.004 } 5.004 and later automatically call "srand" at the beginning. Don't call "srand" more than once--you make your numbers less random, rather than more. Computers are good at being predictable and bad at being random (despite appearances caused by bugs in your programs :-). The random article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz>, courtesy of Tom Phoenix, talks more about this. John von Neumann said, "Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin." Perl relies on the underlying system for the implementation of "rand" and "srand"; on some systems, the generated numbers are not random enough (especially on Windows : see <http://www.perlmonks.org/?node_id=803632>). Several CPAN modules in the "Math" namespace implement better pseudorandom generators; see for example Math::Random::MT ("Mersenne Twister", fast), or Math::TrulyRandom (uses the imperfections in the system's timer to generate random numbers, which is rather slow). More algorithms for random numbers are described in "Numerical Recipes in C" at <http://www.nr.com/> How do I find the current century or millennium? Use the following simple functions: sub get_century { return int((((localtime(shift || time))[5] + 1999))/100); } sub get_millennium { return 1+int((((localtime(shift || time))[5] + 1899))/1000); } On some systems, the POSIX module's "strftime()" function has been extended in a non-standard way to use a %C format, which they sometimes claim is the "century". It isn't, because on most such systems, this is only the first two digits of the four-digit year, and thus cannot be used to determine reliably the current century or millennium. How can I compare two dates and find the difference? (contributed by brian d foy) You could just store all your dates as a number and then subtract. Life isn't always that simple though. The Time::Piece module, which comes with Perl, replaces localtime with a version that returns an object. It also overloads the comparison operators so you can compare them directly: use Time::Piece; my $date1 = localtime( $some_time ); my $date2 = localtime( $some_other_time ); if( $date1 < $date2 ) { print "The date was in the past\n"; } You can also get differences with a subtraction, which returns a Time::Seconds object: my $date_diff = $date1 - $date2; print "The difference is ", $date_diff->days, " days\n"; If you want to work with formatted dates, the Date::Manip, Date::Calc, or DateTime modules can help you. How do I remove consecutive pairs of characters? (contributed by brian d foy) You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in "(.)". The memory parentheses store the matched character in the back-reference "\g1" and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1. s/(.)\g1/$1/g; We can also use the transliteration operator, "tr///". In this example, the search list side of our "tr///" contains nothing, but the "c" option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the "s" option squashes duplicated and consecutive characters in the string so a character does not show up next to itself my $str = 'Haarlem'; # in the Netherlands $str =~ tr///cs; # Now Harlem, like in New York How do I reverse a string? Use "reverse()" in scalar context, as documented in "reverse" in perlfunc. my $reversed = reverse $string; How do I reformat a paragraph? Use Text::Wrap (part of the standard Perl distribution): use Text::Wrap; print wrap("\t", ' ', @paragraphs); The paragraphs you give to Text::Wrap should not contain embedded newlines. Text::Wrap doesn't justify the lines (flush-right). Or use the CPAN module Text::Autoformat. Formatting files can be easily done by making a shell alias, like so: alias fmt="perl -i -MText::Autoformat -n0777 \ -e 'print autoformat $_, {all=>1}' $*" See the documentation for Text::Autoformat to appreciate its many capabilities. How do I change the Nth occurrence of something? You have to keep track of N yourself. For example, let's say you want to change the fifth occurrence of "whoever" or "whomever" into "whosoever" or "whomsoever", case insensitively. These all assume that $_ contains the string to be altered. $count = 0; s{((whom?)ever)}{ ++$count == 5 # is it the 5th? ? "${2}soever" # yes, swap : $1 # renege and leave it there }ige; In the more general case, you can use the "/g" modifier in a "while" loop, keeping count of matches. $WANT = 3; $count = 0; $_ = "One fish two fish red fish blue fish"; while (/(\w+)\s+fish\b/gi) { if (++$count == $WANT) { print "The third fish is a $1 one.\n"; } } That prints out: "The third fish is a red one." You can also use a repetition count and repeated pattern like this: /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; How can I count the number of occurrences of a substring within a string? There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the "tr///" function like so: my $string = "ThisXlineXhasXsomeXx'sXinXit"; my $count = ($string =~ tr/X//); print "There are $count X characters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, "tr///" won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers: my $string = "-9 55 48 -2 23 -76 4 14 -44"; my $count = 0; while ($string =~ /-\d+/g) { $count++ } print "There are $count negative numbers in the string"; Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches. my $count = () = $string =~ /-\d+/g; Why don't my <<HERE documents work? Here documents are found in perlop. Check for these three things: There must be no space after the << part. There (probably) should be a semicolon at the end of the opening token You can't (easily) have any space in front of the tag. There needs to be at least a line separator after the end token. If you want to indent the text in the here document, you can do this: # all in one (my $VAR = <<HERE_TARGET) =~ s/^\s+//gm; your text goes here HERE_TARGET But the HERE_TARGET must still be flush against the margin. If you want that indented also, you'll have to quote in the indentation. (my $quote = <<' FINIS') =~ s/^\s+//gm; ...we will have peace, when you and all your works have perished--and the works of your dark master to whom you would deliver us. You are a liar, Saruman, and a corrupter of men's hearts. --Theoden in /usr/src/perl/taint.c FINIS $quote =~ s/\s+--/\n--/; A nice general-purpose fixer-upper function for indented here documents follows. It expects to be called with a here document as its argument. It looks to see whether each line begins with a common substring, and if so, strips that substring off. Otherwise, it takes the amount of leading whitespace found on the first line and removes that much off each subsequent line. sub fix { local $_ = shift; my ($white, $leader); # common whitespace and common leading string if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\g1\g2?.*\n)+$/) { ($white, $leader) = ($2, quotemeta($1)); } else { ($white, $leader) = (/^(\s+)/, ''); } s/^\s*?$leader(?:$white)?//gm; return $_; } This works with leading special strings, dynamically determined: my $remember_the_main = fix<<' MAIN_INTERPRETER_LOOP'; @@@ int @@@ runops() { @@@ SAVEI32(runlevel); @@@ runlevel++; @@@ while ( op = (*op->op_ppaddr)() ); @@@ TAINT_NOT; @@@ return 0; @@@ } MAIN_INTERPRETER_LOOP Or with a fixed amount of leading whitespace, with remaining indentation correctly preserved: my $poem = fix<<EVER_ON_AND_ON; Now far ahead the Road has gone, And I must follow, if I can, Pursuing it with eager feet, Until it joins some larger way Where many paths and errands meet. And whither then? I cannot say. --Bilbo in /usr/src/perl/pp_ctl.c EVER_ON_AND_ON Beginning with Perl version 5.26, a much simpler and cleaner way to write indented here documents has been added to the language: the tilde (~) modifier. See "Indented Here-docs" in perlop for details. What is the difference between a list and an array? (contributed by brian d foy) A list is a fixed collection of scalars. An array is a variable that holds a variable collection of scalars. An array can supply its collection for list operations, so list operations also work on arrays: # slices ( 'dog', 'cat', 'bird' )[2,3]; @animals[2,3]; # iteration foreach ( qw( dog cat bird ) ) { ... } foreach ( @animals ) { ... } my @three = grep { length == 3 } qw( dog cat bird ); my @three = grep { length == 3 } @animals; # supply an argument list wash_animals( qw( dog cat bird ) ); wash_animals( @animals ); Array operations, which change the scalars, rearrange them, or add or subtract some scalars, only work on arrays. These can't work on a list, which is fixed. Array operations include "shift", "unshift", "push", "pop", and "splice". An array can also change its length: $#animals = 1; # truncate to two elements $#animals = 10000; # pre-extend to 10,001 elements You can change an array element, but you can't change a list element: $animals[0] = 'Rottweiler'; qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! foreach ( @animals ) { s/^d/fr/; # works fine } foreach ( qw( dog cat bird ) ) { s/^d/fr/; # Error! Modification of read only value! } However, if the list element is itself a variable, it appears that you can change a list element. However, the list element is the variable, not the data. You're not changing the list element, but something the list element refers to. The list element itself doesn't change: it's still the same variable. You also have to be careful about context. You can assign an array to a scalar to get the number of elements in the array. This only works for arrays, though: my $count = @animals; # only works with arrays If you try to do the same thing with what you think is a list, you get a quite different result. Although it looks like you have a list on the righthand side, Perl actually sees a bunch of scalars separated by a comma: my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird Since you're assigning to a scalar, the righthand side is in scalar context. The comma operator (yes, it's an operator!) in scalar context evaluates its lefthand side, throws away the result, and evaluates it's righthand side and returns the result. In effect, that list-lookalike assigns to $scalar it's rightmost value. Many people mess this up because they choose a list-lookalike whose last element is also the count they expect: my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally What is the difference between $array[1] and @array[1]? (contributed by brian d foy) The difference is the sigil, that special character in front of the array name. The "$" sigil means "exactly one item", while the "@" sigil means "zero or more items". The "$" gets you a single scalar, while the "@" gets you a list. The confusion arises because people incorrectly assume that the sigil denotes the variable type. The $array[1] is a single-element access to the array. It's going to return the item in index 1 (or undef if there is no item there). If you intend to get exactly one element from the array, this is the form you should use. The @array[1] is an array slice, although it has only one index. You can pull out multiple elements simultaneously by specifying additional indices as a list, like @array[1,4,3,0]. Using a slice on the lefthand side of the assignment supplies list context to the righthand side. This can lead to unexpected results. For instance, if you want to read a single line from a filehandle, assigning to a scalar value is fine: $array[1] = <STDIN>; However, in list context, the line input operator returns all of the lines as a list. The first line goes into @array[1] and the rest of the lines mysteriously disappear: @array[1] = <STDIN>; # most likely not what you want Either the "use warnings" pragma or the -w flag will warn you when you use an array slice with a single index. How can I remove duplicate elements from a list or array? (contributed by brian d foy) Use a hash. When you think the words "unique" or "duplicated", think "hash keys". If you don't care about the order of the elements, you could just create the hash then extract the keys. It's not important how you create that hash: just that you use "keys" to get the unique elements. my %hash = map { $_, 1 } @array; # or a hash slice: @hash{ @array } = (); # or a foreach: $hash{$_} = 1 foreach ( @array ); my @unique = keys %hash; If you want to use a module, try the "uniq" function from List::MoreUtils. In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements. use List::MoreUtils qw(uniq); my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an element, that element has no key in %Seen. The "next" statement creates the key and immediately uses its value, which is "undef", so the loop continues to the "push" and increments the value for that key. The next time the loop sees that same element, its key exists in the hash *and* the value for that key is true (since it's not 0 or "undef"), so the next skips that iteration and the loop goes to the next element. my @unique = (); my %seen = (); foreach my $elem ( @array ) { next if $seen{ $elem }++; push @unique, $elem; } You can write this more briefly using a grep, which does the same thing. my %seen = (); my @unique = grep { ! $seen{ $_ }++ } @array; How do I compute the difference of two arrays? How do I compute the intersection of two arrays? Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array: my (@union, @intersection, @difference); my %count = (); foreach my $element (@array1, @array2) { $count{$element}++ } foreach my $element (keys %count) { push @union, $element; push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; } Note that this is the *symmetric difference*, that is, all elements in either A or in B but not in both. Think of it as an xor operation. How do I test whether two arrays or hashes are equal? The following code works for single-level arrays. It uses a stringwise comparison, and does not distinguish defined versus undefined empty strings. Modify if you have other needs. $are_equal = compare_arrays(\@frogs, \@toads); sub compare_arrays { my ($first, $second) = @_; no warnings; # silence spurious -w undef complaints return 0 unless @$first == @$second; for (my $i = 0; $i < @$first; $i++) { return 0 if $first->[$i] ne $second->[$i]; } return 1; } For multilevel structures, you may wish to use an approach more like this one. It uses the CPAN module FreezeThaw: use FreezeThaw qw(cmpStr); my @a = my @b = ( "this", "that", [ "more", "stuff" ] ); printf "a and b contain %s arrays\n", cmpStr(\@a, \@b) == 0 ? "the same" : "different"; This approach also works for comparing hashes. Here we'll demonstrate two different answers: use FreezeThaw qw(cmpStr cmpStrHard); my %a = my %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); $a{EXTRA} = \%b; $b{EXTRA} = \%a; printf "a and b contain %s hashes\n", cmpStr(\%a, \%b) == 0 ? "the same" : "different"; printf "a and b contain %s hashes\n", cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; The first reports that both those the hashes contain the same data, while the second reports that they do not. Which you prefer is left as an exercise to the reader. Why does defined() return true on empty arrays and hashes? The short story is that you should probably only use defined on scalars or functions, not on aggregates (arrays and hashes). See "defined" in perlfunc in the 5.004 release or later of Perl for more detail. How do I process an entire hash? (contributed by brian d foy) There are a couple of ways that you can process an entire hash. You can get a list of keys, then go through each key, or grab a one key-value pair at a time. To go through all of the keys, use the "keys" function. This extracts all of the keys of the hash and gives them back to you as a list. You can then get the value through the particular key you're processing: foreach my $key ( keys %hash ) { my $value = $hash{$key} ... } Once you have the list of keys, you can process that list before you process the hash elements. For instance, you can sort the keys so you can process them in lexical order: foreach my $key ( sort keys %hash ) { my $value = $hash{$key} ... } Or, you might want to only process some of the items. If you only want to deal with the keys that start with "text:", you can select just those using "grep": foreach my $key ( grep /^text:/, keys %hash ) { my $value = $hash{$key} ... } If the hash is very large, you might not want to create a long list of keys. To save some memory, you can grab one key-value pair at a time using "each()", which returns a pair you haven't seen yet: while( my( $key, $value ) = each( %hash ) ) { ... } The "each" operator returns the pairs in apparently random order, so if ordering matters to you, you'll have to stick with the "keys" method. The "each()" operator can be a bit tricky though. You can't add or delete keys of the hash while you're using it without possibly skipping or re-processing some pairs after Perl internally rehashes all of the elements. Additionally, a hash has only one iterator, so if you mix "keys", "values", or "each" on the same hash, you risk resetting the iterator and messing up your processing. See the "each" entry in perlfunc for more details. What happens if I add or remove keys from a hash while iterating over it? (contributed by brian d foy) The easy answer is "Don't do that!" If you iterate through the hash with each(), you can delete the key most recently returned without worrying about it. If you delete or add other keys, the iterator may skip or double up on them since perl may rearrange the hash table. See the entry for "each()" in perlfunc. How can I know how many entries are in a hash? (contributed by brian d foy) This is very similar to "How do I process an entire hash?", also in perlfaq4, but a bit simpler in the common cases. You can use the "keys()" built-in function in scalar context to find out have many entries you have in a hash: my $key_count = keys %hash; # must be scalar context! If you want to find out how many entries have a defined value, that's a bit different. You have to check each value. A "grep" is handy: my $defined_value_count = grep { defined } values %hash; You can use that same structure to count the entries any way that you like. If you want the count of the keys with vowels in them, you just test for that instead: my $vowel_count = grep { /[aeiou]/ } keys %hash; The "grep" in scalar context returns the count. If you want the list of matching items, just use it in list context instead: my @defined_values = grep { defined } values %hash; The "keys()" function also resets the iterator, which means that you may see strange results if you use this between uses of other hash operators such as "each()". What's the difference between "delete" and "undef" with hashes? Hashes contain pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in %hash, "exists($hash{$key})" will return true. The value for a given key can be "undef", in which case $hash{$key} will be "undef" while "exists $hash{$key}" will return true. This corresponds to ($key, "undef") being in the hash. Pictures help... Here's the %hash table: keys values +------+------+ | a | 3 | | x | 7 | | d | 0 | | e | 2 | +------+------+ And these conditions hold $hash{'a'} is true $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is true exists $hash{'a'} is true (Perl 5 only) grep ($_ eq 'a', keys %hash) is true If you now say undef $hash{'a'} your table now reads: keys values +------+------+ | a | undef| | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $hash{'a'} is FALSE $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is FALSE exists $hash{'a'} is true (Perl 5 only) grep ($_ eq 'a', keys %hash) is true Notice the last two: you have an undef value, but a defined key! Now, consider this: delete $hash{'a'} your table now reads: keys values +------+------+ | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $hash{'a'} is false $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is false exists $hash{'a'} is FALSE (Perl 5 only) grep ($_ eq 'a', keys %hash) is FALSE See, the whole entry is gone! How do I reset an each() operation part-way through? (contributed by brian d foy) You can use the "keys" or "values" functions to reset "each". To simply reset the iterator used by "each" without doing anything else, use one of them in void context: keys %hash; # resets iterator, nothing else. values %hash; # resets iterator, nothing else. See the documentation for "each" in perlfunc. How can I store a multidimensional array in a DBM file? Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File. You might also try DBM::Deep, but it can be a bit slow. How can I make my hash remember the order I put elements into it? Use the Tie::IxHash from CPAN. use Tie::IxHash; tie my %myhash, 'Tie::IxHash'; for (my $i=0; $i<20; $i++) { $myhash{$i} = 2*$i; } my @keys = keys %myhash; # @keys = (0,1,2,3,...) Why does passing a subroutine an undefined element in a hash create it? (contributed by brian d foy) Are you using a really old version of Perl? Normally, accessing a hash key's value for a nonexistent key will *not* create the key. my %hash = (); my $value = $hash{ 'foo' }; print "This won't print\n" if exists $hash{ 'foo' }; Passing $hash{ 'foo' } to a subroutine used to be a special case, though. Since you could assign directly to $_[0], Perl had to be ready to make that assignment so it created the hash key ahead of time: my_sub( $hash{ 'foo' } ); print "This will print before 5.004\n" if exists $hash{ 'foo' }; sub my_sub { # $_[0] = 'bar'; # create hash key in case you do this 1; } Since Perl 5.004, however, this situation is a special case and Perl creates the hash key only when you make the assignment: my_sub( $hash{ 'foo' } ); print "This will print, even after 5.004\n" if exists $hash{ 'foo' }; sub my_sub { $_[0] = 'bar'; } However, if you want the old behavior (and think carefully about that because it's a weird side effect), you can pass a hash slice instead. Perl 5.004 didn't make this a special case: my_sub( @hash{ qw/foo/ } ); How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? Usually a hash ref, perhaps like this: $record = { NAME => "Jason", EMPNO => 132, TITLE => "deputy peon", AGE => 23, SALARY => 37_000, PALS => [ "Norbert", "Rhys", "Phineas"], }; References are documented in perlref and perlreftut. Examples of complex data structures are given in perldsc and perllol. Examples of structures and object-oriented classes are in perlootut. How can I use a reference as a hash key? (contributed by brian d foy and Ben Morrow) Hash keys are strings, so you can't really use a reference as the key. When you try to do that, perl turns the reference into its stringified form (for instance, "HASH(0xDEADBEEF)"). From there you can't get back the reference from the stringified form, at least without doing some extra work on your own. Remember that the entry in the hash will still be there even if the referenced variable goes out of scope, and that it is entirely possible for Perl to subsequently allocate a different variable at the same address. This will mean a new variable might accidentally be associated with the value for an old. If you have Perl 5.10 or later, and you just want to store a value against the reference for lookup later, you can use the core Hash::Util::Fieldhash module. This will also handle renaming the keys if you use multiple threads (which causes all variables to be reallocated at new addresses, changing their stringification), and garbage-collecting the entries when the referenced variable goes out of scope. If you actually need to be able to get a real reference back from each hash entry, you can use the Tie::RefHash module, which does the required work for you. How can I prevent addition of unwanted keys into a hash? Since version 5.8.0, hashes can be *restricted* to a fixed number of given keys. Methods for creating and dealing with restricted hashes are exported by the Hash::Util module. How do I handle binary data correctly? Perl is binary-clean, so it can handle binary data just fine. On Windows or DOS, however, you have to use "binmode" for binary files to avoid conversions for line endings. In general, you should use "binmode" any time you want to work with binary data. Also see "binmode" in perlfunc or perlopentut. If you're concerned about 8-bit textual data then see perllocale. If you want to deal with multibyte characters, however, there are some gotchas. See the section on Regular Expressions. How do I print out or copy a recursive data structure? The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great for printing out data structures. The Storable module on CPAN (or the 5.8 release of Perl), provides a function called "dclone" that recursively copies its argument. use Storable qw(dclone); $r2 = dclone($r1); Where $r1 can be a reference to any kind of data structure you'd like. It will be deeply copied. Because "dclone" takes and returns references, you'd have to add extra punctuation if you had a hash of arrays that you wanted to copy. %newhash = %{ dclone(\%oldhash) }; How do I verify a credit card checksum? Get the Business::CreditCard module from CPAN. Found in /usr/share/perl/5.34/pod/perlfaq5.pod How can I manipulate fixed-record-length files? The most efficient way is using pack() and unpack(). This is faster than using substr() when taking many, many strings. It is slower for just a few. Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, Berkeley-style ps: # sample input line: # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what my $PS_T = 'A6 A4 A7 A5 A*'; open my $ps, '-|', 'ps'; print scalar <$ps>; my @fields = qw( pid tt stat time command ); while (<$ps>) { my %process; @process{@fields} = unpack($PS_T, $_); for my $field ( @fields ) { print "$field: <$process{$field}>\n"; } print 'line=', pack($PS_T, @process{@fields} ), "\n"; } We've used a hash slice in order to easily handle the fields of each row. Storing the keys in an array makes it easy to operate on them as a group or loop over them with "for". It also avoids polluting the program with global variables and using symbolic references. How can I use a filehandle indirectly? An indirect filehandle is the use of something other than a symbol in a place that a filehandle is expected. Here are ways to get indirect filehandles: $fh = SOME_FH; # bareword is strict-subs hostile $fh = "SOME_FH"; # strict-refs hostile; same package only $fh = *SOME_FH; # typeglob $fh = \*SOME_FH; # ref to typeglob (bless-able) $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob Or, you can use the "new" method from one of the IO::* modules to create an anonymous filehandle and store that in a scalar variable. use IO::Handle; # 5.004 or higher my $fh = IO::Handle->new(); Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like "print", "open", "seek", or the "<FH>" diamond operator will accept either a named filehandle or a scalar variable containing one: ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); print $ofh "Type it: "; my $got = <$ifh> print $efh "What was that: $got"; If you're passing a filehandle to a function, you can write the function in two ways: sub accept_fh { my $fh = shift; print $fh "Sending to indirect filehandle\n"; } Or it can localize a typeglob and use the filehandle directly: sub accept_fh { local *FH = shift; print FH "Sending to localized filehandle\n"; } Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.) accept_fh(*STDOUT); accept_fh($handle); In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not expressions or subscripts of hashes or arrays, can be used with built-ins like "print", "printf", or the diamond operator. Using something other than a simple scalar variable as a filehandle is illegal and won't even compile: my @fd = (*STDIN, *STDOUT, *STDERR); print $fd[1] "Type it: "; # WRONG my $got = <$fd[0]> # WRONG print $fd[2] "What was that: $got"; # WRONG With "print" and "printf", you get around this by using a block and an expression where you would place the filehandle: print { $fd[1] } "funny stuff\n"; printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; # Pity the poor deadbeef. That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places: my $ok = -x "/bin/cat"; print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; This approach of treating "print" and "printf" like object methods calls doesn't work for the diamond operator. That's because it's a real operator, not just a function with a comma-less argument. Assuming you've been storing typeglobs in your structure as we did above, you can use the built-in function named "readline" to read a record just as "<>" does. Given the initialization shown above for @fd, this would work, but only because readline() requires a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet. $got = readline($fd[0]); Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. It's the syntax of the fundamental operators. Playing the object game doesn't help you at all here. How come when I open a file read-write it wipes it out? Because you're using something like this, which truncates the file *then* gives you read-write access: open my $fh, '+>', '/path/name'; # WRONG (almost always) Whoops. You should instead use this, which will fail if the file doesn't exist: open my $fh, '+<', '/path/name'; # open for update Using ">" always clobbers or creates. Using "<" never does either. The "+" doesn't change this. Here are examples of many kinds of file opens. Those using "sysopen" all assume that you've pulled in the constants from Fcntl: use Fcntl; To open file for reading: open my $fh, '<', $path or die $!; sysopen my $fh, $path, O_RDONLY or die $!; To open file for writing, create new file if needed or else truncate old file: open my $fh, '>', $path or die $!; sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; To open file for writing, create new file, file must not exist: sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; To open file for appending, create if necessary: open my $fh, '>>', $path or die $!; sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; To open file for appending, file must exist: sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; To open file for update, file must exist: open my $fh, '+<', $path or die $!; sysopen my $fh, $path, O_RDWR or die $!; To open file for update, create file if necessary: sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; To open file for update, file must not exist: sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; To open a file without blocking, creating if necessary: sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT or die "can't open /foo/somefile: $!": Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both successfully create or unlink the same file! Therefore O_EXCL isn't as exclusive as you might wish. See also perlopentut. How can I reliably rename a file? If your operating system supports a proper mv(1) utility or its functional equivalent, this works: rename($old, $new) or system("mv", $old, $new); It may be more portable to use the File::Copy module instead. You just copy to the new file to the new name (checking return values), then delete the old one. This isn't really the same semantically as a "rename()", which preserves meta-information like permissions, timestamps, inode info, etc. I still don't get locking. I just want to increment the number in the file. How can I do this? Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic. Anyway, this is what you can do if you can't help yourself. use Fcntl qw(:DEFAULT :flock); sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; flock $fh, LOCK_EX or die "can't flock numfile: $!"; my $num = <$fh> || 0; seek $fh, 0, 0 or die "can't rewind numfile: $!"; truncate $fh, 0 or die "can't truncate numfile: $!"; (print $fh $num+1, "\n") or die "can't write numfile: $!"; close $fh or die "can't close numfile: $!"; Here's a much better web-page hit counter: $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-) How do I print to more than one file at once? To connect one filehandle to several output filehandles, you can use the IO::Tee or Tie::FileHandle::Multiplex modules. If you only have to do this once, you can print individually to each filehandle. for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" } How can I read in an entire file all at once? The customary Perl approach for processing all the lines in a file is to do so one line at a time: open my $input, '<', $file or die "can't open $file: $!"; while (<$input>) { chomp; # do something with $_ } close $input or die "can't close $file: $!"; This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often--if not almost always--the wrong approach. Whenever you see someone do this: my @lines = <INPUT>; You should think long and hard about why you need everything loaded at once. It's just not a scalable solution. If you "mmap" the file with the File::Map module from CPAN, you can virtually load the entire file into a string without actually storing it in memory: use File::Map qw(map_file); map_file my $string, $filename; Once mapped, you can treat $string as you would any other string. Since you don't necessarily have to load the data, mmap-ing can be very fast and may not increase your memory footprint. You might also find it more fun to use the standard Tie::File module, or the DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element of the array actually accesses the corresponding line in the file. If you want to load the entire file, you can use the Path::Tiny module to do it in one simple and efficient step: use Path::Tiny; my $all_of_it = path($filename)->slurp; # entire file in scalar my @all_lines = path($filename)->lines; # one line per element Or you can read the entire file contents into a scalar like this: my $var; { local $/; open my $fh, '<', $file or die "can't open $file: $!"; $var = <$fh>; } That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this: my $var = do { local $/; <$fh> }; You can also use a localized @ARGV to eliminate the "open": my $var = do { local( @ARGV, $/ ) = $file; <> }; For ordinary files you can also use the "read" function. read( $fh, $var, -s $fh ); That third argument tests the byte size of the data on the $fh filehandle and reads that many bytes into the buffer $var. How can I read in a file by paragraphs? Use the $/ variable (see perlvar for details). You can either set it to "" to eliminate empty paragraphs ("abc\n\n\n\ndef", for instance, gets treated as two paragraphs and not three), or "\n\n" to accept empty paragraphs. Note that a blank line must have no blanks in it. Thus "fred\n \nstuff\n\n" is one paragraph, but "fred\n\nstuff\n\n" is two. How can I read a single character from a file? From the keyboard? You can use the builtin "getc()" function for most filehandles, but it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN or use the sample code in "getc" in perlfunc. If your system supports the portable operating system programming interface (POSIX), you can use the following code, which you'll note turns off echo processing as well. #!/usr/bin/perl -w use strict; $| = 1; for (1..4) { print "gimme: "; my $got = getone(); print "--> $got\n"; } exit; BEGIN { use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); my $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub getone { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } } END { cooked() } The Term::ReadKey module from CPAN may be easier to use. Recent versions include also support for non-portable systems as well. use Term::ReadKey; open my $tty, '<', '/dev/tty'; print "Gimme a char: "; ReadMode "raw"; my $key = ReadKey 0, $tty; ReadMode "normal"; printf "\nYou said %s, char number %03d\n", $key, ord $key; How can I tell whether there's a character waiting on a filehandle? The very first thing you should do is look into getting the Term::ReadKey extension from CPAN. As we mentioned earlier, it now even has limited support for non-portable (read: not open systems, closed, proprietary, not POSIX, not Unix, etc.) systems. You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system-dependent. Here's one solution that works on BSD systems: sub key_ready { my($rin, $nfd); vec($rin, fileno(STDIN), 1) = 1; return $nfd = select($rin,undef,undef,0); } If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The *h2ph* tool that comes with Perl tries to convert C include files to Perl code, which can be "require"d. FIONREAD ends up defined as a function in the *sys/ioctl.ph* file: require './sys/ioctl.ph'; $size = pack("L", 0); ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); If *h2ph* wasn't installed or doesn't work for you, you can *grep* the include files by hand: % grep FIONREAD /usr/include/*/* /usr/include/asm/ioctls.h:#define FIONREAD 0x541B Or write a small C program using the editor of champions: % cat > fionread.c #include <sys/ioctl.h> main() { printf("%#08x\n", FIONREAD); } ^D % cc -o fionread fionread.c % ./fionread 0x4004667f And then hard-code it, leaving porting as an exercise to your successor. $FIONREAD = 0x4004667f; # XXX: opsys dependent $size = pack("L", 0); ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); FIONREAD requires a filehandle connected to a stream, meaning that sockets, pipes, and tty devices work, but *not* files. Why does Perl let me delete read-only files? Why does "-i" clobber protected files? Isn't this a bug in Perl? This is elaborately and painstakingly described in the file-dir-perms article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . The executive summary: learn how your filesystem works. The permissions on a file say what can happen to the data in that file. The permissions on a directory say what can happen to the list of files in that directory. If you delete a file, you're removing its name from the directory (so the operation depends on the permissions of the directory, not of the file). If you try to write to the file, the permissions of the file govern whether you're allowed to. How do I traverse a directory tree? (contributed by brian d foy) The File::Find module, which comes with Perl, does all of the hard work to traverse a directory structure. It comes with Perl. You simply call the "find" subroutine with a callback subroutine and the directories you want to traverse: use File::Find; find( \&wanted, @directories ); sub wanted { # full path in $File::Find::name # just filename in $_ ... do whatever you want to do ... } The File::Find::Closures, which you can download from CPAN, provides many ready-to-use subroutines that you can use with File::Find. The File::Finder, which you can download from CPAN, can help you create the callback subroutine using something closer to the syntax of the "find" command-line utility: use File::Find; use File::Finder; my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); find( $deep_dirs->as_options, @places ); The File::Find::Rule module, which you can download from CPAN, has a similar interface, but does the traversal for you too: use File::Find::Rule; my @files = File::Find::Rule->file() ->name( '*.pm' ) ->in( @INC ); How do I delete a directory tree? (contributed by brian d foy) If you have an empty directory, you can use Perl's built-in "rmdir". If the directory is not empty (so, with files or subdirectories), you either have to empty it yourself (a lot of work) or use a module to help you. The File::Path module, which comes with Perl, has a "remove_tree" which can take care of all of the hard work for you: use File::Path qw(remove_tree); remove_tree( @directories ); The File::Path module also has a legacy interface to the older "rmtree" subroutine. How do I copy an entire directory? (contributed by Shlomi Fish) To do the equivalent of "cp -R" (i.e. copy an entire directory tree recursively) in portable Perl, you'll either need to write something yourself or find a good CPAN module such as File::Copy::Recursive. Found in /usr/share/perl/5.34/pod/perlfaq6.pod How can I hope to use regular expressions without creating illegible and unmaintainable code? Three techniques can make regular expressions maintainable and understandable. Comments Outside the Regex Describe what you're doing and how you're doing it, using normal Perl comments. # turn the line into the first word, a colon, and the # number of characters on the rest of the line s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg; Comments Inside the Regex The "/x" modifier causes whitespace to be ignored in a regex pattern (except in a character class and a few other places), and also allows you to use normal comments there, too. As you can imagine, whitespace and comments help a lot. "/x" lets you turn this: s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs; into this: s{ < # opening angle bracket (?: # Non-backreffing grouping paren [^>'"] * # 0 or more things that are neither > nor ' nor " | # or else ".*?" # a section between double quotes (stingy match) | # or else '.*?' # a section between single quotes (stingy match) ) + # all occurring one or more times > # closing angle bracket }{}gsx; # replace with nothing, i.e. delete It's still not quite so clear as prose, but it is very useful for describing the meaning of each part of the pattern. Different Delimiters While we normally think of patterns as being delimited with "/" characters, they can be delimited by almost any character. perlre describes this. For example, the "s///" above uses braces as delimiters. Selecting another delimiter can avoid quoting the delimiter within the pattern: s/\/usr\/local/\/usr\/share/g; # bad delimiter choice s#/usr/local#/usr/share#g; # better Using logically paired delimiters can be even more readable: s{/usr/local/}{/usr/share}g; # better still I'm having trouble matching over more than one line. What's wrong? Either you don't have more than one line in the string you're looking at (probably), or else you aren't using the correct modifier(s) on your pattern (possibly). There are many ways to get multiline data into a string. If you want it to happen automatically while reading input, you'll want to set $/ (probably to '' for paragraphs or "undef" for the whole file) to allow you to read more than one line at a time. Read perlre to help you decide which of "/s" and "/m" (or both) you might want to use: "/s" allows dot to include newline, and "/m" allows caret and dollar to match next to a newline, not just at the end of the string. You do need to make sure that you've actually got a multiline string in there. For example, this program detects duplicate words, even when they span line breaks (but not paragraph ones). For this example, we don't need "/s" because we aren't using dot in a regular expression that we want to cross line boundaries. Neither do we need "/m" because we don't want caret or dollar to match at any point inside the record next to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in. $/ = ''; # read in whole paragraph, not just one line while ( <> ) { while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) { # word starts alpha print "Duplicate $1 at paragraph $.\n"; } } Here's some code that finds sentences that begin with "From " (which would be mangled by many mailers): $/ = ''; # read in whole paragraph, not just one line while ( <> ) { while ( /^From /gm ) { # /m makes ^ match next to \n print "leading From in paragraph $.\n"; } } Here's code that finds everything between START and END in a paragraph: undef $/; # read in whole file, not just one line or paragraph while ( <> ) { while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries print "$1\n"; } } How can I pull out lines between two patterns that are themselves on different lines? You can use Perl's somewhat exotic ".." operator (documented in perlop): perl -ne 'print if /START/ .. /END/' file1 file2 ... If you wanted text and not lines, you would use perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... But if you want nested occurrences of "START" through "END", you'll run up against the problem described in the question in this section on matching balanced text. Here's another example of using "..": while (<>) { my $in_header = 1 .. /^$/; my $in_body = /^$/ .. eof; # now choose between them } continue { $. = 0 if eof; # fix $. } How do I match XML, HTML, or other nasty, ugly things with a regex? Do not use regexes. Use a module and forget about the regular expressions. The XML::LibXML, HTML::TokeParser and HTML::TreeBuilder modules are good starts, although each namespace has other parsing modules specialized for certain tasks and different ways of doing it. Start at CPAN Search ( <http://metacpan.org/> ) and wonder at all the work people have done for you already! :) I put a regular expression into $/ but it didn't work. What's wrong? $/ has to be a string. You can use these examples if you really need to do this. If you have File::Stream, this is easy. use File::Stream; my $stream = File::Stream->new( $filehandle, separator => qr/\s*,\s*/, ); print "$_\n" while <$stream>; If you don't have File::Stream, you have to do a little more work. You can use the four-argument form of sysread to continually add to a buffer. After you add to the buffer, you check if you have a complete line (using your regular expression). local $_ = ""; while( sysread FH, $_, 8192, length ) { while( s/^((?s).*?)your_pattern// ) { my $record = $1; # do stuff here. } } You can do the same thing with foreach and a match using the c flag and the \G anchor, if you do not mind your entire file being in memory at the end. local $_ = ""; while( sysread FH, $_, 8192, length ) { foreach my $record ( m/\G((?s).*?)your_pattern/gc ) { # do stuff here. } substr( $_, 0, pos ) = "" if pos; } How do I substitute case-insensitively on the LHS while preserving case on the RHS? Here's a lovely Perlish solution by Larry Rosler. It exploits properties of bitwise xor on ASCII strings. $_= "this is a TEsT case"; $old = 'test'; $new = 'success'; s{(\Q$old\E)} { uc $new | (uc $1 ^ $1) . (uc(substr $1, -1) ^ substr $1, -1) x (length($new) - length $1) }egi; print; And here it is as a subroutine, modeled after the above: sub preserve_case { my ($old, $new) = @_; my $mask = uc $old ^ $old; uc $new | $mask . substr($mask, -1) x (length($new) - length($old)) } $string = "this is a TEsT case"; $string =~ s/(test)/preserve_case($1, "success")/egi; print "$string\n"; This prints: this is a SUcCESS case As an alternative, to keep the case of the replacement word if it is longer than the original, you can use this code, by Jeff Pinyan: sub preserve_case { my ($from, $to) = @_; my ($lf, $lt) = map length, @_; if ($lt < $lf) { $from = substr $from, 0, $lt } else { $from .= substr $to, $lf } return uc $to | ($from ^ uc $from); } This changes the sentence to "this is a SUcCess case." Just to show that C programmers can write C in any programming language, if you prefer a more C-like solution, the following script makes the substitution have the same case, letter by letter, as the original. (It also happens to run about 240% slower than the Perlish solution runs.) If the substitution has more characters than the string being substituted, the case of the last character is used for the rest of the substitution. # Original by Nathan Torkington, massaged by Jeffrey Friedl # sub preserve_case { my ($old, $new) = @_; my $state = 0; # 0 = no change; 1 = lc; 2 = uc my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); my $len = $oldlen < $newlen ? $oldlen : $newlen; for ($i = 0; $i < $len; $i++) { if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { $state = 0; } elsif (lc $c eq $c) { substr($new, $i, 1) = lc(substr($new, $i, 1)); $state = 1; } else { substr($new, $i, 1) = uc(substr($new, $i, 1)); $state = 2; } } # finish up with any remaining new (for when new is longer than old) if ($newlen > $oldlen) { if ($state == 1) { substr($new, $oldlen) = lc(substr($new, $oldlen)); } elsif ($state == 2) { substr($new, $oldlen) = uc(substr($new, $oldlen)); } } return $new; } How can I quote a variable to use in a regex? The Perl parser will expand $variable and @variable references in regular expressions unless the delimiter is a single quote. Remember, too, that the right-hand side of a "s///" substitution is considered a double-quoted string (see perlop for more details). Remember also that any regex special characters will be acted on unless you precede the substitution with \Q. Here's an example: $string = "Placido P. Octopus"; $regex = "P."; $string =~ s/$regex/Polyp/; # $string is now "Polypacido P. Octopus" Because "." is special in regular expressions, and can match any single character, the regex "P." here has matched the <Pl> in the original string. To escape the special meaning of ".", we use "\Q": $string = "Placido P. Octopus"; $regex = "P."; $string =~ s/\Q$regex/Polyp/; # $string is now "Placido Polyp Octopus" The use of "\Q" causes the "." in the regex to be treated as a regular character, so that "P." matches a "P" followed by a dot. What is "/o" really for? (contributed by brian d foy) The "/o" option for regular expressions (documented in perlop and perlreref) tells Perl to compile the regular expression only once. This is only useful when the pattern contains a variable. Perls 5.6 and later handle this automatically if the pattern does not change. Since the match operator "m//", the substitution operator "s///", and the regular expression quoting operator "qr//" are double-quotish constructs, you can interpolate variables into the pattern. See the answer to "How can I quote a variable to use in a regex?" for more details. This example takes a regular expression from the argument list and prints the lines of input that match it: my $pattern = shift @ARGV; while( <> ) { print if m/$pattern/; } Versions of Perl prior to 5.6 would recompile the regular expression for each iteration, even if $pattern had not changed. The "/o" would prevent this by telling Perl to compile the pattern the first time, then reuse that for subsequent iterations: my $pattern = shift @ARGV; while( <> ) { print if m/$pattern/o; # useful for Perl < 5.6 } In versions 5.6 and later, Perl won't recompile the regular expression if the variable hasn't changed, so you probably don't need the "/o" option. It doesn't hurt, but it doesn't help either. If you want any version of Perl to compile the regular expression only once even if the variable changes (thus, only using its initial value), you still need the "/o". You can watch Perl's regular expression engine at work to verify for yourself if Perl is recompiling a regular expression. The "use re 'debug'" pragma (comes with Perl 5.005 and later) shows the details. With Perls before 5.6, you should see "re" reporting that its compiling the regular expression on each iteration. With Perl 5.6 or later, you should only see "re" report that for the first iteration. use re 'debug'; my $regex = 'Perl'; foreach ( qw(Perl Java Ruby Python) ) { print STDERR "-" x 73, "\n"; print STDERR "Trying $_...\n"; print STDERR "\t$_ is good!\n" if m/$regex/; } How do I use a regular expression to strip C-style comments from a file? While this actually can be done, it's much harder than you'd think. For example, this one-liner perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis. $/ = undef; $_ = <>; s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; print; This could, of course, be more legibly written with the "/x" modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis. s{ /\* ## Start of /* ... */ comment [^*]*\*+ ## Non-* followed by 1-or-more *'s ( [^/*][^*]*\*+ )* ## 0-or-more things which don't start with / ## but do end with '*' / ## End of /* ... */ comment | ## OR various things which aren't comments: ( " ## Start of " ... " string ( \\. ## Escaped char | ## OR [^"\\] ## Non "\ )* " ## End of " ... " string | ## OR ' ## Start of ' ... ' string ( \\. ## Escaped char | ## OR [^'\\] ## Non '\ )* ' ## End of ' ... ' string | ## OR . ## Anything other char [^/"'\\]* ## Chars which doesn't start a comment, string or escape ) }{defined $2 ? $2 : ""}gxse; A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character: s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse; Can I use Perl regular expressions to match balanced text? (contributed by brian d foy) Your first try should probably be the Text::Balanced module, which is in the Perl standard library since Perl 5.8. It has a variety of functions to deal with tricky text. The Regexp::Common module can also help by providing canned patterns you can use. As of Perl 5.10, you can match balanced text with regular expressions using recursive patterns. Before Perl 5.10, you had to resort to various tricks such as using Perl code in "(??{})" sequences. Here's an example using a recursive regular expression. The goal is to capture all of the text within angle brackets, including the text in nested angle brackets. This sample text has two "major" groups: a group with one level of nesting and a group with two levels of nesting. There are five total groups in angle brackets: I have some <brackets in <nested brackets> > and <another group <nested once <nested twice> > > and that's it. The regular expression to match the balanced text uses two new (to Perl 5.10) regular expression features. These are covered in perlre and this example is a modified version of one in that documentation. First, adding the new possessive "+" to any quantifier finds the longest match and does not backtrack. That's important since you want to handle any angle brackets through the recursion, not backtracking. The group "[^<>]++" finds one or more non-angle brackets without backtracking. Second, the new "(?PARNO)" refers to the sub-pattern in the particular capture group given by "PARNO". In the following regex, the first capture group finds (and remembers) the balanced text, and you need that same pattern within the first buffer to get past the nested text. That's the recursive part. The "(?1)" uses the pattern in the outer capture group as an independent part of the regex. Putting it all together, you have: #!/usr/local/bin/perl5.10.0 my $string =<<"HERE"; I have some <brackets in <nested brackets> > and <another group <nested once <nested twice> > > and that's it. HERE my @groups = $string =~ m/ ( # start of capture group 1 < # match an opening angle bracket (?: [^<>]++ # one or more non angle brackets, non backtracking | (?1) # found < or >, so recurse to capture group 1 )* > # match a closing angle bracket ) # end of capture group 1 /xg; $" = "\n\t"; print "Found:\n\t@groups\n"; The output shows that Perl found the two major groups: Found: <brackets in <nested brackets> > <another group <nested once <nested twice> > > With a little extra work, you can get all of the groups in angle brackets even if they are in other angle brackets too. Each time you get a balanced match, remove its outer delimiter (that's the one you just matched so don't match it again) and add it to a queue of strings to process. Keep doing that until you get no matches: #!/usr/local/bin/perl5.10.0 my @queue =<<"HERE"; I have some <brackets in <nested brackets> > and <another group <nested once <nested twice> > > and that's it. HERE my $regex = qr/ ( # start of bracket 1 < # match an opening angle bracket (?: [^<>]++ # one or more non angle brackets, non backtracking | (?1) # recurse to bracket 1 )* > # match a closing angle bracket ) # end of bracket 1 /x; $" = "\n\t"; while( @queue ) { my $string = shift @queue; my @groups = $string =~ m/$regex/g; print "Found:\n\t@groups\n\n" if @groups; unshift @queue, map { s/^<//; s/>$//; $_ } @groups; } The output shows all of the groups. The outermost matches show up first and the nested matches show up later: Found: <brackets in <nested brackets> > <another group <nested once <nested twice> > > Found: <nested brackets> Found: <nested once <nested twice> > Found: <nested twice> What does it mean that regexes are greedy? How can I get around it? Most people mean that greedy regexes match as much as they can. Technically speaking, it's actually the quantifiers ("?", "*", "+", "{}") that are greedy rather than the whole pattern; Perl prefers local greed and immediate gratification to overall greed. To get non-greedy versions of the same quantifiers, use ("??", "*?", "+?", "{}?"). An example: my $s1 = my $s2 = "I am very very cold"; $s1 =~ s/ve.*y //; # I am cold $s2 =~ s/ve.*?y //; # I am very cold Notice how the second substitution stopped matching as soon as it encountered "y ". The "*?" quantifier effectively tells the regular expression engine to find a match as quickly as possible and pass control on to whatever is next in line, as you would if you were playing hot potato. How can I print out a word-frequency or line-frequency summary? To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question: my (%seen); while (<>) { while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" $seen{$1}++; } } while ( my ($word, $count) = each %seen ) { print "$count $word\n"; } If you wanted to do the same thing for lines, you wouldn't need a regular expression: my (%seen); while (<>) { $seen{$_}++; } while ( my ($line, $count) = each %seen ) { print "$count $line"; } If you want these output in a sorted order, see perlfaq4: "How do I sort a hash (optionally by value instead of key)?". How do I efficiently match many regular expressions at once? (contributed by brian d foy) You want to avoid compiling a regular expression every time you want to match it. In this example, perl must recompile the regular expression for every iteration of the "foreach" loop since $pattern can change: my @patterns = qw( fo+ ba[rz] ); LINE: while( my $line = <> ) { foreach my $pattern ( @patterns ) { if( $line =~ m/\b$pattern\b/i ) { print $line; next LINE; } } } The "qr//" operator compiles a regular expression, but doesn't apply it. When you use the pre-compiled version of the regex, perl does less work. In this example, I inserted a "map" to turn each pattern into its pre-compiled form. The rest of the script is the same, but faster: my @patterns = map { qr/\b$_\b/i } qw( fo+ ba[rz] ); LINE: while( my $line = <> ) { foreach my $pattern ( @patterns ) { if( $line =~ m/$pattern/ ) { print $line; next LINE; } } } In some cases, you may be able to make several patterns into a single regular expression. Beware of situations that require backtracking though. In this example, the regex is only compiled once because $regex doesn't change between iterations: my $regex = join '|', qw( fo+ ba[rz] ); while( my $line = <> ) { print if $line =~ m/\b(?:$regex)\b/i; } The function "list2re" in Data::Munge on CPAN can also be used to form a single regex that matches a list of literal strings (not regexes). For more details on regular expression efficiency, see *Mastering Regular Expressions* by Jeffrey Friedl. He explains how the regular expressions engine works and why some patterns are surprisingly inefficient. Once you understand how perl applies regular expressions, you can tune them for individual situations. What good is "\G" in a regular expression? You use the "\G" anchor to start the next match on the same string where the last match left off. The regular expression engine cannot skip over any characters to find the next match with this anchor, so "\G" is similar to the beginning of string anchor, "^". The "\G" anchor is typically used with the "g" modifier. It uses the value of "pos()" as the position to start the next match. As the match operator makes successive matches, it updates "pos()" with the position of the next character past the last match (or the first character of the next match, depending on how you like to look at it). Each string has its own "pos()" value. Suppose you want to match all of consecutive pairs of digits in a string like "1122a44" and stop matching when you encounter non-digits. You want to match 11 and 22 but the letter "a" shows up between 22 and 44 and you want to stop at "a". Simply matching pairs of digits skips over the "a" and still matches 44. $_ = "1122a44"; my @pairs = m/(\d\d)/g; # qw( 11 22 44 ) If you use the "\G" anchor, you force the match after 22 to start with the "a". The regular expression cannot match there since it does not find a digit, so the next match fails and the match operator returns the pairs it already found. $_ = "1122a44"; my @pairs = m/\G(\d\d)/g; # qw( 11 22 ) You can also use the "\G" anchor in scalar context. You still need the "g" modifier. $_ = "1122a44"; while( m/\G(\d\d)/g ) { print "Found $1\n"; } After the match fails at the letter "a", perl resets "pos()" and the next match on the same string starts at the beginning. $_ = "1122a44"; while( m/\G(\d\d)/g ) { print "Found $1\n"; } print "Found $1 after while" if m/(\d\d)/g; # finds "11" You can disable "pos()" resets on fail with the "c" modifier, documented in perlop and perlreref. Subsequent matches start where the last successful match ended (the value of "pos()") even if a match on the same string has failed in the meantime. In this case, the match after the "while()" loop starts at the "a" (where the last match stopped), and since it does not use any anchor it can skip over the "a" to find 44. $_ = "1122a44"; while( m/\G(\d\d)/gc ) { print "Found $1\n"; } print "Found $1 after while" if m/(\d\d)/g; # finds "44" Typically you use the "\G" anchor with the "c" modifier when you want to try a different match if one fails, such as in a tokenizer. Jeffrey Friedl offers this example which works in 5.004 or later. while (<>) { chomp; PARSER: { m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; }; m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; }; m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; }; m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; }; } } For each line, the "PARSER" loop first tries to match a series of digits followed by a word boundary. This match has to start at the place the last match left off (or the beginning of the string on the first match). Since "m/ \G( \d+\b )/gcx" uses the "c" modifier, if the string does not match that regular expression, perl does not reset pos() and the next match starts at the same position to try a different pattern. Are Perl regexes DFAs or NFAs? Are they POSIX compliant? While it's true that Perl's regular expressions resemble the DFAs (deterministic finite automata) of the egrep(1) program, they are in fact implemented as NFAs (non-deterministic finite automata) to allow backtracking and backreferencing. And they aren't POSIX-style either, because those guarantee worst-case behavior for all cases. (It seems that some people prefer guarantees of consistency, even when what's guaranteed is slowness.) See the book "Mastering Regular Expressions" (from O'Reilly) by Jeffrey Friedl for all the details you could ever hope to know on these matters (a full citation appears in perlfaq2). What's wrong with using grep in a void context? The problem is that grep builds a return list, regardless of the context. This means you're making Perl go to the trouble of building a list that you then just throw away. If the list is large, you waste both time and space. If your intent is to iterate over the list, then use a for loop for this purpose. In perls older than 5.8.1, map suffers from this problem as well. But since 5.8.1, this has been fixed, and map is context aware - in void context, no lists are constructed. How do I match a regular expression that's in a variable? (contributed by brian d foy) We don't have to hard-code patterns into the match operator (or anything else that works with regular expressions). We can put the pattern in a variable for later use. The match operator is a double quote context, so you can interpolate your variable just like a double quoted string. In this case, you read the regular expression as user input and store it in $regex. Once you have the pattern in $regex, you use that variable in the match operator. chomp( my $regex = <STDIN> ); if( $string =~ m/$regex/ ) { ... } Any regular expression special characters in $regex are still special, and the pattern still has to be valid or Perl will complain. For instance, in this pattern there is an unpaired parenthesis. my $regex = "Unmatched ( paren"; "Two parens to bind them all" =~ m/$regex/; When Perl compiles the regular expression, it treats the parenthesis as the start of a memory match. When it doesn't find the closing parenthesis, it complains: Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE paren/ at script line 3. You can get around this in several ways depending on our situation. First, if you don't want any of the characters in the string to be special, you can escape them with "quotemeta" before you use the string. chomp( my $regex = <STDIN> ); $regex = quotemeta( $regex ); if( $string =~ m/$regex/ ) { ... } You can also do this directly in the match operator using the "\Q" and "\E" sequences. The "\Q" tells Perl where to start escaping special characters, and the "\E" tells it where to stop (see perlop for more details). chomp( my $regex = <STDIN> ); if( $string =~ m/\Q$regex\E/ ) { ... } Alternately, you can use "qr//", the regular expression quote operator (see perlop for more details). It quotes and perhaps compiles the pattern, and you can apply regular expression flags to the pattern. chomp( my $input = <STDIN> ); my $regex = qr/$input/is; $string =~ m/$regex/ # same as m/$input/is; You might also want to trap any errors by wrapping an "eval" block around the whole thing. chomp( my $input = <STDIN> ); eval { if( $string =~ m/\Q$input\E/ ) { ... } }; warn $@ if $@; Or... my $regex = eval { qr/$input/is }; if( defined $regex ) { $string =~ m/$regex/; } else { warn $@; } Found in /usr/share/perl/5.34/pod/perlfaq7.pod Can I get a BNF/yacc/RE for the Perl language? There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if you're particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into toke.c as well. In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors." What are all these $@%&* punctuation signs, and how do I know when to use them? They are type specifiers, as detailed in perldata: $ for scalar values (number, string or reference) @ for arrays % for hashes (associative arrays) & for subroutines (aka functions, procedures, methods) * for all types of that symbol name. In version 4 you used them like pointers, but in modern perls you can just use references. There are a couple of other symbols that you're likely to encounter that aren't really type specifiers: <> are used for inputting a record from a filehandle. \ takes a reference to something. Note that <FILE> is *neither* the type specifier for files nor the name of the handle. It is the "<>" operator applied to the handle FILE. It reads one line (well, record--see "$/" in perlvar) from the handle FILE in scalar context, or *all* lines in list context. When performing open, close, or any other operation besides "<>" on files, or even when talking about the handle, do *not* use the brackets. These are correct: "eof(FH)", "seek(FH, 0, 2)" and "copying from STDIN to FILE". How do I skip some return values? One way is to treat the return values as a list and index into it: $dir = (getpwnam($user))[7]; Another way is to use undef as an element on the left-hand-side: ($dev, $ino, undef, undef, $uid, $gid) = stat($file); You can also use a list slice to select only the elements that you need: ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5]; Why do Perl operators have different precedence than C operators? Actually, they don't. All C operators that Perl copies have the same precedence in Perl as they do in C. The problem is with operators that C doesn't have, especially functions that give a list context to everything on their right, eg. print, chmod, exec, and so on. Such functions are called "list operators" and appear as such in the precedence table in perlop. A common mistake is to write: unlink $file || die "snafu"; This gets interpreted as: unlink ($file || die "snafu"); To avoid this problem, either put in extra parentheses or use the super low precedence "or" operator: (unlink $file) || die "snafu"; unlink $file or die "snafu"; The "English" operators ("and", "or", "xor", and "not") deliberately have precedence lower than that of list operators for just such situations as the one above. Another operator with surprising precedence is exponentiation. It binds more tightly even than unary minus, making "-2**2" produce a negative four and not a positive one. It is also right-associating, meaning that "2**3**2" is two raised to the ninth power, not eight squared. Although it has the same precedence as in C, Perl's "?:" operator produces an lvalue. This assigns $x to either $if_true or $if_false, depending on the trueness of $maybe: ($maybe ? $if_true : $if_false) = $x; How do I declare/create a structure? In general, you don't "declare" a structure. Just use a (probably anonymous) hash reference. See perlref and perldsc for details. Here's an example: $person = {}; # new anonymous hash $person->{AGE} = 24; # set field AGE to 24 $person->{NAME} = "Nat"; # set field NAME to "Nat" If you're looking for something a bit more rigorous, try perlootut. How do I create a module? perlnewmod is a good place to start, ignore the bits about uploading to CPAN if you don't want to make your module publicly available. ExtUtils::ModuleMaker and Module::Starter are also good places to start. Many CPAN authors now use Dist::Zilla to automate as much as possible. Detailed documentation about modules can be found at: perlmod, perlmodlib, perlmodstyle. If you need to include C code or C library interfaces use h2xs. h2xs will create the module distribution structure and the initial interface files. perlxs and perlxstut explain the details. How do I adopt or take over a module already on CPAN? Ask the current maintainer to make you a co-maintainer or transfer the module to you. If you can not reach the author for some reason contact the PAUSE admins at modules AT perl.org who may be able to help, but each case is treated separately. * Get a login for the Perl Authors Upload Server (PAUSE) if you don't already have one: <http://pause.perl.org> * Write to modules AT perl.org explaining what you did to contact the current maintainer. The PAUSE admins will also try to reach the maintainer. * Post a public message in a heavily trafficked site announcing your intention to take over the module. * Wait a bit. The PAUSE admins don't want to act too quickly in case the current maintainer is on holiday. If there's no response to private communication or the public post, a PAUSE admin can transfer it to you. How do I create a class? (contributed by brian d foy) In Perl, a class is just a package, and methods are just subroutines. Perl doesn't get more formal than that and lets you set up the package just the way that you like it (that is, it doesn't set up anything for you). See also perlootut, a tutorial that covers class creation, and perlobj. What's a closure? Closures are documented in perlref. *Closure* is a computer science term with a precise but hard-to-explain meaning. Usually, closures are implemented in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These lexicals magically refer to the variables that were around when the subroutine was defined (deep binding). Closures are most often used in programming languages where you can have the return value of a function be itself a function, as you can in Perl. Note that some languages provide anonymous functions but are not capable of providing proper closures: the Python language, for example. For more information on closures, check out any textbook on functional programming. Scheme is a language that not only supports but encourages closures. Here's a classic non-closure function-generating function: sub add_function_generator { return sub { shift() + shift() }; } my $add_sub = add_function_generator(); my $sum = $add_sub->(4,5); # $sum is 9 now. The anonymous subroutine returned by add_function_generator() isn't technically a closure because it refers to no lexicals outside its own scope. Using a closure gives you a *function template* with some customization slots left out to be filled later. Contrast this with the following make_adder() function, in which the returned anonymous function contains a reference to a lexical variable outside the scope of that function itself. Such a reference requires that Perl return a proper closure, thus locking in for all time the value that the lexical had when the function was created. sub make_adder { my $addpiece = shift; return sub { shift() + $addpiece }; } my $f1 = make_adder(20); my $f2 = make_adder(555); Now "$f1->($n)" is always 20 plus whatever $n you pass in, whereas "$f2->($n)" is always 555 plus whatever $n you pass in. The $addpiece in the closure sticks around. Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into a function: my $line; timeout( 30, sub { $line = <STDIN> } ); If the code to execute had been passed in as a string, '$line = <STDIN>', there would have been no way for the hypothetical timeout() function to access the lexical variable $line back in its caller's scope. Another use for a closure is to make a variable *private* to a named subroutine, e.g. a counter that gets initialized at creation time of the sub and can only be modified from within the sub. This is sometimes used with a BEGIN block in package files to make sure a variable doesn't get meddled with during the lifetime of the package: BEGIN { my $id = 0; sub next_id { ++$id } } This is discussed in more detail in perlsub; see the entry on *Persistent Private Variables*. What is variable suicide and how can I prevent it? This problem was fixed in perl 5.004_05, so preventing it means upgrading your version of perl. ;) Variable suicide is when you (temporarily or permanently) lose the value of a variable. It is caused by scoping through my() and local() interacting with either closures or aliased foreach() iterator variables and subroutine arguments. It used to be easy to inadvertently lose a variable's value this way, but now it's much harder. Take this code: my $f = 'foo'; sub T { while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" } } T; print "Finally $f\n"; If you are experiencing variable suicide, that "my $f" in the subroutine doesn't pick up a fresh copy of the $f whose value is 'foo'. The output shows that inside the subroutine the value of $f leaks through when it shouldn't, as in this output: foobar foobarbar foobarbarbar Finally foo The $f that has "bar" added to it three times should be a new $f "my $f" should create a new lexical variable each time through the loop. The expected output is: foobar foobar foobar Finally foo How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? You need to pass references to these objects. See "Pass by Reference" in perlsub for this particular question, and perlref for information on references. Passing Variables and Functions Regular variables and functions are quite easy to pass: just pass in a reference to an existing or anonymous variable or function: func( \$some_scalar ); func( \@some_array ); func( [ 1 .. 10 ] ); func( \%some_hash ); func( { this => 10, that => 20 } ); func( \&some_func ); func( sub { $_[0] ** $_[1] } ); Passing Filehandles As of Perl 5.6, you can represent filehandles with scalar variables which you treat as any other scalar. open my $fh, $filename or die "Cannot open $filename! $!"; func( $fh ); sub func { my $passed_fh = shift; my $line = <$passed_fh>; } Before Perl 5.6, you had to use the *FH or "\*FH" notations. These are "typeglobs"--see "Typeglobs and Filehandles" in perldata and especially "Pass by Reference" in perlsub for more information. Passing Regexes Here's an example of how to pass in a string and a regular expression for it to match against. You construct the pattern with the "qr//" operator: sub compare { my ($val1, $regex) = @_; my $retval = $val1 =~ /$regex/; return $retval; } $match = compare("old McDonald", qr/d.*D/i); Passing Methods To pass an object method into a subroutine, you can do this: call_a_lot(10, $some_obj, "methname") sub call_a_lot { my ($count, $widget, $trick) = @_; for (my $i = 0; $i < $count; $i++) { $widget->$trick(); } } Or, you can use a closure to bundle up the object, its method call, and arguments: my $whatnot = sub { $some_obj->obfuscate(@args) }; func($whatnot); sub func { my $code = shift; &$code(); } You could also investigate the can() method in the UNIVERSAL class (part of the standard perl distribution). How do I create a static variable? (contributed by brian d foy) In Perl 5.10, declare the variable with "state". The "state" declaration creates the lexical variable that persists between calls to the subroutine: sub counter { state $count = 1; $count++ } You can fake a static variable by using a lexical variable which goes out of scope. In this example, you define the subroutine "counter", and it uses the lexical variable $count. Since you wrap this in a BEGIN block, $count is defined at compile-time, but also goes out of scope at the end of the BEGIN block. The BEGIN block also ensures that the subroutine and the value it uses is defined at compile-time so the subroutine is ready to use just like any other subroutine, and you can put this code in the same place as other subroutines in the program text (i.e. at the end of the code, typically). The subroutine "counter" still has a reference to the data, and is the only way you can access the value (and each time you do, you increment the value). The data in chunk of memory defined by $count is private to "counter". BEGIN { my $count = 1; sub counter { $count++ } } my $start = counter(); .... # code that calls counter(); my $end = counter(); In the previous example, you created a function-private variable because only one function remembered its reference. You could define multiple functions while the variable is in scope, and each function can share the "private" variable. It's not really "static" because you can access it outside the function while the lexical variable is in scope, and even create references to it. In this example, "increment_count" and "return_count" share the variable. One function adds to the value and the other simply returns the value. They can both access $count, and since it has gone out of scope, there is no other way to access it. BEGIN { my $count = 1; sub increment_count { $count++ } sub return_count { $count } } To declare a file-private variable, you still use a lexical variable. A file is also a scope, so a lexical variable defined in the file cannot be seen from any other file. See "Persistent Private Variables" in perlsub for more information. The discussion of closures in perlref may help you even though we did not use anonymous subroutines in this answer. See "Persistent Private Variables" in perlsub for details. What's the difference between dynamic and lexical (static) scoping? Between local() and my()? "local($x)" saves away the old value of the global variable $x and assigns a new value for the duration of the subroutine *which is visible in other functions called from that subroutine*. This is done at run-time, so is called dynamic scoping. local() always affects global variables, also called package variables or dynamic variables. "my($x)" creates a new variable that is only visible in the current subroutine. This is done at compile-time, so it is called lexical or static scoping. my() always affects private variables, also called lexical variables or (improperly) static(ly scoped) variables. For instance: sub visible { print "var has value $var\n"; } sub dynamic { local $var = 'local'; # new temporary value for the still-global visible(); # variable called $var } sub lexical { my $var = 'private'; # new private variable, $var visible(); # (invisible outside of sub scope) } $var = 'global'; visible(); # prints global dynamic(); # prints local lexical(); # prints global Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() function, and it is hidden from the called subroutine. In summary, local() doesn't make what you think of as private, local variables. It gives a global variable a temporary value. my() is what you're looking for if you want private variables. See "Private Variables via my()" in perlsub and "Temporary Values via local()" in perlsub for excruciating details. What's the difference between deep and shallow binding? In deep binding, lexical variables mentioned in anonymous subroutines are the same ones that were in scope when the subroutine was created. In shallow binding, they are whichever variables with the same names happen to be in scope when the subroutine is called. Perl always uses deep binding of lexical variables (i.e., those created with my()). However, dynamic variables (aka global, local, or package variables) are effectively shallowly bound. Consider this just one more reason not to use them. See the answer to "What's a closure?". How do I redefine a builtin function, operator, or method? Why do you want to do that? :-) If you want to override a predefined function, such as open(), then you'll have to import the new definition from a different module. See "Overriding Built-in Functions" in perlsub. If you want to overload a Perl operator, such as "+" or "**", then you'll want to use the "use overload" pragma, documented in overload. If you're talking about obscuring method calls in parent classes, see "Overriding methods and method resolution" in perlootut. What's the difference between calling a function as &foo and foo()? (contributed by brian d foy) Calling a subroutine as &foo with no trailing parentheses ignores the prototype of "foo" and passes it the current value of the argument list, @_. Here's an example; the "bar" subroutine calls &foo, which prints its arguments list: sub foo { print "Args in foo are: @_\n"; } sub bar { &foo; } bar( "a", "b", "c" ); When you call "bar" with arguments, you see that "foo" got the same @_: Args in foo are: a b c Calling the subroutine with trailing parentheses, with or without arguments, does not use the current @_. Changing the example to put parentheses after the call to "foo" changes the program: sub foo { print "Args in foo are: @_\n"; } sub bar { &foo(); } bar( "a", "b", "c" ); Now the output shows that "foo" doesn't get the @_ from its caller. Args in foo are: However, using "&" in the call still overrides the prototype of "foo" if present: sub foo ($$$) { print "Args infoo are: @_\n"; } sub bar_1 { &foo; } sub bar_2 { &foo(); } sub bar_3 { foo( $_[0], $_[1], $_[2] ); } # sub bar_4 { foo(); } # bar_4 doesn't compile: "Not enough arguments for main::foo at ..." bar_1( "a", "b", "c" ); # Args in foo are: a b c bar_2( "a", "b", "c" ); # Args in foo are: bar_3( "a", "b", "c" ); # Args in foo are: a b c The main use of the @_ pass-through feature is to write subroutines whose main job it is to call other subroutines for you. For further details, see perlsub. How do I create a switch or case statement? There is a given/when statement in Perl, but it is experimental and likely to change in future. See perlsyn for more details. The general answer is to use a CPAN module such as Switch::Plain: use Switch::Plain; sswitch($variable_holding_a_string) { case 'first': { } case 'second': { } default: { } } or for more complicated comparisons, "if-elsif-else": for ($variable_to_test) { if (/pat1/) { } # do something elsif (/pat2/) { } # do something else elsif (/pat3/) { } # do something else else { } # default } Here's a simple example of a switch based on pattern matching, lined up in a way to make it look more like a switch statement. We'll do a multiway conditional based on the type of reference stored in $whatchamacallit: SWITCH: for (ref $whatchamacallit) { /^$/ && die "not a reference"; /SCALAR/ && do { print_scalar($$ref); last SWITCH; }; /ARRAY/ && do { print_array(@$ref); last SWITCH; }; /HASH/ && do { print_hash(%$ref); last SWITCH; }; /CODE/ && do { warn "can't print function ref"; last SWITCH; }; # DEFAULT warn "User defined type skipped"; } See perlsyn for other examples in this style. Sometimes you should change the positions of the constant and the variable. For example, let's say you wanted to test which of many answers you were given, but in a case-insensitive way that also allows abbreviations. You can use the following technique if the strings all start with different characters or if you want to arrange the matches so that one takes precedence over another, as "SEND" has precedence over "STOP" here: chomp($answer = <>); if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } A totally different approach is to create a hash of function references. my %commands = ( "happy" => \&joy, "sad", => \&sullen, "done" => sub { die "See ya!" }, "mad" => \&angry, ); print "How are you? "; chomp($string = <STDIN>); if ($commands{$string}) { $commands{$string}->(); } else { print "No such command: $string\n"; } Starting from Perl 5.8, a source filter module, "Switch", can also be used to get switch and case. Its use is now discouraged, because it's not fully compatible with the native switch of Perl 5.10, and because, as it's implemented as a source filter, it doesn't always work as intended when complex syntax is involved. How can I find out my current or calling package? (contributed by brian d foy) To find the package you are currently in, use the special literal "__PACKAGE__", as documented in perldata. You can only use the special literals as separate tokens, so you can't interpolate them into strings like you can with variables: my $current_package = __PACKAGE__; print "I am in package $current_package\n"; If you want to find the package calling your code, perhaps to give better diagnostics as Carp does, use the "caller" built-in: sub foo { my @args = ...; my( $package, $filename, $line ) = caller; print "I was called from package $package\n"; ); By default, your program starts in package "main", so you will always be in some package. This is different from finding out the package an object is blessed into, which might not be the current package. For that, use "blessed" from Scalar::Util, part of the Standard Library since Perl 5.8: use Scalar::Util qw(blessed); my $object_package = blessed( $object ); Most of the time, you shouldn't care what package an object is blessed into, however, as long as it claims to inherit from that class: my $is_right_class = eval { $object->isa( $package ) }; # true or false And, with Perl 5.10 and later, you don't have to check for an inheritance to see if the object can handle a role. For that, you can use "DOES", which comes from "UNIVERSAL": my $class_does_it = eval { $object->DOES( $role ) }; # true or false You can safely replace "isa" with "DOES" (although the converse is not true). What does "bad interpreter" mean? (contributed by brian d foy) The "bad interpreter" message comes from the shell, not perl. The actual message may vary depending on your platform, shell, and locale settings. If you see "bad interpreter - no such file or directory", the first line in your perl script (the "shebang" line) does not contain the right path to perl (or any other program capable of running scripts). Sometimes this happens when you move the script from one machine to another and each machine has a different path to perl--/usr/bin/perl versus /usr/local/bin/perl for instance. It may also indicate that the source machine has CRLF line terminators and the destination machine has LF only: the shell tries to find /usr/bin/perl<CR>, but can't. If you see "bad interpreter: Permission denied", you need to make your script executable. In either case, you should still be able to run the scripts with perl explicitly: % perl script.pl If you get a message like "perl: command not found", perl is not in your PATH, which might also mean that the location of perl is not where you expect it so you need to adjust your shebang line. Do I need to recompile XS modules when there is a change in the C library? (contributed by Alex Beamish) If the new version of the C library is ABI-compatible (that's Application Binary Interface compatible) with the version you're upgrading from, and if the shared library version didn't change, no re-compilation should be necessary. Found in /usr/share/perl/5.34/pod/perlfaq8.pod How come exec() doesn't return? (contributed by brian d foy) The "exec" function's job is to turn your process into another command and never to return. If that's not what you want to do, don't use "exec". :) If you want to run an external command and still keep your Perl process going, look at a piped "open", "fork", or "system". How do I do fancy stuff with the keyboard/screen/mouse? How you access/control keyboards, screens, and pointing devices ("mice") is system-dependent. Try the following modules: Keyboard Term::Cap Standard perl distribution Term::ReadKey CPAN Term::ReadLine::Gnu CPAN Term::ReadLine::Perl CPAN Term::Screen CPAN Screen Term::Cap Standard perl distribution Curses CPAN Term::ANSIColor CPAN Mouse Tk CPAN Wx CPAN Gtk2 CPAN Qt4 kdebindings4 package Some of these specific cases are shown as examples in other answers in this section of the perlfaq. How do I read just one key without waiting for a return key? Controlling input buffering is a remarkably system-dependent matter. On many systems, you can just use the stty command as shown in "getc" in perlfunc, but as you see, that's already getting you into portability snags. open(TTY, "+</dev/tty") or die "no tty: $!"; system "stty cbreak </dev/tty >/dev/tty 2>&1"; $key = getc(TTY); # perhaps this works # OR ELSE sysread(TTY, $key, 1); # probably this does system "stty -cbreak </dev/tty >/dev/tty 2>&1"; The Term::ReadKey module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to stty for each key. It even includes limited support for Windows. use Term::ReadKey; ReadMode('cbreak'); $key = ReadKey(0); ReadMode('normal'); However, using the code requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution using the standard POSIX module, which is already on your system (assuming your system supports POSIX). use HotKey; $key = readkey(); And here's the "HotKey" module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures. # HotKey.pm package HotKey; use strict; use warnings; use parent 'Exporter'; our @EXPORT = qw(cbreak cooked readkey); use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); # ok, so i don't want echo either $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub readkey { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } END { cooked() } 1; How do I check whether input is ready on the keyboard? The easiest way to do this is to read a key in nonblocking mode with the Term::ReadKey module from CPAN, passing it an argument of -1 to indicate not to block: use Term::ReadKey; ReadMode('cbreak'); if (defined (my $char = ReadKey(-1)) ) { # input was waiting and it was $char } else { # no input was waiting } ReadMode('normal'); # restore normal tty settings How do I clear the screen? (contributed by brian d foy) To clear the screen, you just have to print the special sequence that tells the terminal to clear the screen. Once you have that sequence, output it when you want to clear the screen. You can use the Term::ANSIScreen module to get the special sequence. Import the "cls" function (or the ":screen" tag): use Term::ANSIScreen qw(cls); my $clear_screen = cls(); print $clear_screen; The Term::Cap module can also get the special sequence if you want to deal with the low-level details of terminal control. The "Tputs" method returns the string for the given capability: use Term::Cap; my $terminal = Term::Cap->Tgetent( { OSPEED => 9600 } ); my $clear_screen = $terminal->Tputs('cl'); print $clear_screen; On Windows, you can use the Win32::Console module. After creating an object for the output filehandle you want to affect, call the "Cls" method: Win32::Console; my $OUT = Win32::Console->new(STD_OUTPUT_HANDLE); my $clear_string = $OUT->Cls; print $clear_screen; If you have a command-line program that does the job, you can call it in backticks to capture whatever it outputs so you can use it later: my $clear_string = `clear`; print $clear_string; How do I get the screen size? If you have Term::ReadKey module installed from CPAN, you can use it to fetch the width and height in characters and in pixels: use Term::ReadKey; my ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize(); This is more portable than the raw "ioctl", but not as illustrative: require './sys/ioctl.ph'; die "no TIOCGWINSZ " unless defined &TIOCGWINSZ; open(my $tty_fh, "+</dev/tty") or die "No tty: $!"; unless (ioctl($tty_fh, &TIOCGWINSZ, $winsize='')) { die sprintf "$0: ioctl TIOCGWINSZ (%08x: $!)\n", &TIOCGWINSZ; } my ($row, $col, $xpixel, $ypixel) = unpack('S4', $winsize); print "(row,col) = ($row,$col)"; print " (xpixel,ypixel) = ($xpixel,$ypixel)" if $xpixel || $ypixel; print "\n"; How do I read and write the serial port? This depends on which operating system your program is running on. In the case of Unix, the serial ports will be accessible through files in "/dev"; on other systems, device names will doubtless differ. Several problem areas common to all device interaction are the following: lockfiles Your system may use lockfiles to control multiple access. Make sure you follow the correct protocol. Unpredictable behavior can result from multiple processes reading from one device. open mode If you expect to use both read and write operations on the device, you'll have to open it for update (see "open" in perlfunc for details). You may wish to open it without running the risk of blocking by using "sysopen()" and "O_RDWR|O_NDELAY|O_NOCTTY" from the Fcntl module (part of the standard perl distribution). See "sysopen" in perlfunc for more on this approach. end of line Some devices will be expecting a "\r" at the end of each line rather than a "\n". In some ports of perl, "\r" and "\n" are different from their usual (Unix) ASCII values of "\015" and "\012". You may have to give the numeric values you want directly, using octal ("\015"), hex ("0x0D"), or as a control-character specification ("\cM"). print DEV "atv1\012"; # wrong, for some devices print DEV "atv1\015"; # right, for some devices Even though with normal text files a "\n" will do the trick, there is still no unified scheme for terminating a line that is portable between Unix, DOS/Win, and Macintosh, except to terminate *ALL* line ends with "\015\012", and strip what you don't need from the output. This applies especially to socket I/O and autoflushing, discussed next. flushing output If you expect characters to get to your device when you "print()" them, you'll want to autoflush that filehandle. You can use "select()" and the $| variable to control autoflushing (see "$|" in perlvar and "select" in perlfunc, or perlfaq5, "How do I flush/unbuffer an output filehandle? Why must I do this?"): my $old_handle = select($dev_fh); $| = 1; select($old_handle); You'll also see code that does this without a temporary variable, as in select((select($deb_handle), $| = 1)[0]); Or if you don't mind pulling in a few thousand lines of code just because you're afraid of a little $| variable: use IO::Handle; $dev_fh->autoflush(1); As mentioned in the previous item, this still doesn't work when using socket I/O between Unix and Macintosh. You'll need to hard code your line terminators, in that case. non-blocking input If you are doing a blocking "read()" or "sysread()", you'll have to arrange for an alarm handler to provide a timeout (see "alarm" in perlfunc). If you have a non-blocking open, you'll likely have a non-blocking read, which means you may have to use a 4-arg "select()" to determine whether I/O is ready on that device (see "select" in perlfunc. While trying to read from his caller-id box, the notorious Jamie Zawinski "<jwz AT netscape.com>", after much gnashing of teeth and fighting with "sysread", "sysopen", POSIX's "tcgetattr" business, and various other functions that go bump in the night, finally came up with this: sub open_modem { use IPC::Open2; my $stty = `/bin/stty -g`; open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1"); # starting cu hoses /dev/tty's stty settings, even when it has # been opened on a pipe... system("/bin/stty $stty"); $_ = <MODEM_IN>; chomp; if ( !m/^Connected/ ) { print STDERR "$0: cu printed `$_' instead of `Connected'\n"; } } How can I measure time under a second? (contributed by brian d foy) The Time::HiRes module (part of the standard distribution as of Perl 5.8) measures time with the "gettimeofday()" system call, which returns the time in microseconds since the epoch. If you can't install Time::HiRes for older Perls and you are on a Unixish system, you may be able to call gettimeofday(2) directly. See "syscall" in perlfunc. Where do I get the include files to do ioctl() or syscall()? Historically, these would be generated by the h2ph tool, part of the standard perl distribution. This program converts cpp(1) directives in C header files to files containing subroutine definitions, like "SYS_getitimer()", which you can use as arguments to your functions. It doesn't work perfectly, but it usually gets most of the job done. Simple files like errno.h, syscall.h, and socket.h were fine, but the hard ones like ioctl.h nearly always need to be hand-edited. Here's how to install the *.ph files: 1. Become the super-user 2. cd /usr/include 3. h2ph *.h */*.h If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See perlxstut for how to get started with h2xs. If your system doesn't support dynamic loading, you still probably ought to use h2xs. See perlxstut and ExtUtils::MakeMaker for more information (in brief, just use make perl instead of a plain make to rebuild perl with a new static extension). How can I capture STDERR from an external command? There are three basic ways of running external commands: system $cmd; # using system() my $output = `$cmd`; # using backticks (``) open (my $pipe_fh, "$cmd |"); # using open() With "system()", both STDOUT and STDERR will go the same place as the script's STDOUT and STDERR, unless the "system()" command redirects them. Backticks and "open()" read only the STDOUT of your command. You can also use the "open3()" function from IPC::Open3. Benjamin Goldberg provides some sample code: To capture a program's STDOUT, but discard its STDERR: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, \*PH, ">&NULL", "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, but discard its STDOUT: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, ">&NULL", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, and let its STDOUT go to our own STDERR: use IPC::Open3; my $in = ''; my $pid = open3($in, ">&STDERR", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To read both a command's STDOUT and its STDERR separately, you can redirect them to temp files, let the command run, then read the temp files: use IPC::Open3; use IO::File; my $in = ''; local *CATCHOUT = IO::File->new_tmpfile; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, ">&CATCHOUT", ">&CATCHERR", "cmd"); waitpid($pid, 0); seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; while( <CATCHOUT> ) {} while( <CATCHERR> ) {} But there's no real need for both to be tempfiles... the following should work just as well, without deadlocking: use IPC::Open3; my $in = ''; use IO::File; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, \*CATCHOUT, ">&CATCHERR", "cmd"); while( <CATCHOUT> ) {} waitpid($pid, 0); seek CATCHERR, 0, 0; while( <CATCHERR> ) {} And it'll be faster, too, since we can begin processing the program's stdout immediately, rather than waiting for the program to finish. With any of these, you can change file descriptors before the call: open(STDOUT, ">logfile"); system("ls"); or you can use Bourne shell file-descriptor redirection: $output = `$cmd 2>some_file`; open (PIPE, "cmd 2>some_file |"); You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT: $output = `$cmd 2>&1`; open (PIPE, "cmd 2>&1 |"); Note that you *cannot* simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes This fails because the "open()" makes STDERR go to where STDOUT was going at the time of the "open()". The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). Note that you *must* use Bourne shell (sh(1)) redirection syntax in backticks, not csh(1)! Details on why Perl's "system()" and backtick and pipe opens all use the Bourne shell are in the versus/csh.whynot article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . To capture a command's STDERR and STDOUT together: $output = `cmd 2>&1`; # either with backticks $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDOUT but discard its STDERR: $output = `cmd 2>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDERR but discard its STDOUT: $output = `cmd 2>&1 1>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR: $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe while (<PH>) { } # plus a read To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done: system("program args 1>program.stdout 2>program.stderr"); Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order. system("prog args 1>tmpfile 2>&1"); system("prog args 2>&1 1>tmpfile"); The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output there, and the old standard error shows up on the old standard out. Why doesn't open() return an error when a pipe open fails? If the second argument to a piped "open()" contains shell metacharacters, perl "fork()"s, then "exec()"s a shell to decode the metacharacters and eventually run the desired program. If the program couldn't be run, it's the shell that gets the message, not Perl. All your Perl program can find out is whether the shell itself could be successfully started. You can still capture the shell's STDERR and check it for error messages. See "How can I capture STDERR from an external command?" elsewhere in this document, or use the IPC::Open3 module. If there are no shell metacharacters in the argument of "open()", Perl runs the command directly, without using the shell, and can correctly report whether the command started. Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? This happens only if your perl is compiled to use stdio instead of perlio, which is the default. Some (maybe all?) stdios set error and eof flags that you may need to clear. The POSIX module defines "clearerr()" that you can use. That is the technically correct way to do it. Here are some less reliable workarounds: 1 Try keeping around the seekpointer and go there, like this: my $where = tell($log_fh); seek($log_fh, $where, 0); 2 If that doesn't work, try seeking to a different part of the file and then back. 3 If that doesn't work, try seeking to a different part of the file, reading something, and then seeking back. 4 If that doesn't work, give up on your stdio package and use sysread. Is there a way to hide perl's command line from programs such as "ps"? First of all note that if you're doing this for security reasons (to avoid people seeing passwords, for example) then you should rewrite your program so that critical information is never given as an argument. Hiding the arguments won't make your program completely secure. To actually alter the visible command line, you can assign to the variable $0 as documented in perlvar. This won't work on all operating systems, though. Daemon programs like sendmail place their state there, as in: $0 = "orcus [accepting connections]"; I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? Unix In the strictest sense, it can't be done--the script executes as a different process from the shell it was started from. Changes to a process are not reflected in its parent--only in any children created after the change. There is shell magic that may allow you to fake it by "eval()"ing the script's output in your shell; check out the comp.unix.questions FAQ for details. How do I tell the difference between errors from the shell and perl? (answer contributed by brian d foy) When you run a Perl script, something else is running the script for you, and that something else may output error messages. The script might emit its own warnings and error messages. Most of the time you cannot tell who said what. You probably cannot fix the thing that runs perl, but you can change how perl outputs its warnings by defining a custom warning and die functions. Consider this script, which has an error you may not notice immediately. #!/usr/locl/bin/perl print "Hello World\n"; I get an error when I run this from my shell (which happens to be bash). That may look like perl forgot it has a "print()" function, but my shebang line is not the path to perl, so the shell runs the script, and I get the error. $ ./test ./test: line 3: print: command not found A quick and dirty fix involves a little bit of code, but this may be all you need to figure out the problem. #!/usr/bin/perl -w BEGIN { $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; } $a = 1 + undef; $x / 0; __END__ The perl message comes out with "Perl" in front. The "BEGIN" block works at compile time so all of the compilation errors and warnings get the "Perl:" prefix too. Perl: Useless use of division (/) in void context at ./test line 9. Perl: Name "main::a" used only once: possible typo at ./test line 8. Perl: Name "main::x" used only once: possible typo at ./test line 9. Perl: Use of uninitialized value in addition (+) at ./test line 8. Perl: Use of uninitialized value in division (/) at ./test line 9. Perl: Illegal division by zero at ./test line 9. Perl: Illegal division by zero at -e line 3. If I don't see that "Perl:", it's not from perl. You could also just know all the perl errors, and although there are some people who may know all of them, you probably don't. However, they all should be in the perldiag manpage. If you don't find the error in there, it probably isn't a perl error. Looking up every message is not the easiest way, so let perl to do it for you. Use the diagnostics pragma with turns perl's normal messages into longer discussions on the topic. use diagnostics; If you don't get a paragraph or two of expanded discussion, it might not be perl's message. What's the difference between require and use? (contributed by brian d foy) Perl runs "require" statement at run-time. Once Perl loads, compiles, and runs the file, it doesn't do anything else. The "use" statement is the same as a "require" run at compile-time, but Perl also calls the "import" method for the loaded package. These two are the same: use MODULE qw(import list); BEGIN { require MODULE; MODULE->import(import list); } However, you can suppress the "import" by using an explicit, empty import list. Both of these still happen at compile-time: use MODULE (); BEGIN { require MODULE; } Since "use" will also call the "import" method, the actual value for "MODULE" must be a bareword. That is, "use" cannot load files by name, although "require" can: require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching! See the entry for "use" in perlfunc for more details. How do I keep my own module/library directory? When you build modules, tell Perl where to install the modules. If you want to install modules for your own use, the easiest way might be local::lib, which you can download from CPAN. It sets various installation settings for you, and uses those same settings within your programs. If you want more flexibility, you need to configure your CPAN client for your particular situation. For "Makefile.PL"-based distributions, use the INSTALL_BASE option when generating Makefiles: perl Makefile.PL INSTALL_BASE=/mydir/perl You can set this in your "CPAN.pm" configuration so modules automatically install in your private library directory when you use the CPAN.pm shell: % cpan cpan> o conf makepl_arg INSTALL_BASE=/mydir/perl cpan> o conf commit For "Build.PL"-based distributions, use the --install_base option: perl Build.PL --install_base /mydir/perl You can configure "CPAN.pm" to automatically use this option too: % cpan cpan> o conf mbuild_arg "--install_base /mydir/perl" cpan> o conf commit INSTALL_BASE tells these tools to put your modules into /mydir/perl/lib/perl5. See "How do I add a directory to my include path (@INC) at runtime?" for details on how to run your newly installed modules. There is one caveat with INSTALL_BASE, though, since it acts differently from the PREFIX and LIB settings that older versions of ExtUtils::MakeMaker advocated. INSTALL_BASE does not support installing modules for multiple versions of Perl or different architectures under the same directory. You should consider whether you really want that and, if you do, use the older PREFIX and LIB settings. See the ExtUtils::Makemaker documentation for more details. How do I add the directory my program lives in to the module/library search path? (contributed by brian d foy) If you know the directory already, you can add it to @INC as you would for any other directory. You might "use lib" if you know the directory at compile time: use lib $directory; The trick in this task is to find the directory. Before your script does anything else (such as a "chdir"), you can get the current working directory with the "Cwd" module, which comes with Perl: BEGIN { use Cwd; our $directory = cwd; } use lib $directory; You can do a similar thing with the value of $0, which holds the script name. That might hold a relative path, but "rel2abs" can turn it into an absolute path. Once you have the BEGIN { use File::Spec::Functions qw(rel2abs); use File::Basename qw(dirname); my $path = rel2abs( $0 ); our $directory = dirname( $path ); } use lib $directory; The FindBin module, which comes with Perl, might work. It finds the directory of the currently running script and puts it in $Bin, which you can then use to construct the right library path: use FindBin qw($Bin); You can also use local::lib to do much of the same thing. Install modules using local::lib's settings then use the module in your program: use local::lib; # sets up a local lib at ~/perl5 See the local::lib documentation for more details. How do I add a directory to my include path (@INC) at runtime? Here are the suggested ways of modifying your include path, including environment variables, run-time switches, and in-code statements: the "PERLLIB" environment variable $ export PERLLIB=/path/to/my/dir $ perl program.pl the "PERL5LIB" environment variable $ export PERL5LIB=/path/to/my/dir $ perl program.pl the "perl -Idir" command line flag $ perl -I/path/to/my/dir program.pl the "lib" pragma: use lib "$ENV{HOME}/myown_perllib"; the local::lib module: use local::lib; use local::lib "~/myown_perllib"; Where are modules installed? Modules are installed on a case-by-case basis (as provided by the methods described in the previous section), and in the operating system. All of these paths are stored in @INC, which you can display with the one-liner perl -e 'print join("\n",@INC,"")' The same information is displayed at the end of the output from the command perl -V To find out where a module's source code is located, use perldoc -l Encode to display the path to the module. In some cases (for example, the "AutoLoader" module), this command will show the path to a separate "pod" file; the module itself should be in the same directory, with a 'pm' file extension. What is socket.ph and where do I get it? It's a Perl 4 style file defining values for system networking constants. Sometimes it is built using h2ph when Perl is installed, but other times it is not. Modern programs should use "use Socket;" instead. Found in /usr/share/perl/5.34/pod/perlfaq9.pod How do I remove HTML from a string? Use HTML::Strip, or HTML::FormatText which not only removes HTML but also attempts to do a little simple formatting of the resulting plain text. How do I decode or create those %-encodings on the web? Most of the time you should not need to do this as your web framework, or if you are making a request, the LWP or other module would handle it for you. To encode a string yourself, use the URI::Escape module. The "uri_escape" function returns the escaped string: my $original = "Colon : Hash # Percent %"; my $escaped = uri_escape( $original ); print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25' To decode the string, use the "uri_unescape" function: my $unescaped = uri_unescape( $escaped ); print $unescaped; # back to original Remember not to encode a full URI, you need to escape each component separately and then join them together. How do I redirect to another page? Most Perl Web Frameworks will have a mechanism for doing this, using the Catalyst framework it would be: $c->res->redirect($url); $c->detach(); If you are using Plack (which most frameworks do), then Plack::Middleware::Rewrite is worth looking at if you are migrating from Apache or have URL's you want to always redirect. How do I make sure users can't enter values into a form that causes my CGI script to do bad things? (contributed by brian d foy) You can't prevent people from sending your script bad data. Even if you add some client-side checks, people may disable them or bypass them completely. For instance, someone might use a module such as LWP to submit to your web site. If you want to prevent data that try to use SQL injection or other sorts of attacks (and you should want to), you have to not trust any data that enter your program. The perlsec documentation has general advice about data security. If you are using the DBI module, use placeholder to fill in data. If you are running external programs with "system" or "exec", use the list forms. There are many other precautions that you should take, too many to list here, and most of them fall under the category of not using any data that you don't intend to use. Trust no one. How do I check a valid mail address? (partly contributed by Aaron Sherman) This isn't as simple a question as it sounds. There are two parts: a) How do I verify that an email address is correctly formatted? b) How do I verify that an email address targets a valid recipient? Without sending mail to the address and seeing whether there's a human on the other end to answer you, you cannot fully answer part *b*, but the Email::Valid module will do both part *a* and part *b* as far as you can in real-time. Our best advice for verifying a person's mail address is to have them enter their address twice, just as you normally do to change a password. This usually weeds out typos. If both versions match, send mail to that address with a personal message. If you get the message back and they've followed your directions, you can be reasonably assured that it's real. A related strategy that's less open to forgery is to give them a PIN (personal ID number). Record the address and PIN (best that it be a random one) for later processing. In the mail you send, include a link to your site with the PIN included. If the mail bounces, you know it's not valid. If they don't click on the link, either they forged the address or (assuming they got the message) following through wasn't important so you don't need to worry about it. How do I find the user's mail address? Ask them for it. There are so many email providers available that it's unlikely the local system has any idea how to determine a user's email address. The exception is for organization-specific email (e.g. foo AT yourcompany.com) where policy can be codified in your program. In that case, you could look at $ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so: my $user_name = getpwuid($<) But you still cannot make assumptions about whether this is correct, unless your policy says it is. You really are best off asking the user. How do I read email? Use the Email::Folder module, like so: use Email::Folder; my $folder = Email::Folder->new('/path/to/email/folder'); while(my $message = $folder->next_message) { # next_message returns Email::Simple objects, but we want # Email::MIME objects as they're more robust my $mime = Email::MIME->new($message->as_string); } There are different classes in the Email::Folder namespace for supporting various mailbox types. Note that these modules are generally rather limited and only support reading rather than writing. How do I find out my hostname, domainname, or IP address? (contributed by brian d foy) The Net::Domain module, which is part of the Standard Library starting in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host name, or the domain name. use Net::Domain qw(hostname hostfqdn hostdomain); my $host = hostfqdn(); The Sys::Hostname module, part of the Standard Library, can also get the hostname: use Sys::Hostname; $host = hostname(); The Sys::Hostname::Long module takes a different approach and tries harder to return the fully qualified hostname: use Sys::Hostname::Long 'hostname_long'; my $hostname = hostname_long(); To get the IP address, you can use the "gethostbyname" built-in function to turn the name into a number. To turn that number into the dotted octet form (a.b.c.d) that most people expect, use the "inet_ntoa" function from the Socket module, which also comes with perl. use Socket; my $address = inet_ntoa( scalar gethostbyname( $host || 'localhost' ) );
Generated by phpman v3.7.12 Author: Che Dong Under GNU General Public License
2026-06-13 13:00 @216.73.216.28
CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)