Found in /usr/share/perl/5.34/pod/perlfaq1.pod Who supports Perl? Who develops it? Why is it free? The original culture of the pre-populist Internet and the deeply-held beliefs of Perl's author, Larry Wall, gave rise to the free and open distribution policy of Perl. Perl is supported by its users. The core, the standard Perl library, the optional modules, and the documentation you're reading now were all written by volunteers. The core development team (known as the Perl Porters) are a group of highly altruistic individuals committed to producing better software for free than you could hope to purchase for money. You may snoop on pending developments via the archives <http://www.nntp.perl.org/group/perl.perl5.porters/> or you can subscribe to the mailing list by sending perl5-porters-subscribe AT perl.org a subscription request (an empty message with no subject is fine). While the GNU project includes Perl in its distributions, there's no such thing as "GNU Perl". Perl is not produced nor maintained by the Free Software Foundation. Perl's licensing terms are also more open than GNU software's tend to be. You can get commercial support of Perl if you wish, although for most users the informal support will more than suffice. See the answer to "Where can I buy a commercial version of Perl?" for more information. Which version of Perl should I use? (contributed by brian d foy with updates from others) There is often a matter of opinion and taste, and there isn't any one answer that fits everyone. In general, you want to use either the current stable release, or the stable release immediately prior to that one. Beyond that, you have to consider several things and decide which is best for you. * If things aren't broken, upgrading perl may break them (or at least issue new warnings). * The latest versions of perl have more bug fixes. * The latest versions of perl may contain performance improvements and features not present in older versions. There have been many changes in perl since perl5 was first introduced. * The Perl community is geared toward supporting the most recent releases, so you'll have an easier time finding help for those. * Older versions of perl may have security vulnerabilities, some of which are serious (see perlsec and search CVEs <https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=Perl> for more information). * The latest versions are probably the least deployed and widely tested, so you may want to wait a few months after their release and see what problems others have if you are risk averse. * The immediate, in addition to the current stable release, the previous stable release is maintained. See "MAINTENANCE AND SUPPORT" in perlpolicy for more information. * There are really two tracks of perl development: a maintenance version and an experimental version. The maintenance versions are stable, and have an even number as the minor release (i.e. perl5.24.x, where 24 is the minor release). The experimental versions may include features that don't make it into the stable versions, and have an odd number as the minor release (i.e. perl5.25.x, where 25 is the minor release). * You can consult releases <http://dev.perl.org/perl5> to determine the current stable release of Perl. How often are new versions of Perl released? Recently, the plan has been to release a new version of Perl roughly every April, but getting the release right is more important than sticking rigidly to a calendar date, so the release date is somewhat flexible. The historical release dates can be viewed at <http://www.cpan.org/src/README.html>. Even numbered minor versions (5.14, 5.16, 5.18) are production versions, and odd numbered minor versions (5.15, 5.17, 5.19) are development versions. Unless you want to try out an experimental feature, you probably never want to install a development version of Perl. The Perl development team are called Perl 5 Porters, and their organization is described at <http://perldoc.perl.org/perlpolicy.html>. The organizational rules really just boil down to one: Larry is always right, even when he was wrong. Is Perl difficult to learn? No, Perl is easy to start learning <http://learn.perl.org/> --and easy to keep learning. It looks like most programming languages you're likely to have experience with, so if you've ever written a C program, an awk script, a shell script, or even a BASIC program, you're already partway there. Most tasks only require a small subset of the Perl language. One of the guiding mottos for Perl development is "there's more than one way to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl's learning curve is therefore shallow (easy to learn) and long (there's a whole lot you can do if you really want). Finally, because Perl is frequently (but not always, and certainly not by definition) an interpreted language, you can write your programs and test them without an intermediate compilation step, allowing you to experiment and test/debug quickly and easily. This ease of experimentation flattens the learning curve even more. Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an understanding of regular expressions, and the ability to understand other people's code. If there's something you need to do, then it's probably already been done, and a working example is usually available for free. Don't forget Perl modules, either. They're discussed in Part 3 of this FAQ, along with CPAN <http://www.cpan.org/>, which is discussed in Part 2. What's the difference between "perl" and "Perl"? "Perl" is the name of the language. Only the "P" is capitalized. The name of the interpreter (the program which runs the Perl script) is "perl" with a lowercase "p". You may or may not choose to follow this usage. But never write "PERL", because perl is not an acronym. Found in /usr/share/perl/5.34/pod/perlfaq2.pod How can I get a binary version of Perl? See CPAN Ports <http://www.cpan.org/ports/> I copied the Perl binary from one machine to another, but scripts don't work. That's probably because you forgot libraries, or library paths differ. You really should build the whole distribution on the machine it will eventually live on, and then type "make install". Most other approaches are doomed to failure. One simple way to check that things are in the right place is to print out the hard-coded @INC that perl looks through for libraries: % perl -le 'print for @INC' If this command lists any paths that don't exist on your system, then you may need to move the appropriate libraries to these locations, or create symbolic links, aliases, or shortcuts appropriately. @INC is also printed as part of the output of % perl -V You might also want to check out "How do I keep my own module/library directory?" in perlfaq8. I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? Read the INSTALL file, which is part of the source distribution. It describes in detail how to cope with most idiosyncrasies that the "Configure" script can't work around for any given system or architecture. What modules and extensions are available for Perl? What is CPAN? CPAN stands for Comprehensive Perl Archive Network, a multi-gigabyte archive replicated on hundreds of machines all over the world. CPAN contains tens of thousands of modules and extensions, source code and documentation, designed for *everything* from commercial database interfaces to keyboard/screen control and running large web sites. You can search CPAN on <http://metacpan.org>. The master web site for CPAN is <http://www.cpan.org/>, <http://www.cpan.org/SITES.html> lists all mirrors. See the CPAN FAQ at <http://www.cpan.org/misc/cpan-faq.html> for answers to the most frequently asked questions about CPAN. The Task::Kensho module has a list of recommended modules which you should review as a good starting point. Where can I get information on Perl? * <http://www.perl.org/> * <http://perldoc.perl.org/> * <http://learn.perl.org/> The complete Perl documentation is available with the Perl distribution. If you have Perl installed locally, you probably have the documentation installed as well: type "perldoc perl" in a terminal or view online <http://perldoc.perl.org/perl.html>. (Some operating system distributions may ship the documentation in a different package; for instance, on Debian, you need to install the "perl-doc" package.) Many good books have been written about Perl--see the section later in perlfaq2 for more details. What mailing lists are there for Perl? A comprehensive list of Perl-related mailing lists can be found at <http://lists.perl.org/> Where can I buy a commercial version of Perl? Perl already *is* commercial software: it has a license that you can grab and carefully read to your manager. It is distributed in releases and comes in well-defined packages. There is a very large and supportive user community and an extensive literature. If you still need commercial support ActiveState <http://www.activestate.com/activeperl> offers this. Found in /usr/share/perl/5.34/pod/perlfaq3.pod How do I find which modules are installed on my system? From the command line, you can use the "cpan" command's "-l" switch: $ cpan -l You can also use "cpan"'s "-a" switch to create an autobundle file that "CPAN.pm" understands and can use to re-install every module: $ cpan -a Inside a Perl program, you can use the ExtUtils::Installed module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up as "Perl" (although you can get those with Module::CoreList). use ExtUtils::Installed; my $inst = ExtUtils::Installed->new(); my @modules = $inst->modules(); If you want a list of all of the Perl module filenames, you can use File::Find::Rule: use File::Find::Rule; my @files = File::Find::Rule-> extras({follow => 1})-> file()-> name( '*.pm' )-> in( @INC ) ; If you do not have that module, you can do the same thing with File::Find which is part of the standard library: use File::Find; my @files; find( { wanted => sub { push @files, $File::Find::fullname if -f $File::Find::fullname && /\.pm$/ }, follow => 1, follow_skip => 2, }, @INC ); print join "\n", @files; If you simply need to check quickly to see if a module is available, you can check for its documentation. If you can read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases): $ perldoc Module::Name You can also try to include the module in a one-liner to see if perl finds it: $ perl -MModule::Name -e1 (If you don't receive a "Can't locate ... in @INC" error message, then Perl found the module name you asked for.) How do I profile my Perl programs? (contributed by brian d foy, updated Fri Jul 25 12:22:26 PDT 2008) The "Devel" namespace has several modules which you can use to profile your Perl programs. The Devel::NYTProf (New York Times Profiler) does both statement and subroutine profiling. It's available from CPAN and you also invoke it with the "-d" switch: perl -d:NYTProf some_perl.pl It creates a database of the profile information that you can turn into reports. The "nytprofhtml" command turns the data into an HTML report similar to the Devel::Cover report: nytprofhtml You might also be interested in using the Benchmark to measure and compare code snippets. You can read more about profiling in *Programming Perl*, chapter 20, or *Mastering Perl*, chapter 5. perldebguts documents creating a custom debugger if you need to create a special sort of profiler. brian d foy describes the process in *The Perl Journal*, "Creating a Perl Debugger", <http://www.ddj.com/184404522> , and "Profiling in Perl" <http://www.ddj.com/184404580> . Perl.com has two interesting articles on profiling: "Profiling Perl", by Simon Cozens, <https://www.perl.com/pub/2004/06/25/profiling.html/> and "Debugging and Profiling mod_perl Applications", by Frank Wiles, <http://www.perl.com/pub/a/2006/02/09/debug_mod_perl.html> . Randal L. Schwartz writes about profiling in "Speeding up Your Perl Programs" for *Unix Review*, <http://www.stonehenge.com/merlyn/UnixReview/col49.html> , and "Profiling in Template Toolkit via Overriding" for *Linux Magazine*, <http://www.stonehenge.com/merlyn/LinuxMag/col75.html> . How do I cross-reference my Perl programs? The B::Xref module can be used to generate cross-reference reports for Perl programs. perl -MO=Xref[,OPTIONS] scriptname.plx Is there a pretty-printer (formatter) for Perl? Perl::Tidy comes with a perl script perltidy which indents and reformats Perl scripts to make them easier to read by trying to follow the rules of the perlstyle. If you write Perl, or spend much time reading Perl, you will probably find it useful. Of course, if you simply follow the guidelines in perlstyle, you shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode or newer cperl-mode for emacs can provide remarkable amounts of help with most (but not all) code, and even less programmable editors can provide significant assistance. Tom Christiansen and many other VI users swear by the following settings in vi and its clones: set ai sw=4 map! ^O {^M}^[O^T Put that in your .exrc file (replacing the caret characters with control characters) and away you go. In insert mode, ^T is for indenting, ^D is for undenting, and ^O is for blockdenting--as it were. A more complete example, with comments, can be found at <http://www.cpan.org/authors/id/T/TO/TOMC/scripts/toms.exrc.gz> Where can I get Perl macros for vi? For a complete version of Tom Christiansen's vi configuration file, see <http://www.cpan.org/authors/id/T/TO/TOMC/scripts/toms.exrc.gz> , the standard benchmark file for vi emulators. The file runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter--see <http://www.cpan.org/src/misc/> . Where can I get perl-mode or cperl-mode for emacs? Since Emacs version 19 patchlevel 22 or so, there have been both a perl-mode.el and support for the Perl debugger built in. These should come with the standard Emacs 19 distribution. Note that the perl-mode of emacs will have fits with "main'foo" (single quote), and mess up the indentation and highlighting. You are probably using "main::foo" in new Perl code anyway, so this shouldn't be an issue. For CPerlMode, see <http://www.emacswiki.org/cgi-bin/wiki/CPerlMode> How can I make my Perl program run faster? The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Jon Bentley's book *Programming Pearls* (that's not a misspelling!) has some good tips on optimization, too. Advice on benchmarking boils down to: benchmark and profile to make sure you're optimizing the right part, look for better algorithms instead of microtuning your code, and when all else fails consider just buying faster hardware. You will probably want to read the answer to the earlier question "How do I profile my Perl programs?" if you haven't done so already. A different approach is to autoload seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C, modules that have critical sections can be written in C (for instance, the PDL module from CPAN). If you're currently linking your perl executable to a shared *libc.so*, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information. The undump program was an ancient attempt to speed up Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway. Is it safe to return a reference to local or lexical data? Yes. Perl's garbage collection system takes care of this so everything works out right. sub makeone { my @a = ( 1 .. 10 ); return \@a; } for ( 1 .. 10 ) { push @many, makeone(); } print $many[4][5], "\n"; print "@many\n"; How can I free an array or hash so my program shrinks? (contributed by Michael Carman) You usually can't. Memory allocated to lexicals (i.e. my() variables) cannot be reclaimed or reused even if they go out of scope. It is reserved in case the variables come back into scope. Memory allocated to global variables can be reused (within your program) by using undef() and/or delete(). On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs sometimes re- exec themselves. Some operating systems (notably, systems that use mmap(2) for allocating large chunks of memory) can reclaim memory that is no longer used, but on such systems, perl must be configured and compiled to use the OS's malloc, not perl's. In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl. See also "How can I make my Perl program take less memory?" How can I make my CGI script more efficient? Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is. There are three popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from <http://www.apache.org/> ) with either of the mod_perl or mod_fastcgi plugin modules. With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an embedded Perl interpreter which pre-compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see <http://perl.apache.org/> With the FCGI module (from CPAN) and the mod_fastcgi module (available from <http://www.fastcgi.com/> ) each of your Perl programs becomes a permanent CGI daemon process. Finally, Plack is a Perl module and toolkit that contains PSGI middleware, helpers and adapters to web servers, allowing you to easily deploy scripts which can continue running, and provides flexibility with regards to which web server you use. It can allow existing CGI scripts to enjoy this flexibility and performance with minimal changes, or can be used along with modern Perl web frameworks to make writing and deploying web services with Perl a breeze. These solutions can have far-reaching effects on your system and on the way you write your CGI programs, so investigate them with care. See also <http://www.cpan.org/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI /> . How can I hide the source for my Perl program? Delete it. :-) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of "security". First of all, however, you *can't* take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though--only by people with access to the filesystem.) So you have to leave the permissions at the socially friendly 0755 level. Some people regard this as a security problem. If your program does insecure things and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed. You can try using encryption via source filters (Starting from Perl 5.8 the Filter::Simple and Filter::Util::Call modules are included in the standard distribution), but any decent programmer will be able to decrypt it. You can try using the byte code compiler and interpreter described later in perlfaq3, but the curious might still be able to de-compile it. You can try using the native-code compiler described later, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (true of every language, not just Perl). It is very easy to recover the source of Perl programs. You simply feed the program to the perl interpreter and use the modules in the B:: hierarchy. The B::Deparse module should be able to defeat most attempts to hide source. Again, this is not unique to Perl. If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive license will give you legal security. License your software and pepper it with threatening statements like "This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah." We are not lawyers, of course, so you should see a lawyer if you want to be sure your license's wording will stand up in court. Can I write useful Perl programs on the command line? Yes. Read perlrun for more information. Some examples follow. (These assume standard Unix shell quoting rules.) # sum first and last fields perl -lane 'print $F[0] + $F[-1]' * # identify text files perl -le 'for(@ARGV) {print if -f && -T _}' * # remove (most) comments from C program perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c # make file a month younger than today, defeating reaper daemons perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' * # find first unused uid perl -le '$i++ while getpwuid($i); print $i' # display reasonable manpath echo $PATH | perl -nl -072 -e ' s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}' OK, the last one was actually an Obfuscated Perl Contest entry. :-) Found in /usr/share/perl/5.34/pod/perlfaq4.pod Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? For the long explanation, see David Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (<http://web.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf>). Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl. perlnumber shows the gory details of number representations and conversions. To limit the number of decimal places in your numbers, you can use the "printf" or "sprintf" function. See "Floating-point Arithmetic" in perlop for more details. printf "%.2f", 10/3; my $number = sprintf "%.2f", 10/3; Does Perl have a round() function? What about ceil() and floor()? Trig functions? Remember that "int()" merely truncates toward 0. For rounding to a certain number of digits, "sprintf()" or "printf()" is usually the easiest route. printf("%.3f", 3.1415926535); # prints 3.142 The POSIX module (part of the standard Perl distribution) implements "ceil()", "floor()", and a number of other mathematical and trigonometric functions. use POSIX; my $ceil = ceil(3.5); # 4 my $floor = floor(3.5); # 3 In 5.000 to 5.003 perls, trigonometry was done in the Math::Complex module. With 5.004, the Math::Trig module (part of the standard Perl distribution) implements the trigonometric functions. Internally it uses the Math::Complex module and some functions can break out from the real axis into the complex plane, for example the inverse sine of 2. Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system of rounding is being used by Perl, but instead to implement the rounding function you need yourself. To see why, notice how you'll still have an issue on half-way-point alternation: for (my $i = -5; $i <= 5; $i += 0.5) { printf "%.0f ",$i } -5 -4 -4 -4 -3 -2 -2 -2 -1 -0 0 0 1 2 2 2 3 4 4 4 5 Don't blame Perl. It's the same as in C. IEEE says we have to do this. Perl numbers whose absolute values are integers under 2**31 (on 32-bit machines) will work pretty much like mathematical integers. Other numbers are not guaranteed. How do I perform an operation on a series of integers? To call a function on each element in an array, and collect the results, use: my @results = map { my_func($_) } @array; For example: my @triple = map { 3 * $_ } @single; To call a function on each element of an array, but ignore the results: foreach my $iterator (@array) { some_func($iterator); } To call a function on each integer in a (small) range, you can use: my @results = map { some_func($_) } (5 .. 25); but you should be aware that in this form, the ".." operator creates a list of all integers in the range, which can take a lot of memory for large ranges. However, the problem does not occur when using ".." within a "for" loop, because in that case the range operator is optimized to *iterate* over the range, without creating the entire list. So my @results = (); for my $i (5 .. 500_005) { push(@results, some_func($i)); } or even push(@results, some_func($_)) for 5 .. 500_005; will not create an intermediate list of 500,000 integers. How do I find the day or week of the year? The day of the year is in the list returned by the "localtime" function. Without an argument "localtime" uses the current time. my $day_of_year = (localtime)[7]; The POSIX module can also format a date as the day of the year or week of the year. use POSIX qw/strftime/; my $day_of_year = strftime "%j", localtime; my $week_of_year = strftime "%W", localtime; To get the day of year for any date, use POSIX's "mktime" to get a time in epoch seconds for the argument to "localtime". use POSIX qw/mktime strftime/; my $week_of_year = strftime "%W", localtime( mktime( 0, 0, 0, 18, 11, 87 ) ); You can also use Time::Piece, which comes with Perl and provides a "localtime" that returns an object: use Time::Piece; my $day_of_year = localtime->yday; my $week_of_year = localtime->week; The Date::Calc module provides two functions to calculate these, too: use Date::Calc; my $day_of_year = Day_of_Year( 1987, 12, 18 ); my $week_of_year = Week_of_Year( 1987, 12, 18 ); How do I find the current century or millennium? Use the following simple functions: sub get_century { return int((((localtime(shift || time))[5] + 1999))/100); } sub get_millennium { return 1+int((((localtime(shift || time))[5] + 1899))/1000); } On some systems, the POSIX module's "strftime()" function has been extended in a non-standard way to use a %C format, which they sometimes claim is the "century". It isn't, because on most such systems, this is only the first two digits of the four-digit year, and thus cannot be used to determine reliably the current century or millennium. How can I compare two dates and find the difference? (contributed by brian d foy) You could just store all your dates as a number and then subtract. Life isn't always that simple though. The Time::Piece module, which comes with Perl, replaces localtime with a version that returns an object. It also overloads the comparison operators so you can compare them directly: use Time::Piece; my $date1 = localtime( $some_time ); my $date2 = localtime( $some_other_time ); if( $date1 < $date2 ) { print "The date was in the past\n"; } You can also get differences with a subtraction, which returns a Time::Seconds object: my $date_diff = $date1 - $date2; print "The difference is ", $date_diff->days, " days\n"; If you want to work with formatted dates, the Date::Manip, Date::Calc, or DateTime modules can help you. How can I find the Julian Day? (contributed by brian d foy and Dave Cross) You can use the Time::Piece module, part of the Standard Library, which can convert a date/time to a Julian Day: $ perl -MTime::Piece -le 'print localtime->julian_day' 2455607.7959375 Or the modified Julian Day: $ perl -MTime::Piece -le 'print localtime->mjd' 55607.2961226851 Or even the day of the year (which is what some people think of as a Julian day): $ perl -MTime::Piece -le 'print localtime->yday' 45 You can also do the same things with the DateTime module: $ perl -MDateTime -le'print DateTime->today->jd' 2453401.5 $ perl -MDateTime -le'print DateTime->today->mjd' 53401 $ perl -MDateTime -le'print DateTime->today->doy' 31 You can use the Time::JulianDay module available on CPAN. Ensure that you really want to find a Julian day, though, as many people have different ideas about Julian days (see <http://www.hermetic.ch/cal_stud/jdn.htm> for instance): $ perl -MTime::JulianDay -le 'print local_julian_day( time )' 55608 How do I find yesterday's date? (contributed by brian d foy) To do it correctly, you can use one of the "Date" modules since they work with calendars instead of times. The DateTime module makes it simple, and give you the same time of day, only the day before, despite daylight saving time changes: use DateTime; my $yesterday = DateTime->now->subtract( days => 1 ); print "Yesterday was $yesterday\n"; You can also use the Date::Calc module using its "Today_and_Now" function. use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); print "@date_time\n"; Most people try to use the time rather than the calendar to figure out dates, but that assumes that days are twenty-four hours each. For most people, there are two days a year when they aren't: the switch to and from summer time throws this off. For example, the rest of the suggestions will be wrong sometimes: Starting with Perl 5.10, Time::Piece and Time::Seconds are part of the standard distribution, so you might think that you could do something like this: use Time::Piece; use Time::Seconds; my $yesterday = localtime() - ONE_DAY; # WRONG print "Yesterday was $yesterday\n"; The Time::Piece module exports a new "localtime" that returns an object, and Time::Seconds exports the "ONE_DAY" constant that is a set number of seconds. This means that it always gives the time 24 hours ago, which is not always yesterday. This can cause problems around the end of daylight saving time when there's one day that is 25 hours long. You have the same problem with Time::Local, which will give the wrong answer for those same special cases: # contributed by Gunnar Hjalmarsson use Time::Local; my $today = timelocal 0, 0, 12, ( localtime )[3..5]; my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; How do I remove consecutive pairs of characters? (contributed by brian d foy) You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in "(.)". The memory parentheses store the matched character in the back-reference "\g1" and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1. s/(.)\g1/$1/g; We can also use the transliteration operator, "tr///". In this example, the search list side of our "tr///" contains nothing, but the "c" option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the "s" option squashes duplicated and consecutive characters in the string so a character does not show up next to itself my $str = 'Haarlem'; # in the Netherlands $str =~ tr///cs; # Now Harlem, like in New York How do I expand function calls in a string? (contributed by brian d foy) This is documented in perlref, and although it's not the easiest thing to read, it does work. In each of these examples, we call the function inside the braces used to dereference a reference. If we have more than one return value, we can construct and dereference an anonymous array. In this case, we call the function in list context. print "The time values are @{ [localtime] }.\n"; If we want to call the function in scalar context, we have to do a bit more work. We can really have any code we like inside the braces, so we simply have to end with the scalar reference, although how you do that is up to you, and you can use code inside the braces. Note that the use of parens creates a list context, so we need "scalar" to force the scalar context on the function: print "The time is ${\(scalar localtime)}.\n" print "The time is ${ my $x = localtime; \$x }.\n"; If your function already returns a reference, you don't need to create the reference yourself. sub timestamp { my $t = localtime; \$t } print "The time is ${ timestamp() }.\n"; The "Interpolation" module can also do a lot of magic for you. You can specify a variable name, in this case "E", to set up a tied hash that does the interpolation for you. It has several other methods to do this as well. use Interpolation E => 'eval'; print "The time values are $E{localtime()}.\n"; In most cases, it is probably easier to simply use string concatenation, which also forces scalar context. print "The time is " . localtime() . ".\n"; How do I find matching/nesting anything? To find something between two single characters, a pattern like "/x([^x]*)x/" will get the intervening bits in $1. For multiple ones, then something more like "/alpha(.*?)omega/" would be needed. For nested patterns and/or balanced expressions, see the so-called (?PARNO) construct (available since perl 5.10). The CPAN module Regexp::Common can help to build such regular expressions (see in particular Regexp::Common::balanced and Regexp::Common::delimited). More complex cases will require to write a parser, probably using a parsing module from CPAN, like Regexp::Grammars, Parse::RecDescent, Parse::Yapp, Text::Balanced, or Marpa::R2. How do I reformat a paragraph? Use Text::Wrap (part of the standard Perl distribution): use Text::Wrap; print wrap("\t", ' ', @paragraphs); The paragraphs you give to Text::Wrap should not contain embedded newlines. Text::Wrap doesn't justify the lines (flush-right). Or use the CPAN module Text::Autoformat. Formatting files can be easily done by making a shell alias, like so: alias fmt="perl -i -MText::Autoformat -n0777 \ -e 'print autoformat $_, {all=>1}' $*" See the documentation for Text::Autoformat to appreciate its many capabilities. How can I access or change N characters of a string? You can access the first characters of a string with substr(). To get the first character, for example, start at position 0 and grab the string of length 1. my $string = "Just another Perl Hacker"; my $first_char = substr( $string, 0, 1 ); # 'J' To change part of a string, you can use the optional fourth argument which is the replacement string. substr( $string, 13, 4, "Perl 5.8.0" ); You can also use substr() as an lvalue. substr( $string, 13, 4 ) = "Perl 5.8.0"; How do I change the Nth occurrence of something? You have to keep track of N yourself. For example, let's say you want to change the fifth occurrence of "whoever" or "whomever" into "whosoever" or "whomsoever", case insensitively. These all assume that $_ contains the string to be altered. $count = 0; s{((whom?)ever)}{ ++$count == 5 # is it the 5th? ? "${2}soever" # yes, swap : $1 # renege and leave it there }ige; In the more general case, you can use the "/g" modifier in a "while" loop, keeping count of matches. $WANT = 3; $count = 0; $_ = "One fish two fish red fish blue fish"; while (/(\w+)\s+fish\b/gi) { if (++$count == $WANT) { print "The third fish is a $1 one.\n"; } } That prints out: "The third fish is a red one." You can also use a repetition count and repeated pattern like this: /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; How can I count the number of occurrences of a substring within a string? There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the "tr///" function like so: my $string = "ThisXlineXhasXsomeXx'sXinXit"; my $count = ($string =~ tr/X//); print "There are $count X characters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, "tr///" won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers: my $string = "-9 55 48 -2 23 -76 4 14 -44"; my $count = 0; while ($string =~ /-\d+/g) { $count++ } print "There are $count negative numbers in the string"; Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches. my $count = () = $string =~ /-\d+/g; How do I strip blank space from the beginning/end of a string? (contributed by brian d foy) A substitution can do this for you. For a single line, you want to replace all the leading or trailing whitespace with nothing. You can do that with a pair of substitutions: s/^\s+//; s/\s+$//; You can also write that as a single substitution, although it turns out the combined statement is slower than the separate ones. That might not matter to you, though: s/^\s+|\s+$//g; In this regular expression, the alternation matches either at the beginning or the end of the string since the anchors have a lower precedence than the alternation. With the "/g" flag, the substitution makes all possible matches, so it gets both. Remember, the trailing newline matches the "\s+", and the "$" anchor can match to the absolute end of the string, so the newline disappears too. Just add the newline to the output, which has the added benefit of preserving "blank" (consisting entirely of whitespace) lines which the "^\s+" would remove all by itself: while( <> ) { s/^\s+|\s+$//g; print "$_\n"; } For a multi-line string, you can apply the regular expression to each logical line in the string by adding the "/m" flag (for "multi-line"). With the "/m" flag, the "$" matches *before* an embedded newline, so it doesn't remove it. This pattern still removes the newline at the end of the string: $string =~ s/^\s+|\s+$//gm; Remember that lines consisting entirely of whitespace will disappear, since the first part of the alternation can match the entire string and replace it with nothing. If you need to keep embedded blank lines, you have to do a little more work. Instead of matching any whitespace (since that includes a newline), just match the other whitespace: $string =~ s/^[\t\f ]+|[\t\f ]+$//mg; How do I extract selected columns from a string? (contributed by brian d foy) If you know the columns that contain the data, you can use "substr" to extract a single column. my $column = substr( $line, $start_column, $length ); You can use "split" if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data. my $line = ' fred barney betty '; my @columns = split /\s+/, $line; # ( '', 'fred', 'barney', 'betty' ); my $line = 'fred||barney||betty'; my @columns = split /\|/, $line; # ( 'fred', '', 'barney', '', 'betty' ); If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that handle that format, such as Text::CSV, Text::CSV_XS, or Text::CSV_PP. If you want to break apart an entire line of fixed columns, you can use "unpack" with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the "pack" and "unpack" entries in perlfunc for more details. my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); Note that spaces in the format argument to "unpack" do not denote literal spaces. If you have space separated data, you may want "split" instead. How do I find the soundex value of a string? (contributed by brian d foy) You can use the "Text::Soundex" module. If you want to do fuzzy or close matching, you might also try the String::Approx, and Text::Metaphone, and Text::DoubleMetaphone modules. Does Perl have anything like Ruby's #{} or Python's f string? Unlike the others, Perl allows you to embed a variable naked in a double quoted string, e.g. "variable $variable". When there isn't whitespace or other non-word characters following the variable name, you can add braces (e.g. "foo ${foo}bar") to ensure correct parsing. An array can also be embedded directly in a string, and will be expanded by default with spaces between the elements. The default LIST_SEPARATOR can be changed by assigning a different string to the special variable $", such as "local $" = ', ';". Perl also supports references within a string providing the equivalent of the features in the other two languages. "${\ ... }" embedded within a string will work for most simple statements such as an object->method call. More complex code can be wrapped in a do block "${\ do{...} }". When you want a list to be expanded per $", use "@{[ ... ]}". use Time::Piece; use Time::Seconds; my $scalar = 'STRING'; my @array = ( 'zorro', 'a', 1, 'B', 3 ); # Print the current date and time and then Tommorrow my $t = Time::Piece->new; say "Now is: ${\ $t->cdate() }"; say "Tomorrow: ${\ do{ my $T=Time::Piece->new + ONE_DAY ; $T->fullday }}"; # some variables in strings say "This is some scalar I have $scalar, this is an array @array."; say "You can also write it like this ${scalar} @{array}."; # Change the $LIST_SEPARATOR local $" = ':'; say "Set \$\" to delimit with ':' and sort the Array @{[ sort @array ]}"; You may also want to look at the module Quote::Code, and templating tools such as Template::Toolkit and Mojo::Template. See also: "How can I expand variables in text strings?" and "How do I expand function calls in a string?" in this FAQ. What is the difference between a list and an array? (contributed by brian d foy) A list is a fixed collection of scalars. An array is a variable that holds a variable collection of scalars. An array can supply its collection for list operations, so list operations also work on arrays: # slices ( 'dog', 'cat', 'bird' )[2,3]; @animals[2,3]; # iteration foreach ( qw( dog cat bird ) ) { ... } foreach ( @animals ) { ... } my @three = grep { length == 3 } qw( dog cat bird ); my @three = grep { length == 3 } @animals; # supply an argument list wash_animals( qw( dog cat bird ) ); wash_animals( @animals ); Array operations, which change the scalars, rearrange them, or add or subtract some scalars, only work on arrays. These can't work on a list, which is fixed. Array operations include "shift", "unshift", "push", "pop", and "splice". An array can also change its length: $#animals = 1; # truncate to two elements $#animals = 10000; # pre-extend to 10,001 elements You can change an array element, but you can't change a list element: $animals[0] = 'Rottweiler'; qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! foreach ( @animals ) { s/^d/fr/; # works fine } foreach ( qw( dog cat bird ) ) { s/^d/fr/; # Error! Modification of read only value! } However, if the list element is itself a variable, it appears that you can change a list element. However, the list element is the variable, not the data. You're not changing the list element, but something the list element refers to. The list element itself doesn't change: it's still the same variable. You also have to be careful about context. You can assign an array to a scalar to get the number of elements in the array. This only works for arrays, though: my $count = @animals; # only works with arrays If you try to do the same thing with what you think is a list, you get a quite different result. Although it looks like you have a list on the righthand side, Perl actually sees a bunch of scalars separated by a comma: my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird Since you're assigning to a scalar, the righthand side is in scalar context. The comma operator (yes, it's an operator!) in scalar context evaluates its lefthand side, throws away the result, and evaluates it's righthand side and returns the result. In effect, that list-lookalike assigns to $scalar it's rightmost value. Many people mess this up because they choose a list-lookalike whose last element is also the count they expect: my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally What is the difference between $array[1] and @array[1]? (contributed by brian d foy) The difference is the sigil, that special character in front of the array name. The "$" sigil means "exactly one item", while the "@" sigil means "zero or more items". The "$" gets you a single scalar, while the "@" gets you a list. The confusion arises because people incorrectly assume that the sigil denotes the variable type. The $array[1] is a single-element access to the array. It's going to return the item in index 1 (or undef if there is no item there). If you intend to get exactly one element from the array, this is the form you should use. The @array[1] is an array slice, although it has only one index. You can pull out multiple elements simultaneously by specifying additional indices as a list, like @array[1,4,3,0]. Using a slice on the lefthand side of the assignment supplies list context to the righthand side. This can lead to unexpected results. For instance, if you want to read a single line from a filehandle, assigning to a scalar value is fine: $array[1] = <STDIN>; However, in list context, the line input operator returns all of the lines as a list. The first line goes into @array[1] and the rest of the lines mysteriously disappear: @array[1] = <STDIN>; # most likely not what you want Either the "use warnings" pragma or the -w flag will warn you when you use an array slice with a single index. How can I remove duplicate elements from a list or array? (contributed by brian d foy) Use a hash. When you think the words "unique" or "duplicated", think "hash keys". If you don't care about the order of the elements, you could just create the hash then extract the keys. It's not important how you create that hash: just that you use "keys" to get the unique elements. my %hash = map { $_, 1 } @array; # or a hash slice: @hash{ @array } = (); # or a foreach: $hash{$_} = 1 foreach ( @array ); my @unique = keys %hash; If you want to use a module, try the "uniq" function from List::MoreUtils. In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements. use List::MoreUtils qw(uniq); my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an element, that element has no key in %Seen. The "next" statement creates the key and immediately uses its value, which is "undef", so the loop continues to the "push" and increments the value for that key. The next time the loop sees that same element, its key exists in the hash *and* the value for that key is true (since it's not 0 or "undef"), so the next skips that iteration and the loop goes to the next element. my @unique = (); my %seen = (); foreach my $elem ( @array ) { next if $seen{ $elem }++; push @unique, $elem; } You can write this more briefly using a grep, which does the same thing. my %seen = (); my @unique = grep { ! $seen{ $_ }++ } @array; How do I compute the difference of two arrays? How do I compute the intersection of two arrays? Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array: my (@union, @intersection, @difference); my %count = (); foreach my $element (@array1, @array2) { $count{$element}++ } foreach my $element (keys %count) { push @union, $element; push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; } Note that this is the *symmetric difference*, that is, all elements in either A or in B but not in both. Think of it as an xor operation. How do I find the first array element for which a condition is true? To find the first array element which satisfies a condition, you can use the "first()" function in the List::Util module, which comes with Perl 5.8. This example finds the first element that contains "Perl". use List::Util qw(first); my $element = first { /Perl/ } @array; If you cannot use List::Util, you can make your own loop to do the same thing. Once you find the element, you stop the loop with last. my $found; foreach ( @array ) { if( /Perl/ ) { $found = $_; last } } If you want the array index, use the "firstidx()" function from "List::MoreUtils": use List::MoreUtils qw(firstidx); my $index = firstidx { /Perl/ } @array; Or write it yourself, iterating through the indices and checking the array element at each index until you find one that satisfies the condition: my( $found, $index ) = ( undef, -1 ); for( $i = 0; $i < @array; $i++ ) { if( $array[$i] =~ /Perl/ ) { $found = $array[$i]; $index = $i; last; } } How do I shuffle an array randomly? If you either have Perl 5.8.0 or later installed, or if you have Scalar-List-Utils 1.03 or later installed, you can say: use List::Util 'shuffle'; @shuffled = shuffle(@list); If not, you can use a Fisher-Yates shuffle. sub fisher_yates_shuffle { my $deck = shift; # $deck is a reference to an array return unless @$deck; # must not be empty! my $i = @$deck; while (--$i) { my $j = int rand ($i+1); @$deck[$i,$j] = @$deck[$j,$i]; } } # shuffle my mpeg collection # my @mpeg = <audio/*/*.mp3>; fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place print @mpeg; Note that the above implementation shuffles an array in place, unlike the "List::Util::shuffle()" which takes a list and returns a new shuffled list. You've probably seen shuffling algorithms that work using splice, randomly picking another element to swap the current element with srand; @new = (); @old = 1 .. 10; # just a demo while (@old) { push(@new, splice(@old, rand @old, 1)); } This is bad because splice is already O(N), and since you do it N times, you just invented a quadratic algorithm; that is, O(N**2). This does not scale, although Perl is so efficient that you probably won't notice this until you have rather largish arrays. How do I process/modify each element of an array? Use "for"/"foreach": for (@lines) { s/foo/bar/; # change that word tr/XZ/ZX/; # swap those letters } Here's another; let's compute spherical volumes: my @volumes = @radii; for (@volumes) { # @volumes has changed parts $_ **= 3; $_ *= (4/3) * 3.14159; # this will be constant folded } which can also be done with "map()" which is made to transform one list into another: my @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii; If you want to do the same thing to modify the values of the hash, you can use the "values" function. As of Perl 5.6 the values are not copied, so if you modify $orbit (in this case), you modify the value. for my $orbit ( values %orbits ) { ($orbit **= 3) *= (4/3) * 3.14159; } Prior to perl 5.6 "values" returned copies of the values, so older perl code often contains constructions such as @orbits{keys %orbits} instead of "values %orbits" where the hash is to be modified. How do I select a random element from an array? Use the "rand()" function (see "rand" in perlfunc): my $index = rand @array; my $element = $array[$index]; Or, simply: my $element = $array[ rand @array ]; How do I permute N elements of a list? Use the List::Permutor module on CPAN. If the list is actually an array, try the Algorithm::Permute module (also on CPAN). It's written in XS code and is very efficient: use Algorithm::Permute; my @array = 'a'..'d'; my $p_iterator = Algorithm::Permute->new ( \@array ); while (my @perm = $p_iterator->next) { print "next permutation: (@perm)\n"; } For even faster execution, you could do: use Algorithm::Permute; my @array = 'a'..'d'; Algorithm::Permute::permute { print "next permutation: (@array)\n"; } @array; Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the "permute()" function is discussed in Volume 4 (still unpublished) of Knuth's *The Art of Computer Programming* and will work on any list: #!/usr/bin/perl -n # Fischer-Krause ordered permutation generator sub permute (&@) { my $code = shift; my @idx = 0..$#_; while ( $code->(@_[@idx]) ) { my $p = $#idx; --$p while $idx[$p-1] > $idx[$p]; my $q = $p or return; push @idx, reverse splice @idx, $p; ++$q while $idx[$p-1] > $idx[$q]; @idx[$p-1,$q]=@idx[$q,$p-1]; } } permute { print "@_\n" } split; The Algorithm::Loops module also provides the "NextPermute" and "NextPermuteNum" functions which efficiently find all unique permutations of an array, even if it contains duplicate values, modifying it in-place: if its elements are in reverse-sorted order then the array is reversed, making it sorted, and it returns false; otherwise the next permutation is returned. "NextPermute" uses string order and "NextPermuteNum" numeric order, so you can enumerate all the permutations of 0..9 like this: use Algorithm::Loops qw(NextPermuteNum); my @list= 0..9; do { print "@list\n" } while NextPermuteNum @list; How do I manipulate arrays of bits? Use "pack()" and "unpack()", or else "vec()" and the bitwise operations. For example, you don't have to store individual bits in an array (which would mean that you're wasting a lot of space). To convert an array of bits to a string, use "vec()" to set the right bits. This sets $vec to have bit N set only if $ints[N] was set: my @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... ) my $vec = ''; foreach( 0 .. $#ints ) { vec($vec,$_,1) = 1 if $ints[$_]; } The string $vec only takes up as many bits as it needs. For instance, if you had 16 entries in @ints, $vec only needs two bytes to store them (not counting the scalar variable overhead). Here's how, given a vector in $vec, you can get those bits into your @ints array: sub bitvec_to_list { my $vec = shift; my @ints; # Find null-byte density then select best algorithm if ($vec =~ tr/\0// / length $vec > 0.95) { use integer; my $i; # This method is faster with mostly null-bytes while($vec =~ /[^\0]/g ) { $i = -9 + 8 * pos $vec; push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); push @ints, $i if vec($vec, ++$i, 1); } } else { # This method is a fast general algorithm use integer; my $bits = unpack "b*", $vec; push @ints, 0 if $bits =~ s/^(\d)// && $1; push @ints, pos $bits while($bits =~ /1/g); } return \@ints; } This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.) You can make the while loop a lot shorter with this suggestion from Benjamin Goldberg: while($vec =~ /[^\0]+/g ) { push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8; } Or use the CPAN module Bit::Vector: my $vector = Bit::Vector->new($num_of_bits); $vector->Index_List_Store(@ints); my @ints = $vector->Index_List_Read(); Bit::Vector provides efficient methods for bit vector, sets of small integers and "big int" math. Here's a more extensive illustration using vec(): # vec demo my $vector = "\xff\x0f\xef\xfe"; print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", unpack("N", $vector), "\n"; my $is_set = vec($vector, 23, 1); print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; pvec($vector); set_vec(1,1,1); set_vec(3,1,1); set_vec(23,1,1); set_vec(3,1,3); set_vec(3,2,3); set_vec(3,4,3); set_vec(3,4,7); set_vec(3,8,3); set_vec(3,8,7); set_vec(0,32,17); set_vec(1,32,17); sub set_vec { my ($offset, $width, $value) = @_; my $vector = ''; vec($vector, $offset, $width) = $value; print "offset=$offset width=$width value=$value\n"; pvec($vector); } sub pvec { my $vector = shift; my $bits = unpack("b*", $vector); my $i = 0; my $BASE = 8; print "vector length in bytes: ", length($vector), "\n"; @bytes = unpack("A8" x length($vector), $bits); print "bits are: @bytes\n\n"; } Why does defined() return true on empty arrays and hashes? The short story is that you should probably only use defined on scalars or functions, not on aggregates (arrays and hashes). See "defined" in perlfunc in the 5.004 release or later of Perl for more detail. What happens if I add or remove keys from a hash while iterating over it? (contributed by brian d foy) The easy answer is "Don't do that!" If you iterate through the hash with each(), you can delete the key most recently returned without worrying about it. If you delete or add other keys, the iterator may skip or double up on them since perl may rearrange the hash table. See the entry for "each()" in perlfunc. How do I sort a hash (optionally by value instead of key)? (contributed by brian d foy) To sort a hash, start with the keys. In this example, we give the list of keys to the sort function which then compares them ASCIIbetically (which might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order. my @keys = sort { $a cmp $b } keys %hash; foreach my $key ( @keys ) { printf "%-20s %6d\n", $key, $hash{$key}; } We could get more fancy in the "sort()" block though. Instead of comparing the keys, we can compute a value with them and use that value as the comparison. For instance, to make our report order case-insensitive, we use "lc" to lowercase the keys before comparing them: my @keys = sort { lc $a cmp lc $b } keys %hash; Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the computation results. If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value. my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash; From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key. my @keys = sort { $hash{$a} <=> $hash{$b} or "\L$a" cmp "\L$b" } keys %hash; What's the difference between "delete" and "undef" with hashes? Hashes contain pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in %hash, "exists($hash{$key})" will return true. The value for a given key can be "undef", in which case $hash{$key} will be "undef" while "exists $hash{$key}" will return true. This corresponds to ($key, "undef") being in the hash. Pictures help... Here's the %hash table: keys values +------+------+ | a | 3 | | x | 7 | | d | 0 | | e | 2 | +------+------+ And these conditions hold $hash{'a'} is true $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is true exists $hash{'a'} is true (Perl 5 only) grep ($_ eq 'a', keys %hash) is true If you now say undef $hash{'a'} your table now reads: keys values +------+------+ | a | undef| | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $hash{'a'} is FALSE $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is FALSE exists $hash{'a'} is true (Perl 5 only) grep ($_ eq 'a', keys %hash) is true Notice the last two: you have an undef value, but a defined key! Now, consider this: delete $hash{'a'} your table now reads: keys values +------+------+ | x | 7 | | d | 0 | | e | 2 | +------+------+ and these conditions now hold; changes in caps: $hash{'a'} is false $hash{'d'} is false defined $hash{'d'} is true defined $hash{'a'} is false exists $hash{'a'} is FALSE (Perl 5 only) grep ($_ eq 'a', keys %hash) is FALSE See, the whole entry is gone! Why don't my tied hashes make the defined/exists distinction? This depends on the tied hash's implementation of EXISTS(). For example, there isn't the concept of undef with hashes that are tied to DBM* files. It also means that exists() and defined() do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes. How can I get the unique keys from two hashes? First you extract the keys from the hashes into lists, then solve the "removing duplicates" problem described above. For example: my %seen = (); for my $element (keys(%foo), keys(%bar)) { $seen{$element}++; } my @uniq = keys %seen; Or more succinctly: my @uniq = keys %{{%foo,%bar}}; Or if you really want to save space: my %seen = (); while (defined ($key = each %foo)) { $seen{$key}++; } while (defined ($key = each %bar)) { $seen{$key}++; } my @uniq = keys %seen; How can I store a multidimensional array in a DBM file? Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File. You might also try DBM::Deep, but it can be a bit slow. Why does passing a subroutine an undefined element in a hash create it? (contributed by brian d foy) Are you using a really old version of Perl? Normally, accessing a hash key's value for a nonexistent key will *not* create the key. my %hash = (); my $value = $hash{ 'foo' }; print "This won't print\n" if exists $hash{ 'foo' }; Passing $hash{ 'foo' } to a subroutine used to be a special case, though. Since you could assign directly to $_[0], Perl had to be ready to make that assignment so it created the hash key ahead of time: my_sub( $hash{ 'foo' } ); print "This will print before 5.004\n" if exists $hash{ 'foo' }; sub my_sub { # $_[0] = 'bar'; # create hash key in case you do this 1; } Since Perl 5.004, however, this situation is a special case and Perl creates the hash key only when you make the assignment: my_sub( $hash{ 'foo' } ); print "This will print, even after 5.004\n" if exists $hash{ 'foo' }; sub my_sub { $_[0] = 'bar'; } However, if you want the old behavior (and think carefully about that because it's a weird side effect), you can pass a hash slice instead. Perl 5.004 didn't make this a special case: my_sub( @hash{ qw/foo/ } ); How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? Usually a hash ref, perhaps like this: $record = { NAME => "Jason", EMPNO => 132, TITLE => "deputy peon", AGE => 23, SALARY => 37_000, PALS => [ "Norbert", "Rhys", "Phineas"], }; References are documented in perlref and perlreftut. Examples of complex data structures are given in perldsc and perllol. Examples of structures and object-oriented classes are in perlootut. How can I use a reference as a hash key? (contributed by brian d foy and Ben Morrow) Hash keys are strings, so you can't really use a reference as the key. When you try to do that, perl turns the reference into its stringified form (for instance, "HASH(0xDEADBEEF)"). From there you can't get back the reference from the stringified form, at least without doing some extra work on your own. Remember that the entry in the hash will still be there even if the referenced variable goes out of scope, and that it is entirely possible for Perl to subsequently allocate a different variable at the same address. This will mean a new variable might accidentally be associated with the value for an old. If you have Perl 5.10 or later, and you just want to store a value against the reference for lookup later, you can use the core Hash::Util::Fieldhash module. This will also handle renaming the keys if you use multiple threads (which causes all variables to be reallocated at new addresses, changing their stringification), and garbage-collecting the entries when the referenced variable goes out of scope. If you actually need to be able to get a real reference back from each hash entry, you can use the Tie::RefHash module, which does the required work for you. How can I check if a key exists in a multilevel hash? (contributed by brian d foy) The trick to this problem is avoiding accidental autovivification. If you want to check three keys deep, you might na?vely try this: my %hash; if( exists $hash{key1}{key2}{key3} ) { ...; } Even though you started with a completely empty hash, after that call to "exists" you've created the structure you needed to check for "key3": %hash = ( 'key1' => { 'key2' => {} } ); That's autovivification. You can get around this in a few ways. The easiest way is to just turn it off. The lexical "autovivification" pragma is available on CPAN. Now you don't add to the hash: { no autovivification; my %hash; if( exists $hash{key1}{key2}{key3} ) { ...; } } The Data::Diver module on CPAN can do it for you too. Its "Dive" subroutine can tell you not only if the keys exist but also get the value: use Data::Diver qw(Dive); my @exists = Dive( \%hash, qw(key1 key2 key3) ); if( ! @exists ) { ...; # keys do not exist } elsif( ! defined $exists[0] ) { ...; # keys exist but value is undef } You can easily do this yourself too by checking each level of the hash before you move onto the next level. This is essentially what Data::Diver does for you: if( check_hash( \%hash, qw(key1 key2 key3) ) ) { ...; } sub check_hash { my( $hash, @keys ) = @_; return unless @keys; foreach my $key ( @keys ) { return unless eval { exists $hash->{$key} }; $hash = $hash->{$key}; } return 1; } How can I prevent addition of unwanted keys into a hash? Since version 5.8.0, hashes can be *restricted* to a fixed number of given keys. Methods for creating and dealing with restricted hashes are exported by the Hash::Util module. How do I determine whether a scalar is a number/whole/integer/float? Assuming that you don't care about IEEE notations like "NaN" or "Infinity", you probably just want to use a regular expression (see also perlretut and perlre): use 5.010; if ( /\D/ ) { say "\thas nondigits"; } if ( /^\d+\z/ ) { say "\tis a whole number"; } if ( /^-?\d+\z/ ) { say "\tis an integer"; } if ( /^[+-]?\d+\z/ ) { say "\tis a +/- integer"; } if ( /^-?(?:\d+\.?|\.\d)\d*\z/ ) { say "\tis a real number"; } if ( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i ) { say "\tis a C float" } There are also some commonly used modules for the task. Scalar::Util (distributed with 5.8) provides access to perl's internal function "looks_like_number" for determining whether a variable looks like a number. Data::Types exports functions that validate data types using both the above and other regular expressions. Thirdly, there is Regexp::Common which has regular expressions to match various types of numbers. Those three modules are available from the CPAN. If you're on a POSIX system, Perl supports the "POSIX::strtod" function for converting strings to doubles (and also "POSIX::strtol" for longs). Its semantics are somewhat cumbersome, so here's a "getnum" wrapper function for more convenient access. This function takes a string and returns the number it found, or "undef" for input that isn't a C float. The "is_numeric" function is a front end to "getnum" if you just want to say, "Is this a float?" sub getnum { use POSIX qw(strtod); my $str = shift; $str =~ s/^\s+//; $str =~ s/\s+$//; $! = 0; my($num, $unparsed) = strtod($str); if (($str eq '') || ($unparsed != 0) || $!) { return undef; } else { return $num; } } sub is_numeric { defined getnum($_[0]) } Or you could check out the String::Scanf module on the CPAN instead. How do I define methods for every class/object? (contributed by Ben Morrow) You can use the "UNIVERSAL" class (see UNIVERSAL). However, please be very careful to consider the consequences of doing this: adding methods to every object is very likely to have unintended consequences. If possible, it would be better to have all your object inherit from some common base class, or to use an object system like Moose that supports roles. How do I verify a credit card checksum? Get the Business::CreditCard module from CPAN. How do I pack arrays of doubles or floats for XS code? The arrays.h/arrays.c code in the PGPLOT module on CPAN does just this. If you're doing a lot of float or double processing, consider using the PDL module from CPAN instead--it makes number-crunching easy. See <https://metacpan.org/release/PGPLOT> for the code. Found in /usr/share/perl/5.34/pod/perlfaq5.pod How do I flush/unbuffer an output filehandle? Why must I do this? (contributed by brian d foy) You might like to read Mark Jason Dominus's "Suffering From Buffering" at <http://perl.plover.com/FAQs/Buffering.html> . Perl normally buffers output so it doesn't make a system call for every bit of output. By saving up output, it makes fewer expensive system calls. For instance, in this little bit of code, you want to print a dot to the screen for every line you process to watch the progress of your program. Instead of seeing a dot for every line, Perl buffers the output and you have a long wait before you see a row of 50 dots all at once: # long wait, then row of dots all at once while( <> ) { print "."; print "\n" unless ++$count % 50; #... expensive line processing operations } To get around this, you have to unbuffer the output filehandle, in this case, "STDOUT". You can set the special variable $| to a true value (mnemonic: making your filehandles "piping hot"): $|++; # dot shown immediately while( <> ) { print "."; print "\n" unless ++$count % 50; #... expensive line processing operations } The $| is one of the per-filehandle special variables, so each filehandle has its own copy of its value. If you want to merge standard output and standard error for instance, you have to unbuffer each (although STDERR might be unbuffered by default): { my $previous_default = select(STDOUT); # save previous default $|++; # autoflush STDOUT select(STDERR); $|++; # autoflush STDERR, to be sure select($previous_default); # restore previous default } # now should alternate . and + while( 1 ) { sleep 1; print STDOUT "."; print STDERR "+"; print STDOUT "\n" unless ++$count % 25; } Besides the $| special variable, you can use "binmode" to give your filehandle a ":unix" layer, which is unbuffered: binmode( STDOUT, ":unix" ); while( 1 ) { sleep 1; print "."; print "\n" unless ++$count % 50; } For more information on output layers, see the entries for "binmode" and open in perlfunc, and the PerlIO module documentation. If you are using IO::Handle or one of its subclasses, you can call the "autoflush" method to change the settings of the filehandle: use IO::Handle; open my( $io_fh ), ">", "output.txt"; $io_fh->autoflush(1); The IO::Handle objects also have a "flush" method. You can flush the buffer any time you want without auto-buffering $io_fh->flush; How do I change, delete, or insert a line in a file, or append to the beginning of a file? (contributed by brian d foy) The basic idea of inserting, changing, or deleting a line from a text file involves reading and printing the file to the point you want to make the change, making the change, then reading and printing the rest of the file. Perl doesn't provide random access to lines (especially since the record input separator, $/, is mutable), although modules such as Tie::File can fake it. A Perl program to do these tasks takes the basic form of opening a file, printing its lines, then closing the file: open my $in, '<', $file or die "Can't read old file: $!"; open my $out, '>', "$file.new" or die "Can't write new file: $!"; while( <$in> ) { print $out $_; } close $out; Within that basic form, add the parts that you need to insert, change, or delete lines. To prepend lines to the beginning, print those lines before you enter the loop that prints the existing lines. open my $in, '<', $file or die "Can't read old file: $!"; open my $out, '>', "$file.new" or die "Can't write new file: $!"; print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC while( <$in> ) { print $out $_; } close $out; To change existing lines, insert the code to modify the lines inside the "while" loop. In this case, the code finds all lowercased versions of "perl" and uppercases them. The happens for every line, so be sure that you're supposed to do that on every line! open my $in, '<', $file or die "Can't read old file: $!"; open my $out, '>', "$file.new" or die "Can't write new file: $!"; print $out "# Add this line to the top\n"; while( <$in> ) { s/\b(perl)\b/Perl/g; print $out $_; } close $out; To change only a particular line, the input line number, $., is useful. First read and print the lines up to the one you want to change. Next, read the single line you want to change, change it, and print it. After that, read the rest of the lines and print those: while( <$in> ) { # print the lines before the change print $out $_; last if $. == 4; # line number before change } my $line = <$in>; $line =~ s/\b(perl)\b/Perl/g; print $out $line; while( <$in> ) { # print the rest of the lines print $out $_; } To skip lines, use the looping controls. The "next" in this example skips comment lines, and the "last" stops all processing once it encounters either "__END__" or "__DATA__". while( <$in> ) { next if /^\s+#/; # skip comment lines last if /^__(END|DATA)__$/; # stop at end of code marker print $out $_; } Do the same sort of thing to delete a particular line by using "next" to skip the lines you don't want to show up in the output. This example skips every fifth line: while( <$in> ) { next unless $. % 5; print $out $_; } If, for some odd reason, you really want to see the whole file at once rather than processing line-by-line, you can slurp it in (as long as you can fit the whole thing in memory!): open my $in, '<', $file or die "Can't read old file: $!" open my $out, '>', "$file.new" or die "Can't write new file: $!"; my $content = do { local $/; <$in> }; # slurp! # do your magic here print $out $content; Modules such as Path::Tiny and Tie::File can help with that too. If you can, however, avoid reading the entire file at once. Perl won't give that memory back to the operating system until the process finishes. You can also use Perl one-liners to modify a file in-place. The following changes all 'Fred' to 'Barney' in inFile.txt, overwriting the file with the new contents. With the "-p" switch, Perl wraps a "while" loop around the code you specify with "-e", and "-i" turns on in-place editing. The current line is in $_. With "-p", Perl automatically prints the value of $_ at the end of the loop. See perlrun for more details. perl -pi -e 's/Fred/Barney/' inFile.txt To make a backup of "inFile.txt", give "-i" a file extension to add: perl -pi.bak -e 's/Fred/Barney/' inFile.txt To change only the fifth line, you can add a test checking $., the input line number, then only perform the operation when the test passes: perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt To add lines before a certain line, you can add a line (or lines!) before Perl prints $_: perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt You can even add a line to the beginning of a file, since the current line prints at the end of the loop: perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt To insert a line after one already in the file, use the "-n" switch. It's just like "-p" except that it doesn't print $_ at the end of the loop, so you have to do that yourself. In this case, print $_ first, then print the line that you want to add. perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt To delete lines, only print the ones that you want. perl -ni -e 'print if /d/' inFile.txt How do I count the number of lines in a file? (contributed by brian d foy) Conceptually, the easiest way to count the lines in a file is to simply read them and count them: my $count = 0; while( <$fh> ) { $count++; } You don't really have to count them yourself, though, since Perl already does that with the $. variable, which is the current line number from the last filehandle read: 1 while( <$fh> ); my $count = $.; If you want to use $., you can reduce it to a simple one-liner, like one of these: % perl -lne '} print $.; {' file % perl -lne 'END { print $. }' file Those can be rather inefficient though. If they aren't fast enough for you, you might just read chunks of data and count the number of newlines: my $lines = 0; open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; while( sysread $fh, $buffer, 4096 ) { $lines += ( $buffer =~ tr/\n// ); } close $fh; However, that doesn't work if the line ending isn't a newline. You might change that "tr///" to a "s///" so you can count the number of times the input record separator, $/, shows up: my $lines = 0; open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; while( sysread $fh, $buffer, 4096 ) { $lines += ( $buffer =~ s|$/||g; ); } close $fh; If you don't mind shelling out, the "wc" command is usually the fastest, even with the extra interprocess overhead. Ensure that you have an untainted filename though: #!perl -T $ENV{PATH} = undef; my $lines; if( $filename =~ /^([0-9a-z_.]+)\z/ ) { $lines = `/usr/bin/wc -l $1` chomp $lines; } How do I delete the last N lines from a file? (contributed by brian d foy) The easiest conceptual solution is to count the lines in the file then start at the beginning and print the number of lines (minus the last N) to a new file. Most often, the real question is how you can delete the last N lines without making more than one pass over the file, or how to do it without a lot of copying. The easy concept is the hard reality when you might have millions of lines in your file. One trick is to use File::ReadBackwards, which starts at the end of the file. That module provides an object that wraps the real filehandle to make it easy for you to move around the file. Once you get to the spot you need, you can get the actual filehandle and work with it as normal. In this case, you get the file position at the end of the last line you want to keep and truncate the file to that point: use File::ReadBackwards; my $filename = 'test.txt'; my $Lines_to_truncate = 2; my $bw = File::ReadBackwards->new( $filename ) or die "Could not read backwards in [$filename]: $!"; my $lines_from_end = 0; until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { print "Got: ", $bw->readline; $lines_from_end++; } truncate( $filename, $bw->tell ); The File::ReadBackwards module also has the advantage of setting the input record separator to a regular expression. You can also use the Tie::File module which lets you access the lines through a tied array. You can use normal array operations to modify your file, including setting the last index and using "splice". How can I use Perl's "-i" option from within a program? "-i" sets the value of Perl's $^I variable, which in turn affects the behavior of "<>"; see perlrun for more details. By modifying the appropriate variables directly, you can get the same behavior within a larger program. For example: # ... { local($^I, @ARGV) = ('.orig', glob("*.c")); while (<>) { if ($. == 1) { print "This line should appear at the top of each file\n"; } s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case print; close ARGV if eof; # Reset $. } } # $^I and @ARGV return to their old values here This block modifies all the ".c" files in the current directory, leaving a backup of the original data from each file in a new ".c.orig" file. How can I copy a file? (contributed by brian d foy) Use the File::Copy module. It comes with Perl and can do a true copy across file systems, and it does its magic in a portable fashion. use File::Copy; copy( $original, $new_copy ) or die "Copy failed: $!"; If you can't use File::Copy, you'll have to do the work yourself: open the original file, open the destination file, then print to the destination file as you read the original. You also have to remember to copy the permissions, owner, and group to the new file. How do I make a temporary file name? If you don't need to know the name of the file, you can use "open()" with "undef" in place of the file name. In Perl 5.8 or later, the "open()" function creates an anonymous temporary file: open my $tmp, '+>', undef or die $!; Otherwise, you can use the File::Temp module. use File::Temp qw/ tempfile tempdir /; my $dir = tempdir( CLEANUP => 1 ); ($fh, $filename) = tempfile( DIR => $dir ); # or if you don't need to know the filename my $fh = tempfile( DIR => $dir ); The File::Temp has been a standard module since Perl 5.6.1. If you don't have a modern enough Perl installed, use the "new_tmpfile" class method from the IO::File module to get a filehandle opened for reading and writing. Use it if you don't need to know the file's name: use IO::File; my $fh = IO::File->new_tmpfile() or die "Unable to make new temporary file: $!"; If you're committed to creating a temporary file by hand, use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter: BEGIN { use Fcntl; use File::Spec; my $temp_dir = File::Spec->tmpdir(); my $file_base = sprintf "%d-%d-0000", $$, time; my $base_name = File::Spec->catfile($temp_dir, $file_base); sub temp_file { my $fh; my $count = 0; until( defined(fileno($fh)) || $count++ > 100 ) { $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; # O_EXCL is required for security reasons. sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; } if( defined fileno($fh) ) { return ($fh, $base_name); } else { return (); } } } How can I manipulate fixed-record-length files? The most efficient way is using pack() and unpack(). This is faster than using substr() when taking many, many strings. It is slower for just a few. Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, Berkeley-style ps: # sample input line: # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what my $PS_T = 'A6 A4 A7 A5 A*'; open my $ps, '-|', 'ps'; print scalar <$ps>; my @fields = qw( pid tt stat time command ); while (<$ps>) { my %process; @process{@fields} = unpack($PS_T, $_); for my $field ( @fields ) { print "$field: <$process{$field}>\n"; } print 'line=', pack($PS_T, @process{@fields} ), "\n"; } We've used a hash slice in order to easily handle the fields of each row. Storing the keys in an array makes it easy to operate on them as a group or loop over them with "for". It also avoids polluting the program with global variables and using symbolic references. How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? As of perl5.6, open() autovivifies file and directory handles as references if you pass it an uninitialized scalar variable. You can then pass these references just like any other scalar, and use them in the place of named handles. open my $fh, $file_name; open local $fh, $file_name; print $fh "Hello World!\n"; process_file( $fh ); If you like, you can store these filehandles in an array or a hash. If you access them directly, they aren't simple scalars and you need to give "print" a little help by placing the filehandle reference in braces. Perl can only figure it out on its own when the filehandle reference is a simple scalar. my @fhs = ( $fh1, $fh2, $fh3 ); for( $i = 0; $i <= $#fhs; $i++ ) { print {$fhs[$i]} "just another Perl answer, \n"; } Before perl5.6, you had to deal with various typeglob idioms which you may see in older code. open FILE, "> $filename"; process_typeglob( *FILE ); process_reference( \*FILE ); sub process_typeglob { local *FH = shift; print FH "Typeglob!" } sub process_reference { local $fh = shift; print $fh "Reference!" } If you want to create many anonymous handles, you should check out the Symbol or IO::Handle modules. How can I use a filehandle indirectly? An indirect filehandle is the use of something other than a symbol in a place that a filehandle is expected. Here are ways to get indirect filehandles: $fh = SOME_FH; # bareword is strict-subs hostile $fh = "SOME_FH"; # strict-refs hostile; same package only $fh = *SOME_FH; # typeglob $fh = \*SOME_FH; # ref to typeglob (bless-able) $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob Or, you can use the "new" method from one of the IO::* modules to create an anonymous filehandle and store that in a scalar variable. use IO::Handle; # 5.004 or higher my $fh = IO::Handle->new(); Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like "print", "open", "seek", or the "<FH>" diamond operator will accept either a named filehandle or a scalar variable containing one: ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); print $ofh "Type it: "; my $got = <$ifh> print $efh "What was that: $got"; If you're passing a filehandle to a function, you can write the function in two ways: sub accept_fh { my $fh = shift; print $fh "Sending to indirect filehandle\n"; } Or it can localize a typeglob and use the filehandle directly: sub accept_fh { local *FH = shift; print FH "Sending to localized filehandle\n"; } Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.) accept_fh(*STDOUT); accept_fh($handle); In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not expressions or subscripts of hashes or arrays, can be used with built-ins like "print", "printf", or the diamond operator. Using something other than a simple scalar variable as a filehandle is illegal and won't even compile: my @fd = (*STDIN, *STDOUT, *STDERR); print $fd[1] "Type it: "; # WRONG my $got = <$fd[0]> # WRONG print $fd[2] "What was that: $got"; # WRONG With "print" and "printf", you get around this by using a block and an expression where you would place the filehandle: print { $fd[1] } "funny stuff\n"; printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; # Pity the poor deadbeef. That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places: my $ok = -x "/bin/cat"; print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; This approach of treating "print" and "printf" like object methods calls doesn't work for the diamond operator. That's because it's a real operator, not just a function with a comma-less argument. Assuming you've been storing typeglobs in your structure as we did above, you can use the built-in function named "readline" to read a record just as "<>" does. Given the initialization shown above for @fd, this would work, but only because readline() requires a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet. $got = readline($fd[0]); Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. It's the syntax of the fundamental operators. Playing the object game doesn't help you at all here. How can I open a filehandle to a string? (contributed by Peter J. Holzer, hjp-usenet2 AT hjp.at) Since Perl 5.8.0 a file handle referring to a string can be created by calling open with a reference to that string instead of the filename. This file handle can then be used to read from or write to the string: open(my $fh, '>', \$string) or die "Could not open string for writing"; print $fh "foo\n"; print $fh "bar\n"; # $string now contains "foo\nbar\n" open(my $fh, '<', \$string) or die "Could not open string for reading"; my $x = <$fh>; # $x now contains "foo\n" With older versions of Perl, the IO::String module provides similar functionality. How can I set up a footer format to be used with write()? There's no builtin way to do this, but perlform has a couple of techniques to make it possible for the intrepid hacker. How can I translate tildes (~) in a filename? Use the <> ("glob()") operator, documented in perlfunc. Versions of Perl older than 5.6 require that you have a shell installed that groks tildes. Later versions of Perl have this feature built in. The File::KGlob module (available from CPAN) gives more portable glob functionality. Within Perl, you may use this directly: $filename =~ s{ ^ ~ # find a leading tilde ( # save this in $1 [^/] # a non-slash character * # repeated 0 or more times (0 means me) ) }{ $1 ? (getpwnam($1))[7] : ( $ENV{HOME} || $ENV{LOGDIR} ) }ex; How come when I open a file read-write it wipes it out? Because you're using something like this, which truncates the file *then* gives you read-write access: open my $fh, '+>', '/path/name'; # WRONG (almost always) Whoops. You should instead use this, which will fail if the file doesn't exist: open my $fh, '+<', '/path/name'; # open for update Using ">" always clobbers or creates. Using "<" never does either. The "+" doesn't change this. Here are examples of many kinds of file opens. Those using "sysopen" all assume that you've pulled in the constants from Fcntl: use Fcntl; To open file for reading: open my $fh, '<', $path or die $!; sysopen my $fh, $path, O_RDONLY or die $!; To open file for writing, create new file if needed or else truncate old file: open my $fh, '>', $path or die $!; sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; To open file for writing, create new file, file must not exist: sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; To open file for appending, create if necessary: open my $fh, '>>', $path or die $!; sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; To open file for appending, file must exist: sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; To open file for update, file must exist: open my $fh, '+<', $path or die $!; sysopen my $fh, $path, O_RDWR or die $!; To open file for update, create file if necessary: sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; To open file for update, file must not exist: sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; To open a file without blocking, creating if necessary: sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT or die "can't open /foo/somefile: $!": Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both successfully create or unlink the same file! Therefore O_EXCL isn't as exclusive as you might wish. See also perlopentut. How can I open a file named with a leading ">" or trailing blanks? (contributed by Brian McCauley) The special two-argument form of Perl's open() function ignores trailing blanks in filenames and infers the mode from certain leading characters (or a trailing "|"). In older versions of Perl this was the only version of open() and so it is prevalent in old code and books. Unless you have a particular reason to use the two-argument form you should use the three-argument form of open() which does not treat any characters in the filename as special. open my $fh, "<", " file "; # filename is " file " open my $fh, ">", ">file"; # filename is ">file" How can I reliably rename a file? If your operating system supports a proper mv(1) utility or its functional equivalent, this works: rename($old, $new) or system("mv", $old, $new); It may be more portable to use the File::Copy module instead. You just copy to the new file to the new name (checking return values), then delete the old one. This isn't really the same semantically as a "rename()", which preserves meta-information like permissions, timestamps, inode info, etc. How can I lock a file? Perl's builtin flock() function (see perlfunc for details) will call flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and later), and lockf(3) if neither of the two previous system calls exists. On some systems, it may even use a different form of native locking. Here are some gotchas with Perl's flock(): 1 Produces a fatal error if none of the three system calls (or their close equivalent) exists. 2 lockf(3) does not provide shared locking, and requires that the filehandle be open for writing (or appending, or read/writing). 3 Some versions of flock() can't lock files over a network (e.g. on NFS file systems), so you'd need to force the use of fcntl(2) when you build Perl. But even this is dubious at best. See the flock entry of perlfunc and the INSTALL file in the source distribution for information on building Perl to do this. Two potentially non-obvious but traditional flock semantics are that it waits indefinitely until the lock is granted, and that its locks are *merely advisory*. Such discretionary locks are more flexible, but offer fewer guarantees. This means that files locked with flock() may be modified by programs that do not also use flock(). Cars that stop for red lights get on well with each other, but not with cars that don't stop for red lights. See the perlport manpage, your port's specific documentation, or your system-specific local manpages for details. It's best to assume traditional behavior if you're writing portable programs. (If you're not, you should as always feel perfectly free to write for your own system's idiosyncrasies (sometimes called "features"). Slavish adherence to portability concerns shouldn't get in the way of your getting your job done.) For more information on file locking, see also "File Locking" in perlopentut if you have it (new for 5.6). Why can't I just open(FH, ">file.lock")? A common bit of code NOT TO USE is this: sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE open my $lock, '>', 'file.lock'; # THIS BROKEN CODE This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this "ought" to work: sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT or die "can't open file.lock: $!"; except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also less than desirable. I still don't get locking. I just want to increment the number in the file. How can I do this? Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic. Anyway, this is what you can do if you can't help yourself. use Fcntl qw(:DEFAULT :flock); sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; flock $fh, LOCK_EX or die "can't flock numfile: $!"; my $num = <$fh> || 0; seek $fh, 0, 0 or die "can't rewind numfile: $!"; truncate $fh, 0 or die "can't truncate numfile: $!"; (print $fh $num+1, "\n") or die "can't write numfile: $!"; close $fh or die "can't close numfile: $!"; Here's a much better web-page hit counter: $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-) All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? If you are on a system that correctly implements "flock" and you use the example appending code from "perldoc -f flock" everything will be OK even if the OS you are on doesn't implement append mode correctly (if such a system exists). So if you are happy to restrict yourself to OSs that implement "flock" (and that's not really much of a restriction) then that is what you should do. If you know you are only going to use a system that does correctly implement appending (i.e. not Win32) then you can omit the "seek" from the code in the previous answer. If you know you are only writing code to run on an OS and filesystem that does implement append mode correctly (a local filesystem on a modern Unix for example), and you keep the file in block-buffered mode and you write less than one buffer-full of output between each manual flushing of the buffer then each bufferload is almost guaranteed to be written to the end of the file in one chunk without getting intermingled with anyone else's output. You can also use the "syswrite" function which is simply a wrapper around your system's write(2) system call. There is still a small theoretical chance that a signal will interrupt the system-level "write()" operation before completion. There is also a possibility that some STDIO implementations may call multiple system level "write()"s even if the buffer was empty to start. There may be some systems where this probability is reduced to zero, and this is not a concern when using ":perlio" instead of your system's STDIO. How do I randomly update a binary file? If you're just trying to patch a binary, in many cases something as simple as this works: perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs However, if you have fixed sized records, then you might do something more like this: my $RECSIZE = 220; # size of record, in bytes my $recno = 37; # which record to update open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; seek $fh, $recno * $RECSIZE, 0; read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; # munge the record seek $fh, -$RECSIZE, 1; print $fh $record; close $fh; Locking and error checking are left as an exercise for the reader. Don't forget them or you'll be quite sorry. How do I get a file's timestamp in perl? If you want to retrieve the time at which the file was last read, written, or had its meta-data (owner, etc) changed, you use the -A, -M, or -C file test operations as documented in perlfunc. These retrieve the age of the file (measured against the start-time of your program) in days as a floating point number. Some platforms may not have all of these times. See perlport for details. To retrieve the "raw" time in seconds since the epoch, you would call the stat function, then use "localtime()", "gmtime()", or "POSIX::strftime()" to convert this into human-readable form. Here's an example: my $write_secs = (stat($file))[9]; printf "file %s updated at %s\n", $file, scalar localtime($write_secs); If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): # error checking left as an exercise for reader. use File::stat; use Time::localtime; my $date_string = ctime(stat($file)->mtime); print "file $file updated at $date_string\n"; The POSIX::strftime() approach has the benefit of being, in theory, independent of the current locale. See perllocale for details. How do I set a file's timestamp in perl? You use the utime() function documented in "utime" in perlfunc. By way of example, here's a little program that copies the read and write times from its first argument to all the rest of them. if (@ARGV < 2) { die "usage: cptimes timestamp_file other_files ...\n"; } my $timestamp = shift; my($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; Error checking is, as usual, left as an exercise for the reader. The perldoc for utime also has an example that has the same effect as touch(1) on files that *already exist*. Certain file systems have a limited ability to store the times on a file at the expected level of precision. For example, the FAT and HPFS filesystem are unable to create dates on files with a finer granularity than two seconds. This is a limitation of the filesystems, not of utime(). How do I print to more than one file at once? To connect one filehandle to several output filehandles, you can use the IO::Tee or Tie::FileHandle::Multiplex modules. If you only have to do this once, you can print individually to each filehandle. for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" } How can I read in an entire file all at once? The customary Perl approach for processing all the lines in a file is to do so one line at a time: open my $input, '<', $file or die "can't open $file: $!"; while (<$input>) { chomp; # do something with $_ } close $input or die "can't close $file: $!"; This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often--if not almost always--the wrong approach. Whenever you see someone do this: my @lines = <INPUT>; You should think long and hard about why you need everything loaded at once. It's just not a scalable solution. If you "mmap" the file with the File::Map module from CPAN, you can virtually load the entire file into a string without actually storing it in memory: use File::Map qw(map_file); map_file my $string, $filename; Once mapped, you can treat $string as you would any other string. Since you don't necessarily have to load the data, mmap-ing can be very fast and may not increase your memory footprint. You might also find it more fun to use the standard Tie::File module, or the DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element of the array actually accesses the corresponding line in the file. If you want to load the entire file, you can use the Path::Tiny module to do it in one simple and efficient step: use Path::Tiny; my $all_of_it = path($filename)->slurp; # entire file in scalar my @all_lines = path($filename)->lines; # one line per element Or you can read the entire file contents into a scalar like this: my $var; { local $/; open my $fh, '<', $file or die "can't open $file: $!"; $var = <$fh>; } That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this: my $var = do { local $/; <$fh> }; You can also use a localized @ARGV to eliminate the "open": my $var = do { local( @ARGV, $/ ) = $file; <> }; For ordinary files you can also use the "read" function. read( $fh, $var, -s $fh ); That third argument tests the byte size of the data on the $fh filehandle and reads that many bytes into the buffer $var. How can I read in a file by paragraphs? Use the $/ variable (see perlvar for details). You can either set it to "" to eliminate empty paragraphs ("abc\n\n\n\ndef", for instance, gets treated as two paragraphs and not three), or "\n\n" to accept empty paragraphs. Note that a blank line must have no blanks in it. Thus "fred\n \nstuff\n\n" is one paragraph, but "fred\n\nstuff\n\n" is two. How can I read a single character from a file? From the keyboard? You can use the builtin "getc()" function for most filehandles, but it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN or use the sample code in "getc" in perlfunc. If your system supports the portable operating system programming interface (POSIX), you can use the following code, which you'll note turns off echo processing as well. #!/usr/bin/perl -w use strict; $| = 1; for (1..4) { print "gimme: "; my $got = getone(); print "--> $got\n"; } exit; BEGIN { use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); my $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub getone { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } } END { cooked() } The Term::ReadKey module from CPAN may be easier to use. Recent versions include also support for non-portable systems as well. use Term::ReadKey; open my $tty, '<', '/dev/tty'; print "Gimme a char: "; ReadMode "raw"; my $key = ReadKey 0, $tty; ReadMode "normal"; printf "\nYou said %s, char number %03d\n", $key, ord $key; How can I tell whether there's a character waiting on a filehandle? The very first thing you should do is look into getting the Term::ReadKey extension from CPAN. As we mentioned earlier, it now even has limited support for non-portable (read: not open systems, closed, proprietary, not POSIX, not Unix, etc.) systems. You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system-dependent. Here's one solution that works on BSD systems: sub key_ready { my($rin, $nfd); vec($rin, fileno(STDIN), 1) = 1; return $nfd = select($rin,undef,undef,0); } If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The *h2ph* tool that comes with Perl tries to convert C include files to Perl code, which can be "require"d. FIONREAD ends up defined as a function in the *sys/ioctl.ph* file: require './sys/ioctl.ph'; $size = pack("L", 0); ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); If *h2ph* wasn't installed or doesn't work for you, you can *grep* the include files by hand: % grep FIONREAD /usr/include/*/* /usr/include/asm/ioctls.h:#define FIONREAD 0x541B Or write a small C program using the editor of champions: % cat > fionread.c #include <sys/ioctl.h> main() { printf("%#08x\n", FIONREAD); } ^D % cc -o fionread fionread.c % ./fionread 0x4004667f And then hard-code it, leaving porting as an exercise to your successor. $FIONREAD = 0x4004667f; # XXX: opsys dependent $size = pack("L", 0); ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; $size = unpack("L", $size); FIONREAD requires a filehandle connected to a stream, meaning that sockets, pipes, and tty devices work, but *not* files. How do I do a "tail -f" in perl? First try seek($gw_fh, 0, 1); The statement "seek($gw_fh, 0, 1)" doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next "<$gw_fh>" makes Perl try again to read something. If that doesn't work (it relies on features of your stdio implementation), then you need something more like this: for (;;) { for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) { # search for some stuff and put it into files } # sleep for a while seek($gw_fh, $curpos, 0); # seek to where we had been } If this still doesn't work, look into the "clearerr" method from IO::Handle, which resets the error and end-of-file states on the handle. There's also a File::Tail module from CPAN. How do I dup() a filehandle in Perl? If you check "open" in perlfunc, you'll see that several of the ways to call open() should do the trick. For example: open my $log, '>>', '/foo/logfile'; open STDERR, '>&', $log; Or even with a literal numeric descriptor: my $fd = $ENV{MHCONTEXTFD}; open $mhcontext, "<&=$fd"; # like fdopen(3S) Note that "<&STDIN" makes a copy, but "<&=STDIN" makes an alias. That means if you close an aliased handle, all aliases become inaccessible. This is not true with a copied one. Error checking, as always, has been left as an exercise for the reader. How do I close a file descriptor by number? If, for some reason, you have a file descriptor instead of a filehandle (perhaps you used "POSIX::open"), you can use the "close()" function from the POSIX module: use POSIX (); POSIX::close( $fd ); This should rarely be necessary, as the Perl "close()" function is to be used for things that Perl opened itself, even if it was a dup of a numeric descriptor as with "MHCONTEXT" above. But if you really have to, you may be able to do this: require './sys/syscall.ph'; my $rc = syscall(SYS_close(), $fd + 0); # must force numeric die "can't sysclose $fd: $!" unless $rc == -1; Or, just use the fdopen(3S) feature of "open()": { open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!"; close $fh; } Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? Whoops! You just put a tab and a formfeed into that filename! Remember that within double quoted strings ("like\this"), the backslash is an escape character. The full list of these is in "Quote and Quote-like Operators" in perlop. Unsurprisingly, you don't have a file called "c:(tab)emp(formfeed)oo" or "c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. Either single-quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions since something like MS-DOS 2.0 or so have treated "/" and "\" the same in a path, you might as well use the one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++, awk, Tcl, Java, or Python, just to mention a few. POSIX paths are more portable, too. Why doesn't glob("*.*") get all the files? Because even on non-Unix ports, Perl's glob function follows standard Unix globbing semantics. You'll need "glob("*")" to get all (non-hidden) files. This makes glob() portable even to legacy systems. Your port may include proprietary globbing functions as well. Check its documentation for details. Why does Perl let me delete read-only files? Why does "-i" clobber protected files? Isn't this a bug in Perl? This is elaborately and painstakingly described in the file-dir-perms article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . The executive summary: learn how your filesystem works. The permissions on a file say what can happen to the data in that file. The permissions on a directory say what can happen to the list of files in that directory. If you delete a file, you're removing its name from the directory (so the operation depends on the permissions of the directory, not of the file). If you try to write to the file, the permissions of the file govern whether you're allowed to. How do I select a random line from a file? Short of loading the file into a database or pre-indexing the lines in the file, there are a couple of things that you can do. Here's a reservoir-sampling algorithm from the Camel Book: srand; rand($.) < 1 && ($line = $_) while <>; This has a significant advantage in space over reading the whole file in. You can find a proof of this method in *The Art of Computer Programming*, Volume 2, Section 3.4.2, by Donald E. Knuth. You can use the File::Random module which provides a function for that algorithm: use File::Random qw/random_line/; my $line = random_line($filename); Another way is to use the Tie::File module, which treats the entire file as an array. Simply access a random array element. Why do I get weird spaces when I print an array of lines? (contributed by brian d foy) If you are seeing spaces between the elements of your array when you print the array, you are probably interpolating the array in double quotes: my @animals = qw(camel llama alpaca vicuna); print "animals are: @animals\n"; It's the double quotes, not the "print", doing this. Whenever you interpolate an array in a double quote context, Perl joins the elements with spaces (or whatever is in $", which is a space by default): animals are: camel llama alpaca vicuna This is different than printing the array without the interpolation: my @animals = qw(camel llama alpaca vicuna); print "animals are: ", @animals, "\n"; Now the output doesn't have the spaces between the elements because the elements of @animals simply become part of the list to "print": animals are: camelllamaalpacavicuna You might notice this when each of the elements of @array end with a newline. You expect to print one element per line, but notice that every line after the first is indented: this is a line this is another line this is the third line That extra space comes from the interpolation of the array. If you don't want to put anything between your array elements, don't use the array in double quotes. You can send it to print without them: print @lines; Found in /usr/share/perl/5.34/pod/perlfaq6.pod How can I pull out lines between two patterns that are themselves on different lines? You can use Perl's somewhat exotic ".." operator (documented in perlop): perl -ne 'print if /START/ .. /END/' file1 file2 ... If you wanted text and not lines, you would use perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... But if you want nested occurrences of "START" through "END", you'll run up against the problem described in the question in this section on matching balanced text. Here's another example of using "..": while (<>) { my $in_header = 1 .. /^$/; my $in_body = /^$/ .. eof; # now choose between them } continue { $. = 0 if eof; # fix $. } How can I match a locale-smart version of "/[a-zA-Z]/"? You can use the POSIX character class syntax "/[[:alpha:]]/" documented in perlre. No matter which locale you are in, the alphabetic characters are the characters in \w without the digits and the underscore. As a regex, that looks like "/[^\W\d_]/". Its complement, the non-alphabetics, is then everything in \W along with the digits and the underscore, or "/[\W\d_]/". What is "/o" really for? (contributed by brian d foy) The "/o" option for regular expressions (documented in perlop and perlreref) tells Perl to compile the regular expression only once. This is only useful when the pattern contains a variable. Perls 5.6 and later handle this automatically if the pattern does not change. Since the match operator "m//", the substitution operator "s///", and the regular expression quoting operator "qr//" are double-quotish constructs, you can interpolate variables into the pattern. See the answer to "How can I quote a variable to use in a regex?" for more details. This example takes a regular expression from the argument list and prints the lines of input that match it: my $pattern = shift @ARGV; while( <> ) { print if m/$pattern/; } Versions of Perl prior to 5.6 would recompile the regular expression for each iteration, even if $pattern had not changed. The "/o" would prevent this by telling Perl to compile the pattern the first time, then reuse that for subsequent iterations: my $pattern = shift @ARGV; while( <> ) { print if m/$pattern/o; # useful for Perl < 5.6 } In versions 5.6 and later, Perl won't recompile the regular expression if the variable hasn't changed, so you probably don't need the "/o" option. It doesn't hurt, but it doesn't help either. If you want any version of Perl to compile the regular expression only once even if the variable changes (thus, only using its initial value), you still need the "/o". You can watch Perl's regular expression engine at work to verify for yourself if Perl is recompiling a regular expression. The "use re 'debug'" pragma (comes with Perl 5.005 and later) shows the details. With Perls before 5.6, you should see "re" reporting that its compiling the regular expression on each iteration. With Perl 5.6 or later, you should only see "re" report that for the first iteration. use re 'debug'; my $regex = 'Perl'; foreach ( qw(Perl Java Ruby Python) ) { print STDERR "-" x 73, "\n"; print STDERR "Trying $_...\n"; print STDERR "\t$_ is good!\n" if m/$regex/; } How do I use a regular expression to strip C-style comments from a file? While this actually can be done, it's much harder than you'd think. For example, this one-liner perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis. $/ = undef; $_ = <>; s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; print; This could, of course, be more legibly written with the "/x" modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis. s{ /\* ## Start of /* ... */ comment [^*]*\*+ ## Non-* followed by 1-or-more *'s ( [^/*][^*]*\*+ )* ## 0-or-more things which don't start with / ## but do end with '*' / ## End of /* ... */ comment | ## OR various things which aren't comments: ( " ## Start of " ... " string ( \\. ## Escaped char | ## OR [^"\\] ## Non "\ )* " ## End of " ... " string | ## OR ' ## Start of ' ... ' string ( \\. ## Escaped char | ## OR [^'\\] ## Non '\ )* ' ## End of ' ... ' string | ## OR . ## Anything other char [^/"'\\]* ## Chars which doesn't start a comment, string or escape ) }{defined $2 ? $2 : ""}gxse; A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character: s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse; How can I print out a word-frequency or line-frequency summary? To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question: my (%seen); while (<>) { while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" $seen{$1}++; } } while ( my ($word, $count) = each %seen ) { print "$count $word\n"; } If you wanted to do the same thing for lines, you wouldn't need a regular expression: my (%seen); while (<>) { $seen{$_}++; } while ( my ($line, $count) = each %seen ) { print "$count $line"; } If you want these output in a sorted order, see perlfaq4: "How do I sort a hash (optionally by value instead of key)?". How do I efficiently match many regular expressions at once? (contributed by brian d foy) You want to avoid compiling a regular expression every time you want to match it. In this example, perl must recompile the regular expression for every iteration of the "foreach" loop since $pattern can change: my @patterns = qw( fo+ ba[rz] ); LINE: while( my $line = <> ) { foreach my $pattern ( @patterns ) { if( $line =~ m/\b$pattern\b/i ) { print $line; next LINE; } } } The "qr//" operator compiles a regular expression, but doesn't apply it. When you use the pre-compiled version of the regex, perl does less work. In this example, I inserted a "map" to turn each pattern into its pre-compiled form. The rest of the script is the same, but faster: my @patterns = map { qr/\b$_\b/i } qw( fo+ ba[rz] ); LINE: while( my $line = <> ) { foreach my $pattern ( @patterns ) { if( $line =~ m/$pattern/ ) { print $line; next LINE; } } } In some cases, you may be able to make several patterns into a single regular expression. Beware of situations that require backtracking though. In this example, the regex is only compiled once because $regex doesn't change between iterations: my $regex = join '|', qw( fo+ ba[rz] ); while( my $line = <> ) { print if $line =~ m/\b(?:$regex)\b/i; } The function "list2re" in Data::Munge on CPAN can also be used to form a single regex that matches a list of literal strings (not regexes). For more details on regular expression efficiency, see *Mastering Regular Expressions* by Jeffrey Friedl. He explains how the regular expressions engine works and why some patterns are surprisingly inefficient. Once you understand how perl applies regular expressions, you can tune them for individual situations. Why don't word-boundary searches with "\b" work for me? (contributed by brian d foy) Ensure that you know what \b really does: it's the boundary between a word character, \w, and something that isn't a word character. That thing that isn't a word character might be \W, but it can also be the start or end of the string. It's not (not!) the boundary between whitespace and non-whitespace, and it's not the stuff between words we use to create sentences. In regex speak, a word boundary (\b) is a "zero width assertion", meaning that it doesn't represent a character in the string, but a condition at a certain position. For the regular expression, /\bPerl\b/, there has to be a word boundary before the "P" and after the "l". As long as something other than a word character precedes the "P" and succeeds the "l", the pattern will match. These strings match /\bPerl\b/. "Perl" # no word char before "P" or after "l" "Perl " # same as previous (space is not a word char) "'Perl'" # the "'" char is not a word char "Perl's" # no word char before "P", non-word char after "l" These strings do not match /\bPerl\b/. "Perl_" # "_" is a word char! "Perler" # no word char before "P", but one after "l" You don't have to use \b to match words though. You can look for non-word characters surrounded by word characters. These strings match the pattern /\b'\b/. "don't" # the "'" char is surrounded by "n" and "t" "qep'a'" # the "'" char is surrounded by "p" and "a" These strings do not match /\b'\b/. "foo'" # there is no word char after non-word "'" You can also use the complement of \b, \B, to specify that there should not be a word boundary. In the pattern /\Bam\B/, there must be a word character before the "a" and after the "m". These patterns match /\Bam\B/: "llama" # "am" surrounded by word chars "Samuel" # same These strings do not match /\Bam\B/ "Sam" # no word boundary before "a", but one after "m" "I am Sam" # "am" surrounded by non-word chars Are Perl regexes DFAs or NFAs? Are they POSIX compliant? While it's true that Perl's regular expressions resemble the DFAs (deterministic finite automata) of the egrep(1) program, they are in fact implemented as NFAs (non-deterministic finite automata) to allow backtracking and backreferencing. And they aren't POSIX-style either, because those guarantee worst-case behavior for all cases. (It seems that some people prefer guarantees of consistency, even when what's guaranteed is slowness.) See the book "Mastering Regular Expressions" (from O'Reilly) by Jeffrey Friedl for all the details you could ever hope to know on these matters (a full citation appears in perlfaq2). Found in /usr/share/perl/5.34/pod/perlfaq7.pod Can I get a BNF/yacc/RE for the Perl language? There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if you're particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into toke.c as well. In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors." Why do Perl operators have different precedence than C operators? Actually, they don't. All C operators that Perl copies have the same precedence in Perl as they do in C. The problem is with operators that C doesn't have, especially functions that give a list context to everything on their right, eg. print, chmod, exec, and so on. Such functions are called "list operators" and appear as such in the precedence table in perlop. A common mistake is to write: unlink $file || die "snafu"; This gets interpreted as: unlink ($file || die "snafu"); To avoid this problem, either put in extra parentheses or use the super low precedence "or" operator: (unlink $file) || die "snafu"; unlink $file or die "snafu"; The "English" operators ("and", "or", "xor", and "not") deliberately have precedence lower than that of list operators for just such situations as the one above. Another operator with surprising precedence is exponentiation. It binds more tightly even than unary minus, making "-2**2" produce a negative four and not a positive one. It is also right-associating, meaning that "2**3**2" is two raised to the ninth power, not eight squared. Although it has the same precedence as in C, Perl's "?:" operator produces an lvalue. This assigns $x to either $if_true or $if_false, depending on the trueness of $maybe: ($maybe ? $if_true : $if_false) = $x; How can I tell if a variable is tainted? You can use the tainted() function of the Scalar::Util module, available from CPAN (or included with Perl since release 5.8.0). See also "Laundering and Detecting Tainted Data" in perlsec. How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? You need to pass references to these objects. See "Pass by Reference" in perlsub for this particular question, and perlref for information on references. Passing Variables and Functions Regular variables and functions are quite easy to pass: just pass in a reference to an existing or anonymous variable or function: func( \$some_scalar ); func( \@some_array ); func( [ 1 .. 10 ] ); func( \%some_hash ); func( { this => 10, that => 20 } ); func( \&some_func ); func( sub { $_[0] ** $_[1] } ); Passing Filehandles As of Perl 5.6, you can represent filehandles with scalar variables which you treat as any other scalar. open my $fh, $filename or die "Cannot open $filename! $!"; func( $fh ); sub func { my $passed_fh = shift; my $line = <$passed_fh>; } Before Perl 5.6, you had to use the *FH or "\*FH" notations. These are "typeglobs"--see "Typeglobs and Filehandles" in perldata and especially "Pass by Reference" in perlsub for more information. Passing Regexes Here's an example of how to pass in a string and a regular expression for it to match against. You construct the pattern with the "qr//" operator: sub compare { my ($val1, $regex) = @_; my $retval = $val1 =~ /$regex/; return $retval; } $match = compare("old McDonald", qr/d.*D/i); Passing Methods To pass an object method into a subroutine, you can do this: call_a_lot(10, $some_obj, "methname") sub call_a_lot { my ($count, $widget, $trick) = @_; for (my $i = 0; $i < $count; $i++) { $widget->$trick(); } } Or, you can use a closure to bundle up the object, its method call, and arguments: my $whatnot = sub { $some_obj->obfuscate(@args) }; func($whatnot); sub func { my $code = shift; &$code(); } You could also investigate the can() method in the UNIVERSAL class (part of the standard perl distribution). What's the difference between dynamic and lexical (static) scoping? Between local() and my()? "local($x)" saves away the old value of the global variable $x and assigns a new value for the duration of the subroutine *which is visible in other functions called from that subroutine*. This is done at run-time, so is called dynamic scoping. local() always affects global variables, also called package variables or dynamic variables. "my($x)" creates a new variable that is only visible in the current subroutine. This is done at compile-time, so it is called lexical or static scoping. my() always affects private variables, also called lexical variables or (improperly) static(ly scoped) variables. For instance: sub visible { print "var has value $var\n"; } sub dynamic { local $var = 'local'; # new temporary value for the still-global visible(); # variable called $var } sub lexical { my $var = 'private'; # new private variable, $var visible(); # (invisible outside of sub scope) } $var = 'global'; visible(); # prints global dynamic(); # prints local lexical(); # prints global Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() function, and it is hidden from the called subroutine. In summary, local() doesn't make what you think of as private, local variables. It gives a global variable a temporary value. my() is what you're looking for if you want private variables. See "Private Variables via my()" in perlsub and "Temporary Values via local()" in perlsub for excruciating details. What's the difference between deep and shallow binding? In deep binding, lexical variables mentioned in anonymous subroutines are the same ones that were in scope when the subroutine was created. In shallow binding, they are whichever variables with the same names happen to be in scope when the subroutine is called. Perl always uses deep binding of lexical variables (i.e., those created with my()). However, dynamic variables (aka global, local, or package variables) are effectively shallowly bound. Consider this just one more reason not to use them. See the answer to "What's a closure?". Why doesn't "my($foo) = <$fh>;" work right? "my()" and "local()" give list context to the right hand side of "=". The <$fh> read operation, like so many of Perl's functions and operators, can tell which context it was called in and behaves appropriately. In general, the scalar() function can help. This function does nothing to the data itself (contrary to popular myth) but rather tells its argument to behave in whatever its scalar fashion is. If that function doesn't have a defined scalar behavior, this of course doesn't help you (such as with sort()). To enforce scalar context in this particular case, however, you need merely omit the parentheses: local($foo) = <$fh>; # WRONG local($foo) = scalar(<$fh>); # ok local $foo = <$fh>; # right You should probably be using lexical variables anyway, although the issue is the same here: my($foo) = <$fh>; # WRONG my $foo = <$fh>; # right How do I redefine a builtin function, operator, or method? Why do you want to do that? :-) If you want to override a predefined function, such as open(), then you'll have to import the new definition from a different module. See "Overriding Built-in Functions" in perlsub. If you want to overload a Perl operator, such as "+" or "**", then you'll want to use the "use overload" pragma, documented in overload. If you're talking about obscuring method calls in parent classes, see "Overriding methods and method resolution" in perlootut. What's the difference between calling a function as &foo and foo()? (contributed by brian d foy) Calling a subroutine as &foo with no trailing parentheses ignores the prototype of "foo" and passes it the current value of the argument list, @_. Here's an example; the "bar" subroutine calls &foo, which prints its arguments list: sub foo { print "Args in foo are: @_\n"; } sub bar { &foo; } bar( "a", "b", "c" ); When you call "bar" with arguments, you see that "foo" got the same @_: Args in foo are: a b c Calling the subroutine with trailing parentheses, with or without arguments, does not use the current @_. Changing the example to put parentheses after the call to "foo" changes the program: sub foo { print "Args in foo are: @_\n"; } sub bar { &foo(); } bar( "a", "b", "c" ); Now the output shows that "foo" doesn't get the @_ from its caller. Args in foo are: However, using "&" in the call still overrides the prototype of "foo" if present: sub foo ($$$) { print "Args infoo are: @_\n"; } sub bar_1 { &foo; } sub bar_2 { &foo(); } sub bar_3 { foo( $_[0], $_[1], $_[2] ); } # sub bar_4 { foo(); } # bar_4 doesn't compile: "Not enough arguments for main::foo at ..." bar_1( "a", "b", "c" ); # Args in foo are: a b c bar_2( "a", "b", "c" ); # Args in foo are: bar_3( "a", "b", "c" ); # Args in foo are: a b c The main use of the @_ pass-through feature is to write subroutines whose main job it is to call other subroutines for you. For further details, see perlsub. How can I catch accesses to undefined variables, functions, or methods? The AUTOLOAD method, discussed in "Autoloading" in perlsub lets you capture calls to undefined functions and methods. When it comes to undefined variables that would trigger a warning under "use warnings", you can promote the warning to an error. use warnings FATAL => qw(uninitialized); Why can't a method included in this same file be found? Some possible reasons: your inheritance is getting confused, you've misspelled the method name, or the object is of the wrong type. Check out perlootut for details about any of the above cases. You may also use "print ref($object)" to find out the class $object was blessed into. Another possible reason for problems is that you've used the indirect object syntax (eg, "find Guru "Samy"") on a class name before Perl has seen that such a package exists. It's wisest to make sure your packages are all defined before you start using them, which will be taken care of if you use the "use" statement instead of "require". If not, make sure to use arrow notation (eg., "Guru->find("Samy")") instead. Object notation is explained in perlobj. Make sure to read about creating modules in perlmod and the perils of indirect objects in "Method Invocation" in perlobj. How can I find out my current or calling package? (contributed by brian d foy) To find the package you are currently in, use the special literal "__PACKAGE__", as documented in perldata. You can only use the special literals as separate tokens, so you can't interpolate them into strings like you can with variables: my $current_package = __PACKAGE__; print "I am in package $current_package\n"; If you want to find the package calling your code, perhaps to give better diagnostics as Carp does, use the "caller" built-in: sub foo { my @args = ...; my( $package, $filename, $line ) = caller; print "I was called from package $package\n"; ); By default, your program starts in package "main", so you will always be in some package. This is different from finding out the package an object is blessed into, which might not be the current package. For that, use "blessed" from Scalar::Util, part of the Standard Library since Perl 5.8: use Scalar::Util qw(blessed); my $object_package = blessed( $object ); Most of the time, you shouldn't care what package an object is blessed into, however, as long as it claims to inherit from that class: my $is_right_class = eval { $object->isa( $package ) }; # true or false And, with Perl 5.10 and later, you don't have to check for an inheritance to see if the object can handle a role. For that, you can use "DOES", which comes from "UNIVERSAL": my $class_does_it = eval { $object->DOES( $role ) }; # true or false You can safely replace "isa" with "DOES" (although the converse is not true). How can I comment out a large block of Perl code? (contributed by brian d foy) The quick-and-dirty way to comment out more than one line of Perl is to surround those lines with Pod directives. You have to put these directives at the beginning of the line and somewhere where Perl expects a new statement (so not in the middle of statements like the "#" comments). You end the comment with "=cut", ending the Pod section: =pod my $object = NotGonnaHappen->new(); ignored_sub(); $wont_be_assigned = 37; =cut The quick-and-dirty method only works well when you don't plan to leave the commented code in the source. If a Pod parser comes along, your multiline comment is going to show up in the Pod translation. A better way hides it from Pod parsers as well. The "=begin" directive can mark a section for a particular purpose. If the Pod parser doesn't want to handle it, it just ignores it. Label the comments with "comment". End the comment using "=end" with the same label. You still need the "=cut" to go back to Perl code from the Pod comment: =begin comment my $object = NotGonnaHappen->new(); ignored_sub(); $wont_be_assigned = 37; =end comment =cut For more information on Pod, check out perlpod and perlpodspec. Found in /usr/share/perl/5.34/pod/perlfaq8.pod How do I find out which operating system I'm running under? The $^O variable ($OSNAME if you use "English") contains an indication of the name of the operating system (not its release number) that your perl binary was built for. How do I do fancy stuff with the keyboard/screen/mouse? How you access/control keyboards, screens, and pointing devices ("mice") is system-dependent. Try the following modules: Keyboard Term::Cap Standard perl distribution Term::ReadKey CPAN Term::ReadLine::Gnu CPAN Term::ReadLine::Perl CPAN Term::Screen CPAN Screen Term::Cap Standard perl distribution Curses CPAN Term::ANSIColor CPAN Mouse Tk CPAN Wx CPAN Gtk2 CPAN Qt4 kdebindings4 package Some of these specific cases are shown as examples in other answers in this section of the perlfaq. How do I read just one key without waiting for a return key? Controlling input buffering is a remarkably system-dependent matter. On many systems, you can just use the stty command as shown in "getc" in perlfunc, but as you see, that's already getting you into portability snags. open(TTY, "+</dev/tty") or die "no tty: $!"; system "stty cbreak </dev/tty >/dev/tty 2>&1"; $key = getc(TTY); # perhaps this works # OR ELSE sysread(TTY, $key, 1); # probably this does system "stty -cbreak </dev/tty >/dev/tty 2>&1"; The Term::ReadKey module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to stty for each key. It even includes limited support for Windows. use Term::ReadKey; ReadMode('cbreak'); $key = ReadKey(0); ReadMode('normal'); However, using the code requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution using the standard POSIX module, which is already on your system (assuming your system supports POSIX). use HotKey; $key = readkey(); And here's the "HotKey" module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures. # HotKey.pm package HotKey; use strict; use warnings; use parent 'Exporter'; our @EXPORT = qw(cbreak cooked readkey); use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); # ok, so i don't want echo either $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub readkey { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } END { cooked() } 1; How do I ask the user for a password? (This question has nothing to do with the web. See a different FAQ for that.) There's an example of this in "crypt" in perlfunc. First, you put the terminal into "no echo" mode, then just read the password normally. You may do this with an old-style "ioctl()" function, POSIX terminal control (see POSIX or its documentation the Camel Book), or a call to the stty program, with varying degrees of portability. You can also do this for most systems using the Term::ReadKey module from CPAN, which is easier to use and in theory more portable. use Term::ReadKey; ReadMode('noecho'); my $password = ReadLine(0); How do I decode encrypted password files? You spend lots and lots of money on dedicated hardware, but this is bound to get you talked about. Seriously, you can't if they are Unix password files--the Unix password system employs one-way encryption. It's more like hashing than encryption. The best you can do is check whether something else hashes to the same string. You can't turn a hash back into the original string. Programs like Crack can forcibly (and intelligently) try to guess passwords, but don't (can't) guarantee quick success. If you're worried about users selecting bad passwords, you should proactively check when they try to change their password (by modifying passwd(1), for example). How do I modify the shadow password file on a Unix system? If perl was installed correctly and your shadow library was written properly, the "getpw*()" functions described in perlfunc should in theory provide (read-only) access to entries in the shadow password file. To change the file, make a new shadow password file (the format varies from system to system--see passwd(1) for specifics) and use pwd_mkdb(8) to install it (see pwd_mkdb(8) for more details). How can I sleep() or alarm() for under a second? If you want finer granularity than the 1 second that the "sleep()" function provides, the easiest way is to use the "select()" function as documented in "select" in perlfunc. Try the Time::HiRes and the BSD::Itimer modules (available from CPAN, and starting from Perl 5.8 Time::HiRes is part of the standard distribution). How can I call my system's unique C functions from Perl? In most cases, you write an external module to do it--see the answer to "Where can I learn about linking C with Perl? [h2xs, xsubpp]". However, if the function is a system call, and your system supports "syscall()", you can use the "syscall" function (documented in perlfunc). Remember to check the modules that came with your distribution, and CPAN as well--someone may already have written a module to do it. On Windows, try Win32::API. On Macs, try Mac::Carbon. If no module has an interface to the C function, you can inline a bit of C in your Perl source with Inline::C. Where do I get the include files to do ioctl() or syscall()? Historically, these would be generated by the h2ph tool, part of the standard perl distribution. This program converts cpp(1) directives in C header files to files containing subroutine definitions, like "SYS_getitimer()", which you can use as arguments to your functions. It doesn't work perfectly, but it usually gets most of the job done. Simple files like errno.h, syscall.h, and socket.h were fine, but the hard ones like ioctl.h nearly always need to be hand-edited. Here's how to install the *.ph files: 1. Become the super-user 2. cd /usr/include 3. h2ph *.h */*.h If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See perlxstut for how to get started with h2xs. If your system doesn't support dynamic loading, you still probably ought to use h2xs. See perlxstut and ExtUtils::MakeMaker for more information (in brief, just use make perl instead of a plain make to rebuild perl with a new static extension). How can I open a pipe both to and from a command? The IPC::Open2 module (part of the standard perl distribution) is an easy-to-use approach that internally uses "pipe()", "fork()", and "exec()" to do the job. Make sure you read the deadlock warnings in its documentation, though (see IPC::Open2). See "Bidirectional Communication with Another Process" in perlipc and "Bidirectional Communication with Yourself" in perlipc You may also use the IPC::Open3 module (part of the standard perl distribution), but be warned that it has a different order of arguments from IPC::Open2 (see IPC::Open3). Why can't I get the output of a command with system()? You're confusing the purpose of "system()" and backticks (``). "system()" runs a command and returns exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT. my $exit_status = system("mail-users"); my $output_string = `ls`; How can I capture STDERR from an external command? There are three basic ways of running external commands: system $cmd; # using system() my $output = `$cmd`; # using backticks (``) open (my $pipe_fh, "$cmd |"); # using open() With "system()", both STDOUT and STDERR will go the same place as the script's STDOUT and STDERR, unless the "system()" command redirects them. Backticks and "open()" read only the STDOUT of your command. You can also use the "open3()" function from IPC::Open3. Benjamin Goldberg provides some sample code: To capture a program's STDOUT, but discard its STDERR: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, \*PH, ">&NULL", "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, but discard its STDOUT: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, ">&NULL", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, and let its STDOUT go to our own STDERR: use IPC::Open3; my $in = ''; my $pid = open3($in, ">&STDERR", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To read both a command's STDOUT and its STDERR separately, you can redirect them to temp files, let the command run, then read the temp files: use IPC::Open3; use IO::File; my $in = ''; local *CATCHOUT = IO::File->new_tmpfile; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, ">&CATCHOUT", ">&CATCHERR", "cmd"); waitpid($pid, 0); seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; while( <CATCHOUT> ) {} while( <CATCHERR> ) {} But there's no real need for both to be tempfiles... the following should work just as well, without deadlocking: use IPC::Open3; my $in = ''; use IO::File; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, \*CATCHOUT, ">&CATCHERR", "cmd"); while( <CATCHOUT> ) {} waitpid($pid, 0); seek CATCHERR, 0, 0; while( <CATCHERR> ) {} And it'll be faster, too, since we can begin processing the program's stdout immediately, rather than waiting for the program to finish. With any of these, you can change file descriptors before the call: open(STDOUT, ">logfile"); system("ls"); or you can use Bourne shell file-descriptor redirection: $output = `$cmd 2>some_file`; open (PIPE, "cmd 2>some_file |"); You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT: $output = `$cmd 2>&1`; open (PIPE, "cmd 2>&1 |"); Note that you *cannot* simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes This fails because the "open()" makes STDERR go to where STDOUT was going at the time of the "open()". The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). Note that you *must* use Bourne shell (sh(1)) redirection syntax in backticks, not csh(1)! Details on why Perl's "system()" and backtick and pipe opens all use the Bourne shell are in the versus/csh.whynot article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . To capture a command's STDERR and STDOUT together: $output = `cmd 2>&1`; # either with backticks $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDOUT but discard its STDERR: $output = `cmd 2>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDERR but discard its STDOUT: $output = `cmd 2>&1 1>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR: $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe while (<PH>) { } # plus a read To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done: system("program args 1>program.stdout 2>program.stderr"); Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order. system("prog args 1>tmpfile 2>&1"); system("prog args 2>&1 1>tmpfile"); The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output there, and the old standard error shows up on the old standard out. Why doesn't open() return an error when a pipe open fails? If the second argument to a piped "open()" contains shell metacharacters, perl "fork()"s, then "exec()"s a shell to decode the metacharacters and eventually run the desired program. If the program couldn't be run, it's the shell that gets the message, not Perl. All your Perl program can find out is whether the shell itself could be successfully started. You can still capture the shell's STDERR and check it for error messages. See "How can I capture STDERR from an external command?" elsewhere in this document, or use the IPC::Open3 module. If there are no shell metacharacters in the argument of "open()", Perl runs the command directly, without using the shell, and can correctly report whether the command started. Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? This happens only if your perl is compiled to use stdio instead of perlio, which is the default. Some (maybe all?) stdios set error and eof flags that you may need to clear. The POSIX module defines "clearerr()" that you can use. That is the technically correct way to do it. Here are some less reliable workarounds: 1 Try keeping around the seekpointer and go there, like this: my $where = tell($log_fh); seek($log_fh, $where, 0); 2 If that doesn't work, try seeking to a different part of the file and then back. 3 If that doesn't work, try seeking to a different part of the file, reading something, and then seeking back. 4 If that doesn't work, give up on your stdio package and use sysread. Can I use perl to run a telnet or ftp session? Try the Net::FTP, TCP::Client, and Net::Telnet modules (available from CPAN). <http://www.cpan.org/scripts/netstuff/telnet.emul.shar> will also help for emulating the telnet protocol, but Net::Telnet is quite probably easier to use. If all you want to do is pretend to be telnet but don't need the initial telnet handshaking, then the standard dual-process approach will suffice: use IO::Socket; # new in 5.004 my $handle = IO::Socket::INET->new('www.perl.com:80') or die "can't connect to port 80 on www.perl.com $!"; $handle->autoflush(1); if (fork()) { # XXX: undef means failure select($handle); print while <STDIN>; # everything from stdin to socket } else { print while <$handle>; # everything from socket to stdout } close $handle; exit; Is there a way to hide perl's command line from programs such as "ps"? First of all note that if you're doing this for security reasons (to avoid people seeing passwords, for example) then you should rewrite your program so that critical information is never given as an argument. Hiding the arguments won't make your program completely secure. To actually alter the visible command line, you can assign to the variable $0 as documented in perlvar. This won't work on all operating systems, though. Daemon programs like sendmail place their state there, as in: $0 = "orcus [accepting connections]"; I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? Unix In the strictest sense, it can't be done--the script executes as a different process from the shell it was started from. Changes to a process are not reflected in its parent--only in any children created after the change. There is shell magic that may allow you to fake it by "eval()"ing the script's output in your shell; check out the comp.unix.questions FAQ for details. How do I close a process's filehandle without waiting for it to complete? Assuming your system supports such things, just send an appropriate signal to the process (see "kill" in perlfunc). It's common to first send a TERM signal, wait a little bit, and then send a KILL signal to finish it off. How do I fork a daemon process? If by daemon process you mean one that's detached (disassociated from its tty), then the following process is reported to work on most Unixish systems. Non-Unix users should check their Your_OS::Process module for other solutions. * Open /dev/tty and use the TIOCNOTTY ioctl on it. See tty(1) for details. Or better yet, you can just use the "POSIX::setsid()" function, so you don't have to worry about process groups. * Change directory to / * Reopen STDIN, STDOUT, and STDERR so they're not connected to the old tty. * Background yourself like this: fork && exit; The Proc::Daemon module, available from CPAN, provides a function to perform these actions for you. How do I find out if I'm running interactively or not? (contributed by brian d foy) This is a difficult question to answer, and the best answer is only a guess. What do you really want to know? If you merely want to know if one of your filehandles is connected to a terminal, you can try the "-t" file test: if( -t STDOUT ) { print "I'm connected to a terminal!\n"; } However, you might be out of luck if you expect that means there is a real person on the other side. With the Expect module, another program can pretend to be a person. The program might even come close to passing the Turing test. The IO::Interactive module does the best it can to give you an answer. Its "is_interactive" function returns an output filehandle; that filehandle points to standard output if the module thinks the session is interactive. Otherwise, the filehandle is a null handle that simply discards the output: use IO::Interactive; print { is_interactive } "I might go to standard output!\n"; This still doesn't guarantee that a real person is answering your prompts or reading your output. If you want to know how to handle automated testing for your distribution, you can check the environment. The CPAN Testers, for instance, set the value of "AUTOMATED_TESTING": unless( $ENV{AUTOMATED_TESTING} ) { print "Hello interactive tester!\n"; } How do I open a file without blocking? If you're lucky enough to be using a system that supports non-blocking reads (most Unixish systems do), you need only to use the "O_NDELAY" or "O_NONBLOCK" flag from the "Fcntl" module in conjunction with "sysopen()": use Fcntl; sysopen(my $fh, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) or die "can't open /foo/somefile: $!": How do I tell the difference between errors from the shell and perl? (answer contributed by brian d foy) When you run a Perl script, something else is running the script for you, and that something else may output error messages. The script might emit its own warnings and error messages. Most of the time you cannot tell who said what. You probably cannot fix the thing that runs perl, but you can change how perl outputs its warnings by defining a custom warning and die functions. Consider this script, which has an error you may not notice immediately. #!/usr/locl/bin/perl print "Hello World\n"; I get an error when I run this from my shell (which happens to be bash). That may look like perl forgot it has a "print()" function, but my shebang line is not the path to perl, so the shell runs the script, and I get the error. $ ./test ./test: line 3: print: command not found A quick and dirty fix involves a little bit of code, but this may be all you need to figure out the problem. #!/usr/bin/perl -w BEGIN { $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; } $a = 1 + undef; $x / 0; __END__ The perl message comes out with "Perl" in front. The "BEGIN" block works at compile time so all of the compilation errors and warnings get the "Perl:" prefix too. Perl: Useless use of division (/) in void context at ./test line 9. Perl: Name "main::a" used only once: possible typo at ./test line 8. Perl: Name "main::x" used only once: possible typo at ./test line 9. Perl: Use of uninitialized value in addition (+) at ./test line 8. Perl: Use of uninitialized value in division (/) at ./test line 9. Perl: Illegal division by zero at ./test line 9. Perl: Illegal division by zero at -e line 3. If I don't see that "Perl:", it's not from perl. You could also just know all the perl errors, and although there are some people who may know all of them, you probably don't. However, they all should be in the perldiag manpage. If you don't find the error in there, it probably isn't a perl error. Looking up every message is not the easiest way, so let perl to do it for you. Use the diagnostics pragma with turns perl's normal messages into longer discussions on the topic. use diagnostics; If you don't get a paragraph or two of expanded discussion, it might not be perl's message. How do I install a module from CPAN? (contributed by brian d foy) The easiest way is to have a module also named CPAN do it for you by using the "cpan" command that comes with Perl. You can give it a list of modules to install: $ cpan IO::Interactive Getopt::Whatever If you prefer "CPANPLUS", it's just as easy: $ cpanp i IO::Interactive Getopt::Whatever If you want to install a distribution from the current directory, you can tell "CPAN.pm" to install "." (the full stop): $ cpan . See the documentation for either of those commands to see what else you can do. If you want to try to install a distribution by yourself, resolving all dependencies on your own, you follow one of two possible build paths. For distributions that use *Makefile.PL*: $ perl Makefile.PL $ make test install For distributions that use *Build.PL*: $ perl Build.PL $ ./Build test $ ./Build install Some distributions may need to link to libraries or other third-party code and their build and installation sequences may be more complicated. Check any *README* or *INSTALL* files that you may find. What's the difference between require and use? (contributed by brian d foy) Perl runs "require" statement at run-time. Once Perl loads, compiles, and runs the file, it doesn't do anything else. The "use" statement is the same as a "require" run at compile-time, but Perl also calls the "import" method for the loaded package. These two are the same: use MODULE qw(import list); BEGIN { require MODULE; MODULE->import(import list); } However, you can suppress the "import" by using an explicit, empty import list. Both of these still happen at compile-time: use MODULE (); BEGIN { require MODULE; } Since "use" will also call the "import" method, the actual value for "MODULE" must be a bareword. That is, "use" cannot load files by name, although "require" can: require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching! See the entry for "use" in perlfunc for more details. Found in /usr/share/perl/5.34/pod/perlfaq9.pod Should I use a web framework? Yes. If you are building a web site with any level of interactivity (forms / users / databases), you will want to use a framework to make handling requests and responses easier. If there is no interactivity then you may still want to look at using something like Template Toolkit <https://metacpan.org/module/Template> or Plack::Middleware::TemplateToolkit so maintenance of your HTML files (and other assets) is easier. Which web framework should I use? There is no simple answer to this question. Perl frameworks can run everything from basic file servers and small scale intranets to massive multinational multilingual websites that are the core to international businesses. Below is a list of a few frameworks with comments which might help you in making a decision, depending on your specific requirements. Start by reading the docs, then ask questions on the relevant mailing list or IRC channel. Catalyst Strongly object-oriented and fully-featured with a long development history and a large community and addon ecosystem. It is excellent for large and complex applications, where you have full control over the server. Dancer2 Free of legacy weight, providing a lightweight and easy to learn API. Has a growing addon ecosystem. It is best used for smaller projects and very easy to learn for beginners. Mojolicious Self-contained and powerful for both small and larger projects, with a focus on HTML5 and real-time web technologies such as WebSockets. Web::Simple Strongly object-oriented and minimal, built for speed and intended as a toolkit for building micro web apps, custom frameworks or for tieing together existing Plack-compatible web applications with one central dispatcher. All of these interact with or use Plack which is worth understanding the basics of when building a website in Perl (there is a lot of useful Plack::Middleware <https://metacpan.org/search?q=plack%3A%3Amiddleware>). How do I remove HTML from a string? Use HTML::Strip, or HTML::FormatText which not only removes HTML but also attempts to do a little simple formatting of the resulting plain text. How do I fetch an HTML file? (contributed by brian d foy) The core HTTP::Tiny module can fetch web resources and give their content back to you as a string: use HTTP::Tiny; my $ua = HTTP::Tiny->new; my $html = $ua->get( "http://www.example.com/index.html" )->{content}; It can also store the resource directly in a file: $ua->mirror( "http://www.example.com/index.html", "foo.html" ); If you need to do something more complicated, the HTTP::Tiny object can be customized by setting attributes, or you can use LWP::UserAgent from the libwww-perl distribution or Mojo::UserAgent from the Mojolicious distribution to make common tasks easier. If you want to simulate an interactive web browser, you can use the WWW::Mechanize module. How do I automate an HTML form submission? If you are doing something complex, such as moving through many pages and forms or a web site, you can use WWW::Mechanize. See its documentation for all the details. If you're submitting values using the GET method, create a URL and encode the form using the "www_form_urlencode" method from HTTP::Tiny: use HTTP::Tiny; my $ua = HTTP::Tiny->new; my $query = $ua->www_form_urlencode([ q => 'DB_File', lucky => 1 ]); my $url = "https://metacpan.org/search?$query"; my $content = $ua->get($url)->{content}; If you're using the POST method, the "post_form" method will encode the content appropriately. use HTTP::Tiny; my $ua = HTTP::Tiny->new; my $url = 'https://metacpan.org/search'; my $form = [ q => 'DB_File', lucky => 1 ]; my $content = $ua->post_form($url, $form)->{content}; How do I make sure users can't enter values into a form that causes my CGI script to do bad things? (contributed by brian d foy) You can't prevent people from sending your script bad data. Even if you add some client-side checks, people may disable them or bypass them completely. For instance, someone might use a module such as LWP to submit to your web site. If you want to prevent data that try to use SQL injection or other sorts of attacks (and you should want to), you have to not trust any data that enter your program. The perlsec documentation has general advice about data security. If you are using the DBI module, use placeholder to fill in data. If you are running external programs with "system" or "exec", use the list forms. There are many other precautions that you should take, too many to list here, and most of them fall under the category of not using any data that you don't intend to use. Trust no one. How do I find the user's mail address? Ask them for it. There are so many email providers available that it's unlikely the local system has any idea how to determine a user's email address. The exception is for organization-specific email (e.g. foo AT yourcompany.com) where policy can be codified in your program. In that case, you could look at $ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so: my $user_name = getpwuid($<) But you still cannot make assumptions about whether this is correct, unless your policy says it is. You really are best off asking the user. How do I find out my hostname, domainname, or IP address? (contributed by brian d foy) The Net::Domain module, which is part of the Standard Library starting in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host name, or the domain name. use Net::Domain qw(hostname hostfqdn hostdomain); my $host = hostfqdn(); The Sys::Hostname module, part of the Standard Library, can also get the hostname: use Sys::Hostname; $host = hostname(); The Sys::Hostname::Long module takes a different approach and tries harder to return the fully qualified hostname: use Sys::Hostname::Long 'hostname_long'; my $hostname = hostname_long(); To get the IP address, you can use the "gethostbyname" built-in function to turn the name into a number. To turn that number into the dotted octet form (a.b.c.d) that most people expect, use the "inet_ntoa" function from the Socket module, which also comes with perl. use Socket; my $address = inet_ntoa( scalar gethostbyname( $host || 'localhost' ) ); How do I fetch/put an (S)FTP file? Net::FTP, and Net::SFTP allow you to interact with FTP and SFTP (Secure FTP) servers.
Generated by phpman v3.7.12 Author: Che Dong Under GNU General Public License
2026-06-13 15:55 @216.73.217.25
CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)