Found in /usr/share/perl/5.34/pod/perlfaq1.pod How stable is Perl? Production releases, which incorporate bug fixes and new functionality, are widely tested before release. Since the 5.000 release, we have averaged about one production release per year. The Perl development team occasionally make changes to the internal core of the language, but all possible efforts are made toward backward compatibility. Found in /usr/share/perl/5.34/pod/perlfaq2.pod Where can I post questions? There are many Perl mailing lists for various topics, specifically the beginners list <http://lists.perl.org/list/beginners.html> may be of use. Other places to ask questions are on the PerlMonks site <http://www.perlmonks.org/> or stackoverflow <http://stackoverflow.com/questions/tagged/perl>. What mailing lists are there for Perl? A comprehensive list of Perl-related mailing lists can be found at <http://lists.perl.org/> Found in /usr/share/perl/5.34/pod/perlfaq3.pod How do I find which modules are installed on my system? From the command line, you can use the "cpan" command's "-l" switch: $ cpan -l You can also use "cpan"'s "-a" switch to create an autobundle file that "CPAN.pm" understands and can use to re-install every module: $ cpan -a Inside a Perl program, you can use the ExtUtils::Installed module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up as "Perl" (although you can get those with Module::CoreList). use ExtUtils::Installed; my $inst = ExtUtils::Installed->new(); my @modules = $inst->modules(); If you want a list of all of the Perl module filenames, you can use File::Find::Rule: use File::Find::Rule; my @files = File::Find::Rule-> extras({follow => 1})-> file()-> name( '*.pm' )-> in( @INC ) ; If you do not have that module, you can do the same thing with File::Find which is part of the standard library: use File::Find; my @files; find( { wanted => sub { push @files, $File::Find::fullname if -f $File::Find::fullname && /\.pm$/ }, follow => 1, follow_skip => 2, }, @INC ); print join "\n", @files; If you simply need to check quickly to see if a module is available, you can check for its documentation. If you can read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases): $ perldoc Module::Name You can also try to include the module in a one-liner to see if perl finds it: $ perl -MModule::Name -e1 (If you don't receive a "Can't locate ... in @INC" error message, then Perl found the module name you asked for.) How can I make my Perl program run faster? The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Jon Bentley's book *Programming Pearls* (that's not a misspelling!) has some good tips on optimization, too. Advice on benchmarking boils down to: benchmark and profile to make sure you're optimizing the right part, look for better algorithms instead of microtuning your code, and when all else fails consider just buying faster hardware. You will probably want to read the answer to the earlier question "How do I profile my Perl programs?" if you haven't done so already. A different approach is to autoload seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C, modules that have critical sections can be written in C (for instance, the PDL module from CPAN). If you're currently linking your perl executable to a shared *libc.so*, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information. The undump program was an ancient attempt to speed up Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway. Why don't Perl one-liners work on my DOS/Mac/VMS system? The problem is usually that the command interpreters on those systems have rather different ideas about quoting than the Unix shells under which the one-liners were created. On some systems, you may have to change single-quotes to double ones, which you must *NOT* do on Unix or Plan9 systems. You might also have to change a single % to a %%. For example: # Unix (including Mac OS X) perl -e 'print "Hello world\n"' # DOS, etc. perl -e "print \"Hello world\n\"" # Mac Classic print "Hello world\n" (then Run "Myscript" or Shift-Command-R) # MPW perl -e 'print "Hello world\n"' # VMS perl -e "print ""Hello world\n""" The problem is that none of these examples are reliable: they depend on the command interpreter. Under Unix, the first two often work. Under DOS, it's entirely possible that neither works. If 4DOS was the command shell, you'd probably have better luck like this: perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>"" Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters. Using qq(), q(), and qx(), instead of "double quotes", 'single quotes', and `backticks`, may make one-liners easier to write. There is no general solution to all of this. It is a mess. [Some of this answer was contributed by Kenneth Albanowski.] Found in /usr/share/perl/5.34/pod/perlfaq4.pod Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? For the long explanation, see David Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (<http://web.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf>). Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl. perlnumber shows the gory details of number representations and conversions. To limit the number of decimal places in your numbers, you can use the "printf" or "sprintf" function. See "Floating-point Arithmetic" in perlop for more details. printf "%.2f", 10/3; my $number = sprintf "%.2f", 10/3; How can I take a string and turn it into epoch seconds? If it's a regular enough string that it always has the same format, you can split it up and pass the parts to "timelocal" in the standard Time::Local module. Otherwise, you should look into the Date::Calc, Date::Parse, and Date::Manip modules from CPAN. How do I find yesterday's date? (contributed by brian d foy) To do it correctly, you can use one of the "Date" modules since they work with calendars instead of times. The DateTime module makes it simple, and give you the same time of day, only the day before, despite daylight saving time changes: use DateTime; my $yesterday = DateTime->now->subtract( days => 1 ); print "Yesterday was $yesterday\n"; You can also use the Date::Calc module using its "Today_and_Now" function. use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); print "@date_time\n"; Most people try to use the time rather than the calendar to figure out dates, but that assumes that days are twenty-four hours each. For most people, there are two days a year when they aren't: the switch to and from summer time throws this off. For example, the rest of the suggestions will be wrong sometimes: Starting with Perl 5.10, Time::Piece and Time::Seconds are part of the standard distribution, so you might think that you could do something like this: use Time::Piece; use Time::Seconds; my $yesterday = localtime() - ONE_DAY; # WRONG print "Yesterday was $yesterday\n"; The Time::Piece module exports a new "localtime" that returns an object, and Time::Seconds exports the "ONE_DAY" constant that is a set number of seconds. This means that it always gives the time 24 hours ago, which is not always yesterday. This can cause problems around the end of daylight saving time when there's one day that is 25 hours long. You have the same problem with Time::Local, which will give the wrong answer for those same special cases: # contributed by Gunnar Hjalmarsson use Time::Local; my $today = timelocal 0, 0, 12, ( localtime )[3..5]; my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; How do I unescape a string? It depends just what you mean by "escape". URL escapes are dealt with in perlfaq9. Shell escapes with the backslash ("\") character are removed with s/\\(.)/$1/g; This won't expand "\n" or "\t" or any other special escapes. How do I expand function calls in a string? (contributed by brian d foy) This is documented in perlref, and although it's not the easiest thing to read, it does work. In each of these examples, we call the function inside the braces used to dereference a reference. If we have more than one return value, we can construct and dereference an anonymous array. In this case, we call the function in list context. print "The time values are @{ [localtime] }.\n"; If we want to call the function in scalar context, we have to do a bit more work. We can really have any code we like inside the braces, so we simply have to end with the scalar reference, although how you do that is up to you, and you can use code inside the braces. Note that the use of parens creates a list context, so we need "scalar" to force the scalar context on the function: print "The time is ${\(scalar localtime)}.\n" print "The time is ${ my $x = localtime; \$x }.\n"; If your function already returns a reference, you don't need to create the reference yourself. sub timestamp { my $t = localtime; \$t } print "The time is ${ timestamp() }.\n"; The "Interpolation" module can also do a lot of magic for you. You can specify a variable name, in this case "E", to set up a tied hash that does the interpolation for you. It has several other methods to do this as well. use Interpolation E => 'eval'; print "The time values are $E{localtime()}.\n"; In most cases, it is probably easier to simply use string concatenation, which also forces scalar context. print "The time is " . localtime() . ".\n"; How do I find matching/nesting anything? To find something between two single characters, a pattern like "/x([^x]*)x/" will get the intervening bits in $1. For multiple ones, then something more like "/alpha(.*?)omega/" would be needed. For nested patterns and/or balanced expressions, see the so-called (?PARNO) construct (available since perl 5.10). The CPAN module Regexp::Common can help to build such regular expressions (see in particular Regexp::Common::balanced and Regexp::Common::delimited). More complex cases will require to write a parser, probably using a parsing module from CPAN, like Regexp::Grammars, Parse::RecDescent, Parse::Yapp, Text::Balanced, or Marpa::R2. How do I reverse a string? Use "reverse()" in scalar context, as documented in "reverse" in perlfunc. my $reversed = reverse $string; How do I expand tabs in a string? You can do it yourself: 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; Or you can just use the Text::Tabs module (part of the standard Perl distribution). use Text::Tabs; my @expanded_lines = expand(@lines_with_tabs); How can I access or change N characters of a string? You can access the first characters of a string with substr(). To get the first character, for example, start at position 0 and grab the string of length 1. my $string = "Just another Perl Hacker"; my $first_char = substr( $string, 0, 1 ); # 'J' To change part of a string, you can use the optional fourth argument which is the replacement string. substr( $string, 13, 4, "Perl 5.8.0" ); You can also use substr() as an lvalue. substr( $string, 13, 4 ) = "Perl 5.8.0"; How can I count the number of occurrences of a substring within a string? There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the "tr///" function like so: my $string = "ThisXlineXhasXsomeXx'sXinXit"; my $count = ($string =~ tr/X//); print "There are $count X characters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, "tr///" won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers: my $string = "-9 55 48 -2 23 -76 4 14 -44"; my $count = 0; while ($string =~ /-\d+/g) { $count++ } print "There are $count negative numbers in the string"; Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches. my $count = () = $string =~ /-\d+/g; How can I split a [character]-delimited string except when inside [character]? Several modules can handle this sort of parsing--Text::Balanced, Text::CSV, Text::CSV_XS, and Text::ParseWords, among others. Take the example case of trying to split a string that is comma-separated into its different fields. You can't use "split(/,/)" because you shouldn't split if the comma is inside quotes. For example, take a data line like this: SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of *Mastering Regular Expressions*, to handle these for us. He suggests (assuming your string is contained in $text): my @new = (); push(@new, $+) while $text =~ m{ "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes | ([^,]+),? | , }gx; push(@new, undef) if substr($text,-1,1) eq ','; If you want to represent quotation marks inside a quotation-mark-delimited field, escape them with backslashes (eg, "like \"this\"". Alternatively, the Text::ParseWords module (part of the standard Perl distribution) lets you say: use Text::ParseWords; @new = quotewords(",", 0, $text); For parsing or generating CSV, though, using Text::CSV rather than implementing it yourself is highly recommended; you'll save yourself odd bugs popping up later by just using code which has already been tried and tested in production for years. How do I strip blank space from the beginning/end of a string? (contributed by brian d foy) A substitution can do this for you. For a single line, you want to replace all the leading or trailing whitespace with nothing. You can do that with a pair of substitutions: s/^\s+//; s/\s+$//; You can also write that as a single substitution, although it turns out the combined statement is slower than the separate ones. That might not matter to you, though: s/^\s+|\s+$//g; In this regular expression, the alternation matches either at the beginning or the end of the string since the anchors have a lower precedence than the alternation. With the "/g" flag, the substitution makes all possible matches, so it gets both. Remember, the trailing newline matches the "\s+", and the "$" anchor can match to the absolute end of the string, so the newline disappears too. Just add the newline to the output, which has the added benefit of preserving "blank" (consisting entirely of whitespace) lines which the "^\s+" would remove all by itself: while( <> ) { s/^\s+|\s+$//g; print "$_\n"; } For a multi-line string, you can apply the regular expression to each logical line in the string by adding the "/m" flag (for "multi-line"). With the "/m" flag, the "$" matches *before* an embedded newline, so it doesn't remove it. This pattern still removes the newline at the end of the string: $string =~ s/^\s+|\s+$//gm; Remember that lines consisting entirely of whitespace will disappear, since the first part of the alternation can match the entire string and replace it with nothing. If you need to keep embedded blank lines, you have to do a little more work. Instead of matching any whitespace (since that includes a newline), just match the other whitespace: $string =~ s/^[\t\f ]+|[\t\f ]+$//mg; How do I pad a string with blanks or pad a number with zeroes? In the following examples, $pad_len is the length to which you wish to pad the string, $text or $num contains the string to be padded, and $pad_char contains the padding character. You can use a single character string constant instead of the $pad_char variable if you know what it is in advance. And in the same way you can use an integer in place of $pad_len if you know the pad length in advance. The simplest method uses the "sprintf" function. It can pad on the left or right with blanks and on the left with zeroes and it will not truncate the result. The "pack" function can only pad strings on the right with blanks and it will truncate the result to a maximum length of $pad_len. # Left padding a string with blanks (no truncation): my $padded = sprintf("%${pad_len}s", $text); my $padded = sprintf("%*s", $pad_len, $text); # same thing # Right padding a string with blanks (no truncation): my $padded = sprintf("%-${pad_len}s", $text); my $padded = sprintf("%-*s", $pad_len, $text); # same thing # Left padding a number with 0 (no truncation): my $padded = sprintf("%0${pad_len}d", $num); my $padded = sprintf("%0*d", $pad_len, $num); # same thing # Right padding a string with blanks using pack (will truncate): my $padded = pack("A$pad_len",$text); If you need to pad with a character other than blank or zero you can use one of the following methods. They all generate a pad string with the "x" operator and combine that with $text. These methods do not truncate $text. Left and right padding with any character, creating a new string: my $padded = $pad_char x ( $pad_len - length( $text ) ) . $text; my $padded = $text . $pad_char x ( $pad_len - length( $text ) ); Left and right padding with any character, modifying $text directly: substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ); $text .= $pad_char x ( $pad_len - length( $text ) ); How do I extract selected columns from a string? (contributed by brian d foy) If you know the columns that contain the data, you can use "substr" to extract a single column. my $column = substr( $line, $start_column, $length ); You can use "split" if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data. my $line = ' fred barney betty '; my @columns = split /\s+/, $line; # ( '', 'fred', 'barney', 'betty' ); my $line = 'fred||barney||betty'; my @columns = split /\|/, $line; # ( 'fred', '', 'barney', '', 'betty' ); If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that handle that format, such as Text::CSV, Text::CSV_XS, or Text::CSV_PP. If you want to break apart an entire line of fixed columns, you can use "unpack" with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the "pack" and "unpack" entries in perlfunc for more details. my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); Note that spaces in the format argument to "unpack" do not denote literal spaces. If you have space separated data, you may want "split" instead. How do I find the soundex value of a string? (contributed by brian d foy) You can use the "Text::Soundex" module. If you want to do fuzzy or close matching, you might also try the String::Approx, and Text::Metaphone, and Text::DoubleMetaphone modules. How can I expand variables in text strings? (contributed by brian d foy) If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with "sprintf" or "printf": my $string = sprintf 'Say hello to %s and %s', $foo, $bar; However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values: my $foo = 'Fred'; my $bar = 'Barney'; $string = 'Say hello to $foo and $bar'; One way I can do this involves the substitution operator and a double "/e" flag. The first "/e" evaluates $1 on the replacement side and turns it into $foo. The second /e starts with $foo and replaces it with its value. $foo, then, turns into 'Fred', and that's finally what's left in the string: $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney' The "/e" will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the "/e" flag (twice even!), I have all of the same security problems I have with "eval" in its string form. If there's something odd in $foo, perhaps something like "@{[ system "rm -rf /" ]}", then I could get myself in trouble. To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single "/e", I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case "???" to signal that I missed something: my $string = 'This has $foo and $bar'; my %Replacements = ( foo => 'Fred', ); # $string =~ s/\$(\w+)/$Replacements{$1}/g; $string =~ s/\$(\w+)/ exists $Replacements{$1} ? $Replacements{$1} : '???' /eg; print $string; Does Perl have anything like Ruby's #{} or Python's f string? Unlike the others, Perl allows you to embed a variable naked in a double quoted string, e.g. "variable $variable". When there isn't whitespace or other non-word characters following the variable name, you can add braces (e.g. "foo ${foo}bar") to ensure correct parsing. An array can also be embedded directly in a string, and will be expanded by default with spaces between the elements. The default LIST_SEPARATOR can be changed by assigning a different string to the special variable $", such as "local $" = ', ';". Perl also supports references within a string providing the equivalent of the features in the other two languages. "${\ ... }" embedded within a string will work for most simple statements such as an object->method call. More complex code can be wrapped in a do block "${\ do{...} }". When you want a list to be expanded per $", use "@{[ ... ]}". use Time::Piece; use Time::Seconds; my $scalar = 'STRING'; my @array = ( 'zorro', 'a', 1, 'B', 3 ); # Print the current date and time and then Tommorrow my $t = Time::Piece->new; say "Now is: ${\ $t->cdate() }"; say "Tomorrow: ${\ do{ my $T=Time::Piece->new + ONE_DAY ; $T->fullday }}"; # some variables in strings say "This is some scalar I have $scalar, this is an array @array."; say "You can also write it like this ${scalar} @{array}."; # Change the $LIST_SEPARATOR local $" = ':'; say "Set \$\" to delimit with ':' and sort the Array @{[ sort @array ]}"; You may also want to look at the module Quote::Code, and templating tools such as Template::Toolkit and Mojo::Template. See also: "How can I expand variables in text strings?" and "How do I expand function calls in a string?" in this FAQ. What is the difference between a list and an array? (contributed by brian d foy) A list is a fixed collection of scalars. An array is a variable that holds a variable collection of scalars. An array can supply its collection for list operations, so list operations also work on arrays: # slices ( 'dog', 'cat', 'bird' )[2,3]; @animals[2,3]; # iteration foreach ( qw( dog cat bird ) ) { ... } foreach ( @animals ) { ... } my @three = grep { length == 3 } qw( dog cat bird ); my @three = grep { length == 3 } @animals; # supply an argument list wash_animals( qw( dog cat bird ) ); wash_animals( @animals ); Array operations, which change the scalars, rearrange them, or add or subtract some scalars, only work on arrays. These can't work on a list, which is fixed. Array operations include "shift", "unshift", "push", "pop", and "splice". An array can also change its length: $#animals = 1; # truncate to two elements $#animals = 10000; # pre-extend to 10,001 elements You can change an array element, but you can't change a list element: $animals[0] = 'Rottweiler'; qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! foreach ( @animals ) { s/^d/fr/; # works fine } foreach ( qw( dog cat bird ) ) { s/^d/fr/; # Error! Modification of read only value! } However, if the list element is itself a variable, it appears that you can change a list element. However, the list element is the variable, not the data. You're not changing the list element, but something the list element refers to. The list element itself doesn't change: it's still the same variable. You also have to be careful about context. You can assign an array to a scalar to get the number of elements in the array. This only works for arrays, though: my $count = @animals; # only works with arrays If you try to do the same thing with what you think is a list, you get a quite different result. Although it looks like you have a list on the righthand side, Perl actually sees a bunch of scalars separated by a comma: my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird Since you're assigning to a scalar, the righthand side is in scalar context. The comma operator (yes, it's an operator!) in scalar context evaluates its lefthand side, throws away the result, and evaluates it's righthand side and returns the result. In effect, that list-lookalike assigns to $scalar it's rightmost value. Many people mess this up because they choose a list-lookalike whose last element is also the count they expect: my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally How can I remove duplicate elements from a list or array? (contributed by brian d foy) Use a hash. When you think the words "unique" or "duplicated", think "hash keys". If you don't care about the order of the elements, you could just create the hash then extract the keys. It's not important how you create that hash: just that you use "keys" to get the unique elements. my %hash = map { $_, 1 } @array; # or a hash slice: @hash{ @array } = (); # or a foreach: $hash{$_} = 1 foreach ( @array ); my @unique = keys %hash; If you want to use a module, try the "uniq" function from List::MoreUtils. In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements. use List::MoreUtils qw(uniq); my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an element, that element has no key in %Seen. The "next" statement creates the key and immediately uses its value, which is "undef", so the loop continues to the "push" and increments the value for that key. The next time the loop sees that same element, its key exists in the hash *and* the value for that key is true (since it's not 0 or "undef"), so the next skips that iteration and the loop goes to the next element. my @unique = (); my %seen = (); foreach my $elem ( @array ) { next if $seen{ $elem }++; push @unique, $elem; } You can write this more briefly using a grep, which does the same thing. my %seen = (); my @unique = grep { ! $seen{ $_ }++ } @array; How can I tell whether a certain element is contained in a list or array? (portions of this answer contributed by Anno Siegel and brian d foy) Hearing the word "in" is an *in*dication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays aren't. That being said, there are several ways to approach this. If you are going to make this query many times over arbitrary string values, the fastest way is probably to invert the original array and maintain a hash whose keys are the first array's values: my @blues = qw/azure cerulean teal turquoise lapis-lazuli/; my %is_blue = (); for (@blues) { $is_blue{$_} = 1 } Now you can check whether $is_blue{$some_color}. It might have been a good idea to keep the blues all in a hash in the first place. If the values are all small integers, you could use a simple indexed array. This kind of an array will take up less space: my @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31); my @is_tiny_prime = (); for (@primes) { $is_tiny_prime[$_] = 1 } # or simply @istiny_prime[@primes] = (1) x @primes; Now you check whether $is_tiny_prime[$some_number]. If the values in question are integers instead of strings, you can save quite a lot of space by using bit strings instead: my @articles = ( 1..10, 150..2000, 2017 ); undef $read; for (@articles) { vec($read,$_,1) = 1 } Now check whether "vec($read,$n,1)" is true for some $n. These methods guarantee fast individual tests but require a re-organization of the original list or array. They only pay off if you have to test multiple values against the same array. If you are testing only once, the standard module List::Util exports the function "any" for this purpose. It works by stopping once it finds the element. It's written in C for speed, and its Perl equivalent looks like this subroutine: sub any (&@) { my $code = shift; foreach (@_) { return 1 if $code->(); } return 0; } If speed is of little concern, the common idiom uses grep in scalar context (which returns the number of items that passed its condition) to traverse the entire list. This does have the benefit of telling you how many matches it found, though. my $is_there = grep $_ eq $whatever, @array; If you want to actually extract the matching elements, simply use grep in list context. my @matches = grep $_ eq $whatever, @array; How do I test whether two arrays or hashes are equal? The following code works for single-level arrays. It uses a stringwise comparison, and does not distinguish defined versus undefined empty strings. Modify if you have other needs. $are_equal = compare_arrays(\@frogs, \@toads); sub compare_arrays { my ($first, $second) = @_; no warnings; # silence spurious -w undef complaints return 0 unless @$first == @$second; for (my $i = 0; $i < @$first; $i++) { return 0 if $first->[$i] ne $second->[$i]; } return 1; } For multilevel structures, you may wish to use an approach more like this one. It uses the CPAN module FreezeThaw: use FreezeThaw qw(cmpStr); my @a = my @b = ( "this", "that", [ "more", "stuff" ] ); printf "a and b contain %s arrays\n", cmpStr(\@a, \@b) == 0 ? "the same" : "different"; This approach also works for comparing hashes. Here we'll demonstrate two different answers: use FreezeThaw qw(cmpStr cmpStrHard); my %a = my %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); $a{EXTRA} = \%b; $b{EXTRA} = \%a; printf "a and b contain %s hashes\n", cmpStr(\%a, \%b) == 0 ? "the same" : "different"; printf "a and b contain %s hashes\n", cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; The first reports that both those the hashes contain the same data, while the second reports that they do not. Which you prefer is left as an exercise to the reader. How do I find the first array element for which a condition is true? To find the first array element which satisfies a condition, you can use the "first()" function in the List::Util module, which comes with Perl 5.8. This example finds the first element that contains "Perl". use List::Util qw(first); my $element = first { /Perl/ } @array; If you cannot use List::Util, you can make your own loop to do the same thing. Once you find the element, you stop the loop with last. my $found; foreach ( @array ) { if( /Perl/ ) { $found = $_; last } } If you want the array index, use the "firstidx()" function from "List::MoreUtils": use List::MoreUtils qw(firstidx); my $index = firstidx { /Perl/ } @array; Or write it yourself, iterating through the indices and checking the array element at each index until you find one that satisfies the condition: my( $found, $index ) = ( undef, -1 ); for( $i = 0; $i < @array; $i++ ) { if( $array[$i] =~ /Perl/ ) { $found = $array[$i]; $index = $i; last; } } How do I handle linked lists? (contributed by brian d foy) Perl's arrays do not have a fixed size, so you don't need linked lists if you just want to add or remove items. You can use array operations such as "push", "pop", "shift", "unshift", or "splice" to do that. Sometimes, however, linked lists can be useful in situations where you want to "shard" an array so you have many small arrays instead of a single big array. You can keep arrays longer than Perl's largest array index, lock smaller arrays separately in threaded programs, reallocate less memory, or quickly insert elements in the middle of the chain. Steve Lembark goes through the details in his YAPC::NA 2009 talk "Perly Linked Lists" ( <http://www.slideshare.net/lembark/perly-linked-lists> ), although you can just use his LinkedList::Single module. How do I handle circular lists? (contributed by brian d foy) If you want to cycle through an array endlessly, you can increment the index modulo the number of elements in the array: my @array = qw( a b c ); my $i = 0; while( 1 ) { print $array[ $i++ % @array ], "\n"; last if $i > 20; } You can also use Tie::Cycle to use a scalar that always has the next element of the circular array: use Tie::Cycle; tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ]; print $cycle; # FFFFFF print $cycle; # 000000 print $cycle; # FFFF00 The Array::Iterator::Circular creates an iterator object for circular arrays: use Array::Iterator::Circular; my $color_iterator = Array::Iterator::Circular->new( qw(red green blue orange) ); foreach ( 1 .. 20 ) { print $color_iterator->next, "\n"; } How do I permute N elements of a list? Use the List::Permutor module on CPAN. If the list is actually an array, try the Algorithm::Permute module (also on CPAN). It's written in XS code and is very efficient: use Algorithm::Permute; my @array = 'a'..'d'; my $p_iterator = Algorithm::Permute->new ( \@array ); while (my @perm = $p_iterator->next) { print "next permutation: (@perm)\n"; } For even faster execution, you could do: use Algorithm::Permute; my @array = 'a'..'d'; Algorithm::Permute::permute { print "next permutation: (@array)\n"; } @array; Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the "permute()" function is discussed in Volume 4 (still unpublished) of Knuth's *The Art of Computer Programming* and will work on any list: #!/usr/bin/perl -n # Fischer-Krause ordered permutation generator sub permute (&@) { my $code = shift; my @idx = 0..$#_; while ( $code->(@_[@idx]) ) { my $p = $#idx; --$p while $idx[$p-1] > $idx[$p]; my $q = $p or return; push @idx, reverse splice @idx, $p; ++$q while $idx[$p-1] > $idx[$q]; @idx[$p-1,$q]=@idx[$q,$p-1]; } } permute { print "@_\n" } split; The Algorithm::Loops module also provides the "NextPermute" and "NextPermuteNum" functions which efficiently find all unique permutations of an array, even if it contains duplicate values, modifying it in-place: if its elements are in reverse-sorted order then the array is reversed, making it sorted, and it returns false; otherwise the next permutation is returned. "NextPermute" uses string order and "NextPermuteNum" numeric order, so you can enumerate all the permutations of 0..9 like this: use Algorithm::Loops qw(NextPermuteNum); my @list= 0..9; do { print "@list\n" } while NextPermuteNum @list; How do I sort a hash (optionally by value instead of key)? (contributed by brian d foy) To sort a hash, start with the keys. In this example, we give the list of keys to the sort function which then compares them ASCIIbetically (which might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order. my @keys = sort { $a cmp $b } keys %hash; foreach my $key ( @keys ) { printf "%-20s %6d\n", $key, $hash{$key}; } We could get more fancy in the "sort()" block though. Instead of comparing the keys, we can compute a value with them and use that value as the comparison. For instance, to make our report order case-insensitive, we use "lc" to lowercase the keys before comparing them: my @keys = sort { lc $a cmp lc $b } keys %hash; Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the computation results. If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value. my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash; From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key. my @keys = sort { $hash{$a} <=> $hash{$b} or "\L$a" cmp "\L$b" } keys %hash; Why don't my tied hashes make the defined/exists distinction? This depends on the tied hash's implementation of EXISTS(). For example, there isn't the concept of undef with hashes that are tied to DBM* files. It also means that exists() and defined() do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes. How can I store a multidimensional array in a DBM file? Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File. You might also try DBM::Deep, but it can be a bit slow. How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? Usually a hash ref, perhaps like this: $record = { NAME => "Jason", EMPNO => 132, TITLE => "deputy peon", AGE => 23, SALARY => 37_000, PALS => [ "Norbert", "Rhys", "Phineas"], }; References are documented in perlref and perlreftut. Examples of complex data structures are given in perldsc and perllol. Examples of structures and object-oriented classes are in perlootut. How can I check if a key exists in a multilevel hash? (contributed by brian d foy) The trick to this problem is avoiding accidental autovivification. If you want to check three keys deep, you might na?vely try this: my %hash; if( exists $hash{key1}{key2}{key3} ) { ...; } Even though you started with a completely empty hash, after that call to "exists" you've created the structure you needed to check for "key3": %hash = ( 'key1' => { 'key2' => {} } ); That's autovivification. You can get around this in a few ways. The easiest way is to just turn it off. The lexical "autovivification" pragma is available on CPAN. Now you don't add to the hash: { no autovivification; my %hash; if( exists $hash{key1}{key2}{key3} ) { ...; } } The Data::Diver module on CPAN can do it for you too. Its "Dive" subroutine can tell you not only if the keys exist but also get the value: use Data::Diver qw(Dive); my @exists = Dive( \%hash, qw(key1 key2 key3) ); if( ! @exists ) { ...; # keys do not exist } elsif( ! defined $exists[0] ) { ...; # keys exist but value is undef } You can easily do this yourself too by checking each level of the hash before you move onto the next level. This is essentially what Data::Diver does for you: if( check_hash( \%hash, qw(key1 key2 key3) ) ) { ...; } sub check_hash { my( $hash, @keys ) = @_; return unless @keys; foreach my $key ( @keys ) { return unless eval { exists $hash->{$key} }; $hash = $hash->{$key}; } return 1; } How do I keep persistent data across program calls? For some specific applications, you can use one of the DBM modules. See AnyDBM_File. More generically, you should consult the FreezeThaw or Storable modules from CPAN. Starting from Perl 5.8, Storable is part of the standard distribution. Here's one example using Storable's "store" and "retrieve" functions: use Storable; store(\%hash, "filename"); # later on... $href = retrieve("filename"); # by ref %hash = %{ retrieve("filename") }; # direct to hash How do I print out or copy a recursive data structure? The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great for printing out data structures. The Storable module on CPAN (or the 5.8 release of Perl), provides a function called "dclone" that recursively copies its argument. use Storable qw(dclone); $r2 = dclone($r1); Where $r1 can be a reference to any kind of data structure you'd like. It will be deeply copied. Because "dclone" takes and returns references, you'd have to add extra punctuation if you had a hash of arrays that you wanted to copy. %newhash = %{ dclone(\%oldhash) }; Found in /usr/share/perl/5.34/pod/perlfaq5.pod How do I flush/unbuffer an output filehandle? Why must I do this? (contributed by brian d foy) You might like to read Mark Jason Dominus's "Suffering From Buffering" at <http://perl.plover.com/FAQs/Buffering.html> . Perl normally buffers output so it doesn't make a system call for every bit of output. By saving up output, it makes fewer expensive system calls. For instance, in this little bit of code, you want to print a dot to the screen for every line you process to watch the progress of your program. Instead of seeing a dot for every line, Perl buffers the output and you have a long wait before you see a row of 50 dots all at once: # long wait, then row of dots all at once while( <> ) { print "."; print "\n" unless ++$count % 50; #... expensive line processing operations } To get around this, you have to unbuffer the output filehandle, in this case, "STDOUT". You can set the special variable $| to a true value (mnemonic: making your filehandles "piping hot"): $|++; # dot shown immediately while( <> ) { print "."; print "\n" unless ++$count % 50; #... expensive line processing operations } The $| is one of the per-filehandle special variables, so each filehandle has its own copy of its value. If you want to merge standard output and standard error for instance, you have to unbuffer each (although STDERR might be unbuffered by default): { my $previous_default = select(STDOUT); # save previous default $|++; # autoflush STDOUT select(STDERR); $|++; # autoflush STDERR, to be sure select($previous_default); # restore previous default } # now should alternate . and + while( 1 ) { sleep 1; print STDOUT "."; print STDERR "+"; print STDOUT "\n" unless ++$count % 25; } Besides the $| special variable, you can use "binmode" to give your filehandle a ":unix" layer, which is unbuffered: binmode( STDOUT, ":unix" ); while( 1 ) { sleep 1; print "."; print "\n" unless ++$count % 50; } For more information on output layers, see the entries for "binmode" and open in perlfunc, and the PerlIO module documentation. If you are using IO::Handle or one of its subclasses, you can call the "autoflush" method to change the settings of the filehandle: use IO::Handle; open my( $io_fh ), ">", "output.txt"; $io_fh->autoflush(1); The IO::Handle objects also have a "flush" method. You can flush the buffer any time you want without auto-buffering $io_fh->flush; How do I delete the last N lines from a file? (contributed by brian d foy) The easiest conceptual solution is to count the lines in the file then start at the beginning and print the number of lines (minus the last N) to a new file. Most often, the real question is how you can delete the last N lines without making more than one pass over the file, or how to do it without a lot of copying. The easy concept is the hard reality when you might have millions of lines in your file. One trick is to use File::ReadBackwards, which starts at the end of the file. That module provides an object that wraps the real filehandle to make it easy for you to move around the file. Once you get to the spot you need, you can get the actual filehandle and work with it as normal. In this case, you get the file position at the end of the last line you want to keep and truncate the file to that point: use File::ReadBackwards; my $filename = 'test.txt'; my $Lines_to_truncate = 2; my $bw = File::ReadBackwards->new( $filename ) or die "Could not read backwards in [$filename]: $!"; my $lines_from_end = 0; until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { print "Got: ", $bw->readline; $lines_from_end++; } truncate( $filename, $bw->tell ); The File::ReadBackwards module also has the advantage of setting the input record separator to a regular expression. You can also use the Tie::File module which lets you access the lines through a tied array. You can use normal array operations to modify your file, including setting the last index and using "splice". How can I open a filehandle to a string? (contributed by Peter J. Holzer, hjp-usenet2 AT hjp.at) Since Perl 5.8.0 a file handle referring to a string can be created by calling open with a reference to that string instead of the filename. This file handle can then be used to read from or write to the string: open(my $fh, '>', \$string) or die "Could not open string for writing"; print $fh "foo\n"; print $fh "bar\n"; # $string now contains "foo\nbar\n" open(my $fh, '<', \$string) or die "Could not open string for reading"; my $x = <$fh>; # $x now contains "foo\n" With older versions of Perl, the IO::String module provides similar functionality. How can I write() into a string? (contributed by brian d foy) If you want to "write" into a string, you just have to <open> a filehandle to a string, which Perl has been able to do since Perl 5.6: open FH, '>', \my $string; write( FH ); Since you want to be a good programmer, you probably want to use a lexical filehandle, even though formats are designed to work with bareword filehandles since the default format names take the filehandle name. However, you can control this with some Perl special per-filehandle variables: $^, which names the top-of-page format, and $~ which shows the line format. You have to change the default filehandle to set these variables: open my($fh), '>', \my $string; { # set per-filehandle variables my $old_fh = select( $fh ); $~ = 'ANIMAL'; $^ = 'ANIMAL_TOP'; select( $old_fh ); } format ANIMAL_TOP = ID Type Name . format ANIMAL = @## @<<< @<<<<<<<<<<<<<< $id, $type, $name . Although write can work with lexical or package variables, whatever variables you use have to scope in the format. That most likely means you'll want to localize some package variables: { local( $id, $type, $name ) = qw( 12 cat Buster ); write( $fh ); } print $string; There are also some tricks that you can play with "formline" and the accumulator variable $^A, but you lose a lot of the value of formats since "formline" won't handle paging and so on. You end up reimplementing formats when you use them. Why do I sometimes get an "Argument list too long" when I use <*>? The "<>" operator performs a globbing operation (see above). In Perl versions earlier than v5.6.0, the internal glob() operator forks csh(1) to do the actual glob expansion, but csh can't handle more than 127 items and so gives the error message "Argument list too long". People who installed tcsh as csh won't have this problem, but their users may be surprised by it. To get around this, either upgrade to Perl v5.6.0 or later, do the glob yourself with readdir() and patterns, or use a module like File::Glob, one that doesn't use the shell to do globbing. Why can't I just open(FH, ">file.lock")? A common bit of code NOT TO USE is this: sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE open my $lock, '>', 'file.lock'; # THIS BROKEN CODE This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this "ought" to work: sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT or die "can't open file.lock: $!"; except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also less than desirable. I still don't get locking. I just want to increment the number in the file. How can I do this? Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic. Anyway, this is what you can do if you can't help yourself. use Fcntl qw(:DEFAULT :flock); sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; flock $fh, LOCK_EX or die "can't flock numfile: $!"; my $num = <$fh> || 0; seek $fh, 0, 0 or die "can't rewind numfile: $!"; truncate $fh, 0 or die "can't truncate numfile: $!"; (print $fh $num+1, "\n") or die "can't write numfile: $!"; close $fh or die "can't close numfile: $!"; Here's a much better web-page hit counter: $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-) All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? If you are on a system that correctly implements "flock" and you use the example appending code from "perldoc -f flock" everything will be OK even if the OS you are on doesn't implement append mode correctly (if such a system exists). So if you are happy to restrict yourself to OSs that implement "flock" (and that's not really much of a restriction) then that is what you should do. If you know you are only going to use a system that does correctly implement appending (i.e. not Win32) then you can omit the "seek" from the code in the previous answer. If you know you are only writing code to run on an OS and filesystem that does implement append mode correctly (a local filesystem on a modern Unix for example), and you keep the file in block-buffered mode and you write less than one buffer-full of output between each manual flushing of the buffer then each bufferload is almost guaranteed to be written to the end of the file in one chunk without getting intermingled with anyone else's output. You can also use the "syswrite" function which is simply a wrapper around your system's write(2) system call. There is still a small theoretical chance that a signal will interrupt the system-level "write()" operation before completion. There is also a possibility that some STDIO implementations may call multiple system level "write()"s even if the buffer was empty to start. There may be some systems where this probability is reduced to zero, and this is not a concern when using ":perlio" instead of your system's STDIO. How do I get a file's timestamp in perl? If you want to retrieve the time at which the file was last read, written, or had its meta-data (owner, etc) changed, you use the -A, -M, or -C file test operations as documented in perlfunc. These retrieve the age of the file (measured against the start-time of your program) in days as a floating point number. Some platforms may not have all of these times. See perlport for details. To retrieve the "raw" time in seconds since the epoch, you would call the stat function, then use "localtime()", "gmtime()", or "POSIX::strftime()" to convert this into human-readable form. Here's an example: my $write_secs = (stat($file))[9]; printf "file %s updated at %s\n", $file, scalar localtime($write_secs); If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): # error checking left as an exercise for reader. use File::stat; use Time::localtime; my $date_string = ctime(stat($file)->mtime); print "file $file updated at $date_string\n"; The POSIX::strftime() approach has the benefit of being, in theory, independent of the current locale. See perllocale for details. How do I set a file's timestamp in perl? You use the utime() function documented in "utime" in perlfunc. By way of example, here's a little program that copies the read and write times from its first argument to all the rest of them. if (@ARGV < 2) { die "usage: cptimes timestamp_file other_files ...\n"; } my $timestamp = shift; my($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; Error checking is, as usual, left as an exercise for the reader. The perldoc for utime also has an example that has the same effect as touch(1) on files that *already exist*. Certain file systems have a limited ability to store the times on a file at the expected level of precision. For example, the FAT and HPFS filesystem are unable to create dates on files with a finer granularity than two seconds. This is a limitation of the filesystems, not of utime(). Found in /usr/share/perl/5.34/pod/perlfaq6.pod How do I match XML, HTML, or other nasty, ugly things with a regex? Do not use regexes. Use a module and forget about the regular expressions. The XML::LibXML, HTML::TokeParser and HTML::TreeBuilder modules are good starts, although each namespace has other parsing modules specialized for certain tasks and different ways of doing it. Start at CPAN Search ( <http://metacpan.org/> ) and wonder at all the work people have done for you already! :) How do I substitute case-insensitively on the LHS while preserving case on the RHS? Here's a lovely Perlish solution by Larry Rosler. It exploits properties of bitwise xor on ASCII strings. $_= "this is a TEsT case"; $old = 'test'; $new = 'success'; s{(\Q$old\E)} { uc $new | (uc $1 ^ $1) . (uc(substr $1, -1) ^ substr $1, -1) x (length($new) - length $1) }egi; print; And here it is as a subroutine, modeled after the above: sub preserve_case { my ($old, $new) = @_; my $mask = uc $old ^ $old; uc $new | $mask . substr($mask, -1) x (length($new) - length($old)) } $string = "this is a TEsT case"; $string =~ s/(test)/preserve_case($1, "success")/egi; print "$string\n"; This prints: this is a SUcCESS case As an alternative, to keep the case of the replacement word if it is longer than the original, you can use this code, by Jeff Pinyan: sub preserve_case { my ($from, $to) = @_; my ($lf, $lt) = map length, @_; if ($lt < $lf) { $from = substr $from, 0, $lt } else { $from .= substr $to, $lf } return uc $to | ($from ^ uc $from); } This changes the sentence to "this is a SUcCess case." Just to show that C programmers can write C in any programming language, if you prefer a more C-like solution, the following script makes the substitution have the same case, letter by letter, as the original. (It also happens to run about 240% slower than the Perlish solution runs.) If the substitution has more characters than the string being substituted, the case of the last character is used for the rest of the substitution. # Original by Nathan Torkington, massaged by Jeffrey Friedl # sub preserve_case { my ($old, $new) = @_; my $state = 0; # 0 = no change; 1 = lc; 2 = uc my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); my $len = $oldlen < $newlen ? $oldlen : $newlen; for ($i = 0; $i < $len; $i++) { if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { $state = 0; } elsif (lc $c eq $c) { substr($new, $i, 1) = lc(substr($new, $i, 1)); $state = 1; } else { substr($new, $i, 1) = uc(substr($new, $i, 1)); $state = 2; } } # finish up with any remaining new (for when new is longer than old) if ($newlen > $oldlen) { if ($state == 1) { substr($new, $oldlen) = lc(substr($new, $oldlen)); } elsif ($state == 2) { substr($new, $oldlen) = uc(substr($new, $oldlen)); } } return $new; } How do I use a regular expression to strip C-style comments from a file? While this actually can be done, it's much harder than you'd think. For example, this one-liner perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis. $/ = undef; $_ = <>; s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; print; This could, of course, be more legibly written with the "/x" modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis. s{ /\* ## Start of /* ... */ comment [^*]*\*+ ## Non-* followed by 1-or-more *'s ( [^/*][^*]*\*+ )* ## 0-or-more things which don't start with / ## but do end with '*' / ## End of /* ... */ comment | ## OR various things which aren't comments: ( " ## Start of " ... " string ( \\. ## Escaped char | ## OR [^"\\] ## Non "\ )* " ## End of " ... " string | ## OR ' ## Start of ' ... ' string ( \\. ## Escaped char | ## OR [^'\\] ## Non '\ )* ' ## End of ' ... ' string | ## OR . ## Anything other char [^/"'\\]* ## Chars which doesn't start a comment, string or escape ) }{defined $2 ? $2 : ""}gxse; A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character: s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse; How can I match strings with multibyte characters? Starting from Perl 5.6 Perl has had some level of multibyte character support. Perl 5.8 or later is recommended. Supported multibyte character repertoires include Unicode, and legacy encodings through the Encode module. See perluniintro, perlunicode, and Encode. If you are stuck with older Perls, you can do Unicode with the Unicode::String module, and character conversions using the Unicode::Map8 and Unicode::Map modules. If you are using Japanese encodings, you might try using the jperl 5.005_03. Finally, the following set of approaches was offered by Jeffrey Friedl, whose article in issue #5 of The Perl Journal talks about this very matter. Let's suppose you have some weird Martian encoding where pairs of ASCII uppercase letters encode single Martian letters (i.e. the two bytes "CV" make a single Martian letter, as do the two bytes "SG", "VS", "XX", etc.). Other bytes represent single characters, just like ASCII. So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. Now, say you want to search for the single character "/GX/". Perl doesn't know about Martian, so it'll find the two bytes "GX" in the "I am CVSGXX!" string, even though that character isn't there: it just looks like it is because "SG" is next to "XX", but there's no real "GX". This is a big problem. Here are a few ways, all painful, to deal with it: # Make sure adjacent "martian" bytes are no longer adjacent. $martian =~ s/([A-Z][A-Z])/ $1 /g; print "found GX!\n" if $martian =~ /GX/; Or like this: my @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g; # above is conceptually similar to: my @chars = $text =~ m/(.)/g; # foreach my $char (@chars) { print "found GX!\n", last if $char eq 'GX'; } Or like this: while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded if ($1 eq 'GX') { print "found GX!\n"; last; } } Here's another, slightly less painful, way to do it from Benjamin Goldberg, who uses a zero-width negative look-behind assertion. print "found GX!\n" if $martian =~ m/ (?<![A-Z]) (?:[A-Z][A-Z])*? GX /x; This succeeds if the "martian" character GX is in the string, and fails otherwise. If you don't like using (?<!), a zero-width negative look-behind assertion, you can replace (?<![A-Z]) with (?:^|[^A-Z]). It does have the drawback of putting the wrong thing in $-[0] and $+[0], but this usually can be worked around. Found in /usr/share/perl/5.34/pod/perlfaq7.pod Do I always/never have to quote my strings or use semicolons and commas? Normally, a bareword doesn't need to be quoted, but in most cases probably should be (and must be under "use strict"). But a hash key consisting of a simple word and the left-hand operand to the "=>" operator both count as though they were quoted: This is like this ------------ --------------- $foo{line} $foo{'line'} bar => stuff 'bar' => stuff The final semicolon in a block is optional, as is the final comma in a list. Good style (see perlstyle) says to put them in except for one-liners: if ($whoops) { exit 1 } my @nums = (1, 2, 3); if ($whoops) { exit 1; } my @lines = ( "There Beren came from mountains cold", "And lost he wandered under leaves", ); How do I declare/create a structure? In general, you don't "declare" a structure. Just use a (probably anonymous) hash reference. See perlref and perldsc for details. Here's an example: $person = {}; # new anonymous hash $person->{AGE} = 24; # set field AGE to 24 $person->{NAME} = "Nat"; # set field NAME to "Nat" If you're looking for something a bit more rigorous, try perlootut. How do I create a static variable? (contributed by brian d foy) In Perl 5.10, declare the variable with "state". The "state" declaration creates the lexical variable that persists between calls to the subroutine: sub counter { state $count = 1; $count++ } You can fake a static variable by using a lexical variable which goes out of scope. In this example, you define the subroutine "counter", and it uses the lexical variable $count. Since you wrap this in a BEGIN block, $count is defined at compile-time, but also goes out of scope at the end of the BEGIN block. The BEGIN block also ensures that the subroutine and the value it uses is defined at compile-time so the subroutine is ready to use just like any other subroutine, and you can put this code in the same place as other subroutines in the program text (i.e. at the end of the code, typically). The subroutine "counter" still has a reference to the data, and is the only way you can access the value (and each time you do, you increment the value). The data in chunk of memory defined by $count is private to "counter". BEGIN { my $count = 1; sub counter { $count++ } } my $start = counter(); .... # code that calls counter(); my $end = counter(); In the previous example, you created a function-private variable because only one function remembered its reference. You could define multiple functions while the variable is in scope, and each function can share the "private" variable. It's not really "static" because you can access it outside the function while the lexical variable is in scope, and even create references to it. In this example, "increment_count" and "return_count" share the variable. One function adds to the value and the other simply returns the value. They can both access $count, and since it has gone out of scope, there is no other way to access it. BEGIN { my $count = 1; sub increment_count { $count++ } sub return_count { $count } } To declare a file-private variable, you still use a lexical variable. A file is also a scope, so a lexical variable defined in the file cannot be seen from any other file. See "Persistent Private Variables" in perlsub for more information. The discussion of closures in perlref may help you even though we did not use anonymous subroutines in this answer. See "Persistent Private Variables" in perlsub for details. What's the difference between dynamic and lexical (static) scoping? Between local() and my()? "local($x)" saves away the old value of the global variable $x and assigns a new value for the duration of the subroutine *which is visible in other functions called from that subroutine*. This is done at run-time, so is called dynamic scoping. local() always affects global variables, also called package variables or dynamic variables. "my($x)" creates a new variable that is only visible in the current subroutine. This is done at compile-time, so it is called lexical or static scoping. my() always affects private variables, also called lexical variables or (improperly) static(ly scoped) variables. For instance: sub visible { print "var has value $var\n"; } sub dynamic { local $var = 'local'; # new temporary value for the still-global visible(); # variable called $var } sub lexical { my $var = 'private'; # new private variable, $var visible(); # (invisible outside of sub scope) } $var = 'global'; visible(); # prints global dynamic(); # prints local lexical(); # prints global Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() function, and it is hidden from the called subroutine. In summary, local() doesn't make what you think of as private, local variables. It gives a global variable a temporary value. my() is what you're looking for if you want private variables. See "Private Variables via my()" in perlsub and "Temporary Values via local()" in perlsub for excruciating details. How do I create a switch or case statement? There is a given/when statement in Perl, but it is experimental and likely to change in future. See perlsyn for more details. The general answer is to use a CPAN module such as Switch::Plain: use Switch::Plain; sswitch($variable_holding_a_string) { case 'first': { } case 'second': { } default: { } } or for more complicated comparisons, "if-elsif-else": for ($variable_to_test) { if (/pat1/) { } # do something elsif (/pat2/) { } # do something else elsif (/pat3/) { } # do something else else { } # default } Here's a simple example of a switch based on pattern matching, lined up in a way to make it look more like a switch statement. We'll do a multiway conditional based on the type of reference stored in $whatchamacallit: SWITCH: for (ref $whatchamacallit) { /^$/ && die "not a reference"; /SCALAR/ && do { print_scalar($$ref); last SWITCH; }; /ARRAY/ && do { print_array(@$ref); last SWITCH; }; /HASH/ && do { print_hash(%$ref); last SWITCH; }; /CODE/ && do { warn "can't print function ref"; last SWITCH; }; # DEFAULT warn "User defined type skipped"; } See perlsyn for other examples in this style. Sometimes you should change the positions of the constant and the variable. For example, let's say you wanted to test which of many answers you were given, but in a case-insensitive way that also allows abbreviations. You can use the following technique if the strings all start with different characters or if you want to arrange the matches so that one takes precedence over another, as "SEND" has precedence over "STOP" here: chomp($answer = <>); if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } A totally different approach is to create a hash of function references. my %commands = ( "happy" => \&joy, "sad", => \&sullen, "done" => sub { die "See ya!" }, "mad" => \&angry, ); print "How are you? "; chomp($string = <STDIN>); if ($commands{$string}) { $commands{$string}->(); } else { print "No such command: $string\n"; } Starting from Perl 5.8, a source filter module, "Switch", can also be used to get switch and case. Its use is now discouraged, because it's not fully compatible with the native switch of Perl 5.10, and because, as it's implemented as a source filter, it doesn't always work as intended when complex syntax is involved. Found in /usr/share/perl/5.34/pod/perlfaq8.pod How do I find out which operating system I'm running under? The $^O variable ($OSNAME if you use "English") contains an indication of the name of the operating system (not its release number) that your perl binary was built for. How do I do fancy stuff with the keyboard/screen/mouse? How you access/control keyboards, screens, and pointing devices ("mice") is system-dependent. Try the following modules: Keyboard Term::Cap Standard perl distribution Term::ReadKey CPAN Term::ReadLine::Gnu CPAN Term::ReadLine::Perl CPAN Term::Screen CPAN Screen Term::Cap Standard perl distribution Curses CPAN Term::ANSIColor CPAN Mouse Tk CPAN Wx CPAN Gtk2 CPAN Qt4 kdebindings4 package Some of these specific cases are shown as examples in other answers in this section of the perlfaq. How do I read just one key without waiting for a return key? Controlling input buffering is a remarkably system-dependent matter. On many systems, you can just use the stty command as shown in "getc" in perlfunc, but as you see, that's already getting you into portability snags. open(TTY, "+</dev/tty") or die "no tty: $!"; system "stty cbreak </dev/tty >/dev/tty 2>&1"; $key = getc(TTY); # perhaps this works # OR ELSE sysread(TTY, $key, 1); # probably this does system "stty -cbreak </dev/tty >/dev/tty 2>&1"; The Term::ReadKey module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to stty for each key. It even includes limited support for Windows. use Term::ReadKey; ReadMode('cbreak'); $key = ReadKey(0); ReadMode('normal'); However, using the code requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution using the standard POSIX module, which is already on your system (assuming your system supports POSIX). use HotKey; $key = readkey(); And here's the "HotKey" module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures. # HotKey.pm package HotKey; use strict; use warnings; use parent 'Exporter'; our @EXPORT = qw(cbreak cooked readkey); use POSIX qw(:termios_h); my ($term, $oterm, $echo, $noecho, $fd_stdin); $fd_stdin = fileno(STDIN); $term = POSIX::Termios->new(); $term->getattr($fd_stdin); $oterm = $term->getlflag(); $echo = ECHO | ECHOK | ICANON; $noecho = $oterm & ~$echo; sub cbreak { $term->setlflag($noecho); # ok, so i don't want echo either $term->setcc(VTIME, 1); $term->setattr($fd_stdin, TCSANOW); } sub cooked { $term->setlflag($oterm); $term->setcc(VTIME, 0); $term->setattr($fd_stdin, TCSANOW); } sub readkey { my $key = ''; cbreak(); sysread(STDIN, $key, 1); cooked(); return $key; } END { cooked() } 1; How do I start a process in the background? (contributed by brian d foy) There's not a single way to run code in the background so you don't have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are covered in perlipc. Several CPAN modules may be able to help, including IPC::Open2 or IPC::Open3, IPC::Run, Parallel::Jobs, Parallel::ForkManager, POE, Proc::Background, and Win32::Process. There are many other modules you might use, so check those namespaces for other options too. If you are on a Unix-like system, you might be able to get away with a system call where you put an "&" on the end of the command: system("cmd &") You can also try using "fork", as described in perlfunc (although this is the same thing that many of the modules will do for you). STDIN, STDOUT, and STDERR are shared Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with "open"ing a pipe (see "open" in perlfunc) but on some systems this means that the child process cannot outlive the parent. Signals You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with "system("cmd&")". Zombies You have to be prepared to "reap" the child process when it finishes. $SIG{CHLD} = sub { wait }; $SIG{CHLD} = 'IGNORE'; You can also use a double fork. You immediately "wait()" for your first child, and the init daemon will "wait()" for your grandchild once it exits. unless ($pid = fork) { unless (fork) { exec "what you really wanna do"; die "exec failed!"; } exit 0; } waitpid($pid, 0); See "Signals" in perlipc for other examples of code to do this. Zombies are not an issue with "system("prog &")". How do I modify the shadow password file on a Unix system? If perl was installed correctly and your shadow library was written properly, the "getpw*()" functions described in perlfunc should in theory provide (read-only) access to entries in the shadow password file. To change the file, make a new shadow password file (the format varies from system to system--see passwd(1) for specifics) and use pwd_mkdb(8) to install it (see pwd_mkdb(8) for more details). Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? Some Sys-V based systems, notably Solaris 2.X, redefined some of the standard socket constants. Since these were constant across all architectures, they were often hardwired into perl code. The proper way to deal with this is to "use Socket" to get the correct values. Note that even though SunOS and Solaris are binary compatible, these values are different. Go figure. How can I call my system's unique C functions from Perl? In most cases, you write an external module to do it--see the answer to "Where can I learn about linking C with Perl? [h2xs, xsubpp]". However, if the function is a system call, and your system supports "syscall()", you can use the "syscall" function (documented in perlfunc). Remember to check the modules that came with your distribution, and CPAN as well--someone may already have written a module to do it. On Windows, try Win32::API. On Macs, try Mac::Carbon. If no module has an interface to the C function, you can inline a bit of C in your Perl source with Inline::C. Why can't I get the output of a command with system()? You're confusing the purpose of "system()" and backticks (``). "system()" runs a command and returns exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT. my $exit_status = system("mail-users"); my $output_string = `ls`; How can I capture STDERR from an external command? There are three basic ways of running external commands: system $cmd; # using system() my $output = `$cmd`; # using backticks (``) open (my $pipe_fh, "$cmd |"); # using open() With "system()", both STDOUT and STDERR will go the same place as the script's STDOUT and STDERR, unless the "system()" command redirects them. Backticks and "open()" read only the STDOUT of your command. You can also use the "open3()" function from IPC::Open3. Benjamin Goldberg provides some sample code: To capture a program's STDOUT, but discard its STDERR: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, \*PH, ">&NULL", "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, but discard its STDOUT: use IPC::Open3; use File::Spec; my $in = ''; open(NULL, ">", File::Spec->devnull); my $pid = open3($in, ">&NULL", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To capture a program's STDERR, and let its STDOUT go to our own STDERR: use IPC::Open3; my $in = ''; my $pid = open3($in, ">&STDERR", \*PH, "cmd"); while( <PH> ) { } waitpid($pid, 0); To read both a command's STDOUT and its STDERR separately, you can redirect them to temp files, let the command run, then read the temp files: use IPC::Open3; use IO::File; my $in = ''; local *CATCHOUT = IO::File->new_tmpfile; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, ">&CATCHOUT", ">&CATCHERR", "cmd"); waitpid($pid, 0); seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; while( <CATCHOUT> ) {} while( <CATCHERR> ) {} But there's no real need for both to be tempfiles... the following should work just as well, without deadlocking: use IPC::Open3; my $in = ''; use IO::File; local *CATCHERR = IO::File->new_tmpfile; my $pid = open3($in, \*CATCHOUT, ">&CATCHERR", "cmd"); while( <CATCHOUT> ) {} waitpid($pid, 0); seek CATCHERR, 0, 0; while( <CATCHERR> ) {} And it'll be faster, too, since we can begin processing the program's stdout immediately, rather than waiting for the program to finish. With any of these, you can change file descriptors before the call: open(STDOUT, ">logfile"); system("ls"); or you can use Bourne shell file-descriptor redirection: $output = `$cmd 2>some_file`; open (PIPE, "cmd 2>some_file |"); You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT: $output = `$cmd 2>&1`; open (PIPE, "cmd 2>&1 |"); Note that you *cannot* simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: open(STDERR, ">&STDOUT"); $alloutput = `cmd args`; # stderr still escapes This fails because the "open()" makes STDERR go to where STDOUT was going at the time of the "open()". The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT). Note that you *must* use Bourne shell (sh(1)) redirection syntax in backticks, not csh(1)! Details on why Perl's "system()" and backtick and pipe opens all use the Bourne shell are in the versus/csh.whynot article in the "Far More Than You Ever Wanted To Know" collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . To capture a command's STDERR and STDOUT together: $output = `cmd 2>&1`; # either with backticks $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDOUT but discard its STDERR: $output = `cmd 2>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To capture a command's STDERR but discard its STDOUT: $output = `cmd 2>&1 1>/dev/null`; # either with backticks $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe while (<PH>) { } # plus a read To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR: $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe while (<PH>) { } # plus a read To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done: system("program args 1>program.stdout 2>program.stderr"); Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order. system("prog args 1>tmpfile 2>&1"); system("prog args 2>&1 1>tmpfile"); The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output there, and the old standard error shows up on the old standard out. Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? This happens only if your perl is compiled to use stdio instead of perlio, which is the default. Some (maybe all?) stdios set error and eof flags that you may need to clear. The POSIX module defines "clearerr()" that you can use. That is the technically correct way to do it. Here are some less reliable workarounds: 1 Try keeping around the seekpointer and go there, like this: my $where = tell($log_fh); seek($log_fh, $where, 0); 2 If that doesn't work, try seeking to a different part of the file and then back. 3 If that doesn't work, try seeking to a different part of the file, reading something, and then seeking back. 4 If that doesn't work, give up on your stdio package and use sysread. How do I avoid zombies on a Unix system? Use the reaper code from "Signals" in perlipc to call "wait()" when a SIGCHLD is received, or else use the double-fork technique described in "How do I start a process in the background?" in perlfaq8. How do I make a system() exit on control-C? You can't. You need to imitate the "system()" call (see perlipc for sample code) and then have a signal handler for the INT signal that passes the signal on to the subprocess. Or you can check for it: $rc = system($cmd); if ($rc & 127) { die "signal death" } How do I install a module from CPAN? (contributed by brian d foy) The easiest way is to have a module also named CPAN do it for you by using the "cpan" command that comes with Perl. You can give it a list of modules to install: $ cpan IO::Interactive Getopt::Whatever If you prefer "CPANPLUS", it's just as easy: $ cpanp i IO::Interactive Getopt::Whatever If you want to install a distribution from the current directory, you can tell "CPAN.pm" to install "." (the full stop): $ cpan . See the documentation for either of those commands to see what else you can do. If you want to try to install a distribution by yourself, resolving all dependencies on your own, you follow one of two possible build paths. For distributions that use *Makefile.PL*: $ perl Makefile.PL $ make test install For distributions that use *Build.PL*: $ perl Build.PL $ ./Build test $ ./Build install Some distributions may need to link to libraries or other third-party code and their build and installation sequences may be more complicated. Check any *README* or *INSTALL* files that you may find. Where are modules installed? Modules are installed on a case-by-case basis (as provided by the methods described in the previous section), and in the operating system. All of these paths are stored in @INC, which you can display with the one-liner perl -e 'print join("\n",@INC,"")' The same information is displayed at the end of the output from the command perl -V To find out where a module's source code is located, use perldoc -l Encode to display the path to the module. In some cases (for example, the "AutoLoader" module), this command will show the path to a separate "pod" file; the module itself should be in the same directory, with a 'pm' file extension. Found in /usr/share/perl/5.34/pod/perlfaq9.pod How do I remove HTML from a string? Use HTML::Strip, or HTML::FormatText which not only removes HTML but also attempts to do a little simple formatting of the resulting plain text. How do I decode a MIME/BASE64 string? The MIME::Base64 package handles this as well as the MIME/QP encoding. Decoding base 64 becomes as simple as: use MIME::Base64; my $decoded = decode_base64($encoded); The Email::MIME module can decode base 64-encoded email message parts transparently so the developer doesn't need to worry about it. How do I find out my hostname, domainname, or IP address? (contributed by brian d foy) The Net::Domain module, which is part of the Standard Library starting in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host name, or the domain name. use Net::Domain qw(hostname hostfqdn hostdomain); my $host = hostfqdn(); The Sys::Hostname module, part of the Standard Library, can also get the hostname: use Sys::Hostname; $host = hostname(); The Sys::Hostname::Long module takes a different approach and tries harder to return the fully qualified hostname: use Sys::Hostname::Long 'hostname_long'; my $hostname = hostname_long(); To get the IP address, you can use the "gethostbyname" built-in function to turn the name into a number. To turn that number into the dotted octet form (a.b.c.d) that most people expect, use the "inet_ntoa" function from the Socket module, which also comes with perl. use Socket; my $address = inet_ntoa( scalar gethostbyname( $host || 'localhost' ) );
Generated by phpman v3.7.12 Author: Che Dong Under GNU General Public License
2026-06-14 01:03 @216.73.216.200
CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)