{
    "mode": "perldoc",
    "parameter": "perlfilter",
    "section": "",
    "url": "https://www.chedong.com/phpMan.php/perldoc/perlfilter/json",
    "generated": "2026-06-13T21:22:54Z",
    "sections": {
        "NAME": {
            "content": "perlfilter - Source Filters\n",
            "subsections": []
        },
        "DESCRIPTION": {
            "content": "This article is about a little-known feature of Perl called *source filters*. Source filters\nalter the program text of a module before Perl sees it, much as a C preprocessor alters the\nsource text of a C program before the compiler sees it. This article tells you more about what\nsource filters are, how they work, and how to write your own.\n\nThe original purpose of source filters was to let you encrypt your program source to prevent\ncasual piracy. This isn't all they can do, as you'll soon learn. But first, the basics.\n",
            "subsections": []
        },
        "CONCEPTS": {
            "content": "Before the Perl interpreter can execute a Perl script, it must first read it from a file into\nmemory for parsing and compilation. If that script itself includes other scripts with a \"use\" or\n\"require\" statement, then each of those scripts will have to be read from their respective files\nas well.\n\nNow think of each logical connection between the Perl parser and an individual file as a *source\nstream*. A source stream is created when the Perl parser opens a file, it continues to exist as\nthe source code is read into memory, and it is destroyed when Perl is finished parsing the file.\nIf the parser encounters a \"require\" or \"use\" statement in a source stream, a new and distinct\nstream is created just for that file.\n\nThe diagram below represents a single source stream, with the flow of source from a Perl script\nfile on the left into the Perl parser on the right. This is how Perl normally operates.\n\nfile -------> parser\n\nThere are two important points to remember:\n\n1.   Although there can be any number of source streams in existence at any given time, only one\nwill be active.\n\n2.   Every source stream is associated with only one file.\n\nA source filter is a special kind of Perl module that intercepts and modifies a source stream\nbefore it reaches the parser. A source filter changes our diagram like this:\n\nfile ----> filter ----> parser\n\nIf that doesn't make much sense, consider the analogy of a command pipeline. Say you have a\nshell script stored in the compressed file *trial.gz*. The simple pipeline command below runs\nthe script without needing to create a temporary file to hold the uncompressed file.\n\ngunzip -c trial.gz | sh\n\nIn this case, the data flow from the pipeline can be represented as follows:\n\ntrial.gz ----> gunzip ----> sh\n\nWith source filters, you can store the text of your script compressed and use a source filter to\nuncompress it for Perl's parser:\n\ncompressed           gunzip\nPerl program ---> source filter ---> parser\n",
            "subsections": []
        },
        "USING FILTERS": {
            "content": "So how do you use a source filter in a Perl script? Above, I said that a source filter is just a\nspecial kind of module. Like all Perl modules, a source filter is invoked with a use statement.\n\nSay you want to pass your Perl source through the C preprocessor before execution. As it\nhappens, the source filters distribution comes with a C preprocessor filter module called\nFilter::cpp.\n\nBelow is an example program, \"cpptest\", which makes use of this filter. Line numbers have been\nadded to allow specific lines to be referenced easily.\n\n1: use Filter::cpp;\n2: #define TRUE 1\n3: $a = TRUE;\n4: print \"a = $a\\n\";\n\nWhen you execute this script, Perl creates a source stream for the file. Before the parser\nprocesses any of the lines from the file, the source stream looks like this:\n\ncpptest ---------> parser\n\nLine 1, \"use Filter::cpp\", includes and installs the \"cpp\" filter module. All source filters\nwork this way. The use statement is compiled and executed at compile time, before any more of\nthe file is read, and it attaches the cpp filter to the source stream behind the scenes. Now the\ndata flow looks like this:\n\ncpptest ----> cpp filter ----> parser\n\nAs the parser reads the second and subsequent lines from the source stream, it feeds those lines\nthrough the \"cpp\" source filter before processing them. The \"cpp\" filter simply passes each line\nthrough the real C preprocessor. The output from the C preprocessor is then inserted back into\nthe source stream by the filter.\n\n.-> cpp --.\n|         |\n|         |\n|       <-'\ncpptest ----> cpp filter ----> parser\n\nThe parser then sees the following code:\n\nuse Filter::cpp;\n$a = 1;\nprint \"a = $a\\n\";\n\nLet's consider what happens when the filtered code includes another module with use:\n\n1: use Filter::cpp;\n2: #define TRUE 1\n3: use Fred;\n4: $a = TRUE;\n5: print \"a = $a\\n\";\n\nThe \"cpp\" filter does not apply to the text of the Fred module, only to the text of the file\nthat used it (\"cpptest\"). Although the use statement on line 3 will pass through the cpp\nfilter, the module that gets included (\"Fred\") will not. The source streams look like this after\nline 3 has been parsed and before line 4 is parsed:\n\ncpptest ---> cpp filter ---> parser (INACTIVE)\n\nFred.pm ----> parser\n\nAs you can see, a new stream has been created for reading the source from \"Fred.pm\". This stream\nwill remain active until all of \"Fred.pm\" has been parsed. The source stream for \"cpptest\" will\nstill exist, but is inactive. Once the parser has finished reading Fred.pm, the source stream\nassociated with it will be destroyed. The source stream for \"cpptest\" then becomes active again\nand the parser reads line 4 and subsequent lines from \"cpptest\".\n\nYou can use more than one source filter on a single file. Similarly, you can reuse the same\nfilter in as many files as you like.\n\nFor example, if you have a uuencoded and compressed source file, it is possible to stack a\nuudecode filter and an uncompression filter like this:\n\nuse Filter::uudecode; use Filter::uncompress;\nM'XL(\".H<US4''V9I;F%L')Q;>7/;1I;>I3=&E=%:F*I\"T?22Q/\nM6]9*<IQCO*XFT\"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=[>\n...\n\nOnce the first line has been processed, the flow will look like this:\n\nfile ---> uudecode ---> uncompress ---> parser\nfilter         filter\n\nData flows through filters in the same order they appear in the source file. The uudecode filter\nappeared before the uncompress filter, so the source file will be uudecoded before it's\nuncompressed.\n",
            "subsections": []
        },
        "WRITING A SOURCE FILTER": {
            "content": "There are three ways to write your own source filter. You can write it in C, use an external\nprogram as a filter, or write the filter in Perl. I won't cover the first two in any great\ndetail, so I'll get them out of the way first. Writing the filter in Perl is most convenient, so\nI'll devote the most space to it.\n",
            "subsections": []
        },
        "WRITING A SOURCE FILTER IN C": {
            "content": "The first of the three available techniques is to write the filter completely in C. The external\nmodule you create interfaces directly with the source filter hooks provided by Perl.\n\nThe advantage of this technique is that you have complete control over the implementation of\nyour filter. The big disadvantage is the increased complexity required to write the filter - not\nonly do you need to understand the source filter hooks, but you also need a reasonable knowledge\nof Perl guts. One of the few times it is worth going to this trouble is when writing a source\nscrambler. The \"decrypt\" filter (which unscrambles the source before Perl parses it) included\nwith the source filter distribution is an example of a C source filter (see Decryption Filters,\nbelow).\n\nDecryption Filters\nAll decryption filters work on the principle of \"security through obscurity.\" Regardless of\nhow well you write a decryption filter and how strong your encryption algorithm is, anyone\ndetermined enough can retrieve the original source code. The reason is quite simple - once\nthe decryption filter has decrypted the source back to its original form, fragments of it\nwill be stored in the computer's memory as Perl parses it. The source might only be in\nmemory for a short period of time, but anyone possessing a debugger, skill, and lots of\npatience can eventually reconstruct your program.\n\nThat said, there are a number of steps that can be taken to make life difficult for the\npotential cracker. The most important: Write your decryption filter in C and statically\nlink the decryption module into the Perl binary. For further tips to make life difficult\nfor the potential cracker, see the file *decrypt.pm* in the source filters distribution.\n",
            "subsections": []
        },
        "CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE": {
            "content": "An alternative to writing the filter in C is to create a separate executable in the language of\nyour choice. The separate executable reads from standard input, does whatever processing is\nnecessary, and writes the filtered data to standard output. \"Filter::cpp\" is an example of a\nsource filter implemented as a separate executable - the executable is the C preprocessor\nbundled with your C compiler.\n\nThe source filter distribution includes two modules that simplify this task: \"Filter::exec\" and\n\"Filter::sh\". Both allow you to run any external executable. Both use a coprocess to control the\nflow of data into and out of the external executable. (For details on coprocesses, see Stephens,\nW.R., \"Advanced Programming in the UNIX Environment.\" Addison-Wesley, ISBN 0-210-56317-7, pages\n441-445.) The difference between them is that \"Filter::exec\" spawns the external command\ndirectly, while \"Filter::sh\" spawns a shell to execute the external command. (Unix uses the\nBourne shell; NT uses the cmd shell.) Spawning a shell allows you to make use of the shell\nmetacharacters and redirection facilities.\n\nHere is an example script that uses \"Filter::sh\":\n\nuse Filter::sh 'tr XYZ PQR';\n$a = 1;\nprint \"XYZ a = $a\\n\";\n\nThe output you'll get when the script is executed:\n\nPQR a = 1\n\nWriting a source filter as a separate executable works fine, but a small performance penalty is\nincurred. For example, if you execute the small example above, a separate subprocess will be\ncreated to run the Unix \"tr\" command. Each use of the filter requires its own subprocess. If\ncreating subprocesses is expensive on your system, you might want to consider one of the other\noptions for creating source filters.\n",
            "subsections": []
        },
        "WRITING A SOURCE FILTER IN PERL": {
            "content": "The easiest and most portable option available for creating your own source filter is to write\nit completely in Perl. To distinguish this from the previous two techniques, I'll call it a Perl\nsource filter.\n\nTo help understand how to write a Perl source filter we need an example to study. Here is a\ncomplete source filter that performs rot13 decoding. (Rot13 is a very simple encryption scheme\nused in Usenet postings to hide the contents of offensive posts. It moves every letter forward\nthirteen places, so that A becomes N, B becomes O, and Z becomes M.)\n\npackage Rot13;\n\nuse Filter::Util::Call;\n\nsub import {\nmy ($type) = @;\nmy ($ref) = [];\nfilteradd(bless $ref);\n}\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n\ntr/n-za-mN-ZA-M/a-zA-Z/\nif ($status = filterread()) > 0;\n$status;\n}\n\n1;\n\nAll Perl source filters are implemented as Perl classes and have the same basic structure as the\nexample above.\n\nFirst, we include the \"Filter::Util::Call\" module, which exports a number of functions into your\nfilter's namespace. The filter shown above uses two of these functions, \"filteradd()\" and\n\"filterread()\".\n\nNext, we create the filter object and associate it with the source stream by defining the\n\"import\" function. If you know Perl well enough, you know that \"import\" is called automatically\nevery time a module is included with a use statement. This makes \"import\" the ideal place to\nboth create and install a filter object.\n\nIn the example filter, the object ($ref) is blessed just like any other Perl object. Our example\nuses an anonymous array, but this isn't a requirement. Because this example doesn't need to\nstore any context information, we could have used a scalar or hash reference just as well. The\nnext section demonstrates context data.\n\nThe association between the filter object and the source stream is made with the \"filteradd()\"\nfunction. This takes a filter object as a parameter ($ref in this case) and installs it in the\nsource stream.\n\nFinally, there is the code that actually does the filtering. For this type of Perl source\nfilter, all the filtering is done in a method called \"filter()\". (It is also possible to write a\nPerl source filter using a closure. See the \"Filter::Util::Call\" manual page for more details.)\nIt's called every time the Perl parser needs another line of source to process. The \"filter()\"\nmethod, in turn, reads lines from the source stream using the \"filterread()\" function.\n\nIf a line was available from the source stream, \"filterread()\" returns a status value greater\nthan zero and appends the line to $. A status value of zero indicates end-of-file, less than\nzero means an error. The filter function itself is expected to return its status in the same\nway, and put the filtered line it wants written to the source stream in $. The use of $\naccounts for the brevity of most Perl source filters.\n\nIn order to make use of the rot13 filter we need some way of encoding the source file in rot13\nformat. The script below, \"mkrot13\", does just that.\n\ndie \"usage mkrot13 filename\\n\" unless @ARGV;\nmy $in = $ARGV[0];\nmy $out = \"$in.tmp\";\nopen(IN, \"<$in\") or die \"Cannot open file $in: $!\\n\";\nopen(OUT, \">$out\") or die \"Cannot open file $out: $!\\n\";\n\nprint OUT \"use Rot13;\\n\";\nwhile (<IN>) {\ntr/a-zA-Z/n-za-mN-ZA-M/;\nprint OUT;\n}\n\nclose IN;\nclose OUT;\nunlink $in;\nrename $out, $in;\n\nIf we encrypt this with \"mkrot13\":\n\nprint \" hello fred \\n\";\n\nthe result will be this:\n\nuse Rot13;\ncevag \"uryyb serq\\a\";\n\nRunning it produces this output:\n\nhello fred\n\nUSING CONTEXT: THE DEBUG FILTER\nThe rot13 example was a trivial example. Here's another demonstration that shows off a few more\nfeatures.\n\nSay you wanted to include a lot of debugging code in your Perl script during development, but\nyou didn't want it available in the released product. Source filters offer a solution. In order\nto keep the example simple, let's say you wanted the debugging output to be controlled by an\nenvironment variable, \"DEBUG\". Debugging code is enabled if the variable exists, otherwise it is\ndisabled.\n\nTwo special marker lines will bracket debugging code, like this:\n\n## DEBUGBEGIN\nif ($year > 1999) {\nwarn \"Debug: millennium bug in year $year\\n\";\n}\n## DEBUGEND\n\nThe filter ensures that Perl parses the code between the <DEBUGBEGIN> and \"DEBUGEND\" markers\nonly when the \"DEBUG\" environment variable exists. That means that when \"DEBUG\" does exist, the\ncode above should be passed through the filter unchanged. The marker lines can also be passed\nthrough as-is, because the Perl parser will see them as comment lines. When \"DEBUG\" isn't set,\nwe need a way to disable the debug code. A simple way to achieve that is to convert the lines\nbetween the two markers into comments:\n\n## DEBUGBEGIN\n#if ($year > 1999) {\n#     warn \"Debug: millennium bug in year $year\\n\";\n#}\n## DEBUGEND\n\nHere is the complete Debug filter:\n\npackage Debug;\n\nuse strict;\nuse warnings;\nuse Filter::Util::Call;\n\nuse constant TRUE => 1;\nuse constant FALSE => 0;\n\nsub import {\nmy ($type) = @;\nmy (%context) = (\nEnabled => defined $ENV{DEBUG},\nInTraceBlock => FALSE,\nFilename => (caller)[1],\nLineNo => 0,\nLastBegin => 0,\n);\nfilteradd(bless \\%context);\n}\n\nsub Die {\nmy ($self) = shift;\nmy ($message) = shift;\nmy ($lineno) = shift || $self->{LastBegin};\ndie \"$message at $self->{Filename} line $lineno.\\n\"\n}\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n$status = filterread();\n++ $self->{LineNo};\n\n# deal with EOF/error first\nif ($status <= 0) {\n$self->Die(\"DEBUGBEGIN has no DEBUGEND\")\nif $self->{InTraceBlock};\nreturn $status;\n}\n\nif ($self->{InTraceBlock}) {\nif (/^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->Die(\"Nested DEBUGBEGIN\", $self->{LineNo})\n} elsif (/^\\s*##\\s*DEBUGEND/) {\n$self->{InTraceBlock} = FALSE;\n}\n\n# comment out the debug lines when the filter is disabled\ns/^/#/ if ! $self->{Enabled};\n} elsif ( /^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->{InTraceBlock} = TRUE;\n$self->{LastBegin} = $self->{LineNo};\n} elsif ( /^\\s*##\\s*DEBUGEND/ ) {\n$self->Die(\"DEBUGEND has no DEBUGBEGIN\", $self->{LineNo});\n}\nreturn $status;\n}\n\n1;\n\nThe big difference between this filter and the previous example is the use of context data in\nthe filter object. The filter object is based on a hash reference, and is used to keep various\npieces of context information between calls to the filter function. All but two of the hash\nfields are used for error reporting. The first of those two, Enabled, is used by the filter to\ndetermine whether the debugging code should be given to the Perl parser. The second,\nInTraceBlock, is true when the filter has encountered a \"DEBUGBEGIN\" line, but has not yet\nencountered the following \"DEBUGEND\" line.\n\nIf you ignore all the error checking that most of the code does, the essence of the filter is as\nfollows:\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n$status = filterread();\n\n# deal with EOF/error first\nreturn $status if $status <= 0;\nif ($self->{InTraceBlock}) {\nif (/^\\s*##\\s*DEBUGEND/) {\n$self->{InTraceBlock} = FALSE\n}\n\n# comment out debug lines when the filter is disabled\ns/^/#/ if ! $self->{Enabled};\n} elsif ( /^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->{InTraceBlock} = TRUE;\n}\nreturn $status;\n}\n\nBe warned: just as the C-preprocessor doesn't know C, the Debug filter doesn't know Perl. It can\nbe fooled quite easily:\n\nprint <<EOM;\n##DEBUGBEGIN\nEOM\n\nSuch things aside, you can see that a lot can be achieved with a modest amount of code.\n",
            "subsections": []
        },
        "CONCLUSION": {
            "content": "You now have better understanding of what a source filter is, and you might even have a possible\nuse for them. If you feel like playing with source filters but need a bit of inspiration, here\nare some extra features you could add to the Debug filter.\n\nFirst, an easy one. Rather than having debugging code that is all-or-nothing, it would be much\nmore useful to be able to control which specific blocks of debugging code get included. Try\nextending the syntax for debug blocks to allow each to be identified. The contents of the\n\"DEBUG\" environment variable can then be used to control which blocks get included.\n\nOnce you can identify individual blocks, try allowing them to be nested. That isn't difficult\neither.\n\nHere is an interesting idea that doesn't involve the Debug filter. Currently Perl subroutines\nhave fairly limited support for formal parameter lists. You can specify the number of parameters\nand their type, but you still have to manually take them out of the @ array yourself. Write a\nsource filter that allows you to have a named parameter list. Such a filter would turn this:\n\nsub MySub ($first, $second, @rest) { ... }\n\ninto this:\n\nsub MySub($$@) {\nmy ($first) = shift;\nmy ($second) = shift;\nmy (@rest) = @;\n...\n}\n\nFinally, if you feel like a real challenge, have a go at writing a full-blown Perl macro\npreprocessor as a source filter. Borrow the useful features from the C preprocessor and any\nother macro processors you know. The tricky bit will be choosing how much knowledge of Perl's\nsyntax you want your filter to have.\n",
            "subsections": []
        },
        "LIMITATIONS": {
            "content": "Source filters only work on the string level, thus are highly limited in its ability to change\nsource code on the fly. It cannot detect comments, quoted strings, heredocs, it is no\nreplacement for a real parser. The only stable usage for source filters are encryption,\ncompression, or the byteloader, to translate binary code back to source code.\n\nSee for example the limitations in Switch, which uses source filters, and thus is does not work\ninside a string eval, the presence of regexes with embedded newlines that are specified with raw\n\"/.../\" delimiters and don't have a modifier \"//x\" are indistinguishable from code chunks\nbeginning with the division operator \"/\". As a workaround you must use \"m/.../\" or \"m?...?\" for\nsuch patterns. Also, the presence of regexes specified with raw \"?...?\" delimiters may cause\nmysterious errors. The workaround is to use \"m?...?\" instead. See\n<https://metacpan.org/pod/Switch#LIMITATIONS>.\n\nCurrently the content of the \"DATA\" block is not filtered.\n\nCurrently internal buffer lengths are limited to 32-bit only.\n",
            "subsections": []
        },
        "THINGS TO LOOK OUT FOR": {
            "content": "Some Filters Clobber the \"DATA\" Handle\nSome source filters use the \"DATA\" handle to read the calling program. When using these\nsource filters you cannot rely on this handle, nor expect any particular kind of behavior\nwhen operating on it. Filters based on Filter::Util::Call (and therefore Filter::Simple) do\nnot alter the \"DATA\" filehandle, but on the other hand totally ignore the text after\n\"DATA\".\n",
            "subsections": []
        },
        "REQUIREMENTS": {
            "content": "The Source Filters distribution is available on CPAN, in\n\nCPAN/modules/by-module/Filter\n\nStarting from Perl 5.8 Filter::Util::Call (the core part of the Source Filters distribution) is\npart of the standard Perl distribution. Also included is a friendlier interface called\nFilter::Simple, by Damian Conway.\n",
            "subsections": []
        },
        "AUTHOR": {
            "content": "Paul Marquess <Paul.Marquess@btinternet.com>\n\nReini Urban <rurban@cpan.org>\n",
            "subsections": []
        },
        "Copyrights": {
            "content": "The first version of this article originally appeared in The Perl Journal #11, and is copyright\n1998 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal. This document may\nbe distributed under the same terms as Perl itself.\n",
            "subsections": []
        }
    },
    "summary": "perlfilter - Source Filters",
    "flags": [],
    "examples": [],
    "see_also": []
}