{
    "content": [
        {
            "type": "text",
            "text": "# PERLFILTER (man)\n\n## NAME\n\nperlfilter - Source Filters\n\n## DESCRIPTION\n\nThis article is about a little-known feature of Perl called source filters. Source filters\nalter the program text of a module before Perl sees it, much as a C preprocessor alters the\nsource text of a C program before the compiler sees it. This article tells you more about\nwhat source filters are, how they work, and how to write your own.\n\n## Sections\n\n- **NAME**\n- **DESCRIPTION**\n- **CONCEPTS**\n- **USING FILTERS**\n- **WRITING A SOURCE FILTER**\n- **WRITING A SOURCE FILTER IN C** (1 subsections)\n- **CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE**\n- **WRITING A SOURCE FILTER IN PERL** (1 subsections)\n- **CONCLUSION**\n- **LIMITATIONS**\n- **THINGS TO LOOK OUT FOR**\n- **REQUIREMENTS**\n- **AUTHOR**\n- **Copyrights**\n\nUse structuredContent.sections for detailed options, examples, and full documentation.\n"
        }
    ],
    "structuredContent": {
        "command": "PERLFILTER",
        "section": "",
        "mode": "man",
        "summary": "perlfilter - Source Filters",
        "synopsis": null,
        "tldr_summary": null,
        "tldr_examples": [],
        "tldr_source": null,
        "flags": [],
        "examples": [],
        "see_also": [],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 8,
                "subsections": []
            },
            {
                "name": "CONCEPTS",
                "lines": 45,
                "subsections": []
            },
            {
                "name": "USING FILTERS",
                "lines": 88,
                "subsections": []
            },
            {
                "name": "WRITING A SOURCE FILTER",
                "lines": 5,
                "subsections": []
            },
            {
                "name": "WRITING A SOURCE FILTER IN C",
                "lines": 11,
                "subsections": [
                    {
                        "name": "Decryption Filters",
                        "lines": 13
                    }
                ]
            },
            {
                "name": "CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE",
                "lines": 31,
                "subsections": []
            },
            {
                "name": "WRITING A SOURCE FILTER IN PERL",
                "lines": 97,
                "subsections": [
                    {
                        "name": "USING CONTEXT: THE DEBUG FILTER",
                        "lines": 133
                    }
                ]
            },
            {
                "name": "CONCLUSION",
                "lines": 34,
                "subsections": []
            },
            {
                "name": "LIMITATIONS",
                "lines": 17,
                "subsections": []
            },
            {
                "name": "THINGS TO LOOK OUT FOR",
                "lines": 7,
                "subsections": []
            },
            {
                "name": "REQUIREMENTS",
                "lines": 8,
                "subsections": []
            },
            {
                "name": "AUTHOR",
                "lines": 4,
                "subsections": []
            },
            {
                "name": "Copyrights",
                "lines": 7,
                "subsections": []
            }
        ],
        "sections": {
            "NAME": {
                "content": "perlfilter - Source Filters\n",
                "subsections": []
            },
            "DESCRIPTION": {
                "content": "This article is about a little-known feature of Perl called source filters. Source filters\nalter the program text of a module before Perl sees it, much as a C preprocessor alters the\nsource text of a C program before the compiler sees it. This article tells you more about\nwhat source filters are, how they work, and how to write your own.\n\nThe original purpose of source filters was to let you encrypt your program source to prevent\ncasual piracy. This isn't all they can do, as you'll soon learn. But first, the basics.\n",
                "subsections": []
            },
            "CONCEPTS": {
                "content": "Before the Perl interpreter can execute a Perl script, it must first read it from a file into\nmemory for parsing and compilation. If that script itself includes other scripts with a \"use\"\nor \"require\" statement, then each of those scripts will have to be read from their respective\nfiles as well.\n\nNow think of each logical connection between the Perl parser and an individual file as a\nsource stream. A source stream is created when the Perl parser opens a file, it continues to\nexist as the source code is read into memory, and it is destroyed when Perl is finished\nparsing the file. If the parser encounters a \"require\" or \"use\" statement in a source stream,\na new and distinct stream is created just for that file.\n\nThe diagram below represents a single source stream, with the flow of source from a Perl\nscript file on the left into the Perl parser on the right. This is how Perl normally\noperates.\n\nfile -------> parser\n\nThere are two important points to remember:\n\n1.   Although there can be any number of source streams in existence at any given time, only\none will be active.\n\n2.   Every source stream is associated with only one file.\n\nA source filter is a special kind of Perl module that intercepts and modifies a source stream\nbefore it reaches the parser. A source filter changes our diagram like this:\n\nfile ----> filter ----> parser\n\nIf that doesn't make much sense, consider the analogy of a command pipeline. Say you have a\nshell script stored in the compressed file trial.gz. The simple pipeline command below runs\nthe script without needing to create a temporary file to hold the uncompressed file.\n\ngunzip -c trial.gz | sh\n\nIn this case, the data flow from the pipeline can be represented as follows:\n\ntrial.gz ----> gunzip ----> sh\n\nWith source filters, you can store the text of your script compressed and use a source filter\nto uncompress it for Perl's parser:\n\ncompressed           gunzip\nPerl program ---> source filter ---> parser\n",
                "subsections": []
            },
            "USING FILTERS": {
                "content": "So how do you use a source filter in a Perl script? Above, I said that a source filter is\njust a special kind of module. Like all Perl modules, a source filter is invoked with a use\nstatement.\n\nSay you want to pass your Perl source through the C preprocessor before execution. As it\nhappens, the source filters distribution comes with a C preprocessor filter module called\nFilter::cpp.\n\nBelow is an example program, \"cpptest\", which makes use of this filter.  Line numbers have\nbeen added to allow specific lines to be referenced easily.\n\n1: use Filter::cpp;\n2: #define TRUE 1\n3: $a = TRUE;\n4: print \"a = $a\\n\";\n\nWhen you execute this script, Perl creates a source stream for the file. Before the parser\nprocesses any of the lines from the file, the source stream looks like this:\n\ncpptest ---------> parser\n\nLine 1, \"use Filter::cpp\", includes and installs the \"cpp\" filter module. All source filters\nwork this way. The use statement is compiled and executed at compile time, before any more of\nthe file is read, and it attaches the cpp filter to the source stream behind the scenes. Now\nthe data flow looks like this:\n\ncpptest ----> cpp filter ----> parser\n\nAs the parser reads the second and subsequent lines from the source stream, it feeds those\nlines through the \"cpp\" source filter before processing them. The \"cpp\" filter simply passes\neach line through the real C preprocessor. The output from the C preprocessor is then\ninserted back into the source stream by the filter.\n\n.-> cpp --.\n|         |\n|         |\n|       <-'\ncpptest ----> cpp filter ----> parser\n\nThe parser then sees the following code:\n\nuse Filter::cpp;\n$a = 1;\nprint \"a = $a\\n\";\n\nLet's consider what happens when the filtered code includes another module with use:\n\n1: use Filter::cpp;\n2: #define TRUE 1\n3: use Fred;\n4: $a = TRUE;\n5: print \"a = $a\\n\";\n\nThe \"cpp\" filter does not apply to the text of the Fred module, only to the text of the file\nthat used it (\"cpptest\"). Although the use statement on line 3 will pass through the cpp\nfilter, the module that gets included (\"Fred\") will not. The source streams look like this\nafter line 3 has been parsed and before line 4 is parsed:\n\ncpptest ---> cpp filter ---> parser (INACTIVE)\n\nFred.pm ----> parser\n\nAs you can see, a new stream has been created for reading the source from \"Fred.pm\". This\nstream will remain active until all of \"Fred.pm\" has been parsed. The source stream for\n\"cpptest\" will still exist, but is inactive. Once the parser has finished reading Fred.pm,\nthe source stream associated with it will be destroyed. The source stream for \"cpptest\" then\nbecomes active again and the parser reads line 4 and subsequent lines from \"cpptest\".\n\nYou can use more than one source filter on a single file. Similarly, you can reuse the same\nfilter in as many files as you like.\n\nFor example, if you have a uuencoded and compressed source file, it is possible to stack a\nuudecode filter and an uncompression filter like this:\n\nuse Filter::uudecode; use Filter::uncompress;\nM'XL(\".H<US4''V9I;F%L')Q;>7/;1I;>I3=&E=%:F*I\"T?22Q/\nM6]9*<IQCO*XFT\"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=[>\n...\n\nOnce the first line has been processed, the flow will look like this:\n\nfile ---> uudecode ---> uncompress ---> parser\nfilter         filter\n\nData flows through filters in the same order they appear in the source file. The uudecode\nfilter appeared before the uncompress filter, so the source file will be uudecoded before\nit's uncompressed.\n",
                "subsections": []
            },
            "WRITING A SOURCE FILTER": {
                "content": "There are three ways to write your own source filter. You can write it in C, use an external\nprogram as a filter, or write the filter in Perl.  I won't cover the first two in any great\ndetail, so I'll get them out of the way first. Writing the filter in Perl is most convenient,\nso I'll devote the most space to it.\n",
                "subsections": []
            },
            "WRITING A SOURCE FILTER IN C": {
                "content": "The first of the three available techniques is to write the filter completely in C. The\nexternal module you create interfaces directly with the source filter hooks provided by Perl.\n\nThe advantage of this technique is that you have complete control over the implementation of\nyour filter. The big disadvantage is the increased complexity required to write the filter -\nnot only do you need to understand the source filter hooks, but you also need a reasonable\nknowledge of Perl guts. One of the few times it is worth going to this trouble is when\nwriting a source scrambler. The \"decrypt\" filter (which unscrambles the source before Perl\nparses it) included with the source filter distribution is an example of a C source filter\n(see Decryption Filters, below).\n",
                "subsections": [
                    {
                        "name": "Decryption Filters",
                        "content": "All decryption filters work on the principle of \"security through obscurity.\" Regardless\nof how well you write a decryption filter and how strong your encryption algorithm is,\nanyone determined enough can retrieve the original source code. The reason is quite\nsimple - once the decryption filter has decrypted the source back to its original form,\nfragments of it will be stored in the computer's memory as Perl parses it. The source\nmight only be in memory for a short period of time, but anyone possessing a debugger,\nskill, and lots of patience can eventually reconstruct your program.\n\nThat said, there are a number of steps that can be taken to make life difficult for the\npotential cracker. The most important: Write your decryption filter in C and statically\nlink the decryption module into the Perl binary. For further tips to make life difficult\nfor the potential cracker, see the file decrypt.pm in the source filters distribution.\n"
                    }
                ]
            },
            "CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE": {
                "content": "An alternative to writing the filter in C is to create a separate executable in the language\nof your choice. The separate executable reads from standard input, does whatever processing\nis necessary, and writes the filtered data to standard output. \"Filter::cpp\" is an example of\na source filter implemented as a separate executable - the executable is the C preprocessor\nbundled with your C compiler.\n\nThe source filter distribution includes two modules that simplify this task: \"Filter::exec\"\nand \"Filter::sh\". Both allow you to run any external executable. Both use a coprocess to\ncontrol the flow of data into and out of the external executable. (For details on\ncoprocesses, see Stephens, W.R., \"Advanced Programming in the UNIX Environment.\"  Addison-\nWesley, ISBN 0-210-56317-7, pages 441-445.) The difference between them is that\n\"Filter::exec\" spawns the external command directly, while \"Filter::sh\" spawns a shell to\nexecute the external command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning a\nshell allows you to make use of the shell metacharacters and redirection facilities.\n\nHere is an example script that uses \"Filter::sh\":\n\nuse Filter::sh 'tr XYZ PQR';\n$a = 1;\nprint \"XYZ a = $a\\n\";\n\nThe output you'll get when the script is executed:\n\nPQR a = 1\n\nWriting a source filter as a separate executable works fine, but a small performance penalty\nis incurred. For example, if you execute the small example above, a separate subprocess will\nbe created to run the Unix \"tr\" command. Each use of the filter requires its own subprocess.\nIf creating subprocesses is expensive on your system, you might want to consider one of the\nother options for creating source filters.\n",
                "subsections": []
            },
            "WRITING A SOURCE FILTER IN PERL": {
                "content": "The easiest and most portable option available for creating your own source filter is to\nwrite it completely in Perl. To distinguish this from the previous two techniques, I'll call\nit a Perl source filter.\n\nTo help understand how to write a Perl source filter we need an example to study. Here is a\ncomplete source filter that performs rot13 decoding. (Rot13 is a very simple encryption\nscheme used in Usenet postings to hide the contents of offensive posts. It moves every letter\nforward thirteen places, so that A becomes N, B becomes O, and Z becomes M.)\n\npackage Rot13;\n\nuse Filter::Util::Call;\n\nsub import {\nmy ($type) = @;\nmy ($ref) = [];\nfilteradd(bless $ref);\n}\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n\ntr/n-za-mN-ZA-M/a-zA-Z/\nif ($status = filterread()) > 0;\n$status;\n}\n\n1;\n\nAll Perl source filters are implemented as Perl classes and have the same basic structure as\nthe example above.\n\nFirst, we include the \"Filter::Util::Call\" module, which exports a number of functions into\nyour filter's namespace. The filter shown above uses two of these functions, \"filteradd()\"\nand \"filterread()\".\n\nNext, we create the filter object and associate it with the source stream by defining the\n\"import\" function. If you know Perl well enough, you know that \"import\" is called\nautomatically every time a module is included with a use statement. This makes \"import\" the\nideal place to both create and install a filter object.\n\nIn the example filter, the object ($ref) is blessed just like any other Perl object. Our\nexample uses an anonymous array, but this isn't a requirement. Because this example doesn't\nneed to store any context information, we could have used a scalar or hash reference just as\nwell. The next section demonstrates context data.\n\nThe association between the filter object and the source stream is made with the\n\"filteradd()\" function. This takes a filter object as a parameter ($ref in this case) and\ninstalls it in the source stream.\n\nFinally, there is the code that actually does the filtering. For this type of Perl source\nfilter, all the filtering is done in a method called \"filter()\". (It is also possible to\nwrite a Perl source filter using a closure. See the \"Filter::Util::Call\" manual page for more\ndetails.) It's called every time the Perl parser needs another line of source to process. The\n\"filter()\" method, in turn, reads lines from the source stream using the \"filterread()\"\nfunction.\n\nIf a line was available from the source stream, \"filterread()\" returns a status value\ngreater than zero and appends the line to $.  A status value of zero indicates end-of-file,\nless than zero means an error. The filter function itself is expected to return its status in\nthe same way, and put the filtered line it wants written to the source stream in $. The use\nof $ accounts for the brevity of most Perl source filters.\n\nIn order to make use of the rot13 filter we need some way of encoding the source file in\nrot13 format. The script below, \"mkrot13\", does just that.\n\ndie \"usage mkrot13 filename\\n\" unless @ARGV;\nmy $in = $ARGV[0];\nmy $out = \"$in.tmp\";\nopen(IN, \"<$in\") or die \"Cannot open file $in: $!\\n\";\nopen(OUT, \">$out\") or die \"Cannot open file $out: $!\\n\";\n\nprint OUT \"use Rot13;\\n\";\nwhile (<IN>) {\ntr/a-zA-Z/n-za-mN-ZA-M/;\nprint OUT;\n}\n\nclose IN;\nclose OUT;\nunlink $in;\nrename $out, $in;\n\nIf we encrypt this with \"mkrot13\":\n\nprint \" hello fred \\n\";\n\nthe result will be this:\n\nuse Rot13;\ncevag \"uryyb serq\\a\";\n\nRunning it produces this output:\n\nhello fred\n",
                "subsections": [
                    {
                        "name": "USING CONTEXT: THE DEBUG FILTER",
                        "content": "The rot13 example was a trivial example. Here's another demonstration that shows off a few\nmore features.\n\nSay you wanted to include a lot of debugging code in your Perl script during development, but\nyou didn't want it available in the released product. Source filters offer a solution. In\norder to keep the example simple, let's say you wanted the debugging output to be controlled\nby an environment variable, \"DEBUG\". Debugging code is enabled if the variable exists,\notherwise it is disabled.\n\nTwo special marker lines will bracket debugging code, like this:\n\n## DEBUGBEGIN\nif ($year > 1999) {\nwarn \"Debug: millennium bug in year $year\\n\";\n}\n## DEBUGEND\n\nThe filter ensures that Perl parses the code between the <DEBUGBEGIN> and \"DEBUGEND\"\nmarkers only when the \"DEBUG\" environment variable exists. That means that when \"DEBUG\" does\nexist, the code above should be passed through the filter unchanged. The marker lines can\nalso be passed through as-is, because the Perl parser will see them as comment lines. When\n\"DEBUG\" isn't set, we need a way to disable the debug code. A simple way to achieve that is\nto convert the lines between the two markers into comments:\n\n## DEBUGBEGIN\n#if ($year > 1999) {\n#     warn \"Debug: millennium bug in year $year\\n\";\n#}\n## DEBUGEND\n\nHere is the complete Debug filter:\n\npackage Debug;\n\nuse strict;\nuse warnings;\nuse Filter::Util::Call;\n\nuse constant TRUE => 1;\nuse constant FALSE => 0;\n\nsub import {\nmy ($type) = @;\nmy (%context) = (\nEnabled => defined $ENV{DEBUG},\nInTraceBlock => FALSE,\nFilename => (caller)[1],\nLineNo => 0,\nLastBegin => 0,\n);\nfilteradd(bless \\%context);\n}\n\nsub Die {\nmy ($self) = shift;\nmy ($message) = shift;\nmy ($lineno) = shift || $self->{LastBegin};\ndie \"$message at $self->{Filename} line $lineno.\\n\"\n}\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n$status = filterread();\n++ $self->{LineNo};\n\n# deal with EOF/error first\nif ($status <= 0) {\n$self->Die(\"DEBUGBEGIN has no DEBUGEND\")\nif $self->{InTraceBlock};\nreturn $status;\n}\n\nif ($self->{InTraceBlock}) {\nif (/^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->Die(\"Nested DEBUGBEGIN\", $self->{LineNo})\n} elsif (/^\\s*##\\s*DEBUGEND/) {\n$self->{InTraceBlock} = FALSE;\n}\n\n# comment out the debug lines when the filter is disabled\ns/^/#/ if ! $self->{Enabled};\n} elsif ( /^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->{InTraceBlock} = TRUE;\n$self->{LastBegin} = $self->{LineNo};\n} elsif ( /^\\s*##\\s*DEBUGEND/ ) {\n$self->Die(\"DEBUGEND has no DEBUGBEGIN\", $self->{LineNo});\n}\nreturn $status;\n}\n\n1;\n\nThe big difference between this filter and the previous example is the use of context data in\nthe filter object. The filter object is based on a hash reference, and is used to keep\nvarious pieces of context information between calls to the filter function. All but two of\nthe hash fields are used for error reporting. The first of those two, Enabled, is used by the\nfilter to determine whether the debugging code should be given to the Perl parser. The\nsecond, InTraceBlock, is true when the filter has encountered a \"DEBUGBEGIN\" line, but has\nnot yet encountered the following \"DEBUGEND\" line.\n\nIf you ignore all the error checking that most of the code does, the essence of the filter is\nas follows:\n\nsub filter {\nmy ($self) = @;\nmy ($status);\n$status = filterread();\n\n# deal with EOF/error first\nreturn $status if $status <= 0;\nif ($self->{InTraceBlock}) {\nif (/^\\s*##\\s*DEBUGEND/) {\n$self->{InTraceBlock} = FALSE\n}\n\n# comment out debug lines when the filter is disabled\ns/^/#/ if ! $self->{Enabled};\n} elsif ( /^\\s*##\\s*DEBUGBEGIN/ ) {\n$self->{InTraceBlock} = TRUE;\n}\nreturn $status;\n}\n\nBe warned: just as the C-preprocessor doesn't know C, the Debug filter doesn't know Perl. It\ncan be fooled quite easily:\n\nprint <<EOM;\n##DEBUGBEGIN\nEOM\n\nSuch things aside, you can see that a lot can be achieved with a modest amount of code.\n"
                    }
                ]
            },
            "CONCLUSION": {
                "content": "You now have better understanding of what a source filter is, and you might even have a\npossible use for them. If you feel like playing with source filters but need a bit of\ninspiration, here are some extra features you could add to the Debug filter.\n\nFirst, an easy one. Rather than having debugging code that is all-or-nothing, it would be\nmuch more useful to be able to control which specific blocks of debugging code get included.\nTry extending the syntax for debug blocks to allow each to be identified. The contents of the\n\"DEBUG\" environment variable can then be used to control which blocks get included.\n\nOnce you can identify individual blocks, try allowing them to be nested. That isn't difficult\neither.\n\nHere is an interesting idea that doesn't involve the Debug filter.  Currently Perl\nsubroutines have fairly limited support for formal parameter lists. You can specify the\nnumber of parameters and their type, but you still have to manually take them out of the @\narray yourself. Write a source filter that allows you to have a named parameter list. Such a\nfilter would turn this:\n\nsub MySub ($first, $second, @rest) { ... }\n\ninto this:\n\nsub MySub($$@) {\nmy ($first) = shift;\nmy ($second) = shift;\nmy (@rest) = @;\n...\n}\n\nFinally, if you feel like a real challenge, have a go at writing a full-blown Perl macro\npreprocessor as a source filter. Borrow the useful features from the C preprocessor and any\nother macro processors you know. The tricky bit will be choosing how much knowledge of Perl's\nsyntax you want your filter to have.\n",
                "subsections": []
            },
            "LIMITATIONS": {
                "content": "Source filters only work on the string level, thus are highly limited in its ability to\nchange source code on the fly. It cannot detect comments, quoted strings, heredocs, it is no\nreplacement for a real parser.  The only stable usage for source filters are encryption,\ncompression, or the byteloader, to translate binary code back to source code.\n\nSee for example the limitations in Switch, which uses source filters, and thus is does not\nwork inside a string eval, the presence of regexes with embedded newlines that are specified\nwith raw \"/.../\" delimiters and don't have a modifier \"//x\" are indistinguishable from code\nchunks beginning with the division operator \"/\". As a workaround you must use \"m/.../\" or\n\"m?...?\" for such patterns. Also, the presence of regexes specified with raw \"?...?\"\ndelimiters may cause mysterious errors. The workaround is to use \"m?...?\" instead.  See\n<https://metacpan.org/pod/Switch#LIMITATIONS>.\n\nCurrently the content of the \"DATA\" block is not filtered.\n\nCurrently internal buffer lengths are limited to 32-bit only.\n",
                "subsections": []
            },
            "THINGS TO LOOK OUT FOR": {
                "content": "Some Filters Clobber the \"DATA\" Handle\nSome source filters use the \"DATA\" handle to read the calling program.  When using these\nsource filters you cannot rely on this handle, nor expect any particular kind of\nbehavior when operating on it.  Filters based on Filter::Util::Call (and therefore\nFilter::Simple) do not alter the \"DATA\" filehandle, but on the other hand totally ignore\nthe text after \"DATA\".\n",
                "subsections": []
            },
            "REQUIREMENTS": {
                "content": "The Source Filters distribution is available on CPAN, in\n\nCPAN/modules/by-module/Filter\n\nStarting from Perl 5.8 Filter::Util::Call (the core part of the Source Filters distribution)\nis part of the standard Perl distribution.  Also included is a friendlier interface called\nFilter::Simple, by Damian Conway.\n",
                "subsections": []
            },
            "AUTHOR": {
                "content": "Paul Marquess <Paul.Marquess@btinternet.com>\n\nReini Urban <rurban@cpan.org>\n",
                "subsections": []
            },
            "Copyrights": {
                "content": "The first version of this article originally appeared in The Perl Journal #11, and is\ncopyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal.\nThis document may be distributed under the same terms as Perl itself.\n\n\n\nperl v5.34.0                                 2025-07-25                                PERLFILTER(1)",
                "subsections": []
            }
        }
    }
}