{
    "content": [
        {
            "type": "text",
            "text": "# pdftotext(1) (man)\n\n## TLDR\n\n> Convert PDF files to plain text format.\n\n- Convert `filename.pdf` to plain text and print it to `stdout`:\n  `pdftotext {{filename.pdf}} -`\n- Convert `filename.pdf` to plain text and save it as `filename.txt`:\n  `pdftotext {{filename.pdf}}`\n- Convert `filename.pdf` to plain text and preserve the layout:\n  `pdftotext -layout {{filename.pdf}}`\n- Convert `input.pdf` to plain text and save it as `output.txt`:\n  `pdftotext {{input.pdf}} {{output.txt}}`\n- Convert pages 2, 3, and 4 of `input.pdf` to plain text and save them as `output.txt`:\n  `pdftotext -f {{2}} -l {{4}} {{input.pdf}} {{output.txt}}`\n\n*Source: tldr-pages*\n\n---\n\n**Summary:** pdftotext - Portable Document Format (PDF) to text converter (version 3.03)\n\n**Synopsis:** pdftotext [options] [PDF-file [text-file]]\n\n## Flags\n\n| Flag | Long | Arg | Description |\n|------|------|-----|-------------|\n| -f | — | — | Specifies the first page to convert. |\n| -l | — | — | Specifies the last page to convert. |\n| -r | — | — | Specifies the resolution, in DPI. The default is 72 DPI. |\n| -x | — | — | Specifies the x-coordinate of the crop area top left corner |\n| -y | — | — | Specifies the y-coordinate of the crop area top left corner |\n| -W | — | — | Specifies the width of crop area in pixels (default is 0) |\n| -H | — | — | Specifies the height of crop area in pixels (default is 0) |\n| — | — | — | Maintain (as best as possible) the original physical layout of the text. The default is to ´undo' physical layout (colum |\n| — | — | — | Assume fixed-pitch (or tabular) text, with the specified character width (in points). This forces physical layout mode. |\n| — | — | — | formatting, etc. Use of raw mode is no longer recommended. |\n| — | — | — | Discard diagonal text (i.e., text that is not close to one of the 0, 90, 180, or 270 degree axes). This is useful for sk |\n| — | — | — | Generate a simple HTML file, including the meta information. This simply wraps the text in <pre> and </pre> and prepends |\n| — | — | — |  |\n| — | — | — | Generate an XHTML file containing bounding box information for each block, line, and word in the file. |\n| — | — | — | Use the crop box rather than the media box with -bbox and -bbox-layout. |\n| — | — | — | Specifies how much spacing we allow after a word before considering adjacent text to be a new column, measured as a frac |\n| — | — | — | Sets the encoding to use for text output. This defaults to \"UTF-8\". |\n| — | — | — | Lists the available encodings |\n| — | — | — | Sets the end-of-line convention to use for text output. |\n| — | — | — | Don't insert page breaks (form feed characters) between pages. |\n| — | — | — | Specify the owner password for the PDF file. Providing this will bypass all security restrictions. |\n| — | — | — | Specify the user password for the PDF file. |\n| -q | — | — |  |\n| -v | — | — |  |\n| -h | --help | — |  |\n\n## See Also\n\n- pdfdetach(1)\n- pdffonts(1)\n- pdfimages(1)\n- pdfinfo(1)\n- pdftocairo(1)\n- pdftohtml(1)\n- pdftoppm(1)\n- pdftops(1)\n- pdfseparate(1)\n- pdfsig(1)\n- pdfunite(1)\n\n## Section Outline\n\n- **NAME** (2 lines)\n- **SYNOPSIS** (2 lines)\n- **DESCRIPTION** (6 lines)\n- **OPTIONS** (1 lines) — 25 subsections\n  - -f (2 lines)\n  - -l (2 lines)\n  - -r (2 lines)\n  - -x (2 lines)\n  - -y (2 lines)\n  - -W (2 lines)\n  - -H (2 lines)\n  - -layout (4 lines)\n  - -fixed (3 lines)\n  - -raw (2 lines)\n  - -nodiag (3 lines)\n  - -htmlmeta (3 lines)\n  - -bbox (1 lines)\n  - -bbox-layout (3 lines)\n  - -cropbox (2 lines)\n  - -colspacing (4 lines)\n  - -enc (2 lines)\n  - -listenc (2 lines)\n  - -eol (2 lines)\n  - -nopgbrk (2 lines)\n  - -opw (3 lines)\n  - -upw (2 lines)\n  - -q (1 lines)\n  - -v (1 lines)\n  - -h -help --help (1 lines)\n- **BUGS** (3 lines)\n- **EXIT CODES** (12 lines)\n- **AUTHOR** (2 lines)\n- **SEE ALSO** (6 lines)\n\n## Full Content\n\n### NAME\n\npdftotext - Portable Document Format (PDF) to text converter (version 3.03)\n\n### SYNOPSIS\n\npdftotext [options] [PDF-file [text-file]]\n\n### DESCRIPTION\n\nPdftotext converts Portable Document Format (PDF) files to plain text.\n\nPdftotext  reads  the PDF file, PDF-file, and writes a text file, text-file.  If text-file is\nnot specified, pdftotext converts file.pdf to file.txt.  If text-file is  ´-',  the  text  is\nsent to stdout.  If PDF-file is ´-', it reads the PDF file from stdin.\n\n### OPTIONS\n\n#### -f\n\nSpecifies the first page to convert.\n\n#### -l\n\nSpecifies the last page to convert.\n\n#### -r\n\nSpecifies the resolution, in DPI.  The default is 72 DPI.\n\n#### -x\n\nSpecifies the x-coordinate of the crop area top left corner\n\n#### -y\n\nSpecifies the y-coordinate of the crop area top left corner\n\n#### -W\n\nSpecifies the width of crop area in pixels (default is 0)\n\n#### -H\n\nSpecifies the height of crop area in pixels (default is 0)\n\n#### -layout\n\nMaintain  (as best as possible) the original physical layout of the text.  The default\nis to ´undo' physical layout (columns, hyphenation, etc.) and output the text in read‐\ning order.\n\n#### -fixed\n\nAssume  fixed-pitch (or tabular) text, with the specified character width (in points).\nThis forces physical layout mode.\n\n#### -raw\n\nformatting, etc.  Use of raw mode is no longer recommended.\n\n#### -nodiag\n\nDiscard  diagonal  text (i.e., text that is not close to one of the 0, 90, 180, or 270\ndegree axes). This is useful for skipping watermarks drawn on body text.\n\n#### -htmlmeta\n\nGenerate a simple HTML file, including the meta information.  This  simply  wraps  the\ntext in <pre> and </pre> and prepends the meta headers.\n\n#### -bbox\n\n#### -bbox-layout\n\nGenerate  an  XHTML file containing bounding box information for each block, line, and\nword in the file.\n\n#### -cropbox\n\nUse the crop box rather than the media box with -bbox and -bbox-layout.\n\n#### -colspacing\n\nSpecifies how much spacing we allow after a word before considering adjacent  text  to\nbe  a new column, measured as a fraction of the font size. Current default is 0.7, old\nreleases had a 0.3 default.\n\n#### -enc\n\nSets the encoding to use for text output. This defaults to \"UTF-8\".\n\n#### -listenc\n\nLists the available encodings\n\n#### -eol\n\nSets the end-of-line convention to use for text output.\n\n#### -nopgbrk\n\nDon't insert page breaks (form feed characters) between pages.\n\n#### -opw\n\nSpecify the owner password for the PDF file.  Providing this will bypass all  security\nrestrictions.\n\n#### -upw\n\nSpecify the user password for the PDF file.\n\n#### -q\n\n#### -v\n\n#### -h -help --help\n\n### BUGS\n\nSome  PDF files contain fonts whose encodings have been mangled beyond recognition.  There is\nno way (short of OCR) to extract text from these files.\n\n### EXIT CODES\n\nThe Xpdf tools use the following exit codes:\n\n0      No error.\n\n1      Error opening a PDF file.\n\n2      Error opening an output file.\n\n3      Error related to PDF permissions.\n\n99     Other error.\n\n### AUTHOR\n\nThe pdftotext software and documentation are copyright 1996-2011 Glyph & Cog, LLC.\n\n### SEE ALSO\n\npdfdetach(1),   pdffonts(1),   pdfimages(1),   pdfinfo(1),    pdftocairo(1),    pdftohtml(1),\npdftoppm(1), pdftops(1), pdfseparate(1), pdfsig(1), pdfunite(1)\n\n\n\n15 August 2011                               pdftotext(1)\n\n"
        }
    ],
    "structuredContent": {
        "command": "pdftotext",
        "section": "1",
        "mode": "man",
        "summary": "pdftotext - Portable Document Format (PDF) to text converter (version 3.03)",
        "synopsis": "pdftotext [options] [PDF-file [text-file]]",
        "tldr_summary": "Convert PDF files to plain text format.",
        "tldr_examples": [
            {
                "description": "Convert `filename.pdf` to plain text and print it to `stdout`",
                "command": "pdftotext {{filename.pdf}} -"
            },
            {
                "description": "Convert `filename.pdf` to plain text and save it as `filename.txt`",
                "command": "pdftotext {{filename.pdf}}"
            },
            {
                "description": "Convert `filename.pdf` to plain text and preserve the layout",
                "command": "pdftotext -layout {{filename.pdf}}"
            },
            {
                "description": "Convert `input.pdf` to plain text and save it as `output.txt`",
                "command": "pdftotext {{input.pdf}} {{output.txt}}"
            },
            {
                "description": "Convert pages 2, 3, and 4 of `input.pdf` to plain text and save them as `output.txt`",
                "command": "pdftotext -f {{2}} -l {{4}} {{input.pdf}} {{output.txt}}"
            }
        ],
        "tldr_source": "official",
        "flags": [
            {
                "flag": "-f",
                "long": null,
                "arg": null,
                "description": "Specifies the first page to convert."
            },
            {
                "flag": "-l",
                "long": null,
                "arg": null,
                "description": "Specifies the last page to convert."
            },
            {
                "flag": "-r",
                "long": null,
                "arg": null,
                "description": "Specifies the resolution, in DPI. The default is 72 DPI."
            },
            {
                "flag": "-x",
                "long": null,
                "arg": null,
                "description": "Specifies the x-coordinate of the crop area top left corner"
            },
            {
                "flag": "-y",
                "long": null,
                "arg": null,
                "description": "Specifies the y-coordinate of the crop area top left corner"
            },
            {
                "flag": "-W",
                "long": null,
                "arg": null,
                "description": "Specifies the width of crop area in pixels (default is 0)"
            },
            {
                "flag": "-H",
                "long": null,
                "arg": null,
                "description": "Specifies the height of crop area in pixels (default is 0)"
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Maintain (as best as possible) the original physical layout of the text. The default is to ´undo' physical layout (columns, hyphenation, etc.) and output the text in read‐ ing order."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Assume fixed-pitch (or tabular) text, with the specified character width (in points). This forces physical layout mode."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "formatting, etc. Use of raw mode is no longer recommended."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Discard diagonal text (i.e., text that is not close to one of the 0, 90, 180, or 270 degree axes). This is useful for skipping watermarks drawn on body text."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Generate a simple HTML file, including the meta information. This simply wraps the text in <pre> and </pre> and prepends the meta headers."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": ""
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Generate an XHTML file containing bounding box information for each block, line, and word in the file."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Use the crop box rather than the media box with -bbox and -bbox-layout."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Specifies how much spacing we allow after a word before considering adjacent text to be a new column, measured as a fraction of the font size. Current default is 0.7, old releases had a 0.3 default."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Sets the encoding to use for text output. This defaults to \"UTF-8\"."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Lists the available encodings"
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Sets the end-of-line convention to use for text output."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Don't insert page breaks (form feed characters) between pages."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Specify the owner password for the PDF file. Providing this will bypass all security restrictions."
            },
            {
                "flag": "",
                "long": null,
                "arg": null,
                "description": "Specify the user password for the PDF file."
            },
            {
                "flag": "-q",
                "long": null,
                "arg": null,
                "description": ""
            },
            {
                "flag": "-v",
                "long": null,
                "arg": null,
                "description": ""
            },
            {
                "flag": "-h",
                "long": "--help",
                "arg": null,
                "description": ""
            }
        ],
        "examples": [],
        "see_also": [
            {
                "name": "pdfdetach",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfdetach/1/json"
            },
            {
                "name": "pdffonts",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdffonts/1/json"
            },
            {
                "name": "pdfimages",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfimages/1/json"
            },
            {
                "name": "pdfinfo",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfinfo/1/json"
            },
            {
                "name": "pdftocairo",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdftocairo/1/json"
            },
            {
                "name": "pdftohtml",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdftohtml/1/json"
            },
            {
                "name": "pdftoppm",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdftoppm/1/json"
            },
            {
                "name": "pdftops",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdftops/1/json"
            },
            {
                "name": "pdfseparate",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfseparate/1/json"
            },
            {
                "name": "pdfsig",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfsig/1/json"
            },
            {
                "name": "pdfunite",
                "section": "1",
                "url": "https://www.chedong.com/phpMan.php/man/pdfunite/1/json"
            }
        ],
        "section_outline": [
            {
                "name": "NAME",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "SYNOPSIS",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "DESCRIPTION",
                "lines": 6,
                "subsections": []
            },
            {
                "name": "OPTIONS",
                "lines": 1,
                "subsections": [
                    {
                        "name": "-f",
                        "lines": 2,
                        "flag": "-f"
                    },
                    {
                        "name": "-l",
                        "lines": 2,
                        "flag": "-l"
                    },
                    {
                        "name": "-r",
                        "lines": 2,
                        "flag": "-r"
                    },
                    {
                        "name": "-x",
                        "lines": 2,
                        "flag": "-x"
                    },
                    {
                        "name": "-y",
                        "lines": 2,
                        "flag": "-y"
                    },
                    {
                        "name": "-W",
                        "lines": 2,
                        "flag": "-W"
                    },
                    {
                        "name": "-H",
                        "lines": 2,
                        "flag": "-H"
                    },
                    {
                        "name": "-layout",
                        "lines": 4
                    },
                    {
                        "name": "-fixed",
                        "lines": 3
                    },
                    {
                        "name": "-raw",
                        "lines": 2
                    },
                    {
                        "name": "-nodiag",
                        "lines": 3
                    },
                    {
                        "name": "-htmlmeta",
                        "lines": 3
                    },
                    {
                        "name": "-bbox",
                        "lines": 1
                    },
                    {
                        "name": "-bbox-layout",
                        "lines": 3
                    },
                    {
                        "name": "-cropbox",
                        "lines": 2
                    },
                    {
                        "name": "-colspacing",
                        "lines": 4
                    },
                    {
                        "name": "-enc",
                        "lines": 2
                    },
                    {
                        "name": "-listenc",
                        "lines": 2
                    },
                    {
                        "name": "-eol",
                        "lines": 2
                    },
                    {
                        "name": "-nopgbrk",
                        "lines": 2
                    },
                    {
                        "name": "-opw",
                        "lines": 3
                    },
                    {
                        "name": "-upw",
                        "lines": 2
                    },
                    {
                        "name": "-q",
                        "lines": 1,
                        "flag": "-q"
                    },
                    {
                        "name": "-v",
                        "lines": 1,
                        "flag": "-v"
                    },
                    {
                        "name": "-h -help --help",
                        "lines": 1,
                        "flag": "-h",
                        "long": "--help"
                    }
                ]
            },
            {
                "name": "BUGS",
                "lines": 3,
                "subsections": []
            },
            {
                "name": "EXIT CODES",
                "lines": 12,
                "subsections": []
            },
            {
                "name": "AUTHOR",
                "lines": 2,
                "subsections": []
            },
            {
                "name": "SEE ALSO",
                "lines": 6,
                "subsections": []
            }
        ]
    }
}