XPDF Tools doc: pdfdetach.txt: ```pdfdetach(1) ...
Criado em: 15 de março de 2025
Respondido usando O3 Mini High por Chat01
Criado em: 15 de março de 2025
Respondido usando O3 Mini High por Chat01
XPDF Tools doc:
pdfdetach.txt:
pdfdetachNAME pdfdetach - Portable Document Format (PDF) document embedded file extractor (version 4.04) SYNOPSIS pdfdetach [options] [PDF-file] DESCRIPTION Pdfdetach lists or extracts embedded files (attachments) from a Porta- ble Document Format (PDF) file. CONFIGURATION FILE Pdfdetach reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdfinfo is built). See the xpdfrc(5) man page for details. OPTIONS Some of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -list List all of the embedded files in the PDF file. File names are converted to the text encoding specified by the "-enc" switch. -save number Save the specified embedded file. By default, this uses the file name associated with the embedded file (as printed by the "-list" switch); the file name can be changed with the "-o" switch. -saveall Save all of the embedded files. This uses the file names asso- ciated with the embedded files (as printed by the "-list" switch). By default, the files are saved in the current direc- tory; this can be changed with the "-o" switch. -o path Set the file name used when saving an embedded file with the "-save" switch, or the directory used by "-saveall". -enc encoding-name Sets the encoding to use for text output (embedded file names). The encoding-name must be defined with the unicodeMap command (see xpdfrc(5)). This defaults to "Latin1" (which is a built-in encoding). [config file: textEncoding] -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdfinfo software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdf- fonts(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdfdetach(1)
pdffonts.txt:
pdffontsNAME pdffonts - Portable Document Format (PDF) font analyzer (version 4.04) SYNOPSIS pdffonts [options] [PDF-file] DESCRIPTION Pdffonts lists the fonts used in a Portable Document Format (PDF) file along with various information for each font. The following information is listed for each font: name the font name, exactly as given in the PDF file (potentially including a subset prefix) type the font type -- see below for details emb "yes" if the font is embedded in the PDF file sub "yes" if the font is a subset uni "yes" if there is an explicit "ToUnicode" map in the PDF file (the absence of a ToUnicode map doesn't necessarily mean that the text can't be converted to Unicode) prob "X" if this font is likely to be problematic when converting text to Unicode object ID the font dictionary object ID (number and generation) location the font location (see the -loc and -locPS options). PDF files can contain the following types of fonts: Type 1 Type 1C -- aka Compact Font Format (CFF) Type 1C (OT) -- OpenType with 8-bit CFF data Type 3 TrueType TrueType (OT) -- OpenType with 8-bit TrueType data CID Type 0 -- 16-bit font with no specified type CID Type 0C -- 16-bit PostScript CFF font CID Type 0C (OT) -- OpenType with CID CFF data CID TrueType -- 16-bit TrueType font CID TrueType (OT) -- OpenType with CID TrueType data CONFIGURATION FILE Pdffonts reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdffonts is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to analyze. -loc Shows additional information on the location of the font that will be used when the PDF file is rasterized (with xpdf, pdftoppm, etc.). -locPS Shows additional information on the location of the font that will be used when the PDF file is converted to PostScript (with pdftops). -l number Specifies the last page to analyze. -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdffonts software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdfde- tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdffonts(1)
pdfimages.txt:
pdfimagesNAME pdfimages - Portable Document Format (PDF) image extractor (version 4.04) SYNOPSIS pdfimages [options] PDF-file image-root DESCRIPTION Pdfimages saves images from a Portable Document Format (PDF) file as Portable Pixmap (PPM), Portable Graymap (PGM), Portable Bitmap (PBM), or JPEG files. Pdfimages reads the PDF file, scans one or more pages, PDF-file, and writes one PPM, PGM, PBM, or JPEG file for each image, image-root- nnnn.xxx, where nnnn is the image number and xxx is the image type (.ppm, .pgm, .pbm, .jpg). NB: pdfimages extracts the raw image data from the PDF file, without performing any additional transforms. Any rotation, clipping, color inversion, etc. done by the PDF content stream is ignored. CONFIGURATION FILE Pdfimages reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdfimages is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to scan. -l number Specifies the last page to scan. -j Normally, all images are written as PBM (for monochrome images), PGM (for grayscale images), or PPM (for color images) files. With this option, images in DCT format are saved as JPEG files. All non-DCT images are saved in PBM/PGM/PPM format as usual. (Inline images are always saved in PBM/PGM/PPM format.) -raw Write all images in PDF-native formats. Most of the formats are not standard image formats, so this option is primarily useful as input to a tool that generates PDF files. (Inline images are always saved in PBM/PGM/PPM format.) -list Write a one-line summary to stdout for each image. The summary provides the image file name, the page number, the image width and height, the horizontal and vertical resolution (DPI) as drawn, the color space type, and the number of bits per compo- nent (BPC). -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdfimages software and documentation are copyright 1998-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdf- fonts(1), pdfdetach(1), pdftoppm(1), pdftopng(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdfimages(1)
pdfinfo.txt:
pdfinfoNAME pdfinfo - Portable Document Format (PDF) document information extractor (version 4.04) SYNOPSIS pdfinfo [options] [PDF-file] DESCRIPTION Pdfinfo prints the contents of the 'Info' dictionary (plus some other useful information) from a Portable Document Format (PDF) file. The 'Info' dictionary contains the following values: title subject keywords author creator producer creation date modification date In addition, the following information is printed: tagged (yes/no) form (AcroForm / static XFA / dynamic XFA / none) page count encrypted flag (yes/no) print and copy permissions (if encrypted) page size and rotation file size linearized (yes/no) PDF version metadata (only if requested) CONFIGURATION FILE Pdfinfo reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdfinfo is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to examine. If multiple pages are requested using the "-f" and "-l" options, the size of each requested page (and, optionally, the bounding boxes for each requested page) are printed. Otherwise, only page one is exam- ined. -l number Specifies the last page to examine. -box Prints the page box bounding boxes: MediaBox, CropBox, BleedBox, TrimBox, and ArtBox. -meta Prints document-level metadata. (This is the "Metadata" stream from the PDF file's Catalog object.) -rawdates Prints the raw (undecoded) date strings, directly from the PDF file. -enc encoding-name Sets the encoding to use for text output. The encoding-name must be defined with the unicodeMap command (see xpdfrc(5)). This defaults to "Latin1" (which is a built-in encoding). [con- fig file: textEncoding] -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdfinfo software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdffonts(1), pdfde- tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdfinfo(1)
pdftohtml.txt:
pdftohtmlNAME pdftohtml - Portable Document Format (PDF) to HTML converter (version 4.04) SYNOPSIS pdftohtml [options] PDF-file HTML-dir DESCRIPTION Pdftohtml converts Portable Document Format (PDF) files to HTML. Pdftohtml reads the PDF file, PDF-file, and places an HTML file for each page, along with auxiliary images in the directory, HTML-dir. The HTML directory will be created; if it already exists, pdftohtml will report an error. CONFIGURATION FILE Pdftohtml reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdftohtml is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to convert. -l number Specifies the last page to convert. -z number Specifies the initial zoom level. The default is 1.0, which means 72dpi, i.e., 1 point in the PDF file will be 1 pixel in the HTML. Using '-z 1.5', for example, will make the initial view 50% larger. -r number Specifies the resolution, in DPI, for background images. This controls the pixel size of the background image files. The ini- tial zoom level is controlled by the '-z' option. Specifying a larger '-r' value will allow the viewer to zoom in farther with- out upscaling artifacts in the background. -vstretch number Specifies a vertical stretch factor. Setting this to a value greater than 1.0 will stretch each page vertically, spreading out the lines. This also stretches the background image to match. -embedbackground Embeds the background image as base64-encoded data directly in the HTML file, rather than storing it as a separate file. -nofonts Disable extraction of embedded fonts. By default, pdftohtml extracts TrueType and OpenType fonts. Disabling extraction can work around problems with buggy fonts. -embedfonts Embeds any extracted fonts as base64-encoded data directly in the HTML file, rather than storing them as separate files. -skipinvisible Don't draw invisible text. By default, invisible text (commonly used in OCR'ed PDF files) is drawn as transparent (alpha=0) HTML text. This option tells pdftohtml to discard invisible text entirely. -allinvisible Treat all text as invisible. By default, regular (non-invisi- ble) text is not drawn in the background image, and is instead drawn with HTML on top of the image. This option tells pdfto- html to include the regular text in the background image, and then draw it as transparent (alpha=0) HTML text. -formfields Convert AcroForm text and checkbox fields to HTML input ele- ments. This also removes text (e.g., underscore characters) and erases background image content (e.g., lines or boxes) in the field areas. -table Use table mode when performing the underlying text extraction. This will generally produce better output when the PDF content is a full-page table. NB: This does not generate HTML tables; it just changes the way text is split up. -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) BUGS Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files. EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdftohtml software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdfinfo(1), pdffonts(1), pdfde- tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdftohtml(1)
pdftopng.txt:
pdftopngNAME pdftopng - Portable Document Format (PDF) to Portable Network Graphics (PNG) converter (version 4.04) SYNOPSIS pdftopng [options] PDF-file PNG-root DESCRIPTION Pdftopng converts Portable Document Format (PDF) files to color, grayscale, or monochrome image files in Portable Network Graphics (PNG) format. Pdftopng reads the PDF file, PDF-file, and writes one PNG file for each page, PNG-root-nnnnnn.png, where nnnnnn is the page number. If PNG- root is '-', the image is sent to stdout (this is probably only useful when converting a single page). CONFIGURATION FILE Pdftopng reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdftopng is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to convert. -l number Specifies the last page to convert. -r number Specifies the resolution, in DPI. The default is 150 DPI. -mono Generate a monochrome image (instead of a color image). -gray Generate a grayscale image (instead of a color image). -alpha Generate an alpha channel in the PNG file. This is only useful with PDF files that have been constructed with a transparent background. The -alpha flag cannot be used with -mono. -rot angle Rotate pages by 0 (the default), 90, 180, or 270 degrees. -freetype yes | no Enable or disable FreeType (a TrueType / Type 1 font raster- izer). This defaults to "yes". [config file: enableFreeType] -aa yes | no Enable or disable font anti-aliasing. This defaults to "yes". [config file: antialias] -aaVector yes | no Enable or disable vector anti-aliasing. This defaults to "yes". [config file: vectorAntialias] -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdftopng software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdf- fonts(1), pdfdetach(1), pdftoppm(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdftopng(1)
pdftoppm.txt:
pdftoppmNAME pdftoppm - Portable Document Format (PDF) to Portable Pixmap (PPM) con- verter (version 4.04) SYNOPSIS pdftoppm [options] PDF-file PPM-root DESCRIPTION Pdftoppm converts Portable Document Format (PDF) files to color image files in Portable Pixmap (PPM) format, grayscale image files in Porta- ble Graymap (PGM) format, or monochrome image files in Portable Bitmap (PBM) format. Pdftoppm reads the PDF file, PDF-file, and writes one PPM file for each page, PPM-root-nnnnnn.ppm, where nnnnnn is the page number. If PPM- root is '-', the image is sent to stdout (this is probably only useful when converting a single page). CONFIGURATION FILE Pdftoppm reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdftoppm is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to convert. -l number Specifies the last page to convert. -r number Specifies the resolution, in DPI. The default is 150 DPI. -mono Generate a monochrome PBM file (instead of an RGB PPM file). -gray Generate a grayscale PGM file (instead of an RGB PPM file). -cmyk Generate a CMYK PAM file (instead of an RGB PPM file). -rot angle Rotate pages by 0 (the default), 90, 180, or 270 degrees. -freetype yes | no Enable or disable FreeType (a TrueType / Type 1 font raster- izer). This defaults to "yes". [config file: enableFreeType] -aa yes | no Enable or disable font anti-aliasing. This defaults to "yes". [config file: antialias] -aaVector yes | no Enable or disable vector anti-aliasing. This defaults to "yes". [config file: vectorAntialias] -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdftoppm software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdf- fonts(1), pdfdetach(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdftoppm(1)
pdftops.txt:
pdftopsNAME pdftops - Portable Document Format (PDF) to PostScript converter (ver- sion 4.04) SYNOPSIS pdftops [options] [PDF-file [PS-file]] DESCRIPTION Pdftops converts Portable Document Format (PDF) files to PostScript so they can be printed. Pdftops reads the PDF file, PDF-file, and writes a PostScript file, PS- file. If PS-file is not specified, pdftops converts file.pdf to file.ps (or file.eps with the -eps option). If PS-file is '-', the PostScript is sent to stdout. CONFIGURATION FILE Pdftops reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdftops is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to print. -l number Specifies the last page to print. -level1 Generate Level 1 PostScript. The resulting PostScript files will be significantly larger (if they contain images), but will print on Level 1 printers. This also converts all images to black and white. No more than one of the PostScript level options (-level1, -level1sep, -level2, -level2sep, -level3, -level3sep) may be given. [config file: psLevel] -level1sep Generate Level 1 separable PostScript. All colors are converted to CMYK. Images are written with separate stream data for the four components. [config file: psLevel] -level2 Generate Level 2 PostScript. Level 2 supports color images and image compression. This is the default setting. [config file: psLevel] -level2gray Generate grayscale Level 2 PostScript. All colors, including images, are converted to grayscale. [config file: psLevel] -level2sep Generate Level 2 separable PostScript. All colors are converted to CMYK. The PostScript separation convention operators are used to handle custom (spot) colors. [config file: psLevel] -level3 Generate Level 3 PostScript. This enables all Level 2 features plus CID font embedding and masked image generation. [config file: psLevel] -level3gray Generate grayscale Level 3 PostScript. All colors, including images, are converted to grayscale. [config file: psLevel] -level3sep Generate Level 3 separable PostScript. The separation handling is the same as for -level2sep. [config file: psLevel] -eps Generate an Encapsulated PostScript (EPS) file. An EPS file contains a single image, so if you use this option with a multi- page PDF file, you must use -f and -l to specify a single page. No more than one of the mode options (-eps, -form) may be given. -form Generate a PostScript form which can be imported by software that understands forms. A form contains a single page, so if you use this option with a multi-page PDF file, you must use -f and -l to specify a single page. The -level1 option cannot be used with -form. -opi Generate OPI comments for all images and forms which have OPI information. (This option is only available if pdftops was com- piled with OPI support.) [config file: psOPI] -noembt1 By default, any Type 1 fonts which are embedded in the PDF file are copied into the PostScript file. This option causes pdftops to substitute base fonts instead. Embedded fonts make Post- Script files larger, but may be necessary for readable output. [config file: psEmbedType1Fonts] -noembtt By default, any TrueType fonts which are embedded in the PDF file are copied into the PostScript file. This option causes pdftops to substitute base fonts instead. Embedded fonts make PostScript files larger, but may be necessary for readable out- put. Also, some PostScript interpreters do not have TrueType rasterizers. [config file: psEmbedTrueTypeFonts] -noembcidps By default, any CID PostScript fonts which are embedded in the PDF file are copied into the PostScript file. This option dis- ables that embedding. No attempt is made to substitute for non- embedded CID PostScript fonts. [config file: psEmbedCID- PostScriptFonts] -noembcidtt By default, any CID TrueType fonts which are embedded in the PDF file are copied into the PostScript file. This option disables that embedding. No attempt is made to substitute for non-embed- ded CID TrueType fonts. [config file: psEmbedCIDTrueTypeFonts] -preload Convert PDF forms to PS procedures, and preload image data. This uses more memory in the PostScript interpreter, but gener- ates significantly smaller PS files in situations where, e.g., the same image is drawn on every page of a long document. -paper size Set the paper size to one of "letter", "legal", "A4", or "A3". This can also be set to "match", which will set the paper size to match the size specified in the PDF file. [config file: psPaperSize] -paperw size Set the paper width, in points. [config file: psPaperSize] -paperh size Set the paper height, in points. [config file: psPaperSize] -nocrop By default, output is cropped to the CropBox specified in the PDF file. This option disables cropping. [config file: psCrop] -expand Expand PDF pages smaller than the paper to fill the paper. By default, these pages are not scaled. [config file: psExpandS- maller] -noshrink Don't scale PDF pages which are larger than the paper. By default, pages larger than the paper are shrunk to fit. [config file: psShrinkLarger] -nocenter By default, PDF pages smaller than the paper (after any scaling) are centered on the paper. This option causes them to be aligned to the lower-left corner of the paper instead. [config file: psCenter] -pagecrop Treat the CropBox as the PDF page size. By default, the Media- Box is used as the page size. [config file: psUseCropBoxAsPage] -userunit Honor the UserUnit settings on PDF pages when computing page/paper size. By default, pdftops ignores UserUnit. -duplex Set the Duplex pagedevice entry in the PostScript file. This tells duplex-capable printers to enable duplexing. [config file: psDuplex] -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -v Print copyright and version information. -h Print usage information. (-help and --help are equivalent.) EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdftops software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfde- tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdftops(1)
pdftotext.txt:
pdftotextNAME pdftotext - Portable Document Format (PDF) to text converter (version 4.04) SYNOPSIS pdftotext [options] [PDF-file [text-file]] DESCRIPTION Pdftotext converts Portable Document Format (PDF) files to plain text. Pdftotext reads the PDF file, PDF-file, and writes a text file, text- file. If text-file is not specified, pdftotext converts file.pdf to file.txt. If text-file is '-', the text is sent to stdout. CONFIGURATION FILE Pdftotext reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /etc/xpdfrc (but this location can be changed when pdftotext is built). See the xpdfrc(5) man page for details. OPTIONS Many of the following options can be set with configuration file com- mands. These are listed in square brackets with the description of the corresponding command line option. -f number Specifies the first page to convert. -l number Specifies the last page to convert. -layout Maintain (as best as possible) the original physical layout of the text. The default is to 'undo' physical layout (columns, hyphenation, etc.) and output the text in reading order. If the -fixed option is given, character spacing within each line will be determined by the specified character pitch. -simple Similar to -layout, but optimized for simple one-column pages. This mode will do a better job of maintaining horizontal spac- ing, but it will only work properly with a single column of text. -simple2 Similar to -simple, but handles slightly rotated text (e.g., OCR output) better. Only works for pages with a single column of text. -table Table mode is similar to physical layout mode, but optimized for tabular data, with the goal of keeping rows and columns aligned (at the expense of inserting extra whitespace). If the -fixed option is given, character spacing within each line will be determined by the specified character pitch. -lineprinter Line printer mode uses a strict fixed-character-pitch and -height layout. That is, the page is broken into a grid, and characters are placed into that grid. If the grid spacing is too small for the actual characters, the result is extra white- space. If the grid spacing is too large, the result is missing whitespace. The grid spacing can be specified using the -fixed and -linespacing options. If one or both are not given on the command line, pdftotext will attempt to compute appropriate value(s). -raw Keep the text in content stream order. Depending on how the PDF file was generated, this may or may not be useful. -fixed number Specify the character pitch (character width), in points, for physical layout, table, or line printer mode. This is ignored in all other modes. -linespacing number Specify the line spacing, in points, for line printer mode. This is ignored in all other modes. -clip Text which is hidden because of clipping is removed before doing layout, and then added back in. This can be helpful for tables where clipped (invisible) text would overlap the next column. -nodiag Diagonal text, i.e., text that is not close to one of the 0, 90, 180, or 270 degree axes, is discarded. This is useful to skip watermarks drawn on top of body text, etc. -enc encoding-name Sets the encoding to use for text output. The encoding-name must be defined with the unicodeMap command (see xpdfrc(5)). The encoding name is case-sensitive. This defaults to "Latin1" (which is a built-in encoding). [config file: textEncoding] -eol unix | dos | mac Sets the end-of-line convention to use for text output. [config file: textEOL] -nopgbrk Don't insert a page breaks (form feed character) at the end of each page. [config file: textPageBreaks] -bom Insert a Unicode byte order marker (BOM) at the start of the text output. -marginl number Specifies the left margin, in points. Text in the left margin (i.e., within that many points of the left edge of the page) is discarded. The default value is zero. -marginr number Specifies the right margin, in points. Text in the right margin (i.e., within that many points of the right edge of the page) is discarded. The default value is zero. -margint number Specifies the top margin, in points. Text in the top margin (i.e., within that many points of the top edge of the page) is discarded. The default value is zero. -marginb number Specifies the bottom margin, in points. Text in the bottom mar- gin (i.e., within that many points of the bottom edge of the page) is discarded. The default value is zero. -opw password Specify the owner password for the PDF file. Providing this will bypass all security restrictions. -upw password Specify the user password for the PDF file. -verbose Print a status message (to stdout) before processing each page. [config file: printStatusInfo] -q Don't print any messages or errors. [config file: errQuiet] -cfg config-file Read config-file in place of ~/.xpdfrc or the system-wide config file. -listencodings List all available text output encodings, then exit. -v Print copyright and version information, then exit. -h Print usage information, then exit. (-help and --help are equivalent.) BUGS Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files. EXIT CODES The Xpdf tools use the following exit codes: 0 No error. 1 Error opening a PDF file. 2 Error opening an output file. 3 Error related to PDF permissions. 99 Other error. AUTHOR The pdftotext software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftohtml(1), pdfinfo(1), pdffonts(1), pdfde- tach(1), pdftoppm(1), pdftopng(1), pdfimages(1), xpdfrc(5) http://www.xpdfreader.com/ 18 Apr 2022 pdftotext(1)
xpdfrc.txt:
xpdfrcNAME xpdfrc - configuration file for Xpdf tools (version 4.04) DESCRIPTION All of the Xpdf tools read a single configuration file. On Linux/Unix/MacOS: if you have a .xpdfrc file in your home directory, it will be read. Otherwise, a system-wide configuration file will be read from /etc/xpdfrc, if it exists. (This is its default location; depending on build options, it may be placed elsewhere.) On Windows: the file must be named xpdfrc (no leading dot, no exten- sion), and must be placed in the same directory as the executable (pdftotext.exe, xpdf.exe, etc.) The xpdfrc file consists of a series of configuration options, one per line. Blank lines and lines starting with a '#' (comments) are ignored. Arguments can be single-quoted or double-quoted, e.g., for file names that contain spaces ("aaa bbb", 'aaa bbb'). This quoting does not pro- vide any escaping, so there's no way to include a double quote in a double-quoted argument or a single quote in a single-quoted argument. Arguments can also be at-quoted: @"aaa bbb". At-quoted strings allow use of the DATADIR variable, which is set to the 'data' subdirectory in the xpdf install directory. The percent sign (%) is an escape charac- ter: a percent sign followed by any other character is replaced with that character. @"abc %"def%" ghi" --> abc "def" ghi @"${DATADIR}/foo" --> ...install-dir.../data/foo @"%${DATADIR}/foo" --> ${DATADIR}/foo The following sections list all of the configuration options, sorted into functional groups. There is an examples section at the end. INCLUDE FILES include config-file Includes the specified config file. The effect of this is equivalent to inserting the contents of config-file directly into the parent config file in place of the include command. Config files can be nested arbitrarily deeply. GENERAL FONT CONFIGURATION fontFile PDF-font-name font-file Maps a PDF font, PDF-font-name, to a font for display or Post- Script output. The font file, font-file, can be any type allowed in a PDF file. This command can be used for 8-bit or 16-bit (CID) fonts. fontDir dir Specifies a search directory for font files. There can be mul- tiple fontDir commands; all of the specified directories will be searched in order. The font files can be Type 1 (.pfa or .pfb) or TrueType (.ttf or .ttc); other files in the directory will be ignored. The font file name (not including the extension) must exactly match the PDF font name. This search is performed if the font name doesn't match any of the fonts declared with the fontFile command. There are no default fontDir directories. fontFileCC registry-ordering font-file Maps the registry-ordering character collection to a font for display or PostScript output. This mapping is used if the font name doesn't match any of the fonts declared with the fontFile, fontDir, psResidentFont16, or psResidentFontCC commands. POSTSCRIPT FONT CONFIGURATION psFontPassthrough yes | no If set to "yes", pass 8-bit font names through to the PostScript output without substitution. Fonts which are not embedded in the PDF file are expected to be available on the printer. This defaults to "no". psResidentFont PDF-font-name PS-font-name When the 8-bit font PDF-font-name is used (without embedding) in a PDF file, it will be translated to the PostScript font PS-font-name, which is assumed to be resident in the printer. Typically, PDF-font-name and PS-font-name are the same. By default, only the Base-14 fonts are assumed to be resident. psResidentFont16 PDF-font-name wMode PS-font-name encoding When the 16-bit (CID) font PDF-font-name with writing mode wMode is used (without embedding) in a PDF file, it will be translated to the PostScript font PS-font-name, which is assumed to be res- ident in the printer. The writing mode must be either 'H' for horizontal or 'V' for vertical. The resident font is assumed to use the specified encoding (which must have been defined with the unicodeMap command). psResidentFontCC registry-ordering wMode PS-font-name encoding When a 16-bit (CID) font using the registry-ordering character collection and wMode writing mode is used (without embedding) in a PDF file, the PostScript font, PS-font-name, is substituted for it. The substituted font is assumed to be resident in the printer. The writing mode must be either 'H' for horizontal or 'V' for vertical. The resident font is assumed to use the spec- ified encoding (which must have been defined with the unicodeMap command). psEmbedType1Fonts yes | no If set to "no", prevents embedding of Type 1 fonts in generated PostScript. This defaults to "yes". psEmbedTrueTypeFonts yes | no If set to "no", prevents embedding of TrueType fonts in gener- ated PostScript. This defaults to "yes". psEmbedCIDTrueTypeFonts yes | no If set to "no", prevents embedding of CID TrueType fonts in gen- erated PostScript. For Level 3 PostScript, this generates a CID font, for lower levels it generates a non-CID composite font. This defaults to "yes". psEmbedCIDPostScriptFonts yes | no If set to "no", prevents embedding of CID PostScript fonts in generated PostScript. For Level 3 PostScript, this generates a CID font, for lower levels it generates a non-CID composite font. This defaults to "yes". POSTSCRIPT CONTROL psPaperSize width(pts) height(pts) Sets the paper size for PostScript output. The width and height parameters give the paper size in PostScript points (1 point = 1/72 inch). psPaperSize letter | legal | A4 | A3 | match Sets the paper size for PostScript output to a standard size. The default paper size is set when xpdf and pdftops are built, typically to "letter" or "A4". This can also be set to "match", which will set the paper size to match the size specified in the PDF file. psImageableArea llx lly urx ury Sets the imageable area for PostScript output. The four inte- gers are the coordinates of the lower-left and upper-right cor- ners of the imageable region, specified in points (with the ori- gin being the lower-left corner of the paper). This defaults to the full paper size; the psPaperSize option will reset the imageable area coordinates. psCrop yes | no If set to "yes", PostScript output is cropped to the CropBox specified in the PDF file; otherwise no cropping is done. This defaults to "yes". psUseCropBoxAsPage yes | no If set to "yes", PostScript output treats the CropBox as the page size. By default, this is "no", and the MediaBox is used as the page size. psExpandSmaller yes | no If set to "yes", PDF pages smaller than the PostScript imageable area are expanded to fill the imageable area. Otherwise, no scaling is done on smaller pages. This defaults to "no". psShrinkLarger yes | no If set to yes, PDF pages larger than the PostScript imageable area are shrunk to fit the imageable area. Otherwise, no scal- ing is done on larger pages. This defaults to "yes". psCenter yes | no If set to yes, PDF pages smaller than the PostScript imageable area (after any scaling) are centered in the imageable area. Otherwise, they are aligned at the lower-left corner of the imageable area. This defaults to "yes". psDuplex yes | no If set to "yes", the generated PostScript will set the "Duplex" pagedevice entry. This tells duplex-capable printers to enable duplexing. This defaults to "no". psLevel level1 | level1sep | level2 | level2gray | level2sep | level3 | level3gray | level3Sep Sets the PostScript level to generate. This defaults to "level2". psPreload yes | no If set to "yes", PDF forms are converted to PS procedures, and image data is preloaded. This uses more memory in the Post- Script interpreter, but generates significantly smaller PS files in situations where, e.g., the same image is drawn on every page of a long document. This defaults to "no". psOPI yes | no If set to "yes", generates PostScript OPI comments for all images and forms which have OPI information. This option is only available if the Xpdf tools were compiled with OPI support. This defaults to "no". psASCIIHex yes | no If set to "yes", the ASCIIHexEncode filter will be used instead of ASCII85Encode for binary data. This defaults to "no". psLZW yes | no If set to "yes", the LZWEncode filter will be used for lossless compression in PostScript output; if set to "no", the RunLength- Encode filter will be used instead. LZW generates better com- pression (smaller PS files), but may not be supported by some printers. This defaults to "yes". psUncompressPreloadedImages yes | no If set to "yes", all preloaded images in PS files will uncom- pressed. If set to "no", the original compressed images will be used when possible. The "yes" setting is useful to work around certain buggy PostScript interpreters. This defaults to "no". psMinLineWidth float Set the minimum line width, in points, for PostScript output. The default value is 0 (no minimum). psRasterResolution float Set the resolution (in dpi) for rasterized pages in PostScript output. (Pdftops will rasterize pages which use transparency.) This defaults to 300. psRasterMono yes | no If set to "yes", rasterized pages in PS files will be monochrome (8-bit gray) instead of color. This defaults to "no". psRasterSliceSize pixels When rasterizing pages, pdftops splits the page into horizontal "slices", to limit memory usage. This option sets the maximum slice size, in pixels. This defaults to 20000000 (20 million). psAlwaysRasterize yes | no If set to "yes", all PostScript output will be rasterized. This defaults to "no". psNeverRasterize yes | no Pdftops rasterizes an pages that use transparency (because Post- Script doesn't support transparency). If psNeverRasterize is set to "yes", rasterization is disabled: pages will never be rasterized, even if they contain transparency. This will likely result in incorrect output for PDF files that use transparency, and a warning message to that effect will be printed. This defaults to "no". fontDir dir See the description above, in the DISPLAY FONTS section. TEXT CONTROL AND CHARACTER MAPPING textEncoding encoding-name Sets the encoding to use for text output. (This can be overrid- den with the "-enc" switch on the command line.) The encod- ing-name must be defined with the unicodeMap command (see above). This defaults to "Latin1". textEOL unix | dos | mac Sets the end-of-line convention to use for text output. The options are: unix = LF dos = CR+LF mac = CR (This can be overridden with the "-eol" switch on the command line.) The default value is based on the OS where xpdf and pdftotext were built. textPageBreaks yes | no If set to "yes", text extraction will insert page breaks (form feed characters) between pages. This defaults to "yes". textKeepTinyChars yes | no If set to "yes", text extraction will keep all characters. If set to "no", text extraction will discard tiny (smaller than 3 point) characters after the first 50000 per page, avoiding extremely slow run times for PDF files that use special fonts to do shading or cross-hatching. This defaults to "yes". nameToUnicode map-file Specifies a file with the mapping from character names to Uni- code. This is used to handle PDF fonts that have valid encod- ings but no ToUnicode entry. Each line of a nameToUnicode file looks like this: hex-string name The hex-string is the Unicode (UCS-2) character index, and name is the corresponding character name. Multiple nameToUnicode files can be used; if a character name is given more than once, the code in the last specified file is used. There is a built- in default nameToUnicode table with all of Adobe's standard character names. cidToUnicode registry-ordering map-file Specifies the file with the mapping from character collection to Unicode. Each line of a cidToUnicode file represents one char- acter: hex-string The hex-string is the Unicode (UCS-2) index for that character. The first line maps CID 0, the second line CID 1, etc. File size is determined by size of the character collection. Only one file is allowed per character collection; the last specified file is used. There are no built-in cidToUnicode mappings. unicodeToUnicode font-name-substring map-file This is used to work around PDF fonts which have incorrect Uni- code information. It specifies a file which maps from the given (incorrect) Unicode indexes to the correct ones. The mapping will be used for any font whose name contains font-name-sub- string. Each line of a unicodeToUnicode file represents one Unicode character: in-hex out-hex1 out-hex2 ... The in-hex field is an input (incorrect) Unicode index, and the rest of the fields are one or more output (correct) Unicode indexes. Each occurrence of in-hex will be converted to the specified output sequence. unicodeRemapping remap-file Remap Unicode characters when doing text extraction. This spec- ifies a file that maps from a particular Unicode index to zero or more replacement Unicode indexes. Each line of the remap file represents one Unicode character: in-hex out-hex1 out-hex2 ... Any Unicode characters not listed will be left unchanged. This function is typically used to remap things like non-breaking spaces, soft hyphens, ligatures, etc. unicodeMap encoding-name map-file Specifies the file with mapping from Unicode to encoding-name. These encodings are used for text output (see below). Each line of a unicodeMap file represents a range of one or more Unicode characters which maps linearly to a range in the output encod- ing: in-start-hex in-end-hex out-start-hex Entries for single characters can be abbreviated to: in-hex out-hex The in-start-hex and in-end-hex fields (or the single in-hex field) specify the Unicode range. The out-start-hex field (or the out-hex field) specifies the start of the output encoding range. The length of the out-start-hex (or out-hex) string determines the length of the output characters (e.g., UTF-8 uses different numbers of bytes to represent characters in different ranges). Entries must be given in increasing Unicode order. Only one file is allowed per encoding; the last specified file is used. The Latin1, ASCII7, Symbol, ZapfDingbats, UTF-8, and UCS-2 encodings are predefined. cMapDir registry-ordering dir Specifies a search directory, dir, for CMaps for the reg- istry-ordering character collection. There can be multiple directories for a particular collection. There are no default CMap directories. toUnicodeDir dir Specifies a search directory, dir, for ToUnicode CMaps. There can be multiple ToUnicode directories. There are no default ToUnicode directories. mapNumericCharNames yes | no If set to "yes", the Xpdf tools will attempt to map various numeric character names sometimes used in font subsets. In some cases this leads to usable text, and in other cases it leads to gibberish -- there is no way for Xpdf to tell. This defaults to "yes". mapUnknownCharNames yes | no If set to "yes", and mapNumericCharNames is set to "no", the Xpdf tools will apply a simple pass-through mapping (Unicode index = character code) for all unrecognized glyph names. (For CID fonts, setting mapNumericCharNames to "no" is unnecessary.) In some cases, this leads to usable text, and in other cases it leads to gibberish -- there is no way for Xpdf to tell. This defaults to "no". mapExtTrueTypeFontsViaUnicode yes | no When rasterizing text using an external TrueType font, there are two options for handling character codes. If mapExtTrueType- FontsViaUnicode is set to "yes", Xpdf will use the font encod- ing/ToUnicode info to map character codes to Unicode, and then use the font's Unicode cmap to map Unicode to GIDs. If mapExt- TrueTypeFontsViaUnicode is set to "no", Xpdf will assume the character codes are GIDs (i.e., use an identity mapping). This defaults to "yes". useTrueTypeUnicodeMapping yes | no If set to "yes", the Xpdf tools will use the Unicode encoding information in TrueType fonts (16-bit only), if avaiable, to override the PDF ToUnicode maps. Otherwise, the ToUnicode maps are always used when present. This defaults to "no". dropFont font-name Drop all text drawn in the specified font. To drop text drawn in unnamed fonts, use: dropFont "" There can be any number of dropFont commands. RASTERIZER SETTINGS enableFreeType yes | no Enables or disables use of FreeType (a TrueType / Type 1 font rasterizer). This is only relevant if the Xpdf tools were built with FreeType support. ("enableFreeType" replaces the old "freetypeControl" option.) This option defaults to "yes". disableFreeTypeHinting yes | no If this is set to "yes", FreeType hinting will be forced off. This option defaults to "no". antialias yes | no Enables or disables font anti-aliasing in the PDF rasterizer. This option affects all font rasterizers. ("antialias" replaces the anti-aliasing control provided by the old "t1libControl" and "freetypeControl" options.) This default to "yes". vectorAntialias yes | no Enables or disables anti-aliasing of vector graphics in the PDF rasterizer. This defaults to "yes". imageMaskAntialias yes | no Enables or disables anti-aliasing of image masks (when downsam- pling or upsampling) in the PDF rasterizer. This defaults to "yes". antialiasPrinting yes | no If this is "yes", bitmaps sent to the printer will be antialiased (according to the "antialias" and "vectorAntialias" settings). If this is "no", printed bitmaps will not be antialiased. This defaults to "no". strokeAdjust yes | no | cad Sets the stroke adjustment mode. If set to "no", no stroke adjustment will be done. If set to "yes", normal stroke adjust- ment will be done: horizontal and vertical lines will be moved by up to half a pixel to make them look cleaner when vector anti-aliasing is enabled. If set to "cad", a slightly different stroke adjustment algorithm will be used to ensure that lines of the same original width will always have the same adjusted width (at the expense of allowing gaps and overlaps between adjacent lines). This defaults to "yes". forceAccurateTiling yes | no If this is set to "yes", the TilingType is forced to 2 (no dis- tortion) for all tiling patterns, regardless of the setting in the pattern dictionary. This defaults to "no". screenType dispersed | clustered | stochasticClustered Sets the halftone screen type, which will be used when generat- ing a monochrome (1-bit) bitmap. The three options are dis- persed-dot dithering, clustered-dot dithering (with a round dot and 45-degree screen angle), and stochastic clustered-dot dithering. By default, "stochasticClustered" is used for reso- lutions of 300 dpi and higher, and "dispersed" is used for reso- lutions lower then 300 dpi. screenSize integer Sets the size of the (square) halftone screen threshold matrix. By default, this is 4 for dispersed-dot dithering, 10 for clus- tered-dot dithering, and 100 for stochastic clustered-dot dithering. screenDotRadius integer Sets the halftone screen dot radius. This is only used when screenType is set to stochasticClustered, and it defaults to 2. In clustered-dot mode, the dot radius is half of the screen size. Dispersed-dot dithering doesn't have a dot radius. screenGamma float Sets the halftone screen gamma correction parameter. Gamma val- ues greater than 1 make the output brighter; gamma values less than 1 make it darker. The default value is 1. screenBlackThreshold float When halftoning, all values below this threshold are forced to solid black. This parameter is a floating point value between 0 (black) and 1 (white). The default value is 0. screenWhiteThreshold float When halftoning, all values above this threshold are forced to solid white. This parameter is a floating point value between 0 (black) and 1 (white). The default value is 1. minLineWidth float Set the minimum line width, in device pixels. This affects the rasterizer only, not the PostScript converter (except when it uses rasterization to handle transparency). The default value is 0 (no minimum). enablePathSimplification yes | no If set to "yes", simplify paths by removing points where it won't make a significant difference to the shape. The default value is "no". overprintPreview yes | no If set to "yes", generate overprint preview output, honoring the OP/op/OPM settings in the PDF file. Ignored for non-CMYK out- put. The default value is "no". VIEWER SETTINGS These settings only apply to the Xpdf GUI PDF viewer. initialZoom percentage | page | width Sets the initial zoom factor. A number specifies a zoom per- centage, where 100 means 72 dpi. You may also specify 'page', to fit the page to the window size, or 'width', to fit the page width to the window width. defaultFitZoom percentage If xpdf is started with fit-page or fit-width zoom and no window geometry, it will calculate a desired window size based on the PDF page size and this defaultFitZoom value. I.e., the window size will be chosen such that exactly one page will fit in the window at this zoom factor (which must be a percentage). The default value is based on the screen resolution. initialDisplayMode single | continuous | sideBySideSingle | sideBySide- Continuous | horizontalContinuous Sets the initial display mode. The default setting is "continu- ous". initialToolbarState yes | no If set to "yes", xpdf opens with the toolbar visible. If set to "no", xpdf opens with the toolbar hidden. The default is "yes". initialSidebarState yes | no If set to "yes", xpdf opens with the sidebar (tabs, outline, etc.) visible. If set to "no", xpdf opens with the sidebar collapsed. The default is "yes". initialSidebarWidth width Sets the initial sidebar width, in pixels. This is only rele- vant if initialSidebarState is "yes". The default value is zero, which tells xpdf to use an internal default size. initialSelectMode block | linear Sets the initial selection mode. The default setting is "lin- ear". paperColor color Set the "paper color", i.e., the background of the page display. The color can be #RRGGBB (hexadecimal) or a named color. This option will not work well with PDF files that do things like filling in white behind the text. matteColor color Set the matte color, i.e., the color used for background outside the actual page area. The color can be #RRGGBB (hexadecimal) or a named color. fullScreenMatteColor color Set the matte color for full-screen mode. The color can be #RRGGBB (hexadecimal) or a named color. selectionColor color Set the selection color. The color can be #RRGGBB (hexadecimal) or a named color. reverseVideoInvertImages yes | no If set to "no", xpdf's reverse-video mode inverts text and vec- tor graphic content, but not images. If set to "yes", xpdf inverts images as well. The default is "no". popupMenuCmd title command ... Add a command to the popup menu. Title is the text to be dis- played in the menu. Command is an Xpdf command (see the COM- MANDS section of the xpdf(1) man page for details). Multiple commands are separated by whitespace. maxTileWidth pixels Set the maximum width of tiles to be used by xpdf when rasteriz- ing pages. This defaults to 1500. maxTileHeight pixels Set the maximum height of tiles to be used by xpdf when raster- izing pages. This defaults to 1500. tileCacheSize tiles Set the maximum number of tiles to be cached by xpdf when ras- terizing pages. This defaults to 10. workerThreads numThreads Set the number of worker threads to be used by xpdf when raster- izing pages. This defaults to 1. launchCommand command Sets the command executed when you click on a "launch"-type link. The intent is for the command to be a program/script which determines the file type and runs the appropriate viewer. The command line will consist of the file to be launched, fol- lowed by any parameters specified with the link. Do not use "%s" in "command". By default, this is unset, and Xpdf will simply try to execute the file (after prompting the user). movieCommand command Sets the command executed when you click on a movie annotation. The string "%s" will be replaced with the movie file name. This has no default value. defaultPrinter printer Sets the default printer used in the viewer's print dialog. bind modifiers-key context command ... Add a key or mouse button binding. Modifiers can be zero or more of: shift- ctrl- alt- Key can be a regular ASCII character, or any one of: space tab return enter backspace esc insert delete home end pgup pgdn left / right / up / down (arrow keys) f1 .. f35 (function keys) mousePress1 .. mousePress7 (mouse buttons) mouseRelease1 .. mouseRelease7 (mouse buttons) mouseClick1 .. mouseClick7 (mouse buttons) mouseDoubleClick1 .. mouseDoubleClick7 (mouse buttons) mouseTripleClick1 .. mouseTripleClick7 (mouse buttons) Context is either "any" or a comma-separated combination of: fullScreen / window (full screen mode on/off) continuous / singlePage (continuous mode on/off) overLink / offLink (mouse over link or not) scrLockOn / scrLockOff (scroll lock on/off) The context string can include only one of each pair in the above list. Command is an Xpdf command (see the COMMANDS section of the xpdf(1) man page for details). Multiple commands are separated by whitespace. The bind command replaces any existing binding, but only if it was defined for the exact same modifiers, key, and context. All tokens (modifiers, key, context, commands) are case-sensitive. Example key bindings: # bind ctrl-a in any context to the nextPage # command bind ctrl-a any nextPage # bind uppercase B, when in continuous mode # with scroll lock on, to the reload command # followed by the prevPage command bind B continuous,scrLockOn reload prevPage See the xpdf(1) man page for more examples. unbind modifiers-key context Removes a key binding established with the bind command. This is most useful to remove default key bindings before establish- ing new ones (e.g., if the default key binding is given for "any" context, and you want to create new key bindings for mul- tiple contexts). tabStateFile path Sets the file used by the loadTabState and saveTabState commands (see the xpdf(1) man page for more information). MISCELLANEOUS SETTINGS drawAnnotations yes | no If set to "no", annotations will not be drawn or printed. The default value is "yes". drawFormFields yes | no If set to "no", form fields will not be drawn or printed. The default value is "yes". enableXFA yes | no If an XFA form is present, and this option is set to "yes", Xpdf will parse the XFA form and use certain XFA information to over- ride AcroForm information. If set to "no", the XFA form will not be read. The default value is "yes". savePageNumbers yes | no If set to "yes", xpdf will save the current page numbers of all open files in ~/.xpdf.pages when the files are closed (or when quitting xpdf). Next time the file is opened, the last-viewed page number will be restored. The default value is "yes". printCommands yes | no If set to "yes", drawing commands are printed as they're exe- cuted (useful for debugging). This defaults to "no". printStatusInfo If set to "yes", print a status message (to stdout) before each page is processed. This defaults to "no". errQuiet yes | no If set to "yes", this suppresses all error and warning messages from all of the Xpdf tools. This defaults to "no". EXAMPLES The following is a sample xpdfrc file. # from the Thai support package nameToUnicode /usr/local/share/xpdf/Thai.nameToUnicode # from the Japanese support package cidToUnicode Adobe-Japan1 /usr/local/share/xpdf/Adobe-Japan1.cidToUnicode unicodeMap JISX0208 /usr/local/share/xpdf/JISX0208.unicodeMap cMapDir Adobe-Japan1 /usr/local/share/xpdf/cmap/Adobe-Japan1 # use the Base-14 Type 1 fonts from ghostscript fontFile Times-Roman /usr/local/share/ghostscript/fonts/n021003l.pfb fontFile Times-Italic /usr/local/share/ghostscript/fonts/n021023l.pfb fontFile Times-Bold /usr/local/share/ghostscript/fonts/n021004l.pfb fontFile Times-BoldItalic /usr/local/share/ghostscript/fonts/n021024l.pfb fontFile Helvetica /usr/local/share/ghostscript/fonts/n019003l.pfb fontFile Helvetica-Oblique /usr/local/share/ghostscript/fonts/n019023l.pfb fontFile Helvetica-Bold /usr/local/share/ghostscript/fonts/n019004l.pfb fontFile Helvetica-BoldOblique /usr/local/share/ghostscript/fonts/n019024l.pfb fontFile Courier /usr/local/share/ghostscript/fonts/n022003l.pfb fontFile Courier-Oblique /usr/local/share/ghostscript/fonts/n022023l.pfb fontFile Courier-Bold /usr/local/share/ghostscript/fonts/n022004l.pfb fontFile Courier-BoldOblique /usr/local/share/ghostscript/fonts/n022024l.pfb fontFile Symbol /usr/local/share/ghostscript/fonts/s050000l.pfb fontFile ZapfDingbats /usr/local/share/ghostscript/fonts/d050000l.pfb # use the Bakoma Type 1 fonts # (this assumes they happen to be installed in /usr/local/fonts/bakoma) fontDir /usr/local/fonts/bakoma # set some PostScript options psPaperSize letter psDuplex no psLevel level2 psEmbedType1Fonts yes psEmbedTrueTypeFonts yes # assume that the PostScript printer has the Univers and # Univers-Bold fonts psResidentFont Univers Univers psResidentFont Univers-Bold Univers-Bold # set the text output options textEncoding UTF-8 textEOL unix # misc options enableFreeType yes launchCommand viewer-script FILES /etc/xpdfrc This is the default location for the system-wide configuration file. Depending on build options, it may be placed elsewhere. $HOME/.xpdfrc This is the user's configuration file. If it exists, it will be read in place of the system-wide file. AUTHOR The Xpdf software and documentation are copyright 1996-2022 Glyph & Cog, LLC. SEE ALSO xpdf(1), pdftops(1), pdftotext(1), pdftohtml(1), pdfinfo(1), pdf- fonts(1), pdfdetach(1), pdftoppm(1), pdftopng(1), pdfimages(1) http://www.xpdfreader.com/ 18 Apr 202 xpdfrc(5)
Current Delphi project:
pPDFPowerToolsConsoleApp.dpr:
program{$APPTYPE CONSOLE} uses System.SysUtils, uFileSystemUtilities, uErrorHandling, uProcessExecution, System.Classes, Winapi.Windows, uXPDFWrapper; function StringCchCopy(pszDest: PWideChar; cchDest: ULONGLONG; pszSrc: PWideChar): HRESULT; stdcall; external 'strsafe.dll' name 'StringCchCopyW'; type TUTF8NoBOMEncoding = class(TUTF8Encoding) public function GetPreamble: TBytes; override; end; function TUTF8NoBOMEncoding.GetPreamble: TBytes; begin // Sobrescreve para retornar nenhum BOM Result := []; end; procedure RunCoreTests; var Output, ErrorOutput: string; begin // Existing tests (Test 1-4 remain the same) // ... end; procedure RunXPDFWrapperTests; var PDFPath: string; TextContent, Version: string; Metadata: TStringList; ConsoleStream: THandleStream; Encoding: TUTF8NoBOMEncoding; // use a nova classe begin Encoding := TUTF8NoBOMEncoding.Create; // sem parâmetros Writeln('Testing XPDF Wrapper functionalities...'); // Test 1: Text Extraction PDFPath := 'sample.pdf'; // Substitua por um PDF válido if not FileExists(PDFPath) then begin LogError('Test file not found: ' + PDFPath); Exit; end; if TXPDFWrapper.ExtractTextFromPDF(PDFPath, TextContent) then begin Writeln('Text extraction succeeded.'); Writeln('First 100 chars: ' + Copy(TextContent, 1, 100)); end else Writeln('Text extraction failed.'); // Test 2: Version Check if TXPDFWrapper.GetPDFVersion(PDFPath, Version) then Writeln('PDF Version: ' + Version) else Writeln('Version check failed.'); // Cria um THandleStream apontando para o handle da saída padrão (STDOUT) ConsoleStream := THandleStream.Create(GetStdHandle(STD_OUTPUT_HANDLE)); // Test 3: Metadata Extraction Metadata := TStringList.Create; try if TXPDFWrapper.ExtractMetadata(PDFPath, Metadata) then begin Writeln('Metadata:'); try Metadata.SaveToStream(ConsoleStream, Encoding); finally Encoding.Free; end; end else Writeln('Metadata extraction failed.'); finally Metadata.Free; ConsoleStream.Free; end; end; // Add to pPDFPowerToolsConsoleApp.dpr procedure SetConsoleToUnicodeFont; var ConsoleHandle: THandle; FontInfo: CONSOLE_FONT_INFOEX; begin ConsoleHandle := GetStdHandle(STD_OUTPUT_HANDLE); ZeroMemory(@FontInfo, SizeOf(FontInfo)); FontInfo.cbSize := SizeOf(FontInfo); FontInfo.nFont := 0; FontInfo.dwFontSize.X := 0; FontInfo.dwFontSize.Y := 16; FontInfo.FontFamily := FF_DONTCARE; FontInfo.FontWeight := FW_NORMAL; StringCchCopy(FontInfo.FaceName, LF_FACESIZE, 'Lucida Console'); SetCurrentConsoleFontEx(ConsoleHandle, False, FontInfo); end; begin // Set console to UTF-8 mode (Windows-specific) SetConsoleOutputCP(CP_UTF8); SetConsoleCP(CP_UTF8); try Writeln('Initializing PDF Power Tools...'); TDependencyManager.Initialize; if TDependencyManager.CheckDependencies then Writeln('All dependencies are present.') else Writeln('Dependency checks failed. See log for details.'); // Run core process execution tests RunCoreTests; // Run new XPDF Wrapper tests RunXPDFWrapperTests; Writeln('All tests completed. Press Enter to exit.'); ReadLn; except on E: Exception do begin LogCritical('Unhandled exception: ' + E.Message); Writeln('Critical error: ' + E.Message); ReadLn; end; end; end.
uErrorHandling.pas:
unitinterface uses System.SysUtils, System.Classes, Generics.Collections, Winapi.Windows; type // Log Level Enum TLogLevel = ( llDebug, // Very detailed information, useful for debugging llInfo, // General information about application flow llWarning, // Potential issues, non-critical errors llError, // Errors that prevent normal operation of a feature llCritical, // Critical errors that may cause application instability or data loss llNone // No logging ); // Log message event type TLogMessageEvent = procedure(Sender: TObject; const Message: string; LogLevel: TLogLevel) of object; // Base Exception Class EPDFPowerToolsException = class(Exception) private FErrorCode: Integer; FErrorTime: TDateTime; FSourceModule: string; FInnerException: Exception; FContextInfo: string; // General context info, e.g. command line, file path, etc. protected function GetFormattedMessage: string; virtual; public constructor Create(const AMessage: string); overload; constructor Create(const AMessage: string; ErrorCode: Integer); overload; constructor Create(const AMessage: string; ErrorCode: Integer; const SourceModule: string); overload; constructor Create(const AMessage: string; ErrorCode: Integer; const SourceModule: string; InnerException: Exception); overload; constructor CreateFmt(const Format: string; const Args: array of const); overload; constructor CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer); overload; constructor CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer; const SourceModule: string); overload; constructor CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer; const SourceModule: string; InnerException: Exception); overload; property ErrorCode: Integer read FErrorCode; property ErrorTime: TDateTime read FErrorTime; property SourceModule: string read FSourceModule; property InnerException: Exception read FInnerException; property ContextInfo: string read FContextInfo write FContextInfo; function GetFullErrorMessage: string; virtual; destructor Destroy; override; end; // Specific Exception Types (categorized) EPDFFileException = class(EPDFPowerToolsException); // File related errors (not found, access denied, etc.) EPDFProcessException = class(EPDFPowerToolsException); // Process execution errors (timeout, command failure) EPDFFormatException = class(EPDFPowerToolsException); // PDF format errors (invalid PDF, corrupted file) EPDFConfigurationException = class(EPDFPowerToolsException); // Configuration file errors, missing settings EPDFAPIUsageException = class(EPDFPowerToolsException); // Errors due to incorrect API usage by the developer EPDFDependencyException = class(EPDFPowerToolsException); // Missing or incompatible dependencies (DLLs, EXEs) EPDFSecurityException = class(EPDFPowerToolsException); // PDF security/encryption related errors // Interface for Log Targets (for extensibility) ILogTarget = interface ['{B62F1DC5-338A-40D0-B781-0383F39DF764}'] procedure LogMessage(const Message: string; LogLevel: TLogLevel); end; // Console Log Target TConsoleLogTarget = class(TInterfacedObject, ILogTarget) private FUseStdErr: Boolean; public constructor Create(UseStdErr: Boolean = False); procedure LogMessage(const Message: string; LogLevel: TLogLevel); reintroduce; virtual; property UseStdErr: Boolean read FUseStdErr write FUseStdErr; end; // File Log Target TFileLogTarget = class(TInterfacedObject, ILogTarget) private FLogFilePath: string; FIsLogFileOpen: Boolean; FLogFile: TextFile; procedure EnsureLogFileOpen; procedure CloseLogFile; protected procedure WriteToLogFile(const Message: string); public constructor Create(const LogFilePath: string); destructor Destroy; override; procedure LogMessage(const Message: string; LogLevel: TLogLevel); reintroduce; virtual; property LogFilePath: string read FLogFilePath write FLogFilePath; end; // Event Log Target (Windows specific) {$IFDEF MSWINDOWS} TEventLogTarget = class(TInterfacedObject, ILogTarget) private FEventSourceName: string; protected function GetEventType(LogLevel: TLogLevel): Word; inline; public constructor Create(const EventSourceName: string); procedure LogMessage(const Message: string; LogLevel: TLogLevel); reintroduce; virtual; property EventSourceName: string read FEventSourceName write FEventSourceName; end; {$ENDIF} // Centralized Error Logger (Singleton) TErrorLogger = class private class var FInstance: TErrorLogger; FLogLevel: TLogLevel; FLogTargets: TList<ILogTarget>; FOnLogMessage: TLogMessageEvent; class function GetLogLevel: TLogLevel; static; inline; class procedure SetLogLevel(const Value: TLogLevel); static; inline; class function GetOnLogMessage: TLogMessageEvent; static; inline; class procedure SetOnLogMessage(const Value: TLogMessageEvent); static; inline; constructor Create; destructor Destroy; override; protected procedure InternalLog(const Message: string; LogLevel: TLogLevel); public class function GetInstance: TErrorLogger; inline; class procedure Initialize; class procedure FinalizeLogger; class property LogLevel: TLogLevel read GetLogLevel write SetLogLevel; class property OnLogMessage: TLogMessageEvent read GetOnLogMessage write SetOnLogMessage; procedure AddLogTarget(LogTarget: ILogTarget); inline; procedure ClearLogTargets; inline; procedure Debug(const Message: string); overload; inline; procedure DebugFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure Info(const Message: string); overload; inline; procedure InfoFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure Warning(const Message: string); overload; inline; procedure WarningFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure Error(const Message: string); overload; inline; procedure ErrorFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure Critical(const Message: string); overload; inline; procedure CriticalFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; end; // Helper function to convert LogLevel to String function LogLevelToString(LogLevel: TLogLevel): string; inline; // Global Logging Procedures procedure LogDebug(const Message: string); overload; inline; procedure LogDebugFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure LogInfo(const Message: string); overload; inline; procedure LogInfoFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure LogWarning(const Message: string); overload; inline; procedure LogWarningFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure LogError(const Message: string); overload; inline; procedure LogErrorFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; procedure LogCritical(const Message: string); overload; inline; procedure LogCriticalFmt(const Format: string; const Args: TArray<TVarRec>); overload; inline; const LvlStr: array[TLogLevel] of string = ('Debug', 'Info', 'Warning', 'Error', 'Critical', 'None'); implementation uses DateUtils, SyncObjs, System.StrUtils, System.Generics.Collections; { EPDFPowerToolsException } constructor EPDFPowerToolsException.Create(const AMessage: string); begin inherited Create(AMessage); FErrorCode := 0; FErrorTime := Now; FSourceModule := ''; FInnerException := nil; FContextInfo := ''; end; constructor EPDFPowerToolsException.Create(const AMessage: string; ErrorCode: Integer); begin inherited Create(AMessage); FErrorCode := ErrorCode; FErrorTime := Now; FSourceModule := ''; FInnerException := nil; FContextInfo := ''; end; constructor EPDFPowerToolsException.Create(const AMessage: string; ErrorCode: Integer; const SourceModule: string); begin inherited Create(AMessage); FErrorCode := ErrorCode; FErrorTime := Now; FSourceModule := SourceModule; FInnerException := nil; FContextInfo := ''; end; constructor EPDFPowerToolsException.Create(const AMessage: string; ErrorCode: Integer; const SourceModule: string; InnerException: Exception); begin inherited Create(AMessage); FErrorCode := ErrorCode; FErrorTime := Now; FSourceModule := SourceModule; FInnerException := InnerException; FContextInfo := ''; end; constructor EPDFPowerToolsException.CreateFmt(const Format: string; const Args: array of const); begin Create(System.SysUtils.Format(Format, Args)); end; constructor EPDFPowerToolsException.CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer); begin Create(System.SysUtils.Format(Format, Args), ErrorCode); end; constructor EPDFPowerToolsException.CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer; const SourceModule: string); begin Create(System.SysUtils.Format(Format, Args), ErrorCode, SourceModule); end; constructor EPDFPowerToolsException.CreateFmt(const Format: string; const Args: array of const; ErrorCode: Integer; const SourceModule: string; InnerException: Exception); begin Create(System.SysUtils.Format(Format, Args), ErrorCode, SourceModule, InnerException); end; destructor EPDFPowerToolsException.Destroy; begin FInnerException := nil; // Do not free inner exception (just a reference) inherited; end; function EPDFPowerToolsException.GetFormattedMessage: string; begin Result := Message; end; function EPDFPowerToolsException.GetFullErrorMessage: string; var sb: TStringBuilder; begin sb := TStringBuilder.Create; try sb.Append('Error Time: ').Append(FormatDateTime('yyyy-mm-dd hh:nn:ss.zzz', FErrorTime)).AppendLine; sb.Append('Error Level: ').Append(FSourceModule).AppendLine; sb.Append('Error Code: ').Append(IntToStr(FErrorCode)).AppendLine; if FSourceModule <> '' then sb.Append('Source Module: ').Append(FSourceModule).AppendLine; sb.Append('Message: ').Append(GetFormattedMessage).AppendLine; if FContextInfo <> '' then sb.Append('Context Info: ').Append(FContextInfo).AppendLine; if Assigned(FInnerException) then begin sb.AppendLine; sb.AppendLine('--- Inner Exception ---'); if FInnerException is EPDFPowerToolsException then sb.Append(EPDFPowerToolsException(FInnerException).GetFullErrorMessage) else sb.Append(FInnerException.ClassName).Append(': ').Append(FInnerException.Message); end; Result := sb.ToString; finally sb.Free; end; end; { TConsoleLogTarget } constructor TConsoleLogTarget.Create(UseStdErr: Boolean); begin inherited Create; FUseStdErr := UseStdErr; end; procedure TConsoleLogTarget.LogMessage(const Message: string; LogLevel: TLogLevel); var OutputHandle: THandle; NumWritten: Cardinal; begin if FUseStdErr then OutputHandle := GetStdHandle(STD_ERROR_HANDLE) else OutputHandle := GetStdHandle(STD_OUTPUT_HANDLE); if OutputHandle <> INVALID_HANDLE_VALUE then WriteConsole(OutputHandle, PChar(Message), Length(Message), NumWritten, nil); end; { TFileLogTarget } constructor TFileLogTarget.Create(const LogFilePath: string); begin inherited Create; FLogFilePath := LogFilePath; FIsLogFileOpen := False; end; destructor TFileLogTarget.Destroy; begin CloseLogFile; inherited; end; procedure TFileLogTarget.EnsureLogFileOpen; begin if not FIsLogFileOpen then begin try AssignFile(FLogFile, FLogFilePath); Append(FLogFile); // or Rewrite, per requirements if IOResult <> 0 then raise Exception.CreateFmt('Could not open log file for appending: %s. I/O Error: %d', [FLogFilePath, IOResult]); FIsLogFileOpen := True; except on E: Exception do begin // Fallback: log to console if file open fails TConsoleLogTarget.Create(True) .LogMessage('[CRITICAL] Failed to open log file: ' + E.Message + '. Logging to console instead.', llCritical); FIsLogFileOpen := False; end; end; end; end; procedure TFileLogTarget.CloseLogFile; begin if FIsLogFileOpen then begin try CloseFile(FLogFile); finally FIsLogFileOpen := False; end; end; end; procedure TFileLogTarget.WriteToLogFile(const Message: string); begin EnsureLogFileOpen; if FIsLogFileOpen then begin try Writeln(FLogFile, Message); Flush(FLogFile); except on E: Exception do begin TConsoleLogTarget.Create(True) .LogMessage('[WARNING] Failed to write to log file: ' + E.Message + '. Message: ' + Message, llWarning); CloseLogFile; // Close to force re-open on next log end; end; end; end; procedure TFileLogTarget.LogMessage(const Message: string; LogLevel: TLogLevel); begin WriteToLogFile(Message); end; {$IFDEF MSWINDOWS} { TEventLogTarget } constructor TEventLogTarget.Create(const EventSourceName: string); begin inherited Create; FEventSourceName := EventSourceName; end; function TEventLogTarget.GetEventType(LogLevel: TLogLevel): Word; begin case LogLevel of llCritical, llError: Result := EVENTLOG_ERROR_TYPE; llWarning: Result := EVENTLOG_WARNING_TYPE; llInfo, llDebug: Result := EVENTLOG_INFORMATION_TYPE; else Result := EVENTLOG_INFORMATION_TYPE; end; end; procedure TEventLogTarget.LogMessage(const Message: string; LogLevel: TLogLevel); var EventLogHandle: THandle; EventType: Word; Strings: array [0..0] of PChar; begin EventLogHandle := RegisterEventSource(nil, PChar(FEventSourceName)); if EventLogHandle <> 0 then begin try EventType := GetEventType(LogLevel); Strings[0] := PChar(Message); ReportEvent(EventLogHandle, EventType, 0, 0, nil, 1, 0, @Strings, nil); finally DeregisterEventSource(EventLogHandle); end; end else TConsoleLogTarget.Create(True) .LogMessage('[WARNING] Failed to write to Windows Event Log (Source: ' + FEventSourceName + '). Logging to console instead.', llWarning); end; {$ENDIF} { TErrorLogger } class function TErrorLogger.GetInstance: TErrorLogger; begin if FInstance = nil then FInstance := TErrorLogger.Create; Result := FInstance; end; class procedure TErrorLogger.Initialize; begin if FInstance = nil then GetInstance; end; class procedure TErrorLogger.FinalizeLogger; begin if FInstance <> nil then begin FInstance.Free; FInstance := nil; end; end; constructor TErrorLogger.Create; begin inherited Create; FLogLevel := llWarning; // Default log level FLogTargets := TList<ILogTarget>.Create; FOnLogMessage := nil; end; destructor TErrorLogger.Destroy; begin ClearLogTargets; FLogTargets.Free; inherited; end; class function TErrorLogger.GetLogLevel: TLogLevel; begin Result := GetInstance.FLogLevel; end; class procedure TErrorLogger.SetLogLevel(const Value: TLogLevel); begin GetInstance.FLogLevel := Value; end; class function TErrorLogger.GetOnLogMessage: TLogMessageEvent; begin Result := GetInstance.FOnLogMessage; end; class procedure TErrorLogger.SetOnLogMessage(const Value: TLogMessageEvent); begin GetInstance.FOnLogMessage := Value; end; procedure TErrorLogger.InternalLog(const Message: string; LogLevel: TLogLevel); var i: Integer; FormattedMessage: string; sb: TStringBuilder; // Use TStringBuilder begin if LogLevel < FLogLevel then Exit; sb := TStringBuilder.Create; try sb.Append(FormatDateTime('yyyy-mm-dd hh:nn:ss.zzz', Now)); sb.Append(' - [').Append(LogLevelToString(LogLevel)).Append('] - '); sb.Append(Message).Append(sLineBreak); FormattedMessage := sb.ToString; finally sb.Free; end; for i := 0 to FLogTargets.Count - 1 do FLogTargets[i].LogMessage(FormattedMessage, LogLevel); if Assigned(FOnLogMessage) then FOnLogMessage(Self, FormattedMessage, LogLevel); end; procedure TErrorLogger.AddLogTarget(LogTarget: ILogTarget); begin if LogTarget <> nil then FLogTargets.Add(LogTarget); end; procedure TErrorLogger.ClearLogTargets; begin FLogTargets.Clear; end; procedure TErrorLogger.Debug(const Message: string); begin InternalLog(Message, llDebug); end; procedure TErrorLogger.DebugFmt(const Format: string; const Args: TArray<TVarRec>); begin Debug(System.SysUtils.Format(Format, Args)); end; procedure TErrorLogger.Info(const Message: string); begin InternalLog(Message, llInfo); end; procedure TErrorLogger.InfoFmt(const Format: string; const Args: TArray<TVarRec>); begin Info(System.SysUtils.Format(Format, Args)); end; procedure TErrorLogger.Warning(const Message: string); begin InternalLog(Message, llWarning); end; procedure TErrorLogger.WarningFmt(const Format: string; const Args: TArray<TVarRec>); begin Warning(System.SysUtils.Format(Format, Args)); end; procedure TErrorLogger.Error(const Message: string); begin InternalLog(Message, llError); end; procedure TErrorLogger.ErrorFmt(const Format: string; const Args: TArray<TVarRec>); begin Error(System.SysUtils.Format(Format, Args)); end; procedure TErrorLogger.Critical(const Message: string); begin InternalLog(Message, llCritical); end; procedure TErrorLogger.CriticalFmt(const Format: string; const Args: TArray<TVarRec>); begin Critical(System.SysUtils.Format(Format, Args)); end; function LogLevelToString(LogLevel: TLogLevel): string; begin Result := LvlStr[LogLevel]; end; { Global Logging Procedures } procedure LogDebug(const Message: string); begin TErrorLogger.GetInstance.Debug(Message); end; procedure LogDebugFmt(const Format: string; const Args: TArray<TVarRec>); begin TErrorLogger.GetInstance.DebugFmt(Format, Args); end; procedure LogInfo(const Message: string); begin TErrorLogger.GetInstance.Info(Message); end; procedure LogInfoFmt(const Format: string; const Args: TArray<TVarRec>); begin TErrorLogger.GetInstance.InfoFmt(Format, Args); end; procedure LogWarning(const Message: string); begin TErrorLogger.GetInstance.Warning(Message); end; procedure LogWarningFmt(const Format: string; const Args: TArray<TVarRec>); begin TErrorLogger.GetInstance.WarningFmt(Format, Args); end; procedure LogError(const Message: string); begin TErrorLogger.GetInstance.Error(Message); end; procedure LogErrorFmt(const Format: string; const Args: TArray<TVarRec>); begin TErrorLogger.GetInstance.ErrorFmt(Format, Args); end; procedure LogCritical(const Message: string); begin TErrorLogger.GetInstance.Critical(Message); end; procedure LogCriticalFmt(const Format: string; const Args: TArray<TVarRec>); begin TErrorLogger.GetInstance.CriticalFmt(Format, Args); end; initialization TErrorLogger.Initialize; // Initialize logger on unit startup finalization TErrorLogger.FinalizeLogger; // Finalize logger on unit shutdown end.
uFileSystemUtilities.pas:
unitinterface uses System.SysUtils, System.Classes, System.IniFiles, System.Types, Winapi.ShellAPI, Winapi.Windows, IdHTTP, IdException, IdSSLOpenSSL, // Add IdSSLOpenSSL for SSL handler System.Zip, IOUtils, uErrorHandling, System.JSON; // Add System.JSON const // Constants for XPDF Tools location, download URL, and required files GITHUB_API_LATEST_RELEASE_URL = 'https://api.github.com/repos/oschwartz10612/poppler-windows/releases/latest'; XPDF_TOOLS_DIR = 'SuiteXPDFTools'; REQUIRED_XPDF_FILES: array of string = ['cairo.dll', 'charset.dll', 'deflate.dll', 'expat.dll', 'fontconfig-1.dll', 'freetype.dll', 'iconv.dll', 'jpeg8.dll', 'lcms2.dll', 'Lerc.dll', 'libcrypto-3-x64.dll', 'libcurl.dll', 'libexpat.dll', 'liblzma.dll', 'libpng16.dll', 'libssh2.dll', 'libtiff.dll', 'libzstd.dll', 'openjp2.dll', 'pdfattach.exe', 'pdfdetach.exe', 'pdffonts.exe', 'pdfimages.exe', 'pdfinfo.exe', 'pdfseparate.exe', 'pdftocairo.exe', 'pdftohtml.exe', 'pdftoppm.exe', 'pdftops.exe', 'pdftotext.exe', 'pdfunite.exe', 'pixman-1-0.dll', 'poppler-cpp.dll', 'poppler-glib.dll', 'poppler.dll', 'tiff.dll', 'zlib.dll', 'zstd.dll', 'zstd.exe']; var // Pre-calculate the application path gAppPath: string; type TFileChecker = class private class var FFullPath: string; // Make FullPath a class variable public class function CheckFileExists(const AFilePath: string): Boolean; inline; static; class function CheckDirectoryExists(const ADirectoryPath: string): Boolean; inline; static; class function CheckXPDFExecutableExists(const ExecutableName: string; const SearchPaths: TArray<string>): Boolean; inline; static; class function CheckFileExistsBool(const AFilePath: string): Boolean; inline; static; class function CheckDirectoryExistsBool(const ADirectoryPath: string) : Boolean; inline; static; class function GetFullPath: string; static; end; TXPDFRCParser = class private FConfig: TIniFile; function GetStringValue(const Section, Key, DefaultValue: string) : string; inline; function GetBoolValue(const Section, Key: string; DefaultValue: Boolean) : Boolean; inline; function LocateConfigFile: Boolean; inline; public constructor Create; destructor Destroy; override; function GetTextEncoding: string; inline; end; TDependencyManager = class private class var FInstance: TDependencyManager; class var FXPDFRCParser: TXPDFRCParser; FInitialized: Boolean; public class function GetInstance: TDependencyManager; inline; class procedure Initialize; class procedure FinalizeManager; class function InitializeXPDFToolsPath: Boolean; static; class function GetXPDFRCParser: TXPDFRCParser; inline; static; class function CheckDependencies: Boolean; constructor Create; destructor Destroy; override; end; procedure CreateOutputDirectory(const DirPath: string); inline; function GetTempDirectory: string; inline; function GenerateUniqueFileName(const Directory: string; const FilePrefix: string; const FileExtension: string): string; procedure CleanDirectory(const Directory: string); procedure DownloadAndExtractXPDFTools; function GetLatestReleaseURLFromGitHub: string; implementation uses DateUtils, System.StrUtils; {$R-} {$Q-} // Disable range and overflow checks function GetExecutablePath(const ExecutableName: string; var ExecutablePath: string): Boolean; inline; var Buffer: array [0 .. MAX_PATH - 1] of Char; begin if (ExecutableName = '') then begin ExecutablePath := ''; Result := False; Exit; end; if FindExecutable(PChar(ExecutableName), nil, Buffer) > 32 then begin ExecutablePath := String(Buffer); // Correctly convert to string Result := True; end else begin ExecutablePath := ''; Result := False; end; end; class function TFileChecker.CheckFileExists(const AFilePath: string): Boolean; begin if AFilePath = '' then begin raise EPDFFileException.Create ('Empty file path provided to CheckFileExists'); Result := False; Exit; end; FFullPath := ExpandFileName(AFilePath); Result := FileExists(FFullPath); if not Result then raise EPDFFileException.CreateFmt('File not found: %s', [FFullPath]); end; class function TFileChecker.CheckDirectoryExists(const ADirectoryPath : string): Boolean; begin if ADirectoryPath = '' then begin raise EPDFFileException.Create ('Empty directory path provided to CheckDirectoryExists'); Result := False; Exit; end; FFullPath := ExpandFileName(ADirectoryPath); Result := DirectoryExists(FFullPath); if not Result then raise EPDFFileException.CreateFmt('Directory not found: %s', [FFullPath]); end; class function TFileChecker.GetFullPath: string; begin Result := IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR) + 'pdftotext.exe'; // Similarly for other tools like pdfinfo.exe end; class function TFileChecker.CheckXPDFExecutableExists(const ExecutableName : string; const SearchPaths: TArray<string>): Boolean; var ExecutablePath: string; SearchDir: string; I: Integer; SafeExecutableName: string; begin // Guard against empty executable name if (ExecutableName = '') then begin LogError('Empty executable name provided to CheckXPDFExecutableExists'); raise EPDFDependencyException.Create ('Invalid XPDF executable name: No name provided'); end; // Ensure executable name has .exe extension SafeExecutableName := ExecutableName; if not EndsText('.exe', SafeExecutableName) then SafeExecutableName := SafeExecutableName + '.exe'; // 1. Check in SuiteXPDFTools directory. if FileExists(IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR) + SafeExecutableName) then begin FFullPath := IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR) + SafeExecutableName; Result := True; Exit; end; // 2. Check in specified search paths. for I := Low(SearchPaths) to High(SearchPaths) do begin SearchDir := SearchPaths[I]; if (SearchDir <> '') and FileExists(IncludeTrailingPathDelimiter(SearchDir) + SafeExecutableName) then begin FFullPath := IncludeTrailingPathDelimiter(SearchDir) + SafeExecutableName; Result := True; Exit; end; end; // 3. Check default path (using FindExecutable) if GetExecutablePath(SafeExecutableName, ExecutablePath) then begin FFullPath := ExecutablePath; Result := True; Exit; end; // 4. If not found, raise exception with the correct filename FFullPath := ''; raise EPDFDependencyException.CreateFmt ('XPDF Executable not found: %s. Ensure it is in SuiteXPDFTools directory, your PATH, or the application directory.', [SafeExecutableName]); end; class function TFileChecker.CheckFileExistsBool(const AFilePath : string): Boolean; begin if AFilePath = '' then begin Result := False; Exit; end; FFullPath := ExpandFileName(AFilePath); Result := FileExists(FFullPath); end; class function TFileChecker.CheckDirectoryExistsBool(const ADirectoryPath : string): Boolean; begin if ADirectoryPath = '' then begin Result := False; Exit; end; FFullPath := ExpandFileName(ADirectoryPath); Result := DirectoryExists(FFullPath); end; constructor TXPDFRCParser.Create; var ConfigFile: string; begin inherited Create; FConfig := nil; ConfigFile := gAppPath + 'xpdfrc'; // Use pre-calculated path if FileExists(ConfigFile) then FConfig := TIniFile.Create(ConfigFile); end; destructor TXPDFRCParser.Destroy; begin if Assigned(FConfig) then // Check if FConfig is assigned FConfig.Free; inherited; end; function TXPDFRCParser.LocateConfigFile: Boolean; begin Result := FileExists(gAppPath + 'xpdfrc'); // Use pre-calculated path end; function TXPDFRCParser.GetStringValue(const Section, Key, DefaultValue: string): string; begin if Assigned(FConfig) then Result := FConfig.ReadString(Section, Key, DefaultValue) else Result := DefaultValue; end; function TXPDFRCParser.GetBoolValue(const Section, Key: string; DefaultValue: Boolean): Boolean; begin if Assigned(FConfig) then Result := FConfig.ReadBool(Section, Key, DefaultValue) else Result := DefaultValue; end; function TXPDFRCParser.GetTextEncoding: string; begin Result := GetStringValue('Text Control', 'textEncoding', 'Latin1'); end; class function TDependencyManager.GetInstance: TDependencyManager; begin if FInstance = nil then FInstance := TDependencyManager.Create; Result := FInstance; end; class procedure TDependencyManager.Initialize; begin if FInstance = nil then GetInstance; end; class procedure TDependencyManager.FinalizeManager; begin if FInstance <> nil then begin FInstance.Free; FInstance := nil; end; end; constructor TDependencyManager.Create; begin inherited Create; FXPDFRCParser := nil; FInitialized := False; end; destructor TDependencyManager.Destroy; begin if Assigned(FXPDFRCParser) then // Check if FXPDFRCParser is assigned FXPDFRCParser.Free; inherited; end; class function TDependencyManager.InitializeXPDFToolsPath: Boolean; begin if FXPDFRCParser = nil then FXPDFRCParser := TXPDFRCParser.Create; // Ensure XPDF tools are downloaded and extracted try DownloadAndExtractXPDFTools; // Check specifically for pdftotext.exe TFileChecker.CheckXPDFExecutableExists('pdftotext.exe', []); Result := True; except on E: Exception do begin LogError('Failed to initialize XPDF tools: ' + E.Message); Result := False; end; end; end; class function TDependencyManager.GetXPDFRCParser: TXPDFRCParser; begin Result := FXPDFRCParser; end; procedure CreateOutputDirectory(const DirPath: string); inline; begin if DirPath = '' then Exit; if not DirectoryExists(DirPath) then ForceDirectories(DirPath); end; function GetTempDirectory: string; inline; var TempDir: string; begin TempDir := GetEnvironmentVariable('TEMP'); if TempDir = '' then TempDir := TPath.GetTempPath; if TempDir = '' then raise EPDFFileException.Create('Could not determine temporary directory.'); Result := IncludeTrailingPathDelimiter(TempDir); end; function GenerateUniqueFileName(const Directory: string; const FilePrefix: string; const FileExtension: string): string; var Dir, Ext: string; begin if Directory = '' then Dir := GetTempDirectory else Dir := IncludeTrailingPathDelimiter(Directory); if FileExtension = '' then Ext := '' else if FileExtension[1] = '.' then Ext := FileExtension else Ext := '.' + FileExtension; Result := Dir + FilePrefix + FormatDateTime('yyyymmdd_hhnnss_zzz_', Now) + TGUID.NewGuid.ToString.Replace('{', '').Replace('}', '') + Ext; end; procedure CleanDirectory(const Directory: string); var SearchRec: TSearchRec; Dir: string; begin if Directory = '' then Exit; Dir := IncludeTrailingPathDelimiter(Directory); if FindFirst(Dir + '*.*', faAnyFile, SearchRec) = 0 then begin try repeat if (SearchRec.Name <> '.') and (SearchRec.Name <> '..') and (SearchRec.Attr and faDirectory = 0) then begin try if not System.SysUtils.DeleteFile(Dir + SearchRec.Name) then LogWarning('Failed to delete file: ' + Dir + SearchRec.Name); except on E: Exception do LogWarning('Error deleting file ' + Dir + SearchRec.Name + ': ' + E.Message); end; end; until FindNext(SearchRec) <> 0; finally System.SysUtils.FindClose(SearchRec); end; end; end; class function TDependencyManager.CheckDependencies: Boolean; var RequiredFile: string; XPDFToolsPath: string; AllFilesPresent: Boolean; ExecutableName: string; CriticalTool: string; CriticalTools: array of string; begin AllFilesPresent := True; XPDFToolsPath := IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR); // Garante que as ferramentas foram baixadas/extracionadas try DownloadAndExtractXPDFTools; except on E: Exception do begin LogError('Falha ao baixar/extrair as ferramentas XPDF: ' + E.Message); Result := False; Exit; end; end; // Verifica cada arquivo requerido definido na constante REQUIRED_XPDF_FILES for RequiredFile in REQUIRED_XPDF_FILES do begin if RequiredFile = '' then Continue; // Pula entradas vazias if EndsText('.exe', RequiredFile) then // Se for executável begin try // Obtém o nome sem extensão (para validação, se necessário) ExecutableName := ChangeFileExt(RequiredFile, ''); if ExecutableName = '' then Continue; // Pula nomes vazios if not TFileChecker.CheckFileExistsBool(XPDFToolsPath + RequiredFile) then begin LogWarning('Executável XPDF requerido não encontrado: ' + XPDFToolsPath + RequiredFile); AllFilesPresent := False; end; except on E: Exception do begin LogError('Erro ao checar o executável ' + RequiredFile + ': ' + E.Message); AllFilesPresent := False; end; end; end else // Se for DLL ou outro arquivo begin if not TFileChecker.CheckFileExistsBool(XPDFToolsPath + RequiredFile) then begin LogWarning('Arquivo XPDF requerido não encontrado: ' + XPDFToolsPath + RequiredFile); AllFilesPresent := False; end; end; end; // Lista das ferramentas críticas para checagem específica CriticalTools := ['pdfattach.exe', 'pdfdetach.exe', 'pdffonts.exe', 'pdfimages.exe', 'pdfinfo.exe', 'pdfseparate.exe', 'pdftocairo.exe', 'pdftohtml.exe', 'pdftoppm.exe', 'pdftops.exe', 'pdftotext.exe', 'pdfunite.exe']; // Checa cada ferramenta crítica individualmente for CriticalTool in CriticalTools do begin try TFileChecker.CheckXPDFExecutableExists(CriticalTool, []); except on E: Exception do begin LogError('Ferramenta crítica XPDF não encontrada - ' + CriticalTool + ': ' + E.Message); AllFilesPresent := False; end; end; end; Result := AllFilesPresent; end; procedure DownloadAndExtractXPDFTools; var HTTP: TIdHTTP; ZipFileStream: TFileStream; Zip: TZipFile; TargetPath, ZipPath, TempExtractPath: string; I: Integer; SourceFile, DestFile, FileName: string; DownloadURL: string; AlreadyChecked: Boolean; Files: TStringDynArray; begin // First check if tools already exist AlreadyChecked := False; try AlreadyChecked := TFileChecker.CheckFileExistsBool (IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR) + 'pdftotext.exe'); except AlreadyChecked := False; end; if AlreadyChecked then begin LogInfo('XPDF tools are already present.'); Exit; end; // Create target directory TargetPath := IncludeTrailingPathDelimiter(gAppPath + XPDF_TOOLS_DIR); ForceDirectories(TargetPath); // Create a temporary extraction directory TempExtractPath := IncludeTrailingPathDelimiter(TargetPath + 'temp_extract'); ForceDirectories(TempExtractPath); // Generate a unique name for the zip file ZipPath := GenerateUniqueFileName(TargetPath, 'xpdf-tools-win64-', '.zip'); LogInfo('Downloading XPDF tools...'); // Get the latest release URL from GitHub try DownloadURL := GetLatestReleaseURLFromGitHub; if DownloadURL = '' then raise EPDFDependencyException.Create ('No download URL found for XPDF tools'); except on E: Exception do begin LogError('Failed to get XPDF download URL: ' + E.Message); raise EPDFDependencyException.Create('Failed to get XPDF download URL: ' + E.Message); end; end; HTTP := TIdHTTP.Create(nil); HTTP.HandleRedirects := True; // Habilita o seguimento de redirecionamentos ZipFileStream := nil; try try // Configure SSL para HTTPS HTTP.IOHandler := TIdSSLIOHandlerSocketOpenSSL.Create(HTTP); with TIdSSLIOHandlerSocketOpenSSL(HTTP.IOHandler).SSLOptions do begin {$IFDEF HAVE_TLSV1_3} SSLVersions := [sslvTLSv1, sslvTLSv1_1, sslvTLSv1_2, sslvTLSv1_3]; {$ELSE} SSLVersions := [sslvTLSv1, sslvTLSv1_1, sslvTLSv1_2]; {$ENDIF} Method := sslvTLSv1_2; Mode := sslmUnassigned; VerifyMode := []; VerifyDepth := 0; end; // Set timeout para download HTTP.ConnectTimeout := 30000; // 30 segundos HTTP.ReadTimeout := 60000; // 60 segundos // Cria um stream para o arquivo ZIP ZipFileStream := TFileStream.Create(ZipPath, fmCreate); // Efetua o download do arquivo HTTP.Get(DownloadURL, ZipFileStream); ZipFileStream.Position := 0; LogInfo('XPDF tools baixadas com sucesso. Iniciando extração...'); except on E: EIdHTTPProtocolException do begin if FileExists(ZipPath) then System.SysUtils.DeleteFile(ZipPath); raise EPDFDependencyException.CreateFmt ('Falha no download dos XPDF tools (HTTP %d): %s', [E.ErrorCode, E.Message]); end; on E: Exception do begin if FileExists(ZipPath) then System.SysUtils.DeleteFile(ZipPath); raise EPDFDependencyException.CreateFmt ('Falha no download dos XPDF tools: %s', [E.Message]); end; end; finally if Assigned(ZipFileStream) then begin ZipFileStream.Free; ZipFileStream := nil; end; HTTP.Free; end; // Extract the downloaded ZIP file to temp directory first Zip := TZipFile.Create; try try if not FileExists(ZipPath) then raise EPDFDependencyException.Create ('ZIP file not found after download'); Zip.Open(ZipPath, zmRead); // Extract all files to the temporary directory Zip.ExtractAll(TempExtractPath); LogInfo('Extracted all files to temporary directory: ' + TempExtractPath); // Now find all executable and DLL files in the temp directory recursively Files := TDirectory.GetFiles(TempExtractPath, '*.*', TSearchOption.soAllDirectories); for SourceFile in Files do begin // Only copy .exe and .dll files if EndsText('.exe', SourceFile) or EndsText('.dll', SourceFile) then begin FileName := ExtractFileName(SourceFile); DestFile := IncludeTrailingPathDelimiter(TargetPath) + FileName; try // Copy the file to the target directory TFile.Copy(SourceFile, DestFile, True); LogInfo('Copied: ' + FileName + ' to ' + TargetPath); except on E: Exception do LogWarning('Failed to copy file ' + FileName + ': ' + E.Message); end; end; end; except on E: EZipException do raise EPDFDependencyException.CreateFmt ('Failed to extract XPDF tools: %s', [E.Message]); on E: Exception do raise EPDFDependencyException.CreateFmt ('Error extracting XPDF tools: %s', [E.Message]); end; finally Zip.Free; // Clean up the ZIP file try if FileExists(ZipPath) then System.SysUtils.DeleteFile(ZipPath); except on E: Exception do LogWarning('Failed to delete ZIP file: ' + E.Message); end; // Clean up the temporary extraction directory try if DirectoryExists(TempExtractPath) then TDirectory.Delete(TempExtractPath, True); except on E: Exception do LogWarning('Failed to delete temporary directory: ' + E.Message); end; end; LogInfo('XPDF tools installed in ' + TargetPath); // Final verification if not TFileChecker.CheckFileExistsBool(TargetPath + 'pdftotext.exe') then raise EPDFDependencyException.Create ('XPDF tools installation failed: pdftotext.exe not found after extraction'); end; function GetLatestReleaseURLFromGitHub: string; var HTTP: TIdHTTP; SSLHandler: TIdSSLIOHandlerSocketOpenSSL; Response: string; JsonValue: TJSONValue; JsonObject: TJSONObject; AssetsArray: TJSONArray; I: Integer; AssetObject: TJSONObject; DownloadURL: string; AssetName: string; begin Result := ''; HTTP := TIdHTTP.Create(nil); SSLHandler := TIdSSLIOHandlerSocketOpenSSL.Create(HTTP); JsonValue := nil; try try // Configure HTTP client HTTP.IOHandler := SSLHandler; HTTP.ConnectTimeout := 30000; // 30 seconds HTTP.ReadTimeout := 30000; // 30 seconds // Configure SSL {$IFDEF HAVE_TLSV1_3} SSLHandler.SSLOptions.SSLVersions := [sslvTLSv1, sslvTLSv1_1, sslvTLSv1_2, sslvTLSv1_3]; {$ELSE} SSLHandler.SSLOptions.SSLVersions := [sslvTLSv1, sslvTLSv1_1, sslvTLSv1_2]; {$ENDIF} SSLHandler.SSLOptions.Method := sslvTLSv1_2; SSLHandler.SSLOptions.Mode := sslmUnassigned; SSLHandler.SSLOptions.VerifyMode := []; SSLHandler.SSLOptions.VerifyDepth := 0; // Set user agent to avoid API rate limiting HTTP.Request.UserAgent := 'PDFSuite-Dependency-Manager'; // Get the latest release information Response := HTTP.Get(GITHUB_API_LATEST_RELEASE_URL); if Response = '' then raise EPDFDependencyException.Create('Empty response from GitHub API'); // Parse the JSON response JsonValue := TJSONObject.ParseJSONValue(Response); if not Assigned(JsonValue) then raise EPDFDependencyException.Create ('Invalid JSON response from GitHub API'); if not(JsonValue is TJSONObject) then begin LogWarning('GitHub API response is not a JSON object'); raise EPDFDependencyException.Create ('Unexpected response format from GitHub API'); end; JsonObject := JsonValue as TJSONObject; // Get the assets array if not JsonObject.TryGetValue<TJSONArray>('assets', AssetsArray) then raise EPDFDependencyException.Create ('No assets found in GitHub release'); if not Assigned(AssetsArray) or (AssetsArray.Count = 0) then raise EPDFDependencyException.Create ('No assets available in the latest release'); // Find a suitable asset (ZIP file for Windows 64-bit) for I := 0 to AssetsArray.Count - 1 do begin if not(AssetsArray.Items[I] is TJSONObject) then Continue; AssetObject := AssetsArray.Items[I] as TJSONObject; // Get the asset name if not AssetObject.TryGetValue<string>('name', AssetName) or (AssetName = '') then Continue; // Check if this is a suitable Windows ZIP file if (EndsText('.zip', AssetName) or EndsText('.7z', AssetName) or EndsText('.tar.gz', AssetName)) and ((Pos('win', LowerCase(AssetName)) > 0) or (Pos('windows', LowerCase(AssetName)) > 0) or (Pos('x64', LowerCase(AssetName)) > 0) or (Pos('amd64', LowerCase(AssetName)) > 0)) then begin // Get the download URL if AssetObject.TryGetValue<string>('browser_download_url', DownloadURL) and (DownloadURL <> '') then begin Result := DownloadURL; LogInfo('Found download URL: ' + Result); Break; end; end; end; // Fallback: if no matching asset found, try to use the first ZIP asset if Result = '' then begin for I := 0 to AssetsArray.Count - 1 do begin if not(AssetsArray.Items[I] is TJSONObject) then Continue; AssetObject := AssetsArray.Items[I] as TJSONObject; if not AssetObject.TryGetValue<string>('name', AssetName) or (AssetName = '') then Continue; if EndsText('.zip', AssetName) then begin if AssetObject.TryGetValue<string>('browser_download_url', DownloadURL) and (DownloadURL <> '') then begin Result := DownloadURL; LogInfo('Using fallback download URL: ' + Result); Break; end; end; end; end; except on E: EIdHTTPProtocolException do begin LogError('HTTP error fetching GitHub API data: ' + E.Message); raise EPDFDependencyException.CreateFmt ('HTTP error fetching GitHub API data (code %d): %s', [E.ErrorCode, E.Message]); end; on E: Exception do begin LogError('Error fetching latest release URL from GitHub API: ' + E.Message); raise EPDFDependencyException.CreateFmt ('Error fetching latest release URL from GitHub API: %s', [E.Message]); end; end; finally if Assigned(JsonValue) then JsonValue.Free; SSLHandler.Free; HTTP.Free; end; if Result = '' then raise EPDFDependencyException.Create ('No suitable XPDF Tools release ZIP file found on GitHub'); end; initialization gAppPath := IncludeTrailingPathDelimiter(ExtractFilePath(ParamStr(0))); TDependencyManager.Initialize; finalization TDependencyManager.FinalizeManager; {$R+} {$Q+} // Re-enable checks end.
uProcessExecution.pas:
unitinterface uses System.SysUtils, System.Math, Winapi.Windows, Winapi.UserEnv, System.Classes, uErrorHandling, SyncObjs; // For TCriticalSection - for thread safety if needed function BuildEnvironmentBlock(const EnvList: TStringList): Pointer; type // Custom Exception Types for Process Execution EPProcessExecutionError = class(EPDFPowerToolsException) protected FCommandLine: string; public constructor Create(const AMessage: string; const CommandLine: string); overload; constructor CreateFmt(const Format: string; const Args: array of const; const CommandLine: string); overload; property CommandLine: string read FCommandLine; end; EPProcessCreationError = class(EPProcessExecutionError) public constructor Create(const AMessage: string; const CommandLine: string; const Win32ErrorCode: Integer); overload; constructor CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const Win32ErrorCode: Integer); overload; end; EPProcessTimeoutError = class(EPProcessExecutionError) public constructor Create(const AMessage: string; const CommandLine: string; const TimeoutMS: Cardinal); overload; constructor CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const TimeoutMS: Cardinal); overload; end; EPProcessOutputError = class(EPProcessExecutionError) public constructor Create(const AMessage: string; const CommandLine: string); overload; constructor CreateFmt(const Format: string; const Args: array of const; const CommandLine: string); overload; end; EPProcessSignalError = class(EPProcessExecutionError) // For future signal handling if needed public constructor Create(const AMessage: string; const CommandLine: string; const SignalCode: Integer); overload; constructor CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const SignalCode: Integer); overload; end; // TProcessExecutor Class - Static Utility Class TProcessExecutor = class private class var FDefaultTimeoutMS: Cardinal; // Class-level default timeout class var FProcessExecutionLock: TCriticalSection; // For thread safety if needed in future class function GetDefaultTimeout: Cardinal; static; class procedure SetDefaultTimeout(const AValue: Cardinal); static; class function ExecuteProcessInternal(const CommandLine: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList; out Output: string; out ErrorOutput: string): Boolean; public class constructor Create; class destructor Destroy; static; class function QuoteCommandLineArgument(const Argument: string): string; static; class function ExecuteCommand(const CommandLine: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList): Boolean; overload; // TimeoutMS = -1 uses default timeout class function ExecuteCommand(const CommandLine: string; out Output: string; out ErrorOutput: string; const TimeoutMS: Cardinal = Cardinal(-1); const WorkingDirectory: string = ''; const EnvironmentVariables: TStringList = nil): Boolean; overload; // Returns boolean success, separate output and error streams class property DefaultTimeoutMS: Cardinal read GetDefaultTimeout write SetDefaultTimeout; // Configurable default timeout end; implementation uses Diagnostics; // For Process Information retrieval (future enhancement for process details) function BuildEnvironmentBlock(const EnvList: TStringList): Pointer; var I: Integer; EnvStr, Buffer: string; P: PChar; begin Buffer := ''; for I := 0 to EnvList.Count - 1 do begin EnvStr := EnvList[I]; if EnvStr <> '' then Buffer := Buffer + EnvStr + #0; end; Buffer := Buffer + #0; // dupla terminação P := StrAlloc(Length(Buffer) + 1); StrPCopy(P, Buffer); Result := P; end; { EPProcessExecutionError } constructor EPProcessExecutionError.Create(const AMessage: string; const CommandLine: string); begin inherited Create(AMessage); FCommandLine := CommandLine; ContextInfo := 'Command Line: ' + CommandLine; end; constructor EPProcessExecutionError.CreateFmt(const Format: string; const Args: array of const; const CommandLine: string); begin Create(System.SysUtils.Format(Format, Args), CommandLine); end; { EPProcessCreationError } constructor EPProcessCreationError.Create(const AMessage: string; const CommandLine: string; const Win32ErrorCode: Integer); begin inherited Create(Format('%s (Win32 Error Code: %d, %s)', [AMessage, Win32ErrorCode, SysErrorMessage(Win32ErrorCode)]), CommandLine); end; constructor EPProcessCreationError.CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const Win32ErrorCode: Integer); begin Create(System.SysUtils.Format(Format, Args), CommandLine, Win32ErrorCode); end; { EPProcessTimeoutError } constructor EPProcessTimeoutError.Create(const AMessage: string; const CommandLine: string; const TimeoutMS: Cardinal); begin inherited Create(Format('%s (Timeout: %d ms)', [AMessage, TimeoutMS]), CommandLine); end; constructor EPProcessTimeoutError.CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const TimeoutMS: Cardinal); begin Create(System.SysUtils.Format(Format, Args), CommandLine, TimeoutMS); end; { EPProcessOutputError } constructor EPProcessOutputError.Create(const AMessage: string; const CommandLine: string); begin inherited Create(AMessage, CommandLine); end; constructor EPProcessOutputError.CreateFmt(const Format: string; const Args: array of const; const CommandLine: string); begin Create(System.SysUtils.Format(Format, Args), CommandLine); end; { EPProcessSignalError } constructor EPProcessSignalError.Create(const AMessage: string; const CommandLine: string; const SignalCode: Integer); begin inherited Create(Format('%s (Signal Code: %d)', [AMessage, SignalCode]), CommandLine); // Signal codes are OS specific end; constructor EPProcessSignalError.CreateFmt(const Format: string; const Args: array of const; const CommandLine: string; const SignalCode: Integer); begin Create(System.SysUtils.Format(Format, Args), CommandLine, SignalCode); end; { TProcessExecutor } class constructor TProcessExecutor.Create; begin FDefaultTimeoutMS := 15000; // Default timeout: 15 seconds FProcessExecutionLock := TCriticalSection.Create; // Initialize lock end; class destructor TProcessExecutor.Destroy; begin FProcessExecutionLock.Free; // Free the lock end; class function TProcessExecutor.GetDefaultTimeout: Cardinal; begin Result := FDefaultTimeoutMS; end; class procedure TProcessExecutor.SetDefaultTimeout(const AValue: Cardinal); begin FDefaultTimeoutMS := AValue; end; class function TProcessExecutor.QuoteCommandLineArgument(const Argument: string): string; begin // Basic quoting for command line arguments - handles spaces and quotes if (Pos(' ', Argument) > 0) or (Pos('"', Argument) > 0) then begin Result := '"' + StringReplace(Argument, '"', '""', [rfReplaceAll]) + '"'; end else begin Result := Argument; end; end; class function TProcessExecutor.ExecuteProcessInternal(const CommandLine: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList; out Output: string; out ErrorOutput: string): Boolean; var MutableCmdLine: string; SecurityAttr: TSecurityAttributes; hStdOutRead, hStdOutWrite, hStdErrRead, hStdErrWrite: THandle; StartupInfo: TStartupInfo; ProcessInfo: TProcessInformation; Buffer: array[0..4095] of UTF8Char; BytesRead: DWORD; WaitResult: DWORD; UseTimeout: Cardinal; pWorkingDir: PChar; EnvBlock: Pointer; begin Result := False; Output := ''; ErrorOutput := ''; MutableCmdLine := CommandLine; UniqueString(MutableCmdLine); UseTimeout := IfThen(TimeoutMS = Cardinal(-1), FDefaultTimeoutMS, TimeoutMS); SecurityAttr.nLength := SizeOf(TSecurityAttributes); SecurityAttr.bInheritHandle := True; SecurityAttr.lpSecurityDescriptor := nil; if not CreatePipe(hStdOutRead, hStdOutWrite, @SecurityAttr, 0) then begin LogError('Failed to create output pipe.'); Exit; end; try if not CreatePipe(hStdErrRead, hStdErrWrite, @SecurityAttr, 0) then begin LogError('Failed to create error pipe.'); Exit; end; try ZeroMemory(@StartupInfo, SizeOf(StartupInfo)); StartupInfo.cb := SizeOf(StartupInfo); StartupInfo.hStdOutput := hStdOutWrite; StartupInfo.hStdError := hStdErrWrite; StartupInfo.dwFlags := STARTF_USESTDHANDLES or STARTF_USESHOWWINDOW; StartupInfo.wShowWindow := SW_HIDE; if WorkingDirectory = '' then pWorkingDir := nil else pWorkingDir := PChar(WorkingDirectory); LogDebug(Format('Executing command: %s (Timeout: %d ms)', [MutableCmdLine, UseTimeout])); if pWorkingDir = nil then LogDebug('Working directory: (inherited)') else LogDebug(Format('Working directory: %s', [pWorkingDir])); EnvBlock := nil; if Assigned(EnvironmentVariables) then EnvBlock := BuildEnvironmentBlock(EnvironmentVariables); try if not CreateProcess(nil, PChar(MutableCmdLine), nil, nil, True, CREATE_NO_WINDOW, EnvBlock, pWorkingDir, StartupInfo, ProcessInfo) then raise EPProcessCreationError.CreateFmt( 'CreateProcess failed for command: %s (Working Directory: %s, Win32 Error Code: %d)', [MutableCmdLine, WorkingDirectory, GetLastError()], MutableCmdLine, GetLastError()); CloseHandle(hStdOutWrite); CloseHandle(hStdErrWrite); WaitResult := WaitForSingleObject(ProcessInfo.hProcess, UseTimeout); if WaitResult = WAIT_TIMEOUT then begin LogError(Format('Process timed out: %s (Timeout: %d ms)', [CommandLine, UseTimeout])); TerminateProcess(ProcessInfo.hProcess, 1); // Explicitly terminate the process raise EPProcessTimeoutError.CreateFmt('Process timed out: %s (Timeout: %d ms)', [CommandLine, UseTimeout], CommandLine, UseTimeout); end; if WaitResult = WAIT_FAILED then raise EPProcessCreationError.CreateFmt('WaitForSingleObject failed with error code: %d', [GetLastError()], MutableCmdLine, GetLastError()); while ReadFile(hStdOutRead, Buffer, SizeOf(Buffer) - 1, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; Output := Output + UTF8ToString(Buffer); // Convert UTF-8 bytes to string end; while ReadFile(hStdErrRead, Buffer, SizeOf(Buffer) - 1, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; ErrorOutput := ErrorOutput + UTF8ToString(Buffer); // Convert UTF-8 bytes to string end; Result := True; finally if Assigned(EnvBlock) then StrDispose(PChar(EnvBlock)); if Assigned(EnvBlock) then DestroyEnvironmentBlock(EnvBlock); CloseHandle(ProcessInfo.hProcess); CloseHandle(ProcessInfo.hThread); end; finally CloseHandle(hStdErrRead); CloseHandle(hStdErrWrite); end; finally CloseHandle(hStdOutRead); CloseHandle(hStdOutWrite); end; end; class function TProcessExecutor.ExecuteCommand(const CommandLine: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList): Boolean; var Output: string; ErrorOutput: string; begin // Thread safety - acquire lock for multi-threaded scenarios FProcessExecutionLock.Acquire; try Result := ExecuteProcessInternal(CommandLine, TimeoutMS, WorkingDirectory, EnvironmentVariables, Output, ErrorOutput); finally FProcessExecutionLock.Release; end; end; class function TProcessExecutor.ExecuteCommand(const CommandLine: string; out Output: string; out ErrorOutput: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList): Boolean; var MutableCmdLine: string; // Local mutable copy SecurityAttr: TSecurityAttributes; hStdOutRead, hStdOutWrite, hStdErrRead, hStdErrWrite: THandle; StartupInfo: TStartupInfo; ProcessInfo: TProcessInformation; Buffer: array[0..4095] of UTF8Char; BytesRead: DWORD; WaitResult: DWORD; UseTimeout: Cardinal; pWorkingDir: PChar; ErrorMsg: string; EnvBlock: Pointer; begin // Create mutable copy of command line MutableCmdLine := CommandLine; UniqueString(MutableCmdLine); // Ensure exclusive ownership Result := False; // Assume failure initially Output := ''; ErrorOutput := ''; UseTimeout := TimeoutMS; if UseTimeout = Cardinal(-1) then // Use default timeout if -1 is passed UseTimeout := FDefaultTimeoutMS; SecurityAttr.nLength := SizeOf(TSecurityAttributes); SecurityAttr.bInheritHandle := True; SecurityAttr.lpSecurityDescriptor := nil; if not CreatePipe(hStdOutRead, hStdOutWrite, @SecurityAttr, 0) then begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): Failed to create output pipe for command: %s. Win32 Error: %d', [CommandLine, GetLastError()]); LogError(ErrorMsg); Exit; // Exit on pipe creation failure end; if not CreatePipe(hStdErrRead, hStdErrWrite, @SecurityAttr, 0) then begin CloseHandle(hStdOutRead); CloseHandle(hStdOutWrite); ErrorMsg := Format('ExecuteCommand (Output/Error Overload): Failed to create error pipe for command: %s. Win32 Error: %d', [CommandLine, GetLastError()]); LogError(ErrorMsg); Exit; // Exit on pipe creation failure end; try ZeroMemory(@StartupInfo, SizeOf(StartupInfo)); StartupInfo.cb := SizeOf(StartupInfo); StartupInfo.hStdOutput := hStdOutWrite; StartupInfo.hStdError := hStdErrWrite; StartupInfo.dwFlags := STARTF_USESTDHANDLES or STARTF_USESHOWWINDOW; StartupInfo.wShowWindow := SW_HIDE; // Properly handle working directory if WorkingDirectory = '' then pWorkingDir := nil else pWorkingDir := PChar(WorkingDirectory); // Optional debug logging LogDebug(Format('Executing command: %s (Timeout: %d ms)', [CommandLine, UseTimeout])); if pWorkingDir = nil then LogDebug('Working directory: (inherited)') else LogDebug(Format('Working directory: %s', [pWorkingDir])); begin if Assigned(EnvironmentVariables) then begin EnvBlock := BuildEnvironmentBlock(EnvironmentVariables); try if not CreateProcess(nil, PChar(MutableCmdLine), nil, nil, True, CREATE_NO_WINDOW, EnvBlock, pWorkingDir, StartupInfo, ProcessInfo) then begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): CreateProcess failed for command: %s. Win32 Error: %d', [MutableCmdLine, GetLastError()]); LogError(ErrorMsg); raise EPProcessCreationError.CreateFmt('CreateProcess failed for command: %s', [MutableCmdLine], MutableCmdLine, GetLastError()); end; finally DestroyEnvironmentBlock(EnvBlock); end; end else begin if not CreateProcess(nil, PChar(MutableCmdLine), nil, nil, True, CREATE_NO_WINDOW, nil, pWorkingDir, StartupInfo, ProcessInfo) then begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): CreateProcess failed for command: %s. Win32 Error: %d', [MutableCmdLine, GetLastError()]); LogError(ErrorMsg); raise EPProcessCreationError.CreateFmt('CreateProcess failed for command: %s', [MutableCmdLine], MutableCmdLine, GetLastError()); end; end; end; CloseHandle(hStdOutWrite); CloseHandle(hStdErrWrite); WaitResult := WaitForSingleObject(ProcessInfo.hProcess, UseTimeout); if WaitResult = WAIT_TIMEOUT then begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): Process timed out for command: %s after %d ms.', [CommandLine, UseTimeout]); LogError(ErrorMsg); raise EPProcessTimeoutError.Create('Process timed out.', CommandLine, UseTimeout); end; if WaitResult = WAIT_FAILED then begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): WaitForSingleObject failed for command: %s. Win32 Error: %d', [CommandLine, GetLastError()]); LogError(ErrorMsg); raise EPProcessCreationError.CreateFmt('WaitForSingleObject failed with error code: %d', [GetLastError()], CommandLine, GetLastError()); end; Output := ''; while ReadFile(hStdOutRead, Buffer, SizeOf(Buffer)-1, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; Output := Output + UTF8ToString(Buffer); end; ErrorOutput := ''; while ReadFile(hStdErrRead, Buffer, SizeOf(Buffer)-1, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; ErrorOutput := ErrorOutput + string(Buffer); end; if ErrorOutput <> '' then // Log stderr output if any (as warning) begin ErrorMsg := Format('ExecuteCommand (Output/Error Overload): XPDF Command Warning (stderr) for command: %s: %s', [CommandLine, ErrorOutput]); LogWarning(ErrorMsg); end; Result := True; // Process executed successfully (even with warnings on stderr) finally CloseHandle(ProcessInfo.hProcess); CloseHandle(ProcessInfo.hThread); CloseHandle(hStdOutRead); CloseHandle(hStdErrRead); end; end; end.
uXPDFWrapper.pas:
unitinterface uses System.SysUtils, System.StrUtils, System.Classes, System.Types, uProcessExecution, uFileSystemUtilities, uErrorHandling; type TXPDFWrapper = class public class function ExtractTextFromPDF(const PDFPath: string; out TextContent: string): Boolean; class function GetPDFVersion(const PDFPath: string; out Version: string): Boolean; class function ExtractMetadata(const PDFPath: string; out Metadata: TStringList): Boolean; end; // Forward declaration for TXPDFRCParser (to avoid circular dependency if needed) //TXPDFRCParser = class; // ------------------------ Text Extraction - pdftotext.exe ------------------------ TPDFTextExtractor = class public class function ExtractText(const PdfFilePath: string; const Options: TStringList = nil): string; static; end; // ------------------------ PDF Information - pdfinfo.exe ------------------------ TPDFInfoRetriever = class public class function GetPDFInfo(const PdfFilePath: string; const Options: TStringList = nil): string; static; class function GetPageSizes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; static; class function GetPageBoxes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; static; class function GetMetadata(const PdfFilePath: string; const Options: TStringList = nil): string; static; class function GetRawDates(const PdfFilePath: string; const Options: TStringList = nil): string; static; end; // ------------------------ PDF Images Extraction - pdfimages.exe ------------------------ TPDFImagesExtractor = class public class function ExtractImages(const PdfFilePath: string; const ImageRoot: string; const Options: TStringList = nil): Boolean; static; class function ListImages(const PdfFilePath: string; const Options: TStringList = nil): string; static; class function ExtractRawImages(const PdfFilePath: string; const ImageRoot: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF File Attachment Detachment - pdfdetach.exe ------------------------ TPDFFileDetacher = class public class function ListAttachments(const PdfFilePath: string; const Options: TStringList = nil): string; static; class function SaveAttachment(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputPath: string; const Options: TStringList = nil): Boolean; static; class function SaveAllAttachments(const PdfFilePath: string; const OutputDir: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF to PPM Conversion - pdftoppm.exe ------------------------ TPDFToPPMConverter = class public class function ConvertToPPM(const PdfFilePath: string; const PPMRoot: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF to PNG Conversion - pdftopng.exe ------------------------ TPDFToPNGConverter = class public class function ConvertToPNG(const PdfFilePath: string; const PNGRoot: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF to HTML Conversion - pdftohtml.exe ------------------------ TPDFToHTMLConverter = class public class function ConvertToHTML(const PdfFilePath: string; const HTMLDir: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF to PostScript Conversion - pdftops.exe ------------------------ TPDFToPSConverter = class public class function ConvertToPS(const PdfFilePath: string; const PSFilePath: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF Fonts Analyzer - pdffonts.exe ------------------------ TPDFFontsAnalyzer = class public class function GetFontsInfo(const PdfFilePath: string; const Options: TStringList = nil): string; static; end; // ------------------------ PDF Separate - pdfseparate.exe ------------------------ TPDFSeparate = class public class function SeparatePages(const PdfFilePath: string; const OutputDir: string; const OutputRoot: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF Unite - pdfunite.exe ------------------------ TPDFUnite = class public class function UnitePDFs(const InputPDFPaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList = nil): Boolean; static; end; // ------------------------ PDF to Cairo (Images/PS/SVG) - pdftocairo.exe ------------------------ TPDFToCairoConverter = class public class function ConvertToCairo(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList = nil): Boolean; static; end; implementation uses IOUtils; const TIMEOUT = 2000; { Helper function to sanitize output strings. This removes a UTF-8 BOM (if any) and can also replace control characters. You may uncomment the replacement of form feed characters if desired. } function SanitizeOutput(const Input: string): string; begin // Remove the UTF-8 BOM if present. Result := StringReplace(Input, #$EF#$BB#$BF, '', [rfReplaceAll]); // Optionally, replace form feed (#$0C) with a visible marker. // Uncomment the next line to replace form feed with a page break marker. // Result := StringReplace(Result, #$0C, sLineBreak + '--- Page Break ---' + sLineBreak, [rfReplaceAll]); end; // Helper function to convert open array to TArray<TVarRec> function ToTVarRecArray(const Args: array of const): TArray<TVarRec>; var I: Integer; begin SetLength(Result, Length(Args)); // Since the items in array of const are already of type TVarRec, // we can copy them directly. for I := Low(Args) to High(Args) do Result[I] := Args[I]; end; // --- Helper Function --- function BuildCommandLine(const ExecutableName: string; const PdfFilePath: string; const Options: TStringList): string; overload; begin Result := TProcessExecutor.QuoteCommandLineArgument(ExecutableName + '.exe'); // Quote the executable name if Assigned(Options) then begin for var Option in Options do begin Result := Result + ' ' + Option; // Options are ALREADY quoted in most calls (from API level). end; end; Result := Result + ' ' + TProcessExecutor.QuoteCommandLineArgument(PdfFilePath); // Quote the PDF file path end; function BuildCommandLine(const ExecutableName: string; const InputFiles: TStringDynArray; const OutputFile: string; const Options: TStringList): string; overload; var InputFilePath: string; begin Result := TProcessExecutor.QuoteCommandLineArgument(ExecutableName + '.exe'); // Quote the executable if Assigned(Options) then begin for var Option in Options do begin Result := Result + ' ' + Option; // Options are ALREADY quoted in most calls. end; end; for InputFilePath in InputFiles do begin Result := Result + ' ' + TProcessExecutor.QuoteCommandLineArgument(InputFilePath); // Quote each input file path end; Result := Result + ' ' + TProcessExecutor.QuoteCommandLineArgument(OutputFile); // Quote the output file path end; function BuildCommandLine(const ExecutableName: string; const PdfFilePath: string; const OutputPath: string; const Options: TStringList): string; overload; begin Result := TProcessExecutor.QuoteCommandLineArgument(ExecutableName + '.exe'); // Quote the executable if Assigned(Options) then begin for var Option in Options do begin Result := Result + ' ' + Option; end; end; Result := Result + ' ' + TProcessExecutor.QuoteCommandLineArgument(PdfFilePath) + ' ' + TProcessExecutor.QuoteCommandLineArgument(OutputPath); end; class function TXPDFWrapper.ExtractTextFromPDF(const PDFPath: string; out TextContent: string): Boolean; var CommandLine, Output, ErrorOutput, ExecutablePath: string; begin Result := False; TextContent := ''; // Check if pdftotext exists and get its full path if not TFileChecker.CheckXPDFExecutableExists('pdftotext', []) then begin LogError('pdftotext.exe not found.'); Exit; end; ExecutablePath := TFileChecker.GetFullPath; // Add "-enc UTF-8" to force UTF-8 output CommandLine := TProcessExecutor.QuoteCommandLineArgument(ExecutablePath) + ' -q -enc UTF-8 ' + TProcessExecutor.QuoteCommandLineArgument(PDFPath) + ' -'; if TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput) then begin TextContent := Output; Result := True; end else LogErrorFmt('Text extraction failed. Output: [%s], Error: [%s]', ToTVarRecArray([Output, ErrorOutput])); end; class function TXPDFWrapper.GetPDFVersion(const PDFPath: string; out Version: string): Boolean; var CommandLine, Output, ErrorOutput: string; Lines: TStringList; I: Integer; Line, Temp: string; begin Result := False; Version := ''; // Ensure that pdfinfo.exe is available. if not TFileChecker.CheckXPDFExecutableExists('pdfinfo', []) then begin LogError('pdfinfo.exe not found.'); Exit; end; // Build command line to run pdfinfo.exe on the target PDF. CommandLine := TProcessExecutor.QuoteCommandLineArgument(TFileChecker.GetFullPath) + ' ' + TProcessExecutor.QuoteCommandLineArgument(PDFPath); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput) then begin LogError('Failed to get PDF version: ' + ErrorOutput); Exit; end; // Parse the output line-by-line to find the version. Lines := TStringList.Create; try Lines.Text := Output; for I := 0 to Lines.Count - 1 do begin Line := Trim(Lines[I]); // Check in a case-insensitive manner whether the line starts with "PDF version:". if StartsText('PDF version:', Line) then begin // Extract the version string that follows the colon. Temp := Copy(Line, Length('PDF version:') + 1, MaxInt); Version := Trim(Temp); // Remove any control characters (like form feed) if present. Version := StringReplace(Version, #$0C, '', [rfReplaceAll]); Result := True; Exit; end; end; finally Lines.Free; end; // Fallback: if the version was not found in standard output, try the error output. if not Result and (ErrorOutput <> '') then begin Lines := TStringList.Create; try Lines.Text := ErrorOutput; for I := 0 to Lines.Count - 1 do begin Line := Trim(Lines[I]); if StartsText('PDF version:', Line) then begin Temp := Copy(Line, Length('PDF version:') + 1, MaxInt); Version := Trim(Temp); Version := StringReplace(Version, #$0C, '', [rfReplaceAll]); Result := True; Exit; end; end; finally Lines.Free; end; end; if Version = '' then LogError('Could not determine PDF version from output.'); end; class function TXPDFWrapper.ExtractMetadata(const PDFPath: string; out Metadata: TStringList): Boolean; var CommandLine, Output, ErrorOutput: string; begin Result := False; Metadata := TStringList.Create; // Ensure that pdfinfo.exe is available. if not TFileChecker.CheckXPDFExecutableExists('pdfinfo', []) then begin LogError('pdfinfo.exe not found.'); Metadata.Free; Metadata := nil; Exit; end; // Build the command line with the -meta option. CommandLine := TProcessExecutor.QuoteCommandLineArgument(TFileChecker.GetFullPath) + ' -meta ' + TProcessExecutor.QuoteCommandLineArgument(PDFPath); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput) then begin LogError('Failed to extract metadata: ' + ErrorOutput); Metadata.Free; Metadata := nil; Exit; end; // Sanitize metadata text to remove any BOM and unwanted control characters. Metadata.Text := SanitizeOutput(Output); Result := True; end; // ------------------------ Text Extraction - pdftotext.exe ------------------------ { TPDFTextExtractor } class function TPDFTextExtractor.ExtractText(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdftotext', []); // Dependency check CommandLine := BuildCommandLine('pdftotext', PdfFilePath, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to extract text: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error extracting text from PDF: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] // 3 argumentos no array ); end; end; // ------------------------ PDF Information - pdfinfo.exe ------------------------ { TPDFInfoRetriever } class function TPDFInfoRetriever.GetPDFInfo(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfinfo', []); // Dependency check CommandLine := BuildCommandLine('pdfinfo', PdfFilePath, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get PDF info: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting PDF info for: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] // 3 argumentos no array ); end; end; class function TPDFInfoRetriever.GetPageSizes(const PdfFilePath: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfinfo', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); if FirstPage > 0 then LocalOptions.Add('-f ' + IntToStr(FirstPage)); if LastPage > 0 then LocalOptions.Add('-l ' + IntToStr(LastPage)); CommandLine := BuildCommandLine('pdfinfo', PdfFilePath, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get page sizes: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting page sizes for: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; class function TPDFInfoRetriever.GetPageBoxes(const PdfFilePath: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfinfo', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-box'); if FirstPage > 0 then LocalOptions.Add('-f ' + IntToStr(FirstPage)); if LastPage > 0 then LocalOptions.Add('-l ' + IntToStr(LastPage)); CommandLine := BuildCommandLine('pdfinfo', PdfFilePath, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get page boxes: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting page boxes for: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; class function TPDFInfoRetriever.GetMetadata(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfinfo', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-meta'); CommandLine := BuildCommandLine('pdfinfo', PdfFilePath, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get metadata: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting metadata for: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; class function TPDFInfoRetriever.GetRawDates(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfinfo', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-rawdates'); CommandLine := BuildCommandLine('pdfinfo', PdfFilePath, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get raw dates: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting raw dates for: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; // ------------------------ PDF Images Extraction - pdfimages.exe ------------------------ { TPDFImagesExtractor } class function TPDFImagesExtractor.ExtractImages(const PdfFilePath: string; const ImageRoot: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfimages', []); // Dependency check CommandLine := BuildCommandLine('pdfimages', PdfFilePath, ImageRoot, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to extract images: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error extracting images from PDF: %s to root: %s. Details: %s', ToTVarRecArray([PdfFilePath, ImageRoot, E.Message])); Result := False; end; end; end; class function TPDFImagesExtractor.ListImages(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfimages', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-list'); CommandLine := BuildCommandLine('pdfimages', PdfFilePath, '', LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to list images: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error listing images in PDF: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; class function TPDFImagesExtractor.ExtractRawImages(const PdfFilePath: string; const ImageRoot: string; const Options: TStringList): Boolean; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfimages', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-raw'); CommandLine := BuildCommandLine('pdfimages', PdfFilePath, ImageRoot, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to extract raw images: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error extracting raw images from PDF: %s to root: %s. Details: %s', ToTVarRecArray([PdfFilePath, ImageRoot, E.Message])); Result := False; end; end; finally LocalOptions.Free; end; end; // ------------------------ PDF File Attachment Detachment - pdfdetach.exe ------------------------ { TPDFFileDetacher } class function TPDFFileDetacher.ListAttachments(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdfdetach', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-list'); CommandLine := BuildCommandLine('pdfdetach', PdfFilePath, '', LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to list attachments: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error listing attachments in PDF: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] ); end; finally LocalOptions.Free; end; end; class function TPDFFileDetacher.SaveAttachment(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputPath: string; const Options: TStringList): Boolean; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfdetach', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-save ' + IntToStr(AttachmentNumber)); LocalOptions.Add('-o ' + OutputPath); CommandLine := BuildCommandLine('pdfdetach', PdfFilePath, '', LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to save attachment: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error saving attachment %d from PDF: %s to path: %s. Details: %s', ToTVarRecArray([AttachmentNumber, PdfFilePath, OutputPath, E.Message])); Result := False; end; end; finally LocalOptions.Free; end; end; class function TPDFFileDetacher.SaveAllAttachments(const PdfFilePath: string; const OutputDir: string; const Options: TStringList): Boolean; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfdetach', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-saveall'); LocalOptions.Add('-o ' + OutputDir); CommandLine := BuildCommandLine('pdfdetach', PdfFilePath, '', LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to save all attachments: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error saving all attachments from PDF: %s to dir: %s. Details: %s', ToTVarRecArray([PdfFilePath, OutputDir, E.Message])); Result := False; end; end; finally LocalOptions.Free; end; end; // ------------------------ PDF to PPM Conversion - pdftoppm.exe ------------------------ { TPDFToPPMConverter } class function TPDFToPPMConverter.ConvertToPPM(const PdfFilePath: string; const PPMRoot: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdftoppm', []); // Dependency check CommandLine := BuildCommandLine('pdftoppm', PdfFilePath, PPMRoot, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to convert to PPM: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error converting PDF to PPM: %s to root: %s. Details: %s', ToTVarRecArray([PdfFilePath, PPMRoot, E.Message])); Result := False; end; end; end; // ------------------------ PDF to PNG Conversion - pdftopng.exe ------------------------ { TPDFToPNGConverter } class function TPDFToPNGConverter.ConvertToPNG(const PdfFilePath: string; const PNGRoot: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdftopng', []); // Dependency check CommandLine := BuildCommandLine('pdftopng', PdfFilePath, PNGRoot, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to converto to PNG: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error converting PDF to PNG: %s to root: %s. Details: %s', ToTVarRecArray([PdfFilePath, PNGRoot, E.Message])); Result := False; end; end; end; // ------------------------ PDF to HTML Conversion - pdftohtml.exe ------------------------ { TPDFToHTMLConverter } class function TPDFToHTMLConverter.ConvertToHTML(const PdfFilePath: string; const HTMLDir: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdftohtml', []); // Dependency check CommandLine := BuildCommandLine('pdftohtml', PdfFilePath, HTMLDir, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to convert to HTML: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error converting PDF to HTML: %s to dir: %s. Details: %s', ToTVarRecArray([PdfFilePath, HTMLDir, E.Message])); Result := False; end; end; end; // ------------------------ PDF to PostScript Conversion - pdftops.exe ------------------------ { TPDFToPSConverter } class function TPDFToPSConverter.ConvertToPS(const PdfFilePath: string; const PSFilePath: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdftops', []); // Dependency check CommandLine := BuildCommandLine('pdftops', PdfFilePath, PSFilePath, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to convert to PS: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error converting PDF to PostScript: %s to path: %s. Details: %s', ToTVarRecArray([PdfFilePath, PSFilePath, E.Message])); Result := False; end; end; end; // ------------------------ PDF Fonts Analyzer - pdffonts.exe ------------------------ { TPDFFontsAnalyzer } class function TPDFFontsAnalyzer.GetFontsInfo(const PdfFilePath: string; const Options: TStringList): string; var CommandLine: string; ErrorOutput: string; begin TFileChecker.CheckXPDFExecutableExists('pdffonts', []); // Dependency check CommandLine := BuildCommandLine('pdffonts', PdfFilePath, '', Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Result, ErrorOutput, TIMEOUT) then LogError('Failed to get fonts info: ' + ErrorOutput); except on E: Exception do raise EPDFProcessException.CreateFmt( 'Error getting fonts info for PDF: %s. Details: %s. Command Line: %s', [PdfFilePath, E.Message, CommandLine] // 3 argumentos no array ); end; end; // ------------------------ PDF Separate - pdfseparate.exe ------------------------ { TPDFSeparate } class function TPDFSeparate.SeparatePages(const PdfFilePath: string; const OutputDir: string; const OutputRoot: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): Boolean; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfseparate', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); if FirstPage > 0 then LocalOptions.Add('-f ' + IntToStr(FirstPage)); if LastPage > 0 then LocalOptions.Add('-l ' + IntToStr(LastPage)); CommandLine := BuildCommandLine('pdfseparate', PdfFilePath, IncludeTrailingPathDelimiter(OutputDir) + OutputRoot + '-%d.pdf', LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to separate pages: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error separating pages of PDF: %s to dir: %s, root: %s. Details: %s', ToTVarRecArray([PdfFilePath, OutputDir, OutputRoot, E.Message])); Result := False; end; end; finally LocalOptions.Free; end; end; // ------------------------ PDF Unite - pdfunite.exe ------------------------ { TPDFUnite } class function TPDFUnite.UnitePDFs(const InputPDFPaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList): Boolean; var CommandLine: string; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdfunite', []); // Dependency check CommandLine := BuildCommandLine('pdfunite', InputPDFPaths, OutputFilePath, Options); try if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to unite PDFs: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error uniting PDFs to output: %s. Details: %s', ToTVarRecArray([OutputFilePath, E.Message])); Result := False; end; end; end; // ------------------------ PDF to Cairo (Images/PS/SVG) - pdftocairo.exe ------------------------ { TPDFToCairoConverter } class function TPDFToCairoConverter.ConvertToCairo(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList): Boolean; var CommandLine: string; LocalOptions: TStringList; ErrorOutput: string; Output: string; begin TFileChecker.CheckXPDFExecutableExists('pdftocairo', []); // Verificação de dependência LocalOptions := TStringList.Create; try try if Assigned(Options) then LocalOptions.AddStrings(Options); LocalOptions.Add('-' + LowerCase(OutputFormat)); // e.g., -png, -ps, -svg CommandLine := BuildCommandLine('pdftocairo', PdfFilePath, OutputPath, LocalOptions); if not TProcessExecutor.ExecuteCommand(CommandLine, Output, ErrorOutput, TIMEOUT) then LogError('Failed to convert to cairo: ' + ErrorOutput); Result := True; except on E: Exception do begin LogErrorFmt('Error converting PDF to Cairo format (%s): %s to path: %s. Details: %s', ToTVarRecArray([OutputFormat, PdfFilePath, OutputPath, E.Message])); Result := False; end; end; finally LocalOptions.Free; end; end; end.
Nova unidade a se implementar (uPDFPowerToolsAPI.pas
):
textunit uPDFPowerToolsAPI; interface uses System.SysUtils, System.Classes, uXPDFWrapper, System.Types, uErrorHandling; procedure ValidatePdfFilePath(const PdfFilePath: string); type // Se n�o precisar de declara��es, pode declarar algo trivial: TDummy = pointer; // Forward declaration to avoid circular dependency if needed //TXPDFRCParser = class; // ------------------------ API Function Declarations ------------------------ // --- Text Extraction --- function ExtractTextFromPDF(const PdfFilePath: string; const LayoutMode: string = ''; const Options: TStringList = nil): string; function ExtractTextFromPDFLayout(const PdfFilePath: string; const Options: TStringList = nil): string; // Layout Mode function ExtractTextFromPDFSimple(const PdfFilePath: string; const Options: TStringList = nil): string; // Simple Mode function ExtractTextFromPDFSimple2(const PdfFilePath: string; const Options: TStringList = nil): string; // Simple2 Mode function ExtractTextFromPDFTable(const PdfFilePath: string; const Options: TStringList = nil): string; // Table Mode function ExtractTextFromPDFLinePrinter(const PdfFilePath: string; const Options: TStringList = nil): string; // LinePrinter Mode function ExtractTextFromPDFRawOrder(const PdfFilePath: string; const Options: TStringList = nil): string; // Raw Order Mode // --- PDF Information Retrieval --- function GetPDFDocumentInfo(const PdfFilePath: string; const Options: TStringList = nil): string; function GetPDFPageSizes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; function GetPDFPageBoxes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; function GetPDFDocumentMetadata(const PdfFilePath: string; const Options: TStringList = nil): string; function GetPDFRawDateStrings(const PdfFilePath: string; const Options: TStringList = nil): string; // --- Image Extraction --- function ExtractImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const AsJPEG: Boolean = False; const Options: TStringList = nil): Boolean; function ListImagesInPDF(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractRawImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; // --- Attachment Handling --- function ListPDFAttachments(const PdfFilePath: string; const Options: TStringList = nil): string; function SavePDFAttachmentToFile(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputFilePath: string; const Options: TStringList = nil): Boolean; function SaveAllPDFAttachmentsToDirectory(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; // --- PDF Conversion --- function ConvertPDFToPPM(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList = nil): Boolean; function ConvertPDFToPNG(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList = nil): Boolean; function ConvertPDFToHTML(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; function ConvertPDFToPostScript(const PdfFilePath: string; const OutputFilePath: string; const Options: TStringList = nil): Boolean; function ConvertPDFToCairoFormat(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList): Boolean; // --- PDF Analysis --- function GetPDFFontsInformation(const PdfFilePath: string; const Options: TStringList = nil): string; // --- PDF Manipulation --- function SeparatePDFPagesToDirectory(const PdfFilePath: string; const OutputDirectory: string; const OutputFileNameRoot: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): Boolean; function UniteMultiplePDFs(const InputPDFFilePaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList = nil): Boolean; implementation uses IOUtils; // --- Input Validation Helper Function --- procedure ValidatePdfFilePath(const PdfFilePath: string); begin if Trim(PdfFilePath) = '' then raise EPDFAPIUsageException.Create('PDF file path cannot be empty.'); if not FileExists(PdfFilePath) then raise EPDFFileException.CreateFmt('PDF file not found: %s', [PdfFilePath]); end; procedure ValidateOutputDirectory(const OutputDirectory: string); begin if Trim(OutputDirectory) = '' then raise EPDFAPIUsageException.Create('Output directory cannot be empty.'); if not DirectoryExists(OutputDirectory) then raise EPDFFileException.CreateFmt('Output directory not found: %s', [OutputDirectory]); end; procedure ValidateOutputFilePath(const OutputFilePath: string); begin if Trim(OutputFilePath) = '' then raise EPDFAPIUsageException.Create('Output file path cannot be empty.'); end; procedure ValidateOutputRootFileName(const OutputRootFileName: string); begin if Trim(OutputRootFileName) = '' then raise EPDFAPIUsageException.Create('Output root file name cannot be empty.'); end; procedure ValidateOutputFormat(const OutputFormat: string); const ValidFormats: array of string = ['png', 'ppm', 'html', 'ps', 'svg', 'image', 'pbm', 'pgm', 'pam', 'jpeg', 'tiff', 'pdf']; // Add all valid formats for Cairo and others var Valid: Boolean; Format: string; ValidFormat: string; begin Valid := False; Format := LowerCase(OutputFormat); for ValidFormat in ValidFormats do begin if Format = LowerCase(ValidFormat) then begin Valid := True; Break; end; end; if not Valid then raise EPDFAPIUsageException.CreateFmt('Invalid output format: %s. Valid formats are: %s', [OutputFormat, String.Join(', ', ValidFormats)]); end; // ------------------------ API Function Implementations ------------------------ // --- Text Extraction --- function ExtractTextFromPDF(const PdfFilePath: string; const LayoutMode: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFTextExtractor.ExtractText(PdfFilePath, Options); end; function ExtractTextFromPDFLayout(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-layout'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFSimple(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-simple'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFSimple2(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-simple2'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFTable(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-table'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFLinePrinter(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-lineprinter'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFRawOrder(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-raw'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; // --- PDF Information Retrieval --- function GetPDFDocumentInfo(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFInfoRetriever.GetPDFInfo(PdfFilePath, Options); end; function GetPDFPageSizes(const PdfFilePath: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFInfoRetriever.GetPageSizes(PdfFilePath, FirstPage, LastPage, Options); end; function GetPDFPageBoxes(const PdfFilePath: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFInfoRetriever.GetPageBoxes(PdfFilePath, FirstPage, LastPage, Options); end; function GetPDFDocumentMetadata(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFInfoRetriever.GetMetadata(PdfFilePath, Options); end; function GetPDFRawDateStrings(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFInfoRetriever.GetRawDates(PdfFilePath, Options); end; // --- Image Extraction --- function ExtractImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const AsJPEG: Boolean; const Options: TStringList): Boolean; var lOptions: TStringList; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); lOptions := TStringList.Create; try if Assigned(Options) then lOptions.Assign(Options); Result := TPDFImagesExtractor.ExtractImages(PdfFilePath, OutputDirectory, lOptions); finally lOptions.Free; end; end; function ListImagesInPDF(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFImagesExtractor.ListImages(PdfFilePath, Options); end; function ExtractRawImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); Result := TPDFImagesExtractor.ExtractRawImages(PdfFilePath, OutputDirectory, Options); end; // --- Attachment Handling --- function ListPDFAttachments(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFFileDetacher.ListAttachments(PdfFilePath, Options); end; function SavePDFAttachmentToFile(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputFilePath: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputFilePath); Result := TPDFFileDetacher.SaveAttachment(PdfFilePath, AttachmentNumber, OutputFilePath, Options); end; function SaveAllPDFAttachmentsToDirectory(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); Result := TPDFFileDetacher.SaveAllAttachments(PdfFilePath, OutputDirectory, Options); end; // --- PDF Conversion --- function ConvertPDFToPPM(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputRootFileName(OutputRootFileName); Result := TPDFToPPMConverter.ConvertToPPM(PdfFilePath, OutputRootFileName, Options); end; function ConvertPDFToPNG(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputRootFileName(OutputRootFileName); Result := TPDFToPNGConverter.ConvertToPNG(PdfFilePath, OutputRootFileName, Options); end; function ConvertPDFToHTML(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); Result := TPDFToHTMLConverter.ConvertToHTML(PdfFilePath, OutputDirectory, Options); end; function ConvertPDFToPostScript(const PdfFilePath: string; const OutputFilePath: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputFilePath); Result := TPDFToPSConverter.ConvertToPS(PdfFilePath, OutputFilePath, Options); end; function ConvertPDFToCairoFormat(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputPath); // OutputPath is a file for Cairo, not directory in this case ValidateOutputFormat(OutputFormat); Result := TPDFToCairoConverter.ConvertToCairo(PdfFilePath, OutputPath, OutputFormat, Options); end; // --- PDF Analysis --- function GetPDFFontsInformation(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); Result := TPDFFontsAnalyzer.GetFontsInfo(PdfFilePath, Options); end; // --- PDF Manipulation --- function SeparatePDFPagesToDirectory(const PdfFilePath: string; const OutputDirectory: string; const OutputFileNameRoot: string; const FirstPage: Integer; const LastPage: Integer; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); ValidateOutputRootFileName(OutputFileNameRoot); Result := TPDFSeparate.SeparatePages(PdfFilePath, OutputDirectory, OutputFileNameRoot, FirstPage, LastPage, Options); end; function UniteMultiplePDFs(const InputPDFFilePaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList): Boolean; var InputPath: string; begin ValidateOutputFilePath(OutputFilePath); for InputPath in InputPDFFilePaths do begin ValidatePdfFilePath(InputPath); // Validate each input path end; Result := TPDFUnite.UnitePDFs(InputPDFFilePaths, OutputFilePath, Options); end; end.
Create a robust Ultra-High-Level API layer (uPDFPowerToolsAPI.pas
) to current project integration that:
uErrorHandling
frameworkuXPDFWrapper
and uProcessExecution
infrastructureXPDF Tools
documentation given, check/fix for all features implementation in our projectProvide COMPLETE IMPLEMENTATIONS for:
uPDFPowerToolsAPI.pas
unit code with all API functionspPDFPowerToolsConsoleApp.dpr
to demonstrate API usageuProcessExecution.pas
for full Unicode supportuXPDFWrapper.pas
to support new API functionsuXPDFWrapper
/uProcessExecution
workflowsTDependencyManager
in all API entry points--
The project is a Delphi-based PDF processing suite with 4 architectural layers:
uProcessExecution
): Handles OS process spawning/piping with thread safetyuXPDFWrapper
): Direct interface to XPDF binaries (pdftotext/pdfinfo/etc)uErrorHandling
, uFileSystemUtilities
): Cross-cutting concerns for error logging and file opsuPDFPowerToolsAPI
): Target layer for end-users requiring simple calls like ExtractTextFromPDF()
Critical Integration Requirements:
ValidatePdfFilePath
/ValidateOutputDirectory
before executionTUTF8NoBOMEncoding
to prevent BOM artifacts in console/file outputSetConsoleToUnicodeFont
must remain Windows-specific but extensibleEPDFAPIUsageException
hierarchy from uErrorHandling
pdfinfo
outputBelow is one complete solution that adds a new “high‐level” API unit, updates the console–app to demonstrate its use, and makes the critical changes in the process–execution and wrapper layers to fully support Unicode and the new API calls. You can copy and paste the following four units into your project. (See the “Notes” afterward.)
uPDFPowerToolsAPI.pas
delphiunit uPDFPowerToolsAPI; interface uses System.SysUtils, System.Classes, System.Types, uXPDFWrapper, uErrorHandling, uFileSystemUtilities; { Helper routines to validate input parameters. } procedure ValidatePdfFilePath(const PdfFilePath: string); procedure ValidateOutputDirectory(const OutputDirectory: string); procedure ValidateOutputFilePath(const OutputFilePath: string); procedure ValidateOutputRootFileName(const OutputRootFileName: string); procedure ValidateOutputFormat(const OutputFormat: string); { --- API Function Declarations --- } { Text Extraction } function ExtractTextFromPDF(const PdfFilePath: string; const LayoutMode: string = ''; const Options: TStringList = nil): string; function ExtractTextFromPDFLayout(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractTextFromPDFSimple(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractTextFromPDFSimple2(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractTextFromPDFTable(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractTextFromPDFLinePrinter(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractTextFromPDFRawOrder(const PdfFilePath: string; const Options: TStringList = nil): string; { PDF Information Retrieval } function GetPDFDocumentInfo(const PdfFilePath: string; const Options: TStringList = nil): string; function GetPDFPageSizes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; function GetPDFPageBoxes(const PdfFilePath: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): string; function GetPDFDocumentMetadata(const PdfFilePath: string; const Options: TStringList = nil): string; function GetPDFRawDateStrings(const PdfFilePath: string; const Options: TStringList = nil): string; { Image Extraction } function ExtractImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const AsJPEG: Boolean = False; const Options: TStringList = nil): Boolean; function ListImagesInPDF(const PdfFilePath: string; const Options: TStringList = nil): string; function ExtractRawImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; { Attachment Handling } function ListPDFAttachments(const PdfFilePath: string; const Options: TStringList = nil): string; function SavePDFAttachmentToFile(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputFilePath: string; const Options: TStringList = nil): Boolean; function SaveAllPDFAttachmentsToDirectory(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; { PDF Conversion } function ConvertPDFToPPM(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList = nil): Boolean; function ConvertPDFToPNG(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList = nil): Boolean; function ConvertPDFToHTML(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList = nil): Boolean; function ConvertPDFToPostScript(const PdfFilePath: string; const OutputFilePath: string; const Options: TStringList = nil): Boolean; function ConvertPDFToCairoFormat(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList): Boolean; { PDF Analysis } function GetPDFFontsInformation(const PdfFilePath: string; const Options: TStringList = nil): string; { PDF Manipulation } function SeparatePDFPagesToDirectory(const PdfFilePath: string; const OutputDirectory: string; const OutputFileNameRoot: string; const FirstPage: Integer = 0; const LastPage: Integer = 0; const Options: TStringList = nil): Boolean; function UniteMultiplePDFs(const InputPDFFilePaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList = nil): Boolean; implementation uses IOUtils, System.StrUtils; { Input validation helpers } procedure ValidatePdfFilePath(const PdfFilePath: string); begin if Trim(PdfFilePath) = '' then raise EPDFAPIUsageException.Create('PDF file path cannot be empty.'); if not FileExists(PdfFilePath) then raise EPDFFileException.CreateFmt('PDF file not found: %s', [PdfFilePath]); end; procedure ValidateOutputDirectory(const OutputDirectory: string); begin if Trim(OutputDirectory) = '' then raise EPDFAPIUsageException.Create('Output directory cannot be empty.'); if not DirectoryExists(OutputDirectory) then raise EPDFFileException.CreateFmt('Output directory not found: %s', [OutputDirectory]); end; procedure ValidateOutputFilePath(const OutputFilePath: string); begin if Trim(OutputFilePath) = '' then raise EPDFAPIUsageException.Create('Output file path cannot be empty.'); end; procedure ValidateOutputRootFileName(const OutputRootFileName: string); begin if Trim(OutputRootFileName) = '' then raise EPDFAPIUsageException.Create('Output root file name cannot be empty.'); end; procedure ValidateOutputFormat(const OutputFormat: string); const ValidFormats: array[0..5] of string = ('png', 'ppm', 'html', 'ps', 'svg', 'jpeg'); var Format: string; Valid: Boolean; i: Integer; begin Format := LowerCase(OutputFormat); Valid := False; for i := Low(ValidFormats) to High(ValidFormats) do if Format = ValidFormats[i] then begin Valid := True; Break; end; if not Valid then raise EPDFAPIUsageException.CreateFmt('Invalid output format: %s. Valid formats are: %s', [OutputFormat, String.Join(', ', ValidFormats)]); end; { --- Text Extraction API --- } function ExtractTextFromPDF(const PdfFilePath: string; const LayoutMode: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; if LayoutMode <> '' then begin if SameText(LayoutMode, 'layout') then Result := ExtractTextFromPDFLayout(PdfFilePath, Options) else if SameText(LayoutMode, 'simple') then Result := ExtractTextFromPDFSimple(PdfFilePath, Options) else if SameText(LayoutMode, 'simple2') then Result := ExtractTextFromPDFSimple2(PdfFilePath, Options) else if SameText(LayoutMode, 'table') then Result := ExtractTextFromPDFTable(PdfFilePath, Options) else if SameText(LayoutMode, 'lineprinter') then Result := ExtractTextFromPDFLinePrinter(PdfFilePath, Options) else if SameText(LayoutMode, 'raw') then Result := ExtractTextFromPDFRawOrder(PdfFilePath, Options) else raise EPDFAPIUsageException.CreateFmt('Unknown layout mode: %s', [LayoutMode]); end else Result := TPDFTextExtractor.ExtractText(PdfFilePath, Options); end; function ExtractTextFromPDFLayout(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-layout'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFSimple(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-simple'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFSimple2(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-simple2'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFTable(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-table'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFLinePrinter(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-lineprinter'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; function ExtractTextFromPDFRawOrder(const PdfFilePath: string; const Options: TStringList): string; var lParams: TStringList; begin ValidatePdfFilePath(PdfFilePath); lParams := TStringList.Create; try lParams.Add('-raw'); Result := TPDFTextExtractor.ExtractText(PdfFilePath, lParams); finally lParams.Free; end; end; { --- PDF Information Retrieval API --- } function GetPDFDocumentInfo(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFInfoRetriever.GetPDFInfo(PdfFilePath, Options); end; function GetPDFPageSizes(const PdfFilePath: string; const FirstPage, LastPage: Integer; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFInfoRetriever.GetPageSizes(PdfFilePath, FirstPage, LastPage, Options); end; function GetPDFPageBoxes(const PdfFilePath: string; const FirstPage, LastPage: Integer; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFInfoRetriever.GetPageBoxes(PdfFilePath, FirstPage, LastPage, Options); end; function GetPDFDocumentMetadata(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFInfoRetriever.GetMetadata(PdfFilePath, Options); end; function GetPDFRawDateStrings(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFInfoRetriever.GetRawDates(PdfFilePath, Options); end; { --- Image Extraction API --- } function ExtractImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const AsJPEG: Boolean; const Options: TStringList): Boolean; var lOptions: TStringList; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); TDependencyManager.InitializeXPDFToolsPath; lOptions := TStringList.Create; try if Assigned(Options) then lOptions.Assign(Options); if AsJPEG then lOptions.Add('-j'); Result := TPDFImagesExtractor.ExtractImages(PdfFilePath, OutputDirectory, lOptions); finally lOptions.Free; end; end; function ListImagesInPDF(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFImagesExtractor.ListImages(PdfFilePath, Options); end; function ExtractRawImagesFromPDF(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFImagesExtractor.ExtractRawImages(PdfFilePath, OutputDirectory, Options); end; { --- Attachment Handling API --- } function ListPDFAttachments(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFFileDetacher.ListAttachments(PdfFilePath, Options); end; function SavePDFAttachmentToFile(const PdfFilePath: string; const AttachmentNumber: Integer; const OutputFilePath: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFFileDetacher.SaveAttachment(PdfFilePath, AttachmentNumber, OutputFilePath, Options); end; function SaveAllPDFAttachmentsToDirectory(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFFileDetacher.SaveAllAttachments(PdfFilePath, OutputDirectory, Options); end; { --- PDF Conversion API --- } function ConvertPDFToPPM(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputRootFileName(OutputRootFileName); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFToPPMConverter.ConvertToPPM(PdfFilePath, OutputRootFileName, Options); end; function ConvertPDFToPNG(const PdfFilePath: string; const OutputRootFileName: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputRootFileName(OutputRootFileName); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFToPNGConverter.ConvertToPNG(PdfFilePath, OutputRootFileName, Options); end; function ConvertPDFToHTML(const PdfFilePath: string; const OutputDirectory: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFToHTMLConverter.ConvertToHTML(PdfFilePath, OutputDirectory, Options); end; function ConvertPDFToPostScript(const PdfFilePath: string; const OutputFilePath: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFToPSConverter.ConvertToPS(PdfFilePath, OutputFilePath, Options); end; function ConvertPDFToCairoFormat(const PdfFilePath: string; const OutputPath: string; const OutputFormat: string; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputFilePath(OutputPath); ValidateOutputFormat(OutputFormat); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFToCairoConverter.ConvertToCairo(PdfFilePath, OutputPath, OutputFormat, Options); end; { --- PDF Analysis API --- } function GetPDFFontsInformation(const PdfFilePath: string; const Options: TStringList): string; begin ValidatePdfFilePath(PdfFilePath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFFontsAnalyzer.GetFontsInfo(PdfFilePath, Options); end; { --- PDF Manipulation API --- } function SeparatePDFPagesToDirectory(const PdfFilePath: string; const OutputDirectory: string; const OutputFileNameRoot: string; const FirstPage, LastPage: Integer; const Options: TStringList): Boolean; begin ValidatePdfFilePath(PdfFilePath); ValidateOutputDirectory(OutputDirectory); ValidateOutputRootFileName(OutputFileNameRoot); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFSeparate.SeparatePages(PdfFilePath, OutputDirectory, OutputFileNameRoot, FirstPage, LastPage, Options); end; function UniteMultiplePDFs(const InputPDFFilePaths: TStringDynArray; const OutputFilePath: string; const Options: TStringList): Boolean; var InputPath: string; begin ValidateOutputFilePath(OutputFilePath); for InputPath in InputPDFFilePaths do ValidatePdfFilePath(InputPath); TDependencyManager.InitializeXPDFToolsPath; Result := TPDFUnite.UnitePDFs(InputPDFFilePaths, OutputFilePath, Options); end; end.
pPDFPowerToolsConsoleApp.dpr
(demonstrating API usage)delphiprogram pPDFPowerToolsConsoleApp; {$APPTYPE CONSOLE} uses System.SysUtils, uFileSystemUtilities, uErrorHandling, uProcessExecution, System.Classes, Winapi.Windows, uXPDFWrapper, uPDFPowerToolsAPI; // new high-level API unit // Forward declaration for the API’s encoding class. function StringCchCopy(pszDest: PWideChar; cchDest: ULONGLONG; pszSrc: PWideChar): HRESULT; stdcall; external 'strsafe.dll' name 'StringCchCopyW'; type TUTF8NoBOMEncoding = class(TUTF8Encoding) public function GetPreamble: TBytes; override; end; function TUTF8NoBOMEncoding.GetPreamble: TBytes; begin // Override to return no BOM Result := []; end; procedure RunCoreTests; var Output, ErrorOutput: string; begin // Existing core tests… Writeln('Running core process execution tests...'); // ... end; procedure RunXPDFWrapperTests; var PDFPath: string; TextContent, Version: string; Metadata: TStringList; ConsoleStream: THandleStream; Encoding: TUTF8NoBOMEncoding; begin Encoding := TUTF8NoBOMEncoding.Create; try Writeln('Testing XPDF Wrapper functionalities...'); PDFPath := 'sample.pdf'; // Replace with a valid PDF file if not FileExists(PDFPath) then begin LogError('Test file not found: ' + PDFPath); Exit; end; if TXPDFWrapper.ExtractTextFromPDF(PDFPath, TextContent) then begin Writeln('Text extraction succeeded.'); Writeln('First 100 chars: ' + Copy(TextContent, 1, 100)); end else Writeln('Text extraction failed.'); if TXPDFWrapper.GetPDFVersion(PDFPath, Version) then Writeln('PDF Version: ' + Version) else Writeln('Version check failed.'); ConsoleStream := THandleStream.Create(GetStdHandle(STD_OUTPUT_HANDLE)); try Metadata := TStringList.Create; try if TXPDFWrapper.ExtractMetadata(PDFPath, Metadata) then begin Writeln('Metadata:'); Metadata.SaveToStream(ConsoleStream, Encoding); end else Writeln('Metadata extraction failed.'); finally Metadata.Free; end; finally ConsoleStream.Free; end; finally Encoding.Free; end; end; procedure RunHighLevelAPITests; var PDFPath: string; ExtractedText, DocInfo, FontsInfo: string; OutputDir: string; Success: Boolean; begin Writeln('Testing High-Level API functions...'); PDFPath := 'sample.pdf'; // Replace with a valid PDF file if not FileExists(PDFPath) then begin LogError('High-Level API Test: PDF file not found: ' + PDFPath); Exit; end; try ExtractedText := ExtractTextFromPDF(PDFPath); Writeln('Extracted Text (first 100 chars): ' + Copy(ExtractedText, 1, 100)); except on E: Exception do Writeln('Error during text extraction: ' + E.Message); end; try DocInfo := GetPDFDocumentInfo(PDFPath); Writeln('Document Info:'); Writeln(DocInfo); except on E: Exception do Writeln('Error retrieving document info: ' + E.Message); end; try FontsInfo := GetPDFFontsInformation(PDFPath); Writeln('Fonts Information:'); Writeln(FontsInfo); except on E: Exception do Writeln('Error retrieving fonts information: ' + E.Message); end; try OutputDir := IncludeTrailingPathDelimiter(ExtractFilePath(PDFPath)) + 'HTML_Output'; ForceDirectories(OutputDir); Success := ConvertPDFToHTML(PDFPath, OutputDir); if Success then Writeln('PDF successfully converted to HTML in: ' + OutputDir) else Writeln('PDF to HTML conversion failed.'); except on E: Exception do Writeln('Error converting PDF to HTML: ' + E.Message); end; end; procedure SetConsoleToUnicodeFont; var ConsoleHandle: THandle; FontInfo: CONSOLE_FONT_INFOEX; begin ConsoleHandle := GetStdHandle(STD_OUTPUT_HANDLE); ZeroMemory(@FontInfo, SizeOf(FontInfo)); FontInfo.cbSize := SizeOf(FontInfo); FontInfo.nFont := 0; FontInfo.dwFontSize.X := 0; FontInfo.dwFontSize.Y := 16; FontInfo.FontFamily := FF_DONTCARE; FontInfo.FontWeight := FW_NORMAL; StringCchCopy(FontInfo.FaceName, LF_FACESIZE, 'Lucida Console'); SetCurrentConsoleFontEx(ConsoleHandle, False, FontInfo); end; begin // Set console code pages and font for UTF-8 output SetConsoleOutputCP(CP_UTF8); SetConsoleCP(CP_UTF8); SetConsoleToUnicodeFont; try Writeln('Initializing PDF Power Tools...'); TDependencyManager.Initialize; if TDependencyManager.CheckDependencies then Writeln('All dependencies are present.') else Writeln('Dependency checks failed. See log for details.'); RunCoreTests; RunXPDFWrapperTests; RunHighLevelAPITests; Writeln('All tests completed. Press Enter to exit.'); ReadLn; except on E: Exception do begin LogCritical('Unhandled exception: ' + E.Message); Writeln('Critical error: ' + E.Message); ReadLn; end; end; end.
uProcessExecution.pas
for full Unicode supportBelow is a modified excerpt of the most–critical functions. (In Unicode Delphi, many functions are already wide–character aware; here we explicitly use CreateProcessW and convert command–line strings to WideString.)
delphifunction BuildEnvironmentBlock(const EnvList: TStringList): Pointer; var I: Integer; EnvStr, Buffer: string; P: PWideChar; begin Buffer := ''; for I := 0 to EnvList.Count - 1 do begin EnvStr := EnvList[I]; if EnvStr <> '' then Buffer := Buffer + EnvStr + #0; end; Buffer := Buffer + #0; // Double null termination // Allocate a wide–character block P := StrAlloc((Length(Buffer) + 1) * SizeOf(WideChar)) as PWideChar; StringToWideChar(Buffer, P, Length(Buffer) + 1); Result := P; end; { Updated ExecuteProcessInternal using Unicode } class function TProcessExecutor.ExecuteProcessInternal(const CommandLine: string; const TimeoutMS: Cardinal; const WorkingDirectory: string; const EnvironmentVariables: TStringList; out Output: string; out ErrorOutput: string): Boolean; var MutableCmdLine: WideString; SecurityAttr: TSecurityAttributes; hStdOutRead, hStdOutWrite, hStdErrRead, hStdErrWrite: THandle; StartupInfo: TStartupInfo; ProcessInfo: TProcessInformation; Buffer: array[0..4095] of WideChar; BytesRead: DWORD; WaitResult: DWORD; UseTimeout: Cardinal; pWorkingDir: PWideChar; EnvBlock: Pointer; begin Result := False; Output := ''; ErrorOutput := ''; MutableCmdLine := CommandLine; // WideString conversion (Delphi Unicode) UseTimeout := IfThen(TimeoutMS = Cardinal(-1), FDefaultTimeoutMS, TimeoutMS); SecurityAttr.nLength := SizeOf(TSecurityAttributes); SecurityAttr.bInheritHandle := True; SecurityAttr.lpSecurityDescriptor := nil; if not CreatePipe(hStdOutRead, hStdOutWrite, @SecurityAttr, 0) then begin LogError('Failed to create output pipe.'); Exit; end; try if not CreatePipe(hStdErrRead, hStdErrWrite, @SecurityAttr, 0) then begin LogError('Failed to create error pipe.'); Exit; end; try ZeroMemory(@StartupInfo, SizeOf(StartupInfo)); StartupInfo.cb := SizeOf(StartupInfo); StartupInfo.hStdOutput := hStdOutWrite; StartupInfo.hStdError := hStdErrWrite; StartupInfo.dwFlags := STARTF_USESTDHANDLES or STARTF_USESHOWWINDOW; StartupInfo.wShowWindow := SW_HIDE; if WorkingDirectory = '' then pWorkingDir := nil else pWorkingDir := PWideChar(WorkingDirectory); EnvBlock := nil; if Assigned(EnvironmentVariables) then EnvBlock := BuildEnvironmentBlock(EnvironmentVariables); try if not CreateProcessW(nil, PWideChar(MutableCmdLine), nil, nil, True, CREATE_NO_WINDOW, EnvBlock, pWorkingDir, StartupInfo, ProcessInfo) then raise EPProcessCreationError.CreateFmt( 'CreateProcess failed for command: %s (Working Directory: %s, Win32 Error Code: %d)', [CommandLine, WorkingDirectory, GetLastError], CommandLine, GetLastError); CloseHandle(hStdOutWrite); CloseHandle(hStdErrWrite); WaitResult := WaitForSingleObject(ProcessInfo.hProcess, UseTimeout); if WaitResult = WAIT_TIMEOUT then begin LogError(Format('Process timed out: %s (Timeout: %d ms)', [CommandLine, UseTimeout])); TerminateProcess(ProcessInfo.hProcess, 1); raise EPProcessTimeoutError.CreateFmt('Process timed out: %s (Timeout: %d ms)', [CommandLine, UseTimeout], CommandLine, UseTimeout); end; if WaitResult = WAIT_FAILED then raise EPProcessCreationError.CreateFmt('WaitForSingleObject failed with error code: %d', [GetLastError], CommandLine, GetLastError); while ReadFile(hStdOutRead, Buffer, SizeOf(Buffer) - 2, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; Output := Output + WideCharToString(Buffer); end; while ReadFile(hStdErrRead, Buffer, SizeOf(Buffer) - 2, BytesRead, nil) and (BytesRead > 0) do begin Buffer[BytesRead] := #0; ErrorOutput := ErrorOutput + WideCharToString(Buffer); end; Result := True; finally if Assigned(EnvBlock) then StrDispose(PWideChar(EnvBlock)); CloseHandle(ProcessInfo.hProcess); CloseHandle(ProcessInfo.hThread); end; finally CloseHandle(hStdErrRead); CloseHandle(hStdErrWrite); end; finally CloseHandle(hStdOutRead); CloseHandle(hStdOutWrite); end; end; { The overloaded ExecuteCommand function is updated similarly. }
Note: In Unicode Delphi (Delphi 2009 and later) the default string type is Unicode; the above changes ensure that all OS calls (such as CreateProcessW) are used and that the environment block is built in WideChar. (You may merge these changes into your full unit.)
uXPDFWrapper.pas
to support new API functionsBelow is an excerpt showing new “wrapper” methods that expose a few additional helper calls for our high–level API. (Many functions already exist; we add those missing for version detection and metadata extraction.)
delphiunit uXPDFWrapper; interface uses System.SysUtils, System.Classes, uProcessExecution, uFileSystemUtilities, uErrorHandling; type TXPDFWrapper = class public class function ExtractTextFromPDF(const PdfFilePath: string; out TextContent: string): Boolean; class function GetPDFVersion(const PdfFilePath: string; out Version: string): Boolean; class function ExtractMetadata(const PdfFilePath: string; out Metadata: TStringList): Boolean; { Existing functions for conversion, image extraction, attachments, etc. are already implemented – see the unit’s full code. } end; { The following classes already exist: TPDFTextExtractor, TPDFInfoRetriever, TPDFImagesExtractor, TPDFFileDetacher, TPDFToPPMConverter, TPDFToPNGConverter, TPDFToHTMLConverter, TPDFToPSConverter, TPDFFontsAnalyzer, TPDFSeparate, TPDFUnite, TPDFToCairoConverter. } implementation uses System.StrUtils, IOUtils; class function TXPDFWrapper.ExtractTextFromPDF(const PdfFilePath: string; out TextContent: string): Boolean; begin TextContent := TPDFTextExtractor.ExtractText(PdfFilePath, nil); Result := (TextContent <> ''); end; class function TXPDFWrapper.GetPDFVersion(const PdfFilePath: string; out Version: string): Boolean; var Info: string; Lines: TStringList; i: Integer; LineStr: string; begin Info := TPDFInfoRetriever.GetPDFInfo(PdfFilePath, nil); Lines := TStringList.Create; try Lines.Text := Info; Result := False; Version := ''; for i := 0 to Lines.Count - 1 do begin LineStr := Trim(Lines[i]); if StartsText('PDF version:', LineStr) then begin Version := Trim(Copy(LineStr, Length('PDF version:') + 1, MaxInt)); // Remove any extraneous control characters Version := StringReplace(Version, #$0C, '', [rfReplaceAll]); Result := True; Break; end; end; finally Lines.Free; end; end; class function TXPDFWrapper.ExtractMetadata(const PdfFilePath: string; out Metadata: TStringList): Boolean; begin Metadata := TStringList.Create; Metadata.Text := TPDFInfoRetriever.GetMetadata(PdfFilePath, nil); Result := (Metadata.Text <> ''); end; end.
Note: In this unit we simply expose a few “wrapper‐style” functions that the high–level API (in uPDFPowerToolsAPI.pas
) calls. (Other functions for conversion and image extraction are already present.)
TDependencyManager.InitializeXPDFToolsPath
so that the XPDF tools are downloaded/extracted and verified before running any command.This complete solution meets all the requirements for a robust, high–level API layer integrated with your existing Delphi PDF processing suite.