module HexaPDF
HexaPDF
API Documentation¶ ↑
Here are some pointers to more in depth information:
-
For information about the command line application, see the
HexaPDF::CLI
module. -
HexaPDF::Document
provides information about how to work with a PDF file. -
HexaPDF::Content::Canvas
provides the canvas API for drawing/writing on a page or form XObject
Constants
- DefaultDocumentConfiguration¶
The default document specific configuration object.
Modify this object if you want to globally change document specific options or if you want to introduce new document specific options.
The following options are provided:
- acro_form.appearance_generator
-
The class that should be used for generating appearances for AcroForm fields. If the value is a String, it should contain the name of a constant to such a class.
- acro_form.create_appearances
-
A boolean specifying whether an AcroForm field's appearances should automatically be generated if they are missing.
- acro_form.text_field.default_width
-
A number specifying the default width of AcroForm text fields which should be auto-sized.
- acro_form.default_font_size
-
A number specifying the default font size of AcroForm text fields which should be auto-sized.
- acro_form.fallback_font
-
The font that should be used when a variable text field references a font that cannot be used.
Can be one of the following:
-
The name of a font, like 'Helvetica'.
-
An array consisting of the font name and a hash of font options, like ['Helvetica', variant: :italic].
-
A callable object receiving the field and the font object (or
nil
if no valid font object was found) and which has to return either a font name or an array consisting of the font name and a hash of font options. This way the response can be different depending on the original font and it would also allow e.g. modifying the configured fonts to add custom ones.
If set to
nil
, the use of the fallback font is disabled.Default is 'Helvetica'.
-
- acro_form.on_invalid_value
-
Callback hook when an invalid value is set for certain types of AcroForm fields.
The value needs to be an object that responds to #call(field, value) where
field
is the AcroForm field on which the value is set andvalue
is the invalid value. The returned value is used instead of the invalid value.The default implementation raises an error.
- acro_form.text_field.default_width
-
A number specifying the default width of AcroForm text fields which should be auto-sized.
- debug
-
If set to
true
, enables debug output. - document.auto_decrypt
-
A boolean determining whether the document should be decrypted automatically when parsed.
If this is set to
false
and the PDF document should later be decrypted, the methodEncryption::SecurityHandler.set_up_decryption
(document, decryption_opts) has to be called to set and retrieve the needed security handler. Note, however, that already loaded indirect objects have to be decrypted manually!In nearly all cases this option should not be changed from its default setting!
- document.on_invalid_string
-
A callable object that takes the invalid UTF-16BE encoded string and returns a valid UTF-8 encoded string.
The default is to remove all invalid characters.
- encryption.aes
-
The class that should be used for AES encryption. If the value is a String, it should contain the name of a constant to such a class.
See
HexaPDF::Encryption::AES
for the general interface such a class must conform to andHexaPDF::Encryption::RubyAES
as well asHexaPDF::Encryption::FastAES
for implementations. - encryption.arc4
-
The class that should be used for ARC4 encryption. If the value is a String, it should contain the name of a constant to such a class.
See
HexaPDF::Encryption::ARC4
for the general interface such a class must conform to andHexaPDF::Encryption::RubyARC4
as well asHexaPDF::Encryption::FastARC4
for implementations. - encryption.filter_map
-
A mapping from a PDF name (a Symbol) to a security handler class (see
Encryption::SecurityHandler
). If the value is a String, it should contain the name of a constant to such a class.PDF defines a standard security handler that is implemented (
HexaPDF::Encryption::StandardSecurityHandler
) and assigned the :Standard name. - encryption.sub_filter_map
-
A mapping from a PDF name (a Symbol) to a security handler class (see
HexaPDF::Encryption::SecurityHandler
). If the value is a String, it should contain the name of a constant to such a class.The sub filter map is used when the security handler defined by the encryption dictionary is not available, but a compatible implementation is.
- filter.map
-
A mapping from a PDF name (a Symbol) to a filter object (see
Filter
). If the value is a String, it should contain the name of a constant that contains a filter object.The most often used filters are implemented and readily available.
See PDF1.7 s7.4.1, ADB sH.3 3.3
- font.map
-
Defines a mapping from font names and variants to font files.
The value needs to be a hash of the form:
{"font_name" => {variant: file_name, variant2: file_name2, ...}, ...}
Once a font is registered in this way, the font name together with a variant name can be used with the
HexaPDF::Document::Fonts#add
method to load the font.For best compatibility, the following variant names should be used:
- none
-
For the normal variant of the font
- bold
-
For the bold variant of the font
- italic
-
For the italic or oblique variant of the font
- bold_italic
-
For the bold and italic/oblique variant of the font
- font.on_missing_glyph
-
Callback hook when an UTF-8 character cannot be mapped to a glyph of a font.
The value needs to be an object that responds to #call(character, font_wrapper) where
character
is the Unicode character for the missing glyph and returns a substitute glyph to be used instead.The
font_wrapper
argument is the used font wrapper object, e.g.HexaPDF::Font::TrueTypeWrapper
. To access theHexaPDF::Document
instance from which this hook was called, you can usefont_wrapper.pdf_object.document
.The default implementation returns an object of class
HexaPDF::Font::InvalidGlyph
which, when not removed before encoding, will raise an error. - font.on_missing_unicode_mapping
-
Callback hook when a character code point cannot be converted to a Unicode character.
The value needs to be an object that responds to #call(code, font_dict) where
code
is the decoded code point andfont_dict
is the font dictionary which was used for the conversion. The returned value is used as the Unicode character and should be a string.The default implementation raises an error.
- font_loader
-
An array with font loader implementations. When a font should be loaded, the array is iterated in sequence and the first valid font returned by a font loader is used.
If a value is a String, it should contain the name of a constant that is a font loader object.
See the
HexaPDF::FontLoader
module for information on how to implement a font loader object. - graphic_object.map
-
A mapping from graphic object names to graphic object factories.
See
HexaPDF::Content::GraphicObject
for more information. - graphic_object.arc.max_curves
-
The maximum number of curves used for approximating a complete ellipse using Bezier curves.
The default value is 6, higher values result in better approximations but also take longer to compute. It should not be set to values lower than 4, otherwise the approximation of a complete ellipse is visibly false.
- image_loader
-
An array with image loader implementations. When an image should be loaded, the array is iterated in sequence to find a suitable image loader.
If a value is a String, it should contain the name of a constant that is an image loader object.
See the
HexaPDF::ImageLoader
module for information on how to implement an image loader object. - image_loader.pdf.use_stringio
-
A boolean determining whether images specified via file names should be read into memory all at once using a StringIO object.
Since loading a PDF as image entails having the IO object from the image PDF around until the PDF document where it is used is written, there is the choice whether memory should be used to load the image PDF all at once or whether a File object is used that needs to be manually closed.
To avoid leaking file descriptors, using the StringIO is the default setting. If you set this option to
false
, it is strongly advised to use ObjectSpace.each_object(File) (orIO
instead of +File) to traverse the list of open file descriptors and close the ones that have been used for PDF images. - io.chunk_size
-
The size of the chunks that are used when reading IO data.
This can be used to limit the memory needed for reading or writing PDF files with huge stream objects.
- layout.boxes.map
-
A mapping from layout box names to box classes. If the value is a String, it should contain the name of a constant to such a class.
See
HexaPDF::Layout::Box
for more information. - page.default_media_box
-
The media box that is used for new pages that don't define a media box. Default value is A4. See HexaPDF::Type::Page::PAPER_SIZE for a list of predefined paper sizes.
This configuration option (together with 'page.default_media_orientation') is also used when validating pages and a page without a media box is found.
The value can either be a rectangle defining the paper size or a Symbol referencing one of the predefined paper sizes.
- page.default_media_orientation
-
The page orientation that is used for new pages that don't define a media box. It is only used if 'page.default_media_box' references a predefined paper size. Default value is :portrait. The other possible value is :landscape.
- parser.on_correctable_error
-
Callback hook when the parser encounters an error that can be corrected.
The value needs to be an object that responds to #call(document, message, position) and returns
true
if an error should be raised. - parser.try_xref_reconstruction
-
A boolean specifying whether non-recoverable parsing errors should lead to reconstructing the main cross-reference table.
The reconstructed cross-reference table might make damaged files usable but there is no way to ensure that the reconstructed file is equal to the undamaged original file (though generally it works out).
There is also the possibility that reconstructing doesn't work because the algorithm has to assume that the PDF was written in a certain way (which is recommended by the PDF specification).
Defaults to
true
. - sorted_tree.max_leaf_node_size
-
The maximum number of nodes that should be in a leaf node of a node tree.
- style.layers_map
-
A mapping from style layer names to layer objects.
See
HexaPDF::Layout::Style::Layers
for more information. - signature.signing_handler
-
A mapping from a Symbol to a signing handler class (see HexaPDF::Document::Signatures::DefaultHandler). If the value is a String, it should contain the name of a constant to such a class.
- signature.sub_filter_map
-
A mapping from a PDF name (a Symbol) to a signature handler class (see
HexaPDF::DigitalSignature::Handler
). If the value is a String, it should contain the name of a constant to such a class.The sub filter map is used for mapping specific signature algorithms to handler classes. The filter value of a signature dictionary is ignored since we only support the standard signature algorithms.
- task.map
-
A mapping from task names to callable task objects. See
HexaPDF::Task
for more information.
- GlobalConfiguration¶
The global configuration object, providing the following options:
- color_space.map
-
A mapping from a PDF name (a Symbol) to a color space class (see
HexaPDF::Content::ColorSpace
). If the value is a String, it should contain the name of a constant that contains a color space class.Classes for the most often used color space families are implemented and readily available.
See PDF1.7 s8.6
- filter.flate.compression
-
Specifies the compression level that should be used with the FlateDecode filter. The level can range from 0 (no compression), 1 (best speed) to 9 (best compression, default).
- filter.flate.on_error
-
Callback hook when a potentially recoverable Zlib error occurs in the FlateDecode filter.
The value needs to be an object that responds to #call(stream, error) where stream is the Zlib stream object and error is the thrown error. The method needs to return
true
if an error should be raised.The default implementation prevents errors from being raised.
- filter.flate.memory
-
Specifies the memory level that should be used with the FlateDecode filter. The level can range from 1 (minimum memory usage; slow, reduces compression) to 9 (maximum memory usage).
The
HexaPDF
default value of 6 has been found in tests to be nearly equivalent to the Zlib default of 8 in terms of speed and compression level but uses less memory. - filter.predictor.strict
-
Specifies whether the predictor algorithm used by LZWDecode and FlateDecode should operate in strict mode, i.e. adhering to the PDF specification without correcting for common deficiences of PDF writer libraries.
- object.type_map
-
A mapping from a PDF name (a Symbol) to PDF object classes which is based on the /Type field. If the value is a String, it should contain the name of a constant that contains a PDF object class.
This mapping is used to provide automatic wrapping of objects in the
HexaPDF::Document#wrap
method. - object.subtype_map
-
A mapping from a PDF name (a Symbol) to PDF object classes which is based on the /Subtype field. If the value is a String, it should contain the name of a constant that contains a PDF object class.
This mapping is used to provide automatic wrapping of objects in the
HexaPDF::Document#wrap
method.
- VERSION¶
The version of
HexaPDF
.