class HexaPDF::Parser

Parent	Object

Parses an IO stream according to PDF2.0 to get at the contained objects.

This class also contains higher-level methods for getting indirect objects and revisions.

See: PDF2.0 s7

Attributes

io[R]¶

The IO stream which is parsed.

Public Class Methods

new(io, document)¶

Creates a new parser for the given IO object.

PDF references are resolved using the associated Document object.

Public Instance Methods

file_header_version()¶

Returns the PDF version number that is stored in the file header.

See: PDF2.0 s7.5.2

linearized?()¶

Returns true if the PDF file is a linearized file.

Note: The method uses heuristics to determine whether a PDF file is linearized. In case of slightly invalid or damaged PDFs that HexaPDF can recover from it is possible that this method returns true even though the PDF isn’t actually linearized.

load_compressed_object(xref_entry)¶

Loads the compressed object identified by the cross-reference entry.

load_object(xref_entry)¶

Loads the indirect (potentially compressed) object specified by the given cross-reference entry.

For information about the xref_entry argument, have a look at HexaPDF::XRefSection and HexaPDF::XRefSection::Entry.

load_revision(pos)¶

Loads a single revision whose cross-reference section/stream is located at the given position.

Returns an HexaPDF::XRefSection object and the accompanying trailer dictionary.

parse_indirect_object(offset = nil)¶

Parses the indirect object at the specified offset.

This method is used by a PDF Document to load objects. It should not be used by any other object because invalid object positions lead to errors.

Returns an array containing [object, oid, gen, stream].

See: PDF2.0 s7.3.10, s7.3.8

parse_xref_section_and_trailer(offset)¶

Parses the cross-reference section at the given position and the following trailer and returns them as an array consisting of an HexaPDF::XRefSection instance and a hash.

This method can only parse cross-reference sections, not cross-reference streams!

See: PDF2.0 s7.5.4, s7.5.5; ADB1.7 sH.3-3.4.3

reconstructed?()¶

Returns true if the PDF file was damaged and could be reconstructed.

reconstructed_revision()¶

Returns the reconstructed revision.

startxref_offset()¶

Returns the offset of the main cross-reference section/stream.

Implementation note: Normally, the %%EOF marker has to be on the last line, however, Adobe viewers relax this restriction and so do we.

If strict parsing is disabled, the whole file is searched for the offset.

See: PDF2.0 s7.5.5, ADB1.7 sH.3-3.4.4

xref_section?(offset)¶

Looks at the given offset and returns true if there is a cross-reference section at that position.

Menu

Classes and Modules

class HexaPDF::Parser

Attributes

Public Class Methods

Public Instance Methods