class HexaPDF:: Tokenizer
Parent | Object |
---|
Tokenizes the content of an IO object following the PDF rules.
See: PDF2.0 s7.2
Constants
Attributes
The IO object from the tokens are read.
Public Class Methods
Creates a new tokenizer for the given IO stream.
If on_correctable_error
is set to an object responding to +call(msg, pos)+, errors for correctable situations are only raised if the return value of calling the object is true
.
Public Instance Methods
Reads the byte (an integer) at the current position and advances the scan pointer.
Returns a single integer or keyword token read from the current position and advances the scan pointer. If the current position doesn’t contain such a token, nil
is returned without advancing the scan pointer. The value NO_MORE_TOKENS
is returned if there are no more tokens available.
Initial runs of whitespace characters are ignored.
Note: This is a special method meant for use with reconstructing the cross-reference table!
Returns the PDF object at the current position. This is different from next_token
because references, arrays and dictionaries consist of multiple tokens.
If the allow_end_array_token
argument is true
, the ‘]’ token is permitted to facilitate the use of this method during array parsing.
See: PDF2.0 s7.3
Returns a single token read from the current position and advances the scan pointer.
Comments and a run of whitespace characters are ignored. The value NO_MORE_TOKENS
is returned if there are no more tokens available.
Reads the cross-reference subsection entry at the current position and advances the scan pointer.
If a problem is detected, yields to caller where the argument recoverable
is truthy if the problem is recoverable.
See: PDF2.0 7.5.4
Returns the next token but does not advance the scan pointer.
Returns the current position of the tokenizer inside in the IO object.
Note that this position might be different from io.pos
since the latter could have been changed somewhere else.
Sets the position at which the next token should be read.
Note that this does not set io.pos
directly (at the moment of invocation)!
Utility method for scanning until the given regular expression matches.
If the end of the file is reached in the process, nil
is returned. Otherwise the matched string is returned.
Skips all whitespace at the current position.
See: PDF2.0 s7.2.2