class HexaPDF::Tokenizer
Parent | Object |
---|
Tokenizes the content of an IO object following the PDF rules.
See: PDF1.7 s7.2
Constants
Attributes
The IO object from the tokens are read.
Public Class Methods
Creates a new tokenizer.
Public Instance Methods
Reads the byte (an integer) at the current position and advances the scan pointer.
Returns the PDF object at the current position. This is different from next_token because references, arrays and dictionaries consist of multiple tokens.
If the allow_end_array_token
argument is true
,
the ']' token is permitted to facilitate the use of this method
during array parsing.
See: PDF1.7 s7.3
Returns a single token read from the current position and advances the scan pointer.
Comments and a run of whitespace characters are ignored. The value
NO_MORE_TOKENS
is returned if there are no more tokens
available.
Reads the cross-reference subsection entry at the current position and advances the scan pointer.
If a possible problem is detected, yields to caller.
See: PDF1.7 7.5.4
Returns the next token but does not advance the scan pointer.
Returns the current position of the tokenizer inside in the IO object.
Note that this position might be different from io.pos
since
the latter could have been changed somewhere else.
Sets the position at which the next token should be read.
Note that this does *not* set io.pos
directly
(at the moment of invocation)!
Utility method for scanning until the given regular expression matches.
If the end of the file is reached in the process, nil
is
returned. Otherwise the matched string is returned.
Skips all whitespace at the current position.
See: PDF1.7 s7.2.2