Tokenizes the content of an IO object following the PDF rules.
See: PDF1.7 s7.2
The IO object from the tokens are read.
Public Class Methods
Creates a new tokenizer.
Public Instance Methods
Reads the byte (an integer) at the current position and advances the scan pointer.
Returns the PDF object at the current position. This is different from next_token because references, arrays and dictionaries consist of multiple tokens.
allow_end_array_token argument is
the ']' token is permitted to facilitate the use of this method
during array parsing.
See: PDF1.7 s7.3
Returns a single token read from the current position and advances the scan pointer.
Comments and a run of whitespace characters are ignored. The value
NO_MORE_TOKENS is returned if there are no more tokens
Reads the cross-reference subsection entry at the current position and advances the scan pointer.
If a possible problem is detected, yields to caller.
See: PDF1.7 7.5.4
Returns the next token but does not advance the scan pointer.
Returns the current position of the tokenizer inside in the IO object.
Note that this position might be different from
the latter could have been changed somewhere else.
Sets the position at which the next token should be read.
Note that this does *not* set
(at the moment of invocation)!
Utility method for scanning until the given regular expression matches.
If the end of the file is reached in the process,
returned. Otherwise the matched string is returned.
Skips all whitespace at the current position.
See: PDF1.7 s7.2.2