module HexaPDF::Task::Optimize

Task for optimizing the PDF document.

For a list of optimization methods this task can perform have a look at the ::call method.

Public Class Methods

call(doc, compact: false, object_streams: :preserve, xref_streams: :preserve, compress_pages: false, prune_page_resources: false)

Optimizes the PDF document.

The field entries that are optional and set to their default value are always deleted. Additional optimization methods are performed depending on the values of the following arguments:

compact

Compacts the object space by merging the revisions and then deleting null and unused values if set to true.

object_streams

Specifies if and how object streams should be used: For :preserve, existing object streams are preserved; for :generate objects are packed into object streams as much as possible; and for :delete existing object streams are deleted.

xref_streams

Specifies if cross-reference streams should be used. Can be :preserve (no modifications), :generate (use cross-reference streams) or :delete (remove cross-reference streams).

If object_streams is set to :generate, this option is implicitly changed to :generate.

compress_pages

Compresses the content streams of all pages if set to true. Note that this can take a very long time because each content stream has to be unfiltered, parsed, serialized and then filtered again.

prune_page_resources

Removes all unused XObjects from the resources dictionaries of all pages. It is recommended to also set the compact argument because otherwise the unused XObjects won’t be deleted from the document.

This is sometimes necessary after importing pages from other PDF files that use a single resources dictionary for all pages.

compact(doc, object_streams, xref_streams)

Compacts the document by merging all revisions into one, deleting null and unused entries and renumbering the objects.

For the meaning of the other arguments see ::call.

compress_pages(doc)

Compresses the contents of all pages by parsing and then serializing again. The HexaPDF serializer is already optimized for small output size so nothing else needs to be done.

Returns a hash of the form key=>true where the keys are the used XObjects (for use with prune_page_resources).

delete_fields_with_defaults(obj)

Deletes field entries (except for /Type) of the object that are optional and currently set to their default value.

process_object_streams(doc, method, xref_streams)

Processes the object streams in each revision according to method: For :preserve, nothing is done, for :delete all object streams are deleted and for :generate objects are packed into object streams as much as possible.

process_xref_streams(doc, method)

Processes the cross-reference streams in each revision according to method: For :preserve, nothing is done, for :delete all cross-reference streams are deleted and for :generate cross-reference streams are added.

prune_page_resources(doc, used_refs)

Deletes all XObject entries from the resources dictionaries of all pages whose names do not match the keys in used_refs.