class HexaPDF::Document::Metadata

Parent

This class provides methods for reading and writing the document-level metadata.

When an instance is created (usually through HexaPDF::Document#metadata), the metadata is read from the document’s information dictionary (see HexaPDF::Type::Info) and made available through the various methods.

By default, the metadata is written to the information dictionary as well as to the document’s metadata stream (see HexaPDF::Type::Metadata) once the document is written. This can be controlled via the write_info_dict and write_metdata_stream methods.

While HexaPDF is able to write an XMP packet (using a limited form) to the document’s metadata stream, it provides no way for reading XMP metadata. If reading functionality or extended writing functionality is needed, make sure this class does not write the metadata and read/create the metadata stream yourself.

Caveats

  • Disabling writing to the information dictionary will only prevent parts from being written. The producer is always written to the information dictionary as per the AGPL license terms. The modification_date may be written depending on the arguments to HexaPDF::Document#write.

  • If writing the metadata stream is enabled, any existing metadata stream is completely overwritten. This means the metadata stream is not updated with the changed information.

Adding custom metadata properties

All the properties specified for the information dictionary are supported.

Furthermore, HexaPDF supports writing custom properties to the metadata stream. For this to work the used XMP namespaces need to be registered using register_namespace. Additionally, the types of all used XMP properties need to be registered using register_property.

The following types for XMP properties are supported:

String

Maps to the XMP simple string value. Values need to be of type String.

Integer

Maps to the XMP integer core value type and gets formatted as string. Values need to be of type Integer.

Date

Maps to the XMP simple string value, correctly formatted. Values need to be of type Time, Date, or DateTime

URI

Maps to the XMP simple value variant of URI. Values need to be of type String or URI.

Boolean

Maps to the XMP simple string value, correctly formatted. Values need to be either true or false.

OrderedArray

Maps to the XMP ordered array. Values need to be of type Array and items must be XMP simple values.

UnorderedArray

Maps to the XMP unordered array. Values need to be of type Array and items must be simple values.

LanguageArray

Maps to the XMP language alternatives array. Values need to be of type Array and items
must either be strings (they are associated with the set default language) or
LocalizedString instances.

See: PDF2.0 s14.3, www.adobe.com/products/xmp.html

Constants

PREDEFINED_NAMESPACES

Contains a mapping of predefined prefixes for XMP namespaces for metadata.

PREDEFINED_PROPERTIES

Contains a mapping of predefined XMP properties to their types, i.e. from namespace to property and then type.

Public Class Methods

new(document)

Creates a new Metadata object for the given PDF document.

Public Instance Methods

author → author or nil
author(value) → value

Returns the name of the person who created the document (author) if no argument is given. Otherwise sets the author to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name dc:creator.

creation_date → creation_date or nil
creation_date(value) → value

Returns the date and time (a Time object) the document was created if no argument is given. Otherwise sets the creation date to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name xmp:CreateDate.

creator → creator or nil
creator(value) → value

Returns the name of the PDF processor that created the original document from which this PDF was converted if no argument is given. Otherwise sets the name of the creator tool to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name xmp:CreatorTool.

custom_metadata(data)

Adds the given data string as custom metadata to the XMP document.

The data string must contain a fully valid ‘rdf:Description’ element.

Using this method allows adding metadata like PDF/A schema definitions for which there is no direct support by HexaPDF.

default_language → language
default_language(value) → value

Returns the default language in RFC3066 format used for unlocalized strings if no argument is given. Otherwise sets the default language to the given language.

The initial default lanuage is taken from the document catalog’s /Lang entry. If that is not set, the default language is assumed to be default language (‘x-default’).

delete
delete(ns_prefix)
delete(ns_prefix, name)

Deletes either all metadata properties, only the ones from a specific namespace, or a specific one.

keywords → keywords or nil
keywords(value) → value

Returns the keywords associated with the document if no argument is given. Otherwise sets keywords to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name pdf:Keywords.

modification_date → modification_date or nil
modification_date(value) → value

Returns the date and time (a Time object) the document was most recently modified if no argument is given. Otherwise sets the modification date to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name xmp:ModifyDate.

namespace(ns)

Returns the namespace URI associated with the given prefix.

producer → producer or nil
producer(value) → value

Returns the name of the PDF processor that converted the original document to PDF if no argument is given. Otherwise sets the name of the producer to the given value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name pdf:Producer.

property(ns_prefix, name) → property_value
property(ns_prefix, name, value) → value

Returns the value for the property specified via the namespace prefix ns_prefix and name if the value argument is not provided. Otherwise sets the property to value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

register_namespace(prefix, uri)

Registers the prefix for the given namespace uri.

register_property_type(prefix, property, type)

Registers the property for the namespace specified via prefix as the given type.

The argument type has to be one of the following: ‘String’, ‘Integer’, ‘Date’, ‘URI’, ‘Boolean’, ‘OrderedArray’, ‘UnorderedArray’, or ‘LanguageArray’.

subject → subject or nil
subject(value) → value

Returns the subject of the document if no argument is given. Otherwise sets the subject to the given value.

If the value is a LocalizedString, the language for the subject is taken from it. Otherwise the language specified via default_language is used.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name dc:description.

title → title or nil
title(value) → value

Returns the document’s title if no argument is given. Otherwise sets the document’s title to the given value.

If the value is a LocalizedString, the language for the title is taken from it. Otherwise the language specified via default_language is used.

The value nil is returned if the property is not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name dc:title.

trapped → trapped or nil
trapped(value) → value

Returns true if the document has been modified to include trapping information if no argument is given. Otherwise sets the trapped status to the given boolean value.

The value nil is returned if the property ist not set. And by using nil as value the property is deleted from the metadata.

This metadata property is represented by the XMP name pdf:Trapped.

write_info_dict(value)

Makes HexaPDF write the information dictionary if value is true.

See the class documentation for caveats.

write_info_dict?()

Returns true if the information dictionary should be written.

write_metadata_stream(value)

Makes HexaPDF write the metadata stream if value is true.

See the class documentation for caveats.

write_metadata_stream?()

Returns true if the metadata stream should be written.