PDF is mainly used as format that provides consistent output regardless of the output device. However, it also provides various interactive features, one of them being support for forms.
AcroForm vs XFA Forms
The PDF specification provides two different ways for representing forms: AcroForm and XFA forms:
AcroForms are static forms where each form field is predefined with respect to its size, possible values and so on. These types of forms have been in the PDF specification since PDF 1.2 and have broad support among PDF reader applications. When speaking of an interactive form we always mean an AcroForm.
XFA forms (Adobe XML Forms Architecture) were introduce with PDF 1.5 and are much more advanced. They allow, for example, that fields are dependent on other fields and that text fields can vary in size, possibly adding pages to the document. XFA forms have been deprecated with PDF 2.0.
XFA forms need much more functionality in a PDF reader application than AcroForm forms. Due to this support for XFA forms is only available in certain commercial software applications.
Since XFA forms are already deprecated, HexaPDF only has support for interactive forms.
Interactive Forms (AcroForm)
An interactive form consists of the main form dictionary, form fields and widget annotations. Together they define the structure and visible appearance of the form.
The main form dictionary references the root fields which in turn can reference child fields. This allows one to build a hierarchy of fields and to inherit attributes from parent fields. Fields without child fields are called terminal fields.
These terminal fields can have a visible appearance which is provided by a widget annotation. Each field can have zero, one or more associated widgets.
Main Interactive Form Dictionary
It only provides a few entries, the most important of which are:
/Fieldscontains the array of root fields. See the various methods on the form class on how to access and modify form fields.
/NeedAppearancesdefines whether appearances should be constructed by the PDF reader application. This is useful for libraries/applications which can’t do this due to the added complexity. They just set this key to
trueand the reader application constructs all appearances.
/DA: The former is a dictionary containing the default resources (like fonts, color spaces, …) that should be used when constructing appearances. The latter defines a “default appearance string” that defines, at least, the font and font size to be used when creating text field appearances. The two keys together allow a PDF reader application to convert text input by a user into a proper PDF content stream.
The form dictionary object is the main entry point for handling interactive forms with HexaPDF. It allows you to list, modify, create and delete the form fields. By relying on the provided convenience methods all the tedious but needed book-keeping is done behind the scenes.
A form field dictionary contains, among other things, the type of the field, its name and its value.
There are four main types of fields which are further sub-divided:
- Button fields
These fields represent interactive controls that a user can manipulate with a mouse.
A button field may be a push button (something to click which produces a result immediately), a check box (for toggling between two states) or a radio button (typically one button in a set can be turned on).
- Text fields
These fields allow the user to input text from the keyboard.
The text can be entered into a single-line or multi-line field and there is also the possibility for rich text strings which allow inline formatting of the text.
- Choice fields
These fields contain several text items of which the user can select one or more.
A choice field may be presented as a scrollable list box or a combo box. The latter also allows the user to input a value other than the predefined ones.
- Signature fields
These fields represent digital signatures and optional data for authenticating the signer name and the document’s contents.
The visual appearance is defined by associated widget annotations. Each terminal field can have zero, one or more associated widgets. For example, each widget annotation of a radio button field describes one possible selection value. Another use for multiple widget annotations is on a multi-page form where a name entered by the user should appear in a header or footer on every page.
A widget annotation describes the visual appearance of a form field on a page. It is implemented by HexaPDF::Type::Annotations::Widget.
As with all other annotations the widgets placement on the page is specified by the
/Rect key and
the visual appearance by the
Additionally, each widget can specify a background color and border style and, depending on the type of the associated field, other properties.
When using HexaPDF you don’t have to worry about the visual appearance. HexaPDF creates the needed
appearance streams automatically using a default style similar to those found in popular PDF reader
applications (see HexaPDF::Type::AcroForm::AppearanceGenerator). This is done by setting the
needed widget annotation and field properties when the widget is created. Later these properties are
used during the creation of the appearance (like some PDF readers would do when the
/NeedAppearances key on the main form object is set).
You can naturally provide the appearance streams yourself if needed since those are just Form XObjects.