You know when you get a form to fill out, but you need to print it and fill out by hand (or worse, try to fill it with your printer). Wouldn’t you rather be able to generate a PDF form out of it and then fill it out on your computer? Even better, have the form filled out automatically (as far as possible) with your usual data?
Well, I decided that’s something I wanted to do. In this particular case, I gave cash to two friends and wanted them to sign an IOU note. I found a PDF form online, but it required me to fill out their and my names twice and I balked at the duplication. (You can find the form by searching for “free iou form template”.)
The tools you will need to use (at this point) are commonly available for free. They are all open source tools, so there is a good chance they will be available years from now. In particular, we’ll need Scribus, a desktop publishing tool, pdftk, a PDF manipulation utility, and a few extra scripts and utilities.
[Note: this is a living entry, and things will get modified as they improve. Right now, it’s pretty rough.]
First, we need to get an image version of the form. If you are starting with a paper form, you can scan it (if you have a scanner) or take a picture with your (decent) camera. Either way will work. Then use an image manipulation tool (like the GIMP) to straighten the form, increase contrast, remove smudging, etc. When the image looks like you want it on paper, save the image.
If you are starting with a PDF file, you still need to transform that into an image. That’s mostly because the PDF import of Scribus is not so hot (although you could try that, if you like). A great utility for that purpose is ImageMagick’s convert. To convert a PDF file into JPEG, you simply tell it to convert form.pdf form.jpg (use appropriate size options).
Next, create a new document in Scribus, choosing a page size that matches (or exceeds) your original form. In this new document, create an image frame (Insert → Image Frame). Insert the form by double-clicking the frame and selecting the form image you generated (here you can try the PDF and see if it works).
If the image shows up correctly on the page, resize the frame to suit its content (i.e. make sure none of the form is cut off on the right or the bottom).
Now we get creative: using the PDF tools, add PDF text fields. The PDF tools are a toolbar that is typically on the top right and has inscrutably tiny icons. By hovering above them, you get the right one – it looks like a square with horizontal lines and a little “1” on the bottom right.
Once you clicked on the button, place form fields on your page. To do that, draw a rectangle on your page where data needs to be filled in. Make the rectangle about the size of the text you’d like to appear, and note down the height of the field. In my case, my rectangles ended up being about 17 points. [Note: if your form has the same fields in the same places, you can copy and paste fields. If you have the same form multiple times on a page, it makes sense to select all the field on one subform (use shift) and copy/paste the whole set.)
Now it’s time to change the default font (because that’s how the form fields are going to look like). Go to Edit→Styles and change the Default Character Style (click on it and hit Edit). You’ll get a big dialog, where you have to change the size to suit your boxes. If the height was 17, as in my case, a font size of 16 points would be appropriate. (You can also change the font family and style if you like, for instance to use your handwriting font or make it bold.)
Save the document as Scribus form (*.sla). The next thing we’ll do requires mucking around in the internals (not because you have to, but because using the Scribus tools is tedious).
Edit the document with a text editor. Scribus stores its files as uncompressed XML files (unless you tell it otherwise), so it’s really easy to edit them.
The new text fields are stored as PAGEOBJECT types. To find the ones we are interested in, looks for the marker, ANNOTATION=”1″. Each line that has that marker is one of our text fields.
Now, for cleanup. You can set all properties of your text field here, and we’ll add more as this project goes along. For now, you only need to change the field border (ANBCOL) from Black to None, and change the name of the field (ANNAME) from whatever the default was to something meaningful.
Save the file in the text editor and open it in Scribus again. [Note: you could have done the same as above by clicking on each form field, right click, select PDF Options→Field Properties, and modify the border and name there. But it’s really slow.]
Save the form as PDF using File→Export→Save as PDF. You can safely ignore the errors, choose a file name, and save it under that name.
Voila! You can now open the form in Adobe Acrobat (or even better, Okular) and fill it right there. You can also save it and print it, as you wished.
Now, if you want to fill the form programmatically, you need to know about FDF files. Adobe Acrobat uses those to define and store data for form fields.
This is where things get a little tricky, since there is no automated way to do that yet. Essentially, you need to get a list of all the form fields in your form and assign values to them using something that Acrobat understands.
You created the form, so you should know the form fields. If you don’t (or if your fillable PDF file came with the form on it), you can just use pdftk to help you out.
To find out the form items in a file named form.pdf, you run pdftk like this:
pdftk form.pdf dump_data_fields
Each of your fields should show up as FieldType Text and whatever name you gave it. Save those names, we’ll need them. [Hint: if your PDF form came from somewhere else and the field names are something stupid, still write them down and first assign sequential numbers to the values. Once you generate a form, you will know how the names match to fields on the page.]
Now generate a values file. The values file is very simple: it contains one line per field, composed of the field type (string), followed by a tab character, the field name, a tab, and the field value. If there is no tab character, the field name is the word in the line following the field type. If your field is named, “LAST”, then a line would look like:
string LAST Gazzetta
Next, you need to generate an FDF file… or not. I wrote a handy utility in Tcl (which means you have to install Tcl if you want to use it; grin) that makes the process simple. It wants input PDF and output PDF as parameters, as well as any values files you want to add. [Note: it reads multiple values files so you can use one with default values for name, etc. and a specific one for the particulars of this form.]
You can find the utility here. [Note: the utility is not working right now. If you are interested in a new version, please comment at the bottom of this article.] When you download it, run it as:
tclsh fillpdf.tcl input.pdf output.pdf default.values extra.values
where input.pdf is the form, output.pdf the name you want the output file to have, default.values and extra.values are just the names of the values files you wrote.
To make processing easier, there are a few extra instructions you can leave in the values files.
To specify that a series of fields are going to be filled with the same value, use the field type “alias.” If the fields NameA, NameB, and NameC are going to be filled with the value of the field Name, you specify:
alias Name NameA NameB NameC
(be careful not to create recursive aliases!)
If a single value field is broken up in multiple destination fields, then you can specify how. The field type “split” is followed by the split character and the names of destination fields, e.g.
split SSN – First3 Second2 Last4
split Total . Dollar Cent