Modern HTML to PDF conversion

August 10, 2019

min read

Tips

Many web apps require some sort of PDF functionality. And as a web developers, we already know one great way to lay out documents — HTML!

In this article I’m going to consider the pros and cons of:

Established tools like wkhtmltopdf that have been in use for years
Google Chrome in “headless” mode
Specialist software built specifically for converting HTML to PDF

I should state upfront that you’re reading this on the Paperplane blog. Paperplane is a cloud API for generating PDFs, but it’s just one of many options available. I’ve tried to present a fair and accurate comparison of the trade offs involved in each of the options presented in this article.

The traditional approaches

Until 2017, there were two common ways to convert HTML to PDF. The first was to use wkhtmltopdf — an open source command line tool specifically designed for the task. A second alternative was PhantomJS, an open source “headless” web browser which can be controlled with JavaScript.

Although these tools have served many people extremely well, they do have some downsides. Support for the latest HTML5 and JavaScript features lags a long way behind the modern browsers we’re used to such as Chrome, Firefox or Safari.

Introducing Headless Chrome

Things changed in April 2017 with the release of Google Chrome 59 which included a “headless” mode. In conjunction with Chrome’s “devtools” API, headless mode allows you to use Chrome in a server environment and script it to perform tasks — like creating PDFs!

By switching to Chrome for PDF generation, you are able to use all the latest CSS layout features like Flexbox and CSS grid. You also get fully up-to-date Javascript support.

Using Headless Chrome to generate PDFs

Integrating with a Cloud API

If you don’t mind paying a small amount to outsource your PDF infrastructure and focus on more important features, consider using a cloud API that takes care of hosting Chrome instances for you.

This is probably the simplest way to get started, as you’ll be interacting with a relatively simple REST API and you won’t need to handle installing Chrome on your own servers or connecting directly to the Chrome devtools API.

Paperplane is a great option for this, although we’re a little biased since this is the Paperplane blog 😁.

Paperplane provides features that go beyond what’s possible out of the box with Headless Chrome. It can upload PDFs to your own Amazon S3 bucket, generate multiple PDFs in parallel, and create PDFs asynchronously with webhook notifications.

You can set all the standard options like page size and include headers and footers just as you would if you were using Headless Chrome directly.

For more information, you can check out the full list of features or the documentation.

Using the Chrome devtools API

If you’d like to use Chrome directly yourself, you can get started by using the “print-to-pdf” command line option, but for more control over the PDF you’ll need to communicate with Chrome’s devtools API.

You can use the devtools API from any programming language, but the most common approach is to use a Node.js library developed by Google called “Puppeteer”.

Controlling Chrome with Puppeteer

You could use Puppeteer to automate a Headless Chrome browser instance in almost any way. Here we’ll just be focusing on how it can be used to create PDFs.

Using Puppeteer with Chrome on your own server

When you install Puppeteer on your server or in your development environment, it will automatically download it’s own copy of Chrome for you. Surprisingly simple, right? Bear in mind though that a Chrome install can be 250MB or more, and running Chrome has the potential to use a significant amount of your server’s resources.

Using Puppeteer via Google Cloud Functions

One interesting new option is the ability to run headless Chrome on Google Cloud’s “serverless” platform — Cloud Functions. This feature was added to Cloud Functions in August 2018 and should provide a low-cost and highly scalable way of generating PDFs. Google’s announcement post has a good walk-through that explains how to set it all up.

Creating your own PDF microservice

If you want to generate PDFs on your own servers, but keep all PDF-related concerns out of your main application, you should check out pdf-bot. It’s a Node.js microservice that can receive URLs via it’s API, add them to a queue, and then notify you via webhooks when the URL has been converted to PDF. It also supports storing your PDF files on Amazon S3.

Advanced typesetting features

Chrome supports the page-break CSS properties which give you basic control over how your content flows across pages. Chrome Headless also allows you to add basic header and footer content when printing to PDF, and specify your page margings.

However, in some situations you might find that you need more fine-grained control over how your HTML document is laid out when printed.

CSS paged media

This is where the CSS paged media module comes in. Here’s some examples of things you can accomplish by using it:

Customizing header and footer content for different pages or document sections.
Adding content to the page margins.
Using different page margins within the same document.

Unfortunately, support for the paged media module in Chrome is quite basic and won’t allow any of these more advanced use-cases.

However, there are some other options we can look at which provide better paged media support.

The paged.js project

The paged.js library is a fairly new but really promising project.

It aims to polyfill web browsers to add the missing paged media functionality.

It can be used in conjunction with Headless Chrome to make much more complex print layouts possible.

Commercial paged media user agents

There are also some commmercial software packages that effectively act as web browsers, but instead of rendering content to a screen, they’re dedicated to rendering content to PDFs.

PrinceXML is capable of creating extremely well-formatted output (check out the samples on their website) but at a steep price of $3800 for a 1-server licence. However, if you require some of the features that only it can offer, such as automatic hyphenation, footnotes or print crop marks, then the cost may be worthwhile.

Docraptor is a cloud API backed by PrinceXML that lets you get started with PrinceXML at a lower price point.

PDFreactor is a competitor to PrinceXML with a similar price tag and a similar focus on producing print-quality PDF output.

The downside of these tools is that they’re often not quite as up to date as the major browsers with the latest CSS features. The level of JavaScript support may lag well behind mainstream browsers too.

Open source paged media user agents

There’s only one tool in this category that we’re aware of - weasyprint.

It has better support for paged media than Chrome but lacks a JavaScript engine.

Comparing the options

To summarise, I’ve attempted to grade the different PDF rendering engines according to three criteria — support for modern web standards, support for JavaScript, and support for paged media and other advanced typesetting features.

I hope this helps you choose the right option for your project!

If you’ve got comments or can suggest improvements to this post, let me know on Twitter 😀.

‍