How to Generate Pixel Perfect Pdfs

HTML5_Badge_512 copie

Did you ever have to generate PDF files in a web application?
If you did, chances are that you liked it as much as that 5-hours train journey seating next to a crying baby after a sleepless night…
I was on a project where we had to generate a 8-pages pdf with footers, headers and include some data from our app.
We asked other people to know which tool can do the job.
Everyone tells us that it’s:

  • A nightmare to generate a great pdf with html
  • Headers and footers are always a hard task
  • A nightmare
  • Very long
  • Did I already said a nightmare?

After all these interviews, the task seemed hard.
But we wanted to make some search to be sure there is no tool out there that can make the job easily.
And we found one.
It is a tool that:

  • Is easy to install
  • Is easy to use
  • Makes great pdf in no time

This tool is phantomjs, which is a headless browser usually used to test our pages.
Let’s dig into this and see how we can now master the server side pdf generation across all our apps!

Generate the first PDFs

In order to generate your first page, you need to download phantomjs (you can use npm).
The next step is to write the script:

// mypage.js

var page = new WebPage();
var html = "<div>My first page!!</div>";

page.setContent(html, null);
page.onLoadFinished = function (status) {
  page.render("mypdf.pdf");
  phantom.exit();
};

You can now generate your pdf file using:

./path/to/phantomjs mypage.js

Pretty easy right? But wait, the page is in landscape mode, I want it in portrait!
No problem, phantomjs can easily handle that.
In fact, the page object has a paperSize property you can edit to make the page look like whatever you want.
For example:

// mypage.js

var page = new WebPage();
var html = "<div>My first page!!</div>";

page.setContent(html, null);
page.paperSize = {
  format: "A4",
  orientation: "portrait",
  margin: { 
    left:"1cm", 
    right:"1cm", 
    top:"1cm", 
    bottom:"1cm" 
  }
};

page.onLoadFinished = function (status) {
  page.render("mypdf.pdf");
  phantom.exit();
};

You can now generate again and you have a pdf in portrait mode, with margins around to contain the body of your page.
But it is not good to have your js and your html in the same file…
So let’s move the html in a file named myhtml.html, in the same folder.
You can reference to this file using page.open:

<!-- myhtml.html -->
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
  </head>
  <body>
    <div>My first page!!</div>
  </body>
</html>
// mypage.js

var fs = require("fs");
var page = new WebPage();
page.paperSize = {
  format: "A4",
  orientation: "portrait",
  margin: { 
    left:"1cm", 
    right:"1cm", 
    top:"1cm", 
    bottom:"1cm" 
  }
};   

page.open(
  "file://" + fs.absolute("./myhtml.html"), //myhtml.html is in the same folder
  function (status) {
    page.render("mypdf.pdf");
    phantom.exit();
  }
);

Ok that’s great!
But I need multiple pages with headers and footers!
To add pages in your PDF, you use the css rules page-break-after and page-break-before.
For example, the following css will render two pages:

<style>
#page {
  page-break-after: always;    
}
</style>
<div id="page">My first page</div> 
<div>My second page</div>

Now you can add multiple pages, just need headers and footers and you are good to go!
In fact, adding headers and footers is quite simple, you have to edit again the paperSize property:

page.paperSize = {
  format: "A4",
  orientation: "portrait",
  margin: { 
    left:"1cm", 
    right:"1cm", 
    top:"1cm", 
    bottom:"1cm" 
  },
  header: {
    height: "3cm",
    contents: phantom.callback(
      function(pageNum, numPages) {
        return("Header: "+pageNum+"/"+numPages);
      }
    )
  }
}

Now you have a header with a pagination, great! Footers work the same way.

We are able to generate great pdf now.
But we have to use phantomjs on the server, and it’s a binary, so we will need to launch the command path/to/phantomjs myscript.js from our server.
And we need to handle errors, and be able to debug, etc…
It is a bit painful to handle that, and there are great tools out there that are able to get it done for us!
On my project, we used nodejs and we found the node-html-pdf package, which is really great for our tasks!

Node-html-pdf

Node-html-pdf is a tool that uses phantomjs to print pdf.
It adds some cool features and provide an easy API you can use on your Node.js server.
The last thing you will have to do after reading this part is adding a route on your server which calls the pdf generation with phantomjs.
Let’s dig into this!

Node-html-pdf provides a set of methods to easily generate pdf on your server.
For example, the last example of the previous chapter can be done by writing:

// myserver.js

var pdf = require("html-pdf");
var fs = require("fs");
var html = fs.readFileSync("./myhtml.html");
var options = {
  format: "A4",
  orientation: "portrait",
  border: { // note that margin becomes border
    left:"1cm", 
    right:"1cm", 
    top:"1cm", 
    bottom:"1cm" 
  }
};

pdf
  .create(html, options)
  .toFile(
    __dirname + "/mypdf.pdf",
    function (err) {
      if (err) console.log(err);
    }
  );

And you generate your pdf with the command:

node myserver.js

But that’s not finished yet! Node-html-pdf adds some cool features like setting the header and the footer in your html instead of your js options.
You can do that with specific id in your html:

<!-- Default header -->
<div id="pageHeader">Header: {{page}}/{{pages}}</div>

<!-- Default footer -->
<div id="pageFooter">Footer: {{page}}/{{pages}}</div>

<!-- nth header -->
<div id="pageHeader-n">My nth header</div>

The last example shows that you can even set a default header of your pages and use another one for specific pages!

There is one thing i did not mention yet, and it is how to use pictures and scripts.
With node-html-pdf, it’s pretty easy: you have to define the path of your asset folder in options and reference the file you want in your html:

// myserver.js
var options = {
  …,
  base: "file://" + __dirname + "/asset/"
};

<!-- myhtml.html -->
<img src="myimage.png" /> 
<!-- myimage.png is in my asset folder, at the root of my directory -->

Conclusion

We now know how to generate great pdf on your server.
It is no longer painful to generate a pdf using html. And you can use your favorite templating engine (like mustache, handlebars, …) to insert your data and make the perfect pdf!

Known issue:

There is an issue we faced using node-html-pdf:

  • Images in header are not rendered.
    To handle that, you can convert your images in base64 and replace the output in your html string.
    If you’re not using a templating engine, you can do:
    var html = fs.readFileSync("path/to/html")
    var image = fs.readFileSync("path/to/dog.png");
    var encodedImage = new Buffer(image).toString("base64");
    html = html.replace("{{encodedImage}}", encodedImage)
    pdf.create(...).toFile(...);

And in your html file:

<img src="data:image/png;base64,{{encodedImage}}" .../>

You liked this article? You'd probably be a good match for our ever-growing tech team at Theodo.

Join Us

  • Stanislas Bernard

    Great article Richard !
    I have a subsidiary question: what about loading multiple files ? How can you handle an html file along with a js file and a css file ?

  • Stanislas Bernard

    hum I just saw that you said you can do it with node-html-pdf’s base option :)