Rendering and storing web pages using wkhtmltopdf

May 17, 2011 in Blog

Wkhtmltopdf is a quite simple shell utility that allows the user to convert any given web page (html) to an image (jpg, png, etc) or a document (pdf). It relies on the WebKit layout engine to render web pages through the respective library written for Qt (QtWebKit).

WebKit is an open source state-of-the-art layout engine which powers Google Chrome and Apple Safari, two major web browsers out there. It is a very dynamic project and one of the most consistent engines for web page rendering following closely the latest standards.

Wkhtmltopdf is open source, written in C++ and distributed under the GNU Lesser General Public License. PHP and Python bindings are already available making it easy to use the utility natively with other platforms as well. On a well updated system wkhtmltopdf is a trustworthy solution for storing snapshot of web pages on demand.

Wkhtmltopdf’s output formats include image formats such as JPEG and PNG and the PDF document format. PDF is an open standard for document exchange independent of software, hardaware and operating systems and each PDF file encaptulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it.