Convert HTML to PDF using Python

This article was first published on PyShark , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this tutorial we will explore how to convert HTML files to PDF using Python.

Table of Contents


Introduction

There are several online tools that allow you to convert HTML files and webpages to PDF, and most of them are free.

While it is a simple process, being able to automate it can be very useful for some HTML code testing as well as saving required webpages as PDF files.

To continue following this tutorial we will need:

wkhtmltopdf is an open source command line tool to render HTML files into PDF using the Qt WebKit rendering engine.

In order to use it in Python, we will also need the pdfkit library which is a wrapper for wkhtmltopdf utility.

First, search for the wkhtmltopdf installer for your operating system. For Windows, you can find the latest version of wkhtmltopdf installer here. Simply download the .exe file and install on your computer.

Remember the path to the directory where it will be installed.
In my case it is: C:\Program Files\wkhtmltopdf

If you don’t have the Python library installed, please open “Command Prompt” (on Windows) and install it using the following code:

pip install pdfkit

Sample file

In order to continue in this tutorial we will need some HTML file to work with.

Here is a sample HTML file we will use in this tutorial:

If you download it and open in your browser, you should see:

and opening it in the code editor should show:


Convert HTML file to PDF using Python

Let’s start with converting HTML file to PDF using Python.

The sample.html file is located in the same directory as the main.py file with the code:

First, we will need to find the path to the wkhtmltopdf executable file wkhtmltopdf.exe

Recall that we installed in C:\Program Files\wkhtmltopdf meaning that the .exe file is in that folder. Navigating to it, you should see that the path to executable file is: C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe

Now we have everything we need and can easily convert HTML file to PDF using Python:

import pdfkit

#Define path to wkhtmltopdf.exe
path_to_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe'

#Define path to HTML file
path_to_file = 'sample.html'

#Point pdfkit configuration to wkhtmltopdf.exe
config = pdfkit.configuration(wkhtmltopdf=path_to_wkhtmltopdf)

#Convert HTML file to PDF
pdfkit.from_file(path_to_file, output_path='sample.pdf', configuration=config)

And you should see sample.pdf created in the same directory:

which should should look like this:


Convert Webpage to PDF using Python

Using pdfkit library you can also convert webpages into PDF using Python.

Let’s convert the wkhtmltopdf project page to PDF!

In this section we will reuse most of the code from the previous section, except now instead of using HTML file we will use the URL of a webpage and the .from_url() method of pdfkit class:

import pdfkit

#Define path to wkhtmltopdf.exe
path_to_wkhtmltopdf = r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe'

#Define url
url = 'https://wkhtmltopdf.org/'

#Point pdfkit configuration to wkhtmltopdf.exe
config = pdfkit.configuration(wkhtmltopdf=path_to_wkhtmltopdf)

#Convert Webpage to PDF
pdfkit.from_url(url, output_path='webpage.pdf', configuration=config)

And you should see webpage.pdf created in the same directory:

which should should look like this:


Conclusion

In this article we explored how to convert HTML to PDF using Python and wkhtmltopdf.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming tutorials.

The post Convert HTML to PDF using Python appeared first on PyShark.

To leave a comment for the author, please follow the link and comment on their blog: PyShark .

Want to share your content on python-bloggers? click here.