A guide to generate PDFs in Python

Introduction

Python became essential for everyday developer tasks; whether you work with Python, you will need to know how to code with Python.

Python is used for automation, testing, web development, data analysis. On the other hand, HTML is the primary language of web development and web-based applications.

One of the superpowers of Python is to deal with data in any format and generate and convert data to any other format. PDF is one of the portable formats that can be used to view data across devices and platforms independent of the device and operating system.

In this article, We will talk about how to generate PDF using Python, and we will introduce multiple libraries like FPDF, Reportlab, Pyppeteer and Pdfkit and the difference between them.

Libraries

There are a lot of libraries on Python to deal with PDF; We will introduce some of the popular libraries that can be used easily to convert HTML files to PDF format.

1. FPDF

Free-PDF is a python library Ported from PHP to generate PDF. It provides various functionalities to generate pdf, like generating PDFs from text files and writing your data formats to generate PDFs.


While FPDF supports HTML, it only understands the basic functionalities and doesn’t understand CSS. That’s why you need to use HTMLMixin as it helps FPDF to understand the advanced features of the HTML. 
You can install FPDF with pip using the following command.

pip install fpdf==1.7

FPDF supports:

  • Page formatting
  • Images, links, colours
  • Automatic line and page breaks

A code example:

from fpdf import FPDF, HTMLMixin
# creating a class inherited from both FPDF and HTMLMixin
class MyFPDF(FPDF, HTMLMixin):
	pass
# instantiating the class
pdf = MyFPDF()
# adding a page
pdf.add_page()
# opening html file 
file = open("file.html", "r")
# extracting the data from hte file as a string
Data = file.read()
# HTMLMixin write_html method
pdf.write_html(data)
#saving the file as a pdf
pdf.output('Python_fpdf.pdf', 'F')

The previous example takes a file anime file.html and converts it into a PDF file name Python_fpdf.pdf with the help of the HTMLMixin library. 

You can find more about FPDF here

2. Reportlab

Reportlab is a python library that helps you to create PDF.it has its opensource version and a commercial version, and the difference is that the commercial version supports a Report Markup Language (RML)both provide you with the following features:

  • Supports dynamic web PDF generation
  • Supports converting XML into PDF
  • Support vector graphics and inclusion of other PDF files
  • Support the creation of time charts and tables

you can install it using the following command:

pip install reportlab

Reportlab is a very complex tool with a lot of capability to create your format and style for PDF. The simplest example can be like the following:

from reportlab.pdfgen import canvas

c = canvas.Canvas("reportlab_pdf.pdf")
c.drawString(100,100,"Hello World")
c.showPage()
c.save()

You can find more info about reportlab here

3. Pyppeteer

We talked before about Puppeteer in Generate a PDF with JavaScript Article and how it is a tool to automate the browser. Pyppeteer is an unofficial port of the automation library provided by the chrome browser.

Main differences between Puppeteer and Pyppeteer

  • Pyppeteer accepts both the dictionary input parameters and keyword arguments
  • Python is not using $ in the method names
  • Page.evaluate() and Page.querySelectorEval() may fail and require you to add a “` force_expr=True“` option to force input strings as an expression

Install it using the following command:

pip install pyppeteer

A Code example:

import asyncio
from pyppeteer import launch

#defining an async method
async def main():
    # launching browser session
    browser = await launch( )
    # opening a new page
    page = await browser.newPage()
    # go to a specific address or file
    await page.goto(file: path\_to\_html_file.html')
    #create a screen shot from the page
    await page.screenshot({'path': 'sample.png'})
    # save the screenshot as a pdf
    await page.pdf({'path': 'pyppeteer_pdf.pdf'})
    #close the browser
    await browser.close()
# invocation of the Async main function
asyncio.get_event_loop().run_until_complete(main())

You can read more about Pyppeteer here

4. Python-Wkhtmltopdf

wkhtmltopdf is a widely used command-line tool used to generate PDF from HTML URLs; Python-Wkhtmltopdf is a wrapper for this command-line tool to be used in Python.you can install it using the following command

pip install py3-wkhtmltopdf==0.4.1

The usage is simple; you need to import the library and provide wkhtmltopdf API with the URL and the path for the output file.

from wkhtmltopdf import wkhtmltopdf

wkhtmltopdf(url='apitemplate.io', output_file='wkhtmltopdf.pdf')

You can find more information here

5. Pdfkit

A wrapper for the wkhtmltopdf makes it very easy to generate PDF from various formats like files, strings, and URLs.You can install it using the following command:

pip install pdfkit

Pdfkit supports features like:

  • Vector graphics
  • Text features like wrapping, aligning and bullet lists
  • PNG and JPEG Image embedding
  • Annotation features like Highlights and underlines
  • PDF security like encryption

An example of the generation of a PDF from a file is:

#importing pdfkit
import pdfkit

# calling the from file method to convert file to pdf
pdfkit.from_file('file.html', 'file.pdf')

It also supports generating pdfs from links by calling the from_url method.

pdfkit.from_url('https://apitemplate.io/',  python.pdf')

you can also specify the setting of the page and font like the following:

options = {
    'page-size': 'A4',
    'margin-top': '0.75in',
    'margin-right': '0.75in',
    'margin-bottom': '0.75in',
    'margin-left': '0.75in',
    'encoding': "UTF-8",
    'custom-header': [
        ('Accept-Encoding', 'gzip')
    ],
    'cookie': [
        ('cookie-empty-value', '""')
        ('cookie-name1', 'cookie-value1'),
        ('cookie-name2', 'cookie-value2'),
    ],
    'no-outline': None
}
pdfkit.from_file('file.html', 'file.pdf', options=options)

You can learn more about pdfkit from here

Comparison

So we have a lot of options to choose from. The only question remains which one is more suitable for me. I would say it depends on your application and what you actually need to do. For example, if you want to build a PDF from scratch, or you just want to convert HTML into a PDF, or you want to fill a particular template and convert it into a specific format.

So if you want to convert HTML into PDF, I believe PDFKit, FPDF, and Wkhtmltopdf are the best options you have. But PDFkit is the more popular one of them. On the other hand, if you want to render PDFs, your options are Pyppeteer and Reportlab.

Reportlab advantage is that it supports a wide variety of graphs like line plots and bar charts and can embed images. On the other hand, it doesn’t provide a method for creating a footer and footnotes and can embed only JPEG images, but with the right python extension, you can extend this to 30 more formats. Reportlab is also more difficult for beginner users and more comprehensive.

On the other hand, Pyppeteer provides better rendering and is easier if you are familiar with its javascript version but only supports specific browsers like chrome and chromium that must be available on your machine to work with this tool.

Conclusion

This article talked about five of the most popular python libraries for generating PDFs.

We had a brief introduction to some of the tools/ libraries like FPDF, wkHTMLToPdf, Pyppeteer, ReportLab, and PDFKit. We also compared them in different properties like complexity, size of generated files, resolution, and Features.
Finally, if you want to have a tool with all the features of these libraries and more, in that case, I recommend that you check out APITemplate.io, which is a tool that can help you generate PDF quickly with PDF generating API over the cloud and totally compatible with CSS, JavaScript, and Python. It also comes with predefined templates which you can reuse and edit.

Table of Contents

Share:

Share on facebook
Facebook
Share on twitter
Twitter
Share on pinterest
Pinterest
Share on linkedin
LinkedIn

Image Generation Posts

PDF Generation Posts