A Guide to Generate PDFs in Python (Updated 2024)

1. Introduction

Python became essential for everyday developer tasks. Whether you work with Python, you will need to know how to code with Python.

Python is used for automation, testing, web development, data analysis. On the other hand, HTML is the primary language of web development and web-based applications.

One of the superpowers of Python is to deal with data in any format and generate and convert data to any other format. PDF is one of the portable formats that can be used to view data across devices and platforms independent of the device and operating system.

In this article, We will talk about how to generate PDF using Python, and we will introduce multiple libraries like FPDF, Reportlab, Pyppeteer and Pdfkit and the difference between them.

2. Libraries for PDF Generation in Python

There are a lot of libraries on Python to deal with PDF; We will introduce some of the popular libraries that can be used easily to convert HTML files to PDF format.

i. FPDF

Free-PDF is a python library Ported from PHP to generate PDF. It provides various functionalities to generate PDF, like generating PDFs from text files and writing your data formats to generate PDFs.
While FPDF supports HTML, it only understands the basic functionalities and doesn’t understand CSS. That’s why you need to use HTMLMixin as it helps FPDF to understand the advanced features of the HTML. 
You can install FPDF with pip using the following command.

pip install fpdf==1.7.2

FPDF supports:

  • Page formatting
  • Images, links, colours
  • Automatic line and page breaks

A code example:

from fpdf import FPDF, HTMLMixin
# creating a class inherited from both FPDF and HTMLMixin
class MyFPDF(FPDF, HTMLMixin):
	pass
# instantiating the class
pdf = MyFPDF()
# adding a page
pdf.add_page()
# opening html file 
file = open("file.html", "r")
# extracting the data from hte file as a string
Data = file.read()
# HTMLMixin write_html method
pdf.write_html(data)
#saving the file as a pdf
pdf.output('Python_fpdf.pdf', 'F')

The previous example takes a file anime file.html and converts it into a PDF file name Python_fpdf.pdf with the help of the HTMLMixin library. 

You can find more about FPDF here

ii. Reportlab

Reportlab is a python library that helps you to create PDF.it has its opensource version and a commercial version, and the difference is that the commercial version supports a Report Markup Language (RML)both provide you with the following features:

  • Supports dynamic web PDF generation
  • Supports converting XML into PDF
  • Support vector graphics and inclusion of other PDF files
  • Support the creation of time charts and tables

you can install it using the following command:

pip install reportlab

Reportlab is a very complex tool with a lot of capability to create your format and style for PDF. The simplest example can be like the following:

from reportlab.pdfgen import canvas

c = canvas.Canvas("reportlab_pdf.pdf")
c.drawString(100,100,"Hello World")
c.showPage()
c.save()

You can find more info about reportlab here

iii. Pyppeteer

We talked before about Puppeteer in Generate a PDF with JavaScript Article and how it is a tool to automate the browser. Pyppeteer is an unofficial port of the automation library provided by the chrome browser.

Main differences between Puppeteer and Pyppeteer

  • Pyppeteer accepts both the dictionary input parameters and keyword arguments
  • Python is not using $ in the method names
  • Page.evaluate() and Page.querySelectorEval() may fail and require you to add a “` force_expr=True“` option to force input strings as an expression

Install it using the following command:

pip install pyppeteer

A Code example:

import asyncio
from pyppeteer import launch

#defining an async method
async def main():
    # launching browser session
    browser = await launch( )
    # opening a new page
    page = await browser.newPage()
    # go to a specific address or file
    await page.goto(file: path\_to\_html_file.html')
    #create a screen shot from the page
    await page.screenshot({'path': 'sample.png'})
    # save the screenshot as a pdf
    await page.pdf({'path': 'pyppeteer_pdf.pdf'})
    #close the browser
    await browser.close()
# invocation of the Async main function
asyncio.get_event_loop().run_until_complete(main())

You can read more about Pyppeteer here

iv. Python-Wkhtmltopdf

wkhtmltopdf is a widely used command-line tool used to generate PDF from HTML URLs; Python-Wkhtmltopdf is a wrapper for this command-line tool to be used in Python.you can install it using the following command

pip install py3-wkhtmltopdf==0.4.1

The usage is simple; you need to import the library and provide wkhtmltopdf API with the URL and the path for the output file.

from wkhtmltopdf import wkhtmltopdf

wkhtmltopdf(url='apitemplate.io', output_file='wkhtmltopdf.pdf')

You can find more information here

v. Pdfkit

A wrapper for the wkhtmltopdf makes it very easy to generate PDF from various formats like files, strings, and URLs.You can install it using the following command:

pip install pdfkit

Pdfkit supports features like:

  • Vector graphics
  • Text features like wrapping, aligning and bullet lists
  • PNG and JPEG Image embedding
  • Annotation features like Highlights and underlines
  • PDF security like encryption

An example of the generation of a PDF from a file is:

#importing pdfkit
import pdfkit

# calling the from file method to convert file to pdf
pdfkit.from_file('file.html', 'file.pdf')

It also supports generating pdfs from links by calling the from_url method.

pdfkit.from_url('https://apitemplate.io/',  python.pdf')

you can also specify the setting of the page and font like the following:

options = {
    'page-size': 'A4',
    'margin-top': '0.75in',
    'margin-right': '0.75in',
    'margin-bottom': '0.75in',
    'margin-left': '0.75in',
    'encoding': "UTF-8",
    'custom-header': [
        ('Accept-Encoding', 'gzip')
    ],
    'cookie': [
        ('cookie-empty-value', '""')
        ('cookie-name1', 'cookie-value1'),
        ('cookie-name2', 'cookie-value2'),
    ],
    'no-outline': None
}
pdfkit.from_file('file.html', 'file.pdf', options=options)

You can learn more about pdfkit from here

3. Comparison of PDF Generation Libraries in Python

So we have a lot of options to choose from. The only question remains which one is more suitable for me. I would say it depends on your application and what you actually need to do.

Here’s a detailed comparison table for these libraries:

FeatureReportLabPyppeteerPDFKitFPDFpython-wkhtmltopdf
Primary Use CasePDF generation with strong support for complex layouts and graphicsWeb scraping and browser automation, can generate screenshots as PDFPDF generation from HTML using wkhtmltopdf as a backendPDF generation focusing on ease of use without external dependenciesPDF generation from HTML, leveraging wkhtmltopdf capabilities
CapabilitiesHigh-quality PDFs, charts, graphicsHeadless Chrome/Chromium browser automationConverts HTML to PDF, simple APISimple PDF generation, customizableConverts HTML to PDF with precise rendering of web pages
Syntax EaseModerate to complex, flexible APIComplex, requires understanding of async programmingSimple, minimal coding requiredVery simple, easy to learnSimple, acts as a wrapper around wkhtmltopdf
Supported PlatformsCross-platform (Windows, macOS, Linux)Cross-platform (Windows, macOS, Linux)Cross-platform (needs wkhtmltopdf installed)Cross-platform (Windows, macOS, Linux)Cross-platform (requires wkhtmltopdf)
Installation ComplexityRequires Python and library installationRequires Node installation and possibly additional browser binariesRequires wkhtmltopdf to be installed separatelyOnly needs Python and FPDFRequires both Python and wkhtmltopdf installations
Unique FeaturesPowerful layout engine, extensive documentationAutomates web interactions, generates PDFs/screenshotsSimple API, relies on robust wkhtmltopdfPure Python with no dependencies, supports pluginsUtilizes web rendering engines for accurate PDF creation

For example, if you want to build a PDF from scratch, or you just want to convert HTML into a PDF, or you want to fill a particular template and convert it into a specific format.

So if you want to convert HTML into PDF, I believe PDFKit, FPDF, and Wkhtmltopdf are the best options you have. But PDFkit is the more popular one of them. On the other hand, if you want to render PDFs, your options are Pyppeteer and Reportlab.

We’ve put together an article that explains how to convert HTML to PDF using Python, which includes a section on using APITemplate.io’s REST API for the HTML to PDF conversion.

Reportlab advantage is that it supports a wide variety of graphs like line plots and bar charts and can embed images. On the other hand, it doesn’t provide a method for creating a footer and footnotes and can embed only JPEG images, but with the right python extension, you can extend this to 30 more formats. Reportlab is also more difficult for beginner users and more comprehensive.

On the other hand, Pyppeteer provides better rendering and is easier if you are familiar with its javascript version but only supports specific browsers like chrome and chromium that must be available on your machine to work with this tool.

Each library serves somewhat different needs, so the best choice depends on the specific requirements of your project.

4. Conclusion

This article talked about five of the most popular python libraries for generating PDFs.

We had a brief introduction to some of the tools/ libraries like FPDF, wkHTMLToPdf, Pyppeteer, ReportLab, and PDFKit. We also compared them in different properties like complexity, size of generated files, resolution, and features.

Finally, if you want to have a tool with all the features of these libraries and more, APITemplate.io offers features such as PDF creation or HTML to PDF conversion compatibility with no-code/low-code tools, and HTML to PDF conversion, making it easy for businesses to generate PDF documents.

Sign up for a free account today and start automating your PDF generation process.

Table of Contents

Share:

Facebook
Twitter
Pinterest
LinkedIn

Articles for Image Generation

Articles for PDF Generation