1. Introduction
Python became essential for everyday developer tasks. Whether you work with Python, you will need to know how to code with Python.
Python is used for automation, testing, web development, data analysis. On the other hand, HTML is the primary language of web development and web-based applications.
One of the superpowers of Python is to deal with data in any format and generate and convert data to any other format. PDF is one of the portable formats that can be used to view data across devices and platforms independent of the device and operating system.
In this article, We will talk about how to generate PDF using Python, and we will introduce multiple libraries like FPDF, ReportLab, Pyppeteer and Pdfkit and the difference between them.
Note: If you’re looking for a way to generate PDF documents from HTML, please visit our other blog post for a comprehensive guide: Convert HTML to PDF using Python with 5 Popular Libraries
2. Five Popular Libraries for PDF Generation in Python
There are a lot of libraries on Python to deal with PDF; We will introduce some of the popular libraries that can be used easily to convert HTML files to PDF format.
i. FPDF
FPDF(Free-PDF) is a python library Ported from PHP to generate PDF. It provides various functionalities to generate PDF, like generating PDFs from text files and writing your data formats to generate PDFs.
While FPDF supports HTML, it only understands the basic functionalities and doesn’t understand CSS. That’s why you need to use HTMLMixin as it helps FPDF to understand the advanced features of the HTML.
You can install FPDF with pip using the following command.
pip install fpdf==1.7.2
FPDF supports:
- Page formatting
- Images, links, colours
- Automatic line and page breaks
A code example:
from fpdf import FPDF, HTMLMixin
# creating a class inherited from both FPDF and HTMLMixin
class MyFPDF(FPDF, HTMLMixin):
pass
# instantiating the class
pdf = MyFPDF()
# adding a page
pdf.add_page()
# opening html file
file = open("file.html", "r")
# extracting the data from hte file as a string
Data = file.read()
# HTMLMixin write_html method
pdf.write_html(data)
#saving the file as a pdf
pdf.output('Python_fpdf.pdf', 'F')
The previous example takes a file anime file.html and converts it into a PDF file name Python_fpdf.pdf with the help of the HTMLMixin library.
You can find more about FPDF here
ii. ReportLab
ReportLab is a python library that helps you to create PDF.it has its opensource version and a commercial version, and the difference is that the commercial version supports a Report Markup Language (RML)both provide you with the following features:
- Supports dynamic web PDF generation
- Supports converting XML into PDF
- Support vector graphics and inclusion of other PDF files
- Support the creation of time charts and tables
you can install it using the following command:
pip install reportlab
ReportLab is a very complex tool with a lot of capability to create your format and style for PDF. The simplest example can be like the following:
from reportlab.pdfgen import canvas
c = canvas.Canvas("reportlab_pdf.pdf")
c.drawString(100,100,"Hello World")
c.showPage()
c.save()
You can find more info about ReportLab here
iii. Pyppeteer
We talked before about Puppeteer in Generate a PDF with JavaScript Article and how it is a tool to automate the browser.
Pyppeteer is an unofficial port of the automation library provided by the chrome browser.
Main differences between Puppeteer and Pyppeteer
- Pyppeteer accepts both the dictionary input parameters and keyword arguments
- Python is not using $ in the method names
- Page.evaluate() and Page.querySelectorEval() may fail and require you to add a “` force_expr=True“` option to force input strings as an expression
Install it using the following command:
pip install pyppeteer
A Code example:
import asyncio
from pyppeteer import launch
#defining an async method
async def main():
# launching browser session
browser = await launch( )
# opening a new page
page = await browser.newPage()
# go to a specific address or file
await page.goto(file: path\_to\_html_file.html')
#create a screen shot from the page
await page.screenshot({'path': 'sample.png'})
# save the screenshot as a pdf
await page.pdf({'path': 'pyppeteer_pdf.pdf'})
#close the browser
await browser.close()
# invocation of the Async main function
asyncio.get_event_loop().run_until_complete(main())
You can read more about Pyppeteer here
iv. Python-Wkhtmltopdf
wkhtmltopdf is a widely used command-line tool used to generate PDF from HTML URLs; Python-Wkhtmltopdf is a wrapper for this command-line tool to be used in Python.you can install it using the following command
pip install py3-wkhtmltopdf==0.4.1
The usage is simple; you need to import the library and provide wkhtmltopdf API with the URL and the path for the output file.
from wkhtmltopdf import wkhtmltopdf
wkhtmltopdf(url='apitemplate.io', output_file='wkhtmltopdf.pdf')
You can find more information here
v. PDFKit
PDFKit is a wrapper for the wkhtmltopdf makes it very easy to generate PDF from various formats like files, strings, and URLs.You can install it using the following command:
pip install pdfkit
Pdfkit supports features like:
- Vector graphics
- Text features like wrapping, aligning and bullet lists
- PNG and JPEG Image embedding
- Annotation features like Highlights and underlines
- PDF security like encryption
An example of the generation of a PDF from a file is:
#importing pdfkit
import pdfkit
# calling the from file method to convert file to pdf
pdfkit.from_file('file.html', 'file.pdf')
It also supports generating pdfs from links by calling the from_url method.
pdfkit.from_url('https://apitemplate.io/', python.pdf')
you can also specify the setting of the page and font like the following:
options = {
'page-size': 'A4',
'margin-top': '0.75in',
'margin-right': '0.75in',
'margin-bottom': '0.75in',
'margin-left': '0.75in',
'encoding': "UTF-8",
'custom-header': [
('Accept-Encoding', 'gzip')
],
'cookie': [
('cookie-empty-value', '""')
('cookie-name1', 'cookie-value1'),
('cookie-name2', 'cookie-value2'),
],
'no-outline': None
}
pdfkit.from_file('file.html', 'file.pdf', options=options)
You can learn more about pdfkit from here
3. Comparison of the 5 Libraries for PDF Generation in Python
So we have a lot of options to choose from. The only question remains which one is more suitable for me. I would say it depends on your application and what you actually need to do.
Here’s a detailed comparison table for these libraries:
Feature | ReportLab | Pyppeteer | PDFKit | FPDF | python-wkhtmltopdf |
---|---|---|---|---|---|
Primary Use Case | PDF generation with strong support for complex layouts and graphics | Web scraping and browser automation, can generate screenshots as PDF | PDF generation from HTML using wkhtmltopdf as a backend | PDF generation focusing on ease of use without external dependencies | PDF generation from HTML, leveraging wkhtmltopdf capabilities |
Capabilities | High-quality PDFs, charts, graphics | Headless Chrome/Chromium browser automation | Converts HTML to PDF, simple API | Simple PDF generation, customizable | Converts HTML to PDF with precise rendering of web pages |
Syntax Ease | Moderate to complex, flexible API | Complex, requires understanding of async programming | Simple, minimal coding required | Very simple, easy to learn | Simple, acts as a wrapper around wkhtmltopdf |
Supported Platforms | Cross-platform (Windows, macOS, Linux) | Cross-platform (Windows, macOS, Linux) | Cross-platform (needs wkhtmltopdf installed) | Cross-platform (Windows, macOS, Linux) | Cross-platform (requires wkhtmltopdf) |
Installation Complexity | Requires Python and library installation | Requires Node installation and possibly additional browser binaries | Requires wkhtmltopdf to be installed separately | Only needs Python and FPDF | Requires both Python and wkhtmltopdf installations |
Unique Features | Powerful layout engine, extensive documentation | Automates web interactions, generates PDFs/screenshots | Simple API, relies on robust wkhtmltopdf | Pure Python with no dependencies, supports plugins | Uses web rendering engines for accurate PDF creation |
For example, if you want to build a PDF from scratch, or you just want to convert HTML into a PDF, or you want to fill a particular template and convert it into a specific format.
So if you want to convert HTML into PDF, I believe PDFKit, FPDF, and Wkhtmltopdf are the best options you have. But PDFkit is the more popular one of them. On the other hand, if you want to render PDFs, your options are Pyppeteer and ReportLab.
We’ve put together an article that explains how to convert HTML to PDF using Python, which includes a section on using APITemplate.io’s REST API for the HTML to PDF conversion.
ReportLab advantage is that it supports a wide variety of graphs like line plots and bar charts and can embed images. On the other hand, it doesn’t provide a method for creating a footer and footnotes and can embed only JPEG images, but with the right python extension, you can extend this to 30 more formats. ReportLab is also more difficult for beginner users and more comprehensive.
On the other hand, Pyppeteer provides better rendering and is easier if you are familiar with its javascript version but only supports specific browsers like chrome and chromium that must be available on your machine to work with this tool.
Each library serves somewhat different needs, so the best choice depends on the specific requirements of your project.
4. Conclusion
This article talked about five of the most popular python libraries for generating PDFs.
We had a brief introduction to some of the tools/ libraries like FPDF, wkHTMLToPdf, Pyppeteer, ReportLab, and PDFKit. We also compared them in different properties like complexity, size of generated files, resolution, and features.
Finally, if you want to have a tool with all the features of these libraries and more, APITemplate.io offers features such as PDF creation or HTML to PDF conversion compatibility with no-code/low-code tools, and HTML to PDF conversion, making it easy for businesses to generate PDF documents.
Sign up for a free account today and start automating your PDF generation process.