Generating PDFs in Java with 3 Popular Libraries

By Ahmed Hashesh
April 17, 2022

1. Introduction

Creating PDF documents programmatically is a common requirement in software development, especially when dealing with report generation, invoicing, or any content that requires a printable or easily distributable format.

Java, being one of the most widely used programming languages, offers several libraries to simplify this task. This article will expand on how to generate pdf from HTML files in Java using OpenHTMLtoPDF, iTextPDF, and Flying saucer and their differences.

Each of these libraries brings unique features and capabilities to the table, from converting HTML content directly into PDF files to providing detailed control over the document’s appearance and layout.

Let’s get started now.

2. Generating PDFs in Java with 3 Libraries

i. OpenHTMLtoPDF

OpenHTMLtoPDF is an open-source Java library to convert the ML/XHTML into PDFs or images.

It uses PDFBOX open library to generate PDF after rendering the XHTML. Apache PDFBox is an open-source Java library that supports creating and converting PDF documents.

In this tutorial, we will use the PdfRendererBuilder class from the library, which provides different methods to generate the PDF:

run(): Run the XHTML/XML to PDF conversion
toStream(): An output stream to output the resulting PDF.
withUri(): Provides a URI (Uniform Resource Identifier) to convert to PDF.

You can find more about these methods in the documentation here.

Code Example

The following code example provides a simple usage for the OpenHTMLtoPDF by creating a URI from the HTML file, passing it to the builder to convert it to a stream and then running the XML/XHTML conversion to PDF using the Renderer.

import java.io.FileOutputStream;
import java.io.OutputStream;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;

public class SimpleUsage
{
    public static void main(String[] args) throws Exception {
        try (OutputStream os = new FileOutputStream("out.pdf")) {
            PdfRendererBuilder builder = new PdfRendererBuilder();
            builder.useFastMode();
           
 builder.withUri("file:in.htm");
	// set output to an output stream set 
            builder.toStream(os);
	// Run the XHTML/XML to PDF conversion and 
            builder.run();
            //prints the message if the PDF is created successfully
            System.out.println("PDF created");
        }
    }
}

Maven Dependency

Maven is a tool to standardize the build process as it takes up most of the build tasks.
We need to add the following dependencies to the POM.xml file to get the above code working and running.

 <dependencies>
    <dependency>
        <!-- ALWAYS required, usually included transitively. -->
        <groupId>com.openhtmltopdf</groupId>
        <artifactId>openhtmltopdf-core</artifactId>
        <version>${openhtml.version}</version>
    </dependency>

    <dependency>
        <!-- Required for PDF output. -->
        <groupId>com.openhtmltopdf</groupId>
        <artifactId>openhtmltopdf-pdfbox</artifactId>
        <version>${openhtml.version}</version>
    </dependency>
 <dependencies>

You can find more information about OpenHTMLtoPDF here.

ii. iTextPDF

iTextPDF library that provides API to create PDF, RTF, and HTML documents.

iTextPDF has a hierarchical structure; it divides the text into “Chunks” combining these Chunks together will form a “Phrase”. There is a subclass from the Phrase like the “Paragraph”, which itself contains multiple subclasses. In this tutorial, we will use some iTextPDF classes.

PdfWriter: A DocWriter class for PDF; using this class, every element can be added to a document and written to the outputstream.

XMLWorkerHelper: A helper class for parsing XHTML/CSS or XML flow to PDF.

You can find more about these classes from the documentation PdfWriter, XMLWorkerHelper.

Code Example

The flossing code example demonstrates the simplest way to generate a PDF from an HTML file by instantiating a singleton instance out of the XMLWorkerHelper class, parsing the HTML file, and passing the parsed stream to the pdfWriter instance to generate the PDF.

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorkerHelper;

public class Html2Pdf {
    private static final String HTML = "html.html";

    public static void main(String[] args) {
        try {
            Document document = new Document();
	// this method is used to get an instance of the PdfWriter.
            PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("html.pdf"));
            document.open();
	// Get a Singleton XMLWorkerHelper
	// parseXHtml: Parses the xml data in the given reader 
            XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream(HTML));
            document.close();
        } catch (IOException | DocumentException e) {
            e.printStackTrace();
        }
    }
}

Maven Dependency

We need to add the following dependencies to the POM.xml file to get the above code working and running.

<dependency>
   <groupId>com.itextpdf</groupId>
   <artifactId>itextpdf</artifactId>
   <version>${itextpdf.version}</version>
</dependency>
<dependency>
   <groupId>com.itextpdf.tool</groupId>
   <artifactId>xmlworker</artifactId>
   <version>${xmlworker.version}</version>
</dependency>

You can find more information about iTextPDF here.

iii. Flying Saucer

Flying Saucer is a Java library for converting XML/XHTML into PDF or images; Flying Saucer was made based on the iTextPDF.

Code Example

The following code demonstrates how to use the Flying Saucer library by combining the Jsoup library along with the xhtmlrenderer, which is the flying saucer library.

Jsoup is an open-source Java library to parse, extract and manipulate data from HTML files. As Jsoup expects a string, we first need to open the HTML file as a File object first and then pass it to the JSOUP library; and you can find more about this here.

After parsing the HTML file, we pass the data to the Flying saucer library to convert it into a PDF.

import java.io.*;
import java.io.FileOutputStream;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

import org.xhtmlrenderer.pdf.ITextRenderer;

public class Main {

   public static void main(String[] args) throws Exception {

       try (OutputStream os = new FileOutputStream("out.pdf")) {
           // opening the file from the path
           File in = new File("html.html");
           // Jsoup expects a string
           Document document = Jsoup.parse(in, null);

           //Convert the HTML format into XHTML
      	document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);

           ITextRenderer iTextRenderer = new ITextRenderer();
           iTextRenderer.setDocumentFromString(document.html());
           iTextRenderer.layout();
           iTextRenderer.createPDF(os);
           System.out.println("PDF created");
       }
   }
}

Maven Dependency

We need to add the following dependencies to the POM.xml file to get the above code working and running.

<dependency>
   <groupId>org.jsoup</groupId>
   <artifactId>jsoup</artifactId>
   <version>1.14.3</version>
</dependency>
<dependency>
   <groupId>org.xhtmlrenderer</groupId>
   <artifactId>flying-saucer-core</artifactId>
   <version>9.1.22</version>
</dependency>

<dependency>
   <groupId>org.xhtmlrenderer</groupId>
   <artifactId>flying-saucer-pdf-openpdf</artifactId>
   <version>9.1.22</version>
</dependency>

You can find more information here.

3. Comparison of All Libraries

OpenHTMLtoPDF, iTextPDF, and Flying Saucer are three popular libraries used by developers to create PDF documents. These libraries vary in their approach, usability, and feature sets.

This comparison aims to provide a clear overview of each, helping developers choose the one that best suits their project requirements.

Feature	OpenHTMLtoPDF	iTextPDF	Flying Saucer
Rendering Engine	Uses its own PDFBox-based renderer	Own rendering engine	Uses iText as the rendering engine
HTML/CSS Support	Good support for CSS 2.1	Excellent HTML and CSS support including some HTML5 features	Limited to CSS 2.1 and XHTML
Extensibility	Moderate	High, with extensive customization options	Moderate
Performance	Good	Very good, optimized for performance	Good, but dependent on iText
Community and Support	Growing community, responsive support	Large community, professional support available at a cost	Smaller community, limited updates
Licensing	Apache License 2.0	AGPL license for open source; commercial license needed for proprietary use	LGPL, requires iText which has AGPL
Ease of Use	Easy to use for basic PDF generation	Steep learning curve but versatile	Relatively easy to integrate
Special Features	Direct image rendering, web page to PDF	Advanced features like PDF manipulation, merging, splitting	Mainly focused on HTML to PDF conversion

First, We need to highlight Flying Saucer based on iText, which means minor changes between them. Flying Saucer provides an easier approach for simple HTML to PDF conversions but with limited HTML and CSS capability compared to the other two.

However, OpenHTMLtoPDF is based on another library called PDFBOX. PDFBOX is a well-maintained, open-source library with an LGPL license, while, iTextPDF is an AGPL license library.

OpenHTMLtoPDF is also considered faster than the Flying Saucer. In addition, OpenHTMLtoPDF is an excellent choice for straightforward PDF generation with good HTML and CSS support, benefiting from an open and permissive license

iTextPDF can be considered much more resource-efficient than PDFBOX as it processes the text chunk by chunk, and it also has an event-oriented architecture.

On the other hand, OpenHTMLtoPDF provides a built-in plugin for SVG and MathML and also provides better support for CSS3 transforms, and one of the drawbacks of OpenHTMLtoPDF is that there is no support for OpenType fonts.

4. Conclusion

In this article, we talked about how to generate pdf from HTML files using Java. Then, we briefly introduced some of the tools/ libraries like OpenHTMLtoPDF, iTextPDF and Flying Saucer. We also compared them in different properties like complexity, size of generated files, resolution, and features.

Finally, if you want to have a tool with all the features of these libraries and more, in that case, I recommend that you check out APITemplate.io.

APITemplate.io is a tool that can help you generate PDFs quickly with PDF generating API over the cloud and is compatible with CSS, JavaScript, and Python. It also comes with predefined templates which you can reuse and edit.

Sign up for a free account with APITemplate.io now and start automating your PDF generation.

Ahmed Hashesh

A learner, Content Writer, Embedded SW Engineer, Machine learning and Autonomous Vehicles Enthusiast.

Share:

Articles for Image Generation

How to leverage APITemplate for effective marketing

Ayeesha

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Generating PDFs in Java with 3 Popular Libraries

1. Introduction

2. Generating PDFs in Java with 3 Libraries

i. OpenHTMLtoPDF

ii. iTextPDF

iii. Flying Saucer

3. Comparison of All Libraries

4. Conclusion

Table of Contents

Share:

Articles for Image Generation

Articles for PDF Generation