Created At: September 6, 2023
Most real-world applications encounter the requirement of generating PDFs based on some content. This includes generating PDFs from custom HTML content or even generating PDFs directly from a website URL.
Earlier, we needed to write a lot of custom code to get this done, and it used to take a lot of time. But now, there are many great libraries and tools that can do this with just a few lines of code.
In this article, we will look into 4 popular approaches such as Puppeteer, jsPDF, PlayWright and html-pdf that we can take to generate PDFs from HTML using Node.js.
If you are looking for ways to generate PDF documents on the client side or in the browser, we have an article for you: Generate PDFs in JavaScript (Browser) with 4 Popular Methods.
Let get started now!
1. Four Popular Node.js Libraries that Convert HTML to PDF
Converting HTML pages to PDF is a common task that developers encounter, whether it’s for generating reports, invoices, or contracts. In this article, we will delve into three popular Node.js libraries for converting HTML to PDF:
- Puppeteer
- jsPDF
- PlayWright
- html-pdf
We’ll explore their features, benefits, and limitations, along with code examples to help you make an informed choice for your next project. So if you’re looking for a reliable way to convert HTML to PDF in a Node.js environment, read on.
Prerequisites
Before going into the implementations, first make sure that you have a NodeJs environment ready. If not, you can install NodeJs from the official website nodejs.org/en.
Also, we will be using the npm
package manager to handle our dependencies.
i. Puppeteer
Puppeteer is a Node library that provides a high-level API to control headless browsers via the DevTools Protocol. It is commonly used for web scraping, automated testing, and generating PDFs or screenshots. It can perform actions such as clicking buttons, filling out forms, and capturing screenshots or PDFs.
Install Puppeteer
npm install puppeteer
Generate PDF from a website URL
const puppeteer = require('puppeteer');
async function generatePDF(url, outputPath) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
await page.pdf({ path: outputPath, format: 'A4' });
await browser.close();
}
// Usage
generatePDF('https://google.com', 'google.pdf')
.then(() => console.log('PDF generated successfully'))
.catch(err => console.error('Error generating PDF:', err));
In the above code, if you look at the method generatePDF
, we are doing the following things:
First, we launch Puppeteer, which launches a headless browser.
We create a new page and navigate to our specific
url
Now, the important part is
page.pdf()
, which takes the path to the PDF and the format for the PDF.Finally, we can close the browser once the PDF is generated.
Generate PDF from Custom HTML content
const puppeteer = require('puppeteer');
async function generatePDFfromHTML(htmlContent, outputPath) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(htmlContent);
await page.pdf({ path: outputPath, format: 'A4' });
await browser.close();
}
// Usage
const htmlContent = '<h1>Hello World</h1><p>This is custom HTML content.</p>';
generatePDFfromHTML(htmlContent, 'custom.pdf')
.then(() => console.log('PDF generated successfully'))
.catch(err => console.error('Error generating PDF:', err));
In the above code, you can see that in the method generatePDFfromHTML
:
- First, we launch a headless browser and create a new page.
- Now, we only need to set our custom HTML content, which will be used to generate the PDF.
- Finally, we generate the PDF and close the browser.
ii. jsPDF
jsPDF is a JavaScript library that allows you to generate PDF files from HTML programmatically. It works in both Node.js and browser environments. It’s highly customizable, allowing you to add text, images, and even vector graphics.
Install jsPDF
npm install jspdf
Generate PDF from a website URL
const axios = require('axios');
const jsPDF = require('jspdf');
async function generatePDFfromURL(url, outputPath) {
try {
const response = await axios.get(url);
const textContent = response.data;
const doc = new jsPDF();
doc.text(textContent, 10, 10);
doc.save(outputPath);
console.log('PDF generated successfully');
} catch (error) {
console.error('Error fetching URL:', error);
}
}
// Usage
generatePDFfromURL('https://google.com', 'google.pdf');
js
PDF also does not support the generation of PDFs directly from website URLs out of the box. We need to use libraries like axios
to first get the content from the webpage and pass that content to the library function to generate PDFs. One other customization with jspdf
is that you can pass coordinates for the elements in the PDF, and then the elements will be rendered to those locations in the PDF.
Generate PDF from Custom HTML content
const jsPDF = require('jspdf');
function generatePDFfromHTML(htmlContent, outputPath) {
const doc = new jsPDF();
doc.text(htmlContent, 10, 10);
doc.save(outputPath);
console.log('PDF generated successfully');
}
// Usage
const htmlContent = 'Hello World. This is custom HTML content.';
generatePDFfromHTML(htmlContent, 'custom.pdf');
Generating a PDF from custom HTML content is pretty straightforward with jspdf
, similar to generating a PDF from a website URL as we have seen above.
iii. PlayWright
Playwright is a Node.js library that allows for automation of Chromium, Firefox, and WebKit browsers.
PlayWright not only offers capabilities to perform actions on a browser programmatically, but it is also useful for tasks such as web scraping, testing, and generating PDFs from web content. In this section, we will be using PlayWright to generate PDF documents from a website URL.
Install Playwright
First, please install playwright
npm install playwright
PlayWright requires Chrome(or Firefox) browser to run. To install browsers, please run the following command
npm install playwright
Generate PDF from a Website URL
The following snippet is to generate a PDF from a Website URL using PlayWright
const playwright = require('playwright');
async function generatePDFfromURL(url, outputPath) {
const browser = await playwright.chromium.launch();
const page = await browser.newPage();
await page.goto(url);
await page.pdf({ path: outputPath });
console.log('PDF generated successfully');
await browser.close();
}
// Usage
generatePDFfromURL('https://google.com', 'custom.pdf');
Playwright directly supports PDF generation from website URLs, unlike libraries like jsPDF that require fetching the webpage content first.
Playwright’s PDF method captures the full page rendering, making it suitable for generating high-fidelity PDF documents from web pages.
Generate PDF from Custom HTML Content
const playwright = require('playwright');
async function generatePDFfromHTML(htmlContent, outputPath) {
const browser = await playwright.chromium.launch();
const page = await browser.newPage();
await page.setContent(htmlContent);
await page.pdf({ path: outputPath });
console.log('PDF generated successfully');
await browser.close();
}
// Usage
const htmlContent = '<p>Hello World. This is custom HTML content.</p>';
generatePDFfromHTML(htmlContent, 'custom.pdf');
Generating a PDF from custom HTML content using Playwright is as straightforward as generating it from a live webpage.
By using page.setContent()
, you can load any HTML content directly into the browser context and then generate a PDF, making it a versatile tool for creating PDF documents from dynamically generated HTML content.
iv. html-pdf
html-pdf
is a NodeJs library that generates PDFs from HTML using PhantomJS. It is a simple yet powerful tool that allows for easy PDF generation from HTML templates.
Note from the Author: This repo isn’t maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.
* You may find out more about Puppeteer in the first section of the libraries
Install html-pdf
npm install html-pdf
Generate PDF from a website URL
const axios = require('axios');
const pdf = require('html-pdf');
async function generatePDFfromURL(url, outputPath) {
try {
const response = await axios.get(url);
const htmlContent = response.data;
pdf.create(htmlContent).toFile(outputPath, (err, res) => {
if (err) return console.log(err);
console.log('PDF generated successfully:', res);
});
} catch (error) {
console.error('Error fetching URL:', error);
}
}
// Usage
generatePDFfromURL('https://google.com', 'google.pdf');
html-pdf
does not support generating PDFs from website URLs out of the box, which is why we first need to extract the website content using axios
. Once we have the content from the webpage, we can then use the html-pdf
library to generate a PDF from the website content.
Generate PDF from Custom HTML content
const pdf = require('html-pdf');
function generatePDFfromHTML(htmlContent, outputPath) {
pdf.create(htmlContent).toFile(outputPath, (err, res) => {
if (err) return console.log(err);
console.log('PDF generated successfully:', res);
});
}
// Usage
const htmlContent = '<h1>Hello World</h1><p>This is custom HTML content.</p>';
generatePDFfromHTML(htmlContent, 'custom.pdf');
Now, if you look at the code above, we are using the .create()
method from the library, which takes our custom HTML content and generates a PDF file to the output path.
2. Comparison of Four Node.js Libraries for HTML to PDF Conversion
In this section, we will explore four popular Node.js libraries: Puppeteer, jsPDF, Playwright, and html-pdf and how they differ from one another.
The comparison table aims to provide a side-by-side overview of these libraries across various dimensions such as speed, capabilities, and dependencies, helping you choose the one that best suits your project requirements.
Feature | Puppeteer | jsPDF | Playwright | html-pdf |
---|---|---|---|---|
Rendering Engine | Chromium and Firefox | JavaScript Renderer | Chromium, WebKit or Firefox | PhantomJS and it uses QtWebKit as the back-end |
Execution Environment | Server-side and headless browser | Client-side or Server-side | Server-side and headless browser | Server-side |
Output Quality | High (webpage accuracy) | Moderate (depends on use) | High (webpage accuracy) | Good (with some limitations) |
Ease of Use | Moderate | Easy | Moderate | Easy |
Custom Fonts Support | Yes | Yes | Yes | Yes |
JavaScript Execution | Full support | No execution, static content | Full support | Limited |
CSS Support | Full | Basic | Full | Basic |
PDF Options | Size, format, margins | Compression, format, size | Size, format, margins | Size, format, margins |
Performance | Good (can be resource-heavy) | Fast | Good (can be resource-heavy) | Moderate |
Use Case | Perfect for complex webpages | Simple documents, charts | Advanced web pages, testing | Simple HTML pages |
Community and Support | Strong | Very Strong | Growing | Declining (PhantomJS is deprecated) |
In conclusion:
- Puppeteer is highly versatile and is ideal for projects that require full web features, including JavaScript execution, complex layouts, and CSS styles. It has the best support for modern web technologies but can be slower and has a heavier dependency (Chromium).
- jsPDF is excellent for creating basic to moderately complex PDFs directly through JavaScript. It’s fast and lightweight, but its ability to convert HTML and CSS to PDF is limited compared to the others. Ideal for generating PDFs that don’t require HTML/CSS rendering.
- Playwright is similar to Puppeteer, it is an excellent choice for applications requiring high-quality rendering of complex web pages, including those with heavy JavaScript or modern CSS. These libraries leverage modern browsers to provide a high level of detail and accuracy in rendering.
- html-pdf is quicker for simpler tasks and can still convert HTML to PDF. However, it relies on the deprecated PhantomJS engine, so it might not fully support all modern web features. Its use is generally advised for simpler tasks or legacy projects.
In short, for developers requiring high fidelity and complex interactions, Puppeteer or Playwright would be the best choices. Those needing quick, simple PDF generation, particularly on the client-side, might find jsPDF more appealing.
jsPDF shines in generating simpler PDFs directly from scripts and can even run client-side, making it ideal for applications with less complex requirements.
Although html-pdf offers a straightforward approach to converting basic HTML to PDF, its reliance on the now-deprecated PhantomJS might pose long-term sustainability issues.
3. Create PDF from HTML using APITemplate.io
Above are some examples of how we can use libraries to convert HTML to PDF and web pages to PDF in NodeJs. However, when it comes to generating PDFs using templates or keeping track of generated PDFs, we need to do a lot of extra work to handle everything.
We need to have our own PDF generator tracker to keep track of the files generated. If we want to use custom templates such as invoice generators, we also need to create and manage those templates.
APITemplate.io is an API-based PDF generation platform that offers the perfect solution for all of the above use cases. In addition, APITemplate.io’s PDF generation API uses a Chromium-based rendering engine that supports JavaScript, CSS, and HTML.
Let’s see how we can use APITemplate.io to generate PDFs.
i. Generate Template-based PDF
APITemplate.io allows you to manage your templates. Go to Manage Templates from the dashboard.
From Manage Template, you can create your own templates. The following is a sample invoice template. There are many templates available that you can choose from and customize based on your requirements.
To start using APITemplate.io APIs, you need to obtain your API Key, which can be obtained from the API Integration
tab.
Now that you have your APITemplate account ready, let’s take some action and integrate it with our application. We will use the template to generate PDFs.
const axios = require('axios');
// Initialize HTTP client
const client = axios.create();
// API URL
const url = "https://rest.apitemplate.io/v2/create-pdf?template_id=YOUR_TEMPLATE_ID";
// Payload data
const payload = {
date: "15/05/2022",
invoice_no: "435568799",
sender_address1: "3244 Jurong Drive",
sender_address2: "Falmouth Maine 1703",
sender_phone: "255-781-6789",
sender_email: "[email protected]",
rece_addess1: "2354 Lakeside Drive",
rece_addess2: "New York 234562",
rece_phone: "34333-84-223",
rece_email: "[email protected]",
items: [
{item_name: "Oil", unit: 1, unit_price: 100, total: 100},
{item_name: "Rice", unit: 2, unit_price: 200, total: 400},
{item_name: "Mangoes", unit: 3, unit_price: 300, total: 900},
{item_name: "Cloth", unit: 4, unit_price: 400, total: 1600},
{item_name: "Orange", unit: 7, unit_price: 20, total: 1400},
{item_name: "Mobiles", unit: 1, unit_price: 500, total: 500},
{item_name: "Bags", unit: 9, unit_price: 60, total: 5400},
{item_name: "Shoes", unit: 2, unit_price: 30, total: 60},
],
total: "total",
footer_email: "[email protected]",
};
// Set headers
const headers = {
"X-API-KEY": "YOUR_API_KEY",
"Content-Type": "application/json",
};
// Make the POST request
client.post(url, payload, { headers })
.then(response => {
// Read the response
const responseString = JSON.stringify(response.data);
// Print the response
console.log(responseString);
})
.catch(error => {
console.error('Error:', error);
});
And if we check the response_string
, we have the following:
{
"download_url":"PDF_URL",
"transaction_ref":"8cd2aced-b2a2-40fb-bd45-392c777d6f6",
"status":"success",
"template_id":"YOUR_TEMPLATE_ID"
}
In the above code, it’s very easy to use ApiTemplate to convert HTML to PDF because we don’t need to install any other library. We just need to call one simple API and use our data as a request body, and that’s it!
You can use the download_url
from the response to download or distribute the generated PDF.
ii. Generate PDF from website URL
ApiTemplate also supports generating PDFs from website URLs.
const axios = require('axios');
async function main() {
const api_key = "YOUR_API_KEY";
const template_id = "YOUR_TEMPLATE_ID";
const data = {
url: "https://en.wikipedia.org/wiki/Sceloporus_malachiticus",
settings: {
paper_size: "A4",
orientation: "1",
header_font_size: "9px",
margin_top: "40",
margin_right: "10",
margin_bottom: "40",
margin_left: "10",
print_background: "1",
displayHeaderFooter: true,
custom_header: `<style>#header, #footer { padding: 0 !important; }</style>
<table style="width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px">
<tr>
<td style="text-align:left; width:30%!important;"><span class="date"></span></td>
<td style="text-align:center; width:30%!important;"><span class="pageNumber"></span></td>
<td style="text-align:right; width:30%!important;"><span class="totalPages"></span></td>
</tr>
</table>`,
custom_footer: `<style>#header, #footer { padding: 0 !important; }</style>
<table style="width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px">
<tr>
<td style="text-align:left; width:30%!important;"><span class="date"></span></td>
<td style="text-align:center; width:30%!important;"><span class="pageNumber"></span></td>
<td style="text-align:right; width:30%!important;"><span class="totalPages"></span></td>
</tr>
</table>`
}
};
try {
const response = await axios.post(
"https://rest.apitemplate.io/v2/create-pdf-from-url",
data,
{
headers: {
"X-API-KEY": api_key
}
}
);
console.log('PDF generated successfully:', response.data);
} catch (error) {
console.error('Error:', error);
}
}
main();
In the above code, we can provide the URL in the request body along with the settings for the PDF. APITemplate will use this request body to generate a PDF and return a download URL for your PDF.
iii. Generate PDF from custom HTML content
If you want to generate PDFs using your own custom HTML content, ApiTemplate also supports that.
const axios = require('axios');
async function main() {
const api_key = "YOUR_API_KEY";
const template_id = "YOUR_TEMPLATE_ID";
const data = {
body: "<h1> hello world {{name}} </h1>",
css: "<style>.bg{background: red};</style>",
data: {
name: "This is a title"
},
settings: {
paper_size: "A4",
orientation: "1",
header_font_size: "9px",
margin_top: "40",
margin_right: "10",
margin_bottom: "40",
margin_left: "10",
print_background: "1",
displayHeaderFooter: true,
custom_header: `<style>#header, #footer { padding: 0 !important; }</style>
<table style="width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px">
<tr>
<td style="text-align:left; width:30%!important;"><span class="date"></span></td>
<td style="text-align:center; width:30%!important;"><span class="pageNumber"></span></td>
<td style="text-align:right; width:30%!important;"><span class="totalPages"></span></td>
</tr>
</table>`,
custom_footer: `<style>#header, #footer { padding: 0 !important; }</style>
<table style="width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px">
<tr>
<td style="text-align:left; width:30%!important;"><span class="date"></span></td>
<td style="text-align:center; width:30%!important;"><span class="pageNumber"></span></td>
<td style="text-align:right; width:30%!important;"><span class="totalPages"></span></td>
</tr>
</table>`
}
};
try {
const response = await axios.post(
"https://rest.apitemplate.io/v2/create-pdf-from-html",
data,
{
headers: {
"X-API-KEY": api_key
}
}
);
console.log('PDF generated successfully:', response.data);
} catch (error) {
console.error('Error:', error);
}
}
main();
Similar to generating a PDF from a website URL, the API request above takes the body and CSS as part of the payload to generate a PDF.
4. Performance Considerations
Open-source third-party libraries work fine in most cases. However, when it comes to generating PDFs from HTML at scale, you need to handle all the scaling and edge cases yourself.
By using APITemplate.io, you don’t need to worry about any performance or scaling issues as it handles them for you and the best part is that it supports Javascript and CSS.
5. Conclusion
PDF generation features are now a part of every business application. We have seen how we can use third-party libraries to generate PDFs if our use case is simple.
However, if we have complex use cases, such as maintaining templates, APITemplate.io provides a solution just for that using simple API calls.
Sign up for a free account with us now and start automating your PDF generation or click here to learn more about our PDF generation.
Libraries: