How to convert HTML to PDF using C#

Most real-world applications face the need to generate PDFs from various content sources. This task often involves creating PDFs from custom HTML or directly from website URLs.

Earlier, you had to write lots of code to do this and it took a lot of time. But now, there are many good libraries and tools that let you do this easily with just a few lines of code.

In this article, we will look into some approaches that we can take to generate PDFs from HTML using C#.

Why using HTML for PDF conversion

i. Open and Mature Technology: HTML is an open standard, ensuring that the tools and technologies developed around it are widely available and well-understood. Its maturity signifies that most challenges and peculiarities have been thoroughly documented, making troubleshooting easier.

ii. Cost-effective: A wide range of tools, libraries, and APIs (both free and paid) are available to convert HTML to PDF. This diminishes the need for specialized PDF creation software.

iii. Embed Multimedia: HTML facilitates embedding multimedia elements like images, videos, and audios. Although not all these elements can be directly converted to PDF, using HTML as a source offers opportunities to create documents enriched with multimedia.

iv. Styling with CSS: Cascading Style Sheets (CSS) offer robust styling capabilities for HTML content. This enables effective branding, theming, and visual consistency, which are then mirrored in the resultant PDF.

v. Easy to Learn and Use: Learning the basics of HTML is straightforward, making it accessible for many users to generate content.

In summary, converting HTML to PDF combines the best aspects of both formats: the flexibility, accessibility, and interactivity of HTML with the portability and standardization of PDFs.

Converting HTML to PDF using C# Libraries

PuppeteerSharp

PuppeteerSharp is a .NET port of Puppeteer that provides a high-level API to control headless browsers. PuppeteerSharp is used to scrape web content, automate testing, generate PDFs, or take screenshots of websites. With PuppeteerSharp, you can easily convert HTML to PDF or website to PDF.

Generate PDF from a website URL

using System.Threading.Tasks;
using PuppeteerSharp;

class Program
{
    public static async Task Main()
    {
        await new BrowserFetcher().DownloadAsync();
        using var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
        var page = await browser.NewPageAsync();
        await page.GoToAsync("http://www.google.com");
        await page.PdfAsync("website.pdf");
    }
}

In the above code, we are doing the following:

  • The BrowserFetcher downloads the required version of Chromium.
  • Puppeteer.LaunchAsync starts a headless browser instance where we pass our website URL.
  • NewPageAsync creates a new tab.
  • We use GoToAsync to navigate to our website URL.
  • Finally, PdfAsync generates the PDF from the website’s current state.

Generate PDF from Custom HTML content

using System.Threading.Tasks;
using PuppeteerSharp;

class Program
{
    public static async Task Main()
    {
        await new BrowserFetcher().DownloadAsync();
        using var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
        using var page = await browser.NewPageAsync();
        await page.SetContentAsync("<div>My Custom Content</div>");
        await page.PdfAsync("customContent.pdf");
    }
}

In the above code, we are generating PDFs from custom HTML content:

  1. BrowserFetcher downloads a new instance of Chromium.
  2. We then launch this instance as a headless browser.
  3. A new page is created in our headless browser with NewPageAsync, where we set our custom HTML content.
  4. Finally, PdfAsync is used to generate the PDF from the custom HTML content set on the page.

HtmlRenderer.PdfSharp

HtmlRenderer.PdfSharp is a C# library used to generate PDFs. This library enables the creation of PDF documents from HTML snippets using static rendering code.

While HtmlRenderer.PdfSharp doesn’t support generating PDFs directly from website URLs by default, we can first extract the website content and then utilize the library for PDF generation.

Convert Website to PDF

using System;
using System.Net;
using PdfSharp;
using TheArtOfDev.HtmlRenderer.PdfSharp;

class Program
{
    static void Main()
    {
        using (var client = new WebClient())
        {
            string htmlCode = client.DownloadString("http://example.com");
            PdfDocument pdf = PdfGenerator.GeneratePdf(htmlCode, PageSize.A4);
            pdf.Save("url_to_pdf.pdf");
        }
    }
}

In the above code, we are performing the following steps to generate a PDF:

  1. Since this library does not support direct URL content fetching, we use WebClient to get the content from the website URL.

  2. We extract the website content from the URL using client.DownloadString.

  3. Using PdfGenerator.GeneratePdf, we generate the PDF with the extracted content.

  4. Finally, we download the content.

Generate PDF from Custom HTML content

using System;
using PdfSharp;
using TheArtOfDev.HtmlRenderer.PdfSharp;

namespace html_to_pdf
{
    class Program
    {
        static void Main(string[] args)
        {
            string htmlString = "<h1>Document</h1> <p>This is an HTML document which is converted to a pdf file.</p>";
            PdfDocument pdfDocument = PdfGenerator.GeneratePdf(htmlString, PageSize.A4);
            pdfDocument.Save("html_to_pdf.pdf");
        }
    }
}

To generate PDFs from custom HTML content, we follow a similar approach. The only difference is that we now use our custom HTML content as the extracted content.

iTextSharp

iTextSharp is a widely-used .NET library for creating and manipulating PDF documents. It allows developers to generate PDFs from different sources, including HTML content.

However, iTextSharp does not automatically fetch HTML content from URLs. its main function is to convert provided HTML content into PDFs.

Convert Website to PDF

using System;
using System.IO;
using System.Net;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;

class Program
{
    static void Main()
    {
        string htmlContent;
        using (var client = new WebClient())
        {
            htmlContent = client.DownloadString("http://example.com");
        }

        using (var memoryStream = new MemoryStream())
        {
            using (var document = new Document())
            {
                PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
                document.Open();
                using (var stringReader = new StringReader(htmlContent))
                {
                    HTMLWorker htmlParser = new HTMLWorker(document);
                    htmlParser.Parse(stringReader);
                }
                document.Close();
            }

            // Save PDF to file
            File.WriteAllBytes("website.pdf", memoryStream.ToArray());
        }
    }
}

iTextSharp does not support generating PDFs directly from URLs out of the box.

  1. First, we extract the website content using WebClient.

  2. We then use HTMLWorker to insert the content into a PDF instance created by PdfWriter.

  3. Finally, we download the generated PDF.

Generate PDF from Custom HTML content

using System;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;

class Program
{
    static void Main()
    {
        string htmlContent = "<h1>Hello World</h1><p>This is a test HTML string.</p>";
        using (var memoryStream = new MemoryStream())
        {
            using (var document = new Document())
            {
                PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
                document.Open();
                using (var stringReader = new StringReader(htmlContent))
                {
                    HTMLWorker htmlParser = new HTMLWorker(document);
                    htmlParser.Parse(stringReader);
                }
                document.Close();
            }

            File.WriteAllBytes("customHtmlContent.pdf", memoryStream.ToArray());
        }
    }
}

Generating PDFs from custom HTML is similar to generating them from website content. In this case, we directly obtain the HTML content without using WebClient. We then follow similar steps to create the PDFs.

Comparison of PuppeteerSharp, HtmlRenderer.PdfSharp, and iTextSharp

Generating PDF documents is a common requirement for many applications in the .NET environment. This article presents a comprehensive comparison of three popular libraries used in C# for PDF generation: PuppeteerSharp, HtmlRenderer.PdfSharp, and iTextSharp.

Each of these libraries offers unique features and capabilities, making them suitable for different scenarios. We will evaluate them based on various criteria to help developers make an informed choice for their specific needs.

Feature / LibraryPuppeteerSharpHtmlRenderer.PdfSharpiTextSharp
Rendering EngineHeadless ChromePDFSharp EngineiText Engine
HTML to PDF ConversionYesYesYes
CSS SupportFull (Web standards)LimitedExtensive
JavaScript SupportYesNoLimited
Editing Existing PDFsNoNoYes
Form Filling & ManipulationNoNoYes
LicenseApache 2.0MITAGPL (Commercial available)
Ease of UseModerateEasyModerate
PerformanceHighModerateHigh
Documentation & Community SupportGoodGoodExcellent
Custom Fonts SupportYesYesYes
Image EmbeddingYesYesYes
Platform CompatibilityCross-platform.NET dependentCross-platform

The choice of a PDF generation library in C# largely depends on the specific needs of the project. PuppeteerSharp is ideal for accurate rendering of web content, HtmlRenderer.PdfSharp for simple PDF generation tasks, and iTextSharp for more complex PDF manipulation and creation requirements.

Developers should consider factors like ease of use, performance, and feature set when selecting a library for their application.

Converting HTML to PDF using APITemplate.io

The examples above demonstrate how we can use various libraries to convert HTML and web pages to PDFs in C#. However, the process becomes more complex when generating PDFs using templates or monitoring generated PDFs. We need a system to track the PDFs we create.

Additionally, if we wish to use specific templates, like those for generating invoices, we also have to develop and maintain these templates ourselves. This extra work involves creating our own PDF generator tracker and managing template designs.

APITemplate.io is an API-based PDF generation platform that offers the perfect solution for all of the above use cases.

Let’s see how we can use APITemplate.io to generate PDFs.

Generate Template Based PDF

APITemplate.io allows you to manage your templates. Go to “Manage Templates” from the dashboard.

From Manage Template, you can create your own templates. The following is a sample invoice template. There are many templates available that you can choose from and customize based on your requirements.

To start using APITemplate.io APIs, you need to obtain your API Key, which can be obtained from the API Integration tab.

Now that you have your APITemplate account ready, let’s take some action and integrate it with our application. We will use the template to generate PDFs.

using System;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using Newtonsoft.Json;

class Program
{
    static async Task Main(string[] args)
    {
        // API URL
        string url = "https://rest.apitemplate.io/v2/create-pdf?template_id=YOUR_TEMPLATE_ID";

        // Payload data
        var payload = new
        {
            date = "15/05/2022",
            invoice_no = "435568799",
            sender_address1 = "3244 Jurong Drive",
            sender_address2 = "Falmouth Maine 1703",
            sender_phone = "255-781-6789",
            sender_email = "[email protected]",
            rece_addess1 = "2354 Lakeside Drive",
            rece_addess2 = "New York 234562",
            rece_phone = "34333-84-223",
            rece_email = "[email protected]",
            items = new[]
            {
                new { item_name = "Oil", unit = 1, unit_price = 100, total = 100 },
                new { item_name = "Rice", unit = 2, unit_price = 200, total = 400 },
                // ... Add other items here
            },
            total = "total",
            footer_email = "[email protected]",
        };

        string jsonPayload = JsonConvert.SerializeObject(payload);

        using (var client = new HttpClient())
        {
            client.DefaultRequestHeaders.Add("X-API-KEY", "YOUR_API_KEY");
            client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

            try
            {
                var content = new StringContent(jsonPayload, System.Text.Encoding.UTF8, "application/json");
                var response = await client.PostAsync(url, content);

                string responseString = await response.Content.ReadAsStringAsync();

                Console.WriteLine(responseString);
            }
            catch (HttpRequestException e)
            {
                Console.Error.WriteLine($"Error: {e.Message}");
            }
        }
    }
}

And if we check the response_string, we have the following:

{
    "download_url":"PDF_URL",
    "transaction_ref":"8cd2aced-b2a2-40fb-bd45-392c777d6f6",
    "status":"success",
    "template_id":"YOUR_TEMPLATE_ID"
}

In the above code, it’s very easy to use ApiTemplate to convert HTML to PDF because we don’t need to install any other library. We just need to call one simple API and use our data as a request body, and that’s it!

You can use the download_url from the response to download or distribute the generated PDF.

Convert Website to PDF using API Template

ApiTemplate also supports generating PDFs from website URLs.

using System;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using Newtonsoft.Json;

class Program
{
    static async Task Main()
    {
        const string apiKey = "YOUR_API_KEY";
        const string templateId = "YOUR_TEMPLATE_ID";

        var data = new
        {
            url = "https://en.wikipedia.org/wiki/Sceloporus_malachiticus",
            settings = new
            {
                paper_size = "A4",
                orientation = "1",
                header_font_size = "9px",
                margin_top = "40",
                margin_right = "10",
                margin_bottom = "40",
                margin_left = "10",
                print_background = "1",
                displayHeaderFooter = true,
                custom_header = @"<style>#header, #footer { padding: 0 !important; }</style>
                    <table style='width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px'>
                        <tr>
                            <td style='text-align:left; width:30%!important;'><span class='date'></span></td>
                            <td style='text-align:center; width:30%!important;'><span class='pageNumber'></span></td>
                            <td style='text-align:right; width:30%!important;'><span class='totalPages'></span></td>
                        </tr>
                    </table>",
                custom_footer = @"<style>#header, #footer { padding: 0 !important; }</style>
                    <table style='width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px'>
                        <tr>
                            <td style='text-align:left; width:30%!important;'><span class='date'></span></td>
                            <td style='text-align:center; width:30%!important;'><span class='pageNumber'></span></td>
                            <td style='text-align:right; width:30%!important;'><span class='totalPages'></span></td>
                        </tr>
                    </table>"
            }
        };

        var json = JsonConvert.SerializeObject(data);

        using (var client = new HttpClient())
        {
            client.DefaultRequestHeaders.Add("X-API-KEY", apiKey);

            try
            {
                var content = new StringContent(json, System.Text.Encoding.UTF8, "application/json");
                var response = await client.PostAsync("https://rest.apitemplate.io/v2/create-pdf-from-url", content);
                var responseString = await response.Content.ReadAsStringAsync();
                Console.WriteLine("PDF generated successfully: " + responseString);
            }
            catch (HttpRequestException e)
            {
                Console.Error.WriteLine("Error: " + e.Message);
            }
        }
    }
}

In the above code, we can provide the URL in the request body along with the settings for the PDF. APITemplate will use this request body to generate a PDF and return a download URL for your PDF.

Generate PDF from custom HTML content

If you want to generate PDFs using your own custom HTML content, ApiTemplate also supports that.

using System;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;
using Newtonsoft.Json;

class Program
{
    static async Task Main()
    {
        const string apiKey = "YOUR_API_KEY";
        const string templateId = "YOUR_TEMPLATE_ID";

        var data = new
        {
            body = "<h1> hello world {{name}} </h1>",
            css = "<style>.bg{background: red};</style>",
            data = new
            {
                name = "This is a title"
            },
            settings = new
            {
                paper_size = "A4",
                orientation = "1",
                header_font_size = "9px",
                margin_top = "40",
                margin_right = "10",
                margin_bottom = "40",
                margin_left = "10",
                print_background = "1",
                displayHeaderFooter = true,
                custom_header = @"<style>#header, #footer { padding: 0 !important; }</style>
                    <table style='width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px'>
                        <tr>
                            <td style='text-align:left; width:30%!important;'><span class='date'></span></td>
                            <td style='text-align:center; width:30%!important;'><span class='pageNumber'></span></td>
                            <td style='text-align:right; width:30%!important;'><span class='totalPages'></span></td>
                        </tr>
                    </table>",
                custom_footer = @"<style>#header, #footer { padding: 0 !important; }</style>
                    <table style='width: 100%; padding: 0px 5px;margin: 0px!important;font-size: 15px'>
                        <tr>
                            <td style='text-align:left; width:30%!important;'><span class='date'></span></td>
                            <td style='text-align:center; width:30%!important;'><span class='pageNumber'></span></td>
                            <td style='text-align:right; width:30%!important;'><span class='totalPages'></span></td>
                        </tr>
                    </table>"
            }
        };

        var json = JsonConvert.SerializeObject(data);

        using (var client = new HttpClient())
        {
            client.DefaultRequestHeaders.Add("X-API-KEY", apiKey);

            try
            {
                var content = new StringContent(json, System.Text.Encoding.UTF8, "application/json");
                var response = await client.PostAsync("https://rest.apitemplate.io/v2/create-pdf-from-html", content);
                var responseString = await response.Content.ReadAsStringAsync();
                Console.WriteLine("PDF generated successfully: " + responseString);
            }
            catch (HttpRequestException e)
            {
                Console.Error.WriteLine("Error: " + e.Message);
            }
        }
    }
}

Similar to generating a PDF from a website URL, the API request above takes the body and CSS as part of the payload to generate a PDF.

Performance Considerations

Open source third-party libraries are generally effective for most needs. Yet, for generating PDFs from HTML on a large scale, managing scaling and various edge cases becomes your responsibility.

With APITemplate.io, these concerns about performance and scaling are taken care of for you. APITemplate.io also handles error situations, providing a more streamlined and reliable solution for large-scale PDF generation.

Conclusion

PDF generation has become an essential feature in modern business applications. We’ve explored using third-party libraries for simple PDF generation tasks.

However, for more complex scenarios, like managing templates, APITemplate.io offers a tailored solution through straightforward API calls. This approach simplifies handling intricate use cases, providing a user-friendly alternative for complex PDF generation requirements.

Sign up for a free account with us now and start automating your PDF generation.

Table of Contents

Share:

Facebook
Twitter
Pinterest
LinkedIn

Articles for Image Generation

Articles for PDF Generation