Generating PDFs has become a crucial feature for many modern web applications, from creating invoices to producing complex data-driven reports. Over the years, variousĀ HTML to PDF toolsĀ have emerged, each with its own approach and capabilities. In this post, weāll take a brief look at the evolution of PDF generation solutions ā fromĀ wkhtmltopdfĀ andĀ PrinceXMLĀ (viaĀ DocRaptor) toĀ PuppeteerĀ andĀ PlaywrightĀ ā and then walk through a short tutorial on using Playwright (Node.js) to generate a sample PDF in 2025.
A Brief History of PDF Generation Tools
wkhtmltopdfā
wkhtmltopdfĀ is a command-line tool that uses theĀ WebKit rendering engineĀ (the foundation of early Safari) to convert HTML and CSS into PDF. Historically, it was one of the most popular open-source solutions for server-side PDF generation.
ā”ļø How it works:
- Renders a static HTML document (with CSS) in a WebKit-based environment.
- Outputs a PDF file that closely (but not always perfectly) matches the layout seen in a WebKit browser.
- Integrates easily into scripts or back-end services via CLI.
ā Pros:
- Open-sourceĀ and free to use.
- Simple CLI usageĀ makes it straightforward to integrate with different back-end languages.
- Mature project with a large user base.
ā Cons:
- Limited supportĀ for modern JavaScript and CSS features likeĀ flexboxĀ orĀ grid.
- Inconsistent behavior with more complex layouts or interactive elements.
- Can beĀ slowerĀ and produceĀ lower-quality renderingĀ compared to headless Chrome.
DocRaptor (PrinceXML)ā
DocRaptorĀ is aĀ cloud-based APIĀ that leverages the commercialĀ PrinceXMLĀ rendering engine to convert HTML (and CSS) into high-quality PDF or Excel documents. PrinceXML itself is known for its advanced typesetting capabilities and thorough CSS support, including complex paged media features.
ā”ļø How it works:
- You send your HTML/CSS (and optionally JavaScript) via an API call.
- The service uses PrinceXML under the hood to produce a precisely formatted PDF.
- You receive the final PDF via the API response.
ā Pros:
- High-fidelity renderingĀ for complex layouts, including advanced typographic and pagination features.
- ThoroughĀ CSS support, handling paged media, footnotes, multi-column layouts, and more.
- Backed by a commercial solution withĀ consistent updatesĀ and dedicated support.
ā Cons:
- Very high licensing cost: PrinceXML (and by extension DocRaptor) can be prohibitively expensive for many smaller projects or startups.
- Not a full headless browser engine: Because PrinceXML isnāt based on Chromium/Firefox, it may struggle with dynamic JavaScript frameworks (React, Vue, Angular) that rely heavily on live browser rendering.
Headless Chrome as a Game Changerā
For a long time, generating PDFs that perfectly mirrored a modern browserās rendering was difficult. Older engines like WebKit in wkhtmltopdf or specialized solutions (e.g., PrinceXML) often lagged behind the latest HTML/CSS/JS capabilities found in Chrome. That changed significantly when Google introducedĀ Headless Chrome, allowing developers to run the browser without a visible UI.
Shortly after,Ā PuppeteerĀ and thenĀ PlaywrightĀ emerged as powerful tools for automating Headless Chrome, enabling developers to generate PDFs and screenshots with precision.
Puppeteerā
PuppeteerĀ is aĀ Node.js libraryĀ that provides an extensive API for automating tasks in headless (or full) Chrome/Chromium. Initially released by the Google Chrome team, it quickly became popular for web scraping, testing, and HTML to PDF conversion.
ā”ļø How it works:
- Puppeteer spins up aĀ headless Chrome/ChromiumĀ instance.
- Navigates to a given web page or loads an HTML string.
- Waits for the content (including JavaScript) to render fully, then exports a PDF that matches Chromeās final layout.
ā Pros:
- Full supportĀ for modernĀ HTML, CSS, and dynamic JavaScript.
- Highly configurable: page size, margins, custom headers/footers, etc.
- MaintainedĀ by the Google Chrome team, ensuring updates align with Chromium.
ā Cons:
- Language focus: Primarily supported inĀ Node.js, so itās less convenient for developers in other languages.
- Resource usage: Spinning up a headless browser can be heavier compared to simpler tools.
- Scaling complexity: Requires more infrastructure for large workloads.
Playwrightā
PlaywrightĀ wasĀ developed by MicrosoftĀ and shares many similarities with Puppeteer. It supports automated testing and browser manipulation for Chromium, Firefox, and WebKit, although PDF generation currently works only in Chromium. Despite that limitation, Playwrightās design and multiple language SDKs make it a powerful option for diverse teams.
ā”ļø How it works:
- Playwright launches aĀ headless Chrome/ChromiumĀ instance.
- It navigates to a specified webpage or processes an HTML string.
- The tool waits until the content, including any JavaScript, is fully rendered, and then generates a PDF that reflects the final layout of the browser.
ā Pros:
- Multiple language SDKsĀ (Python, Java, C#/.NET, JavaScript/TypeScript) accommodate varied development teams.
- Modern standards compatibility: Because it uses a real Chromium engine for PDF, flexbox, grid, and other advanced CSS features are fully supported.
- Flexible configuration: Allows custom margins, paper sizes, headers/footers, and dynamic page manipulation before exporting.
- Scalable & actively maintained: Backed by Microsoft and the open-source community, it evolves quickly and is well-documented. You can containerize headless Chromium and distribute load for large-scale PDF tasks.
ā Cons:
- PDF limited to Chromium: Even though Playwright supports Firefox and WebKit for automation, PDF generation is restricted to Chromium.
- High resource usage: Like Puppeteer, running a headless browser is heavier than simpler command-line tools.
- Overkill for minimal needs: If you only require simple, static PDFs, spinning up Chromium might be more complex than necessary.
Puppeteer vs Playwrightā
BothĀ PuppeteerĀ andĀ PlaywrightĀ are powerful tools forĀ PDF generation, leveraging the sameĀ Chromium engine. The choice between the two largely depends on your specific needs, such as your preferred programming language, the complexity of your project, and whether you need to automate multiple browsers beyond PDF generation.
Hereās a comparison to help you choose the right tool for your needs:
Step-by-Step Guide: Generating Invoice PDF with Playwright
In this quick example, weāll show how to generate anĀ invoice PDFĀ by rendering an EJS template in Node.js and converting it to a PDF usingĀ Playwright.
The entire example is available onĀ GitHubĀ if you'd like to view or clone the full project.
1ļøā£ Set Up the Project
1. Install Node.js.
Make sure you have Node.js installed. If not, download and install it fromĀ Node.js Website.
2. Create a New Project Directory.
Open a Terminal and run:
mkdir invoice-generator
cd invoice-generator
3. Initialize a New Node.js Project.
Create aĀ package.json
Ā file with the following command:
npm init -y
4. Install Required Packages.
InstallĀ ejs
Ā for templating andĀ playwright
Ā for generating PDFs:
npm install ejs playwright
2ļøā£ Organize Your Project Structureā
Hereās a suggested structure for better organization:
invoice-generator/
āāā data/ // Directory for data files
ā āāā invoice-data.json // JSON file for invoice data
āāā templates/ // Directory for HTML templates
ā āāā invoice.ejs // Template for the invoice
āāā generate-invoice.js // Main script to generate PDFs
āāā package.json // Project configuration file
3ļøā£ Create an EJS Templateā
- To start generating PDFs, youāll first need an HTML template for your invoice. In this example, we will useĀ EJS, a simple templating language that allows you to generate HTML markup using plain JavaScript.
- Save this template asĀ
invoice.ejs
Ā in theĀtemplates
Ā directory. It should define the structure of your invoice and include placeholders for dynamic data.
You can find my example template onĀ
4ļøā£ Add Invoice Data
- Save your invoice details asĀ
invoice-data.json
Ā in theĀdata
Ā directory. - This file will hold the dynamic data, such as customer details, that will be used to populate the placeholders in theĀ
invoice.ejs
Ā template. - By keeping the data in a separate file, you ensure that the template can be reused with different data sets.
You can find the example invoice data onĀ
5ļøā£ Create the PDF Generator Script
- To generate a PDF from your HTML template, youāll need a script that renders the template and uses Playwright for PDF conversion.
- Below is a complete example of the script.
- Save this file asĀ
generate-invoice.js
Ā in your project directory.
Below is the complete script:
const ejs = require('ejs');
const fs = require('fs');
const {chromium} = require('playwright');
const path = require('path');
// Load invoice data from the JSON file
const invoiceData = JSON.parse(fs.readFileSync(path.join(__dirname, 'data', 'invoice-data.json'), 'utf8'));
(async () => {
try {
const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); // Generate unique timestamp
// Render the EJS template to HTML
const templatePath = path.join(__dirname, 'templates', 'invoice.ejs');
const html = await ejs.renderFile(templatePath, invoiceData);
// Launch a headless browser using Playwright
const browser = await chromium.launch();
const page = await browser.newPage();
// Load the rendered HTML into the browser
await page.setContent(html, {waitUntil: 'load'});
// Generate the PDF and save it with a timestamped filename
const pdfPath = `invoice-${timestamp}.pdf`;
await page.pdf({
path: pdfPath,
format: 'A4',
printBackground: true
// Additional parameters can be added here
});
await browser.close();
console.log(`PDF successfully created at: ${pdfPath}`);
} catch (error) {
console.error('An error occurred while generating the invoice:', error);
}
})();
You can also find the complete script onĀ
ļø6ļøā£ Run the Scriptā
1. Open a terminal and navigate to the project directory:
cd invoice-generator
2. Run the script:
node generate-invoice.js
7ļøā£ Check the Outputā
Once the script has executed successfully, youāll find the following file in your project directory:
- Generated PDF:Ā
invoice-<timestamp>.pdf
Ā - the final invoice ready for sharing or printing. - Open the PDF to ensure it displays the invoice correctly.
Itās done! Youāve successfully created the invoice PDF.Ā š
Below is a preview of the generated invoice PDF:
Conclusionā
PlaywrightĀ (and headless browsers in general) represent the modern standard forĀ HTML to PDF conversion, providing excellent support for contemporary web layouts and interactive elements. While some projects still rely on wkhtmltopdf or specialized engines like PrinceXML, using headless Chromium ensures your PDFs accurately reflect the latest HTML/CSS/JS capabilities.
Donāt want to manage browser instances yourself? If youāre looking for a simpler route, you could opt for anĀ API-based solutionĀ such asĀ PDFBoltĀ ā an approach that offloads the maintenance and scaling concerns, letting you focus on your core application logic.
No matter which method you choose, we hope this brief history and tutorial will help you generate PDFs with confidence in 2025 and beyond!