Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to convert HTML to PDF using Python
Converting HTML to PDF programmatically is a common requirement for generating reports, invoices, or documentation. Python provides several libraries for this task, with pdfkit being one of the most popular options as it leverages the powerful wkhtmltopdf engine.
In this article, we will explore how to convert HTML files to PDF using Python with the pdfkit library.
Prerequisites and Installation
Step 1: Install pdfkit Library
First, install the pdfkit library using pip ?
pip install pdfkit
Step 2: Install wkhtmltopdf
The pdfkit library requires wkhtmltopdf as its backend engine. Follow these steps to install it ?
- Download wkhtmltopdf from the official website
- Install it to the default location: C:\Program Files\wkhtmltopdf\bin (Windows)
- Add the path to your system's environment variables ?
- Open System Properties ? Advanced ? Environment Variables
- Select "Path" and click Edit
- Add C:\Program Files\wkhtmltopdf\bin
- Click OK to save
- Verify installation by running wkhtmltopdf --version in command prompt
Converting HTML Files to PDF
Method 1: Converting Local HTML Files
Here's how to convert a local HTML file to PDF ?
import pdfkit
# Configure the path to wkhtmltopdf (Windows)
config = pdfkit.configuration(wkhtmltopdf=r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')
# Convert HTML file to PDF
pdfkit.from_file('example.html', 'output.pdf', configuration=config)
print("HTML file successfully converted to PDF!")
Method 2: Converting HTML Strings
You can also convert HTML content directly from a string ?
import pdfkit
# HTML content as string
html_content = """
<html>
<head><title>Sample Document</title></head>
<body>
<h1>Hello World!</h1>
<p>This is a sample HTML document converted to PDF.</p>
</body>
</html>
"""
# Configure path (adjust for your system)
config = pdfkit.configuration(wkhtmltopdf=r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')
# Convert HTML string to PDF
pdfkit.from_string(html_content, 'string_output.pdf', configuration=config)
print("HTML string successfully converted to PDF!")
Method 3: Converting Web URLs
You can convert web pages directly from URLs ?
import pdfkit
# Configure path
config = pdfkit.configuration(wkhtmltopdf=r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')
# Convert URL to PDF
pdfkit.from_url('https://www.example.com', 'webpage.pdf', configuration=config)
print("Webpage successfully converted to PDF!")
Customizing PDF Output
You can customize the PDF output using various options ?
import pdfkit
config = pdfkit.configuration(wkhtmltopdf=r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')
# PDF customization options
options = {
'page-size': 'A4',
'margin-top': '0.75in',
'margin-right': '0.75in',
'margin-bottom': '0.75in',
'margin-left': '0.75in',
'encoding': "UTF-8",
'no-outline': None,
'enable-local-file-access': None
}
# Convert with custom options
pdfkit.from_file('example.html', 'custom_output.pdf', options=options, configuration=config)
print("Custom PDF generated successfully!")
Cross-Platform Configuration
For better compatibility across different operating systems ?
import pdfkit
import platform
# Auto-detect wkhtmltopdf path based on OS
def get_wkhtmltopdf_path():
system = platform.system()
if system == "Windows":
return r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe'
elif system == "Darwin": # macOS
return '/usr/local/bin/wkhtmltopdf'
else: # Linux
return '/usr/bin/wkhtmltopdf'
# Configure based on system
config = pdfkit.configuration(wkhtmltopdf=get_wkhtmltopdf_path())
# Convert HTML to PDF
pdfkit.from_file('example.html', 'cross_platform_output.pdf', configuration=config)
print("PDF created successfully on", platform.system())
Common Options
| Option | Description | Example Value |
|---|---|---|
| page-size | Paper size | 'A4', 'Letter' |
| orientation | Page orientation | 'Portrait', 'Landscape' |
| margin-top/bottom/left/right | Page margins | '0.75in', '10mm' |
| encoding | Character encoding | 'UTF-8' |
Conclusion
Converting HTML to PDF using Python with pdfkit is straightforward and powerful. The library supports converting from files, strings, and URLs, with extensive customization options for professional PDF output. Remember to install both pdfkit and wkhtmltopdf for the solution to work properly.
