Create, read, write Word and PDF in Python

GemBox.Document is a .NET library that enables you to process Word files from any .NET application. But it's also a COM accessible library that you can use in Python as well.

System Requirements

To use GemBox.Document in Python, you'll need to:

  1. Download and install GemBox.Document Setup.
  2. Expose GemBox.Document to COM Interop with Regasm.exe tool:
    :: Add GemBox.Document to COM registry for x86 (32-bit) applications.
    C:\Windows\Microsoft.NET\Framework\v4.0.30319\RegAsm.exe [path to installed GemBox.Document.dll]
    :: Add GemBox.Document to COM registry for x64 (64-bit) applications.
    C:\Windows\Microsoft.NET\Framework64\v4.0.30319\RegAsm.exe [path to installed GemBox.Document.dll]
  3. Install Python for Windows extension:
    :: Install Python extension for Windows.
    pip install pywin32

Working with Word files in Python

The following example shows how you can read a template Word file from Python, edit its content (with Find and Replace, Mail Merge and Modify Bookmarks operations) and write it as an output file of PDF format.

Reading Word file and writing PDF file in Python
Screenshot of a PDF file created from updated Word template
import os
import win32com.client as COM

# Create ComHelper object.
comHelper = COM.Dispatch("GemBox.Document.ComHelper")
# If using the Professional version, put your serial key below.

# Read Word document.
document = comHelper.Load(os.getcwd() + "\\%#ComTemplate.docx%")

# Find and replace text.
document.Content.Replace("PLACEHOLDER1", "Sample Value 1")
document.Content.Replace("PLACEHOLDER2", "Sample Value 2")
document.Content.Replace("PLACEHOLDER3", "Sample Value 3")

# Execute mail merge process.
source = COM.Dispatch("System.Collections.Hashtable")
source.Add("Name", "John")
source.Add("Surname", "Doe")
source.Add("Age", 30)

# Modify bookmarks content.
document.Bookmarks.Item("Bookmark1").GetContent(True).LoadText("Sample Content 1.")
document.Bookmarks.Item("Bookmark2").GetContent(True).LoadText("Sample Content 2.")

# Write document as PDF.
document.Save(os.getcwd()  + "\\ComExample.pdf")

Wrapper Library

Not all members of GemBox.Document are accessible because of the COM limitations like unsupported static and overload methods. That is why you can use the ComHelper class which provides alternatives for some members that cannot be called with COM Interop.

However, if you need to use many GemBox.Document members from Python, a recommended approach is to create a .NET wrapper library instead. Your wrapper library should do all the work within and exposes a minimal set of classes and methods to the unmanaged code.

This will enable you to take advantage of GemBox.Document's full capabilities, avoid any COM limitations, and improve performance by reducing the number of COM Callable Wrappers created at runtime.

See also

Next steps

GemBox.Document is a .NET component that enables you to read, write, edit, convert, and print document files from your .NET applications using one simple API. How about testing it today?

Download Buy