Python - Working with .docx module


Word documents contain formatted text wrapped within three object levels. Lowest level- Run objects, Middle level- Paragraph objects and Highest level- Document object.

So, we cannot work with these documents using normal text editors. But we can manipulate these word documents in python using the python-docx module.

  • The first step is to install this third-party module python-docx. You can use pip “pip install python-docx”
  • After installation import “docx” NOT “python-docx”.
  • Use “docx.Document” class to start working with the word document.

Example

# import docx NOT python-docx
import docx
# create an instance of a word document
doc = docx.Document()
# add a heading of level 0 (largest heading)
doc.add_heading('Heading for the document', 0)
# add a paragraph and store
# the object in a variable
doc_para = doc.add_paragraph('Your paragraph goes here, ')
# add a run i.e, style like
# bold, italic, underline, etc.
doc_para.add_run('hey there, bold here').bold = True
doc_para.add_run(', and ')
doc_para.add_run('these words are italic').italic = True
# add a page break to start a new page
doc.add_page_break()
# add a heading of level 2
doc.add_heading('Heading level 2', 2)
# pictures can also be added to our word document
# width is optional
doc.add_picture('path_to_picture')
# now save the document to a location
doc.save('path_to_document')

Updated on: 08-Aug-2020

7K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements