Replacing Text in Word Documents with Python While Preserving Formatting

Sometimes, when using Python to replace text in Word documents, the default operation may disrupt the original formatting. In such cases, you can utilize Python’s Win32com functionality to achieve this. Next, we will introduce how to perform content search and replacement using the Find.Execute method.

The following is the official definition of this method from Microsoft

Name Required/Optional Data Type Description
FindText Optional Variant The text to search for. Use an empty string (“”) to search for formatting alone. You can search for special characters by specifying the appropriate character codes. For example, “^p” corresponds to paragraph marks, while “^t” corresponds to tab characters.
MatchCase Optional Variant True indicates that the specified find text is case-sensitive. Corresponds to the “Match case” checkbox in the “Find and Replace” dialog (under the Edit menu).
MatchWholeWord Optional Variant True indicates that the find operation should match whole words only, not parts of longer words. This corresponds to the [ Match whole word only ] checkbox in the [ Find and Replace ] dialog.
MatchWildcards Optional Variant True indicates that the find text is a special search operator. This corresponds to the [ Use wildcards ] checkbox in the [ Find and Replace ] dialog.
MatchSoundsLike Optional Variant True indicates that the find operation should find words that sound like the find text. This corresponds to the [ Sounds like ] checkbox in the [ Find and Replace ] dialog.
MatchAllWordForms Optional Variant True indicates that the find operation should find all forms of the find text (e.g., “sit” will find “sits” and “sat”). This corresponds to the [ Find all forms ] checkbox in the [ Find and Replace ] dialog.
Forward Optional Variant True indicates to search forward (to the end of the document).
Wrap Optional Variant This argument controls what happens if the search starts from a position other than the beginning of the document and reaches the end of the document (if Forward is set to False, it will be the opposite). This argument also controls what happens if the selected range or selection does not find the range and search text. It can be one of the WdFindWrap constants.
Format Optional Variant True indicates that the find operation should find text with formatting and (or) rather than just the find text.
Replace Optional Variant The replacement text. To delete the text specified by the Find argument, use an empty string (“”). You can specify special characters and advanced search criteria, just as you did for the Find argument. To specify a graphic object or other non-text item as the replacement, move this item to the [Clipboard] and specify “^c” for ReplaceWith.
Replace Optional Variant Specifies the number of replacements to perform: one, all, or none. It can be any WdReplace constant.
MatchKashida Optional Variant True indicates that the find operation should match kashida in Arabic documents. This argument may not be available depending on the language support you have selected or installed (e.g., US English).
MatchDiacritics Optional Variant True indicates that the find operation should match diacritics in documents for right-to-left languages. This argument may not be available depending on the language support you have selected or installed (e.g., US English).
MatchAlefHamza Optional Variant True indicates that the find operation should match alef hamzas in Arabic documents. This argument may not be available depending on the language support you have selected or installed (e.g., US English).
MatchControl Optional Variant True indicates that the find operation should match bidirectional control characters in documents for right-to-left languages. This argument may not be available depending on the language support you have selected or installed (e.g., US English).
MatchPrefix Optional Variant True indicates that the search string should match the beginning of the text. This corresponds to the [ Match prefix ] checkbox in the [ Find and Replace ] dialog.
MatchSuffix Optional Variant True indicates that the search string should match the end of the text. This corresponds to the [ Match suffix ] checkbox in the [ Find and Replace ] dialog.
MatchPhrase Optional Variant True indicates that all spaces and control characters between words will be ignored.
IgnoreSpace Optional Variant True indicates that spaces between words will be ignored. This corresponds to the [ Ignore whitespace ] checkbox in the [ Find and Replace ] dialog.
IgnorePunct Optional Variant True indicates that all punctuation characters between words will be ignored. This corresponds to the [ Ignore punctuation ] checkbox in the [ Find and Replace ] dialog.
import os
import win32com.client
docx_path = 'demo.docx'
app = win32com.client.DispatchEx("Kwps.Application")
doc = app.Documents.Open(os.path.abspath(docx_path), ReadOnly=0)
oldstr = "hello"
newstr = "world"
app.Selection.Find.Execute(oldstr, False, False, False, False, False, True, 1, False, newstr, 2)
doc.Close()
app.Quit()

Parameter Description

  • <span>OldStr</span>: Search term
  • <span>True</span>: Case-sensitive
  • <span>True</span>: Match whole words only, not parts of words (whole word match).
  • <span>True</span>: Use wildcards.
  • <span>True</span>: Sounds like.
  • <span>True</span>: Find all forms of the word.
  • <span>True</span>: Search towards the end of the document.
  1. <span>True</span>: The text being searched must match the original text exactly, including formatting.
  2. <span>NewStr</span>: The new text to replace with
  3. <span>2</span>: The number of replacements, where<span>0</span> means no replacement,<span>1</span> means only replace the first matched text, and<span>2</span> means replace all matched text.

For more methods, refer to

https://learn.microsoft.com/en-us/office/vba/api/word.find.execute

Leave a Comment