Before we get started, have you tried our new Python Code Assistant? It's like having an expert coder at your fingertips. Check it out!
In this tutorial, we will make a simple command-line program that we can supply with a .docx
file path and words that need replacing.
We start with the Imports.
The re
library is essential here because we can use its sub()
function to replace certain expressions with other text in a given string.
We also need the sys
module so we can get the command line arguments with sys.argv
.
Last but not least, we also get the Document
class from docx
so we can work with word files. We have to download it first with:
$ pip install python-docx
Let's get started:
# Import re for regex functions
import re
# Import sys for getting the command line arguments
import sys
# Import docx to work with .docx files.
# Must be installed: pip install python-docx
from docx import Document
Next, we get to the command line arguments. We want to check if the inputs are valid.
Now if the sys.argv
list is shorter than three items, we know that the user didn't provide enough information. The first argument is always the file path of the Python file itself. The second one should be the file path of the file where the text will be replaced.
The rest of the arguments will be pairs like this text=replacewith
which tells us what we replace with what. That's what we check in the for
loop.
In the end, we also save the file path to a variable, so we don't have to type out sys.argv[1]
every time.
# Check if Command Line Arguments are passed.
if len(sys.argv) < 3:
print('Not Enough arguments where supplied')
sys.exit()
# Check if replacers are in a valid schema
for replaceArg in sys.argv[2:]:
if len(replaceArg.split('=')) != 2:
print('Faulty replace argument given')
print('-> ', replaceArg)
sys.exit()
# Store file path from CL Arguments.
file_path = sys.argv[1]
If the file ends with .docx
we know we have to use the docx
class. We first make a new Document
object which we will provide with our file path. Then we loop over the replacement arguments just like for the .txt
files.
After that, we loop through the document's paragraphs right before looping through the runs of the paragraphs. These runs represent the style spans of the document; we replace the text and then simply save the document with the save()
method.
if file_path.endswith('.docx'):
doc = Document(file_path)
# Loop through replacer arguments
occurences = {}
for replaceArgs in sys.argv[2:]:
# split the word=replacedword into a list
replaceArg = replaceArgs.split('=')
# initialize the number of occurences of this word to 0
occurences[replaceArg[0]] = 0
# Loop through paragraphs
for para in doc.paragraphs:
# Loop through runs (style spans)
for run in para.runs:
# if there is text on this run, replace it
if run.text:
# get the replacement text
replaced_text = re.sub(replaceArg[0], replaceArg[1], run.text, 999)
if replaced_text != run.text:
# if the replaced text is not the same as the original
# replace the text and increment the number of occurences
run.text = replaced_text
occurences[replaceArg[0]] += 1
# print the number of occurences of each word
for word, count in occurences.items():
print(f"The word {word} was found and replaced {count} times.")
# make a new file name by adding "_new" to the original file name
new_file_path = file_path.replace(".docx", "_new.docx")
# save the new docx file
doc.save(new_file_path)
else:
print('The file type is invalid, only .docx are supported')
Let's run it on this document file:
$ python docx_text_replacer.py doc.docx SYN=TEST Linux=Windows TCP=UDP
The word SYN was found and replaced 5 times.
The word Linux was found and replaced 1 times.
The word TCP was found and replaced 1 times.
I wanted to replace the "SYN" word with "TEST", "Linux" with "Windows", and "TCP" with "UDP" on the document, and it was successful!
Excellent! You have successfully created a file replacement program using Python code! See how you can add more features to this program, such as adding more file formats.
Get the complete code here.
Learn also: How to Convert PDF to Docx in Python.
Happy coding ♥
Just finished the article? Now, boost your next project with our Python Code Generator. Discover a faster, smarter way to code.
View Full Code Improve My Code
Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!