Python and MS Word and win32com

Overview

I once did an automated script for a reformatted Word document. Opening the word file and editing it is the interesting part, which is not very easy, but I found a useful python library. There is a nice book Python Programming On Win32, which I recommend to read. That is why I decided to choose python as my main language.

Word with python

Once you open the word file, you can edit it. However you cannot open multiple instances of the same file, which is obvious.

from win32com.client import Dispatch

myWord = Dispatch('Word.Application')
myWord.Visible = 1  # comment out for production

myWord.Documents.Open(working_file)  # open file

# ... doing something

myWord.ActiveDocument.SaveAs(new_file)
myWord.Quit() # close Word.Application

Or you can use second approach:

from win32com import client

app = client.gencache.EnsureDispatch("Word.Application")
app.Documents.Open(working_file) # open file
app.ActiveDocument.ActiveWindow.View.Type = 3  # prevent that word opens itself
app.Quit()

Format other document

For formatting multiple Word documents I use one format file and applied them for others in loop where I pick up all files depends on last date of update. I open both files (one for format and one which should be formatted). Compare them and if there some change I overwrite with properly formatted file.

            orig_document = application.Documents.Open(orig_full_path)
            format_document = application.Documents.Open(format_full_path)

            compare_documents = application.CompareDocuments(orig_document,
                                                             format_document,
                                                             CompareFormatting=True)  # default checking formatting
            application.ActiveDocument.ActiveWindow.View.Type = 3  # prevent that word opens itself
            application.ActiveDocument.SaveAs(cmp_file) # temporary file
            application.Documents.Close()

And when I need to applied format:

            if compare_documents:
                application.Documents.Open(orig_full_path)  # reopen file
                application.ActiveDocument.CopyStylesFromTemplate(format_full_path)  # apply format
                application.ActiveDocument.SaveAs(new_file)  # save to new
                application.Documents.Close()
            else:
                application.Documents.Close()

After all operations I clean up all temporary files.

OneDrive communication

All of these files have been synced with OneDrive cloud storage. And python offers some interesting operations. First, you need to find out where the application (.exe file) is located. Also reapply the configurations for authentication and authorization.

def find_one_drive_path():
    one_config = OneDriveConfiguration('configuration.yml')
    logging.info("Check configuration OneDrive path...")
    logging.info(one_config.location_of_onedrive_app)
    logging.info(one_config.location_of_directory)
    return one_config

If the application is not running or has other problems, python can help us in this way and run the process for us. In this scenario, that would be the OneDrive application.

def check_if_process_running(process_name):
    for process in psutil.process_iter():
        try:
            # Check if process name contains the given name string.
            if process_name.lower() in process.name().lower():
                return True
        except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
            pass
    return False


def start_the_process(param):
    try:
        p = subprocess.Popen(param)
        psutil.Process(p.pid)
    except Exception as e:
        logging.error(e)
        raise SystemExit

If we have a OneDrive folder location, we can easily use the absolute path to that OneDrive location in Word or another application.

Michal Slovík
Michal Slovík
Java developer and Cloud DevOps

My job interests include devops, java development and docker / kubernetes technologies.