In this quick post I am using the function demonstrated in this post. I have wrapped it into a module and used it in a notebook to read a MS Project file, then output the results of the flattened MS Project to excel workbooks which I then formatted using a quick and dirty self made excel formatting module (code can be found here).
This is an example of a function I use a lot. The scenario is that many team members that need to see their tasks in MS Project either do not have MS Project and/or don’t want to use MS Project but are quite happy to use excel spreadsheets. I use this function, alongside a couple of others that I will publish later, to “Flatten” an MS Project file, place the contents in a Python Pandas DataFrame, manipulate the Pandas DataFrame to get subsets of tasks I want to publish and output these to excel (typically) or to word or PDF. By “Flatten” I mean to collapse the summary tasks above a single to one line as follows:
MS Project File
…to Pandas Table….
The following Jupyter Notebook performed this transformation:
It looks like a big file but only because I have a number of comments added. Let’s break it down.
I won’t go over the first three notebook cells, I would be repeating earlier blogs on using win32com and generation of constants.
Let’s get to the function “create_project_data_frame” and the points to note:
I have explained these in the notebook above
Initialise some variables to used by the function
projectDataFrame = pd.DataFrame(columns=headers)
Creates an empty Pandas DataFrame with column headers (from the KeyWord “header”). There are some mandatory headers: I see these as outputs required from any Project file.
Note I have used UniqueID rather than task ID. That’s because they never change
This is the header where the summary level tasks to the task being output are added e.g. “Level 1 > Level 2 > Level 3”
Name of the task
Other headers can be added but you must use the field names. IN the example above I added
summary_tasks_to_task = 
This list stores the current summary level e.g. [“Level 1”, “Level 2”, “Level 3]
Tasks is a collection object from the project application. We use it to hold all the tasks in the projects that we then iterate over.
We then iterate over the collection of tasks with “for t in task_collection:“
We only want to capture information on tasks, not summary tasks. “if (not t.Summary) & ~(t.UniqueID in UniqueIDs_to_Ignore). On occassions I need to ignore some tasks. The function is provide a list of UniqueIDs to ignore, hence “~(t.UniqueID in UniqueIDs_to_Ignore)“.
This steps through the TaskDependencies collection and appends them to a list called dep
Similarly, a task can have many resources associated with it and this routine iterates through the object “Assignments” and appends the resources to a list called “res“
Sometimes tasks mid way through a project have no summary tasks. Not good, but it happens. This routine clears the summary_task_to_task list to empty
This code then starts the creation of a temporary list of strings that will create a task row in the pandas dataframe:
This code then iterates over the list of headers requested and appends the values to the temp list above. The last bit of code appends the string to the pandas dataframe
The next code handles what to do with a task that is a summary task (t.Summary). If the summary task is at a level greater than the previous is appended to the list summary_tasks_to_task, if it is smaller the last summary task is popped from the list.
The last bit of code in this notebook cell sets the index of the dataframe to the UniqueID and then converts the Finish and Start columns to datatime types (rather than text). The dataframe is then returned.
The last cell create the headers list, creates is of UniqueIDs to be ignored and then calls the routine above and assigns the returned Pandas DataFrame to the variable “frame“
Once you have a Pandas data it is easy to manipulate the data and create alternative dataframes with subsets of data that can be output to excel spreadsheets. For example I produce filtered dataframe by resource name then create a workbook for each resource that contains only their tasks. It is possible to create separate worksheets for each day or week or month.
My initial months working with win32com saw me searching and reading MSDN Object Model References, reading other peoples win32com python code, and trawling VBA snippets whilst bashing away in a python IDE trying to get the function I was after. So here are my tips in reading the MSDN object references with the intent of replicating the functionality using win32com in Python.
The structure is roughly:
Let’s start simple and working from where this article left off but with constants.
Step 1: Import modules and constants
import win32com.client from MSOconstants.MSO import constants as MSO from MSOconstants.MSPPT import constants as MSPPT
Step 2: Create and set-up references to a powerpoint application, presentation, slide, add a diamond and save the presentation.
So far so good. I have a powerpoint slide with diamond thus:
Let’s say I want to add a bar to the slide that represents timeline between two dates such that I can place myDiamond on the timeline at a place that represents that point in time along the timeline; say 1/4 along the timeline.
I need to:
Determine Slide Dimension
Draw a box the width of the slide using the slide dimension
place myDiamond 1/4 away along the box
Step 3: Determine Slide Dimension
A slide is dimensioned in pixels. I want to get the dimensions of my slide. I google “msdn powerpoint slide width” and get this link as the first entry.
So we need an object that represents the presentation “Application.ActivePresentation” so that we can call the PageSetup objects SlideWidth property. In our case we have a reference to the presentation in the variable “presentation“. So we call the presentation.PageSetup.SlideWidth property
for me it is 960 pixels.
Note: we can’t use “with” statements in Python.
Step 4: Draw a box the width of the slide using the slide dimension
I google “msdn powerpoint add shape” and get this link first
The snippet below creates four python module for each of Microsoft Office, Excel, Word and Powerpoint in the “MSOconstants” directory of my Jupyter Lab notebook location (I’m using Jupyter Lab notebooks. You can reference any folder location of your choice).
Note: if you have Microsoft Project then you can create the equivalent by referencing “Microsoft Office 14.0 Object Library” (14.0 is the version I happen to have, you may have a different version). This is important as in a later blog I will outlining how to use win32com with MS Project.
Here we are importing the key module that will allow us to access Microsoft Office applications through the COM object.
But what is a COM object? It stands for Component Object Model, which, to me at least, means it gives , through an interface (COM object), to the Microsoft Office applications and their associated methods and attributes.
This blogging business is hard work, at least harder than I thought is was going to be. It is only once I started writing my first blog page that I realised how much I knew, took for granted and/or I felt I needed to explain. I find it helps to have a picture of the reader in mind. My reader is someone like myself; knows a little something about the Python language, hunts around for quick fixes (and gets frustrated that they can’t remember them from one week to the next) and believes Python and general script automation should be able to address some of the long winded or repetitive tasks encountered work and, specifically in my case, a project manager.
I have no special talents. I am only passionately curious. — Einstein