In this quick post I am using the function demonstrated in this post. I have wrapped it into a module and used it in a notebook to read a MS Project file, then output the results of the flattened MS Project to excel workbooks which I then formatted using a quick and dirty self made excel formatting module (code can be found here).
This is an example of a function I use a lot. The scenario is that many team members that need to see their tasks in MS Project either do not have MS Project and/or don’t want to use MS Project but are quite happy to use excel spreadsheets. I use this function, alongside a couple of others that I will publish later, to “Flatten” an MS Project file, place the contents in a Python Pandas DataFrame, manipulate the Pandas DataFrame to get subsets of tasks I want to publish and output these to excel (typically) or to word or PDF. By “Flatten” I mean to collapse the summary tasks above a single to one line as follows:
MS Project File
…to Pandas Table….
The following Jupyter Notebook performed this transformation:
It looks like a big file but only because I have a number of comments added. Let’s break it down.
I won’t go over the first three notebook cells, I would be repeating earlier blogs on using win32com and generation of constants.
Let’s get to the function “create_project_data_frame” and the points to note:
I have explained these in the notebook above
Initialise some variables to used by the function
projectDataFrame = pd.DataFrame(columns=headers)
Creates an empty Pandas DataFrame with column headers (from the KeyWord “header”). There are some mandatory headers: I see these as outputs required from any Project file.
Note I have used UniqueID rather than task ID. That’s because they never change
This is the header where the summary level tasks to the task being output are added e.g. “Level 1 > Level 2 > Level 3”
Name of the task
Other headers can be added but you must use the field names. IN the example above I added
summary_tasks_to_task = 
This list stores the current summary level e.g. [“Level 1”, “Level 2”, “Level 3]
Tasks is a collection object from the project application. We use it to hold all the tasks in the projects that we then iterate over.
We then iterate over the collection of tasks with “for t in task_collection:“
We only want to capture information on tasks, not summary tasks. “if (not t.Summary) & ~(t.UniqueID in UniqueIDs_to_Ignore). On occassions I need to ignore some tasks. The function is provide a list of UniqueIDs to ignore, hence “~(t.UniqueID in UniqueIDs_to_Ignore)“.
This steps through the TaskDependencies collection and appends them to a list called dep
Similarly, a task can have many resources associated with it and this routine iterates through the object “Assignments” and appends the resources to a list called “res“
Sometimes tasks mid way through a project have no summary tasks. Not good, but it happens. This routine clears the summary_task_to_task list to empty
This code then starts the creation of a temporary list of strings that will create a task row in the pandas dataframe:
This code then iterates over the list of headers requested and appends the values to the temp list above. The last bit of code appends the string to the pandas dataframe
The next code handles what to do with a task that is a summary task (t.Summary). If the summary task is at a level greater than the previous is appended to the list summary_tasks_to_task, if it is smaller the last summary task is popped from the list.
The last bit of code in this notebook cell sets the index of the dataframe to the UniqueID and then converts the Finish and Start columns to datatime types (rather than text). The dataframe is then returned.
The last cell create the headers list, creates is of UniqueIDs to be ignored and then calls the routine above and assigns the returned Pandas DataFrame to the variable “frame“
Once you have a Pandas data it is easy to manipulate the data and create alternative dataframes with subsets of data that can be output to excel spreadsheets. For example I produce filtered dataframe by resource name then create a workbook for each resource that contains only their tasks. It is possible to create separate worksheets for each day or week or month.