This page lists smaller projects I've done to showcase my experience using different tools or techniques. For the most part, these are not complete stand-alone projects but examples of data visualization or programming using open data. In the following section (in yellow), I have a link to a summary of several larger projects I've done over the years below that are links to the different projects.
This page shows a series of example projects because I can only describe larger projects I've done at work. I have done small analytics-oriented projects that display some proficiency with a set of languages (R, Python), platforms (Cloud or data), or other technologies.
Should Kitchener-Waterloo implement speed cameras at all schools in the Region? I looked at traffic collision data from the City of Kitchener and found little evidence that speed cameras would provide much value. First, there are relatively few accidents near schools during school hours involving pedestrians or cyclists (an average of roughly 3.7 non-fatal accidents yearly). In Kitchener, the majority of accidents have occurred at intersections near downtown schools or Williamsburg. I suggest that traditional traffic calming measures – speed bumps, crosswalks, and crossing attendants – provide citizens with better safety and value.
This example data visualization uses R leaflet and open data from the Canadian government detailing the Temporary Foreign Worker Program (TFWP). Over the past few years, the TFWP program has been controversial -- the UN recently critized it for creating conditions ripe for modern slavery. I take no position on the matter, but present the data from Q1, 2024 in a visualization so you can make up your own mind.
First, a visual analysis of the wages paid to executives and employees at Ontario colleges. The data is from the Ontario Sunshine list.
Second, a more in-depth analysis based on inferential and predictive machine learning models. The models and visualizations are done in R.
This Python Google Cloud application I wrote is a Docker app that can be run as a CloudRun service. It will authenticate to one's Google services (GMail or Google Drive) and read a series of documents/emails you specify. It will then conduct a text analysis and cluster the documents into thematic groups. The program generates a series of detailed visualizations and data files that can be downloaded and used for further analysis.
This is a small part of an old research project. In this Python Jupyter notebook, I examine and visualize telecommunications patents from the USPTO. This is interesting because it shows the growth of international players, especially the Chinese (Huawei, in particular) over the past 15 years.
Examines changes to people's perception of Huawei in a series of Jupyter notebooks. This too was part of a research project.
This shows the economic relationships between different industries in New Zealand and China. The Power BI visualization allows for dynamic exploration of the graph.
I often work with networks. I thought it would be fun to create a dynamic visualization of the economic relationships between industries in New Zealand and China. The visualization uses OECD data from the international input output tables to show those ties.
You can see the dynamic visualization and read more about my process here.
This project shows how different kinds of communication goods ranging from newspapers to mobile phones were adopted in the USA and Canada. A simple Bass model is used to forecast sales for each good. A discussion shows how innovative consumers help sales relative to consumers that imitate others in their purchases. The historical data shows that modern consumer goods like personal computers and mobile phones have strong network effects that encourage consumers to purchase compared to other kinds of communication technologies like radio, newspaper or television.
I have also written a series of technical reports while working as a contractor and during my Masters co-op at Statistics Canada.
Follow the link for descriptions of some of my previous work-related data science projects.