Welcome! I'm Jon MacKay, a data scientist and analytics consultant specializing in transforming complex business challenges into actionable insights.
What You'll Find Here
This site serves as both a demonstration of my technical expertise and a window into my problem-solving approach. Each case study showcases real-world applications of data science techniques I've employed to drive business value.
For Potential Clients
Browse through my portfolio to see how similar analytical approaches could benefit your business. Each example includes:
- The business context and challenge
- Technical methodology and tools used
- Measurable outcomes and insights gained
- Potential applications across industries
For Employers
This portfolio demonstrates my hands-on experience with:
- Advanced analytics and statistical modeling
- Data visualization and storytelling
- Programming proficiency (Python, R, SQL)
- End-to-end project execution
Featured Work
Below you'll find a curated selection of technical examples that showcase my approach to data analysis. Each project includes detailed documentation and code samples, demonstrating both technical proficiency and business acumen.
Should Kitchener-Waterloo implement speed cameras at all Public and Catholic Board schools in the Region? I looked at traffic collision data from the City of Kitchener and found little evidence that speed cameras would provide much value. First, there are relatively few accidents near schools during school hours involving pedestrians (an average of about 4.7 accidents yearly). In Kitchener, the majority of accidents have occurred at intersections near downtown schools or Williamsburg. I suggest that traditional traffic calming measures – speed bumps, crosswalks, and crossing attendants – provide citizens with better safety and value.
Kitchener and Waterloo are considering a dramatic change to our roads: automated speed cameras at every school, creating "community safety zones" with 30 km/h speed limits. Any speed above that—yes, even a single kilometer over—could earn you a ticket in the mail. In this posting, I’m going to focus on this question: what will this do to our driving patterns?
This example data visualization uses R leaflet and open data from the Canadian government detailing the Temporary Foreign Worker Program (TFWP). Over the past few years, the TFWP program has been controversial -- the UN recently critized it for creating conditions ripe for modern slavery. I take no position on the matter, but present the data from Q1, 2024 in a visualization so you can make up your own mind.
First, a visual analysis of the wages paid to executives and employees at Ontario colleges. The data is from the Ontario Sunshine list.
Second, a more in-depth analysis based on inferential and predictive machine learning models. The models and visualizations are done in R.
This Python Google Cloud application I wrote is a Docker app that can be run as a CloudRun service. It will authenticate to one's Google services (GMail or Google Drive) and read a series of documents/emails you specify. It will then conduct a text analysis and cluster the documents into thematic groups. The program generates a series of detailed visualizations and data files that can be downloaded and used for further analysis.
This is a small part of an old research project. In this Python Jupyter notebook, I examine and visualize telecommunications patents from the USPTO. This is interesting because it shows the growth of international players, especially the Chinese (Huawei, in particular) over the past 15 years.
Examines changes to people's perception of Huawei in a series of Jupyter notebooks. This too was part of a research project.
This shows the economic relationships between different industries in New Zealand and China. The Power BI visualization allows for dynamic exploration of the graph.
I often work with networks. I thought it would be fun to create a dynamic visualization of the economic relationships between industries in New Zealand and China. The visualization uses OECD data from the international input output tables to show those ties.
You can see the dynamic visualization and read more about my process here.
This project shows how different kinds of communication goods ranging from newspapers to mobile phones were adopted in the USA and Canada. A Bass model is used to forecast sales for each good. A discussion shows how innovative consumers help sales relative to consumers that imitate others in their purchases. The historical data shows that modern consumer goods like personal computers and mobile phones have strong network effects that encourage consumers to purchase compared to other kinds of communication technologies like radio, newspaper or television.
I have also written a series of technical reports while working as a contractor and during my Masters co-op at Statistics Canada.
Follow the link for descriptions of some of my previous work-related data science projects.