Overview

I use an app called “Speedtest” more than any other, except perhaps social media. What it does is allow you to test how fast your internet is at any given moment. Which should be obvious from the name.

An interesting feature of the app is that it stores a list of past tests and allows you to email a CSV (Comma Separated Values) file of all your past tests.

Well, I just learned how to import those things into Python and extract the delicious data from them. So of course that’s what I plan to do here.

Background:

I currently live in Wilmington, NC and our internet provider is Spectrum. The data points, while well over a hundred in number, were not taken in any scientific manner. More often than not I only measured when things were bad because I wanted to know it wasn’t my imagination. So there is some selection bias.

Also, Spectrum doesn’t accept speed results from third party apps so this isn’t intended to prove anything to the company about their service.

Project

Internet Data Speed Statistical Analysis

What I Did

Using Python, import and clean data from a standard CSV file, graph and perform simple statistical calculations on the result.

Internet Speed Results from July 2019 to January 2020

The blue bars are results from my home based wifi. The red are from other sources. I included this data set separately because I spent a month in Wyoming last fall and having the large gap in there looked incomplete.

Plus I wanted to emphasis that I was getting better internet in the outskirts of Laramie than I was in Wilmington. Not a good look for Spectrum.

Darker bars indicate multiple readings a day because the scale isn’t fine enough to have them be completely separate.

I also had to remove a single data point from my friend’s house with super internet. The one reading of >300Mbps threw off the scale in such a way to make it unreadable.

The broad takeaway is that prior to January my internet had a rough cap of around 25Mbps and in January that cap nearly doubled.

This is because we discovered that Spectrum had neglected to upgrade our service when they increased the base speed of their lowest tier offering. So they were content to let us continue to pay full price for a lower quality product.

Nice…

Statistical Analysis

As you can see from the graphs the average (mean) and median values for download speed increased by a significant amount once the service was upgraded. So in general the speeds are greater now.

The main problem with the service is with the variance (standard deviation). While caps are higher the speeds still frequently drop below 1Mbps which was consistent with the prior service. So the speed fluctuates in an extreme way and the service upgrade appears to have done little to fix that problem.

Conclusions and Improvements

In conclusion, faster speeds are better and the upgrade in January clearly improved things in that area. However, the service is still unreliable, though it is hard to tell how unreliable simply from the data collected.

I would like to know what percent of the readings were below 1Mbps but I already admitted that the readings are likely skewed towards bad service since that was when I would take more tests. While finding out would be a simple implementation, that personal bias would skew the results.

I think this would be a good task to revisit with a more scientific set of data. Could even be interesting to try and set up an automatic script that collects speed tests over a regular period.

Lots of good ideas for the future…

Additional things I learned:

This was the first time that I had to use breakpoints and stepping through code to figure out a bug in my Python. I had lots of experience doing so in Java/Eclipse but I had never done that in Visual Studio before.

Good to have that under my belt finally.

Leave a Reply