I vividly remember a grade school science project recording weather conditions for a selection of cities around the country. I sat at our dining room table on a Sunday night with a week’s worth of newspapers, graph paper, and my colored pencils, diligently plotting temperature highs and lows then connecting the lines to make a rainbow of parallel plots. It was fascinating and beautiful.

Am I starting too far back? Probably. It's true, though. From a young age, I had an interest in statistics and data visualization, although I didn't consider or even know those terms at the time.

As a data science student, most of the lectures and labs I encountered covered topics in the abstract. We spent more time discussing irises, Titanic passengers, real estate in King County, WA, and a seemingly endless array of Pokemon than I ever expected I would. However, the more I explored research papers and publications, the more I found real-world examples of data science and machine learning in action that I could relate to. This one, in particular, lept out at me.

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes by Himabindu Lakkaraju, et al.

Before studying data science, I spent over fifteen years working in education and academic administration.

Bring your data to life with just one line of code

As a data science student, I am often reminded that data scientists can spend 60% — 80% of their time cleaning and managing data… and that’s why Exploratory Data Analysis is so important. EDA is not the most glamorous task, but it lays the foundation for the rest of the work you will do. It should be approached mindfully and methodically.

I am also interested in adding as many tools to my collection as I can, which is why I was so intrigued when I stumbled across Pandas Profiling.

So what does it do?

