The learning theories behind Advancing into Analytics

[This article was first published on George J. Mount, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Technical books are curious in a lot of ways, including this one: most technical authors don’t typically teach or write for a living. They’re technicians who happen to write a book. That means that while you may get the most brilliant technical know-how, you may not receive it in a format best suited to understand and retain it. Lots of technical books feel like a battle of wits against the author, and readers quickly lose what tenuous grasp was offered of the material.

Now, I’m by no means a trained instructional designer or learning theorist, and like many academic pursuits I think that the fluff/nugget ratio is pretty high in these fields. But I have spent enough time adjacent to them that I’ve been able to identify those nuggets and incorporate useful learning theories into Advancing into Analytics.

What this means in theory (pun intended) is that Advancing into Analytics is written for you to learn and retain the most knowledge possible, without having to work too hard at it.

Here are some of the topics and techniques I used to do that. I especially rely on Make it Stick: The Science of Successful Learning by Peter C. Brown et al. and Powerful Teaching: Unleash the Science of Learning by Pooja K. Agarwal and Patrica M. Bain for making it happen.

Transfer learning

Learning happens by relating new knowledge to existing knowledge. Transfer learning is the practice of explicitly making this connection part of the learning.

I’ve said it before and I’ll say it again: Excel kicks off a great learning path to more advanced analytics. Spreadsheet users know from experience the main operations and tasks of data cleaning and analysis. Technical elites too often sneer at spreadsheets, and attempt to write their audience’s knowledge about data to zero, so they can start from a “purer” approach. Talk about negative yardage!

In my book, I instead directly relate Excel knowledge to broader analytics equivalents:

  • What does the VLOOKUP() tell us about database joins?
  • How do you recreate Excel’s Custom Sort menu in R?
  • How to the Rows and Values areas of a PivotTable relate to grouping and aggregrating fields in Python?

Active Recall

Have you ever read and re-read a book, thinking you’ve nailed the content, only to find that you can’t remember any of it when tested? Maybe you even used a highlighter and sticky notes, but to no avail.

The issue with learning this way is that it focuses purely on the consumption of material and not its implementation. To really master a subject, you need to actively apply it to new material. As Pooja K. Agarwal and Patricia M. Bain write in Powerful Teaching: “One of the best ways to make sure something sticks and get stored is to focus on the retrieval stage, not the encoding stage.”

Now, I’ll admit that I tend to skip end-of-chapter book exercises. They’re usually dull (literal) textbook exercises, and it can be hard to find the solutions anyway. Why not just continue reading and keep the book’s momentum going? I’ve noticed that many technical authors don’t even include book exercises, likely for these reasons (and because, let’s be honest, exercises take more work).

I provide exercises for nearly all chapters of Advancing into Analytics, using real-life datasets to practice data exploration and hypothesis testing in Excel, Python and R. What’s more all exercise solutions are conveniently available in the book’s public GitHub repository. If you read the book, please do these exercises. It’s how you’ll remember the content.

Interleaving

As my business’s name might attest, I am a (mostly erstwhile) musician. Of the many lessons learned from music is the power of interleaving.

It’s tempting to practice a piece from start to end each time, but that’s not so effective. The problem is that gaps may form in the music covered (i.e., you may only practice the beginning of a piece, or your favorite or easiest parts). You can easily fall into a slump when you know what order to expect each time you practice.

A better approach is to mix it up. Pick a random part of a piece and start playing and re-playing. Try sections out-of-order or even backwards. Add some variety to the way in which you practice.

Learning often follows a blocked approach, where one topic is studied very thoroughly before moving onto the next, often in the same order. By contrast, interleaving mixes topics in a spaced, often varying, order.

Advancing into Analytics is arranged into three sections: first, the statistical foundations of analytics are demonstrated in Excel. The reader then learns analytics in R and later Python.

Blocking versus interleaving

Rather than treat these topics as three disconnected parts, I interleave related concepts among them. For example, readers will recreate the same analysis of a dataset using all three applications. Statistical and data cleaning know-how is introduced and re-introduced in different contexts, so that we’ll conceptualize a data table in Excel, then build it in R and Python.

Now, if you’re thinking that this is how knowledge usually works in reality anyway… well, you’re probably right. Learning tends to be iterative and incremental, and there’s no clean break between mastering one topic and getting started in another. Traditional education isn’t always modeled this way, but Advancing into Analytics is. It’s just not possible to master hypothesis testing, for example, in a single chapter, so you’ll see the topic appear in different contexts throughout the book.

Learn analytics fast

Getting into analytics isn’t easy. In many cases, it literally requires learning a new language (of the programming variety). You’ve got enough on your plate: a battle of wits with a technically gifted but pedagogically unaware author shouldn’t be there.

In Advancing into Analytics, you’ll learn not one but two programming languages. Not only that, you’ll discover hypothesis testing, data wrangling, even a smidge of what could be called machine learning, all in 250 pages. This is possible with the help of learning theory. I hope the book can serve as the straightest path to analytics out there.

Learn more about Advancing into Analytics: From Excel to Python and R

To leave a comment for the author, please follow the link and comment on their blog: George J. Mount.

Want to share your content on python-bloggers? click here.