Why most “coding for spreadsheet users” training fails

[This article was first published on George J. Mount, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Excel is arguably the world’s largest coding community, with 750 million users worldwide. Many of us, myself included, get introduced to the data world from Excel.

But the thing is… Excel is not coding, not exactly. Like any software, Excel has its limitations, which a full-borne programming language like R or Python can make up for.

This is a huge market, so it’s not surprising to find a bevy of educational offerings on “how to learn coding from an Excel user’s perspective.” In fact, I’m not surprised there is more.

The problem is that little of this training follows solid instructional design principles. It does not help the user make a graceful pivot from Excel to coding.

Here’s what I see wrong with most “coding for Excel users” programs.

It’s too specific, too fast

There is a stream of training, particularly for Python, that aims to teach users how to automate the production of Excel workbooks or conduct basic data analysis, by populating data, formatting worksheets and so forth.

The idea here is that it doesn’t actually take much knowledge of Python to make great strides in automating workbooks.

I take this attitude as if saying you don’t really need to know how to drive a car “that well” if you are just driving up the corner.

This is a counterproductive approach: it may seem like a time-saver to cut corners and half-train, but over time it’s inviting serious errors that will take lots of cleanup.

While I encourage analysts to merge their abilities in spreadsheets and coding (in fact, often the most fruitful data products come from such mashups), I highly discourage learning coding just to automate spreadsheets. In the long run, it does not offer solid footing into coding.

It doesn’t actually relate concepts

Fortunately, not all spreadsheets-to-coding training jumps the gun like the above. Often, it does provide step-by-step fundamentals to learning a programming language.

A common problem with this approach is it doesn’t do nearly enough to explicitly help students bridge the “mental model” of spreadsheets into coding.

This is such an overlooked teaching tool! After all:

Students learn new ideas by relating them to what they already know, and then transferring them into their long-term memory.

“How People Learn: An Evidence-Based Approach,” Paul Bruno (source: Edutopia)

I so often look at “introduction to coding for spreadsheet users” training and think that it could just as easily be “introduction to coding” training: that is, there is nothing in that training unique to the perspective of a spreadsheet user.

There’s so much education on “coding for spreadsheet users” that you can look at it and see that it could just as well been just any introduction for anywhere.

Here are some approaches I take in relating new ideas about coding to what students already know about spreadsheets:

  • What is known as a PivotTable in Excel is known in programming as a group by or aggregation.
  • What we’re doing with VLOOKUP() is really building a left outer join of sorts.
  • In Excel, we operate directly on the input data. In programming, we usually import that data and operate on it via an assigned variable.

It’s framed as “either/or”

This last one is more of a course “attitude” than an instructional approach, but it may be the most detrimental of them all.

The attitude sounds like this:

You’ve been using Excel when you really should be coding. Look at all these problems with using spreadsheets! Time to kick the habit.

This is the wrong attitude to take for a couple of reasons:

It’s demotivating

Earlier I mentioned the importance of helping students learn new ideas by relating them to what they already know.

Guess what? It’s hard to relate new ideas when you’re told what you already know is garbage.

Excel users intuitively understand how to work with data: they can sort, filter, group and join. Now it’s just a matter of pairing code to concept rather than starting from scratch. This is not wasted effort by a long shot.

What a drag! It’s not a great motivator for students to be told that what they know is crap. Moreover this reduces the ability to use Excel as a way to bridge the gap to other connections: if we are burning Excel down then why try to build off it, when it should be burned.

It’s not accurate

In addition to being a horrible teaching tactic, this attitude is just incorrect: there is no reason to pitch Excel from the data workflow!

I make a big idea of the data analytics stack because it serves to contextualize data tools as being “slices” of the same stack. This way, we can see tools as a “yes, and” relationship instead of “either/or.”

Excel absolutely has a place in data analytics. So does programming. Learning and using one does not negate the other.

But it should be something like this instead:

Excel is a great tool for data analysis, but it’s not the only tool. Python is a valuable tool for things. But this doesn’t mean you should throw out Ecxel entirely! It will always be a great tool for data prototyping and providing interactive data models for end-users.

You’ve also learned way more than you realized about coding from using Excel. Many of the tasks you perform on data all the time can also be done in Python.

An approach like this is more honest and more encouraging.

So, take a look at your “coding for spreadsheet users” content. How is it purposefully using the “mental model” of spreadsheets to teach coding? Is the course predicated on the idea that spreadsheets suck and shouldn’t be used anymore?

Bridging the divide from spreadsheets to coding

Learning coding is no small feat, so training programs that make it so usually fail. That’s the problem with the “learn a bit of Python to automate your Excel workbooks” school of thought.

At the same time, Excel users are not the average newbie coder, as they’ve worked with manipulating data and writing functions for some time. It’s important this training dig into these strengths — and, yes, spreadsheet mastery is a strength which should not be discarded.

These are topics which I address head-on in my book, Advancing into Analytics: From Excel to Python and R.

Advancing into Analytics Cover Image

Spreadsheet users: I am one of you. Let me help you level up your data skills. No snark, no cheap shortcuts. Just the straightest learning path from spreadsheets to coding. I look forward to your thoughts on the book.

To leave a comment for the author, please follow the link and comment on their blog: George J. Mount.

Want to share your content on python-bloggers? click here.