Anscombe’s quartet in Python

[This article was first published on Stringfest Analytics, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Anscombe’s quartet is a well-known statistical parlor trick with a powerful message. Fortunately, with the help of the seaborn package as well as pandas, it can be demonstrated in Python with just a few lines of code, which you can follow along with below:

Moral of the story

What I love about Anscombe’s quartet is how it shows data visualization is not an optional or inferior angle to analytics. What we may have totally missed with the summary statistics was clear as day with the visualization.

Of course, I suppose a secondary moral is to be very skeptical of small sample sizes, although I suppose that takeaway doesn’t have such a “wow” moment as the visualization did…

Actually, one takeaway from our example — did you see how easy it was to do that in Python, and what an attractive report came out of it? If you’d like to learn more Python for analytics, check out my book Advancing into Analytics.

To leave a comment for the author, please follow the link and comment on their blog: Stringfest Analytics.

Want to share your content on python-bloggers? click here.