Using the data algebra for Statistics and Data Science

[This article was first published on python – Win Vector LLC, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

I have a new intermediate introduction on the data algebra up here: Using the data algebra for Statistics and Data Science.

The data algebra is a tool for data processing in Python which is implemented on top of any of Pandas, Google BigQuery, PostgreSQL, MySQL, Spark, and SQLite. It allows you to develop data processing pipelines incrementally and then use and re-use them on different data sets in different data stores.

Please check it out.

To leave a comment for the author, please follow the link and comment on their blog: python – Win Vector LLC.

Want to share your content on python-bloggers? click here.