Articles by John Mount

Advanced Data Reshaping in Python and R

September 4, 2019 | John Mount

This note is a simple data wrangling example worked using both the Python data_algebra package and the R cdata package. Both of these packages make data wrangling easy through he use of coordinatized data concepts (relying heavily on Codd’s “rule of access”). The advantages of data_algebra and ... [...Read more...]

New Getting Started with vtreat Documentation

September 2, 2019 | John Mount

Win Vector LLC‘s Dr. Nina Zumel has just released some new vtreat documentation. vtreat is a an all-in one step data preparation system that helps defend your machine learning algorithms from: Missing values Large cardinality categorical variables Novel levels from categorical variables I hoped she could get the Python ... [...Read more...]

Introducing data_algebra

August 26, 2019 | John Mount

This article introduces the data_algebra project: a data processing tool family available in R and Python. These tools are designed to transform data either in-memory or on remote databases. In particular we will discuss the Python implementation (also called data_algebra) and its relation to the mature R implementations (...
[...Read more...]

Eliminating Tail Calls in Python Using Exceptions

August 23, 2019 | John Mount

I was working through Kyle Miller‘s excellent note: “Tail call recursion in Python”, and decided to experiment with variations of the techniques. The idea is: one may want to eliminate use of the Python language call-stack in the case of a “tail calls” (a function call where the result ... [...Read more...]
1 2 3 4