New vtreat Feature: Nested Model Bias Warning

January 11, 2020 | John Mount

For quite a while we have been teaching estimating variable re-encodings on the exact same data they are later naively using to train a model on, leads to an undesirable nested model bias. The vtreat package (both the R version and Python version) both incorporate a cross-frame method that allows ... [...Read more...]

A Richer Category for Data Wrangling

December 22, 2019 | John Mount

I’ve been writing a lot about a category theory interpretations of data-processing pipelines and some of the improvements we feel it is driving in both the data_algebra and in rquery/rqdatatable. I think I’ve found an even better category theory re-formulation of the package, which I will ... [...Read more...]

SKEWed perceptions

December 20, 2019 | OSM

The CBOE’s SKEW index has attracted some headlines among the press and blogosphere, as readings approach levels not see in the last year. If the index continues to draw attention, doomsayers will likely say this predicts the next correction or bear ...
[...Read more...]
1 15 16 17 18 19 26