Python-bloggers

reticulate: R interface to Python

This article was first published on Python on RStudio , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

We are pleased to announce the reticulate package, a comprehensive set of tools for interoperability between Python and R. The package includes facilities for:

Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability. If you are an R developer that uses Python for some of your work or a member of data science team that uses both languages, reticulate can dramatically streamline your workflow!

You can install the reticulate pacakge from CRAN as follows:

install.packages("reticulate")

Read on to learn more about the features of reticulate, or see the reticulate website for detailed documentation on using the package.

Python in R Markdown

The reticulate package includes a Python engine for R Markdown with the following features:

Built in conversion for many Python object types is provided, including NumPy arrays and Pandas data frames. From example, you can use Pandas to read and manipulate data then easily plot the Pandas data frame using ggplot2:

Note that the reticulate Python engine is enabled by default within R Markdown whenever reticulate is installed.

See the R Markdown Python Engine documentation for additional details.

Importing Python modules

You can use the import() function to import any Python module and call it from R. For example, this code imports the Python os module and calls the listdir() function:

library(reticulate)
os <- import("os")
os$listdir(".")
 [1] ".git"             ".gitignore"       ".Rbuildignore"    ".RData"          
 [5] ".Rhistory"        ".Rproj.user"      ".travis.yml"      "appveyor.yml"    
 [9] "DESCRIPTION"      "docs"             "external"         "index.html"      
[13] "index.Rmd"        "inst"             "issues"           "LICENSE"         
[17] "man"              "NAMESPACE"        "NEWS.md"          "pkgdown"         
[21] "R"                "README.md"        "reticulate.Rproj" "src"             
[25] "tests"            "vignettes"      

Functions and other data within Python modules and classes can be accessed via the $ operator (analogous to the way you would interact with an R list, environment, or reference class).

Imported Python modules support code completion and inline help:

See Calling Python from R for additional details on interacting with Python objects from within R.

Sourcing Python scripts

You can source any Python script just as you would source an R script using the source_python() function. For example, if you had the following Python script flights.py:

import pandas

def read_flights(file):
  flights = pandas.read_csv(file)
  flights = flights[flights['dest'] == "ORD"]
  flights = flights[['carrier', 'dep_delay', 'arr_delay']]
  flights = flights.dropna()
  return flights

Then you can source the script and call the read_flights() function as follows:

source_python("flights.py")
flights <- read_flights("flights.csv")

library(ggplot2)
ggplot(flights, aes(carrier, arr_delay)) + geom_point() + geom_jitter()

See the source_python() documentation for additional details on sourcing Python code.

Python REPL

If you want to work with Python interactively you can call the repl_python() function, which provides a Python REPL embedded within your R session. Objects created within the Python REPL can be accessed from R using the py object exported from reticulate. For example:

Enter exit within the Python REPL to return to the R prompt.

Note that Python code can also access objects from within the R session using the r object (e.g. r.flights). See the repl_python() documentation for additional details on using the embedded Python REPL.

Type conversions

When calling into Python, R data types are automatically converted to their equivalent Python types. When values are returned from Python to R they are converted back to R types. Types are converted as follows:

R Python Examples
Single-element vector Scalar 1, 1L, TRUE, "foo"
Multi-element vector List c(1.0, 2.0, 3.0), c(1L, 2L, 3L)
List of multiple types Tuple list(1L, TRUE, "foo")
Named list Dict list(a = 1L, b = 2.0), dict(x = x_data)
Matrix/Array NumPy ndarray matrix(c(1,2,3,4), nrow = 2, ncol = 2)
Data Frame Pandas DataFrame data.frame(x = c(1,2,3), y = c("a", "b", "c"))
Function Python function function(x) x + 1
NULL, TRUE, FALSE None, True, False NULL, TRUE, FALSE

If a Python object of a custom class is returned then an R reference to that object is returned. You can call methods and access properties of the object just as if it was an instance of an R reference class.

Learning more

The reticulate website includes comprehensive documentation on using the package, including the following articles that cover various aspects of using reticulate:

Why reticulate?

From the Wikipedia article on the reticulated python:

The reticulated python is a speicies of python found in Southeast Asia. They are the world’s longest snakes and longest reptiles…The specific name, reticulatus, is Latin meaning “net-like”, or reticulated, and is a reference to the complex colour pattern.

From the Merriam-Webster definition of reticulate:

1: resembling a net or network; especially : having veins, fibers, or lines crossing a reticulate leaf. 2: being or involving evolutionary change dependent on genetic recombination involving diverse interbreeding populations.

The package enables you to reticulate Python code into R, creating a new breed of project that weaves together the two languages.

UPDATE: Nov. 27, 2019
Learn more about how R and Python work together in RStudio.

To leave a comment for the author, please follow the link and comment on their blog: Python on RStudio .

Want to share your content on python-bloggers? click here.
Exit mobile version