Python-bloggers

Why RStudio Supports Python for Data Science

This article was first published on Python on RStudio , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

As RStudio’s products have increasingly supported Python over the past year, some of our seasoned customers have given us quizzical looks and ask, “Why are you adding Python support? I thought you were an R company!”

Just to set the record straight, RStudio does love R and the R community, and we have no plans to change that. However, if RStudio’s goal is to “enhance the production and consumption of knowledge by everyone, regardless of economic means” (which is what we say in our mission statement), that means we have to be open to all ways of approaching that goal, not just the R-based ones.

This still leaves open the question of why we would embrace a language that some in the data science world think of as a competitor. And while I can’t claim we have a definitive answer, we do have something more than anecdotes to encourage R users to embrace Python as well. We have data.

Survey Data Says “R and Python Are Used for Different Things”

“In God we trust; others must provide data.”

– Attributed to W. Edwards Deming and others, including Anonymous

RStudio has run a broad-based survey of people who use or intend to use R over the past two years. In the 2019 edition of the survey, we asked our more than 2,000 respondents to answer two questions:

“What applications do you use R for most?”

and

“What applications do you use Python for most?”

Respondents were allowed to check as many answers as they wished in both cases. They also were allowed to enter their own application categories as an open-ended response. It is important to note that while this data is indicative of user attitudes, it is by no means conclusive.

Below are the summary plots for the results of these survey questions.

Figure 1: R is used most commonly for visualization, statistical analysis, and data transformation.

Figure 2: R users employ Python most commonly for data transformation and machine learning.

Taking these charts at face value (again, read the next section before you do that), we can draw some interesting conclusions:

Think of These Results As Directional Instead of Hard Numbers

While these analyses are interesting and the sample sizes reasonable, readers should understand that these results aren’t really representative of all data scientists. As the creator and primary analyst for this survey, I can give you several reasons why you shouldn’t put too much stock in these numbers beyond their overall direction:

The best way to think of this survey is that it represents the views of a few thousand of RStudio’s friends and customers. While this doesn’t give us any conclusions about the general population of data scientists or programmers, we can use it to think about what we can do to make those people more productive.

RStudio Should (and Does) Support Both R and Python

Despite the fact that we can’t use this survey for general conclusions, we can use this data to think about how RStudio should support our customers and data science community in their work:

While RStudio already offers Python support in its products, we’ll be adding to that support in new versions that will be released in the coming months. Those announcements will appear both here on blog.rstudio.com and on the main web site, so check regularly for when those are released.


Survey Details

RStudio fielded its 2019 R community survey beginning on December 13, 2019. We closed the survey on January 10, 2020 after it had accumulated 2,176 responses. Its details are as follows:

To leave a comment for the author, please follow the link and comment on their blog: Python on RStudio .

Want to share your content on python-bloggers? click here.
Exit mobile version