Google Sheets API using Python

[This article was first published on PyShark, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this article we will discuss how to access and edit Google Sheets using Python.

Table of Contents

  • Introduction
  • Creating a sample Google Sheets document
  • Creating Google API credentials
  • Opening a Google sheet using Python
  • Selecting/creating/deleting a worksheet using Python
  • Editing data in a Google sheet using Python
  • Conclusion

Introduction

Accessing data from multiple sources using Python becomes a standard requirement for nearly any position in data science and analytics. Working with Excel spreadsheets and internal CSV files happens everywhere.

But what if your data is now stored on Google Drive as a Google Sheet? Of course you can download it in any format of your choice. Yet it is not a scalable solution since it requires constant human input.

Let’s see how we can solve this in a programmatic way and even automate some of the tasks when working with Google Sheets using Python.

To continue following this tutorial we will need two Python libraries: gspread and oauth2client.

If you don’t have them installed, please open “Command Prompt” (on Windows) and install them using the following code:

pip install gspread
pip install oauth2client

Creating a sample Google Sheets document

You probably already have a Google Sheets document if you are reading this article. And you would like to work with your file. For the purposes of this tutorial, I will create a simple Google Sheets file where I will replicate the students’ grades dataset. Essentially we will have three columns: first name, last name, and grade.

To create your first Google Sheets file, go to Google Drive page and login using your Google account. Once you are in, in the top left corner, click New and then Google Sheets. This will create a blank document that looks like this:

This new document is all empty and doesn’t have any data or it’s own filename.

In the top left corner, where you see “Untitled spreadsheet”, let’s rename it to “My Google Sheet” for convenience.

And let’s add some sample data to work with. I added some simple fields:

Great. Now we have a file that we will continue working with.


Creating Google API credentials

The first step to start working with Google Sheets using Python is to create API credentials for the Google Drive and Google Sheets that will allow us to connect to our files.

To get started we are going to head to Google Cloud Console for developers and login with out Google account.

Once we are in, at the very top, you will see the following button to create a project:

Click on it and it will take you to a new page where it will ask you to create a name for your project. I called mine “gsheets-pyshark”, and click “Create“.

Now, in the top right corner, click on the “bell” icon, and you will see a notification that the project has been created. From that notification list click View to get to the project page.

This will take you to the API dashboard for your project. It should look like this:

Perfect. So far we created our own unique project for working with Google Sheets using Python. The next step is to set up the APIs.

Go to navigation menu (click the three horizontal lines in the top right corner), and choose APIs & Services and then Dashboard. You should arrive at the following page:

As you can see, we don’t have any APIs working just yet. We need to add them. Follow the blue link in the middle of the screen that says API library to get to the list of all available Google APIs:

Here we will need to add the “Google Drive” API. Find it by typing the name in the search box, click on it, and then click Enable. It may take a few seconds to load. Once it’s enabled, we arrive here:

Wonderful. The Google Drive API is now enabled. To use it we will need to generate credentials for it. Click on Create Credentials in the top right corner and fill out the form like this:

After you filled out the form above, continue with clicking the blue button What credentials do I need?

In the next form, you should choose your account name (I set it to misha-pyshark) and the account’s role for the project (I chose Owner). The “Key type” should be set to JSON, because we would like to download the credentials as a .json file:

Then, click Continue and your credentials will be downloaded automatically and you will see a pop-up saying your API credentials have been created.

Important note: Please rename the downloaded JSON file to “mycredentials” as it will be much easier to reference it later in the code. Also, you should place it in the same directory/folder where your Python code will be (so we can save time on specifying the location of the file).

So far we have enabled our Google Drive API and downloaded credentials for it. The last step is to enable the Google Sheets API. Let’s go back to Google API library and find “Google Sheets API”. Click into it and click Enable to get it running.

Now we are all set to access Google Sheets using Python.


Opening a Google Sheet using Python

There are multiple ways to preform the authorization step in Python. The method I use in this tutorial is widely used across majority of the articles on this topics, so I decided to do it the same way.

As the first step, we will need to import the required libraries:

import gspread
from oauth2client.service_account import ServiceAccountCredentials

Now it’s time to configure the client with our credentials. This will create the gspread client:

gc = gspread.service_account(filename='mycredentials.json')

Everything is set up to retrieve the Google Sheet we’ve created earlier and get all the records from it.

There are 3 ways to open a Google Sheet using Python:

1. Open Google Sheet by Name

Here, we simply need to input the actual name of the Google Sheet that we created:

gsheet = gc.open("my_google_sheet")

2. Open Google Sheet by URL

To open the Google Sheet using a URL, you would need to open your Google Sheet in the browser and copy its address. For my file it is: https://docs.google.com/spreadsheets/d/1L7cYfMVPIiYPkTYe1bDwKPGfhAJXp8HCeg34Bh7VYl0/

Now, we use .open_by_url() method and pass our URL as an argument:

gsheet = gc.open_by_url("https://docs.google.com/spreadsheets/d/1L7cYfMVPIiYPkTYe1bDwKPGfhAJXp8HCeg34Bh7VYl0/")

3. Open Google Sheet by Key

Opening the Google Sheet by key is very similar to the previous option. What is the key to our file? It’s very easy to find. The key is the last component of the URL between the two last slashes (“/”). In my case it’s: 1L7cYfMVPIiYPkTYe1bDwKPGfhAJXp8HCeg34Bh7VYl0

Now, we use .open_by_key() method and pass our URL as an argument:

gsheet = gc.open_by_key("1L7cYfMVPIiYPkTYe1bDwKPGfhAJXp8HCeg34Bh7VYl0")

Whichever way you decided to continue with, it will create an object in our memory and store it as gsheet. Now what exactly does it contain? Simply, it’s contents are exactly what we have entered when we created this Google Sheet. Let’s now retrieve the information from it:

mydata = gsheet.sheet1.get_all_records()
print(mydata)

In the above chunk of code we opened our retrieved all the data from “Sheet 1” and printed it. It should look like this:

[{'first name': 'James', 'grade': 77, 'last name': 'Smith'},
{'first name': 'Maria', 'grade': 68, 'last name': 'Johnson'},
{'first name': 'Sam', 'grade': 59, 'last name': 'Erickson'},
{'first name': 'David', 'grade': 89, 'last name': 'Williams'}]

We ended up getting a list with values for each row. Also notice how Google Sheets by default set the first row as names for the columns.


Bonus: Google Sheet can be easily converted to a Pandas dataframe using the following code:

import pandas as pd

df= pd.DataFrame(mydata)
print(df)

And we get a much more familiar output:

first name last name grade
0 James Smith 77
1 Maria Johnson 68
2 Sam Erickson 59
3 David Williams 89

Selecting/Creating/Deleting a Worksheet using Python

After we were able to access the Google Sheet, let’s see what we can start with. The first checkpoint is our flexibility in working with different worksheets of our Google Sheet.

My sample Google Sheet contains only one worksheet that is called “Sheet 1”. Yet, it is very common that you will have a multi worksheet file. Of course when we retrieve the data we would need to specify from which worksheet the data should be taken. And this takes us to the first interaction option.


Selecting a Worksheet using Python

Recall that our Google Sheet is saved in local memory as gsheet. Now we would like to access a specific worksheet by its name from the file. What we are going to do is create a subset of the main Google Sheet that only contains data from “Sheet1” and print its content:

wsheet = gsheet.worksheet("Sheet1")

mydata = wsheet.get_all_records()
print(mydata)

The result should be identical to the previous section since we only have a single worksheet in our file:

first name last name grade
0 James Smith 77
1 Maria Johnson 68
2 Sam Erickson 59
3 David Williams 89

Creating a Worksheet using Python

Now, what if you wanted another empty worksheet? Of course you can login into your Google Drive via browser and create it manually. But if your script is running as a part if an automated process (which is why you are probably reading this article), we want to make everything work from our Python code.

The functionality of gspread library allows us to create new worksheets as well:

newsheet = gsheet.add_worksheet(title="New Worksheet", rows="100", cols="20")

The above code will create a new worksheet in out Google Sheet with the given parameters. Keep in mind, you must specify a new name for the worksheet being added as well as the number of rows and columns.

To check that it worked, login into your Google Drive and take a look at the Google Sheet file, and you will see that a second worksheet “New Worksheet” has been added to your file.


Deleting a Worksheet using Python

Alternatively, you may want to delete the worksheet you have just created. Now, here is a little catch: you can only delete the sheets that you have created using Python. In my example, it’s newsheet and the reason is the formatting that is stored in local memory to access that particular worksheet:

gsheet.del_worksheet(newsheet)

And if you check your Google Sheet again, you will see that the “New Worksheet” has been removed.


Editing data in a Google Sheet using Python

The majority of the most useful functionality will be discussed in this section. This is where we actually get to editing data in the main file using Python. To get started, we will need to define the worksheet we will be working with:

wsheet = gsheet.worksheet("Sheet1")

Recall that the data we are working with looks like this:

Getting a Cell Value

Let’s say we want to retrieve the data from a specific cell of the worksheet, and let that cell be “A2” (and the data we are looking to get is “James”). We would do it using the following code:

cval = wsheet.acell('A2').value

print(cval)

And we get exactly what we expected:

James

Updating a Cell Value

There can also be a case when you would like to update a value in a cell. Let’s say we made a wrong entry and need to change the name in A2 cell from “James” to “John”. It can be simply changed using the following code:

wsheet.update('A2', 'John')

And reusing the little code chunk from the previous section to get the updated value:

cval = wsheet.acell('A2').value

print(cval)

We get:

John

Getting All Values from a Row

Alternatively, you may be interested in retrieving the entire row of data (rather than a single cell). In our case, let’s assume we are interested to get the data for the first student in our worksheet.

The only caveat is that we need to know the index of the row that we want to retrieve the data from. Assuming first row is the column headers, the row of interest has an index of 2. Now we can get its values:

row_index = 2
values_row = wsheet.row_values(row_index)

print(values_row)

And we get a list of values in the row:

['John', 'Smith', '77']

Getting All Values from a Column

In another scenario, you may wish to get the data from a column (instead of a row). Assume for our scenario we want to get all data from the “grade” column.

We will again need to find its index (index = 3) and run code similar to the section above:

col_index = 3
values_column = wsheet.col_values(col_index)

print(values_column)

And we get a list of values in the column:

['grade', '77', '68', '59', '89']

Inserting a New Row

Probably one of the more popular tasks we usually do with spreadsheets is adding new data.

In our example, assume there is a new student who just wrote his/her exam and got a grade for it. We would like to update our Google Sheet using Python with a new entry.

To do this, we will need two pieces of information: students data (first name, last name, and grade) and the index for a row that we are going to insert:

student_data = ['Emily', 'Watson', 89]
new_row_index = 6

The reason the new index is 6 because we know that we already have 5 rows with data, and want to add at the end of the list. Keep in mind you can choose any index value, and it just push the remaining rows down.

Now, let’s add this row to our Google Sheet and check if it was added:

wsheet.insert_row(student_data, new_row_index)

values_row = wsheet.row_values(new_row_index)
print(values_row)

And we get a list of values in the row we just added:

['Emily', 'Watson', '89']

Note: the functionality of the whole package is quite extensive and a lot of it is very case specific. For a complete set of methods with examples of the gspread package is available here.


Conclusion

In this article we discussed how to access/edit Google Sheets using Python as well as covered the features of the gspread library.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Python Programming articles.

The post Google Sheets API using Python appeared first on PyShark.

To leave a comment for the author, please follow the link and comment on their blog: PyShark.

Want to share your content on python-bloggers? click here.