Firing Up Firestore

Posted on March 20, 2022 by Python - datawookie in Data science | 0 Comments

This article was first published on Python - datawookie , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

I’ve just started collaborating on a new project, Votela, with Luke. We’re going to be using Firestore for stashing our data. I’ve never worked with Firestore before, so one of my first tasks was just figuring out how to get connected and how to shift some data to and from the database.

Luckily for me Luke had already set up the Firestore instance, so I needed to just connect and let rip. In reality it took me a while to “just connect” because the setup was quite different to what I’m accustomed to.

Authentication

The authentication procedure is covered in the detailed documentation from Google. This is what I did.

First install the Google Cloud CLI.

sudo snap install --classic google-cloud-cli

Initiate authentication.

gcloud auth application-default login

That will open a page in your browser. You’ll be asked to select a Google account. Then you’ll be asked to allow access to that Google account. This always feels a little risky, but I steeled myself and accepted the potential consequences. And was rewarded with the following cheering message:

Install the Python Package

Since I planned on creating some test data in Python I needed to get the required client library installed. After creating and activating a virtual environment I installed the google-cloud-firestore package.

pip3 install google-cloud-firestore

TBH I actually created a requirements.txt file which included a few other packages I’d be using.

google-cloud-firestore==2.4.0
lorem==0.1.1
names==0.3.0
python-dotenv==0.19.2
scipy==1.8.0
uuid==1.30

Then I installed them all.

pip3 install -r requirements.txt

Connecting

Exciting times! I was ready to connect.

from google.cloud import firestore

db = firestore.Client("votela")

Miraculously that just worked. The resulting db is an instance of the Client class. We’ll use that a little later.

Creating Data

Right, now I’m ready to create some dummy data. We’ll start be creating a few imaginary users.

import re
import random
import names
import lorem
import uuid

# Create a seeded RNG for reproducibility.
rng = random.Random()
rng.seed(13)

NUSER = 5

# Each user will be identified by an unique UUID.
USERS = [{"uuid": uuid.UUID(int=rng.getrandbits(128)).hex} for i in range(NUSER)]

for user in USERS:
    # Generate a random name.
    user["name"] = names.get_full_name()
    # Derive a handle from the name.
    user["handle"] = re.sub(" +", "_", user["name"].lower())
    # Create a random biography (in Latin because we're fancy like that).
    user["bio"] = lorem.paragraph()

The resulting USERS is a list of dictionaries, one for each synthetic user.

Creating a Collection

Each user will be stored as a separate document in a collection on Firestore. We’ll create a users collection using the Client object that we instantiated earlier. The user documents will be inserted into this collection.

# Create (or retrieve if it already exists) collection.
users = db.collection("users")

The result is a CollectionReference object.

Add Documents to the Collection

And now we can insert the user documents into the collection using the add() method.

for user in USERS:
    users.add(document_data=user, document_id=user["uuid"])

However, this will raise an AlreadyExists exception if the document already exists. Since I was iterating on this script it was actually more convenient to use the document() method, which created a document with the specified ID if it didn’t currently exist or returned the referenced object if it did already exist. The resulting DocumentReference object has a set() method which can be used to (spec|mod)ify the document contents.

for user in USERS:
    # The uuid field is used as the document ID.
    users.document(user["uuid"]).set(user)

Inserting one document at a time feels inefficient and I’m sure that it must be possible to do a batch insert. However, being super pragmatic at this stage, it works and executes quickly so I’m not too fussed.

Retrieving Documents from the Collection

At this stage I was able to check the Firestore console to confirm that the documents had been added to the collection (I’m happy to report that they had indeed). However, in the interests of completeness it would be useful to retrieve them in code.

Let’s retrieve all of the document IDs.

# Get all of the document IDs.
for user in users.list_documents():
    print(user.id)

00c2f09186ce51bd17b8b123a524bf3f
01769a3c092936e8d5a2038fda048969
082018facc7eb77d4beb197c350cc530
09f0992ab1f5d8538b16bb0dce98225d
0b0fb71cde14bff2eed7a24a6c9fee24

Now we can focus on a single document and retrieve it by ID.

user = users.document("00c2f09186ce51bd17b8b123a524bf3f").get()

The resulting object has an id attribute and a get() method for accessing its fields.

user.id

'00c2f09186ce51bd17b8b123a524bf3f'

user.get("name")

'Catherine Waters'

# The document ID is also stored as the uuid field in the document.
user.get("uuid")

'00c2f09186ce51bd17b8b123a524bf3f'

You can also use the to_dict() method to retrieve all of the fields as a dictionary.

Conclusion

I hit my head against this for quite a while this weekend, mostly trying to figure out how to authenticate. Thanks to my patient collaborator, Luke, for tolerating my whingeing. Now that I’ve got access sorted out though I’m feeling empowered and excited about the next steps on this project.

FYI I was wondering about the difference between Firebase and Firestore. Turns out that the former, Firebase, is an entire platform for developing apps, while the latter, Firestore, is a scalable NoSQL database which is part of Firebase.

Resources

To leave a comment for the author, please follow the link and comment on their blog: Python - datawookie .

Want to share your content on python-bloggers? click here.

Python-bloggers

Data science news and tutorials - contributed by Python bloggers