API as a package: Structure

Posted on September 15, 2022 by The Jumping Rivers Blog in Data science | 0 Comments

This article was first published on The Jumping Rivers Blog , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

This is part one of our three part series

Part 1: API as a package: Structure (this post)
Part 2: API as a package: Logging (to be published)
Part 3: API as a package: Testing (to be published)

Introduction

At Jumping Rivers we were recently tasked with taking a prototype
application built in {shiny} to a public facing production environment
for a public sector organisation. During the scoping exercise it was
determined that a more appropriate solution to fit the requirements was
to build the application with a {plumber} API providing the interface to
the Bayesian network model and other application tools written in R.

When building applications in {shiny} we have for some time been using
the “app as a package” approach which has been popularised by tools like
{golem} and {leprechaun}, in large part due to the convenience that
comes with leveraging the testing and dependency structure that our R
developers are comfortable with in authoring packages, and the ease with
which one can install and run an application in a new environment as a
result. For this project we looked to take some of these ideas to a
{plumber} application. This blog post discusses some of the thoughts and
resultant structure that came as a result of that process.

As I began to flesh out this blog post I realised that it was becoming
very long, and there were a number of different aspects that I wanted to
discuss: structure, logging and testing to name a few. To try to keep
this a bit more palatable I will instead do a mini-series of blog posts
around the API as a package idea and focus predominantly on the
structure elements here.

Do you use RStudio Pro? If so, checkout out our managed RStudio services

API as a package

There are a few things I really like about the {shiny} app as a package
approach that I wanted to reflect in the design and build of a {plumber}
application as package.

It encourages a regular structure and organisation for an
application. All modules have a consistent naming pattern and
structure.
It encourages leveraging the {testthat} package and including some
common tests across a series of applications, see
golem::use_reccommended_tests() for example.
An instance of the app can be created via a single function call
which does all the necessary set up, say my_package::run_app()

Primarily I wanted these features, which could be reused across
{plumber} applications that we create both internally and for our
clients. As far as I know there isn’t a similar package that provides an
opinionated way of laying out a {plumber} application as a package, and
it is my intention to create one as a follow up to this work.

Regular structure

When developing the solution for this particular project I did have in
the back of my mind that I wanted to create as much reusable structure
for any future projects of this sort as possible. I really wanted to
have an easy way to, from a package structure, be able to build out an
API with nested routes, using code that could easily transfer to another
package.

I opted for a structure that utilised the inst/extdata/api/routes
directory of a package as a basis with the idea that the following file
structure

| inst/extdata/api/routes/
|
| - model.R
| - reports/
  -  |
     | - pdf.R

with example route definitions inside

# model.R
#* @post /prediction
exported_function_from_my_package

# pdf.R
#* @post /weekly
exported_function_from_my_package

would translate to an API with the following endpoints

/model/prediction
/reports/pdf/weekly

A few simple function definitions would allow us to do this for any
given package that uses this file structure.

The first function here just grabs the directory from the current
package where I will define the endpoints that make up my API.

get_internal_routes = function(path = ".") {
  system.file("extdata", "api", "routes", path,
              package = utils::packageName(),
              mustWork = TRUE)
}

create_routes will recursively list out all of the .R files within the
chosen directory and name them according to the name of the file, this
will make it easy to build out a a number of “nested” routers that will
all be mounted into the same API, achieving the compartmentalisation
that we desire. For example the two files at
<my_package>/inst/extdata/api/routes/model.R and
<my_package>/inst/extdata/api/routes/reports/pdf.R will take on the
names "model" and "reports/pdf" respectively.

add_default_route_names = function(routes, dir) {
  names = stringr::str_remove(routes, pattern = dir)
  names = stringr::str_remove(names, pattern = "\\.R$")
  names(routes) = names
  routes
}

create_routes = function(dir) {
  routes = list.files(
    dir, recursive = TRUE,
    full.names = TRUE, pattern = "*\\.R$"
  )
  add_default_route_names(routes, dir)
}

The final few pieces to the puzzle ensure that we have / at the
beginning of a string (ensure_slash()), for the purpose of mounting
components to my router. add_plumber_definition() just calls the
necessary functions from {plumber} to process a new route file, i.e from
the decorated functions in the file create the routes, and then mount
them at a given path to an existing router object. For example given a
file “test.R” that has a #* @get /identity decorator against a
function definition and endpoint = "test" we would add
/test/identity to the existing router. generate_api() takes a full
named vector/list of file paths, ensures they all have an appropriate
name and mounts them all to a new Plumber router object.

ensure_slash = function(string) {
  has_slash = grepl("^/", string)
  if (has_slash) string else paste0("/", string)
}

add_plumber_definition = function(pr, endpoint, file, ...) {
  router = plumber::pr(file = file, ...)
  plumber::pr_mount(pr = pr,
                    path = endpoint,
                    router = router
  )
}

generate_api = function(routes, ...) {
  endpoints = purrr::map_chr(names(routes), ensure_slash)
  purrr::reduce2(
    .x = endpoints, .y = routes,
    .f = add_plumber_definition, ...,
    .init =  plumber::pr(NULL)
  )
}

With these defined I can then, as I develop my package, add new routes
by defining functions and adding {plumber} tag annotations to files in
/inst/ and rebuild the new API with

get_internal_routes() %>%
  create_routes() %>%
  generate_api()

and nothing about this code is specific to my current package so is
transferable. As a concrete, but very much simplified example, I might
have the following collection of files/annotations under
<my_package>/inst/extdata/api/routes

## File: /example.R
# Taken from plumber quickstart documentation
# https://www.rplumber.io/articles/quickstart.html
#* @get /echo
function(msg="") {
  list(msg = paste0("The message is: '", msg, "'"))
}


## File: /test.R
#* @get /is_alive
function() {
  list(alive = TRUE)
}


## File: /nested/example.R
# Taken from plumber quickstart documentation
# https://www.rplumber.io/articles/quickstart.html
#* @get /echo
function(msg="") {
  list(msg = paste0("The message is: '", msg, "'"))
}

which would give me

get_internal_routes() %>%
  create_routes() %>%
  generate_api()

# # Plumber router with 0 endpoints, 4 filters, and 3 sub-routers.
# # Use `pr_run()` on this object to start the API.
# ├──[queryString]
# ├──[body]
# ├──[cookieParser]
# ├──[sharedSecret]
# ├──/example
# │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  ├──[queryString]
# │  ├──[body]
# │  ├──[cookieParser]
# │  ├──[sharedSecret]
# │  └──/echo (GET)
# ├──/nested
# │  ├──/example
# │  │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  │  ├──[queryString]
# │  │  ├──[body]
# │  │  ├──[cookieParser]
# │  │  ├──[sharedSecret]
# │  │  └──/echo (GET)
# ├──/test
# │  │ # Plumber router with 1 endpoint, 4 filters, and 0 sub-routers.
# │  ├──[queryString]
# │  ├──[body]
# │  ├──[cookieParser]
# │  ├──[sharedSecret]
# │  └──/is_alive (GET)

This {cookieCutter} example is available to view at our Github blog
repo.

Basic testing

In my real project I refrained from having any actual function
definitions being made in inst/. Instead each function that was part
of the exposed API was a proper exported function from my package
(additionally filenames for said functions followed a regular structure
too of api_<topic>.R). This allows for leveraging {testthat} against
the logic of each of the functions as well as using other tools like
{lintr} and ensuring that dependencies, documentation etc are all
dealt with appropriately. Testing individual functions that will be
exposed as routes can be a little different to other R functions in that
the objects passed as arguments come from a request. As alluded to in
the introduction I will prepare another blog post detailing some
elements of testing for API as a package but a short snippet that I
found particularly helpful for testing that a running API is functioning
as I expect is included here.

The following code could be used to set up (and subsequently tear down)
a running API that is expecting requests for a package cookieCutter

# tests/testthat/setup.R

## run before any tests
# pick a random available port to serve your app locally
port = httpuv::randomPort()

# start a background R process that launches an instance of the API
# serving on that random port
running_api = callr::r_bg(
  function(port) {
    dir = cookieCutter::get_internal_routes()
    routes = cookieCutter::create_routes(dir)
    api = cookieCutter::generate_api(routes)
    api$run(port = port, host = "0.0.0.0")
  }, list(port = port)
)

# Small wait for the background process to ensure it
# starts properly
Sys.sleep(1)

## run after all tests
withr::defer(running_api$kill(), testthat::teardown_env())

A simple test to ensure that our is_alive endpoint works then might look
like

test_that("is alive", {
  res = httr::GET(glue::glue("http://0.0.0.0:{port}/test/is_alive"))
  expect_equal(res$status_code, 200)
})

Logging

{shiny} has some useful packages for adding logging, in particular
{shinylogger} is very helpful at giving you plenty of logging for little
effort on my part as the user. As far as I could find nothing similar
exists for {plumber} so I set up a bunch of hooks, using the {logger}
package to write information to both file and terminal. Since that could
form it’s own blogpost I will save that discussion for the future.

For updates and revisions to this article, see the original post

To leave a comment for the author, please follow the link and comment on their blog: The Jumping Rivers Blog .

Want to share your content on python-bloggers? click here.

Python-bloggers

Data science news and tutorials - contributed by Python bloggers