Chapter 3 Structuring your Project

3.1 Shiny App as a Package

In the next chapter you will be introduced to the {golem} (Guyader et al. 2020) package, an opinionated framework for building production-ready Shiny Applications. This framework will be used a lot through this book, and it relies on the idea that every Shiny application should be built as an R package.

But in a world where Shiny Applications are mostly created as a series of files, why bother with a package?

3.1.1 What is in a production-grade Shiny App?

You probably haven’t realized it yet, but if you have built a significant (in term of code base) Shiny application, chances are you have been using a package-like structure without knowing it.

Think about your last Shiny application which was created as a single-file (app.R) or two files app (ui.R and server.R). On top of these files, what so you needed to make it a production-ready application?

3.1.1.1 Metadata

First of all, metadata. In other words, all the necessary information for something that runs in production: the name of the app, the version number (which is crucial to any serious, production-level project), what the application does, who to contact if something goes wrong, etc.

3.1.1.2 Dependencies

Second, you need to find a way to handle the dependencies. When you want to push your app into production, you do not want to have this conversation with the IT team:

  • IT: Hey, I tried to source("app.R") as you said, but I got an error.

  • R-dev: What is the error?

  • IT: It says “could not find package ‘shiny’”.

  • R-dev: Ah yes, you need to install {shiny}. Try to run install.packages("shiny").

  • IT: OK nice. What else?

  • R-dev: Let me think, try also install.packages("DT")… good? Now try install.packages("ggplot2"), and …

  • […]

  • IT: Ok, now I source the ‘app.R’, right?

  • R-dev: Sure!

  • IT: Ok so it says ’could not find function runApp()

  • R-dev: Ah, you got to do library(shiny) at the beginning of your script. And library(purrr), and library(jsonlite), and…

For example here, this library(purrr) and library(jsonlite) will lead to a NAMESPACE conflict on the flatten() function that can cause you some debugging headache (trust us, we have been there before). So, hey, it would be cool if we could have a Shiny app that only imports specific functions from a package, right?

We can not stress enough that dependencies matter. You need to handle them, and handle them correctly if you want to ensure a smooth deployment to production.

3.1.1.3 Functions

Third, let’s say you are building a big application. Something with thousands of lines of code. You can not build this large application by writing one or two files, as it is simply impossible to maintain in the long run or use on a daily basis. So, if we are developing a large application, we should split everything into smaller files. And maybe we can store those files in a specific directory. For instance, we can name this directory R/.

3.1.1.4 Documentation

Last but not least, we want our app to live long and prosper, which means we need to document it. It would be nice to find a way to comment each small piece of code to explain what these specific lines do, so that either the end users or the developers that take over the code will be able to maintain it.

3.1.1.5 Test

The other thing we need for our application to be successful on the long run is a testing infrastructure, so that we are sure we are not introducing any regression along development.

3.1.1.6 Build and Deploy

Oh, and it would also be nice if people could get a tar.gz and install it on their computer and have access to a local copy of the app!

3.1.1.7 Shiny App as a Package

OK, so let’s sum up what we need for our application:

  • metadata and dependencies, which is what you get from the DESCRIPTION + NAMESPACE files of a package. Even more useful is the fact that you can do “selective namespace extraction” inside a package, i.e. you can say “I want this function from this package”.

  • Also, this app needs to be split up in smaller .R files, stored in a specific directory, which is the way a package is organized.

  • We can not emphasize enough how documentation is a central part of any package, so we solved this question too here. So is the testing toolkit.

  • And of course, the “install everywhere” wish comes for free when a Shiny App is in a package.

3.1.2 Document and test

As with any production-grade software, documentation and testing will help ensuring the usability and sustainability of your app.

3.1.2.1 Documenting your app

Documenting your Shiny app involves explaining features to the end users and also to the future developers (chances are this future developer will be you). The good news is that using the R package structure helps you leverage the common tools for documentation in R:

  • A README file that you will put at the root of your package, which will document how to install the package, and some information about how to use the package. Note that in many cases developers go for a .md file (short for markdown) because this format is automatically rendered on services like GitHub, GitLab, or any other main version control system.
  • Vignettes are longer form documentation that explain in more depth how to use your app. There are also useful if you need to detail the core functions of the application using a static document. In a perfect world, you would create one vignette per shiny page/tab.
  • Function documentation. Every function in your package should come with its own documentation, even if it is just for your future self. “Exported” functions, the one which are available once you run library(myapp), should be fully documented and will be listed in the package help page. Internal functions need less documentation, but documenting them is the best way to be sure you can come back to the app in a few months and still know why things are the way they are, what the pieces of the apps are used for, and how to use these functions14 .
  • If needed, you can build a {pkgdown} website, that can either be deployed on the web or kept internally. It can contain installation steps for I.T., internal features use for developers, a user guide, etc.

3.1.2.2 Testing

Nothing should go to production without being tested. Nothing.

Testing production apps is a broad question that we will come back to in another chapter, but let’s talk briefly about why using a package structure helps with testing.

Frameworks for package testing are robust and widely documented in the R world, and if you choose to embrace the “Shiny App as a Package” structure, you do not have to put any extra-effort for testing your application back-end: use a canonical testing framework like {testthat} (Wickham 2020). Learning how to use it is not the subject of this chapter, so feel free to refer to the documentation, and see also Chapter 5 of the workshop: “Building a package that lasts”.

We will come back to testing in the “Build Yourself a Safety Net” chapter.

3.1.3 Deploy

Once your application is ready, you will want to send it to production. Most of the time, if not always, that means running it on someone else’s computer.

3.1.3.1 Local deployment

When adopting the package structure, you can use classical tools to locally install your Shiny application. A Shiny App as a package can be built as a tar.gz, sent to your colleagues, friends, and family, and even to the CRAN. It can also be installed in any R-package repository. You can then install your packaged application along with its dependencies using the appropriate remotes::install_*() command. And, if you built your app with {golem}, you can launch the app using:

library(myuberapp)
run_app()

3.1.3.2 RStudio Connect, Shiny Server, shinyapps.io

Sending a Shiny application to a RStudio product currently requires placing an R script at the root of your package directory.

To run your application on these platforms, you will need to use a “standard” Shiny filename pattern, i.e. an app.R file or ui.R / server.R, and send the whole thing to the server. To integrate your “Shiny App as a package” into Connect or Shiny Server, you can adopt two strategies:

  • Use an internal package manager like RStudio Package Manager, where the package app is installed. Once the package is available in your internal repository, you can create an app.R file with only this small piece of code:
library(myuberapp)
run_app()
  • Upload the complete content of the package directory to the server. You will need an app.R file at the root of the package:
# Load all R scripts and functions
pkgload::load_all()
# Launch the application
shiny::shinyApp(ui = app_ui(), server = app_server) 

This is the file you will get if you run one of the three RStudio related functions from golem, for example golem::add_rconnect_file().

3.1.3.3 Docker containers

Docker containers can be used to embed a frozen OS that will launch your application in a safe environment. In order to dockerize your app, create a Dockerfile that lists your package to be installed as in the local deployment with the appropriate remotes::install_*() function. Then, use as a CMD R -e 'options("shiny.port" = 80, shiny.host = "0.0.0.0"); myuberapp::run_app()' so that your app will be launched when starting the Docker container. Change the output port according to your needs.

Note that {golem} provides you the Dockerfile you need with golem::add_dockerfile().

We will return to Docker containers in a few chapters, notably in the context of building Dockerfile for {golem}-based applications.

3.1.4 Resources

In the rest of this book, we will assume you are comfortable with building an R package. If you need to read some resources before continuing, feel free to have a look at these links:

3.2 Using Shiny Modules

Modules are one of the most powerful tools for building Shiny Applications in a maintainable and sustainable manner.

3.2.1 Why are we using Shiny modules?

Small is beautiful. Being able to properly cut a codebase into small modules will help developers build a mental model of the application (Remember “What is a complex Shiny Application?”). But what are Shiny modules?

Shiny modules address the namespacing problem in Shiny UI and server logic, adding a level of abstraction beyond functions

Let us first untangle this quote with an example about the Shiny namespace problem.

3.2.1.1 One million “Validate” buttons

A big Shiny application usually requires reusing pieces of ui/server, which makes it hard to name and identify similar inputs and outputs.

Shiny requires its outputs and inputs to have a unique id. And, unfortunately, we can not bypass that: when you send a plot from R to the browser, i.e from the server to the ui, the browser needs to know exactly where to put this element. This “exactly where” is handled through the use of an id. Ids are not Shiny specific: they are at the very root of the way web pages work. Understanding all of this is not the purpose of this chapter: just remember that Shiny inputs and outputs ids have to be unique, just as any id on a webpage, so that the browser knows where to put what it receives from R, and R knows what to listen to from the browser. The need to be unique is made a little bit complex by the way Shiny handles the names, as it shares a global pool for all the id names, with no native way to use namespaces. Wait, namespaces?

Namespaces are a computer science concept created to handle a common issue: how to share the same name for a variable in various places of your program without them conflicting. In other words, how to use an object called plop several times in the program, and still be sure that it is correctly used depending on the context.

R itself has a system for namespaces; this is what packages do and why you can have purrr::flatten and jsonlite::flatten on the same computer and inside the same script: the function names are the same, but the two live in different namespaces, and the behavior of both functions can be totally different as the symbol is evaluated inside two different namespaces. If you want to learn more about namespaces, please refer to the 7.4 Special environments chapter from Advanced R, or turn to any computer science book: namespaces are pretty common in any programming language.

So, that is what modules are made for: creating small namespaces where you can safely define ids without conflicting with other ids in the app. Why do we need to do that? Think about the number of times you created a “OK” or “validate” button. How have you been handling that so far? By creating validate1, validate2, and so on and so forth. But if you think about it, you are mimicking a namespacing process: a validate in namespace 1, another in namespace 2.

Consider the following Shiny application:

library(shiny)
ui <- function() {
  fluidPage(
    sliderInput(
      inputId = "choice1", 
      label = "choice 1", 
      min = 1, max = 10, value = 5
    ),
    actionButton(
      inputId = "validate1",
      label = "Validate choice 1"
    ),
    sliderInput(
      inputId = "choice2",
      label = "choice 2", 
      min = 1, max = 10, value = 5
    ),
    actionButton(
      inputId = "validate2", 
      label = "Validate choice 2"
    )
  )
}

server <- function(input, output, session) {
  observeEvent( input$validate1 , {
    print(input$choice1)
  })
  observeEvent( input$validate2 , {
    print(input$choice2)
  })
}

shinyApp(ui, server)

This, of course, is an approach that works. Well, it works as long as your code base is small. But how can you be sure that you are not creating validate6 on line 55 and another on line 837? How can you be sure that you are deleting the correct combination of UI/server components if they are named that way? Also, how do you work smoothly in a context where you have to scroll from sliderInput("choice1" to observeEvent( input$choice1 , { which might be separated by thousands of lines?

3.2.1.2 A bite-sized code base

Build your application through multiple smaller applications that are easier to understand, develop and maintain, using {shiny} (Chang et al. 2020) modules.

We assume that you know the saying that “if you copy and paste something more than twice, you should make a function”. So, in a Shiny application, how can we refactor a partially repetitive piece of code so that it is reusable?

Yes, you guessed right: using shiny modules. Shiny modules aim at three things: simplifying “id” namespacing, split the code base into a series of functions, and allow UI/Server parts of your app to be reused. Most of the time, modules are used to do the two first. In our case, we could say that 90% of the modules we write are never reused15 ; they are here to allow us to split the code base into smaller, more manageable pieces.

With Shiny modules, you will be writing a combination of UI and server functions. Think of them as small, standalone Shiny apps, which handle a fraction of your global application. If you develop R packages, chances are you have split your functions into series of smaller functions. With Shiny modules, you are doing the exact same thing: with just a little bit of tweaking, you can split your application into series of smaller applications.

3.2.2 When should you modularize?

No matter how big your application is, it is always a safe to start modularizing from the very beginning. The sooner you use modules, the easier downstream development will be. It is even easier if you are working with {golem}, which promotes the use of modules from the very beginning of your application.

“Yes, but I just want to write a small app, nothing fancy”

Production apps almost always started as a small Proof of Concept. Then, the small PoC becomes an interesting idea. Then, this idea becomes a strategical asset. And before you know it, your ‘not-that-fancy’ app needs to become larger and larger. So, you will be better off starting on solid foundations from the very beginning.

3.2.3 A practical walk through

An example is worth a thousand words, so let’s explore the code of a very small Shiny application that is split into modules.

3.2.3.1 Your first Shiny Module

Let’s try to transform the above example (the one with two sliders and two action buttons) into an application with a module. The module is a small shiny application separated from the main code:

# Re-usable module
mod_ui <- function(id) {
  ns <- NS(id)
  tagList(
    sliderInput(
      inputId = ns("choice"), 
      label = "Choice",
      min = 1, max = 10, value = 5
    ),
    actionButton(
      inputId = ns("validate"),
      label = "Validate Choice"
    )
  )
}

mod_server <- function(input, output, session) {
  ns <- session$ns
  
  observeEvent( input$validate , {
    print(input$choice)
  })
  
}

# Main application
library(shiny)
app_ui <- function() {
  fluidPage(
    mod_ui(id = "mod_ui_1"),
    mod_ui(id = "mod_ui_2")
  )
}

app_server <- function(input, output, session) {
  callModule(mod_server, id = "mod_ui_1")
  callModule(mod_server, id = "mod_ui_2")
}

shinyApp(app_ui, app_server)

Let’s stop for a minute and decompose what we have here.

The server function of the module (mod_server()) is pretty much the same as before: you use the same code as the one you would use in any server part of a Shiny application.

The ui function of the module (mod_ui()) requires specific things. There are two new things: ns <- NS(id) and ns(inputId). That is where the namespacing happens. Remember the previous version where we identified out two “validate” buttons with slightly different namespaces: validate1 and validate2? Here, we create namespaces with the ns() function, built with ns <- NS(id). This line, ns <- NS(id), is added on top of all module ui functions and will allow building namespaces with the module id.

To understand what it does, let us try and run it outside Shiny:

id <- "mod_ui_1"
ns <- NS(id)
ns("choice")
[1] "mod_ui_1-choice"

And here it is, our namespaced id!

Each call to a module with callModule() requires a different id argument that will allow creating various internal namespaces, preventing from id conflicts16 . Then you can have as many validate input as you want in your app, as long as this validate has a unique id inside your module.

3.2.3.2 Passing arguments to your modules

Shiny modules will potentially be reused and may need specific user interface and inputs. This requires using extra arguments to generate the UI and server. As UI and server are functions, you can set parameters that will be used to configure the internals of the result.

As you can see, the app_ui contains a series of call to mod_ui(unique_id, ...) function, allowing additional arguments like any other function:

mod_ui <- function(id, button_label) {
  ns <- NS(id)
  tagList(
    actionButton(ns("validate"), button_label)
  )
}

mod_ui("mod_ui_1", button_label = "Validate Choice")
mod_ui("mod_ui_2", button_label = "Validate Choice, again")
<button id="mod_ui_1-validate" type="button" class="btn btn-default action-button">Validate Choice</button>

<button id="mod_ui_2-validate" type="button" class="btn btn-default action-button">Validate Choice, again</button>

The app_server side contains a series of callModule(mod_server, unique_id, ...), also allowing additional parameters, just like any other function.

As a live example, we can have a look at mod_dataviz.R from the {tidytuesday201942} (Fay 2020l) Shiny application, available at https://connect.thinkr.fr/tidytuesday201942/.

This application contains 6 tabs, 4 of them being pretty much alike: a side bar with inputs, a main panel with a button, and the plot. This is a typical case where you should reuse modules: if two or more parts are relatively similar, it is easier to bundle it inside a reusable module, and condition the ui/server with function arguments.

Snapshot of the {tidytuesday201942} Shiny application.

FIGURE 3.1: Snapshot of the {tidytuesday201942} Shiny application.

Here, are an extract of how it works in the module UI:

mod_dataviz_ui <- function(
  id, 
  type = c("point", "hist", "boxplot", "bar")
) {
  h4(
    sprintf( "Create a geom_%s", type )
  ),
  if (type == "boxplot" | type =="bar") {
    selectInput(
      ns("x"),
      "x", 
      choices = names_that_are(c("logical", "character"))
    )
  } else {
    selectInput(
      ns("x"),
      "x", 
      choices = names_that_are("numeric")
    )
  }
}

And in the module server:

mod_dataviz_server <- function(
  input, 
  output, 
  session, 
  type
) {
  if (type == "point") {
    x <- rlang::sym(input$x)
    y <- rlang::sym(input$y)
    color <- rlang::sym(input$color)
    r$plot <- ggplot(
      big_epa_cars, 
      aes(!!x, !!y, color = !!color)
    )  +
      geom_point() + 
      scale_color_manual(
        values = color_values(
          1:length(unique(pull(big_epa_cars, !!color))), 
          palette = input$palette
        )
      )
  } 
}

Then, the UI of the entire application is:

app_ui <- function() {
  # [...]
  tagList(
    fluidRow(
      id = "geom_point", mod_dataviz_ui("dataviz_ui_1", "point")
    ), 
    fluidRow(
      id = "geom_hist", mod_dataviz_ui("dataviz_ui_2", "hist")
    )
  )
}

And the app_server() of the application:

app_server <- function(input, output, session) {
  #callModule(mod_raw_server, "raw_ui_1")
  callModule(mod_dataviz_server, "dataviz_ui_1", type = "point")
  callModule(mod_dataviz_server, "dataviz_ui_2", type = "hist")
  callModule(mod_dataviz_server, "dataviz_ui_3", type = "boxplot")
  callModule(mod_dataviz_server, "dataviz_ui_4", type = "bar")
}

3.2.4 Communication between modules

One of the hardest part of using modules is sharing data across them. There are at least three approaches:

  • Returning a reactive function
  • The “stratégie du petit r” (to be pronounced with a french accent of course)
  • The “stratégie du grand R6”.

3.2.4.1 Returning values from the module

One common approach is to return a reactive function from one module, and pass it to another in the general app_server() function.

Here is an example that illustrate this pattern.

# Module 1
mod_ui <- function(id) {
  ns <- NS(id)
  tagList(
    sliderInput(ns("choice"), "Choice", 1, 10, 5)
  )
}

mod_server <- function(input, output, session) {
  return(
    reactive({
      input$choice
    })
  )
}

# Module 2
mod_b_ui <- function(id) {
  ns <- NS(id)
  tagList(
    actionButton(ns("validate"), "Print")
  )
}

mod_b_server <- function(input, output, session, react) {
  observeEvent( input$validate , {
    print(react())
  })
}

# Application
library(shiny)
app_ui <- function() {
  fluidPage(
    mod_ui("mod_ui_1"),
    mod_b_ui("mod_ui_2")
  )
}

app_server <- function(input, output, session) {
  res <- callModule(mod_server, "mod_ui_1")
  callModule(mod_b_server, "mod_ui_2", react = res)
}

shinyApp(app_ui, app_server)

This strategy works well, but for large Shiny Apps it might be hard to handle large lists of reactive outputs / inputs and to keep track of how things are organized. It might also create some reactivity issues, as a lot of reactive function calls is harder to control, or lead to too much computation from the server.

3.2.4.2 The “stratégie du petit r”

In this strategy, we create a global reactiveValues list that is passed along other modules. The idea is that it allows to be less preoccupied about what your module takes as input and what it outputs. You can think of this approach as creating a small, internal database that is passed along all the modules of your application.

Below, we create a “global” (in the sense that it is initiated at the top of the module hierarchy) reactiveValues() object in the app_server() function. It will then go through all modules, passed as a function argument.

# Module 1
mod_ui <- function(id) {
  ns <- NS(id)
  tagList(
    sliderInput(ns("choice"), "Choice", 1, 10, 5)
  )
}

mod_server <- function(input, output, session, r) {
  observeEvent( input$choice , {
    r$choice <- input$choice
  })
  
}

# Module 2
mod_b_ui <- function(id) {
  ns <- NS(id)
  tagList(
    actionButton(ns("validate"), "Print")
  )
}

mod_b_server <- function(input, output, session, r) {
  ns <- session$ns
  observeEvent( input$validate , {
    print(r$choice)
  })
  
}

# Application
library(shiny)
ui <- function() {
  fluidPage(
    mod_ui("mod_ui_1"),
    mod_b_ui("mod_ui_2")
  )
}

server <- function(input, output, session) {
  r <- reactiveValues()
  callModule(mod_server, "mod_ui_1", r)
  callModule(mod_b_server, "mod_ui_2", r)
}

shinyApp(ui, server)

The good thing about this method is that whenever you add something in one module, it is immediately available in all other modules where r is present. The downside is that it can make it harder to reason about the app, as the input/content of the r is not specified anywhere unless you explicitly document it: the parameter to your server function being “r” only, you need to be a little bit more zealous when it comes to documenting it.

Note that if you want to share your module, for example in a package, you should document the structure of the r. For example:

#' @param r a `reactiveValues()` list with a `choice` element in it. 
#' This `r$choice` will be printed to the R console.

3.2.4.3 The “stratégie du grand R6”

Similarly to the “stratégie du petit r”, we can create an R6 object, which is passed along inside the modules. As this R6 object is not a reactive object and is not meant to be used as such, this reduces uncontrolled reactivity of the application, thus reduces the complexity of handling chain reactions across modules. Of course, you need to have another special tool in your app to trigger elements. All this will be explained in details in chapter Reactivity anti-patterns of this book, and you can find an example of this pattern inside the {hexmake} (Fay 2020g) application.

3.3 Structuring your app

3.3.1 Business logic & application logic

A shiny application has two main components: the application logic and the business logic.

  • Application logic is what makes your Shiny app interactive: structure, buttons, tables, interactivity, etc. These components are not specific to your core business: you could use them for any other line of work or professional context. This has no other use case than your interactive application: it is not meant to be used outside your app, you would not use them in a markdown report for instance.

  • Business logic is the components with the core algorithms and functions that make your application specific to your area of work. You can recognize these elements as the ones that can be run outside any interactive context. This is the case for specific computation and algorithm, custom plot or geom for {ggplot2} (Wickham, Chang, et al. 2020), specific calls to a database, etc.

These two components do not have to live together. And in reality, they should not live together if you want to keep your sanity when you are building an app. If you keep all together in the same file, you will end up having to rerun the app from scratch and spend five minutes clicking everywhere just to be sure you have correctly set the color palette for the graph on the last tabPanel().

Trust us, we have been there, and it is not pretty.

So what is the way to go? Extract the business function from the reactive functions. Literally. Compare this pattern:

# Application
library(shiny)
library(dplyr)
ui <- function() {
  tagList(
    tableOutput("tbl")
  )
}

server <- function(input, output, session) {
  output$tbl <- renderTable({
    mtcars %>%
      # [...] %>% 
      # [...] %>% 
      # [...] %>% 
      # [...] %>% 
      # [...] %>% 
      top_n(10)
  })
}

shinyApp(ui, server)

To this one:

library(shiny)
library(dplyr)

# Business logic
top_this <- function(tbl) {
  tbl %>%
    # [...] %>% 
    # [...] %>% 
    # [...] %>% 
    # [...] %>% 
    top_n(10)
}

# Application
ui <- function() {
  tagList(
    tableOutput("tbl")
  )
}

server <- function(input, output, session) {
  output$tbl <- renderTable({
    top_this(mtcars)
  })
}

shinyApp(ui, server)

Both scripts do the exact same thing. The difference is that the second code can be easily explored without having to relaunch the app. You will be able to build a reproducible example to explore, illustrate and improve top_this(). This function can be tested, documented and reused outside the application. Moreover, this approach lowers the cognitive load when debugging: you either debug an application issue, or a business logic issue. You never debug both at the same time.

Even more, think about the future: how likely are the colors or the UI subject to change, compared to how likely the core algorithms are to change? As said in The Art of Unix Programming, “Fashions in the look and feel of GUI toolkits may come and go, but raster operations and compositing are forever(Raymond 2003). In other words, the core back-end, once consolidated, will potentially stay unchanged forever. On the other hand, the front-end might change: new colors, new graphic designs, new interactions, new visualization libraries… Whenever this happens, you will be happy you have separated the business logic from the application logic, as you will have to change less code.

How to do that? Add you application logic inside a file (typically, a module), and the business logic in another R script (typically starting with fct_ or utils_). You can even write the business logic inside another package, making these functions really reusable outside your application.

3.3.2 Small is beautiful (bis repetita)

There are a lot of reasons for splitting your application into smaller pieces, including the fact that it is easier to maintain, easier to decipher, and it facilitates collaboration.

There is nothing harder to maintain than a Shiny app only made of a unique 1000-line long app.R file. Well, there still is the 10000-line long app.R file, but you get the idea. Long scripts are almost always synonymous with complexity when it comes to building software. Of course, small and numerous scripts do not systematically prevent codebase complexity, but they do simplify collaboration and maintenance, and divide the application logic into smaller, easier-to-understand bits of code.

So yes, big files are complex to handle and make development harder. Here is what happens when you work on an application for production:

  • You will work during a long period of time (either in one run or split across several months) on your codebase. Hence, you will have to get back to pieces of code you wrote a long time ago.

  • You will possibly develop with other developers. Maintaining a code base when several people work on the same directory is already a complex thing: from time to time you might work on the same file separately, a situation where you will have to be careful about what and how to merge things when changes are implemented. It is almost impossible to work together on one same file all along the project without losing your mind: even more if this file is thousands of lines long.

  • You will implement numerous features. Numerous features imply a lot of UI & server interactions. In an app.R file containing thousands of lines, it is very hard to match the UI element with its server counterpart. When the UI is on line 50 and the server on line 570, you will be scrolling a lot when working on these elements.

3.3.3 Conventions matter

In this section you will find a suggestion for a naming convention for your app files that will help you and your team be organized.

Splitting files is good. Splitting files using a defined convention is better. Why? Because using a common convention for your files helps the other developers (and potentially you) to know exactly what is contained in a specific file, making it easier to navigate through the codebase, be it for new-comers or for developers already familiar with the software.

As developed in “Refactoring at Scale” (Lemaire 2020), lacking a defined file structure when it comes to codebase leads to a slower productivity on the long run, notably when new engineers join the team: engineers with a knowledge of the file structure have learned how to navigate through the codebase, but new comers will find it hard to understand how everything is organized. And of course, on the long run, even developers with a knowledge of the structure can get lost, even more if getting back to a project that they have not

Because it’s easier to maintain the status quo, instead of proactively beginning to organize related files […], engineers instead learn to navigate the increasingly sprawling code. New engineers introduced to the growing chaos raise a warning flag and encourage the team to begin splitting up the code, but these concerns fall to deaf ears […]. Eventually, the codebase reaches a critical mass where the persistent lack of organization has dramatically slowed productivity across the engineering team. Only then does the team take the time to draft a plan for grooming the codebase, at which point the number of variables to consider is far greater than had they made a concerted effort to tackle the problem months (or even years) earlier.

Using a convention allows everyone to know where to look when debugging, refactoring, or implementing new features. For example, if you follow {golem}’s convention (which is the one developed in this section), you will know immediately that a file starting with mod_ contains a module. If you take over a project, look in the R/ folder, and see files starting with these three letters, you will know immediately that these files contain modules.

Here is our proposition for a convention defining how to split your application into smaller pieces.

First of all, put everything into an R/ folder. If you build your app using the {golem} framework, this is already the case. We use the package convention to hold the functions of our application.

The naming convention in {golem} is the following:

  • app_*.R (typically app_ui.R and app_server.R) contain the top level functions defining your user interface and your server function.

  • fct_* files contains the business logic, potentially large functions. They are the backbone of the application and may not be specific to a given module. They can be added using {golem} with the add_fct("name") function.

  • mod_* files contain a unique module. Many Shiny apps contain a series of tabs, or at least a tab-like pattern, so we suggest that you number them according to their step in the application. Tabs are almost always named in the user interface, so that you can use this tab-name as the file name. For example, if you build a dashboard where the first tab is called “Import”, you should name your file mod_01_import.R. You can create this file with a module skeleton using golem::add_module("01_import").

  • utils_* are files that contain utilities, which are small helper functions. For example, you might want to have a not_na, which is not_na <- Negate(is.na), a not_null, or small tools that you will be using application-wide. Note that you can also create utils for a specific module.

  • *_ui_*, for example utils_ui.R, relates to the user interface.

  • *_server_* are files that contain anything related to the application’s back-end. For example fct_connection_server.R will contain functions that are related to the connection to a database, and which are specifically used from the server side.

Note that when building a module file with {golem}, you can also create fct_ and utils_ files that will hold functions and utilities for this specific module. For example, golem::add_module("01_import", fct = "readr", utils = "ui") will create R/mod_01_import.R, R/mod_01_import_fct_readr.R and R/mod_01_import_utils_ui.R.

Of course, as with any convention, you might occasionally feel like deviating from the general pattern. Your app may not have that many functions, or maybe the functions can all fit into one utils_ file. But be it one or thousands of files, it is always a good practice to stick to a formalized pattern as much as possible.

References

Chang, Winston, Joe Cheng, JJ Allaire, Yihui Xie, and Jonathan McPherson. 2020. Shiny: Web Application Framework for R. https://CRAN.R-project.org/package=shiny.

Fay, Colin. 2020g. Hexmake: Hex Stickers Maker. https://github.com/colinfay/hexmake.

Fay, Colin. 2020l. Tidytuesday201942: A Golem App for Tidy Tuesday. http://www.github.com/ColinFay/tidytuesday201942.

Guyader, Vincent, Colin Fay, Sébastien Rochette, and Cervan Girard. 2020. Golem: A Framework for Robust Shiny Applications. https://github.com/ThinkR-open/golem.

Lemaire, Maude. 2020. Refactoring at Scale. Henry Holt. https://learning.oreilly.com/library/view/refactoring-at-scale/9781492075523/.

Raymond, Eric S. 2003. The Art of Unix Programming (the Addison-Wesley Professional Computng Series). Addison-Wesley.

Wickham, Hadley. 2020. Testthat: Unit Testing for R. https://CRAN.R-project.org/package=testthat.

Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2020. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.


  1. {roxygen2} comes with a @noRd tag, that prevents the documentation from being built. This allows to still write the documentation using the same tags as the exported function, without the internal functions being documented in the end package. For example, that is why by default the modules built with {golem} version > 0.2.0 come with @noRd: you should document them, but chances are your do not need to export them.↩︎

  2. Most of the time, pieces / panels of the app are too unique to be reused elsewhere.↩︎

  3. Well, of course you can still have inner module id conflicts, but they are easier to avoid, detect, and fix.↩︎


ThinkR Website