Chapter 10 Building an “ipsum-app”

10.1 Prototyping is crucial

10.1.1 Prototype, then polish

Prototyping first may help keep you from investing far too much time for marginal gains.

And yet another Rule from the Art of Unix Programming: “Rule of Optimization: Prototype before polishing. Get it working before you optimize it.”

Getting things to work before trying to optimize the app is always a good approach:

  • Making things work before working on low level optimization makes the whole engineering process easier: having a “minimal viable product” that works, even if slowly and not perfectly, gives a stronger sense of success to the project. For example if you are building a vehicle, it feels more of a success to start with a skateboard than with a wheel: you quickly have a product that can be used to move, not waiting for the end of the project before finally having something useful.
Building a Minimum Viable Product

FIGURE 10.1: Building a Minimum Viable Product

  • Abstraction is hard, and makes the code base harder to work with. You have heard a lot that if you are copying and pasting something more than twice, you should write a function. And with Shiny, if you are writing a piece of the app more than twice, you should write modules. But while these kind of abstractions are elegant and optimized, they can make the software harder to work on while building it. So before focusing on turning something into a function, make it work first. As said in R for Data Science (Wickham and Grolemund 2017) about abstraction with {purrr} (Henry and Wickham 2020):

Once you master these functions, you’ll find it takes much less time to solve iteration problems. But you should never feel bad about using a for loop instead of a map function. The map functions are a step up a tower of abstraction, and it can take a long time to get your head around how they work. The important thing is that you solve the problem that you’re working on, not write the most concise and elegant code (although that’s definitely something you want to strive towards!).

As a small example, we can refer to the binding module from {hexmake} (Fay 2020g): this module manipulates namespaces, inputs and session to automatically bind inputs to the R6 object containing the image (see implementation here, here and here). That’s an elegant solution: instead of duplicating content, we use functions to automatically bind events. But that is a higher level of abstraction: we manipulate different levels of namespacing and inputs, making it harder to reason about when you have to change the code base.

  • It’s hard to identify upfront the real bottlenecks of the app. As long as the app is not in a working state, it is very hard to identify the real pieces of code that need to be optimized. Chances are that if you ask yourself upfront what the app bottlenecks will be, you will not aim right. So instead of losing time focusing on specific pieces of code you think need to be optimized, start by having something that works, then optimize the code. In other words, “Make It Work. Make It Right. Make It Fast”, (KentBeck).

  • It’s easier to spot mistakes when you have something that can run. If a piece of software runs, it is straightforward to check if a change in the codebase break the software or not: it either still run or not.

10.1.2 The “UI first” approach

Using what can be called a “UI first” approach when building an app is in most cases the safest way to go. And for two main reasons.

10.1.2.1 Agreeing on specifications

First of all, it helps everybody involved in the application to agree on what the app is supposed to do, and once the UI is set, there should be no “surprise implementation”. Well, at least, this is the best way to reduce the number of changes in the app, as the sooner we have a global idea of the app, the better. It is hard to implement a core new feature once the app is 90% finished, while it would have been way easier to implement it if it has been detected from the very start. Indeed, implementing core features once the app is very advanced can be critical, as our application might not have been thought to work the way it now needs to work, so adding certain elements might lead to a need for change in the core architecture of the app. Once we agree on what elements compose the app, there should be no sudden “oh the app needs to do that thing now, sorry I hadn’t realized that before”.

We can not blame the person ordering the app for not realizing everything needed to build the app: it is really hard to have a mental model of the whole software when we are writing specifications, not to mention when reading them. On the other hand, having a mock application with the UI really helps realizing what the app is doing and how it works, and to agree with the developer that this is actually what we want our application to do (or realizing that this is not something we actually need).

Prototyping the UI first should require the least possible computation from the server-side of your application. You focus on the appearance of the app: buttons, figures, tables, graphs… and how they interact with each other. At that stage of the design process, you will not be focusing on correctness of the results or graphs: you will be placing elements on the front-end so that you can be sure that everything is there, even if some buttons do not trigger anything. At that point, the idea is to get the people ordering the app think about what they actually need, and there might be some question rising like “oh, where is the button to download that results in pdf?”. And at that precise moment is the perfect time for a change in specification.

10.1.2.2 Organizing work

A pre-defined UI allows every person involved in the coding process to know which part of the app they are working on, and to be sure that you do not forget anything. As you might be working on the app as a team, you will need to find a strategy for efficiently splitting the work between every coder. And it’s much easier to work on a piece of the app you can visually identify and integrate in a complete app scenario. In other words, it is easier to be told “you will be working on the ‘Summary’ panel from that mock UI” than “you will be working on bullet point 45 to 78 of the specifications”.

10.2 Prototyping Shiny

In the next section, you will be introduced to two packages that can be used when prototyping user interface: {shinipsum} (Fay and Rochette 2020b) and {fakir} (Fay and Rochette 2020a).

10.2.1 Fast UI Prototyping with {shinipsum}

When prototyping the UI for an application, we will not be focusing on building the actual computation: what we need is creating a draft with visual components, so that we can have visual clues about the end result. To do that, you can use the {shinipsum} package, which has been designed to generate random {shiny} (Chang et al. 2020) elements. If you are familiar with “lorem ipsum”, the fake text generator that is used in software design as a placeholder for text, the idea is the same: generating placeholders for Shiny outputs.

{shinipsum} can be installed from CRAN with:

install.packages("shinipsum")

You can install this package from GitHub with:

remotes::install_github("Thinkr-open/shinipsum")

In this package, a series of functions that generates random placeholders. For example, random_ggplot() generates random {ggplot2} (Wickham, Chang, et al. 2020) elements. If we run this code two times, we should get different results32 :

library(shinipsum)
library(ggplot2)
random_ggplot() + 
  labs(title = "Random plot") 
A random plot

FIGURE 10.2: A random plot

random_ggplot() + 
  labs(title = "Random plot") 
A random plot

FIGURE 10.3: A random plot

Of course, the idea is to combine this with a Shiny interface, for example random_ggplot() will be used with a renderPlot() and plotOutput(). And as we want to prototype but still be close to what the app might look like, these functions take arguments that can shape the output: for example, random_ggplot() has a type parameter that can help you select a specific geom.

library(shiny)
library(shinipsum)
library(DT)
ui <- fluidPage(
  h2("A Random DT"),
  DTOutput("data_table"),
  h2("A Random Plot"),
  plotOutput("plot"),
  h2("A Random Text"),
  tableOutput("text")
)

server <- function(input, output, session) {
  output$data_table <- DT::renderDT({
    random_DT(5, 5)
  })
  output$plot <- renderPlot({
    random_ggplot()
  })
  output$text <- renderText({
    random_text(nwords = 50)
  })
}
shinyApp(ui, server)
An app built with {shinipsum}

FIGURE 10.4: An app built with {shinipsum}

Other {shinipsum} functions include:

  • tables:
random_table(nrow = 3, ncol = 10)
  agegp     alcgp    tobgp ncases ncontrols agegp.1
1 25-34 0-39g/day 0-9g/day      0        40   25-34
2 25-34 0-39g/day    10-19      0        10   25-34
3 25-34 0-39g/day    20-29      0         6   25-34
    alcgp.1  tobgp.1 ncases.1 ncontrols.1
1 0-39g/day 0-9g/day        0          40
2 0-39g/day    10-19        0          10
3 0-39g/day    20-29        0           6
  • print outputs:
random_print(type = "model")

	Shapiro-Wilk normality test

data:  datasets::airquality$Ozone
W = 0.88, p-value = 3e-08

… and text, image, ggplotly, dygraph, and DT.

10.2.2 Using {fakir} for fake data generation

Generating random placeholders for Shiny might not be enough: maybe you also need example datasets.

This can be accomplished using the {fakir} package, which was primarily created to provide fake datasets for R tutorials and exercises, but that can easily be used inside a Shiny application.

At the time of writing these lines, the package is only available on GitHub, and can be installed with:

remotes::install_github("Thinkr-open/fakir")

This package contains three datasets that are randomly generated when you call the corresponding functions:

  • fake_base_clients() generates a fake dataset for a ticketing service

  • fake_sondage_answers() is a fake survey about transportation

  • fake_visits() is a fake dataset for the visits on a website

library(fakir)
fake_visits(from = "2017-01-01", to = "2017-01-31")
# A tibble: 31 x 8
   timestamp   year month   day  home about  blog
 * <date>     <dbl> <dbl> <int> <int> <int> <int>
 1 2017-01-01  2017     1     1   369   220   404
 2 2017-01-02  2017     1     2   159   250   414
 3 2017-01-03  2017     1     3   436   170   498
 4 2017-01-04  2017     1     4    NA   258   526
 5 2017-01-05  2017     1     5   362    NA   407
 6 2017-01-06  2017     1     6   245   145   576
 7 2017-01-07  2017     1     7    NA    NA   484
 8 2017-01-08  2017     1     8   461   103   441
 9 2017-01-09  2017     1     9   337   113   673
10 2017-01-10  2017     1    10    NA   169   308
# … with 21 more rows, and 1 more variable:
#   contact <int>

The idea with these datasets is to combine various formats that can reflect “real life” datasets: they contain dates, numeric and character variables, and have missing values. They can also be manipulated with the included {sf} (Pebesma 2020) geographical dataset fra_sf allowing for maps creation.

Fake datasets created with {fakir} can be used to build light examples on the use of the inputs, for filters or interactive map, or as examples for the internal functions and their corresponding documentation.

10.3 Building with RMarkdown

While on one side you are building the user interface, you (or someone from your team) can start working on the back-end implementation. Again, this should be done out of any reactive logic: the back-end should not depend on any reactive context. And because documentation is gold, you should start with writing the back-end documentation straight as package documentation, inside your Vignettes, or in inst/. Or what we call “Rmd-first”.

10.3.1 Define the content of the application

Rmarkdown files are the perfect spot to sandbox the back-end of your application: inside the file, you don’t have to think about any reactive behavior, as you are just working with plain old R code: data wrangling operations, multi-parameters based models, summary tables outputs, graphical outputs…

And the nice thing is that you can share the output of the rendered file as an HTML or PDF to either your client or boss, or anyone involved in the project. That way, you can focus on the core algorithm, not some UI implementation like “I want the button to be blue” when what you need to know is if the output of the model is correct. In other words, you are applying the rule of the separation of concerns, i.e. you help focusing on one part of the application without adding any cognitive load to the person “reading” the outputs. And, last but not least, if you have to implement changes to the back-end functions, it is way easier to check and to share in a static file than in an application.

When doing that, the best way is again to separate things: do not be afraid of writing multiple RMarkdown files, one for each part of the end application. Again, this will help everybody focus on what matters: be it you, your team, or the person ordering the application.

Building the back-end in Rmd files is also a good way to make the back-end “application independent”, in the sense that it helps documenting how the algorithms you have been building can be used outside of the application. In many cases, when you are building an application, you are creating functions that contain business logic/domain expertise, and that can in fact be used outside of the application. Writing these functions and how they work together forces you to think about these functions, and also gives a good starting point for anybody familiar with R that would want to start using this back-end toolkit. Of course, as you are building your application as a package, it is way easier now: you can share a package with the application inside it, along with a function to launch the app but also functions that can be used outside.

And if you need some data to use as an example, feel free to pick one from {fakir}! (See 10).

10.3.2 Using the Rmd files as a laboratory notebook

Rmd can also be used as the place to keep track of what you have in mind while creating the application: most of the time, you will create the functions inside the R/ folder, but it might not be the perfect place to document your thought process. On the other hand, using Markdown as a kind of “software laboratory notebook” to keep track of your idea is a good way to document all the choices you have made about your data wrangling, models, visualization, so that you can use it as a common knowledge-base all along the application life: you can share this with your client, with the rest of your team, or with anybody involved in the project.

And also, developing in multiple Rmd files helps the separation of work between multiple developers, and will reduce code conflicts during development.

10.3.3 Rmd, Vignettes, and documentation first

Working with the {golem} (Guyader et al. 2020) framework implies that you will build the application as an R package. And of course, an R package implies writing documentation: one of the main goals of the Vignettes, in an R package, is to document how to use the package. And the good news is that when checking a package, i.e when running check() from {devtools} (Wickham, Hester, and Chang 2020) or R CMD check, the Vignettes are going to be built, and the process will fail if at least one of the Vignette fails to render. That way, you can use the documentation of the back-end as an extra tool for doing unit-testing!

One radical approach to the “Rmd first” philosophy is to write everything in an Rmd from the very beginning of your project: write the function code, there roxygen tags, their tests, etc, then move everything to the correct spot in the package infrastructure once you are happy with everything. And of course, when you need to add another feature to your app, open a new markdown and start the process of development and documentation again.

References

Chang, Winston, Joe Cheng, JJ Allaire, Yihui Xie, and Jonathan McPherson. 2020. Shiny: Web Application Framework for R. https://CRAN.R-project.org/package=shiny.

Fay, Colin. 2020g. Hexmake: Hex Stickers Maker. https://github.com/colinfay/hexmake.

Fay, Colin, and Sebastien Rochette. 2020a. Fakir: Create Fake Data in R for Tutorials. https://github.com/Thinkr-open/fakir.

Fay, Colin, and Sebastien Rochette. 2020b. Shinipsum: Lorem-Ipsum-Like Helpers for Fast Shiny Prototyping. https://github.com/Thinkr-open/shinipsum.

Guyader, Vincent, Colin Fay, Sébastien Rochette, and Cervan Girard. 2020. Golem: A Framework for Robust Shiny Applications. https://github.com/ThinkR-open/golem.

Henry, Lionel, and Hadley Wickham. 2020. Purrr: Functional Programming Tools. https://CRAN.R-project.org/package=purrr.

Pebesma, Edzer. 2020. Sf: Simple Features for R. https://CRAN.R-project.org/package=sf.

Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2020. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. https://CRAN.R-project.org/package=ggplot2.

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media. https://www.xarg.org/ref/a/1491910399/.

Wickham, Hadley, Jim Hester, and Winston Chang. 2020. Devtools: Tools to Make Developing R Packages Easier. https://CRAN.R-project.org/package=devtools.


  1. Well, there is a probability that we will get the same plot twice, and but that is the beauty of randomness.↩︎


ThinkR Website