Chapter 4 Examples
Shiny apps come in many different shapes and forms. We will not be able to represent this vast diversity, but instead we wanted some apps that can be used to showcase common patterns, and that can also fit onto the pages of a printed book reasonably well.
We will use 3 Shiny apps as examples, all 3 are implemented in both R and Python:
faithful
: a “Hello Shiny!” app displaying the Old Faithful geyser waiting times data as a histogram with a slider that allows to adjust the number of bins used in the histogram — this app demonstrates the very basics of of reactivity, and it is very short.bananas
: an app that classifies the ripeness of banana fruits based on the color composition (green, yellow, brown) — this app demonstrates a more complex use case with dependencies, and the app also relies on a machine learning model, thus it better reflects real world use cases.lbtest
: an app to test load balancing when scaling Shiny apps to multiple instances.
Let’s learn about the example apps.
4.1 Old Faithful
This is the classic “Hello Shiny!” app that you can see in R by trying
shiny::runExample("01_hello")
. The app displays the Old Faithful geyser
waiting times data as a histogram with a slider that allows to adjust
the number of bins used in the histogram (Fig. 4.1).
The R version of the app was originally written by the Shiny package authors
(Chang et al. 2024).
The “Hello Shiny!” in R has no dependencies other than shiny
.
The Old Faithful app in Python has more requirements besides shiny
,
because the Python standard library does not have the geyser data readily
available, and you need e.g. matplotlib
(Hunter 2007) for the histogram.
We wrote the Python version as a mirror translation of the R version,
so that you can see the similarities and the differences.
In R, the data set datasets::faithful
(R Core Team 2024) contains waiting time between eruptions and
the duration of the eruption for the Old Faithful geyser in Yellowstone National
Park, Wyoming, USA. We got the Python data set from the Seaborn library
seaborn.load_dataset("geyser")
(Waskom 2021).
The source code for the different builds of the Old Faithful Shiny app is at
https://github.com/h10y/faithful. You can download the GitHub repository
az a zip file from GitHub, or clone the repository with
git clone https://github.com/h10y/faithful.git
.
4.2 Bananas
The bananas
app was born out of a “stuck-in-the-house” COVID-19 project when
one of the authors bought some green bananas at the store and took daily
photographs of each fruit.
Later, the data set was used as part of teaching a workshops. The motivation for the app
is that it follows a workflow that is fairly common in all kinds of data science
projects:
- Have a question to answer: Is my banana ripe?
- Collect data: Go to the store, buy bananas, set up a ring light and take pictures every day over 3 weeks.
- Compile the training data: Classify colour pixels and calculate the relative proportions, score pictures according to ripeness status.
- Run exploratory data analysis: Let’s explore and visualize the data set.
- Train a classification model: Estimate the the banana ripeness class and probability given the colour composition.
- Build a “scoring engine”: Given some colour inputs for a new fruit, tell me the probability for the ripeness classes.
- Build a user interface: Let a non technical user to do the data exploration and classification as part of a web application.
4.2.1 The Bananas Data Set
The data set tracks the ripening colour composition of banana fruits daily over a
3-week period. The full data set can be found in the GitHub repository
and R package bananas
(install.packages("bananas", repos = "https://psolymos.r-universe.dev")
).
The subset used in the book and the Shiny app constitutes the 6 fruits that were kept at room temperature.
The table has the following fields:
fruit
: the identifier of the fruit,day
: number between 0 and 20, the number of days since the first set of photographs,ripeness
: the ripeness class of the fruit based in Péter’s personal judgement (Under, Ripe, Very, Over),green
,yellow
,brown
: colour composition, these 3 values add up to 1 (100%).
The colour composition was determined based on colour mapping the pixel values of the banana fruits and converting the pixel based 2-dimensional area to proportions.
The following summary presents the ripeness and the percentage values of green, yellow, brown colours.
Figure 4.2 shows the change in colour composition over the 3 weeks of the experiment. You can see that the proportion of green colour went down, parallel to that the yellow colour proportion peaked around day 5. Yellow started decreasing after that while the proportion of brown started increasing.
We can also present the same information according to the ripeness classes (Fig. 4.3). You can see that the under-ripe class is characterized by high green proportion and the absence of brown. The ripe class is characterized by the highest proportion of yellow. Very ripe bananas have higher proportion of brown while yellow colour is still the most common. Over ripe bananas are mostly brown.
4.2.2 Model Training
We chose Support Vector Machines (SVM) to model a multi-level response variable (Under, Ripe, Very, Over) as a function of the green, yellow, and brown colours.
We used the e1071
package (Meyer et al. 2023) in R, and the SVM model’s prediction accuracy was 90.8%.
We saved the trained model object as an R binary .rds
file:
library(e1071)
# Read the bananas data
x <- read.csv("bananas.csv")
x$ripeness <- factor(x$ripeness, c("Under", "Ripe", "Very", "Over"))
# Multinomial classification with Support Vector Machines
m <- svm(ripeness ~ green + yellow + brown,
data = x,
probability = TRUE
)
# Two-way table to test prediction accuracy
table(x$ripeness, predict(m))
sum(diag(table(x$ripeness, predict(m)))) / nrow(x)
# Predict ripeness class
predict(m, data.frame(green = 1, yellow = 0, brown = 0),
probability = TRUE)
predict(m, data.frame(green = 0, yellow = 1, brown = 0),
probability = TRUE)
predict(m, data.frame(green = 0, yellow = 0, brown = 1),
probability = TRUE)
predict(m, data.frame(green = 0.1, yellow = 0.2, brown = 0.7),
probability = TRUE)
# Save the model object
saveRDS(m, "bananas-svm.rds")
We can fit a similar SVM model in Python using scikit-learn (sklearn
) (Pedregosa et al. 2011):
import pandas as pd
from joblib import dump
from sklearn import svm
# Global
x = pd.read_csv('bananas.csv')
# Train SVM
x.loc[x.ripeness == 'Under', 'target'] = 0
x.loc[x.ripeness == 'Ripe', 'target'] = 1
x.loc[x.ripeness == 'Very', 'target'] = 2
x.loc[x.ripeness == 'Over', 'target'] = 3
data_X = x[['green', 'yellow', 'brown']].to_numpy()
data_y = x.target.values
svm_model = svm.SVC(probability = True)
svm_model.fit(data_X, data_y)
#' Predict ripeness class
svm_model.predict_proba([[1, 0, 0]])
svm_model.predict_proba([[0, 1, 0]])
svm_model.predict_proba([[0, 0, 1]])
svm_model.predict_proba([[0.1, 0.2, 0.7]])
# Write model object to file
dump(svm_model, 'bananas-svm.joblib')
4.2.3 The Shiny App
The Shiny app consists of a ternary plot showing the daily colour composition of each banana fruit, alongside the new point to be classified (in red), as shown in Figure 4.4. The three numeric inputs on the left hand side of the plot control the position of the red dot. The classification results based on these inputs are shown on the right hand side of the ternary plot. You can see probabilities of under-ripe, ripe, very ripe, and over-ripe classes, and the class with highest probability is assigned as a label.
The source code for the different builds of the Bananas Shiny app is at
https://github.com/h10y/bananas. You can download the GitHub repository
az a zip file from GitHub, or clone the repository with
git clone https://github.com/h10y/bananas.git
.
4.3 Load Balancing Test
Shiny apps can run multiple sessions in the same app instance. A common problem when scaling the number of replicas for Shiny apps is that traffic might not be sent to the same session and thus the app might randomly fail. This app is used to determine if the HTTP requests made by the client are correctly routed back to the same R or Python process for the session.
Both the Python and the R version of the app registers a dynamic route for the client to try to connect to. The JavaScript code on the client side will repeatedly hit the dynamic route. The server will send a 200 OK status code only if the client reached the correct Shiny session, where it originally came from (Fig. 4.5).
The original Python app was written by Joe Cheng and is from the rstudio/py-shiny
GitHub repository. We wrote the R version to mirror the Python
version.
This app will be useful when the deployment includes load balancing between multiple replicas. For such deployments, session affinity (or sticky sessions) needs to be available. This app can be used to test such setups. If the test fails, it will stop before the counter reaches 100 and will say Failure! If the app succeeds 100 times, you’ll see Test complete. The app is not useful for testing a single instance deployment, or with Shinylive, because these setups won’t fail, but you can still try it.
The source code for the different builds of the load balancing test Shiny app is at
https://github.com/h10y/lbtest. You can download the GitHub repository
az a zip file from GitHub, or clone the repository with
git clone https://github.com/h10y/lbtest.git
.
4.4 Summary
This is the end of Part I. We covered all the fundamentals that the rest of the book builds upon. In the next part, we’ll cover all the technical details of Shiny hosting that happens on your local machine.
We recommend getting the example repositories mentioned in this chapter
available on your computer. This way you will be able to follow all the examples
from the following chapters and won’t have to copy paste the text from the
book to files. Visit the GitHub organization h10y
which stands for
hostingshiny (there are 10 letters between the first h and the last y):
https://github.com/h10y/.