Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

R Cheat Sheet, Data Visualization with ggplot2, Cheat Sheet of Advanced Computer Programming

R color cheat sheet or ggplot2 function in R language. Explore data visualization with grammar of graphics function in R base

Typology: Cheat Sheet

2020/2021

Uploaded on 04/23/2021

anuradha
anuradha 🇺🇸

4.6

(9)

240 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Geoms - Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables. Each function returns a layer.
Three Variables
l + geom_contour(aes(z = z))
x, y, z, alpha, colour, group, linetype, size,
weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))
l <- ggplot(seals, aes(long, lat))
l + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
l + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size, width
Two Variables
Discrete X, Discrete Y
g <- ggplot(diamonds, aes(cut, color))
Discrete X, Continuous Y
f <- ggplot(mpg, aes(class, hwy))
f + geom_col()
x, y, alpha, color, fill, group, linetype, size
f + geom_boxplot()
x, y, lower, middle, upper, ymax, ymin, alpha,
color, fill, group, linetype, shape, size, weight
f + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill, group
f + geom_violin(scale = "area")
x, y, alpha, color, fill, group, linetype, size,
weight
Continuous X, Continuous Y
e <- ggplot(mpg, aes(cty, hwy))
e + geom_label(aes(label = cty), nudge_x = 1,
nudge_y = 1, check_overlap = TRUE)
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
e + geom_jitter(height = 2, width = 2)
x, y, alpha, color, fill, shape, size
e + geom_point()
x, y, alpha, color, fill, shape, size, stroke
e + geom_quantile()
x, y, alpha, color, group, linetype, size, weight
e + geom_rug(sides = "bl")
x, y, alpha, color, linetype, size
e + geom_smooth(method = lm)
x, y, alpha, color, fill, group, linetype, size, weight
e + geom_text(aes(label = cty), nudge_x = 1,
nudge_y = 1, check_overlap = TRUE)
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
A
B
C
A
B
C
Continuous Function
i <- ggplot(economics, aes(date, unemploy))
i + geom_area()
x, y, alpha, color, fill, linetype, size
i + geom_line()
x, y, alpha, color, group, linetype, size
i + geom_step(direction = "hv")
x, y, alpha, color, group, linetype, size
Continuous Bivariate Distribution
h <- ggplot(diamonds, aes(carat, price))
j + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, group,
linetype, size
j + geom_errorbar()
x, ymax, ymin, alpha, color, group, linetype,
size, width (also geom_errorbarh())
j + geom_linerange()
x, ymin, ymax, alpha, color, group, linetype, size
j + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, group,
linetype, shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
k <- ggplot(data, aes(fill = murder))
k + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
h + geom_bin2d(binwidth = c(0.25, 500))
x, y, alpha, color, fill, linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, group, linetype, size
h + geom_hex()
x, y, alpha, colour, fill, size
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org and www.ggplot2-exts.org • ggplot2 2.1.0 • Updated: 11/16
ggplot(data = mpg, aes(x = cty, y = hwy))
Begins a plot that you finish by adding layers to.
Add one geom function per layer.
Basics
Complete the template below to build a graph.
ggplot2 is based on the grammar of graphics, the
idea that you can build every graph from the same
components: a data set, a coordinate system, and
geoms—visual marks that represent data points.
To display values, map variables in the data to visual
properties of the geom (aesthetics) like size, color,
and x and y locations.
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Geoms - Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = "bin")
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = "bin")
a + geom_density(kernal = "gaussian")
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))
i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy))
Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = "hv")
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend="butt",
linejoin="round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy - 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F
M
A
=
1
2
3
0
0
1
2
3
4
4
1
2
3
0
0
1
2
3
4
4
+
data
geom
coordinate
system
plot
+
F M A
=
1
2
3
0
0123 4
4
1
2
3
0
0123 4
4
data geom coordinate
system plot
x = F
y = A
color = F
size = A
1
2
3
0
0123 4
4
plot
+
F M A
=
1
2
3
0
0123 4
4
data geom coordinate
system
x = F
y = A
x = F
y = A
Graphical Primitives
Data Visualization
with ggplot2
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/15
Geoms - Use a geom to represent data points, use the geom’s aesthetic properties to represent variables
Basics
One Variable
a + geom_area(stat = "bin")
x, y, alpha, color, fill, linetype, size
b + geom_area(aes(y = ..density..), stat = "bin")
a + geom_density(kernal = "gaussian")
x, y, alpha, color, fill, linetype, size, weight
b + geom_density(aes(y = ..county..))
a+ geom_dotplot()
x, y, alpha, color, fill
a + geom_freqpoly()
x, y, alpha, color, linetype, size
b + geom_freqpoly(aes(y = ..density..))
a + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
b + geom_histogram(aes(y = ..density..))
Discrete
a <- ggplot(mpg, aes(fl))
b + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
a <- ggplot(mpg, aes(hwy))
Two Variables
Discrete X, Discrete Y
h <- ggplot(diamonds, aes(cut, color))
h + geom_jitter()
x, y, alpha, color, fill, shape, size
Discrete X, Continuous Y
g <- ggplot(mpg, aes(class, hwy))
g + geom_bar(stat = "identity")
x, y, alpha, color, fill, linetype, size, weight
g + geom_boxplot()
lower, middle, upper, x, ymax, ymin, alpha,
color, fill, linetype, shape, size, weight
g + geom_dotplot(binaxis = "y",
stackdir = "center")
x, y, alpha, color, fill
g + geom_violin(scale = "area")
x, y, alpha, color, fill, linetype, size, weight
Continuous X, Continuous Y
f <- ggplot(mpg, aes(cty, hwy))
f + geom_blank()
f + geom_jitter()
x, y, alpha, color, fill, shape, size
f + geom_point()
x, y, alpha, color, fill, shape, size
f + geom_quantile()
x, y, alpha, color, linetype, size, weight
f + geom_rug(sides = "bl")
alpha, color, linetype, size
f + geom_smooth(model = lm)
x, y, alpha, color, fill, linetype, size, weight
f + geom_text(aes(label = cty))
x, y, label, alpha, angle, color, family, fontface,
hjust, lineheight, size, vjust
Three Variables
i + geom_contour(aes(z = z))
x, y, z, alpha, colour, linetype, size, weight
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))
i <- ggplot(seals, aes(long, lat))
g <- ggplot(economics, aes(date, unemploy))
Continuous Function
g + geom_area()
x, y, alpha, color, fill, linetype, size
g + geom_line()
x, y, alpha, color, linetype, size
g + geom_step(direction = "hv")
x, y, alpha, color, linetype, size
Continuous Bivariate Distribution
h <- ggplot(movies, aes(year, rating))
h + geom_bin2d(binwidth = c(5, 0.5))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size, weight
h + geom_density2d()
x, y, alpha, colour, linetype, size
h + geom_hex()
x, y, alpha, colour, fill size
d + geom_segment(aes(
xend = long + delta_long,
yend = lat + delta_lat))
x, xend, y, yend, alpha, color, linetype, size
d + geom_rect(aes(xmin = long, ymin = lat,
xmax= long + delta_long,
ymax = lat + delta_lat))
xmax, xmin, ymax, ymin, alpha, color, fill,
linetype, size
c + geom_polygon(aes(group = group))
x, y, alpha, color, fill, linetype, size
d<- ggplot(seals, aes(x = long, y = lat))
i + geom_raster(aes(fill = z), hjust=0.5,
vjust=0.5, interpolate=FALSE)
x, y, alpha, fill
i + geom_tile(aes(fill = z))
x, y, alpha, color, fill, linetype, size
e + geom_crossbar(fatten = 2)
x, y, ymax, ymin, alpha, color, fill, linetype,
size
e + geom_errorbar()
x, ymax, ymin, alpha, color, linetype, size,
width (also geom_errorbarh())
e + geom_linerange()
x, ymin, ymax, alpha, color, linetype, size
e + geom_pointrange()
x, y, ymin, ymax, alpha, color, fill, linetype,
shape, size
Visualizing error
df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
g + geom_path(lineend="butt",
linejoin="round’, linemitre=1)
x, y, alpha, color, linetype, size
g + geom_ribbon(aes(ymin=unemploy - 900,
ymax=unemploy + 900))
x, ymax, ymin, alpha, color, fill, linetype, size
g <- ggplot(economics, aes(date, unemploy))
c <- ggplot(map, aes(long, lat))
data <- data.frame(murder = USArrests$Murder,
state = tolower(rownames(USArrests)))
map <- map_data("state")
e <- ggplot(data, aes(fill = murder))
e + geom_map(aes(map_id = state), map = map) +
expand_limits(x = map$long, y = map$lat)
map_id, alpha, color, fill, linetype, size
Maps
F M A
=
1
2
3
0
0123 4
4
1
2
3
0
0123 4
4
+
data geom coordinate
system plot
+
F
M
A
=
1
2
3
0
0
1
2
3
4
4
1
2
3
0
0
1
2
3
4
4
data
geom
coordinate
system
plot
x = F
y = A
color = F
size = A
1
2
3
0
0123 4
4
plot
+
F M A
=
1
2
3
0
0123 4
4
data geom coordinate
system
x = F
y = A
x = F
y = A
ggsave("plot.png", width = 5, height = 5)
Saves last plot as 5’ x 5’ file named "plot.png" in
working directory. Matches file type to file extension.
qplot(x = cty, y = hwy, data = mpg, geom = "point")
Creates a complete plot with given data, geom, and
mappings. Supplies many useful defaults.
aesthetic mappings
data
geom
last_plot()
Returns the last plot
ggplot(data = <DATA >) +
<GEOM_FUNCTION> (
mapping = aes(<MAPPINGS> ),
stat = <STAT> ,
position = <POSITION>
) +
<COORDINATE_FUNCTION> +
<FACET_FUNCTION> +
<SCALE_FUNCTION> +
<THEME_FUNCTION>
<THEME_FUNCTION>
<SCALE_FUNCTION>
<FACET_FUNCTION>
<COORDINATE_FUNCTION>
<POSITION>
<STAT>
<MAPPINGS>
<GEOM_FUNCTION>
<DATA>
Required
Not
required,
sensible
defaults
supplied
Graphical Primitives
a <- ggplot(economics, aes(date, unemploy))
b <- ggplot(seals, aes(x = long, y = lat))
a + geom_blank()
(Useful for expanding limits)
b + geom_curve(aes(yend = lat + 1,
xend=long+1,curvature=z)) - x, xend, y, yend,
alpha, angle, color, curvature, linetype, size
a + geom_path(lineend="butt",
linejoin="round’, linemitre=1)
x, y, alpha, color, group, linetype, size
a + geom_polygon(aes(group = group))
x, y, alpha, color, fill, group, linetype, size
b + geom_rect(aes(xmin = long, ymin=lat,
xmax= long + 1, ymax = lat + 1)) - xmax, xmin,
ymax, ymin, alpha, color, fill, linetype, size
a + geom_ribbon(aes(ymin=unemploy - 900,
ymax=unemploy + 900)) - x, ymax, ymin
alpha, color, fill, group, linetype, size
Line Segments
common aesthetics: x, y, alpha, color, linetype, size
b + geom_abline(aes(intercept=0, slope=1))
b + geom_hline(aes(yintercept = lat))
b + geom_vline(aes(xintercept = long))
b + geom_segment(aes(yend=lat+1, xend=long+1))
b + geom_spoke(aes(angle = 1:1155, radius = 1))
One Variable
c + geom_area(stat = "bin")
x, y, alpha, color, fill, linetype, size
c + geom_density(kernel = "gaussian")
x, y, alpha, color, fill, group, linetype, size, weight
c + geom_dotplot()
x, y, alpha, color, fill
c + geom_freqpoly()
x, y, alpha, color, group, linetype, size
c + geom_histogram(binwidth = 5)
x, y, alpha, color, fill, linetype, size, weight
c2 + geom_qq(aes(sample = hwy))
x, y, alpha, color, fill, linetype, size, weight
Discrete
d <- ggplot(mpg, aes(fl))
d + geom_bar()
x, alpha, color, fill, linetype, size, weight
Continuous
c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg)
pf2

Partial preview of the text

Download R Cheat Sheet, Data Visualization with ggplot2 and more Cheat Sheet Advanced Computer Programming in PDF only on Docsity!

Geoms - Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables. Each function returns a layer.

Three Variables

l + geom_contour(aes(z = z))

x, y, z, alpha, colour, group, linetype, size, weight seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) l <- ggplot(seals, aes(long, lat))

l + geom_raster(aes(fill = z), hjust=0.5,

vjust=0.5, interpolate=FALSE)

x, y, alpha, fill

l + geom_tile(aes(fill = z))

x, y, alpha, color, fill, linetype, size, width

Two Variables

Discrete X, Discrete Y

g <- ggplot(diamonds, aes(cut, color))

g + geom_count()

x, y, alpha, color, fill, shape, size, stroke

Discrete X, Continuous Y

f <- ggplot(mpg, aes(class, hwy))

f + geom_col()

x, y, alpha, color, fill, group, linetype, size

f + geom_boxplot()

x, y, lower, middle, upper, ymax, ymin, alpha, color, fill, group, linetype, shape, size, weight

f + geom_dotplot(binaxis = "y",

stackdir = "center")

x, y, alpha, color, fill, group

f + geom_violin(scale = "area")

x, y, alpha, color, fill, group, linetype, size, weight

Continuous X, Continuous Y

e <- ggplot(mpg, aes(cty, hwy))

e + geom_label(aes(label = cty), nudge_x = 1,

nudge_y = 1, check_overlap = TRUE)

x, y, label, alpha, angle, color, family, fontface, hjust, lineheight, size, vjust

e + geom_jitter(height = 2, width = 2)

x, y, alpha, color, fill, shape, size

e + geom_point()

x, y, alpha, color, fill, shape, size, stroke

e + geom_quantile()

x, y, alpha, color, group, linetype, size, weight

e + geom_rug(sides = "bl")

x, y, alpha, color, linetype, size

e + geom_smooth(method = lm)

x, y, alpha, color, fill, group, linetype, size, weight

e + geom_text(aes(label = cty), nudge_x = 1,

nudge_y = 1, check_overlap = TRUE)

x, y, label, alpha, angle, color, family, fontface, hjust, lineheight, size, vjust

AB

C

A

B

C

Continuous Function

i <- ggplot(economics, aes(date, unemploy))

i + geom_area()

x, y, alpha, color, fill, linetype, size

i + geom_line()

x, y, alpha, color, group, linetype, size

i + geom_step(direction = "hv")

x, y, alpha, color, group, linetype, size

Continuous Bivariate Distribution

h <- ggplot(diamonds, aes(carat, price))

j + geom_crossbar(fatten = 2)

x, y, ymax, ymin, alpha, color, fill, group, linetype, size

j + geom_errorbar()

x, ymax, ymin, alpha, color, group, linetype, size, width (also geom_errorbarh())

j + geom_linerange()

x, ymin, ymax, alpha, color, group, linetype, size

j + geom_pointrange()

x, y, ymin, ymax, alpha, color, fill, group, linetype, shape, size

Visualizing error

df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests))) map <- map_data("state") k <- ggplot(data, aes(fill = murder))

k + geom_map(aes(map_id = state), map = map) +

expand_limits(x = map$long, y = map$lat)

map_id, alpha, color, fill, linetype, size

Maps

h + geom_bin2d(binwidth = c(0.25, 500))

x, y, alpha, color, fill, linetype, size, weight

h + geom_density2d()

x, y, alpha, colour, group, linetype, size

h + geom_hex()

x, y, alpha, colour, fill, size

Data Visualization

with ggplot

Cheat Sheet

RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org and www.ggplot2-exts.org • ggplot2 2.1.0 • Updated: 11/1 6 ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot that you finish by adding layers to. Add one geom function per layer.

Basics

Complete the template below to build a graph. ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same components: a data set, a coordinate system, and geoms—visual marks that represent data points. To display values, map variables in the data to visual properties of the geom (aesthetics) like size, color, and x and y locations. Graphical Primitives

Data Visualization

with ggplot

Cheat Sheet

RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/1 5 Geoms - Use a geom to represent data points, use the geom’s aesthetic properties to represent variables Basics One Variable a + geom_area(stat = "bin") x, y, alpha, color, fill, linetype, size b + geom_area(aes(y = ..density..), stat = "bin") a + geom_density( kernal = "gaussian" ) x, y, alpha, color, fill, linetype, size, weight b + geom_density(aes(y = ..county..)) a+ geom_dotplot() x, y, alpha, color, fill a + geom_freqpoly() x, y, alpha, color, linetype, size b + geom_freqpoly(aes(y = ..density..)) a + geom_histogram( binwidth = 5 ) x, y, alpha, color, fill, linetype, size, weight b + geom_histogram(aes(y = ..density..)) Discrete a <- ggplot(mpg, aes(fl)) b + geom_bar() x, alpha, color, fill, linetype, size, weight Continuous a <- ggplot(mpg, aes(hwy)) Two Variables Discrete X, Discrete Y h <- ggplot(diamonds, aes(cut, color)) h + geom_jitter() x, y, alpha, color, fill, shape, size Discrete X, Continuous Y g <- ggplot(mpg, aes(class, hwy)) g + geom_bar(stat = "identity") x, y, alpha, color, fill, linetype, size, weight g + geom_boxplot() lower, middle, upper, x, ymax, ymin, alpha, color, fill, linetype, shape, size, weight g + geom_dotplot( binaxis = "y", stackdir = "center" ) x, y, alpha, color, fill g + geom_violin( scale = "area" ) x, y, alpha, color, fill, linetype, size, weight Continuous X, Continuous Y f <- ggplot(mpg, aes(cty, hwy)) f + geom_blank() f + geom_jitter() x, y, alpha, color, fill, shape, size f + geom_point() x, y, alpha, color, fill, shape, size f + geom_quantile() x, y, alpha, color, linetype, size, weight f + geom_rug( sides = "bl" ) alpha, color, linetype, size f + geom_smooth( model = lm ) x, y, alpha, color, fill, linetype, size, weight f + geom_text( aes(label = cty) ) x, y, label, alpha, angle, color, family, fontface, hjust, lineheight, size, vjust Three Variables i + geom_contour( aes(z = z) ) x, y, z, alpha, colour, linetype, size, weight seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) i <- ggplot(seals, aes(long, lat)) g <- ggplot(economics, aes(date, unemploy)) Continuous Function g + geom_area() x, y, alpha, color, fill, linetype, size g + geom_line() x, y, alpha, color, linetype, size g + geom_step( direction = "hv" ) x, y, alpha, color, linetype, size Continuous Bivariate Distribution h <- ggplot(movies, aes(year, rating)) h + geom_bin2d( binwidth = c(5, 0.5) ) xmax, xmin, ymax, ymin, alpha, color, fill, linetype, size, weight h + geom_density2d() x, y, alpha, colour, linetype, size h + geom_hex() x, y, alpha, colour, fill size d + geom_segment( aes( xend = long + delta_long, yend = lat + delta_lat) ) x, xend, y, yend, alpha, color, linetype, size d + geom_rect( aes(xmin = long, ymin = lat, xmax= long + delta_long, ymax = lat + delta_lat) ) xmax, xmin, ymax, ymin, alpha, color, fill, linetype, size c + geom_polygon( aes(group = group) ) x, y, alpha, color, fill, linetype, size d<- ggplot(seals, aes(x = long, y = lat)) i + geom_raster( aes(fill = z), hjust=0.5, vjust=0.5, interpolate=FALSE ) x, y, alpha, fill i + geom_tile( aes(fill = z) ) x, y, alpha, color, fill, linetype, size e + geom_crossbar( fatten = 2 ) x, y, ymax, ymin, alpha, color, fill, linetype, size e + geom_errorbar() x, ymax, ymin, alpha, color, linetype, size, width (also geom_errorbarh() ) e + geom_linerange() x, ymin, ymax, alpha, color, linetype, size e + geom_pointrange() x, y, ymin, ymax, alpha, color, fill, linetype, shape, size Visualizing error df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) g + geom_path( lineend="butt", linejoin="round’, linemitre=1 ) x, y, alpha, color, linetype, size g + geom_ribbon( aes(ymin=unemploy - 900, ymax=unemploy + 900) ) x, ymax, ymin, alpha, color, fill, linetype, size g <- ggplot(economics, aes(date, unemploy)) c <- ggplot(map, aes(long, lat)) data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests))) map <- map_data("state") e <- ggplot(data, aes(fill = murder)) e + geom_map( aes(map_id = state), map = map ) + expand_limits( x = map$long, y = map$lat ) map_id, alpha, color, fill, linetype, size Maps F M A = 12 3 (^00 1 2 3 ) 4 1 2 3 (^00 1 2 3 ) 4

data geom coordinate system plot

F M A = 12 3 (^00 1 2 3 ) 4 1 2 3 (^00 1 2 3 ) 4 data geom coordinate system plot x = F y = A color = F size = A 1 2 3 (^00 1 2 3 ) 4 plot

F M A

1 2 3 (^00 1 2 3 ) 4 data geom coordinate x = F y = A system x = F y = A Graphical Primitives

Data Visualization

with ggplot

Cheat Sheet

RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more at docs.ggplot2.org • ggplot2 0.9.3.1 • Updated: 3/1 5 Geoms - Use a geom to represent data points, use the geom’s aesthetic properties to represent variables Basics One Variable a + geom_area(stat = "bin") x, y, alpha, color, fill, linetype, size b + geom_area(aes(y = ..density..), stat = "bin") a + geom_density( kernal = "gaussian" ) x, y, alpha, color, fill, linetype, size, weight b + geom_density(aes(y = ..county..)) a+ geom_dotplot() x, y, alpha, color, fill a + geom_freqpoly() x, y, alpha, color, linetype, size b + geom_freqpoly(aes(y = ..density..)) a + geom_histogram( binwidth = 5 ) x, y, alpha, color, fill, linetype, size, weight b + geom_histogram(aes(y = ..density..)) Discrete a <- ggplot(mpg, aes(fl)) b + geom_bar() x, alpha, color, fill, linetype, size, weight Continuous a <- ggplot(mpg, aes(hwy)) Two Variables Discrete X, Discrete Y h <- ggplot(diamonds, aes(cut, color)) h + geom_jitter() x, y, alpha, color, fill, shape, size Discrete X, Continuous Y g <- ggplot(mpg, aes(class, hwy)) g + geom_bar(stat = "identity") x, y, alpha, color, fill, linetype, size, weight g + geom_boxplot() lower, middle, upper, x, ymax, ymin, alpha, color, fill, linetype, shape, size, weight g + geom_dotplot( binaxis = "y", stackdir = "center" ) x, y, alpha, color, fill g + geom_violin( scale = "area" ) x, y, alpha, color, fill, linetype, size, weight Continuous X, Continuous Y f <- ggplot(mpg, aes(cty, hwy)) f + geom_blank() f + geom_jitter() x, y, alpha, color, fill, shape, size f + geom_point() x, y, alpha, color, fill, shape, size f + geom_quantile() x, y, alpha, color, linetype, size, weight f + geom_rug( sides = "bl" ) alpha, color, linetype, size f + geom_smooth( model = lm ) x, y, alpha, color, fill, linetype, size, weight f + geom_text( aes(label = cty) ) x, y, label, alpha, angle, color, family, fontface, hjust, lineheight, size, vjust Three Variables i + geom_contour( aes(z = z) ) x, y, z, alpha, colour, linetype, size, weight seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)) i <- ggplot(seals, aes(long, lat)) g <- ggplot(economics, aes(date, unemploy)) Continuous Function g + geom_area() x, y, alpha, color, fill, linetype, size g + geom_line() x, y, alpha, color, linetype, size g + geom_step( direction = "hv" ) x, y, alpha, color, linetype, size Continuous Bivariate Distribution h <- ggplot(movies, aes(year, rating)) h + geom_bin2d( binwidth = c(5, 0.5) ) xmax, xmin, ymax, ymin, alpha, color, fill, linetype, size, weight h + geom_density2d() x, y, alpha, colour, linetype, size h + geom_hex() x, y, alpha, colour, fill size d + geom_segment( aes( xend = long + delta_long, yend = lat + delta_lat) ) x, xend, y, yend, alpha, color, linetype, size d + geom_rect( aes(xmin = long, ymin = lat, xmax= long + delta_long, ymax = lat + delta_lat) ) xmax, xmin, ymax, ymin, alpha, color, fill, linetype, size c + geom_polygon( aes(group = group) ) x, y, alpha, color, fill, linetype, size d<- ggplot(seals, aes(x = long, y = lat)) i + geom_raster( aes(fill = z), hjust=0.5, vjust=0.5, interpolate=FALSE ) x, y, alpha, fill i + geom_tile( aes(fill = z) ) x, y, alpha, color, fill, linetype, size e + geom_crossbar( fatten = 2 ) x, y, ymax, ymin, alpha, color, fill, linetype, size e + geom_errorbar() x, ymax, ymin, alpha, color, linetype, size, width (also geom_errorbarh() ) e + geom_linerange() x, ymin, ymax, alpha, color, linetype, size e + geom_pointrange() x, y, ymin, ymax, alpha, color, fill, linetype, shape, size Visualizing error df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) e <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) g + geom_path( lineend="butt", linejoin="round’, linemitre=1 ) x, y, alpha, color, linetype, size g + geom_ribbon( aes(ymin=unemploy - 900, ymax=unemploy + 900) ) x, ymax, ymin, alpha, color, fill, linetype, size g <- ggplot(economics, aes(date, unemploy)) c <- ggplot(map, aes(long, lat)) data <- data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests))) map <- map_data("state") e <- ggplot(data, aes(fill = murder)) e + geom_map( aes(map_id = state), map = map ) + expand_limits( x = map$long, y = map$lat ) map_id, alpha, color, fill, linetype, size Maps F M A = 12 3 (^00 1 2 3 ) 4 1 2 3 (^00 1 2 3 ) 4

data geom coordinate system plot

F M A = 12 3 (^00 1 2 3 ) 4 1 2 3 (^00 1 2 3 ) 4 data geom coordinate system plot x = F y = A color = F size = A 1 2 3 (^00 1 2 3 ) 4 plot

F M A 12 = 3 (^00 1 2 3 ) 4 data geom coordinate x = F y = A system x = F y = A ggsave("plot.png", width = 5, height = 5) Saves last plot as 5’ x 5’ file named "plot.png" in working directory. Matches file type to file extension.

qplot(x = cty, y = hwy, data = mpg, geom = "point")

Creates a complete plot with given data, geom, and mappings. Supplies many useful defaults. aesthetic mappings data geom last_plot() Returns the last plot ggplot(data = ) + <GEOM_FUNCTION> ( mapping = aes( ) , stat = , position = ) + <COORDINATE_FUNCTION> + <FACET_FUNCTION> + <SCALE_FUNCTION> + <THEME_FUNCTION><THEME_FUNCTION> <SCALE_FUNCTION> <FACET_FUNCTION> <COORDINATE_FUNCTION> <GEOM_FUNCTION> Required Not required, sensible defaults supplied

Graphical Primitives

a <- ggplot(economics, aes(date, unemploy))

b <- ggplot(seals, aes(x = long, y = lat))

a + geom_blank()

(Useful for expanding limits)

b + geom_curve(aes(yend = lat + 1,

xend=long+1,curvature=z)) - x, xend, y, yend,

alpha, angle, color, curvature, linetype, size

a + geom_path(lineend="butt",

linejoin="round’, linemitre=1)

x, y, alpha, color, group, linetype, size

a + geom_polygon(aes(group = group))

x, y, alpha, color, fill, group, linetype, size

b + geom_rect(aes(xmin = long, ymin=lat,

xmax= long + 1, ymax = lat + 1)) - xmax, xmin,

ymax, ymin, alpha, color, fill, linetype, size

a + geom_ribbon(aes(ymin=unemploy - 900,

ymax=unemploy + 900)) - x, ymax, ymin

alpha, color, fill, group, linetype, size

Line Segments

common aesthetics: x, y, alpha, color, linetype, size

b + geom_abline(aes(intercept=0, slope=1))

b + geom_hline(aes(yintercept = lat))

b + geom_vline(aes(xintercept = long))

b + geom_segment(aes(yend=lat+1, xend=long+1))

b + geom_spoke(aes(angle = 1:1155, radius = 1))

One Variable

c + geom_area(stat = "bin")

x, y, alpha, color, fill, linetype, size

c + geom_density(kernel = "gaussian")

x, y, alpha, color, fill, group, linetype, size, weight

c + geom_dotplot()

x, y, alpha, color, fill

c + geom_freqpoly()

x, y, alpha, color, group, linetype, size

c + geom_histogram(binwidth = 5)

x, y, alpha, color, fill, linetype, size, weight

c2 + geom_qq(aes(sample = hwy))

x, y, alpha, color, fill, linetype, size, weight

Discrete

d <- ggplot(mpg, aes(fl))

d + geom_bar()

x, alpha, color, fill, linetype, size, weight

Continuous

c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg)

RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com

Coordinate Systems

60 long lat

π + coord_quickmap()

π + coord_map(projection = "ortho",

orientation=c(41, -74, 0))

projection, orientation, xlim, ylim Map projections from the mapproj package (mercator (default), azequalarea, lagrange, etc.)

r + coord_cartesian(xlim = c(0, 5))

xlim, ylim The default cartesian coordinate system

r + coord_fixed(ratio = 1/2)

ratio, xlim, ylim Cartesian coordinates with fixed aspect ratio between x and y units

r + coord_flip()

xlim, ylim Flipped Cartesian coordinates

r + coord_polar(theta = "x", direction=1 )

theta, start, direction Polar coordinates

r + coord_trans(ytrans = "sqrt")

xtrans, ytrans, limx, limy Transformed cartesian coordinates. Set xtrans and ytrans to the name of a window function.

r <- d + geom_bar()

Position Adjustments

s + geom_bar(position = "dodge") Arrange elements side by side s + geom_bar(position = "fill") Stack elements on top of one another, normalize height e + geom_point(position = "jitter") Add random noise to X and Y position of each element to avoid overplotting e + geom_label(position = "nudge") Nudge labels away from points s + geom_bar(position = "stack") Stack elements on top of one another

s <- ggplot(mpg, aes(fl, fill = drv))

Position adjustments determine how to arrange geoms that would otherwise occupy the same space. Each position adjustment can be recast as a function with manual width and height arguments s + geom_bar(position = position_dodge(width = 1)) A B

Themes

r + theme_classic() r + theme_light() r + theme_linedraw() r + theme_minimal() Minimal themes r + theme_void() Empty theme 0 50 100 150 c d e fl p r count 0 50 100 150 c d e fl p r count 0 50 100 150 c d e fl p r count 0 50 100 150 c d e fl p r r + theme_bw() count White background with grid lines r + theme_gray() Grey background (default theme) r + theme_dark() (^0) dark for contrast 50 100 150 c d e fl p r count

Zooming

t + coord_cartesian( xlim = c(0, 100), ylim = c(10, 20)) With clipping (removes unseen data points) t + xlim(0, 100) + ylim(10, 20) t + scale_x_continuous(limits = c(0, 100)) + scale_y_continuous(limits = c(0, 100)) Without clipping (preferred)

Legends

n + theme(legend.position = "bottom")

Place legend at "bottom", "top", "left", or "right"

n + guides(fill = "none")

Set legend type for each aesthetic: colorbar, legend, or none (no legend)

n + scale_fill_discrete(name = "Title",

labels = c("A", "B", "C", "D", "E"))

Set legend title and labels with a scale function.

Faceting

t <- ggplot(mpg, aes(cty, hwy)) + geom_point()

Facets divide a plot into subplots based on the values of one or more discrete variables.

t + facet_grid(. ~ fl)

facet into columns based on fl

t + facet_grid(year ~ .)

facet into rows based on year

t + facet_grid(year ~ fl)

facet into both rows and columns

t + facet_wrap(~ fl)

wrap facets into a rectangular layout Set scales to let axis limits vary across facets

t + facet_grid(drv ~ fl, scales = "free")

x and y axis limits adjust to individual facets

  • "free_x" - x axis limits adjust
  • "free_y" - y axis limits adjust Set labeller to adjust facet labels t + facet_grid(. ~ fl, labeller = label_both) t + facet_grid(fl ~ ., labeller = label_bquote(alpha ^ .(fl))) t + facet_grid(. ~ fl, labeller = label_parsed) fl: c fl: d fl: e fl: p fl: r c d e p r ↵c^ ↵d^ ↵e^ ↵p^ ↵r Learn more at docs.ggplot2.org and www.ggplot2-exts.org • ggplot2 2.1.0 • Updated: 11/1 6 c + stat_bin(binwidth = 1, origin = 10) x, y | ..count.., ..ncount.., ..density.., ..ndensity.. c + stat_count(width = 1) x, y, | ..count.., ..prop.. c + stat_density(adjust = 1, kernel = "gaussian") x, y, | ..count.., ..density.., ..scaled.. e + stat_bin_2d(bins = 30, drop = T) x, y, fill | ..count.., ..density.. e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density.. e + stat_density_2d(contour = TRUE, n = 100) x, y, color, size | ..level.. e + stat_ellipse(level = 0.95, segments = 51, type = "t") l + stat_contour(aes(z = z)) x, y, z, order | ..level.. l + stat_summary_hex(aes(z = z), bins = 30, fun = max) x, y, z, fill | ..value.. l + stat_summary_2d(aes(z = z), bins = 30, fun = mean) x, y, z, fill | ..value.. f + stat_boxplot(coef = 1.5) x, y | ..lower.., ..middle.., ..upper.., ..width.. , ..ymin.., ..ymax.. f + stat_ydensity(kernel = "gaussian", scale = "area") x, y | ..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width.. e + stat_ecdf(n = 40) x, y | ..x.., ..y.. e + stat_quantile(quantiles = c(0.1, 0.9), formula = y ~ log(x), method = "rq") x, y | ..quantile.. e + stat_smooth(method = "lm", formula = y ~ x, se=T, level=0.95) x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax.. ggplot() + stat_function(aes(x = -3:3), n = 99, fun = dnorm, args = list(sd=0.5)) x | ..x.., ..y.. e + stat_identity(na.rm = TRUE) ggplot() + stat_qq(aes(sample=1:100), dist = qt, dparam=list(df=5)) sample, x, y | ..sample.., ..theoretical.. e + stat_sum() x, y, size | ..n.., ..prop.. e + stat_summary(fun.data = "mean_cl_boot") h + stat_summary_bin(fun.y = "mean", geom = "bar") e + stat_unique() Stats - An alternative way to build a layer

x (^) ..count..

3 (^00 1 2 3 ) 4 1 2 3 (^00 1 2 3 ) 4 data geom^ coordinate system x = x plot y = ..count.. fl cty cyl stat A stat builds new variables to plot (e.g., count, prop). Visualize a stat by changing the default stat of a geom function, geom_bar(stat="count") or by using a stat function, stat_count(geom="bar"), which calls a default geom to make a layer (equivalent to a geom function). Use ..name.. syntax to map stat variables to aesthetics.

i + stat_density2d(aes(fill = ..level..),

geom = "polygon")

stat function geom mappings variable created by stat geom to use 1D distributions 2D distributions 3 Variables Comparisons Functions General Purpose

Labels

t + labs( x = "New x axis label", y = "New y axis label", title ="Add a title above the plot", subtitle = "Add a subtitle below title", caption = "Add a caption below plot", = "New legend title") Use scale functions to update legend labels t + annotate(geom = "text", x = 8, y = 9, label = "A") geom to place manual values for geom’s aesthetics o + scale_fill_distiller(palette = "Blues") o + scale_fill_gradient(low="red", high="yellow") o + scale_fill_gradient2(low="red", high="blue", mid = "white", midpoint = 25) o + scale_fill_gradientn(colours=topo.colors(6)) Also: rainbow(), heat.colors(), terrain.colors(), cm.colors(), RColorBrewer::brewer.pal() Scales Scales map data values to the visual values of an aesthetic. To change a mapping, add a new scale. (n <- d + geom_bar(aes(fill = fl))) n + scale_fill_manual( values = c("skyblue", "royalblue", "blue", "navy"), limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", "r"), name = "fuel", labels = c("D", "E", "P", "R")) scale_ aesthetic to adjust^ prepackaged scale to use^ scale specific arguments range of values to include in mapping title to use in legend/axis labels to use in legend/axis breaks to use in legend/axis

General Purpose scales

Use with most aesthetics scale_continuous() - map cont’ values to visual ones scalediscrete() - map discrete values to visual ones scaleidentity() - use data values as visual ones scalemanual(values = c()) - map discrete values to manually chosen visual ones scaledate(date_labels = "%m/%d"), date_breaks = "2 weeks") - treat data values as dates. scale_datetime() - treat data x values as date times. Use same arguments as scale_x_date(). See ?strptime for label formats.

X and Y location scales

Color and fill scales (Discrete)

Shape and size scales

Use with x or y aesthetics (x shown here) scale_x_log10() - Plot x on log10 scale scale_x_reverse() - Reverse direction of x axis scale_x_sqrt() - Plot x on square root scale n <- d + geom_bar(aes(fill = fl)) n + scale_fill_brewer(palette = "Blues") For palette choices: RColorBrewer::display.brewer.all() n + scale_fill_grey(start = 0.2, end = 0.8, na.value = "red") p + scale_shape() + scale_size() p + scale_shape_manual(values = c(3:7))

Color and fill scales (Continuous)

o <- c + geom_dotplot(aes(fill = ..x..)) p <- e + geom_point(aes(shape = fl, size = cyl)) p + scale_radius(range = c(1,6)) p + scale_size_area(max_size = 6) Maps to radius of circle, or area c(-1, 26) 0: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Manual shape values