
Summarise Cases
group_by(.data, ..., add =
FALSE)
Returns copy of table !
grouped by …
g_iris <- group_by(iris, Species)
ungroup(x, …)
Returns ungrouped copy !
of table.
ungroup(g_iris)
Use group_by() to create a "grouped" copy of a table. !
dplyr functions will manipulate each "group" separately and
then combine the results.
mtcars %>%
group_by(cyl) %>%
summarise(avg = mean(mpg))
These apply summary functions to columns to create a new
table. Summary functions take vectors as input and return one
value (see back).
VARIATIONS
summarise_all() - Apply funs to every column.
summarise_at() - Apply funs to specific columns.
summarise_if() - Apply funs to all cols of one type.
summarise(.data, …)!
Compute table of summaries. Also
summarise_(). !
summarise(mtcars, avg = mean(mpg))
count(x, ..., wt = NULL, sort = FALSE)!
Count number of rows in each group defined
by the variables in … Also tally().!
count(iris, Species)
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more with browseVignettes(package = c("dplyr", "tibble")) • dplyr 0.5.0 • tibble 1.2.0 • Updated: 2017-01
Each observation, or
case, is in its own row
Each variable is in
its own column
&
dplyr functions work with pipes and expect tidy data. In tidy data:
pipes
x %>% f(y)
becomes f(x, y)
filter(.data, …) Extract rows that meet logical
criteria. Also filter_(). filter(iris, Sepal.Length > 7)
distinct(.data, ..., .keep_all = FALSE) Remove
rows with duplicate values. Also distinct_(). !
distinct(iris, Species)
sample_frac(tbl, size = 1, replace = FALSE,
weight = NULL, .env = parent.frame()) Randomly
select fraction of rows. !
sample_frac(iris, 0.5, replace = TRUE)
sample_n(tbl, size, replace = FALSE, weight =
NULL, .env = parent.frame()) Randomly select
size rows. sample_n(iris, 10, replace = TRUE)
slice(.data, …) Select rows by position. Also
slice_(). slice(iris, 10:15)
top_n(x, n, wt) Select and order top n entries (by
group if grouped data). top_n(iris, 5, Sepal.Width)
Row functions return a subset of rows as a new table. Use a
variant that ends in _ for non-standard evaluation friendly code.
See ?base::logic and ?Comparison for help.
arrange(.data, …) Order rows by values of a
column or columns (low to high), use with
desc() to order from high to low.
arrange(mtcars, mpg)
arrange(mtcars, desc(mpg))
add_row(.data, ..., .before = NULL, .after = NULL)
Add one or more rows to a table.
add_row(faithful, eruptions = 1, waiting = 1)
Group Cases
Manipulate Cases
EXTRACT VARIABLES
ADD CASES
ARRANGE CASES
Logical and boolean operators to use with filter()
Column functions return a set of columns as a new table. Use a
variant that ends in _ for non-standard evaluation friendly code.
contains(match)
ends_with(match)
matches(match)
:, e.g. mpg:cyl
-, e.g, -Species
num_range(prefix, range)
one_of(…)
starts_with(match)
select(.data, …)
Extract columns by name. Also select_if()
select(iris, Sepal.Length, Species)
Manipulate Variables
Use these helpers with select (),
e.g. select(iris, starts_with("Sepal"))
These apply vectorized functions to columns. Vectorized funs take
vectors as input and return vectors of the same length as output
(see back).
mutate(.data, …) !
Compute new column(s).
mutate(mtcars, gpm = 1/mpg)
transmute(.data, …)!
Compute new column(s), drop others.
transmute(mtcars, gpm = 1/mpg)
mutate_all(.tbl, .funs, …) Apply funs to every
column. Use with funs(). !
mutate_all(faithful, funs(log(.), log2(.)))
mutate_at(.tbl, .cols, .funs, …) Apply funs to
specific columns. Use with funs(), vars() and
the helper functions for select().!
mutate_at(iris, vars( -Species), funs(log(.)))
mutate_if(.tbl, .predicate, .funs, …) !
Apply funs to all columns of one type. !
Use with funs().!
mutate_if(iris, is.numeric, funs(log(.)))
add_column(.data, ..., .before = NULL, .after =
NULL) Add new column(s).
add_column(mtcars, new = 1:32)
rename(.data, …) Rename columns.!
rename(iris, Length = Sepal.Length)
MAKE NEW VARIABLES
EXTRACT CASES
summary function
vectorized function
Data Transformation with dplyr : : CHEAT SHEET