Simplifying vs. preserving subsetting
http://adv-r.had.co.nz/Subsetting.html
Simplifying subsets
returns the simplest possible data structure that can represent the output, and
is useful interactively because it usually gives you what you want.
Preserving subsetting
keeps the structure of the output the same as the input, and
is generally better for programming because the result will always be the same type.
Omitting drop = FALSE
when subsetting matrices and data frames is one of the most common sources of programming errors. (It will work for your test cases, but then someone will pass in a single column data frame and it will fail in an unexpected and unclear way.)
Unfortunately, how you switch between simplifying and preserving differs for different data types,
as summarised in the table below.
Simplifying | Preserving | |
---|---|---|
Vector | x[[1]] | x[1] |
List | x[[1]] | x[1] |
Factor | x[1:4, drop = TURE] | x[1:4] x[[1:4]] |
Array, Matrix | x[1, ] x[ , 1] |
x[1, , drop = FALSE] x[ , 1, drop = FALSE] |
Data frame | x[[1]] x[ , 1] | x[1] x[ , 1, drop = FALSE] |
Preserving is the same for all data types: you get the same type of output as input.
Simplifying behaviour varies slightly between different data types, as described below:
Atomic vector:
x <- c(a=1, b=2) a b 1 2 |
|
x[[1]] [1] 1 removes names. |
x[1] a 1 . |
List: return the object inside the list, not a single element list.
y <- list(a=1, b=2) $a [1] 1 $b [1] 2 |
|
y[[1]] y[[1]] %>% str [1] 1 num 1 removes names. |
y[1] y[1] %>% str $a [1] 1 List of 1 $ a: num 1 … |
Factor: drops any unused levels.
z <- factor(c("a", "b")) [1] a b Levels: a b |
|
z[[1]] [1] a Levels: a removes names. |
z[1] [1] a Levels: a b |
Matrix or array: if any of the dimensions has length 1, drops that dimension.
a <- matrix(1:4, nrow=2) [,1] [,2] [1,] 1 3 [2,] 2 4 |
|
a[1, ] [1] 1 3 removes names. |
a[1, , drop = FALSE] [,1] [,2] [1,] 1 3 …. |
Data frame: if output is a single column, returns a vector instead of a data frame.
a <- matrix(1:4, nrow=2) [,1] [,2] [1,] 1 3 [2,] 2 4 |
|
a[1, ] [1] 1 3 removes names. |
a[1, , drop = FALSE] [,1] [,2] [1,] 1 3 …. |
…………..