Simplifying vs. preserving subsetting

Published by onesixx on

http://adv-r.had.co.nz/Subsetting.html

Simplifying subsets
returns the simplest possible data structure that can represent the output, and
is useful interactively because it usually gives you what you want.

Preserving subsetting
keeps the structure of the output the same as the input, and
is generally better for programming because the result will always be the same type.
Omitting drop = FALSE when subsetting matrices and data frames is one of the most common sources of programming errors. (It will work for your test cases, but then someone will pass in a single column data frame and it will fail in an unexpected and unclear way.)

Unfortunately, how you switch between simplifying and preserving differs for different data types,
as summarised in the table below.

  Simplifying Preserving
Vector x[[1]] x[1]
List x[[1]] x[1]
Factor x[1:4, drop = TURE] x[1:4]           x[[1:4]]
Array, Matrix x[1,   ] 
x[   , 1]
x[1,    , drop = FALSE
x[  , 1, drop = FALSE]
Data frame x[[1]] x[ , 1] x[1] x[  , 1, drop = FALSE

Preserving is the same for all data types: you get the same type of output as input.
Simplifying behaviour varies slightly between different data types, as described below:

Atomic vector

x <- c(a=1, b=2) 
a b 
1 2
x[[1]]
[1] 1

removes names.

x[1]
a
1

.

List: return the object inside the list, not a single element list.

y <- list(a=1, b=2)
$a
[1] 1

$b
[1] 2
y[[1]]
y[[1]] %>% str
[1] 1

num 1

removes names.

y[1]
y[1] %>% str
$a
[1] 1

List of 1
 $ a: num 1

Factor: drops any unused levels.

z <- factor(c("a", "b"))
[1] a b
Levels: a b
z[[1]]
[1] a
Levels: a

removes names.

z[1]
[1] a
Levels: a b

Matrix or array: if any of the dimensions has length 1, drops that dimension.

a <- matrix(1:4, nrow=2)
     [,1] [,2]
[1,]    1    3
[2,]    2    4
a[1, ]
[1] 1 3

removes names.

a[1, , drop = FALSE]
     [,1] [,2]
[1,]    1    3

….

Data frame: if output is a single column, returns a vector instead of a data frame.

a <- matrix(1:4, nrow=2)
     [,1] [,2]
[1,]    1    3
[2,]    2    4
a[1, ]
[1] 1 3

removes names.

a[1, , drop = FALSE]
     [,1] [,2]
[1,]    1    3

….

…………..

Categories: R Reshaping

onesixx

Blog Owner

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x