cbind(), rbind(), Merge() 병합

Published onesixx on

 

 

 

rbind, rbindlist

https://github.com/Rdatatable/data.table/issues/600

sample data (column별로 정의)

dt1 <- data.table(A=c("a", "b", "c"),
                  B=c(10, 20, 30), 
                  C=c(FALSE, TRUE, TRUE)  )
dt2 <- data.table(A=c("d","e"),
                  B=c(40, 50),
                  C=c(TRUE, FALSE)  )
> dt1                    > dt2              
   A  B     C               A  B     C
1: a 10 FALSE            1: d 40  TRUE
2: b 20  TRUE            2: e 50 FALSE
3: c 30  TRUE                         
rbind(dt1, dt2, stringsAsFactors=F)
rbindlist (list(dt1,dt2))              # identical
   A  B     C
1: a 10 FALSE
2: b 20  TRUE
3: c 30  TRUE
4: d 40  TRUE
5: e 50 FALSE

 

Sample data (row별로 정의)

dl <- list(r1=c("a", 10, FALSE),
           r2=c("b", 20, TRUE), 
           r3=c("c", 30, TRUE),
           r4=c("d", 40, FALSE),
           r5=c("e", 50, TRUE))
$r1
[1] "a"     "10"    "FALSE"
$r2
[1] "b"    "20"   "TRUE"
$r3
[1] "c"    "30"   "TRUE"
$r4
[1] "d"     "40"    "FALSE
$r5
[1] "e"    "50"   "TRUE"

matrix형태로 변환되며, 모두 Character타입이 된다. 

do.call(rbind, dl)
do.call(cbind, transpose(dl))    #identical
> do.call(rbind, dl)         > do.call(cbind, transpose(dl))
   [,1] [,2] [,3]                   [,1] [,2] [,3] 
r1 "a"  "10" "FALSE"           [1,] "a"  "10" "FALSE"
r2 "b"  "20" "TRUE"            [2,] "b"  "20" "TRUE" 
r3 "c"  "30" "TRUE"            [3,] "c"  "30" "TRUE"
r4 "d"  "40" "FALSE"           [4,] "d"  "40" "FALSE"
r5 "e"  "50" "TRUE"            [5,] "e"  "50" "TRUE" 

rbindlist는 아래와 같이 Error가 난다.  (list내의 vector형식은 rbindlist가 불가능하다.)

rbindlist(dl)
Error in rbindlist(dl) : 
  Item 1 of list input is not a data.frame, data.table or list

해결방법

ldply(dl, rbind)

transpose(dl) %>% as.data.table()
as.data.table(dl) %>% transpose()

bind_rows(dl) %>% t()
bind_cols(dl) %>% t()

rbindlist(lapply(1:length(dl),function(x)data.frame(dl[[x]][1],dl[[x]][2],dl[[x]][3])))

 

Merge = Join 

http://onesixx.com/data-table/#joinmerge

dplyr bind_row bind_col (rbind, cbind)

http://dplyr.tidyverse.org/reference/bind.html

bind_rows

bind_rows(..., .id = NULL)

bind_cols(...)

combine(...)

 

  • 속도가 빠름.
  • .id 를 통해 각 data.frame에 이름을 줄수 있음.
bind_rows("group 1" = one, "group 2" = two, .id = "groups")
  • row나 column갯수가 다르면 NA로 채워짐

 

 

Categories: Reshaping

onesixx

Blog Owner

2 Comments

Leave a Reply

Your email address will not be published.