tstrsplit

equivalent to transpose(strsplit(…))     > strsplit(“A text I want to display with spaces”, NULL) [[1]] [1] “A” ” ” “t” “e” “x” “t” ” ” “I” ” ” “w” “a” “n” “t” ” ” “t” “o” ” ” “d” “i” “s” “p” “l” “a” “y” ” ” “w” “i” “t” Read more…

data.table

  > A <- as.data.frame(a) > A a 1 NA 2 TRUE 3 FALSE 4 NA > AA <- as.data.table(A) > AA[,(nn):=NULL] Error in eval(lhs, parent.frame(), parent.frame()) : object ‘nn’ not found > AA[,(nn):=NA] Error in eval(lhs, parent.frame(), parent.frame()) : object ‘nn’ not found > AA[,.(nn):=NA] Error in eval(lhs, parent.frame(), Read more…

fst: Lightning Fast Serialization of Data Frames for R

https://cran.r-project.org/web/packages/fst/index.html   0.8.4 https://cran.r-project.org/web/packages/fst/fst.pdf http://www.fstpackage.org/ a fast, easy and flexible way to serialize data frames. write/read Test DataFrame  Generate some random data frame with 1 million rows and various column types nr_of_rows <- 1e6 df <- data.frame( Logical = sample(c(TRUE, FALSE, NA), prob = c(0.85, 0.1, 0.05), nr_of_rows, replace = TRUE), Read more…

APPLY

https://stackoverflow.com/questions/3505701/grouping-functions-tapply-by-aggregate-and-the-apply-family   M <- matrix(seq(1,16), 4, 4) D <- data.frame(M) T <- data.table(M) L <- list(a=1, b=1:3, c=10:16) > M [,1] [,2] [,3] [,4] [1,] 1 5 9 13 [2,] 2 6 10 14 [3,] 3 7 11 15 [4,] 4 8 12 16 > D X1 X2 X3 X4 Read more…

cbind(), rbind(), Merge() 병합

      rbind, rbindlist https://github.com/Rdatatable/data.table/issues/600 sample data (column별로 정의) dt1 <- data.table(A=c(“a”, “b”, “c”), B=c(10, 20, 30), C=c(FALSE, TRUE, TRUE) ) dt2 <- data.table(A=c(“d”,”e”), B=c(40, 50), C=c(TRUE, FALSE) ) > dt1 > dt2 A B C A B C 1: a 10 FALSE 1: d 40 TRUE 2: b Read more…

Simplifying vs. preserving subsetting

http://adv-r.had.co.nz/Subsetting.html Simplifying subsets returns the simplest possible data structure that can represent the output, and is useful interactively because it usually gives you what you want. Preserving subsetting keeps the structure of the output the same as the input, and is generally better for programming because the result will always Read more…

truncated

  [list output truncated] str(data, list.len=ncol(data)) data %>% str(list.len=ncol(.)) reached getOption(“max.print”) — omitted options(max.print=666666) option reset R세션이 restart되면, default값으로 reset됨   

logical AND (& , &&)  ,  logical OR (| and || )

https://stackoverflow.com/questions/6558921/r-boolean-operators-and http://www.burns-stat.com/pages/Tutor/R_inferno.pdf http://stat.ethz.ch/R-manual/R-patched/library/base/html/Logic.html   logical AND (& , &&)  ,  logical OR (| and || ) &&  결과가 하나, 왼쪽에서 오른쪽으로 단지 Vector의 첫번째 요소만 비교한다.  ((-2:2) >= 0) && ((-2:2) <= 0) [1] FALSE -2 -2 FALSE Help 페이지를 보면,  && || 는  “appropriate for programming control-flow and [is] typically Read more…

package :: tidyverse

  pre install dependencies ‘httr’, ‘rvest’, ‘xml2’ are not available for package ‘tidyverse’ sudo apt install libssl-dev sudo apt install libcurl4-openssl-dev sudo apt install libxml2-dev 설치 install.packages(“tidyverse”) Installing package into ‘/Users/onesixx/R/x86_64-pc-linux-gnu-library/3.4’ (as ‘lib’ is unspecified) also installing the dependencies ‘colorspace’, ‘mnormt’, ‘bindr’, ‘RColorBrewer’, ‘dichromat’, ‘munsell’, ‘labeling’, ‘viridisLite’, ‘rematch’, ‘plyr’, ‘psych’, Read more…

data.table reshaping

join(merge)   dtProd <- data.table(CustomerId = c(1:6), Product= c(rep(“iphone”,3), rep(“gallaxy”,3)) ) dtAddr <- data.table(CustomerId = c(3,5,7), Address= c(rep(“Seoul”,2), rep(“Pusan”, 1)) ) # full outer join merge(dtProd, dtAddr, by=”CustomerId”, all=T) merge(dtProd, dtAddr, by=c(“CustomerId”, “Sex”), all=T) # inner join merge(dtProd, dtAddr, by=”CustomerId”) dtProd[dtAddr, nomatch=0L, on=”CustomerId”] # anti join – use `!` operator Read more…

data.table Tip

    To list all objects in the data.table package > ls(“package:data.table”) [1] “:=” “address” “alloc.col” “as.chron.IDate” “as.chron.ITime” [6] “as.data.table” “as.Date.IDate” “as.IDate” “as.ITime” “as.xts.data.table” [11] “between” “%between%” “chgroup” “%chin%” “chmatch” [16] “chorder” “CJ” “copy” “data.table” “dcast” [21] “dcast.data.table” “fintersect” “first” “foverlaps” “frank” [26] “frankv” “fread” “fsetdiff” “fsetequal” “fsort” [31] “funion” Read more…

data.table |C|R|U|D

Created Convert from data.frame  setDT()는 (사본을 만들거나 메모리 위치를 변경하지 않고) data.table을 만들 수 있습니다. (by reference) data.table 은 setDT 함수로 사본을 만드는 대신,  data.table 로 변환해 버린다.  setDF(), setDT() setDF(my_dt) # Convert data.table to data.frame setDT(my_dt) # Convert data.frame to data.table data.table() diamondsDT <- data.table(diamonds)   Create table (from the scrach)  Data.frame과 Read more…

data.table

Data.table-crud Data.table-tip data.table이란 https://github.com/Rdatatable/data.table/wiki http://datatable.r-forge.r-project.org/datatable-faq.pdf https://stackoverflow.com/users/403310/matt-dowle data.table, written by Matt Dowle > library(data.table) data.table 1.10.4.2 The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way Documentation: ?data.table, example(data.table) and browseVignettes(“data.table”) Release notes, videos and slides: http://r-datatable.com data.table 1.10.4 ********** This installation of data.table has not detected OpenMP support. It will still Read more…