factor

    cha <- c(“A”,”B”,”C”,”D”,”E”) id <- as.factor(c(1,3,5)) cha[id] <- “F” cha [1] “F” “F” “F” “D” “E” cha[as.numeric(as.character(id))] <- “F” cha [1] “F” “B” “F” “D” “F” # factor의값은 level이다. id [1] 1 3 5 Levels: 1 3 5 as.integer(id) [1] 1 2 3 as.character(id) [1] “1” “3” “5” Read more…

4. Model :: intro

http://r4ds.had.co.nz/model-intro.html   In model basics, you’ll learn how models work mechanistically, focusing on the important family of linear models. You’ll learn general tools for gaining insight into what a predictive model tells you about your data, focusing on simple simulated datasets. In model building, you’ll learn how to use models Read more…

Program :: Iteration with map

  http://r4ds.had.co.nz/iteration.html#the-map-functions   https://www.rstudio.com/resources/videos/happy-r-users-purrr-tutorial/ https://github.com/rstudio/rstudio-conf/tree/master/2017/Happy_R_Users_Purrr-Charlotte_Wickham http://purrr.tidyverse.org/reference/map.html http://statkclee.github.io/parallel-r/ds-fp-purrr.html   http://r4ds.had.co.nz/iteration.html#the-map-functions   https://stackoverflow.com/questions/38403111/ways-to-add-multiple-columns-to-data-frame-using-plyr-dplyr-purrr https://www.r-bloggers.com/rebuilding-map-example-with-apply-functions/   map function makes a   map() list  lapply(), sapply() map_lgl()      logical vector   map_int()      integer vector   map_dbl()      double vector   map_chr() character vector   각 map 함수는 input으로 vector를 받아서, 각 Read more…

Program :: Iteration with function

http://r4ds.had.co.nz/iteration.html   Intro . 코드 중복(copy&paste, duplication)을 줄이기위해, 우는 Function을 사용하거나  Iteration을 활용한다.   Function을 사용 :  코드의 반복되는 패턴을 찾아, 쉽게 수정되고 재사용할수 있는 독립적인 한 덩어리로 발췌해 낸다.   iteration : multiple inputs(다른 컬럼, 다른 데이터셋) 에 대해 같은 작업을 반복하여 수행한다. iteration의  2개의 패러다임 imperative programming    : 명령어   for Read more…

Program :: Iteration with loop

http://r4ds.had.co.nz/iteration.html   Intro . 코드 중복(copy&paste, duplication)을 줄이기위해, 우는 Function을 사용하거나  Iteration을 활용한다.   Function을 사용 :  코드의 반복되는 패턴을 찾아, 쉽게 수정되고 재사용할수 있는 독립적인 한 덩어리로 발췌해 낸다.   iteration : multiple inputs(다른 컬럼, 다른 데이터셋) 에 대해 같은 작업을 반복하여 수행한다. iteration의  2개의 패러다임 imperative programming    : 명령어   for Read more…

3. Program :: intro

http://r4ds.had.co.nz/program-intro.html   Pipes 18.1 Intro 18.2 Piping alternatives 18.3 When not to use the pipe The pipe is a powerful tool, but it’s not the only tool at your disposal, and it doesn’t solve every problem! Pipes are most useful for rewriting a fairly short linear sequence of operations. I Read more…

2. wrangle-intro

http://r4ds.had.co.nz/wrangle-intro.html     tibbles 개요 data.frame 대신에 그것의 변종인 tibbles vignette(“tibble”)   tibble만들기 data frame 에서 tibble로 변환 as_tibble(iris) %>% str() tbl_df(iris) %>% str()  반대로 tibble이 안 먹는 old function이 있는 경우 as.data.frame(tb)   새로 만들기  tibble( a = lubridate::now() + runif(1e1) * 86400, b = lubridate::today()+ runif(1e1) * 30, Read more…

Layer::position ggplot2

Position adjustments 이산형 값, 특히, bar plot이나 histogram의 stacking효과를 위한 객체위치 튜닝 주로 사용. “identity”  (default, 특별한 배치가 필요없이 있는 그대로 배치) “dodge”   “fill”   ggplot(diamonds) + geom_bar(aes(x=cut, fill=cut)) ggplot(diamonds) + geom_bar(aes(x=cut, fill=clarity)) ggplot(diamonds) + geom_bar(aes(x=cut, fill=clarity), position=”identity”) ggplot(diamonds) + geom_bar(aes(x=cut, fill=clarity), position=”dodge”) ggplot(diamonds) + geom_bar(aes(x=cut, fill=clarity), position=”fill”) 많은 점들이  overlap되었을때  overplotting. Read more…

1. explore :: intro

http://r4ds.had.co.nz/explore-intro.html exploratory data analysis     –  data visualisation      –  data transformation    http://onesixx.com/dplyr/     – EDA 1. Generate questions about your data. 2. Search for answers by visualising, transforming, and modelling your data. 3. Use what you learn to refine your questions and/or generate new questions. Read more…

Intro :: tidy

http://r4ds.had.co.nz/introduction.html https://blog.rstudio.org/ 2 Introduction 일반적인 Data science 프로젝트 단계 import take data stored in a file, database, or web API, and load it into a data frame in R Wrangling tidying and transforming are called wrangling, because getting your data in a form that’s natural to work with often feels Read more…

R for Data Science (Pdf)

http://brettklamer.com/diversions/statistical/compile-r-for-data-science-to-a-pdf/  을 번역함 Compile “R for Data Science” source on GibHub to a PDF   1. Download the repository from https://github.com/hadley/r4ds 압축풀고, 작업디렉토리 변경 setwd(“~/Downloads/r4ds-master/”) 2. 필요 package설치  R에서 devtools를 사용하여 github 설치 devtools::install_github(“hadley/r4ds”) devtools::install_github(“wch/webshot”) Downloading GitHub repo hadley/r4ds@master from URL https://api.github.com/repos/hadley/r4ds/zipball/master Installing r4ds Installing condvis Installing htmltools Installing Rcpp Installing Read more…

R for Data Science

http://r4ds.had.co.nz/      <- –  https://bookdown.org/   Garrett Grolemund, Hadley Wickham http://tidyverse.org/ https://github.com/hadley/r4ds http://style.tidyverse.org/ library(tidyverse)  로 tidyverse의  핵심 팩키지를 한번에 loading할수 있다.  ggplot2, for data visualisation. dplyr,      for data manipulation. tidyr,      for data tidying. readr,     for data import. purrr,     for functional programming. tibble, Read more…