# data.table reshape (dcast,melt)

R

https://github.com/rstudio/cheatsheets/blob/master/data-import.pdf

R

R

R

R

## melt.data.table() & dcast.data.table()

data.table에 맞게 reshape2 package의 함수를 수정한 dcast & melt

#### Convert `DT` to long form where each `dob` is a separate observation.

rs <- dcast.data.table(rs,”Step~variable”)

data_melt <- melt.data.table(data, id.vars = c(“A”, “B”, “C”))
data_cast <- dcast.data.table(result, id ~ Param)

R

## Examples

Q1. Calculate total number of rows by month and then sort on descending order

mydata[, .N, by = month] [order(-N)]

The .N operator is used to find count.

Q2. Find top 3 months with high mean arrival delay

mydata[, .(mean_arr_delay = mean(arr_delay, na.rm = TRUE)), by = month][order(-mean_arr_delay)][1:3]

Q3. Find origin of flights having average total delay is greater than 20 minutes

mydata[, lapply(.SD, mean, na.rm = TRUE), .SDcols = c(“arr_delay”, “dep_delay”), by = origin][(arr_delay + dep_delay) > 20]

Q4.  Extract average of arrival and departure delays for carrier == ‘DL’ by ‘origin’ and ‘dest’ variables

mydata[carrier == “DL”,
lapply(.SD, mean, na.rm = TRUE),
by = .(origin, dest),
.SDcols = c(“arr_delay”, “dep_delay”)]

Q5. Pull first value of ‘air_time’ by ‘origin’ and then sum the returned values when it is greater than 300

mydata[, .SD, .SDcols=”air_time”, by=origin][air_time > 300, sum(air_time)]

Categories: Reshaping

#### onesixx

Blog Owner 