data.table reshape (cast,melt)

Published by onesixx on

cast하면 key 가 생긴다.

R &

data.table에 맞게 reshape2 package의 함수를 수정한 dcast & melt

Convert DT to long form where each dob is a separate observation.

rs <,”Step~variable”)

data_melt <-, id.vars = c(“A”, “B”, “C”))
data_cast <-, id ~ Param)


Examples for Practise

Q1. Calculate total number of rows by month and then sort on descending order

mydata[, .N, by = month] [order(-N)]

The .N operator is used to find count.   Q2. Find top 3 months with high mean arrival delay

mydata[, .(mean_arr_delay = mean(arr_delay, na.rm = TRUE)), by = month][order(-mean_arr_delay)][1:3]

Q3. Find origin of flights having average total delay is greater than 20 minutes

mydata[, lapply(.SD, mean, na.rm = TRUE), .SDcols = c(“arr_delay”, “dep_delay”), by = origin][(arr_delay + dep_delay) > 20]

Q4.  Extract average of arrival and departure delays for carrier == ‘DL’ by ‘origin’ and ‘dest’ variables

mydata[carrier == “DL”,
        lapply(.SD, mean, na.rm = TRUE),
        by = .(origin, dest),
        .SDcols = c(“arr_delay”, “dep_delay”)]

Q5. Pull first value of ‘air_time’ by ‘origin’ and then sum the returned values when it is greater than 300

mydata[, .SD[1], .SDcols=”air_time”, by=origin][air_time > 300, sum(air_time)]

Timeserise to data.table

Categories: Reshaping


Blog Owner

Inline Feedbacks
View all comments
Would love your thoughts, please comment.x