melt.data.table {data.table} | R Documentation |
An S3 method for melting data.table
s written entirely in C for speed. It also avoids any unnecessary copies by handling all arguments internally in a memory efficient manner.
From 1.9.6, to melt or cast data.tables, it is not necessary to load reshape2
anymore. If you have to, then load reshape2
package before loading data.table
.
NEW: melt.data.table
now allows melting into multiple columns simultaneously. See the details
and examples
section.
## fast melt a data.table ## S3 method for class 'data.table' melt(data, id.vars, measure.vars, variable.name = "variable", value.name = "value", ..., na.rm = FALSE, variable.factor = TRUE, value.factor = FALSE, verbose = getOption("datatable.verbose"))
data |
A |
id.vars |
vector of id variables. Can be integer (corresponding id column numbers) or character (id column names) vector. If missing, all non-measure columns will be assigned to it. |
measure.vars |
vector of measure variables. Can be integer (corresponding measure column numbers) or character (measure column names) vector. If missing, all non-id columns will be assigned to it. NEW: |
variable.name |
name for the measured variable names column. The default name is 'variable'. |
value.name |
name for the molten data values column. The default name is 'value'. |
na.rm |
If |
variable.factor |
If |
value.factor |
If |
verbose |
|
... |
any other arguments to be passed to/from other methods. |
If id.vars
and measure.vars
are both missing, all non-numeric/integer/logical
columns are assigned as id variables and the rest as measure variables. If only one of id.vars
or measure.vars
is supplied, the rest of the columns will be assigned to the other. Both id.vars
and measure.vars
can have the same column more than once and the same column can be both as id and measure variables.
melt.data.table
also accepts list
columns for both id and measure variables.
When all measure.vars
are not of the same type, they'll be coerced according to the hierarchy list
> character
> numeric > integer > logical
. For example, if any of the measure variables is a list
, then entire value column will be coerced to a list. Note that, if the type of value
column is a list, na.rm = TRUE
will have no effect.
From version 1.9.6
, melt
gains a feature with measure.vars
accepting a list of character
or integer
vectors as well to melt into multiple columns in a single function call efficiently. See the examples
section for the usage.
Attributes are preserved if all value
columns are of the same type. By default, if any of the columns to be melted are of type factor
, it'll be coerced to character
type. This is to be compatible with reshape2
's melt.data.frame
. To get a factor
column, set value.factor = TRUE
. melt.data.table
also preserves ordered
factors.
An unkeyed data.table
containing the molten data.
dcast
, http://had.co.nz/reshape/
set.seed(45) require(data.table) DT <- data.table( i_1 = c(1:5, NA), i_2 = c(NA,6,7,8,9,10), f_1 = factor(sample(c(letters[1:3], NA), 6, TRUE)), f_2 = factor(c("z", "a", "x", "c", "x", "x"), ordered=TRUE), c_1 = sample(c(letters[1:3], NA), 6, TRUE), d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"), d_2 = as.Date(6:1, origin="2012-01-01")) # add a couple of list cols DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5,1)))), by = i_1]$c] DT[, l_2 := DT[, list(c=list(rep(c_1, sample(5,1)))), by = i_1]$c] # id, measure as character/integer/numeric vectors melt(DT, id=1:2, measure="f_1") melt(DT, id=c("i_1", "i_2"), measure=3) # same as above melt(DT, id=1:2, measure=3L, value.factor=TRUE) # same, but 'value' is factor melt(DT, id=1:2, measure=3:4, value.factor=TRUE) # 'value' is *ordered* factor # preserves attribute when types are identical, ex: Date melt(DT, id=3:4, measure=c("d_1", "d_2")) melt(DT, id=3:4, measure=c("i_1", "d_1")) # attribute not preserved # on list melt(DT, id=1, measure=c("l_1", "l_2")) # value is a list melt(DT, id=1, measure=c("c_1", "l_1")) # c1 coerced to list # on character melt(DT, id=1, measure=c("c_1", "f_1")) # value is char melt(DT, id=1, measure=c("c_1", "i_2")) # i2 coerced to char # on na.rm=TRUE. NAs are removed efficiently, from within C melt(DT, id=1, measure=c("c_1", "i_2"), na.rm=TRUE) # remove NA # NEW FEATURE: measure.vars can be a list # melt "f_1,f_2" and "d_1,d_2" simultaneously, retain 'factor' attribute # convenient way using internal function patterns() melt(DT, id=1:2, measure=patterns("^f_", "^d_"), value.factor=TRUE) # same as above, but provide list of columns directly by column names or indices melt(DT, id=1:2, measure=list(3:4, c("d_1", "d_2")), value.factor=TRUE) # na.rm=TRUE removes rows with NAs in any 'value' columns melt(DT, id=1:2, measure=patterns("f_", "d_"), value.factor=TRUE, na.rm=TRUE) # return 'NA' for missing columns, 'na.rm=TRUE' ignored due to list column melt(DT, id=1:2, measure=patterns("l_", "c_"), na.rm=TRUE)