我有麻烦重新安排以下数据帧:
set.seed(45)
dat1 <- data.frame(
name = rep(c("firstName", "secondName"), each=4),
numbers = rep(1:4, 2),
value = rnorm(8)
)
dat1
name numbers value
1 firstName 1 0.3407997
2 firstName 2 -0.7033403
3 firstName 3 -0.3795377
4 firstName 4 -0.7460474
5 secondName 1 -0.8981073
6 secondName 2 -0.3347941
7 secondName 3 -0.5013782
8 secondName 4 -0.1745357
我想重塑它,以便每个唯一的“name”变量都是一个行名,“值”作为该行的观察值,“数字”作为冒号。就像这样:
name 1 2 3 4
1 firstName 0.3407997 -0.7033403 -0.3795377 -0.7460474
5 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357
我试过熔化和铸造,还有其他一些方法,但似乎都不行。
如果考虑性能,另一个选择是使用数据。表格对reshape2的melt和dcast函数的扩展
(参考:使用data.tables进行高效重塑)
library(data.table)
setDT(dat1)
dcast(dat1, name ~ numbers, value.var = "value")
# name 1 2 3 4
# 1: firstName 0.1836433 -0.8356286 1.5952808 0.3295078
# 2: secondName -0.8204684 0.4874291 0.7383247 0.5757814
至于数据。表v1.9.6可以对多个列进行强制转换
## add an extra column
dat1[, value2 := value * 2]
## cast multiple value columns
dcast(dat1, name ~ numbers, value.var = c("value", "value2"))
# name value_1 value_2 value_3 value_4 value2_1 value2_2 value2_3 value2_4
# 1: firstName 0.1836433 -0.8356286 1.5952808 0.3295078 0.3672866 -1.6712572 3.190562 0.6590155
# 2: secondName -0.8204684 0.4874291 0.7383247 0.5757814 -1.6409368 0.9748581 1.476649 1.1515627
其他两种选择:
基本包:
df <- unstack(dat1, form = value ~ numbers)
rownames(df) <- unique(dat1$name)
df
sqldf包:
library(sqldf)
sqldf('SELECT name,
MAX(CASE WHEN numbers = 1 THEN value ELSE NULL END) x1,
MAX(CASE WHEN numbers = 2 THEN value ELSE NULL END) x2,
MAX(CASE WHEN numbers = 3 THEN value ELSE NULL END) x3,
MAX(CASE WHEN numbers = 4 THEN value ELSE NULL END) x4
FROM dat1
GROUP BY name')