如何将数据从长格式重塑为宽格式

我有麻烦重新安排以下数据帧:

set.seed(45)
dat1 <- data.frame(
    name = rep(c("firstName", "secondName"), each=4),
    numbers = rep(1:4, 2),
    value = rnorm(8)
    )

dat1
       name  numbers      value
1  firstName       1  0.3407997
2  firstName       2 -0.7033403
3  firstName       3 -0.3795377
4  firstName       4 -0.7460474
5 secondName       1 -0.8981073
6 secondName       2 -0.3347941
7 secondName       3 -0.5013782
8 secondName       4 -0.1745357

我想重塑它，以便每个唯一的“name”变量都是一个行名，“值”作为该行的观察值，“数字”作为冒号。就像这样:

     name          1          2          3         4
1  firstName  0.3407997 -0.7033403 -0.3795377 -0.7460474
5 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357

我试过熔化和铸造，还有其他一些方法，但似乎都不行。

当前回答

新的(2014年)tidyr包也简单地做到了这一点，gather()/spread()是melt/cast的术语。

编辑:现在，在2019年，tidyr v 1.0已经推出，并将spread和gather设置为弃用路径，更倾向于pivot_更宽和pivot_更长，您可以在这个答案中找到描述。如果你想简要了解一下传播/聚集的短暂生活，请继续阅读。

library(tidyr)
spread(dat1, key = numbers, value = value)

从github,

Tidyr是为了配合整洁的数据框架而设计的重塑重塑，并与magrittr和dplyr携手合作，为数据分析构建一个坚实的管道。就像reshape2做得比重塑少一样，tidyr做得比重塑少。它是专门为整理数据而设计的，而不是像重塑2那样进行一般的重塑，也不是像重塑那样进行一般的聚合。特别是，内置方法只适用于数据帧，而tidyr不提供边距或聚合。

2014-07-29 19:37:09

其他回答

简单多了!

devtools::install_github("yikeshu0611/onetree") #install onetree package

library(onetree)
widedata=reshape_toWide(data = dat1,id = "name",j = "numbers",value.var.prefix = "value")
widedata

        name     value1     value2     value3     value4
   firstName  0.3407997 -0.7033403 -0.3795377 -0.7460474
  secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357

如果你想从宽返回到长，只改变宽为长，不改变对象。

reshape_toLong(data = widedata,id = "name",j = "numbers",value.var.prefix = "value")

        name numbers      value
   firstName       1  0.3407997
  secondName       1 -0.8981073
   firstName       2 -0.7033403
  secondName       2 -0.3347941
   firstName       3 -0.3795377
  secondName       3 -0.5013782
   firstName       4 -0.7460474
  secondName       4 -0.1745357

2019-07-26 05:47:41

您可以使用重塑()函数或使用重塑包中的melt() / cast()函数来实现这一点。对于第二个选项，示例代码为

library(reshape)
cast(dat1, name ~ numbers)

或者使用重塑2

library(reshape2)
dcast(dat1, name ~ numbers)

2011-05-04 22:42:14

Win-Vector公司的天才数据科学家(他们制作了vtreat、seplyr和replyr)推出了一个非常强大的新软件包，名为cdata。它实现了本文和本文中描述的“协调数据”原则。其思想是，无论如何组织数据，都应该能够使用“数据坐标”系统识别单个数据点。下面是约翰·芒特最近博客文章的节选:

The whole system is based on two primitives or operators cdata::moveValuesToRowsD() and cdata::moveValuesToColumnsD(). These operators have pivot, un-pivot, one-hot encode, transpose, moving multiple rows and columns, and many other transforms as simple special cases. It is easy to write many different operations in terms of the cdata primitives. These operators can work-in memory or at big data scale (with databases and Apache Spark; for big data use the cdata::moveValuesToRowsN() and cdata::moveValuesToColumnsN() variants). The transforms are controlled by a control table that itself is a diagram of (or picture of) the transform.

我们将首先构建控制表(有关详细信息，请参阅博客文章)，然后执行数据从行到列的移动。

library(cdata)
# first build the control table
pivotControlTable <- buildPivotControlTableD(table = dat1, # reference to dataset
                        columnToTakeKeysFrom = 'numbers', # this will become column headers
                        columnToTakeValuesFrom = 'value', # this contains data
                        sep="_")                          # optional for making column names

# perform the move of data to columns
dat_wide <- moveValuesToColumnsD(tallTable =  dat1, # reference to dataset
                    keyColumns = c('name'),         # this(these) column(s) should stay untouched 
                    controlTable = pivotControlTable# control table above
                    ) 
dat_wide

#>         name  numbers_1  numbers_2  numbers_3  numbers_4
#> 1  firstName  0.3407997 -0.7033403 -0.3795377 -0.7460474
#> 2 secondName -0.8981073 -0.3347941 -0.5013782 -0.1745357

2017-12-23 23:01:37

使用你的例子数据框架，我们可以:

xtabs(value ~ name + numbers, data = dat1)

2011-05-04 22:58:48

如果考虑性能，另一个选择是使用数据。表格对reshape2的melt和dcast函数的扩展

(参考:使用data.tables进行高效重塑)

library(data.table)

setDT(dat1)
dcast(dat1, name ~ numbers, value.var = "value")

#          name          1          2         3         4
# 1:  firstName  0.1836433 -0.8356286 1.5952808 0.3295078
# 2: secondName -0.8204684  0.4874291 0.7383247 0.5757814

至于数据。表v1.9.6可以对多个列进行强制转换

## add an extra column
dat1[, value2 := value * 2]

## cast multiple value columns
dcast(dat1, name ~ numbers, value.var = c("value", "value2"))

#          name    value_1    value_2   value_3   value_4   value2_1   value2_2 value2_3  value2_4
# 1:  firstName  0.1836433 -0.8356286 1.5952808 0.3295078  0.3672866 -1.6712572 3.190562 0.6590155
# 2: secondName -0.8204684  0.4874291 0.7383247 0.5757814 -1.6409368  0.9748581 1.476649 1.1515627

2016-03-27 22:35:51

如何将数据从长格式重塑为宽格式

推荐文章

最新文章

标签