按多列对数据帧行排序（排序）

我想按多列对数据帧进行排序。例如，对于下面的数据帧，我希望按列“z”（降序）排序，然后按列“b”（升序）排序：

dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), 
      levels = c("Low", "Med", "Hi"), ordered = TRUE),
      x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
      z = c(1, 1, 1, 2))
dd
    b x y z
1  Hi A 8 1
2 Med D 3 1
3  Hi A 9 1
4 Low C 9 2

当前回答

假设您有一个data.frame a，并且希望使用名为x降序的列对其进行排序。调用排序后的数据。frame newdata

newdata <- A[order(-A$x),]

如果需要升序，请将“-”替换为空。你可以吃类似的东西

newdata <- A[order(-A$x, A$y, -A$z),]

其中x和z是data.frame A中的一些列。这意味着按照x降序、y升序和z降序对data.frameA进行排序。

2011-01-25 13:10:21

其他回答

针对OP中添加的关于如何以编程方式排序的注释：

使用dplyr和data.table

library(dplyr)
library(data.table)

dplyr公司

只需使用arrange_，这是arrange的标准评估版本。

df1 <- tbl_df(iris)
#using strings or formula
arrange_(df1, c('Petal.Length', 'Petal.Width'))
arrange_(df1, ~Petal.Length, ~Petal.Width)
    Source: local data frame [150 x 5]

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          (dbl)       (dbl)        (dbl)       (dbl)  (fctr)
1           4.6         3.6          1.0         0.2  setosa
2           4.3         3.0          1.1         0.1  setosa
3           5.8         4.0          1.2         0.2  setosa
4           5.0         3.2          1.2         0.2  setosa
5           4.7         3.2          1.3         0.2  setosa
6           5.4         3.9          1.3         0.4  setosa
7           5.5         3.5          1.3         0.2  setosa
8           4.4         3.0          1.3         0.2  setosa
9           5.0         3.5          1.3         0.3  setosa
10          4.5         2.3          1.3         0.3  setosa
..          ...         ...          ...         ...     ...


#Or using a variable
sortBy <- c('Petal.Length', 'Petal.Width')
arrange_(df1, .dots = sortBy)
    Source: local data frame [150 x 5]

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          (dbl)       (dbl)        (dbl)       (dbl)  (fctr)
1           4.6         3.6          1.0         0.2  setosa
2           4.3         3.0          1.1         0.1  setosa
3           5.8         4.0          1.2         0.2  setosa
4           5.0         3.2          1.2         0.2  setosa
5           4.7         3.2          1.3         0.2  setosa
6           5.5         3.5          1.3         0.2  setosa
7           4.4         3.0          1.3         0.2  setosa
8           4.4         3.2          1.3         0.2  setosa
9           5.0         3.5          1.3         0.3  setosa
10          4.5         2.3          1.3         0.3  setosa
..          ...         ...          ...         ...     ...

#Doing the same operation except sorting Petal.Length in descending order
sortByDesc <- c('desc(Petal.Length)', 'Petal.Width')
arrange_(df1, .dots = sortByDesc)

更多信息请点击此处：https://cran.r-project.org/web/packages/dplyr/vignettes/nse.html

最好使用公式，因为它还可以捕获环境来计算表达式

数据表

dt1 <- data.table(iris) #not really required, as you can work directly on your data.frame
sortBy <- c('Petal.Length', 'Petal.Width')
sortType <- c(-1, 1)
setorderv(dt1, sortBy, sortType)
dt1
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          7.7         2.6          6.9         2.3 virginica
  2:          7.7         2.8          6.7         2.0 virginica
  3:          7.7         3.8          6.7         2.2 virginica
  4:          7.6         3.0          6.6         2.1 virginica
  5:          7.9         3.8          6.4         2.0 virginica
 ---                                                            
146:          5.4         3.9          1.3         0.4    setosa
147:          5.8         4.0          1.2         0.2    setosa
148:          5.0         3.2          1.2         0.2    setosa
149:          4.3         3.0          1.1         0.1    setosa
150:          4.6         3.6          1.0         0.2    setosa

2016-02-05 21:11:52

就像很久以前的机械卡片分拣机一样，首先按最不重要的键排序，然后按下一个最重要的键进行排序。不需要库，可以使用任意数量的键以及任意组合的升序和降序键。

 dd <- dd[order(dd$b, decreasing = FALSE),]

现在我们准备好做最重要的关键。这一类是稳定的，最重要的密钥中的任何联系都已经解决。

dd <- dd[order(dd$z, decreasing = TRUE),]

这可能不是最快的，但它确实简单可靠

2015-01-15 04:28:25

当我想自动化n列的排序过程时，我正在与上述解决方案作斗争，因为每一列的列名都可能不同。我从psych包中找到了一个非常有用的功能，可以直接实现这一点：

dfOrder(myDf, columnIndices)

其中columnIndex是一个或多个列的索引，按要对其排序的顺序排列。此处提供更多信息：

“psych”包中的dfOrder函数

2018-10-24 22:32:43

dplyr中的arrange（）是我最喜欢的选项。使用管道操作员，从最不重要的方面转到最重要的方面

dd1 <- dd %>%
    arrange(z) %>%
    arrange(desc(x))

2018-10-29 16:56:46

我通过下面的例子了解了秩序，这让我困惑了很长一段时间：

set.seed(1234)

ID        = 1:10
Age       = round(rnorm(10, 50, 1))
diag      = c("Depression", "Bipolar")
Diagnosis = sample(diag, 10, replace=TRUE)

data = data.frame(ID, Age, Diagnosis)

databyAge = data[order(Age),]
databyAge

此示例之所以有效，唯一的原因是顺序是按向量Age排序，而不是按数据帧数据中名为Age的列排序。

要看到这一点，请使用read.table创建一个完全相同的数据帧，列名称略有不同，并且不使用任何上述向量：

my.data <- read.table(text = '

  id age  diagnosis
   1  49 Depression
   2  50 Depression
   3  51 Depression
   4  48 Depression
   5  50 Depression
   6  51    Bipolar
   7  49    Bipolar
   8  49    Bipolar
   9  49    Bipolar
  10  49 Depression

', header = TRUE)

由于没有名为age的向量，上述order的行结构不再有效：

databyage = my.data[order(age),]

以下行之所以有效，是因为顺序根据my.data中的列年龄排序。

databyage = my.data[order(my.data$age),]

我认为这是值得张贴的，因为我被这个例子迷惑了这么久。如果这个帖子不适合这个线程，我可以删除它。

编辑：2014年5月13日

下面是按每列对数据帧进行排序而不指定列名的通用方法。下面的代码显示了如何从左到右或从右到左排序。如果每一列都是数字，这将起作用。我没有尝试添加字符列。

一两个月前，我在另一个网站的一篇旧帖子中找到了do.call代码，但这是经过广泛而艰难的搜索之后才发现的。我不确定我现在能不能重新安置那个职位。目前的线程是在R中订购data.frame的第一个热门线程。因此，我认为我的原始do.call代码的扩展版本可能有用。

set.seed(1234)

v1  <- c(0,0,0,0, 0,0,0,0, 1,1,1,1, 1,1,1,1)
v2  <- c(0,0,0,0, 1,1,1,1, 0,0,0,0, 1,1,1,1)
v3  <- c(0,0,1,1, 0,0,1,1, 0,0,1,1, 0,0,1,1)
v4  <- c(0,1,0,1, 0,1,0,1, 0,1,0,1, 0,1,0,1)

df.1 <- data.frame(v1, v2, v3, v4) 
df.1

rdf.1 <- df.1[sample(nrow(df.1), nrow(df.1), replace = FALSE),]
rdf.1

order.rdf.1 <- rdf.1[do.call(order, as.list(rdf.1)),]
order.rdf.1

order.rdf.2 <- rdf.1[do.call(order, rev(as.list(rdf.1))),]
order.rdf.2

rdf.3 <- data.frame(rdf.1$v2, rdf.1$v4, rdf.1$v3, rdf.1$v1) 
rdf.3

order.rdf.3 <- rdf.1[do.call(order, as.list(rdf.3)),]
order.rdf.3

2013-09-02 19:28:56

按多列对数据帧行排序（排序）

推荐文章

最新文章

标签