在R中,mean()和median()是标准函数,它们执行您所期望的功能。Mode()告诉您对象的内部存储模式,而不是参数中出现次数最多的值。但是是否存在一个标准库函数来实现向量(或列表)的统计模式?


当前回答

计算包含离散值的向量“v”的MODE的一个简单方法是:

names(sort(table(v)))[length(sort(table(v)))]

其他回答

为了生成模式,我写了下面的代码。

MODE <- function(dataframe){
    DF <- as.data.frame(dataframe)

    MODE2 <- function(x){      
        if (is.numeric(x) == FALSE){
            df <- as.data.frame(table(x))  
            df <- df[order(df$Freq), ]         
            m <- max(df$Freq)        
            MODE1 <- as.vector(as.character(subset(df, Freq == m)[, 1]))

            if (sum(df$Freq)/length(df$Freq)==1){
                warning("No Mode: Frequency of all values is 1", call. = FALSE)
            }else{
                return(MODE1)
            }

        }else{ 
            df <- as.data.frame(table(x))  
            df <- df[order(df$Freq), ]         
            m <- max(df$Freq)        
            MODE1 <- as.vector(as.numeric(as.character(subset(df, Freq == m)[, 1])))

            if (sum(df$Freq)/length(df$Freq)==1){
                warning("No Mode: Frequency of all values is 1", call. = FALSE)
            }else{
                return(MODE1)
            }
        }
    }

    return(as.vector(lapply(DF, MODE2)))
}

让我们试试吧:

MODE(mtcars)
MODE(CO2)
MODE(ToothGrowth)
MODE(InsectSprays)

有一个包谦和提供单变量单模态(有时是多模态)数据的模态估计和通常概率分布的模态值。

mySamples <- c(19, 4, 5, 7, 29, 19, 29, 13, 25, 19)

library(modeest)
mlv(mySamples, method = "mfv")

Mode (most likely value): 19 
Bickel's modal skewness: -0.1 
Call: mlv.default(x = mySamples, method = "mfv")

欲了解更多信息,请参阅本页

你也可以在CRAN任务视图:概率分布中寻找“模式估计”。已经提出了两个新的一揽子计划。

您还可以计算一个实例在您的集合中出现的次数,并找到最大次数。如。

> temp <- table(as.vector(x))
> names (temp)[temp==max(temp)]
[1] "1"
> as.data.frame(table(x))
r5050 Freq
1     0   13
2     1   15
3     2    6
> 

下面是可以用来找到R中矢量变量的模式的代码。

a <- table([vector])

names(a[a==max(a)])

模式并不是在所有情况下都有用。所以函数应该处理这种情况。试试下面的函数。

Mode <- function(v) {
  # checking unique numbers in the input
  uniqv <- unique(v)
  # frquency of most occured value in the input data
  m1 <- max(tabulate(match(v, uniqv)))
  n <- length(tabulate(match(v, uniqv)))
  # if all elements are same
  same_val_check <- all(diff(v) == 0)
  if(same_val_check == F){
    # frquency of second most occured value in the input data
    m2 <- sort(tabulate(match(v, uniqv)),partial=n-1)[n-1]
    if (m1 != m2) {
      # Returning the most repeated value
      mode <- uniqv[which.max(tabulate(match(v, uniqv)))]
    } else{
      mode <- "Two or more values have same frequency. So mode can't be calculated."
    }
  } else {
    # if all elements are same
    mode <- unique(v)
  }
  return(mode)
}

输出,

x1 <- c(1,2,3,3,3,4,5)
Mode(x1)
# [1] 3

x2 <- c(1,2,3,4,5)
Mode(x2)
# [1] "Two or more varibles have same frequency. So mode can't be calculated."

x3 <- c(1,1,2,3,3,4,5)
Mode(x3)
# [1] "Two or more values have same frequency. So mode can't be calculated."