我有一个很大的数据集,我想阅读特定的列或放弃所有其他列。
data <- read.dta("file.dta")
我选择我不感兴趣的列:
var.out <- names(data)[!names(data) %in% c("iden", "name", "x_serv", "m_serv")]
然后我想做的事情是:
for(i in 1:length(var.out)) {
paste("data$", var.out[i], sep="") <- NULL
}
删除所有不需要的列。这是最优解吗?
我试图在使用包数据时删除一列。表并得到了意想不到的结果。我觉得下面的内容可能值得发表。只是一个小警告。
[编辑:Matthew…]
DF = read.table(text = "
fruit state grade y1980 y1990 y2000
apples Ohio aa 500 100 55
apples Ohio bb 0 0 44
apples Ohio cc 700 0 33
apples Ohio dd 300 50 66
", sep = "", header = TRUE, stringsAsFactors = FALSE)
DF[ , !names(DF) %in% c("grade")] # all columns other than 'grade'
fruit state y1980 y1990 y2000
1 apples Ohio 500 100 55
2 apples Ohio 0 0 44
3 apples Ohio 700 0 33
4 apples Ohio 300 50 66
library('data.table')
DT = as.data.table(DF)
DT[ , !names(dat4) %in% c("grade")] # not expected !! not the same as DF !!
[1] TRUE TRUE FALSE TRUE TRUE TRUE
DT[ , !names(DT) %in% c("grade"), with=FALSE] # that's better
fruit state y1980 y1990 y2000
1: apples Ohio 500 100 55
2: apples Ohio 0 0 44
3: apples Ohio 700 0 33
4: apples Ohio 300 50 66
基本上就是数据的语法。table与data.frame并不完全相同。实际上有很多不同之处,参见FAQ 1.1和FAQ 2.17。我警告过你!
我试图在使用包数据时删除一列。表并得到了意想不到的结果。我觉得下面的内容可能值得发表。只是一个小警告。
[编辑:Matthew…]
DF = read.table(text = "
fruit state grade y1980 y1990 y2000
apples Ohio aa 500 100 55
apples Ohio bb 0 0 44
apples Ohio cc 700 0 33
apples Ohio dd 300 50 66
", sep = "", header = TRUE, stringsAsFactors = FALSE)
DF[ , !names(DF) %in% c("grade")] # all columns other than 'grade'
fruit state y1980 y1990 y2000
1 apples Ohio 500 100 55
2 apples Ohio 0 0 44
3 apples Ohio 700 0 33
4 apples Ohio 300 50 66
library('data.table')
DT = as.data.table(DF)
DT[ , !names(dat4) %in% c("grade")] # not expected !! not the same as DF !!
[1] TRUE TRUE FALSE TRUE TRUE TRUE
DT[ , !names(DT) %in% c("grade"), with=FALSE] # that's better
fruit state y1980 y1990 y2000
1: apples Ohio 500 100 55
2: apples Ohio 0 0 44
3: apples Ohio 700 0 33
4: apples Ohio 300 50 66
基本上就是数据的语法。table与data.frame并不完全相同。实际上有很多不同之处,参见FAQ 1.1和FAQ 2.17。我警告过你!