R语言实战 - 基本数据管理

Posted 2020-10-06 你的踏板车要滑向哪里

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了R语言实战 - 基本数据管理相关的知识，希望对你有一定的参考价值。

8. 数据排序

> leadership$age
[1] 32 45 25 39 NA
> newdata <- leadership[order(leadership$age),]
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
1       1 2008-10-24      US      M  32     5     4     5     5     5
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
2       2 2008-10-28      US      F  45     3     5     2     5     5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
  stringAsFactors agecat
3           FALSE  Young
1           FALSE  Young
4           FALSE  Young
2           FALSE  Young
5           FALSE   <NA>
> 
> 
> attach(leadership)
The following objects are masked _by_ .GlobalEnv:

    age, country, gender, manager

> newdata <- leadership[order(gender, age),]
> detach(leadership)
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
2       2 2008-10-28      US      F  45     3     5     2     5     5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
1       1 2008-10-24      US      M  32     5     4     5     5     5
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
  stringAsFactors agecat
3           FALSE  Young
2           FALSE  Young
5           FALSE   <NA>
1           FALSE  Young
4           FALSE  Young
> 
> attach(leadership)
The following objects are masked _by_ .GlobalEnv:

    age, country, gender, manager

> newdata <- leadership[order(gender, -age),]
> detach(leadership)
> newdata
  manager   testDate country gender age item1 item2 item3 item4 item5
5       5 2009-05-01      UK      F  NA     2     2     1     2     1
2       2 2008-10-28      US      F  45     3     5     2     5     5
3       3 2008-10-01      UK      F  25     3     5     5     5     2
4       4 2008-10-12      UK      M  39     3     3     4    NA    NA
1       1 2008-10-24      US      M  32     5     4     5     5     5
  stringAsFactors agecat
5           FALSE   <NA>
2           FALSE  Young
3           FALSE  Young
4           FALSE  Young
1           FALSE  Young
>

9. 数据集的合并

9.1 添加列

> patientID <- c(1, 2, 3, 4)
> age <- c(25, 34, 28, 52)
> status <- c("poor", "improved", "excellent", "poor")
> gender <- c("F", "M", "M", "F")
> dataframeA <- data.frame(patientID, gender)
> dataframeA
  patientID gender
1         1      F
2         2      M
3         3      M
4         4      F
> dataframeB <- data.frame(patientID, age, status)
> dataframeB
  patientID age    status
1         1  25      poor
2         2  34  improved
3         3  28 excellent
4         4  52      poor
> total <- merge(dataframeA, dataframeB, by="ID")
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> total <- merge(dataframeA, dataframeB, by="patientID")
> total
  patientID gender age    status
1         1      F  25      poor
2         2      M  34  improved
3         3      M  28 excellent
4         4      F  52      poor
> total <- merge(dataframeA, dataframeB, by=c("gender", "age"))
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> total <- merge(dataframeA, dataframeB, by=c("patientID", "age"))
Error in fix.by(by.x, x) : ‘by‘ must specify a uniquely valid column
> 
> total <- cbind(dataframeA, dataframeB)
> total
  patientID gender patientID age    status
1         1      F         1  25      poor
2         2      M         2  34  improved
3         3      M         3  28 excellent
4         4      F         4  52      poor
>

9.2 添加行

> total <- rbind(dataframeA, dataframeB)
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match

10. 数据集取子集

10.1 选入（保留）变量

10.2 剔除（丢弃）变量

10.3 选入观测

10.4 subset() 函数

10.5 随机抽样

以上是关于R语言实战 - 基本数据管理的主要内容，如果未能解决你的问题，请参考以下文章