使用 1 列对数据框进行排序

Posted

技术标签:

【中文标题】使用 1 列对数据框进行排序【英文标题】:Sorting dataframe with 1 column 【发布时间】:2022-01-17 11:08:21 【问题描述】:

我有一个包含 1 列的名称数据框。我尝试了order() 的多次迭代,并将其转换为列表,并以几种不同的方式尝试了sort(),但没有成功。

下面是dput()供参考:

> dput(names.ordered)
structure(list(Directors = c("Darabont, Frank", "Nolan, Christopher", 
"Lumet, Sidney", "Spielberg, Steven", "Jackson, Peter", "Tarantino, Quentin", 
"Leone, Sergio", "Fincher, David", "Zemeckis, Robert", "Kershner, Irvin", 
"Wachowski, Lana", "Scorsese, Martin", "Forman, Milos", "Kurosawa, Akira", 
"Demme, Jonathan", "Meirelles, Fernando", "Benigni, Roberto", 
"Capra, Frank", "Lucas, George", "Miyazaki, Hayao", "Besson, Luc", 
"Kobayashi, Masaki", "Polanski, Roman", "Cameron, James", "Singer, Bryan", 
"Hitchcock, Alfred", "Allers, Roger", "Chaplin, Charles", "Kaye, Tony", 
"Takahata, Isao", "Chazelle, Damien", "Scott, Ridley", "Nakache, Olivier", 
"Curtiz, Michael", "Tornatore, Giuseppe", "Kubrick, Stanley", 
"Wilder, Billy", "Stanton, Andrew", "Russo, Anthony", "Persichetti, Bob", 
"Chan-Wook, Park", "Phillips, Todd", "Shinkai, Makoto", "Unkrich, Lee", 
"Labaki, Nadine", "Petersen, Wolfgang", "Hirani, Rajkumar", "Lasseter, John", 
"Mendes, Sam", "Gibson, Mel", "Kail, Thomas", "Marquand, Richard", 
"Klimov, Elem", "Lang, Fritz", "Khan, Aamir", "Welles, Orson", 
"Vinterberg, Thomas", "Aronofsky, Darren", "Donen, Stanley", 
"Gondry, Michel", "Lean, David", "Tiwari, Nitesh", "Villeneuve, Denis", 
"Zeller, Florian", "Farhadi, Asghar", "Ray, Satyajit", "Ritchie, Guy", 
"Jeunet, Jean-Pierre", "Mulligan, Robert", "Docter, Pete", "Mann, Michael", 
"Hanson, Curtis", "McTiernan, John", "Gnanavel, T.J.", "Farrelly, Peter", 
"Hirschbiegel, Oliver", "Gilliam, Terry", "Eastwood, Clint", 
"Majidi, Majid", "Kramer, Stanley", "Sturges, John", "Huston, John", 
"Howard, Ron", "Coen, Ethan", "Carpenter, John", "Bergman, Ingmar", 
"McDonagh, Martin", "Pablos, Sergio", "Lynch, David", "Weir, Peter", 
"Reed, Carol", "McTeigue, James", "Boyle, Danny", "Coen, Joel", 
"O'Connor, Gavin", "Fleming, Victor", "Ozu, Yasujirô", "Kazan, Elia", 
"Irmak, Cagan", "Szifron, Damián", "Tarkovsky, Andrei", "Cimino, Michael", 
"Costa-Gavras, Costa-Gavras,", "Anderson, Wes", "Keaton, Buster", 
"Bruckman, Clyde", "Linklater, Richard", "Elliot, Adam", "Sheridan, Jim", 
"Abrahamson, Lenny", "Raghavan, Sriram", "Mangold, James", "McQueen, Steve", 
"Lubitsch, Ernst", "DeBlois, Dean", "Miller, George", "Wyler, William", 
"Yates, David", "Clouzot, Henri-Georges", "Reiner, Rob", "Kashyap, Anurag", 
"Rosenberg, Stuart", "Hallström, Lasse", "Kassovitz, Mathieu", 
"Truffaut, François", "Yamada, Naoko", "Stone, Oliver", "McCarthy, Tom", 
"Jones, Terry", "George, Terry", "Turgul, Yavuz", "Wong, Kar-Wai", 
"Penn, Sean", "Anno, Hideaki", "Pontecorvo, Gillo", "Fellini, Federico", 
"Wenders, Wim", "Kieslowski, Krzysztof", "Kumar, Ram", "Coppola, Francis Ford", 
"Joon Ho, Bong", "von Donnersmarck, Florian Henckel", "Van Sant, Gus", 
"De Sica, Vittorio", "Hill, George Roy", "De Palma, Brian", "Mankiewicz, Joseph L.", 
"Anderson, Paul Thomas", "del Toro, Guillermo", "Campanella, Juan José", 
"Shyamalan, M. Night", "Dreyer, Carl Theodor", "Avildsen, John G.", 
"Iñárritu, Alejandro G.")), row.names = c(NA, -154L), class = "data.frame")

我已经尝试过的几件事返回错误或没有结果:

> names.ordered <- names.ordered[order(names.ordered$Directors)]
Error in `[.data.frame`(names.ordered, order(names.ordered$Directors)) : 
  undefined columns selected

> names.ordered <- names.ordered[order(1)] 

#after converting to list
> names.ordered <- sort(names.ordered)
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
  'x' must be atomic

【问题讨论】:

【参考方案1】:

即使数据框仅包含一列,您也需要指定要对哪一列进行排序/排序。

如果要保留names.ordered 的原始顺序,请使用order 创建索引:

idx <- order(names.ordered$Director)
head(names.ordered)
           Directors
1    Darabont, Frank
2 Nolan, Christopher
3      Lumet, Sidney
4  Spielberg, Steven
5     Jackson, Peter
6 Tarantino, Quentin
head(names.ordered[idx, ])
# [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren" 

如果要重新排列names.ordered 的顺序,请使用sort()

names.ordered$Directors <- sort(names.ordered$Directors)
head(names.ordered$Directors)
# [1] "Abrahamson, Lenny"     "Allers, Roger"         "Anderson, Paul Thomas" "Anderson, Wes"         "Anno, Hideaki"         "Aronofsky, Darren"    
tail(names.ordered$Directors)
# [1] "Wong, Kar-Wai"    "Wyler, William"   "Yamada, Naoko"    "Yates, David"     "Zeller, Florian"  "Zemeckis, Robert"

【讨论】:

【参考方案2】:

我认为您的主要问题是您尝试对列进行排序。从数据框中提取元素的语法是x[i, j, ... , drop=TRUE] X[j],其中i 表示行,j 表示列。请注意引用行时始终需要的逗号。由于您没有使用逗号,R 认为您使用了X[j] 并且您想要对列进行排序。所以在逗号前使用order()按行排序。

在“ order() ”调用中,只需输入要从中获取要重新排列数据框的顺序的向量。

一个小麻烦是您只有一列,这会将结果强制为尽可能低的维度(即本例中的向量)。为了避免这种情况,有一个参数drop=FALSE

names_ordered <- names[order(names$Directors), , drop=FALSE]

head(names_ordered)
#            Directors
# 1    Darabont, Frank
# 2 Nolan, Christopher
# 3      Lumet, Sidney
# 4  Spielberg, Steven
# 5     Jackson, Peter
# 6 Tarantino, Quentin

【讨论】:

以上是关于使用 1 列对数据框进行排序的主要内容,如果未能解决你的问题,请参考以下文章

按两列对数据框进行排序(有条件)[重复]

如何根据基于其他列的列对数据框进行排序[重复]

按字符和日期列对数据框进行排序

使用超过 1 列对 excel 数据进行排序

根据字符串值列对 pandas 数据帧行进行排序

使用特定列对 Jtable 项目进行排序 - JAVA