Apache Spark 2.0：按降序排列到 orderBy() / sort() 列的表达式字符串

Posted 2023-03-31

技术标签:

【中文标题】Apache Spark 2.0：按降序排列到 orderBy() / sort() 列的表达式字符串【英文标题】：Apache Spark 2.0: Expression-string to orderBy() / sort() column in descending order 【发布时间】：2018-12-14 01:14:56 【问题描述】：

我正在查看类似于以下内容的书籍示例（几乎相同）：

>>> from pyspark.sql import functions as sFn
>>>   # Note: I import Spark functions this way to avoid name collisions w/ Python.
>>>   # Usage below: sFn.expr(), sFn.col(), etc.

>>> col0 = [0, 1, 2, 3]
>>> col1 = [4, 5, 6, 7]

>>> myDF = spark.createDataFrame(zip(col0, col1),
                                 schema=['col0', 'col1'])
>>> print(myDF)
>>> myDF.show()
>>> myDF.orderBy(sFn.expr('col0 desc')).show() # <--- Problem line. Doesn't descend.

现在书籍示例声称最后一条语句将按col0 降序排列，但事实并非如此：

DataFrame[col0: bigint, col1: bigint]

+----+----+
|col0|col1|
+----+----+
|   0|   4|
|   1|   5|
|   2|   6|
|   3|   7|
+----+----+

+----+----+
|col0|col1|
+----+----+
|   0|   4|
|   1|   5|
|   2|   6|
|   3|   7|
+----+----+

然而，这种语法变体一直对我有用：

myDF.orderBy(sFn.col("col0").desc()).show()

上面有问题的变体是否是错字或勘误表？如果是错字或勘误表，需要进行哪些调整才能使其正常工作？

谢谢。

【问题讨论】：

【参考方案1】：

在sFn.expr('col0 desc') 中，desc 被翻译为别名而不是order by modifier，正如您在控制台中键入它所看到的那样：

sFn.expr('col0 desc')
# Column<col0 AS `desc`>

您可以根据需要选择其他几个选项：

 myDF.orderBy('col0', ascending=0).show()
+----+----+
|col0|col1|
+----+----+
|   3|   7|
|   2|   6|
|   1|   5|
|   0|   4|
+----+----+


myDF.orderBy(sFn.desc('col0')).show()
+----+----+
|col0|col1|
+----+----+
|   3|   7|
|   2|   6|
|   1|   5|
|   0|   4|
+----+----+

myDF.orderBy(myDF.col0.desc()).show()
+----+----+
|col0|col1|
+----+----+
|   3|   7|
|   2|   6|
|   1|   5|
|   0|   4|
+----+----+

【讨论】：

就像您演示的那样，我也对其进行了评估，并尝试将AS 也添加到字符串中，但没有任何作用。它看起来像一个书错误。我喜欢你的变化。接受为答案。谢谢！不客气。从来没有喜欢过SQL（因为指令和关键字草率地穿插着标识符），我更喜欢DataFrame API；并且上面的变体将为访问者提供清晰的示例（他们可以应用于这篇文章的变体）。恕我直言。 =:) 同意。 SQL 语法很灵活，但随着代码的堆积，以后可能会变得非常难以维护和推理。

以上是关于Apache Spark 2.0：按降序排列到 orderBy() / sort() 列的表达式字符串的主要内容，如果未能解决你的问题，请参考以下文章

按降序排列的可枚举范围

SQLAlchemy 按降序排列？

Vertica - 按降序排列的投影

Codeigniter 分页链接按降序/倒序排列？

如何在 Firebase 中按降序排列数据 - Python

使用 plotly-express 按值按降序排列条形图中的条形