Clojure线程优先具有过滤功能

Posted 2021-04-27

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Clojure线程优先具有过滤功能相关的知识，希望对你有一定的参考价值。

我遇到一个问题，将一些表格串联在一起，从korma函数对结果集进行一些ETL。

我从korma sql回来了：

({:id 1 :some_field "asd" :children [{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4} {:a 2 :b 2 :c 3}] :another_field "qwe"})

我想通过获取:a关键字为1的“children”来过滤此结果集。

我的尝试：

;mock of korma result
(def data '({:id 1 :some_field "asd" :children [{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4} {:a 2 :b 2 :c 3}] :another_field "qwe"}))

(-> data 
    first 
    :children 
    (filter #(= (% :a) 1)))

我在这里期待的是一个哈希图矢量：a设置为1，即：

[{:a 1 :b 2 :c 3} {:a 1 :b 3 :c 4}]

但是，我收到以下错误：

IllegalArgumentException Don't know how to create ISeq from: xxx.core$eval3145$fn__3146  clojure.lang.RT.seqFrom (RT.java:505)

从错误我收集它试图从函数创建一个序列...虽然只是无法连接点为什么。

此外，如果我通过执行以下操作完全分离过滤器功能：

(let [children (-> data first :children)] 
    (filter #(= (% :a) 1) children))

有用。我不确定为什么第一个线程没有应用过滤器函数，传入:children向量作为coll参数。

任何和所有帮助非常感谢。

谢谢

答案

你想要thread-last宏：

(->> data first :children (filter #(= (% :a) 1)))

产量

（{：a 1，：b 2，：c 3} {：a 1，：b 3，：c 4}）

原始代码中的thread-first宏等同于编写：

(filter (:children (first data)) #(= (% :a) 1))

这会导致错误，因为您的匿名函数不是序列。

另一答案

线程优先（->）和线程最后（->>）宏总是有问题的，因为在选择一个优于另一个时很容易出错（或者像在这里那样将它们混合起来）。像这样分解步骤：

(ns tstclj.core
  (:use cooljure.core)  ; see https://github.com/cloojure/tupelo/
  (:gen-class))

(def data [ {:id 1 :some_field "asd" 
             :children [ {:a 1 :b 2 :c 3} 
                          {:a 1 :b 3 :c 4}
                          {:a 2 :b 2 :c 3} ] 
             :another_field "qwe"} ] )

(def v1    (first data))
(def v2    (:children v1))
(def v3    (filter #(= (% :a) 1) v2))

(spyx v1)    ; from tupelo.core/spyx
(spyx v2)
(spyx v3)

你会得到如下结果：

v1 => {:children [{:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1} {:c 3, :b 2, :a 2}], :another_field "qwe", :id 1, :some_field "asd"}
v2 => [{:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1} {:c 3, :b 2, :a 2}]
v3 => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

这是你想要的。问题是你真的需要使用thread-last作为filter表单。避免这个问题最可靠的方法是始终明确并使用Clojure as->线程形式，或者更好的是，来自it->的the Tupelo library：

(def result (it-> data 
                  (first it)
                  (:children  it)
                  (filter #(= (% :a) 1) it)))

通过使用线程优先，你不小心写了相当于这个：

(def result (it-> data 
                  (first it)
                  (:children  it)
                  (filter it #(= (% :a) 1))))

并且该错误反映了函数#(= (% :a) 1)无法转换为seq的事实。有时，使用let表单并为中间结果命名是值得的：

(let [result-map        (first data)
      children-vec      (:children  result-map)
      a1-maps           (filter #(= (% :a) 1) children-vec) ]
  (spyx a1-maps))
;;-> a1-maps => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

我们还可以查看前两个解决方案中的任何一个，并注意每个阶段的输出用作管道中下一个函数的最后一个参数。因此，我们也可以使用thread-last解决它：

(def result3  (->>  data
                    first
                    :children
                    (filter #(= (% :a) 1))))
(spyx result3)
;;-> result3 => ({:c 3, :b 2, :a 1} {:c 4, :b 3, :a 1})

除非你的处理链非常简单，否则我发现使用it->形式来明确管道的每个阶段应该如何使用中间值，这一点总是更加清晰。

另一答案

我不确定为什么第一个线程没有应用过滤器函数，传入：children vector作为coll参数。

这正是thread-first宏所做的。

来自clojuredocs.org：

通过表单线程化expr。插入x作为第一个表单中的第二个项目，如果它不是列表，则列出它。

所以，在你的情况下，filter的应用最终是：

(filter [...] #(= (% :a) 1))

如果你必须使用thread-first（而不是thread-last），那么你可以通过部分应用filter及其谓词来解决这个问题：

(->
  data
  first
  :children
  ((partial filter #(= (:a %) 1)))
  vec)

; [{:a 1, :b 2, :c 3} {:a 1, :b 3, :c 4}]

以上是关于Clojure线程优先具有过滤功能的主要内容，如果未能解决你的问题，请参考以下文章