使用 walk 以 rego 递归聚合 terraform 状态下的资源

Posted 2023-04-13

技术标签:

【中文标题】使用 walk 以 rego 递归聚合 terraform 状态下的资源【英文标题】：Using walk to recursively aggregate resources in a terraform state with rego 【发布时间】：2020-11-06 16:47:51 【问题描述】：

我正在使用 Open Policy Agent 针对我的 terraform 状态的 JSON 输出编写策略。

下面是状态文件的结构：


  "format_version": "0.1",
  "terraform_version": "0.12.28",
  "values": 
    "root_module": 
      "resources": [],
      "child_modules": [
        
          "resources": [],
          "address": "",
          "child_modules": [
            
              "resources": [],
              "address": "",
              "child_modules": [
                
              ]
            
          ]
        
      ]

我定义了这条讨厌的规则来实现我想要的，但这显然不是聚合这些资源的理想方式。

resources[resource_type] = all 
    some resource_type
    resource_types[resource_type]
    rm := tfstate.values.root_module

    # I think the below can be simplified with the built in "walk" function TODO: do that.
    root_resources := [name |
        name := rm.resources[_]
        name.type == resource_type
    ]

    cmone_resources = [name |
        name := rm.child_modules[_].resources[_]
        name.type == resource_type
    ]

    cmtwo_resources = [name |
        name := rm.child_modules[_].child_modules[_].resources[_]
        name.type == resource_type
    ]

    cm := array.concat(cmone_resources, cmtwo_resources)

    all := array.concat(cm, root_resources)

我已经阅读了内置函数walk(x, [path, value]) 的文档。文档here。我相信这个函数可以做我想做的事情，但是根据给出的文档和我在其他地方找到的公认的稀疏示例，我无法弄清楚如何让它按我的预期工作。

我已经包含了一个playground，其中包含一个非常基本的设置和我定义的当前规则。任何帮助都将不胜感激。

【问题讨论】：

【参考方案1】：

你在正确的轨道上，使用walk绝对是收集任意嵌套子资源的好方法。

首先，我们想探索一下 walk 的作用。它本质上将遍历我们正在遍历的对象中的所有节点，并为每个节点提供“路径”和当前节点值。路径将是一个键数组，就像对象一样：

"a": "b": "c": 123

如果我们走过去（下面的示例使用opa run REPL：

> [path, value] = walk("a": "b": "c": 123)
+---------------+-----------------------+
|     path      |         value         |
+---------------+-----------------------+
| []            | "a":"b":"c":123 |
| ["a"]         | "b":"c":123       |
| ["a","b"]     | "c":123             |
| ["a","b","c"] | 123                   |
+---------------+-----------------------+

我们看到path 和value 的值的每个路径和值组合。您可以在部分规则（如您的 resources 规则）或推导式中迭代时捕获任何这些值。

所以.. 把它交给 terraform 的东西。如果我们修改操场示例以遍历示例输入（稍作修改以赋予事物一些唯一的名称），我们会得到：

walk_example[path] = value 
    [path, value] := walk(tfstate)

https://play.openpolicyagent.org/p/2u5shGbrV2

如果您查看walk_example 的结果值，我们可以看到我们期望必须处理的所有路径和值。

从那里开始进行过滤，类似于您在resources 规则中为resource_types 所做的事情。我们不会对集合进行迭代，而是将其用作查找以检查每种类型是否正常，并且我们将首先构建完整的 all 资源集（不按类型对它们进行分组）。原因是遍历输入 json 的所有节点非常昂贵，因此我们只想执行一次。随后，我们可以通过第二次按类型分组（根据需要）更快地遍历每个资源的完整列表。

更新后的版本如下所示：

walk_resources[resource]   
    [path, value] := walk(tfstate)

    # Attempt to iterate over "resources" of the value, if the key doesn't
    # exist its OK, this iteration for walk will be undefined, and excluded
    # from the results.
    # Note: If you needed to be sure it was a "real" resource, and not some
    # key you can perform additional validation on the path here!
    resource := value.resources[_]
    
    # check if the resource type was contained in the set of desired resource types
    resource_types[resource.type]

https://play.openpolicyagent.org/p/TyqMKDyWyh

^ 操场输入已更新，以在示例中包含另一个级别的嵌套和类型。您可以看到原始的 resources 输出缺少深度 3 资源，但 walk_resources 集合包含所有预期的资源。

最后一部分，如果您想按类型对它们进行分组，请添加一个完整的规则，例如：

# list of all resources of a given type. given type must be defined in the resource_types variable above
resources =  resource_type: resources |
    some resource_type
    resource_types[resource_type]
    resources :=  resource | 
        walk_resources[resource]
        resource.type == resource_type

https://play.openpolicyagent.org/p/RlRZwibij9

它将原来的resources 规则替换为一种推导式，该推导式将遍历每种资源类型，然后收集与该类型匹配的资源。

一个额外的指针，我在这些 terraform 资源帮助器规则中看到了一个问题，即您将要引用该“完整”规则，请参阅https://www.openpolicyagent.org/docs/latest/policy-language/#complete-definitions 了解有关其含义的一些详细信息，而不是“部分”规则（在这种情况下，构建一组资源而不是为理解结果分配值）。问题在于，在编写本文时，OPA 将在内部缓存“完整”规则的值，而部分规则则不会。所以如果你再去写一堆规则，比如：

deny[msg] 
    r := resources["foo"]
    # enforce something for resources of type "foo"...
    ...


deny[msg] 
    r := resources["bar"]
    # enforce something for resources of type "bar"...
    ...

您要确保它每次都使用resources 的缓存值，而不是重新计算集合。 resources 规则的原始版本会遇到这个问题，以及使用我在这些示例中展示的 walk_resources 规则。需要注意的一点，因为如果您有一个大的输入 tfplan，它可能会对性能产生相当大的影响。

【讨论】：

以上是关于使用 walk 以 rego 递归聚合 terraform 状态下的资源的主要内容，如果未能解决你的问题，请参考以下文章