服务器列表的 Cloudwatch 警报

Posted

技术标签:

【中文标题】服务器列表的 Cloudwatch 警报【英文标题】:Cloudwatch alarm for list of servers 【发布时间】:2021-04-09 20:33:15 【问题描述】:

我正在尝试在服务器列表中设置一些警报,我的服务器在本地定义如下:

  locals 
      my_list = [
        "server1",
        "server2"
      ]
    

然后我将我的 cloudwatch 警报定义为:(这是一个这样的警报)

resource "aws_cloudwatch_metric_alarm" "ec2-high-cpu-warning" 
  for_each            = toset(local.my_list)
  alarm_name          = "ec2-high-cpu-warning-for-$each.key"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "1"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  dimensions = 
    instanceid   = values(data.aws_instances.my_instances)[*].ids
    instancename = local.my_list
  

  period                    = "60"
  statistic                 = "Average"
  threshold                 = "11"
  alarm_description         = "This warning is for high cpu utilization for $each.key"
  actions_enabled           = true
  alarm_actions             = [data.aws_sns_topic.my_sns.arn]
  insufficient_data_actions = []
  treat_missing_data        = "notBreaching"

我也这样定义数据源:

data "aws_instances" "my_instances" 

  for_each = toset(local.my_list)

  instance_tags = 
    Name = each.key
  

现在当我运行 terraform plan 时出现错误:

| data.aws_instances.my_instances is object with 2 attributes

属性“dimensions”的值不合适:元素“instanceid”:字符串 必填。

【问题讨论】:

您想为每个实例定义一个警报?那么在您的示例中,您将创建两个警报? @marcin 我希望每个实例有多个不同的警报,例如 cpu、内存等,我只在此处发布了 cpu 警报。同样是的,在我上面的示例中,两个警报为每个实例创建了一个,正确。 【参考方案1】:

在你的for_each 中你应该使用data.aws_instance.my_instances

resource "aws_cloudwatch_metric_alarm" "ec2-high-cpu-warning" 

  for_each            = data.aws_instance.my_instances
  
  alarm_name          = "ec2-high-cpu-warning-for-$each.key"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "1"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  
  dimensions = 
    instanceid   = each.value.id
    instancename = each.key
  

  period                    = "60"
  statistic                 = "Average"
  threshold                 = "11"
  alarm_description         = "This warning is for high cpu utilization for $each.key"
  actions_enabled           = true
  alarm_actions             = [data.aws_sns_topic.my_sns.arn]
  insufficient_data_actions = []
  treat_missing_data        = "notBreaching"

上面将为您的两个实例创建两个警报(每个实例一个警报),其中instancename 将是server1 或``server2`。

【讨论】:

以上是关于服务器列表的 Cloudwatch 警报的主要内容,如果未能解决你的问题,请参考以下文章

我可以使用 CloudWatch 警报扩展 AWS Spot 实例吗?

将现有 AWS CloudWatch 警报导出到 CloudFormation 模板

配置 cloudwatch “空闲”警报

由于 heredoc,Cloudwatch 警报创建失败

Cloudwatch 突然上传到 s3 的警报

扩展 Fargate 服务任务以匹配 CloudWatch 指标