AWS ECS Fargate 容器运行状况检查命令
Posted
技术标签:
【中文标题】AWS ECS Fargate 容器运行状况检查命令【英文标题】:AWS ECS Fargate Container Healthcheck command 【发布时间】:2019-10-29 16:06:22 【问题描述】:我正在尝试设置 aws ecs fargate 部署配置。我能够在没有容器健康检查的情况下运行容器。但是,我也想运行容器健康检查。我尝试了所有可能的方案来实现这一点。但是,没有运气。
我尝试使用以下 aws 推荐的命令来验证来自列出的 url 的容器运行状况检查。
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definition_healthcheck
-
[ "CMD-SHELL", "curl -f http://localhost/ || 退出 1" ]i>
[ "CMD-SHELL" "curl -f 127.0.0.1 || 退出 1" ]i>
我尝试了以上两个命令。但是,它们都没有按预期工作。请帮我接收容器有效的健康检查命令
下面是我的 DockerFile
FROM centos:latest
RUN yum update -y
RUN yum install httpd httpd-tools curl -y
EXPOSE 80
CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
HEALTHCHECK CMD curl --fail http://localhost:80/ || exit 1
FROM microsoft/dotnet:2.1-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
FROM microsoft/dotnet:2.1-sdk AS build
WORKDIR /DockerDemoApi
COPY ./DockerDemoApi.csproj DockerDemoApi/
RUN dotnet restore DockerDemoApi/DockerDemoApi.csproj
COPY . .
WORKDIR /DockerDemoApi
RUN dotnet build DockerDemoApi.csproj -c Release -o /app
FROM build AS publish
RUN dotnet publish DockerDemoApi.csproj -c Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "DockerDemoApi.dll"]
我在我的容器中添加了 curl 命令及其工作。但是,如果我在 AWS Healthcheck 任务中保留相同的命令,它就会失败。
任务定义 JSON:
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
"containerDefinitions": [
"dnsSearchDomains": null,
"logConfiguration":
"logDriver": "awslogs",
"secretOptions": null,
"options":
"awslogs-group": "/ecs/mall-health-check-task",
"awslogs-region": "ap-south-1",
"awslogs-stream-prefix": "ecs"
,
"entryPoint": [],
"portMappings": [
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
],
"command": [],
"linuxParameters": null,
"cpu": 256,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": 512,
"volumesFrom": [],
"stopTimeout": null,
"image": "xxxx.dkr.ecr.ap-south-
1.amazonaws.com/autoaml/api/dev/alpine:latest",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": [],
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "sample-app"
],
"placementConstraints": [],
"memory": "512",
"taskRoleArn": "arn:aws:iam::xxxx:role/ecsTaskExecutionRole",
"compatibilities": [
"EC2",
"FARGATE"
],
"taskDefinitionArn": "arn:aws:ecs:ap-south-1:xxx:task-definition/mall-
health-check-task:9",
"family": "mall-health-check-task",
"requiresAttributes": [
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-ecr-pull"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.task-eni"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.task-iam-role"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-awslogs"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
],
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "256",
"revision": 9,
"status": "ACTIVE",
"proxyConfiguration": null,
"volumes": []
【问题讨论】:
您的运行状况检查是否失败或 Fargate 抛出错误? 是的。它始终将健康状态显示为 UNKNOWN。 我看不到您传递的命令有任何问题,除非容器运行状况有问题。还有一个信息,您是否将您的容器标记为“essential=true” 我没有添加那个 essential=true @Haran。但是,那会做什么呢? 我添加了 essential=true 并且仍然显示相同 【参考方案1】:Documentation 提到以下内容:
在 AWS 管理控制台中注册任务定义时,使用逗号分隔的命令列表,在创建任务定义后会自动转换为字符串。健康检查的示例输入可能是:
CMD-SHELL, curl -f http://localhost/ || exit 1
使用 AWS 管理控制台 JSON 面板、AWS CLI 或 API 注册任务定义时,您应该将命令列表括在括号中。健康检查的示例输入可能是:
[ "CMD-SHELL", "curl -f http://localhost/ || exit 1" ]
您是否验证了您的运行状况检查命令?我的意思是,http://127.0.0.0 是有效的,对吧? 当您点击http://127.0.0.0(无端口)时,检查您的容器是否返回成功响应。
下面是示例任务定义。这是在容器中启动tomcat服务器并检查健康状况(localhost:8080)
-
根据需要修改任务定义(如 Role Arn )
创建 ECS 服务并映射任务定义。
创建配置的日志组。
启动 ECS 服务,您的任务应显示为“正常”。
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
"containerDefinitions": [
"dnsSearchDomains": null,
"logConfiguration":
"logDriver": "awslogs",
"secretOptions": null,
"options":
"awslogs-group": "/test/test-task",
"awslogs-region": "us-east-2",
"awslogs-stream-prefix": "test"
,
"entryPoint": null,
"portMappings": [
"hostPort": 8080,
"protocol": "tcp",
"containerPort": 8080
],
"command": null,
"linuxParameters": null,
"cpu": 0,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": null,
"volumesFrom": [],
"stopTimeout": null,
"image": "tomcat",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": false,
"interactive": null,
"healthCheck":
"retries": 3,
"command": [
"CMD-SHELL",
"curl -f http://localhost:8080/ || exit 1"
],
"timeout": 5,
"interval": 30,
"startPeriod": null
,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "tomcat"
],
"memory": "1024",
"taskRoleArn": "arn:aws:iam::accountid:role/taskExecutionRole",
"family": "test-task",
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "512",
"proxyConfiguration": null,
"volumes": []
【讨论】:
在发布此问题之前,我也尝试过@Haran。输出没有变化。如果可能的话,如果您有任何工作示例,请分享,我会尝试 添加了一个示例任务定义。 我已经添加了我的 dockerfile。你能弄清楚healthcheck的定义是否有错误吗? 关于健康检查,由于您在任务定义中传递 cmd,它将覆盖 Dockerfile 健康检查,ECS 将启动容器,例如“docker run -d --health-cmd='curl @987654325 @ || 退出 1' --health-interval=5s --health-timeout=3s tomcat"。请分享您的任务定义 json。首先,验证您的健康检查是否在本地 Docker 环境中有效,然后在 ECS-Fargate 中尝试。 我已经添加了我的任务定义 json。请看@Haran。仅供参考 - 我已经从我的任务定义中删除了 healthcheck 命令,因为它失败了【参考方案2】:您正在使用的 docker 镜像,它是否安装了 curl
包的一部分?
根据您的屏幕截图,您似乎正在直接使用httpd:2.4
docker 映像。如果是这样,那么curl
不是包的一部分。
您需要从上面的httpd:2.4
创建自己的 docker 镜像作为基础。下面是获取图像卷曲部分的示例 Dockerfile 内容。
示例 -
FROM httpd:2.4
RUN apt-get update; \
apt-get install -y --no-install-recommends curl;
然后构建镜像并将其推送到您的 dockerhub 帐户或私有 docker repo。
docker build -t my-apache2 .
docker run -dit --name my-running-app -p 80:80 my-apache2
现在有了上面的图片,你应该可以让 healthcheck 命令工作了。
https://hub.docker.com/_/httpd
https://github.com/docker-library/httpd/blob/master/2.4/Dockerfile
【讨论】:
是的@Imran。我将 curl 作为软件包的一部分安装。仅供参考 - 请找到以下代码。FROM centos:latest RUN yum update -y RUN yum install httpd httpd-tools curl -y EXPOSE 80
问题依旧。
@Arjun 您可以使用您的Dockerfile
的完整详细信息编辑问题吗?评论中的上述代码缺少ENTRYPOINT
详细信息,因此我不确定您是否正确启动了服务!。
我已经添加了我的 DockerFile。请告诉我【参考方案3】:
我不知道为什么,但是将 http://localhost 更改为 http://127.0.0.1(不仅仅是 127.0.0.1)修复了问题。
我遵循了here 的建议,它解决了我的健康检查问题。
【讨论】:
【参考方案4】:遇到了同样的问题,并为我的用例找到了解决方案:
三个容器在一个任务定义中,分别是
-
nginx 边车
两个 NodeJ 应用程序
使用 ecs-params.yml 文件来声明健康检查:
version: 1
task_definition:
task_execution_role: ecsTaskExecutionRole
ecs_network_mode: awsvpc
task_size:
mem_limit: 2GB
cpu_limit: 1024
services:
nginx-sidecar:
healthcheck:
test: curl -f http://localhost || exit 0
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
<service 2>:
healthcheck:
test: curl -f http://localhost:3023 || exit 0
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
<service 3>:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3019/health"]
interval: 10s
timeout: 3s
retries: 3
start_period: 5s
确保 curl 在您的 docker 文件中可用,并且您也可以在本地调用它
我的 Dockerfile:
FROM node:14.17-alpine
RUN apk add --update curl
您可以在 ecs-params.yml 中包含以下任一用于运行状况检查的命令:
test: curl -f http://localhost || exit 0
test: ["CMD", "curl", "-f", "http://localhost"]
两者在我的用例中都有效。希望这会有所帮助,因为其他答案都不适合我。
【讨论】:
以上是关于AWS ECS Fargate 容器运行状况检查命令的主要内容,如果未能解决你的问题,请参考以下文章