AWS ECS Fargate - 任务未运行
Posted
技术标签:
【中文标题】AWS ECS Fargate - 任务未运行【英文标题】:AWS ECS Fargate - Task Not Running 【发布时间】:2019-09-25 04:25:53 【问题描述】:我按照这个 AWS 教程在 AWS ECS Fargate https://docs.aws.amazon.com/AmazonECS/latest/userguide/ECS_AWSCLI_Fargate.html上设置任务
我在 ECR 中有一个图像,我已经设置了集群和服务,以及任务定义,但没有任何东西正在运行。我必须在 aws 控制台中手动运行任务才能启动它,当我使用新的任务定义更新服务时,即使强制部署,运行的任务也不会更新。
我想要一个非常简单的设置,因此我没有 ELB 或 AutoScaling 策略以及服务的以下设置:
任务数 1
最低健康百分比 100
最大百分比 200
部署类型:滚动更新
我觉得我错过了一些东西,我的任务没有自动启动,也没有在服务更新时更新。
如果有帮助的话,我已在我的 bitbucket 管道中使用的部署代码下方附上:
#!/bin/bash
set -e
# possible -b (base / app name) -i (image version), -e (deploy env) and -s (service id)
while getopts b:i:e:s:r: option
do
case "$option"
in
b) BASE_NAME=$OPTARG;;
i) IMG_VERSION=$OPTARG;;
e) DEPLOY_ENV=$OPTARG;;
s) SERVICE_ID=$OPTARG;;
r) EXECUTION_ROLE=$OPTARG;;
esac
done
echo "BASE_NAME: " $BASE_NAME
echo "IMG_VERSION: " $IMG_VERSION
echo "DEPLOY_ENV: " $DEPLOY_ENV
echo "SERVICE_ID: " $SERVICE_ID
echo "EXECUTION_ROLE: " $EXECUTION_ROLE
if [ -z "$BASE_NAME" ]; then
echo "exit: No BASE_NAME specified"
exit;
fi
if [ -z "$SERVICE_ID" ]; then
echo "exit: No SERVICE_ID specified"
exit;
fi
if [ -z "$DEPLOY_ENV" ]; then
echo "exit: No DEPLOY_ENV specified"
exit;
fi
if [ -z "$IMG_VERSION" ]; then
echo "exit: No IMG_VERSION specified"
exit;
fi
if [ -z "$EXECUTION_ROLE" ]; then
echo "exit: No EXECUTION_ROLE specified"
exit;
fi
# Define variables
TASK_FAMILY=$BASE_NAME-$DEPLOY_ENV-$SERVICE_ID
SERVICE_NAME=$BASE_NAME-$DEPLOY_ENV-$SERVICE_ID-service
CLUSTER_NAME=$BASE_NAME-$DEPLOY_ENV-cluster
IMAGE_PACEHOLDER="<IMAGE_VERSION>"
CONTAINER_DEFINITION_FILE=$(cat $BASE_NAME-$SERVICE_ID.container-definition.json)
CONTAINER_DEFINITION="$CONTAINER_DEFINITION_FILE//$IMAGE_PACEHOLDER/$IMG_VERSION"
export TASK_VERSION=$(aws ecs register-task-definition --family $TASK_FAMILY --container-definitions "$CONTAINER_DEFINITION" --requires-compatibilities '["FARGATE"]' --cpu "512" --memory "1024" --network-mode "awsvpc" --execution-role-arn $EXECUTION_ROLE | jq --raw-output '.taskDefinition.revision')
echo "Registered ECS Task Definition: " $TASK_VERSION
if [ -n "$TASK_VERSION" ]; then
echo "Update ECS Cluster: " $CLUSTER_NAME
echo "Service: " $SERVICE_NAME
echo "Task Definition: " $TASK_FAMILY:$TASK_VERSION
#Update ECS Service
DEPLOYED_SERVICE=$(aws ecs update-service --cluster $CLUSTER_NAME --service $SERVICE_NAME --task-definition $TASK_FAMILY:$TASK_VERSION --force-new-deployment | jq --raw-output '.service.serviceName')
echo "Deployment of $DEPLOYED_SERVICE complete"
else
echo "exit: No task definition"
exit;
fi
编辑:
这是我的任务定义:
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::<Account-id>:role/:arn:aws:iam::<Account-id>:role/ecsTaskExecutionRole",
"containerDefinitions": [
"dnsSearchDomains": null,
"logConfiguration": null,
"entryPoint": [],
"portMappings": [
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
,
"hostPort": 443,
"protocol": "tcp",
"containerPort": 443
],
"command": [],
"linuxParameters": null,
"cpu": 0,
"environment": [],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [],
"workingDirectory": "/usr/share/nginx/html/",
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": null,
"volumesFrom": [],
"stopTimeout": null,
"image": "<Account-id>.dkr.ecr.us-east-1.amazonaws.com/<my-ecr-image>:latest",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "dig-website"
],
"placementConstraints": [],
"memory": "1024",
"taskRoleArn": null,
"compatibilities": [
"EC2",
"FARGATE"
],
"taskDefinitionArn": "arn:aws:ecs:us-east-1:<Account-id>:task-definition/myapp-production-website:11",
"family": "myapp-production-website",
"requiresAttributes": [
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.execution-role-ecr-pull"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "ecs.capability.task-eni"
,
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.ecr-auth"
],
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "512",
"revision": 11,
"status": "ACTIVE",
"proxyConfiguration": null,
"volumes": []
【问题讨论】:
请分享您的任务定义 @congbaoguier 刚刚添加了它 您的 codepipeline 部署脚本之前是否工作过(例如,部署为 EC2 任务)?您的部署脚本执行的输出是什么?我以前没有使用过代码管道。如果此 codepipeline 需要任何 IAM 角色,您确定该 codepipeline 角色具有所有必需的权限吗? 【参考方案1】:解决了我的问题。该错误源于我传递给我的 bitbucket 管道的参数。
我的管道中有一个 env 变量来填充执行角色,我不知道的是,我需要传递给 ecs register-task 的 aws cli 的只是角色的名称,而不是完整的 ARN如下图:
"executionRoleArn": "arn:aws:iam::<Account-id>:role/:arn:aws:iam::<Account-id>:role/ecsTaskExecutionRole"
应该改为:
"executionRoleArn": "arn:aws:iam::<Account-id>:role/ecsTaskExecutionRole"
因为它无法解析 arn,所以会抛出错误,提示角色没有正确的权限。
【讨论】:
以上是关于AWS ECS Fargate - 任务未运行的主要内容,如果未能解决你的问题,请参考以下文章