ansible-playbook 在设置时挂起

Posted

技术标签:

【中文标题】ansible-playbook 在设置时挂起【英文标题】:ansible-playbook hangs at setup 【发布时间】:2017-01-31 15:37:01 【问题描述】:

我正在尝试运行 ansible-plabook,但它在设置时挂起。我的剧本做了很多工作,比如调用不同的角色和模块,它还收集事实。它以前工作正常,但现在我不确定出了什么问题,感谢任何帮助

主机操作系统为 RHEL 7 在这些系统之间设置了 ssh 无密码身份验证 我的清单文件仅包含 1 个主机系统

我运行的命令是

 ansible-playbook  -i /tmp/tmpBo5Xmj -vvvvv playbook.yml -c ssh

这里是详细日志

TASK [setup] *******************************************************************
<172.17.239.193> ESTABLISH SSH CONNECTION FOR USER: ansible
<172.17.239.193> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ansible_password/ansible_ssh_pass not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no)
<172.17.239.193> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.239.193> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.239.193> SSH: PlayContext set ssh_common_args: ()
<172.17.239.193> SSH: PlayContext set ssh_extra_args: ()
<172.17.239.193> SSH: EXEC ssh -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ansible -o ConnectTimeout=10 172.17.239.193 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801 `" && echo ansible-tmp-1474582282.38-93511913696801="` echo $HOME/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801 `" ) && sleep 0'"'"''
<172.17.239.193> PUT /tmp/tmpAKnqv6 TO /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup
<172.17.239.193> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ansible_password/ansible_ssh_pass not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no)
<172.17.239.193> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.239.193> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.239.193> SSH: PlayContext set ssh_common_args: ()
<172.17.239.193> SSH: PlayContext set sftp_extra_args: ()
<172.17.239.193> SSH: EXEC sftp -b - -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ansible -o ConnectTimeout=10 '[172.17.239.193]'
<172.17.239.193> ESTABLISH SSH CONNECTION FOR USER: ansible
<172.17.239.193> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.239.193> SSH: ansible_password/ansible_ssh_pass not set: (-o)(KbdInteractiveAuthentication=no)(-o)(PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey)(-o)(PasswordAuthentication=no)
<172.17.239.193> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.239.193> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.239.193> SSH: PlayContext set ssh_common_args: ()
<172.17.239.193> SSH: PlayContext set ssh_extra_args: ()
<172.17.239.193> **SSH: EXEC ssh -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ansible -o ConnectTimeout=10 -tt 172.17.239.193 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-njtihbebdvbpospbpivnpwbhrqtnfylc; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/" > /dev/null 2>&1'"'"'"'"'"'"'"'"' && sleep 0'"'"''**

在目标系统上,我可以看到以下 python 进程正在运行

[root@odcrac01 ~]# ps -ef | grep python| grep ansible
ansible  12600 12568  0 07:18 pts/0    00:00:00 /bin/sh -c sudo -H -S  -p "[sudo via ansible, key=tdtazugynuyekapktrkwjrwuawfvgkme] password: " -u root /bin/sh -c 'echo BECOME-SUCCESS-tdtazugynuyekapktrkwjrwuawfvgkme; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/" > /dev/null 2>&1' && sleep 0
root     12613 12600  0 07:18 pts/0    00:00:00 sudo -H -S -p [sudo via ansible, key=tdtazugynuyekapktrkwjrwuawfvgkme] password:  -u root /bin/sh -c echo BECOME-SUCCESS-tdtazugynuyekapktrkwjrwuawfvgkme; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/" > /dev/null 2>&1
root     12614 12613  0 07:18 pts/0    00:00:00 /bin/sh -c echo BECOME-SUCCESS-tdtazugynuyekapktrkwjrwuawfvgkme; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/" > /dev/null 2>&1
root     12615 12614  0 07:18 pts/0    00:00:00 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582204.64-194542154309618/setup
root     12616 12615  0 07:18 pts/0    00:00:00 /usr/bin/python /tmp/ansible_0loivr/ansible_module_setup.py
ansible  15436 15435  0 07:20 pts/1    00:00:00 /bin/sh -c sudo -H -S -n -u root /bin/sh -c 'echo BECOME-SUCCESS-njtihbebdvbpospbpivnpwbhrqtnfylc; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/" > /dev/null 2>&1' && sleep 0
root     15449 15436  0 07:20 pts/1    00:00:00 sudo -H -S -n -u root /bin/sh -c echo BECOME-SUCCESS-njtihbebdvbpospbpivnpwbhrqtnfylc; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/" > /dev/null 2>&1
root     15450 15449  0 07:20 pts/1    00:00:00 /bin/sh -c echo BECOME-SUCCESS-njtihbebdvbpospbpivnpwbhrqtnfylc; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/" > /dev/null 2>&1
root     15451 15450  0 07:20 pts/1    00:00:00 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474582282.38-93511913696801/setup
root     15452 15451  0 07:20 pts/1    00:00:00 /usr/bin/python /tmp/ansible_PJZfVt/ansible_module_setup.py

这是一个简单的剧本,我设置了成为:是和成为用户:root,不知何故,当我设置成为:是的它不起作用,它挂起

 - name: list files in target system
   hosts: clonedb
   user: ansible
   become: yes
   become_user: root
   gather_facts: yes
   tasks:
   - name: list files in target system
     command: ls
     always_run: true
     tags: list

如果我评论 become 和 become_user 它工作正常。我已将用户 ansible 添加到目标系统的 sudoers 列表中,但它仍然挂起

在“ansible”用户的目标系统上,我已通过将其添加到 sudoers 列表来授予 sudo 权限

ansible         ALL=(ALL)       NOPASSWD: ALL

我尝试在目标系统上以 ansible 用户身份运行 sudo 命令,它工作正常

[ansible@odcrac01 ~]$ sudo ls ~root
anaconda-ks.cfg  cvuqdisk-1.0.9-1.rpm  install.log  install.log.syslog  remove_disk.sh

但在另一个系统上它工作正常

(virtualapp) [ansible@OEL72-37-70 lib]$ python odcansible.py
sys path:['/home/ansible/virtualapp/pypi_portal/lib', '/home/ansible/virtualapp/lib64/python27.zip', '/home/ansible/virtualapp/lib64/python2.7', '/home/ansible/virtualapp/lib64/python2.7/plat-linux2', '/home/ansible/virtualapp/lib64/python2.7/lib-tk', '/home/ansible/virtualapp/lib64/python2.7/lib-old', '/home/ansible/virtualapp/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7', '/usr/lib/python2.7', '/home/ansible/virtualapp/lib/python2.7/site-packages']
PLAY [create temporary directory in target system] *****************************

TASK [setup] *******************************************************************
<172.17.58.95> ESTABLISH SSH CONNECTION FOR USER: ansible
<172.17.58.95> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)(-o)(IdentitiesOnly=yes)(-o)(ControlMaster=auto)(-o)(ControlPersist=60s)
<172.17.58.95> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.58.95> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.58.95> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.58.95> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r)
<172.17.58.95> SSH: EXEC sshpass -d14 ssh -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=ansible -o ConnectTimeout=10 -o ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r 172.17.58.95 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1474661320.71-273658467725557 `" && echo ansible-tmp-1474661320.71-273658467725557="` echo $HOME/.ansible/tmp/ansible-tmp-1474661320.71-273658467725557 `" ) && sleep 0'"'"''
<172.17.58.95> PUT /tmp/tmpITvUgQ TO /home/ansible/.ansible/tmp/ansible-tmp-1474661320.71-273658467725557/setup
<172.17.58.95> SSH: disable batch mode for sshpass: (-o)(BatchMode=no)
<172.17.58.95> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)(-o)(IdentitiesOnly=yes)(-o)(ControlMaster=auto)(-o)(ControlPersist=60s)
<172.17.58.95> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.58.95> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.58.95> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.58.95> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r)
<172.17.58.95> SSH: EXEC sshpass -d14 sftp -o BatchMode=no -b - -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=ansible -o ConnectTimeout=10 -o ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r '[172.17.58.95]'
<172.17.58.95> ESTABLISH SSH CONNECTION FOR USER: ansible
<172.17.58.95> SSH: ansible.cfg set ssh_args: (-o)(UserKnownHostsFile=/dev/null)(-o)(StrictHostKeyChecking=no)(-o)(IdentitiesOnly=yes)(-o)(ControlMaster=auto)(-o)(ControlPersist=60s)
<172.17.58.95> SSH: ANSIBLE_HOST_KEY_CHECKING/host_key_checking disabled: (-o)(StrictHostKeyChecking=no)
<172.17.58.95> SSH: ANSIBLE_REMOTE_USER/remote_user/ansible_user/user/-u set: (-o)(User=ansible)
<172.17.58.95> SSH: ANSIBLE_TIMEOUT/timeout set: (-o)(ConnectTimeout=10)
<172.17.58.95> SSH: found only ControlPersist; added ControlPath: (-o)(ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r)
<172.17.58.95> SSH: EXEC sshpass -d14 ssh -C -vvv -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=ansible -o ConnectTimeout=10 -o ControlPath=/home/ansible/.ansible/cp/ansible-ssh-%h-%p-%r -tt 172.17.58.95 '/bin/sh -c '"'"'sudo -H -S  -p "[sudo via ansible, key=wcazqfwywctzrpesmznhbpbibluqmkqg] password: " -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-wcazqfwywctzrpesmznhbpbibluqmkqg; LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python /home/ansible/.ansible/tmp/ansible-tmp-1474661320.71-273658467725557/setup; rm -rf "/home/ansible/.ansible/tmp/ansible-tmp-1474661320.71-273658467725557/" > /dev/null 2>&1'"'"'"'"'"'"'"'"' && sleep 0'"'"''
ok: [172.17.58.95]

【问题讨论】:

sudo 正在运行似乎很奇怪,因为它仍处于设置步骤。但这让我认为问题实际上出在您的 sudoers 配置上。 我也面临这个问题,从 ssh 密钥恢复到普通用户名/密码有帮助..你有合适的解决方案吗? 【参考方案1】:

你能澄清一下你的问题吗?

您说您的目标是收集有关主机的事实 - 但您的剧本并未反映这一点。

如果您的唯一目标是收集有关主机的事实,您可以使用设置模块来完成该任务。您也不需要剧本来收集有关主机的事实。

ansible clonedb -m setup -u ansible

上面的 ad-hoc 命令使用用户“ansible”为您的“clonedb”主机组收集事实来进行身份验证。如果您不使用 SSH 密钥对服务器进行身份验证,则还需要传递“-k”选项以提示输入 SSH 密码。

不过,收集事实的最佳方式是通过剧本。您可以进一步简化您的剧本,只需执行以下操作:

---
- hosts: clonedb
  user: ansible

  tasks:
  - name: gather facts
    action: setup

您不需要特权帐户来收集有关主机的事实。

“gather_facts”选项默认设置为 True。除非您在 ansible.cfg 中明确将其设置为“False”,否则无需在您的 playbook 中指定它。

您应该将事实存储在 redis 中或通过 json 文件存储,因为一旦剧本完成,事实将从内存中删除。

http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching

编辑:

剧本的简化版本:

---
- hosts: clonedb
  user: ansible
  become: yes

  tasks:
  - name: list files
    command: ls
    always_run: true

    register: listfiles
  - debug: var=listfiles

【讨论】:

感谢您的回复,我做的不仅仅是收集事实。我只是举了一个任务(ls)的例子。我有几个角色,我在那个剧本中称之为。其中一些以 root 用户、oracle 用户、grid 用户身份运行。所以我需要 user become: yes 和 become_user: root ,不知何故这是行不通的。 我建议简化你的剧本。不必使用 become_user: root 也不必使用 become: yes。 become:yes 是一种特权升级。您的剧本在我的测试 VM 上运行,因此远程计算机上的 sudoers 文件可能存在问题。 我更新了我的答案,包括一个简化版本的剧本,它将显示文件列表。 @VivekBasappa - 我无法评论你的答案,所以我会在这里问。你能解释一下为什么我的剧本不起作用吗?我在提供之前进行了测试,效果很好。同样,默认升级用户是 root,因此同时拥有“become: yes”和“become_user: root”是多余的。如果您将“成为”放在您的任务中,您将不得不为您的剧本中需要升级的每个任务调用该模块,这就是为什么您不应该将它放在您的任务中。 我不知道为什么,我仍在试图找出为什么这对我不起作用(仅在一个系统上不起作用)。【参考方案2】:

尝试检查尝试连接到目标计算机的用户主文件夹中的权限。我在我自己的目标机器上运行这个命令/home/ansible:chmod 775 -R /home/ansible。您应该选择您的用户名。

【讨论】:

【参考方案3】:

我遇到了同样的问题,我通过登录远程框并删除由 ansible 创建的目录来解决它:

(remote_box)$ rm -Rf ~/.ansible

这是因为我打断了之前的 ansible 会话。

【讨论】:

【参考方案4】:

检查,问题主机没有损坏的 NFS 挂载。

我在特定主机上遇到了同样的问题,问题是通过运行 setup.py 找到的:

$ python ./setup.py 
(after continuous time)
NFS server 192.168.1.22 not responding still trying

你可以通过 mount 命令检查挂载

mount
...
/mnt/share on 192.168.1.22:/share remote/read/write/setuid/nodevices/rstchow/xattr/zone=dom87zone4/sharezone=2/dev=9640001 on Wed Feb 20 13:23:43 2019

【讨论】:

【参考方案5】:

在做了一些调试和网络搜索后,我发现了这个问题 https://github.com/ansible/ansible/issues/12025

这是在 ansible 2.0 上 如果我在任务挂起之前在主级别设置成为:是的,但如果我在任务中设置它它可以工作

下面的剧本将不起作用

- hosts: clonedb
  user: ansible
  become: yes

  tasks:
  - name: list files
    command: ls
    always_run: true

    register: listfiles
  - debug: var=listfiles

但这一个有效 - 主机:克隆数据库 用户:ansible

  tasks:
  - name: list files
    command: ls
    always_run: true
    become: yes
    become_user: root

    register: listfiles
  - debug: var=listfiles

【讨论】:

以上是关于ansible-playbook 在设置时挂起的主要内容,如果未能解决你的问题,请参考以下文章

centos 7.5 supermicro x11sta-t - 当我设置内核参数 memmap= 内核在启动时挂起

MudBlazor WASM 项目在启动时挂起

在读取 hbase 表时挂起 Mapreduce 作业

linux:启动时挂起进程

Jenkins 从 Git 获取代码时挂起

在Mac上完成多处理时挂起,但不在Windows上挂起