在 Azure VM 上使用 cloud-init 挂载数据磁盘失败
Posted
技术标签:
【中文标题】在 Azure VM 上使用 cloud-init 挂载数据磁盘失败【英文标题】:Using cloud-init on an Azure VM to mount a data disk fails 【发布时间】:2020-07-19 23:13:27 【问题描述】:这是一个与之前的 SO 问题类似的问题,我从中修改了我的代码 How can i use cloud-init to load a datadisk on an ubuntu VM in azure
使用通过 Terraform 传递的云配置文件:
#cloud-config
disk_setup:
/dev/disk/azure/scsi1/lun0:
table_type: gpt
layout: true
overwrite: false
fs_setup:
- device: /dev/disk/azure/scsi1/lun0
partition: 1
filesystem: ext4
mounts:
- [
"/dev/disk/azure/scsi1/lun0-part1",
"/opt/data",
auto,
"defaults,noexec,nofail",
]
data "template_file" "cloudconfig"
template = file("$path.module/cloud-init.tpl")
data "template_cloudinit_config" "config"
gzip = true
base64_encode = true
part
content_type = "text/cloud-config"
content = "$data.template_file.cloudconfig.rendered"
module "nexus_test_vm"
#unnecessary details ommitted - 1 VM with 1 external disk, fixed lun of 0, ubuntu 18.04
vm_size = "Standard_B2S"
cloud_init_template = data.template_cloudinit_config.config.rendered
模块的相关位(VM创建)
resource "azurerm_virtual_machine" "generic-vm"
count = var.number
name = "$local.my_name-$count.index-vm"
location = var.location
resource_group_name = var.resource_group_name
network_interface_ids = [azurerm_network_interface.generic-nic[count.index].id]
vm_size = var.vm_size
delete_os_disk_on_termination = true
storage_image_reference
id = var.image_id
storage_os_disk
name = "$local.my_name-$count.index-os"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
disk_size_gb = var.os_disk_size
os_profile
computer_name = "$local.my_name-$count.index"
admin_username = local.my_admin_user_name
custom_data = var.cloud_init_template
os_profile_linux_config
disable_password_authentication = true
ssh_keys
path = "/home/$local.my_admin_user_name/.ssh/authorized_keys"
//key_data = tls_private_key.vm_ssh_key.public_key_openssh
key_data = var.public_key_openssh
tags =
Name = "$local.my_name-$count.index"
Deployment = local.my_deployment
Prefix = var.prefix
Environment = var.env
Location = var.location
Volatile = var.volatile
Terraform = "true"
resource "azurerm_managed_disk" "generic-disk"
name = "$azurerm_virtual_machine.generic-vm.*.name[0]-1-generic-disk"
location = var.rg_location
resource_group_name = var.rg_name
storage_account_type = "Standard_LRS"
create_option = "Empty"
disk_size_gb = var.external_disk_size
resource "azurerm_virtual_machine_data_disk_attachment" "generic-disk"
managed_disk_id = azurerm_managed_disk.generic-disk.id
virtual_machine_id = azurerm_virtual_machine.generic-vm.*.id[0]
lun = 0
caching = "ReadWrite"
我收到很多奇怪的错误,表明运行 cloud-init 时磁盘不存在。但是,当我 ssh 进入虚拟机时,磁盘就在那里!这是比赛条件吗?是否可以在 cloud-init 中配置等待,或者让我更好地了解可能发生的情况?
来自虚拟机的相关日志:
head -n 5000 /var/log/cloud-init.log | grep lun
2020-04-07 16:30:51,296 - cc_disk_setup.py[DEBUG]: Partitioning disks: '/dev/disk/azure/scsi1/lun0': 'layout': True, 'overwrite': False, 'table_type': 'gpt', '/dev/disk/cloud/azure_resource': 'table_type': 'gpt', 'layout': [100], 'overwrite': True, '_origname': 'ephemeral0'
2020-04-07 16:30:51,318 - util.py[DEBUG]: Creating partition on /dev/disk/azure/scsi1/lun0 took 0.021 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,601 - cc_disk_setup.py[DEBUG]: setting up filesystems: ['device': '/dev/disk/azure/scsi1/lun0', 'filesystem': 'ext4', 'partition': 1]
2020-04-07 16:30:51,725 - util.py[DEBUG]: Creating fs for /dev/disk/azure/scsi1/lun0 took 0.124 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,733 - cc_mounts.py[DEBUG]: mounts configuration is [['/dev/disk/azure/scsi1/lun0-part1', '/opt/data', 'auto', 'defaults,noexec,nofail']]
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Attempting to determine the real name of /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: changed /dev/disk/azure/scsi1/lun0-part1 => None
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Ignoring nonexistent named mount /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,736 - cc_mounts.py[DEBUG]: Changes to fstab: ['+ /dev/disk/azure/scsi1/lun0-part1 /opt/data auto defaults,noexec,nofail,comment=cloudconfig 0 2']
ls -l /dev/disk/azure/scsi1/lun0
lrwxrwxrwx 1 root root 12 Apr 7 16:32 /dev/disk/azure/scsi1/lun0 -> ../../../sdc
【问题讨论】:
能展示一下你创建的模块的内容吗? 更新了模块的相关部分 我没有在虚拟机中看到任何 storage_data_disk。如何将数据磁盘附加到 VM? 也添加了! 好的,我看到了。您使用关联来附加数据磁盘。我认为顺序是问题。您可以尝试将数据盘添加到VM资源中,块storage_data_disk
。
【参考方案1】:
对于这个问题,我认为是数据盘和VM和cloud-init的顺序。据我所知,云初始化是在虚拟机首次启动时执行的。而且你创建的Terraform文件好像数据盘创建的时间比VM晚,所以也比cloud-init晚,然后报错。
所以解决方案是在虚拟机内部设置数据盘storage_data_disk
块,这样虚拟机就会被创建并附加数据盘,然后执行cloud-init。
【讨论】:
以上是关于在 Azure VM 上使用 cloud-init 挂载数据磁盘失败的主要内容,如果未能解决你的问题,请参考以下文章
使用 azure cli/bash 在 Azure VM 的“实时”数据磁盘上更新缓存设置