Diving deeper into cloud-init
Applies to: ✔️ Linux VMs ✔️ Flexible scale sets
To learn more about cloud-init or troubleshoot it at a deeper level, you need to understand how it works. This document highlights the important parts, and explains the Azure specifics.
When cloud-init is included in a generalized image and a VM is created from that image, it processes configurations and runs through five stages during the initial boot. These stages show you at what point cloud-init applies its configurations.
Understand Cloud-Init configuration
Configuring a VM to run on a platform uses cloud-init to apply multiple configurations. The main configurations you interact with is User data
(customData), which supports multiple formats. For more information, see User-Data Formats & cloud-init 21.2 documentation. You also have the ability to add and run scripts (/var/lib/cloud/scripts) for other configuration.
Preconfigured Azure Marketplace images
Some configurations are already baked into Azure Marketplace images that come with cloud-init.
Cloud data source - cloud-init contains code that can interact with cloud platforms, these codes are called 'datasources'. When a VM is created from a cloud-init image in Azure, cloud-init loads the Azure datasource, which interacts with the Azure metadata endpoints to get the VM specific configuration.
Runtime config (/run/cloud-init).
Image config (/etc/cloud), like
/etc/cloud/cloud.cfg
,/etc/cloud/cloud.cfg.d/*.cfg
. An example of where this configuration is used in Azure, it's common for the Linux OS images with cloud-init to have an Azure datasource directive that tells cloud-init what datasource it should use, this configuration saves cloud-init time:sudo cat /etc/cloud/cloud.cfg.d/90_dpkg.cfg
# to update this file, run dpkg-reconfigure cloud-init datasource_list: [ Azure ]
Cloud-init boot stages (processing configuration)
When you're provisioning VMs with cloud-init, there are five configuration boot stages. The output from these stages is visible in the logs.
Generator Stage: The cloud-init systemd generator starts, and determines that cloud-init should be included in the boot goals, and if so, it enables cloud-init.
Cloud-init Local Stage: Here, cloud-init looks for the local "Azure" datasource, which enables cloud-init to interface with Azure, and apply a networking configuration, including fallback.
Cloud-init init Stage (Network): Networking should be online, and the NIC and route table information should be generated. At this stage, the modules listed in
cloud_init_modules
in/etc/cloud/cloud.cfg
are run. The VM in Azure is mounted, the ephemeral disk is formatted, the hostname is set, along with other tasks.The following are some of the
cloud_init_modules
:- migrator - seed_random - bootcmd - write-files - growpart - resizefs - disk_setup - mounts - set_hostname - update_hostname - ssh
After this stage, cloud-init sends a signal to the Azure platform that the VM has been provisioned successfully. Some modules may have failed, however not all module failures automatically result in a provisioning failure.
Cloud-init Config Stage: At this stage, the modules in
cloud_config_modules
defined and listed in/etc/cloud/cloud
.cfg runs.Cloud-init Final Stage: At this final stage, the modules in
cloud_final_modules
, listed in/etc/cloud/cloud.cfg
, runs. Here modules that need to be run late in the boot process run, such as package installations and run scripts etc.- During this stage, you can run scripts by placing them in the directories under
/var/lib/cloud/scripts
:per-boot
- scripts within this directory, run on every rebootper-instance
- scripts within this directory run when a new instance is first bootedper-once
- scripts within this directory run only once
- During this stage, you can run scripts by placing them in the directories under