Iotedge cannot boot after host abrupt shutdown

Pavel Livshits 10 Reputation points
2023-05-10T09:13:12.9866667+00:00

Hi,

I'm using iotghub for our iot running on Ubuntu linux machine. This is the second time that after an abrupt power cut, the iotedge completely fails on boot, while complaining about some issue with read-only file system (see logs below). I have tried several options but nothing helped. Eventually only after completely removing and reinstalling the iotedge I could get the system back to life.

Anyone experienced the same? Any suggestions?

sudo apt-get autoremove --purge aziot-edge
and then
sudo apt-get update; sudo apt-get install aziot-edge defender-iot-micro-agent-edge

Logs:

$ sudo iotedge system logs -- -f

-- Logs begin at Tue 2023-02-28 02:37:19 UTC. --

May 10 05:34:36 aziot-edged[2779421]: 2023-05-10T05:34:36Z [WARN] - container runtime error

May 10 05:34:36 aziot-edged[2779421]: Caused by:

May 10 05:34:36 aziot-edged[2779421]: HTTP 500 Internal Server Error: error while creating mount source path '/var/lib/aziot/edged/mnt/edgeAgent.sock': mkdir /var/lib/aziot: read-only file system

May 10 05:34:36 aziot-edged[2779421]: 2023-05-10T05:34:36Z [WARN] - Error in watchdog: Failed to start Edge runtime: runtime operation error: start module "edgeAgent"

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Watchdog checking Edge runtime status

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Edge runtime status is stopped, starting module now...

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Starting module edgeAgent...

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Starting new listener for module edgeAgent

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Starting workload API...

May 10 05:35:31 aziot-edged[2779421]: 2023-05-10T05:35:31Z [INFO] - Workload API stopped

May 10 05:35:48 aziot-edged[2779421]: 2023-05-10T05:35:48Z [WARN] - container runtime error

May 10 05:35:48 aziot-edged[2779421]: Caused by:

May 10 05:35:48 aziot-edged[2779421]: HTTP 500 Internal Server Error: error while creating mount source path '/var/lib/aziot/edged/mnt/edgeAgent.sock': mkdir /var/lib/aziot: read-only file system

May 10 05:35:48 aziot-edged[2779421]: 2023-05-10T05:35:48Z [WARN] - Error in watchdog: Failed to start Edge runtime: runtime operation error: start module "edgeAgent"

^

Thanks,

Pavel

Azure IoT Edge
Azure IoT Edge
An Azure service that is used to deploy cloud workloads to run on internet of things (IoT) edge devices via standard containers.
543 questions
Azure
Azure
A cloud computing platform and infrastructure for building, deploying and managing applications and services through a worldwide network of Microsoft-managed datacenters.
960 questions
Azure IoT Hub
Azure IoT Hub
An Azure service that enables bidirectional communication between internet of things (IoT) devices and applications.
1,127 questions
{count} vote

2 answers

Sort by: Most helpful
  1. AshokPeddakotla-MSFT 27,801 Reputation points
    2023-05-10T13:49:43.7266667+00:00

    @Pavel Livshits Welcome to Microsoft Q&A forum!

    I'm using iotghub for our iot running on Ubuntu linux machine. This is the second time that after an abrupt power cut, the iotedge completely fails on boot, while complaining about some issue with read-only file system (see logs below). I have tried several options but nothing helped. Eventually only after completely removing and reinstalling the iotedge I could get the system back to life.

    We are sorry for the inconvenience caused in this regard. I understand that after re-installing the iot edge resolved the isssue.

    aziot-edged[2779421]: HTTP 500 Internal Server Error: error while creating mount source path '/var/lib/aziot/edged/mnt/edgeAgent.sock': mkdir /var/lib/aziot: read-only file system

    As per the error message, it is preventing the IoT Edge runtime from starting up properly. It's important to note that read-only file system errors can be caused by a number of factors, such as file system corruption or hardware issues.

    I would suggest you, please see Solutions to common issues for Azure IoT Edge for the common issue related to file system and the resolution.

    Symptoms:

    The security daemon fails to start and module containers aren't created. The edgeAgent, edgeHub and other custom modules aren't started by IoT Edge service. In aziot-edged logs, you see this error:

    • The daemon could not start up successfully: Could not start management service
    • caused by: An error occurred for path /var/run/iotedge/mgmt.sock
    • caused by: Permission denied (os error 13)

    Cause

    For all Linux distros except CentOS 7, IoT Edge's default configuration is to use systemd socket activation. A permission error happens if you change the configuration file to not use socket activation but leave the URLs as /var/run/iotedge/*.sock, since the iotedge user can't write to /var/run/iotedge meaning it can't unlock and mount the sockets itself.

    Solution

    You don't need to disable socket activation on a distribution where socket activation is supported. However, if you prefer to not use socket activation at all, put the sockets in /var/lib/iotedge/.

    1. Run systemctl disable iotedge.socket iotedge.mgmt.socket to disable the socket units so that systemd doesn't start them unnecessarily
    2. Change the iotedge config to use /var/lib/iotedge/*.sock in both connect and listen sections
    3. If you already have modules, they have the old /var/run/iotedge/*.sock mounts, so docker rm -f them.

    Hope this helps. If this answers your query, do click Accept Answer and Yes if this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

  2. Mohr Clemens 5 Reputation points
    2023-07-20T08:21:30.4+00:00

    I have exactly the same problem. When I restart docker and run config apply everything works again.

    sudo systemctl restart docker
    sudo iotedge config apply
    

    The next time I reboot the System, I get the error message again. Nothing has been changed in the socket activation.

    The next time I reboot the System, I get the error message again. Nothing has been changed in the socket activation.

    The next time I reboot the System, I get the error message again. Nothing has been changed in the socket activation.

    1 person found this answer helpful.