frequent MQTT disconnections with cloud in Python Azure IOT SDK

Satish Narayana 0 Reputation points
2024-03-19T13:01:22.4466667+00:00

Frequent MQTT disconnections with azure device client. This happens whenever low-level Rx packet loss (WIFI) while communicating with azure IOT cloud device client.

  • Python Azure IOT SDK.
  • VERSION = "2.12.0"
  • git tag : 2022-08-23
  1. We are observing frequent MQTT disconnections, it’s trying to reconnect after 6-7 minutes.
  2. We are seeing MQTT receiving payload failure this may be because of packet loss observed in low level network(wifi) driver.

Azure disconnections error log: [ 4710.753541] python3.8[7287]: [CLOUD MGR:03-05 14:03:11:DEBUG:2529]:client:_easy_log: Sending PINGREQ [ 4711.002754] python3.8[7287]: [CLOUD MGR:03-05 14:03:11:DEBUG:2529]:client:_easy_log: Received PINGRESP

[ 4741.178599] python3.8[7287]: [CLOUD MGR:03-05 14:03:41:DEBUG:2529]:client:_easy_log: Sending PINGREQ [ 4751.786725] python3.8[507]: PING 8.8.8.8 (8.8.8.8): 56 data bytes [ 4751.786995] python3.8[507]: --- 8.8.8.8 ping statistics --- [ 4751.787101] python3.8[507]: 3 packets transmitted, 0 packets received, 100% packet loss [ 4756.795731] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:2529]:client:_easy_log: Received PUBLISH (d0, q0, r0, m0), '$iothub/methods/POST/ExecuteControlCommand /?$rid=29', ... (492 bytes) [ 4756.804362] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:INFO:0253]:mqtt_transport:on_message: message received on $iothub/methods/POST/ExecuteControlCommand/?$rid=2 9 [ 4756.816165] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_mqtt_message_received in pipeline thread [ 4756.824252] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0337]:pipeline_stages_mqtt:_on_mqtt_message_received: MQTTTransportStage: message received on topic $i othub/methods/POST/ExecuteControlCommand/?$rid=29 [ 4756.827691] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_pipeline_event in callback thread [ 4756.849254] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0337]:sync_handler_manager:_receiver_handler_runner: HANDLER RUNNER (_on_method_request_received): Inv oking handler [ 4756.859652] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:INFO:0036]:azure_receive_handler:received_message_handler: Executing Control Command [ 4756.859913] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:INFO:0046]:azure_receive_handler:received_message_handler: <----- {"metadata": {"msg_type": "CTL"}, "request ": {"req_command": "toggle_favorite", "response_msg_type": "json", "request_params": null, "cloud_msg_payload": {"data": {"type": "Control", "code": "FAVORITE_CTRL","attributes": [{"id": "2", "name": "favb", "state": "ON"}]}}}} [ 4756.860255] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0079]:external_io_zeromq:send_external_msg: Sending message : {"metadata": {"msg_type": "CTL"}, "reque st": {"req_command": "toggle_favorite", "response_msg_type": "json", "request_params": null, "cloud_msg_payload": {"data": {"type": "Control", "code": "FAVORITE_CTRL", "attributes": [{"id": "2", "name": "favb", "state": "ON"}]}}}} [ 4756.871643] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:INFO:0408]:sync_clients:send_method_response: Sending method response to Hub... [ 4756.876009] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0401]:mqtt_pipeline:send_method_response: Starting SendMethodResponseOperation on the pipeline [ 4756.890352] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0104]:pipeline_thread:wrapper: Starting run_op in pipeline thread [ 4757.235028] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0191]:pipeline_ops_base:spawn_worker_op: SendMethodResponseOperation: creating worker op of type MQTTPublishOperation [ 4757.235124] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0261]:pipeline_stages_mqtt:_run_op: MQTTTransportStage(MQTTPublishOperation): publishing on $iothub/methods/res/200/?$rid=29 [ 4757.235198] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:INFO:0541]:mqtt_transport:publish: publishing on $iothub/methods/res/200/?$rid=29 [ 4757.235273] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:2529]:client:_easy_log: Sending PUBLISH (d0, q1, r0, m223), 'b'$iothub/methods/res/200/?$rid=29'', ...(4 bytes) [ 4757.235343] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0550]:mqtt_transport:publish: _mqtt_client.publish returned rc=0 [ 4757.235409] python3.8[7287]: [CLOUD MGR:03-05 14:03:57:DEBUG:0594]:mqtt_transport:establish_operation: Waiting for response on MID: 223 [ 4771.456567] python3.8[7287]: [CLOUD MGR:03-05 14:04:11:DEBUG:0023]:cloud_sysd_mgr:notify_systemd: Systemd Notified : Cloud Mgr [ 4771.937553] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0204]:mqtt_transport:on_disconnect: disconnected with result code: 16 [ 4772.087988] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0208]:mqtt_transport:on_disconnect: File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap [ 4772.088275] python3.8[7287]: self._bootstrap_inner() [ 4772.088378] python3.8[7287]: File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner [ 4772.088457] python3.8[7287]: self.run() [ 4772.088529] python3.8[7287]: File "/usr/lib/python3.8/threading.py", line 870, in run [ 4772.088597] python3.8[7287]: self._target(*self._args, **self._kwargs) [ 4772.088666] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 3591, in _thread_main [ 4772.088734] python3.8[7287]: self.loop_forever(retry_first_connection=True) [ 4772.088799] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 1756, in loop_forever [ 4772.088863] python3.8[7287]: rc = self._loop(timeout) [ 4772.088929] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 1185, in _loop [ 4772.088995] python3.8[7287]: return self.loop_misc() [ 4772.089059] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 1610, in loop_misc [ 4772.089123] python3.8[7287]: self._check_keepalive() [ 4772.089194] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 2560, in _check_keepalive [ 4772.089258] python3.8[7287]: self._do_on_disconnect(rc) [ 4772.089472] python3.8[7287]: File "/usr/lib/python3.8/site-packages/paho/mqtt/client.py", line 3475, in _do_on_disconnect [ 4772.089608] python3.8[7287]: on_disconnect(self, self._userdata, rc) [ 4772.089679] python3.8[7287]: File "/usr/lib/python3.8/site-packages/azure/iot/device/common/mqtt_transport.py", line 208, in on_disconnect [ 4772.089745] python3.8[7287]: logger.debug("".join(traceback.format_stack())) [ 4772.096111] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0295]:mqtt_transport:_force_transport_disconnect_and_cleanup: Forcing paho disconnect to prevent it fro m automatically reconnecting [ 4772.108473] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0314]:mqtt_transport:_force_transport_disconnect_and_cleanup: in paho thread. nulling _thread [ 4772.120756] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0317]:mqtt_transport:_force_transport_disconnect_and_cleanup: Done forcing paho disconnect [ 4772.131663] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_mqtt_disconnected in pipeline thread [ 4772.150808] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0204]:mqtt_transport:on_disconnect: disconnected with result code: 0 [ 4772.160699] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0398]:pipeline_stages_mqtt:_on_mqtt_disconnected: MQTTTransportStage: _on_mqtt_disconnect called: Clien t or broker did not communicate in the keepalive interval. [ 4772.160954] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:1139]:pipeline_stages_base:_handle_pipeline_event: ConnectionStateStage(DisconnectedEvent): State chan ges CONNECTED -> DISCONNECTED. Attempting to reconnect [ 4772.164237] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_mqtt_disconnected in pipeline thread [ 4772.175905] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0310]:pipeline_stages_base:_handle_pipeline_event: PipelineRootStage: DisconnectedEvent received. Call ing on_disconnected_handler [ 4772.186868] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_disconnected in callback thread [ 4772.205217] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0104]:pipeline_thread:wrapper: Starting on_reconnect_timer_expired in pipeline thread [ 4772.207427] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0138]:abstract_clients:_on_disconnected: Connection State - Disconnected [ 4772.211749] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0147]:abstract_clients:_on_disconnected: Cleared all pending method requests due to disconnect [ 4772.225863] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0365]:sync_handler_manager:_client_event_handler_runner: HANDLER RUNNER (CLIENT EVENT): <azure.iot.dev ice.iothub.client_event.ClientEvent object at 0xae6b7e68> event received. Invoking <bound method AzureCloudConnMgr.handle_on_connection_state_change of <cloud_provider.azure.azure_cloud_conn_mgr.AzureCloudConnMgr object at 0xae6df568>> handler [ 4772.226898] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0444]:pipeline_stages_mqtt:_on_mqtt_disconnected: MQTTTransportStage: Unexpected disconnect (no pending connection op) [ 4772.229529] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0328]:pipeline_stages_base:_handle_pipeline_event: PipelineRootStage: BackgroundExceptionEvent receive d. Calling on_background_exception_handler [ 4772.246178] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:DEBUG:0104]:pipeline_thread:wrapper: Starting _on_background_exception in callback thread [ 4772.267531] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0144]:azure_cloud_conn_mgr:handle_on_connection_state_change: handle_on_connection_state_change disconn ected. reconnecting. [ 4772.267786] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0146]:azure_cloud_conn_mgr:handle_on_connection_state_change: Network is Down!!! [ 4772.269590] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:INFO:0426]:oob_cb:send_no_network: Azure Client is disconnected from Azure Hub [ 4772.275210] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:WARNING:0028]:handle_exceptions:handle_background_exception: Exception caught in background thread. Unable to handle. [ 4772.276419] python3.8[7287]: [CLOUD MGR:03-05 14:04:12:WARNING:0029]:handle_exceptions:handle_background_exception: ['azure.iot.device.common.transport_exceptions. ConnectionDroppedError: Unexpected disconnection\n']

Please find the attached file for the error log with retry cloud connection.Cloud_disconnection.txt

Azure IoT SDK
Azure IoT SDK
An Azure software development kit that facilitates building applications that connect to Azure IoT services.
222 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. LeelaRajeshSayana-MSFT 15,801 Reputation points Microsoft Employee
    2024-03-19T21:42:14.03+00:00

    Hi @Satish Narayana Greetings! Welcome to Microsoft Q&A forum. Thank you for posting this question here.

    Inspecting the logs you have provided and looking at the behavior if the issue stated in the question, we can conclude that spotty network on the device end is causing this behavior.

    Here are a few things you can try to improve the behavior of the device functioning.

    1. Increase the keep_alive time of the Device Client connection. This parameter controls the rate at which the client will send ping messages to the broker. You can configure this setting by adding the 'keep_alive' parameter when you create the device client using create_from_connection_string method
    2. Embed a Retry pattern in the code to prevent frequent reconnect requests by catching ConnectionFailedError exceptions in the code. Here is a sample implementation in Python which has a retry pattern based on CredentialError. You can refer a sample code to catch connection failures in the code.
    3. Create a device-side buffer to capture the generated data when the device disconnects so that the data can be transferred once the connection re-establishes.

    If you anticipate the network on the device end to be spotty, you may consider opting for alternate approaches such as using a gateway device and sharing the data to the IoT Hub through the Gateway. Here is an article that provides you the steps on the configurations needed for this architecture Configure an IoT Edge device to act as a transparent gateway. Hope this helps. Please let us know if you need any additional assistance or further guidance on this issue.


    If the response helped, please do click Accept Answer and Yes for the answer provided. Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.