D2C messages sending retry mechanism on Azure Sphere

Dmytro Seliverstov 41 Reputation points
2023-02-20T12:33:58.4266667+00:00

Hi, everyone. I am developing an application that runs on Azure Sphere and I faced an issue with my current implementation. The main problem is that SDK destroys any pending message in the queue upon disconnection. So I came to the point of implementing the mechanism of sending result verification that causes sending retrial in case of failure.

Need your help or comments about the resending mechanism code. Is there any practice or an example of similar functionality somewhere? For the time being, concerns are the semaphore and additional threads for every message. I can get rid of semaphore by simply polling in __resender_thread with "while (iothub_client_t == NULL)" but it's not power efficient.

Here are the code pieces:

static sem_t resender_sem;
static IOTHUB_DEVICE_CLIENT_LL_HANDLE iothub_client_handle = NULL;

__attribute__((constructor))
static void init_module(void)
{
    sem_init(&resender_sem, 0, 0);
}

__attribute__((destructor))
static void deinit_module(void)
{
    sem_destroy(&resender_sem);
}

static void SendMessageCallback(IOTHUB_CLIENT_CONFIRMATION_RESULT result, void *context);

// thread with a loop for message send retry
static int __resender_handler(void* thrd_ctx)
{
    IOTHUB_MESSAGE_HANDLE messageHandle = (IOTHUB_MESSAGE_HANDLE)thrd_ctx;
    int ret = 0;

    while (1) {
        if (!sem_wait(&resender_sem)) {
            sem_post(&resender_sem);
            if ((ret = IoTHubDeviceClient_LL_SendEventAsync(iothub_client_h, messageHandle, 
                                SendMessageCallback, (void*)messageHandle)) != IOTHUB_CLIENT_OK) {
                Log_Debug("WARNING: failed to hand over the message to IoTHubClient %d", ret);
                sleep(10);
            } else {
                log_Debug("INFO: IoTHubClient accepted the message for delivery");
                break;
            }
        }
    }

    return 0;
}

static void SendMessageCallback(IOTHUB_CLIENT_CONFIRMATION_RESULT result, void *context)
{
    Log_Debug("INFO: Message received by IoT Hub. Result is: %d\n", result);
    thrd_t thread_handle = NULL;

    if (!result || thrd_create(&thread_handle, 
                        (thrd_start_t)__resender_handler, 
                        (void*)context)) {
        IoTHubMessage_Destroy((IOTHUB_MESSAGE_HANDLE)context);
    }
}

// D2C library module for sending telemetry to the cloud
void D2C_send_message(char* payload, size_t size)
{
    if (iothub_client_h == NULL) {
        Log_Debug("ERROR: iothub_client_h is NULL\n");
        return;
    }

    if (payload == NULL) {
        Log_Debug("ERROR: payload is NULL\n");
        return;
    }

    IOTHUB_MESSAGE_HANDLE messageHandle = NULL;
    if (size) {
        Lo_Debug("Sending IoT Hub Message, payload is raw data\n");

        messageHandle = IoTHubMessage_CreateFromByteArray(payload, size);
    } else {
        Log_Debug("Sending IoT Hub Message payload: %s\n", payload);

        messageHandle = IoTHubMessage_CreateFromString(payload);
    }

    if (IoTHubDeviceClient_LL_SendEventAsync(iothub_client_h, messageHandle, SendMessageCallback,
                                             (void*)messageHandle) != IOTHUB_CLIENT_OK) {
        Log_Debug("WARNING: failed to hand over the message to IoTHubClient\n");
        IoTHubMessage_Destroy(messageHandle);
    } else {
        Log_Debug("INFO: IoTHubClient accepted the message for delivery\n");
    }
}

// standard connection to cloud set up that runs in the uloop
static void __setup_azure_client(void)
{
    if (iothub_client_h) {
        sem_wait(&resender_sem);
        IoTHubDeviceClient_LL_Destroy(iothub_client_h);
        iothub_client_h = NULL;
    }

    AZURE_SPHERE_PROV_RETURN_VALUE provResult =
        IoTHubDeviceClient_LL_CreateWithAzureSphereDeviceAuthProvisioning(__scope_id, 10000,
                                                                          &iothub_client_h);
    if (provResult.result != AZURE_SPHERE_PROV_RESULT_OK) {
    // timeouts, etc.
    ...
    }
    // setoptions, etc.
    ...
    
    IoTHubDeviceClient_LL_SetConnectionStatusCallback(iothub_client_h,
                                                      __hub_connection_status_callback, NULL);
    sem_post(&resender_sem);
}
Azure Sphere
Azure Sphere
An Azure internet of things security solution including hardware, operating system, and cloud components.
156 questions
Azure IoT SDK
Azure IoT SDK
An Azure software development kit that facilitates building applications that connect to Azure IoT services.
208 questions
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 27,491 Reputation points
    2023-02-24T08:42:20.4866667+00:00

    Dmytro Seliverstov Welcome to Microsoft Q&A forum!

    Apologies for the delayed response.

    I am developing an application that runs on Azure Sphere and I faced an issue with my current implementation. The main problem is that SDK destroys any pending message in the queue upon disconnection. So I came to the point of implementing the mechanism of sending result verification that causes sending retrial in case of failure.

    I understand that you are facing an issue with your Azure Sphere application where the SDK destroys any pending message in the queue upon disconnection.

    Need your help or comments about the resending mechanism code. Is there any practice or an example of similar functionality somewhere?

    Did you consider using IoTHubDeviceClient_LL_SetRetryPolicy for retry logic?

    Currently the default Retry Policy in the Azure IoT Device Client C SDK is IOTHUB_CLIENT_RETRY_EXPONENTIAL_BACKOFF_WITH_JITTER (with no timeout), but it can be set by using the following SDK function:

    In iothub_client module:

    IOTHUB_CLIENT_RESULT IoTHubClient_SetRetryPolicy( IOTHUB_CLIENT_HANDLE iotHubClientHandle, IOTHUB_CLIENT_RETRY_POLICY retryPolicy, size_t retryTimeoutLimitInSeconds);
    

    Or if using the iothub_client_ll module:

    IOTHUB_CLIENT_RESULT IoTHubClient_LL_SetRetryPolicy( IOTHUB_CLIENT_LL_HANDLE iotHubClientHandle, IOTHUB_CLIENT_RETRY_POLICY retryPolicy, size_t retryTimeoutLimitInSeconds);
    

    Note that, If retryTimeoutLimitInSeconds is set as 0 (zero) the timeout for retry policies is disabled.

    See Azure IoT Device Client C SDK for more details.

    To learn more about the Azure IoT device SDKs and managing retries, see Retry patterns.

    You can also consider implementing a persistent storage mechanism for pending messages?

    Azure IoT Hub supports several methods for storing messages, including message queues and message routing. By using one of these methods, you could ensure that no messages are lost due to disconnection. The best approach will depend on the specific requirements and constraints of your use case. See Use IoT Hub message routing to send device-to-cloud messages to different endpoints

    If you need further help in this matter, please comment in the below section and we are happy to discuss!


    If this answers your query, do click Accept Answer and Yes for this answer as helpful. And, if you have any further query do let us know by commenting in the below section.

    0 comments No comments

0 additional answers

Sort by: Most helpful