Recommendations for developing production-ready applications

As you develop applications for your Azure Sphere devices, there are several things to consider that ensure your applications are production ready. This topic includes a checklist of best practices to verify your applications are ready for pilot or production deployment. Confirming these items are complete can reduce the number of issues you encounter in production and make it easier to diagnose any issues that do arise.

When you develop an Azure Sphere application, decide whether it will run on the High-Level (HL), the Real-Time (RT) core, or a hybrid of both. High-level applications run containerized on the Azure Sphere OS, and real-time capable applications (RTApps) run on bare metal or with a real-time operating system (RTOS) on the real-time cores.

The recommendations provided here are intended to help you increase quality and productivity in your production-ready applications. The checklist below provides a concise list of design suggestions for both application types, as well as recommended coding fundamentals and solution design considerations, including links to topics that discuss each point in more detail. These suggestions are derived from our partnerships with customers, including field analysis, code reviews, and support interactions of production-deployed applications in real-world solutions and device designs.

Coding fundamentals

  • Common issues

    • Ensure that production-ready applications don't use beta toolsets.
    • When targeting the API set, use the latest CMake and Azure Sphere tools.
    • To ensure full code optimization and size, consider compiling the final image packages in Release mode before deploying an application to production. Make sure to build and test the Release package before deploying it.
    • Use a zero-warnings policy when performing a full build to ensure compiler warnings are intentionally addressed.
    • Set up a consistent CI/CD pipeline and use a proper branching strategy.
  • Memory-related issues

    • When possible, define all common fixed strings as global const char* instead of hard-coding, so they can be used as data pointers.
    • If global data structures are reasonably small, consider giving fixed lengths to the array members rather than using pointers to dynamically allocated memory.
    • Avoid dynamic memory allocation whenever possible.
    • For functions that return a pointer to a memory buffer, consider converting to functions that return a referenced buffer pointer and its related size to the callers.
  • Dynamic containers and buffers

    • Consider using an incremental allocation approach for containers such as lists and vectors.

High-level core application design suggestions

  • General fundamentals

    • Properly initialize and destroy all handlers upon exit or error.
    • Always use exit codes.
    • If an application detects that it's in an unrecoverable state and requires a restart, ensure that it's always handled as a "clean" application exit, rather than risking a deadlock state.
    • Implement error handling and logging. For more information, see Error handling and logging.
    • Use a system timer as a watchdog to detect whether the application is in an unrecoverable state or stall (such as deadlock, exhausted memory, or connectivity not recovering though the implemented logic), and effect proper recovery. For more information, see Use a system timer as a watchdog.
  • Handling concurrency

    • Use EventLoop whenever possible.
    • Look for efficiency on concurrent tasks.
    • Evaluate when to use threads and scope to specific tasks only. For more information on when to use threads, see Handling concurrency.
  • Connectivity monitoring

    • Implement a proper connectivity health-checking task based on a robust state machine that regularly checks the status of the internet connection.
    • For solutions that require power management, power down the Azure Sphere chip after sending data, track the total up-time, and set a shutdown timer.
    • cURL has recently updated callback behavior and best practises. While Azure Sphere has taken efforts to ensure older versions of cURL behavior continue to work as expected, it is recommended to follow the latest guidance for security and reliability when using curl_multi, as the use of recursive callbacks can result in unexpected crashes, connectivity outages and potential security vulnerabilities. If a TimerCallback fires with a timeout of 0ms, treat it as a timeout of 1ms to avoid recursive callbacks. Be sure to also call curl_multi_socket_action explicitly at least once following calls to curl_multi_add_handle.
  • Memory management and usage

    • Track application memory usage with the Azure Sphere OS APIs and ensure applications react appropriately to unexpected memory use.

Real-time core application design suggestions

  • Enable the MT3620 watchdog timer to detect deadlock and implement proper recovery logic.
  • Implement inter-core communications for hybrid HL-core and RT-core applications.

Solution design considerations

  • Connectivity requirements and troubleshooting

    • Ensure all network prerequisites are met. For more information, see Connectivity requirements and troubleshooting.
    • Troubleshoot connectivity issues by using OSNetworkRequirementCheck-HLApp and OSNetworkRequirementChecker-PC.

For additional items to consider when moving an IoT solution to a production environment, see Move an IoT solution from test to production.