Azure Cloud Services for Tello Drones: How to Analyze Images by Azure Computer Vision

아티클
1/17/2024

Introduction

Tello is a programmable mini drone, which is perfect and popular for beginners. Users can easily control it by programming languages such as Scratch, Python, and Swift. Microsoft Azure provides a variety of cloud computing services including artificial intelligence, machine learning, IoT, storage, security, networking, media, integration and so on. Azure Computer Vision provide advanced algorithms that process images and return information based on the visual features you're interested in. Optical Character Recognition (OCR), Image Analysis and Spatial Analysis are three typical services that commonly used. Additionally, Azure Computer Vision provides capabilities to detect human faces within an image and give out rectangle coordinates for each detected face.

In this article, we will walk you through the steps required to capture and send image from Tello to Azure Computer Vision with Azure Cognitive Services Computer Vision SDK for Python to get the analyses of the image, such as objects and human faces.

Prerequisites

Tello Drone.
Azure Storage Blobs client library for Python
IDE: PyCharm Community
Azure subscription

Network Access for PC

Since Tello is connected to the PC by Wi-Fi, it is straightforward to know that our PC should equipped with two network interface cards, one is for connecting with Tello, and the other one is for connecting with Internet.

Fig. 1 Network Access

Image Analysis of Azure Computer Vision

The Computer Vision Image Analysis service can extract a wide variety of visual features from your images. For example, it can determine whether an image contains adult content, find specific brands or objects, or find human faces.
The latest version of Image Analysis, 4.0, which is now in public preview, has new features like synchronous OCR and people detection.
In this project, we make use of Image Analysis to tag visual features and detect faces in the image that captured by the Tello.

Create Necessary Services on Azure

In this project, we will use Azure Computer Vision and Azure Storage services. The “Create an Azure Storage Account” section in this wiki article “Azure Cloud Services for Tello Drones: How to Send Images to Azure Blob Storage” will guide you to create Azure Storage service. For Azure Computer Vision, this official docs “Quickstart: Create a Cognitive Services resource using the Azure portal” is a good start to create your own computer vision services. You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. The free pricing tier (F0) is enough for you to try the service.

Install Python Packages

In this project, we will install “djitellopy”, “azure-storage-blob”, “Pygame” and “azure-cognitiveservices-vision-computervision” packages to accelerate the development. Please refer to the “Install Python Azure IoT SDK and Tello Python SDK” section of the article “Azure Cloud Services for Tello Drones: How to send telemetry to Azure IoTHub” to complete this step.

Create and Debug Python Code on Your PC

Copy and paste the following code to your PyCharm project.

from djitellopy import tello
import KeyPressModule as kp
import time
import cv2
from azure.storage.blob import ContentSettings, BlobClient
from azure.cognitiveservices.vision.computervision  import  ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials
 
kp.init()
me =  tello.Tello()
me.connect()
print(me.get_battery())
me.streamon()
 
global img
# connection string for Azure Blob storage
conn_str =  "your connection string"
container_name =  "raspberrypic"
blob_name =  "capture"
# subscription information for your Azure Computer Vision
subscription_key =  "your subsctiption key"
endpoint =  "your service endpoint"
 
 
def getKeyboardInput():
    lr, fb, ud, yv =  0, 0, 0, 0
    speed =  50
 
    if kp.getKey("LEFT"):
        lr =  -speed
    elif kp.getKey("RIGHT"):
        lr =  speed
 
    if kp.getKey("UP"):
        fb =  speed
    elif kp.getKey("DOWN"):
        fb =  -speed
 
    if kp.getKey("w"):
        ud =  speed
    elif kp.getKey("s"):
        ud =  -speed
 
    if kp.getKey("a"):
        yv =  speed
    elif kp.getKey("d"):
        yv =  -speed
 
    if kp.getKey("q"): me.land()
    if kp.getKey("e"): me.takeoff()
 
    if kp.getKey("z"):
        global img
        cv2.imwrite(f'Resources/Images/capture.jpg', img)
        time.sleep(0.3)
        #create computer vision client
        computervision_client =  ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))
 
        print("===== Tag the image =====")
        # Open local image file
        local_image =  open(f'Resources/Images/capture.jpg', "rb")
        # Call API local image
        tags_result_local =  computervision_client.tag_image_in_stream(local_image)
        # Print results with confidence score
        print("Tags in the local image: ")
        if len(tags_result_local.tags) == 0:
            print("No tags detected.")
        else:
            for tag in tags_result_local.tags:
                print("'{}' with confidence {:.2f}%".format(tag.name, tag.confidence *  100))
        print()
 
        print("===== Detect Faces =====")
        image =  open(f'Resources/Images/capture.jpg', "rb")
        local_image_features =  ["faces"]
        detect_faces_results_local =  computervision_client.analyze_image_in_stream(image, local_image_features)
        # Print results with confidence score
        print("Faces in the local image: ")
        if len(detect_faces_results_local.faces) == 0:
            print("No faces detected.")
        else:
            for face in detect_faces_results_local.faces:
                left =  face.face_rectangle.left
                top =  face.face_rectangle.top
                right =  face.face_rectangle.left + face.face_rectangle.width
                bottom =  face.face_rectangle.top + face.face_rectangle.height
                print("'{}' of age {} at location {}, {}, {}, {}".format(face.gender, face.age,
                                                                         face.face_rectangle.left,
                                                                         face.face_rectangle.top,
                                                                         face.face_rectangle.left +  face.face_rectangle.width,
                                                                         face.face_rectangle.top +  face.face_rectangle.height))
 
        # upload the image to Azure Blob Storage, Overwrite if it already exists!
        blob =  BlobClient.from_connection_string(conn_str, container_name, blob_name)
        image_content_setting =  ContentSettings(content_type='image/jpeg')
        with open(f'Resources/Images/capture.jpg', "rb") as data:
            try:
                blob.upload_blob(data, overwrite=True, content_settings=image_content_setting)
                print("Blob storage uploading completed")
            except ValueError:
                print("Blob storage uploading error")
    return [lr, fb, ud, yv]
 
 
def main():
    print("Capture, analyse and send Tello image to Azure Blob Storage")
    while True:
        vals =  getKeyboardInput()
        me.send_rc_control(vals[0], vals[1], vals[2], vals[3])
        global img
        img =  me.get_frame_read().frame
        img =  cv2.resize(img, (1280, 720))
        cv2.putText(img, str(me.get_current_state()), (10, 60), cv2.FONT_HERSHEY_PLAIN, 0.9, (255, 0, 255), 1)
        cv2.imshow("image", img)
        cv2.waitKey(50)
 
 
if __name__ == "__main__":
    main()

In this Python application, we define a function getKeyboardInput to receive the keyboard input of the user, which return the control parameters to the loop in main function. It is quite the same as we designed in this article “Azure Cloud Services for Tello Drones: How to Control Tello by Azure C2D Messages”.

To achieve the image analysis, we first store the image in the folder “Resources/Images/” of the project when we input “z” from keyboard. Then, a computervision_client is created to send the image to the Azure Computer Vision service to get the tags and faces in the image. The results will be displayed on the screen as soon as the application receives the reply.

Please do substitute the connection string with yours that created before. Then, power on the Tello, connect your PC with Tello by Wi-Fi. You will notice that the LED on the Tello will flash quickly with yellow color. Press Run or Debug button to start the process.

After a few seconds, the real time image that streamed from the Tello, as well as the Pygame window will be shown on the screen, as presented in Fig. 2.

Then, we can click the mouse on the Pygame window to focus user input on it. After that, we can use “w, s, a, d, e, q, up arrow, down arrow, left arrow, right arrow” to control the movement of the Tello drone. The device status of the Tello will be written on the top center of the image.

Fig. 2 Real time image and Pygame window

When we click “z” from keyboard, a computervision_client is created to send the image to the Azure Computer Vision service to get the tags and faces in the image. The results will be shown in the output window in Fig. 3.

Fig. 3 Computer Vision Results of the captured image from Tello

After that, the image is transmitted to the Azure Blob storage. We can see the log information on the debug window. Also, we can make use of Azure Storage Explorer to check the image.

Summary

In this tutorial, we have presented the steps and Python codes on how to send image streamed from Tello drone to Azure Computer Vision service to get the insights.

Resources

MS Docs for Azure Computer Vision.
MS Docs for Create a storage account.
MS Docs for Manage storage account access keys.

다음을 통해 공유