Shelf Product Recognition (preview): Analyze shelf images using pretrained model

The fastest way to start using Product Recognition is to use the built-in pretrained AI models. With the Product Understanding API, you can upload a shelf image and get the locations of products and gaps.

Photo of a retail shelf with products and gaps highlighted with rectangles.

Note

The brands shown in the images are not affiliated with Microsoft and do not indicate any form of endorsement of Microsoft or Microsoft products by the brand owners, or an endorsement of the brand owners or their products by Microsoft.

Prerequisites

  • An Azure subscription - Create one for free
  • Once you have your Azure subscription, create a Vision resource in the Azure portal. It must be deployed in the East US or West US 2 region. After it deploys, select Go to resource.
    • You'll need the key and endpoint from the resource you create to connect your application to the Azure AI Vision service. You'll paste your key and endpoint into the code below later in the guide.
  • An Azure Storage resource with a blob storage container. Create one
  • cURL installed. Or, you can use a different REST platform, like Swagger or the REST Client extension for VS Code.
  • A shelf image. You can download our sample image or bring your own images. The maximum file size per image is 20 MB.

Analyze shelf images

To analyze a shelf image, do the following steps:

  1. Upload the images you'd like to analyze to your blob storage container, and get the absolute URL.

  2. Copy the following curl command into a text editor.

    curl -X PUT -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -H "Content-Type: application/json" "<endpoint>/computervision/productrecognition/ms-pretrained-product-detection/runs/<your_run_name>?api-version=2023-04-01-preview" -d "{
        'url':'<your_url_string>'
    }"
    
  3. Make the following changes in the command where needed:

    1. Replace the <subscriptionKey> with your Vision resource key.
    2. Replace the <endpoint> with your Vision resource endpoint. For example: https://YourResourceName.cognitiveservices.azure.com.
    3. Replace the <your_run_name> with your unique test run name for the task queue. It is an async API task queue name for you to be able retrieve the API response later. For example, .../runs/test1?api-version...
    4. Replace the <your_url_string> contents with the blob URL of the image
  4. Open a command prompt window.

  5. Paste your edited curl command from the text editor into the command prompt window, and then run the command.

Examine the response

A successful response is returned in JSON. The product understanding API results are returned in a ProductUnderstandingResultApiModel JSON field:

{
  "imageMetadata": {
    "width": 2000,
    "height": 1500
  },
  "products": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 12,
        "h": 12
      },
      "classifications": [
        {
          "confidence": 0.9,
          "label": "string"
        }
      ]
    }
  ],
  "gaps": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 123,
        "h": 123
      },
      "classifications": [
        {
          "confidence": 0.8,
          "label": "string"
        }
      ]
    }
  ]
}

See the following sections for definitions of each JSON field.

Product Understanding Result API model

Results from the product understanding operation.

Name Type Description Required
imageMetadata ImageMetadataApiModel The image metadata information such as height, width and format. Yes
products DetectedObjectApiModel Products detected in the image. Yes
gaps DetectedObjectApiModel Gaps detected in the image. Yes

Image Metadata API model

The image metadata information such as height, width and format.

Name Type Description Required
width integer The width of the image in pixels. Yes
height integer The height of the image in pixels. Yes

Detected Object API model

Describes a detected object in an image.

Name Type Description Required
id string ID of the detected object. No
boundingBox BoundingBoxApiModel A bounding box for an area inside an image. Yes
classifications ImageClassificationApiModel Classification confidences of the detected object. Yes

Bounding Box API model

A bounding box for an area inside an image.

Name Type Description Required
x integer Left-coordinate of the top left point of the area, in pixels. Yes
y integer Top-coordinate of the top left point of the area, in pixels. Yes
w integer Width measured from the top-left point of the area, in pixels. Yes
h integer Height measured from the top-left point of the area, in pixels. Yes

Image Classification API model

Describes the image classification confidence of a label.

Name Type Description Required
confidence float Confidence of the classification prediction. Yes
label string Label of the classification prediction. Yes

Next steps

In this guide, you learned how to make a basic analysis call using the pretrained Product Understanding REST API. Next, learn how to use a custom Product Recognition model to better meet your business needs.