Shelf Product Recognition (preview): Analyze shelf images using pretrained model

Article
02/14/2024

The fastest way to start using Product Recognition is to use the built-in pretrained AI models. With the Product Understanding API, you can upload a shelf image and get the locations of products and gaps.

Photo of a retail shelf with products and gaps highlighted with rectangles.

Note

The brands shown in the images are not affiliated with Microsoft and do not indicate any form of endorsement of Microsoft or Microsoft products by the brand owners, or an endorsement of the brand owners or their products by Microsoft.

Prerequisites

An Azure subscription - Create one for free
Once you have your Azure subscription, create a Vision resource in the Azure portal. It must be deployed in the East US or West US 2 region. After it deploys, select Go to resource.
- You'll need the key and endpoint from the resource you create to connect your application to the Azure AI Vision service. You'll paste your key and endpoint into the code below later in the guide.
An Azure Storage resource with a blob storage container. Create one
cURL installed. Or, you can use a different REST platform, like Swagger or the REST Client extension for VS Code.
A shelf image. You can download our sample image or bring your own images. The maximum file size per image is 20 MB.

Analyze shelf images

To analyze a shelf image, do the following steps:

Upload the images you'd like to analyze to your blob storage container, and get the absolute URL.

Copy the following curl command into a text editor.

curl -X PUT -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -H "Content-Type: application/json" "<endpoint>/computervision/productrecognition/ms-pretrained-product-detection/runs/<your_run_name>?api-version=2023-04-01-preview" -d "{
    'url':'<your_url_string>'
}"

Make the following changes in the command where needed:
1. Replace the <subscriptionKey> with your Vision resource key.
2. Replace the <endpoint> with your Vision resource endpoint. For example: https://YourResourceName.cognitiveservices.azure.com.
3. Replace the <your_run_name> with your unique test run name for the task queue. It is an async API task queue name for you to be able retrieve the API response later. For example, .../runs/test1?api-version...
4. Replace the <your_url_string> contents with the blob URL of the image
Open a command prompt window.
Paste your edited curl command from the text editor into the command prompt window, and then run the command.

Examine the response

A successful response is returned in JSON. The product understanding API results are returned in a ProductUnderstandingResultApiModel JSON field:

{
  "imageMetadata": {
    "width": 2000,
    "height": 1500
  },
  "products": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 12,
        "h": 12
      },
      "classifications": [
        {
          "confidence": 0.9,
          "label": "string"
        }
      ]
    }
  ],
  "gaps": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 123,
        "h": 123
      },
      "classifications": [
        {
          "confidence": 0.8,
          "label": "string"
        }
      ]
    }
  ]
}

See the following sections for definitions of each JSON field.

Product Understanding Result API model

Results from the product understanding operation.

Name	Type	Description	Required
`imageMetadata`	ImageMetadataApiModel	The image metadata information such as height, width and format.	Yes
`products`	DetectedObjectApiModel	Products detected in the image.	Yes
`gaps`	DetectedObjectApiModel	Gaps detected in the image.	Yes

Image Metadata API model

The image metadata information such as height, width and format.

Name	Type	Description	Required
`width`	integer	The width of the image in pixels.	Yes
`height`	integer	The height of the image in pixels.	Yes

Detected Object API model

Describes a detected object in an image.

Name	Type	Description	Required
`id`	string	ID of the detected object.	No
`boundingBox`	BoundingBoxApiModel	A bounding box for an area inside an image.	Yes
`classifications`	ImageClassificationApiModel	Classification confidences of the detected object.	Yes

Bounding Box API model

A bounding box for an area inside an image.

Name	Type	Description	Required
`x`	integer	Left-coordinate of the top left point of the area, in pixels.	Yes
`y`	integer	Top-coordinate of the top left point of the area, in pixels.	Yes
`w`	integer	Width measured from the top-left point of the area, in pixels.	Yes
`h`	integer	Height measured from the top-left point of the area, in pixels.	Yes

Image Classification API model

Describes the image classification confidence of a label.

Name	Type	Description	Required
`confidence`	float	Confidence of the classification prediction.	Yes
`label`	string	Label of the classification prediction.	Yes

Next steps

In this guide, you learned how to make a basic analysis call using the pretrained Product Understanding REST API. Next, learn how to use a custom Product Recognition model to better meet your business needs.

Train a custom model for Product Recognition