Product Recognition (version 4.0 preview)

Article
02/15/2024

The Product Recognition APIs let you analyze photos of shelves in a retail store. You can detect the presence of products and get their bounding box coordinates. Use it in combination with model customization to train a model to identify your specific products. You can also compare Product Recognition results to your store's planogram document.

Try out the capabilities of Product Recognition quickly and easily in your browser using Vision Studio.

Try Vision Studio

Photo of a shelf with products and gaps outlined in rectangles.

Note

The brands shown in the images are not affiliated with Microsoft and do not indicate any form of endorsement of Microsoft or Microsoft products by the brand owners, or an endorsement of the brand owners or their products by Microsoft.

Important

You can train a custom model for product recognition using either the Custom Vision service or the Image Analysis 4.0 Product Recognition APIs. The following table compares the two services.

Areas

Products on Shelves – Custom Vision

Product Recognition – Image Analysis API/Customization

Features

Custom product understanding

Image stitching & rectification,
Pretrained product understanding,
Custom product understanding,
Planogram matching

Base model

CNN

Florence transformer model

Labeling

Customvision.ai

AML Studio

Web Portal

Customvision.ai

Vision Studio

Libraries

REST, SDK

REST, Python Sample

Minimum training data needed

15 images per category

2-5 images per category

Training data storage

Uploaded to service

Customer’s blob storage account

Model hosting

Cloud and edge

Cloud hosting only, edge container hosting to come

AI quality

context	Top-1 accuracy, 14 datasets
1 shot (catalog)	29.4
2 shot	57.1
3 shot	66.7
5 shot	80.8
10 shot	86.4
full	94.9

context	Top-1 accuracy, 14 datasets
1 shot (catalog)	86.9
2 shot	88.8
3 shot	89.8
5 shot	90.3
10 shot	91.0
full	95.4

Pricing

Custom Vision pricing

Image Analysis pricing

Product Recognition features

Shelf image composition

The stitching and rectification APIs let you modify images to improve the accuracy of the Product Understanding results. You can use these APIs to:

Stitch together multiple images of a shelf to create a single image.
Rectify an image to remove perspective distortion.

Shelf product recognition (pretrained model)

The Product Understanding API lets you analyze a shelf image using the out-of-box pretrained model. This operation detects products and gaps in the shelf image and returns the bounding box coordinates of each product and gap, along with a confidence score for each.

The following JSON response illustrates what the Product Understanding API returns.

{
  "imageMetadata": {
    "width": 2000,
    "height": 1500
  },
  "products": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 12,
        "h": 12
      },
      "classifications": [
        {
          "confidence": 0.9,
          "label": "string"
        }
      ]
    }
  ],
  "gaps": [
    {
      "id": "string",
      "boundingBox": {
        "x": 1234,
        "y": 1234,
        "w": 123,
        "h": 123
      },
      "classifications": [
        {
          "confidence": 0.8,
          "label": "string"
        }
      ]
    }
  ]
}

Shelf product recognition (customized model)

The Product Understanding API can also be used with a custom trained model to detect your specific products. This operation returns the bounding box coordinates of each product and gap, along with the label of each product.

The following JSON response illustrates what the Product Understanding API returns when used with a custom model.

"detectedProducts": {
  "imageMetadata": {
    "width": 21,
    "height": 25
  },
  "products": [
    {
      "id": "01",
      "boundingBox": {
        "x": 123,
        "y": 234,
        "w": 34,
        "h": 45
      },
      "classifications": [
        {
          "confidence": 0.8,
          "label": "Product1"
        }
      ]
    }
  ],
  "gaps": [
    {
      "id": "02",
      "boundingBox": {
        "x": 12,
        "y": 123,
        "w": 1234,
        "h": 123
      },
      "classifications": [
        {
          "confidence": 0.9,
          "label": "Product1"
        }
      ]
    }
  ]
}

Shelf planogram compliance

The Planogram matching API lets you compare the results of the Product Understanding API to a planogram document. This operation matches each detected product and gap to its corresponding position in the planogram document.

It returns a JSON response that accounts for each position in the planogram document, whether it's occupied by a product or gap.

{
  "matchedResultsPerPosition": [
    {
      "positionId": "01",
      "detectedObject": {
        "id": "01",
        "boundingBox": {
          "x": 12,
          "y": 1234,
          "w": 123,
          "h": 12345
        },
        "classifications": [
          {
            "confidence": 0.9,
            "label": "Product1"
          }
        ]
      }
    }
  ]
}

Limitations

Product Recognition is only available in the East US and West US 2 Azure regions.
Shelf images can be up to 20 MB in size. The recommended size is 4 MB.
We recommend you do stitching and rectification on the shelf images before uploading them for analysis.
Using a custom model is optional in Product Recognition, but it's required for the planogram matching function.

Next steps

Get started with Product Recognition by trying out the stitching and rectification APIs. Then do basic analysis with the Product Understanding API.