Training
Call the Image Analysis 4.0 Analyze API
This article demonstrates how to call the Image Analysis 4.0 API to return information about an image's visual features. It also shows you how to parse the returned information.
This guide assumes you've followed the steps mentioned in the quickstart page. This means:
- You have created a Computer Vision resource and obtained a key and endpoint URL.
- You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on code examples here.
To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY
and VISION_ENDPOINT
with your key and endpoint.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
Start by creating a ImageAnalysisClient object. For example:
string endpoint = Environment.GetEnvironmentVariable("VISION_ENDPOINT");
string key = Environment.GetEnvironmentVariable("VISION_KEY");
// Create an Image Analysis client.
ImageAnalysisClient client = new ImageAnalysisClient(
new Uri(endpoint),
new AzureKeyCredential(key));
You can select an image by providing a publicly accessible image URL, or by passing binary data to the SDK. See Image requirements for supported image formats.
Create a Uri object for the image you want to analyze.
Uri imageURL = new Uri("https://aka.ms/azsdk/image-analysis/sample.jpg");
Alternatively, you can pass the image data to the SDK through a BinaryData object. For example, read from a local image file you want to analyze.
using FileStream stream = new FileStream("sample.jpg", FileMode.Open);
BinaryData imageData = BinaryData.FromStream(stream);
The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.
Important
The visual features Captions
and DenseCaptions
are only supported in certain Azure regions: see Region availability
VisualFeatures visualFeatures =
VisualFeatures.Caption |
VisualFeatures.DenseCaptions |
VisualFeatures.Objects |
VisualFeatures.Read |
VisualFeatures.Tags |
VisualFeatures.People |
VisualFeatures.SmartCrops;
Use an ImageAnalysisOptions object to specify various options for the Analyze Image API call.
- Language: You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
- Gender neutral captions: If you're extracting captions or dense captions (using VisualFeatures.Caption or VisualFeatures.DenseCaptions), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
- Crop aspect ratio: An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
ImageAnalysisOptions options = new ImageAnalysisOptions {
GenderNeutralCaption = true,
Language = "en",
SmartCropsAspectRatios = new float[] { 0.9F, 1.33F }};
This section shows you how to make an analysis call to the service.
Call the Analyze method on the ImageAnalysisClient object, as shown here. The call is synchronous, and blocks execution until the service returns the results or an error occurred. Alternatively, you can call the non-blocking AnalyzeAsync method.
Use the input objects created in the above sections. To analyze from an image buffer instead of URL, replace imageURL
in the method call with the imageData
variable.
ImageAnalysisResult result = client.Analyze(
imageURL,
visualFeatures,
options);
The following code shows you how to parse the results of the various Analyze operations.
Console.WriteLine("Image analysis results:");
// Print caption results to the console
Console.WriteLine(" Caption:");
Console.WriteLine($" '{result.Caption.Text}', Confidence {result.Caption.Confidence:F4}");
// Print dense caption results to the console
Console.WriteLine(" Dense Captions:");
foreach (DenseCaption denseCaption in result.DenseCaptions.Values)
{
Console.WriteLine($" '{denseCaption.Text}', Confidence {denseCaption.Confidence:F4}, Bounding box {denseCaption.BoundingBox}");
}
// Print object detection results to the console
Console.WriteLine(" Objects:");
foreach (DetectedObject detectedObject in result.Objects.Values)
{
Console.WriteLine($" '{detectedObject.Tags.First().Name}', Bounding box {detectedObject.BoundingBox.ToString()}");
}
// Print text (OCR) analysis results to the console
Console.WriteLine(" Read:");
foreach (DetectedTextBlock block in result.Read.Blocks)
foreach (DetectedTextLine line in block.Lines)
{
Console.WriteLine($" Line: '{line.Text}', Bounding Polygon: [{string.Join(" ", line.BoundingPolygon)}]");
foreach (DetectedTextWord word in line.Words)
{
Console.WriteLine($" Word: '{word.Text}', Confidence {word.Confidence.ToString("#.####")}, Bounding Polygon: [{string.Join(" ", word.BoundingPolygon)}]");
}
}
// Print tags results to the console
Console.WriteLine(" Tags:");
foreach (DetectedTag tag in result.Tags.Values)
{
Console.WriteLine($" '{tag.Name}', Confidence {tag.Confidence:F4}");
}
// Print people detection results to the console
Console.WriteLine(" People:");
foreach (DetectedPerson person in result.People.Values)
{
Console.WriteLine($" Person: Bounding box {person.BoundingBox.ToString()}, Confidence {person.Confidence:F4}");
}
// Print smart-crops analysis results to the console
Console.WriteLine(" SmartCrops:");
foreach (CropRegion cropRegion in result.SmartCrops.Values)
{
Console.WriteLine($" Aspect ratio: {cropRegion.AspectRatio}, Bounding box: {cropRegion.BoundingBox}");
}
// Print metadata
Console.WriteLine(" Metadata:");
Console.WriteLine($" Model: {result.ModelVersion}");
Console.WriteLine($" Image width: {result.Metadata.Width}");
Console.WriteLine($" Image hight: {result.Metadata.Height}");
When you interact with Image Analysis using the .NET SDK, any response from the service that doesn't have a 200
(success) status code results in an exception being thrown. For example, if you try to analyze an image that is not accessible due to a broken URL, a 400
status is returned, indicating a bad request, and a corresponding exception is thrown.
In the following snippet, errors are handled gracefully by catching the exception and displaying additional information about the error.
var imageUrl = new Uri("https://some-host-name.com/non-existing-image.jpg");
try
{
var result = client.Analyze(imageUrl, VisualFeatures.Caption);
}
catch (RequestFailedException e)
{
if (e.Status != 200)
{
Console.WriteLine("Error analyzing image.");
Console.WriteLine($"HTTP status code {e.Status}: {e.Message}");
}
else
{
throw;
}
}
You can learn more about how to enable SDK logging here.
This guide assumes you've followed the steps of the quickstart. This means:
- You've created a Computer Vision resource and obtained a key and endpoint URL.
- You've installed the appropriate SDK package and have a working quickstart application. You can modify this quickstart application based on the code examples here.
To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY
and VISION_ENDPOINT
with your key and endpoint.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
Start by creating an ImageAnalysisClient object using one of the constructors. For example:
client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)
You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.
You can use the following sample image URL.
# Define image URL
image_url = "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png"
Alternatively, you can pass in the image as bytes object. For example, read from a local image file you want to analyze.
# Load image to analyze into a 'bytes' object
with open("sample.jpg", "rb") as f:
image_data = f.read()
The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.
Important
The visual features Captions and DenseCaptions are only supported in certain Azure regions. See Region availability.
visual_features =[
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.CAPTION,
VisualFeatures.DENSE_CAPTIONS,
VisualFeatures.READ,
VisualFeatures.SMART_CROPS,
VisualFeatures.PEOPLE,
]
The following code calls the analyze_from_url method on the client with the features you selected above and other options, defined below. To analyze from an image buffer instead of URL, call the method analyze instead, with image_data=image_data
as the first argument.
# Analyze all visual features from an image stream. This will be a synchronously (blocking) call.
result = client.analyze_from_url(
image_url=image_url,
visual_features=visual_features,
smart_crops_aspect_ratios=[0.9, 1.33],
gender_neutral_caption=True,
language="en"
)
An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SMART_CROPS was selected as part the visual feature list. If you select VisualFeatures.SMART_CROPS but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
If you're extracting captions or dense captions (using VisualFeatures.CAPTION or VisualFeatures.DENSE_CAPTIONS), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
The following code shows you how to parse the results from the analyze_from_url or analyze operations.
# Print all analysis results to the console
print("Image analysis results:")
if result.caption is not None:
print(" Caption:")
print(f" '{result.caption.text}', Confidence {result.caption.confidence:.4f}")
if result.dense_captions is not None:
print(" Dense Captions:")
for caption in result.dense_captions.list:
print(f" '{caption.text}', {caption.bounding_box}, Confidence: {caption.confidence:.4f}")
if result.read is not None:
print(" Read:")
for line in result.read.blocks[0].lines:
print(f" Line: '{line.text}', Bounding box {line.bounding_polygon}")
for word in line.words:
print(f" Word: '{word.text}', Bounding polygon {word.bounding_polygon}, Confidence {word.confidence:.4f}")
if result.tags is not None:
print(" Tags:")
for tag in result.tags.list:
print(f" '{tag.name}', Confidence {tag.confidence:.4f}")
if result.objects is not None:
print(" Objects:")
for object in result.objects.list:
print(f" '{object.tags[0].name}', {object.bounding_box}, Confidence: {object.tags[0].confidence:.4f}")
if result.people is not None:
print(" People:")
for person in result.people.list:
print(f" {person.bounding_box}, Confidence {person.confidence:.4f}")
if result.smart_crops is not None:
print(" Smart Cropping:")
for smart_crop in result.smart_crops.list:
print(f" Aspect ratio {smart_crop.aspect_ratio}: Smart crop {smart_crop.bounding_box}")
print(f" Image height: {result.metadata.height}")
print(f" Image width: {result.metadata.width}")
print(f" Model version: {result.model_version}")
The analyze
methods raise an HttpResponseError exception for a non-success HTTP status code response from the service. The exception's status_code
is the HTTP response status code. The exception's error.message
contains a detailed message that allows you to diagnose the issue:
try:
result = client.analyze( ... )
except HttpResponseError as e:
print(f"Status code: {e.status_code}")
print(f"Reason: {e.reason}")
print(f"Message: {e.error.message}")
For example, when you provide a wrong authentication key:
Status code: 401
Reason: PermissionDenied
Message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.
Or when you provide an image URL that doesn't exist or is not accessible:
Status code: 400
Reason: Bad Request
Message: The provided image url is not accessible.
The client uses the standard Python logging library. The SDK logs HTTP request and response details, which may be useful in troubleshooting. To log to stdout, add the following:
import sys
import logging
# Acquire the logger for this client library. Use 'azure' to affect both
# 'azure.core` and `azure.ai.vision.imageanalysis' libraries.
logger = logging.getLogger("azure")
# Set the desired logging level. logging.INFO or logging.DEBUG are good options.
logger.setLevel(logging.INFO)
# Direct logging output to stdout (the default):
handler = logging.StreamHandler(stream=sys.stdout)
# Or direct logging output to a file:
# handler = logging.FileHandler(filename = 'sample.log')
logger.addHandler(handler)
# Optional: change the default logging format. Here we add a timestamp.
formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
handler.setFormatter(formatter)
By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including Ocp-Apim-Subscription-Key
, which holds the key), and the request and response payloads. To create logs without redaction, set the method argument logging_enable = True
when you create ImageAnalysisClient
, or when you call analyze
on the client.
# Create an Image Analysis client with none redacted log
client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
logging_enable=True
)
None redacted logs are generated for log level logging.DEBUG
only. Be sure to protect none redacted logs to avoid compromising security. For more information, see Configure logging in the Azure libraries for Python
This guide assumes you've followed the steps in the quickstart page. This means:
- You have created a Computer Vision resource and obtained a key and endpoint URL.
- You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on code examples here.
To authenticate with the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY
and VISION_ENDPOINT
with your key and endpoint.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
Start by creating an ImageAnalysisClient object. For example:
String endpoint = System.getenv("VISION_ENDPOINT");
String key = System.getenv("VISION_KEY");
if (endpoint == null || key == null) {
System.out.println("Missing environment variable 'VISION_ENDPOINT' or 'VISION_KEY'.");
System.out.println("Set them before running this sample.");
System.exit(1);
}
// Create a synchronous Image Analysis client.
ImageAnalysisClient client = new ImageAnalysisClientBuilder()
.endpoint(endpoint)
.credential(new KeyCredential(key))
.buildClient();
You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.
Create an imageUrl
string to hold the publicly accessible URL of the image you want to analyze.
String imageUrl = "https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png";
Alternatively, you can pass in the image as memory buffer using a BinaryData object. For example, read from a local image file you want to analyze.
BinaryData imageData = BinaryData.fromFile(new File("sample.png").toPath());
The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.
Important
The visual features Captions and DenseCaptions are only supported in certain Azure regions. See Region availability.
// visualFeatures: Select one or more visual features to analyze.
List<VisualFeatures> visualFeatures = Arrays.asList(
VisualFeatures.SMART_CROPS,
VisualFeatures.CAPTION,
VisualFeatures.DENSE_CAPTIONS,
VisualFeatures.OBJECTS,
VisualFeatures.PEOPLE,
VisualFeatures.READ,
VisualFeatures.TAGS);
Use an ImageAnalysisOptions object to specify various options for the Analyze API call.
- Language: You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
- Gender neutral captions: If you're extracting captions or dense captions (using VisualFeatures.CAPTION or VisualFeatures.DENSE_CAPTIONS), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
- Crop aspect ratio: An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SMART_CROPS was selected as part the visual feature list. If you select VisualFeatures.SMART_CROPS but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
// Specify analysis options (or set `options` to null for defaults)
ImageAnalysisOptions options = new ImageAnalysisOptions()
.setLanguage("en")
.setGenderNeutralCaption(true)
.setSmartCropsAspectRatios(Arrays.asList(0.9, 1.33, 1.78));
This section shows you how to make an analysis call to the service.
Call the analyzeFromUrl method on the ImageAnalysisClient object, as shown here. The call is synchronous, and will block until the service returns the results or an error occurred. Alternatively, you can use a ImageAnalysisAsyncClient object instead, and call its analyzeFromUrl method, which is non-blocking.
To analyze from an image buffer instead of URL, call the analyze method instead, and pass in the imageData
as the first argument.
try {
// Analyze all visual features from an image URL. This is a synchronous (blocking) call.
ImageAnalysisResult result = client.analyzeFromUrl(
imageUrl,
visualFeatures,
options);
printAnalysisResults(result);
} catch (HttpResponseException e) {
System.out.println("Exception: " + e.getClass().getSimpleName());
System.out.println("Status code: " + e.getResponse().getStatusCode());
System.out.println("Message: " + e.getMessage());
} catch (Exception e) {
System.out.println("Message: " + e.getMessage());
}
The following code shows you how to parse the results from the analyzeFromUrl and analyze operations.
// Print all analysis results to the console
public static void printAnalysisResults(ImageAnalysisResult result) {
System.out.println("Image analysis results:");
if (result.getCaption() != null) {
System.out.println(" Caption:");
System.out.println(" \"" + result.getCaption().getText() + "\", Confidence "
+ String.format("%.4f", result.getCaption().getConfidence()));
}
if (result.getDenseCaptions() != null) {
System.out.println(" Dense Captions:");
for (DenseCaption denseCaption : result.getDenseCaptions().getValues()) {
System.out.println(" \"" + denseCaption.getText() + "\", Bounding box "
+ denseCaption.getBoundingBox() + ", Confidence " + String.format("%.4f", denseCaption.getConfidence()));
}
}
if (result.getRead() != null) {
System.out.println(" Read:");
for (DetectedTextLine line : result.getRead().getBlocks().get(0).getLines()) {
System.out.println(" Line: '" + line.getText()
+ "', Bounding polygon " + line.getBoundingPolygon());
for (DetectedTextWord word : line.getWords()) {
System.out.println(" Word: '" + word.getText()
+ "', Bounding polygon " + word.getBoundingPolygon()
+ ", Confidence " + String.format("%.4f", word.getConfidence()));
}
}
}
if (result.getTags() != null) {
System.out.println(" Tags:");
for (DetectedTag tag : result.getTags().getValues()) {
System.out.println(" \"" + tag.getName() + "\", Confidence " + String.format("%.4f", tag.getConfidence()));
}
}
if (result.getObjects() != null) {
System.out.println(" Objects:");
for (DetectedObject detectedObject : result.getObjects().getValues()) {
System.out.println(" \"" + detectedObject.getTags().get(0).getName() + "\", Bounding box "
+ detectedObject.getBoundingBox() + ", Confidence " + String.format("%.4f", detectedObject.getTags().get(0).getConfidence()));
}
}
if (result.getPeople() != null) {
System.out.println(" People:");
for (DetectedPerson person : result.getPeople().getValues()) {
System.out.println(" Bounding box "
+ person.getBoundingBox() + ", Confidence " + String.format("%.4f", person.getConfidence()));
}
}
if (result.getSmartCrops() != null) {
System.out.println(" Crop Suggestions:");
for (CropRegion cropRegion : result.getSmartCrops().getValues()) {
System.out.println(" Aspect ratio "
+ cropRegion.getAspectRatio() + ": Bounding box " + cropRegion.getBoundingBox());
}
}
System.out.println(" Image height = " + result.getMetadata().getHeight());
System.out.println(" Image width = " + result.getMetadata().getWidth());
System.out.println(" Model version = " + result.getModelVersion());
}
The analyze
methods throw HttpResponseException when the service responds with a non-success HTTP status code. The exception's getResponse().getStatusCode()
holds the HTTP response status code. The exception's getMessage()
contains a detailed message that allows you to diagnose the issue:
try {
ImageAnalysisResult result = client.analyze(...)
} catch (HttpResponseException e) {
System.out.println("Exception: " + e.getClass().getSimpleName());
System.out.println("Status code: " + e.getResponse().getStatusCode());
System.out.println("Message: " + e.getMessage());
} catch (Exception e) {
System.out.println("Message: " + e.getMessage());
}
For example, when you provide a wrong authentication key:
Exception: ClientAuthenticationException
Status code: 401
Message: Status code 401, "{"error":{"code":"401","message":"Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource."}}"
Or when you provide an image in a format that isn't recognized:
Exception: HttpResponseException
Status code: 400
Message: Status code 400, "{"error":{"code":"InvalidRequest","message":"Image format is not valid.","innererror":{"code":"InvalidImageFormat","message":"Input data is not a valid image."}}}"
Reviewing the HTTP request sent or response received over the wire to the Image Analysis service can be useful in troubleshooting. The Image Analysis client library supports a built-in console logging framework for temporary debugging purposes. It also supports more advanced logging using the SLF4J interface. For detailed information, see Use logging in the Azure SDK for Java.
The sections below discusses enabling console logging using the built-in framework.
You can enable console logging of HTTP request and response for your entire application by setting the following two environment variables. This change affects every Azure client that supports logging HTTP request and response.
- Set environment variable
AZURE_LOG_LEVEL
todebug
- Set environment variable
AZURE_HTTP_LOG_DETAIL_LEVEL
to one of the following values:
Value | Logging level |
---|---|
none |
HTTP request/response logging is disabled |
basic |
Logs only URLs, HTTP methods, and time to finish the request. |
headers |
Logs everything in BASIC, plus all the request and response headers. |
body |
Logs everything in BASIC, plus all the request and response body. |
body_and_headers |
Logs everything in HEADERS and BODY. |
To enable console logging of HTTP request and response for a single client
- Set environment variable
AZURE_LOG_LEVEL
todebug
- Add a call to
httpLogOptions
when building theImageAnalysisClient
:
ImageAnalysisClient client = new ImageAnalysisClientBuilder()
.endpoint(endpoint)
.credential(new KeyCredential(key))
.httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS))
.buildClient();
The enum HttpLogDetailLevel defines the supported logging levels.
By default, when logging, certain HTTP header and query parameter values are redacted. It's possible to override this default by specifying which headers and query parameters are safe to log:
ImageAnalysisClient client = new ImageAnalysisClientBuilder()
.endpoint(endpoint)
.credential(new KeyCredential(key))
.httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS)
.addAllowedHeaderName("safe-to-log-header-name")
.addAllowedQueryParamName("safe-to-log-query-parameter-name"))
.buildClient();
For example, to get a complete un-redacted log of the HTTP request, apply the following:
.httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS)
.addAllowedHeaderName("Ocp-Apim-Subscription-Key")
.addAllowedQueryParamName("features")
.addAllowedQueryParamName("language")
.addAllowedQueryParamName("gender-neutral-caption")
.addAllowedQueryParamName("smartcrops-aspect-ratios")
.addAllowedQueryParamName("model-version"))
Add more to the above to get an un-redacted HTTP response. When you share an un-redacted log, make sure it doesn't contain secrets such as your subscription key.
This guide assumes you have followed the steps mentioned in the quickstart. This means:
- You have created a Computer Vision resource and obtained a key and endpoint URL.
- You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on the code examples here.
To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY
and VISION_ENDPOINT
with your key and endpoint.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
Start by creating a ImageAnalysisClient object. For example:
// Load the .env file if it exists
require("dotenv").config();
const endpoint = process.env['VISION_ENDPOINT'] || '<your_endpoint>';
const key = process.env['VISION_KEY'] || '<your_key>';
const credential = new AzureKeyCredential(key);
const client = createClient(endpoint, credential);
You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.
You can use the following sample image URL.
const imageUrl = 'https://learn.microsoft.com/azure/ai-services/computer-vision/media/quickstarts/presentation.png';
Alternatively, you can pass in the image as a data array. For example, read from a local image file you want to analyze.
const imagePath = '../sample.jpg';
const imageData = fs.readFileSync(imagePath);
The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the Overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.
Important
The visual features Captions and DenseCaptions are only supported in certain Azure regions. See .
const features = [
'Caption',
'DenseCaptions',
'Objects',
'People',
'Read',
'SmartCrops',
'Tags'
];
The following code calls the Analyze Image API with the features you selected above and other options, defined next. To analyze from an image buffer instead of URL, replace imageURL
in the method call with imageData
.
const result = await client.path('/imageanalysis:analyze').post({
body: {
url: imageUrl
},
queryParameters: {
features: features,
'language': 'en',
'gender-neutral-captions': 'true',
'smartCrops-aspect-ratios': [0.9, 1.33]
},
contentType: 'application/json'
});
An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
If you're extracting captions or dense captions (using VisualFeatures.Caption or VisualFeatures.DenseCaptions), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
The following code shows you how to parse the results of the various analyze operations.
const iaResult = result.body;
console.log(`Model Version: ${iaResult.modelVersion}`);
console.log(`Image Metadata: ${JSON.stringify(iaResult.metadata)}`);
if (iaResult.captionResult) {
console.log(`Caption: ${iaResult.captionResult.text} (confidence: ${iaResult.captionResult.confidence})`);
}
if (iaResult.denseCaptionsResult) {
iaResult.denseCaptionsResult.values.forEach(denseCaption => console.log(`Dense Caption: ${JSON.stringify(denseCaption)}`));
}
if (iaResult.objectsResult) {
iaResult.objectsResult.values.forEach(object => console.log(`Object: ${JSON.stringify(object)}`));
}
if (iaResult.peopleResult) {
iaResult.peopleResult.values.forEach(person => console.log(`Person: ${JSON.stringify(person)}`));
}
if (iaResult.readResult) {
iaResult.readResult.blocks.forEach(block => console.log(`Text Block: ${JSON.stringify(block)}`));
}
if (iaResult.smartCropsResult) {
iaResult.smartCropsResult.values.forEach(smartCrop => console.log(`Smart Crop: ${JSON.stringify(smartCrop)}`));
}
if (iaResult.tagsResult) {
iaResult.tagsResult.values.forEach(tag => console.log(`Tag: ${JSON.stringify(tag)}`));
}
Enabling logging may help uncover useful information about failures. In order to see a log of HTTP requests and responses, set the AZURE_LOG_LEVEL
environment variable to info
. Alternatively, logging can be enabled at runtime by calling setLogLevel
in the @azure/logger
:
const { setLogLevel } = require("@azure/logger");
setLogLevel("info");
For more detailed instructions on how to enable logs, you can look at the @azure/logger package docs.
This guide assumes you have successfully followed the steps mentioned in the quickstart page. This means:
- You have created a Computer Vision resource and obtained a key and endpoint URL.
- You have successfully made a
curl.exe
call to the service (or used an alternative tool). You modify thecurl.exe
call based on the examples here.
To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL.
Important
If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.
For more information about AI services security, see Authenticate requests to Azure AI services.
The SDK example assumes that you defined the environment variables VISION_KEY
and VISION_ENDPOINT
with your key and endpoint.
Authentication is done by adding the HTTP request header Ocp-Apim-Subscription-Key and setting it to your vision key. The call is made to the URL <endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01
, where <endpoint>
is your unique computer vision endpoint URL. You add query strings based on your analysis options.
The code in this guide uses remote images referenced by URL. You may want to try different images on your own to see the full capability of the Image Analysis features.
When analyzing a remote image, you specify the image's URL by formatting the request body like this: {"url":"https://learn.microsoft.com/azure/cognitive-services/computer-vision/images/windows-kitchen.jpg"}
. The Content-Type should be application/json
.
To analyze a local image, you'd put the binary image data in the HTTP request body. The Content-Type should be application/octet-stream
or multipart/form-data
.
The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.
Visual features 'Captions' and 'DenseCaptions' are only supported in certain Azure regions: see Region availability.
Note
The REST API uses the terms Smart Crops and Smart Crops Aspect Ratios. The SDK uses the terms Crop Suggestions and Cropping Aspect Ratios. They both refer to the same service operation. Similarly, the REST API uses the term Read for detecting text in the image using Optical Character Recognition (OCR), whereas the SDK uses the term Text for the same operation.
You can specify which features you want to use by setting the URL query parameters of the Analysis 4.0 API. A parameter can have multiple values, separated by commas.
URL parameter | Value | Description |
---|---|---|
features |
read |
Reads the visible text in the image and outputs it as structured JSON data. |
features |
caption |
Describes the image content with a complete sentence in supported languages. |
features |
denseCaptions |
Generates detailed captions for up to 10 prominent image regions. |
features |
smartCrops |
Finds the rectangle coordinates that would crop the image to a desired aspect ratio while preserving the area of interest. |
features |
objects |
Detects various objects within an image, including the approximate location. The Objects argument is only available in English. |
features |
tags |
Tags the image with a detailed list of words related to the image content. |
features |
people |
Detects people appearing in images, including the approximate locations. |
A populated URL might look like this:
<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=tags,read,caption,denseCaptions,smartCrops,objects,people
You can also do image analysis with a custom trained model. To create and train a model, see Create a custom Image Analysis model. Once your model is trained, all you need is the model's name. You do not need to specify visual features if you use a custom model.
To use a custom model, don't use the features query parameter. Instead, set the model-name
parameter to the name of your model as shown here. Replace MyCustomModelName
with your custom model name.
<endpoint>/computervision/imageanalysis:analyze?api-version=2023-02-01&model-name=MyCustomModelName
You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
Language option only applies when you're using the standard model.
The following URL query parameter specifies the language. The default value is en
.
URL parameter | Value | Description |
---|---|---|
language |
en |
English |
language |
es |
Spanish |
language |
ja |
Japanese |
language |
pt |
Portuguese |
language |
zh |
Simplified Chinese |
A populated URL might look like this:
<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=caption&language=en
If you're extracting captions or dense captions, you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
Gender neutral caption option only applies when you're using the standard model.
Add the optional query string gender-neutral-caption
with values true
or false
(the default).
A populated URL might look like this:
<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=caption&gender-neutral-caption=true
An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
Smart cropping aspect rations only applies when you're using the standard model.
Add the optional query string smartcrops-aspect-ratios
, with one or more aspect ratios separated by a comma.
A populated URL might look like this:
<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=smartCrops&smartcrops-aspect-ratios=0.8,1.2
This section shows you how to make an analysis call to the service using the standard model, and get the results.
The service returns a 200
HTTP response, and the body contains the returned data in the form of a JSON string. The following text is an example of a JSON response.
{
"modelVersion": "string",
"captionResult": {
"text": "string",
"confidence": 0.0
},
"denseCaptionsResult": {
"values": [
{
"text": "string",
"confidence": 0.0,
"boundingBox": {
"x": 0,
"y": 0,
"w": 0,
"h": 0
}
}
]
},
"metadata": {
"width": 0,
"height": 0
},
"tagsResult": {
"values": [
{
"name": "string",
"confidence": 0.0
}
]
},
"objectsResult": {
"values": [
{
"id": "string",
"boundingBox": {
"x": 0,
"y": 0,
"w": 0,
"h": 0
},
"tags": [
{
"name": "string",
"confidence": 0.0
}
]
}
]
},
"readResult": {
"blocks": [
{
"lines": [
{
"text": "string",
"boundingPolygon": [
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
}
],
"words": [
{
"text": "string",
"boundingPolygon": [
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
},
{
"x": 0,
"y": 0
}
],
"confidence": 0.0
}
]
}
]
}
]
},
"smartCropsResult": {
"values": [
{
"aspectRatio": 0.0,
"boundingBox": {
"x": 0,
"y": 0,
"w": 0,
"h": 0
}
}
]
},
"peopleResult": {
"values": [
{
"boundingBox": {
"x": 0,
"y": 0,
"w": 0,
"h": 0
},
"confidence": 0.0
}
]
}
}
On error, the Image Analysis service response contains a JSON payload that includes an error code and error message. It may also include other details in the form of and inner error code and message. For example:
{
"error":
{
"code": "InvalidRequest",
"message": "Analyze query is invalid.",
"innererror":
{
"code": "NotSupportedVisualFeature",
"message": "Specified feature type is not valid"
}
}
}
Following is a list of common errors and their causes. List items are presented in the following format:
- HTTP response code
- Error code and message in the JSON response
- [Optional] Inner error code and message in the JSON response
- Error code and message in the JSON response
List of common errors:
400 Bad Request
InvalidRequest - Image URL is badly formatted or not accessible
. Make sure the image URL is valid and publicly accessible.InvalidRequest - The image size is not allowed to be zero or larger than 20971520 bytes
. Reduce the size of the image by compressing it and/or resizing, and resubmit your request.InvalidRequest - The feature 'Caption' is not supported in this region
. The feature is only supported in specific Azure regions. See Quickstart prerequisites for the list of supported Azure regions.InvalidRequest - The provided image content type ... is not supported
. The HTTP header Content-Type in the request isn't an allowed type:- For an image URL, Content-Type should be
application/json
- For a binary image data, Content-Type should be
application/octet-stream
ormultipart/form-data
- For an image URL, Content-Type should be
InvalidRequest - Either 'features' or 'model-name' needs to be specified in the query parameter
.InvalidRequest - Image format is not valid
InvalidImageFormat - Image format is not valid
. See the Image requirements section for supported image formats.
InvalidRequest - Analyze query is invalid
NotSupportedVisualFeature - Specified feature type is not valid
. Make sure the features query string has a valid value.NotSupportedLanguage - The input language is not supported
. Make sure the language query string has a valid value for the selected visual feature, based on the following table.BadArgument - 'smartcrops-aspect-ratios' aspect ratio is not in allowed range [0.75 to 1.8]
401 PermissionDenied
401 - Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource
.
404 Resource Not Found
404 - Resource not found
. The service couldn't find the custom model based on the name provided by themodel-name
query string.
- Explore the concept articles to learn more about each feature.
- Explore the SDK code samples on GitHub:
- See the REST API reference to learn more about the API functionality.