影像分析認知技能
影像 分析 技能會根據影像內容擷取一組豐富的視覺功能。 例如,您可以從影像產生標題、產生標記,或識別名人和地標。 本文是影像分析技能的參考檔。 如需使用指示,請參閱 從影像 擷取文字和資訊。
此技能會使用 Azure AI 視覺在 Azure AI 服務中提供的機器學習模型。 影像分析僅適用於符合下列需求的影像:
- 影像必須以 JPEG、PNG、GIF 或 BMP 格式呈現
- 影像的檔案大小必須小於 4 MB
- 影像的維度必須大於 50 x 50 像素
此技能是使用 AI 影像分析 API 3.2 版來實作。 如果您的解決方案需要呼叫較新版本的服務 API(例如 4.0 版),請考慮透過 Web API 自定義技能實作。
注意
此技能會繫結至 Azure AI 服務,並且每個索引子每天超過 20 個文件的交易需要可計費資源。 內建技能的執行會依現有的 Azure AI 服務預付型方案價格收費。
此外,影像擷取是由 Azure AI 搜尋服務計費。
@odata.type
Microsoft.Skills.Vision.ImageAnalysisSkill
技能參數
這些參數會區分大小寫。
參數名稱 | 描述 |
---|---|
defaultLanguageCode |
字串,表示要傳回的語言。 服務會傳回指定語言的辨識結果。 如果未指定此參數,預設值為 “en”。 支援的語言包含 Azure AI 視覺正式推出語言的子集。 當新引進正式運作狀態的語言進入 AI 視覺服務時,在完全整合此技能之前,預期會有延遲。 |
visualFeatures |
字串數位,表示要傳回的視覺功能類型。 有效的視覺功能類型包括:
defaultLanguageCode 都支援哪些視覺功能。 |
details |
字串數位,指出要傳回哪些網域特定詳細數據。 有效的視覺功能類型包括:
|
技能輸入
輸入名稱 | 描述 |
---|---|
image |
複雜類型。 目前僅可搭配 "/document/normalized_images" 欄位使用,該欄位是由 Azure Blob 索引子在 imageAction 被設定為 none 以外的其他值時產生。 |
技能輸出
輸出名稱 | 描述 |
---|---|
adult |
輸出是複雜類型的單一成人物件,由布爾值欄位 (isAdultContent 、 、 ) 和雙類型分數 (adultScore 、 goreScore isGoryContent isRacyContent racyScore 、 ) 組成。 |
brands |
輸出是品牌對象的陣列,其中物件是包含 name (string) 和confidence 分數 (double) 的複雜類型。 它也會傳回具有四個rectangle 周框方塊座標 (x 、 y 、 h w , 以像素為單位) 的 ,表示影像內的位置。 針對矩形, x 而 y 是左上方。 左下角為 x 、 y+h 。 右上方為 x+w 、 y 。 右下角為 x+w 、 y+h 。 |
categories |
輸出是類別對象的陣列,其中每個類別物件都是由 (字串)、 (double) score 和選擇性detail 組成的name 複雜類型,其中包含名人或地標詳細數據。 如需類別名稱的完整清單,請參閱類別分類法。 詳細數據是巢狀複雜類型。 名人詳細數據包含名稱、信賴分數和臉部周框方塊。 地標詳細數據是由名稱和信賴分數所組成。 |
description |
輸出是複雜類型的單一描述物件,由 tags 清單和 caption (由 (字串) 和 confidence (double) 組成的Text 陣列) 組成。 |
faces |
由、 gender 組成的age 複雜類型,且faceBoundingBox 具有四個周框方塊座標(以像素為單位),表示影像內部的位置。 座標為 top 、、width left 、height 。 |
objects |
輸出是視覺特徵物件的陣列。 每個物件都是複雜型別,由 object (string)、 confidence (double) rectangle 組成(具有四個周框方塊座標,表示影像內的位置),以及 parent 包含物件名稱和信賴度的 。 |
tags |
輸出是 imageTag 物件的陣列,其中標記物件是包含 name (string)、 hint (string) 和 confidence (double) 的複雜類型。 新增提示是罕見的。 只有在標記模棱兩可時才會產生。 例如,標記為「捲曲」的影像可能會有「運動」的提示,以更能指出其內容。 |
範例技能定義
{
"description": "Extract image analysis.",
"@odata.type": "#Microsoft.Skills.Vision.ImageAnalysisSkill",
"context": "/document/normalized_images/*",
"defaultLanguageCode": "en",
"visualFeatures": [
"adult",
"brands",
"categories",
"description",
"faces",
"objects",
"tags"
],
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/*"
}
],
"outputs": [
{
"name": "adult"
},
{
"name": "brands"
},
{
"name": "categories"
},
{
"name": "description"
},
{
"name": "faces"
},
{
"name": "objects"
},
{
"name": "tags"
}
]
}
範例索引
對於單一物件(例如 adult
和 description
),您可以在索引中將其結構為 Collection(Edm.ComplexType)
,以傳回 adult
和 description
輸出所有物件。 如需將輸出對應至索引字段的詳細資訊,請參閱 複雜類型的扁平化資訊。
{
"fields": [
{
"name": "metadata_storage_name",
"type": "Edm.String",
"key": true,
"searchable": true,
"filterable": false,
"facetable": false,
"sortable": true
},
{
"name": "metadata_storage_path",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false,
"sortable": true
},
{
"name": "content",
"type": "Edm.String",
"sortable": false,
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "adult",
"type": "Edm.ComplexType",
"fields": [
{
"name": "isAdultContent",
"type": "Edm.Boolean",
"searchable": false,
"filterable": true,
"facetable": true
},
{
"name": "isGoryContent",
"type": "Edm.Boolean",
"searchable": false,
"filterable": true,
"facetable": true
},
{
"name": "isRacyContent",
"type": "Edm.Boolean",
"searchable": false,
"filterable": true,
"facetable": true
},
{
"name": "adultScore",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "goreScore",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "racyScore",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
},
{
"name": "brands",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "name",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "rectangle",
"type": "Edm.ComplexType",
"fields": [
{
"name": "x",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "y",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "w",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "h",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
},
{
"name": "categories",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "name",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "score",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "detail",
"type": "Edm.ComplexType",
"fields": [
{
"name": "celebrities",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "name",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "faceBoundingBox",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "x",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "y",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
}
]
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
},
{
"name": "landmarks",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "name",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
}
]
},
{
"name": "description",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "tags",
"type": "Collection(Edm.String)",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "captions",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "text",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
},
{
"name": "faces",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "age",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "gender",
"type": "Edm.String",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "faceBoundingBox",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "top",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "left",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "width",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "height",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
},
{
"name": "objects",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "object",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "rectangle",
"type": "Edm.ComplexType",
"fields": [
{
"name": "x",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "y",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "w",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
},
{
"name": "h",
"type": "Edm.Int32",
"searchable": false,
"filterable": false,
"facetable": false
}
]
},
{
"name": "parent",
"type": "Edm.ComplexType",
"fields": [
{
"name": "object",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
},
{
"name": "tags",
"type": "Collection(Edm.ComplexType)",
"fields": [
{
"name": "name",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "hint",
"type": "Edm.String",
"searchable": true,
"filterable": false,
"facetable": false
},
{
"name": "confidence",
"type": "Edm.Double",
"searchable": false,
"filterable": false,
"facetable": false
}
]
}
]
}
範例輸出欄位對應
目標欄位可以是複雜的欄位或集合。 索引定義會指定任何子欄位。
"outputFieldMappings": [
{
"sourceFieldName": "/document/normalized_images/*/adult",
"targetFieldName": "adult"
},
{
"sourceFieldName": "/document/normalized_images/*/brands/*",
"targetFieldName": "brands"
},
{
"sourceFieldName": "/document/normalized_images/*/categories/*",
"targetFieldName": "categories"
},
{
"sourceFieldName": "/document/normalized_images/*/description",
"targetFieldName": "description"
},
{
"sourceFieldName": "/document/normalized_images/*/faces/*",
"targetFieldName": "faces"
},
{
"sourceFieldName": "/document/normalized_images/*/objects/*",
"targetFieldName": "objects"
},
{
"sourceFieldName": "/document/normalized_images/*/tags/*",
"targetFieldName": "tags"
}
輸出欄位對應的變化 (巢狀屬性)
您可以定義輸出欄位對應至較低層級的屬性,例如名人或地標。 在此情況下,請確定您的索引架構有一個字段,專門包含每個詳細數據。
"outputFieldMappings": [
{
"sourceFieldName": "/document/normalized_images/*/categories/detail/celebrities/*",
"targetFieldName": "celebrities"
},
{
"sourceFieldName": "/document/normalized_images/*/categories/detail/landmarks/*",
"targetFieldName": "landmarks"
}
範例輸入
{
"values": [
{
"recordId": "1",
"data": {
"image": {
"data": "BASE64 ENCODED STRING OF A JPEG IMAGE",
"width": 500,
"height": 300,
"originalWidth": 5000,
"originalHeight": 3000,
"rotationFromOriginal": 90,
"contentOffset": 500,
"pageNumber": 2
}
}
}
]
}
範例輸出
{
"values": [
{
"recordId": "1",
"data": {
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceBoundingBox": [
{
"x": 273,
"y": 309
},
{
"x": 395,
"y": 309
},
{
"x": 395,
"y": 431
},
{
"x": 273,
"y": 431
}
],
"confidence": 0.999028444
}
],
"landmarks": [ ]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"isGoryContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.068613491952419281,
"goreScore": 0.08928389008070282
},
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
{
"name": "man",
"confidence": 0.94493889808654785
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.89513939619064331
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
]
},
"faces": [
{
"age": 44,
"gender": "Male",
"faceBoundingBox": [
{
"x": 1601,
"y": 395
},
{
"x": 1653,
"y": 395
},
{
"x": 1653,
"y": 447
},
{
"x": 1601,
"y": 447
}
]
}
],
"objects": [
{
"rectangle": {
"x": 25,
"y": 43,
"w": 172,
"h": 140
},
"object": "person",
"confidence": 0.931
}
],
"brands":[
{
"name":"Microsoft",
"confidence": 0.903,
"rectangle":{
"x":20,
"y":97,
"w":62,
"h":52
}
}
]
}
}
]
}
錯誤案例
在下列錯誤情況下,不會擷取任何專案。
錯誤碼 | 描述 |
---|---|
NotSupportedLanguage |
不支援提供的語言。 |
InvalidImageUrl |
影像 URL 的格式不正確或無法存取。 |
InvalidImageFormat |
輸入數據不是有效的影像。 |
InvalidImageSize |
輸入影像太大。 |
NotSupportedVisualFeature |
指定的功能類型無效。 |
NotSupportedImage |
不支援的影像,例如兒童色情內容。 |
InvalidDetails |
不支援的領域特定模型。 |
如果您收到類似 "One or more skills are invalid. Details: Error in skill #<num>: Outputs are not supported by skill: Landmarks"
的錯誤,請檢查路徑。 名人和地標都是 下 detail
的屬性。
"categories":[
{
"name":"building_",
"score":0.97265625,
"detail":{
"landmarks":[
{
"name":"Forbidden City",
"confidence":0.92013400793075562
}
]