開始使用 AI 影像技術

2025-05-20

Windows AI Foundry 中的映射功能支援下列功能：

影像超解析度：縮放和銳化影像。
影像描述：產生描述影像的文字。
影像分割：識別影像內的物件。
物件清除：從影像中移除物件。

如需 API 詳細數據，請參閱適用於 AI 映像功能的 API 參考。

如需 內容仲裁詳細數據，請參閱使用衍生式 AI API 的內容安全性。

重要

以下是目前支援的 Windows AI 功能和 Windows App SDK 版本清單。

1.8 版實驗版（1.8.0-experimental1） - 物件清除， Phi Silica， LoRA 微調 Phi Silica，對話摘要（文字智能）

私人預覽 - 語意搜尋

版本 1.7.1 （1.7.250401001） - 所有其他 API

這些 API 只能在已接收 5 月 7 日更新的 Windows Insider Preview （WIP）裝置上運作。在5月28日至29日，一次選擇性更新將會釋出給非WIP裝置，之後在6月10日會有後續更新。此更新將帶來 Windows AI API 運作所需的 AI 模型。這些更新也會要求使用 Windows AI API 的任何應用程式在運行時間獲得套件身分識別之前，將無法執行此動作。

我可以使用影像超解析度做什麼？

影像超解析度 API 可啟用影像銳化和縮放。

縮放比例限制最高為8倍，因為較高的縮放比例可能會帶來偽影並影響影像的精確度。如果最終寬度或高度超過其原始大小的 8 倍，則會拋出例外。

我可以使用影像描述做什麼？

重要

圖片描述目前在中國無法使用。

影像描述 API 可讓您為影像產生各種類型的文字描述。

支援下列型態的文字描述：

輔助功能 - 提供完整描述，其中包含適用於具有輔助功能需求之使用者的詳細數據。
標題 - 提供適合影像標題的簡短描述。如果未指定任何值，則為預設值。
DetailedNarration - 提供較長的描述。
OfficeCharts：提供適合圖表和圖解的描述。

由於這些 API 使用 Machine Learning （ML）模型，因此文字無法正確描述影像時，偶爾會發生錯誤。因此，在下列案例中，不建議將這些 API 用於影像：

如果影像包含潛在的敏感性內容和不正確的描述，可能會引起爭議，例如旗標、地圖、地球、文化符號或宗教符號。
當正確描述很重要時，例如醫療建議或診斷、法律內容或財務檔。

從影像取得文字描述

影像描述 API 會擷取影像、所需的文字描述類型（選擇性），以及您想要採用的內容仲裁層級（選擇性），以防止有害的使用。

下列範例示範如何取得影像的文字描述。

注意

影像必須是 ImageBuffer 對象，因為目前不支援 SoftwareBitmap 。此範例示範如何將 SoftwareBitmap 轉換為 ImageBuffer。

呼叫 ImageDescriptionGenerator.GetReadyState 方法，然後等候 ImageDescriptionGenerator.EnsureReadyAsync 方法成功傳回，以確保影像超解析度模型可供使用。
一旦影像超解析度模型可供使用，請建立 ImageDescriptionGenerator 對象來參考它。
（選擇性）建立 ContentFilterOptions 物件，並指定您慣用的值。如果您選擇使用預設值，則可以傳入 Null 物件。
取得影像描述（LanguageModelResponse.Response），請使用原始影像、描述類型的列舉（選擇性）及 ContentFilterOptions 物件（選擇性）來呼叫 ImageDescriptionGenerator.DescribeAsync 方法。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.Management.Deployment;  
using Microsoft.Windows.AI;
using Microsoft.Windows.AI.ContentModeration;
using Windows.Storage.StorageFile;  
using Windows.Storage.Streams;  
using Windows.Graphics.Imaging;

if (ImageDescriptionGenerator.GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageDescriptionGenerator.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}

ImageDescriptionGenerator imageDescriptionGenerator = await ImageDescriptionGenerator.CreateAsync();

// Convert already available softwareBitmap to ImageBuffer.
ImageBuffer inputImage = ImageBuffer.CreateCopyFromBitmap(softwareBitmap);  

// Create content moderation thresholds object.
ContentFilterOptions filterOptions = new ContentFilterOptions();
filterOptions.PromptMinSeverityLevelToBlock.ViolentContentSeverity = SeverityLevel.Medium;
filterOptions.ResponseMinSeverityLevelToBlock.ViolentContentSeverity = SeverityLevel.Medium;

// Get text description.
LanguageModelResponse languageModelResponse = await imageDescriptionGenerator.DescribeAsync(inputImage, ImageDescriptionScenario.Caption, filterOptions);
string response = languageModelResponse.Response;

#include <winrt/Microsoft.Graphics.Imaging.h>
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Microsoft.Windows.AI.ContentSafety.h>
#include <winrt/Microsoft.Windows.AI.h>
#include <winrt/Windows.Foundation.h>
#include <winrt/Windows.Graphics.Imaging.h> 
#include <winrt/Windows.Storage.Streams.h>
#include <winrt/Windows.Storage.StorageFile.h>

using namespace winrt::Microsoft::Graphics::Imaging; 
using namespace winrt::Microsoft::Windows::AI;
using namespace winrt::Microsoft::Windows::AI::ContentSafety; 
using namespace winrt::Microsoft::Windows::AI::Imaging; 
using namespace winrt::Windows::Foundation; 
using namespace winrt::Windows::Graphics::Imaging;
using namespace winrt::Windows::Storage::Streams;
using namespace winrt::Windows::Storage::StorageFile;    

if (ImageDescriptionGenerator::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageDescriptionGenerator::EnsureReadyAsync().get();
    auto loadResult = ImageScaler::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageDescriptionGenerator imageDescriptionGenerator = 
    ImageDescriptionGenerator::CreateAsync().get();

// Convert already available softwareBitmap to ImageBuffer.
auto inputBuffer = Microsoft::Graphics::Imaging::ImageBuffer::CreateForSoftwareBitmap(bitmap); (softwareBitmap);

// Create content moderation thresholds object.

ContentFilterOptions contentFilter{};
contentFilter.PromptMaxAllowedSeverityLevel().Violent(SeverityLevel::Medium);
contentFilter.ResponseMaxAllowedSeverityLevel().Violent(SeverityLevel::Medium);

// Get text description.
auto response = imageDescriptionGenerator.DescribeAsync(inputImage, ImageDescriptionKind::BriefDescription, contentFilter).get();
string text = response.Description();

我可以使用影像分割做什麼？

影像分割可用來識別影像中的特定物件。模型會同時取得影像和「提示」物件，並傳回已識別物件的遮罩。

您可以透過下列任何組合來提供提示：

屬於你所識別之點的座標。
與您所識別的點無關的座標。
一個座標矩形，環繞您要識別的內容。

您提供的提示越多，模型就越精確。請遵循這些提示指導方針，將不正確的結果或錯誤降到最低。

建議在提示中避免使用多個矩形，因為它們可能會產生不正確的遮罩。
避免僅使用排除點而不包括包括點或矩形。
請勿指定超過支援的 32 個座標上限（1 代表點，矩形為 2），因為這樣會傳回錯誤。

傳回的遮罩採用灰階-8 格式，且已識別物件的遮罩圖元值為 255（所有其他值為 0）。

識別影像中的物件

下列範例示範如何識別影像中的物件。這些範例假設您已經有作為輸入的軟體位圖物件（softwareBitmap）。

呼叫 GetReadyState 方法並等候 EnsureReadyAsync 方法成功傳回，以確保影像分割模型可供使用。
一旦影像分割模型可供使用，請建立 ImageObjectExtractor 對象來參考它。
將影像傳遞至 ImageObjectExtractor.CreateWithSoftwareBitmapAsync。
建立 ImageObjectExtractorHint 物件。稍後會示範建立具有不同輸入的提示物件的其他方式。
使用 GetSoftwareBitmapObjectMask 方法將提示提交至模型，此方法會傳回最終結果。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.AI;
using Microsoft.Windows.Management.Deployment;
using Windows.Graphics.Imaging;

if (ImageObjectExtractor::GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageObjectExtractor.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}

ImageObjectExtractor imageObjectExtractor = await ImageObjectExtractor.CreateWithSoftwareBitmapAsync(softwareBitmap);

ImageObjectExtractorHint hint = new ImageObjectExtractorHint{
    includeRects: null, 
    includePoints:
        new List<PointInt32> { new PointInt32(306, 212),
                               new PointInt32(216, 336)},
    excludePoints: null};
    SoftwareBitmap finalImage = imageObjectExtractor.GetSoftwareBitmapObjectMask(hint);

#include <winrt/Microsoft.Graphics.Imaging.h> 
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Foundation.h>
using namespace winrt::Microsoft::Graphics::Imaging; 
using namespace winrt::Microsoft::Windows::AI.Imaging;
using namespace winrt::Windows::Graphics::Imaging; 
using namespace winrt::Windows::Foundation;

if (ImageObjectExtractor::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageObjectExtractor::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageObjectExtractor imageObjectExtractor = ImageObjectExtractor::CreateWithSoftwareBitmapAsync(softwareBitmap).get();

ImageObjectExtractorHint hint(
    {},
    {
        Windows::Graphics::PointInt32{306, 212},        
        Windows::Graphics::PointInt32{216, 336}
    },
    {}
);

Windows::Graphics::Imaging::SoftwareBitmap finalImage = imageObjectExtractor.GetSoftwareBitmapObjectMask(hint);

指定提示，其中包含和不包含的要點

此代碼段示範如何使用包含和排除的點作為提示。

ImageObjectExtractorHint hint(
    includeRects: null,
    includePoints: 
        new List<PointInt32> { new PointInt32(150, 90), 
                               new PointInt32(216, 336), 
                               new PointInt32(550, 330)},
    excludePoints: 
        new List<PointInt32> { new PointInt32(306, 212) });

ImageObjectExtractorHint hint(
    {}, 
    { 
        PointInt32{150, 90}, 
        PointInt32{216, 336}, 
        PointInt32{550, 330}
    },
    { 
        PointInt32{306, 212}
    }
);

使用矩形指定提示

此代碼段示範如何使用矩形（RectInt32 X, Y, Width, Height）作為提示。

ImageObjectExtractorHint hint(
    includeRects: 
        new List<RectInt32> {new RectInt32(370, 278, 285, 126)},
    includePoints: null,
    excludePoints: null );

ImageObjectExtractorHint hint(
    { 
        RectInt32{370, 278, 285, 126}
    }, 
    {},
    {}
);

我可以使用物件清除做什麼？

物件清除可用來從影像中移除物件。模型會同時使用影像和指示要移除物件的灰階遮罩，從影像中擦除遮罩區域，然後以影像背景取代已擦除的區域。

從影像移除不必要的物件

下列範例示範如何從影像中移除物件。此範例假設您已經有用於影像和遮罩的軟體點陣圖物件（softwareBitmap）。遮罩必須採用 Gray8 格式，且要移除區域的每個像素設為 255，而所有其他圖元則設定為 0。

呼叫 GetReadyState 方法並等候 EnsureReadyAsync 方法成功傳回，以確保影像分割模型可供使用。
一旦物件清除模型可供使用，請建立 ImageObjectRemover 對象來參考它。
最後，使用 RemoveFromSoftwareBitmap 方法將影像和遮罩提交至模型，以傳回最終結果。

using Microsoft.Graphics.Imaging;
using Microsoft.Windows.AI;
using Microsoft.Windows.Management.Deployment;
using Windows.Graphics.Imaging;

if (ImageObjectRemover::GetReadyState() == AIFeatureReadyState.EnsureNeeded) 
{
    var result = await ImageObjectRemover.EnsureReadyAsync();
    if (result.Status != PackageDeploymentStatus.CompletedSuccess)
    {
        throw result.ExtendedError;
    }
}
ImageObjectRemover imageObjectRemover = await ImageObjectRemover.CreateAsync();
SoftwareBitmap finalImage = imageObjectRemover.RemoveFromSoftwareBitmap(imageBitmap, maskBitmap); // Insert your own imagebitmap and maskbitmap

#include <winrt/Microsoft.Graphics.Imaging.h>
#include <winrt/Microsoft.Windows.AI.Imaging.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Foundation.h>
using namespace winrt::Microsoft::Graphics::Imaging;
using namespace winrt::Microsoft::Windows::AI.Imaging;
using namespace winrt::Windows::Graphics::Imaging; 
using namespace winrt::Windows::Foundation;
if (ImageObjectRemover::GetReadyState() == AIFeatureReadyState::NotReady)
{
    auto loadResult = ImageObjectRemover::EnsureReadyAsync().get();

    if (loadResult.Status() != AIFeatureReadyResultState::Success)
    {
        throw winrt::hresult_error(loadResult.ExtendedError());
    }
}

ImageObjectRemover imageObjectRemover = ImageObjectRemover::CreateAsync().get();
// Insert your own imagebitmap and maskbitmap
Windows::Graphics::Imaging::SoftwareBitmap buffer = 
    imageObjectRemover.RemoveFromSoftwareBitmap(imageBitmap, maskBitmap);

負責任的人工智慧

我們已使用下列步驟的組合，以確保這些映像 API 值得信任、安全且負責任地建置。建議您檢閱在應用程式中實作 AI 功能時，在 Windows 上負責任產生 AI 開發中所述的最佳做法。