Распознавание намерений с помощью сопоставления пользовательских шаблонов сущностей

Статья
01/21/2024

Пакет SDK службы искусственного интеллекта Azure имеет встроенную функцию для обеспечения распознавания намерений с простым сопоставлением шаблонов языка. Намерение — это то, что хочет сделать пользователь: закрыть окно, пометить флажок, вставить какой-то текст и т. д.

В данном руководстве используется пакет SDK службы "Речь" для разработки консольного приложения, извлекающего намерения из речевых фрагментов, полученных через микрофон устройства. Узнайте следующие темы:

создать проект в Visual Studio, ссылающийся на пакет SDK для службы "Речь" для NuGet;
выполнить настройки речи и получить распознаватель намерений;
добавить намерения и шаблоны с помощью API пакета SDK для службы "Речь";
добавить пользовательские сущности с помощью API пакета SDK службы "Речь";
использовать асинхронное непрерывное распознавание при определенном событии.

Когда следует использовать сопоставление шаблонов

Используйте сопоставление шаблонов, если:

Вам необходимо лишь точное совпадение с тем, что сказал пользователь. Эти шаблоны соответствуют более агрессивно, чем понимание языка общения (CLU).
У вас нет доступа к модели CLU, но по-прежнему требуется намерения.

Дополнительные сведения см. в обзоре сопоставления шаблонов.

Необходимые компоненты

Перед началом работы с этим руководством необходимо убедиться в наличии следующих элементов:

Ресурс служб искусственного интеллекта Azure или единый ресурс службы распознавания речи
Visual Studio 2019 (любой выпуск).

Создание проекта

Создайте проект консольного приложения C# в Visual Studio 2019 и установите пакет SDK службы "Речь".

Добавление стандартного кода

Откройте файл Program.cs и добавьте код, который выступает в качестве основы для нашего проекта.

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Intent;

namespace helloworld
{
    class Program
    {
        static void Main(string[] args)
        {
            IntentPatternMatchingWithMicrophoneAsync().Wait();
        }

        private static async Task IntentPatternMatchingWithMicrophoneAsync()
        {
            var config = SpeechConfig.FromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");
        }
    }
}

Создание конфигурации службы "Речь"

Прежде чем инициализировать IntentRecognizer объект, необходимо создать конфигурацию, которая использует ключ и регион Azure для ресурса прогнозирования служб искусственного интеллекта Azure.

Замените "YOUR_SUBSCRIPTION_KEY" ключ прогнозирования служб ИИ Azure.
Замените "YOUR_SUBSCRIPTION_REGION" регион ресурсов служб ИИ Azure.

В этом примере для создания SpeechConfig используется метод FromSubscription(). Полный список доступных методов см. в статье SpeechConfig Class (Класс SpeechConfig).

Инициализация объекта IntentRecognizer

Теперь создайте IntentRecognizer. Вставьте этот код непосредственно под конфигурацией службы "Речь".

using (var recognizer = new IntentRecognizer(config))
{
    
}

Добавление некоторых намерений

Примечание.

В PatternMatchingIntent можно добавить несколько шаблонов.

Вставьте этот код в блок using.

// Creates a Pattern Matching model and adds specific intents from your model. The
// Id is used to identify this model from others in the collection.
var model = new PatternMatchingModel("YourPatternMatchingModelId");

// Creates a pattern that uses groups of optional words. "[Go | Take me]" will match either "Go", "Take me", or "".
var patternWithOptionalWords = "[Go | Take me] to [floor|level] {floorName}";

// Creates a pattern that uses an optional entity and group that could be used to tie commands together.
var patternWithOptionalEntity = "Go to parking [{parkingLevel}]";

// You can also have multiple entities of the same name in a single pattern by adding appending a unique identifier
// to distinguish between the instances. For example:
var patternWithTwoOfTheSameEntity = "Go to floor {floorName:1} [and then go to floor {floorName:2}]";
// NOTE: Both floorName:1 and floorName:2 are tied to the same list of entries. The identifier can be a string
//       and is separated from the entity name by a ':'

// Creates the pattern matching intents and adds them to the model
model.Intents.Add(new PatternMatchingIntent("ChangeFloors", patternWithOptionalWords, patternWithOptionalEntity, patternWithTwoOfTheSameEntity));
model.Intents.Add(new PatternMatchingIntent("DoorControl", "{action} the doors", "{action} doors", "{action} the door", "{action} door"));

Добавление пользовательских сущностей

Чтобы воспользоваться всеми преимуществами средства сопоставления шаблонов, можно настроить сущности. Мы зададим floorName как список доступных этажей, а parkingLevel — как целочисленную сущность.

Вставьте приведенный ниже код под строкой намерений:

// Creates the "floorName" entity and set it to type list.
// Adds acceptable values. NOTE the default entity type is Any and so we do not need
// to declare the "action" entity.
model.Entities.Add(PatternMatchingEntity.CreateListEntity("floorName", EntityMatchMode.Strict, "ground floor", "lobby", "1st", "first", "one", "1", "2nd", "second", "two", "2"));

// Creates the "parkingLevel" entity as a pre-built integer
model.Entities.Add(PatternMatchingEntity.CreateIntegerEntity("parkingLevel"));

Применение модели к распознавателю

Теперь необходимо применить модель к IntentRecognizer. Можно использовать несколько моделей одновременно, чтобы задать для API коллекцию моделей.

Вставьте приведенный ниже код под разделом сущностей:

var modelCollection = new LanguageUnderstandingModelCollection();
modelCollection.Add(model);

recognizer.ApplyLanguageModels(modelCollection);

Распознавание намерения

В объекте IntentRecognizer необходимо вызвать метод RecognizeOnceAsync(). Этот метод запрашивает у службы "Речь" распознавание речи в одной фразе и прекращает распознавание речи после определения фразы.

Вставьте этот код после применения языковых моделей:

Console.WriteLine("Say something...");

var result = await recognizer.RecognizeOnceAsync();

Отображение результатов распознавания (или ошибок)

Когда служба "Речь" возвратит результат распознавания, мы выведем его на печать.

Вставьте код ниже под строкой var result = await recognizer.RecognizeOnceAsync();.

if (result.Reason == ResultReason.RecognizedIntent)
{
    Console.WriteLine($"RECOGNIZED: Text={result.Text}");
    Console.WriteLine($"       Intent Id={result.IntentId}.");

    var entities = result.Entities;
    switch (result.IntentId)
    {
        case "ChangeFloors":
            if (entities.TryGetValue("floorName", out string floorName))
            {
                Console.WriteLine($"       FloorName={floorName}");
            }

            if (entities.TryGetValue("floorName:1", out floorName))
            {
                Console.WriteLine($"     FloorName:1={floorName}");
            }

            if (entities.TryGetValue("floorName:2", out floorName))
            {
                Console.WriteLine($"     FloorName:2={floorName}");
            }

            if (entities.TryGetValue("parkingLevel", out string parkingLevel))
            {
                Console.WriteLine($"    ParkingLevel={parkingLevel}");
            }

            break;

        case "DoorControl":
            if (entities.TryGetValue("action", out string action))
            {
                Console.WriteLine($"          Action={action}");
            }
            break;
    }
}
else if (result.Reason == ResultReason.RecognizedSpeech)
{
    Console.WriteLine($"RECOGNIZED: Text={result.Text}");
    Console.WriteLine($"    Intent not recognized.");
}
else if (result.Reason == ResultReason.NoMatch)
{
    Console.WriteLine($"NOMATCH: Speech could not be recognized.");
}
else if (result.Reason == ResultReason.Canceled)
{
    var cancellation = CancellationDetails.FromResult(result);
    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

    if (cancellation.Reason == CancellationReason.Error)
    {
        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
        Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
    }
}

Проверка кода

На этом этапе код должен выглядеть так:

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Intent;

namespace helloworld
{
    class Program
    {
        static void Main(string[] args)
        {
            IntentPatternMatchingWithMicrophoneAsync().Wait();
        }

        private static async Task IntentPatternMatchingWithMicrophoneAsync()
        {
            var config = SpeechConfig.FromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");

            using (var recognizer = new IntentRecognizer(config))
            {
                // Creates a Pattern Matching model and adds specific intents from your model. The
                // Id is used to identify this model from others in the collection.
                var model = new PatternMatchingModel("YourPatternMatchingModelId");

                // Creates a pattern that uses groups of optional words. "[Go | Take me]" will match either "Go", "Take me", or "".
                var patternWithOptionalWords = "[Go | Take me] to [floor|level] {floorName}";

                // Creates a pattern that uses an optional entity and group that could be used to tie commands together.
                var patternWithOptionalEntity = "Go to parking [{parkingLevel}]";

                // You can also have multiple entities of the same name in a single pattern by adding appending a unique identifier
                // to distinguish between the instances. For example:
                var patternWithTwoOfTheSameEntity = "Go to floor {floorName:1} [and then go to floor {floorName:2}]";
                // NOTE: Both floorName:1 and floorName:2 are tied to the same list of entries. The identifier can be a string
                //       and is separated from the entity name by a ':'

                // Adds some intents to look for specific patterns.
                model.Intents.Add(new PatternMatchingIntent("ChangeFloors", patternWithOptionalWords, patternWithOptionalEntity, patternWithTwoOfTheSameEntity));
                model.Intents.Add(new PatternMatchingIntent("DoorControl", "{action} the doors", "{action} doors", "{action} the door", "{action} door"));

                // Creates the "floorName" entity and set it to type list.
                // Adds acceptable values. NOTE the default entity type is Any and so we do not need
                // to declare the "action" entity.
                model.Entities.Add(PatternMatchingEntity.CreateListEntity("floorName", EntityMatchMode.Strict, "ground floor", "lobby", "1st", "first", "one", "1", "2nd", "second", "two", "2"));

                // Creates the "parkingLevel" entity as a pre-built integer
                model.Entities.Add(PatternMatchingEntity.CreateIntegerEntity("parkingLevel"));

                var modelCollection = new LanguageUnderstandingModelCollection();
                modelCollection.Add(model);

                recognizer.ApplyLanguageModels(modelCollection);

                Console.WriteLine("Say something...");

                var result = await recognizer.RecognizeOnceAsync();

                if (result.Reason == ResultReason.RecognizedIntent)
                {
                    Console.WriteLine($"RECOGNIZED: Text={result.Text}");
                    Console.WriteLine($"       Intent Id={result.IntentId}.");

                    var entities = result.Entities;
                    switch (result.IntentId)
                    {
                        case "ChangeFloors":
                            if (entities.TryGetValue("floorName", out string floorName))
                            {
                                Console.WriteLine($"       FloorName={floorName}");
                            }

                            if (entities.TryGetValue("floorName:1", out floorName))
                            {
                                Console.WriteLine($"     FloorName:1={floorName}");
                            }

                            if (entities.TryGetValue("floorName:2", out floorName))
                            {
                                Console.WriteLine($"     FloorName:2={floorName}");
                            }

                            if (entities.TryGetValue("parkingLevel", out string parkingLevel))
                            {
                                Console.WriteLine($"    ParkingLevel={parkingLevel}");
                            }

                            break;

                        case "DoorControl":
                            if (entities.TryGetValue("action", out string action))
                            {
                                Console.WriteLine($"          Action={action}");
                            }
                            break;
                    }
                }
                else if (result.Reason == ResultReason.RecognizedSpeech)
                {
                    Console.WriteLine($"RECOGNIZED: Text={result.Text}");
                    Console.WriteLine($"    Intent not recognized.");
                }
                else if (result.Reason == ResultReason.NoMatch)
                {
                    Console.WriteLine($"NOMATCH: Speech could not be recognized.");
                }
                else if (result.Reason == ResultReason.Canceled)
                {
                    var cancellation = CancellationDetails.FromResult(result);
                    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                    if (cancellation.Reason == CancellationReason.Error)
                    {
                        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
                        Console.WriteLine($"CANCELED: Did you set the speech resource key and region values?");
                    }
                }
            }
        }
    }
}

Создание и запуск приложения

Теперь можно приступать к созданию приложения и проверке распознавания речи, используя службу "Речь".

Скомпилируйте код. В строке меню Visual Studio последовательно выберите Сборка>Собрать решение.
Запустите приложение. В строке меню выберите Отладка>Начать отладку или нажмите клавишу F5.
Начните распознавание. Вам будет предложено произнести какую-либо фразу. По умолчанию используется английский язык. Речь, записанная в виде текста, отправляется в службу "Речь" и выводится в консоли.

Например, если вы скажете "Доставьте меня на этаж 2", результат будет выглядеть следующим образом:

Say something...
RECOGNIZED: Text=Take me to floor 2.
       Intent Id=ChangeFloors.
       FloorName=2

Еще один пример: если вы скажете "Доставьте меня на этаж 7", результат будет выглядеть следующим образом:

Say something...
RECOGNIZED: Text=Take me to floor 7.
    Intent not recognized.

Намерение не распознано, так как 7 отсутствует в списке допустимых значений для floorName.

Создание проекта

Создайте проект консольного приложения C++ в Visual Studio 2019 и установите пакет SDK службы "Речь".

Добавление стандартного кода

Откройте файл helloworld.cpp и добавьте код, который выступает в качестве основы для нашего проекта.

#include <iostream>
#include <speechapi_cxx.h>

using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Intent;

int main()
{
    std::cout << "Hello World!\n";

    auto config = SpeechConfig::FromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");
}

Создание конфигурации службы "Речь"

Замените "YOUR_SUBSCRIPTION_KEY" ключ прогнозирования служб ИИ Azure.
Замените "YOUR_SUBSCRIPTION_REGION" регион ресурсов служб ИИ Azure.

Инициализация объекта IntentRecognizer

Теперь создайте IntentRecognizer. Вставьте этот код непосредственно под конфигурацией службы "Речь".

    auto intentRecognizer = IntentRecognizer::FromConfig(config);

Добавление некоторых намерений

Необходимо связать некоторые шаблоны с PatternMatchingModel и применить их к IntentRecognizer. Для начала создадим PatternMatchingModel и добавим в него нескольких намерений. PatternMatchingIntent — это структура, поэтому мы просто будем использовать встроенный синтаксис.

Примечание.

В PatternMatchingIntent можно добавить несколько шаблонов.

auto model = PatternMatchingModel::FromId("myNewModel");

model->Intents.push_back({"Take me to floor {floorName}.", "Go to floor {floorName}."} , "ChangeFloors");
model->Intents.push_back({"{action} the door."}, "OpenCloseDoor");

Добавление пользовательских сущностей

Чтобы воспользоваться всеми преимуществами средства сопоставления шаблонов, можно настроить сущности. Мы зададим floorName как список доступных этажей,

model->Entities.push_back({ "floorName" , Intent::EntityType::List, Intent::EntityMatchMode::Strict, {"one", "1", "two", "2", "lobby", "ground floor"} });

Применение модели к распознавателю

std::vector<std::shared_ptr<LanguageUnderstandingModel>> collection;

collection.push_back(model);
intentRecognizer->ApplyLanguageModels(collection);

Распознавание намерения

Вставьте приведенный ниже код под строкой намерений:

std::cout << "Say something ..." << std::endl;
auto result = intentRecognizer->RecognizeOnceAsync().get();

Отображение результатов распознавания (или ошибок)

Когда служба "Речь" возвратит результат распознавания, мы выведем его на печать.

Вставьте код ниже под строкой auto result = intentRecognizer->RecognizeOnceAsync().get();.

switch (result->Reason)
{
case ResultReason::RecognizedSpeech:
        std::cout << "RECOGNIZED: Text = " << result->Text.c_str() << std::endl;
        std::cout << "NO INTENT RECOGNIZED!" << std::endl;
        break;
case ResultReason::RecognizedIntent:
    std::cout << "RECOGNIZED: Text = " << result->Text.c_str() << std::endl;
    std::cout << "  Intent Id = " << result->IntentId.c_str() << std::endl;
    auto entities = result->GetEntities();
    if (entities.find("floorName") != entities.end())
    {
        std::cout << "  Floor name: = " << entities["floorName"].c_str() << std::endl;
    }

    if (entities.find("action") != entities.end())
    {
        std::cout << "  Action: = " << entities["action"].c_str() << std::endl;
    }

    break;
case ResultReason::NoMatch:
{
    auto noMatch = NoMatchDetails::FromResult(result);
    switch (noMatch->Reason)
    {
    case NoMatchReason::NotRecognized:
        std::cout << "NOMATCH: Speech was detected, but not recognized." << std::endl;
        break;
    case NoMatchReason::InitialSilenceTimeout:
        std::cout << "NOMATCH: The start of the audio stream contains only silence, and the service timed out waiting for speech." << std::endl;
        break;
    case NoMatchReason::InitialBabbleTimeout:
        std::cout << "NOMATCH: The start of the audio stream contains only noise, and the service timed out waiting for speech." << std::endl;
        break;
    case NoMatchReason::KeywordNotRecognized:
        std::cout << "NOMATCH: Keyword not recognized" << std::endl;
        break;
    }
    break;
}
case ResultReason::Canceled:
{
    auto cancellation = CancellationDetails::FromResult(result);

    if (!cancellation->ErrorDetails.empty())
    {
        std::cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails.c_str() << std::endl;
        std::cout << "CANCELED: Did you set the speech resource key and region values?" << std::endl;
    }
}
default:
    break;
}

Проверка кода

На этом этапе код должен выглядеть так:

#include <iostream>
#include <speechapi_cxx.h>

using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Intent;

int main()
{
    auto config = SpeechConfig::FromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");
    auto intentRecognizer = IntentRecognizer::FromConfig(config);

    auto model = PatternMatchingModel::FromId("myNewModel");

    model->Intents.push_back({"Take me to floor {floorName}.", "Go to floor {floorName}."} , "ChangeFloors");
    model->Intents.push_back({"{action} the door."}, "OpenCloseDoor");

    model->Entities.push_back({ "floorName" , Intent::EntityType::List, Intent::EntityMatchMode::Strict, {"one", "1", "two", "2", "lobby", "ground floor"} });

    std::vector<std::shared_ptr<LanguageUnderstandingModel>> collection;

    collection.push_back(model);
    intentRecognizer->ApplyLanguageModels(collection);

    std::cout << "Say something ..." << std::endl;

    auto result = intentRecognizer->RecognizeOnceAsync().get();

    switch (result->Reason)
    {
    case ResultReason::RecognizedSpeech:
        std::cout << "RECOGNIZED: Text = " << result->Text.c_str() << std::endl;
        std::cout << "NO INTENT RECOGNIZED!" << std::endl;
        break;
    case ResultReason::RecognizedIntent:
        std::cout << "RECOGNIZED: Text = " << result->Text.c_str() << std::endl;
        std::cout << "  Intent Id = " << result->IntentId.c_str() << std::endl;
        auto entities = result->GetEntities();
        if (entities.find("floorName") != entities.end())
        {
            std::cout << "  Floor name: = " << entities["floorName"].c_str() << std::endl;
        }

        if (entities.find("action") != entities.end())
        {
            std::cout << "  Action: = " << entities["action"].c_str() << std::endl;
        }

        break;
    case ResultReason::NoMatch:
    {
        auto noMatch = NoMatchDetails::FromResult(result);
        switch (noMatch->Reason)
        {
        case NoMatchReason::NotRecognized:
            std::cout << "NOMATCH: Speech was detected, but not recognized." << std::endl;
            break;
        case NoMatchReason::InitialSilenceTimeout:
            std::cout << "NOMATCH: The start of the audio stream contains only silence, and the service timed out waiting for speech." << std::endl;
            break;
        case NoMatchReason::InitialBabbleTimeout:
            std::cout << "NOMATCH: The start of the audio stream contains only noise, and the service timed out waiting for speech." << std::endl;
            break;
        case NoMatchReason::KeywordNotRecognized:
            std::cout << "NOMATCH: Keyword not recognized." << std::endl;
            break;
        }
        break;
    }
    case ResultReason::Canceled:
    {
        auto cancellation = CancellationDetails::FromResult(result);

        if (!cancellation->ErrorDetails.empty())
        {
            std::cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails.c_str() << std::endl;
            std::cout << "CANCELED: Did you set the speech resource key and region values?" << std::endl;
        }
    }
    default:
        break;
    }
}

Создание и запуск приложения

Теперь можно приступать к созданию приложения и проверке распознавания речи, используя службу "Речь".

Скомпилируйте код. В строке меню Visual Studio последовательно выберите Сборка>Собрать решение.
Запустите приложение. В строке меню выберите Отладка>Начать отладку или нажмите клавишу F5.
Начните распознавание. Вам будет предложено произнести какую-либо фразу. По умолчанию используется английский язык. Речь, записанная в виде текста, отправляется в службу "Речь" и выводится в консоли.

Например, если вы скажете "Доставьте меня на этаж 2", результат будет выглядеть следующим образом:

Say something ...
RECOGNIZED: Text = Take me to floor 2.
  Intent Id = ChangeFloors
  Floor name: = 2

Еще один пример: , если вы скажете "Доставь меня на этаж 7", результат будет выглядеть следующим образом:

Say something ...
RECOGNIZED: Text = Take me to floor 7.
NO INTENT RECOGNIZED!

Идентификатор намерения пуст, так как в нашем списке не было цифры 7.

Справочная документация | Дополнительные примеры в GitHub

В этом кратком руководстве описано, как установить пакет SDK службы "Речь" для Java.

Требования платформы

Выберите целевую среду:

Среда выполнения Java
Android

Пакет SDK службы "Речь" для Java совместим с Windows, Linux и macOS.

В Windows необходимо использовать 64-разрядную целевую архитектуру. Требуется Windows 10 или более поздней версии.

Установите microsoft Распространяемый компонент Visual C++ для Visual Studio 2015, 2017, 2019 и 2022 для своей платформы. При первой установке этого пакета может потребоваться перезагрузка.

Пакет SDK службы "Речь" для Java не поддерживает Windows в ARM64.

Внимание

Эта статья ссылается на CentOS, дистрибутив Linux, который приближается к состоянию конца жизни (EOL). Пожалуйста, рассмотрите возможность использования и планирования соответствующим образом. Дополнительные сведения см. в руководстве centOS End Of Life.

Пакет SDK службы "Речь" для Java поддерживает следующие дистрибутивы архитектур x64, ARM32 (Debian/Ubuntu) и ARM64 (Debian/Ubuntu):

Ubuntu 18.04/20.04
Debian 10/11
Red Hat Enterprise Linux (RHEL) 7/8;
CentOS 7

Внимание

Используйте последний выпуск LTS дистрибутива Linux. Например, если вы работаете с Ubuntu 20.04 LTS, используйте последний выпуск Ubuntu 20.04.X.

Пакет SDK службы "Речь" зависит от следующих системных библиотек Linux:

Общие библиотеки библиотеки GNU C, включая библиотеку программирования потоков POSIX. libpthreads
Библиотека OpenSSL (libssl) версии 1.x и сертификаты (ca-certificates).
Общая библиотека для приложений ALSA (libasound).

Чтобы установить безопасный websocket, необходимо также установить ca-certificates и избежать WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED ошибки.

Внимание

Пакет SDK службы "Речь" еще не поддерживает OpenSSL 3.0, который является по умолчанию в Ubuntu 22.04 и Debian 12.

Выполните следующие команды.

sudo apt-get update
sudo apt-get install build-essential libssl-dev ca-certificates libasound2 wget

Чтобы использовать пакет SDK службы "Речь" в Alpine Linux, создайте среду Debian chroot, как описано на вики-сайте Alpine Linux для запущенных программ glibc, а затем следуйте инструкциям Debian.

sudo apt-get update
sudo apt-get install build-essential libssl-dev ca-certificates libasound2 wget

Внимание

Установите средства и библиотеки разработки:

sudo yum update
sudo yum groupinstall "Development tools"
sudo yum install alsa-lib openssl wget

Внимание

Если вы используете RHEL/CentOS 7, выполните инструкции по настройке пакета SDK службы "Речь" в RHEL/CentOS 7.
В RHEL следуйте инструкциям по настройке OpenSSL для Linux.

Установите пакет средств разработки Java, например Azul Zulu OpenJDK. Кроме того, должна работать сборка Microsoft OpenJDK или предпочтительный JDK.

Установка пакета SDK службы "Речь" для Java

Некоторые инструкции используют определенную версию пакета SDK, например 1.24.2. Чтобы проверка последнюю версию, выполните поиск в нашем репозитории GitHub.

Выберите целевую среду:

Среда выполнения Java
Android

В этом руководстве объясняется, как установить пакет SDK службы "Речь" для Java в среде выполнения Java.

Поддерживаемые операционные системы

Пакет SDK службы "Речь" для Java доступен для таких операционных систем:

Windows — только 64-разрядная.
Mac — macOS X версии 10.14 или более поздней.
Linux: см. поддерживаемые дистрибутивы Linux и целевые архитектуры.

Выполните следующие действия, чтобы установить пакет SDK службы "Речь" для Java с помощью Apache Maven:

Установите Apache Maven.
Откройте командную строку, в которой хотите создать проект, и создайте файл pom.xml .

Скопируйте следующее XML-содержимое в pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.microsoft.cognitiveservices.speech.samples</groupId>
    <artifactId>quickstart-eclipse</artifactId>
    <version>1.0.0-SNAPSHOT</version>
    <build>
        <sourceDirectory>src</sourceDirectory>
        <plugins>
        <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.7.0</version>
            <configuration>
            <source>1.8</source>
            <target>1.8</target>
            </configuration>
        </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency>
        <groupId>com.microsoft.cognitiveservices.speech</groupId>
        <artifactId>client-sdk</artifactId>
        <version>1.37.0</version>
        </dependency>
    </dependencies>
</project>

Выполните следующую команду Maven, чтобы установить пакет SDK службы "Речь" и зависимости.
```
mvn clean dependency:copy-dependencies
```

Создание проекта Eclipse и установка пакета SDK для службы "Речь"

Установите интегрированную среду разработки Java Eclipse. Для этой интегрированной среды разработки требуется установить Java.
Запустите Eclipse.
В средстве запуска Eclipse в поле Рабочая область введите имя нового каталога рабочей области. Затем выберите Запустить.
После этого отобразится главное окно интегрированной среды разработки Eclipse. Если отобразится экран приветствия, закройте его.
В меню Eclipse выберите "Файл>нового>проекта".
Откроется диалоговое окно Создание проекта . Выберите Проект Java и щелкните Далее.
После этого запустится мастер создания проектов Java. В поле Имя проекта введите quickstart. В качестве среды выполнения выберите JavaSE-1.8. Выберите Готово.
Если появится окно Open Associated Perspective? (Открыть связанную перспективу?), выберите Open Perspective (Открыть перспективу).
В обозревателе пакетов щелкните правой кнопкой мыши проект quickstart. Выберите "Настроить>преобразование в проект Maven" в контекстном меню.
Откроется окно Create new POM (Создать новый POM). В поле Идентификатор группы введите com.microsoft.cognitiveservices.speech.samples, а в поле Идентификатор артефакта укажите quickstart. Выберите Готово.
Откройте файл pom.xml и измените его:
1. dependencies Добавьте элемент в конце файла перед закрывающим тегом </project>с пакетом SDK службы "Речь" в качестве зависимости:
```
<dependencies>
  <dependency>
    <groupId>com.microsoft.cognitiveservices.speech</groupId>
    <artifactId>client-sdk</artifactId>
    <version>1.37.0</version>
  </dependency>
</dependencies>
```
1. Сохраните изменения.

Конфигурации Gradle

Для конфигураций Gradle требуется явная ссылка на расширение зависимостей .jar :

// build.gradle

dependencies {
    implementation group: 'com.microsoft.cognitiveservices.speech', name: 'client-sdk', version: "1.37.0", ext: "jar"
}

Добавление стандартного кода

Откройте Main.java из каталога src.
Замените содержимое файла следующим кодом.

import java.util.ArrayList;
import java.util.Dictionary;
import java.util.concurrent.ExecutionException;


import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.intent.*;

public class Main {
    public static void main(String[] args) throws InterruptedException, ExecutionException {
        IntentPatternMatchingWithMicrophone();
    }

    public static void IntentPatternMatchingWithMicrophone() throws InterruptedException, ExecutionException {
        SpeechConfig config = SpeechConfig.fromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");
    }
}

Создание конфигурации службы "Речь"

Замените "YOUR_SUBSCRIPTION_KEY" ключ прогнозирования служб ИИ Azure.
Замените "YOUR_SUBSCRIPTION_REGION" регион ресурсов служб ИИ Azure.

В этом примере для создания SpeechConfig используется метод fromSubscription(). Полный список доступных методов см. в статье SpeechConfig Class (Класс SpeechConfig).

Инициализация объекта IntentRecognizer

Теперь создайте IntentRecognizer. Вставьте этот код непосредственно под конфигурацией службы "Речь". Мы делаем это для этого, чтобы воспользоваться преимуществами функции автоматического закрытия интерфейса.

try (IntentRecognizer recognizer = new IntentRecognizer(config)) {

}

Добавление некоторых намерений

Примечание.

В PatternMatchingIntent можно добавить несколько шаблонов.

Вставьте этот код в блок try.

// Creates a Pattern Matching model and adds specific intents from your model. The
// Id is used to identify this model from others in the collection.
PatternMatchingModel model = new PatternMatchingModel("YourPatternMatchingModelId");

// Creates a pattern that uses groups of optional words. "[Go | Take me]" will match either "Go", "Take me", or "".
String patternWithOptionalWords = "[Go | Take me] to [floor|level] {floorName}";

// Creates a pattern that uses an optional entity and group that could be used to tie commands together.
String patternWithOptionalEntity = "Go to parking [{parkingLevel}]";

// You can also have multiple entities of the same name in a single pattern by adding appending a unique identifier
// to distinguish between the instances. For example:
String patternWithTwoOfTheSameEntity = "Go to floor {floorName:1} [and then go to floor {floorName:2}]";
// NOTE: Both floorName:1 and floorName:2 are tied to the same list of entries. The identifier can be a string
//       and is separated from the entity name by a ':'

// Creates the pattern matching intents and adds them to the model
model.getIntents().put(new PatternMatchingIntent("ChangeFloors", patternWithOptionalWords, patternWithOptionalEntity, patternWithTwoOfTheSameEntity));
model.getIntents().put(new PatternMatchingIntent("DoorControl", "{action} the doors", "{action} doors", "{action} the door", "{action} door"));

Добавление пользовательских сущностей

Вставьте приведенный ниже код под строкой намерений:

// Creates the "floorName" entity and set it to type list.
// Adds acceptable values. NOTE the default entity type is Any and so we do not need
// to declare the "action" entity.
model.getEntities().put(PatternMatchingEntity.CreateListEntity("floorName", PatternMatchingEntity.EntityMatchMode.Strict, "ground floor", "lobby", "1st", "first", "one", "1", "2nd", "second", "two", "2"));

// Creates the "parkingLevel" entity as a pre-built integer
model.getEntities().put(PatternMatchingEntity.CreateIntegerEntity("parkingLevel"));

Применение модели к распознавателю

Вставьте приведенный ниже код под разделом сущностей:

ArrayList<LanguageUnderstandingModel> modelCollection = new ArrayList<LanguageUnderstandingModel>();
modelCollection.add(model);

recognizer.applyLanguageModels(modelCollection);

Распознавание намерения

Вставьте этот код после применения языковых моделей:

System.out.println("Say something...");

IntentRecognitionResult result = recognizer.recognizeOnceAsync().get();

Отображение результатов распознавания (или ошибок)

Когда служба "Речь" возвратит результат распознавания, мы выведем его на печать.

Вставьте код ниже под строкой IntentRecognitionResult result = recognizer.recognizeOnceAsync.get();.

if (result.getReason() == ResultReason.RecognizedSpeech) {
    System.out.println("RECOGNIZED: Text= " + result.getText());
    System.out.println(String.format("%17s", "Intent not recognized."));
}
else if (result.getReason() == ResultReason.RecognizedIntent)
{
    System.out.println("RECOGNIZED: Text= " + result.getText());
    System.out.println(String.format("%17s %s", "Intent Id=", result.getIntentId() + "."));
    Dictionary<String, String> entities = result.getEntities();

    switch (result.getIntentId())
    {
        case "ChangeFloors":
            if (entities.get("floorName") != null) {
                System.out.println(String.format("%17s %s", "FloorName=", entities.get("floorName")));
            }
            if (entities.get("floorName:1") != null) {
                System.out.println(String.format("%17s %s", "FloorName:1=", entities.get("floorName:1")));
            }
            if (entities.get("floorName:2") != null) {
                System.out.println(String.format("%17s %s", "FloorName:2=", entities.get("floorName:2")));
            }
            if (entities.get("parkingLevel") != null) {
                System.out.println(String.format("%17s %s", "ParkingLevel=", entities.get("parkingLevel")));
            }
            break;
        case "DoorControl":
            if (entities.get("action") != null) {
                System.out.println(String.format("%17s %s", "Action=", entities.get("action")));
            }
            break;
    }
}
else if (result.getReason() == ResultReason.NoMatch) {
    System.out.println("NOMATCH: Speech could not be recognized.");
}
else if (result.getReason() == ResultReason.Canceled) {
    CancellationDetails cancellation = CancellationDetails.fromResult(result);
    System.out.println("CANCELED: Reason=" + cancellation.getReason());

    if (cancellation.getReason() == CancellationReason.Error)
    {
        System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
        System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
        System.out.println("CANCELED: Did you update the subscription info?");
    }
}

Проверка кода

На этом этапе код должен выглядеть так:

package quickstart;
import java.util.ArrayList;
import java.util.concurrent.ExecutionException;
import java.util.Dictionary;

import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.intent.*;

public class Main {
    public static void main(String[] args) throws InterruptedException, ExecutionException {
        IntentPatternMatchingWithMicrophone();
    }

    public static void IntentPatternMatchingWithMicrophone() throws InterruptedException, ExecutionException {
        SpeechConfig config = SpeechConfig.fromSubscription("YOUR_SUBSCRIPTION_KEY", "YOUR_SUBSCRIPTION_REGION");
        try (IntentRecognizer recognizer = new IntentRecognizer(config)) {
            // Creates a Pattern Matching model and adds specific intents from your model. The
            // Id is used to identify this model from others in the collection.
            PatternMatchingModel model = new PatternMatchingModel("YourPatternMatchingModelId");

            // Creates a pattern that uses groups of optional words. "[Go | Take me]" will match either "Go", "Take me", or "".
            String patternWithOptionalWords = "[Go | Take me] to [floor|level] {floorName}";

            // Creates a pattern that uses an optional entity and group that could be used to tie commands together.
            String patternWithOptionalEntity = "Go to parking [{parkingLevel}]";

            // You can also have multiple entities of the same name in a single pattern by adding appending a unique identifier
            // to distinguish between the instances. For example:
            String patternWithTwoOfTheSameEntity = "Go to floor {floorName:1} [and then go to floor {floorName:2}]";
            // NOTE: Both floorName:1 and floorName:2 are tied to the same list of entries. The identifier can be a string
            // and is separated from the entity name by a ':'

            // Creates the pattern matching intents and adds them to the model
            model.getIntents().put(new PatternMatchingIntent("ChangeFloors", patternWithOptionalWords, patternWithOptionalEntity, patternWithTwoOfTheSameEntity));
            model.getIntents().put(new PatternMatchingIntent("DoorControl", "{action} the doors", "{action} doors", "{action} the door", "{action} door"));

            // Creates the "floorName" entity and set it to type list.
            // Adds acceptable values. NOTE the default entity type is Any and so we do not need
            // to declare the "action" entity.
            model.getEntities().put(PatternMatchingEntity.CreateListEntity("floorName", PatternMatchingEntity.EntityMatchMode.Strict, "ground floor", "lobby", "1st", "first", "one", "1", "2nd", "second", "two", "2"));

            // Creates the "parkingLevel" entity as a pre-built integer
            model.getEntities().put(PatternMatchingEntity.CreateIntegerEntity("parkingLevel"));

            ArrayList<LanguageUnderstandingModel> modelCollection = new ArrayList<LanguageUnderstandingModel>();
            modelCollection.add(model);

            recognizer.applyLanguageModels(modelCollection);

            System.out.println("Say something...");

            IntentRecognitionResult result = recognizer.recognizeOnceAsync().get();

            if (result.getReason() == ResultReason.RecognizedSpeech) {
                System.out.println("RECOGNIZED: Text= " + result.getText());
                System.out.println(String.format("%17s", "Intent not recognized."));
            }
            else if (result.getReason() == ResultReason.RecognizedIntent)
            {
                System.out.println("RECOGNIZED: Text= " + result.getText());
                System.out.println(String.format("%17s %s", "Intent Id=", result.getIntentId() + "."));
                Dictionary<String, String> entities = result.getEntities();

                switch (result.getIntentId())
                {
                    case "ChangeFloors":
                        if (entities.get("floorName") != null) {
                            System.out.println(String.format("%17s %s", "FloorName=", entities.get("floorName")));
                        }
                        if (entities.get("floorName:1") != null) {
                            System.out.println(String.format("%17s %s", "FloorName:1=", entities.get("floorName:1")));
                        }
                        if (entities.get("floorName:2") != null) {
                            System.out.println(String.format("%17s %s", "FloorName:2=", entities.get("floorName:2")));
                        }
                        if (entities.get("parkingLevel") != null) {
                            System.out.println(String.format("%17s %s", "ParkingLevel=", entities.get("parkingLevel")));
                        }
                        break;

                    case "DoorControl":
                        if (entities.get("action") != null) {
                            System.out.println(String.format("%17s %s", "Action=", entities.get("action")));
                        }
                        break;
                }
            }
            else if (result.getReason() == ResultReason.NoMatch) {
                System.out.println("NOMATCH: Speech could not be recognized.");
            }
            else if (result.getReason() == ResultReason.Canceled) {
                CancellationDetails cancellation = CancellationDetails.fromResult(result);
                System.out.println("CANCELED: Reason=" + cancellation.getReason());

                if (cancellation.getReason() == CancellationReason.Error)
                {
                    System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                    System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                    System.out.println("CANCELED: Did you update the subscription info?");
                }
            }
        }
    }
}

Создание и запуск приложения

Теперь вы готовы создать приложение и протестировать распознавание намерений с помощью службы речи и встроенного сопоставления шаблонов.

Нажмите кнопку выполнения в Eclipse или нажмите клавиши CTRL+F11, а затем просмотрите выходные данные для параметра "Сказать что-то..." Строке. Когда она появится, произнесите речевой фрагмент и просмотрите выходные данные.

Например, если вы скажете "Доставьте меня на этаж 2", результат будет выглядеть следующим образом:

Say something...
RECOGNIZED: Text=Take me to floor 2.
       Intent Id=ChangeFloors.
       FloorName=2

Еще один пример: если вы скажете "Доставьте меня на этаж 7", результат будет выглядеть следующим образом:

Say something...
RECOGNIZED: Text=Take me to floor 7.
    Intent not recognized.

Намерение не распознано, так как 7 отсутствует в списке допустимых значений для floorName.

Распознавание намерений с помощью сопоставления пользовательских шаблонов сущностей

Когда следует использовать сопоставление шаблонов

Необходимые компоненты

Создание проекта

Добавление стандартного кода

Создание конфигурации службы "Речь"

Инициализация объекта IntentRecognizer

Добавление некоторых намерений

Добавление пользовательских сущностей

Применение модели к распознавателю

Распознавание намерения

Отображение результатов распознавания (или ошибок)

Проверка кода

Создание и запуск приложения

Создание проекта

Добавление стандартного кода

Создание конфигурации службы "Речь"

Инициализация объекта IntentRecognizer

Добавление некоторых намерений

Добавление пользовательских сущностей

Применение модели к распознавателю

Распознавание намерения

Отображение результатов распознавания (или ошибок)

Проверка кода

Создание и запуск приложения

Требования платформы

Установка пакета SDK службы "Речь" для Java

Поддерживаемые операционные системы

Добавление стандартного кода

Создание конфигурации службы "Речь"

Инициализация объекта IntentRecognizer

Добавление некоторых намерений

Добавление пользовательских сущностей

Применение модели к распознавателю

Распознавание намерения

Отображение результатов распознавания (или ошибок)

Проверка кода

Создание и запуск приложения

Дополнительные ресурсы