在 Fabric 中使用 Azure AI 語言文字分析搭配 REST API 和 SynapseML （預覽）

重要

這項功能處於預覽狀態。

Azure AI 語言是 Azure AI 服務，可讓您使用自然語言處理（NLP）功能執行文字採礦和文字分析。

在本文中，您將瞭解如何在 Microsoft Fabric 中直接使用 Azure AI 語言服務來分析文字。讀完本文後，您可以：

在句子或文件層級偵測情感標籤
識別指定文字輸入的語言
從文字擷取關鍵片語
識別文字中的不同實體，並將其分類為預先定義的類別或類型

必要條件

取得 Microsoft Fabric 訂用帳戶。或註冊免費的 Microsoft Fabric 試用版。
登入 Microsoft Fabric。
使用首頁左下角的體驗切換器切換到 Fabric。

建立新的筆記本。
將你的筆記本連接至 Lakehouse。在筆記本左側，選取 [新增]，以新增現有的 Lakehouse 或建立新的 Lakehouse。

備註

本文使用Microsoft網狀架構的內建預建 AI 服務，其會自動處理驗證。您不需要取得個別的 Azure AI 服務金鑰 - 驗證會透過您的 Fabric 工作區進行管理。如需詳細資訊，請參閱網狀架構中預先建置的 AI 模型（預覽版）。

本文中的程式代碼範例會使用預安裝於 Microsoft Fabric 筆記本中的連結庫：

SynapseML：在網狀架構筆記本中預安裝機器學習功能
PySpark：網狀架構 Spark 計算中預設提供
標準 Python 連結庫： json， uuid 是 Python 標準連結庫的一部分

備註

Microsoft網狀架構筆記本隨附許多預安裝通用連結庫。 SynapseML 連結庫提供 MLflow 整合和文字分析功能，可在 Spark 環境中自動取得。

選擇您的方法

本文提供兩種在 Fabric 中使用 Azure AI 語言服務的方式：

REST API 方法：對服務的直接 HTTP 呼叫（建議初學者使用）
SynapseML 方法：使用 Spark DataFrame 進行更大規模的處理

小提示

新的用戶應該從 REST API 方法開始 ，因為更容易瞭解和偵錯。 SynapseML 方法較適合使用 Spark 處理大型數據集。

其餘 API
SynapseML

設定驗證和端點

將此程式代碼複製並貼到 Fabric 筆記本的第一個數據格中，以設定 Azure AI 語言服務的連線：

備註

此程式代碼會使用 Fabric 的內建驗證。函 get_fabric_env_config 式會自動擷取您的工作區認證，並聯機到預先建置的 AI 服務。不需要 API 金鑰。

# Get workload endpoints and access token
from synapse.ml.fabric.service_discovery import get_fabric_env_config
from synapse.ml.fabric.token_utils import TokenUtils
import json
import requests

fabric_env_config = get_fabric_env_config().fabric_env_config
auth_header = TokenUtils().get_openai_auth_header()

# Make a RESful request to AI service
prebuilt_AI_base_host = fabric_env_config.ml_workload_endpoint + "cognitive/textanalytics/"
print("Workload endpoint for AI service: \n" + prebuilt_AI_base_host)

service_url = prebuilt_AI_base_host + "language/:analyze-text?api-version=2022-05-01"
print("Service URL: \n" + service_url)

auth_headers = {
    "Authorization" : auth_header
}

def print_response(response):
    if response.status_code == 200:
        print(json.dumps(response.json(), indent=2))
    else:
        print(f"Error: {response.status_code}, {response.content}")

匯入必要的程式庫

將此程式代碼複製並貼到網狀架構筆記本的第一個資料格中：

import synapse.ml.core
from synapse.ml.cognitive.language import AnalyzeText
from pyspark.sql.functions import col

# Note: 'spark' and 'display()' are automatically available in Fabric notebooks

情緒分析功能提供一種方法，可在句子和文件層級偵測情緒標籤（例如「負面」、「中性」和「正面」）和信賴度分數。此功能也會針對每個檔傳回 0 到 1 之間的信賴分數，以及針對正面、中性及負面情感，傳回其中句子的信賴分數。如需啟用的語言清單，請參閱情感分析和意見挖掘語言支援。

分析文字的情感

將此程式代碼複製到筆記本中的新資料格，以分析範例文字的情感：

payload = {
    "kind": "SentimentAnalysis",
    "parameters": {
        "modelVersion": "latest",
        "opinionMining": "True"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "language":"en",
                "text": "The food and service were unacceptable. The concierge was nice, however."
            }
        ]
    }
} 

response = requests.post(service_url, json=payload, headers=auth_headers)


# Output all information of the request process
print_response(response)

小提示

您可以將「text」字段中的文字取代為您自己的內容來分析。服務會傳回情緒分數，並識別文字的哪些部分是正面、負面或中性。

預期的輸出

當您成功執行下列程式代碼時，應該會看到如下的輸出：

{
  "kind": "SentimentAnalysisResults",
  "results": {
    "documents": [
      {
        "id": "1",
        "sentiment": "negative",
        "confidenceScores": {
          "positive": 0.0,
          "neutral": 0.0,
          "negative": 1.0
        },
        "sentences": [
          {
            "sentiment": "negative",
            "confidenceScores": {
              "positive": 0.0,
              "neutral": 0.0,
              "negative": 1.0
            },
            "offset": 0,
            "length": 40,
            "text": "The food and service were unacceptable. ",
            "targets": [
              {
                "sentiment": "negative",
                "confidenceScores": {
                  "positive": 0.01,
                  "negative": 0.99
                },
                "offset": 4,
                "length": 4,
                "text": "food",
                "relations": [
                  {
                    "relationType": "assessment",
                    "ref": "#/documents/0/sentences/0/assessments/0"
                  }
                ]
              },
              {
                "sentiment": "negative",
                "confidenceScores": {
                  "positive": 0.01,
                  "negative": 0.99
                },
                "offset": 13,
                "length": 7,
                "text": "service",
                "relations": [
                  {
                    "relationType": "assessment",
                    "ref": "#/documents/0/sentences/0/assessments/0"
                  }
                ]
              }
            ],
            "assessments": [
              {
                "sentiment": "negative",
                "confidenceScores": {
                  "positive": 0.01,
                  "negative": 0.99
                },
                "offset": 26,
                "length": 12,
                "text": "unacceptable",
                "isNegated": false
              }
            ]
          },
          {
            "sentiment": "neutral",
            "confidenceScores": {
              "positive": 0.22,
              "neutral": 0.75,
              "negative": 0.04
            },
            "offset": 40,
            "length": 32,
            "text": "The concierge was nice, however.",
            "targets": [
              {
                "sentiment": "positive",
                "confidenceScores": {
                  "positive": 1.0,
                  "negative": 0.0
                },
                "offset": 44,
                "length": 9,
                "text": "concierge",
                "relations": [
                  {
                    "relationType": "assessment",
                    "ref": "#/documents/0/sentences/1/assessments/0"
                  }
                ]
              }
            ],
            "assessments": [
              {
                "sentiment": "positive",
                "confidenceScores": {
                  "positive": 1.0,
                  "negative": 0.0
                },
                "offset": 58,
                "length": 4,
                "text": "nice",
                "isNegated": false
              }
            ]
          }
        ],
        "warnings": []
      }
    ],
    "errors": [],
    "modelVersion": "2025-01-01"
  }
}

情緒分析功能提供一種方法，可在句子和文件層級偵測情緒標籤（例如「負面」、「中性」和「正面」）和信賴度分數。此功能也會為每份文件和其中的句子傳回 0 到 1 之間的信賴度分數，以表示正面、中性和負面情感。如需啟用的語言清單，請參閱情感分析和意見挖掘語言支援。

df = spark.createDataFrame([
    ("Great atmosphere. Close to plenty of restaurants, hotels, and transit! Staff are friendly and helpful.",),
    ("What a sad story!",)
], ["text"])

model = (AnalyzeText()
        .setTextCol("text")
        .setKind("SentimentAnalysis")
        .setOutputCol("response"))

result = model.transform(df)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("sentiment", col("documents.sentiment"))

display(result.select("text", "sentiment"))

語言偵測器

其餘 API
SynapseML

語言偵測器會針對每份文件評估文字輸入，並傳回語言識別碼，其中含有指出分析強度的分數。此功能很適合用於收集未知語言任意文字的內容存放區。如需啟用的語言清單，請參閱支援的語言偵測語言。

payload = {
    "kind": "LanguageDetection",
    "parameters": {
        "modelVersion": "latest"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "text": "This is a document written in English."
            }
        ]
    }
}

response = requests.post(service_url, json=payload, headers=auth_headers)

# Output all information of the request process
print_response(response)

輸出

{
  "kind": "LanguageDetectionResults",
  "results": {
    "documents": [
      {
        "id": "1",
        "warnings": [],
        "detectedLanguage": {
          "name": "English",
          "iso6391Name": "en",
          "confidenceScore": 0.95
        }
      }
    ],
    "errors": [],
    "modelVersion": "2024-11-01"
  }
}

df = spark.createDataFrame([
    (["Hello world"],),
    (["Bonjour tout le monde", "Hola mundo", "Tumhara naam kya hai?"],),
    (["你好"],),
    (["日本国（にほんこく、にっぽんこく、英"],)
], ["text"])

model = (AnalyzeText()
        .setTextCol("text")
        .setKind("LanguageDetection")
        .setOutputCol("response"))

result = model.transform(df)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("detectedLanguage", col("documents.detectedLanguage.name"))

display(result.select("text", "detectedLanguage"))

關鍵片語擷取器

其餘 API
SynapseML

關鍵片語擷取會評估非結構化的文字，並傳回關鍵片語的清單。此功能在您需要快速識別文件集合中的要點時相當有用。如需啟用的語言清單，請參閱支援關鍵片語擷取的語言。

payload = {
    "kind": "KeyPhraseExtraction",
    "parameters": {
        "modelVersion": "latest"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "language":"en",
                "text": "Dr. Smith has a very modern medical office, and she has great staff."
            }
        ]
    }
}

response = requests.post(service_url, json=payload, headers=auth_headers)

# Output all information of the request process
print_response(response)

輸出

{
  "kind": "KeyPhraseExtractionResults",
  "results": {
    "documents": [
      {
        "id": "1",
        "keyPhrases": [
          "modern medical office",
          "Dr. Smith",
          "great staff"
        ],
        "warnings": []
      }
    ],
    "errors": [],
    "modelVersion": "2022-10-01"
  }
}

df = spark.createDataFrame([
    ("en", "Microsoft was founded by Bill Gates and Paul Allen."),
    ("en", "Text Analytics is one of the Azure Cognitive Services."),
    ("en", "My cat might need to see a veterinarian.")
], ["language", "text"])

model = (AnalyzeText()
        .setTextCol("text")
        .setKind("KeyPhraseExtraction")
        .setOutputCol("response"))

result = model.transform(df)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("keyPhrases", col("documents.keyPhrases"))

display(result.select("text", "keyPhrases"))

具名實體辨識 (NER)

其餘 API
SynapseML

具名實體辨識 (NER) 能夠識別文字中的不同實體，並將它們分類成預先定義的類別或類型，例如：人員、位置、事件、產品和組織。如需支援語言的清單，請參閱 NER 語言支援。

payload = {
    "kind": "EntityRecognition",
    "parameters": {
        "modelVersion": "latest"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "language": "en",
                "text": "I had a wonderful trip to Seattle last week."
            }
        ]
    }
}

response = requests.post(service_url, json=payload, headers=auth_headers)

# Output all information of the request process
print_response(response)

輸出

{
  "kind": "EntityRecognitionResults",
  "results": {
    "documents": [
      {
        "id": "1",
        "entities": [
          {
            "text": "trip",
            "category": "Event",
            "offset": 18,
            "length": 4,
            "confidenceScore": 0.66
          },
          {
            "text": "Seattle",
            "category": "Location",
            "subcategory": "City",
            "offset": 26,
            "length": 7,
            "confidenceScore": 1.0
          },
          {
            "text": "last week",
            "category": "DateTime",
            "subcategory": "DateRange",
            "offset": 34,
            "length": 9,
            "confidenceScore": 1.0
          }
        ],
        "warnings": []
      }
    ],
    "errors": [],
    "modelVersion": "2025-02-01"
  }
}

df = spark.createDataFrame([
    ("en", "Microsoft was founded by Bill Gates and Paul Allen."),
    ("en", "Pike place market is my favorite Seattle attraction.")
], ["language", "text"])

model = (AnalyzeText()
        .setTextCol("text")
        .setKind("EntityRecognition")
        .setOutputCol("response"))

result = model.transform(df)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("entityNames", col("documents.entities.text"))

display(result.select("text", "entityNames"))

實體連結

其餘 API
SynapseML

本節中沒有 REST API 步驟。

實體連結會識別並釐清在文字中找到之實體的身分識別。例如，"We went to Seattle last week." (我們上週去西雅圖。) 這句話中，系統會識別出 "Seattle" (西雅圖)，並提供 Wikipedia 上詳細資訊的連結。如需已啟用語言的清單，請參閱實體連結的支援語言。

df = spark.createDataFrame([
    ("en", "Microsoft was founded by Bill Gates and Paul Allen."),
    ("en", "Pike place market is my favorite Seattle attraction.")
], ["language", "text"])

model = (AnalyzeText()
        .setTextCol("text")
        .setKind("EntityLinking")
        .setOutputCol("response"))

result = model.transform(df)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("entityNames", col("documents.entities.name"))

display(result)

搭配使用 Fabric 中預先建置的文字分析與 SynapseML
搭配使用 Fabric 中預先建置的 Azure AI 翻譯工具與 REST API
搭配使用 Fabric 中預先建置的 Azure AI 翻譯工具與 SynapseML
搭配使用 Fabric 中預先建置的 Azure OpenAI 與 REST API
搭配使用 Fabric 中預先建置的 Azure OpenAI 與 Python SDK
搭配使用 Fabric 中預先建置的 Azure OpenAI 與 SynapseML
SynapseML GitHub 存放庫 - SynapseML 的原始程式碼和檔
Azure AI 語言檔 - Azure AI 語言服務的完整參考

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-08-28