擷取關鍵片語

已完成

關鍵字擷取是 Azure Language 提供的一項功能。 它會識別文字中的關鍵片語或主要概念。

有數種方式可以呼叫 關鍵詞組擷取 API。 在這裡,您會使用 擴充 azure_ai 功能來擷取 SQL 查詢中的關鍵片語。

先決條件

您需要適用於 PostgreSQL 的 Azure 資料庫彈性伺服器,並azure_ai啟用並設定擴充功能。 您也需要藉由設定語言資源的密鑰和端點,向 Azure 認知服務 授權 它。

案例

關鍵片語擷取適用於各種工作:

  • 摘要:使用關鍵片語將冗長的文件濃縮成核心主題,例如識別音訊文字記錄或會議筆記中所討論的主題。
  • 內容分類:使用關鍵片語編製檔索引以進行搜尋和流覽。 關鍵片語也可用來將文字雲端中的文件可視化。
  • 檔叢集:您可以使用關鍵詞組來叢集和分析大量支援票證、產品檢閱和其他非結構化輸入。

搭配 Azure 認知服務使用關鍵片語擷取 SQL

適用於 PostgreSQL 的 Azure 資料庫彈性伺服器 azure_ai延伸模組 提供使用者定義函式 (UDF),以直接從 SQL 內部存取 AI 功能。 關鍵片語擷取 API 可以透過 azure_cognitive.extract_key_phrases 方法來存取。

azure_cognitive.extract_key_phrases(
 text TEXT,
 language TEXT,
 timeout_ms INTEGER DEFAULT 3600000,
 throw_on_error BOOLEAN DEFAULT TRUE,
 disable_service_logs BOOLEAN DEFAULT FALSE
)

必要的參數是 text,就是輸入,和 language,即 text 所使用的語言。 例如, en-us 是美式英文,而 fr 是法文。 如需可用語言的完整清單,請參閱 語言支援

根據預設,如果關鍵片語擷取未在3,600,000毫秒內完成,則會停止,也就是1小時。 變更 timeout_ms 即可自訂此延遲。

如果發生錯誤,預設行為是擲回例外狀況,進而導致交易復原。 您可以將 設定 throw_on_error 為 false 來停用此行為。

如需完整的參數檔,請參閱 Azure 認知服務擴充功能 檔。

例如,叫用此查詢:

SELECT azure_cognitive.extract_key_phrases('The food was delicious and the staff were wonderful.', 'en-us');

得到此結果:

 extract_key_phrases 
---------------------
 {food,staff}

您可以針對輸入文字使用資料表資料行:

SELECT description, azure_cognitive.extract_key_phrases(description, 'en-us')
FROM listings LIMIT 1;

這會傳回 (啟用 \x 以延長顯示):

description    | Welcome! If you stay here you will be living in a light filled two bedroom upper and ground level apartment (in a two apartment home). During your stay you will be welcome to share in our fresh eggs from the chickens and garden produce in season! Welcome! Come enjoy your time in Seattle at a lovely urban farmstead. There are two bedrooms each with a queen bed, full bath, living room and kitchen with wood floors throughout. During your stay you will be welcome to eat fresh eggs from the chickens and possibly fruit/veggies from the garden if you are in luck! We are family friendly and have a down to earth atmosphere. There is a large covered back porch and grill for hanging out especially in summer and a treehouse for up in the trees hammock time! Walking distance to Othello Light Rail Station for easy access to downtown. Also nearby is the fantastic Seward Park and the Kubota Gardens for outdoorsy loveliness. New last year is out beautiful Rainier Beach indoor swimming pool comp
extract_key_phrases | {"beautiful Rainier Beach indoor swimming pool","large covered back porch","Othello Light Rail Station","ground level apartment","lovely urban farmstead","fantastic Seward Park","two bedroom upper","two apartment home","two bedrooms","fresh eggs","queen bed","full bath","living room","wood floors","earth atmosphere","Walking distance","easy access","Kubota Gardens","outdoorsy loveliness","garden produce","hammock time",stay,chickens,season,Seattle,kitchen,fruit/veggies,luck,grill,summer,treehouse,trees,downtown,last}

總結

關鍵片語擷取會從文字中選取主要概念。 Azure 認知服務語言模型負責將自然語言煮沸成關鍵詞或片語。 Azure 資料庫的 azure_ai 擴充功能適用於 PostgreSQL,提供 azure_cognitive.extract_key_phrases API,讓您可以直接在 SQL 查詢中存取關鍵片語擷取功能。