TripPin 第 6 部分 - 架構

文章
12/13/2023

此多部分教學課程涵蓋如何建立Power Query的新數據源延伸模組。本教學課程旨在循序完成，每個課程都是以先前課程中建立的連接器為基礎，以累加方式將新功能新增至您的連接器。

在本課程中，您將會：

定義 REST API 的固定架構
動態設定數據行的數據類型
強制執行數據表結構，以避免因為遺漏數據行而發生轉換錯誤
隱藏結果集中的數據行

OData 服務在標準 REST API 上的優點之一是其 $metadata定義。 $metadata檔描述此服務上找到的數據，包括其所有實體（數據表）和欄位（數據行）的架構。函 OData.Feed 式會使用此架構定義來自動設定數據類型資訊，因此，使用者不會取得所有文字和數位欄位（就像您一 Json.Document樣），而是取得日期、整數、時間等等，以提供更佳的整體用戶體驗。

許多 REST API 無法以程式設計方式判斷其架構。在這些情況下，您必須在連接器中包含架構定義。在這一課，您將定義每個數據表的簡單硬式編碼架構，並在您從服務讀取的數據上強制執行架構。

注意

此處所述的方法應該適用於許多 REST 服務。未來的課程會以遞歸方式在結構化數據行（記錄、清單、數據表）上強制執行架構，並提供範例實作，以程式設計方式從 CSDL 或 JSON 架構檔產生架構數據表。

整體而言，對連接器所傳回的數據強制執行架構具有多個優點，例如：

設定正確的數據類型
移除不需要向使用者顯示的資料列（例如內部識別碼或狀態資訊）
藉由新增回應中可能遺漏的任何數據行，確保每個數據頁都有相同的圖形（REST API 表示字段應該是 Null 的常見方式）

使用 Table.Schema 檢視現有的架構

在上一課中建立的連接器會顯示 TripPin 服務中的三個數據表，AirportsAirlines以及 People。執行下列查詢以檢視 Airlines 資料表：

let
    source = TripPin.Contents(),
    data = source{[Name="Airlines"]}[Data]
in
    data

在結果中，您會看到四個傳回的數據行：

@odata.id
@odata.editLink
AirlineCode
名稱

航空公司沒有架構。

“@odata.*” 資料行是 OData 通訊協定的一部分，而不是您想要或需要向連接器的終端使用者顯示的專案。 AirlineCode 和 Name 是您想要保留的兩個數據行。如果您查看數據表的架構（使用方便的 Table.Schema 函式），您可以看到數據表中的所有數據行都有數據類型 Any.Type。

let
    source = TripPin.Contents(),
    data = source{[Name="Airlines"]}[Data]
in
    Table.Schema(data)

Airlines Table.Schema。

Table.Schema 會傳回數據表中數據行的許多元數據，包括名稱、位置、類型資訊，以及許多進階屬性，例如 Precision、Scale 和 MaxLength。未來的課程將提供設定這些進階屬性的設計模式，但現在您只需要關注已刻有的類型（）、基本類型（TypeNameKind），以及數據行值是否可能是 Null （IsNullable）。

定義簡單的架構數據表

您的架構資料表將由兩個資料列組成：

資料行	詳細資料
名稱	資料行名稱。這必須符合服務所傳回結果中的名稱。
類型	您要設定的 M 資料類型。這可以是基本類型（`text`、、 `datetimenumber`等），或已刻有的類型（`Int64.Type`、`Currency.Type`等等）。

資料表的 Airlines 硬式編碼架構數據表會將其 AirlineCode 和 Name 資料行設定為 text，如下所示：

Airlines = #table({"Name", "Type"}, {
        {"AirlineCode", type text},
        {"Name", type text}
    });

資料表 Airports 有四個您想要保留的欄位（包括其中一種類型 record）：

Airports = #table({"Name", "Type"}, {
        {"IcaoCode", type text},
        {"Name", type text},
        {"IataCode", type text},
        {"Location", type record}
    });

最後，People數據表有七個字段，包括 list （Emails、 AddressInfo）、可為 Null 的數據行（Gender），以及具有已標寫類型的數據行（Concurrency）。

People = #table({"Name", "Type"}, {
        {"UserName", type text},
        {"FirstName", type text},
        {"LastName", type text},
        {"Emails", type list},
        {"AddressInfo", type list},
        {"Gender", type nullable text},
        {"Concurrency", Int64.Type}
    })

SchemaTransformTable 協助程式函式

SchemaTransformTable以下所述的協助程式函式將用來強制執行數據上的架構。它需要以下參數：

參數	類型	Description
table	table	您要強制執行架構的數據表。
schema	table	要從中讀取數據行資訊的架構數據表，具有下列類型： `type table [Name = text, Type = type]`。
enforceSchema	數值	（選擇性）控制函式行為的列舉。預設值（`EnforceSchema.Strict = 1`）可確保輸出資料表符合新增任何遺漏資料行所提供的架構數據表，以及移除額外的數據行。 `EnforceSchema.IgnoreExtraColumns = 2`選項可用來保留結果中的額外數據行。使用時 `EnforceSchema.IgnoreMissingColumns = 3` ，將會忽略遺漏的數據行和額外的數據行。

此函式的邏輯看起來像這樣：

判斷源數據表中是否有任何遺漏的數據行。
判斷是否有任何額外的數據行。
忽略結構化資料列（類型 list為、 record和 table），並將資料行設定為 type any。
使用 Table.TransformColumnTypes 來設定每個數據行類型。
根據它們出現在架構數據表中的順序來重新排序數據行。
使用 Value.ReplaceType 設定數據表本身的類型。

注意

在查詢編輯器中檢視結果時，設定數據表類型的最後一個步驟將會移除Power Query UI 推斷類型資訊的需求。這會移除您在上一個教學課程結尾看到的雙重要求問題。

下列協助程式程式代碼可以複製並貼到您的延伸模組中：

EnforceSchema.Strict = 1;               // Add any missing columns, remove extra columns, set table type
EnforceSchema.IgnoreExtraColumns = 2;   // Add missing columns, do not remove extra columns
EnforceSchema.IgnoreMissingColumns = 3; // Do not add or remove columns

SchemaTransformTable = (table as table, schema as table, optional enforceSchema as number) as table =>
    let
        // Default to EnforceSchema.Strict
        _enforceSchema = if (enforceSchema <> null) then enforceSchema else EnforceSchema.Strict,

        // Applies type transforms to a given table
        EnforceTypes = (table as table, schema as table) as table =>
            let
                map = (t) => if Type.Is(t, type list) or Type.Is(t, type record) or t = type any then null else t,
                mapped = Table.TransformColumns(schema, {"Type", map}),
                omitted = Table.SelectRows(mapped, each [Type] <> null),
                existingColumns = Table.ColumnNames(table),
                removeMissing = Table.SelectRows(omitted, each List.Contains(existingColumns, [Name])),
                primativeTransforms = Table.ToRows(removeMissing),
                changedPrimatives = Table.TransformColumnTypes(table, primativeTransforms)
            in
                changedPrimatives,

        // Returns the table type for a given schema
        SchemaToTableType = (schema as table) as type =>
            let
                toList = List.Transform(schema[Type], (t) => [Type=t, Optional=false]),
                toRecord = Record.FromList(toList, schema[Name]),
                toType = Type.ForRecord(toRecord, false)
            in
                type table (toType),

        // Determine if we have extra/missing columns.
        // The enforceSchema parameter determines what we do about them.
        schemaNames = schema[Name],
        foundNames = Table.ColumnNames(table),
        addNames = List.RemoveItems(schemaNames, foundNames),
        extraNames = List.RemoveItems(foundNames, schemaNames),
        tmp = Text.NewGuid(),
        added = Table.AddColumn(table, tmp, each []),
        expanded = Table.ExpandRecordColumn(added, tmp, addNames),
        result = if List.IsEmpty(addNames) then table else expanded,
        fullList =
            if (_enforceSchema = EnforceSchema.Strict) then
                schemaNames
            else if (_enforceSchema = EnforceSchema.IgnoreMissingColumns) then
                foundNames
            else
                schemaNames & extraNames,

        // Select the final list of columns.
        // These will be ordered according to the schema table.
        reordered = Table.SelectColumns(result, fullList, MissingField.Ignore),
        enforcedTypes = EnforceTypes(reordered, schema),
        withType = if (_enforceSchema = EnforceSchema.Strict) then Value.ReplaceType(enforcedTypes, SchemaToTableType(schema)) else enforcedTypes
    in
        withType;

更新 TripPin 連接器

您現在將會對連接器進行下列變更，以使用新的架構強制執行程序代碼。

定義保存您所有架構定義的主要架構資料表（SchemaTable）。
TripPin.Feed更新、 GetPage和 GetAllPagesByNextLink 以接受 schema 參數。
在中 GetPage強制執行架構。
更新導覽數據表程序代碼，以使用對新函式的GetEntity呼叫來包裝每個數據表，這可讓您在未來更彈性地操作數據表定義。

主要架構數據表

您現在會將架構定義合併成單一數據表，並新增協助程式函式（GetSchemaForEntity），可讓您根據實體名稱來查閱定義（例如， GetSchemaForEntity("Airlines")）。

SchemaTable = #table({"Entity", "SchemaTable"}, {
    {"Airlines", #table({"Name", "Type"}, {
        {"AirlineCode", type text},
        {"Name", type text}
    })},    
    
    {"Airports", #table({"Name", "Type"}, {
        {"IcaoCode", type text},
        {"Name", type text},
        {"IataCode", type text},
        {"Location", type record}
    })},

    {"People", #table({"Name", "Type"}, {
        {"UserName", type text},
        {"FirstName", type text},
        {"LastName", type text},
        {"Emails", type list},
        {"AddressInfo", type list},
        {"Gender", type nullable text},
        {"Concurrency", Int64.Type}
    })}
});

GetSchemaForEntity = (entity as text) as table => try SchemaTable{[Entity=entity]}[SchemaTable] otherwise error "Couldn't find entity: '" & entity &"'";

將架構支援新增至數據函式

您現在會將選擇性 schema 參數新增至 TripPin.Feed、 GetPage和 GetAllPagesByNextLink 函式。這可讓您將架構向下傳遞至分頁函式，其中會套用至您從服務取回的結果。

TripPin.Feed = (url as text, optional schema as table) as table => ...
GetPage = (url as text, optional schema as table) as table => ...
GetAllPagesByNextLink = (url as text, optional schema as table) as table => ...

您也會更新對這些函式的所有呼叫，以確保您已正確傳遞架構。

強制執行架構

實際的架構強制執行將會在您的函式中 GetPage 完成。

GetPage = (url as text, optional schema as table) as table =>
    let
        response = Web.Contents(url, [ Headers = DefaultRequestHeaders ]),        
        body = Json.Document(response),
        nextLink = GetNextLink(body),
        data = Table.FromRecords(body[value]),
        // enforce the schema
        withSchema = if (schema <> null) then SchemaTransformTable(data, schema) else data
    in
        withSchema meta [NextLink = nextLink];

注意

此 GetPage 實作會使用 Table.FromRecords 將 JSON 回應中的記錄清單轉換為數據表。使用 Table.FromRecords 的主要缺點是，它會假設清單中的所有記錄都有相同的欄位集。這適用於 TripPin 服務，因為 OData 記錄會保證包含相同的欄位，但這可能是所有 REST API 的情況。更健全的實作會使用 Table.FromList 和 Table.ExpandRecordColumn 的組合。稍後的教學課程會變更實作，以從架構數據表取得數據行清單，確保 JSON 轉譯期間不會遺失或遺失任何數據行。

新增 GetEntity 函式

函式 GetEntity 會包裝對 TripPin.Feed 的呼叫。它會根據實體名稱查閱架構定義，並建置完整的要求URL。

GetEntity = (url as text, entity as text) as table => 
    let
        fullUrl = Uri.Combine(url, entity),
        schemaTable = GetSchemaForEntity(entity),
        result = TripPin.Feed(fullUrl, schemaTable)
    in
        result;

接著，您將更新函 TripPinNavTable 式以呼叫 GetEntity，而不是內嵌進行所有呼叫。這一點的主要優點是，它可讓您繼續修改實體建置程序代碼，而不需要觸控瀏覽數據表邏輯。

TripPinNavTable = (url as text) as table =>
    let
        entitiesAsTable = Table.FromList(RootEntities, Splitter.SplitByNothing()),
        rename = Table.RenameColumns(entitiesAsTable, {{"Column1", "Name"}}),
        // Add Data as a calculated column
        withData = Table.AddColumn(rename, "Data", each GetEntity(url, [Name]), type table),
        // Add ItemKind and ItemName as fixed text values
        withItemKind = Table.AddColumn(withData, "ItemKind", each "Table", type text),
        withItemName = Table.AddColumn(withItemKind, "ItemName", each "Table", type text),
        // Indicate that the node should not be expandable
        withIsLeaf = Table.AddColumn(withItemName, "IsLeaf", each true, type logical),
        // Generate the nav table
        navTable = Table.ToNavigationTable(withIsLeaf, {"Name"}, "Name", "Data", "ItemKind", "ItemName", "IsLeaf")
    in
        navTable;

融會貫通

進行所有程式代碼變更之後，編譯並重新執行呼叫 Table.Schema Airlines 數據表的測試查詢。

let
    source = TripPin.Contents(),
    data = source{[Name="Airlines"]}[Data]
in
    Table.Schema(data)

您現在會看到 Airlines 資料表在其架構中只定義兩個資料列：

具有架構的航空公司。

如果您針對人員資料表執行相同的程式代碼...

let
    source = TripPin.Contents(),
    data = source{[Name="People"]}[Data]
in
    Table.Schema(data)

您會看到您使用的已指定類型（Int64.Type）也已正確設定。

使用架構人員。

請務必注意的是，的這個實SchemaTransformTable作不會修改和 record 資料行的類型list，但 Emails 和 AddressInfo 資料行仍會型別為 list。這是因為 Json.Document 會將 JSON 陣列正確對應至 M 清單，並將 JSON 物件對應至 M 記錄。如果您要在 Power Query 中展開清單或記錄資料行，您會看到所有展開的數據行都會是任何類型。未來的教學課程將改善實作，以遞歸方式設定巢狀複雜型別的類型資訊。

結論

本教學課程提供範例實作，以針對從 REST 服務傳回的 JSON 數據強制執行架構。雖然此範例使用簡單的硬式編碼架構數據表格式，但此方法可以透過從另一個來源動態建置架構數據表定義，例如 JSON 架構檔案，或數據源所公開的元數據服務/端點，來擴充此方法。

除了修改數據行類型（和值），您的程式代碼也會在數據表本身上設定正確的類型資訊。在 Power Query 內執行時，設定此類型資訊有利於效能，因為用戶體驗一律會嘗試推斷類型資訊，以向使用者顯示正確的 UI 佇列，而推斷呼叫最終可能會觸發基礎數據 API 的其他呼叫。

如果您使用上一課的 TripPin 連接器來檢視人員資料表，您會看到所有數據行都有「任何類型」圖示（甚至是包含清單的數據行）：

沒有架構的人員。

從本課程使用 TripPin 連接器執行相同的查詢，您現在會看到類型資訊已正確顯示。