Exercise - Optimize queries by using the materialize function

Completed

In this exercise, you use the materialize function to optimize your queries.

Recall that the materialize function caches the results of a subquery when it runs, so that other parts of the query can reference the partial result.

Use the materialize function

In our retail company scenario, your sales team asks you for the top customers in the top 10 states, by revenue and how much they contribute, as a percentage, to the states' sales.

To provide this information, you want to break down the query into stages so that you can see results at each stage, as follows:

  1. Run the following query to get the top states, by revenue. Use the let statement to assign subqueries to the variables USCustomers and USCustomerSales.

    Run the query

    let USCustomers = Customers | where RegionCountryName == 'United States';
    let USCustomerSales =
        SalesFact
        | summarize USTotalCustomerSales = sum(SalesAmount) by CustomerKey
        | join kind=inner USCustomers on CustomerKey;
    USCustomerSales
    | summarize USTotalStateSales = round(sum(USTotalCustomerSales)) by StateProvinceName
    

    Take a look at the resulting list. Notice that the table contains columns for StateProvinceName and USTotalStateSales. You want to show the province name in the results, so you need to join the Customers table to the results.

    Screenshot of the `materialize` function, showing the first block of the query.

  2. Run the following query to add the province and the details about the top customers in the states to the results. You add this data by joining the USCustomerSales subquery to the results from the previous query.

    Now, optimize the query by using the materialize function in the USCustomerSales table let statement. Doing so runs the assigned subquery only once and caches the results.

    Run the query

    let USCustomers = Customers | where RegionCountryName == 'United States';
    let USCustomerSales = materialize(
        SalesFact
        | summarize USTotalCustomerSales = sum(SalesAmount) by CustomerKey
        | join kind=inner USCustomers on CustomerKey);
    USCustomerSales
    | summarize USTotalStateSales = round(sum(USTotalCustomerSales)) by StateProvinceName
    | lookup (
        USCustomerSales
        | summarize arg_max(USTotalCustomerSales, *) by StateProvinceName
        )
        on StateProvinceName
    | top 10 by USTotalStateSales
    

    Your results should look like those in the following image:

    Screenshot of the materialize function, showing the second block of the query.

  3. Run the following query to output the columns for the report and calculate the top customers' contributions to their state's sales, as a percentage.

    Run the query

    let Pcent = (portion: real, total: real) { round(100 * portion / total, 2) };
    let USCustomers = Customers | where RegionCountryName == 'United States';
    let USCustomerSales = materialize(
        SalesFact
        | summarize USTotalCustomerSales = sum(SalesAmount) by CustomerKey
        | join kind=inner USCustomers on CustomerKey);
    USCustomerSales
    | summarize USTotalStateSales = round(sum(USTotalCustomerSales)) by StateProvinceName
    | lookup (
        USCustomerSales
        | summarize arg_max(USTotalCustomerSales, *) by StateProvinceName
        )
        on StateProvinceName
    | top 10 by USTotalStateSales
    | project
        StateProvinceName,
        StateSales = USTotalStateSales,
        TopCustomerFirstName = FirstName,
        TopCustomerLastName = LastName,
        TopCustomerSales = round(USTotalCustomerSales),
        TopCustomerPercentage = Pcent(USTotalCustomerSales, USTotalStateSales)
    

    Your results should look like those in the following image:

    Screenshot of the materialize function, showing the entire query.