Dynamic aggregation , grouping and order by in Azure Cognitive Search

Anurag Sharma 0 Reputation points
2023-06-22T03:01:54.4633333+00:00

Our Dataset

Our data set is a flat list of attributes identified by a unique id. Overall we have ~130 attributes. There is an example record in the Appendix for reference.

We have over 52000 unique ids.

  • We want to achieve the following as part of our search request
  • Be able to filter by a single attribute or a combination of attributes using AND/OR criteria
  • Group the results using a combination of multiple fields (4 in our case)
  • Calculate aggregated values for a subset of fields as part of grouping. We are only interested in using max/min functions for grouping. e.g. max value of a field value
  • Include the aggregated values at the group level in the response
  • Have the records that constitute the group as a nested array within the group record
  • Order the grouped records based on the value of the aggregated fields that were created as part of grouping.
  • Return the grouped and ordered records including aggregated fields and nested array of constituent records using pagination.
  • We need to be able to select the fields returned in the records.
  • Please note that the aggregated values at the group level cannot be pre calculated as they are dependent on the filter criteria.
  • We need to achieve this as part of a REST API response. i.e. search results need to be displayed to an end user with maximum tolerable delay ~1-2 secs.

Please advice if this can be achieved using Azure Cognitive Search product.

Appendix

Sampel Record

        "document": {

            "vehicle_uid": "8137116e411c9e6edffdd86bd50cfdd8e9cd6af",

            "search_text": "Porsche Taycan 79kWh 4S sedan",

            "vehicle_status": "discontinued",

            "vehicle_status_correct_case": null,

            "hero_image_url": "https://caradvisoryinternal.blob.core.windows.net/ca-media/images/190654/494d5a/1/0012.png",

            "limited_edition": "null",

            "year_low": 2022,

            "year_high": 2022,

            "segment": "luxury",

            "vehicle_make": "porsche",

            "vehicle_model": "taycan",

            "vehicle_trim": "4s",

            "vehicle_trim_group": "4staycanporsche",

             ....

             ....

             ~130 attributes in total 

        }

Expected Response Record

Grouping is required based on trim_status_id field

group_*fields need to be calculated dynamically per search request, after application of filter criteria.

documents in results need to be ordered using one or more group_* fields

variants array inside each document is made up of the records that were grouped together.

{

"results": [

    {        

       "document": {

            "id": "dG95b3RhX2hpbHV4X3RyZF9kaXNjb250aW51ZWQ=",

            "vehicle_make": "toyota",

            "vehicle_model": "hilux",

            "vehicle_trim": "trd",

            "vehicle_make_correct_case": "Toyota",

            "vehicle_model_correct_case": "Hilux",

            "vehicle_trim_correct_case": "TRD",

            "trim_status_id": "toyota_hilux_trd_discontinued",

            "vehicle_status": "discontinued",

            "group_year_high": 2017, >>> Example of dynamically aggregated field calculated after application of filter criteria

            "group_year_low": 2017,

            "group_fuel_score_high": 0,

            "group_fuel_score_low": 0,

            "group_environment_score_high": 1,

            "group_environment_score_low": 1,

            ....

            ....

            //additional grouped fields  ~40. in number

            

            "variants": [

                {

                    "hero_image_url": "https://caradvisoryinternal.blob.core.windows.net/ca-media/images/1100103/ffffff/1/0012.png",

                    "vehicle_uid": "78253654bf7e8c8ebdc28c04142d50ed405491c",

                    "year_low": 2017,

                    "year_high": 2017,

                    "vehicle_version_correct_case": "2.8 T/Diesel TRD (White) Double Cab 4X4",

                    "environment_score": 1,

                    "driven_wheels": "4x4",

                    "vehicle_seats": 5,

                    "vehicle_doors": 4,

                    "fuel_type": "diesel",

                    "fuel_score": 0,

                    "retail_price": 58990,

                    .... 

                    ....

                    //fields selected in search query

                },

                ....

                ....

            ]

        }
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,339 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. brtrach-MSFT 17,731 Reputation points Microsoft Employee Moderator
    2023-06-28T01:41:36.7033333+00:00

    Here is an example of how to perform filtering, grouping, and ordering of search results in Azure Cognitive Search using the Lucene query syntax:

    https://[service name].search.windows.net/indexes/[index name]/docs?search=[search term]&$filter=[filter criteria]&$orderby=[order by criteria]&$top=[number of results to return]&$skip=[number of results to skip]

    • [service name] is the name of your search service.
    • [index name] is the name of your search index.
    • [search term] is the term you want to search for.
    • [filter criteria] is the criteria you want to use to filter the search results.
    • [order by criteria] is the criteria you want to use to order the search results.
    • [number of results to return] is the number of search results you want to return.
    • [number of results to skip] is the number of search results you want to skip.

    You can use the $filter parameter to filter the search results by a single attribute or a combination of attributes using AND/OR criteria. You can use the $orderby parameter to order the search results based on the value of the aggregated fields that were created as part of grouping. You can use the $top and $skip parameters to implement pagination.

    To group the search results using a combination of multiple fields and calculate aggregated values for a subset of fields as part of grouping, you can use the groupby parameter in the search request. You can specify the fields to group by and the aggregation functions to use for each field. The search results will be grouped by the specified fields, and the aggregated values will be calculated for each group.


  2. brtrach-MSFT 17,731 Reputation points Microsoft Employee Moderator
    2023-07-19T21:28:31.4966667+00:00

    @Anurag Sharma

    You can use the $orderby parameter to order the search results based on the value of the aggregated fields that were created as part of grouping. Here is an example of how to use groupby and orderby together in a search request:

    https://[service name].search.windows.net/indexes/[index name]/docs?search=[search term]&$filter=[filter criteria]&$top=[number of results to return]&$skip=[number of results to skip]&$count=true&$select=[fields to return]&$groupby=[fields to group by]&$select=[fields to return]&$orderby=[order by criteria]

    You can specify the fields to group by and the aggregation functions to use for each field in the $groupby parameter. You can specify the fields to return in the $select parameter. You can specify the order in which to return the search results in the $orderby parameter.

    Please note that Azure Cognitive Search does not support user-defined functions (UDFs) for grouping and creating aggregated fields. However, you can use the built-in aggregation functions such as count, sum, average, min, and max to calculate aggregated values for each group.

    If you found this helpful, we ask that you take a new survey or mark this as an answer, so we know this was able to resolve your query.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.