Share via


Custom query specification - Apps and Games API

Partners can use this query specification to easily formulate custom queries for extracting data from analytics tables. The queries can be used to select only the desired columns and metrics that match a certain criterion. At the heart of the language specification is the dataset definition on which a custom query can be written.

Datasets

In the same way that some queries are run against a database that has tables and columns, a custom query works on Datasets that have columns and metrics. The full list of available datasets for formulating a query can be obtained by using the datasets API.

Here's an example of a dataset shown as a JSON.

{ 
                            "datasetName": "Acquisitions", 
                            "selectableColumns": [ 
                                "TitleId", 
                                "ProductId", 
                                "XboxProductId", 
                                "ProductTypeName", 
                                "TitleName", 
                                "CatalogId", 
                                "SandboxId", 
                                "SkuId", 
                                "SkuTypeName", 
                                "SkuDisplayName", 
                                "AvailabilityId", 
                                "RegionName", 
                                "CountryName", 
                                "Market", 
                                "PaymentType", 
                                "StoreClientName", 
                                "StoreClientCategory", 
                                "ParentProductName", 
                                "ParentProductId", 
                                "XboxParentProductId", 
                                "AcquisitionType", 
                                "PurchaseTaxType", 
                                "LocalCurrencyCode", 
                                "SupportedPlatform", 
                                "Age", 
                                "Gender", 
                                "OsVersion", 
                                "DeviceType", 
                                "DateStamp" 
                            ], 
                            "availableMetrics": [ 
                                "PurchaseQuantity", 
                                "PurchasePriceUSDAmount", 
                                "PurchaseTaxUSDAmount", 
                                "PurchasePriceLocalAmount", 
                                "PurchaseTaxLocalAmount" 
                            ], 
                            "availableDateRanges": [ 
                                "LAST_72_HOURS", 
                                "LAST_30_DAYS", 
                                "LAST_3_MONTHS", 
                                "LAST_6_MONTHS", 
                                "LAST_12_MONTHS", 
                                "LAST_2_YEARS", 
                                "LAST_3_YEARS", 
                                "LAST_4_YEARS" 
                            ], 
                            "columnFilters": {}, 
                            "aggregationToDateRangeMapping": { 
                                "Hourly": "LAST_72_HOURS", 
                                "Daily": "LAST_30_DAYS,LAST_3_MONTHS", 
                                "Weekly": "LAST_6_MONTHS,LAST_12_MONTHS", 
                                "Monthly": "LAST_2_YEARS,LAST_3_YEARS,LAST_4_YEARS" 
                            }, 
                        } 

Parts of a dataset

A dataset name is like a database table name. For example, Acquisitions. A dataset has a list of columns that can be selected, such as AcquisitionType.

A dataset also has metrics, which are like aggregation functions in a database. For example, PurchaseQuantity.

There are fixed time spans over which data can be exported.

Formulating a query on a dataset

The following are sample queries that show how to extract various types of data.

Query Description
SELECT AcquisitionType, ProductId FROM Acquisitions TIMESPAN LAST_3_MONTHS This query gets every AcquisitionType and its corresponding ProductId in the last three months.
SELECT ProductId, PurchaseQuantity FROM Acquisitions ORDER BY PurchaseQuantity LIMIT 10 This query gets the top 10 Products in decreasing order of their Purchase Quantity.

Query specification

This section describes the query definition and structure.

Grammar reference

This table describes the symbols used in queries.

Symbol Meaning
? Optional
* Zero or more
+ One or more
| Or/One of the list

Query definition

The query statement has the following clauses: SelectClause, FromClause, WhereClause?, OrderClause?, LimitClause?, and TimeSpan?.

  • SelectClause: SELECT ColumOrMetricName (, ColumOrMetricName)*
    • ColumOrMetricName: Columns and Metrics defined within the Dataset
  • FromClause: FROM DatasetName
    • DatasetName: Dataset name defined within the Dataset
  • WhereClause: WHERE FilterCondition (**AND** FilterCondition)*
    • FilterCondition: ColumOrMetricName Operator Value
      • Operator: = | > | < | >= | <= | != | LIKE | NOT LIKE | IN | NOT IN
      • Value: Number | StringLiteral | MultiNumberList | MultiStringList
        • Number: -? [0-9]+ (. [0-9] [0-9]*)?
        • StringLiteral: ' [a-zA-Z0-9_]*'
        • MultiNumberList: (Number (,Number)*)
        • MultiStringList: (StringLiteral (,StringLiteral)*)
  • OrderClause: ORDER BY OrderCondition (,OrderCondition)*
    • OrderCondition: ColumOrMetricName (**ASC** | **DESC**)*
  • LimitClause: LIMIT [0-9]+
  • TimeSpan: TIMESPAN ( TODAY | YESTERDAY | LAST_7_DAYS | LAST_14_DAYS | LAST_30_DAYS | LAST_90_DAYS | LAST_180_DAYS | LAST_365_DAYS | LAST_MONTH | LAST_3_MONTHS | LAST_6_MONTHS | LAST_1_YEAR)

Query structure

A Report query is made up of multiple parts:

  • SELECT
  • FROM
  • WHERE
  • ORDER BY
  • LIMIT
  • TIMESPAN

Each part is described as follows:

SELECT

This part of the query specifies the columns that get exported. The columns that can be selected are the fields listed in selectableColumns and availableMetrics sections of a dataset. If there's a metric column included in the selected field list, then metrics are calculated for every distinct combination of the nonmetric columns.

Example:

  • SELECT ProductId, PurchaseQuantity

DISTINCT

Adding DISTINCT keyword after SELECT ensures the final exported data doesn't have any duplicate rows. DISTINCT keyword works irrespective of whether a metric column is selected.

Example:

  • SELECT DISTINCT ProductId, PurchaseQuantity

FROM

This part of the query indicates the dataset from which data needs to be exported. The dataset name given here needs to be a valid dataset name returned by the datasets API.

Example:

  • FROM Acquisition
  • FROM Usage

WHERE

This part of the query is used to specify filter conditions on the dataset. Only rows matching all the conditions listed in this clause are present in the final exported file. The filter condition can be on any of the columns listed in selectableColumns and availableMetrics. The values specified in the filter condition can be a list of numbers or a list of strings only when the operator is IN or NOT IN. The values can always be given as a literal string and they're converted to the native types of columns. Multiple filter conditions need to be separated with an AND operation.

Example:

  • WHERE ProductId = '8da-86836-3c12dc7e62957d-4959-899260'
  • WHERE TitleName LIKE '%Grand%'
  • WHERE CountryName NOT IN (3243, 321, 2123)
  • WHERE PurchaseQuantity =100
  • WHERE PurchaseQuantity ='100'
  • WHERE ProductId ='aaaabbbb-0000-cccc-1111-dddd2222eeee' AND CountryName =' Canada'

ORDER BY

This part of the query specifies the ordering criteria for the exported rows. The columns on which ordering can be defined need to be from the selectableColumns and availableMetrics of the dataset. If there's no ordering direction specified, it's defaulted to DESC on the column. Ordering can be defined on multiple columns by separating the criteria with a comma.

Example:

  • ORDER BY PurchaseQuantity ASC, TitleName (CC) DESC
  • ORDER BY TitleName ASC, PurchaseQuantity

LIMIT

This part of the query specifies the number of rows that are exported. The number you specify needs to be a positive nonzero integer.

TIMESPAN

This part of the query specifies the time duration for which the data needs to be exported. The possible values should be from the availableDateRanges field in the dataset definition.

Case sensitivity in query specification

The specification is completely case insensitive. Predefined keywords, column names, and values can be specified using upper or lower case.

Custom query example

curl --location 'https://manage.devcenter.microsoft.com/consumer/insights/v1.1/ScheduledReport' \ 
 --header 'Authorization: \ 
 --header 'Content-Type: application/json' \ 
 --data '{ 
 "Description": "Testing sample report", 
 "QueryId": "<id of the query>", 
 "ReportName": "Test Reports", 
 "executeNow": true, 
 "QueryStartTime": "2024-06-01", 
 "QueryEndTime": "2024-06-13" 
 }'

The provided date range must adhere to the date-to-aggregation mapping for the dataset. For instance, if "last_30_days" is mapped to daily aggregation and a custom date range exceeding 30 days is specified with daily aggregation, an error will be thrown indicating that daily aggregation isn't supported for this date range. Similarly, if "last_3_months" is mapped to weekly aggregation, then for a date range difference between 31 and 90 days, weekly aggregation must be used.