Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The purpose of this document is to serve as a temporary API resource for the Data Science Toolkit services that allow for the creation of Logistic Regression models, Lookup tables, and Hashed Table Predictors.
Descriptor list
Scalar descriptor
Type:
"scalar_descriptor"
Features:
appnexus``_auditedcookie_ageestimated_average_priceestimated_clearing_pricepredicted_iab_view_ratepredicted_video_completion_rateself_auditedsizecreative_sizespend_protectionuniformuser_age
Note
The size descriptor is represented as a string in your models ("300x250", for instance), though converted to a scalar in our bidder. Any size is technically valid in our system, hence this feature being treated as a scalar rather than a categorical feature.
Example:
{
"type": "scalar_descriptor",
"feature_keyword": "cookie_age",
"default_value": 0, //Value returned by the descriptor if no match is found
"initial_range_log": 4, //Used for log bucketing, initial range
"bucket_count_log_per_range": 2 //used for log bucketing, # of buckets per range
}
Segment descriptor
Type:
"segment_descriptor"
Features:
segment_valuesegment_agesegment_presence
Example:
{
"type": "segment_descriptor",
"feature_keyword": "segment_age",
"segment_id": 2, //ID of referenced segment
"default_value": 0, //Value returned by the descriptor if no match is found
"initial_range_log": 4, //Used for log bucketing, initial range
"bucket_count_log_per_range": 2 //used for log bucketing, # of buckets per range
}
Frequency/Recency descriptor
Type:
"frequency_recency_descriptor"
Features:
frequency_lifefrequency_dailyrecency
Available object types for this descriptor:
advertiserline_itemcampaign
Example:
{
"type": "frequency_recency_descriptor",
"feature_keyword": 'frequency_life',
"object_type": 'advertiser',
"object_id": 1, //ID of the referenced advertiser,
"default_value": 0, //Value returned by the descriptor if no match is found
"initial_range_log": 4, //Used for log bucketing, initial range
"bucket_count_log_per_range": 2 //used for log bucketing, # of buckets per range
}
Categorical descriptor
Type:
"categorical_descriptor"
Features:
countryregioncitydmapostal_codeuser_dayuser_houros_familyos_extendedbrowserlanguageuser_genderdomainip_addresspositionplacementplacement_grouppublisherseller_member_idsupply_typedevice_typedevice_modelcarriermobile_appmobile_app_instancemobile_app_bundleappnexus``_intended_audienceseller_intended_audiencespend_protectionuser_group_idadvertiser_idbrand_categorycreativeinventory_url_idmedia_type
Example:
{
"type": "categorical_descriptor",
"feature_keyword": "city"
}
Hash table descriptor
Type:
"hashed"
Example:
{
"type": "hashed",
"keys": [ array of one to 5 descriptors in this list:
scalar_descriptor,
custom_model_descriptor,
freq_rec_descriptor,
segment_descriptor,
categorical_descriptor
],
"hash_seeds": [42, 42, 42, 42, 42, 42], //Seeds used when passed to Murmurhash3_x64_128 function, only first one is used for now, array is for planned future hash functions that need more than one seed
"hash_id": <existing hash table ID>,
"default_value": 0, //Value returned by the descriptor if no match is found in your hash table
"hash_table_size_log": 20 //log of maximum value for a key of your table. Values larger than 2^hash_table_size_log will be rejected. Max for hash_table_size_log is 64 (no bucketing)
}
Hash tables
This endpoint is to submit a pre-hashed table. bucket_index0 and bucket_index1, each 64 bits long, are there to support hashing algorithms that produce long values as keys. Currently, we only support one hashing algorithm: MurmurHash3_x64_128, which will create two 64 bit integers but we only use the lower 64 bits of the hash.
Values in bucket_index0 must always be smaller than (2 ^ hash_table_size_log) or they will get rejected.
Currently, the values in bucket_index1 are ignored as this is to be used for future expansion. If a value is sent for bucket_index1, it must be 0. The parameter is optional.
Hash table keys
For each of your Hash Table keys, you will need a uint32 value. These values should be the ID of respective object that you are referencing from our system - domain_id, for instance, rather than the domain string value. These uint32 keys are then transformed into a byte array (little-endian), and hashed.
Python example
hash_bucket = (mmh3.hash64(bytes, seed)[0]) % table_size
Logit function
Model creation and update are similar, same request format is to be used for both.
| Method | Endpoint | Purpose |
|---|---|---|
GET |
/custom-model-logit |
Retrieve a Logit function associated with the parameters provided. |
PUT |
/custom-model-logit |
Update an existing Logit function with data in the JSON payload. |
POST |
/custom-model-logit |
Create a new Logit function from data in the JSON payload. |
DELETE |
/custom-model-logit |
Delete an existing Logit function matching the parameters provided. |
Parameters
| Name | Data Type | Parameter Type | Required On | Example |
|---|---|---|---|---|
id |
int | Query | GET, PUT, DELETE |
?id=1 |
member_id |
int | Query | PUT, POST |
?member_id=1 |
Example POST
{"custom-model-logit": {
"member_id": 1,
"beta0": 1.2,
"max": 5,
"min": 0, //optional, will be set to 0 if not passed
"name": "Sample LRE model",
"offset": 0.3, //optional, will be set to 0 if not passed
"scale": 1.5, //optional, will be set to 1 if not passed
"predictors": [
{
"coefficient": 0.2,
"feature_descriptor": {
"bucket_count_log_per_range": 31,
"default_value": 0,
"feature_keyword": "size",
"initial_range_log": 31,
"type": "scalar_descriptor"
},
"type": "scalar"
},
{
"coefficient": 0.3,
"feature_descriptor": {
"bucket_count_log_per_range": 31,
"custom_model_id": 2,
"default_value": 0,
"feature_keyword": "custom_model",
"initial_range_log": 31,
"type": "custom_model_descriptor"
},
"type": "scalar"
},
{
"coefficient": 0.4,
"feature_descriptor": {
"bucket_count_log_per_range": 31,
"default_value": 0,
"feature_keyword": "frequency_life",
"initial_range_log": 31,
"object_id": 1,
"object_type": "advertiser",
"type": "frequency_recency_descriptor"
},
"type": "scalar"
},
{
"coefficient": 0.5,
"feature_descriptor": {
"bucket_count_log_per_range": 31,
"default_value": 0,
"feature_keyword": "segment_age",
"initial_range_log": 31,
"segment_id": 2,
"type": "segment_descriptor"
},
"type": "scalar"
},
{
"type": "hashed",
"keys": [
{
"type": "categorical_descriptor",
"feature_keyword": "advertiser_id"
},
{
"type": "scalar_descriptor",
"feature_keyword": "user_age",
"default_value": 0
}
],
"hash_seeds": [42, 42, 42, 42, 42, 42],
"default_value": 0,
"hash_table_size_log": 20,
"coefficients": [
{"bucket_index0": 0, "bucket_index1": 0, "weight": 1.3},
{"bucket_index0": 1, "bucket_index1": 0, "weight": 0.7},
{"bucket_index0": 2, "bucket_index1": 0, "weight": 1.5},
{"bucket_index0": 3, "bucket_index1": 0, "weight": 0.9}
]
},
{
"type": "lookup",
"default_value": 0.1,
"features": [
{
"type": "categorical_descriptor",
"feature_keyword": "advertiser_id"
},
{
"type": "scalar_descriptor",
"feature_keyword": "user_age",
"default_value": 0
}
],
"coefficients": [
{'weight': 1.1, 'key': [1, 1]},
{'weight': 1.3, 'key': [2, 2]},
{'weight': 1.2, 'key': [3, 3]},
]
}
]
}}