How to deploy a TimeGEN-1 model with Azure AI Foundry
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
In this article, you learn how to use Azure AI Foundry to deploy the TimeGEN-1 model as a serverless API with pay-as-you-go billing. You filter on the Nixtla collection to browse the TimeGEN-1 model in the Model Catalog.
The Nixtla TimeGEN-1 is a generative, pretrained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
Important
Models that are in preview are marked as preview on their model cards in the model catalog.
Deploy TimeGEN-1 as a serverless API
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
You can deploy TimeGEN-1 as a serverless API with pay-as-you-go billing. Nixtla offers TimeGEN-1 through Microsoft Azure Marketplace. Nixtla can change or update the terms of use and pricing of this model.
Prerequisites
An Azure subscription with a valid payment method. Free or trial Azure subscriptions don't work. If you don't have an Azure subscription, create a paid Azure account to begin.
An Azure AI Foundry hub. The serverless API model deployment offering for Nixtla's TimeGEN-1 model is only available with hubs created in specific regions. For a list of these regions, see Region availability for models in serverless API endpoints.
Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Foundry portal. To perform the steps in this article, your user account must be assigned the Azure AI Developer role on the resource group. For more information on permissions, visit Role-based access control in Azure AI Foundry portal.
Estimate the number of tokens needed
Before you create a deployment, it's useful to estimate the number of tokens that you plan to consume and be billed for. One token corresponds to one data point in your input dataset or output dataset.
Suppose you have the following input time series dataset:
Unique_id | Timestamp | Target Variable | Exogenous Variable 1 | Exogenous Variable 2 |
---|---|---|---|---|
BE | 2016-10-22 00:00:00 | 70.00 | 49593.0 | 57253.0 |
BE | 2016-10-22 01:00:00 | 37.10 | 46073.0 | 51887.0 |
To determine the number of tokens, multiply the number of rows (in this example, two) and the number of columns used for forecasting—not counting the unique_id and timestamp columns (in this example, three) to get a total of six tokens.
Given the following output dataset:
Unique_id | Timestamp | Forecasted Target Variable |
---|---|---|
BE | 2016-10-22 02:00:00 | 46.57 |
BE | 2016-10-22 03:00:00 | 48.57 |
You can also determine the number of tokens by counting the number of data points returned after data forecasting. In this example, the number of tokens is two.
Estimate pricing based on tokens
There are four pricing meters that determine the price you pay. These meters are as follows:
Pricing Meter | Description |
---|---|
paygo-inference-input-tokens | Costs associated with the tokens used as input for inference when finetune_steps = 0 |
paygo-inference-output-tokens | Costs associated with the tokens used as output for inference when finetune_steps = 0 |
paygo-finetuned-model-inference-input-tokens | Costs associated with the tokens used as input for inference when finetune_steps > 0 |
paygo-finetuned-model-inference-output-tokens | Costs associated with the tokens used as output for inference when finetune_steps > 0 |
Create a new deployment
These steps demonstrate the deployment of TimeGEN-1. To create a deployment:
- Sign in to Azure AI Foundry.
- If you’re not already in your project, select it.
- Select Model catalog from the left navigation pane.
Select the model card of the model you want to deploy. In this article, you select TimeGEN-1 to open the Model Details page.
Select Deploy to open a serverless API deployment window for the model.
Alternatively, you can initiate a deployment from your project in the Azure AI Foundry portal as follows:
- From the left sidebar of your project, select Models + Endpoints.
- Select + Deploy model > Deploy base model.
- Search for and select TimeGEN-1 to open the Model Details page.
- Select Confirm to open a serverless API deployment window for the model.
In the deployment wizard, select the link to Azure Marketplace Terms to learn more about the terms of use.
Select the Pricing and terms tab to learn about pricing for the selected model.
Select the Subscribe and Deploy button. If this is your first time deploying the model in the project, you have to subscribe your project for the particular offering.
Note
This step requires that your account has the Azure AI Developer role permissions on the resource group, as listed in the prerequisites. Models that are offered by non-Microsoft providers (for example, Nixtla TimeGEN-1) are billed through Azure Marketplace. For such models, you're required to subscribe your project to the particular model offering. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending. Currently, you can have only one deployment for each model within a project.
Once you subscribe the project for the particular Azure Marketplace offering, subsequent deployments of the same offering in the same project don't require subscribing again. If this scenario applies to you, there's a Continue to deploy option to select.
Give the deployment a name. This name becomes part of the deployment API URL. This URL must be unique in each Azure region.
Select Deploy. Wait until the deployment is ready and you're redirected to the Model deployments page.
Return to the Deployments page, select the deployment, and note the endpoint's Target URI and the Secret Key. For more information on using the APIs, see the reference section.
-
You can always find the endpoint's details, URL, and access keys by navigating to your project's Management center from the left navigation pane. Then, select Models + endpoints.
To learn about billing for the TimeGEN-1 model deployed as a serverless API with pay-as-you-go token-based billing, see Cost and quota considerations for the TimeGEN-1 family of models deployed as a service.
Consume the TimeGEN-1 model as a service
You can consume TimeGEN-1 models by using the forecast API.
From the left navigation pane of your project, select My assets > Models + endpoints.
Find and select the deployment you created.
Copy the Target URI and the Key value.
Try the samples here:
Use Case | Description | Sample Notebook |
---|---|---|
Quick Start Forecast | The Nixtla TimeGEN1 is a generative, pretrained forecasting model for time series data. TimeGEN1 can produce accurate forecasts for new time series without training, using only historical values as inputs. | Quick Start Forecast |
Fine-tuning | Fine-tuning is a powerful process to utilize Time-GEN1 more effectively. Foundation models - for example, TimeGEN1 - are pretrained on vast amounts of data, to capture wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. Fine-tuning refines the model parameters to forecast a new task, allowing it to tailor its vast pre-existing knowledge towards the requirements of the new data. In this way, fine-tuning serves as a crucial bridge, linking the broad TimeGEN1 capabilities to the specifics of your tasks. Concretely, the fine-tuning process involves performing some training iterations on your input data, to minimize the forecasting error. The forecasts are produced with the updated model. To control the number of iterations, use the finetune_steps argument of the forecast method. | Fine-tuning |
Anomaly Detection | Anomaly detection in time series data is important across various industries - for example, finance and healthcare. It involves monitoring ordered data points to spot irregularities that might signal issues or threats. Organizations can then swiftly act to prevent, improve, or safeguard their operations. | Anomaly Detection |
Exogenous Variables | Exogenous variables are external factors that can influence forecasts. These variables take on one of a limited, fixed number of possible values, and induce a grouping of your observations. For example, if you're forecasting daily product demand for a retailer, you could benefit from an event variable that may tell you what kind of event takes place on a given day, for example 'None', Sporting', or 'Cultural'. Or you might also include external factors such as weather. | Exogenous Variables |
Demand Forecasting | Demand forecasting involves application of historical data and other analytical information, to build models that help predict future estimates of customer demand, for specific products, over a specific time period. It helps shape product road map, inventory production, and inventory allocation, among other things. | Demand Forecasting |
For more information about use of the APIs, visit the reference section.
Reference for TimeGEN-1 deployed as a serverless API
Forecast API
Use the method POST
to send the request to the /forecast
route:
Request
POST /forecast HTTP/1.1
Host: <DEPLOYMENT_URI>
Authorization: Bearer <TOKEN>
Content-type: application/json
Request schema
The Payload JSON formatted string contains these parameters:
Key | Type | Default | Description |
---|---|---|---|
Forecast Horizon (fh ) |
int |
No default. This value must be specified. | Forecast horizon |
Frequency (freq ) |
str |
None | Frequency of the data. By default, the frequency is inferred automatically. For more information, visit pandas available frequencies. |
Identifying Column (id_col ) |
str |
unique_id |
Column that identifies each series. |
Time Column (time_col ) |
str |
ds |
Column that identifies each timestep; its values can be timestamps or integers. |
Target Column (target_col ) |
str |
y |
Column that contains the target. |
Exogenous DataFrame (X_df ) |
DataFrame |
None | DataFrame with [unique_id, ds] columns and the df future exogenous variables. |
Prediction Intervals (level ) |
List[Union[int, float]] |
None | Confidence levels between 0 and 100 for prediction intervals. |
Quantiles (quantiles ) |
List[float] |
None | List of quantiles to forecast between (0, 1). level and quantiles shouldn't be used simultaneously. The output dataframe has the quantile columns formatted as TimeGEN-q-(100 * q) for each q value. The term 100 * q represents percentiles, but we choose this notation to avoid the appearance of dots in column names. |
Fine-tuning Steps (finetune_steps ) |
int |
0 | Number of steps used to fine-tune learning TimeGEN-1 in the new data. |
Fine-tuning Loss (finetune_loss ) |
str |
default |
Loss function to use for fine-tuning. Options: mae , mse , rmse , mape , smape |
Clean Exogenous First (clean_ex_first ) |
bool |
True | Clean exogenous signal before making forecasts using TimeGEN-1. |
Validate API Key (validate_api_key ) |
bool |
False | If true, validates API key before sending requests. |
Add History (add_history ) |
bool |
False | Return fitted values of the model. |
Date Features (date_features ) |
Union |
False | Features computed from the dates. Can be pandas date attributes or functions that take the dates as input. If true, automatically adds the most used date features for the frequency of df . |
One-Hot Encoding Date Features (date_features_to_one_hot ) |
Union |
True | Apply one-hot encoding to these date features. If date_features=True then all date features are one-hot encoded by default. |
Model (model ) |
str |
azureai |
azureai |
Number of Partitions (num_partitions ) |
int |
None | Number of partitions to use. If none, the number of partitions matches the available parallel resources in distributed environments. |
Example
payload = {
"model": "azureai",
"freq": "D",
"fh": 7,
"y": {
"2015-12-02": 8.71177264560569,
"2015-12-03": 8.05610965954506,
"2015-12-04": 8.08147504013705,
"2015-12-05": 7.45876269238096,
"2015-12-06": 8.01400499477946,
"2015-12-07": 8.49678638163858,
"2015-12-08": 7.98104975966596,
"2015-12-09": 7.77779262633883,
"2015-12-10": 8.2602342916073,
"2015-12-11": 7.86633892304654,
"2015-12-12": 7.31055015853442,
"2015-12-13": 7.71824095195932,
"2015-12-14": 8.31947369244219,
"2015-12-15": 8.23668532271246,
"2015-12-16": 7.80751004221619,
"2015-12-17": 7.59186171488993,
"2015-12-18": 7.52886925664225,
"2015-12-19": 7.17165682276851,
"2015-12-20": 7.89133075766189,
"2015-12-21": 8.36007143564403,
"2015-12-22": 8.11042723757502,
"2015-12-23": 7.77527584648686,
"2015-12-24": 7.34729970074316,
"2015-12-25": 7.30182234213793,
"2015-12-26": 7.12044437239249,
"2015-12-27": 8.87877607170755,
"2015-12-28": 9.25061821847475,
"2015-12-29": 9.24792513230345,
"2015-12-30": 8.39140318535794,
"2015-12-31": 8.00469951054955,
"2016-01-01": 7.58933582317062,
"2016-01-02": 7.82524529143177,
"2016-01-03": 8.24931374626064,
"2016-01-04": 9.29514097366865,
"2016-01-05": 8.56826646160024,
"2016-01-06": 8.35255436947459,
"2016-01-07": 8.29579811063615,
"2016-01-08": 8.29029259122431,
"2016-01-09": 7.78572089653462,
"2016-01-10": 8.28172399041139,
"2016-01-11": 8.4707303170059,
"2016-01-12": 8.13505390861157,
"2016-01-13": 8.06714903991011
},
"clean_ex_first": True,
"finetune_steps": 0,
"finetune_loss": "default"
}
Response schema
The response is a data frame of type pandas.DataFrame
that contains the TimeGEN-1 forecasts for point predictions and probabilistic predictions.
Example
This JSON sample is an example response:
{
"status": 200,
"data": {
"timestamp": [
"2016-01-14 00:00:00",
"2016-01-15 00:00:00",
"2016-01-16 00:00:00",
"2016-01-17 00:00:00",
"2016-01-18 00:00:00",
"2016-01-19 00:00:00",
"2016-01-20 00:00:00"
],
"value": [
7.960582256317139,
7.7414960861206055,
7.728490352630615,
8.267574310302734,
8.543140411376953,
8.298713684082031,
8.105557441711426
],
"input_tokens": 43,
"output_tokens": 7,
"finetune_tokens": 0
},
"message": "success",
"details": "request successful",
"code": "B10",
"support": "If you have questions or need support, please email ops@nixtla.io",
"requestID": "2JHQL2LDUX"
}
Cost and quotas
Cost and quota considerations for TimeGEN-1 deployed as a serverless API
Nixtla offers TimeGEN-1 deployed as a serverless API through Azure Marketplace. TimeGEN-1 is integrated with Azure AI Foundry for use. You can find more information about Azure Marketplace pricing when you deploy the model.
Each time a project subscribes to a given offer from Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference; however, multiple meters are available to track each scenario independently.
For more information about how to track costs, visit monitor costs for models offered throughout Azure Marketplace.
Quota is managed per deployment. Each deployment has a rate limit of 200,000 tokens per minute and 1,000 API requests per minute. However, we currently limit one deployment per model per project. Contact Microsoft Azure Support if the current rate limits are insufficient for your scenarios.