Conditional Hyperparameters in Sweep Job

Tom Owen 0 Reputation points
2024-06-28T13:20:33.82+00:00

Hello,

I'm working on a hyperparameter tuning job in Azure ML using the CLI/YAML schema. I want to be able to add different optimizers as part of the hyperparameter search space, and also be able to tune their hyperparameters but I'm concerned that doing this will cause the sweep job to trial hyperparameters that aren't being used by the optimizer, therefore wasting compute time and resources.

Let's say I have the following function to do this:

from keras.optimizers import SGD, Adam, RMSprop, Adagrad, Adadelta

def get_optimizer(args):
    name = args.optimizer
    if name == 'SGD':
        return SGD(learning_rate=args.learning_rate, momentum=args.momentum)
    elif name == 'Adam':
        return Adam(learning_rate=args.learning_rate, beta_1=args.beta_1, beta_2=args.beta_2, epsilon=args.epsilon)
    elif name == 'RMSprop':
        return RMSprop(learning_rate=args.learning_rate, rho=args.rho, epsilon=args.epsilon)
    elif name == 'Adagrad':
        return Adagrad(learning_rate=args.learning_rate, epsilon=args.epsilon)
    elif name == 'Adadelta':
        return Adadelta(learning_rate=args.learning_rate, rho=args.rho, epsilon=args.epsilon)
    else:
        raise ValueError(f"Unknown optimizer: {name}")

The hyperparameter 'momentum' is only relevant for Stochastic Gradient Descent (SGD), therefore if I use a 'choice' input for a sweep job and it choses 'Adam' during the hyperparameter sweep, I don't want it to trial lots of different values for 'momentum', as this will waste a trial and potentially confuse a Bayesian optimization algorithm.

Ideally, it'd be good to be able to define the search space in the YAML schema for the sweep job as such:

search_space:
  optimizer:
    type: choice
    values: ['SGD', 'Adam', 'RMSprop', 'Adagrad', 'Adadelta']
  learning_rate:
    type: uniform
    min_value: 0.0001
    max_value: 0.01
  beta_1:
    type: uniform
    min_value: 0.85
    max_value: 0.99
    conditional:
      - parent: optimizer
        value: 'Adam'
  beta_2:
    type: uniform
    min_value: 0.9
    max_value: 0.999
    conditional:
      - parent: optimizer
        value: 'Adam'
  momentum:
    type: uniform
    min_value: 0.5
    max_value: 0.9
    conditional:
      - parent: optimizer
        value: 'SGD'

Does anyone know if there's a different way I can achieve what I'm after, or do I need to re-think the hyperparameter tuning strategy and use Hyperopt in Databricks or something like that?

Many thanks

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,959 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 52,776 Reputation points
    2024-06-29T05:37:28.7033333+00:00

    Hello Tom,

    Thanks for reaching out to us. There are two alternative ways you may want to considerate, please take a look and see any of them is better.

    Custom Script for Dynamic Optimization: If Azure ML’s YAML schema doesn’t fully meet your needs, you might consider writing a custom Python script where you programmatically define hyperparameter search spaces based on the chosen optimizer. This requires more manual setup but offers flexibility.

    Hyperparameter Optimization Libraries: Using libraries like Hyperopt or Optuna outside of Azure ML might provide additional flexibility in defining complex search spaces and conditional parameters, but integrating them into Azure ML workflows could require more effort.

    I hope this helps. Please take a look and let us know how it works.

    Regards,

    Yutong

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.