How do I train models in Python
- Load pre-trained checkpointed model and continue retraining?
- Relate alpha, beta1, beta2 and epsilon to learning rate and momentum in adam_sgd?
- Train two or more models jointly?
- Train with a weighted loss?
- Train a multilabel classifier in Python?
- Train an Image auto encoder using Deconvolution and Unpooling?
- Object Detection using Fast R-CNN?
- Build your own image classifier using Transfer Learning?
Load pre trained checkpointed model and continue retraining
You have a CNTK trainer
object and save a checkpoint file.
checkpoint_file = "mModel_checkpointed.dnn"
trainer.save_checkpoint(checkpoint_file)
At a later time I load the ".dnn"
file as my model
mymodel.load_model(checkpoint_file)
When you restart training and it would load the model but not the weights. To be able to restore training from the previously checkpointed model. You need to use trainer.restore_from_checkpoint
instead, to recreate the trainer and learners. One important thing is that in your Python script the order of network creation (not only structure! but the order in which you create your nodes) should stay the same at the point when you create a checkpoint and when you restore from it.
Relate alpha, beta1, beta2 and epsilon to learning rate and momentum in adam_sgd
- Alpha is the learning_rate
- Beta1 is momentum parameter
- Beta2 is variance_momentum parameter
Train two or more models jointly
There are two cases that this can arise. One is multitask learning the other is you have one processing pipeline for some part of your input and another pipeline for the other.
- Multitask learning is when you have multiple criteria you want your network to be good at. The standard way to deal with this is to construct a single number out of all these criteria. For example
cross_entropy = cross_entropy_with_softmax(predictions, labels)
reconstruction_error = squared_error(reconstruction, features)
combined_loss = 0.3 * reconstruction_error + cross_entropy
- Multiple pipelines such as the one below are also tricky
q_embed = qmodel(question)
a_embed = amodel(answer)
The trainer takes a single function whose parameters should be modified by the training procedure. How can we pass the union of qmodel
and amodel
or the union of cross_entropy
and reconstruction_error
? The answer is to find a point where all the parameters are part of the same function. That happens at the very least in the final loss.
i.e. loss.parameters
contains all parameters the loss depends on. Typically however our model should not
include the loss, as it only makes sense for training. Therefore a better approach is to use combine to create a combined model
final_model = combine(predictions, reconstruction)
For the separate pipeline case there is probably a place where everything gets combined
qa_score = score(q_embed,a_embed)
then qa_score
can play the role of final_model
above.
Train with a weighted loss
The specification of a per-example weight in the loss is as simple as
weight = input_variable((1))
weighted_loss = weight * loss
where loss
is any builtin or user-defined loss function. Of course during training each minibatch will need to have a mapping from weight to actual values (one for each example).
Train a multilabel classifier in Python
For multilabel classification you should avoid using CrossEntropy as it can only handle input vectors that sum to 1. A sensible alternative is to use hamming loss.
def my_hamming_loss:
y = placeholder()
p = placeholder()
hammingLoss = reduce_sum (not_equal (y, (greater (p, 0.5))))
return hammingLoss
which you then can call as hammingLoss(labels, prob) and pass as the error metric to the Trainer.
Train an Image Auto Encoder Using Deconvolution And Unpooling
There are instructions here.
Train Object Detection using Fast R CNN
There are instructions here.
Train an image classifier using Transfer Learning
There are instructions here