Fastai error running experiment on Azure ML

SANSONE Sara 1 Reputation point
2021-02-19T16:55:12.05+00:00

Hello everybody,
I’m working with Fastai (V. 2.1.7) on Azure Machine Learning (Azure ML) and I’m having an issue.

If I train a model directly in the notebook, everything looks ok.
When I try to run exactly the same python code into an experiment I get the following error.

Traceback (most recent call last):
  File "train.py", line 75, in <module>
    learn.fit_one_cycle(8, 3e-3)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/callback/schedule.py", line 112, in fit_one_cycle
    self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 205, in fit
    self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 154, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 196, in _do_fit
    self._with_events(self._do_epoch, 'epoch', CancelEpochException)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 154, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 190, in _do_epoch
    self._do_epoch_train()
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 182, in _do_epoch_train
    self._with_events(self.all_batches, 'train', CancelTrainException)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 154, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/learner.py", line 160, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/data/load.py", line 103, in __iter__
    yield self.after_batch(b)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastcore/transform.py", line 198, in __call__
    def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastcore/transform.py", line 150, in compose_tfms
    x = f(x, **kwargs)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/vision/augment.py", line 34, in __call__
    self.before_call(b, split_idx=split_idx)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/vision/augment.py", line 377, in before_call
    self.do,self.mat = True,self._get_affine_mat(b)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/vision/augment.py", line 388, in _get_affine_mat
    aff_m = _init_mat(x)
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/fastai/vision/augment.py", line 286, in _init_mat
    mat = torch.eye(3, device=x.device).float()
AttributeError: 'list' object has no attribute 'device'

Have you ever experienced the same issue?
Do you have any idea about it?
Thanks a lot

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,728 questions
{count} votes