Visualize tensorboard logs for a job running on Azure ml

Question

Visualize tensorboard logs for a job running on Azure ml

Ahsan Iqbal 0

I have a training job running on azure ml. The job is submitted using azure cli. In the job.yaml I configured tensorboard service as specified here

services:

my_tensor_board:

type: tensor_board

log_dir: "outputs"

nodes: all

In order to get tensorboard link, I run following command

az ml job show-services --name my_job_name --resource-group my_resource_grp --workspace_name my_workspace_name

It returns a json response with link to tensorboard. However, if I follow the link no logs are shown.

Am I doing something wrong?

dupammi 8,615 Reputation points Microsoft External Staff

2024-01-11T08:29:32.9066667+00:00

Hi @Ahsan Iqbal , We haven’t heard from you on the last response and was just checking back to see if you got a chance to check my above response. Thank you.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-12T11:21:59.6766667+00:00

Hi @Ahsan Iqbal , We haven’t heard from you on the last response and was just checking back to see if you got a chance to check my above response. Thank you.

1 answer

Your answer

dupammi 8,615 Reputation points Microsoft External Staff

2024-01-11T08:29:32.9066667+00:00

Hi @Ahsan Iqbal , We haven’t heard from you on the last response and was just checking back to see if you got a chance to check my above response. Thank you.
dupammi 8,615 Reputation points Microsoft External Staff

2024-01-12T11:21:59.6766667+00:00

Hi @Ahsan Iqbal , We haven’t heard from you on the last response and was just checking back to see if you got a chance to check my above response. Thank you.

Answer 1

Hi @Ahsan Iqbal,

Thank you for using the Microsoft Q&A forum.

To debug the issue, I would suggest to first try executing the train.py script independently to ensure that it runs without any issues.

Once the script is running successfully, you can use the same directory as the path to the TensorBoard logs destination path for saving. With this, you can make sure that the logs are getting generated and getting saved in the ML environment.

When creating a job using the Azure Machine Learning CLI, the job is executed on a compute target that is specified in the compute field of the job.yml file. To check if the script is being accessed without any issues by the job compute, you can navigate to the jobs and check the job logs.

By following these steps, you can isolate the issue and determine if the issue is with the job.yml file or the train.py script.

I hope this helps! Thank you.

Share via

Visualize tensorboard logs for a job running on Azure ml

1 answer

Your answer