Error in creating cyclecloud cluster (No nodes found for nodearray hpc)

Question

Error in creating cyclecloud cluster (No nodes found for nodearray hpc)

De Matteis Tiziano 26

Hello,

I'm trying to setup my first CycleCloud cluster, but I keep getting error in the initialization phase.

In particular, it complains about not finding nodes for "nodearray hpc".

The full error message:

CycleCloud Version: 8.2.0-1616
Cluster: Test2 (version 8.2.x)
==============================

Status: Error [Software Configuration] (retrying)
Start Time: 2022-03-13T17:31:29.377Z

Description: Unable to execute command `"bash"  "/tmp/chef-script20220313-15831-14pjjqq"` (exit code 1)

Detail: 
STDOUT: 
STDERR: Upgrade not required!
Bucket has a max_count <= 0, defined for machinetype=='Standard_F2s_v2'. Skipping
/opt/cycle/slurm/cyclecloud_slurm.py:571: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
  logging.warn("No nodes were created for nodearray %s using name format %s and offset %s: %s", request_set.nodearray, request_set.name_format,
No nodes were created for nodearray hpc using name format hpc-pg0-%d and offset 1: Limited by 200 total cores (10 of Standard_D4_v2) quota in eastus
Bucket has a max_count <= 0, defined for machinetype=='Standard_F2s_v2'. Skipping
Unhandled failure.
Traceback (most recent call last):
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 1101, in <module>
    main()
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 1078, in main
    args.func(**kwargs)
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 296, in generate_slurm_conf
    _generate_slurm_conf(partitions, writer, subprocess)
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 222, in _generate_slurm_conf
    raise RuntimeError("No nodes found for nodearray %s. Please run 'cyclecloud_slurm.sh create_nodes' first!" % partition.nodearray)
RuntimeError: No nodes found for nodearray hpc. Please run 'cyclecloud_slurm.sh create_nodes' first!
Traceback (most recent call last):
  File "/opt/cycle/jetpack/system/embedded/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/cycle/jetpack/system/embedded/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 1101, in <module>
    main()
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 1078, in main
    args.func(**kwargs)
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 296, in generate_slurm_conf
    _generate_slurm_conf(partitions, writer, subprocess)
  File "/opt/cycle/slurm/cyclecloud_slurm.py", line 222, in _generate_slurm_conf
    raise RuntimeError("No nodes found for nodearray %s. Please run 'cyclecloud_slurm.sh create_nodes' first!" % partition.nodearray)
RuntimeError: No nodes found for nodearray hpc. Please run 'cyclecloud_slurm.sh create_nodes' first!
EXCEPTION: bash[Create cyclecloud.conf] (slurm::scheduler line 156) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'


Affected Nodes (1):
---
Node Name: scheduler
Hostname: ip-0A000005
IP Address: 10.0.0.5
Azure Resource ID: /subscriptions/8e7cef8b-5b7e-469a-8dd8-4cb4835f1727/resourceGroups/Test2-MEYTCMLBMU2GMLJQGU4DCLJUGA/providers/Microsoft.Compute/virtualMachines/scheduler-MIZGGNZQGRQWGLJVGUYGCLJUGB
Azure VM ID: df70aba9-e8a0-4f15-8d86-764423838920
Cluster-Init: slurm:default:2.4.7, slurm:scheduler:2.4.7
Node ID: be007f51-9d51-4a42-ae4b-0eab606c563a

Any suggestion on what could be the cause?

My cluster configuration is the following:

cycle cloud 8.2
slurm as cluster manager
Standard_D12_v2 as instance for the Scheduler, Standard_D4_v2 for HPC instances and Standard_F2s_v2 for HTC instances
I've selected 16 as maximum HPC cores (I wanted to have just two machines)
the image selected is Centos 7 for all the different instances

Thank you for your support!

vipullag-MSFT 26,487 Reputation points Moderator

2022-03-14T09:52:21.317+00:00

@De Matteis Tiziano

I think issue is related to quota. Can you please check if you have the sufficient quota for hpc node array ?

Check the quotas in your subscription - may be you can use a different SKU and test the configuration. Besides that D series v2 is quite outdated, v3 and v4 are available, can you try with a more updated D series size?
De Matteis Tiziano 26 Reputation points

2022-03-14T10:18:16.067+00:00

Hi @vipullag-MSFT

I'm not sure about the "HPC node array" quota (I didn't find it in my quotas page), but changing the SKU to v4 did the trick and the initialization has been successful.

Thanks!
vipullag-MSFT 26,487 Reputation points Moderator

2022-03-14T11:23:58.813+00:00

@De Matteis Tiziano

Thanks for confirming that the issue is resolved, please 'Accept as answer', so that it can help others in the community looking for help on similar topics.

Accepted answer

0 additional answers

Your answer

vipullag-MSFT 26,487 Reputation points Moderator

2022-03-14T09:52:21.317+00:00

@De Matteis Tiziano

I think issue is related to quota. Can you please check if you have the sufficient quota for hpc node array ?

Check the quotas in your subscription - may be you can use a different SKU and test the configuration. Besides that D series v2 is quite outdated, v3 and v4 are available, can you try with a more updated D series size?
De Matteis Tiziano 26 Reputation points

2022-03-14T10:18:16.067+00:00

Hi @vipullag-MSFT

I'm not sure about the "HPC node array" quota (I didn't find it in my quotas page), but changing the SKU to v4 did the trick and the initialization has been successful.

Thanks!
vipullag-MSFT 26,487 Reputation points Moderator

2022-03-14T11:23:58.813+00:00

@De Matteis Tiziano

Thanks for confirming that the issue is resolved, please 'Accept as answer', so that it can help others in the community looking for help on similar topics.

Answer 1

vipullag-MSFT 26,487 Moderator

@De Matteis Tiziano

Thanks for reaching out on Microsoft Q&A Platform.

You can use a different SKU and test the configuration. Besides that D series v2 is quite outdated, v3 and v4 are available, can you try with a more updated D series size?

Hope this helps.
Please 'Accept as answer' if it helped, so that it can help others in the community looking for help on similar topics.

Share via

Error in creating cyclecloud cluster (No nodes found for nodearray hpc)

0 additional answers

Your answer