Skip to content

SSH launch fails when host file has more than 64 hosts #6198

Open
@bwbarrett

Description

@bwbarrett

We're seeing launch failures when the host file has more than 64 hosts, which is resolved with --mca routed direct MCA parameter. Platform was x86_64 Linux in EC2. Each instance has 2 cores (4 hyperthreads). Hostfile looked like:

172.31.16.122
172.31.16.222
172.31.16.67
172.31.16.80
172.31.17.114
172.31.17.135
172.31.17.173
172.31.17.178
172.31.17.181
172.31.17.235
172.31.17.244
172.31.17.254
172.31.17.26
172.31.17.7
172.31.18.106
172.31.18.143
172.31.18.187
172.31.18.28
172.31.18.36
172.31.18.82
172.31.19.153
172.31.19.31
172.31.19.64
172.31.19.99
172.31.20.109
172.31.20.139
172.31.20.45
172.31.20.48
172.31.20.54
172.31.20.92
172.31.21.198
172.31.21.247
172.31.21.35
172.31.21.49
172.31.22.105
172.31.22.187
172.31.22.233
172.31.22.96
172.31.22.97
172.31.23.139
172.31.23.15
172.31.23.17
172.31.23.176
172.31.23.18
172.31.23.197
172.31.23.226
172.31.23.240
172.31.23.46
172.31.23.59
172.31.24.106
172.31.24.125
172.31.24.134
172.31.24.153
172.31.24.159
172.31.24.190
172.31.24.59
172.31.24.64
172.31.25.105
172.31.25.147
172.31.25.204
172.31.25.205
172.31.25.74
172.31.26.126
172.31.26.146
172.31.26.232
172.31.26.254
172.31.26.65
172.31.26.69
172.31.27.129
172.31.27.148
172.31.27.184
172.31.27.198
172.31.27.234
172.31.27.28
172.31.27.35
172.31.28.13
172.31.28.22
172.31.28.221
172.31.28.30
172.31.28.38
172.31.28.75
172.31.29.20
172.31.29.232
172.31.29.40
172.31.29.42
172.31.29.46
172.31.29.63
172.31.29.78
172.31.30.21
172.31.30.245
172.31.30.31
172.31.30.48
172.31.30.82
172.31.31.126
172.31.31.159
172.31.31.82

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions