-
Notifications
You must be signed in to change notification settings - Fork 305
Open
Labels
bugSomething isn't workingSomething isn't working
Description
What happened?
The built-in Slurm controller, when enabled, will bind on the first interface it finds when it tries to establish a connection between the nodes. If the nodes happen to have a public internet access, this means that the nodes will try to talk to each other over the public internet, which is usually firewalled. One needs to set the julia --bind-to=10.8.0.0.1 parameter, but I couldn't see a way to do it.
I ended up with this interesting workaround:
cd /opt/pysr-env/julia_env/pyjuliapkg/install/bin
if [ ! -f julia_real ]; then
mv julia julia_real
fi
cat << EOT > julia
#!/bin/bash
exec "$PWD/julia_real" --bind-to=${PRIVATE_IP} "\$@"
EOT
chmod +x julia
cd -This might be a problem with other distributed controllers as well.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working