Skip to content

@everywhere is slow on HPC with multi-node environment #39291

Closed
@algorithmx

Description

@algorithmx

remotecall_eval(Main, procs, ex)

Please check here for descriptions of the problem by three Julia users:

https://discourse.julialang.org/t/everywhere-takes-a-very-long-time-when-using-a-cluster/35724

I have tested @everywhere and pmap() on an HPC. Test code and result available here
https://github.com/algorithmx/nodeba

Basically I just put timestamps between the lines. You can see in t*.log files that the largest gap is the one between timestamp 3 and 4. More interestingly, I found that increasing nworkers() causes the gap to increase linearly. I believe that this gap represents the execution time of the macro @everywhere, seen from master.

The vesion info is :

julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c7 (2020-11-09 13:37 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD EPYC 7452 32-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, znver2)
Environment:
JULIA_PKG_SERVER = https://mirrors.tuna.tsinghua.edu.cn/julia

Metadata

Metadata

Assignees

No one assigned

    Labels

    parallelismParallel or distributed computation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions