Closed
Description
Open MPI version
gitclone ompi v3.0.x (2f13cce)
Details of the problem
If OMPI installation is moved to a different location OPAL_PREFIX
env var is used to identify that and installdirs/env
is handling this correctly.
PMIx also has MCA infrastructure and similar installdirs component. in PMIx PMIX_INSTALL_PREFIX
playing the role of OPAL_PREFIX
. The important difference is that OPAL_PREFIX
is an OMPI variable and it gets propagated to orte daemons, i.e.:
[cn01:17570] [[10714,0],0] plm:rsh: final template argv:
/usr/bin/ssh <template> \
OPAL_PREFIX=<ompi-path>/ompi-v3.0.x ; export OPAL_PREFIX; \
PATH=<ompi-path>/ompi-v3.0.x/bin:$PATH ; export PATH ; \
LD_LIBRARY_PATH=<ompi-path>/ompi-v3.0.x/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; \
DYLD_LIBRARY_PATH=<ompi-path>/ompi-v3.0.x/lib:$DYLD_LIBRARY_PATH ; export DYLD_LIBRARY_PATH ; \
<ompi-path>/ompi-v3.0.x/bin/orted -mca orte_debug_daemons "1" -mca ess "env" -mca ess_base_jobid "702152704" \
-mca ess_base_vpid "<template>" -mca ess_base_num_procs "2" -mca orte_node_regex "cn01,cn02@0(2)" \
-mca orte_hnp_uri "702152704.0;tcp://<IP1>,<IP2>;ud://<UD>" -mca coll_hcoll_enable "1" -mca pml "yalla" \
--mca plm_base_verbose "100" -mca plm "rsh" -mca rmaps_base_mapping_policy "node" \
-mca hwloc_base_binding_policy "core" -mca rmaps_base_display_map "1"
As it can be seen there PMIX_INSTALL_PREFIX
is not propagated so orteds are failing with:
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
listener-thread-start
But I couldn't open the help file:
<OLD-INSTALLATION_PATH>/ompi-v3.0.x/share/pmix/help-pmix-server.txt: No such file or directory. Sorry!
Because they can't find PMIx mca ptl components.