Skip to content

Commit ed85512

Browse files
author
Ralph Castain
committed
Update to track PMIx v2.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
1 parent e31e8b9 commit ed85512

File tree

12 files changed

+206
-156
lines changed

12 files changed

+206
-156
lines changed

opal/mca/pmix/pmix2x/pmix/AUTHORS

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,31 +9,22 @@ Email Name Affiliation(s)
99
alinask Elena Shipunova Mellanox
1010
annu13 Annapurna Dasari Intel
1111
artpol84 Artem Polyakov Mellanox
12-
ashleypittman Ashley Pittman Intel
1312
dsolt Dave Solt IBM
14-
garlick Jim Garlick LLNL
1513
ggouaillardet Gilles Gouaillardet RIST
1614
hjelmn Nathan Hjelm LANL
1715
igor-ivanov Igor Ivanov Mellanox
1816
jladd-mlnx Joshua Ladd Mellanox
19-
jjhursey Joshua Hursey IBM
20-
jsquyres Jeff Squyres Cisco
21-
karasevb Boris Karasev Mellanox
22-
kawashima-fj Takahiro Kawashima Fujitsu
17+
jsquyres Jeff Squyres Cisco, IU
2318
nkogteva Nadezhda Kogteva Mellanox
24-
nysal Nysal Jan KA IBM
25-
PHHargrove Paul Hargrove LBNL
26-
rhc54 Ralph Castain Intel
19+
rhc54 Ralph Castain LANL, Cisco, Intel
2720
------------------------------- --------------------------- -------------------
2821

2922
Affiliation abbreviations:
3023
--------------------------
3124
Cisco = Cisco Systems, Inc.
32-
Fujitsu = Fujitsu
3325
IBM = International Business Machines, Inc.
3426
Intel = Intel, Inc.
27+
IU = Indiana University
3528
LANL = Los Alamos National Laboratory
36-
LBNL = Lawrence Berkeley National Laboratory
37-
LLNL = Lawrence Livermore National Laboratory
3829
Mellanox = Mellanox
3930
RIST = Research Organization for Information Science and Technology

opal/mca/pmix/pmix2x/pmix/INSTALL

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ This file is a *very* short overview of building and installing
2424
the PMIx library. Much more information is available on the
2525
PMIx web site (e.g., see the FAQ section):
2626

27-
http://pmix.github.io/pmix/pmix
27+
http://pmix.github.io/pmix/master
2828

2929

3030
Developer Builds
@@ -34,7 +34,7 @@ If you have checked out a DEVELOPER'S COPY of PMIx (i.e., you checked
3434
out from Git), you should read the HACKING file before attempting to
3535
build PMIx. You must then run:
3636

37-
shell$ ./autogen.pl
37+
shell$ ./autogen.sh
3838

3939
You will need very recent versions of GNU Autoconf, Automake, and
4040
Libtool. If autogen.sh fails, read the HACKING file. If anything

opal/mca/pmix/pmix2x/pmix/NEWS

Lines changed: 0 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -24,65 +24,6 @@ current release as well as the "stable" bug fix release branch.
2424
Master (not on release branches yet)
2525
------------------------------------
2626

27-
28-
2.0.0
29-
------
30-
**** NOTE: This release implements the complete PMIX v2.0 Standard
31-
**** and therefore includes a number of new APIs and features. These
32-
**** can be tracked by their RFC's in the RFC repository at:
33-
**** https://github.com/pmix/RFCs. A formal standards document will
34-
**** be included in a later v2.x release. Some of the changes are
35-
**** identified below.
36-
- Added the Modular Component Architecture (MCA) plugin manager and
37-
converted a number of operations to plugins, thereby allowing easy
38-
customization and extension (including proprietary offerings)
39-
- Added support for TCP sockets instead of Unix domain sockets for
40-
client-server communications
41-
- Added support for on-the-fly Allocation requests, including requests
42-
for additional resources, extension of time for currently allocated
43-
resources, and return of identified allocated resources to the scheduler
44-
(RFC 0005 - https://github.com/pmix/RFCs/blob/master/RFC0005.md)
45-
- Tightened rules on the processing of PMIx_Get requests, including
46-
reservation of the "pmix" prefix for attribute keys and specifying
47-
behaviors associated with the PMIX_RANK_WILDCARD value
48-
(RFC 0009 - https://github.com/pmix/RFCs/blob/master/RFC0009.md)
49-
- Extended support for tool interactions with a PMIx server aimed at
50-
meeting the needs of debuggers and other tools. Includes support
51-
for rendezvousing with a system-level PMIx server for interacting
52-
with the system management stack (SMS) outside of an allocated
53-
session, and adds two new APIs:
54-
- PMIx_Query: request general information such as the process
55-
table for a specified job, and available SMS capabilities
56-
- PMIx_Log: log messages (e.g., application progress) to a
57-
system-hosted persistent store
58-
(RFC 0010 - https://github.com/pmix/RFCs/blob/master/RFC0010.md)
59-
- Added support for fabric/network interactions associated with
60-
"instant on" application startup
61-
(RFC 0012 - https://github.com/pmix/RFCs/blob/master/RFC0012.md)
62-
- Added an attribute to support getting the time remaining in an
63-
allocation via the PMIx_Query interface
64-
(RFC 0013 - https://github.com/pmix/RFCs/blob/master/RFC0013.md)
65-
- Added interfaces to support job control and monitoring requests,
66-
including heartbeat and file monitors to detect stalled applications.
67-
Job control interface supports standard signal-related operations
68-
(pause, kill, resume, etc.) as well as checkpoint/restart requests.
69-
The interface can also be used by an application to indicate it is
70-
willing to be pre-empted, with the host RM providing an event
71-
notification when the preemption is desired.
72-
(RFC 0015 - https://github.com/pmix/RFCs/blob/master/RFC0015.md)
73-
- Extended the event notification system to support notifications
74-
across threads in the same process, and the ability to direct
75-
ordering of notifications when registering event handlers.
76-
(RFC 0018 - https://github.com/pmix/RFCs/blob/master/RFC0018.md)
77-
- Expose the buffer manipulation functions via a new set of APIs
78-
to support heterogeneous data transfers within the host RM
79-
environment
80-
(RFC 0020 - https://github.com/pmix/RFCs/blob/master/RFC0020.md)
81-
- Fix a number of race condition issues that arose at scale
82-
- Enable PMIx servers to generate notifications to the host RM
83-
and to themselves
84-
85-
8627
1.2.2 -- 21 March 2017
8728
----------------------
8829
- Compiler fix for Sun/Oracle CC (PR #322)

opal/mca/pmix/pmix2x/pmix/VERSION

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# major, minor, and release are generally combined in the form
1414
# <major>.<minor>.<release>.
1515

16-
major=2
16+
major=3
1717
minor=0
1818
release=0
1919

@@ -23,14 +23,14 @@ release=0
2323
# The only requirement is that it must be entirely printable ASCII
2424
# characters and have no white space.
2525

26-
greek=
26+
greek=a1
2727

2828
# If repo_rev is empty, then the repository version number will be
2929
# obtained during "make dist" via the "git describe --tags --always"
3030
# command, or with the date (if "git describe" fails) in the form of
3131
# "date<date>".
3232

33-
repo_rev=git6fb501d
33+
repo_rev=git4c2c8d0
3434

3535
# If tarball_version is not empty, it is used as the version string in
3636
# the tarball filename, regardless of all other versions listed in
@@ -44,7 +44,7 @@ tarball_version=
4444

4545
# The date when this release was created
4646

47-
date="Jun 19, 2017"
47+
date="Jun 25, 2017"
4848

4949
# The shared library version of each of PMIx's public libraries.
5050
# These versions are maintained in accordance with the "Library
@@ -75,4 +75,4 @@ date="Jun 19, 2017"
7575
# Version numbers are described in the Libtool current:revision:age
7676
# format.
7777

78-
libpmix_so_version=3:0:1
78+
libpmix_so_version=0:0:0

opal/mca/pmix/pmix2x/pmix/include/pmix_common.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,8 @@ typedef uint32_t pmix_rank_t;
124124
#define PMIX_CONNECT_SYSTEM_FIRST "pmix.cnct.sys.first" // (bool) Preferentially look for a system-level PMIx server first
125125
#define PMIX_REGISTER_NODATA "pmix.reg.nodata" // (bool) Registration is for nspace only, do not copy job data
126126
#define PMIX_SERVER_ENABLE_MONITORING "pmix.srv.monitor" // (bool) Enable PMIx internal monitoring by server
127+
#define PMIX_SERVER_NSPACE "pmix.srv.nspace" // (char*) Name of the nspace to use for this server
128+
#define PMIX_SERVER_RANK "pmix.srv.rank" // (pmix_rank_t) Rank of this server
127129

128130

129131
/* identification attributes */

opal/mca/pmix/pmix2x/pmix/src/buffer_ops/copy.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,7 @@ PMIX_EXPORT pmix_status_t pmix_value_xfer(pmix_value_t *p, pmix_value_t *src)
425425
break;
426426
}
427427
/* allocate space and do the copy */
428-
switch (src->type) {
428+
switch (src->data.darray->type) {
429429
case PMIX_UINT8:
430430
case PMIX_INT8:
431431
case PMIX_BYTE:

opal/mca/pmix/pmix2x/pmix/src/buffer_ops/unpack.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -769,6 +769,7 @@ pmix_status_t pmix_bfrop_unpack_info(pmix_buffer_t *buffer, void *dest,
769769
return ret;
770770
}
771771
if (NULL == tmp) {
772+
PMIX_ERROR_LOG(PMIX_ERROR);
772773
return PMIX_ERROR;
773774
}
774775
(void)strncpy(ptr[i].key, tmp, PMIX_MAX_KEYLEN);

opal/mca/pmix/pmix2x/pmix/src/client/pmix_client_get.c

Lines changed: 45 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ PMIX_EXPORT pmix_status_t PMIx_Get(const pmix_proc_t *proc, const char key[],
111111
PMIX_RELEASE(cb);
112112

113113
pmix_output_verbose(2, pmix_globals.debug_output,
114-
"pmix:client get completed");
114+
"pmix:client get completed %d", rc);
115115

116116
return rc;
117117
}
@@ -464,7 +464,7 @@ static pmix_status_t process_val(pmix_value_t *val,
464464
}
465465
nvals = 0;
466466
for (n=0; n < nsize; n++) {
467-
if (PMIX_SUCCESS != (rc = pmix_pointer_array_add(results, &info[n]))) {
467+
if (0 > (rc = pmix_pointer_array_add(results, &info[n]))) {
468468
return rc;
469469
}
470470
++nvals;
@@ -536,25 +536,45 @@ static void _getnbfn(int fd, short flags, void *cbdata)
536536
/* if the rank is WILDCARD, then they want all the job-level info,
537537
* so no need to check the modex */
538538
if (PMIX_RANK_WILDCARD != cb->rank) {
539+
rc = PMIX_ERR_NOT_FOUND;
539540
#if defined(PMIX_ENABLE_DSTORE) && (PMIX_ENABLE_DSTORE == 1)
540-
if (PMIX_SUCCESS == (rc = pmix_dstore_fetch(nptr->nspace, cb->rank, NULL, &val))) {
541-
#else
542-
if (PMIX_SUCCESS == (rc = pmix_hash_fetch(&nptr->modex, cb->rank, NULL, &val))) {
541+
/* my own data is in the hash table, so don't bother looking
542+
* in the dstore if that is what they want */
543+
if (pmix_globals.myid.rank != cb->rank) {
544+
if (PMIX_SUCCESS == (rc = pmix_dstore_fetch(nptr->nspace, cb->rank, NULL, &val))) {
545+
pmix_output_verbose(2, pmix_globals.debug_output,
546+
"pmix_get[%d]: value retrieved from dstore", __LINE__);
547+
if (PMIX_SUCCESS != (rc = process_val(val, &nvals, &results))) {
548+
cb->value_cbfunc(rc, NULL, cb->cbdata);
549+
/* cleanup */
550+
if (NULL != val) {
551+
PMIX_VALUE_RELEASE(val);
552+
}
553+
PMIX_RELEASE(cb);
554+
return;
555+
}
556+
}
557+
}
543558
#endif /* PMIX_ENABLE_DSTORE */
544-
pmix_output_verbose(2, pmix_globals.debug_output,
545-
"pmix_get[%d]: value retrieved from dstore", __LINE__);
546-
if (PMIX_SUCCESS != (rc = process_val(val, &nvals, &results))) {
547-
cb->value_cbfunc(rc, NULL, cb->cbdata);
548-
/* cleanup */
549-
if (NULL != val) {
550-
PMIX_VALUE_RELEASE(val);
559+
if (PMIX_SUCCESS != rc) {
560+
/* if the user was asking about themselves, or we aren't using the dstore,
561+
* then we need to check the hash table */
562+
if (PMIX_SUCCESS == (rc = pmix_hash_fetch(&nptr->modex, cb->rank, NULL, &val))) {
563+
pmix_output_verbose(2, pmix_globals.debug_output,
564+
"pmix_get[%d]: value retrieved from hash", __LINE__);
565+
if (PMIX_SUCCESS != (rc = process_val(val, &nvals, &results))) {
566+
cb->value_cbfunc(rc, NULL, cb->cbdata);
567+
/* cleanup */
568+
if (NULL != val) {
569+
PMIX_VALUE_RELEASE(val);
570+
}
571+
PMIX_RELEASE(cb);
572+
return;
551573
}
552-
PMIX_RELEASE(cb);
553-
return;
574+
PMIX_VALUE_RELEASE(val);
554575
}
555-
/* cleanup */
556-
PMIX_VALUE_RELEASE(val);
557-
} else {
576+
}
577+
if (PMIX_SUCCESS != rc) {
558578
/* if we didn't find a modex for this rank, then we need
559579
* to go get it. Thus, the caller wants -all- information for
560580
* the specified rank, not just the job-level info. */
@@ -572,12 +592,17 @@ static void _getnbfn(int fd, short flags, void *cbdata)
572592
PMIX_RELEASE(cb);
573593
return;
574594
}
575-
/* cleanup */
576595
PMIX_VALUE_RELEASE(val);
577596
}
578597
/* now let's package up the results */
579598
PMIX_VALUE_CREATE(val, 1);
580599
val->type = PMIX_DATA_ARRAY;
600+
val->data.darray = (pmix_data_array_t*)malloc(sizeof(pmix_data_array_t));
601+
if (NULL == val->data.darray) {
602+
PMIX_VALUE_RELEASE(val);
603+
cb->value_cbfunc(PMIX_ERR_NOMEM, NULL, cb->cbdata);
604+
return;
605+
}
581606
val->data.darray->type = PMIX_INFO;
582607
val->data.darray->size = nvals;
583608
PMIX_INFO_CREATE(iptr, nvals);
@@ -597,14 +622,13 @@ static void _getnbfn(int fd, short flags, void *cbdata)
597622
} else {
598623
pmix_value_xfer(&iptr[n].value, &info->value);
599624
}
600-
PMIX_INFO_FREE(info, 1);
625+
PMIX_INFO_DESTRUCT(info);
601626
}
602627
}
603628
/* done with results array */
604629
PMIX_DESTRUCT(&results);
605-
/* return the result to the caller */
630+
/* return the result to the caller - they are responsible for releasing it */
606631
cb->value_cbfunc(PMIX_SUCCESS, val, cb->cbdata);
607-
PMIX_VALUE_FREE(val, 1);
608632
PMIX_RELEASE(cb);
609633
return;
610634
}

0 commit comments

Comments
 (0)