Skip to content

Conversation

@devreal
Copy link
Contributor

@devreal devreal commented Jul 29, 2020

This PR fixes the handling of info values throughout the code. In particular:

  1. It reverts a "fix" introduced in 9b88e60, which was meant to force opal_info_get to always copy the full info value but could lead to out-of-bounds reads and potentially non-terminated strings because it does not match the description of the interface in the header file. The interface expects a buffer of valuelen+1 bytes, where valuelen is the maximum expected length of the info value. The info value may be shorter than that.
  2. It fixes the allocation of stack buffers in several places to include the extra byte required for the null-terminator while allowing for the full OPAL_MAX_INFO_VAL string length. In ompio, I opted to limit the length of the buffer to 128B where it is clear that the info value should contain a numeric value (e.g., stripe size). If the value is larger than that it cannot be properly handled anyway. I'm happy to revert that change (the previous implementation was missing the extra byte though).
  3. It fixes the info handling in the osc/portals4 component, which was too complicated and tried to read non-existing mca parameters.

The PR also fixes some whitespace issues I found (tabs and trailing spaces). I'll be happy to take them out if that is an issue.

Fixes #7961

devreal added 4 commits July 29, 2020 18:49
This reverts commit 9b88e60.

Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Copy link
Member

@edgargabriel edgargabriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the ompio functionality I am fine with the changes.

@bosilca
Copy link
Member

bosilca commented Jul 30, 2020

You need to check with a Fortran expert here before such a major change, as the manipulation of the info in Fortran is "different". Let me tag @jsquyres our in-house Fortran guru.

@devreal
Copy link
Contributor Author

devreal commented Aug 5, 2020

I tried to understand the Fortran string handling and it looks to me like MPI_Info_get in both mpi-f.h and mpi-f08 is mapped to ompi_info_get_f, which allocates a static char array of size MPI_MAX_INFO_VAL+1 that is eventually passed to opal_info_get and later copied to the Fortran string. That should be safe with the changes in this PR. Maybe @jsquyres can confirm that?

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I applaud this PR for trying to clean up our info string handling. It's currently somewhat of a mess, and really should be cleaned up. The fact that most real-world MPI applications don't seem to bump up against MPI_MAX_INFO_VAL and MPI_MAX_INFO_KEY doesn't change the fact that our info implementation is... messy.

Second, let me confirm: yes, the Fortran handling is as @devreal describes. The one difference to keep in mind between the C and Fortran handling of strings is that C expects strings to be \0-terminated, but Fortran expects them to be space-padded on the right. Meaning: Fortran treats string copies more like memcpy -- there's a pointer and a length. E.g., if you have an 8-character string in Fortran and you put the value "dog" in there, the memory will be exactly 8 bytes long and its contents are guaranteed to be: dog<SP><SP><SP><SP><SP> (vs. C, where the contents of memory will be dog\0****, where * can be any value).

All this being said, I'm not convinced that this PR is taking the right tact.

The real issue is that MPI-3.1 does not define that info keys or values must be \0-terminated in C. Indeed, it makes very confusing/ambiguous statements about the "end of string character" (it doesn't mandate that they are present, but does insist that the valuelen does not include them). I think some improvements may have been made in MPI-4.0, but we'll still have the legacy issue about what to do for MPI-3.1.

MPI-3.1 MPI_INFO_GET says:

valuelen is the number of characters available in value. If it is less than the actual size of the value, the value is truncated. In C, valuelen should be one less than the amount of allocated space to allow for the null terminator.

Note that this doesn't mandate that \0 is there (yes, it's a mess).

Also note that in Open MPI:

  • OPAL_INFO_MAX_VAL is 256
  • MPI_INFO_MAX_VAL is 256 (i.e., it's defined to be OPAL_INFO_MAX_VAL) in C and 255 in Fortran (i.e., it's a constant equal to OPAL_INFO_MAX_VAL-1)

But with this change, we have to add one to OPAL_INFO_MAX_VAL everywhere (in C) to make it long enough. This is problematic for two reasons:

  1. The name OPAL_INFO_MAX_VAL is a lie: it's not the "max" length. It's one less than the max length. My $0.02 is that this degrades the readability of this already-confusing code.
  2. C MPI applications cannot use char foo[MPI_INFO_MAX_VAL] because it's too short.

The second point is what kills the approach of this PR for me.

Why don't we change the internals of OPAL info handling to be more like a memcpy with an associated length? E.g., something like:

typedef struct {
    char *key, *value;
    // MPI-3.1 does not guarantee that key or value are \0-terminated,
    // so we maintain their lengths manually.  These length values
    // will include the \0 _if it is present_.
    size_t key_len_in_bytes, value_len_in_bytes;
} opal_info_entry_t;

typedef struct {
    opal_list_t info_entries;
    // ...
} opal_info_t;

void opal_info_get(opal_info_t *info, const char *key, char *value, int value_len_in_bytes, ...)
{
    opal_info_entry_t *entry = opal_find_info_entry(info, key);
    if (NULL == entry) return;
    if (value_len_in_bytes < entry->value_len_in_bytes) {
        memcpy(value, entry->value, value_len_in_bytes);
    } else {
        memcpy(value, entry->value, entry->value_len_in_bytes);
    }
}

This way, the OPAL info implementation can store however many bytes are relevant for the info and key, and we always know exactly how long they are (including the \0, if it is present).

...I just read the changes in MPI-4.0 for info:

  1. MPI_INFO_SET defines that C key and value strings are null-terminated.
    • That clears up some of the ambiguities.
    • Meaning: if the user declares char foo[MPI_INFO_MAX_VAL], that must include the \0 character.
  2. MPI_INFO_GET was deprecated (i.e., not fixed / ambiguities not clarified).
  3. MPI_INFO_GET_VALUELEN was deprecated (i.e., not fixed / ambiguities not clarified).
  4. MPI_INFO_GET_STRING was introduced. It seems to have clear descriptions of string handling, including the \0 character.

I think that the implementation I proposed above will allow for both the MPI-3.1-style MPI_INFO_GET / MPI_INFO_GET_VALUELEN and the new MPI-4.0-style MPI_INFO_SET / MPI_GET_INFO_STRING. I.e., there can be a little extra logic in the top MPI layer for handling the MPI-3.1 and MPI-4.0 semantics (because the OPAL layer will be a dumb char-and-length storage/lookup layer).

Thoughts?

* We have found the element, so we can return the value
* Set the flag and value
* NOTE: the interface states that "'valuelen' should be one less than
* the allocated space to allow for the null terminator."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What interface are you referring to here? This is a static function in opal/util/info.c. There is no comment describing the interface to opal_info_get_nolock() as far as I can see...?

Are you referring to MPI_INFO_GET? Because if so, MPI_INFO_GET != opal_info_get_nolock().

struct opal_info_t {
opal_list_t super;
opal_mutex_t *i_lock;
opal_mutex_t *i_lock;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to worry about this PR, but in the future, it would be nice if whitespace/style/comment typo fixes could be separated out into their own commits.


if (NULL != fh) {
char char_stripe[MPI_MAX_INFO_VAL];
char char_stripe[128];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your discussion of this in the PR description, but I guess I still don't understand why this is necessary...? Hard-coding array lengths is, in general, "bad".

Going from MPI_MAX_INFO_VAL (255) to 128 saves a few bytes, but is it significant? These are stack variables, after all -- it's not like we're allocating these strings and leaving them around/needlessly consuming extra memory that we know is not going to be used.

char stdin_target[OPAL_MAX_INFO_VAL+1];
char params[OPAL_MAX_INFO_VAL+1];
char mapper[OPAL_MAX_INFO_VAL+1];
char slot_list[OPAL_MAX_INFO_VAL+1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my main question about this PR.

@lanl-ompi
Copy link
Contributor

Can one of the admins verify this patch?

@devreal
Copy link
Contributor Author

devreal commented Nov 22, 2020

Sorry for the long silence on this one, too many other things crossed my path. While I agree with @jsquyres that info handling in MPI in particular and string handling in C are a mess, I think both interfaces propose here are not a cure for our problem, but rather a band-aid with plenty of room for error. Even with the interface proposed by @jsquyres the problem that the call has to allocate the right amount of memory remains.

Taking a step back here: why do we copy the string in opal_info_get in the first place? From what I can see it is so that a) we maintain the string in a consistent state even if other threads set a new value; and b) to prevent callers from modifying the value and key stored in the info object. For OMPI-internal use a) is legitimate, while I believe that b) is less of a problem (given that we have full control over our code and code reviews should catch any nefarious behavior). At the same time, all of the OMPI-internal use cases I have seen are reading and comparing the info value, never modifying it. So instead of copying the string into a buffer that is provided by the caller, why not return an immutable info value object that is guaranteed to stay valid for the duration the caller uses them?

The way to do that is by storing both strings in ref-counted objects and making them const char[]:

struct opal_info_key_t {
    opal_object_t super;
    const size_t len; // length of the info key
    const char key[OPAL_MAX_INFO_KEY + 1];
};

struct opal_info_value_t {
    opal_object_t super;
    const size_t len; // length of the info key
    const char value[]; // value, flexible array member
};

struct opal_info_entry_t {
    opal_list_item_t super; /**< required for opal_list_t type */
    opal_info_value_t *ie_value; /**< value part of the (key, value) pair. */
    opal_info_key_t *ie_key; /**< "key" part of the (key, value)
                                     * pair */
};

Setting a new value for an existing info key simply becomes:

int opal_info_set_nolock (opal_info_t *info, const char *key, const char *value) {
    opal_info_entry_t *old_info;

    old_info = info_find_key (info, key);
    if (NULL != old_info) {
        size_t len = strlen(value);
        // allocate space for the object and the string
        opal_info_value_t *info_val  = malloc(sizeof(*info_val) + len);
        OBJ_CONSTRUCT(info_val, opal_info_value_t);
        // have to cast away const here
        strcpy((char*)&info_val->val, value);
        *(size_t*)(&info_val->len) = len;
        // release the old object and set the new one
        // info_old->ie_value will remain valid until the last user has released it
        OBJ_RELEASE(old_info->ie_value);
        old_info->ie_value = info_val;
    } else {
        // set a new value, similar to the above
    }

Querying an info value will return an opal_info_value_t object:

opal_info_value_t
opal_info_get_nolock (opal_info_t *info, const char *key)
{
    opal_info_entry_t *search;

    search = info_find_key (info, key);

    if (NULL == search){
        return NULL;
    } else {
        OBJ_RETAIN(search->ie_value);
        return search->ie_value;
    }
}

A caller may then look at the returned object and finally release it:

opal_info_value_t *info_val = opal_info_get(info, "info_key");
if (info_val != NULL) {
    if (0 == strcmp(info_val->value, "true")) {
        // do whatever you have to do
    }
    OBJ_RELEASE(info_val);
}

So: no temporary memory to be allocated (either on stack or heap), no getting the string terminator right, no string copying. Of course, the caller could cast away the const of info_val->value but we should be able to trust our own code not to do such nefarious things. If the caller requires a string that is mutable (e.g., to pass it to strtok) then they can strdup it into their own memory and work on that. For the vast majority of use cases, this is not actually necessary.

As for the upper MPI layer: with the provided opal_info_value_t, the upper layers can handle the string as they need to: the Fortran interface might handle it differently than the C interface and MPI_INFO_GET might handle it differently than MPI_INFO_GET_STRING. OPAL just doesn't have to care anymore and we eliminate all the sources of errors in the internal info handling.

I admit that we end up with slightly more objects that have to be allocated but info object modification should not be in the critical path so I don't think that is a major concern. The proposed interface might make info querying faster by avoiding copying of the string.

I'd like to get some freedback from @jsquyres and anyone else who wants to chime in before making such a sweeping change (plenty of places to touch throughout the code). I think we should get a fix in for 5.0 and I'm happy to work on it.

@jsquyres
Copy link
Member

In general, I think @devreal's proposal sounds fine: have the OPAL info stuff do some bare-bones minimum of handling, and the MPI layer can add on additional semantics necessary for MPI_Info functionality. I think that was the original design, but the two layers were a little more dependent upon each other than we intended.

A few minor points:

  1. I'm a little confused why there is a different treatment of char array lengths between keys and values. Why is one a hard-coded length and the other is not?
  2. Also, I see the use of OPAL_MAX_INFO_KEY + 1 which I think is misleading. Is it the max length or not?
  3. I have a friendly amendment proposal: what about making an opal_string_t class?
    • It is largely like what @devreal proposed for his key/value classes, but with a few minor differences.
    • There is no length / limit on the internal char array (it will likely just be a pointer).
    • The opal_string_t class will take care of all memory allocation / reallocation / freeing, analogous to C++ std::string and string entities in other languages.
    • And since the OPAL info handling will have no concept of max lengths for keys/values, the MPI layer can take care to never set a key longer than MPI_MAX_INFO_KEY or a value longer than MPI_MAX_INFO_VAL per the MPI-3.1 and MPI-4.0 definitions, as relevant.

@devreal
Copy link
Contributor Author

devreal commented Mar 12, 2021

Superseded by #8330, closing.

@devreal devreal closed this Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opal_info_get: why use memcpy without range check?

5 participants