Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined behavior in Vec::operator[] #446

Open
cary-ilm opened this issue Oct 10, 2024 · 7 comments
Open

Undefined behavior in Vec::operator[] #446

cary-ilm opened this issue Oct 10, 2024 · 7 comments

Comments

@cary-ilm
Copy link
Member

The Vec2, Vec3, and Vec4 operator[](int) relies on undefined behavior, since indexing outside of the &x isn't legal.

template <class T>
IMATH_CONSTEXPR14 IMATH_HOSTDEVICE inline T&
Vec2<T>::operator[] (int i) IMATH_NOEXCEPT
{
    return (&x)[i]; 
}

See AcademySoftwareFoundation/OpenShadingLanguage#1865 for a real-world failure case and investigation of the cause, involving the Intel oneAPI compiler.

The C++ specification doesn't guarantee that the x and y fields are adjacent without padding.

One alternative might be to add #pragma packed or __attribute__(packed) to the class declaration to ensure there is no padding, but padding may not be the issue, since the problem may involve compiler optimizations with temporary objects.

Replacing the &x with reinterpret_cast<T*> may be slightly preferable but is still non-standard and may still not resolve the issue.

Is the only truly valid way of implementing the indexing operation to use a switch statement? That would incur a performance penalty. But this is not a method that should be called in performance-critical situations anyway, much better to use the explicit member references. If operator[] is doomed to be a less-than-optimal method, would it be acceptable to make it even slightly slower but guaranteed correct?

Suggestions?

@meshula
Copy link
Contributor

meshula commented Oct 14, 2024

Pragmas and attributes don't protect against issues like optimizations or undefined behavior in certain compilers. Pragmas or attributes may not help with alignment-related performance on some architectures either.

A union would ensure contiguity and fix UB related to packing with no performance penalty, but doesn't address OOB.

union {
    struct { T x, y; };
    T data[2];
};

@lgritz
Copy link
Contributor

lgritz commented Oct 14, 2024

@meshula That also breaks modern type punning rules, but so far it has worked on every compiler.

@lgritz
Copy link
Contributor

lgritz commented Oct 14, 2024

The union approach is the one I tend to use (caveats notwithstanding).

@AlexMWells has a favourite idiom for dealing with subscripted access that is truly C++ compliant and generates great code, but is complicated and a bit daunting to a naive reader to read and make sense of.

@meshula
Copy link
Contributor

meshula commented Oct 14, 2024

The reason I suggested the union is that it doesn't introduce branching for basic access.

Is subscripting with no array access an option for Imath 4? Since EXRCore is C, it doesn't use Imath at all, so it's within the realm of possibility to simply leave a problematic pattern behind.

@AlexMWells
Copy link

AlexMWells commented Oct 15, 2024

About the union approach. I don't think the solution to C++ undefined behavior is to rely on another C++ undefined behavior even if it "appears" to work for most compilers.
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c183-dont-use-a-union-for-type-punning

A nice summary of situation here:
https://tttapa.github.io/Pages/Programming/Cpp/Practices/type-punning.html

I personally have had to track down issues with union approaches where mixing of .x, .y, .z and [i] access via a union caused undefined behavior.

@meshula
Copy link
Contributor

meshula commented Oct 16, 2024

Yeah, so I come back to proposing we omit the bracket operators completely. That would have been impossible before because OpenEXR relied on the indexing, but OpenEXRCore doesn't use the Imath definitions at all.

@AlexMWells
Copy link

AlexMWells commented Oct 16, 2024

I've done some prototyping to explore implementing dynamic index array subscript support and created a godbolt project ontop of Imath that has a wrapper class Vec3Subscript where
enum class Subscript {
UndefinedBehavior,
None,
Selection,
Dynamic,
TempArray,
CopyToArray
};
Vec3_Subscript intent is to explore different array subscript options. It is not being proposed this templated subscript behavior be added to underlying Vec3. Instead we can use this to easily explore different implementations options

https://godbolt.org/z/qEPo4a1c8

When vectorizing kernels, the Subscript::Selection generates vastly better code as Scalar Replacement Of Aggregates (SROA) succeeds allow the local variable to be kept in registers (which matters for GPU's).

There also is some examples in there on statically unrolling loops to implement generic algorithms that directly access .x, .y. ,z data members vs compile time known std::integral_constant<int, 0>, std::integral_constant<int, 1>, std::integral_constant<int, 2>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants