Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate shareRaw #1229

Merged
merged 12 commits into from
Oct 25, 2022
Merged

Deprecate shareRaw #1229

merged 12 commits into from
Oct 25, 2022

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Mar 14, 2022

Idea: Instead of rc.storeChunk(shareRaw(ptr), …), have users write rc.storeChunkRaw(ptr, …) which is a bit more transparent in what is happening. Partially resolves #1216 by introducing a clearer API that distinguishes between owned and non-owned resources.

TODO

  • Documentation
  • The diff on examples, tests and Python bindings shows the proposed API changes. Discuss if they are ok like this. (Especially the changes in determineDatatype)
  • Maybe revert some usage changes, so the CI can test if shareRaw continues working
  • Introduce an internal version of shareRaw, so every creation of a non-owning shared pointer goes through the same function. Ideally, in the end a shared pointer is either owning or has been created by that function.

@franzpoeschel franzpoeschel added the api: new additions to the API label Mar 14, 2022
@franzpoeschel franzpoeschel force-pushed the deprecate-shareRaw branch 4 times, most recently from 2c5ba0a to 19d71ca Compare March 22, 2022 10:50
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and as discussed, let's proceed.

Only thing I would add is another overload for contiguous containers to determineDatatype for symmetry with storeChunk.

@ax3l
Copy link
Member

ax3l commented Apr 15, 2022

@franzpoeschel pls resolve the merge conflict :)

@franzpoeschel
Copy link
Contributor Author

@franzpoeschel pls resolve the merge conflict :)

Done

@franzpoeschel franzpoeschel force-pushed the deprecate-shareRaw branch 2 times, most recently from 37d966e to d5547d0 Compare August 17, 2022 09:05
Comment on lines 285 to 294
template <typename T>
inline constexpr Datatype determineDatatypeRaw(T const *)
{
return determineDatatype<T>();
}

template <typename T_ContiguousContainer>
inline constexpr Datatype determineDatatypeContiguous(T_ContiguousContainer &&)
{
using T_ContiguousContainer_stripped =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For determineDatatype, I tend to just use one API function name with overloads for pointers and cont. containers.
There is no lifetime question here and additional naming is not really needed, adding unnecessary API calls to learn.

What do you think?

Copy link
Contributor Author

@franzpoeschel franzpoeschel Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did it like this is a slight inconsistency that we would get:

determineDatatype<std::vector<float>>(); // evaluates to Datatype::VECTOR_FLOAT
determineDatatype(std::vector<float>()); // evaluates to Datatype::FLOAT, interpreting the vector as a contiguous container

As long as we don't add the following overload, this would technically work, but still be extremely confusing imo:

template <typename T>
inline constexpr Datatype determineDatatype(T)
{
     // ignoring reference decaying for this example now
    return determineDatatype<T>();
}

I see the appeal of not introducing new API calls left and right, but we should be aware of the possible source for confusion.

Copy link
Contributor Author

@franzpoeschel franzpoeschel Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the PR again, and we could theoretically do the entire thing only with new overloads, instead of new API calls (even replace storeChunkRaw with storeChunk). So this is entirely a design decision:

  • storeChunkRaw(): I like making it explicit that we are dealing with non-owned, raw memory
  • determineDatatypeContiguous(): Avoids confusion, as described above
  • determineDatatypeRaw(): Would then be consistent with the change above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slide tendency to overloads, simply because it's C++ pattern (vs. C).

Copy link
Member

@ax3l ax3l Oct 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case, we could make it consistently evaluate to Datatype::VECTOR_FLOAT:

determineDatatype<std::vector<float>>();
determineDatatype(std::vector<float>());

and ask the user to call via ::value_type or .data() to get the fundamental type aka Datatype::FLOAT.

determineDatatype<std::vector<float>::value_type>();
determineDatatype(std::vector<float>().data());

I think I overdid the original detection support here for STL :)

If this changes existing behavior, then let us note it in NEWS.rst pls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe take your suggestion, but make it a little bit stricter:

If you want a direct translation between C++ types and our openPMD::Datatype enum, then use determineDatatype<T>() without argument (exactly as it is before this PR).

We provide overloads for pointer types, i.e. determineDatatype<T>(std::shared_ptr<T>) (already exists) and determineDatatype<T>(T const *) (new).

If a user passes a std::vector or a std::array, we don't even start guessing what is meant, but throw a detailed compile-time error:

Error: Passed a contiguous container type to determineDatatype<>().                                 [10/1843]
These types are not directly supported due to colliding semantics.                                           
Assuming a vector object `std::vector<float> vec;`,                                                          
use one of the following alternatives:                                                                       
                                                                                                             
1) If what you want is a direct openPMD::Datatype equivalent of the container                                
   type, use:                                                                                                
   `determineDatatype<decltype(vec)>()`                                                                      
   OR                                                                                                        
   `determineDatatype<std::vector<float>>()`.                                                                
   The result will be `Datatype::VECTOR_FLOAT`.                                                              
2) If what you want is the openPMD::Datatype equivalent of the *contained type*,
   use the raw pointer overload by:
   `determineDatatype(vec.data())`
   This is the variant that you likely wish to use if intending to write data
   from the vector via `storeChunk()`, e.g. `storeChunk(vec, {0}, {10})`.

Corresponding implementation:

template <typename T_ContiguousContainer>
inline constexpr Datatype determineDatatype(T_ContiguousContainer &&)
{
    using T_ContiguousContainer_stripped =
        std::remove_reference_t<T_ContiguousContainer>;
    if constexpr (auxiliary::IsContiguousContainer_v<
                      T_ContiguousContainer_stripped>)
    {
        static_assert(auxiliary::dependent_false_v<T_ContiguousContainer>, R"(
Error: Passed a contiguous container type to determineDatatype<>().
These types are not directly supported due to colliding semantics.
Assuming a vector object `std::vector<float> vec;`,
use one of the following alternatives:

1) If what you want is a direct openPMD::Datatype equivalent of the container
   type, use:
   `determineDatatype<decltype(vec)>()`
   OR
   `determineDatatype<std::vector<float>>()`.
   The result will be `Datatype::VECTOR_FLOAT`.
2) If what you want is the openPMD::Datatype equivalent of the *contained type*,
   use the raw pointer overload by:
   `determineDatatype(vec.data())`
   This is the variant that you likely wish to use if intending to write data
   from the vector via `storeChunk()`, e.g. `storeChunk(vec, {0}, {10})`.
        )");
    }
    else
    {
        static_assert(auxiliary::dependent_false_v<T_ContiguousContainer>, R"(
Unknown datatype passed to determineDatatype<>().
For a direct translation from C++ type to the openPMD::Datatype enum, use:
`auto determineDatatype<T>() -> Datatype`.

For detecting the contained datatpye of a pointer type (shared or raw pointer),
use either of the following overloads:
`auto determineDatatype<T>(std::shared_ptr<T>) -> Datatype` or
`auto determineDatatype<T>(T const *) -> Datatype`.
        )")
    }
}

This would entirely avoid any API inconsistencies, without introducing unexpected behavior where we need to guess what the user wanted.

* extent in the record component will be selected.
*/
template <typename T>
void loadChunk(std::shared_ptr<T> data, Offset offset, Extent extent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a std::shared_ptr<T[]>, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I forgot that in #1296. Should I open a separate PR or just add it into this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I updated it now

include/openPMD/RecordComponent.hpp Outdated Show resolved Hide resolved
@ax3l ax3l merged commit a8436ab into openPMD:dev Oct 25, 2022
eschnett added a commit to eschnett/openPMD-api that referenced this pull request Nov 11, 2022
* dev: (70 commits)
  Docs: Recommend Static Build for Superbuilds (openPMD#1325)
  Python 3.11 (openPMD#1323)
  pybind11: v2.10.1+ (openPMD#1322)
  Add Attribute::getOptional<T>() and use to add some more dynamic datatype conversions at read time (openPMD#1278)
  Mapping between ADIOS steps and openPMD iterations (openPMD#949)
  Deprecate shareRaw (openPMD#1229)
  Fix append mode double attributes (openPMD#1302)
  Constant scalars: Don't flush double (openPMD#1315)
  Remove caching cmake vars (openPMD#1313)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1311)
  storeChunk: Add an overload for shared_ptr<T[]> (openPMD#1296)
  Fix `operationAsString` Export (openPMD#1309)
  ADIOS2: more fine-grained control for file endings (openPMD#1218)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1307)
  Fix file existence check in parallel tests (openPMD#1303)
  ADIOS2: Flush to disk within a step (openPMD#1207)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1304)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1295)
  Update catch2 to v2.13.9 (openPMD#1299)
  [pre-commit.ci] pre-commit autoupdate (openPMD#1292)
  ...

# Conflicts:
#	.github/workflows/linux.yml
@ax3l ax3l mentioned this pull request Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ADIOS2 BP5 optimization: Allow users to "give up" their data into openPMD
2 participants