Open
Description
Unified memory isn't supported in PyTorch and was considered a potential blocker for the custom ops refactor.
We found a workaround at the time, with a simple viability proof.
It's however not clear how this fits together with the current open PR #1544 and RFC #1545 and this needs to be fleshed out.
Questions:
- Are the needed changes to the code base deeply rooted or relatively superficial
- Is it impactful to work on this right now or should we focus on finalizing the non-optimizer related custom_ops first
- Maybe it's straight forward to already implement this while prototyping? If yes, we can already open a PR or make it part of the open PR.