Skip to content

feat(aten): implement as_strided, resize_, _reshape_alias; refactor empty ops#5

Merged
izzalDev merged 1 commit into
mainfrom
feat/minimal
May 21, 2026
Merged

feat(aten): implement as_strided, resize_, _reshape_alias; refactor empty ops#5
izzalDev merged 1 commit into
mainfrom
feat/minimal

Conversation

@izzalDev

Copy link
Copy Markdown
Owner

Summary

This PR implements three previously stubbed ATen native operations for the OpenCL backend — as_strided, resize_, and _reshape_alias — and refactors the existing empty_memory_format and empty_strided implementations to delegate to the upstream at::detail allocation helpers, removing the bespoke make_opencl_tensor path.

Changes

  • Add as_strided, resize_, and _reshape_alias to Minimal.cpp / Minimal.h, and register all three in TORCH_LIBRARY_IMPL.
  • Refactor empty_memory_format to use at::detail::empty_generic and empty_strided to use at::detail::empty_strided_generic; remove the internal make_opencl_tensor helper.
  • Remove the CPUFallback include from OpenCLMinimal.cpp; previously commented-out stub declarations in the header are replaced with real signatures.
  • Tighten TORCH_CHECK messages across all ops (pin memory, non-strided layout, peer-to-peer copy).
  • Reformat all touched files to the 80-column limit introduced in .clang-format.

Tests

  • test_allocator.py: add zero-size allocation, multi-dtype, pin_memory error, and unsupported layout tests.
  • test_minimal.py: add TestAsStrided, TestResize, and TestReshapeAlias test classes; extend TestCopyFrom with dtype mismatch, numel mismatch, and non-contiguous source cases.
  • test_device.py: add lifecycle, is_in_bad_fork, and exchange_device binding tests.

Notes

  • resize_ performs a memcpy-based reallocation when the new size exceeds the current storage; shrinking is metadata-only.
  • Negative strides and negative storage offsets are explicitly rejected in as_strided; this is a current OpenCL backend limitation, not a general ATen constraint.
  • The _copy_from peer-to-peer error message was reformatted to fit the 80-column limit; no functional change.

Checklist

  • Code compiles cleanly against the target libtorch / OpenCL stack
  • All existing tests pass
  • New tests added for each new operation
  • Formatted with clang-format (80-column limit)
  • Docs / changelog updated (if applicable)

…mpty ops

- Add as_strided, resize_, and _reshape_alias native implementations and
  register them in TORCH_LIBRARY_IMPL
- Refactor empty_memory_format and empty_strided to use
  at::detail::empty_generic / empty_strided_generic instead of the
  custom make_opencl_tensor helper
- Remove CPUFallback include; drop commented-out stub declarations from
  header and replace with real signatures
- Tighten error messages (pin_memory, layout, peer-to-peer copy)
- Reformat to 80-column limit throughout
- Expand test coverage: zero-size alloc, dtype variants, pin_memory /
  layout error paths, as_strided (basic, negative stride/offset,
  size-stride mismatch), resize_ (shrink, scalar), _reshape_alias, and
  additional _copy_from error cases
@izzalDev izzalDev merged commit 88d9720 into main May 21, 2026
6 checks passed
@izzalDev izzalDev deleted the feat/minimal branch May 21, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant