Skip to content

Remove Element classes!? #1458

@kohr-h

Description

@kohr-h

This is a somewhat more radical extension of #1457 to remove complexity in ODL. I'd love to have some feedback on it.


With a bit of distance to the weeds of the code, I've started thinking a bit about decisions we made (implicitly or explicitly) early on, and how well they served us so far. All in all I think we made good choices in many places, and the usefulness, flexibility and expressive power of ODL are indications of that.

At the same time, however, there are a couple of places where our solutions are more complex than they have to be. Structures like Discretization were simply mapped directly from mathematics, and sometimes there wasn't so much consideration put into the question "do we need this abstraction?" Rather, it was easy to do at the time and corresponded nicely to the math. I did that a lot.

Some of these rather innocent decisions have come back to hit us really hard. To me, the single most important case of this was the decision to introduce ...Element classes that wrap data containers. I can speak from the experience of implementing __array_ufunc__ for the Tensor class that it was a huge pain to write and test the code for all corner cases, just to make sure that Tensor is almost indistinguishable from NumPy arrays (but still different enough to be annoying quite regularly, see below).
Another think I've grown increasingly annoyed with is the necessity to use .asarray() all the time, just to be sure that every array functionality works as expected and we're not subject by arbitrary limitations of Tensor, either because it's ambiguous (some cases with indexing), or just not yet implemented.
And finally, all the tedious work with the CuPy backend, just to wrap it into a Tensor class and make it behave nicely, seems kind of pointless work when the array itself already works as expected.

So the big question is: Why don't we work with arrays directly?

Why don't we scrap all those element classes and use arrays directly? That would probably remove 1/4 of the whole code while touching almost no functionality. What exactly would we lose? I can't think of anything more important than these:

  • x.inner(y) would have to become space.inner(x, y), same with norm and dist.
  • We couldn't ask for x.space. So what? In the vast majority we have an operator that we can ask.
  • Anything else??

On the other hand, I think the benefits would be massive:

  • Given the right abstractions in TensorSpace and the likes, support for new array implementations right away. In the worst case, a wrapper module with name aliases would be needed, to provide a common interface (similar to the get_array_module function in CuPy)
  • A huge reduction in complexity by removing all the wrapping code that needs to be maintained and probably has additional bugs in it.
  • No more asarray casting, no more worrying "is this supported by the Tensor class?"
  • No more decision required regarding "should this method support array-like input or not?" For instance, operator calls do, but inner and similar methods don't, which is confusing.
  • One thing less to worry about for users who know NumPy already, but not ODL.

Of course, it would be quite a big change. But in my opinion, we'd benefit from it big time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions