Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array reinterpret operations #3366

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mshinwell
Copy link
Collaborator

@mshinwell mshinwell commented Dec 12, 2024

This is a different approach from the one I took previously to adding the "array reinterpret" operations, e.g. given an int64# array, get the two values corresponding to the type #(float# * float#) from byte offset 8 (which is unaligned with respect to the product type as a whole). I think this way is easier: the Lambda primitives now mirror what happens in the middle end, namely they take both the array kind and the array ref/set kind. (This means that all this information basically just passes through Lambda_to_flambda, which in fact could now become simpler, as some conversion functions could now be deleted were it not for the float array optimization.)

  • The array kind determines indexing
  • The array ref/set kind determines what is actually read or written.

We should be able to expose reinterpret primitives in Translprim which map to these existing primitives, just with the kinds appropriately set, I hope.

@mshinwell mshinwell marked this pull request as draft December 12, 2024 09:57
@mshinwell mshinwell force-pushed the array-reinterpret2 branch 3 times, most recently from fa8e3b4 to 435c54d Compare December 12, 2024 14:22
@mshinwell mshinwell force-pushed the array-reinterpret2 branch 3 times, most recently from aa191ec to 56115ec Compare January 14, 2025 10:17
@mshinwell mshinwell marked this pull request as ready for review January 15, 2025 11:30
@mshinwell mshinwell requested a review from ccasin January 15, 2025 11:30
@ccasin ccasin force-pushed the array-reinterpret2 branch from 70abdeb to b4cbe79 Compare February 19, 2025 11:22
@ccasin ccasin force-pushed the array-reinterpret2 branch 3 times, most recently from 402e2ed to 75e64ac Compare February 28, 2025 00:46
@ccasin ccasin requested review from TheNumbat and rtjoa February 28, 2025 00:47
@ccasin
Copy link
Contributor

ccasin commented Feb 28, 2025

This PR is now ready for review.

  • @rtjoa: Could you review the typing and lambda bits (and corresponding tests)?
  • @TheNumbat: Could you review everything from the middle-end onwards? Please also check if you think the tests of dynamic behavior are sufficient.

@@ -0,0 +1,183 @@

(* One of the tests. *)
type t =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests with mixed unboxed products and reinterpreting non-value data as values/immediates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to have these but I think these cases are intentionally not supported by the current implementation and we plan to ship without them, as they aren't needed by my initial use case

Is that right @mshinwell ?

If it actually is supported to have a mixture of unboxed things and immediates in an array I will definitely write tests for that because it would simplify one aspect of the use case

Copy link
Contributor

@TheNumbat TheNumbat Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc unboxed products currently have to be all-scannable or all-unfriendly, but it didn't look like this pr prohibited loading e.g. #(int * int) from an int64# array? (Also the tests only create tuples where the fields are all the same layout; in any case it should test like #(int64 * float64))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding "unboxed things and immediates in an array": I think that should be supported? See ignorable_product_element_kind in lambda.mli.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding reinterpreting non-value data as values: I think we should prohibit this, and I have added checks to try to do so.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding reinterpreting non-value data as immediates: this is allowed, and the bottom bit is set now upon reading, as for the existing (non-array) reinterpret primitives.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like here we maybe just need tests for "a mixture of unboxed things and immediates in an array"?

Copy link
Contributor

@rtjoa rtjoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to typing, lambda, and the testsuite look good to me.

Thinking aloud: there are a few mechanisms that interact with each other in Translprim:

  • The glb_array_* functions use the "cheat" that treats Pgenarray* as a super kind of all other array kinds.
  • lookup_primitive defaults the array kind for array reinterpret operations to Pgenarray, then specialize_primitive takes the GLB of this kind and the array_kind_of_elt ~elt_sort:None of the relevant part of its type.
  • array_kind_of_elt, when elt_sort is None, defaults to calling classify with the sort of its parameter ty
  • classify, when called with sort value and a type not found in the environment, returns Any, causing array_kind_of_elt to return Pgenarray.

Squinting at the above makes me wonder if we can accidentally reinterpret-get something as a Pgenarray. I think we can't, but wanted to say this all aloud for those more familiar with this part of the compiler.

  • I assume the existing code, which relies on array_kind_of_elt in a similar way, is correct 🙂.
  • When I make classify assert false in the "not found" case, all the tests still pass.

@mshinwell mshinwell force-pushed the array-reinterpret2 branch from 75e64ac to d1f5847 Compare March 7, 2025 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants