This repository has been archived by the owner on Apr 28, 2023. It is now read-only.
[WIP][DO NOT MERGE] Experimental vector types #513
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a WIP experiment, please do not review.
I am looking on some feedback on how to best propagate vector types through Halide following up on the discussion from #511 and #512.
The first 2 commits in the stack are irrelevant.
In a first experiment I'm interested in using some type annotation in TC to express that a particular type is a vector with proper alignment (i.e. that it can be loaded exactly in an x86 type register and that I can define operators on the type thanks to intrinsics). In that experiment, using a TC vector type in the language will be for the purpose of blackboxing it in the TC mapper and guarantee low-level SIMD code will be generated.
The fact that Halide makes the design decision to make vectorization a property of the loop is orthogonal to my experiment and is not something I plan to inherit in TC before experimenting. Of course I'd like to convey the information through Halide in a proper way, if possible.
So what would you recommend in this case?
Using Halide's vector type seemed quite natural and it seems to get the job done (i.e.
test_compile_and_run.cc
actually produces code that runs without crashing).In light of your comments on #512, do I understand properly that lanes are only meant to be used internally within Halide?