This repository was archived by the owner on Apr 28, 2023. It is now read-only.
This repository was archived by the owner on Apr 28, 2023. It is now read-only.
Repromote from shared to private memory #554
Open
Description
If a tensor reference group is promoted to shared memory at some scope, it may be interesting to promote it to registers at some deeper scope. There are two possibilities:
- promote to registers instead of promoting to shared (freeing the shared memory for other uses or for increased occupancy);
- promote to registers from shared, hiding global access latency and/or having more coalescing when copying from global to shared.
#161 and #217 attempted this behavior; first by demoting from shared memory, then by promoting from shared to private. Demotion from shared was mostly harmful, principally because promotion to registers was too deep and rarely beneficial by itself. The effect may be different with tunable promotion depth, so we can start by having this behavior controlled by a flag.