hsa_enqueue_kernel should be called from the HSA LocaleModel rather than directly from the compiler's generated code.
Generally speaking, the compiler mods should focus on marking functions that should be code generated into GPU kernels and doing that code generation.The process of calling these kernels should be primarily under the control of the LocaleModel.chpl with help from the C runtime code. (suggested by MF)