-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AOT] Memory Info Restore Mechanism with Better Performance #4113
base: main
Are you sure you want to change the base?
[AOT] Memory Info Restore Mechanism with Better Performance #4113
Conversation
Thank you for the keen observation and the intriguing analysis of the root cause. IIUC, the rationale behind storing // in aot_check_memory_overflow()
/* Get memory base address and memory data size */
if (func_ctx->mem_space_unchanged
#if WASM_ENABLE_SHARED_MEMORY != 0
|| is_shared_memory
#endif
) {
mem_base_addr = func_ctx->mem_info[0].mem_base_addr; // This branch should be used in the majority of cases.
}
else {
if (!(mem_base_addr = LLVMBuildLoad2(
comp_ctx->builder, OPQ_PTR_TYPE,
func_ctx->mem_info[0].mem_base_addr, "mem_base"))) {
aot_set_last_error("llvm build load failed.");
goto fail;
}
} Using the address of the memory base address does not allow optimization passes to recognize the pattern and decide to eliminate superfluous load instructions. However, I concur that the conditions for setting // in create_memory_info
bool mem_space_unchanged = true; // (!func->has_op_memory_grow && !func->has_op_func_call) || (!module->possible_memory_grow); Therefore, I believe the concept of "reloading the base address when the memory might change" is excellent. If you're in agreement with my perspective, we can begin refactoring the PR by concentrating on "reloading the base address when the memory might change" and eliminating "keeping the address of the memory base address." |
@lum1n0us Completely agree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're in agreement with my perspective, we can begin refactoring the PR by concentrating on "reloading the base address when the memory might change" and eliminating "keeping the address of the memory base address."
I agree. Let me know what I can do to help with the refactoring. |
I believe the approach should be:
|
I found that my program runs slower in WAMR AOT mode compared to other WASM runtimes, e.g. WAVM. I compared their LLVM IRs and found that WAMR emits more load operations of memory base.
In WAMR, functions with
mem_space_unchanged
keep memory base address inmem_base_addr
ofAOTMemInfo
, while others keep the address of memory base address in that field. When emit instructions like load/store, the former use the base address directly, while the later should load base address from its address at first. This reload is redundant when there is no possibility of changing memory between two consecutive load/store instructions:Optimization passes won’t recognize this redundancy because the reloaded memory base is accessed within the context.
In WAVM, the base address is reloaded when the memory possibly changes, e.g. after calling another function or after
memory.grow
. This can be redundant if there are no subsequent load/store instructions, but the dead code elimination pass handles this:Performance
Here is a sample C++ program
substr.cc
:Compiled with emcc (version:
3.1.59 (0e4c5994eb5b8defd38367a416d0703fd506ad81)
)Then ran wamrc and iwasm(linux) and compared the performance:
product-mini/platforms/posix/main.c:
result:
commit e3dcf4f
commit 3f268e5
IR(optimized) comparison: