Closed
Description
DLRM bot workload (LD_PRELOAD=/opt/miniforge3/lib/libiomp5.so:/home/yifei/ipex_env/gperftools-2.7.90/.libs/libtcmalloc.so numactl -C 0-55 -m 0 python3 ./tools/main.py --driver=mlp --batch_size=128 --hidden_size_list=13x512x256x128 --has_bias=512x256x128 --act_type=relu --dtype=bf16
) failed with the following error message
python3: ../lib/gc/ExecutionEngine/CPURuntime/MemoryPool.cpp:215: void {anonymous}::FILOMemoryPool::dealloc(void*): Assertion `current->allocated > chunk->size' failed.
bench_mlp.sh: line 9: 1393883 Aborted (core dumped) LD_PRELOAD=/opt/miniforge3/lib/libiomp5.so:/home/yifei/ipex_env/gperftools-2.7.90/.libs/libtcmalloc.so numactl -C 0-55 -m 0 python3 ./tools/main.py --driver=mlp --batch_size=128 --hidden_size_list=13x512x256x128 --has_bias=512x256x128 --act_type=relu --dtype=bf16 -p
More analysis shall be performed. Current observation is that running any one of the layers alone will not trigger the above-mentioned issue.