Branch for reproduce `yifei/mlp_benching_new`; Command: `numactl -C 0-55 -m 0 python3 ./tools/main.py --driver=mlp --batch_size=128 --hidden_size_list=512x1024 --has_bias=1024 --act_type=relu --dtype=bf16 -p`