Closed
Description
'hf_bin_files = glob.glob(os.path.join(hf_folder, "*.bin"))' in the define 'hf_model_weights_iterator' in file '/vllm/model_executor/weight_utils.py' joins all '.bin', however, pytorch_model.bin and training_args.bin merged together in some checkpoint, which leads to error when start api_server.py. Therefore, avoiding joining training_args.bin increases code robustness