You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if [ -f "nohup.out" ]; then
lsof_output=$(lsof nohup.out)
if [ -z "$lsof_output" ]; then
# 如果没有其他进程正在使用nohup.out,重命名文件
mv nohup.out "nohup_${current_time}.out"
else
echo "nohup.out is currently in use by another process. Exiting."
exit 1
fi
fi
xinference 部署方式如下
#!/bin/bash
获取当前日期和时间,格式为YYYY-MM-DD_HH-MM-SS
current_time=$(date +"%Y-%m-%d_%H-%M-%S")
检查当前目录下是否存在 nohup.out 文件,并确保没有其他进程正在使用它
if [ -f "nohup.out" ]; then
lsof_output=$(lsof nohup.out)
if [ -z "$lsof_output" ]; then
# 如果没有其他进程正在使用nohup.out,重命名文件
mv nohup.out "nohup_${current_time}.out"
else
echo "nohup.out is currently in use by another process. Exiting."
exit 1
fi
fi
设置环境变量
export XINFERENCE_HOME=/data/xinference_home
export XINFERENCE_MODEL_SRC=modelscope
启动服务
nohup xinference-local --host 0.0.0.0 --port 9997 >nohup.out 2>&1 &
提示用户服务已启动
echo "Service has been started and running in the background. Logs are being written to nohup.out."
以此方式部署qwen2.5
当采用以下python脚本进行并发测试的时候
当并发量大于16之后 速度明显下降 16之前并发之后的token生成速率约为25左右 但是到17之后并发量为2tokens/s
设备采用的是8卡3090服务器
之前的版本设置多副本之后并发量也是成倍数增长的 但是最新版之后并发性能明显下降。通过实时监控设备GPU使用情况,发现GPU使用率在高并发时明显上不去。
The text was updated successfully, but these errors were encountered: