Description
I firt run https://github.com/microsoft/DeepSpeed-MII/blob/main/mii/legacy/examples/local/text-generation-bloom560m-example.py
then run code
import mii
generator = mii.mii_query_handle("bloom560m_deployment")
result = generator.query({"query": ["DeepSpeed is", "Seattle is"]}, do_sample=True, max_new_tokens=30)
print(result)
But got the error:
[2025-01-13 09:23:10,377] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-01-13 09:23:12,916] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter hf_auth_token is deprecated. Parameter will be removed. Please use the pipeline_kwargs
field to pass kwargs to the HuggingFace pipeline creation.
[2025-01-13 09:23:12,916] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter trust_remote_code is deprecated. Parameter will be removed. Please use the pipeline_kwargs
field to pass kwargs to the HuggingFace pipeline creation.
query_kwargs {
key: "max_new_tokens"
value {
ivalue: 30
}
}
query_kwargs {
key: "do_sample"
value {
bvalue: true
}
}
method GeneratorReply
responseis <grpc.aio.EOF>
Traceback (most recent call last):
File "query.py", line 3, in
result = generator.query({"query": ["DeepSpeed is"]}, do_sample=True, max_new_tokens=30)
File "/usr/local/lib/python3.8/dist-packages/mii/legacy/client.py", line 80, in query
return self.asyncio_loop.run_until_complete(
File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/usr/local/lib/python3.8/dist-packages/mii/legacy/client.py", line 75, in _request_async_response
proto_response = await getattr(self.stub, task_methods.method)(proto_request)
File "/usr/local/lib/python3.8/dist-packages/grpc/aio/_call.py", line 328, in await
raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Exception calling application: <AioRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Exception calling application: not enough values to unpack (expected 2, got 0)"
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Exception calling application: not enough values to unpack (expected 2, got 0)", grpc_status:2, created_time:"2025-01-13T09:23:13.311084448+08:00"}"
"
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2025-01-13T09:23:13.31192943+08:00", grpc_status:2, grpc_message:"Exception calling application: <AioRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "Exception calling application: not enough values to unpack (expected 2, got 0)"\n\tdebug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Exception calling application: not enough values to unpack (expected 2, got 0)", grpc_status:2, created_time:"2025-01-13T09:23:13.311084448+08:00"}"\n>"}"
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1736731393.313577 3176 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers
I0000 00:00:1736731393.324543 3176 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers
I0000 00:00:1736731393.337308 3176 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers
import mii
mii.terminate("bloom560m_deployment")
the above code is also cannot run