Skip to content

[Bug]: error helper for TypeError: _extractNVMLErrorsAsClasses..gen_new..new() takes 1 positional argument but 2 were given #12906

@youkaichao

Description

@youkaichao

Your current environment

None

🐛 Describe the bug

If anyone encountered this error, this issue helps track the problem.

the root cause is a bug in https://pypi.org/project/nvidia-ml-py , that its dynamically created exception class cannot be deserialized.

a minimal reproducible example:

# pip install -U nvidia-ml-py
import pynvml
import pickle
error_data = None
try:
    pynvml.nvmlInit()
    pynvml.nvmlDeviceGetHandleByIndex(1000)
except Exception as e:
    data = pickle.dumps(e)
    error_data = data

# error here
# TypeError: _extractNVMLErrorsAsClasses.<locals>.gen_new.<locals>.new() takes 1 positional argument but 2 were given
print(pickle.loads(data))

the fact that the exception cannot be deserialized, becomes worse when it is used together with ray, because ray will try to deserialize the error in the driver process:

def f():
    import pynvml
    pynvml.nvmlInit()
    pynvml.nvmlDeviceGetHandleByIndex(1000)

# call it directly, we can get clear error message
# NVMLError_InvalidArgument: Invalid Argument
f()

import ray
ray.init()
# call it in ray, we cannot get clear error message.
# the error will be
# RuntimeError: Failed to unpickle serialized exception
# TypeError: _extractNVMLErrorsAsClasses.<locals>.gen_new.<locals>.new() takes 1 positional argument but 2 were given
ray.get(ray.remote(f).remote())

This is a bug in the pynvml library, the solution is to change the new function inside it:

        def gen_new(val):
            def new(typ, *args): # <-- change here, add `, *args` to make it accept various arguments
                print(args)
                obj = NVMLError.__new__(typ, val)
                return obj
            return new

after that, we can see the real error, usually NVMLError_InvalidArgument.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleOver 90 days of inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions