Skip to content

pybind11 ignores exceptions raised by len(arg), calls std::vector::reserve() with bad value #2076

Closed
@nmusolino

Description

@nmusolino

Issue description

When converting a sequence-like argument to an std::vector<T>, pybind11 ignores exceptions raised by len(arg) and erroneously calls std::vector<T>::reserve() with a very large value. The result is a confusing exception of the following form:

Traceback (most recent call last):
  File "./pybind11_repro.py", line 11, in <module>
    my_module.f(a)
ValueError: vector::reserve

Reproducible example code

C++ code:

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

#include <vector>

namespace py = pybind11;
using namespace py::literals;

namespace my_module {

void f(const std::vector<int>&) {}

}  // end namespace my_module

PYBIND11_MODULE(my_module, m){
    m.def("f", &my_module::f);
};

This C++ code was compiled with:

g++-8 -Wall -shared -std=c++17 -fPIC -o my_module$(python3-config --extension-suffix) $(python3-config --ldflags) pybind11_repro.cpp -I.<pybind11 include dir> -I/usr/local/include/python3.8  

Python code:

import numpy
import my_module

my_module.f([1, 2])  # Okay

a = numpy.array(2.0)
my_module.f(a)       # raises: ValueError: vector::reserve

I used a zero-dimensional numpy array to demonstrate this problem, but it actually occurs with any object that is sequence-like (has __getitem__) and raises from __len__:

class BadLen(): 
    def __getitem__(self, i): 
        return i 
           
    def __len__(self): 
        raise Exception() 

my_module.f(BadLen())    # raises: ValueError: vector::reserve

Result and expected result

When running the Python code above, a C++ exception is thrown by std::vector<int>::reserve(). pybind11 propagates this to the Python caller, but this is confusing and does not indicate why the operation failed.

Traceback (most recent call last):
  File "./pybind11_repro.py", line 11, in <module>
    my_module.f(a)
ValueError: vector::reserve

This exception occurred because len(a) in the example above raises a Python exception (see Diagnosis section below):

>>> len(a)
Traceback (most recent call last):
  File "pybind11_repro.py", line 9, in <module>
    len(a)
TypeError: len() of unsized object

I think that pybind11 should propagate this original Python exception to the caller, instead of triggering an unrelated exception at a later point in time.

Diagnosis

This occurs through the following steps:

  1. pybind11::sequence::size() calls PySequence_Size(m_ptr), which returns a signed Py_ssize_t value. The pybind11 member function unconditionally casts this value to size_t. If PySequence_Size returns -1 (indicating failure), pybind11::sequence::size() will return SIZE_MAX, a very large positive number.
class sequence : public object {
public:
    PYBIND11_OBJECT_DEFAULT(sequence, object, PySequence_Check)
    size_t size() const { return (size_t) PySequence_Size(m_ptr); }
    /*  ...   */
};
  1. The return value of size() is passed to vector<T>::reserve(). The reserve() member function can throw std::length_error or std::bad_alloc if the requested size is too large.
template <typename Type, typename Value> struct list_caster {
    /*  ...   */
private:
    template <typename T = Type,
              enable_if_t<std::is_same<decltype(std::declval<T>().reserve(0)), void>::value, int> = 0>
    void reserve_maybe(sequence s, Type *) { value.reserve(s.size()); }
    /*  ...   */
};

I think this problem could be fixed by checking the return value of PySequence_Size, and raising the active CPython exception if that call has failed, as indicated by its return value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions