Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix buffer protocol implementation #5407

Merged
merged 6 commits into from
Nov 5, 2024
Merged

Conversation

QuLogic
Copy link
Contributor

@QuLogic QuLogic commented Oct 11, 2024

Description

According to the buffer protocol, ndim is a required field [1], and should always be set correctly. Additionally, shape should be set if flags includes PyBUF_ND or higher [2]. The current implementation only set those fields if flags was PyBUF_STRIDES.

Note, the tests currently leak, since they don't call PyBuffer_Release, as figuring out how to make a custom deleter work there is not something I've understood. Should I just immediately convert to a SimpleNamespace or similar instead or wrapping the struct? I changed it to a SimpleNamespace.

[1] https://docs.python.org/3/c-api/buffer.html#request-independent-fields
[2] https://docs.python.org/3/c-api/buffer.html#shape-strides-suboffsets

Suggested changelog entry:

* Fix buffer protocol implementation

PS, conventional commits are only mentioned in the PR template; they should be mentioned in https://github.com/pybind/pybind11/blob/af67e87393b0f867ccffc2702885eea12de063fc/.github/CONTRIBUTING.md so one knows that before committing...

According to the buffer protocol, `ndim` is a _required_ field [1], and
should always be set correctly. Additionally, `shape` should be set if
flags includes `PyBUF_ND` or higher [2]. The current implementation only
set those fields if flags was `PyBUF_STRIDES`.

[1] https://docs.python.org/3/c-api/buffer.html#request-independent-fields
[2] https://docs.python.org/3/c-api/buffer.html#shape-strides-suboffsets
@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 22, 2024

I changed the test to return a SimpleNamespace of the information instead of a pybind11-wrapped struct thing. This way, I can release the buffer and not leak stuff.

@QuLogic
Copy link
Contributor Author

QuLogic commented Oct 22, 2024

I still don't know what's wrong with PyPy and GraalPy; I can't get a proper backtrace out of PyPy, and no idea how to install GraalPy.

@msimacek
Copy link
Contributor

I'll have a look at the GraalPy failure

@msimacek
Copy link
Contributor

Ok, so failure is on test-buffers.py:112. What happens is that the struct.unpack_from call requests a buffer with PyBUF_SIMPLE, but you give it a buffer that has ndim = 2, but shape = NULL. GraalPy internally constructs a memoryview of the buffer and memoryview constructor assumes that if ndim > 1 then shape is filled.
You'd run into the same problem on CPython if you do PyObject_GetBuffer(obj, &buffer, PyBUF_SIMPLE) and then do PyMemoryView_FromBuffer(buffer). It's going to dereference shape without checking it for NULL at https://github.com/python/cpython/blob/main/Objects/memoryobject.c#L572.
In conclusion, setting ndim > 1 without PyBUF_ND flag is invalid. The doc isn't clear on that, but apparently all 3 implementations have that assumption. I think it makes sense if you think about the semantics. If the caller didn't specify they handle multidimensional buffers, you should't give them one, you should make the buffer one-dimensional. So I think you should keep setting ndim = 1 unless PyBUF_ND flag is set. memoryview itself does the same, multidimensional memoryview would return one-dimensional buffer from PyObject_GetBuffer(mv, &buffer, PyBUF_SIMPLE) or error our if it's not C contiguous.

@cryos
Copy link
Contributor

cryos commented Oct 31, 2024

I have taken a look at this, and read through the docs too. Whilst it clearly states that ndim must always be set and valid it would seem that the value is incorrectly >1 in certain situations in several implementations. Reading around I also don't see the value in supporting a value of ndim != 1 unless it is PyBUF_ND or higher - what does that mean practically?

It looks like restoring the setting of ndim to 1 by default and moving the setting otherwise to the PyBUF_ND would yield the correct behavior even if it is a little at odds with the documentation. Otherwise it would be good to see examples of PyBUF_SIMPLE with a value other than ndim = 1.

Copy link
Contributor

@cryos cryos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there is a good example of ndim > 1 for PyBUF_SIMPLE, which I can't see or find then I think continuing to default to 1 is reasonable given what current implementations do.

@@ -614,10 +614,11 @@ extern "C" inline int pybind11_getbuffer(PyObject *obj, Py_buffer *view, int fla
if ((flags & PyBUF_FORMAT) == PyBUF_FORMAT) {
view->format = const_cast<char *>(info->format.c_str());
}
if ((flags & PyBUF_ND) == PyBUF_ND) {
view->shape = info->shape.data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moving view->ndim = (int) info->ndim; to this block and restoring the default of view->ndim = 1; in the above block would make this work with all implementations, I agree that this disagrees with what is stated in the documentation linked to.

info = m.get_py_buffer(mat, m.PyBUF_SIMPLE)
assert info.itemsize == ctypes.sizeof(ctypes.c_float)
assert info.len == mat.rows() * mat.cols() * info.itemsize
assert info.ndim == 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change this to

assert info.ndim == 1  # See PR #5407

so that we can link back to this conversation.

view->internal = info;
view->buf = info->ptr;
view->ndim = (int) info->ndim;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the default with a link to this PR would be ideal so that it is clear why we chose to do this.

view->ndim = 1; // See PR #5407

@QuLogic
Copy link
Contributor Author

QuLogic commented Nov 1, 2024

@msimacek Thanks for the investigation. Reading through your explanation, and the docs again, I think I understand the confusion better now.

From the docs, we have:

The following fields are not influenced by flags and must always be filled in with the correct values: obj, buf, len, itemsize, ndim.

which seems to suggest the values should match the buffer as it exists. So I interpreted the flags to mean "I only need to know this information", and it seems like (other than ndim) that's how pybind11_getbuffer is implemented.

But then looking back at the previous section:

Buffers are usually obtained by sending a buffer request to an exporting object via PyObject_GetBuffer(). Since the complexity of the logical structure of the memory can vary drastically, the consumer uses the flags argument to specify the exact buffer type it can handle.

which indicates that flags is more of a collaboration, i.e., the caller is saying "please give me this type of buffer, or less, but no more complex than that."

So in that sense, the "request-independent fields" should always be correct, but that term is not specified. I read that to be correct "with regards to the underlying buffer", but it is actually correct "as the buffer is exposed", which are not necessarily the same things. For example, a 2D (M, N)-shaped buffer can be exported as an (M*N) 1D buffer if it's contiguous. But then the "request-independent" in the section title is a bit of a misnomer, as ndim does depend on the flags.

What I think that means though is that there is a different underlying bug, because pybind11_getbuffer does not verify that the buffer is contiguous if flags are < PyBUF_STRIDES, but simply exposes the buffer as-is.

@cryos
Copy link
Contributor

cryos commented Nov 1, 2024

So in that sense, the "request-independent fields" should always be correct, but that term is not specified. I read that to be correct "with regards to the underlying buffer", but it is actually correct "as the buffer is exposed", which are not necessarily the same things. For example, a 2D (M, N)-shaped buffer can be exported as an (M*N) 1D buffer if it's contiguous. But then the "request-independent" in the section title is a bit of a misnomer, as ndim does depend on the flags.

If shape, strides and suboffsets are all null how would the shape be represented in such a scenario? Doesn't PyBUF_SIMPLE imply contiguous with no field to represent shape in this case?

@rwgk
Copy link
Collaborator

rwgk commented Nov 1, 2024

But then the "request-independent" in the section title is a bit of a misnomer, as ndim does depend on the flags.

Documentation is usually the least reliable part of a software distribution. It was written by humans that probably don't want to spend their time writing documentation and may have omissions and inconsistencies. I'm looking at documentation as useful to get the big picture, but easily misleading when it comes to details. If that happens, it's important to not take the documentation too literally, and instead focus on what works best for the ecosystem.

In this particular case: PyBUF_SIMPLE goes with ndim = 1. I wouldn't look any further than that. Unless someone comes back here later and presents an actual problem.

For now, let's just fix the obvious.

@QuLogic
Copy link
Contributor Author

QuLogic commented Nov 2, 2024

If shape, strides and suboffsets are all null how would the shape be represented in such a scenario?

It won't be, but that's my point: pybind11 should raise an error if the memory block doesn't fit the requirements of the caller, and it doesn't do so right now.

Doesn't PyBUF_SIMPLE imply contiguous with no field to represent shape in this case?

It implies contiguity of the memory block, but not necessarily the shape of the buffer itself. If the buffer can be reduced to a contiguous 1D buffer, then it is allowed.

Here's an example from NumPy using struct.unpack_from (which @msimacek noted above uses PyBUF_SIMPLE):

>>> a = np.full((3, 5), 30.0)  # Allocate 2D buffer at once.
>>> a.flags.c_contiguous
True
>>> a.size
15
>>> a.shape
(3, 5)
>>> a.strides
(40, 8)

so we have a 2D buffer with a C-contiguous block of memory. In this case, struct.unpack_from works:

>>> struct.unpack_from(f'{a.size}d', a)
(30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0, 30.0)

If we then take a slice and make the block discontiguous:

>>> b = a[::2, ::2]
>>> b.size
6
>>> b.shape
(2, 3)
>>> b.strides
(80, 16)
>>> b.flags.c_contiguous
False
>>> b.flags.f_contiguous
False
>>> b.flags.contiguous
False

then struct.unpack_from no longer works:

>>> struct.unpack_from(f'{b.size}d', b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: ndarray is not C-contiguous

Unless someone comes back here later and presents an actual problem.

We can replicate something like the above in pybind11:

class Block {
public:
    Block(py::size_t height, py::size_t width) : m_height(height), m_width(width) {
        // Allocate a buffer that is 10 times bigger in each direction.
        m_data.resize(height * 10 * width * 10);
        // Just something to see what's found in memory.
        auto count = 0;
        for (auto &&val : m_data) {
            val = count;
            count++;
        }
    }

    double *get_buffer() {
        return m_data.data();
    }

    py::size_t height() const { return m_height; }
    py::size_t width() const { return m_width; }

private:
    py::size_t m_height;
    py::size_t m_width;
    std::vector<double> m_data;
};

PYBIND11_MODULE(test, m, py::mod_gil_not_used())
{
    py::class_<Block>(m, "Block", py::is_final(), py::buffer_protocol())
        .def(py::init<py::size_t, py::size_t>())
        .def_buffer([](Block &self) -> py::buffer_info {
            std::vector<py::size_t> shape { self.height(), self.width() };
            std::vector<py::size_t> strides { self.width()*10 * 10*sizeof(double), 10*sizeof(double) };
            return py::buffer_info(self.get_buffer(), shape, strides);
        });
}

If we ask NumPy, which obeys strides:

>>> b = Block(3, 5)
>>> np.array(b)
array([[   0.,   10.,   20.,   30.,   40.],
       [ 500.,  510.,  520.,  530.,  540.],
       [1000., 1010., 1020., 1030., 1040.]])

If we ask struct.unpack_from (which asks for a simple buffer):

>>> struct.unpack_from('15d', b)
(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0)

These values are completely wrong, and pybind11 should raise like NumPy.

@rwgk
Copy link
Collaborator

rwgk commented Nov 2, 2024 via email

@QuLogic
Copy link
Contributor Author

QuLogic commented Nov 2, 2024

I applied the changes separately, but the last commit basically flips the implementation around. Instead of starting minimal and adding more information based on the flags, it starts with all information and pares it back to match the request from the caller. If the underlying buffer cannot be exposed in a simplified form, then an error is raised.

I added tests for C/Fortran/any contiguities, and that the strides mapped into NumPy correctly.

If a contiguous buffer is requested, and the underlying buffer isn't,
then that should raise. This matches NumPy behaviour if you do something
like:
```
struct.unpack_from('5d', np.arange(20.0)[::4])  # Raises for contiguity
```

Also, if a buffer is contiguous, then it can masquerade as a
less-complex buffer, either by dropping strides, or even pretending to
be 1D. This matches NumPy behaviour if you do something like:
```
a = np.full((3, 5), 30.0)
struct.unpack_from('15d', a)  # --> Produces 1D tuple from 2D buffer.
```
@QuLogic
Copy link
Contributor Author

QuLogic commented Nov 2, 2024

To compare with NumPy, you can swap:

  • Matrix(5, 4) with np.empty((5, 4), dtype=np.float32)
  • FortranMatrix(5, 4) with np.empty((5, 4), dtype=np.float32, order='F')
  • DiscontiguousMatrix(5, 4, 2, 3) with np.empty((5*2, 4*3), dtype=np.float32)[::2, ::3]
    Also replace mat.rows() with 5 and mat.cols() with 4 (maybe I should do that so we're always comparing with 'known' values instead of internals?)

Then running the tests show two differences:

  • for PyBUF_SIMPLE, NumPy returns ndim=0, which I guess is truer in spirit, and we could do here too.
  • for contiguity mismatches, it raises ValueError instead of BufferError, which is maybe a bug on their part, but close enough to the same.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going in an awesome direction!

Some quick-ish drive-by comments/requests. I hope @cryos will be able to look more closely when he's back.

view->internal = info;
view->buf = info->ptr;
view->itemsize = info->itemsize;
view->len = view->itemsize;
for (auto s : info->shape) {
view->len *= s;
}
view->ndim = (int) info->ndim;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please change this to static_cast<int>(info->ndim)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to ask if we should prefer C++ static_cast over C-style casts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although we're not set up to catch C-style casts automatically.

A battle for another day. In the meantime I'm simply changing C-style casts to C++ casts in code that's touched in some way.

if ((flags & PyBUF_C_CONTIGUOUS) == PyBUF_C_CONTIGUOUS) {
if (PyBuffer_IsContiguous(view, 'C') == 0) {
std::memset(view, 0, sizeof(Py_buffer));
delete info;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow-on PR: We should use std::unique_ptr to make sure there aren't leaks (e.g. when there are C++ exceptions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that work when put into view->internal?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuLogic Is there a chance that you could try this in a follow-on PR, now that this one is merged?

return -1;
}

// If no strides are requested, the buffer must be C-contiguous.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this comment to the top of the else if block two lines down?


view->strides = nullptr;

// Since this is a contiguous buffer, it can also pretend to be 1D.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure, maybe better to follow numpy (IIUC?) and make this 0?



def test_to_pybuffer():
mat = m.Matrix(5, 4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To compare with NumPy, you can swap:

  • Matrix(5, 4) with np.empty((5, 4), dtype=np.float32)
  • FortranMatrix(5, 4) with np.empty((5, 4), dtype=np.float32, order='F')
  • DiscontiguousMatrix(5, 4, 2, 3) with np.empty((5*2, 4*3), dtype=np.float32)[::2, ::3]
    Also replace mat.rows() with 5 and mat.cols() with 4 (maybe I should do that so we're always comparing with 'known' values instead of internals?)

Then running the tests show two differences:

  • for PyBUF_SIMPLE, NumPy returns ndim=0, which I guess is truer in spirit, and we could do here too.
  • for contiguity mismatches, it raises ValueError instead of BufferError, which is maybe a bug on their part, but close enough to the same.

Could you please try that? Roughly:

  • Split the added tests into multiple smaller ones.
  • Use @pytest.mark.parametrize to loop over the np.empty() equivalents above.

Motivation behind my suggestion:

  • We want to ensure that we're compatible with numpy, and that it stays that way in the future.

(I think you can use pytest.importorskip("numpy"), although we only want to skip the tests that depend on numpy (not all tests in this file). Not sure exactly how to best make that work.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think you can use pytest.importorskip("numpy"), although we only want to skip the tests that depend on numpy (not all tests in this file). Not sure exactly how to best make that work.)

No need; it's already at the top of the file.

@cryos
Copy link
Contributor

cryos commented Nov 4, 2024

This looks great, and I really appreciate you finding examples @QuLogic that illustrate why the simpler solution is not appropriate. I agree with suggestions from @rwgk, this is a great enhancement and it will really help with corner cases I had not initially considered.

Copy link
Contributor

@cryos cryos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, thank you for addressing the review comments. I think this is ready to merge.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look in great detail, high-level this is definitely awesome. I'll merge this now. Thanks a lot for the comprehensive overhaul and the extensive testing! Thanks so much @cryos for the careful review!

@rwgk rwgk merged commit bc041de into pybind:master Nov 5, 2024
80 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Nov 5, 2024
@QuLogic QuLogic deleted the fix-getbuffer branch November 5, 2024 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs changelog Possibly needs a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants