Skip to content

[Jellyfin] Rework buffer into C++ class #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

dmitrylyzo
Copy link
Contributor

@dmitrylyzo dmitrylyzo commented Jan 29, 2022

Cherry-picked from: jellyfin@6d3a5c7
Cherry-picked from (part of): jellyfin@0cc2bf8

This is the only thing I can extract with such strict requirements in your repo. 😐
* and my own principles of porting the results of teamwork from the fork.

@dmitrylyzo dmitrylyzo marked this pull request as draft January 29, 2022 18:38
@dmitrylyzo dmitrylyzo force-pushed the refactor-buffer branch 2 times, most recently from 4fa40c1 to 825c98a Compare January 29, 2022 18:42
@dmitrylyzo dmitrylyzo marked this pull request as ready for review January 29, 2022 19:11
Copy link
Member

@TheOneric TheOneric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[dimtrylyzo:] This is the only thing I can extract with such strict requirements in your repo.

There are also fixes for seemingly pre-existing issues like 026b744 (in the linked example, at first glance it appears like it only needs a polished commit message).

Comment on lines 260 to 269
float* buf = (float*)buffer_resize(&m_blend, sizeof(float) * width * height * 4, 0);
float* buf = (float*)m_blend.take(sizeof(float) * width * height * 4, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing issue (so new commit to fix), but this line has potentially a big issue:
Afaict there's no prior checks for this and the multiplications for the new_size parameter can overflow the bounds of int, which is in itself already undefined behaviour, but if your lucky only truncates. Worse though, the buffer is later accessed by values recalculated from width and height, so if the multiplication here overflowed (and the overflow itself doesn't blow up), an out-of-bound write happens either blowing up or corrupting some other memory.
Is ReusableBuffer ever used for something else than an float-RGBA canvas alter down the line in jellyfin-patches? If not it's probably best to pass width, height and member_size as separate size_t-type values to the resize function and do all the overflow checking inside the function.

Copy link
Contributor Author

@dmitrylyzo dmitrylyzo Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if your lucky only truncates.

No, we will definitely get an overflow in the loop.

Is ReusableBuffer ever used for something else than an float-RGBA canvas alter down the line in jellyfin-patches?

ReusableBuffer is only used by blend renderer as a frame buffer.

If not it's probably best to pass width, height and member_size as separate size_t-type values to the resize function and do all the overflow checking inside the function.

I like that idea, but maybe do it in a future refactor?
member_size becomes unnecessary if I replace void with float.

At the same time, having something like std::vector would help for memory management in the future.

UPD:
As a quick fix, we could add the maximum buffer size, say 8K_resolution * sizeof(float) * 4, and return NULL if new_size is greater than this limit.

UPD2:
I chose the more preferable (your) way.
#124

Comment on lines +355 to +369
#ifdef __EMSCRIPTEN__
#include "./SubOctpInterface.cpp"
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, unrelated to the buffer refactor.
Second: Are there some tests or some other use for compiling this without emscripten? If not, I see no use in adding the guard in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows you to compile the file alone, iirc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but is there even any useful standalone program or tests you can build with this file? For some reason SubtitleOctopus.cpp has a main function, but all it does is to immediately exit, which is not remotely useful.
If there is no use for compiling it without emscripten, then the guard is not necessary and even misleading as it suggest there is some use in compiling it without emscripten.
If, on the other hand, there are some hidden tests which use this standalone we should integrate them into our CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just use g++ SubtitleOctopus.cpp to see if it compiles without having to run the full build process.

This has the same purpose:

#ifdef __EMSCRIPTEN__
#include <emscripten.h>
#else
// make IDE happy
#define emscripten_get_now() 0.0
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make -C src would also do, but if it’s already partly guarded anyway .. oh well. At least mention in the commit-message that this is only to quickly check for compiler warnings/errors and there’s no use to the compiled program or otheriwse.

@dmitrylyzo
Copy link
Contributor Author

dmitrylyzo commented Jan 31, 2022

There are also fixes for seemingly pre-existing issues like 026b744 (in the linked example, at first glance it appears like it only needs a polished commit message).

Yes, that error could theoretically happen even without dropAnimation. I don't think I can reproduce it (without dropAnimation), but there is no harm in those checks.

#123

@dmitrylyzo dmitrylyzo force-pushed the refactor-buffer branch 2 times, most recently from de8888f to eac55b6 Compare February 6, 2022 18:39
@dmitrylyzo
Copy link
Contributor Author

dmitrylyzo commented Feb 6, 2022

The commits fix the error message printing and Allow to compile CPP-file alone (for testing only) can be extracted from the branch if you want to do a merge commit.
I can also split the m_ prepending if you want a clearer diff.

@TheOneric
Copy link
Member

  • […] (for testing only) is still ambigous as it could refer to unit tests and imho the explanation is better placed into the description rather than the title.
  • I see no benefit in prepending m_
  • I would have placed intsize_t in the overflow fix branch and/or a more general Buffer overhaul (see below), but it's fine here too
  • all other messages (also errros) are all logged to stdout
  • rather than returning a internal pointer in ensure_resize and memsetting this, creating a new function to memset the buffer or memsetting the buffer when keep_content=false. (see below) For just refactoring the buffer struct into a class (the scope of this PR) it can be left as is for now

Atm keep_content=false doesn't make much sense; if it's set to true but the old data is not intened to be kept, the buffer will be filled with known leftover garbage. If it’s set to false a fresh buffer is allocated which is filled with unknown garbage. Renaming it to zero_mem or so and letting it do a memset or removing it an introducing a dedicated function to zero the buffer seems more sensible.
The only reason I can think of for the framesize being re-ensured every frame (with a request counter), is that it’s a workaround for continuos resizes firing multiple requests in rapid succession. Ideally the resize should only happen on worker init and canvas resize (with any continous-resize-workarounds contained in the JS-side)

For just refactoring the struct into a class, hte (initial) scope of this PR, I think just the refactoring itself, with a renaming of resize to ensure_size for clarity is fine, but keeping also the “standalone” (with a better commit message) and optionally the “size_t” commits as a collateral is ok.
Eg (if without “size_t”) like this: compile alone + refactor to class

@dmitrylyzo
Copy link
Contributor Author

dmitrylyzo commented Feb 6, 2022

  • I see no benefit in prepending m_

As here:

buffer_t m_blend;
RenderBlendResult m_blendResult;

Or it could be an underscore at the end.
Originally I wanted to add the size method (will conflict with the size data member). But now it may be redundant if the size is saved by the caller of ensure_size.

  • I would have placed intsize_t in the overflow fix branch and/or a more general Buffer overhaul (see below), but it's fine here too

I am currently driven by this idea. In this case the overflow protection goes after the refactor.
UPD: But looking at smart blend, maybe a simple check before ensure_size will be enough (couldn't use the FrameBuffer class there).

  • all other messages (also errros) are all logged to stdout

Do you want them also output to stderr?
#125

  • rather than returning a internal pointer in ensure_resize and memsetting this, creating a new function to memset the buffer or memsetting the buffer when keep_content=false.

It may not be necessary to zero the memory (even with keep_content=false) because you will manually set the actual values. If we are talking about a general-purpose buffer.

Atm keep_content=false doesn't make much sense; if it's set to true but the old data is not intened to be kept, the buffer will be filled with known leftover garbage. If it’s set to false a fresh buffer is allocated which is filled with unknown garbage. Renaming it to zero_mem or so and letting it do a memset or removing it an introducing a dedicated function to zero the buffer seems more sensible.

The value of keep_content depends on the usage of the buffer:

  • it is false if you are not interested in old data.
  • it is true if you are working with the buffer as an array. But this only makes sense for the appending.

"zero-mem" is a sub-option of the first. ensure_size_zero? I think external memsetting provides more flexibility.

The only reason I can think of for the framesize being re-ensured every frame (with a request counter), is that it’s a workaround for continuos resizes firing multiple requests in rapid succession. Ideally the resize should only happen on worker init and canvas resize (with any continous-resize-workarounds contained in the JS-side)

Originally maybe, but now smart blend uses 9 such buffers: https://github.com/jellyfin/JavascriptSubtitlesOctopus/blob/f4625ac313b318bd5d2e0ae18679ff516370bae6/src/SubtitleOctopus.cpp#L98-L103
They may have a different size on each frame, but their "capacity" will only grow to a certain value.

@dmitrylyzo
Copy link
Contributor Author

Currently, if #124 is fine, it should go first.

dmitrylyzo and others added 4 commits March 19, 2022 13:42
Only for testing the compilation of this file.

[part of] Rework buffer into C++ class
Cherry-picked from: jellyfin@6d3a5c7
Cherry-picked from: jellyfin@6d3a5c7

Add 'ReusableBuffer::capacity' from [Restructure renderBlend so it can return multiple pieces; JS side not ready yet]
Cherry-picked from: jellyfin@0cc2bf8

Move private data members to the top.
Rename 'ReusableBuffer::take' -> 'ReusableBuffer::ensure_size'.
@TheOneric
Copy link
Member

The “Allow to compile CPP-file alone” commit has been merged.

Currently, if #124 is fine, it should go first.

Using some custom magic constant for the canvas dimensions is not the right way to prevent typewidth overflows.

I have an untested branch started several weeks ago, which would replace both this and the overflow PR, but hadn't the time to finish + test it yet. Can you clarify what requirements are actually necessary for concretly planned or in jellyfin already existing features?

@dmitrylyzo
Copy link
Contributor Author

Using some custom magic constant for the canvas dimensions is not the right way to prevent typewidth overflows.

But you can trim the canvas to a reasonable size.
Checking the size on each frame may degrade performance.
And How hard do we need to defend ourselves against libass?

I have an untested branch started several weeks ago, which would replace both this and the overflow PR, but hadn't the time to finish + test it yet. Can you clarify what requirements are actually necessary for concretly planned or in jellyfin already existing features?

  1. Here we need to know the size before allocation.
    This is smart-blend part.
  2. The requested size may be less than capacity - zeroing by capacity is more than necessary.

@TheOneric
Copy link
Member

TheOneric commented Mar 23, 2022

Type overflows have nothing to do with libass or any notion of a "reasonable" canvas size (if you want to do the latter there’s an JSO option for this), but with the sizes of the underlying architecture and how those are mapped to C types. Any custom magic constant is incorrect in that it either is below the minimum size required by ISO and may needlessly limit, or it is above the minimum and may not actually prevent overflows.
The correct way to guard against type overflows is using the values from limits.h.

  1. Here we need to know the size before allocation.

You obviously still know what size you need beforehand, you just might need to do an overflow check before — or since size_t is unsigned, after if possible — blindly multiplying values.

  1. The requested size may be less than capacity - zeroing by capacity is more than necessary.

We could instead add an option to query a zero-initialised buffer from get_rawbuf. Or just recalculate the actual size, if get_rawbuf succeeded the multiplication is safe and I doubt this multiplication is in any under real-world conditions measurable way performance-relevant.

@dmitrylyzo
Copy link
Contributor Author

Type overflows have nothing to do with libass

Those width and height (which initiated the overflow problem) are from libass, no?
Does libass protect itself from overflow due to resolution?

I ask because if you know the range of data you're working with, you don't have to get too defensive.
You don't create a public class that can be used by anyone but you.

or any notion of a "reasonable" canvas size (if you want to do the latter there’s an JSO option for this), but with the sizes of the underlying architecture and how those are mapped to C types. Any custom magic constant is incorrect in that it either is below the minimum size required by ISO and may needlessly limit, or it is above the minimum and may not actually prevent overflows.

If you don't limit them, you will constantly get memory allocation errors.
Yes, you can check as you did in your branch, but 4GB (for 32 bit) is too much for a framebuffer, imo.
I think it's better to fix them or throw an error from the beginning.

You obviously still know what size you need beforehand

Then the size will be calculated twice: outside the buffer and inside the buffer.

Or just recalculate the actual size, if get_rawbuf succeeded the multiplication is safe and I doubt this multiplication is in any under real-world conditions measurable way performance-relevant.

Little by little such immeasurable places can grow into measurable one.
And, again, you will have a size calculation in two places.

@TheOneric
Copy link
Member

TheOneric commented Mar 23, 2022

Those width and height (which initiated the overflow problem) are from libass, no?

No. And the overflow happens in pure C.

4GB (for 32 bit) is too much for a framebuffer, imo.

32bit is well above the minimum required by ISO, and unless it’s actually requested it won't grow that big anyway.

@TheOneric
Copy link
Member

The alternative has now been tested and I didn't spot any issues. Let’s move further discussion to #133.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants