Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[video_player_avfoundation] enable more than 30 fps #7466

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

misos1
Copy link
Contributor

@misos1 misos1 commented Aug 21, 2024

In the player there was hardcoded 30 fps when setting up video composition. Now it uses timing from source track and also fallback minFrameDuration as seems frameDuration must be always set to something and it takes over in some situations as is mentioned in documentation about sourceTrackIDForFrameTiming. Also video composition setup is skipped when it is not needed when preferredTransform is identity.

Function updatePlayingState is called often right after setupEventSinkIfReadyToPlay but seems it was forgotten in onListenWithArguments and also it cannot be called in that way because setupEventSinkIfReadyToPlay may finish asynchronously when called again from line [self performSelector:_cmd onThread:NSThread.mainThread withObject:self waitUntilDone:NO] so now is updatePlayingState called right after _isInitialized = YES which is what it needs to even do something.

There was one more obstacle for playing 60 fps videos on 60 hz screen. Seems there are at least two "display links" at play when playing video, one calls function displayLinkFired and other one from flutter engine calls copyPixelBuffer but only when textureFrameAvailable was called previously. But the order in which those two are called is undefined so 16 ms after displayLinkFired may be called copyPixelBuffer and right after that displayLinkFired and so on. But copyPixelBuffer steals the newest pixel buffer from video player output and in displayLinkFired hasNewPixelBufferForItemTime will not report another pixel buffer for time close to that. Then the next frame is not called copyPixelBuffer because textureFrameAvailable was not called and in this way it skips every second frame so it plays video at 30 fps. There was also a synchronization problem with lastKnownAvailableTime. Now pixel buffers are produced and reported just on a single place in displayLinkFired and received with correct synchronization in copyPixelBuffer. Ideally there would be just a single "display link" from flutter engine if it supported also pulling frames instead of only pushing (which is good enough for a camera where the system is pushing frames to us, but from player video output is needed to pull frames). Calling textureFrameAvailable every frame could accomplish that but it looks like this line in flutter engine is calling even when copyPixelBuffer returns NULL and it may be expensive (although there is no need to call it in such case):

sk_sp<flutter::DlImage> image = [self wrapExternalPixelBuffer:_lastPixelBuffer context:context];

Seems there is some bug with the video player using this flutter engine on macos. Looks like the video is playing normally but then it starts "tearing", it looks like it is displaying frames normally but once in a while it shows some frame from the past like some previously cached frame. This is happening on the main branch but rendering on 60 fps exaggerates it (it is not caused by this PR).

Pre-launch Checklist

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@@ -21,27 +21,38 @@ @interface FVPFrameUpdater : NSObject
@property(nonatomic, weak, readonly) NSObject<FlutterTextureRegistry> *registry;
// The output that this updater is managing.
@property(nonatomic, weak) AVPlayerItemVideoOutput *videoOutput;
// The last time that has been validated as avaliable according to hasNewPixelBufferForItemTime:.
@property(nonatomic, assign) CMTime lastKnownAvailableTime;
@property(nonatomic) CVPixelBufferRef latestPixelBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All properties need comments, per the style guide. Please explain in comments what the purpose of these two new properties is.

if (self.latestPixelBuffer) {
CFRelease(self.latestPixelBuffer);
}
self.latestPixelBuffer = [self.videoOutput copyPixelBufferForItemTime:outputItemTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially doubling the memory usage for video output, isn't it? Why doesn't the previous approach of only storing the timestamp work? The PR description discusses the early-consume problem, but it seems like that could be addressed simply by changing copyPixelBuffer to prefer the last time instead of the current time.

Copy link
Contributor Author

@misos1 misos1 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially doubling the memory usage for video output, isn't it?

Can you please explain how? Maybe memory usage with the original approach would be one frame less if flutter engine deleted its previous pixel buffer right before calling copyPixelBuffer but it would not want to do that because copyPixelBuffer can also return NULL and then it needs something latest to show.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain how?

Aren't you keeping an extra copy of the frame besides the one kept by the player and the one kept by the engine?

I guess not doubled, but increasing by one frame relative to the current implementation.

(Also looking again, the current PR code appears to be leaking every frame it consumes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also looking again, the current PR code appears to be leaking every frame it consumes.)

I cannot see it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see now. The memory flow is pretty hard to follow here on the copied buffer as currently written, with the buffer sometimes freed by the frame updater, and sometimes handed off to to the engine.

Which brings us back to the initial question: why do we need to copy the buffer proactively when the display link fires instead of storing the timestamp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you keeping an extra copy of the frame besides the one kept by the player and the one kept by the engine?

Both versions have a worst case number of pixel buffers 2+N where N is the number held by the player. This case is between copyPixelBufferForItemTime and until the engine replaces its own after copyPixelBuffer. Current version just can have 2+N for a little longer, especially when displayLinkFired is called after copyPixelBuffer at each frame. Btw if the player kept only a single frame then storing timestamp instead of buffer would not work.

Which brings us back to the initial question: why do we need to copy the buffer proactively when the display link fires instead of storing the timestamp?

That would mean fetching pixel buffers from the past, especially when displayLinkFired is called after copyPixelBuffer at each frame. But this is based on undocumented and possibly wrong assumption that the player always leaves at least one past frame ready for us. What if this is not the case? Actually it is not. Seems the player periodically flushes all past pixel buffers. After there are some 3 pixel buffers in the past then the player flushes them all.

I tried an implementation which was sending timestamps instead of pixel buffers and copyPixelBufferForItemTime often returned NULL with timestamp for which hasNewPixelBufferForItemTime returned true before. It had 4x more frame drops in average during my tests compared to little modified current implementation (with copyPixelBufferForItemTime moved outside of pixelBufferSynchronizationQueue).

There are several causes of frame drops. This modified implementation minimises cases where copyPixelBufferForItemTime returns NULL and accidentally also frame drops caused by artefacts of using two "display links". They are caused by displayLinkFired and copyPixelBuffer changing order. Because things running on pixelBufferSynchronizationQueue are short (in time) and new pixel buffer is generated after some time then even when is copyPixelBuffer called after displayLinkFired at some frame (while before it was conversely) it has chance to obtain pixel buffer from displayLinkFired from previous frame and to not clear _textureFrameAvailable from this latest displayLinkFired. But this is of course not ideal, a more proper way would be to wait until the middle of frame and then send a new pixel buffer and call textureFrameAvailable but even more proper solution would be to not use two "display links".

Another cause of frame drops is when hasNewPixelBufferForItemTime returns false which is now prevalent. Seems this is caused by irregularities at which is called displayLinkFired which causes irregular timestamps returned by CACurrentMediaTime. I tested to use CADisplayLink::timestamp instead and frame drops dropped almost to zero, below 0.1%, around 20x less than current implementation (modified) on average during my tests. But this would need access to CADisplayLink through FVPDisplayLink and some other implementation for macos.

I also tested an implementation without a display link where textureFrameAvailable and copyPixelBuffer were called for each frame and it had around 3x less frame drops than the current implementation (modified). Unfortunately I could not use CADisplayLink::timestamp here so there were still frame drops due to irregularities. Flutter engine would need to provide something similar, or maybe it can be simply obtained by rounding CACurrentMediaTime down to refresh duration of display but that would then require update frequency of display from engine (or something what is doing FVPDisplayLink for macos).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean fetching pixel buffers from the past, especially when displayLinkFired is called after copyPixelBuffer at each frame. But this is based on undocumented and possibly wrong assumption that the player always leaves at least one past frame ready for us. What if this is not the case? Actually it is not. Seems the player periodically flushes all past pixel buffers. After there are some 3 pixel buffers in the past then the player flushes them all.

I tried an implementation which was sending timestamps instead of pixel buffers and copyPixelBufferForItemTime often returned NULL with timestamp for which hasNewPixelBufferForItemTime returned true before. It had 4x more frame drops in average during my tests compared to little modified current implementation

Very interesting, thanks for the details! That definitely seems worth the slight memory hit.

Another cause of frame drops is when hasNewPixelBufferForItemTime returns false which is now prevalent.

Could you file an issue with the details of what you've found here for us to follow up on in the future? It sounds like you've done a lot of great investigation here that we should be sure to capture and track.

// Currently set at a constant 30 FPS
videoComposition.frameDuration = CMTimeMake(1, 30);
videoComposition.sourceTrackIDForFrameTiming = videoTrack.trackID;
videoComposition.frameDuration = videoTrack.minFrameDuration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if minFrameDuration is kCMTimeInvalid, which is a documented possibility?

Copy link
Contributor Author

@misos1 misos1 Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting frameDuration to kCMTimeInvalid causes an objc exception. But setting some magical constant as fallback also does not sound good, it can even be CMTimeMake(1, 1). Maybe my thought at that time was that if it happens then someone will report it but with even bad fallback that is less probable. Because now we do not know why that could happen and what is incidence of that, if it happens only for some degenerate video files then magical constant may be ok but if it happens often then some more elaborate fix could be needed (like using display refresh rate, but which could be overkill if this does not happen at all). But maybe better would be to use some magical constant as fallback (or maybe better would be to not setup video composition if minFrameDuration is not valid?) but also report it as error or warning.

From definition minFrameDuration should be defined if the video has at least two frames. I tested with video with just a single frame and minFrameDuration still had valid CMTime value, I tried to test with video without any video samples but that video sadly also did not contain any video tracks. I was only able to get kCMTimeInvalid in minFrameDuration when I used it on an audio track.

Copy link
Contributor

@stuartmorgan stuartmorgan Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should default to the previous value of 1/30; we should not knowingly write code that throws an exception in a case that's documented to be possible.

We could log that we are doing fallback and suggest filing an issue with details, but I don't think it's necessary. If nobody notices any problems that result from the fallback if/when it happens, then it doesn't really matter in practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not everyone reports everything. Even if something is hard enough to notice does not mean users will not notice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any objection to once-per-video logging about this as long as it's:

  • only compiled in debug mode, and
  • has clear, actionable steps for a developer hitting it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is NSLog(@"Warning: videoTrack.minFrameDuration for input video is invalid, please report this.");, that is why I asked what is typically used in packages for such logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is why I asked what is typically used in packages for such logs

Typically we don't do logging like this, because it's very hard to notice; instead we return errors so they will be surfaced using the usual process.

In this case I don't think adding a new error return that didn't previously exist is worth the risk of regression though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the message, it would need to be much more specific about the action: report to whom, and how? The Flutter team? The source of the video? Someone else?

@@ -283,6 +294,10 @@ - (instancetype)initWithPlayerItem:(AVPlayerItem *)item
error:nil] == AVKeyValueStatusLoaded) {
// Rotate the video by using a videoComposition and the preferredTransform
self->_preferredTransform = FVPGetStandardizedTransformForTrack(videoTrack);
// do not use video composition when it is not needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should be properly capitalized and punctuated sentences.

@@ -283,6 +294,10 @@ - (instancetype)initWithPlayerItem:(AVPlayerItem *)item
error:nil] == AVKeyValueStatusLoaded) {
// Rotate the video by using a videoComposition and the preferredTransform
self->_preferredTransform = FVPGetStandardizedTransformForTrack(videoTrack);
// do not use video composition when it is not needed
if (CGAffineTransformIsIdentity(self->_preferredTransform)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this change related to the PR? I don't see it discussed in the PR description.

Copy link
Contributor Author

@misos1 misos1 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This composition may add overhead (also on memory). And guessing from the fact that it needs its own frame rate it probably pulls frames independently of calls to copyPixelBufferForItemTime so it may also add another latency if its processing per frame happens to run after video_player's (maybe not if it processes frames ahead which would mean even more memory usage).

@@ -320,6 +335,10 @@ - (instancetype)initWithPlayerItem:(AVPlayerItem *)item
_videoOutput = [avFactory videoOutputWithPixelBufferAttributes:pixBuffAttributes];
frameUpdater.videoOutput = _videoOutput;

_pixelBufferSynchronizationQueue =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being stored in two different places, rather than being read out of the frame updater? The two must be the same queue in order for things to work, so storing it twice just introduces a potential source of bugs.

@@ -320,6 +335,10 @@ - (instancetype)initWithPlayerItem:(AVPlayerItem *)item
_videoOutput = [avFactory videoOutputWithPixelBufferAttributes:pixBuffAttributes];
frameUpdater.videoOutput = _videoOutput;

_pixelBufferSynchronizationQueue =
dispatch_queue_create("io.flutter.video_player.pixelBufferSynchronizationQueue", NULL);
frameUpdater.pixelBufferSynchronizationQueue = _pixelBufferSynchronizationQueue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is being created here rather than in the frame updater initialization?

__block CVPixelBufferRef buffer = NULL;
dispatch_sync(self.pixelBufferSynchronizationQueue, ^{
buffer = self.frameUpdater.latestPixelBuffer;
self.frameUpdater.latestPixelBuffer = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this pair is the only way the buffer should be consumed, a method on the frame updater that does both would be better than making the buffer publicly writeable.

@stuartmorgan
Copy link
Contributor

@misos1 Are you still planning on addressing the remaining comments?

@misos1
Copy link
Contributor Author

misos1 commented Sep 17, 2024

@stuartmorgan Yes, I was waiting for your input as I thought you wanted to handle when minFrameDuration is kCMTimeInvalid differently. I thought of setting frameDuration to some constant in such a case but also emit some warning log message like I saw sometimes a flutter engine does but I did not find any way how that is normally done in plugins.

@stuartmorgan
Copy link
Contributor

Sorry, I didn't realize that was waiting for my input. I'll respond there.

@end

@implementation FVPFrameUpdater
- (FVPFrameUpdater *)initWithRegistry:(NSObject<FlutterTextureRegistry> *)registry {
NSAssert(self, @"super init cannot be nil");
if (self == nil) return nil;
_registry = registry;
_lastKnownAvailableTime = kCMTimeInvalid;
return self;
}

- (void)displayLinkFired {
// Only report a new frame if one is actually available.
CMTime outputItemTime = [self.videoOutput itemTimeForHostTime:CACurrentMediaTime()];
if ([self.videoOutput hasNewPixelBufferForItemTime:outputItemTime]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, shouldn't these two lines be inside the dispatch_async? AVPlayerItemVideoOutput doesn't seem to be marked as threadsafe.


- (void)dealloc {
if (_latestPixelBuffer) {
CFRelease(_latestPixelBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: he releases of the buffer should use CVBufferRelease, and that doesn't need to be null-checked.

@misos1
Copy link
Contributor Author

misos1 commented Sep 26, 2024

Seems there is some bug with the video player using this flutter engine on macos. Looks like the video is playing normally but then it starts "tearing", it looks like it is displaying frames normally but once in a while it shows some frame from the past like some previously cached frame. This is happening on the main branch but rendering on 60 fps exaggerates it (it is not caused by this PR).

Seems this is caused by a bug in the flutter engine. There is this "You need to maintain a strong reference to textureOut until the GPU finishes execution of commands accessing the texture, because the system doesn’t automatically retain it.". But here is textureOut released right after CVMetalTextureCacheCreateTextureFromImage:

https://github.com/flutter/engine/blob/6f802b39ab0669eb6ba3272dff1d34e85febeb77/shell/platform/darwin/graphics/FlutterDarwinExternalTextureMetal.mm#L233-L249
https://github.com/flutter/engine/blob/6f802b39ab0669eb6ba3272dff1d34e85febeb77/shell/platform/darwin/graphics/FlutterDarwinExternalTextureMetal.mm#L177-L197

I thought that it flashes video frames from the past but I looked closely at video recorded from the screen and for example at one moment for 1/60 of a second it shows a frame from video 12 frames into the future. Maybe when the underlying textureOut is released this memory may be reused and overwritten by AVPlayer for decoding following video frames.

This does not explain so well another thing which I noticed. If copyPixelBuffer returns for some frame NULL then the engine shows a transparent image. But I will assume this is also caused by that use after free bug.

There is also this "Note that Core Video doesn’t explicitly declare any pixel format types as Metal compatible. Specify true for the kCVPixelBufferMetalCompatibilityKey option to create Metal-compatible buffers when creating or requesting Core Video pixel buffers.". Maybe the player should also specify this in pixBuffAttributes when creating AVPlayerItemVideoOutput?

@stuartmorgan
Copy link
Contributor

stuartmorgan commented Sep 26, 2024

Seems this is caused by a bug in the flutter engine.

Please definitely file an issue with details if you haven't already!

Edited to add: You can cross-reference the engine issue with flutter/flutter#135999, which sounds like the macOS playback issue you are describing here if I'm understanding correctly.

@stuartmorgan
Copy link
Contributor

There is also this "Note that Core Video doesn’t explicitly declare any pixel format types as Metal compatible. Specify true for the kCVPixelBufferMetalCompatibilityKey option to create Metal-compatible buffers when creating or requesting Core Video pixel buffers.". Maybe the player should also specify this in pixBuffAttributes when creating AVPlayerItemVideoOutput?

This would probably be a question for the #hackers-engine channel; I'm not familiar with the current end-to-end pipeline requirements for the engine texture rendering path.

@stuartmorgan
Copy link
Contributor

In terms of moving this forward, from my perspective:

  • Structurally everything here makes sense to me at this point; thanks again for the clear explanations!
  • I don't consider the macOS issue blocking as it's an existing issue. If we get feedback that the experience is substantially worse we can always do a follow-up to disable just >30fps for macOS, while leaving all the other improvements (e.g., threading model fixes) in place.

So once the smaller feedback items still open are addressed in an an updated version of the PR, I can re-review, and I expect we'll be on track for getting this landed.

Does that sound right?

@misos1
Copy link
Contributor Author

misos1 commented Sep 26, 2024

If we get feedback that the experience is substantially worse we can always do a follow-up to disable just >30fps for macOS, while leaving all the other improvements (e.g., threading model fixes) in place.

Yes, video composition can be set every time even with affinity and fps forced to 30 but it still happens. I wonder why no one reported this. And there are other things which may seemingly help little with it like retaining pixel buffer by frame updater and creating fresh copy of pixel buffer (I first though that AVPlayer at some point rewrites pixel buffers already returned by copyPixelBufferForItemTime but that is not the case). At least these worked on my specific system, but there is nothing for sure if there is UB, someone may experience it even worse with a 30 fps version.

@stuartmorgan
Copy link
Contributor

I wonder why no one reported this.

I think they have, per my edit to this comment above. Unless that's not the same behavior you're describing?

And there are other things which may seemingly help little with it like retaining pixel buffer by frame updater and creating fresh copy of pixel buffer

If we have lifetime issues within the engine pipeline itself, that's definitely something we should fix in the engine rather than try to hack around at the video_player level.

@misos1
Copy link
Contributor Author

misos1 commented Sep 26, 2024

I think they have, per my edit to this comment above. Unless that's not the same behavior you're describing?

Maybe partially, as I wrote, if copyPixelBuffer returns for some frame NULL then the engine shows a transparent image meaning it will flicker into background. I did not experience this on ios although it seems they share the same code so UB should be also on ios.

@misos1
Copy link
Contributor Author

misos1 commented Sep 29, 2024

Would it be doable to add new capability to the flutter engine and depend on it in this package or it needs to also work with the engine prior to such change?

@stuartmorgan
Copy link
Contributor

Whether we try to work around engine bugs at the plugin level is something we decide on a case-by-cases basis. If the macOS engine texture pipeline is buggy, I would not attempt to work around that at the plugin level without a very compelling reason to do so.

@misos1
Copy link
Contributor Author

misos1 commented Sep 30, 2024

No I do not mean this, rather for the engine being able to pull pixel buffers rather than needing textureFrameAvailable. Something like a flag when the engine will call copyPixelBuffer for every frame, if it returns NULL it will do nothing, if it returns pixel buffer it will update internal buffers.

@stuartmorgan
Copy link
Contributor

I'm not really sure what you mean by "for every frame", but if you want to propose a new engine API the place to start would be a design document. Details of exactly when video_player would adopt a new proposed API would be figured out much later in the process.

@misos1
Copy link
Contributor Author

misos1 commented Sep 30, 2024

By "for every frame" I mean that it would be called as if _textureFrameAvailable was always true. _textureFrameAvailable is set to true by calling textureFrameAvailable and set to false soon after is copyPixelBuffer called.

@misos1
Copy link
Contributor Author

misos1 commented Oct 10, 2024

Regarding #7466 (comment), it seems that macOS actually uses some different code path than ios but here it is the same, cvMetalTexture is released right after CVMetalTextureCacheCreateTextureFromImage:

https://github.com/flutter/engine/blob/6f802b39ab0669eb6ba3272dff1d34e85febeb77/shell/platform/darwin/macos/framework/Source/FlutterExternalTexture.mm#L119-L135
https://github.com/flutter/engine/blob/6f802b39ab0669eb6ba3272dff1d34e85febeb77/shell/platform/darwin/macos/framework/Source/FlutterExternalTexture.mm#L77-L97

For both it seems like a change here was some 4 years ago so I am not sure why such similar things were done twice.

And this is eventually called from embedder_external_texture_metal.mm which handles things differently, it does not use flag like _textureFrameAvailable like is used in FlutterDarwinExternalTextureMetal.mm but instead it nullifies last stored image when is called textureFrameAvailable so if copyPixelBuffer returns NULL it will not show anything and will periodically call it until it returns non-NULL and this explains what I observed.

I am not sure if this is intended or not but FlutterTexture does not tell anything about whether copyPixelBuffer can or cannot return NULL or what should happen in such a case:

/**
 * Copy the contents of the texture into a `CVPixelBuffer`.
 *
 * The type of the pixel buffer is one of the following:
 * - `kCVPixelFormatType_32BGRA`
 * - `kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange`
 * - `kCVPixelFormatType_420YpCbCr8BiPlanarFullRange`
 */
- (CVPixelBufferRef _Nullable)copyPixelBuffer;

So I will assume that returning NULL from copyPixelBuffer after previously returning non-NULL is undefined (it is practically unavoidable to return NULL at least once at start because the engine will call it until it returns non-NULL even without calling textureFrameAvailable). Then both old and current implementation are not entirely correct because they can return NULL from copyPixelBuffer because copyPixelBufferForItemTime can return NULL anytime so player needs to remember last returned pixel buffer and if it does not have any new then return this remembered buffer again until there is some new.

Copy link
Contributor

@stuartmorgan stuartmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I needed a block of time where I could swap all the context back in here since the implementation has changed significantly since my last review.

// The display link that drives frameUpdater.
@property(nonatomic) FVPDisplayLink *displayLink;
// The time interval between screen refresh updates.
@property(nonatomic) _Atomic CFTimeInterval duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using stdatomic instead of just making the property atomic? A nonatomic _Atomic property seems needlessly confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not documented whether atomic property with primitive type can be lock free so it looked like overkill. Although there are some hints that it probably can be. So probably there is no problem to change it to atomic property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deem std atomics as readable enough and there is this nice choice of memory ordering but probably yes atomic property would make for simpler code even with that potential little overhead (it is unknown to me what ordering it uses in case it is lock free).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deem std atomics as readable enough

You are not the only person who will be reading and maintaining this code. Obj-C developers are overwhelmingly going to be more familiar with atomic than _Atomic (in fact, I don't think I've ever seen _Atomic in Obj-C code in literally decades of Obj-C development).

Unless you have benchmark data showing that atomic is an issue in practice, please use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not the only person who will be reading and maintaining this code.

I did not want to imply that. It is part of a sentence which concludes that atomic property will probably have better tradeoff regarding simplicity (readability) vs performance (meaning better choice).

[_registry textureFrameAvailable:_textureId];
}
// Display link duration is in an undefined state until displayLinkFired is called at least once
// so it should not be used directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it belongs on the ivar declaration rather than here.

} else {
NSLog(@"Warning: videoTrack.minFrameDuration for input video is invalid, please report this to "
@"https://github.com/flutter/flutter/issues with input video attached.");
videoComposition.frameDuration = CMTimeMake(1, 120);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 120 fps rather than 30 as the fallback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why not 60? There is no good universal value. I think some displays have 120 Hz and this should not be ever used anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And why not 60?

Because I would like the fallback to be conservative, and I think 30 is a reasonable baseline value for video. Which is why I proposed it at the beginning of the review.

this should not be ever used anyway

If you believe this code path will never be used, I'm not sure why you have pushed back repeatedly against my suggestions for how we handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded constant of 30 fps was what led to this PR in the first place. With 30 the worst case would be worse playback for videos with more fps, while something higher is more inclusive and worst case could be maybe performance related but I did not observe that even when I set 1000 or 1000000 here and it never produces more frames than are in source video. Actually my constant is conservative, it does not account for slow motion videos (in case it has actual frame rate for example 240 fps and not just 30 fps but it is slower, it can have 240 fps with instruction to play at 0.125 rate, in such case frameDuration need to be CMTimeMake(1, 240) and with 120 it would play at 15 fps). Also I think more "standard" is actually 24 fps (devised some 100 years ago), not 30, and 120 is divisible by 24.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded constant of 30 fps was what led to this PR in the first place.

And this PR changes the use of the hard-coded constant from "always" to "cases where the frame duration can't be computed, of which we currently have no known examples", which constitutes a substantial improvement.

With 30 the worst case would be worse playback for videos with more fps, while something higher is more inclusive and worst case could be maybe performance related

Yes, I am fully aware of that. I would prefer that cases we don't have a good understanding of and are actively reporting via logging as not handled correctly err on the side of worse playback rather than app performance problems.

For instance, we know from issue reports that people use this package to play audio files even though we do not recommend that, and I would rather that have worse (imaginary) video playback than performance problems if that's a case that can trigger this fallback.

Also I think more "standard" is actually 24 fps (devised some 100 years ago), not 30, and 120 is divisible by 24.

For a fallback where we have no information about the video's framerate, I am more concerned with screen rate than film standards. Many common phone screen refresh rates are not divisible by 24.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am fully aware of that. I would prefer that cases we don't have a good understanding of and are actively reporting via logging as not handled correctly err on the side of worse playback rather than app performance problems.

I do not think it would have worse performance with 120 vs 30 (I can think of a passive implementation where it does not even require some sort of timer which calls some function at that frequency). But even if it had a little impact there is still this sourceTrackIDForFrameTiming so it should mostly use track for timing anyway and just in those rarer occasions fallback to frameDuration.

For instance, we know from issue reports that people use this package to play audio files even though we do not recommend that, and I would rather that have worse (imaginary) video playback than performance problems if that's a case that can trigger this fallback.

Yes, as I wrote audio tracks are the only ones for which I noticed that their minFrameDuration is invalid but video composition is used only when there is at least one video track (and specifically for the first video track). It would crash long before it could get there and use that 30 or 120 fps.

For a fallback where we have no information about the video's framerate, I am more concerned with screen rate than film standards. Many common phone screen refresh rates are not divisible by 24.

Btw on macos the display refresh rate can be set to 24 and combination 120 in frameDuration with 24 fps video makes for smooth playback.

So why to not use 24 instead of 30? Or maybe just put here CMTimeMake(1, 1) for these imaginary videos?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these imaginary videos?

Again: If you believe this code path will never be used, I'm not sure why you have pushed back repeatedly against my suggestions for how we handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know whether it will be ever used or not. As I wrote, 120 seems to me a better choice than 30. Actually from some points of view 30 may be worse than even some ridiculous values like 10 where it will be at least very well noticeable than just barely noticeable but not quite right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know whether it will be ever used or not.

It is unconstructive to repeatedly assert that this case will "not be ever used" and is "imaginary" as a way to dismiss my position, and then argue the reverse when it comes to your position.

If you believe this cannot happen, then there is literally no downside to just making the change that I am asking for; if you do believe could potentially happen, then please stop asserting that it can't.

As I wrote, 120 seems to me a better choice than 30.

Yes, I understand that, I just don't agree. I do not find the arguments you've presented compelling enough to warrant approving a behavioral change (relative to the version that is already in production) for a case we have no actual examples of, and thus no ability to evaluate concretely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I mentioned "imaginary" I was referring to your "...have worse (imaginary) video..." before and it was just a reaction to what you wrote. I wrote "this should not be ever used" not "(will) not be ever used" and this does not mean that I "believe" it will certainly never happen (in such case I would write "this will not be ever used" as if it was literally dead code). I can think that chance is small but this is really unknown until I can test it on every existing video in the world.

So if I understand correctly your arguments are:

  1. Better to use something that was here before as fallback due to behavioral change.
  2. It should be "harmonic" with refresh rate, so nothing like 24 fps, to avoid resulting artifacts and worse playback.
  3. It should be something high enough as to not cause worse playback than 30 fps would, so nothing less than 30 like 15 or 1.
  4. It should not be too high due to performance concerns, even 60 is too much.

I would at least reconsider to use 60 fps here as it covers both 30 and 60 fps videos and although 30 fps is probably still more used, both are very popular and 60 just covers both of them. And even if setting 60 here would mean that video composition would try to check 2x more often whether there is new frame, consider that even with 30 here player plugin itself already needs to check for new frame at display refresh rate which is at least 60 Hz (seems there is possibility for even 30 Hz and lower but it is opt-in in CADisplayLink).

// The display link that drives frameUpdater.
@property(nonatomic) FVPDisplayLink *displayLink;
// The time interval between screen refresh updates.
@property(nonatomic) _Atomic CFTimeInterval duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need a default starting value (probably 30fps) rather than 0 for when it's read before the display link fires?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is 0 at the beginning, targetTime will fallback to CACurrentMediaTime which is ok (condition in copyPixelBuffer is either true if CACurrentMediaTime is not equal to targetTime or false when targetTime is already equal to CACurrentMediaTime).

// outside of which targetTime is reset should be narrow enough to make possible lag as small as
// possible and at the same time wide enough to avoid too frequent resets which would lead to
// irregular sampling. Ideally there would be a targetTimestamp of display link used by flutter
// engine (FlutterTexture can provide timestamp and duration or timestamp and targetTimestamp).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the paranthetical is describing a desired feature? As worded it sounds like it's describing something that currently exists, which is confusing. I would replace this entire sentence with a TODO referencing an issue that requests the feature in detail, so that it's clearer what the context is and where the feature is tracked.

// Display link duration is in an undefined state until displayLinkFired is called at least once
// so it should not be used directly.
atomic_store_explicit(&_duration, _displayLink.duration, memory_order_relaxed);
[_registry textureFrameAvailable:_textureId];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to manage targetTime within this class and conditionalize textureFrameAvailable: calls on the timing logic that's currently in copyPixelBuffer?

E.g., let's say the engine is slow to call copyPixelBuffer for whatever reason, and we call textureFrameAvailable: twice, then after that we get two copyPixelBuffer calls. The first one will skip a frame due to the target time being too old, and then the second one will be somewhat expensive no-op, right? Or will the engine debounce copyPixelBuffer calls anyway so we don't have to worry about it?

Copy link
Contributor Author

@misos1 misos1 Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In such a case I could just use targetTimestamp from the display link. This would be that previous implementation just with added targetTimestamp and I had it ready at some time for both ios and macos and although it was better with targetTimestamp instead of CACurrentMediaTime it still had that inherent problem with race which this new version solves.

Engine calls copyPixelBuffer only once regardless of how many times textureFrameAvailable was called (its effect took place) and that is actually the problem. If it was possible to use textureFrameAvailable to enqueue multiple (at least 2) calls to copyPixelBuffer (provided that they will not be called at the same screen update) then that previous implementation could be fine (with some changes).

if ([self.videoOutput hasNewPixelBufferForItemTime:outputItemTime]) {
buffer = [self.videoOutput copyPixelBufferForItemTime:outputItemTime itemTimeForDisplay:NULL];
if (buffer) {
CVBufferRelease(self.latestPixelBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ownership is non-trivial enough that it's worth commenting all the calls to explain what they balance. So here: // Balance the owned reference from `copyPixelBufferForItemTime:`.

}

// Better to avoid returning NULL as it is unspecified what should be displayed in such a case.
return CVBufferRetain(self.latestPixelBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then here: // Add a retain for the engine, since the `copyPixelBufferForItemTime:` has already been accounted for, and the engine expects an owning reference.

if (CVDisplayLinkGetCurrentTime(self.displayLink, &timestamp) != kCVReturnSuccess) {
return 0;
}
return 1.0 * timestamp.videoRefreshPeriod / timestamp.videoTimeScale;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the 1.0 * for? If it's to make this a double, just cast to a double.

});
}

// Better to avoid returning NULL as it is unspecified what should be displayed in such a case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to comment about this on the declaration of latestPixelBuffer rather than here. Something like "The last buffer returned in copyPixelBuffer. This is stored so that in can be returned again if nothing new is available from the video buffer, since the engine has undefined behavior when returning NULL."

@stuartmorgan stuartmorgan added the triage-ios Should be looked at in iOS triage label Nov 8, 2024
@@ -125,6 +125,7 @@ @interface StubFVPDisplayLinkFactory : NSObject <FVPDisplayLinkFactory>

/** This display link to return. */
@property(nonatomic, strong) FVPDisplayLink *displayLink;
@property(nonatomic) void (^fireDisplayLink)(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (nonatomic, copy)

@property(nonatomic, assign) CMTime lastKnownAvailableTime;
// The display link that drives frameUpdater.
@property(nonatomic) FVPDisplayLink *displayLink;
// The time interval between screen refresh updates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can you call it something like frameDuration or frameDelta or displayLinkDuration

@@ -543,16 +562,25 @@ - (void)setPlaybackSpeed:(double)speed {
}

- (CVPixelBufferRef)copyPixelBuffer {
// Ensure video sampling at regular intervals. This function is not called at exact time intervals
// so CACurrentMediaTime returns irregular timestamps which causes missed video frames. The range
// outside of which targetTime is reset should be narrow enough to make possible lag as small as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make lags (due to skipping frames?) as less frequent as possible.

// so CACurrentMediaTime returns irregular timestamps which causes missed video frames. The range
// outside of which targetTime is reset should be narrow enough to make possible lag as small as
// possible and at the same time wide enough to avoid too frequent resets which would lead to
// irregular sampling. Ideally there would be a targetTimestamp of display link used by flutter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more explanation on how it leads to irregular sampling?

Can we do an experiment to always reset the time (remove the if check below) and see how it performs on a sample video?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants