Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detecting motion vectors / handling scrolling better #1232

Closed
totaam opened this issue Jun 17, 2016 · 68 comments
Closed

detecting motion vectors / handling scrolling better #1232

totaam opened this issue Jun 17, 2016 · 68 comments

Comments

@totaam
Copy link
Collaborator

totaam commented Jun 17, 2016

Issue migrated from trac ticket # 1232

component: encodings | priority: blocker | resolution: fixed

2016-06-17 04:50:26: antoine created the issue


Scrolling can make us use a video region for the area being scrolled, we should be able to supply the motion vectors to the encoder: we know that the whole region moves as one unit so this should save a lot of CPU and bandwidth.

Plan:

  • add optional codec API to pass motion vectors to a test video codec (either x264 or vpx)
  • verify that these motion vectors are used: smaller output? metadata? save-to-file feature from r12144 then playing it back with a player that show motion vectors (ie: ffdshow does)
  • compare with and the current search algorithms (try EXA, HEX, etc..) and find their limitations - unit test it all
  • write simple motion detection for video region (exact match only and vertical only, for now) - unit test it, see Search Algorithms for Block-Matching in Motion Estimation
  • pass the motion vector to the encoder (and turn off deblocking?)
  • provide a DBUS api for passing those motion vectors from the app

Alternatively, maybe we could also use the motion vector internally: send the delta (using a fast stream compressor like lz4) + motion: still send the newly exposed region separately (different encoding)
We may want to force an I frame for the next video frame since the delta will be bigger.

Also worth thinking about: distinguish video from scrolling. Video has high framerate and lasts longer, we can then:

  • use a different motion search algorithm: use ESA (aka exhaustive search) for video, for scrolling, we should stick with HEX, DIA or UHM (see X264 - Motion Estimation Method- Comparison)
  • disable "subme" (subpixel motion) for scrolling
  • disable partitions?
  • only use a single reference frame for scrolling (much faster)

There are probably some limitations on motion vectors range, etc

@totaam
Copy link
Collaborator Author

totaam commented Jun 19, 2016

2016-06-19 11:44:41: antoine uploaded file detect-scrolling.patch (2.5 KiB)

unoptimized code to detect scrolling

@totaam
Copy link
Collaborator Author

totaam commented Jun 19, 2016

2016-06-19 11:53:49: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Jun 19, 2016

2016-06-19 11:53:49: antoine changed title from motion vectors for video regions to detecting motion vectors / handling scrolling better

@totaam
Copy link
Collaborator Author

totaam commented Jun 19, 2016

2016-06-19 11:53:49: antoine commented


The good news first: with the patch above, I can detect scrolling in the majority of cases where the distance is small enough, ie: with Firefox:

best scroll guess, matches 91% of 464 lines: 10
best scroll guess, matches 89% of 464 lines: 13
best scroll guess, matches 88% of 464 lines: 15
best scroll guess, matches 95% of 464 lines: 6
best scroll guess, matches 93% of 464 lines: 8
best scroll guess, matches 89% of 464 lines: 13
best scroll guess, matches 95% of 464 lines: 6
best scroll guess, matches 90% of 464 lines: 12
best scroll guess, matches 95% of 464 lines: 5
best scroll guess, matches 96% of 464 lines: 5
best scroll guess, matches 97% of 464 lines: 3
best scroll guess, matches 95% of 464 lines: 7
best scroll guess, matches 97% of 464 lines: 3
best scroll guess, matches 98% of 464 lines: 0
best scroll guess, matches 90% of 464 lines: -14
best scroll guess, matches 73% of 464 lines: -42
best scroll guess, matches 64% of 464 lines: -52
best scroll guess, matches 74% of 464 lines: -32
best scroll guess, matches 73% of 464 lines: -32
best scroll guess, matches 52% of 464 lines: -60
best scroll guess, matches 77% of 464 lines: -28
best scroll guess, matches 64% of 464 lines: -45
best scroll guess, matches 85% of 464 lines: -17
best scroll guess, matches 85% of 464 lines: -18

The bad news is that x264 does not support supplying motion vectors: Manually feeding x264 with my own motion data?: Short answer: No you can't feed in your motion estimation data to x264.
NVENC does, but it's very poorly documented (as usual) and it is per block... and blocks are small.

So I think we'll just have to handle scrolling without using video, unlike the current test patch which hooks in the video path. But we still want to only fire it when there is a good chance it will be useful (as we have to scan the whole picture, and then compare all the checksums)

@totaam
Copy link
Collaborator Author

totaam commented Jun 20, 2016

2016-06-20 09:31:51: antoine uploaded file detect-scrolling-v3.patch (6.5 KiB)

make it fast with cython zero-copy and CRC64 instead of sha1sum

@totaam
Copy link
Collaborator Author

totaam commented Jun 21, 2016

2016-06-21 16:39:50: antoine commented


Some preparatory work in r12877 and r12878, an updated patch is attached.

What works:

  • scrolling is detected, even multiple regions scrolling in different directions by different amounts each
  • the client can handle "scroll" paint messages and update the screen accordingly

Still todo:

  • prevent non-scroll updates whilst scrolling, they mess things up
  • probably want to reset the video encoder after scrolling, or at least force an I frame
  • handle sending more than one scroll group - not very hard
  • move scrolling outside video path, so we can scroll the window before applying the video mask (rounding of coordinates to even numbers)
  • handle refresh updates: we need to tell the refresh logic to update its list of damaged rectangles - hard! (maybe just refresh everything instead.. which is costly)
  • any issues with desktop scaling?
  • CRC64 is not fast enough: 20ms for scanning one 1080p image, assembly to the rescue? ([http://create.stephan-brumme.com/crc32/] or liblzma crc64_x86.S - pylzma is fast but does not work for memoryviews..), maybe use a different algo on 32-bit
  • support for scrolling in the pixmap backing? meh
  • support for scrolling in the html5 client?
  • sync paints: we should flush after all the updates are received
  • non-scroll packets bypass too much of the send accounting logic
  • also handle transparency
  • opengl: don't keep temporary fbo around for nothing, use the same code to tackle opengl backend should preserve fbo contents when resizing #478
  • maybe split the crc and pixel functions from cystats into its own module, may be imported from ximage?
  • would be nice to merge non-scroll regions separated by small gaps (less than N, N<=2? depends on image width?), as sending those using separate packets is a bit of a waste of CPU and bandwidth
  • make the client code handle any scrolling by default (both horizontal and vertical) - even if the server side may take much longer, same packet format will be easier!
  • interpolated smooth scrolling? if we know we're not getting more than 50 fps (batch delay 20ms) and our monitor can do 150fps, then we can do smoother scrolling by scrolling 3 times in smaller increments

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2016

2016-06-22 03:39:40: antoine uploaded file scrolling-pixmap.patch (2.0 KiB)

failed attempt at implementing scrolling in the pixmap backing - just crashes randomly..

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2016

2016-06-22 14:16:54: antoine commented


More preparatory work in r12881 + r12885 + r12886 + r12887, and tests in r12880 + r12882 + r12883 + r12884

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2016

2016-06-22 17:10:32: antoine commented


Testing notes for my own benefit since I may have to get back to other things:

  • start the server with:
XPRA_VIDEO_SKIP_EDGE=1 XPRA_B_FRAMES=0 XPRA_ENCODING_STRICT_MODE=1 \
xpra start :100 --start=xterm
  • client:
xpra attach --speaker=off  --remote-logging=no --no-mmap --opengl=yes --desktop-scaling=no
  • launch test app or firefox and make sure to hide the xterm (avoid a GL paint bug with more than one window shown):
vglrun python ./tests/xpra/clients/test_gl_vscrollup.py 1 10000 50

Doing a focus change causes the whole window to be "repainted", without any actual pixel changes, the server correctly sees it:

encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 630, 480), '{0: 480, 1: 478, 2: 476, 3: .. 470, -4: 472, -3: 474, -2: 476}', [], [], {})
 will send scroll packets for yscroll={0: 480, 1: 478, 2: 476...}
 480 lines have not changed: '[0, 1, 2, 3, 4, 5, 6, 7, 8, .. , 474, 475, 476, 477, 478, 479]'

No data is sent to the client at all which is great.


Then the vscroll gl test application scrolls the window up by 50 pixels and repaints the bottom 50 pixels, again the server correctly identifies this:

encode_scrolling(XShmImageWrapper(BGRX: 0, 0, 630, 480), '{0: 380, 1: 379, 2: 378, 3: .. 385, -4: 384, -3: 383, -2: 382}', [], [], {})
 will send scroll packets for yscroll={0: 380, -60: ...  -52: 426, -51: 428, -50: 430, -49: 429, -48: 428, ..}
 scroll groups for distance=-50 : '[0, 1, 2, 3, 4, 5, 6, 7, 8, .. , 424, 425, 426, 427, 428, 429]'=[(0, 430)]
 non scroll: 1 packets: '[430, 431, 432, 433, 434, 4 .. , 474, 475, 476, 477, 478, 479]'=[(430, 50)]

And the client gets the matching paint requests:

do_scroll_paints(((0, 50, 630, 430, -50),), 1)
draw_region(0, 430, 630, 50, png, 604 bytes, 0, {'compress_level': 2}, [<function record_decode_time at 0x7f95402250c8>, <function after_draw_refresh at 0x7f9540225050>])

Yet the window appears black after this first scroll event.
That's weird considering that the client and the gl vscroll test app actually use the exact same code for painting.
The GL paint code seems to have issues when there is more than one window shown, maybe there are other issues too. Maybe fixing the pixmap backend would help confirm that the paint data is correct (or maybe also save it to file for manual inspection).

Ideally, we would want to choose the "scroll" encoding early, but the encoding selection logic code doesn't have access to the pixels.. This may have to be changed. We could then more efficiently discard unnecessary repaints.

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2016

2016-06-22 17:40:55: antoine uploaded file scrolling-pixmap-v2.patch (2.6 KiB)

almost working pixmap implementation (needs XPRA_REPAINT_ALL=1), shows the same corruption

@totaam
Copy link
Collaborator Author

totaam commented Jun 26, 2016

2016-06-26 12:17:02: antoine uploaded file detect-scrolling-v16.patch (24.1 KiB)

updated patch

@totaam
Copy link
Collaborator Author

totaam commented Jun 26, 2016

2016-06-26 16:51:04: antoine commented


Lots of changes related to scrolling have been merged:

  • r12912 server side support
  • r12914 makes it possible to record and replay scrolling data
  • r12913 tests for the new motion class
  • r12901 sets XPRA_SCROLL_ENCODING=0 by default client-side since the gl backend has problems with more than one window..
  • r12894 adds horizontal scrolling support in client backends (not supported by the server as detection would be much more costly)
  • r12893 client backend work
  • r12897 + r12896 tests improvements
  • r12909 ensures that a refresh will also match the window size (even when the video encoder would round the dimensions down to an even number - as is the case with an xterm)

Still TODO:

  • fix opengl backing with multiple windows (and fix pixmap backing?)
  • refresh handling: maybe do something simple, like extending the current refresh regions by the maximum scroll distances
  • move scroll handling out of the video-only path, and make it work with the video region code
  • check for issues with desktop scaling or transparency
  • support for scrolling in the html5 client
  • non-scroll packets bypass too much of the send accounting logic: maybe return a list of packets from the encode function?
  • fix issues with b-frames allow B frames with video encoders #800

Maybes (see comment:2 for details):

  • reset the video encoder after scrolling
  • move scrolling outside video path
  • handle refresh updates "correctly", moving and splitting the list of damaged rectangles - hard
  • crc would be quicker if we did it whilst copying the image (when we call restride)
  • sub-images don't need to copy the pixels (use reference counting)
  • opengl: don't keep temporary fbo around for nothing, use the same code to tackle opengl backend should preserve fbo contents when resizing #478
  • merge non-scroll regions separated by small gaps
  • interpolated smooth scrolling

@totaam
Copy link
Collaborator Author

totaam commented Jun 27, 2016

2016-06-27 15:17:25: antoine commented


  • r12920 finally fixes the opengl client, and scrolling is now enabled by default, can still be disabled by setting XPRA_SCROLL_ENCODING=0
  • r12921 allows us to override the default XPRA_SCROLL_MIN_PERCENT=40, below this percentage, we won't be using scroll encoding, defaulting back to video
  • r12928 was missing from previous commits, you need this to build the new cython "motion" code

Until this is integrated more tightly, you need to run the server with XPRA_ENCODING_STRICT_MODE=1 to trigger the video code path, which is the place where the scrolling detection is currently hooked up.
You can get the scrolling debug with "-d scroll".
The color shown with opengl paint box debugging is orange, it only draws around the area we copied to (not from).

@totaam
Copy link
Collaborator Author

totaam commented Jul 12, 2016

2016-07-12 17:52:23: antoine commented


Milestone renamed

@totaam
Copy link
Collaborator Author

totaam commented Aug 20, 2016

2016-08-20 13:46:00: antoine commented


r13402 adds an SSE 4.2 accelerated CRC32 version which is a lot faster than the standard zlib.crc32 on supported hardware (tested on Skylake).

@totaam
Copy link
Collaborator Author

totaam commented Aug 20, 2016

2016-08-20 16:38:20: antoine commented


There was a silly bug in the code, which means that the performance was overestimated for crc32c. But as of r13405 we switch to xxhash via python-xxhash, and we get 7.5GB/s! (with some better tests)

@totaam
Copy link
Collaborator Author

totaam commented Aug 23, 2016

2016-08-23 09:42:03: antoine commented


Lots of updates: r13433, r13434, r13435, r13436.
scrolling detection is enabled again, and seems to work well enough. As of r13442 we only enable scrolling detection if "xxhash" is available (the alternatives are just too slow).
Still todo: #1257, try to manage conflicts between video regions and scrolling... very difficult to have both at the same time.

@totaam
Copy link
Collaborator Author

totaam commented Aug 25, 2016

2016-08-25 15:45:35: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Aug 25, 2016

2016-08-25 15:45:35: antoine changed owner from antoine to afarr

@totaam
Copy link
Collaborator Author

totaam commented Aug 25, 2016

2016-08-25 15:45:35: antoine commented


Looks ready for a first round of testing.
How I tested: using firefox and scrolling around (without video playing at the same time). You can try to see what encoding is used with opengl paint debug ("Meta+Shift+F12" to turn on for a window), but since the scrolling will accumulate previous paint boxes... it can get messy quickly. A better way is to check what encodings are actually being used with xpra info | egrep "total_frames|total_pixels".
Then for extra verbose details, there's "-d scroll".

Important: you must have python-xxhash installed on the server for scroll detection.

@totaam
Copy link
Collaborator Author

totaam commented Aug 26, 2016

2016-08-26 10:37:33: antoine commented


FYI: the win32 and osx shadow servers now take advantage of scrolling detection to save a huge amount of bandwidth and CPU, they can be useful for testing and helped uncover an absolute monster of a bug: 13479

@totaam
Copy link
Collaborator Author

totaam commented Aug 29, 2016

2016-08-29 07:00:34: antoine commented


@totaam
Copy link
Collaborator Author

totaam commented Sep 7, 2016

2016-09-07 10:13:19: antoine commented


Just hit this:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/client/gl/gl_window_backing_base.py", line 514, in do_scroll_paints
    assert y+ydelta>=0 and y+h+ydelta<=bh, "vertical scroll by %i: rectangle %s overflows the backing buffer size %s" % (ydelta, (x, y, w, h), self.size)
AssertionError: vertical scroll by 299: rectangle (17, 2, 1002, 299) overflows the backing buffer size (1021, 591)

Not sure how to reproduce it.

@totaam
Copy link
Collaborator Author

totaam commented Sep 8, 2016

2016-09-08 17:30:20: antoine commented


The bug in comment:14 is caused by a mismatch between the server window size and the backing client-side (which can happen more easily with desktop-scaling enabled, ie: #1098), r13618 is a workaround for that.
r13617 also improves error handling end reporting.

@totaam
Copy link
Collaborator Author

totaam commented Sep 9, 2016

2016-09-09 12:05:53: antoine commented


Minor fix in r13623: we don't use scrolling for "real" video, which may use b-frames (#800). "real" video can be identified in xpra info as:

$ xpra info | grep video-mode
window.1.video_subregion.video-mode=True

@totaam
Copy link
Collaborator Author

totaam commented Sep 21, 2016

2016-09-21 09:14:43: antoine commented


Minor change: as of r13797 + r13799, we run the scrolling detection in more cases, even when we detected a video region that really does look like real video, but not if the API has been used to set this video region.

Notes:

@totaam
Copy link
Collaborator Author

totaam commented Oct 7, 2016

2016-10-07 23:37:29: maxmylyn changed owner from afarr to antoine

@totaam
Copy link
Collaborator Author

totaam commented Oct 7, 2016

2016-10-07 23:37:29: maxmylyn commented


Running a Fedora 23 r14042 trunk server with a Fedora 23 r14042 trunk client:

  • I'm starting my server with XPRA_SCROLL_ENCODING=1 XPRA_ENCODING_STRICT_MODE=1 xpra start :13 --no-daemon --start-child=firefox -d scroll --bind-tcp=0.0.0.0:2200 and connecting with XPRA_SCROLL_ENCODING=1 xpra attach tcp:ip:port

And, I'm not able to trigger the new encoding at all. I can tell this because running with XPRA_OPENGL_PAINT_BOX=1, upon scrolling all I see is h264, none of the new paints mentioned earlier.

Also, I keep seeing this pop up in the logs: [/attachment/ticket/1232/zero.log]

(..)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_video_source.py", line 1120, in calculate_scaling
    ffps = int(pixels/(width*height)/(time.time() - otime))
ZeroDivisionError: integer division or modulo by zero

Passing back to you with the tracebacks.

Also, I'd like (for myself, afarr, and anyone else interested) for you to indicate what flags and options are required to get the new scrolling working. As far as I can tell all that's required is XPRA_SCROLL_ENCODING=1 and XPRA_ENCODING_STRICT_MODE=1 on the server, and just scroll encoding on the client.

@totaam
Copy link
Collaborator Author

totaam commented Oct 8, 2016

2016-10-08 09:29:31: antoine uploaded file zero.log (3.6 KiB)

ZeroDivisionError in calculate_scaling

@totaam
Copy link
Collaborator Author

totaam commented Oct 8, 2016

2016-10-08 10:52:34: antoine changed owner from antoine to afarr

@totaam
Copy link
Collaborator Author

totaam commented Nov 20, 2016

2016-11-20 17:00:01: antoine changed priority from major to blocker

@totaam
Copy link
Collaborator Author

totaam commented Nov 20, 2016

2016-11-20 17:00:01: antoine commented


Confirmed and fairly easy to reproduce.
r14465 improves the code and adds XPRA_OPENGL_SAVE_BUFFERS to the opengl client, making it possible to save the state of the client-side FBO, frame by frame.

Other TODO:

  • scroll data size mismatch: 876x1027 vs 876x1031 - we could trim or pad the checksum array

@totaam
Copy link
Collaborator Author

totaam commented Nov 21, 2016

2016-11-21 09:07:56: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Nov 21, 2016

2016-11-21 09:07:56: antoine changed owner from antoine to maxmylyn

@totaam
Copy link
Collaborator Author

totaam commented Nov 21, 2016

2016-11-21 09:07:56: antoine commented


More changes in r14466, see commit message.
Using the newly introduced XPRA_SAVE_VIDEO_FRAMES=jpeg server side switch and the existing XPRA_OPENGL_SAVE_BUFFERS=jpeg client side switch.

How to use this cool new feature (should go on the wiki somewhere) for debugging any video picture issues:

  • enable both env vars
  • enable "compress" debug logging (and more? "scroll", whatever)
  • wait for problem to occur
  • inspect the recent sequence of images until we see the exact point where things don't look right
  • grab the log sample around that point in time (the image filenames use the same format as the log timestamp)

Back to the bug: r14467 fixes that, I can no longer reproduce it.
Note however that we just repaint the area, it may still appear out of order (looks like the picture is flickering a bit when it does), but the next screen refresh fixes it.

I'm still seeing the occasional lower than desired visual quality when scrolling very very fast. It should go back up more quickly than it does after you stop scrolling - but this is not really scrolling related and belongs in another ticket (related to #1257 and #1135).

It would be nice if we could also update the image checksum when we process the next auto-refresh (scrolling is currently considered lossy), but this would be quite hard to implement.

@totaam
Copy link
Collaborator Author

totaam commented Nov 21, 2016

2016-11-21 23:47:58: maxmylyn commented


Upped to r14467:

Honestly, it looks and behaves much better now - I'm not seeing any more of the issues I described earlier.

Even after almost a full work day of using it, I still can't find any outstanding issues.

Passing back to you; probably for closure.

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 06:10:41: antoine changed status from new to closed

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 06:10:41: antoine set resolution to fixed

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 06:10:41: antoine commented


Let's finally close this.

@totaam totaam closed this as completed Nov 22, 2016
@totaam
Copy link
Collaborator Author

totaam commented Feb 2, 2017

2017-02-02 11:33:06: antoine commented


Follow up in #1426 and #1320

@totaam
Copy link
Collaborator Author

totaam commented Apr 25, 2019

2019-04-25 07:28:44: antoine commented


See also #2248: more aggressive use of "scroll" encoding.

@totaam
Copy link
Collaborator Author

totaam commented May 8, 2020

2020-05-08 15:11:30: antoine commented


A bug has been fixed 3 years later! #2757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant