Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sixel support #371

Closed
joseluis opened this issue Apr 25, 2021 · 56 comments
Closed

Sixel support #371

joseluis opened this issue Apr 25, 2021 · 56 comments
Labels
enhancement New feature or request

Comments

@joseluis
Copy link

Several terminals already support Sixels, like xterm (in vt340 mode), wezterm, mlterm, foot, (kitty has its own thing) while others are working on it like Alacritty and Microsoft Terminal.

The thing is right now there's no yet a terminal multiplexer that really supports sixels. Screen and tmux don't and probably never will. (Althought there's been a series of tmux forks trying to fix that (latest) they're unofficial, unstable, and personally I've never been able to make them work).

It would be beyond fantastic if Zellij would be the first one to support this feature, and help bridge the gap in the road into a future where CLI apps can display pixel graphics inside terminal emulators. Soon more interesting apps will be created that makes use of it.

Disclaimer: I'm currently developing the Rust bindings for notcurses which will feature pixel support in the next version 2.3.0.

@a-kenji a-kenji added the enhancement New feature or request label Apr 25, 2021
@a-kenji
Copy link
Contributor

a-kenji commented Apr 25, 2021

Thank you for this issue and for putting this on our radar, I think this would be fantastic.

@qballer
Copy link
Member

qballer commented Apr 25, 2021

We discussed this a couple of months back and would surely like to support something like this in order to enable video/image rendering in the terminal though I'm not sure this is going to take priority. IMO Zellij has some way to go in terms of compatibility, performance, and UX before we continue to venture into this (very exciting!) option. Other maintainers might have other thoughts in this regard

@joseluis
Copy link
Author

Cool. I'm happy to confirm you are as excited about this as I am, and I also understand there are more urgent issues to tackle on before working on this. I just wanted to make sure you knew about it.

@dankamongmen
Copy link

this is great, though if you want to do serious video and are willing to blaze a new trail, the sixel protocol is sadly terrible, and we could do much, much better. i'm the lead developer of Notcurses, pretty much the only game in town when it comes to mixing glyphs and bitmaps freely, and have some thoughts on the topic. in the meantime you might want to check out this essay. if you'd like to talk, mail me at nickblack@linux.com, or respond here.

if you want to stick to something that already exists, the kitty protocol is itself in many ways superior to sixel, and i heartily recommend implementing it (ideally in its 0.20 version, including the animation method) over sixel. sixel is kinda fundamentally underspecified and very difficult to cleanly mix with text (i've had to go through an obscene number of hoops to unify Sixel+Kitty). anyway, sorry to butt in.

@bluss
Copy link

bluss commented May 6, 2021

Some discussion of interoperability/terminal protocol here https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/12

iTerm2's protocol seems to have buy-in from several terminals too.

I want to mention a use case: IPython REPL with inline plots and other image output, including latex math. This already works in terminals, outside of multiplexers, but would be very valuable inside. For me that's a more tangible usecase than just a hack to play videos in the terminal 🙂 . (Animations/videos are a common output in IPython too, btw).

@ghost
Copy link

ghost commented May 31, 2021

The thing is right now there's no yet a terminal multiplexer that really supports sixels.

Jexer has supported sixel in a terminal multiplexer environment for a couple years now. XtermWM is an easy demo of those capabilities.

Some notes from that time:

For zellij, you would be looking at both decoding and encoding:

  • Decoding is pretty easy. My decoder is super-verbose and highly-commented Java, yet still less than 1000 LOC (jexer.tterminal.Sixel.java).
  • The encoder is quite a bit worse. Instead of looking at mine (which is in jexer.backend.ECMA48Terminal.java), I would suggest checking out @dankamongmen 's work on sixel support in notcurses, beginning here. I got mine up to the "works alright for me" stage, but getting more performance out of it would require some non-trivial rework around the way I'm using Java BufferedImages; Nick's work would be a better starting point for new projects.

@timsofteng
Copy link

Any updates?

@imsnif
Copy link
Member

imsnif commented Jun 26, 2021

Any updates?

No one that I'm aware of is working on this. If anyone's interested and needs some guidance to get around the app - I'd be happy to help.

@ghost
Copy link

ghost commented Jun 26, 2021

FYI if you want a start on the decoding side, wezterm's sixel decoder is in terminalstate .

@ghost ghost mentioned this issue Jan 17, 2022
@imsnif
Copy link
Member

imsnif commented Mar 26, 2022

I'm starting to look into this again. Been reading all the helpful content linked to in this issue (thanks all!) and doing some basic tinkering. Talking about Sixel, I think the major question I have for @dankamongmen, @AutumnMeowMeow and @joseluis is if the behaviour of the sixel images on screen is defined somewhere that I'm missing?

Questions like:

  1. Should they be mapped to specific cells? If so, what happens when lines are wrapped?
  2. Does video work by only re-rendering the damaged areas, and if so what happens when the user scrolls or adds line-breaks for example?
  3. Are these sort of things the responsibility of the video app/tool or the terminal emulator?

The reason I'm asking these is that a way I'm interested in taking the Zellij implementation in is to automatically open any rendered sixel assets in a new floating pane. Logically this floating pane would "belong" to the terminal pane that opened it, any updates to that area would be happening in it, only it would be draggable and in a different place. This seems to me like a better UX than most of the implementations I've encountered - but I'm new to this protocol and am wondering if this would break other assumptions and expectations people usually have when developing/using Sixel?

@dankamongmen
Copy link

i don't believe there to be a single place where sixel behavior is canonically outlined in all its detail. there are a few places to look:

  • obviously i'm biased, but as far as i'm aware, notcurses has the most complete definition of how things work in various software terminals. jexer is very likely just as complete. having authored the former, i can confirm that there is very real divergence in existing implementations.
  • @j4james and @hackerb9 are building up a thorough historical record of hardware terminal behavior in hackerb9/vte340test, based on observation of real hardware
  • the DEC manuals have plenty of information about sixel, but are IMHO superseded by the work of @j4james/@hackerb9 mentioned above

alacritty's ayosec/graphics branch is highly divergent from other implementations in terms of sixel removal (but has not yet been merged, and might never be). for other implementations, you've got the following basic semantics:

(a) painting a sixel is done as a single logical event, distinct from events before and after
(b) text emitted after the sixel annihilates the target cell's sixel content
(c) text emitted before the sixel is annihilated by sixel content unless the sixel is blitted with transparency

to answer your questions:

  1. yes; this is necessary to effect (b) and (c) above. when the display is scrolled, all sixels are scrolled, unless they're not, according to decsdm. in the alternate screen, they ought not be scrolled.
  2. each time the sixel is emitted, you're going to a coordinate beforehand. so i emit frame 0 at 0,0. user scrolls. it gets scrolled up, sure. maybe i can detect this, maybe i can't; if i do, maybe i clear and redraw. maybe i don't, figuring the next frame will handle it. either way, next frame comes up, i once again go to 0,0 to emit frame 1. so it all works out in the end.
  3. which kind of thing? i think anything atop the semantics i outlined above are the client's job, simply by virtue of i'm unfamiliar with any assistance currently being rendered. in general, a full-screen TUI application is not going to be admitting scroll events, so it's moot?

@hackerb9
Copy link

Hi @imsnif,

I have a DEC VT340+, which I believe was the ultimate hardware sixel terminal emulator Digital sold. I'm always happy to test programs to answer any questions which the documentation is vague about. Just file an issue at github.com/hackerb9/vt340test.

1. Should they be mapped to specific cells? If so, what happens when lines are wrapped?

Typically sixels are mapped to specific cells, but it may be possible to abstract that away for a terminal multiplexer. Regardless, sixels never wrap.

3. Are these sort of things the responsibility of the video app/tool or the terminal emulator?

Your video app/tool must handle everything. Sixel is great for command line use (the stereotypical glass TTY with past commands scrolling up as if on infinite fan-fold paper). But it is too low-level for the sort of full-screen interaction you are talking about. That's why @dankamongmen's notcurses is such a welcome library.

2. Does video work by only re-rendering the damaged areas, and if so what happens when the user scrolls or adds line-breaks for example?

By "damaged", you mean the delta, right? You can re-render just the changed areas, if you want. Sixel supports 1-bit transparency, same as GIF, so it is possible for video to be optimized quite a bit further than current video players take advantage of. However, I wrote the simplest possible video player which splats every frame of sixels and was surprised that performance was more than sufficient when playing local files. (Someday, when I get around to it, I'll optimize it to work better over ssh.)

If the user scrolls or adds line breaks, it would cause damage your application wouldn't know about. Since the application is responsible, you'll have to find another way around the problem.

You can either prevent that from happening or you can simply send the whole I-frame every once in a while. I believe notcurses is the answer you're looking for as it should make it trivial to prevent the user from causing the damage in the first place.

The reason I'm asking these is that a way I'm interested in taking the Zellij implementation in is to automatically open any rendered sixel assets in a new floating pane. Logically this floating pane would "belong" to the terminal pane that opened it, any updates to that area would be happening in it, only it would be draggable and in a different place. This seems to me like a better UX than most of the implementations I've encountered - but I'm new to this protocol and am wondering if this would break other assumptions and expectations people usually have when developing/using Sixel?

I'm not sure I understand what you're saying. Unlike Kitty, sixel has no "assets". (Okay, technically, one can store sixel images in page memory, but that's probably not what you're talking about.) Are you thinking of something like XTerm's floating Tek4014 pane for drawing graphics? That's probably not what you mean, but if so, Tektronix graphics was a different protocol and treating sixels that way would certainly break my assumptions as a developer. To me, the biggest benefit of sixel is that it is integrated with the text.

@dankamongmen
Copy link

Sixel supports 1-bit transparency, same as GIF, so it is possible for video to be optimized quite a bit further than current video players take advantage of.

[blink] you know what, i don't think i'd considered this. maybe i had? i don't think my data model can take advantage of it immediately, but definitely something to keep in the back of my head.

@dankamongmen
Copy link

Your video app/tool must handle everything. Sixel is great for command line use (the stereotypical glass TTY with past commands scrolling up as if on infinite fan-fold paper). But it is too low-level for the sort of full-screen interaction you are talking about. That's why @dankamongmen's notcurses is such a welcome library.

oh yeah; i had assumed that you couldn't use notcurses for some reason, but if you can, its entire reason for existence is abstracting all this crap away

@timsofteng
Copy link

Hello. I'm not good in this issue but I saw this repo about new terminal image protocol implementation.

https://github.com/contour-terminal/terminal-good-image-protocol

And this article:
https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/26

The point is sixel is not best and not only one image protocol.

@dankamongmen
Copy link

Hello. I'm not good in this issue but I saw this repo about new terminal image protocol implementation.

https://github.com/contour-terminal/terminal-good-image-protocol

And this article: https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/26

The point is sixel is not best and not only one image protocol.

sure, you might enjoy reading https://nick-black.com/dankwiki/index.php?title=Theory_and_Practice_of_Sprixels, which discusses how these various backends can be united by a client library

@timsofteng
Copy link

@dankamongmen thank you!
What is your favorite image protocol for the terminal?

@dankamongmen
Copy link

@dankamongmen thank you! What is your favorite image protocol for the terminal?

i sat down to design my ideal protocol, and it ended up looking so much like kitty's that i just recommend using that. kitty's is far superior to sixel imho, save with regard to portability. of course if you use notcurses, it handles various backends for you =].

@timsofteng
Copy link

timsofteng commented Mar 27, 2022

i sat down to design my ideal protocol, and it ended up looking so much like kitty's that i just recommend using that. kitty's is far superior to sixel imho, save with regard to portability. of course if you use notcurses, it handles various backends for you =].

Thanks! looks like my favorite terminal emulator foot doesn't support kitty image protocol...

btw, fzf is based on notcurces right? Slightly offtop but do you know does it support image preview with some of image protocol?

@dankamongmen
Copy link

Thanks! looks like my favorite terminal emulator foot doesn't support kitty image protocol...

foot absolutely supports sixel

btw, fzf is based on notcurces right? Slightly offtop but do you know does it support image preview with some of image protocol?

it does not use notcurses. it uses both sixel and (iirc) kitty itself.

@imsnif
Copy link
Member

imsnif commented Mar 28, 2022

@dankamongmen @hackerb9 - thanks for the explanations. I think I'm starting to get the hang of what's required of me.

(also @hackerb9 - I didn't know about the Xterm floating pane, this looks really cool! I did mean something very similar, only rendered as a floating zellij pane - you can see something similar in the gif on this repo)

A few more questions, if I may:

  1. How about Sixel on Sixel? I've observed two different behaviours when testing out some stuff: one is to delete the entire cell (wezterm iirc), the other is to kind of merge the graphics into the existing sixel (xterm with flag iirc)... I guess this is one of the undefined corners?
  2. How are the " Raster attributes handled? I've also seen differing behaviour on whether they apply to the entire sixel sequence or to whatever comes after them (as the specification seems to imply? https://vt100.net/docs/vt3xx-gp/chapter14.html)
  3. Also regarding Raster attributes: should they actually be handled? Should I query to find a pixel to cell ratio?

About using notcurses:
Currently I'm leaning toward rolling our own implementation (and breaking out whatever parts make sense as external crates), but I'm not 100% decided yet. My main concerns are:

  1. What I essentially need is something that would translate between the on-wire sixel bytes from the pty and our internal state (and the other way, ofc). Since we would have to develop the latter anyway (seeing as sixels should be bound to cells), and our rendering engine would need to be aware of all the particularities anyway as well, I don't think we would be saving a lot of work here. Maybe I'm missing something though, having not yet waded into all the implementation details.
  2. Depending on an external C library would be possible, but might complicate a lot of our build and distribution processes. Not a deal breaker, but def a concern.
  3. I'm a tiny bit concerned about the performance implications of translating the state in 1, but this can also be handled if we decide to go this route.

@dankamongmen
Copy link

What I essentially need is something that would translate between the on-wire sixel bytes from the pty and our internal state (and the other way, ofc). Since we would have to develop the latter anyway (seeing as sixels should be bound to cells), and our rendering engine would need to be aware of all the particularities anyway as well, I don't think we would be saving a lot of work here. Maybe I'm missing something though, having not yet waded into all the implementation details.

so i took a look through your code, and i think i agree with what you say here: given your current infrastructure, it makes sense for you to do it. what i think would make the most sense, though (and obviously once again i'm biased), is for you to undo some of your current infrastructure, and adopt notcurses for it. your entire system of planes and subwindows and z-axes are notcurses's bread and butter. as i wrote in chapter 8 of the book:

It’s useful to note that there is never a need for more than one plane, as demonstrated by the simple fact that
each rendered frame is a two-dimensional area. Planes do not add power to the system: any algorithm which
can be expressed using multiple planes can be expressed using a single plane and external state. Instead, they
add expressiveness, and supply order to the state needed beyond the data present in the rendered frame.
The color blending performed for transparent or translucent planes can be simulated by the programmer.
Redrawing the parts of an underlying plane exposed by moving another can be managed by the programmer.
Mapping a base glyph to null cells of different planes can be done by keeping an index for each null cell, etc.
etc. In a great many cases, this external structure would be reproducing the algorithms and data structures
of planes. Planes are provided as a concise and efficient implementation of the codes frequently
necessary to implement TUIs.
The rule for using planes is thus simply to use a plane whenever you find
yourself implementing code already provided by planes.

you could eliminate a ton of your code if you used ncplanes directly, and quite probably imho save yourself a lot of horrible uninteresting little bugs that i've addressed over the past three years (especially once you get into e.g. wide glyphs and RTL).

all this sixel-atop-sixel annoyance would be handled for you, and you'd get kitty, etc.

of course, if you're uninterested in doing that, i can understand.

@imsnif
Copy link
Member

imsnif commented Mar 28, 2022

you could eliminate a ton of your code if you used ncplanes directly, and quite probably imho save yourself a lot of horrible uninteresting little bugs that i've addressed over the past three years (especially once you get into e.g. wide glyphs and RTL).

We support wide-glyphs (bugs still pop up here and there, but it's becoming less of an issue with time). I'd be curious to hear more about the concerns of RTL languages (I speak one and as far as I'm aware aside from the wide-glyph part, dealing with the directionality is mostly the app's responsibility?) but maybe we can do this in another issue.

About the library itself: not using an external library for rendering is definitely a conscious decision I've made early on. Granted, as you say this causes a lot of grief (more than I was initially aware of), but since this is our bread and butter as well, I think it allowed us to add lots of specialized features and performance optimizations that I'm super happy about. I'm not saying we do them better, just that it's good to have this sort of freedom in our core offering.

@joseluis
Copy link
Author

As a possibility, it may be worth considering offering alternative pixel support behind a non-default notcurses feature, which would depend on the libnotcurses-sys that I'm a maintainer of, so I'd be happy to help with any needs.

Ideally a future zero-dependencies 100% rust sixel (&kitty) solution for pixels in the terminal would be very nice to have, specially as a separate library, but that will probably require many more hours of work.

@imsnif
Copy link
Member

imsnif commented Mar 28, 2022

As a possibility, it may be worth considering offering alternative pixel support behind a non-default notcurses feature, which would depend on the libnotcurses-sys that I'm a maintainer of, so I'd be happy to help with any needs.

Maybe I'm not understanding something, but wouldn't this also require us to do just as much work translating the notcurses state to our state? Or alternatively changing many parts of our architecture to use notcurses instead?

@joseluis
Copy link
Author

Maybe I'm not understanding something, but wouldn't this also require us to do just as much work translating the notcurses state to our state? Or alternatively changing many parts of our architecture to use notcurses instead?

It very well maybe be the case, I've not yet investigated how this library is built internally. If the work that is needed to be done in both cases turns out to be similar while achieving similar compatibility/feature-set, it would probably mean it's worth it to go full rust.

@AutumnMeowMeow
Copy link
Contributor

(I am generally away from F/OSS for the time being (all projects archived), so may be quite sporadic or late on responses. Trans stuff. Sorry. 🤷‍♀️)

@imsnif @joseluis

As @dankamongmen and I have both seen, sixel implementations are quite different by terminal. When you get to testing, I would recommend testing on xterm first, and then: mlterm, foot, mintty, wezterm. (Do it without manipulating DECSDM at all. Assume that the bottom text row is not available for sixel images.) If your solution works the same on all those, then you should be able to justify filing bugs on any others that look different.

Two main issues are:

  1. The implicit Z axis: image-over-image, image-over-text, text-over-image. Assuming a right-hand coordinate system with positive Z going into the screen (away from user), then most terminals place text at Z=1 and image data at Z=0. Text usually fully destroys image, image usually overwrites-with-transparency image, and image-with-transparency is drawn over text. But not always: on xterm the cursor position will expose text under the image. And konsole differs on "default background" (SGR 49) vs explicit background, so (as of last month, haven't looked more recently) it will place image at Z=1, text at Z=0, and show image underneath the text, IF the background color is "default" via SGR 49 or happens to match the same RGB colors as background.
  2. More importantly is image destruction. Most terminals have different or undefined behavior when an image is overwritten by something else. Enough that I designed jexer's approach to assume that touching any text cell on an image could destroy the entire image. (Which breaks for konsole, because of the point above. But works everywhere else I have tested.)

Point 2 above is also a serious challenge for terminals when you use lots of small images as jexer does now, and notcurses might in the future if it goes to a mosaics design: almost all of the terminals I tested had to fix crashes and bad screen artifacts, due to not being designed from the get-go for frequent cell-sized overwrites of larger images.

Which answers the animation question: for sixel, you must replace the changed area for each frame. The 1-bit transparency for GIF-like animations would likely work for the majority of terminals, but not wezterm. I would be unsurprised if it also exposed memory pressure/bugs on others.

Raster attributes: @j4james would probably know better than anyone, but so far I have not used or seen any cases (on current terminals) for a non-1:1 pixel aspect ratio. The image width/height are used though, for defining rectangles of current background color. Though some of the DEC standards suggest staying within raster attributes, in practice you can always go outside them: they are a minimum size of the image pixel data, not a maximum.

Within the pty: implement the stuff on the bottom of this and the applications inside will be good to go.

On the encoder to the terminal side:

  • notcurses and chafa both feature wicked fast encoders. Both also feature nice developers who I'm sure could provide guidance on efficient encoding. 😉
  • I loved the math behind chafa's solution, really a lightbulb moment (plus eigenvalues!), so I modeled my new one after that, and it can be run standalone. (Mine is still super slow though. I blame Java. Yours will smoke it.)
  • Doing small tiles/mosaics lets you use different custom palettes for each small image piece, so you can easily get something that looks like 16-bit depth in practice. (And you do want to use different palettes. My first design (LegacySixelEncoder) using a uniform palette was a mistake.)
  • Assume a 1000x1000 max pixel limit, due to xterm. You can also use XTSMGRAPHICS to get the true max image size. Break up larger images as needed on output.

Have fun! 🐱

💗

@imsnif
Copy link
Member

imsnif commented Mar 29, 2022

Thanks for all the details @AutumnMeowMeow ! I'm going to start hacking on this and will totally hit you (and the others on this thread) up with questions as they come. Hope you are well.

@imsnif
Copy link
Member

imsnif commented Apr 12, 2022

More progress! Just merged #1316 which adds support for XTWINOPS 14 + 16 (we already supported 18). This both queries the terminal emulator in which Zellij runs for this pixel data on startup and SIGWINCH - so that we can know how many pixels fit inside a character cell when rendering Sixels, and also responds to similar queries from terminals running inside Zellij panes (adjusted for their size ofc).

@imsnif
Copy link
Member

imsnif commented Apr 25, 2022

Making good progress - and unless there are surprises, I think the bulk of the work is behind me at this point.

I created: https://github.com/zellij-org/sixel-image - a sixel serializer / deserializer which (if I dare say so myself) is pretty fast. Takes me less than 300ms to serialize and deserialize the infamous lady-of-shalott image from here: https://jexer.sourceforge.io/sixel.html (including writing it to the HD). Haven't done any thorough benchmarking though, so just a first impression.

I'm hoping that having this as an external pure-rust crate with very few dependencies will encourage use outside of Zellij as well in the future.

I have a local branch using this to successfully render images into Zellij panes, but there are naturally some adjustments to work out. I'll keep this thread posted.

@imsnif
Copy link
Member

imsnif commented Apr 25, 2022

I do have a question for @j4james meanwhile: I implemented raster attributes so that the image is padded with current background color their full width/height. I thought this was what was suggested in this thread, but both implementations I've seen (xterm with the flag and wezterm) don't do this. What do you think is the most common and expected approach?

@j4james
Copy link

j4james commented Apr 25, 2022

The expected approach would be for you match the behavior of the VT340 as demonstrated by the test cases here:

https://github.com/hackerb9/vt340test/blob/main/j4james/raster_dimensions.sh
https://github.com/hackerb9/vt340test/blob/main/j4james/raster_dimensions.png

The most common approach, though, is to studiously ignore the specifications and do whatever is most convenient. This need not be the same as anyone else, since they'll all be doing something different anyway.

@imsnif
Copy link
Member

imsnif commented Apr 26, 2022

The most common approach, though, is to studiously ignore the specifications and do whatever is most convenient. This need not be the same as anyone else, since they'll all be doing something different anyway.

That's what I like hearing. :)

Seriously though - since Zellij is an intermediary here, I want to try and make our behaviour in this regard as predictable as possible. I'd even be happy to go with "what most people do". That being said, I'll throw some stuff together and release experimental support ASAP and might ping you to give it a try and let me know what you think if you're willing.

@j4james
Copy link

j4james commented Apr 26, 2022

As an intermediary, the ideal approach would be for you to interpret the protocol correctly when parsing images into your internal buffer, but use a dumbed-down version of sixel when forwarding that content to the downstream clients. On the parsing side, though, most people probably won't care as long as you can handle a basic sixel image without exploding.

@imsnif
Copy link
Member

imsnif commented Apr 26, 2022

Yeah, that's kind of the approach I've been following so far. For example, when we interpret a raster attribute with padding, we pad the internal buffer, but when we serialize it outside we add the padding explicitly rather than use a raster attribute.

@AutumnMeowMeow
Copy link
Contributor

An interesting approach on deserialize/serialize, in that you aren't actually quantizing a 24-bit image into sixel. That drastically reduces the work for the terminal multiplexing case. (FYI hanging on to the palette/registers is also what xterm does internally -- which is great for sixel, but a pain for adding support for 24/32-bit images.) It makes layered transparency of the text-window-over-image variety really straightforward too: just alpha-blend the background color of a floating window against the palette, and otherwise serialize as normal.

Another door opens on new things to do. :-) Maybe a command-line tool to fast-crop a sixel image read from stdin, that could be plugged into a larger program...🤔

@imsnif
Copy link
Member

imsnif commented Apr 26, 2022

FYI hanging on to the palette/registers is also what xterm does internally -- which is great for sixel, but a pain for adding support for 24/32-bit images.

Good to know xterm had the same idea. It shouldn't be too much of a problem to adjust the library's internal representation, keep the speed and allow other paths for deserializing these images.

I don't have plans for supporting 24/32-bit images in the near future (who knows though, if someone else wants to do it).
My next steps are:

  1. Make this work inside Zellij
  2. Support terminals without Sixel support by integrating chafa or some similar solution
  3. Add image encoding as part of plugins, so that plugin developers don't have to worry about this and can just ask to display an image - having Zellij do the heavy lifting for them.

Might take me a little bit because I'm plugging holes in other parts of Zellij (Sixels are my fun side project), but I hope to get at least 1 out of the way in the near future.

Another door opens on new things to do. :-) Maybe a command-line tool to fast-crop a sixel image read from stdin, that could be plugged into a larger program...thinking

Yeah, that would be cool! My idea is providing a filter method and letting people do image thumbnails (or just general resizing).

@imsnif
Copy link
Member

imsnif commented May 9, 2022

More progress, and an obligatory screenshot:
img-2022-05-09-165854

Got a local branch (mostly) working. There's still some issues to hammer out, namely the whole text-on-image implicit z-index story, but otherwise things are working pretty well. The performance is pretty smooth with the aforementioned libs, I'm also caching the serialized assets so atm the performance bottleneck is pretty much the terminal emulator itself after first render.

Considering optimizing this further by caching each line so that scrolling is smooth from the get-go - maybe redoing this on SIGWINCH. Just initial thoughts though.

I hope to have this merged soon (and the crate-API finalized so we can officially publish them as well for others to use).

@imsnif
Copy link
Member

imsnif commented May 9, 2022

I implemented this by having an "anchor cell" (the top-left cell of an image) and then serializing the relevant parts of the image on render providing it intersects the pane in one way or another. The cropping of the sixel-image lib allowed me to also only render the changed part of the image rather than the whole image every time.

The anchor cell is really only used for stuff like SIGWINCH (font size change) and scrollback overflow. Otherwise we know where the image is. This made things tremendously easier (I find) than marking each cell as being potentially under one or more images.

@imsnif imsnif mentioned this issue May 10, 2022
4 tasks
@imsnif
Copy link
Member

imsnif commented May 17, 2022

More progress!

New things that work:

  1. Moving a floating pane over a sixel image - this works by temporarily cropping the sixel-image below it to the right shape
  2. Moving a floating pane with a sixel image - works by re-rendering the pane and the panes around it (and so generally moving panes around with Sixel images works - this was the most complex case)
  3. The infamous implicit z-index when writing text above an existing image - for this to work I added a new cut_out method to the SixelImage library that lets you give it a rect (x/y/width/height) and it cuts out those pixels from the image.

What's left is handling DECSDM, reaping the images and handling SIGWINCH. Also, finding a way to test all of this and cleaning stuff up.

Will keep this thread updated.

@imsnif
Copy link
Member

imsnif commented May 19, 2022

Hey, question to the thread denizens regarding raster attributes padding once more (specifically for @j4james and @hackerb9 )

The behavior I'm seeing with some sixel assets (as well as lsix) in xterm+wezterm+contour is that at least the vertical padding does not happen beyond the defined image itself. If there are extra lines below it that were padded with the raster attribute, they are omitted.

Is my interpretation correct? Or am I running into a different issue here? I'm asking because this is different than what I understood from the discussion above.

@j4james
Copy link

j4james commented May 19, 2022

I'm not entirely clear what you're asking, but if your raster attributes are declared as say 120x120, but the sixel data for the image only takes up 60x60, then it should be padded out to 120x120 (assuming the background mode hasn't been set to transparent). Again, you can look at the test cases I linked above to see how this is meant to work.

It's been a while since I've done any sixel testing, so things may have changed, but last I checked, neither XTerm, WezTerm, or Contour were handling raster attributes correctly.

@imsnif
Copy link
Member

imsnif commented May 19, 2022

Yeah, that's what I'm asking.

I guess in this case I'll kind of be forced to implement it in the same not-correct way, because our users use those terminals and apps developed for those terminals rather than the VT340.

@imsnif
Copy link
Member

imsnif commented May 19, 2022

With lsix for example, if I implement it in this way then there's a very large gap between each thumbnail line.

@j4james
Copy link

j4james commented May 19, 2022

I thought what you were going to do was make sure the attributes you set exactly matched the dimensions of the sixel data, then it shouldn't matter if the terminal emulator gets it wrong.

@imsnif
Copy link
Member

imsnif commented May 19, 2022

For sure - the problem now is:

  1. I get a raster attribute that tells me to pad the image to eg. 500h x 500w
  2. The actual image is just 400h x 500w
  3. I produce a 500h x 500w image, but it isn't what the app (eg. lsix) is expecting me to do, since this causes an unseemly gap between its thumbnail lines

@j4james
Copy link

j4james commented May 19, 2022

Ah, what you're probably running into there is the clipping issue. When you're padding an image, that padding can only extend as far as the boundaries of the screen (or the margins if there are any). The padding won't force the viewport to scroll if it extends past the bottom.

So in your example, the image you produce may need to be anything between 400 and 500 in height depending on where on the screen it was rendered. I'm surprised lsix is doing that though.

@imsnif
Copy link
Member

imsnif commented May 19, 2022

I see! Right. I'll give it a try when I get a chance and report the results.

I'm totally open to this being just a misunderstanding or misuse on my part. But just to make sure I ran lsix outside of Zellij, redirected it to a file and cated the file inside Zellij. Still happens. When I remove the raster attribute from the images it looks fine. When I crop the padding it also looks fine.

@imsnif
Copy link
Member

imsnif commented May 21, 2022

I'm totally open to this being just a misunderstanding or misuse on my part.

I was right. This was my bug - I switched horizontal/vertical padding around. The gap was the horizontal padding which I mistakenly made vertical. It works now as expected.

@imsnif
Copy link
Member

imsnif commented Jul 7, 2022

PR is up: #1557

@imsnif
Copy link
Member

imsnif commented Jul 8, 2022

At long last - this has been implemented!

I just merged Sixel support into main and it will be available in the next release.

Much thanks to all thread denizens for your help and support. It was really fun to see us terminal geeks coming together to improve the ecosystem. This was quite a task and would have been considerably harder without you.

@imsnif imsnif closed this as completed Jul 8, 2022
@christianparpart
Copy link

I'm not entirely clear what you're asking, but if your raster attributes are declared as say 120x120, but the sixel data for the image only takes up 60x60, then it should be padded out to 120x120 (assuming the background mode hasn't been set to transparent). Again, you can look at the test cases I linked above to see how this is meant to work.

It's been a while since I've done any sixel testing, so things may have changed, but last I checked, neither XTerm, WezTerm, or Contour were handling raster attributes correctly.

This is interesting. So width/height from raster attributes should certainly be working in contour. If anything (or anything else) is broken, we'd have to see if they're relevant at all to modern applications (eying at aspect ratio here), but nevertheless, getting to know that there is interest on a broken implementation to be fixed, I'm very open to that. (sorry for being late and I also don't plan to hijack that ticket nor forum. It's meant more like a general statement :-) )

@j4james
Copy link

j4james commented Sep 12, 2022

So width/height from raster attributes should certainly be working in contour.

When I last tested (which was a long time ago), the problems I saw were with images being clipped when the raster attribute dimensions were smaller than the actual image content, incorrect handling of sizes that didn't align with cell boundaries, and incorrect handling of zero/default values.

If anything (or anything else) is broken, we'd have to see if they're relevant

If you want to see all the ways in which your terminal doesn't match the original DEC implementation, there's a whole bunch of test cases here:

https://github.com/hackerb9/vt340test/tree/main/j4james

Whether you think any of that is relevant is another matter. But this is good lesson to bear in mind when you're designing your new image protocol. Future terminal devs will likely be picking and choosing which bits of your spec they feel like following, and implementing it just as badly as current devs have done with sixel.

@hackerb9
Copy link

https://github.com/hackerb9/vt340test/tree/main/j4james

Whether you think any of that is relevant is another matter. But this is good lesson to bear in mind when you're designing your new image protocol. Future terminal devs will likely be picking and choosing which bits of your spec they feel like following, and implementing it just as badly as current devs have done with sixel.

A good point, James.

Personally, I think current devs have done best they could with the limited information available (a situation which @j4james has done quite a bit to rectify). Also, as much as I love that my VT340 is still useful, picking and choosing (and extending) is what has allowed sixel to evolve as a living protocol. And that's another thing to remember when designing the GIP: don't let perfect be the enemy of the good enough for right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests