Issue with sixel rendering: stretched bands #22095

Closed
opened 2026-01-31 08:03:15 +00:00 by claunia · 14 comments
Owner

Originally created by @PhMajerus on GitHub (Aug 14, 2024).

Windows Terminal version

Canary 1.22.2261.0

Windows build number

10.0.22631.4037 x64

Description of the problem

I believe I may have found an issue with sixels images rendering.

When displaying sixels images with a transparent background (by using P0;1 and not drawing some pixels in any of the colors), it renders perfectly when displayed on its own.
image

But when it gets close to another sixel image, added either on the previous line, or on the second line of the multi-line image with transparency, its width gets weird.

image

When the other sixel image is added above, the first band/line of the multi-line image gets slightly stretched.
image

When the other sixel image is appended after the second line of the multi-line image, that band gets stretched.
image

Note this also seems to happen in conhost, but with less visible stretching (only noticeable at some font sizes).

Steps to reproduce

Here is the repro text file: test.txt

Expected Behavior

The sixel images to keep their dimensions regardless of other contents displayed in the terminal.

Actual Behavior

Sixel images sometimes get distorted with some bands stretched when other contents is displayed near them.

Originally created by @PhMajerus on GitHub (Aug 14, 2024). ### Windows Terminal version Canary 1.22.2261.0 ### Windows build number 10.0.22631.4037 x64 ### Description of the problem I believe I may have found an issue with sixels images rendering. When displaying sixels images with a transparent background (by using `P0;1` and not drawing some pixels in any of the colors), it renders perfectly when displayed on its own. ![image](https://github.com/user-attachments/assets/fef76824-b088-4b1b-b736-9523d39e6db0) But when it gets close to another sixel image, added either on the previous line, or on the second line of the multi-line image with transparency, its width gets weird. ![image](https://github.com/user-attachments/assets/5491e2ec-9846-4d90-9e00-671a1da62b49) When the other sixel image is added above, the first band/line of the multi-line image gets slightly stretched. ![image](https://github.com/user-attachments/assets/d8434943-4a84-4686-92bf-09ebfbaedbdf) When the other sixel image is appended after the second line of the multi-line image, that band gets stretched. ![image](https://github.com/user-attachments/assets/453e4524-f5cb-4f0c-bdda-b73a81f7c10d) Note this also seems to happen in conhost, but with less visible stretching (only noticeable at some font sizes). ### Steps to reproduce Here is the repro text file: [test.txt](https://github.com/user-attachments/files/16611729/test.txt) ### Expected Behavior The sixel images to keep their dimensions regardless of other contents displayed in the terminal. ### Actual Behavior Sixel images sometimes get distorted with some bands stretched when other contents is displayed near them.
claunia added the Area-RenderingIssue-BugIn-PRNeeds-Tag-FixProduct-Terminal labels 2026-01-31 08:03:16 +00:00
Author
Owner

@DHowett commented on GitHub (Aug 14, 2024):

@lhecker for Atlas

@PhMajerus do you by chance know whether you’re using the D2D or D3D backend?

I think you can test it by switching to a font with a long single-glyph ligature (Cascadia Code works well) and trying to color it in multiple colors:

\e[31m<\e[32m!\e[33m-\e[34m-\e[m

If it shows up in four colors, you’re on D3D. One color, D2D.

@DHowett commented on GitHub (Aug 14, 2024): @lhecker for Atlas @PhMajerus do you by chance know whether you’re using the D2D or D3D backend? I *think* you can test it by switching to a font with a long single-glyph ligature (Cascadia Code works well) and trying to color it in multiple colors: `\e[31m<\e[32m!\e[33m-\e[34m-\e[m` If it shows up in four colors, you’re on D3D. One color, D2D.
Author
Owner

@lhecker commented on GitHub (Aug 14, 2024):

This happens with both D2D and D3D. It's a lot easier to test now that we can switch between them manually in the "Rendering" settings.

@lhecker commented on GitHub (Aug 14, 2024): This happens with both D2D and D3D. It's a lot easier to test now that we can switch between them manually in the "Rendering" settings.
Author
Owner

@PhMajerus commented on GitHub (Aug 14, 2024):

@DHowett the <!-- test only works when using a font that has ligatures. Most users, me included, probably use the Cascadia Mono version in Terminal.
I tested with both D2D and D3D using the manual setting in Rendering settings, and as @lhecker said, the problem happens with both, and even with the GDI? renderer in conhost.

@PhMajerus commented on GitHub (Aug 14, 2024): @DHowett the `<!--` test only works when using a font that has ligatures. Most users, me included, probably use the Cascadia Mono version in Terminal. I tested with both D2D and D3D using the manual setting in Rendering settings, and as @lhecker said, the problem happens with both, and even with the GDI? renderer in conhost.
Author
Owner

@DHowett commented on GitHub (Aug 14, 2024):

@DHowett the <!-- test only works when using a font that has ligatures.

I did try to call that out. ;)

@DHowett commented on GitHub (Aug 14, 2024): > [@DHowett](https://github.com/DHowett) the `<!--` test only works when using a font that has ligatures. I *did* try to call that out. ;)
Author
Owner

@PhMajerus commented on GitHub (Aug 14, 2024):

Oops, sorry, forgot you mentioned switching to Cascadia Code in the meantime while I was testing.

BTW, showing which rendering engine got selected when set to Automatic in Settings would be nice to have, like a label below the description text that shows the engine currently in use. It would provide a more reliable way to find out without being too intrusive or technical, since the Rendering settings page is pretty much all about those advanced tweaks anyway.

@PhMajerus commented on GitHub (Aug 14, 2024): Oops, sorry, forgot you mentioned switching to Cascadia Code in the meantime while I was testing. BTW, showing which rendering engine got selected when set to `Automatic` in Settings would be nice to have, like a label below the description text that shows the engine currently in use. It would provide a more reliable way to find out without being too intrusive or technical, since the Rendering settings page is pretty much all about those advanced tweaks anyway.
Author
Owner

@j4james commented on GitHub (Aug 14, 2024):

I think what's happening here is that the first image is actual 24 pixels high, due to the nature of sixel images (they're always a multiple of 6). This means it extends over two lines, and the second line's image slice is thus preallocated with a width of 40, even though it's entirely transparent at that point.

Then when you write the second image immediately below that, it's actually overlapping the first, so the first image slice for that image will have a width of 40, but the second slice will have its expected width of 32.

When it comes time to rendering, we've now got two lines that look like they ought to be the same, but the image slice for the first line has additional transparent padding that's making it wider. Ideally this shouldn't be a problem, but I think there are possibly rounding errors when we scale the image slices that results in them being misaligned.

For the GDI renderer in particular, this calculation looks wrong.

9c1436775e/src/renderer/gdi/paint.cpp (L687-L688)

I'm assuming it should be rounding, rather than truncating, although it may be more complicated than that.

I don't know if the Atlas engine is affected in the same way, or there's something else involved there.

@j4james commented on GitHub (Aug 14, 2024): I think what's happening here is that the first image is actual 24 pixels high, due to the nature of sixel images (they're always a multiple of 6). This means it extends over two lines, and the second line's image slice is thus preallocated with a width of 40, even though it's entirely transparent at that point. Then when you write the second image immediately below that, it's actually overlapping the first, so the first image slice for that image will have a width of 40, but the second slice will have its expected width of 32. When it comes time to rendering, we've now got two lines that look like they ought to be the same, but the image slice for the first line has additional transparent padding that's making it wider. Ideally this shouldn't be a problem, but I think there are possibly rounding errors when we scale the image slices that results in them being misaligned. For the GDI renderer in particular, this calculation looks wrong. https://github.com/microsoft/terminal/blob/9c1436775e44c1a7090c41cd44624d5a39cb9828/src/renderer/gdi/paint.cpp#L687-L688 I'm assuming it should be rounding, rather than truncating, although it may be more complicated than that. I don't know if the Atlas engine is affected in the same way, or there's something else involved there.
Author
Owner

@j4james commented on GitHub (Aug 15, 2024):

On further investigation, I think the solution may be much simpler. The rounding shouldn't be an issue if the image slices were always a multiple of the virtual cell width, and they already would be if it weren't for this line:

1511d2c2ad/src/buffer/out/ImageSlice.cpp (L68)

But I'm not sure that's actually needed. I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as RGBQUAD. I suspect this code is just left over from when I was experimenting with a palette renderer, which would have had just one byte per pixel.

I need to do some more tests to confirm, but dropping that line seems to fix the issue in both the GDI and Atlas renderers.

@j4james commented on GitHub (Aug 15, 2024): On further investigation, I think the solution may be much simpler. The rounding shouldn't be an issue if the image slices were always a multiple of the virtual cell width, and they already would be if it weren't for this line: https://github.com/microsoft/terminal/blob/1511d2c2ad4220721afb3f4689d9f1d2895c0bd7/src/buffer/out/ImageSlice.cpp#L68 But I'm not sure that's actually needed. I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as `RGBQUAD`. I suspect this code is just left over from when I was experimenting with a palette renderer, which would have had just one byte per pixel. I need to do some more tests to confirm, but dropping that line seems to fix the issue in both the GDI and Atlas renderers.
Author
Owner

@lhecker commented on GitHub (Aug 15, 2024):

At least the CreateBitmap documentation states:

Each scan line in the rectangle must be word aligned (scan lines that are not word aligned must be padded with zeros).

So, I think removing that line should be safe.

@lhecker commented on GitHub (Aug 15, 2024): At least the `CreateBitmap` documentation states: > Each scan line in the rectangle must be word aligned (scan lines that are not word aligned must be padded with zeros). So, I think removing that line should be safe.
Author
Owner

@PhMajerus commented on GitHub (Aug 15, 2024):

I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as RGBQUAD.

Yeah, the Windows DIB format uses a whole number of bytes per line. A monochrome DIB that's 12px wide will use 2 bytes and waste 4 bits per line.
If you use a 256 colors palette or RGBQUADs, you'll always use whole bytes already.
I need to check if it has to be a multiple of 4, I don't remember that part.

@PhMajerus commented on GitHub (Aug 15, 2024): > I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as `RGBQUAD`. Yeah, the Windows DIB format uses a whole number of bytes per line. A monochrome DIB that's 12px wide will use 2 bytes and waste 4 bits per line. If you use a 256 colors palette or RGBQUADs, you'll always use whole bytes already. I need to check if it has to be a multiple of 4, I don't remember that part.
Author
Owner

@j4james commented on GitHub (Aug 15, 2024):

The 4 byte alignment thing is mentioned in the BITMAPINFO docs:

each scan line must be padded with zeros to end on a LONG data-type boundary

@j4james commented on GitHub (Aug 15, 2024): The 4 byte alignment thing is mentioned in the [`BITMAPINFO`](https://learn.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-bitmapinfo) docs: > each scan line must be padded with zeros to end on a LONG data-type boundary
Author
Owner

@j4james commented on GitHub (Aug 15, 2024):

@PhMajerus I forgot to say thank you for spotting this. The early testing is much appreciated.

@j4james commented on GitHub (Aug 15, 2024): @PhMajerus I forgot to say thank you for spotting this. The early testing is much appreciated.
Author
Owner

@PhMajerus commented on GitHub (Aug 15, 2024):

each scan line must be padded with zeros to end on a LONG data-type boundary

Yeah, I checked my DIB code and it's aligned on 4 bytes boundaries.

I forgot to say thank you for spotting this. The early testing is much appreciated.

You're welcome, I really like the feature, and has been experimenting with it. Thanks for implementing it.
I think we can have a sxlDir that shows Windows Explorer icons next to filenames in the terminal. So I'll try to find some time to experiment some more. I basically need to write the converter the other way around, from DIB to Sixel.

Do you have some good reference or recommendations on interleaving text and sixel images?
Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? Which pixel height should we use to be as compatible as possible? Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? If not, how can we reliably move the cursor to the right of the image?

@PhMajerus commented on GitHub (Aug 15, 2024): > each scan line must be padded with zeros to end on a LONG data-type boundary Yeah, I checked my DIB code and it's aligned on 4 bytes boundaries. > I forgot to say thank you for spotting this. The early testing is much appreciated. You're welcome, I really like the feature, and has been experimenting with it. Thanks for implementing it. I think we can have a sxlDir that shows Windows Explorer icons next to filenames in the terminal. So I'll try to find some time to experiment some more. I basically need to write the converter the other way around, from DIB to Sixel. Do you have some good reference or recommendations on interleaving text and sixel images? Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? Which pixel height should we use to be as compatible as possible? Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? If not, how can we reliably move the cursor to the right of the image?
Author
Owner

@j4james commented on GitHub (Aug 15, 2024):

Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images?

On a standard Sixel implementation that's what I would do, but on most Linux terminals you won't know how many columns the image will occupy (see below), so you won't know how far right to move the cursor. Some support a proprietary mode (8452) that should leave the cursor positioned to the right of the image, but that may still not be the correct row.

Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell?

Again, if you only need to work with standard Sixel implementations, that would be perfect. That's the cell size of the VT340, and what real terminal emulators are likely to emulate. But that's not going to work on most Linux terminals, because their cell size will be dependent on whatever font size the user happens to have set at the time. They expect you to query the size every time you want to output an image, and then rescale the image to match.

Personally I just don't bother trying to support non-standard terminals. There are VT libraries (like notcurses) that have some level of cross-platform image support, so it is doable, but I think they are more geared towards fullscreen apps.

@j4james commented on GitHub (Aug 15, 2024): > Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? On a standard Sixel implementation that's what I would do, but on most Linux terminals you won't know how many columns the image will occupy (see below), so you won't know how far right to move the cursor. Some support a proprietary mode (8452) that should leave the cursor positioned to the right of the image, but that may still not be the correct row. > Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? Again, if you only need to work with standard Sixel implementations, that would be perfect. That's the cell size of the VT340, and what real terminal emulators are likely to emulate. But that's not going to work on most Linux terminals, because their cell size will be dependent on whatever font size the user happens to have set at the time. They expect you to query the size every time you want to output an image, and then rescale the image to match. Personally I just don't bother trying to support non-standard terminals. There are VT libraries (like notcurses) that have some level of cross-platform image support, so it is doable, but I think they are more geared towards fullscreen apps.
Author
Owner

@j4james commented on GitHub (Aug 15, 2024):

One other thing I forgot to mention: a 20 pixel high image has an effective height of 24 pixels, so be aware that it'll trigger a scroll on the last line of the page. The cursor will still remain on the starting row (now scrolled up), so your layout should be fine, but the scrolling may come as a surprise. If you want to avoid that scrolling, you'll need to limit the image height to 18 pixels, but that means you can't completely fill the line height.

@j4james commented on GitHub (Aug 15, 2024): One other thing I forgot to mention: a 20 pixel high image has an effective height of 24 pixels, so be aware that it'll trigger a scroll on the last line of the page. The cursor will still remain on the starting row (now scrolled up), so your layout should be fine, but the scrolling may come as a surprise. If you want to avoid that scrolling, you'll need to limit the image height to 18 pixels, but that means you can't completely fill the line height.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#22095