Issue with sixel rendering: stretched bands #22095

@PhMajerus commented on GitHub (Aug 14, 2024):

@DHowett the <!-- test only works when using a font that has ligatures. Most users, me included, probably use the Cascadia Mono version in Terminal.
I tested with both D2D and D3D using the manual setting in Rendering settings, and as @lhecker said, the problem happens with both, and even with the GDI? renderer in conhost.

@PhMajerus commented on GitHub (Aug 14, 2024): @DHowett the `<!--` test only works when using a font that has ligatures. Most users, me included, probably use the Cascadia Mono version in Terminal. I tested with both D2D and D3D using the manual setting in Rendering settings, and as @lhecker said, the problem happens with both, and even with the GDI? renderer in conhost.

claunia commented

@DHowett commented on GitHub (Aug 14, 2024):

@DHowett the <!-- test only works when using a font that has ligatures.

I did try to call that out. ;)

@DHowett commented on GitHub (Aug 14, 2024): > [@DHowett](https://github.com/DHowett) the `<!--` test only works when using a font that has ligatures. I *did* try to call that out. ;)

claunia commented

@PhMajerus commented on GitHub (Aug 14, 2024):

Oops, sorry, forgot you mentioned switching to Cascadia Code in the meantime while I was testing.

BTW, showing which rendering engine got selected when set to Automatic in Settings would be nice to have, like a label below the description text that shows the engine currently in use. It would provide a more reliable way to find out without being too intrusive or technical, since the Rendering settings page is pretty much all about those advanced tweaks anyway.

@PhMajerus commented on GitHub (Aug 14, 2024): Oops, sorry, forgot you mentioned switching to Cascadia Code in the meantime while I was testing. BTW, showing which rendering engine got selected when set to `Automatic` in Settings would be nice to have, like a label below the description text that shows the engine currently in use. It would provide a more reliable way to find out without being too intrusive or technical, since the Rendering settings page is pretty much all about those advanced tweaks anyway.

claunia commented

9c1436775e/src/renderer/gdi/paint.cpp (L687-L688)

@j4james commented on GitHub (Aug 14, 2024):

I think what's happening here is that the first image is actual 24 pixels high, due to the nature of sixel images (they're always a multiple of 6). This means it extends over two lines, and the second line's image slice is thus preallocated with a width of 40, even though it's entirely transparent at that point.

Then when you write the second image immediately below that, it's actually overlapping the first, so the first image slice for that image will have a width of 40, but the second slice will have its expected width of 32.

When it comes time to rendering, we've now got two lines that look like they ought to be the same, but the image slice for the first line has additional transparent padding that's making it wider. Ideally this shouldn't be a problem, but I think there are possibly rounding errors when we scale the image slices that results in them being misaligned.

For the GDI renderer in particular, this calculation looks wrong.

I'm assuming it should be rounding, rather than truncating, although it may be more complicated than that.

I don't know if the Atlas engine is affected in the same way, or there's something else involved there.

@j4james commented on GitHub (Aug 14, 2024): I think what's happening here is that the first image is actual 24 pixels high, due to the nature of sixel images (they're always a multiple of 6). This means it extends over two lines, and the second line's image slice is thus preallocated with a width of 40, even though it's entirely transparent at that point. Then when you write the second image immediately below that, it's actually overlapping the first, so the first image slice for that image will have a width of 40, but the second slice will have its expected width of 32. When it comes time to rendering, we've now got two lines that look like they ought to be the same, but the image slice for the first line has additional transparent padding that's making it wider. Ideally this shouldn't be a problem, but I think there are possibly rounding errors when we scale the image slices that results in them being misaligned. For the GDI renderer in particular, this calculation looks wrong. https://github.com/microsoft/terminal/blob/9c1436775e44c1a7090c41cd44624d5a39cb9828/src/renderer/gdi/paint.cpp#L687-L688 I'm assuming it should be rounding, rather than truncating, although it may be more complicated than that. I don't know if the Atlas engine is affected in the same way, or there's something else involved there.

claunia commented

1511d2c2ad/src/buffer/out/ImageSlice.cpp (L68)

@j4james commented on GitHub (Aug 15, 2024):

On further investigation, I think the solution may be much simpler. The rounding shouldn't be an issue if the image slices were always a multiple of the virtual cell width, and they already would be if it weren't for this line:

But I'm not sure that's actually needed. I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as RGBQUAD. I suspect this code is just left over from when I was experimenting with a palette renderer, which would have had just one byte per pixel.

I need to do some more tests to confirm, but dropping that line seems to fix the issue in both the GDI and Atlas renderers.

@j4james commented on GitHub (Aug 15, 2024): On further investigation, I think the solution may be much simpler. The rounding shouldn't be an issue if the image slices were always a multiple of the virtual cell width, and they already would be if it weren't for this line: https://github.com/microsoft/terminal/blob/1511d2c2ad4220721afb3f4689d9f1d2895c0bd7/src/buffer/out/ImageSlice.cpp#L68 But I'm not sure that's actually needed. I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as `RGBQUAD`. I suspect this code is just left over from when I was experimenting with a palette renderer, which would have had just one byte per pixel. I need to do some more tests to confirm, but dropping that line seems to fix the issue in both the GDI and Atlas renderers.

claunia commented

@lhecker commented on GitHub (Aug 15, 2024):

At least the CreateBitmap documentation states:

Each scan line in the rectangle must be word aligned (scan lines that are not word aligned must be padded with zeros).

So, I think removing that line should be safe.

@lhecker commented on GitHub (Aug 15, 2024): At least the `CreateBitmap` documentation states: > Each scan line in the rectangle must be word aligned (scan lines that are not word aligned must be padded with zeros). So, I think removing that line should be safe.

claunia commented

@PhMajerus commented on GitHub (Aug 15, 2024):

I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as RGBQUAD.

Yeah, the Windows DIB format uses a whole number of bytes per line. A monochrome DIB that's 12px wide will use 2 bytes and waste 4 bits per line.
If you use a 256 colors palette or RGBQUADs, you'll always use whole bytes already.
I need to check if it has to be a multiple of 4, I don't remember that part.

@PhMajerus commented on GitHub (Aug 15, 2024): > I think it's the byte width of the image buffer that needs to be a multiple of 4, and that's already going to be the case if we're storing our pixels as `RGBQUAD`. Yeah, the Windows DIB format uses a whole number of bytes per line. A monochrome DIB that's 12px wide will use 2 bytes and waste 4 bits per line. If you use a 256 colors palette or RGBQUADs, you'll always use whole bytes already. I need to check if it has to be a multiple of 4, I don't remember that part.

claunia commented

@j4james commented on GitHub (Aug 15, 2024):

The 4 byte alignment thing is mentioned in the BITMAPINFO docs:

each scan line must be padded with zeros to end on a LONG data-type boundary

@j4james commented on GitHub (Aug 15, 2024): The 4 byte alignment thing is mentioned in the [`BITMAPINFO`](https://learn.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-bitmapinfo) docs: > each scan line must be padded with zeros to end on a LONG data-type boundary

claunia commented

@j4james commented on GitHub (Aug 15, 2024):

@PhMajerus I forgot to say thank you for spotting this. The early testing is much appreciated.

@j4james commented on GitHub (Aug 15, 2024): @PhMajerus I forgot to say thank you for spotting this. The early testing is much appreciated.

claunia commented

@PhMajerus commented on GitHub (Aug 15, 2024):

each scan line must be padded with zeros to end on a LONG data-type boundary

Yeah, I checked my DIB code and it's aligned on 4 bytes boundaries.

I forgot to say thank you for spotting this. The early testing is much appreciated.

You're welcome, I really like the feature, and has been experimenting with it. Thanks for implementing it.
I think we can have a sxlDir that shows Windows Explorer icons next to filenames in the terminal. So I'll try to find some time to experiment some more. I basically need to write the converter the other way around, from DIB to Sixel.

Do you have some good reference or recommendations on interleaving text and sixel images?
Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? Which pixel height should we use to be as compatible as possible? Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? If not, how can we reliably move the cursor to the right of the image?

@PhMajerus commented on GitHub (Aug 15, 2024): > each scan line must be padded with zeros to end on a LONG data-type boundary Yeah, I checked my DIB code and it's aligned on 4 bytes boundaries. > I forgot to say thank you for spotting this. The early testing is much appreciated. You're welcome, I really like the feature, and has been experimenting with it. Thanks for implementing it. I think we can have a sxlDir that shows Windows Explorer icons next to filenames in the terminal. So I'll try to find some time to experiment some more. I basically need to write the converter the other way around, from DIB to Sixel. Do you have some good reference or recommendations on interleaving text and sixel images? Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? Which pixel height should we use to be as compatible as possible? Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? If not, how can we reliably move the cursor to the right of the image?

claunia commented

@j4james commented on GitHub (Aug 15, 2024):

Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images?

On a standard Sixel implementation that's what I would do, but on most Linux terminals you won't know how many columns the image will occupy (see below), so you won't know how far right to move the cursor. Some support a proprietary mode (8452) that should leave the cursor positioned to the right of the image, but that may still not be the correct row.

Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell?

Again, if you only need to work with standard Sixel implementations, that would be perfect. That's the cell size of the VT340, and what real terminal emulators are likely to emulate. But that's not going to work on most Linux terminals, because their cell size will be dependent on whatever font size the user happens to have set at the time. They expect you to query the size every time you want to output an image, and then rescale the image to match.

Personally I just don't bother trying to support non-standard terminals. There are VT libraries (like notcurses) that have some level of cross-platform image support, so it is doable, but I think they are more geared towards fullscreen apps.

@j4james commented on GitHub (Aug 15, 2024): > Is my technique of following a sixel by a ␛[nC to move the cursor to its right before continuing with text the best way to mix text and 1-line images? On a standard Sixel implementation that's what I would do, but on most Linux terminals you won't know how many columns the image will occupy (see below), so you won't know how far right to move the cursor. Some support a proprietary mode (8452) that should leave the cursor positioned to the right of the image, but that may still not be the correct row. > Is 20 safe for a 1-line sixel? Can we always assume a width of 10 pixels will fit 1 character cell? Again, if you only need to work with standard Sixel implementations, that would be perfect. That's the cell size of the VT340, and what real terminal emulators are likely to emulate. But that's not going to work on most Linux terminals, because their cell size will be dependent on whatever font size the user happens to have set at the time. They expect you to query the size every time you want to output an image, and then rescale the image to match. Personally I just don't bother trying to support non-standard terminals. There are VT libraries (like notcurses) that have some level of cross-platform image support, so it is doable, but I think they are more geared towards fullscreen apps.

claunia commented