Problem displaying text which contains a combining character (Combining Acute Accent U+0301) #18208

Closed
opened 2026-01-31 06:06:52 +00:00 by claunia · 2 comments
Owner

Originally created by @gusbzs on GitHub (Aug 18, 2022).

Windows Terminal version

1.14.1963.0

Windows build number

10.0.22000.0

Other Software

WSL version: 0.65.3.0
Kernel version: 5.15.57.1
WSLg version: 1.0.41
MSRDC version: 1.2.3213
Direct3D version: 1.601.0
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22000.856

Ubuntu 20.04.4 LTS

Steps to reproduce

  1. Open an Ubuntu 20.04.4 LTS tab inside Windows Terminal.
  2. Execute the following command to generate file "test.txt" which contains the spanish word "automático" using the Unicode "combining acute accent character U+0301" which is represented in UTF8 by bytes 0xCC and 0x81 (hex).
    echo "6175746f6d61cc817469636f0a" | xxd -r -p - >test.txt
  3. Execute the following command to display the contents of the generated "test.txt" file:
    cat test.txt

Expected Behavior

The spanish word "automático" should be displayed inside Windows Terminal without any additional extra spacing between each letter, just like Windows Notepad displays the same file, which is correct:
image

Actual Behavior

The spanish word "automático" is displayed inside Windows Terminal with extra spacing before and after the combining character:
image

Originally created by @gusbzs on GitHub (Aug 18, 2022). ### Windows Terminal version 1.14.1963.0 ### Windows build number 10.0.22000.0 ### Other Software WSL version: 0.65.3.0 Kernel version: 5.15.57.1 WSLg version: 1.0.41 MSRDC version: 1.2.3213 Direct3D version: 1.601.0 DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp Windows version: 10.0.22000.856 Ubuntu 20.04.4 LTS ### Steps to reproduce 1. Open an Ubuntu 20.04.4 LTS tab inside Windows Terminal. 2. Execute the following command to generate file "test.txt" which contains the spanish word "automático" using the Unicode "combining acute accent character U+0301" which is represented in UTF8 by bytes 0xCC and 0x81 (hex). ` echo "6175746f6d61cc817469636f0a" | xxd -r -p - >test.txt ` 3. Execute the following command to display the contents of the generated "test.txt" file: `cat test.txt` ### Expected Behavior The spanish word "automático" should be displayed inside Windows Terminal without any additional extra spacing between each letter, just like Windows Notepad displays the same file, which is correct: ![image](https://user-images.githubusercontent.com/7851513/185462112-55284ea7-9b11-48c3-bfaa-32e700c46cdc.png) ### Actual Behavior The spanish word "automático" is displayed inside Windows Terminal with extra spacing before and after the combining character: ![image](https://user-images.githubusercontent.com/7851513/185460695-6d0aee0b-9949-4e71-bca0-893daa458a23.png)
claunia added the Issue-BugResolution-Duplicate labels 2026-01-31 06:06:52 +00:00
Author
Owner

@DHowett commented on GitHub (Aug 18, 2022):

Unfortunately, this is one of those complicated things that consoles cannot handle very well. Here's what's happening:

Each code unit is assigned one cell... and for the "zero-width" character class that encompasses combining diacritical marks, that throws us off.

It ends up like this:

0 1 2 3 4 5 6 7 8 9 10
a u t o m a ◌́ t i c o
61 75 74 6f 6d 61 0301 74 69 63 6f

When we give this back to the text rendering engine, it's "correctly" allocated the two cells at positions 5 and 6, but it's rejoined instead of rendering as its constituent parts as in the diagram.

This will be fixed, broadly, by /dupe #8000.

Thanks!

@DHowett commented on GitHub (Aug 18, 2022): Unfortunately, this is one of those complicated things that consoles cannot handle very well. Here's what's happening: Each code unit is assigned one cell... and for the "zero-width" character class that encompasses combining diacritical marks, that throws us off. It ends up like this: |0|1|2|3|4|5|6|7|8|9|10| |-|-|-|-|-|-|-|-|-|-|-| |a|u|t|o|m|a|◌́|t|i|c|o |`61`|`75`|`74`|`6f`|`6d`|`61`|`0301`|`74`|`69`|`63`|`6f`| When we give this back to the text rendering engine, it's "correctly" allocated the two cells at positions 5 and 6, but it's rejoined instead of rendering as its constituent parts as in the diagram. This will be fixed, broadly, by /dupe #8000. Thanks!
Author
Owner

@ghost commented on GitHub (Aug 18, 2022):

Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!

@ghost commented on GitHub (Aug 18, 2022): Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#18208