Decomposed unicode characters take space of two characters #13855

Open
opened 2026-01-31 03:54:02 +00:00 by claunia · 0 comments
Owner

Originally created by @samuliasmala on GitHub (May 20, 2021).

Windows Terminal version (or Windows build number)

1.7.1033.0

Other Software

No response

Steps to reproduce

Run the following command in Ubuntu bash:

echo $'"\u00E4"\n"\u0061\u0308"'

Expected Behavior

Because U+00E4 and U+0061 U+0308 are equivalent Unicode forms I was expecting the following output:
image

Actual Behavior

The actual output I got was following:
image

The second form which takes space of two characters is a combination of Latin Small Letter A and Combining Diaeresis which should be displayed as a single character. Now it's displayed as two characters which results in two issues:

  • Readability of such text suffers since sentences have half-spaces around the character
  • E.g. less command calculates the row length using single-character spacing which causes overlapping text and disappearing lines when line length is exceeded due to the Combining Diaeresis or other similar character
Originally created by @samuliasmala on GitHub (May 20, 2021). ### Windows Terminal version (or Windows build number) 1.7.1033.0 ### Other Software _No response_ ### Steps to reproduce Run the following command in Ubuntu bash: ```bash echo $'"\u00E4"\n"\u0061\u0308"' ``` ### Expected Behavior Because U+00E4 and U+0061 U+0308 are [equivalent Unicode forms](https://handwiki.org/wiki/Unicode_equivalence) I was expecting the following output: ![image](https://user-images.githubusercontent.com/14218719/118957956-a40dbb00-b969-11eb-92b4-53a2aa23e251.png) ### Actual Behavior The actual output I got was following: ![image](https://user-images.githubusercontent.com/14218719/118958091-c1db2000-b969-11eb-8ac5-f4fcf72875b7.png) The second form which takes space of two characters is a combination of [Latin Small Letter A](https://www.compart.com/en/unicode/U+0061) and [Combining Diaeresis](https://www.compart.com/en/unicode/U+0308) which should be displayed as a single character. Now it's displayed as two characters which results in two issues: - Readability of such text suffers since sentences have half-spaces around the character - E.g. `less` command calculates the row length using single-character spacing which causes overlapping text and disappearing lines when line length is exceeded due to the Combining Diaeresis or other similar character
claunia added the Area-OutputResolution-Duplicate labels 2026-01-31 03:54:02 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#13855