Korean text is not rendered properly #21320

Open
opened 2026-01-31 07:40:30 +00:00 by claunia · 3 comments
Owner

Originally created by @pandaninjas on GitHub (Feb 28, 2024).

Windows Terminal version

1.19.10302.0

Windows build number

10.0.22621.3155

Other Software

No response

Steps to reproduce

Certain Korean text doesn't render properly. Example
curl.exe https://gist.githubusercontent.com/pandaninjas/388ef63712016c853675dd33f0ab6864/raw/7c4f5b1d3107b2ee60c6a78d7c3a3e882ec2ebd2/korean-text.txt

Expected Behavior

Renders properly. Ex:
image

Actual Behavior

Korean text is printed weirdly
image

Originally created by @pandaninjas on GitHub (Feb 28, 2024). ### Windows Terminal version 1.19.10302.0 ### Windows build number 10.0.22621.3155 ### Other Software _No response_ ### Steps to reproduce Certain Korean text doesn't render properly. Example `curl.exe https://gist.githubusercontent.com/pandaninjas/388ef63712016c853675dd33f0ab6864/raw/7c4f5b1d3107b2ee60c6a78d7c3a3e882ec2ebd2/korean-text.txt` ### Expected Behavior Renders properly. Ex: ![image](https://github.com/microsoft/terminal/assets/101084582/fa2803c9-8695-42fb-b120-fd283b4fd141) ### Actual Behavior Korean text is printed weirdly ![image](https://github.com/microsoft/terminal/assets/101084582/c4b38b2d-24b2-4b9f-8bfa-b8efb3d88c9a)
claunia added the Area-RenderingIssue-BugProduct-Terminal labels 2026-01-31 07:40:32 +00:00
Author
Owner

@leejy12 commented on GitHub (Feb 28, 2024):

I think Terminal is correct here. The text file you provided contains two letters (U+1105) and (U+1173).
The browser seems to be combining those two together to render them as (U+B974), which I don't think should be happening.

@leejy12 commented on GitHub (Feb 28, 2024): I think Terminal is correct here. The text file you provided contains two letters `ᄅ`(U+1105) and `ᅳ`(U+1173). The browser seems to be combining those two together to render them as `르`(U+B974), which I don't think should be happening.
Author
Owner

@lhecker commented on GitHub (Feb 28, 2024):

I think Terminal is correct here.

It used to be correct! But nowadays, terminals are supposed to mostly handle grapheme clusters as per UAX#29 correctly (the Unicode Text Segmentation standard). The width of a glyph is the maximum width of each character in a grapheme cluster as per UAX#11 (the East Asian Width standard).

The two glyphs are syllable type L (leading consonant) and V (vowel), as per:

UAX#29 specifies that L×V do not break apart, which means that Windows Terminal has at least 1 bug: It should treat 르 as a single unit for cursor navigation, etc.

I'm not entirely sure whether ᄅ​ᅳ should be drawn as 르, because I'm not a native Korean speaker. But given the above it should definitely only get allocated 1 cell, which would mean that it should be drawn as 르, because it won't fit otherwise. I believe the only reason it currently gets drawn as ᄅ​ᅳ is because we rely on DirectWrite whose builtin support for modern Unicode is still incomplete. For instance, you can try pasting 르 into Word and it'll be broken just like it is in Windows Terminal. Both are built on top of DirectWrite. 😕

@lhecker commented on GitHub (Feb 28, 2024): > I think Terminal is correct here. It used to be correct! But nowadays, terminals are supposed to mostly handle grapheme clusters as per UAX#29 correctly (the Unicode Text Segmentation standard). The width of a glyph is the maximum width of each character in a grapheme cluster as per UAX#11 (the East Asian Width standard). The two glyphs are syllable type L (leading consonant) and V (vowel), as per: * https://codepoints.net/U+1105?lang=en * https://codepoints.net/U+1173?lang=en UAX#29 specifies that L×V do not break apart, which means that Windows Terminal has at least 1 bug: It should treat 르 as a single unit for cursor navigation, etc. I'm not entirely sure whether ᄅ​ᅳ should be drawn as 르, because I'm not a native Korean speaker. But given the above it should definitely only get allocated 1 cell, which would mean that it should be drawn as 르, because it won't fit otherwise. I believe the only reason it currently gets drawn as ᄅ​ᅳ is because we rely on DirectWrite whose builtin support for modern Unicode is still incomplete. For instance, you can try pasting 르 into Word and it'll be broken just like it is in Windows Terminal. Both are built on top of DirectWrite. 😕
Author
Owner

@lhecker commented on GitHub (Feb 29, 2024):

Found the answer: https://devblogs.microsoft.com/oldnewthing/20201009-00/?p=104351
tl;dr: Windows (DirectWrite) does it ""right"" but they're literally the only one to do so, which is why Windows is effectively wrong. There's no alternative to DirectWrite though... Hmm, I hope I can figure something out because normalizing the input text to NFC first would be super awkward and make performance optimizations difficult.

@lhecker commented on GitHub (Feb 29, 2024): Found the answer: https://devblogs.microsoft.com/oldnewthing/20201009-00/?p=104351 tl;dr: Windows (DirectWrite) does it ""right"" but they're literally the only one to do so, which is why Windows is effectively wrong. There's no alternative to DirectWrite though... Hmm, I hope I can figure something out because normalizing the input text to NFC first would be super awkward and make performance optimizations difficult.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#21320