Combining Diacritic Marks are broken under AtlasEngine #16101

Closed
opened 2026-01-31 04:57:37 +00:00 by claunia · 3 comments
Owner

Originally created by @reli-msft on GitHub (Dec 10, 2021).

Originally assigned to: @lhecker on GitHub.

Windows Terminal version

1.13

Windows build number

Any

Other Software

No response

Steps to reproduce

Type

echo "[e`u{0301}`u{0301}]"

In PowerShell v6+

Expected Behavior

Text like this should appear in the window:

image

The two acute accents should stack above letter e.

Actual Behavior

image

The acute accents are not properly placed.

Originally created by @reli-msft on GitHub (Dec 10, 2021). Originally assigned to: @lhecker on GitHub. ### Windows Terminal version 1.13 ### Windows build number Any ### Other Software _No response_ ### Steps to reproduce Type ```powershell echo "[e`u{0301}`u{0301}]" ``` In PowerShell v6+ ### Expected Behavior Text like this should appear in the window: ![image](https://user-images.githubusercontent.com/50213618/145647680-bf033226-e39b-48a3-aa40-85f5281a24df.png) The two acute accents should stack above letter `e`. ### Actual Behavior ![image](https://user-images.githubusercontent.com/50213618/145647627-1947af02-2204-479a-8e00-6c30d5f2d94e.png) The acute accents are not properly placed.
Author
Owner

@reli-msft commented on GitHub (Dec 10, 2021):

It looks like AtlasEngine doesn't do GPOS at all -- however GPOS is the key to support combining diacritics.

@reli-msft commented on GitHub (Dec 10, 2021): It looks like AtlasEngine doesn't do GPOS at all -- however GPOS is the key to support combining diacritics.
Author
Owner

@Perlence commented on GitHub (Jan 29, 2022):

Is it related to #3546?

@Perlence commented on GitHub (Jan 29, 2022): Is it related to #3546?
Author
Owner

@lhecker commented on GitHub (Jan 29, 2022):

@Perlence It's somewhat related but the cause is different. It's not "expecting the quantity of glyphs [to] equal [...] the cells allocated for each line". It actually handles unicode drawing correctly, because it doesn't do its own unicode drawing in the first place (unlike our previous code). But it's missing a piece of text analysis to recognize combining diacritics in order to not split text in the wrong locations.

The Jewish word "ﭏלה‎" consists of the 3 clusters "ל", "ﭏ" and "ה" which can be cached independently without affecting text layout. The first one ("ﭏ") is a ligature consisting of "א" and "ל" and may not be split up. The "AtlasEngine" has a system to break unicode text into exactly such "clusters" of related characters. Such clusters are cached together so that they only need to be drawn once and all future drawings can refer to the cached one.

What this issue implies is that the canBreakShapingAfter field of DWRITE_SHAPING_TEXT_PROPERTIES isn't enough to decide whether a run of glyphs forms a "cluster", because it doesn't handle combining diacritics, meaning we unfortunately will have to call IDWriteTextAnalyzer::GetGlyphPlacements. This is quite unfortunate, because we only need to draw text once and most of the time we actually only want to segment text without being interested in almost everything GetGlyphPlacements returns.

I'll tackle this issue in the near term though.

@lhecker commented on GitHub (Jan 29, 2022): @Perlence It's somewhat related but the cause is different. It's not "expecting the quantity of glyphs [to] equal [...] the cells allocated for each line". It actually handles unicode drawing correctly, because it doesn't do its own unicode drawing in the first place (unlike our previous code). But it's missing a piece of text analysis to recognize combining diacritics in order to not split text in the wrong locations. The Jewish word "ﭏלה‎" consists of the 3 clusters "ל", "ﭏ" and "ה" which can be cached independently without affecting text layout. The first one ("ﭏ") is a ligature consisting of "א" and "ל" and may _not_ be split up. The "AtlasEngine" has a system to break unicode text into exactly such "clusters" of related characters. Such clusters are cached together so that they only need to be drawn once and all future drawings can refer to the cached one. What this issue implies is that the `canBreakShapingAfter` field of `DWRITE_SHAPING_TEXT_PROPERTIES` isn't enough to decide whether a run of glyphs forms a "cluster", because it doesn't handle combining diacritics, meaning we unfortunately will have to call `IDWriteTextAnalyzer::GetGlyphPlacements`. This is quite unfortunate, because we only need to draw text once and most of the time we actually only want to segment text without being interested in almost everything `GetGlyphPlacements` returns. I'll tackle this issue in the near term though.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#16101