[PR #17680] Remove IsGlyphFullWidth from InputBuffer #31322

Closed
opened 2026-01-31 09:46:34 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/17680

State: closed
Merged: Yes


In several places the old conhost codebase appears to assume that any
wide glyph is represented by two codepoints. This is probably an
artifact of the ASCII/DBCS split that conhost used to have.
When conhost got merged into a single UCS2-aware application,
this artifact was apparently never properly resolved.

To my knowledge there are at least two places where this assumption
exists: The clipboard code which translates non-wide non-ascii
characters to Alt-numpad sequences, and this code. Both are wrong.
This is because in a Unicode-context there's no correlation between
the number of codepoints and the width of the glyph, even with UCS2.

In a post-UCS2-world the correct check is for surrogate pairs,
as they must be avoided for the same reason DBCS were avoided.

One could consider this a breaking change of the API,
as this can now result in repeat counts >1 for wide glyphs.
If someone complained about this change in behavior, I'd probably
not change it back, as narrow complex Unicode characters exist too.

**Original Pull Request:** https://github.com/microsoft/terminal/pull/17680 **State:** closed **Merged:** Yes --- In several places the old conhost codebase appears to assume that any wide glyph is represented by two codepoints. This is probably an artifact of the ASCII/DBCS split that conhost used to have. When conhost got merged into a single UCS2-aware application, this artifact was apparently never properly resolved. To my knowledge there are at least two places where this assumption exists: The clipboard code which translates non-wide non-ascii characters to Alt-numpad sequences, and this code. Both are wrong. This is because in a Unicode-context there's no correlation between the number of codepoints and the width of the glyph, even with UCS2. In a post-UCS2-world the correct check is for surrogate pairs, as they must be avoided for the same reason DBCS were avoided. One could consider this a breaking change of the API, as this can now result in repeat counts >1 for wide glyphs. If someone complained about this change in behavior, I'd probably not change it back, as narrow complex Unicode characters exist too.
claunia added the pull-request label 2026-01-31 09:46:34 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#31322