[PR #4781] Stop padding text columns for fullwidth characters #25938

Open
opened 2026-01-31 09:12:48 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/4781

State: closed
Merged: Yes


Summary of the Pull Request

Adjusts column padding code in CustomTextLayout to only pad out for surrogate pairs, not anything that reports two columns.

References

PR Checklist

Detailed Description of the Pull Request / Additional comments

For surrogate pairs like high Unicode emoji, we receive two wchar_ts but only one column count (which is usually 2 because emoji are usually inscribed in the full-width squares.) To compensate for this, I added in a little padding function at the top of the CustomTextLayout construction that adds a column of 0 aligned with the second half of a surrogate pair so the text-to-glyph mapping lines up correctly.

Unfortunately, I made a mistake while either responding to PR feedback in #4747 or in the first place and I made it pad out extra 0 columns based on the FIRST column count, not based on whether or not there is a trailing surrogate pair. The correct thing to do is to pad it out based on the LENGTH of text associated with the given column count. This means that full width characters which can be represented in one wchar_t, like those coming from the IME in most cases (U+5C41 for example) will have a column count of 2. This is perfectly correct for mapping text-to-glyphs and doesn't need a 0 added after it. A house emoji (U+1F3E0) comes in as two wchar_ts (0xD83C 0xDFE0) with the column count of 2. To ensure that the arrays are aligned, the 2 matches up with the 0xD83C but the 0xDFE0 needs a 0 on it so it will be skipped over. (Don't worry, because it's a surrogate, it's naturally consumed correctly by the glyph mapper.)

The effect was that every OTHER character inserted by the IME was scaled to 0 size (as an advance of 0 was expected for 0 columns).
The fix restores it so those characters don't have an associated count and aren't scaled.

Validation Steps Performed

  • Opened it up
  • Put in the house emoji like #4747 (U+1f3e0)
  • Put in some characters with simplified Chinese IME (fixed now)
  • Put in the utf83.txt sample text used in #4747
**Original Pull Request:** https://github.com/microsoft/terminal/pull/4781 **State:** closed **Merged:** Yes --- ## Summary of the Pull Request Adjusts column padding code in `CustomTextLayout` to only pad out for surrogate pairs, not anything that reports two columns. ## References - See also #4747 ## PR Checklist * [x] Closes #4780 * [x] I work here. * [x] Manual tests. * [x] No doc, this fixes code to match comment. Oops. * [x] Am core contributor. Also discussed with @leonMSFT. ## Detailed Description of the Pull Request / Additional comments For surrogate pairs like high Unicode emoji, we receive two wchar_ts but only one column count (which is usually 2 because emoji are usually inscribed in the full-width squares.) To compensate for this, I added in a little padding function at the top of the `CustomTextLayout` construction that adds a column of 0 aligned with the second half of a surrogate pair so the text-to-glyph mapping lines up correctly. Unfortunately, I made a mistake while either responding to PR feedback in #4747 or in the first place and I made it pad out extra 0 columns based on the FIRST column count, not based on whether or not there is a trailing surrogate pair. The correct thing to do is to pad it out based on the LENGTH of text associated with the given column count. This means that full width characters which can be represented in one wchar_t, like those coming from the IME in most cases (U+5C41 for example) will have a column count of 2. This is perfectly correct for mapping text-to-glyphs and doesn't need a 0 added after it. A house emoji (U+1F3E0) comes in as two wchar_ts (0xD83C 0xDFE0) with the column count of 2. To ensure that the arrays are aligned, the 2 matches up with the 0xD83C but the 0xDFE0 needs a 0 on it so it will be skipped over. (Don't worry, because it's a surrogate, it's naturally consumed correctly by the glyph mapper.) The effect was that every OTHER character inserted by the IME was scaled to 0 size (as an advance of 0 was expected for 0 columns). The fix restores it so those characters don't have an associated count and aren't scaled. ## Validation Steps Performed - Opened it up - Put in the house emoji like #4747 (U+1f3e0) - Put in some characters with simplified Chinese IME (fixed now) - Put in the utf83.txt sample text used in #4747
claunia added the pull-request label 2026-01-31 09:12:48 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#25938