[PR #13626] Rewrite ROW to be Unicode capable #29674

Closed
opened 2026-01-31 09:36:15 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/13626

State: closed
Merged: Yes


This commit is a from-scratch rewrite of ROW with the primary goal to get
rid of the rather bodgy UnicodeStorage class and improve Unicode support.

Previously a 120x9001 terminal buffer would store a vector of 9001 ROWs
where each ROW stored exactly 120 wchar_t. Glyphs exceeding their
allocated space would be stored in the UnicodeStorage which was basically
a hashmap<Coordinate, String>. Iterating over the text in a ROW would
require us to check each glyph and fetch it from the map conditionally.
On newlines we'd have to invalidate all map entries that are now gone,
so for every invalidated ROW we'd iterate through all glyphs again and if
a single one was stored in UnicodeStorage, we'd then iterate through the
entire hashmap to remove all coordinates that were residing on that ROW.
All in all, this wasn't the most robust nor performant code.

The new implementation is simple (from a design perspective):
Store all text in a ROW in a regular string. Grow the string if needed.
The association between columns and text works by storing character offsets
in a column-wide array. This algorithm is <100 LOC and removes ~1000.

As an aside this PR does a few more things that go hand in hand:

  • Remove most of ROW helper classes, which aren't needed anymore.
  • Allocate backing memory in a single VirtualAlloc call.
  • Rewrite IsCursorDoubleWidth to use DbcsAttrAt directly.
    Improves overall performance by 10-20% and makes this implementation
    faster than the previous NxM storage, despite the added complexity.

Part of #8000

Validation Steps Performed

  • Existing and new unit and feature tests complete
  • Printing Unicode completes without crashing
  • Resizing works without crashing
**Original Pull Request:** https://github.com/microsoft/terminal/pull/13626 **State:** closed **Merged:** Yes --- This commit is a from-scratch rewrite of `ROW` with the primary goal to get rid of the rather bodgy `UnicodeStorage` class and improve Unicode support. Previously a 120x9001 terminal buffer would store a vector of 9001 `ROW`s where each `ROW` stored exactly 120 `wchar_t`. Glyphs exceeding their allocated space would be stored in the `UnicodeStorage` which was basically a `hashmap<Coordinate, String>`. Iterating over the text in a `ROW` would require us to check each glyph and fetch it from the map conditionally. On newlines we'd have to invalidate all map entries that are now gone, so for every invalidated `ROW` we'd iterate through all glyphs again and if a single one was stored in `UnicodeStorage`, we'd then iterate through the entire hashmap to remove all coordinates that were residing on that `ROW`. All in all, this wasn't the most robust nor performant code. The new implementation is simple (from a design perspective): Store all text in a `ROW` in a regular string. Grow the string if needed. The association between columns and text works by storing character offsets in a column-wide array. This algorithm is <100 LOC and removes ~1000. As an aside this PR does a few more things that go hand in hand: * Remove most of `ROW` helper classes, which aren't needed anymore. * Allocate backing memory in a single `VirtualAlloc` call. * Rewrite `IsCursorDoubleWidth` to use `DbcsAttrAt` directly. Improves overall performance by 10-20% and makes this implementation faster than the previous NxM storage, despite the added complexity. Part of #8000 ## Validation Steps Performed * Existing and new unit and feature tests complete ✅ * Printing Unicode completes without crashing ✅ * Resizing works without crashing ✅
claunia added the pull-request label 2026-01-31 09:36:15 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#29674