[PR #13626] [MERGED] Rewrite ROW to be Unicode capable #29669

Open
opened 2026-01-31 09:36:14 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/microsoft/terminal/pull/13626
Author: @lhecker
Created: 7/29/2022
Status: Merged
Merged: 11/11/2022
Merged by: @lhecker

Base: mainHead: dev/lhecker/8000-text-buffer-rework


📝 Commits (10+)

  • 1761d60 Rewrite ROW to be Unicode capable
  • 3fd2788 Fix resize with wide characters
  • ffcac71 Add test for buffer backup/restore
  • bd9f6ad Fix spell check
  • eddce54 Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework
  • ceea6e3 Finish merge, Address feedback
  • 9200b21 Fix a regression
  • 106b2ab Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework
  • a0c5847 Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework
  • ea16b1a Minor adjustments, Fix build

📊 Changes

59 files changed (+1265 additions, -2117 deletions)

View changed files

📝 .github/actions/spelling/allow/apis.txt (+5 -4)
📝 .github/actions/spelling/expect/expect.txt (+3 -0)
src/buffer/out/AttrRow.cpp (+0 -134)
src/buffer/out/AttrRow.hpp (+0 -68)
src/buffer/out/CharRow.cpp (+0 -281)
src/buffer/out/CharRow.hpp (+0 -117)
src/buffer/out/CharRowCell.cpp (+0 -73)
src/buffer/out/CharRowCell.hpp (+0 -65)
src/buffer/out/CharRowCellReference.cpp (+0 -136)
src/buffer/out/CharRowCellReference.hpp (+0 -63)
📝 src/buffer/out/DbcsAttribute.hpp (+14 -121)
📝 src/buffer/out/LineRendition.hpp (+1 -1)
📝 src/buffer/out/OutputCell.hpp (+1 -1)
📝 src/buffer/out/OutputCellIterator.cpp (+10 -33)
📝 src/buffer/out/OutputCellView.cpp (+1 -14)
📝 src/buffer/out/OutputCellView.hpp (+1 -1)
📝 src/buffer/out/Row.cpp (+565 -76)
📝 src/buffer/out/Row.hpp (+124 -45)
src/buffer/out/UnicodeStorage.cpp (+0 -98)
src/buffer/out/UnicodeStorage.hpp (+0 -65)

...and 39 more files

📄 Description

This commit is a from-scratch rewrite of ROW with the primary goal to get
rid of the rather bodgy UnicodeStorage class and improve Unicode support.

Previously a 120x9001 terminal buffer would store a vector of 9001 ROWs
where each ROW stored exactly 120 wchar_t. Glyphs exceeding their
allocated space would be stored in the UnicodeStorage which was basically
a hashmap<Coordinate, String>. Iterating over the text in a ROW would
require us to check each glyph and fetch it from the map conditionally.
On newlines we'd have to invalidate all map entries that are now gone,
so for every invalidated ROW we'd iterate through all glyphs again and if
a single one was stored in UnicodeStorage, we'd then iterate through the
entire hashmap to remove all coordinates that were residing on that ROW.
All in all, this wasn't the most robust nor performant code.

The new implementation is simple (from a design perspective):
Store all text in a ROW in a regular string. Grow the string if needed.
The association between columns and text works by storing character offsets
in a column-wide array. This algorithm is <100 LOC and removes ~1000.

As an aside this PR does a few more things that go hand in hand:

  • Remove most of ROW helper classes, which aren't needed anymore.
  • Allocate backing memory in a single VirtualAlloc call.
  • Rewrite IsCursorDoubleWidth to use DbcsAttrAt directly.
    Improves overall performance by 10-20% and makes this implementation
    faster than the previous NxM storage, despite the added complexity.

Part of #8000

Validation Steps Performed

  • Existing and new unit and feature tests complete
  • Printing Unicode completes without crashing
  • Resizing works without crashing

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/microsoft/terminal/pull/13626 **Author:** [@lhecker](https://github.com/lhecker) **Created:** 7/29/2022 **Status:** ✅ Merged **Merged:** 11/11/2022 **Merged by:** [@lhecker](https://github.com/lhecker) **Base:** `main` ← **Head:** `dev/lhecker/8000-text-buffer-rework` --- ### 📝 Commits (10+) - [`1761d60`](https://github.com/microsoft/terminal/commit/1761d6061fb65661aa9711632ccc33f542bd68b1) Rewrite ROW to be Unicode capable - [`3fd2788`](https://github.com/microsoft/terminal/commit/3fd27886d8d099efd1e93da962ea488c5a70942e) Fix resize with wide characters - [`ffcac71`](https://github.com/microsoft/terminal/commit/ffcac717a36babfad048de50afe5495dc390302a) Add test for buffer backup/restore - [`bd9f6ad`](https://github.com/microsoft/terminal/commit/bd9f6adce37e0f11822f30b7c47c1b1d123916c2) Fix spell check - [`eddce54`](https://github.com/microsoft/terminal/commit/eddce541e8edd5bc7dc0e4b108873bb34ad32766) Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework - [`ceea6e3`](https://github.com/microsoft/terminal/commit/ceea6e3734b01894797e9ec2e0c0f39ac37d3d87) Finish merge, Address feedback - [`9200b21`](https://github.com/microsoft/terminal/commit/9200b21eb51cf2f9bf674aef44585f4aecd93442) Fix a regression - [`106b2ab`](https://github.com/microsoft/terminal/commit/106b2ab72f840bf80a8044a4c6883babeb152716) Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework - [`a0c5847`](https://github.com/microsoft/terminal/commit/a0c58473f661626976c0b21e496b419794fc7f8b) Merge remote-tracking branch 'origin/main' into dev/lhecker/8000-text-buffer-rework - [`ea16b1a`](https://github.com/microsoft/terminal/commit/ea16b1a8db5a87daa7f9530ed08147eef4e965e3) Minor adjustments, Fix build ### 📊 Changes **59 files changed** (+1265 additions, -2117 deletions) <details> <summary>View changed files</summary> 📝 `.github/actions/spelling/allow/apis.txt` (+5 -4) 📝 `.github/actions/spelling/expect/expect.txt` (+3 -0) ➖ `src/buffer/out/AttrRow.cpp` (+0 -134) ➖ `src/buffer/out/AttrRow.hpp` (+0 -68) ➖ `src/buffer/out/CharRow.cpp` (+0 -281) ➖ `src/buffer/out/CharRow.hpp` (+0 -117) ➖ `src/buffer/out/CharRowCell.cpp` (+0 -73) ➖ `src/buffer/out/CharRowCell.hpp` (+0 -65) ➖ `src/buffer/out/CharRowCellReference.cpp` (+0 -136) ➖ `src/buffer/out/CharRowCellReference.hpp` (+0 -63) 📝 `src/buffer/out/DbcsAttribute.hpp` (+14 -121) 📝 `src/buffer/out/LineRendition.hpp` (+1 -1) 📝 `src/buffer/out/OutputCell.hpp` (+1 -1) 📝 `src/buffer/out/OutputCellIterator.cpp` (+10 -33) 📝 `src/buffer/out/OutputCellView.cpp` (+1 -14) 📝 `src/buffer/out/OutputCellView.hpp` (+1 -1) 📝 `src/buffer/out/Row.cpp` (+565 -76) 📝 `src/buffer/out/Row.hpp` (+124 -45) ➖ `src/buffer/out/UnicodeStorage.cpp` (+0 -98) ➖ `src/buffer/out/UnicodeStorage.hpp` (+0 -65) _...and 39 more files_ </details> ### 📄 Description This commit is a from-scratch rewrite of `ROW` with the primary goal to get rid of the rather bodgy `UnicodeStorage` class and improve Unicode support. Previously a 120x9001 terminal buffer would store a vector of 9001 `ROW`s where each `ROW` stored exactly 120 `wchar_t`. Glyphs exceeding their allocated space would be stored in the `UnicodeStorage` which was basically a `hashmap<Coordinate, String>`. Iterating over the text in a `ROW` would require us to check each glyph and fetch it from the map conditionally. On newlines we'd have to invalidate all map entries that are now gone, so for every invalidated `ROW` we'd iterate through all glyphs again and if a single one was stored in `UnicodeStorage`, we'd then iterate through the entire hashmap to remove all coordinates that were residing on that `ROW`. All in all, this wasn't the most robust nor performant code. The new implementation is simple (from a design perspective): Store all text in a `ROW` in a regular string. Grow the string if needed. The association between columns and text works by storing character offsets in a column-wide array. This algorithm is <100 LOC and removes ~1000. As an aside this PR does a few more things that go hand in hand: * Remove most of `ROW` helper classes, which aren't needed anymore. * Allocate backing memory in a single `VirtualAlloc` call. * Rewrite `IsCursorDoubleWidth` to use `DbcsAttrAt` directly. Improves overall performance by 10-20% and makes this implementation faster than the previous NxM storage, despite the added complexity. Part of #8000 ## Validation Steps Performed * Existing and new unit and feature tests complete ✅ * Printing Unicode completes without crashing ✅ * Resizing works without crashing ✅ --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-31 09:36:14 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#29669