[PR #16377] Refactor TextBuffer::GenHTML/RTF to read the buffer directly #30907

Open
opened 2026-01-31 09:43:45 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/16377

State: closed
Merged: Yes


TextBuffer::GenHTML and TextBuffer::GenRTF now read directly from the TextBuffer.

  • Since we're reading from the buffer, we can now read all the attributes saved in the buffer. Formatted copy now copies most (if not all) font/color attributes in the requested format (RTF/HTML).
  • Use TextBuffer::CopyRequest to pass all copy-related options into text generation functions as one unit.
  • Helper function TextBuffer::CopyRequest::FromConfig() generates a copy request based on Selection mode and user configuration.
  • Both formatted text generation functions now use std::string and fmt::format_to to generate the required strings. Previously, we were using std::ostringstream which is not recommended due to its potential overhead.
  • Reading attributes from ROW's attribute RLE simplified the logic as we don't have to track attribute change between the text.
  • On the caller side, we do not have to rebuild the plain text string from the vector of strings anymore. TextBuffer::GetPlainText() returns the entire text as one std::string.
  • Removed TextBuffer::TextAndColors.
  • Removed TextBuffer::GetText(). TextBuffer::GetPlainText() took its place.

This PR also fixes two bugs in the formatted copy:

  • We were applying line breaks after each selected row, even though the row could have been a Wrapped row. This caused the wrapped rows to break when they shouldn't.
  • We mishandled Unicode text (\uN) within the RTF copy. Every next character that uses a surrogate pair or high codepoint was missing in the copied text when pasted to MSWord. The command \uc4 should have been \uc1, which is used to tell how many fallback characters are used for each Unicode codepoint (\u). We always use one ? character as the fallback.

Closes #16191

References and Relevant Issues

Validation Steps Performed

  • Casual copy-pasting from Terminal or OpenConsole to word editors works as before.
  • Verified HTML copy by copying the generated HTML string and running it through an HTML viewer. Sample
  • Verified RTF copy by copy-pasting the generated RTF string into MSWord.
  • SingleLine mode works (Shift+ copy)
  • BlockSelection mode works (Alt selection)

PR Checklist

  • Tests added/passed
**Original Pull Request:** https://github.com/microsoft/terminal/pull/16377 **State:** closed **Merged:** Yes --- `TextBuffer::GenHTML` and `TextBuffer::GenRTF` now read directly from the TextBuffer. - Since we're reading from the buffer, we can now read _all_ the attributes saved in the buffer. Formatted copy now copies most (if not all) font/color attributes in the requested format (RTF/HTML). - Use `TextBuffer::CopyRequest` to pass all copy-related options into text generation functions as one unit. - Helper function `TextBuffer::CopyRequest::FromConfig()` generates a copy request based on Selection mode and user configuration. - Both formatted text generation functions now use `std::string` and `fmt::format_to` to generate the required strings. Previously, we were using `std::ostringstream` which is not recommended due to its potential overhead. - Reading attributes from `ROW`'s attribute RLE simplified the logic as we don't have to track attribute change between the text. - On the caller side, we do not have to rebuild the plain text string from the vector of strings anymore. `TextBuffer::GetPlainText()` returns the entire text as one `std::string`. - Removed `TextBuffer::TextAndColors`. - Removed `TextBuffer::GetText()`. `TextBuffer::GetPlainText()` took its place. This PR also fixes two bugs in the formatted copy: - We were applying line breaks after each selected row, even though the row could have been a Wrapped row. This caused the wrapped rows to break when they shouldn't. - We mishandled Unicode text (\uN) within the RTF copy. Every next character that uses a surrogate pair or high codepoint was missing in the copied text when pasted to MSWord. The command `\uc4` should have been `\uc1`, which is used to tell how many fallback characters are used for each Unicode codepoint (\u). We always use one `?` character as the fallback. Closes #16191 **References and Relevant Issues** - #16270 **Validation Steps Performed** - Casual copy-pasting from Terminal or OpenConsole to word editors works as before. - Verified HTML copy by copying the generated HTML string and running it through an HTML viewer. [Sample](https://codepen.io/tusharvickey/pen/wvNXbVN) - Verified RTF copy by copy-pasting the generated RTF string into MSWord. - SingleLine mode works (<kbd>Shift</kbd>+ copy) - BlockSelection mode works (<kbd>Alt</kbd> selection) **PR Checklist** - [x] Tests added/passed
claunia added the pull-request label 2026-01-31 09:43:45 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#30907