terminal

mirror of https://github.com/microsoft/terminal.git synced 2026-04-23 22:53:41 +00:00

Author	SHA1	Message	Date
Leonard Hecker	6b19d21845	Fix a output marks performance regression (#19242 ) An alternative approach for #18291. Improves perf by ~7%.	2025-08-13 18:10:18 +02:00
Javier	6e89242373	Multiple fixes to address DD CodeQL requirements (#18451 ) After taking in 1.22, our CodeQL process caught a few locations where we weren't following the right guidance: - Performing integer comparisons of different sizes which could lead to an infinite loop if the larger integer goes out of range of the smaller integer - Not checking HResult of a called method Co-authored-by: aphistra <102989060+aphistra@users.noreply.github.com>	2025-03-18 13:26:31 -05:00
Carlos Zamora	64d4fbab17	Make selection an exclusive range (#18106 ) Selection is generally stored as an inclusive start and end. This PR makes the end exclusive which now allows degenerate selections, namely in mark mode. This also modifies mouse selection to round to the nearest cell boundary (see #5099) and improves word boundaries to be a bit more modern and make sense for degenerate selections (similar to #15787). Closes #5099 Closes #13447 Closes #17892 ## Detailed Description of the Pull Request / Additional comments - Buffer, Viewport, and Point - Introduced a few new functions here to find word boundaries, delimiter class runs, and glyph boundaries. - 📝These new functions should be able to replace a few other functions (i.e. `GetWordStart` --> `GetWordStart2`). That migration is going to be a part of #4423 to reduce the risk of breaking UIA. - Viewport: added a few functions to handle navigating the _exclusive_ bounds (namely allowing RightExclusive as a position for buffer coordinates). This is important for selection to be able to highlight the entire line. - 📝`BottomInclusiveRightExclusive()` will replace `EndExclusive` in the UIA code - Point: `iterate_rows_exclusive` is similar to `iterate_rows`, except it has handling for RightExclusive - Renderer - Use `iterate_rows_exclusive` for proper handling (this actually fixed a lot of our issues) - Remove some workarounds in `_drawHighlighted` (this is a boundary where we got inclusive coords and made them exclusive, but now we don't need that!) - Terminal - fix selection marker rendering - `_ConvertToBufferCell()`: add a param to allow for RightExclusive or clamp it to RightInclusive (original behavior). Both are useful! - Use new `GetWordStart2` and `GetWordEnd2` to improve word boundaries and make them feel right now that the selection an exclusive range. - Convert a few `IsInBounds` --> `IsInExclusiveBounds` for safety and correctness - Add `TriggerSelection` to `SelectNewRegion` - 📝 We normally called `TriggerSelection` in a different layer, but it turns out, UIA's `Select` function wouldn't actually update the renderer. Whoops! This fixes that. - TermControl - `_getTerminalPosition` now has a new param to round to the nearest cell (see #5099) - UIA - `TermControlUIAProvider::GetSelectionRange` no need to convert from inclusive range to exclusive range anymore! - `TextBuffer::GetPlainText` now works on an exclusive range, so no need to convert the range anymore! ## Validation Steps Performed This fundamental change impacts a lot of scenarios: - ✅Rendering selections - ✅Selection markers - ✅Copy text - ✅Session restore - ✅Mark mode navigation (i.e. character, word, line, buffer) - ✅Mouse selection (i.e. click+drag, shift+click, multi-click, alt+click) - ✅Hyperlinks (interaction and rendering) - ✅Accessibility (i.e. get selection, movement, text extraction, selecting text) - [ ] Prev/Next Command/Output (untested) - ✅Unit tests ## Follow-ups - Refs #4423 - Now that selection and UIA are both exclusive ranges, it should be a lot easier to deduplicate code between selection and UIA. We should be able to remove `EndExclusive` as well when we do that. This'll also be an opportunity to modernize that code and use more `til` classes.	2025-01-28 16:54:49 -06:00
Leonard Hecker	75f7ae4bec	AtlasEngine: Implement sixels (#17581 ) * Add a revision to `ImageSlice` so that the renderers can use it to cache them as bitmaps across frames. * Hooked up the revision tracking to AtlasEngine to cache the slices into `Buffer`s so we can own them into the `Present`. * Hooked up those snapshots to BackendD3D with a straightforward hashmap -> atlas-rect logic. Just like rendering text. * Hooked up BackendD2D with a bad, but simple & direct drawing logic. * Bonus: Modify `ImageSlice` to be returned as a raw pointers as this helps performance slightly. (Trivial type == good.) * Bonus: Fixed the `_debugShowDirty` code (disabled by default). ## Validation Steps Performed * `mpv --really-quiet --vo=sixel foo.mp4` looks good ✅ * Scroll up down & observe dirty rects ✅	2024-07-23 12:39:12 -07:00
Leonard Hecker	04c33f35c3	15x faster reflow in debug builds (#17550 ) STL iterators have a significant overhead. This improves performance of `GetLastNonSpaceColumn` by >100x (it's too large to measure), and reflow by ~15x in debug builds. This makes text reflow in debug builds today ~10x faster than it used to be in release builds before the large rewrites in #15701 and #13626.	2024-07-12 02:24:29 +00:00
James Holderness	236c0030f1	Add support for Sixel images in conhost (#17421 ) ## Summary of the Pull Request This PR introduces basic support for the Sixel graphics protocol in conhost, limited to the GDI renderer. ## References and Relevant Issues This is a first step towards supporting Sixel graphics in Windows Terminal (#448), but that will first require us to have some form of ConPTY passthrough (#1173). ## Detailed Description of the Pull Request / Additional comments There are three main parts to the architecture: * The `SixelParser` class takes care of parsing the incoming Sixel `DCS` sequence. * The resulting image content is stored in the text buffer in a series of `ImageSlice` objects, which represent per-row image content. * The renderer then takes care of painting those image slices for each affected row. The parser is designed to support multiple conformance levels so we can one day provide strict compatibility with the original DEC hardware. But for now the default behavior is intended to work with more modern Sixel applications. This is essentially the equivalent of a VT340 with 256 colors, so it should still work reasonably well as a VT340 emulator too. ## Validation Steps Performed Thanks to the work of @hackerb9, who has done extensive testing on a real VT340, we now have a fairly good understanding of how the original Sixel hardware terminals worked, and I've tried to make sure that our implementation matches that behavior as closely as possible. I've also done some testing with modern Sixel libraries like notcurses and jexer, but those typically rely on the terminal implementing certain proprietary Xterm query sequences which I haven't included in this PR. --------- Co-authored-by: Dustin L. Howett <dustin@howett.net>	2024-07-01 10:57:49 +00:00
Leonard Hecker	cb48babe9d	Implement grapheme clusters (#16916 ) First, this adds `GraphemeTableGen` which * parses `ucd.nounihan.grouped.xml` * computes the cluster break property for each codepoint * computes the East Asian Width property for each codepoint * compresses everything into a 4-stage trie * computes a LUT of cluster break rules between 2 codepoints * and serializes everything to C++ tables and helper functions Next, this adds `GraphemeTestTableGen` which * parses `GraphemeBreakTest.txt` * splits each test into graphemes and break opportunities * and serializes everything to a C++ table for use as unit tests `CodepointWidthDetector.cpp` was rewritten from scratch to * use an iterator struct (`GraphemeState`) to maintain state * accumulate codepoints until a break opportunity arises * accumulate the total width of a grapheme * support 3 different measurement modes: Grapheme clusters, `wcswidth`-style, and a mode identical to the old conhost With this in place the following changes were made: * `ROW::WriteHelper::_replaceTextUnicode` now uses the new grapheme cluster text iterators * The same function was modified to join new text with existing contents of the current cell if they join to form a cluster * Otherwise, a ton of places were modified to funnel the selection of the measurement mode over from WT's settings to ConPTY This is part of #1472 ## Validation Steps Performed * So many tests ✅ * https://github.com/apparebit/demicode works fantastic ✅ * UTF8-torture-test.txt works fantastic ✅	2024-06-26 18:40:27 +00:00
Leonard Hecker	4e7b63c664	A minor TSF refactoring (#17067 ) Next in the popular series of minor refactorings: Out with the old, in with the new! This PR removes all of the existing TSF code, both for conhost and Windows Terminal. conhost's TSF implementation was awful: It allocated an entire text buffer _per line_ of input. Additionally, its implementation spanned a whopping 40 files and almost 5000 lines of code. Windows Terminal's implementation was absolutely fine in comparison, but it was user unfriendly due to two reasons: Its usage of the `CoreTextServices` WinRT API indirectly meant that it used a non-transitory TSF document, which is not the right choice for a terminal. A `TF_SS_TRANSITORY` document (-context) indicates to TSF that it cannot undo a previously completed composition which is exactly what we need: Once composition has completed we send the result to the shell and we cannot undo this later on. The WinRT API does not allow us to use `TF_SS_TRANSITORY` and so it's unsuitable for our application. Additionally, the implementation used XAML to render the composition instead of being part of our text renderer, which resulted in the text looking weird and hard to read. The new implementation spans just 8 files and is ~1000 lines which should make it significantly easier to maintain. The architecture is not particularly great, but it's certainly better than what we had. The implementation is almost entirely identical between both conhost and Windows Terminal and thus they both also behave identical. It fixes an uncountable number of subtle bugs in the conhost TSF implementation, as it failed to check for status codes after calls. It also adds several new features, like support for wavy underlines (as used by the Japanese IME), dashed underlines (the default for various languages now, like Vietnamese), colored underlines, colored foreground/background controlled by the IME, and more! I have tried to replicate the following issues and have a high confidence that they're resolved now: Closes #1304 Closes #3730 Closes #4052 Closes #5007 (as it is not applicable anymore) Closes #5110 Closes #6186 Closes #6192 Closes #13805 Closes #14349 Closes #14407 Closes #16180 For the following issues I'm not entirely sure if it'll fix it, but I suspect it's somewhat likely: #13681 #16305 #16817 Lastly, there's one remaining bug that I don't know how to resolve. However, that issue also plagues conhost and Windows Terminal right now, so it's at least not a regression: * Press Win+. (emoji picker) and close it * Move the window around * Press Win+. This will open the emoji picker at the old window location. It also occurs when the cursor moves within the window. While this is super annoying, I could not find a way to fix it. ## Validation Steps Performed * See the above closed issues * Use Vietnamese Telex and type "xin choaf" Results in "xin chào" ✅ * Use the MS Japanese IME and press Alt+` Toggles between the last 2 modes ✅ * Use the MS Japanese IME, type "kyouhaishaheiku", and press Space * The text is converted, underlined and the first part is doubly underlined ✅ * Left/Right moves between the 3 segments ✅ * Home/End moves between start/end ✅ * Esc puts a wavy line under the current segment ✅ * Use the Korean IME, type "gksgks" This results in "한한" ✅ * Use the Korean IME, type "gks", and press Right Ctrl Opens a popup which allows you to navigate with Arrow/Tab keys ✅	2024-04-18 17:47:28 +00:00
Leonard Hecker	4fd15c9937	Remove dependency on IsGlyphFullWidth for IRM/DECSWL (#16903 ) This gets rid off the implicit dependency on `IsGlyphFullWidth` for the IRM and DECSWL/DECDWL/DECDHL implementations. ## Validation Steps Performed In pwsh: * ``"`e[31mab`e[m`b`e[4h`e[32m$(''10)`e[m`e[4l"`` prints a red "a", 10 green "" and a red "b" ✅ ``"`e[31mab`e[m`b`e[4h`e[32m$(''1000)`e[m`e[4l"`` prints a red "a" and a couple lines of green "" ✅ ``"`e[31mf$('o'*70)`e[m`e#6`e#5"`` the right half of the row is erased ✅	2024-04-10 15:12:40 +00:00
Mike Griese	c3f44f7730	Rewrite how marks are stored & add reflow (#16937 ) This is pretty much a huge refactoring of how marks are stored in the buffer. Gone is the list of `ScrollMark`s in the buffer that store regions of text as points marking the ends. Those would be nigh impossible to reflow nicely. Instead, we're going to use `TextAttribute`s to store the kind of output we've got - `Prompt`, `Command`, `Output`, or, the default, `None`. Those already reflow nicely! But we also need to store things like, the exit code for the command. That's why we've now added `ScrollbarData` to `ROW`s. There's really only going to be one prompt->output on a single row. So, we only need to store one ScrollbarData per-row. When a command ends, we can just go update the mark on the row that started that command. But iterating over the whole buffer to find the next/previous prompt/command/output region sounds complicated. So, to avoid everyone needing to do some variant of that, we've added `MarkExtents` (which is literally just the same mark structure as before). TextBuffer can figure out where all the mark regions are, and hand that back to callers. This allows ControlCore to be basically unchanged. _But collecting up all the regions for all the marks sounds expensive! We need to update the scrollbar frequently, we can't just collect those up every time!_ No we can't! But we also don't need to. The scrollbar doesn't need to know where all the marks start and end and if they have commands and this and that - no. We only need to know the rows that have marks on them. So, we've now also got `ScrollMark` to represent just a mark on a scrollbar at a specific row on the buffer. We can get those quickly. * [x] I added a bunch of tests for this. * [x] I played with it and it feels good, even after a reflow (finally) * See: * #11000 * #15057 (I'm not marking this as closed. The stacked PR will close this, when I move marks to Stable)	2024-04-05 20:16:10 +00:00
Leonard Hecker	043d5cd484	Fix bugs in CharToColumnMapper (#16787 ) Aside from overall simplifying `CharToColumnMapper` this fixes 2 bugs: * The backward search loop may have iterated 1 column too far, because it didn't stop at `current <= target`, but rather at `(current - 1) <= target`. This issue was only apparent when surrogate pairs were being used in a row. * When the target offset is that of a trailing surrogate pair the forward search loop may have iterated 1 column too far. It's somewhat unlikely for this to happen since this code is only used through ICU, but you never know. This is a continuation of PR #16775.	2024-02-29 13:59:15 -08:00
Comzyh	e7796e7db3	Fix the hyperlink detection when there are leading wide glyph in the row (#16775 ) ## Summary of the Pull Request URL detection was broken again in #15858. When the regex matched, we calculate the column(cell) by its offset, we use forward or backward iteration of the column to find the correct column that displays the glyphs of `_chars[offset]`. `abf5d9423a/src/buffer/out/Row.cpp (L95-L104)` However, when calculating the `currentOffset` we forget that MSB of `_charOffsets[col]` could be `1`, or col is pointing to another glyph in preceding column. `abf5d9423a/src/buffer/out/Row.hpp (L223-L226)`	2024-02-28 16:34:40 +00:00
Tushar Singh	a3ac337d88	Refactor `TextBuffer::GenHTML/RTF` to read the buffer directly (#16377 ) `TextBuffer::GenHTML` and `TextBuffer::GenRTF` now read directly from the TextBuffer. - Since we're reading from the buffer, we can now read _all_ the attributes saved in the buffer. Formatted copy now copies most (if not all) font/color attributes in the requested format (RTF/HTML). - Use `TextBuffer::CopyRequest` to pass all copy-related options into text generation functions as one unit. - Helper function `TextBuffer::CopyRequest::FromConfig()` generates a copy request based on Selection mode and user configuration. - Both formatted text generation functions now use `std::string` and `fmt::format_to` to generate the required strings. Previously, we were using `std::ostringstream` which is not recommended due to its potential overhead. - Reading attributes from `ROW`'s attribute RLE simplified the logic as we don't have to track attribute change between the text. - On the caller side, we do not have to rebuild the plain text string from the vector of strings anymore. `TextBuffer::GetPlainText()` returns the entire text as one `std::string`. - Removed `TextBuffer::TextAndColors`. - Removed `TextBuffer::GetText()`. `TextBuffer::GetPlainText()` took its place. This PR also fixes two bugs in the formatted copy: - We were applying line breaks after each selected row, even though the row could have been a Wrapped row. This caused the wrapped rows to break when they shouldn't. - We mishandled Unicode text (\uN) within the RTF copy. Every next character that uses a surrogate pair or high codepoint was missing in the copied text when pasted to MSWord. The command `\uc4` should have been `\uc1`, which is used to tell how many fallback characters are used for each Unicode codepoint (\u). We always use one `?` character as the fallback. Closes #16191 References and Relevant Issues - #16270 Validation Steps Performed - Casual copy-pasting from Terminal or OpenConsole to word editors works as before. - Verified HTML copy by copying the generated HTML string and running it through an HTML viewer. [Sample](https://codepen.io/tusharvickey/pen/wvNXbVN) - Verified RTF copy by copy-pasting the generated RTF string into MSWord. - SingleLine mode works (<kbd>Shift</kbd>+ copy) - BlockSelection mode works (<kbd>Alt</kbd> selection)	2024-01-29 22:20:33 +00:00
Leonard Hecker	74748394c1	Reimplement TextBuffer::Reflow (#15701 ) Subjectively speaking, this commit makes 3 improvements: * Most importantly, it now would work with arbitrary Unicode text. (No more `IsGlyphFullWidth` or DBCS handling during reflow.) * Due to the simpler implementation it hopefully makes review of future changes and maintenance simpler. (~3x less LOC.) * It improves perf. by 1-2 orders of magnitude. (At 120x9001 with a full buffer I get 60ms -> 2ms.) Unfortunately, I'm not confident that the new code replicates the old code exactly, because I failed to understand it. During development I simply tried to match its behavior with what I think reflow should do. Closes #797 Closes #3088 Closes #4968 Closes #6546 Closes #6901 Closes #15964 Closes MSFT:19446208 Related to #5800 and #8000 ## Validation Steps Performed * Unit tests ✅ * Feature tests ✅ * Reflow with a scrollback ✅ * Reflowing the cursor cell causes a forced line-wrap ✅ (Even at the end of the buffer. ✅) * `color 8f` and reflowing retains the background color ✅ * Enter alt buffer, Resize window, Exit alt buffer ✅	2023-09-25 17:28:51 -07:00
Leonard Hecker	cd80f3c764	Use ICU for text search (#15858 ) The ultimate goal of this PR was to use ICU for text search to * Improve Unicode support Previously we used `towlower` and only supported BMP glphs. * Improve search performance (10-100x) This allows us to search for all results in the entire text buffer at once without having to do so asynchronously. Unfortunately, this required some significant changes too: * ICU's search facilities operate on text positions which we need to be mapped back to buffer coordinates. This required the introduction of `CharToColumnMapper` to implement sort of a reverse-`_charOffsets` mapping. It turns text (character) positions back into coordinates. * Previously search restarted every time you clicked the search button. It used the current selection as the starting position for the new search. But since ICU's `uregex` cannot search backwards we're required to accumulate all results in a vector first and so we need to cache that vector in between searches. * We need to know when the cached vector became invalid and so we have to track any changes made to `TextBuffer`. The way this commit solves it is by splitting `GetRowByOffset` into `GetRowByOffset` for `const ROW` access and `GetMutableRowByOffset` which increments a mutation counter on each call. The `Search` instance can then compare its cached mutation count against the previous mutation count. Finally, this commit makes 2 semi-unrelated changes: * URL search now also uses ICU, since it's closely related to regular text search anyways. This significantly improves performance at large window sizes. * A few minor issues in `UiaTracing` were fixed. In particular 2 functions which passed strings as `wstring` by copy are now using `wstring_view` and `TraceLoggingCountedWideString`. Related to #6319 and #8000 ## Validation Steps Performed * Search upward/downward in conhost ✅ * Search upward/downward in WT ✅ * Searching for any of ß, ẞ, ss or SS matches any of the other ✅ * Searching for any of Σ, σ, or ς matches any of the other ✅	2023-08-24 22:56:40 +00:00
Leonard Hecker	e9c8391fd5	Fix compilation with Visual Studio 17.8 (#15819 ) This broke with https://github.com/microsoft/STL/pull/3721 It's a minor issue and a minor fix. :)	2023-08-11 15:17:18 +02:00
Leonard Hecker	167819a8f4	Add an ASCII fast-pass to ROW (#15499 ) Performance of printing enwik8.txt at the following block sizes: 4KiB (printf): 78MB/s -> 93MB/s 128KiB (cat): 117MB/s -> 156MB/s The change itself is rather self-explanatory. A tighter, simpler loop runs faster. ## Validation Steps Performed Mixed ASCII/Unicode text output looks generally correct. ✅	2023-07-05 21:26:15 +02:00
Leonard Hecker	94e6b91c78	We've been trying to reach you about your WriteCharsLegacy's extended Emoji support (#15567 ) This is a complete rewrite of the old `WriteCharsLegacy` function which is used when VT mode is disabled as well as for all interactive console input handling on Windows. The previous code was almost horrifying in some aspects as it first wrote the incoming text into a local buffer, stripping/replacing any control characters. That's not particular fast and never was. It's unknown why it was like that. It also measured the width of each glyph to correctly determine the cursor position and line wrapping. Presumably this used to work quite well in the original console code, because it would then just copy that local buffer into the destination text buffer, but with the introduction of the broken and extremely slow `OutputCellIterator` abstraction this would end up measuring all text twice and cause disagreements between `WriteCharsLegacy`'s idea of the cursor position and `OutputCellIterator`'s cursor position. Emoji input was basically entirely broken. This PR fixes it by passing any incoming text straight to the `TextBuffer` as well as by using its cursor positioning facilities to correctly implement wrapping and backspace handling. Backspacing over Emojis and an array of other aspects still don't work correctly thanks to cmdline.cpp, but it works quite a lot better now. Related to #8000 Closes #8839 Closes #10808 ## Validation Steps Performed * Printing various Unicode text ✅ * On an fgets() input line * Typing text works ✅ * Inserting text works anywhere ✅ * Ctrl+X is translated to ^X ✅ * Null is translated to ^@ ✅ This was tested by hardcoding the `OutputMode` to 3 instead of 7. * Backspace only advances to start of the input ✅ * Backspace deletes the entire preceding tab ✅ * Backspace doesn't delete whitespace preceding a tab ✅ * Backspacing a force-wrapped wide glyph unwraps the line break ✅ * Backspacing ^X deletes both glyphs ✅ * Backspacing a force-wrapped tab deletes trailing whitespace ✅ * When executing ```cpp fputs("foo: ", stdout); fgets(buffer, stdin); ``` pressing tab and then backspace does not delete the whitespace that follows after the "foo:" string (= `sOriginalXPosition`).	2023-06-30 14:51:07 -05:00
Leonard Hecker	f0291c6501	Remove boolean success return values from TextBuffer (#15566 ) I've removed these because it made some of my new code pretty convoluted for now good reason as most of these functions aren't exception safe to begin with. Basically, their boolean status is often just a pretense because they can crash or throw anyways. Furthermore, `WriteCharsLegacy` failed to check the status code returned by `AdjustCursorPosition` in some of its parts too. In the future we should instead probably strive to continue make our legacy code more exception safe.	2023-06-22 16:24:10 -07:00
Leonard Hecker	e594d97c90	Allow ROW::CopyRangeFrom to be vectorized (#15267 ) By rewriting the first major copy loop in `CopyRangeFrom` to use pointers/iterators instead of indices for iteration, the autovectorizer kicks in end neatly rewrites it as an unrolled SIMD loop. This improves performance during traditional window resizes by roughly 2x and will be quite helpful in the future for our more complex reflow resize. Unfortunately, MSVC unrolls the loop by 4x which is too much for our purpose, but there's no option to change that. It's still better than not having any vectorization however, since it kicks in at 32 columns. It also renames the function to `CopyTextFrom` be more in line with the others and to avoid confusion, because it doesn't copy attributes. ## Validation Steps Performed * Traditional resizing works ✅	2023-06-22 17:17:46 -05:00
Leonard Hecker	b8f402f64b	Reduce cost of resetting row attributes (#15497 ) Performance of printing enwik8.txt at the following block sizes: 4KiB (printf): 54MB/s -> 54MB/s 128KiB (cat): 101MB/s -> 104MB/s ## Validation Steps Performed This change is easily verifiable via review.	2023-06-15 15:34:29 +00:00
Leonard Hecker	f3e2890084	Vectorize ROW initialization (#15501 ) Performance of printing enwik8.txt at the following block sizes: 4KiB (printf): 51MB/s -> 54MB/s 128KiB (cat): 92MB/s -> 103MB/s ## Validation Steps Performed * Rows are properly filled with whitespace at various window sizes as observed under a debugger ✅	2023-06-15 14:45:35 +00:00
Leonard Hecker	c183d12649	Move AdaptDispatch::_FillRect into TextBuffer (#15541 ) This commit makes 2 changes: * Expose dirty-range information from `ROW::CopyTextFrom` This will allow us to call `TriggerRedraw`, which is an aspect I haven't previously considered as something this API needs. * Add a `FillRect` API to `TextBuffer` and refactor `AdeptDispatch` to use that API. Even if we determine that the new text APIs are unfit (for instance too difficult to use), this will make it simpler to write efficient implementations right inside `TextBuffer`. Since the new `FillRect` API lacks bounds checks the way `WriteLine` has them, it breaks `AdaptDispatch::_EraseAll` which failed to adjust the bottom parameter after scrolling the contents. This would result in more rows being erased than intended. ## Validation Steps Performed * `chcp 65001` * Launch `pwsh` * ``"`e[29483`$x"`` fills the viewport with cats ✅ * `ResizeTraditional` still doesn't work any worse than it used to ✅	2023-06-14 14:34:42 -05:00
Leonard Hecker	612b00cd44	Initialize rows lazily (#15524 ) For a 120x9001 terminal, `a01500f` reduced the private working set of conhost by roughly 0.7MB, presumably due to tighter `ROW` packing, but also increased it by 2.1MB due to the addition of the `_charOffsets` array on each `ROW` instance. An option to fix this would be to only allocate a `_charOffsets` if the first wide or complex Unicode glyph is encountered. But on one hand this would be quite western-centric and unfairly hurt most languages that exist and on another we can get rid of the `_charOffsets` array entirely in the future by injecting ZWNJs if a write begins with a combining glyph and just recount each row from the start. That's still faster than fragmented memory. This commit goes a different way and instead reduces the working set of conhost after it launches from 7MB down to just 2MB, by only committing ROWs when they're first used. Finally, it adds a "scratchpad" row which can be used to build more complex contents, for instance to horizontally scroll them. ## Validation Steps Performed * Traditional resize * Horizontal shrinking works ✅ * Vertical shrinking works ✅ and cursor stays in the viewport ✅ * Reflow works ✅ * Filling the buffer with ASCII works ✅ and no leaks ✅ * Filling the buffer with complex Unicode works ✅ and no leaks ✅ * `^[[3J` erases scrollback ✅ * Test `ScrollRows` with a positive delta ✅ * I don't know how to test `Reset`. ❔ Unit tests use it though	2023-06-10 13:17:18 +00:00
Leonard Hecker	ecb5e37a7d	Use new row primitives for ResizeTraditional (#15105 ) This will allow us to share the same fundamental text insertion logic for both `ResizeTraditional` and `Reflow`, because both can be implemented with `ROW::CopyRangeFrom`. It also replaces the `BufferAllocator` struct with a `_allocateBuffer` function which will help us allocate scratch buffer rows in the future. Closes #14696 ## PR Checklist * Disable reflow resize in conhost * Print "zhwik8.txt" - a enwik8.txt equivalent of Chinese Wikipedia * Run `color 80` in cmd * Resize windows from 120 to 119 columns * Wide glyphs disappear and are replaced with whitespace ✅ * Resizing the window to >120 columns adds gray whitespace ✅	2023-04-05 09:59:20 -05:00
Leonard Hecker	f20cd3a9d3	Add an efficient text stream write function (#14821 ) This adds PR adds a couple foundational functions and classes to make our TextBuffer more performant and allow us to improve our Unicode correctness in the future, by getting rid of our dependence on `OutputCellIterator`. In the future we can then replace the simple UTF-16 code point iterator with a proper grapheme cluster iterator. While my focus is technically on Unicode correctness, the ~4x VT throughput increase in OpenConsole is pretty nice too. This PR adds: * A new, simpler ROW iterator (unused in this PR) * Cursor movement functions (`NavigateToPrevious`, `NavigateToNext`) They're based on functions that align the cursor to the start/end of the _current_ cell, so such functions can be added as well. * `ReplaceText` to write a raw string of text with the possibility to specify a right margin. * `CopyRangeFrom` will allow us to make reflow much faster, as it's able to bulk-copy already measured strings without re-measuring them. Related to #8000 ## Validation Steps Performed * enwik8.txt, zhwik8.txt, emoji-test.txt, all work with proper wide glyph reflow at the end of a row ✅ * This produces "a 咪" where only "a" has a white background: ```sh printf '\e7こん\e8\x1b[107ma\x1b[m\n' ``` * This produces "abん": ```sh stdbuf -o0 printf '\x1b7こん\x1b8a'; printf 'b\n' ``` * This produces "xy" at the end of the line: ```sh stdbuf -o0 printf '\e[999C\bこ\bx'; printf 'y\n' ``` * This produces red whitespace followed by "こ " in the default background color at the end of the line, and "ん" on the next line: ```sh printf '\e[41m\e[K\e[m\e[999C\e[2Dこん\n' ```	2023-03-24 17:20:53 -05:00
Leonard Hecker	9dcdcac0bb	Ignore CHAR_INFO trailers during WriteConsoleOutput (#14840 ) #13626 contains a small "regression" compared to #13321: It now began to store trailers in the buffer wherever possible to allow a region of the buffer to be backed up and restored via Read/WriteConsoleOutput. But we're unfortunately still ill-equipped to handle anything but UCS-2 via WriteConsoleOutput, so it's best to again ignore trailers just like in #13321. ## Validation Steps Performed * Added unit test ✅	2023-02-15 17:40:24 -06:00
Leonard Hecker	a01500f051	Rewrite ROW to be Unicode capable (#13626 ) This commit is a from-scratch rewrite of `ROW` with the primary goal to get rid of the rather bodgy `UnicodeStorage` class and improve Unicode support. Previously a 120x9001 terminal buffer would store a vector of 9001 `ROW`s where each `ROW` stored exactly 120 `wchar_t`. Glyphs exceeding their allocated space would be stored in the `UnicodeStorage` which was basically a `hashmap<Coordinate, String>`. Iterating over the text in a `ROW` would require us to check each glyph and fetch it from the map conditionally. On newlines we'd have to invalidate all map entries that are now gone, so for every invalidated `ROW` we'd iterate through all glyphs again and if a single one was stored in `UnicodeStorage`, we'd then iterate through the entire hashmap to remove all coordinates that were residing on that `ROW`. All in all, this wasn't the most robust nor performant code. The new implementation is simple (from a design perspective): Store all text in a `ROW` in a regular string. Grow the string if needed. The association between columns and text works by storing character offsets in a column-wide array. This algorithm is <100 LOC and removes ~1000. As an aside this PR does a few more things that go hand in hand: * Remove most of `ROW` helper classes, which aren't needed anymore. * Allocate backing memory in a single `VirtualAlloc` call. * Rewrite `IsCursorDoubleWidth` to use `DbcsAttrAt` directly. Improves overall performance by 10-20% and makes this implementation faster than the previous NxM storage, despite the added complexity. Part of #8000 ## Validation Steps Performed * Existing and new unit and feature tests complete ✅ * Printing Unicode completes without crashing ✅ * Resizing works without crashing ✅	2022-11-11 20:34:58 +01:00
Leonard Hecker	ed27737233	Use 32-bit coordinates throughout the project (#13025 ) Previously this project used a great variety of types to present text buffer coordinates: `short`, `unsigned short`, `int`, `unsigned int`, `size_t`, `ptrdiff_t`, `COORD`/`SMALL_RECT` (aka `short`), and more. This massive commit migrates almost all use of those types over to the centralized types `til::point`/`size`/`rect`/`inclusive_rect` and their underlying type `til::CoordType` (aka `int32_t`). Due to the size of the changeset and statistics I expect it to contain bugs. The biggest risk I see is that some code potentially, maybe implicitly, expected arithmetic to be mod 2^16 and that this code now allows it to be mod 2^32. Any narrowing into `short` later on would then throw exceptions. ## PR Checklist * [x] Closes #4015 * [x] I work here * [x] Tests added/passed ## Validation Steps Performed Casual usage of OpenConsole and Windows Terminal. ✅	2022-06-03 23:02:46 +00:00
Leonard Hecker	57c3953aca	Use type inference throughout the project (#12975 ) #4015 requires sweeping changes in order to allow a migration of our buffer coordinates from `int16_t` to `int32_t`. This commit reduces the size of future commits by using type inference wherever possible, dropping the need to manually adjust types throughout the project later. As an added bonus this commit standardizes the alignment of cv qualifiers to be always left of the type (e.g. `const T&` instead of `T const&`). The migration to type inference with `auto` was mostly done using JetBrains Resharper with some manual intervention and the standardization of cv qualifier alignment using clang-format 14. ## References This is preparation work for #4015. ## Validation Steps Performed * Tests pass ✅	2022-04-25 15:40:47 +00:00
Leonard Hecker	a8e4bedae3	Introduce til::rle - a run length encoded vector (#10099 ) ## Summary of the Pull Request Introduces `til::rle`, a vector-like container which stores elements of type T in a run length encoded format. This allows efficient compaction of repeated elements within the vector. ## References * #8000 - Supports buffer rewrite work. A re-use of `til::rle` will be useful as a column counter as we pursue NxM storage and presentation. * #3075 - The new iterators allow skipping forward by multiple units, which wasn't possible under `TextBuffer-/OutputCellIterator`. Additionally it also allows a bulk insertions. * #8787 and #410 - High probability this should be `pmr`-ified like `bitmap` for things like `chafa` and `cacafire` which are changing the run length frequently. ## PR Checklist * [x] Closes #8741 * [x] I work here. * [x] Tests added. * [x] Tests passed. ## Validation Steps Performed * [x] Ran `cacafire` in `OpenConsole.exe` and it looked beautiful * [x] Ran new suite of `RunLengthEncodingTests.cpp` Co-authored-by: Michael Niksa <miniksa@microsoft.com>	2021-05-20 17:27:50 +00:00
James Holderness	4c53c595e7	Add support for double-width/double-height lines in conhost (#8664 ) This PR adds support for the VT line rendition attributes, which allow for double-width and double-height line renditions. These renditions are enabled with the `DECDWL` (double-width line) and `DECDHL` (double-height line) escape sequences. Both reset to the default rendition with the `DECSWL` (single-width line) escape sequence. For now this functionality is only supported by the GDI renderer in conhost. There are a lot of changes, so this is just a general overview of the main areas affected. Previously it was safe to assume that the screen had a fixed width, at least for a given point in time. But now we need to deal with the possibility of different lines have different widths, so all the functions that are constrained by the right border (text wrapping, cursor movement operations, and sequences like `EL` and `ICH`) now need to lookup the width of the active line in order to behave correctly. Similarly it used to be safe to assume that buffer and screen coordinates were the same thing, but that is no longer true. Lots of places now need to translate back and forth between coordinate systems dependent on the line rendition. This includes clipboard handling, the conhost color selection and search, accessibility location tracking and screen reading, IME editor positioning, "snapping" the viewport, and of course all the rendering calculations. For the rendering itself, I've had to introduce a new `PrepareLineTransform` method that the render engines can use to setup the necessary transform matrix for a given line rendition. This is also now used to handle the horizontal viewport offset, since that could no longer be achieved just by changing the target coordinates (on a double width line, the viewport offset may be halfway through a character). I've also had to change the renderer's existing `InvalidateCursor` method to take a `SMALL_RECT` rather than a `COORD`, to allow for the cursor being a variable width. Technically this was already a problem, because the cursor could occupy two screen cells when over a double-width character, but now it can be anything between one and four screen cells (e.g. a double-width character on the double-width line). In terms of architectural changes, there is now a new `lineRendition` field in the `ROW` class that keeps track of the line rendition for each row, and several new methods in the `ROW` and `TextBuffer` classes for manipulating that state. This includes a few helper methods for handling the various issues discussed above, e.g. position clamping and translating between coordinate systems. ## Validation Steps Performed I've manually confirmed all the double-width and double-height tests in _Vttest_ are now working as expected, and the _VT100 Torture Test_ now renders correctly (at least the line rendition aspects). I've also got my own test scripts that check many of the line rendition boundary cases and have confirmed that those are now passing. I've manually tested as many areas of the conhost UI that I could think of, that might be affected by line rendition, including things like searching, selection, copying, and color highlighting. For accessibility, I've confirmed that the _Magnifier_ and _Narrator_ correctly handle double-width lines. And I've also tested the Japanese IME, which while not perfect, is at least useable. Closes #7865	2021-02-18 05:44:50 +00:00
Dustin L. Howett	e7592ec3d4	ROW: clean up in preparation to hide CharRow & AttrRow (#8446 ) Moving things out of CharRow into ROW helps us hide it as an implementation detail. This is part one of many. ### CharRow: Hide ClearCell, use ROW::ClearColumn ### CharRow: Hide GetText, use ROW::GetText ### CharRowBaseTests: remove dead file (never used!) ### CharRow: Move DoubleBytePadded into ROW ### CharRow: Move WrapForced into ROW ### Char/AttrRow: Hide Reset, use ROW::Reset ### Remove RowCellIterator (dead code) RCI was unused; it was replaced by TextBufferCellIterator shortly after its creation ### Move AttrRowTests to ut_textbuffer from ut_host It had no reliance on the host.	2021-01-20 21:16:56 +00:00
Austin Lamb	539a5dc0af	Greatly reduce allocations in the conhost/OpenConsole startup path (#8489 ) I was looking at conhost/OpenConsole and noticed it was being pretty inefficient with allocations due to some usages of std::deque and std::vector that didn't need to be done quite that way. So this uses std::vector for the TextBuffer's storage of ROW objects, which allows one allocation to contiguously reserve space for all the ROWs - on Desktop this is 9001 ROW objects which means it saves 9000 allocations that the std::deque would have done. Plus it has the benefit of increasing locality of the ROW objects since deque is going to chase pointers more often with its data structure. Then, within each ROW there are CharRow and ATTR_ROW objects that use std::vector today. This changes them to use Boost's small_vector, which is a variation of vector that allows for the so-called "small string optimization." Since we know the typical size of these vectors, we can pre-reserve the right number of elements directly in the CharRow/ATTR_ROW instances, avoiding any heap allocations at all for constructing these objects. There are a ton of variations on this "small_vector" concept out there in the world - this one in Boost, LLVM has one called SmallVector, Electronic Arts' STL has a small_vector, Facebook's folly library has one...there are a silly number of these out there. But Boost seems like it's by far the easiest to consume in terms of integration into this repo, the CI/CD pipeline, licensing, and stuff like that, so I went with the boost version. In terms of numbers, I measured the startup path of OpenConsole.exe on my dev box for Release x64 configuration. My box is an i7-6700k @ 4 Ghz, with 32 GB RAM, not that I think machine config matters much here: \| \| Allocation count \| Allocated bytes \| CPU usage (ms) \| \| ------ \| ------------------- \| ------------------ \| -------------- \| \| Before \| 29,461 \| 4,984,640 \| 103 \| \| After \| 2,459 (-91%) \| 4,853,931 (-2.6%) \| 96 (-7%) \| Along the way, I also fixed a dynamic initializer I happened to spot in the registry code, and updated some docs. ## Validation Steps Performed - Ran "runut", "runft" and "runuia" locally and confirmed results are the same as the main branch - Profiled the before/after numbers in the Visual Studio profiler, for the numbers shown in the table Co-authored-by: Austin Lamb <austinl@microsoft.com>	2020-12-16 10:40:30 -08:00
Michael Niksa	4351f32f5d	Commit attr runs less frequently by accumulating length of color run (#6919 ) The act of calling `InsertAttrRuns` is relatively slow. Instead of calling it a bunch of times to meddle with colors one cell at a time, we'll accumulate a length of color and call it to make it act all at once. This is great for when one color full line is getting replaced with another color full line OR when a line is being replaced with the same color all at once. There's significantly fewer checks to be made inside `InsertAttrRuns` if we can help it out by accumulating the length of each color before asking it to stitch it into the storage. Validation ---------- - Run `time cat big.txt` and `time cat ls.txt` under VS Performance Profiler.	2020-07-17 17:53:01 +00:00
Carlos Zamora	4dd9f9c180	make filling chars (and, thus, erase line/char) unset wrap (#2831 ) EraseInLine calls `FillConsoleOutputCharacterW()`. In filling the row with chars, we were setting the wrap flag. We need to specifically not do this on ANY _FILL_ operation. Now a fill operation UNSETS the wrap flag if we fill to the end of the line. Originally, we had a boolean `setWrap` that would mean... - true: if writing to the end of the row, SET the wrap value to true - false: if writing to the end of the row, DON'T CHANGE the wrap value Now we're making this bool a std::optional to allow for a ternary state. This allows for us to handle the following cases completely. Refer to the table below: ,- current wrap value \| ,- are we filling the last cell in the row? \| \| ,- new wrap value \| \| \| ,- comments \|-- \|-- \|-- \| \| 0 \| 0 \| 0 \| \| 0 \| 1 \| 0 \| \| 0 \| 1 \| 1 \| THIS CASE WAS HANDLED CORRECTLY \| 1 \| 0 \| 0 \| THIS CASE WAS UNHANDLED \| 1 \| 0 \| 1 \| \| 1 \| 1 \| 1 \| To handle that special case (1-0-0), we need to UNSET the wrap. So now, we have ~setWrap~ `wrap` mean the following: - true: if writing to the end of the row, SET the wrap value to TRUE - false: if writing to the end of the row, SET the wrap value to FALSE - nullopt: leave the wrap value as it is Closes #1126	2019-09-30 18:16:31 -07:00
Michael Niksa	81ab5803aa	C26473, do not cast pointer back to the same type.	2019-09-03 09:44:19 -07:00
Michael Niksa	4f1157c044	C26447,C26440 - is noexcept but can throw or doesn't throw but not noexcept	2019-08-29 15:23:07 -07:00
adiviness	9b92986b49	add clang-format conf to the project, format the c++ code (#1141 )	2019-06-11 13:27:09 -07:00
Dustin Howett	d4d59fa339	Initial release of the Windows Terminal source code This commit introduces all of the Windows Terminal and Console Host source, under the MIT license.	2019-05-02 15:29:04 -07:00

40 Commits