Commit Graph

18 Commits

Author SHA1 Message Date
Leonard Hecker
c183d12649 Move AdaptDispatch::_FillRect into TextBuffer (#15541)
This commit makes 2 changes:
* Expose dirty-range information from `ROW::CopyTextFrom`
  This will allow us to call `TriggerRedraw`, which is an aspect
  I haven't previously considered as something this API needs.
* Add a `FillRect` API to `TextBuffer` and refactor `AdeptDispatch`
  to use that API. Even if we determine that the new text APIs are
  unfit (for instance too difficult to use), this will make it simpler
  to write efficient implementations right inside `TextBuffer`.

Since the new `FillRect` API lacks bounds checks the way `WriteLine`
has them, it breaks `AdaptDispatch::_EraseAll` which failed to adjust
the bottom parameter after scrolling the contents. This would result
in more rows being erased than intended.

## Validation Steps Performed
* `chcp 65001`
* Launch `pwsh`
* ``"`e[29483`$x"`` fills the viewport with cats 
* `ResizeTraditional` still doesn't work any worse than it used to 
2023-06-14 14:34:42 -05:00
Leonard Hecker
612b00cd44 Initialize rows lazily (#15524)
For a 120x9001 terminal, a01500f reduced the private working set of
conhost by roughly 0.7MB, presumably due to tighter `ROW` packing, but
also increased it by 2.1MB due to the addition of the `_charOffsets`
array on each `ROW` instance. An option to fix this would be to only
allocate a `_charOffsets` if the first wide or complex Unicode glyph
is encountered. But on one hand this would be quite western-centric
and unfairly hurt most languages that exist and on another we can get
rid of the `_charOffsets` array entirely in the future by injecting
ZWNJs if a write begins with a combining glyph and just recount each
row from the start. That's still faster than fragmented memory.

This commit goes a different way and instead reduces the working
set of conhost after it launches from 7MB down to just 2MB,
by only committing ROWs when they're first used.

Finally, it adds a "scratchpad" row which can be used to build
more complex contents, for instance to horizontally scroll them.

## Validation Steps Performed
* Traditional resize
  * Horizontal shrinking works 
  * Vertical shrinking works  and cursor stays in the viewport 
* Reflow works 
* Filling the buffer with ASCII works  and no leaks 
* Filling the buffer with complex Unicode works  and no leaks 
* `^[[3J` erases scrollback 
* Test `ScrollRows` with a positive delta 
* I don't know how to test `Reset`.  Unit tests use it though
2023-06-10 13:17:18 +00:00
Leonard Hecker
ecb5e37a7d Use new row primitives for ResizeTraditional (#15105)
This will allow us to share the same fundamental text insertion
logic for both `ResizeTraditional` and `Reflow`, because both
can be implemented with `ROW::CopyRangeFrom`. It also replaces
the `BufferAllocator` struct with a `_allocateBuffer` function
which will help us allocate scratch buffer rows in the future.

Closes #14696

## PR Checklist
* Disable reflow resize in conhost
* Print "zhwik8.txt" - a enwik8.txt equivalent of Chinese Wikipedia
* Run `color 80` in cmd
* Resize windows from 120 to 119 columns
* Wide glyphs disappear and are replaced with whitespace 
* Resizing the window to >120 columns adds gray whitespace 
2023-04-05 09:59:20 -05:00
Leonard Hecker
f20cd3a9d3 Add an efficient text stream write function (#14821)
This adds PR adds a couple foundational functions and classes to make
our TextBuffer more performant and allow us to improve our Unicode
correctness in the future, by getting rid of our dependence on
`OutputCellIterator`. In the future we can then replace the simple
UTF-16 code point iterator with a proper grapheme cluster iterator.

While my focus is technically on Unicode correctness, the ~4x VT
throughput increase in OpenConsole is pretty nice too.

This PR adds:
* A new, simpler ROW iterator (unused in this PR)
* Cursor movement functions (`NavigateToPrevious`, `NavigateToNext`)
  They're based on functions that align the cursor to the start/end
  of the _current_ cell, so such functions can be added as well.
* `ReplaceText` to write a raw string of text with the possibility to
  specify a right margin.
* `CopyRangeFrom` will allow us to make reflow much faster, as it's able
  to bulk-copy already measured strings without re-measuring them.

Related to #8000

## Validation Steps Performed
* enwik8.txt, zhwik8.txt, emoji-test.txt, all work with proper
  wide glyph reflow at the end of a row 
* This produces "a 咪" where only "a" has a white background:
  ```sh
  printf '\e7こん\e8\x1b[107ma\x1b[m\n'
  ```
* This produces "abん":
  ```sh
  stdbuf -o0 printf '\x1b7こん\x1b8a'; printf 'b\n'
  ```
* This produces "xy" at the end of the line:
  ```sh
  stdbuf -o0 printf '\e[999C\bこ\bx'; printf 'y\n'
  ```
* This produces red whitespace followed by "こ " in the default
  background color at the end of the line, and "ん" on the next line:
  ```sh
  printf '\e[41m\e[K\e[m\e[999C\e[2Dこん\n'
  ```
2023-03-24 17:20:53 -05:00
Leonard Hecker
9dcdcac0bb Ignore CHAR_INFO trailers during WriteConsoleOutput (#14840)
#13626 contains a small "regression" compared to #13321:
It now began to store trailers in the buffer wherever possible to allow
a region
of the buffer to be backed up and restored via Read/WriteConsoleOutput.
But we're unfortunately still ill-equipped to handle anything but UCS-2
via
WriteConsoleOutput, so it's best to again ignore trailers just like in
#13321.

## Validation Steps Performed
* Added unit test 
2023-02-15 17:40:24 -06:00
Leonard Hecker
a01500f051 Rewrite ROW to be Unicode capable (#13626)
This commit is a from-scratch rewrite of `ROW` with the primary goal to get
rid of the rather bodgy `UnicodeStorage` class and improve Unicode support.

Previously a 120x9001 terminal buffer would store a vector of 9001 `ROW`s
where each `ROW` stored exactly 120 `wchar_t`. Glyphs exceeding their
allocated space would be stored in the `UnicodeStorage` which was basically
a `hashmap<Coordinate, String>`. Iterating over the text in a `ROW` would
require us to check each glyph and fetch it from the map conditionally.
On newlines we'd have to invalidate all map entries that are now gone,
so for every invalidated `ROW` we'd iterate through all glyphs again and if
a single one was stored in `UnicodeStorage`, we'd then iterate through the
entire hashmap to remove all coordinates that were residing on that `ROW`.
All in all, this wasn't the most robust nor performant code.

The new implementation is simple (from a design perspective):
Store all text in a `ROW` in a regular string. Grow the string if needed.
The association between columns and text works by storing character offsets
in a column-wide array. This algorithm is <100 LOC and removes ~1000.

As an aside this PR does a few more things that go hand in hand:
* Remove most of `ROW` helper classes, which aren't needed anymore.
* Allocate backing memory in a single `VirtualAlloc` call.
* Rewrite `IsCursorDoubleWidth` to use `DbcsAttrAt` directly.
  Improves overall performance by 10-20% and makes this implementation
  faster than the previous NxM storage, despite the added complexity.

Part of #8000

## Validation Steps Performed
* Existing and new unit and feature tests complete 
* Printing Unicode completes without crashing 
* Resizing works without crashing 
2022-11-11 20:34:58 +01:00
Leonard Hecker
ed27737233 Use 32-bit coordinates throughout the project (#13025)
Previously this project used a great variety of types to present text buffer
coordinates: `short`, `unsigned short`, `int`, `unsigned int`, `size_t`,
`ptrdiff_t`, `COORD`/`SMALL_RECT` (aka `short`), and more.
This massive commit migrates almost all use of those types over to the
centralized types `til::point`/`size`/`rect`/`inclusive_rect` and their
underlying type `til::CoordType` (aka `int32_t`).

Due to the size of the changeset and statistics I expect it to contain bugs.
The biggest risk I see is that some code potentially, maybe implicitly, expected
arithmetic to be mod 2^16 and that this code now allows it to be mod 2^32.
Any narrowing into `short` later on would then throw exceptions.

## PR Checklist
* [x] Closes #4015
* [x] I work here
* [x] Tests added/passed

## Validation Steps Performed
Casual usage of OpenConsole and Windows Terminal. 
2022-06-03 23:02:46 +00:00
Leonard Hecker
57c3953aca Use type inference throughout the project (#12975)
#4015 requires sweeping changes in order to allow a migration of our buffer
coordinates from `int16_t` to `int32_t`. This commit reduces the size of
future commits by using type inference wherever possible, dropping the
need to manually adjust types throughout the project later.

As an added bonus this commit standardizes the alignment of cv qualifiers
to be always left of the type (e.g. `const T&` instead of `T const&`).

The migration to type inference with `auto` was mostly done
using JetBrains Resharper with some manual intervention and the
standardization of cv qualifier alignment using clang-format 14.

## References

This is preparation work for #4015.

## Validation Steps Performed
* Tests pass 
2022-04-25 15:40:47 +00:00
Leonard Hecker
a8e4bedae3 Introduce til::rle - a run length encoded vector (#10099)
## Summary of the Pull Request

Introduces `til::rle`, a vector-like container which stores elements of
type T in a run length encoded format. This allows efficient compaction
of repeated elements within the vector.

## References

* #8000 - Supports buffer rewrite work. A re-use of `til::rle` will be
  useful as a column counter as we pursue NxM storage and presentation.
* #3075 - The new iterators allow skipping forward by multiple units,
  which wasn't possible under `TextBuffer-/OutputCellIterator`.
  Additionally it also allows a bulk insertions.
* #8787 and #410 - High probability this should be `pmr`-ified
  like `bitmap` for things like `chafa` and `cacafire`
  which are changing the run length frequently.

## PR Checklist

* [x] Closes #8741
* [x] I work here.
* [x] Tests added.
* [x] Tests passed.

## Validation Steps Performed

* [x] Ran `cacafire` in `OpenConsole.exe` and it looked beautiful
* [x] Ran new suite of `RunLengthEncodingTests.cpp`

Co-authored-by: Michael Niksa <miniksa@microsoft.com>
2021-05-20 17:27:50 +00:00
James Holderness
4c53c595e7 Add support for double-width/double-height lines in conhost (#8664)
This PR adds support for the VT line rendition attributes, which allow
for double-width and double-height line renditions. These renditions are
enabled with the `DECDWL` (double-width line) and `DECDHL`
(double-height line) escape sequences. Both reset to the default
rendition with the `DECSWL` (single-width line) escape sequence. For now
this functionality is only supported by the GDI renderer in conhost.

There are a lot of changes, so this is just a general overview of the
main areas affected.

Previously it was safe to assume that the screen had a fixed width, at
least for a given point in time. But now we need to deal with the
possibility of different lines have different widths, so all the
functions that are constrained by the right border (text wrapping,
cursor movement operations, and sequences like `EL` and `ICH`) now need
to lookup the width of the active line in order to behave correctly.

Similarly it used to be safe to assume that buffer and screen
coordinates were the same thing, but that is no longer true. Lots of
places now need to translate back and forth between coordinate systems
dependent on the line rendition. This includes clipboard handling, the
conhost color selection and search, accessibility location tracking and
screen reading, IME editor positioning, "snapping" the viewport, and of
course all the rendering calculations.

For the rendering itself, I've had to introduce a new
`PrepareLineTransform` method that the render engines can use to setup
the necessary transform matrix for a given line rendition. This is also
now used to handle the horizontal viewport offset, since that could no
longer be achieved just by changing the target coordinates (on a double
width line, the viewport offset may be halfway through a character).

I've also had to change the renderer's existing `InvalidateCursor`
method to take a `SMALL_RECT` rather than a `COORD`, to allow for the
cursor being a variable width. Technically this was already a problem,
because the cursor could occupy two screen cells when over a
double-width character, but now it can be anything between one and four
screen cells (e.g. a double-width character on the double-width line).

In terms of architectural changes, there is now a new `lineRendition`
field in the `ROW` class that keeps track of the line rendition for each
row, and several new methods in the `ROW` and `TextBuffer` classes for
manipulating that state. This includes a few helper methods for handling
the various issues discussed above, e.g. position clamping and
translating between coordinate systems.

## Validation Steps Performed

I've manually confirmed all the double-width and double-height tests in
_Vttest_ are now working as expected, and the _VT100 Torture Test_ now
renders correctly (at least the line rendition aspects). I've also got
my own test scripts that check many of the line rendition boundary cases
and have confirmed that those are now passing.

I've manually tested as many areas of the conhost UI that I could think
of, that might be affected by line rendition, including things like
searching, selection, copying, and color highlighting. For
accessibility, I've confirmed that the _Magnifier_ and _Narrator_
correctly handle double-width lines. And I've also tested the Japanese
IME, which while not perfect, is at least useable.

Closes #7865
2021-02-18 05:44:50 +00:00
Dustin L. Howett
e7592ec3d4 ROW: clean up in preparation to hide CharRow & AttrRow (#8446)
Moving things out of CharRow into ROW helps us hide it as an implementation detail.
This is part one of many.

### CharRow: Hide ClearCell, use ROW::ClearColumn

### CharRow: Hide GetText, use ROW::GetText

### CharRowBaseTests: remove dead file (never used!)

### CharRow: Move DoubleBytePadded into ROW

### CharRow: Move WrapForced into ROW

### Char/AttrRow: Hide Reset, use ROW::Reset

### Remove RowCellIterator (dead code)

RCI was unused; it was replaced by TextBufferCellIterator shortly after its creation

### Move AttrRowTests to ut_textbuffer from ut_host

It had no reliance on the host.
2021-01-20 21:16:56 +00:00
Austin Lamb
539a5dc0af Greatly reduce allocations in the conhost/OpenConsole startup path (#8489)
I was looking at conhost/OpenConsole and noticed it was being pretty
inefficient with allocations due to some usages of std::deque and
std::vector that didn't need to be done quite that way.

So this uses std::vector for the TextBuffer's storage of ROW objects,
which allows one allocation to contiguously reserve space for all the
ROWs - on Desktop this is 9001 ROW objects which means it saves 9000
allocations that the std::deque would have done.  Plus it has the
benefit of increasing locality of the ROW objects since deque is going
to chase pointers more often with its data structure.

Then, within each ROW there are CharRow and ATTR_ROW objects that use
std::vector today.  This changes them to use Boost's small_vector, which
is a variation of vector that allows for the so-called "small string
optimization."  Since we know the typical size of these vectors, we can
pre-reserve the right number of elements directly in the
CharRow/ATTR_ROW instances, avoiding any heap allocations at all for
constructing these objects.

There are a ton of variations on this "small_vector" concept out there
in the world - this one in Boost, LLVM has one called SmallVector,
Electronic Arts' STL has a small_vector, Facebook's folly library has
one...there are a silly number of these out there.  But Boost seems like
it's by far the easiest to consume in terms of integration into this
repo, the CI/CD pipeline, licensing, and stuff like that, so I went with
the boost version.

In terms of numbers, I measured the startup path of OpenConsole.exe on
my dev box for Release x64 configuration.  My box is an i7-6700k @ 4
Ghz, with 32 GB RAM, not that I think machine config matters much here:

|        | Allocation count    | Allocated bytes    | CPU usage (ms) |
| ------ | ------------------- | ------------------ | -------------- |
| Before | 29,461              | 4,984,640          | 103            |
| After  | 2,459 (-91%)        | 4,853,931 (-2.6%)  | 96 (-7%)       |

Along the way, I also fixed a dynamic initializer I happened to spot in
the registry code, and updated some docs.

## Validation Steps Performed
- Ran "runut", "runft" and "runuia" locally and confirmed results are
  the same as the main branch
- Profiled the before/after numbers in the Visual Studio profiler, for
  the numbers shown in the table

Co-authored-by: Austin Lamb <austinl@microsoft.com>
2020-12-16 10:40:30 -08:00
Michael Niksa
4351f32f5d Commit attr runs less frequently by accumulating length of color run (#6919)
The act of calling `InsertAttrRuns` is relatively slow. Instead of
calling it a bunch of times to meddle with colors one cell at a time,
we'll accumulate a length of color and call it to make it act all at
once. This is great for when one color full line is getting replaced
with another color full line OR when a line is being replaced with the
same color all at once. There's significantly fewer checks to be made
inside `InsertAttrRuns` if we can help it out by accumulating the length
of each color before asking it to stitch it into the storage.

Validation
----------

- Run `time cat big.txt` and `time cat ls.txt` under VS Performance
  Profiler.
2020-07-17 17:53:01 +00:00
Carlos Zamora
4dd9f9c180 make filling chars (and, thus, erase line/char) unset wrap (#2831)
EraseInLine calls `FillConsoleOutputCharacterW()`. In filling the row with
chars, we were setting the wrap flag. We need to specifically not do this on
ANY _FILL_ operation. Now a fill operation UNSETS the wrap flag if we fill to
the end of the line.

Originally, we had a boolean `setWrap` that would mean...
- **true**: if writing to the end of the row, SET the wrap value to true
- **false**: if writing to the end of the row, DON'T CHANGE the wrap value

Now we're making this bool a std::optional to allow for a ternary state. This
allows for us to handle the following cases completely. Refer to the table
below:

,- current wrap value
|     ,- are we filling the last cell in the row?
|     |     ,- new wrap value
|     |     |     ,- comments
|--   |--   |--   |
| 0   | 0   | 0   |
| 0   | 1   | 0   |
| 0   | 1   | 1   | THIS CASE WAS HANDLED CORRECTLY
| 1   | 0   | 0   | THIS CASE WAS UNHANDLED
| 1   | 0   | 1   |
| 1   | 1   | 1   |

To handle that special case (1-0-0), we need to UNSET the wrap. So now, we have
~setWrap~ `wrap` mean the following:
- **true**: if writing to the end of the row, SET the wrap value to TRUE
- **false**: if writing to the end of the row, SET the wrap value to FALSE
- **nullopt**: leave the wrap value as it is

Closes #1126
2019-09-30 18:16:31 -07:00
Michael Niksa
81ab5803aa C26473, do not cast pointer back to the same type. 2019-09-03 09:44:19 -07:00
Michael Niksa
4f1157c044 C26447,C26440 - is noexcept but can throw or doesn't throw but not noexcept 2019-08-29 15:23:07 -07:00
adiviness
9b92986b49 add clang-format conf to the project, format the c++ code (#1141) 2019-06-11 13:27:09 -07:00
Dustin Howett
d4d59fa339 Initial release of the Windows Terminal source code
This commit introduces all of the Windows Terminal and Console Host source,
under the MIT license.
2019-05-02 15:29:04 -07:00