Regression vs legacy conhost: different behavior from ReadConsoleOutputCharacterW when surrogate pair(s) are present. #21415

Open
opened 2026-01-31 07:44:04 +00:00 by claunia · 0 comments
Owner

Originally created by @chrisant996 on GitHub (Mar 18, 2024).

Originally assigned to: @lhecker on GitHub.

Windows Terminal version

1.19.10573.0

Windows build number

10.0.19045.4046

Other Software

This is an API issue, and affects any software that calls ReadConsoleOutputCharacterW.
The API issue only repros with Windows Terminal, not with legacy conhost (nor with ConEmu, ConsoleZ, etc).

For example clink was affected by this when also using eza or dirx (see here for details of how this was encountered during real world usage).

Steps to reproduce

When a line of text in the console screen buffer contains one or more surrogate pairs, then the behavior ReadConsoleOutputCharacterW API does not match the documented contract, and is different from the behavior with legacy conhost.

The attached repro program demonstrates the behavior:

  • It works as expected in legacy conhost (and ConEmu, ConsoleZ, etc).
  • It malfunctions in Windows Terminal.

repro_surrogate_pairs_issue.zip

Repro:

  • Use WriteConsoleW to print a line of text that includes one or more surrogate pairs.
  • Use ReadConsoleOutputCharacterW to read back the same line.

Easy demonstration program:

  1. Create a folder and extract the files from the .zip file linked above.
  2. Optional: run the signed Repro.exe file.
  3. Or, you can build the source files and run a locally built copy of the Repro.exe program.

Expected Behavior

ReadConsoleOutputCharacterW should:

  1. Fill the out param lpNumberOfCharsRead with the number of characters read (e.g. the width of the console).
  2. Fill the out param lpCharacter with characters read from the console.
  3. Return true (success).

The demo program should first write 4 lines, then verify that reading the lines back matches what was originally written.
Each line contains Unicode codepoints that correspond to certain nerdfonts icons.

Sample expected output:

OUTPUT:

-a--- 17k 13 Mar 14:38  Aaa.cpp
-a--- 11k 13 Mar 14:44 󱗀 Bbb.xml
-a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp
-a--- 11k 13 Mar 14:44  Ddd.zip

RESULTS:

Line 1 len 120 matches : -a--- 17k 13 Mar 14:38  Aaa.cpp
Line 2 len 120 matches : -a--- 11k 13 Mar 14:44 󱗀 Bbb.xml
Line 3 len 120 matches : -a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp
Line 4 len 120 matches : -a--- 11k 13 Mar 14:44  Ddd.zip

Actual Behavior

Only in Windows Terminal (all versions; 1.19, 1.20, 1.20 canary):

  1. Fills the out param lpNumberOfCharsRead with 0.
  2. Does not fill the out param lpCharacter.
  3. Returns true (success).

Problem 1: It reads nothing; it should read text successfully, the same as in legacy conhost.
Problem 2: It reports success; that's inaccurate since it failed to read the text that was present.

Sample actual output:

OUTPUT:

-a--- 17k 13 Mar 14:38  Aaa.cpp
-a--- 11k 13 Mar 14:44 󱗀 Bbb.xml
-a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp
-a--- 11k 13 Mar 14:44  Ddd.zip

RESULTS:

Line 1 len 120 matches : -a--- 17k 13 Mar 14:38 ��� Aaa.cpp
Line 2 len 0   DIFFERS :
Line 3 len 0   DIFFERS :
Line 4 len 120 matches : -a--- 11k 13 Mar 14:44  Ddd.zip
Originally created by @chrisant996 on GitHub (Mar 18, 2024). Originally assigned to: @lhecker on GitHub. ### Windows Terminal version 1.19.10573.0 ### Windows build number 10.0.19045.4046 ### Other Software This is an API issue, and affects any software that calls `ReadConsoleOutputCharacterW`. The API issue only repros with Windows Terminal, not with legacy conhost (nor with ConEmu, ConsoleZ, etc). For example [clink](https://github.com/chrisant996/clink) was affected by this when also using [eza](https://github.com/eza-community/eza) or [dirx](https://github.com/chrisant996/dirx) (see [here](https://github.com/chrisant996/clink/issues/574#issuecomment-1996293488) for details of how this was encountered during real world usage). ### Steps to reproduce When a line of text in the console screen buffer contains one or more surrogate pairs, then the behavior `ReadConsoleOutputCharacterW` API does not match the documented contract, and is different from the behavior with legacy conhost. The attached repro program demonstrates the behavior: - It works as expected in legacy conhost (and ConEmu, ConsoleZ, etc). - It malfunctions in Windows Terminal. [repro_surrogate_pairs_issue.zip](https://github.com/microsoft/terminal/files/14640769/repro_surrogate_pairs_issue.zip) Repro: - Use `WriteConsoleW` to print a line of text that includes one or more surrogate pairs. - Use `ReadConsoleOutputCharacterW` to read back the same line. ### Easy demonstration program: 1. Create a folder and extract the files from the .zip file linked above. 2. Optional: run the signed Repro.exe file. 3. Or, you can build the source files and run a locally built copy of the Repro.exe program. ### Expected Behavior `ReadConsoleOutputCharacterW` should: 1. Fill the out param `lpNumberOfCharsRead` with the number of characters read (e.g. the width of the console). 2. Fill the out param `lpCharacter` with characters read from the console. 3. Return true (success). The demo program should first write 4 lines, then verify that reading the lines back matches what was originally written. Each line contains Unicode codepoints that correspond to certain [nerdfonts](https://nerdfonts.com) icons. ### Sample expected output: ```plain OUTPUT: -a--- 17k 13 Mar 14:38  Aaa.cpp -a--- 11k 13 Mar 14:44 󱗀 Bbb.xml -a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp -a--- 11k 13 Mar 14:44  Ddd.zip RESULTS: Line 1 len 120 matches : -a--- 17k 13 Mar 14:38  Aaa.cpp Line 2 len 120 matches : -a--- 11k 13 Mar 14:44 󱗀 Bbb.xml Line 3 len 120 matches : -a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp Line 4 len 120 matches : -a--- 11k 13 Mar 14:44  Ddd.zip ``` ### Actual Behavior Only in Windows Terminal (all versions; 1.19, 1.20, 1.20 canary): 1. Fills the out param `lpNumberOfCharsRead` with 0. 2. Does not fill the out param `lpCharacter`. 3. Returns true (success). **Problem 1:** It reads nothing; it should read text successfully, the same as in legacy conhost. **Problem 2:** It reports success; that's inaccurate since it failed to read the text that was present. ### Sample actual output: ```plain OUTPUT: -a--- 17k 13 Mar 14:38  Aaa.cpp -a--- 11k 13 Mar 14:44 󱗀 Bbb.xml -a--- 10k 13 Mar 14:43 󰅲 Ccc.lisp -a--- 11k 13 Mar 14:44  Ddd.zip RESULTS: Line 1 len 120 matches : -a--- 17k 13 Mar 14:38 ��� Aaa.cpp Line 2 len 0 DIFFERS : Line 3 len 0 DIFFERS : Line 4 len 120 matches : -a--- 11k 13 Mar 14:44  Ddd.zip ```
claunia added the Product-ConhostArea-OutputNeeds-TriageIssue-BugPriority-1 labels 2026-01-31 07:44:04 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#21415