ReadConsole does not work with utf-8 codepage #20527

Closed
opened 2026-01-31 07:16:35 +00:00 by claunia · 9 comments
Owner

Originally created by @tholp on GitHub (Sep 22, 2023).

Windows Terminal version

No response

Windows build number

No response

Other Software

No response

Steps to reproduce

Use SetConsoleCP to use utf-8 (or if in utf8 beta-mode in windows).
Read using ReadConsole

Inputting "abcæøå" the input buffer will contain 0x61 0x62 0x63 0x00 0x00 0x00 0xD 0x0A

Expected Behavior

utf-8 characters for the letters æøå.

Actual Behavior

nulls

Originally created by @tholp on GitHub (Sep 22, 2023). ### Windows Terminal version _No response_ ### Windows build number _No response_ ### Other Software _No response_ ### Steps to reproduce Use **SetConsoleCP** to use utf-8 (or if in utf8 beta-mode in windows). Read using **ReadConsole** Inputting "abcæøå" the input buffer will contain 0x61 0x62 0x63 **0x00 0x00 0x00** 0xD 0x0A ### Expected Behavior utf-8 characters for the letters æøå. ### Actual Behavior nulls
Author
Owner

@zadjii-msft commented on GitHub (Sep 22, 2023):

Are you using ReadConsoleA or ReadConsoleW/? (does your main take chars or wchar_ts?)

I'd bet this is the thing that @lhecker just re-wrote the entire input buffer to fix ☺️

@zadjii-msft commented on GitHub (Sep 22, 2023): Are you using `ReadConsoleA` or `ReadConsoleW`/? (does your `main` take `char`s or `wchar_t`s?) I'd bet this is the thing that @lhecker just re-wrote the entire input buffer to fix ☺️
Author
Owner

@tholp commented on GitHub (Sep 22, 2023):

I use ReadConsoleW and actually I have no problems using utf-16 or Codepage 1252. ReadConsoleW reads into to a buffer and I have written what bytes the buffer contains, i.e. without interpretating anything as characters. So the types char and wchar_t are irrelevant.

@tholp commented on GitHub (Sep 22, 2023): I use `ReadConsoleW` and actually I have no problems using utf-16 or Codepage 1252. `ReadConsoleW` reads into to a buffer and I have written what _bytes_ the buffer contains, i.e. without interpretating anything as characters. So the types `char` and `wchar_t` are irrelevant.
Author
Owner

@tholp commented on GitHub (Sep 22, 2023):

By the way, the problem is the same both for the Windows Terminal and the Windows Console Host.

@tholp commented on GitHub (Sep 22, 2023): By the way, the problem is the same both for the **Windows Terminal** and the **Windows Console Host**.
Author
Owner

@lhecker commented on GitHub (Sep 22, 2023):

@tholp Some parts of your comments must be incorrect. You write that your buffer contains:

0x61 0x62 0x63 0x00 0x00 0x00 0xD 0x0A

And you said (emphasis mine):

ReadConsoleW reads into to a buffer and I have written what bytes the buffer contains

If you had truly used the W variant it would've filled your buffer with 16-bit characters which would result in a byte-wise buffer like this:

0x61 0x00, 0x62 0x00, 0x63 0x00, 0x00 0x00, 0x00 0x00, 0x00 0x00, 0x0D 0x00, 0x0A 0x00

= There should be 0x00 high bytes for each 16-bit integer.

So either you used the A variant or you wrote what 16-bit integers your buffer contains. Do you know which one is it?


Furthermore, when you say:

Inputting "abcæøå"

how did you enter that string? Did you enter it regularly with your keyboard or did you use WriteConsoleInput? Depending on your answer I think I know which PR fixed your issue. 🙂

@lhecker commented on GitHub (Sep 22, 2023): @tholp Some parts of your comments must be incorrect. You write that your buffer contains: ``` 0x61 0x62 0x63 0x00 0x00 0x00 0xD 0x0A ``` And you said (emphasis mine): > **`ReadConsoleW`** reads into to a buffer and I have written what **_bytes_** the buffer contains If you had truly used the `W` variant it would've filled your buffer with 16-bit characters which would result in a byte-wise buffer like this: ``` 0x61 0x00, 0x62 0x00, 0x63 0x00, 0x00 0x00, 0x00 0x00, 0x00 0x00, 0x0D 0x00, 0x0A 0x00 ``` = There should be 0x00 high bytes for each 16-bit integer. So either you used the `A` variant or you wrote what 16-bit integers your buffer contains. Do you know which one is it? --- Furthermore, when you say: > Inputting "abcæøå" how did you enter that string? Did you enter it regularly with your keyboard or did you use `WriteConsoleInput`? Depending on your answer I think I know which PR fixed your issue. 🙂
Author
Owner

@tholp commented on GitHub (Sep 25, 2023):

Sorry, you are right. My code actually obtains the (bad) bytes using ReadFile.
I don't know if that problem should be reported here or somewhere else?

Running echo abcæøå | myProgram gives the correct buffer contents, but inputting it using the keyboard gives the null-bytes.

@tholp commented on GitHub (Sep 25, 2023): Sorry, you are right. My code actually obtains the (bad) bytes using `ReadFile`. I don't know if that problem should be reported here or somewhere else? Running `echo abcæøå | myProgram` gives the correct buffer contents, but inputting it using the keyboard gives the null-bytes.
Author
Owner

@lhecker commented on GitHub (Sep 25, 2023):

Ah I see... In that case your issue is fixed by the combination of #14745 and #15783. The former fixes UTF-8 not working via ReadFile and is fixed in 1.18 and later. The latter fixes interactive Unicode input and is fixed in 1.19 and later. 1.18 is currently available via Windows Terminal Preview. (It'll be stable in a few weeks.)

I'll mark it as a /duplicate of #14745, because that one affects you more than the other.

@lhecker commented on GitHub (Sep 25, 2023): Ah I see... In that case your issue is fixed by the combination of #14745 and #15783. The former fixes UTF-8 not working via `ReadFile` and is fixed in 1.18 and later. The latter fixes interactive Unicode input and is fixed in 1.19 and later. 1.18 is currently available via Windows Terminal Preview. (It'll be stable in a few weeks.) I'll mark it as a /duplicate of #14745, because that one affects you more than the other.
Author
Owner

@microsoft-github-policy-service[bot] commented on GitHub (Sep 25, 2023):

Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!

@microsoft-github-policy-service[bot] commented on GitHub (Sep 25, 2023): Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!
Author
Owner

@lhecker commented on GitHub (Sep 25, 2023):

Oh and I should say: It's quite likely that Windows Terminal Preview 1.18 already fixes your issue. If you get a chance, please try it out! 🙂 (If you compare your issue description with #4551 for instance, you'll see that your description is very similar, and #4551 has been fixed in 1.18.)

@lhecker commented on GitHub (Sep 25, 2023): Oh and I should say: It's quite likely that Windows Terminal Preview 1.18 already fixes your issue. If you get a chance, please try it out! 🙂 (If you compare your issue description with #4551 for instance, you'll see that your description is very similar, and #4551 has been fixed in 1.18.)
Author
Owner

@tholp commented on GitHub (Sep 26, 2023):

Yes, that sounds exactly like my issue. Thank you, and once more: sorry for the confusion with ReadConsole (I apparently didn't understand our own (old) code and should have made a pure test).

Not sure I can test it until the change is in the public version (I have in any case updated our software to get the right result).

@tholp commented on GitHub (Sep 26, 2023): Yes, that sounds exactly like my issue. Thank you, and once more: sorry for the confusion with `ReadConsole` (I apparently didn't understand our own (old) code and should have made a pure test). Not sure I can test it until the change is in the public version (I have in any case updated our software to get the right result).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#20527