Is there a way for parent process to set the codepage to be used by ConPTY? #12599

Closed
opened 2026-01-31 03:19:58 +00:00 by claunia · 8 comments
Owner

Originally created by @Eli-Zaretskii on GitHub (Feb 15, 2021).

Description of the new feature/enhancement

This could be a feature request, or it could be a question about an existing feature (in which case apologies for missing that feature.

The problem is that a Pseudo Console seems to assume that the child process sends text encoded in the default console codepage, and it converts the text to UTF-8 under that assumption. When this assumption is false, the text read by the parent process from the other end of the pipe will represent very different characters.

Case in point: in my locale, the ANSI codepage is 1255, but the default console OEM codepage seems to be 437, something utterly inadequate for dealing with Hebrew text.

Another case in point: Git for Windows by default produces UTF-8 output, and while it can be told to use some other encoding, doing so is not recommended and generally causes problems in the long run, as non-UTF-8 sequences seep into the repository.

In sum, there are real-life use cases where (a) the application in the child process emits non-ASCII text encoded in something other than the default console codepage, and (b) the programmer who programs the parent application has no control on the code of the child application, and so can neither affect its encoding nor add a call to SetConsoleOutputCP to that application.

What is needed in this case is some way for the parent process to tell Pseudo Console to use a specific codepage when converting the output of the child process to UTF-8. Is this possible with the current APIs? If not, could such a feature be added?

This feature, if added, would allow to use Pseudo Console for communicating with child processes in a way similar to communications via pipes, but with the child process behaving as if it were invoked interactively vis-a-vis a terminal device, not unlike how PTYs are used on Posix systems.

Proposed technical implementation details (optional)

One possible implementation would be an API similar to SetConsoleOutputCP, but one that accepts a handle, either to the Pseudo Console or to one of its pipes.

Thanks.

Originally created by @Eli-Zaretskii on GitHub (Feb 15, 2021). <!-- 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 I ACKNOWLEDGE THE FOLLOWING BEFORE PROCEEDING: 1. If I delete this entire template and go my own path, the core team may close my issue without further explanation or engagement. 2. If I list multiple bugs/concerns in this one issue, the core team may close my issue without further explanation or engagement. 3. If I write an issue that has many duplicates, the core team may close my issue without further explanation or engagement (and without necessarily spending time to find the exact duplicate ID number). 4. If I leave the title incomplete when filing the issue, the core team may close my issue without further explanation or engagement. 5. If I file something completely blank in the body, the core team may close my issue without further explanation or engagement. All good? Then proceed! --> # Description of the new feature/enhancement <!-- A clear and concise description of what the problem is that the new feature would solve. Describe why and how a user would use this new functionality (if applicable). --> This could be a feature request, or it could be a question about an existing feature (in which case apologies for missing that feature. The problem is that a Pseudo Console seems to assume that the child process sends text encoded in the default console codepage, and it converts the text to UTF-8 under that assumption. When this assumption is false, the text read by the parent process from the other end of the pipe will represent very different characters. Case in point: in my locale, the ANSI codepage is 1255, but the default console OEM codepage seems to be 437, something utterly inadequate for dealing with Hebrew text. Another case in point: Git for Windows by default produces UTF-8 output, and while it can be told to use some other encoding, doing so is not recommended and generally causes problems in the long run, as non-UTF-8 sequences seep into the repository. In sum, there are real-life use cases where (a) the application in the child process emits non-ASCII text encoded in something other than the default console codepage, and (b) the programmer who programs the parent application has no control on the code of the child application, and so can neither affect its encoding nor add a call to `SetConsoleOutputCP` to that application. What is needed in this case is some way for the *parent* process to tell Pseudo Console to use a specific codepage when converting the output of the child process to UTF-8. Is this possible with the current APIs? If not, could such a feature be added? This feature, if added, would allow to use Pseudo Console for communicating with child processes in a way similar to communications via pipes, but with the child process behaving as if it were invoked interactively vis-a-vis a terminal device, not unlike how PTYs are used on Posix systems. # Proposed technical implementation details (optional) <!-- A clear and concise description of what you want to happen. --> One possible implementation would be an API similar to `SetConsoleOutputCP`, but one that accepts a handle, either to the Pseudo Console or to one of its pipes. Thanks.
claunia added the Issue-QuestionResolution-DuplicateProduct-Conpty labels 2026-01-31 03:19:58 +00:00
Author
Owner

@DHowett commented on GitHub (Feb 15, 2021):

This is a reasonable request, but I do want to ask:

no control on the code of the child

can neither affect its encoding

How does this application work in a normal console window if it's incompatible with the default system codepage?

@DHowett commented on GitHub (Feb 15, 2021): This is a reasonable request, but I do want to ask: > no control on the code of the child > can neither affect its encoding How does this application work in a normal console window if it's incompatible with the default system codepage?
Author
Owner

@Eli-Zaretskii commented on GitHub (Feb 15, 2021):

How does this application work in a normal console window if it's incompatible with the default system codepage?

It depends on special setup of the console where it runs. For example, Git runs in Git Bash window, which is not the default Command Prompt window. Other applications could require that "someone" runs chcp before starting them.

@Eli-Zaretskii commented on GitHub (Feb 15, 2021): > How does this application work in a normal console window if it's incompatible with the default system codepage? It depends on special setup of the console where it runs. For example, Git runs in Git Bash window, which is not the default Command Prompt window. Other applications could require that "someone" runs `chcp` before starting them.
Author
Owner

@lonnywong commented on GitHub (Jul 13, 2022):

Any progress?

@lonnywong commented on GitHub (Jul 13, 2022): Any progress?
Author
Owner

@zadjii-msft commented on GitHub (Jul 13, 2022):

No, sorry for letting this fall so far down the triage queue.

This kinda just sounds by design right now.

The "git bash runs in a git bash window" thing - yea, I'm pretty sure they're running in winpty and the console isn't involved at all there. There, the client can just write output to the stdout pipe, and then mintty just reads whatever the bytes are.

Other applications could require that "someone" runs chcp before starting them

That sounds like exactly the same thing that would happen today

Ultimately, seems like the root of the issue here is:

(a) the application in the child process emits non-ASCII text encoded in something other than the default console codepage

but, under the assumption that the child process never added a call to SetConsole*Codepage. That kinda seems like an issue that the client application should resolve, right? Maybe I'm missing something. We're all talking in hypotheticals here, a concrete example might be useful.

Maybe there's something I'm missing between

in my locale, the ANSI codepage is 1255, but the default console OEM codepage seems to be 437

@zadjii-msft commented on GitHub (Jul 13, 2022): No, sorry for letting this fall so far down the triage queue. This kinda just sounds by design right now. The "git bash runs in a git bash window" thing - yea, I'm pretty sure they're running in winpty and the console isn't involved _at all_ there. There, the client can just write output to the stdout pipe, and then mintty just reads whatever the bytes are. > Other applications could require that "someone" runs `chcp` before starting them That sounds like exactly the same thing that would happen today Ultimately, seems like the root of the issue here is: > (a) the application in the child process emits non-ASCII text encoded in something other than the default console codepage but, under the assumption that the child process never added a call to `SetConsole*Codepage`. That kinda seems like an issue that the client application should resolve, right? Maybe I'm missing something. We're all talking in hypotheticals here, a concrete example might be useful. Maybe there's something I'm missing between > in my locale, the ANSI codepage is 1255, but the default console OEM codepage seems to be 437
Author
Owner

@Eli-Zaretskii commented on GitHub (Jul 13, 2022):

You are basically saying that ConPTY is meant to be used as a console, i.e. its other end is actually displayed to users. Whereas I thought that ConPTY is meant to be used like PTY devices on Unix systems, where they are frequently used instead of pipes to have one process drive another via a bidirectional communications channel. The advantage of using PTYs like that is that the child program which writes to the PTY behaves as it would behave when its output was a console device, and thus the parent program could be a front-end to it. For example, there are programs that colorize their output, but only when the output is a console device. Or there are programs that automatically make their output unbuffered when it is connected to a console device, but not when it's a pipe.

So using PTYs instead of pipes allows to run the child process and have it behave as it does when it writes to a console, but the parent process can then grab the output and do whatever it likes with that. In these scenarios, it is frequently desirable to use encoding of text that is not the system codepage, because the system codepage could be inappropriate. Pipes allow that, but ConPTY doesn't. And if this is the design, then ConPTYs can never be used as the Windows equivalent of the Unix PTYs, because the latter don't have that limitation. Which is IMO too bad.

@Eli-Zaretskii commented on GitHub (Jul 13, 2022): You are basically saying that ConPTY is meant to be used as a console, i.e. its other end is actually displayed to users. Whereas I thought that ConPTY is meant to be used like PTY devices on Unix systems, where they are frequently used instead of pipes to have one process drive another via a bidirectional communications channel. The advantage of using PTYs like that is that the child program which writes to the PTY behaves as it would behave when its output was a console device, and thus the parent program could be a front-end to it. For example, there are programs that colorize their output, but only when the output is a console device. Or there are programs that automatically make their output unbuffered when it is connected to a console device, but not when it's a pipe. So using PTYs instead of pipes allows to run the child process and have it behave as it does when it writes to a console, but the parent process can then grab the output and do whatever it likes with that. In these scenarios, it is frequently desirable to use encoding of text that is not the system codepage, because the system codepage could be inappropriate. Pipes allow that, but ConPTY doesn't. And if this is the design, then ConPTYs can never be used as the Windows equivalent of the Unix PTYs, because the latter don't have that limitation. Which is IMO too bad.
Author
Owner

@carlos-zamora commented on GitHub (Dec 7, 2022):

Which is IMO too bad.

Unfortunately, not all of the Windows console API can easily be converted directly into streamable operations. We're making progress on this in #1173, but we're not there yet. In the interest of maintaining our focus, I'm gonna track this work item over there /dup #1173.

@carlos-zamora commented on GitHub (Dec 7, 2022): > Which is IMO too bad. Unfortunately, not all of the Windows console API can easily be converted directly into streamable operations. We're making progress on this in #1173, but we're not there yet. In the interest of maintaining our focus, I'm gonna track this work item over there /dup #1173.
Author
Owner

@ghost commented on GitHub (Dec 7, 2022):

Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!

@ghost commented on GitHub (Dec 7, 2022): Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!
Author
Owner

@ghost commented on GitHub (Dec 7, 2022):

Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!

@ghost commented on GitHub (Dec 7, 2022): Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#12599