Cross-platform win32-input-mode usage #11516

Closed
opened 2026-01-31 02:49:54 +00:00 by claunia · 15 comments
Owner

Originally created by @o-sdn-o on GitHub (Nov 20, 2020).

Description of the new feature/enhancement

By creating this issue, I would like to start a public discussion about what could be changed in the current design of the win32-input-mode protocol before it becomes widespread and cross-platform.

The Considerations section states:

612b00cd44/doc/specs/%234999%20-%20Improved%20keyboard%20handling%20in%20Conpty.md (L60-L62)

At the same time, the section Future Considerations allows cross-platform use:

612b00cd44/doc/specs/%234999%20-%20Improved%20keyboard%20handling%20in%20Conpty.md (L411-L416)

Improvements

List of possible improvements, issues and use cases not covered by this keyboard protocol:

Originally created by @o-sdn-o on GitHub (Nov 20, 2020). # Description of the new feature/enhancement By creating this issue, I would like to start a public discussion about what could be changed in the current design of the [`win32-input-mode` protocol](https://github.com/microsoft/terminal/blob/main/doc/specs/%234999%20-%20Improved%20keyboard%20handling%20in%20Conpty.md) before it becomes widespread and cross-platform. The `Considerations` section states: https://github.com/microsoft/terminal/blob/612b00cd445470fa85d2f71dc94930f03ebe4823/doc/specs/%234999%20-%20Improved%20keyboard%20handling%20in%20Conpty.md?plain=1#L60-L62 At the same time, the section `Future Considerations` allows cross-platform use: https://github.com/microsoft/terminal/blob/612b00cd445470fa85d2f71dc94930f03ebe4823/doc/specs/%234999%20-%20Improved%20keyboard%20handling%20in%20Conpty.md?plain=1#L411-L416 ## Improvements List of possible improvements, issues and use cases not covered by this keyboard protocol: - ~~No way to restore the normal mode of the terminal when the connection is lost or the application crashes.~~ https://github.com/microsoft/terminal/issues/8343#issuecomment-1586225095 - Support `win32-input-mode` request acknowledgment with the current state of keyboard modifiers. - Req: `^[[?9001h` - Ack: `^[[0;0;0;0;CtrlState;0_` - Support for <kbd>LeftShift</kbd> and <kbd>RightShift</kbd> modifiers along with <kbd>Shift</kbd>. - Support for Unicode codepoint decimal value (for BaseCharacter) instead of UnicodeChar (Uc), which currently represents one UTF-16 code unit (wchar_t). `^[[Vk;Sc;Uc;Kd;Cs;Rc_` => `^[[Vk;Sc;BaseCharacter;Kd;Cs;Rc_` - Support for passing/receiving an entire [grapheme cluster](https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries) as one keyboard event in order to get it by a single ReadKey call, e.g. getting an accented letter `á` represented as a sequence of [codepoints](https://en.wikipedia.org/wiki/Code_point). (`GC` = `BaseCharacter` + `Codepoint1` + `Cp2` + … + `CpN`). #2853 https://github.com/microsoft/terminal/issues/8343#issuecomment-1586225095 https://github.com/microsoft/terminal/issues/8343#issuecomment-1586288563 Possible solution: - Add continuing codepoints as optional arguments to the event sequence: `^[[Vk;Sc;BaseCharacter;Kd;Cs;Rc [ ; Codepoint1 ; Cp2 ; … ; CpN ] _` - Sending clipboard data without additional generation of keystroke sequence, since this data must already be enclosed (mandatory [bracketed paste mode](https://en.wikipedia.org/wiki/Bracketed-paste)).
claunia added the Issue-FeatureNeeds-TriageNeeds-Tag-Fix labels 2026-01-31 02:49:55 +00:00
Author
Owner

@j4james commented on GitHub (Jun 11, 2023):

No way to restore the normal mode of the terminal when the connection is lost or the application crashes.

This is a general problem that can apply to a lot of terminal states. Some of them you could potentially recover from by blindly typing reset in a WSL shell, or echoing an appropriate escape sequence, but that's not a realistic solution for most users. We need something like a Reset Terminal action that'll allow the user to easily recover from any messed up state - not just this one particular keyboard protocol.

This is a fairly common feature in terminal emulators, and I could have sworn we already had an issue for it, but I can't find it now. It's possible it was just something that was discussed in an unrelated issue, but we definitely should open an issue for it if there isn't one.

No way to get an entire grapheme cluster via single read key call, i.e. get accented a-letter that represented as a codepoint sequence.

But the win32-input-mode is a direct representation of the key events we receive. If there was keyboard that allowed you to enter grapheme clusters, that would have to be in the form of multiple key events, and this is something that applications would already be expected to handle. Same goes for any characters outside the BMP for that matter - they're going to appear as two separate key events to represent the surrogate pair. We can't report a UTF-8 representation for combined events that we haven't received yet.

@j4james commented on GitHub (Jun 11, 2023): > No way to restore the normal mode of the terminal when the connection is lost or the application crashes. This is a general problem that can apply to a lot of terminal states. Some of them you could potentially recover from by blindly typing `reset` in a WSL shell, or echoing an appropriate escape sequence, but that's not a realistic solution for most users. We need something like a `Reset Terminal` action that'll allow the user to easily recover from any messed up state - not just this one particular keyboard protocol. This is a fairly common feature in terminal emulators, and I could have sworn we already had an issue for it, but I can't find it now. It's possible it was just something that was discussed in an unrelated issue, but we definitely should open an issue for it if there isn't one. > No way to get an entire grapheme cluster via single read key call, i.e. get accented a-letter `á` that represented as a codepoint sequence. But the win32-input-mode is a direct representation of the key events we receive. If there was keyboard that allowed you to enter grapheme clusters, that would have to be in the form of multiple key events, and this is something that applications would already be expected to handle. Same goes for any characters outside the BMP for that matter - they're going to appear as two separate key events to represent the surrogate pair. We can't report a UTF-8 representation for combined events that we haven't received yet.
Author
Owner

@o-sdn-o commented on GitHub (Jun 11, 2023):

I think that it is necessary to support grapheme clusters in this keyboard protocol for the following reason.
The application must be able to compose codepoints into clusters. At the same time, it should mark how current cluster was composed. In the general case, it makes sense to distinguish a grapheme cluster assembled with several keystrokes from a grapheme cluster generated with one keystroke, since when deleting it with Backspace, in the first case, the codepoints will have to be deleted one-by-one, in the second, the whole cluster at a time. By using the Del key in both cases, we can delete the entire cluster.

For example, a user can compose a cluster by typing codepoint by codepoint using Alt+Num, and in the process correct errors using Backspace. Also, the user can use IME with one keystroke to get a complex cluster, and delete it with Backspace also in one keystroke.

Currently, we can only get fragmented keyboard events/clusters. Tomorrow we may be able to get entire grapheme cluster from Windows subsystems per call, and the keyboard protocol should support this.

@o-sdn-o commented on GitHub (Jun 11, 2023): I think that it is necessary to support grapheme clusters in this keyboard protocol for the following reason. The application must be able to compose codepoints into clusters. At the same time, it should mark how current cluster was composed. In the general case, it makes sense to distinguish a grapheme cluster assembled with several keystrokes from a grapheme cluster generated with one keystroke, since when deleting it with Backspace, in the first case, the codepoints will have to be deleted one-by-one, in the second, the whole cluster at a time. By using the Del key in both cases, we can delete the entire cluster. For example, a user can compose a cluster by typing codepoint by codepoint using Alt+Num, and in the process correct errors using Backspace. Also, the user can use IME with one keystroke to get a complex cluster, and delete it with Backspace also in one keystroke. Currently, we can only get fragmented keyboard events/clusters. Tomorrow we may be able to get entire grapheme cluster from Windows subsystems per call, and the keyboard protocol should support this.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

Closing this issue in favor of vt-input-mode protocol.

@o-sdn-o commented on GitHub (Jun 13, 2023): Closing this issue in favor of [`vt-input-mode`](https://github.com/netxs-group/vtm/issues/400) protocol.
Author
Owner

@j4james commented on GitHub (Jun 13, 2023):

For example, a user can compose a cluster by typing codepoint by codepoint using Alt+Num, and in the process correct errors using Backspace. Also, the user can use IME with one keystroke to get a complex cluster, and delete it with Backspace also in one keystroke.

Assuming this was possible, what happens when your editor saves these two different clusters to disk and later reloads them? Now there's no way to distinguish between the two, and a backspace would work in exactly the same way for both of them. To a user, the backspace behavior would just seem random.

I really cannot imagine an editor going to all that effort, when it's just going to cause confusion. If anything, it might make sense to backspace over the entire cluster with one keystroke in all cases, regardless of how the cluster was entered. That at least would be consistent.

@j4james commented on GitHub (Jun 13, 2023): > For example, a user can compose a cluster by typing codepoint by codepoint using Alt+Num, and in the process correct errors using Backspace. Also, the user can use IME with one keystroke to get a complex cluster, and delete it with Backspace also in one keystroke. Assuming this was possible, what happens when your editor saves these two different clusters to disk and later reloads them? Now there's no way to distinguish between the two, and a backspace would work in exactly the same way for both of them. To a user, the backspace behavior would just seem random. I really cannot imagine an editor going to all that effort, when it's just going to cause confusion. If anything, it might make sense to backspace over the entire cluster with one keystroke in _all_ cases, regardless of how the cluster was entered. That at least would be consistent.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

It would be important for me here to clarify that we are talking only about the current open cluster in which code points are currently being added.

After the cluster is closed, it is atomic.

@o-sdn-o commented on GitHub (Jun 13, 2023): It would be important for me here to clarify that we are talking only about the current open cluster in which code points are currently being added. After the cluster is closed, it is atomic.
Author
Owner

@j4james commented on GitHub (Jun 13, 2023):

Then I don't understand why you'd need to tell the difference between a cluster entered as a single keystroke and one entered with multiple keystrokes. If the keyboard protocol splits a cluster into multiple keystrokes, the final keystroke would assumedly close the cluster, so it's still going to immediately become atomic isn't it?

@j4james commented on GitHub (Jun 13, 2023): Then I don't understand why you'd need to tell the difference between a cluster entered as a single keystroke and one entered with multiple keystrokes. If the keyboard protocol splits a cluster into multiple keystrokes, the final keystroke would assumedly close the cluster, so it's still going to immediately become atomic isn't it?
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

The last entered cluster must be kept open, since it is not known whether an additional joiner will arrive or not, for example, with a skin tone modifier.

@o-sdn-o commented on GitHub (Jun 13, 2023): The last entered cluster must be kept open, since it is not known whether an additional joiner will arrive or not, for example, with a skin tone modifier.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

User can manually construct a grapheme cluster of any length they want by stringing together multiple zero-width joiners.

In the case of receiving Backspace and before that there was a cluster containing more than one codepoint received in one transaction, then it will delete this cluster entirely. If codepoints were received separately, only the last codepoint will be deleted.

Case 1

ReadKey call: BaseChar1
ReadKey call: BaseChar2
ReadKey call: Continuing Codepoint1
ReadKey call: Continuing Codepoint2
ReadKey call: Continuing Codepoint3
ReadKey call: Backspace
---------------
Result: BaseChar1 + BaseChar2+Codepoint1+Codepoint2

Case 2

ReadKey call: BaseChar1
ReadKey call: BaseChar2+Codepoint1+Codepoint2+Codepoint3
ReadKey call: Backspace
---------------
Result: BaseChar1

Case 3

ReadKey call: BaseChar1
ReadKey call: BaseChar2+Codepoint1+Codepoint2
ReadKey call: Continuing Codepoint3
ReadKey call: Backspace
ReadKey call: Backspace
---------------
Result: BaseChar1 + BaseChar2+Codepoint1

Note: Grapheme cluster segmentation is based on the codepoint properties from the Unicode Character Database and Grapheme Cluster Boundary Rules.

@o-sdn-o commented on GitHub (Jun 13, 2023): User can manually construct a grapheme cluster of any length they want by stringing together multiple zero-width joiners. In the case of receiving Backspace and before that there was a cluster containing more than one codepoint received in one transaction, then it will delete this cluster entirely. If codepoints were received separately, only the last codepoint will be deleted. Case 1 ``` ReadKey call: BaseChar1 ReadKey call: BaseChar2 ReadKey call: Continuing Codepoint1 ReadKey call: Continuing Codepoint2 ReadKey call: Continuing Codepoint3 ReadKey call: Backspace --------------- Result: BaseChar1 + BaseChar2+Codepoint1+Codepoint2 ``` Case 2 ``` ReadKey call: BaseChar1 ReadKey call: BaseChar2+Codepoint1+Codepoint2+Codepoint3 ReadKey call: Backspace --------------- Result: BaseChar1 ``` Case 3 ``` ReadKey call: BaseChar1 ReadKey call: BaseChar2+Codepoint1+Codepoint2 ReadKey call: Continuing Codepoint3 ReadKey call: Backspace ReadKey call: Backspace --------------- Result: BaseChar1 + BaseChar2+Codepoint1 ``` Note: Grapheme cluster segmentation is based on the codepoint properties from the Unicode Character Database and [Grapheme Cluster Boundary Rules](https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules).
Author
Owner

@j4james commented on GitHub (Jun 13, 2023):

We seem to be going in circles here. I refer you back to my comment here: https://github.com/microsoft/terminal/issues/8343#issuecomment-1588876569

I understand what you're suggesting - I just don't think it's a very good idea. That's just my opinion.

@j4james commented on GitHub (Jun 13, 2023): We seem to be going in circles here. I refer you back to my comment here: https://github.com/microsoft/terminal/issues/8343#issuecomment-1588876569 I understand what you're suggesting - I just don't think it's a very good idea. That's just my opinion.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

it might make sense to backspace over the entire cluster with one keystroke in all cases, regardless of how the cluster was entered.

For example GitHub web interface allows you to ZWJ+codepoint-by-ZWJ+codepoint backspace a complex cluster. Try to backspace the following cluster 🏴‍☠️. I am using FF as my desktop browser. It may behave differently in other browsers. Windows Notepad and Notepad++ allows you to backspace codepoint-by-codepoint.

@o-sdn-o commented on GitHub (Jun 13, 2023): > it might make sense to backspace over the entire cluster with one keystroke in all cases, regardless of how the cluster was entered. For example GitHub web interface allows you to ZWJ+codepoint-by-ZWJ+codepoint backspace a complex cluster. Try to backspace the following cluster 🏴‍☠️. I am using FF as my desktop browser. It may behave differently in other browsers. Windows Notepad and Notepad++ allows you to backspace codepoint-by-codepoint.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

And it's annoying when you insert a complex cluster with a single keystroke, and to remove it you have to press the backspace a dozen times. This does not apply to rare emoji. It seems to me when using a Devanagari-like writing system with complex clusters in each syllable, this is critical.

In other hand if the whole cluster will be deleted in one backspace, then in case of a typo, you need to retype the entire cluster from the beginning.

@o-sdn-o commented on GitHub (Jun 13, 2023): And it's annoying when you insert a complex cluster with a single keystroke, and to remove it you have to press the backspace a dozen times. This does not apply to rare emoji. It seems to me when using a Devanagari-like writing system with complex clusters in each syllable, this is critical. In other hand if the whole cluster will be deleted in one backspace, then in case of a typo, you need to retype the entire cluster from the beginning.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

Probably a compromise here will be to completely delete either with the Del key or the backspace, but the whole cluster that has just arrived. In all other cases, the backspace deletes codepoint-by-codepoint.

@o-sdn-o commented on GitHub (Jun 13, 2023): Probably a compromise here will be to completely delete either with the Del key or the backspace, but the whole cluster that has just arrived. In all other cases, the backspace deletes codepoint-by-codepoint.
Author
Owner

@j4james commented on GitHub (Jun 13, 2023):

This seems like something I'd want to configure in my editor though. Sometimes I might prefer backspacing over individual parts of a cluster, and sometimes I might prefer deleting the entire thing with a single backspace. That should be my choice to make, so I know it's going to work exactly the way I expect. I definitely wouldn't want the behavior randomly changing depending on which IME I happened to use, and then potentially changing again when I reload the content from disk.

@j4james commented on GitHub (Jun 13, 2023): This seems like something I'd want to configure in my editor though. Sometimes I might prefer backspacing over individual parts of a cluster, and sometimes I might prefer deleting the entire thing with a single backspace. That should be my choice to make, so I know it's going to work exactly the way I expect. I definitely wouldn't want the behavior randomly changing depending on which IME I happened to use, and then potentially changing again when I reload the content from disk.
Author
Owner

@o-sdn-o commented on GitHub (Jun 13, 2023):

I agree that disputes about what atomic or piecewise grapheme clusters should be is a matter of settings. But that's not the point.

I am based on the assumption that a certain operating system can output an entire cluster at a single keystroke, and I consider it fundamentally wrong to split it into a sequence of distinct keypresses and releases, as required by the current win32-input-mode implementation.

@o-sdn-o commented on GitHub (Jun 13, 2023): I agree that disputes about what atomic or piecewise grapheme clusters should be is a matter of settings. But that's not the point. I am based on the assumption that a certain operating system can output an entire cluster at a single keystroke, and I consider it fundamentally wrong to split it into a sequence of distinct keypresses and releases, as required by the current win32-input-mode implementation.
Author
Owner

@j4james commented on GitHub (Jun 13, 2023):

I consider it fundamentally wrong to split it into a sequence of distinct keypresses and releases

I sympathise with that viewpoint, but from my understanding of your proposed protocol, you're just creating more work for app devs. They already have to accept a cluster split into distinct keypresses, but now they also have to check for additional parameters at the end of the sequence that might be continuing codepoints. So unless there is some value to them in distinguishing between the two, it's just bunch of extra work for nothing.

Also, if an app has already been designed to work with win32-input-mode, they're not going to be expecting those continuing codepoints, so they'd lose half of the cluster. That is not what I would consider backwards compatible. So if you want to extend the protocol like this, I'd encourage you to use a different mode number.

The additional viewport and mouse stuff is an even bigger issue, but I don't want to get into a discussion about protocol design. As long as it's a separate mode that an application has specifically requested, you can do whatever you want.

@j4james commented on GitHub (Jun 13, 2023): > I consider it fundamentally wrong to split it into a sequence of distinct keypresses and releases I sympathise with that viewpoint, but from my understanding of your proposed protocol, you're just creating more work for app devs. They already have to accept a cluster split into distinct keypresses, but now they also have to check for additional parameters at the end of the sequence that might be continuing codepoints. So unless there is some value to them in distinguishing between the two, it's just bunch of extra work for nothing. Also, if an app has already been designed to work with win32-input-mode, they're not going to be expecting those continuing codepoints, so they'd lose half of the cluster. That is not what I would consider backwards compatible. So if you want to extend the protocol like this, I'd encourage you to use a different mode number. The additional viewport and mouse stuff is an even bigger issue, but I don't want to get into a discussion about protocol design. As long as it's a separate mode that an application has specifically requested, you can do whatever you want.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#11516