Terminal does not handle custom keyboard layouts with dead-keys mapping to themselves #4913

Open
opened 2026-01-31 00:00:34 +00:00 by claunia · 0 comments
Owner

Originally created by @springcomp on GitHub (Nov 11, 2019).

The Microsoft Terminal does not seem to correctly handle dead-keys for a custom keyboard layout made with Microsoft Keyboard Layout Creator.

In fact, it does not correctly handle dead-keys which have a mapping for themselves in the keyboard layout.

Environment

Platform ServicePack Version VersionString
Win32NT 10.0.19018.0 Microsoft Windows NT 10.0.19018.0

Additionnal software:

Steps to reproduce

This issue happens with many dead-keys. For the purpose of this discussion, let us focus on one particular dead-key.

On a French AZERTY keyboard, the AltGr+2 combination is a dead-key used to input the ~ (TILDE, U+007E) diacritical mark. By itself, this sequence does not produce any output to the Terminal.

Obviously, when this sequence if followed by a supported character, for instance o, the corresponding õ character is typed into the Terminal. A spacing version of the ~ character can be typed into the Terminal by following the dead-key sequence with a Space.

However, when using a custom AZERTY-NF keyboard layout, the corresponding sequence does not work.

In order to reproduce this issue:

  • Install the referred to AZERY-NF keyboard layout.
  • Select this layout as the currently active layout with Win+Space
  • In the Microsoft Terminal, type the following sequence: AltGr+n.
  • Observe that a character ~ (COMBINING TILDE, U+0303) has been typed into the Terminal.
  • Proceed to type the following key: o.
  • Observe that the character õ has been typed into the Terminal.

image

Alternatively, you can reproduce this issue by following this other set of steps:

  • Install MSKLC 1.4.

  • Install the French AZERTY keyboard layout.

  • With MSKLC, using the File|Open Existing Keyboard… menu command, open the French AZERTY keyboard layout (named Français in the list)

  • Add an additionnal mapping for the AltGr+2 dead key like so:

    • Click on the é key (VK_2) and push the All… button in the corresponding dialog.
    • In the lower part of this extended dialog, locate the mapping for the ctrl-alt-<key> dead key whose value is 0x007e and push the button.
    • Add a mapping for the base U+007e (TILDE) to the composite U+0303 (COMBINING TILDE) as illustrated by the following screenshot.

    image

    • Using the Project|Build DLL and Setup Package menu command, compile the resulting layout. It should be generated in your %USERPROFILE%\Documents\layout01 folder.
  • Install the custom layout by running Setup.exe.

  • Close your Windows session and login again.

  • Select this layout as the currently active layout with Win+Space

  • In the Microsoft Terminal, type the following sequence: AltGr+2.

  • Like previously described, observe that a character ~ (COMBINING TILDE, U+0303) has been typed into the Terminal.

  • Proceed to type the following key: o.

  • Observe that the character õ has been typed into the Terminal.

Expected behavior

Similarly to what happens with the regular French keyboard, the AltGr+n combination on an AZERTY-NF keyboard layout is a dead-key and should not produce any output to the Terminal.

Followed by a supported character, for instance o, this dead-key sequence corresponds to the ~ diacritical mark and should type the character õ by itself on the terminal.

Actual behavior

When typing AltGr+n, o, the string is typed into the Terminal, whereas the string õ was expected instead.

Analysis summary

The heart of the problem lies in the following conditions:

  • Microsoft Terminal handles WM_CHAR message that ultimately calls the ToUnicodeEx function.
  • Most keyboard layouts do not have a mapping for a double dead-key (a mapping for a dead-key to itself)

I believe calling ToUnicodeEx as part of the WM_CHAR handler is incorrect, because the effect of this function depends on the previous internal keyboard state.

Extended analysis

Obviously, the expected behaviour is correctly observed in the builtin CMD, PowerShell and WSL terminals on Windows.

By compiling the latest version of the source code for Microsoft Terminal, I can see that the expected behaviour is also correctly observed when running the OpenConsole.exe program. The faulty behaviour only happens in the WindowsTerminal application launched from the CascadiaPackage.

I was able to pinpoint a troublesome behaviour at this location in the source code.

There, the ToUnicodeEx^ function does not return the correct result.

The reason for that is that this function is called inside a WM_CHAR handler where the keyboard internal state already contains the current character. However, the ToUnicodeEx function changes its behaviour depending on the current keyboard state.

For instance, consider the following table, that summarizes the parameters supplied to the ToUnicodeEx function:

Layout Sequence Virtual Key Code Scan Code States
French AZERTY AltGr+2 VK_2 (0x32) 0x03 CONTROL+MENU
French AZERTY-NF AltGr+n VK _N (0x4e) 0x31 CONTROL+MENU

When called by itself in a a simple program, the ToUnicodeEx returns the correct result for both layouts, which is:

  • The result of the function is -1 indicating a dead key.
  • The supplied buffer is set to a string containing "~" (TILDE, U+007E).

However, when called from within the WM_CHAR handler, the keyboard state contains the current (dead key character), the ToUnicodeEx function tries to combine the character corresponding to the supplied virtual key code, scan code and key states with the current internal keyboard state for the process.

  • The result of the function is 2 indicating that a two-character string is returned.
  • The supplied buffer is set to a string containing "~~" (TILDE, U+007E; TILDE, U+007E)

In fact, the ToUnicodeEx function combines the supplied character with the previous keyboard state. In the case of virtually all keyboard layouts on the planet, there is no such valid combination. Therefore, the ToUnicodeEx function emits a string where the corresponding dead key character is emitted twice. That is what happens on Windows in all applications when typing the sequence AltGr+2 twice on a French AZERTY keyboard.

Microsoft Terminal takes advantage of this fact, and when the ToUnicodeEx function returns anything other than 1 (a single character) or -1 (a dead key character), it does not produce any output.

As for the AZERTY-NF layout of the customized layout described in the section for how to reproduce this issue, the ToUnicodeEx function behaves like so:

  • The result is 1 indicating that a single character string is returned.
  • The supplied buffer is set to a string containing "~" (COMBINING TILDE, U+0303).

This is consistent because the internal keyboard state already contains the dead key character and when combined with itself again, the keyboard layout is instructed to emit this particular character.

Because this results in a single character string, Microsoft Terminal writes this character to the output.

Conclusion

I understand that this is a freak issue but I believe virtually all other Windows terminal-like apps handle the custom keyboard layout correctly, which seem to indicate that this is a bug in Microsoft Terminal only.

Originally created by @springcomp on GitHub (Nov 11, 2019). The Microsoft Terminal does not seem to correctly handle dead-keys for a custom keyboard layout made with Microsoft Keyboard Layout Creator. In fact, it does not correctly handle dead-keys which have a mapping for themselves in the keyboard layout. # Environment |Platform|ServicePack|Version|VersionString| |---|---|---|---| |Win32NT||10.0.19018.0|Microsoft Windows NT 10.0.19018.0| Additionnal software: - [Custom AZERTY-NF Keyboard Layout](https://github.com/springcomp/optimized-azerty-win/releases/tag/v1.3.0.0) - [Microsoft Keyboard Layout Creator 1.4](https://www.microsoft.com/en-us/download/details.aspx?id=22339) # Steps to reproduce This issue happens with many dead-keys. For the purpose of this discussion, let us focus on one particular dead-key. On a French AZERTY keyboard, the <kbd>AltGr</kbd>+<kbd>2</kbd> combination is a dead-key used to input the `~` (TILDE, U+007E) diacritical mark. By itself, this sequence does not produce any output to the Terminal. Obviously, when this sequence if followed by a supported character, for instance <kbd>o</kbd>, the corresponding `õ` character is typed into the Terminal. A spacing version of the `~` character can be typed into the Terminal by following the dead-key sequence with a <kbd>Space</kbd>. However, when using a custom AZERTY-NF keyboard layout, the corresponding sequence does not work. In order to reproduce this issue: - Install the referred to AZERY-NF keyboard layout. - Select this layout as the currently active layout with <kbd>Win</kbd>+<kbd>Space</kbd> - In the Microsoft Terminal, type the following sequence: <kbd>AltGr</kbd>+<kbd>n</kbd>. - Observe that a character `~` (COMBINING TILDE, U+0303) has been typed into the Terminal. - Proceed to type the following key: <kbd>o</kbd>. - Observe that the character `õ` has been typed into the Terminal. ![image](https://user-images.githubusercontent.com/8488398/68478272-c6306400-022f-11ea-9f81-ff4518b8fafc.png) Alternatively, you can reproduce this issue by following this other set of steps: - Install MSKLC 1.4. - Install the French AZERTY keyboard layout. - With MSKLC, using the `File|Open Existing Keyboard…` menu command, open the French AZERTY keyboard layout (named _Français_ in the list) - Add an additionnal mapping for the <kbd>AltGr</kbd>+<kbd>2</kbd> dead key like so: - Click on the `é` key (VK_2) and push the `All…` button in the corresponding dialog. - In the lower part of this extended dialog, locate the mapping for the `ctrl-alt-<key>` dead key whose value is `0x007e` and push the `…` button. - Add a mapping for the base `U+007e` (TILDE) to the composite `U+0303` (COMBINING TILDE) as illustrated by the following screenshot. ![image](https://user-images.githubusercontent.com/8488398/68573528-33c6d500-0468-11ea-9586-12f82968ba8c.png) - Using the `Project|Build DLL and Setup Package` menu command, compile the resulting layout. It should be generated in your `%USERPROFILE%\Documents\layout01` folder. - Install the custom layout by running `Setup.exe`. - Close your Windows session and login again. - Select this layout as the currently active layout with <kbd>Win</kbd>+<kbd>Space</kbd> - In the Microsoft Terminal, type the following sequence: <kbd>AltGr</kbd>+<kbd>2</kbd>. - Like previously described, observe that a character `~` (COMBINING TILDE, U+0303) has been typed into the Terminal. - Proceed to type the following key: <kbd>o</kbd>. - Observe that the character `õ` has been typed into the Terminal. # Expected behavior Similarly to what happens with the regular French keyboard, the <kbd>AltGr</kbd>+<kbd>n</kbd> combination on an AZERTY-NF keyboard layout is a dead-key and should not produce any output to the Terminal. Followed by a supported character, for instance <kbd>o</kbd>, this dead-key sequence corresponds to the `~` diacritical mark and should type the character `õ` by itself on the terminal. # Actual behavior When typing <kbd>AltGr</kbd>+<kbd>n</kbd>, <kbd>o</kbd>, the string `~õ` is typed into the Terminal, whereas the string `õ` was expected instead. # Analysis summary The heart of the problem lies in the following conditions: - Microsoft Terminal handles `WM_CHAR` message that ultimately calls the `ToUnicodeEx` function. - Most keyboard layouts do not have a mapping for a double dead-key (a mapping for a dead-key to itself) I believe calling `ToUnicodeEx` as part of the `WM_CHAR` handler is incorrect, because the effect of this function depends on the previous internal keyboard state. # Extended analysis Obviously, the expected behaviour is correctly observed in the builtin CMD, PowerShell and WSL terminals on Windows. By compiling the latest version of the source code for Microsoft Terminal, I can see that the expected behaviour is also correctly observed when running the `OpenConsole.exe` program. The faulty behaviour only happens in the `WindowsTerminal` application launched from the `CascadiaPackage`. I was able to pinpoint a troublesome behaviour at [this location](https://github.com/microsoft/terminal/blob/3e8a1a78bc456afc58686fa8ce818da0f03edbb3/src/cascadia/TerminalCore/Terminal.cpp#L303) in the source code. There, the `ToUnicodeEx`[^](https://docs.microsoft.com/fr-fr/windows/win32/api/winuser/nf-winuser-tounicodeex) function does not return the correct result. The reason for that is that this function is called inside a `WM_CHAR` handler where the keyboard internal state already contains the current character. However, the `ToUnicodeEx` function changes its behaviour depending on the current keyboard state. For instance, consider the following table, that summarizes the parameters supplied to the `ToUnicodeEx` function: |Layout|Sequence|Virtual Key Code|Scan Code|States| |---|---|---|---|---| |French AZERTY|<kbd>AltGr</kbd>+<kbd>2</kbd>|VK_2 (0x32)|0x03|CONTROL+MENU| |French AZERTY-NF|<kbd>AltGr</kbd>+<kbd>n</kbd>|VK _N (0x4e)|0x31|CONTROL+MENU| When called by itself in a [a simple program](https://gist.github.com/springcomp/f07157d1a50d18dcca84985752f10e33#file-program-cs), the `ToUnicodeEx` returns the correct result for both layouts, which is: - The result of the function is `-1` indicating a dead key. - The supplied buffer is set to a string containing `"~"` (TILDE, U+007E). However, when called from within the `WM_CHAR` handler, the keyboard state contains the current (dead key character), the `ToUnicodeEx` function tries to combine the character corresponding to the supplied virtual key code, scan code and key states with the current internal keyboard state for the process. - The result of the function is `2` indicating that a two-character string is returned. - The supplied buffer is set to a string containing `"~~"` (TILDE, U+007E; TILDE, U+007E) In fact, the `ToUnicodeEx` function _combines_ the supplied character with the previous keyboard state. In the case of virtually all keyboard layouts on the planet, there is no such valid combination. Therefore, the `ToUnicodeEx` function emits a string where the corresponding dead key character is emitted twice. That is what happens on Windows in all applications when typing the sequence <kbd>AltGr</kbd>+<kbd>2</kbd> twice on a French AZERTY keyboard. Microsoft Terminal takes advantage of this fact, and when the `ToUnicodeEx` function returns anything other than `1` (a single character) or `-1` (a dead key character), it does not produce any output. As for the AZERTY-NF layout of the customized layout described in the section for how to reproduce this issue, the `ToUnicodeEx` function behaves like so: - The result is `1` indicating that a single character string is returned. - The supplied buffer is set to a string containing `"~"` (COMBINING TILDE, U+0303). This is consistent because the internal keyboard state already contains the dead key character and when combined with itself again, the keyboard layout is instructed to emit this particular character. Because this results in a single character string, Microsoft Terminal writes this character to the output. # Conclusion I understand that this is a freak issue but I believe virtually all other Windows terminal-like apps handle the custom keyboard layout correctly, which seem to indicate that this is a bug in Microsoft Terminal only.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#4913