Ambiguous width character in CJK environment #552

Open
opened 2026-01-30 21:55:04 +00:00 by claunia · 13 comments
Owner

Originally created by @ghost on GitHub (Feb 10, 2019).

The operation in the English environment is perfect. However, the behavior in the CJK environment is unstable.
Type ☆, (\b), ☆, (\b), ☆ ... , because the sequence is insufficient, the character shifts one cell to the right.

c__windows_system32_cmd

I thought that it was my mistake, I tried drawing by querying the cursor position, but it could not be solved.
Do you have any corrections?

Originally created by @ghost on GitHub (Feb 10, 2019). The operation in the English environment is perfect. However, the behavior in the CJK environment is unstable. Type ☆, (\b), ☆, (\b), ☆ ... , because the sequence is insufficient, the character shifts one cell to the right. ![c__windows_system32_cmd](https://user-images.githubusercontent.com/44701315/52535751-7a7e8600-2d95-11e9-8714-16ddd776488f.gif) I thought that it was my mistake, I tried drawing by querying the cursor position, but it could not be solved. Do you have any corrections?
claunia added the Issue-FeatureArea-RenderingProduct-Conpty labels 2026-01-30 21:55:05 +00:00
Author
Owner

@k-takata commented on GitHub (Feb 15, 2019):

I hope this ConPTY issue will be fixed by next release of Windows (1903?).

@k-takata commented on GitHub (Feb 15, 2019): I hope this ConPTY issue will be fixed by next release of Windows (1903?).
Author
Owner

@ghost commented on GitHub (Feb 20, 2019):

image
In this way, it is not under my control that the square cursor is displayed shifted to the right by half.

@ghost commented on GitHub (Feb 20, 2019): ![image](https://user-images.githubusercontent.com/44701315/53100385-0731fc80-356b-11e9-8763-13ed979bfa20.png) In this way, it is not under my control that the square cursor is displayed shifted to the right by half.
Author
Owner

@ghost commented on GitHub (Feb 21, 2019):

Will endeavor. The voyage to UTF-8 has just begun. Engage!

@ghost commented on GitHub (Feb 21, 2019): Will endeavor. The voyage to UTF-8 has just begun. Engage!
Author
Owner

@k-takata commented on GitHub (Feb 21, 2019):

@nak Why you close this issue?

I consider this is a bug of ConPTY.
'☆' (U+2606) is an ambiguous width character and it is shown as full width in Japanese environment (cp932). Of course, it is shown as full width in the normal Command Prompt.
However, on ConPTY '☆' is handled as a half width character (even it is shown in full width) and the cursor position and cursor width becomes weird. Also trying to delete '☆' by a Backspace doesn't work well. It only deletes a half part of the character. They should work as same as the normal Command Prompt.

@k-takata commented on GitHub (Feb 21, 2019): @nak Why you close this issue? I consider this is a bug of ConPTY. '☆' (U+2606) is an ambiguous width character and it is shown as full width in Japanese environment (cp932). Of course, it is shown as full width in the normal Command Prompt. However, on ConPTY '☆' is handled as a half width character (even it is shown in full width) and the cursor position and cursor width becomes weird. Also trying to delete '☆' by a Backspace doesn't work well. It only deletes a half part of the character. They should work as same as the normal Command Prompt.
Author
Owner

@ghost commented on GitHub (Feb 22, 2019):

Reopen it. And I will talk.

That processing at the command prompt is customization of Eastern Asia. Now, I think that it became the era when the user performs text rendering because of this implementation.
I follow the implementation of the UNICODE expert.
If this behavior is a bug, fix it as soon as problems arise.

I'd like to verify if this implementation is inconsistent, during that time, I kept it open all day, I thought it was missing courtesy.
If you have problems, create an issue for it.

@ghost commented on GitHub (Feb 22, 2019): Reopen it. And I will talk. That processing at the command prompt is customization of Eastern Asia. Now, I think that it became the era when the user performs text rendering because of this implementation. I follow the implementation of the UNICODE expert. If this behavior is a bug, fix it as soon as problems arise. I'd like to verify if this implementation is inconsistent, during that time, I kept it open all day, I thought it was missing courtesy. If you have problems, create an issue for it.
Author
Owner

@ghost commented on GitHub (Feb 22, 2019):

What moves right one cell at a time is that of Vim's renderer.
Please ignore it now.

The system locale has no effect on the WSL Console.
Even in the Windows domain, if you use ConPTY, the system locale behaves like Linux.
ConPTY works like cmd.exe under Windows control.
In Linux (WSL) control it behaves like bash.
Can not you?

@ghost commented on GitHub (Feb 22, 2019): What moves right one cell at a time is that of Vim's renderer. Please ignore it now. The system locale has no effect on the WSL Console. Even in the Windows domain, if you use ConPTY, the system locale behaves like Linux. ConPTY works like cmd.exe under Windows control. In Linux (WSL) control it behaves like bash. Can not you?
Author
Owner

@be5invis commented on GitHub (Feb 26, 2019):

So @miniksa @zadjii-msft
Do we need a callback like HRESULT ConsoleTextShape(_In_ const WCHAR* text, _Inout_ IConsoleCellSink* sink) that lets the application that consumes ConPTY results to do the cell allocation?

@be5invis commented on GitHub (Feb 26, 2019): So @miniksa @zadjii-msft Do we need a callback like `HRESULT ConsoleTextShape(_In_ const WCHAR* text, _Inout_ IConsoleCellSink* sink)` that lets the application that consumes ConPTY results to do the cell allocation?
Author
Owner

@miniksa commented on GitHub (Feb 26, 2019):

@be5invis, theoretically, perhaps. But the performance overhead of having that call go through interprocess communication for every single individual character would likely be prohibitive.

We don't know what the right answer is right now. We haven't been able to invest time in coming up with a more complete solution yet. We hope to one day.

@miniksa commented on GitHub (Feb 26, 2019): @be5invis, theoretically, perhaps. But the performance overhead of having that call go through interprocess communication for every single individual character would likely be prohibitive. We don't know what the right answer is right now. We haven't been able to invest time in coming up with a more complete solution yet. We hope to one day.
Author
Owner

@sedwards2009 commented on GitHub (Feb 26, 2019):

@be5invis

The applications which output these characters can be on a totally different machine and operating system with respect to the application which consumes and displays the characters. A Windows specific function like ConsoleTextShape() could only solve the problem in the Windows + Windows case.

Ideally the unicode standard would state exactly how wide each character is to be when displayed in a monospace grid. Then we all just have to agree to follow the standard.

@sedwards2009 commented on GitHub (Feb 26, 2019): @be5invis The applications which output these characters can be on a totally different machine and operating system with respect to the application which consumes and displays the characters. A Windows specific function like `ConsoleTextShape()` could only solve the problem in the Windows + Windows case. Ideally the unicode standard would state exactly how wide each character is to be when displayed in a monospace grid. Then we all just have to agree to follow the standard.
Author
Owner

@ghost commented on GitHub (Feb 26, 2019):

The current escape sequence, the absolute position of the screen is incorrect.
Use the relative distance from the current cursor position.
I want the API to redraw only the line with the cursor.

@ghost commented on GitHub (Feb 26, 2019): The current escape sequence, the absolute position of the screen is incorrect. Use the relative distance from the current cursor position. I want the API to redraw only the line with the cursor.
Author
Owner

@be5invis commented on GitHub (Feb 27, 2019):

@sedwards2009
One obviously complex case is the box-drawing characters: they are full-width under most far east locales, but half-width under others.
Specifying a standard about how to properly shape text under a character grid is the ultimate goal, but providing shaping callbacks could become a valuable solution, since Windows console apps can directly access the character grid.

@be5invis commented on GitHub (Feb 27, 2019): @sedwards2009 One obviously complex case is the box-drawing characters: they are full-width under most far east locales, but half-width under others. Specifying a standard about how to _properly_ shape text under a character grid is the ultimate goal, but providing shaping callbacks could become a valuable solution, since Windows console apps can directly access the character grid.
Author
Owner

@be5invis commented on GitHub (Feb 27, 2019):

@miniksa Not every character, but every string flush.
The ConsoleTextShape will be called for each text flush. The sink will provide callbacks for associating a text slice with a cell run.

struct ConsoleShapingState {
    // Bidi level, etc.
}
class IConsoleCellSink {
public:
    virtual HRESULT acquireScroll(UINT rows) = 0;
    virtual HRESULT putCursor(UINT row, UINT column) = 0;
    virtual HRESULT putTextRun(UINT row, UINT column, UINT cells,
        UINT cch, const WCHAR* pwchText) = 0;
}


HRESULT CALLBACK ConsoleTextShape(
    _In_ UINT cch,
    _In_ const WCHAR* pwchText,
    _In_ UINT cellMatrixWidth,
    _In_ UINT cellMatrixHeight,
    _In_ UINT startCursorRow,
    _In_ UINT startCursorColumn,
    _Inout_ ConsoleShapingState* state,
    _Inout_ IConsoleCellSink* sink);
@be5invis commented on GitHub (Feb 27, 2019): @miniksa Not every character, but every string flush. The `ConsoleTextShape` will be called for each text flush. The sink will provide callbacks for associating a text slice with a cell run. ```C++ struct ConsoleShapingState { // Bidi level, etc. } class IConsoleCellSink { public: virtual HRESULT acquireScroll(UINT rows) = 0; virtual HRESULT putCursor(UINT row, UINT column) = 0; virtual HRESULT putTextRun(UINT row, UINT column, UINT cells, UINT cch, const WCHAR* pwchText) = 0; } HRESULT CALLBACK ConsoleTextShape( _In_ UINT cch, _In_ const WCHAR* pwchText, _In_ UINT cellMatrixWidth, _In_ UINT cellMatrixHeight, _In_ UINT startCursorRow, _In_ UINT startCursorColumn, _Inout_ ConsoleShapingState* state, _Inout_ IConsoleCellSink* sink); ```
Author
Owner

@ghost commented on GitHub (Mar 14, 2019):

The implementation is almost complete. It can be used normally.
https://github.com/ntak/vim-1/tree/control_ambiwidth

I am glad if ConPTY can be used casually.

@ghost commented on GitHub (Mar 14, 2019): The implementation is almost complete. It can be used normally. https://github.com/ntak/vim-1/tree/control_ambiwidth I am glad if ConPTY can be used casually.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#552