Give API to measure the space that a string occupies #320

Open
opened 2026-01-30 21:48:43 +00:00 by claunia · 21 comments
Owner

Originally created by @be5invis on GitHub (Jul 8, 2018).

This is an extension to #57.
Under a certain console/PTY, assume the font family/size is specified, give a string, and return the space (a bit mask of the character matrix?) it would occupy.

Originally created by @be5invis on GitHub (Jul 8, 2018). This is an extension to #57. Under a certain console/PTY, assume the font family/size is specified, give a string, and return the space (a bit mask of the character matrix?) it would occupy.
claunia added the Issue-FeatureProduct-ConhostArea-Server labels 2026-01-30 21:48:43 +00:00
Author
Owner

@zadjii-msft commented on GitHub (Jul 9, 2018):

IIRC, determining the width of a string is a pretty hard problem actually. There are all sorts of crazy Unicode edge cases to handle, there are some assumptions we make in the code manually (eg box-drawing chars are single-width always).

Adding @adiviness as he has been working in that area quite a lot.

I'd really doubt that we'd be adding another API to conhost. Is there an equivalent API on *nix that we could use for inspiration?

@zadjii-msft commented on GitHub (Jul 9, 2018): IIRC, determining the width of a string is a pretty hard problem actually. There are all sorts of crazy Unicode edge cases to handle, there are some assumptions we make in the code manually (eg box-drawing chars are single-width always). Adding @adiviness as he has been working in that area quite a lot. I'd really doubt that we'd be adding another API to conhost. Is there an equivalent API on *nix that we could use for inspiration?
Author
Owner

@be5invis commented on GitHub (Jul 10, 2018):

box-drawing chars are single-width always

It is false for CJK languages since 1980s using some fonts, but true for other fonts (like my Sarasa Gothic).

@be5invis commented on GitHub (Jul 10, 2018): > box-drawing chars are single-width always It is **false** for CJK languages since 1980s using some fonts, but true for other fonts (like my Sarasa Gothic).
Author
Owner

@be5invis commented on GitHub (Jul 11, 2018):

I have seen multiple libraries trying to "guess" the actual width of a string, like

https://github.com/martinheidegger/varsize-string
https://github.com/sindresorhus/widest-line
https://github.com/sindresorhus/string-width

If we have an accurate API then it would greatly help people writing console applications.

@be5invis commented on GitHub (Jul 11, 2018): I have seen multiple libraries trying to "guess" the actual width of a string, like https://github.com/martinheidegger/varsize-string https://github.com/sindresorhus/widest-line https://github.com/sindresorhus/string-width If we have an **accurate** API then it would greatly help people writing console applications.
Author
Owner

@vapier commented on GitHub (Jul 21, 2018):

on the unix side, there isn't a great API for terminal emulators. the closest are the wcwidth and wcswidth functions. they effectively operate on code points.

most programs (both terminal emulators and editors/tools) tend to just use wcwidth which mostly works OK as long as you stick to the simpler things: code points that are unambiguously narrow (1) or wide (2) or unprintable (-1), or you assume combining characters (0) always "attach" to the previous printable codepoint. anything involving zero width joiners is out the window, as are any scripts involving more complicated rules, or RTL scripts.

common examples of complicated rules:

  • ತ್ಯ is ತ್ (U+0CA4 U+0CCD) followed by ಯ (U+0CAF). wcwidth would count those as 1/0/1 (which is correct), but when considered together, it should be 1.
  • ఫ్ట్వే is ఫ్ (U+0C2B U+0C4D) followed by ట్ (U+0C1F U+0C4D) followed by వే (U+0C35 U+0C47). wcwidth would count those as 1/0/1/0/1/0 (which is correct), but when considered together, it should be 1.

wcswidth should be able to calculate the right answer, but i don't think most implementations handle these graphemes correctly either.

the original question was about the rendering box needed for a particular grapheme in a particular font. this shouldn't matter, but in practice, a lot of fonts (including monospace ones) aren't consistent in their widths/heights. they can be narrower or wider than a single cell requiring manual intervention to center/scale them in the respective cells. freetype/fontconfig are the standard font related libraries in the unix world for rendering.

along those lines, wide-characters (i.e. CJK) should be taking up two cells even if the font gets it wrong. otherwise you easily run out of sync with the console's idea of cursor location and the remote application's idea of cursor location. i grok that this might be a fundamental limitation in the existing Windows console code and is not trivial to resolve.

hth.

@vapier commented on GitHub (Jul 21, 2018): on the unix side, there isn't a great API for terminal emulators. the closest are the [wcwidth](http://pubs.opengroup.org/onlinepubs/9699919799/functions/wcwidth.html) and [wcswidth](http://pubs.opengroup.org/onlinepubs/9699919799/functions/wcswidth.html) functions. they effectively operate on code points. most programs (both terminal emulators and editors/tools) tend to just use `wcwidth` which mostly works OK as long as you stick to the simpler things: code points that are unambiguously narrow (1) or wide (2) or unprintable (-1), or you assume combining characters (0) always "attach" to the previous printable codepoint. anything involving zero width joiners is out the window, as are any scripts involving more complicated rules, or RTL scripts. common examples of complicated rules: * ತ್ಯ is ತ್ ([U+0CA4](https://unicode-table.com/en/0CA4/) [U+0CCD](https://unicode-table.com/en/0CCD/)) followed by ಯ ([U+0CAF](https://unicode-table.com/en/0CAF/)). `wcwidth` would count those as 1/0/1 (which is correct), but when considered together, it should be 1. * ఫ్ట్వే is ఫ్ ([U+0C2B](https://unicode-table.com/en/0C2B/) [U+0C4D](https://unicode-table.com/en/0C4D/)) followed by ట్ ([U+0C1F](https://unicode-table.com/en/0C1F/) [U+0C4D](https://unicode-table.com/en/0C4D/)) followed by వే ([U+0C35](https://unicode-table.com/en/0C35/) [U+0C47](https://unicode-table.com/en/0C47/)). `wcwidth` would count those as 1/0/1/0/1/0 (which is correct), but when considered together, it should be 1. `wcswidth` should be able to calculate the right answer, but i don't think most implementations handle these graphemes correctly either. the original question was about the rendering box needed for a particular grapheme in a particular font. this *shouldn't* matter, but in practice, a lot of fonts (including monospace ones) aren't consistent in their widths/heights. they can be narrower or wider than a single cell requiring manual intervention to center/scale them in the respective cells. freetype/fontconfig are the standard font related libraries in the unix world for rendering. along those lines, wide-characters (i.e. CJK) should be taking up two cells even if the font gets it wrong. otherwise you easily run out of sync with the console's idea of cursor location and the remote application's idea of cursor location. i grok that this might be a fundamental limitation in the existing Windows console code and is not trivial to resolve. hth.
Author
Owner

@JFLarvoire commented on GitHub (Aug 31, 2018):

+1 for having such a function.
I have a command-line tool called dirc.exe for comparing directories side by side, and the column alignment breaks if file names contain 0-width characters or double-width ideograms.
I tried using open-source versions of wcswidth(), but this is not fool-proof: The Windows console does not always size characters as expected by that function. What we really need is the console itself telling us what it will display.
And contrary to what @zadjii-msft is said in his comment above, I think this is easy to do: The console can simply write the string into a private hidden screen buffer (using the very same code it uses for displaying it in the visible buffer), and report how much the hidden buffer cursor moved.

@JFLarvoire commented on GitHub (Aug 31, 2018): +1 for having such a function. I have a command-line tool called [dirc.exe](https://github.com/JFLarvoire/SysToolsLib/releases) for comparing directories side by side, and the column alignment breaks if file names contain 0-width characters or double-width ideograms. I tried using open-source versions of wcswidth(), but this is not fool-proof: The Windows console does not always size characters as expected by that function. What we really need is the console itself telling us what it _will_ display. And contrary to what @zadjii-msft is said in his comment above, I think this is easy to do: The console can simply write the string into a private hidden screen buffer (using the very same code it uses for displaying it in the visible buffer), and report how much the hidden buffer cursor moved.
Author
Owner

@kghost commented on GitHub (Oct 15, 2018):

The most important thing is not how you measure the width, it is important that the measurement of terminal app and console app agree with each other. When the width doesn't match, it will mess up all ncurses apps or tmux/screen.

So instead of providing another platform dependent function, I strongly suggest using a widely used library like utf8proc (with this patch) to determine charactor width. It follows the Unicode standard mostly.

And there are characters with situational width, depending on locales. Make sure you app can handle this or just use the library.

@kghost commented on GitHub (Oct 15, 2018): The most important thing is not how you measure the width, it is important that the measurement of terminal app and console app agree with each other. When the width doesn't match, it will mess up all ncurses apps or tmux/screen. So instead of providing another platform dependent function, I strongly suggest using a widely used library like [utf8proc](https://github.com/JuliaStrings/utf8proc) (with [this patch](https://github.com/JuliaStrings/utf8proc/pull/83)) to determine charactor width. It follows the [Unicode standard](http://unicode.org/reports/tr11/) mostly. And there are characters with situational width, depending on locales. Make sure you app can handle this or just use the library.
Author
Owner

@be5invis commented on GitHub (Feb 11, 2019):

@kghost @miniksa
What we finally need is a proper support for text shaping in the console, which is not a well-studied area. Its concept may be close to justification, which is another not-well-studied-area...
But, how about let the terminal (that uses ConPTY) to guide Conhost how to associate cells with text runs? Then, they can have arbitrary-complex rendering/layout/justification. Just imagine, complex script like Indic scripts worked in Console!

@be5invis commented on GitHub (Feb 11, 2019): @kghost @miniksa What we finally need is a proper support for text shaping in the console, which is not a well-studied area. Its concept may be close to *justification*, which is another not-well-studied-area... But, how about let the terminal (that uses ConPTY) to guide Conhost how to associate cells with text runs? Then, they can have arbitrary-complex rendering/layout/justification. Just imagine, complex script like Indic scripts worked in Console!
Author
Owner

@zadjii-msft commented on GitHub (Jul 12, 2021):

From @alabuzhev in #10592

It is not uncommon for text mode apps to organise and display data in a table-like way with multiple columns. To do so, for any arbitrary string an app should be able to calculate its visible length, i.e. the number of screen cells occupied, and truncate it or append with spaces to fit into the desired column.

Historically the most popular way to do so is just take the string size is characters, e.g. string.size() (here and below "character" means wchar_t), assuming that each character occupies exactly one cell. It is extremely easy and for the USA and Europe it usually "just works". Except when it doesn't. Sooner or later unusual characters go slipping through the cracks and that assumption goes out with a bang: the rendered string is actually longer (or shorter) than expected and all the following characters are shifted. And all the following lines as well. Oops. You've probably seen that already somewhere.

So, to make sure that everything works even with unusual characters, apps need to do something smarter and treat different characters differently. There are ways to do that, e.g. using external libraries or Unicode data directly. There's one, just one small problem with that approach: text mode apps don't and can't render anything directly. The actual rendering happens in a different process in an unpredictable way: the number of occupied cells could depend on the OS version, the console host, the console mode, the API used, the output codepage, the active font, the colour of the character (yes), and so on and so forth.

In other words, to do the right thing, it's not enough to fully support Unicode and take into account character widths, grapheme clusters etc. An app needs to ask itself "what would renderer do?" first. And it's not exactly trivial to find out. Even the methods that worked in the past, e.g. checking the OS version or querying the console font, are now deprecated and either don't work without advanced magic or don't work at all in Terminal.

So, are there any reasonable ways / recommendations to predict the renderer behaviour and say for sure "if I print this particular string, the cursor will move exactly N characters to the right"? (not even to mention RTL, that's a different PITA).

@zadjii-msft commented on GitHub (Jul 12, 2021): From @alabuzhev in #10592 > It is not uncommon for text mode apps to organise and display data in a table-like way with multiple columns. To do so, for any arbitrary string an app should be able to calculate its _visible length_, i.e. the number of screen cells occupied, and truncate it or append with spaces to fit into the desired column. > > Historically the most popular way to do so is just take the string size is characters, e.g. `string.size()` (here and below "character" means `wchar_t`), assuming that each character occupies exactly one cell. It is extremely easy and for the USA and Europe it usually "just works". Except when it doesn't. Sooner or later _unusual characters_ go slipping through the cracks and that assumption goes out with a bang: the rendered string is actually longer (or shorter) than expected and all the following characters are shifted. And all the following lines as well. Oops. You've probably seen that already somewhere. > > So, to make sure that everything works even with unusual characters, apps need to do something smarter and treat different characters differently. There are ways to do that, e.g. using external libraries or Unicode data directly. There's one, just one small problem with that approach: text mode apps don't and can't render anything directly. The actual rendering happens in a different process in an unpredictable way: the number of occupied cells could depend on the OS version, the console host, the console mode, the API used, the output codepage, the active font, the colour of the character (yes), and so on and so forth. > > In other words, to do the right thing, it's not enough to fully support Unicode and take into account character widths, grapheme clusters etc. An app needs to ask itself "what would renderer do?" first. And it's not exactly trivial to find out. Even the methods that worked in the past, e.g. checking the OS version or querying the console font, are now deprecated and either [don't work without advanced magic](https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getversionexw) or [don't work at all](https://docs.microsoft.com/en-us/windows/console/getcurrentconsolefontex) in Terminal. > > So, are there any reasonable ways / recommendations to predict the renderer behaviour and say for sure _"if I print this particular string, the cursor will move exactly N characters to the right"_? (not even to mention RTL, that's a different PITA). >
Author
Owner

@viktor-podzigun commented on GitHub (Jan 23, 2024):

fyi: there is new Unicode Terminal Complex Script Support, or TCSS proposal

@viktor-podzigun commented on GitHub (Jan 23, 2024): fyi: there is new Unicode **Terminal Complex Script Support**, or TCSS [proposal](https://www.unicode.org/L2/L2023/23107-terminal-suppt.pdf)
Author
Owner

@tig commented on GitHub (Jan 23, 2024):

fyi: there is new Unicode Terminal Complex Script Support, or TCSS proposal

@DHowett-MSFT - I'm super interested in helping with this. At the minimum, you can count on Terminal.Gui as being a test case. Please feel free to reach out (tig (at) kindel (dot) com).

@tig commented on GitHub (Jan 23, 2024): > fyi: there is new Unicode **Terminal Complex Script Support**, or TCSS [proposal](https://www.unicode.org/L2/L2023/23107-terminal-suppt.pdf) @DHowett-MSFT - I'm super interested in helping with this. At the minimum, you can count on Terminal.Gui as being a test case. Please feel free to reach out (tig (at) kindel (dot) com).
Author
Owner

@zadjii-msft commented on GitHub (Jan 23, 2024):

fyi: there is new Unicode Terminal Complex Script Support, or TCSS proposal

In fact, @DHowett is listed on the author line of that proposal 😅

@zadjii-msft commented on GitHub (Jan 23, 2024): > fyi: there is new Unicode **Terminal Complex Script Support**, or TCSS [proposal](https://www.unicode.org/L2/L2023/23107-terminal-suppt.pdf) In fact, @DHowett is listed on the author line of that proposal 😅
Author
Owner

@j4james commented on GitHub (Jan 23, 2024):

Note that there's also Contour's Unicode Core proposal, which has already been adopted by a number of other terminals, and at least one application that I'm aware of.

@j4james commented on GitHub (Jan 23, 2024): Note that there's also Contour's [Unicode Core](https://github.com/contour-terminal/terminal-unicode-core) proposal, which has already been adopted by a number of other terminals, and at least one application that I'm aware of.
Author
Owner

@german-one commented on GitHub (Aug 18, 2024):

FWIW To be in line with what is/becomes the default in Windows Terminal, I used the code effective in v1.22 to measure the displayed width of strings. The results are really promising.
See https://github.com/german-one/wtswidth---Windows-Terminal-string-width
As an aside: I really appreciate the Terminal's evolution towards Unicode correctness. 👍

@german-one commented on GitHub (Aug 18, 2024): FWIW To be in line with what is/becomes the default in Windows Terminal, I used the code effective in v1.22 to measure the displayed width of strings. The results are really promising. See https://github.com/german-one/wtswidth---Windows-Terminal-string-width As an aside: I really appreciate the Terminal's evolution towards Unicode correctness. 👍
Author
Owner

@unxed commented on GitHub (Aug 29, 2024):

See https://github.com/german-one/wtswidth---Windows-Terminal-string-width

This page mentions Proper Complex Script Support in Text Terminals proposal from unicode.org. I also translated it into Russian. Hope my Russian speaking colleges find it helpful!

На этой странице упоминается предложение с сайта unicode.org о правильной поддержке сложных скриптов в текстовых терминалах. Перевёл его на русский язык. Надеюсь, русскоязычным коллегам пригодится!

@unxed commented on GitHub (Aug 29, 2024): > See https://github.com/german-one/wtswidth---Windows-Terminal-string-width This page mentions `Proper Complex Script Support in Text Terminals` [proposal](https://www.unicode.org/L2/L2023/23107-terminal-suppt.pdf) from unicode.org. I also [translated it into Russian](https://github.com/unxed/UnicodeTerminals). Hope my Russian speaking colleges find it helpful! На этой странице упоминается [предложение с сайта unicode.org](https://www.unicode.org/L2/L2023/23107-terminal-suppt.pdf) о правильной поддержке сложных скриптов в текстовых терминалах. [Перевёл его на русский язык](https://github.com/unxed/UnicodeTerminals). Надеюсь, русскоязычным коллегам пригодится!
Author
Owner

@unxed commented on GitHub (Aug 29, 2024):

Because the result of wcwidth() execution is not always reliable in practice (for example, it often returns -1 for characters that do occupy screen space, which needs to be taken into account somehow), I used the following hack: I measured the actual width in terminal cells for the first 1,114,111 Unicode characters.

The measurement was performed using the following algorithm:

  1. move the cursor to position 0, 0
  2. output the character
  3. get the current cursor position from the terminal
  4. measure the offset

And so on for each character. I did this in the GNOME terminal with default settings. I attach the result, as well as the source code of the program for measurement - you can run it in MS Terminal or in any other terminal and compare the results. Be prepared for the process to be quite lengthy: it took me two days.

screen_cell_measure.tar.gz

@unxed commented on GitHub (Aug 29, 2024): Because the result of wcwidth() execution is not always reliable in practice (for example, it often returns -1 for characters that do occupy screen space, which needs to be taken into account somehow), I used the following hack: I measured the actual width in terminal cells for the first 1,114,111 Unicode characters. The measurement was performed using the following algorithm: 1. move the cursor to position 0, 0 2. output the character 3. get the current cursor position from the terminal 4. measure the offset And so on for each character. I did this in the GNOME terminal with default settings. I attach the result, as well as the source code of the program for measurement - you can run it in MS Terminal or in any other terminal and compare the results. Be prepared for the process to be quite lengthy: it took me two days. [screen_cell_measure.tar.gz](https://github.com/user-attachments/files/16801771/screen_cell_measure.tar.gz)
Author
Owner

@unxed commented on GitHub (Sep 29, 2024):

There are quite reasonable thought on this here:
https://github.com/magiblot/tvision/issues/51#issuecomment-2367458336

@unxed commented on GitHub (Sep 29, 2024): There are quite reasonable thought on this here: https://github.com/magiblot/tvision/issues/51#issuecomment-2367458336
Author
Owner

@lhecker commented on GitHub (Sep 30, 2024):

The primary issue with having an API that asks the terminal is that it requires a costly cross-process roundtrip. Console applications on Windows are already some of the worst of any OS when it comes to performance precisely due to this issue. However, we can solve that by exposing the measurement as a function in kernel32.dll. Since we own the OS and platform, we can simply build the internal APIs needed to inject the Terminal's idea of Unicode into the console processes it owns. This would make it as fast as wcswidth on UNIX but with a more precise control over the Unicode version/algorithm.

I'm not a big fan of the comment you linked, because text attributes that influence text measurement would hurt performance. Right now, an ideal terminal can infer the cursor position purely from measuring text after VT parsing (even if VT line renditions are used). If the linked comment's idea would be adopted, this wouldn't work anymore and during text iteration the attributes would need to be checked. I expect that this would halve the performance, or something of that order.

@o-sdn-o's character geometry proposal on the other hand wouldn't have this issue. The codepoints they're proposing would be part of the grapheme cluster segmentation that would need to happen regardless. It would allow terminal applications precise control over the width of ambiguous width codepoints (or those few awkward codepoints that are narrow/wide but should be the opposite). I'm not sure whether the vertical sizing, rotation, and halving are needed in practice though. I think they should only be adopted if a stronger need for them arises.

I think, if anything, the introduction of a wcswidth-like API on Windows would be the least controversial as it wouldn't require coordination with the Unicode consortium. But o-sdn-o's proposal may be worth consideration for when it comes to an official Unicode spec.

@lhecker commented on GitHub (Sep 30, 2024): The primary issue with having an API that asks the terminal is that it requires a costly cross-process roundtrip. Console applications on Windows are already some of the worst of any OS when it comes to performance precisely due to this issue. However, we can solve that by exposing the measurement as a function in kernel32.dll. Since we own the OS and platform, we can simply build the internal APIs needed to inject the Terminal's idea of Unicode into the console processes it owns. This would make it as fast as `wcswidth` on UNIX but with a more precise control over the Unicode version/algorithm. I'm not a big fan of the comment you linked, because text attributes that influence text measurement would hurt performance. Right now, an ideal terminal can infer the cursor position purely from measuring text after VT parsing (even if VT line renditions are used). If the linked comment's idea would be adopted, this wouldn't work anymore and during text iteration the attributes would need to be checked. I expect that this would halve the performance, or something of that order. @o-sdn-o's character geometry proposal on the other hand wouldn't have this issue. The codepoints they're proposing would be part of the grapheme cluster segmentation that would need to happen regardless. It would allow terminal applications precise control over the width of ambiguous width codepoints (or those few awkward codepoints that are narrow/wide but should be the opposite). I'm not sure whether the vertical sizing, rotation, and halving are needed in practice though. I think they should only be adopted if a stronger need for them arises. I think, if anything, the introduction of a `wcswidth`-like API on Windows would be the least controversial as it wouldn't require coordination with the Unicode consortium. But o-sdn-o's proposal may be worth consideration for when it comes to an official Unicode spec.
Author
Owner

@o-sdn-o commented on GitHub (Sep 30, 2024):

I'm not sure whether the vertical sizing, rotation, and halving are needed in practice though. I think they should only be adopted if a stronger need for them arises.

Vertical sizing and halving are just side effects. One of the points in this approach: the terminal can operate internally exclusively with 1x1 fragments, this will dramatically simplify its life. Receiving a 3x3 cluster, it breaks it into nine independent (adjacent for the time being) objects of size 1x1. If necessary, for example, for the purpose of copying a selection, it can reassemble this cluster back into a monolith.

@o-sdn-o commented on GitHub (Sep 30, 2024): > I'm not sure whether the vertical sizing, rotation, and halving are needed in practice though. I think they should only be adopted if a stronger need for them arises. Vertical sizing and halving are just side effects. One of the points in this approach: the terminal can operate internally exclusively with 1x1 fragments, this will dramatically simplify its life. Receiving a 3x3 cluster, it breaks it into nine independent (adjacent for the time being) objects of size 1x1. If necessary, for example, for the purpose of copying a selection, it can reassemble this cluster back into a monolith.
Author
Owner

@tig commented on GitHub (Sep 27, 2025):

We (Terminal.Gui) have discovered the following codepoints are reported by wcwidth as 1 column wide, but treated by WT as 2.

Here's our prototype workaround code:

    /// <summary>Gets the number of columns the rune occupies in the terminal.</summary>
    /// <remarks>This is a Terminal.Gui extension method to <see cref="System.Text.Rune"/> to support TUI text manipulation.</remarks>
    /// <param name="rune">The rune to measure.</param>
    /// <returns>
    ///     The number of columns required to fit the rune, 0 if the argument is the null character, or -1 if the value is
    ///     not printable, otherwise the number of columns that the rune occupies.
    /// </returns>
    public static int GetColumns (this Rune rune)
    {
        int value = rune.Value;

        // HACK: This is a workaround for the fact that WT treats these glyphs as 2 columns wide
        // HACK: when, unicode (and wcwidth) say they are 1.
        // HACK: See https://github.com/gui-cs/Terminal.Gui/pull/4255 for more.
        // Check I Ching symbol ranges (hardcoded for reliability)
        if (value is >= 0x2630 and <= 0x2637 ||  // Trigrams
            value is >= 0x268A and <= 0x268F ||  // Monograms/Digrams
            value is >= 0x4DC0 and <= 0x4DFF)    // Hexagrams
        {
            return 2; // Assume double-width due to Windows Terminal font rendering
        }

        // Fallback to original GetWidth for other code points
        return UnicodeCalculator.GetWidth (rune);
    }

Posting this here as another example of why it would be wonderful for there to be a standard way of asking a terminal how wide it will render a glyph.

https://github.com/gui-cs/Terminal.Gui/pull/4255

@tig commented on GitHub (Sep 27, 2025): We (Terminal.Gui) have discovered the following codepoints are reported by wcwidth as 1 column wide, but treated by WT as 2. Here's our prototype workaround code: ```cs /// <summary>Gets the number of columns the rune occupies in the terminal.</summary> /// <remarks>This is a Terminal.Gui extension method to <see cref="System.Text.Rune"/> to support TUI text manipulation.</remarks> /// <param name="rune">The rune to measure.</param> /// <returns> /// The number of columns required to fit the rune, 0 if the argument is the null character, or -1 if the value is /// not printable, otherwise the number of columns that the rune occupies. /// </returns> public static int GetColumns (this Rune rune) { int value = rune.Value; // HACK: This is a workaround for the fact that WT treats these glyphs as 2 columns wide // HACK: when, unicode (and wcwidth) say they are 1. // HACK: See https://github.com/gui-cs/Terminal.Gui/pull/4255 for more. // Check I Ching symbol ranges (hardcoded for reliability) if (value is >= 0x2630 and <= 0x2637 || // Trigrams value is >= 0x268A and <= 0x268F || // Monograms/Digrams value is >= 0x4DC0 and <= 0x4DFF) // Hexagrams { return 2; // Assume double-width due to Windows Terminal font rendering } // Fallback to original GetWidth for other code points return UnicodeCalculator.GetWidth (rune); } ``` Posting this here as another example of why it would be wonderful for there to be a standard way of asking a terminal how wide it will render a glyph. https://github.com/gui-cs/Terminal.Gui/pull/4255
Author
Owner

@j4james commented on GitHub (Sep 27, 2025):

@tig According to the EastAsianWidth.txt documentation, they're categorized as W, which is "East Asian Wide".

2630..2637     ; W  # So     [8] TRIGRAM FOR HEAVEN..TRIGRAM FOR EARTH
268A..268F     ; W  # So     [6] MONOGRAM FOR YANG..DIGRAM FOR GREATER YIN
4DC0..4DFF     ; W  # So    [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION

Is there something I'm missing? Or is it possible your wcwidth library is out of date? Either way this sounds like a bug in someone's code which needs to be fixed.

@j4james commented on GitHub (Sep 27, 2025): @tig According to the [EastAsianWidth.txt](https://www.unicode.org/Public/17.0.0/ucd/EastAsianWidth.txt) documentation, they're categorized as `W`, which is "East Asian Wide". ``` 2630..2637 ; W # So [8] TRIGRAM FOR HEAVEN..TRIGRAM FOR EARTH 268A..268F ; W # So [6] MONOGRAM FOR YANG..DIGRAM FOR GREATER YIN 4DC0..4DFF ; W # So [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION ``` Is there something I'm missing? Or is it possible your `wcwidth` library is out of date? Either way this sounds like a bug in someone's code which needs to be fixed.
Author
Owner

@tig commented on GitHub (Sep 27, 2025):

We're using wcwidth via this package: https://github.com/spectreconsole/wcwidth

Well, shit. It does look like it's out of date.

This is an example:

䷀Hexagram For The Creative Heaven - U+4dc0

I just tried using https://wcwidth.readthedocs.io and it returns 2 for /u4dc0.

Damn.

@tig commented on GitHub (Sep 27, 2025): We're using wcwidth via this package: https://github.com/spectreconsole/wcwidth Well, shit. It does look like it's out of date. This is an example: ䷀Hexagram For The Creative Heaven - U+4dc0 I just tried using https://wcwidth.readthedocs.io and it returns 2 for `/u4dc0`. Damn.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#320