Specify char set for double-click selection instead of word boundaries #23945

Open
opened 2026-01-31 08:56:55 +00:00 by claunia · 1 comment
Owner

Originally created by @inzeets on GitHub (Jan 7, 2026).

Originally assigned to: @DHowett on GitHub.

Description of the new feature

Is there any way to switch wt to double-click-select chunk of text consisting of predefined character set, in a conventional way, same as vim (iskeyword) or putty (character classes) do for example? I know the character set I'm working with, and what should make my double-click selection, so I can enumerate it (ascii minus some chars). Otherwise I need to put almost all Unicode table to avoid sporadic UI symbols (like border elements, or file-type icons, etc) being selected as a part of the filename, keyword, variable, url, etc. Sporadically getting these symbols copied is distracting and getting annoying over time. Is this just because I don't understand what is the logic behind taking "word delimiters" approach and not providing all Unicode chars that's not a word as a default value, so I have to maintain my own list of known delimiters I hit by the time? Please explain.

Proposed technical implementation details

Use "Double-click selection delimiter chars" if it's not empty, and "Double-click selection character set" otherwise.

Originally created by @inzeets on GitHub (Jan 7, 2026). Originally assigned to: @DHowett on GitHub. ### Description of the new feature Is there any way to switch wt to double-click-select chunk of text consisting of predefined character set, in a conventional way, same as vim (iskeyword) or putty (character classes) do for example? I know the character set I'm working with, and what should make my double-click selection, so I can enumerate it (ascii minus some chars). Otherwise I need to put almost all Unicode table to avoid sporadic UI symbols (like border elements, or file-type icons, etc) being selected as a part of the filename, keyword, variable, url, etc. Sporadically getting these symbols copied is distracting and getting annoying over time. Is this just because I don't understand what is the logic behind taking "word delimiters" approach and not providing all Unicode chars that's not a word as a default value, so I have to maintain my own list of known delimiters I hit by the time? Please explain. ### Proposed technical implementation details Use "Double-click selection delimiter chars" if it's not empty, and "Double-click selection character set" otherwise.
claunia added the Issue-FeatureNeeds-TriageNeeds-Tag-Fix labels 2026-01-31 08:56:56 +00:00
Author
Owner

@DHowett commented on GitHub (Jan 15, 2026):

We spent a little bit brainstorming about this. It's interesting: putty was one of our comparables when we first added support for double-click selection. I'm a long-time putty user as well, and so internally we do use a character class system. We just don't expose it for customization.

There's a cloud of issues here:

We should probably do something somewhere in the middle of all this. I don't think we should actually allow users to specify a character class for every character. Putty offers 0-127, which is great. It doesn't offer box and line drawing characters for customization.

Maybe allowing character class nomination via regex character classes (which covers simple cases like [a-zA-Z] all the way through complex cases like Unicode character classes \p{Script=Cyrillic}) would be a good place to end up.

Perhaps it would look something like this?

"wordDelimiterClasses": [
  "[\\x00-\\x1b]",
  "\\s",
  "\\w",
  "\\u{...}",
]

To answer this:

I don't understand what is the logic behind taking "word delimiters" approach and not providing all Unicode chars that's not a word as a default value

Well, our contemporaries at the time did it that way too. iTerm 2 defined their delimiters as:

When you double-click in the terminal window, a "word" is selected. A word is defined as a string delimited by characters of a different class. The classes of characters are whitespace, word characters, and non-word characters. The characters in this field define the set of non-word characters.
emphasis mine

Konsole has the same:

Image

But filling it with a list of all Unicode characters which are not parts of words would have made the default settings file huge indeed (because of the presumption that it must be able to be set and edited via the UI, it is a single field and it cannot be "internally, secretly, we use a Unicode table".)

@DHowett commented on GitHub (Jan 15, 2026): We spent a little bit brainstorming about this. It's interesting: putty was one of our comparables when we first added support for double-click selection. I'm a long-time putty user as well, and so internally we _do_ use a character class system. We just don't expose it for customization. There's a cloud of issues here: - #3077 - #3078 - #3196 - #7173 - And for Accessibility, #11313 We should probably do something somewhere in the middle of all this. I don't think we should actually allow users to specify a character class for _every character._ Putty offers 0-127, which is great. It doesn't offer box and line drawing characters for customization. Maybe allowing character class nomination via regex character classes (which covers simple cases like `[a-zA-Z]` all the way through complex cases like _Unicode_ character classes `\p{Script=Cyrillic}`) would be a good place to end up. Perhaps it would look something like this? ```json "wordDelimiterClasses": [ "[\\x00-\\x1b]", "\\s", "\\w", "\\u{...}", ] ``` To answer this: > I don't understand what is the logic behind taking "word delimiters" approach and not providing all Unicode chars that's not a word as a default value Well, our contemporaries at the time did it that way too. iTerm 2 [defined their delimiters](https://iterm2.com/3.0/documentation-preferences.html) as: > When you double-click in the terminal window, a "word" is selected. A word is defined as a string delimited by characters of a different class. The classes of characters are whitespace, word characters, and non-word characters. **The characters in this field define the set of non-word characters.** > _emphasis mine_ Konsole has the same: ![Image](https://github.com/user-attachments/assets/5ffab0dc-c0ac-4d86-8143-5bb0d79b8e99) But filling it with a list of all Unicode characters which _are not_ parts of words would have made the default settings file huge indeed (because of the presumption that it must be able to be set and edited via the UI, it is a single field and it cannot be "internally, secretly, we use a Unicode table".)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#23945