[PR #7304] Refactor VT control sequence identification #26889

Open
opened 2026-01-31 09:18:44 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/7304

State: closed
Merged: Yes


This PR changes the way VT control sequences are identified and
dispatched, to be more efficient and easier to extend. Instead of
parsing the intermediate characters into a vector, and then having to
identify a sequence using both that vector and the final char, we now
use just a single uint64_t value as the identifier.

The way the identifier is constructed is by taking the private parameter
prefix, each of the intermediate characters, and then the final
character, and shifting them into a 64-bit integer one byte at a time,
in reverse order. For example, the DECTLTC control has a private
parameter prefix of ?, one intermediate of ', and a final character
of s. The ASCII values of those characters are 0x3F, 0x27, and
0x73 respectively, and reversing them gets you 0x73273F, so that would
then be the identifier for the control.

The reason for storing them in reverse order, is because sometimes we
need to look at the first intermediate to determine the operation, and
treat the rest of the sequence as a kind of sub-identifier (the
character set designation sequences are one example of this). When in
reverse order, this can easily be achieved by masking off the low byte
to get the first intermediate, and then shifting the value right by 8
bits to get a new identifier with the rest of the sequence.

With 64 bits we have enough space for a private prefix, six
intermediates, and the final char, which is way more than we should ever
need (the DEC STD 070 specification recommends supporting at least
three intermediates, but in practice we're unlikely to see more than
two).

With this new way of identifying controls, it should now be possible for
every action code to be unique (for the most part). So I've also used
this PR to clean up the action codes a bit, splitting the codes for the
escape sequences from the control sequences, and sorting them into
alphabetical order (which also does a reasonable job of clustering
associated controls).

Validation Steps Performed

I think the existing unit tests should be good enough to confirm that
all sequences are still being dispatched correctly. However, I've also
manually tested a number of sequences to make sure they were still
working as expected, in particular those that used intermediates, since
they were the most affected by the dispatch code refactoring.

Since these changes also affected the input state machine, I've done
some manual testing of the conpty keyboard handling (both with and
without the new Win32 input mode enabled) to make sure the keyboard VT
sequences were processed correctly. I've also manually tested the
various VT mouse modes in Vttest to confirm that they were still working
correctly too.

Closes #7276

**Original Pull Request:** https://github.com/microsoft/terminal/pull/7304 **State:** closed **Merged:** Yes --- This PR changes the way VT control sequences are identified and dispatched, to be more efficient and easier to extend. Instead of parsing the intermediate characters into a vector, and then having to identify a sequence using both that vector and the final char, we now use just a single `uint64_t` value as the identifier. The way the identifier is constructed is by taking the private parameter prefix, each of the intermediate characters, and then the final character, and shifting them into a 64-bit integer one byte at a time, in reverse order. For example, the `DECTLTC` control has a private parameter prefix of `?`, one intermediate of `'`, and a final character of `s`. The ASCII values of those characters are `0x3F`, `0x27`, and `0x73` respectively, and reversing them gets you 0x73273F, so that would then be the identifier for the control. The reason for storing them in reverse order, is because sometimes we need to look at the first intermediate to determine the operation, and treat the rest of the sequence as a kind of sub-identifier (the character set designation sequences are one example of this). When in reverse order, this can easily be achieved by masking off the low byte to get the first intermediate, and then shifting the value right by 8 bits to get a new identifier with the rest of the sequence. With 64 bits we have enough space for a private prefix, six intermediates, and the final char, which is way more than we should ever need (the _DEC STD 070_ specification recommends supporting at least three intermediates, but in practice we're unlikely to see more than two). With this new way of identifying controls, it should now be possible for every action code to be unique (for the most part). So I've also used this PR to clean up the action codes a bit, splitting the codes for the escape sequences from the control sequences, and sorting them into alphabetical order (which also does a reasonable job of clustering associated controls). ## Validation Steps Performed I think the existing unit tests should be good enough to confirm that all sequences are still being dispatched correctly. However, I've also manually tested a number of sequences to make sure they were still working as expected, in particular those that used intermediates, since they were the most affected by the dispatch code refactoring. Since these changes also affected the input state machine, I've done some manual testing of the conpty keyboard handling (both with and without the new Win32 input mode enabled) to make sure the keyboard VT sequences were processed correctly. I've also manually tested the various VT mouse modes in Vttest to confirm that they were still working correctly too. Closes #7276
claunia added the pull-request label 2026-01-31 09:18:44 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#26889