Unit test framework for terminal emulation #2436

Open
opened 2026-01-30 22:55:00 +00:00 by claunia · 3 comments
Owner

Originally created by @darrenstarr on GitHub (Jul 1, 2019).

Summary of the new feature/enhancement

Currently, terminal emulation for unix is extremely incomplete.

  • The parser fails in many circumstances and can't cope with what would be considered backwards references in a regular expression engine. It is missing support for many different modes of operation...
  • The terminal escape sequence interpreter is minimal. It's as approximately the same state that my terminal emulator VtNetCore.UWP was on the second or third day of development. It has barely a minimal implementation required to run a few sample cases. Things like midnight commander barely function at all (though I'm REALLY impressed by how much does function). vttest more or less considers Microsoft Terminal to be more or less unusable... look at basic terminal comparison on windows
  • Unit tests for the parser and the command sequence interpreter are minimal at best, incorrect in some cases and are very coarse grained.

I don't believe that the current state of the terminal emulator's unit testing framework is suitable for permitting clear pull requests or filing clear issues with test cases.

Anyone who creates a terminal which should test for correctness needs to have a proper testing framework and it would be impossible for any of us to make a terminal work without this.

Proposed technical implementation details (optional)

I recommend writing a unit test framework that is part of the CI pipeline that supports describing and writing unit tests in the following fashion

  • Name of test
  • Input data
  • Compliance level (minimum viable product/ECMA-48/xterm/academic)
  • Expected cursor state?
  • Expected window characteristics?
  • Expected window text and attributes?
  • Query text characteristics (character width, character height, etc...)?
  • Query window as pixels?
  • Query window as text?
  • Query current buffer?
  • Query mouse pointer?
  • Query mouse position?
  • Query mouse colors?
  • Query protected areas?
  • Query current character set?
  • Query state of sequence parser?
  • Query state of sequence buffer?

I recommend supporting a "function as a service" implementation that would allow describing the test as a JSON file and a Javascript or Typescript file for the test itself.

I intend to work towards this goal with my VtNetCore engine as well. I have about 200 unit tests at this time and have planned another 1000+ for the additional features I currently support. If we do this together, we can standardize a virtual terminal test suite.

If you look at VtNetCore, you can see how to start a legitimate test suite (in this case using XUnit) for compliance. I highly recommend also looking at libvterm, they have done an excellent job of setting an example for all of us. But they have some very serious limitations that can't be solved by writing such a simple query language. We should use something more robust.

If Microsoft is going to make a real attempt at a real terminal emulator and they're going to claim compliance, you absolutely must provide a better means for providing tests. At this time, the test suite (see OutputEngineTest) is just too complex and it's not possible for people like me, or like Thomas Dickey or anyone else to provide a meaningful bug report.

Originally created by @darrenstarr on GitHub (Jul 1, 2019). # Summary of the new feature/enhancement <!-- A clear and concise description of what the problem is that the new feature would solve. Describe why and how a user would use this new functionality (if applicable). --> Currently, terminal emulation for unix is extremely incomplete. - The parser fails in many circumstances and can't cope with what would be considered backwards references in a regular expression engine. It is missing support for many different modes of operation... - The terminal escape sequence interpreter is minimal. It's as approximately the same state that my terminal emulator [VtNetCore.UWP](https://github.com/darrenstarr/VtNetCore.UWP) was on the second or third day of development. It has barely a minimal implementation required to run a few sample cases. Things like midnight commander barely function at all (though I'm REALLY impressed by how much does function). vttest more or less considers Microsoft Terminal to be more or less unusable... look at [basic terminal comparison on windows](https://medium.com/@ITGuyGoneBad/vttest-on-different-terminals-4235d4d7aee6) - Unit tests for the parser and the command sequence interpreter are minimal at best, incorrect in some cases and are very coarse grained. I don't believe that the current state of the terminal emulator's unit testing framework is suitable for permitting clear pull requests or filing clear issues with test cases. Anyone who creates a terminal which should test for correctness needs to have a proper testing framework and it would be impossible for any of us to make a terminal work without this. # Proposed technical implementation details (optional) <!-- A clear and concise description of what you want to happen. --> I recommend writing a unit test framework that is part of the CI pipeline that supports describing and writing unit tests in the following fashion - Name of test - Input data - Compliance level (minimum viable product/ECMA-48/xterm/academic) - Expected cursor state? - Expected window characteristics? - Expected window text and attributes? - Query text characteristics (character width, character height, etc...)? - Query window as pixels? - Query window as text? - Query current buffer? - Query mouse pointer? - Query mouse position? - Query mouse colors? - Query protected areas? - Query current character set? - Query state of sequence parser? - Query state of sequence buffer? I recommend supporting a "function as a service" implementation that would allow describing the test as a JSON file and a Javascript or Typescript file for the test itself. I intend to work towards this goal with my VtNetCore engine as well. I have about 200 unit tests at this time and have planned another 1000+ for the additional features I currently support. If we do this together, we can standardize a virtual terminal test suite. If you look at VtNetCore, you can see how to start a legitimate test suite (in this case using XUnit) for compliance. I highly recommend also looking at [libvterm](https://github.com/neovim/libvterm), they have done an excellent job of setting an example for all of us. But they have some very serious limitations that can't be solved by writing such a simple query language. We should use something more robust. If Microsoft is going to make a real attempt at a real terminal emulator and they're going to claim compliance, you absolutely must provide a better means for providing tests. At this time, the test suite (see OutputEngineTest) is just too complex and it's not possible for people like me, or like Thomas Dickey or anyone else to provide a meaningful bug report.
claunia added the Issue-FeatureArea-VTProduct-Terminal labels 2026-01-30 22:55:01 +00:00
Author
Owner

@darrenstarr commented on GitHub (Jul 3, 2019):

In my terminal emulator VtNetCore, I have implemented A LOT more of the DEC VT and XTerm features into my parser.
More importantly, I've used Microsoft's ClearScript project to embed the V8 JavaScript engine into a test suite and started implementing tests in JavaScript as an example of how it can be done. See VtNetCore.JavaScript.Tests which took very little time to implement and can serve as a foundation for moving forward.

@darrenstarr commented on GitHub (Jul 3, 2019): In my terminal emulator [VtNetCore](https://github.com/darrenstarr/VtNetCore), I have implemented A LOT more of the DEC VT and XTerm features into my parser. More importantly, I've used [Microsoft's ClearScript](https://github.com/microsoft/ClearScript) project to embed the V8 JavaScript engine into a test suite and started implementing tests in JavaScript as an example of how it can be done. See [VtNetCore.JavaScript.Tests](https://github.com/darrenstarr/VtNetCore/tree/3209e412d1fd9e036a1f78dca66abd9f2ff83575/VtNetCore.JavaScript.Tests) which took very little time to implement and can serve as a foundation for moving forward.
Author
Owner

@zadjii-msft commented on GitHub (Jul 3, 2019):

  • The parser fails in many circumstances

Wait which circumstances? Do you have some specific strings we can't parse? I'm pretty sure the state machine that handles our VT parsing is actually quite good. I'd love to be wrong here though. This is the poster that's been hanging in our offices for the last 4 years as reference:

vt500_parser.png

I don't believe we handle sos/pm/apc or any dcs states, but we haven't really needed to yet, and adding support wouldn't be hard. The other one that might be missing is some OSC strings are not parsed the best, but that's more of a case-by-case basis. Since the strings in question aren't really seen in the wild, we haven't needed to implement them.

  • The terminal escape sequence interpreter is minimal.

The Windows Terminal's is quite minimal, yes. This is because we fortunately only need to talk to conpty with the terminal currently, and conpty only emits those sequences. Fortunately, conpty is a much more capable terminal emulator. This is the header for the adapter that conhost (conpty) uses. Conhost is actually a pretty competent terminal emulator - it's good enough to handle most things that you'll see for applications respecting xterm-256color. Case in point - since WSL came out, conhost has been acting as the terminal for all linux apps running on Windows, and although the first previews were pretty rough, we've gotten dramatically better.

Does that mean we're a complete terminal emulator? Absolutely not. But we have to triage which features to add and which to leave on the backlog, and while there are a lot of terminal features that are out there, we've decided to stick with implementing the ones that area actually used.

Some day in the near future, we hope to expand the Windows Terminal's adapter as well to be just as complete as conhost's, so the terminal could talk directly to things like ssh or wsl without conhost interpreting and translating them for us, but there are a lot of other things higher on the priority list currently.

  • Our unit tests are bad

I'm not gonna say they're not bad. They're good enough. The OutputEngineTest certainly isn't a good test, it's really only testing if the parser can in fact, parse things. The more interesting tests are largely scattered about ScreenBufferTests.cpp. Look at this great test of VT stuff.

Is it the most elegant? Absolutely not. But again, there's only so much time in the day, and unfortunately setting up a better test framework hasn't found its way into any of my days :/ I'd be happy to review a PR that added a cleaner test framework however :P

@zadjii-msft commented on GitHub (Jul 3, 2019): * The parser fails in many circumstances Wait which circumstances? Do you have some specific strings we can't parse? I'm pretty sure the state machine that handles our VT parsing is actually quite good. I'd love to be wrong here though. This is the poster that's been hanging in our offices for the last 4 years as reference: <!-- ![img](https://vt100.net/emu/vt500_parser.png) --> ![vt500_parser.png](https://user-images.githubusercontent.com/18356694/81812530-e21d0700-94eb-11ea-8662-98c642fdc6f4.png) I don't believe we handle sos/pm/apc or any dcs states, but we haven't really _needed_ to yet, and adding support wouldn't be hard. The other one that might be missing is some OSC strings are not parsed the best, but that's more of a case-by-case basis. Since the strings in question aren't really seen in the wild, we haven't _needed_ to implement them. * The terminal escape sequence interpreter is minimal. The [Windows Terminal's](https://github.com/microsoft/terminal/blob/master/src/cascadia/TerminalCore/TerminalDispatch.hpp) is quite minimal, yes. This is because we fortunately only need to talk to conpty with the terminal currently, and conpty only emits those sequences. Fortunately, conpty is a _much_ more capable terminal emulator. [This is the header ](https://github.com/microsoft/terminal/blob/master/src/terminal/adapter/adaptDispatch.hpp)for the adapter that conhost (conpty) uses. Conhost is actually a pretty competent terminal emulator - it's _good enough_ to handle most things that you'll see for applications respecting `xterm-256color`. Case in point - since WSL came out, conhost has been acting as the terminal for all linux apps running on Windows, and although the first previews were pretty rough, we've gotten _dramatically_ better. Does that mean we're a complete terminal emulator? Absolutely not. But we have to triage which features to add and which to leave on the backlog, and while there are a _lot_ of terminal features that are out there, we've decided to stick with implementing the ones that area actually used. Some day in the near future, we hope to expand the Windows Terminal's adapter as well to be just as complete as conhost's, so the terminal could talk directly to things like `ssh` or `wsl` without conhost interpreting and translating them for us, but there are a lot of other things higher on the priority list currently. * Our unit tests are bad I'm not gonna say they're _not_ bad. They're good enough. The `OutputEngineTest` certainly isn't a good test, it's really only testing if the parser can in fact, parse things. The more interesting tests are largely scattered about [ScreenBufferTests.cpp](https://github.com/microsoft/terminal/blob/master/src/host/ut_host/ScreenBufferTests.cpp). [Look at this great test of VT stuff](https://github.com/microsoft/terminal/blob/master/src/host/ut_host/ScreenBufferTests.cpp#L978-L1107). Is it the most elegant? Absolutely not. But again, there's only so much time in the day, and unfortunately setting up a better test framework hasn't found its way into any of my days :/ I'd be happy to review a PR that added a cleaner test framework however :P
Author
Owner

@darrenstarr commented on GitHub (Jul 3, 2019):

Let me first say that the reason I filed the report is that I love what you all have done with this terminal. I have decided that I want to contribute tests to accelerate development. I will continue to maintain mine as an academic project with a focus on accuracy. You’ve done a far better job on color support than I have and I hope to use your code as a reference to getting my color matching right. I love it.

Let me also pint out that I’ve probably used incompatible terminology when writing the original post. What you call parser, I called reader and what I call parser processes the tokens from the reader. So I want to make sure it was more to make a case for a more versatile test framework.

I see from the graphic you shared that a problem I encountered with terminal may be related to the “anywhere” entry point into “ESC”. I am on a phone now and can’t verify now. But the problems I’ve seen have been from malformed sequences that don’t appear to trigger the escape if your state machine is already parsing another sequence. I wanted to write a test for this, but couldn’t find the right place to do it.

I believe I also saw circumstances where you didn’t handle bad parameter counts. I am writing a series of tests now to verify those. I am under the impression that this could be why you have so many problems on vttest who abuses that heavily.

The screen tests are great. I hope you don’t mind if I borrow a few for my tests as well. I believe I can post some pull requests for some of the more severe failures in command processing because of that. And it’s not that you’re violating any specifications. It’s that a lot of software depends on terminals supporting non-standard CSI handling.

I see the state of your adapter. You are well on the way to something good. It is however (like mine was early on) missing a lot of features. Some are necessary, some less so. I am trying to make a list of “minimum features to implement” with documentation of “minimum supported functionality of those features” before you can say “we do xterm”. If I can manage that (a lot of work), I will attempt to create the appropriate unit tests for your code.

Anyway... please keep up the great work and if you have time to help me get the terminal running in an app which can run a suite of unit tests via javascript, I’d love to work on it with you.

@darrenstarr commented on GitHub (Jul 3, 2019): Let me first say that the reason I filed the report is that I love what you all have done with this terminal. I have decided that I want to contribute tests to accelerate development. I will continue to maintain mine as an academic project with a focus on accuracy. You’ve done a far better job on color support than I have and I hope to use your code as a reference to getting my color matching right. I love it. Let me also pint out that I’ve probably used incompatible terminology when writing the original post. What you call parser, I called reader and what I call parser processes the tokens from the reader. So I want to make sure it was more to make a case for a more versatile test framework. I see from the graphic you shared that a problem I encountered with terminal may be related to the “anywhere” entry point into “ESC”. I am on a phone now and can’t verify now. But the problems I’ve seen have been from malformed sequences that don’t appear to trigger the escape if your state machine is already parsing another sequence. I wanted to write a test for this, but couldn’t find the right place to do it. I believe I also saw circumstances where you didn’t handle bad parameter counts. I am writing a series of tests now to verify those. I am under the impression that this could be why you have so many problems on vttest who abuses that heavily. The screen tests are great. I hope you don’t mind if I borrow a few for my tests as well. I believe I can post some pull requests for some of the more severe failures in command processing because of that. And it’s not that you’re violating any specifications. It’s that a lot of software depends on terminals supporting non-standard CSI handling. I see the state of your adapter. You are well on the way to something good. It is however (like mine was early on) missing a lot of features. Some are necessary, some less so. I am trying to make a list of “minimum features to implement” with documentation of “minimum supported functionality of those features” before you can say “we do xterm”. If I can manage that (a lot of work), I will attempt to create the appropriate unit tests for your code. Anyway... please keep up the great work and if you have time to help me get the terminal running in an app which can run a suite of unit tests via javascript, I’d love to work on it with you.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#2436