Terminal shows SOH and STX characters in Ruby's pry for some reason #14681

Closed
opened 2026-01-31 04:16:44 +00:00 by claunia · 12 comments
Owner

Originally created by @OmriSama on GitHub (Jul 25, 2021).

Windows Terminal version (or Windows build number)

Terminal 1.8.1521.0, Windows 10.0.19043.0

Other Software

WSL2, running Manjaro Linux: https://github.com/sileshn/ManjaroWSL
Using Zsh and oh-my-zsh (but reproduced with Bash)
Fira Code Retina font (but run into this with other fonts)

ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
Pry version 0.14.1 on Ruby 3.0.2

Steps to reproduce

With everything installed properly (pry in $PATH):

➜  ~  pry
[1] pry(main)> ls Module

Expected Behavior

No special characters or control codes should be shown:

[1] pry(main)> ls Module
Module.methods: constants  nesting  used_modules
Module#methods:
  <              attr_writer              const_set              method_defined?             protected_method_defined?
  <=             autoload                 const_source_location  module_eval                 public_class_method
  <=>            autoload?                constants              module_exec                 public_constant
  ==             class_eval               define_method          name                        public_instance_method
  ===            class_exec               deprecate_constant     prepend                     public_instance_methods
  >              class_variable_defined?  freeze                 pretty_print                public_method_defined?
  >=             class_variable_get       include                pretty_print_cycle          remove_class_variable
  alias_method   class_variable_set       include?               private_class_method        remove_method
  ancestors      class_variables          included_modules       private_constant            singleton_class?
  attr           const_defined?           inspect                private_instance_methods    to_s
  attr_accessor  const_get                instance_method        private_method_defined?     undef_method
  attr_reader    const_missing            instance_methods       protected_instance_methods
[2] pry(main)>

Actual Behavior

For some reason, special characters/control codes are being shown:

image

[1] pry(main)> ls Module
Module.methods: constants  nesting  used_modules
Module#methods:
  <              attr_writer              const_set              method_defined?             protected_method_defined?
  <=             autoload                 const_source_location  module_eval                 public_class_method
  <=>            autoload?                constants              module_exec                 public_constant
  ==             class_eval               define_method          name                        public_instance_method
  ===            class_exec               deprecate_constant     prepend                     public_instance_methods
  >              class_variable_defined?  freeze                 pretty_print                public_method_defined?
  >=             class_variable_get       include                pretty_print_cycle          remove_class_variable
  alias_method   class_variable_set       include?               private_class_method        remove_method
  ancestors      class_variables          included_modules       private_constant            singleton_class?
  attr           const_defined?           inspect                private_instance_methods    to_s
  attr_accessor  const_get                instance_method        private_method_defined?     undef_method
  attr_reader    const_missing            instance_methods       protected_instance_methods
[2] pry(main)>
Originally created by @OmriSama on GitHub (Jul 25, 2021). ### Windows Terminal version (or Windows build number) Terminal 1.8.1521.0, Windows 10.0.19043.0 ### Other Software WSL2, running Manjaro Linux: https://github.com/sileshn/ManjaroWSL Using Zsh and oh-my-zsh (but reproduced with Bash) Fira Code Retina font (but run into this with other fonts) ``` ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux] ``` ``` Pry version 0.14.1 on Ruby 3.0.2 ``` ### Steps to reproduce With everything installed properly (`pry` in `$PATH`): ```sh ➜ ~ pry [1] pry(main)> ls Module ``` ### Expected Behavior No special characters or control codes should be shown: ```ruby [1] pry(main)> ls Module Module.methods: constants nesting used_modules Module#methods: < attr_writer const_set method_defined? protected_method_defined? <= autoload const_source_location module_eval public_class_method <=> autoload? constants module_exec public_constant == class_eval define_method name public_instance_method === class_exec deprecate_constant prepend public_instance_methods > class_variable_defined? freeze pretty_print public_method_defined? >= class_variable_get include pretty_print_cycle remove_class_variable alias_method class_variable_set include? private_class_method remove_method ancestors class_variables included_modules private_constant singleton_class? attr const_defined? inspect private_instance_methods to_s attr_accessor const_get instance_method private_method_defined? undef_method attr_reader const_missing instance_methods protected_instance_methods [2] pry(main)> ``` ### Actual Behavior For some reason, special characters/control codes are being shown: ![image](https://user-images.githubusercontent.com/10534779/126884330-f4cceb7d-39bf-463c-8488-52616d12ea02.png) ```ruby [1] pry(main)> ls Module Module.methods: constants nesting used_modules Module#methods: < attr_writer const_set method_defined? protected_method_defined? <= autoload const_source_location module_eval public_class_method <=> autoload? constants module_exec public_constant == class_eval define_method name public_instance_method === class_exec deprecate_constant prepend public_instance_methods > class_variable_defined? freeze pretty_print public_method_defined? >= class_variable_get include pretty_print_cycle remove_class_variable alias_method class_variable_set include? private_class_method remove_method ancestors class_variables included_modules private_constant singleton_class? attr const_defined? inspect private_instance_methods to_s attr_accessor const_get instance_method private_method_defined? undef_method attr_reader const_missing instance_methods protected_instance_methods [2] pry(main)> ```
Author
Owner

@OmriSama commented on GitHub (Aug 2, 2021):

Another interesting thing, is in cases where the Terminal viewport is too small, the characters that get displayed are different: (^A and ^B in this case)

^A^B^A^BModule.methods^A^B^A^B: constants  nesting  used_modules
^A^B^A^BModule#methods^A^B^A^B:
  <
  <=
  <=>
  ==
  ===
  >
  >=
  alias_method
  ancestors
  attr
  attr_accessor
  attr_reader
  attr_writer
  autoload
  autoload?
  class_eval
  class_exec
  class_variable_defined?
  class_variable_get
  class_variable_set
  class_variables
  const_defined?
  const_get
  const_missing
  const_set
  const_source_location
  constants
@OmriSama commented on GitHub (Aug 2, 2021): Another interesting thing, is in cases where the Terminal viewport is too small, the characters that get displayed are different: (`^A` and `^B` in this case) ```ruby ^A^B^A^BModule.methods^A^B^A^B: constants nesting used_modules ^A^B^A^BModule#methods^A^B^A^B: < <= <=> == === > >= alias_method ancestors attr attr_accessor attr_reader attr_writer autoload autoload? class_eval class_exec class_variable_defined? class_variable_get class_variable_set class_variables const_defined? const_get const_missing const_set const_source_location constants ```
Author
Owner

@zadjii-msft commented on GitHub (Aug 3, 2021):

Does this repro in the vintage console host?

I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them, but maybe they're getting stuck in the conpty buffer then getting round-tripped as U+FFFD. I haven't had enough coffee to remember - Usually @j4james is my expert on these things.

@zadjii-msft commented on GitHub (Aug 3, 2021): Does this repro in the vintage console host? I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them, but maybe they're getting stuck in the conpty buffer then getting round-tripped as U+FFFD. I haven't had enough coffee to remember - Usually @j4james is my expert on these things.
Author
Owner

@j4james commented on GitHub (Aug 3, 2021):

I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them

If we want to match the original DEC terminals, we should technically be eating most of the control characters which don't have defined behaviour, exactly like we do with NUL. That said, legacy DOS/Windows console apps would have expected those controls to be rendered, and we've already had complaints about the way we now drop NUL characters (see #8680).

In a perfect world, I'd eventually like for us to have different compatibility modes, so you could choose between things like strict VT, legacy ANSI.COM, or XTerm compatibility. In the meantime, though, I thought our current behavior was a reasonable compromise. We implement most of the controls that have defined meaning, but output those that aren't typically used in VT apps, so we still have a certain amount of legacy support.

But I don't have a strong opinion on this either way. If we decide we want to be strictly VT-compatible in this regard, I'd be happy to write a PR for it. Just be aware that it might trigger more bug reports from people running legacy apps.

@j4james commented on GitHub (Aug 3, 2021): > I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them If we want to match the original DEC terminals, we should technically be eating most of the control characters which don't have defined behaviour, exactly like we do with `NUL`. That said, legacy DOS/Windows console apps would have expected those controls to be rendered, and we've already had complaints about the way we now drop `NUL` characters (see #8680). In a perfect world, I'd eventually like for us to have different compatibility modes, so you could choose between things like strict VT, legacy ANSI.COM, or XTerm compatibility. In the meantime, though, I thought our current behavior was a reasonable compromise. We implement most of the controls that have defined meaning, but output those that aren't typically used in VT apps, so we still have a certain amount of legacy support. But I don't have a strong opinion on this either way. If we decide we want to be strictly VT-compatible in this regard, I'd be happy to write a PR for it. Just be aware that it might trigger more bug reports from people running legacy apps.
Author
Owner

@OmriSama commented on GitHub (Aug 3, 2021):

Does this repro in the vintage console host?

@zadjii-msft I wish I could tell you, but I can't run WSL in the Legacy Console:

image

Unless you meant, just running the whole thing in cmd.exe. In which case, cmd.exe can't even display some of the extended characters that otherwise work on Terminal 😅

image

@OmriSama commented on GitHub (Aug 3, 2021): > > > Does this repro in the vintage console host? @zadjii-msft I wish I could tell you, but I can't run WSL in the Legacy Console: ![image](https://user-images.githubusercontent.com/10534779/128069745-fcc1f571-b568-4810-8052-580ed25cdb29.png) Unless you meant, just running the whole thing in `cmd.exe`. In which case, `cmd.exe` can't even display some of the extended characters that otherwise work on Terminal 😅 ![image](https://user-images.githubusercontent.com/10534779/128069917-ffc91637-9559-4966-b31c-ca681734b3a1.png)
Author
Owner

@OmriSama commented on GitHub (Aug 3, 2021):

I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them

If we want to match the original DEC terminals, we should technically be eating most of the control characters which don't have defined behaviour, exactly like we do with NUL. That said, legacy DOS/Windows console apps would have expected those controls to be rendered, and we've already had complaints about the way we now drop NUL characters (see #8680).

In a perfect world, I'd eventually like for us to have different compatibility modes, so you could choose between things like strict VT, legacy ANSI.COM, or XTerm compatibility. In the meantime, though, I thought our current behavior was a reasonable compromise. We implement most of the controls that have defined meaning, but output those that aren't typically used in VT apps, so we still have a certain amount of legacy support.

But I don't have a strong opinion on this either way. If we decide we want to be strictly VT-compatible in this regard, I'd be happy to write a PR for it. Just be aware that it might trigger more bug reports from people running legacy apps.

@j4james Though I am not a contributor to Terminal, I think the approach of Compatibility Modes is a sound idea; I believe it's what other Terminal Emulators like iTerm2 do.

@OmriSama commented on GitHub (Aug 3, 2021): > > > > I forget what the actual right behavior for SOH and STX are - I'd assume that we should just be eating those and doing nothing with them > > If we want to match the original DEC terminals, we should technically be eating most of the control characters which don't have defined behaviour, exactly like we do with `NUL`. That said, legacy DOS/Windows console apps would have expected those controls to be rendered, and we've already had complaints about the way we now drop `NUL` characters (see #8680). > > In a perfect world, I'd eventually like for us to have different compatibility modes, so you could choose between things like strict VT, legacy ANSI.COM, or XTerm compatibility. In the meantime, though, I thought our current behavior was a reasonable compromise. We implement most of the controls that have defined meaning, but output those that aren't typically used in VT apps, so we still have a certain amount of legacy support. > > But I don't have a strong opinion on this either way. If we decide we want to be strictly VT-compatible in this regard, I'd be happy to write a PR for it. Just be aware that it might trigger more bug reports from people running legacy apps. @j4james Though I am not a contributor to Terminal, I think the approach of Compatibility Modes is a sound idea; I believe it's what other Terminal Emulators like iTerm2 do.
Author
Owner

@DHowett commented on GitHub (Aug 3, 2021):

We will probably want to ignore more and more control characters that we can't do anything meaningful with when VT mode is enabled. Apps usually¹ opt in to this, so they can be coerced into behaving.

97be53b426 introduced \001 and \002 for... some reason? To "escape" the control sequences.

It looks like they wrap color escapes in \x1 and \x2 so that they can later remove them if the user doesn't want color. That is not the best design, as it rather requires them to create a bunch of information that they then have to destroy!

¹ Of course, Terminal enables it by default. We likely do need some measure of "compatibility modes" like @j4james suggests.

@DHowett commented on GitHub (Aug 3, 2021): We will probably want to ignore more and more control characters that we can't do anything meaningful with _when VT mode is enabled_. Apps usually¹ opt in to this, so they can be coerced into behaving. https://github.com/pry/pry/commit/97be53b426f7f0a762130b54623faa6946f2c09a introduced `\001` and `\002` for... some reason? To "escape" the control sequences. It looks like they wrap color escapes in `\x1` and `\x2` so that they can later *remove* them if the user doesn't want color. That is not the best design, as it rather requires them to create a bunch of information that they then have to destroy! ¹ Of course, Terminal enables it by default. We likely do need some measure of "compatibility modes" like @j4james suggests.
Author
Owner

@zadjii-msft commented on GitHub (Aug 3, 2021):

the upstream bug is https://github.com/pry/pry/issues/2185, though, we'll likely want to address this on our own.

¹ Of course, Terminal enables it by default. We likely do need some measure of "compatibility modes" like j4james suggests.

hot warm take: compatibility mode is conhost. If your app truly depends on being able to output SOH and STX into the output buffer, then use conhost, where VT won't be turned on by default. Otherwise, apps in VT mode2 should do what VT need to do.

2: any vt-like mode. I don't think pure VT vs xterm vs whatever mode is going to affect this one in the future.

@zadjii-msft commented on GitHub (Aug 3, 2021): the upstream bug is https://github.com/pry/pry/issues/2185, though, we'll likely want to address this on our own. > ¹ Of course, Terminal enables it by default. We likely do need some measure of "compatibility modes" like j4james suggests. ~hot~ warm take: compatibility mode is conhost. If your app truly depends on being able to output SOH and STX into the output buffer, then use conhost, where VT won't be turned on by default. Otherwise, apps in VT mode<sup>2</sup> should do what VT need to do. <sup>2</sup>: any vt-like mode. I don't think pure VT vs xterm vs _whatever_ mode is going to affect this one in the future.
Author
Owner

@j4james commented on GitHub (Aug 3, 2021):

I don't think pure VT vs xterm vs whatever mode is going to affect this one in the future.

Well the mode I had in mind that would be affected by this is what is usually referred to as ANSIBBS, which is based on the old DOS ANSI.SYS driver with a few extensions. It's what you'd need if you want to access a BBS (they still exist, just via telnet).

You can get by with a pure VT emulator, but a lot of stuff won't look right. And a key requirement is using codepage 437, and having access to the C0 range as printable characters (for things like the card suites). Obviously not going to happen anytime soon, but I'm hoping it might be feasible one day.

@j4james commented on GitHub (Aug 3, 2021): > I don't think pure VT vs xterm vs _whatever_ mode is going to affect this one in the future. Well the mode I had in mind that would be affected by this is what is usually referred to as _ANSIBBS_, which is based on the old DOS ANSI.SYS driver with a few extensions. It's what you'd need if you want to access a BBS (they still exist, just via telnet). You can get by with a pure VT emulator, but a lot of stuff won't look right. And a key requirement is using codepage 437, and having access to the C0 range as printable characters (for things like the card suites). Obviously not going to happen anytime soon, but I'm hoping it might be feasible one day.
Author
Owner

@OmriSama commented on GitHub (Aug 3, 2021):

@DHowett Just gotta say, I really appreciate the collab with the Pry guys here! Thank y'all for helping uncover that this was also an issue in Pry...

@OmriSama commented on GitHub (Aug 3, 2021): @DHowett Just gotta say, I really appreciate the collab with the Pry guys [here](https://github.com/pry/pry/pull/2209)! Thank y'all for helping uncover that this was also an issue in Pry...
Author
Owner

@D3vil0p3r commented on GitHub (Mar 30, 2023):

When the fix for this issue will be applied?

@D3vil0p3r commented on GitHub (Mar 30, 2023): When the fix for this issue will be applied?
Author
Owner

@zadjii-msft commented on GitHub (Mar 30, 2023):

Whenever someone writes the code 😜 If you'd like to contribute the fix yourself, I can try and help walk you through where I'd start. I'm pretty sure we should be able to just eat them in AdaptDispatch (somewhere)

@zadjii-msft commented on GitHub (Mar 30, 2023): Whenever someone writes the code 😜 If you'd like to contribute the fix yourself, I can try and help walk you through where I'd start. I'm pretty sure we should be able to just eat them in AdaptDispatch (somewhere)
Author
Owner

@j4james commented on GitHub (Mar 30, 2023):

FYI, I've got a PR for this, which I'll hopefully submit later tonight.

@j4james commented on GitHub (Mar 30, 2023): FYI, I've got a PR for this, which I'll hopefully submit later tonight.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#14681