Warn if settings.json is not UTF-8 (at least in key binding) #14111

Open
opened 2026-01-31 04:01:14 +00:00 by claunia · 2 comments
Owner

Originally created by @KalleOlaviNiemitalo on GitHub (Jun 4, 2021).

Description of the new feature/enhancement

If settings.json is encoded in e.g. Windows-1252 rather than UTF-8, then Terminal should warn about that, rather than silently substitute U+FFFD replacement characters in strings. This would help users who edit settings.json with Visual Studio, which saves in the ANSI code page by default if the file does not appear to be UTF-8 already. The silent substitution is especially hard to diagnose if the only non-ASCII character is in a key binding that then just does not work.

Proposed technical implementation details (optional)

Apply settings.json as much as possible, but then pop up a message saying that the encoding is wrong.

AFAICT, jsoncpp gives the raw bytes of JSON string values to Terminal if there is are no backslash escapes. These bytes are UTF-8 if the file was correctly encoded. In Terminal, ConversionTrait<KeyChord>::FromJson calls til::u8u16, which calls MultiByteToWideChar without the MB_ERR_INVALID_CHARS flag. There are more til::u8u16 calls in other ConversionTrait specializations.

https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/cascadia/TerminalSettingsModel/KeyChordSerialization.cpp#L332
https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/cascadia/TerminalSettingsModel/JsonUtils.h#L161
https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/inc/til/u8u16convert.h#L282

Originally created by @KalleOlaviNiemitalo on GitHub (Jun 4, 2021). # Description of the new feature/enhancement If settings.json is encoded in e.g. Windows-1252 rather than UTF-8, then Terminal should warn about that, rather than silently substitute U+FFFD replacement characters in strings. This would help users who edit settings.json with Visual Studio, which saves in the ANSI code page by default if the file does not appear to be UTF-8 already. The silent substitution is especially hard to diagnose if the only non-ASCII character is in a key binding that then just does not work. # Proposed technical implementation details (optional) Apply settings.json as much as possible, but then pop up a message saying that the encoding is wrong. AFAICT, jsoncpp gives the raw bytes of JSON string values to Terminal if there is are no backslash escapes. These bytes are UTF-8 if the file was correctly encoded. In Terminal, ConversionTrait\<KeyChord>::FromJson calls til::u8u16, which calls MultiByteToWideChar without the MB_ERR_INVALID_CHARS flag. There are more til::u8u16 calls in other ConversionTrait specializations. <https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/cascadia/TerminalSettingsModel/KeyChordSerialization.cpp#L332> <https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/cascadia/TerminalSettingsModel/JsonUtils.h#L161> <https://github.com/microsoft/terminal/blob/2531454bcd46eadc14273ca85acc76231b8810cb/src/inc/til/u8u16convert.h#L282>
claunia added the Issue-FeatureHelp WantedArea-SettingsProduct-Terminal labels 2026-01-31 04:01:14 +00:00
Author
Owner

@KalleOlaviNiemitalo commented on GitHub (Jun 4, 2021):

It seems it would be much easier to throw an exception right away than save the error and keep processing the rest of the JSON tree. Wouldn't be much worse for the user, either.

@KalleOlaviNiemitalo commented on GitHub (Jun 4, 2021): It seems it would be much easier to throw an exception right away than save the error and keep processing the rest of the JSON tree. Wouldn't be much worse for the user, either.
Author
Owner

@orcmid commented on GitHub (Jun 7, 2021):

It seems it would be much easier to throw an exception right away than save the error and keep processing the rest of the JSON tree. Wouldn't be much worse for the user, either.

Yes, once it is not known exactly what the damage involves, stopping is better than continuing. A compromise (not unlike preprocessor/VC fails) is to count and quit when it is clear there is no simple continuation. But absent that, stopping and identifying the point of difficulty is appropriate, just like hitting an XML cliff. The error message should not be misleading - i.e., presume too much about the root cause.

@orcmid commented on GitHub (Jun 7, 2021): > It seems it would be much easier to throw an exception right away than save the error and keep processing the rest of the JSON tree. Wouldn't be much worse for the user, either. Yes, once it is not known exactly what the damage involves, stopping is better than continuing. A compromise (not unlike preprocessor/VC fails) is to count and quit when it is clear there is no simple continuation. But absent that, stopping and identifying the point of difficulty is appropriate, just like hitting an XML cliff. The error message should not be misleading - i.e., presume too much about the root cause.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#14111