[PR #4093] [MERGED] Implement til::u8u16 and til::u16u8 conversion functions #25631

Open
opened 2026-01-31 09:10:44 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/microsoft/terminal/pull/4093
Author: @german-one
Created: 12/31/2019
Status: Merged
Merged: 1/30/2020
Merged by: @DHowett-MSFT

Base: masterHead: master


📝 Commits (10+)

  • 99f1bbe Implement UTF-8 <--> UTF-16 conversion in user mode.
  • 52cdbf4 formatting issues
  • b855a8f still formatting issues
  • c5e8f9e add Reset methods
  • 7789fb1 extend unit tests
  • 3a6a757 PR feedback for _OutputThread
  • 6671dcc PR feedback for Utf8Utf16Convert
  • b9e2e65 add U8U16Test to src\tools\
  • dbfcd83 remove shrink_to_fit()
  • 4f63dae fix quirks in test loops

📊 Changes

24 files changed (+2245 additions, -312 deletions)

View changed files

📝 OpenConsole.sln (+23 -0)
📝 src/cascadia/TerminalConnection/ConptyConnection.cpp (+25 -13)
📝 src/cascadia/TerminalConnection/ConptyConnection.h (+4 -0)
📝 src/inc/til.h (+1 -0)
src/inc/til/u8u16convert.h (+458 -0)
📝 src/til/ut_til/til.unit.tests.vcxproj (+1 -0)
src/til/ut_til/u8u16convertTests.cpp (+143 -0)
src/tools/U8U16Test/PropertySheet.props (+16 -0)
src/tools/U8U16Test/U8U16Test.cpp (+780 -0)
📝 src/tools/U8U16Test/U8U16Test.hpp (+44 -27)
src/tools/U8U16Test/U8U16Test.vcxproj (+129 -0)
src/tools/U8U16Test/U8U16Test.vcxproj.filters (+37 -0)
src/tools/U8U16Test/_test.cmd (+6 -0)
src/tools/U8U16Test/en.txt (+2 -0)
src/tools/U8U16Test/fr.txt (+2 -0)
src/tools/U8U16Test/main.cpp (+558 -0)
src/tools/U8U16Test/packages.config (+4 -0)
src/tools/U8U16Test/ru.txt (+2 -0)
src/tools/U8U16Test/zh.txt (+2 -0)
src/types/UTF8OutPipeReader.cpp (+0 -100)

...and 4 more files

📄 Description

Summary of the Pull Request

PR Checklist

Detailed Description of the Pull Request / Additional comments

On my list:

  1. ✔ Implement UTF-8 <--> UTF-16 conversion in user mode (this PR)
    1.1. ✔ Transpose my U8ToU16() and U16ToU8() C --> C++ (obsolet)
    1.2. ✔ Implement functors for partials handling
    1.3. ✔ Implement functors to do both the partials handling and the conversion task at once
    1.4. ✔ Supersede Utf8OutPipeReader and remove it from the code base to avoid further use
  2. Unify UTF-8 handling (supersede Utf8ToWideCharParser)
    2.1. Update VtInputThread::_HandleRunInput()
    2.2. Update ApiRoutines::WriteConsoleAImpl()
    2.3. (optional / ask the core team) Remove Utf8ToWideCharParser from the code base to avoid further use
  3. Enable BOM discarding
    3.1. Implement an enum class containing flags for U8ToU16() and U16ToU8() to enable discarding both BOM and/or invalids
    3.2. Replace the 3rd parameter of U8ToU16(), U16ToU8(), and related functors with the enum and update the function code accordingly
    3.3. Make use of the 3rd parameter to discard the BOM in all current functor callers, or (optional / ask the core team) make it the default for U8ToU16() and U16ToU8()
  4. Find UTF-16 to UTF-8 conversions and examine if they can be unified, too

Validation Steps Performed

  • Unit tests implemented.
  • Loads of manual tests to evaluate the behavior of the implemented code.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/microsoft/terminal/pull/4093 **Author:** [@german-one](https://github.com/german-one) **Created:** 12/31/2019 **Status:** ✅ Merged **Merged:** 1/30/2020 **Merged by:** [@DHowett-MSFT](https://github.com/DHowett-MSFT) **Base:** `master` ← **Head:** `master` --- ### 📝 Commits (10+) - [`99f1bbe`](https://github.com/microsoft/terminal/commit/99f1bbe168f7299a2cdc193f264b73ee4b20273a) Implement UTF-8 <--> UTF-16 conversion in user mode. - [`52cdbf4`](https://github.com/microsoft/terminal/commit/52cdbf43b3dc16d11f8b877e72f5676cf7b8cd63) formatting issues - [`b855a8f`](https://github.com/microsoft/terminal/commit/b855a8f8ec9378d916977fc5fe5117f6cd0d7d6c) still formatting issues - [`c5e8f9e`](https://github.com/microsoft/terminal/commit/c5e8f9ef32b9edfb557413fe509e653730df02c1) add Reset methods - [`7789fb1`](https://github.com/microsoft/terminal/commit/7789fb1e205b38ed6f208cf8c9f5168044a58fed) extend unit tests - [`3a6a757`](https://github.com/microsoft/terminal/commit/3a6a757f42b52c277b6e5d32a0dce85806f28f07) PR feedback for _OutputThread - [`6671dcc`](https://github.com/microsoft/terminal/commit/6671dcc7e72e56df14b57937eaf760d446bec6a6) PR feedback for Utf8Utf16Convert - [`b9e2e65`](https://github.com/microsoft/terminal/commit/b9e2e65aa8b39b1039841d3ac3cf88b236d01e9c) add U8U16Test to src\tools\ - [`dbfcd83`](https://github.com/microsoft/terminal/commit/dbfcd83474b886d3408fb34d14652b140f956c10) remove `shrink_to_fit()` - [`4f63dae`](https://github.com/microsoft/terminal/commit/4f63dae365093ca92e2043fb56cfea8195d022be) fix quirks in test loops ### 📊 Changes **24 files changed** (+2245 additions, -312 deletions) <details> <summary>View changed files</summary> 📝 `OpenConsole.sln` (+23 -0) 📝 `src/cascadia/TerminalConnection/ConptyConnection.cpp` (+25 -13) 📝 `src/cascadia/TerminalConnection/ConptyConnection.h` (+4 -0) 📝 `src/inc/til.h` (+1 -0) ➕ `src/inc/til/u8u16convert.h` (+458 -0) 📝 `src/til/ut_til/til.unit.tests.vcxproj` (+1 -0) ➕ `src/til/ut_til/u8u16convertTests.cpp` (+143 -0) ➕ `src/tools/U8U16Test/PropertySheet.props` (+16 -0) ➕ `src/tools/U8U16Test/U8U16Test.cpp` (+780 -0) 📝 `src/tools/U8U16Test/U8U16Test.hpp` (+44 -27) ➕ `src/tools/U8U16Test/U8U16Test.vcxproj` (+129 -0) ➕ `src/tools/U8U16Test/U8U16Test.vcxproj.filters` (+37 -0) ➕ `src/tools/U8U16Test/_test.cmd` (+6 -0) ➕ `src/tools/U8U16Test/en.txt` (+2 -0) ➕ `src/tools/U8U16Test/fr.txt` (+2 -0) ➕ `src/tools/U8U16Test/main.cpp` (+558 -0) ➕ `src/tools/U8U16Test/packages.config` (+4 -0) ➕ `src/tools/U8U16Test/ru.txt` (+2 -0) ➕ `src/tools/U8U16Test/zh.txt` (+2 -0) ➖ `src/types/UTF8OutPipeReader.cpp` (+0 -100) _...and 4 more files_ </details> ### 📄 Description <!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? --> ## Summary of the Pull Request * Remove `UTF8OutputPipeReader`, move the pipe reading back to `ConptyConnection::_OutputThread()`. * Implement UTF-8 <--> UTF-16 conversion ~~in user mode. Enable to toggle between ignoring invalid UTF-8 and replacing it with U+FFFD.~~ See #3378 * Implement a re-usable partials handling. <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist * [x] Closes #4092 * [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA * [x] Tests added/passed * [ ] Requires documentation to be updated * [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx <!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here --> ## Detailed Description of the Pull Request / Additional comments On my list: 1. ✔ Implement UTF-8 <--> UTF-16 conversion ~~in user mode~~ (this PR) 1.1. ✔ ~~Transpose my U8ToU16() and U16ToU8() C --> C++~~ (obsolet) 1.2. ✔ Implement functors for partials handling 1.3. ✔ Implement functors to do both the partials handling and the conversion task at once 1.4. ✔ Supersede Utf8OutPipeReader and remove it from the code base to avoid further use 2. ❌ Unify UTF-8 handling (supersede Utf8ToWideCharParser) 2.1. ❌ Update VtInputThread::_HandleRunInput() 2.2. ❌ Update ApiRoutines::WriteConsoleAImpl() 2.3. ❌ (optional / ask the core team) Remove Utf8ToWideCharParser from the code base to avoid further use 3. ❌ Enable BOM discarding 3.1. ❌ Implement an `enum class` containing flags for U8ToU16() and U16ToU8() to enable discarding both BOM and/or invalids 3.2. ❌ Replace the 3rd parameter of U8ToU16(), U16ToU8(), and related functors with the enum and update the function code accordingly 3.3. ❌ Make use of the 3rd parameter to discard the BOM in all current functor callers, or (optional / ask the core team) make it the default for U8ToU16() and U16ToU8() 4. ❌ Find UTF-16 to UTF-8 conversions and examine if they can be unified, too <!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well --> ## Validation Steps Performed * Unit tests implemented. * Loads of manual tests to evaluate the behavior of the implemented code. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-31 09:10:44 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#25631