[PR #4422] Unify UTF-8 handling using til::u8u16 & revise WriteConsoleAImpl #25759

Open
opened 2026-01-31 09:11:37 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/microsoft/terminal/pull/4422

State: closed
Merged: Yes


Summary of the Pull Request

Replace utf8Parser with til::u8u16 in order to have the same conversion algorithms used in terminal and conhost.

References

This PR is a follow up of #4093

PR Checklist

Detailed Description of the Pull Request / Additional comments

This PR addresses item 2 in this list:

  1. ✉ Implement til::u8u16 and til::u16u8 (done in PR #4093)

  2. Unify UTF-8 handling using til::u8u16 (this PR)
    2.1. ✔ Update VtInputThread::_HandleRunInput()
    2.2. ✔ Update ApiRoutines::WriteConsoleAImpl()
    2.3. (optional / ask the core team) Remove Utf8ToWideCharParser from the code base to avoid further use

  3. Enable BOM discarding (follow up)
    3.1. extend til::u8u16 and til::u16u8 with a 3rd parameter to enable discarding the BOM
    3.2. Make use of the 3rd parameter to discard the BOM in all current function callers, or (optional / ask the core team) make it the default for til::u8u16 and til::u16u8

  4. Find UTF-16 to UTF-8 conversions and examine if they can be unified, too (follow up)

@miniksa @DHowett-MSFT

  • Please check if this PR, along with the investigations done in #4093, does really close #3378
  • Please advice if I should remove Utf8ToWideCharParser now that it isn't used anymore

Validation Steps Performed

  • long UTF-8 files outputted to the console
  • printf tested as shown in #4086
**Original Pull Request:** https://github.com/microsoft/terminal/pull/4422 **State:** closed **Merged:** Yes --- <!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? --> ## Summary of the Pull Request Replace `utf8Parser` with `til::u8u16` in order to have the same conversion algorithms used in terminal and conhost. <!-- Other than the issue solved, is this relevant to any other issues/existing PRs? --> ## References This PR is a follow up of #4093 <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist * [x] Closes #4086 , Closes #3378 * [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA * [x] Tests added/passed * [ ] Requires documentation to be updated * [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx <!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here --> ## Detailed Description of the Pull Request / Additional comments This PR addresses item 2 in this list: 1. ✉ Implement `til::u8u16` and `til::u16u8` (done in PR #4093) 2. ✔ **Unify UTF-8 handling using `til::u8u16` (this PR)** 2.1. ✔ **Update VtInputThread::_HandleRunInput()** 2.2. ✔ **Update ApiRoutines::WriteConsoleAImpl()** 2.3. ❌ (optional / ask the core team) Remove Utf8ToWideCharParser from the code base to avoid further use 3. ❌ Enable BOM discarding (follow up) 3.1. ❌ extend `til::u8u16` and `til::u16u8` with a 3rd parameter to enable discarding the BOM 3.2. ❌ Make use of the 3rd parameter to discard the BOM in all current function callers, or (optional / ask the core team) make it the default for `til::u8u16` and `til::u16u8` 4. ❌ Find UTF-16 to UTF-8 conversions and examine if they can be unified, too (follow up) @miniksa @DHowett-MSFT - Please check if this PR, along with the investigations done in #4093, does really close #3378 - Please advice if I should remove `Utf8ToWideCharParser` now that it isn't used anymore <!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well --> ## Validation Steps Performed - long UTF-8 files outputted to the console - `printf` tested as shown in #4086
claunia added the pull-request label 2026-01-31 09:11:37 +00:00
Sign in to join this conversation.
No Label pull-request
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#25759