Get rid of buffering in the partials handling of u8u16 and u16u8 #14869

Open
opened 2026-01-31 04:21:46 +00:00 by claunia · 0 comments
Owner

Originally created by @german-one on GitHub (Aug 13, 2021).

Description of the new feature/enhancement

The state classes used for the partials handling contain a buffer which is used to provide a preprocessed string that contains only complete code points for the conversion. This means that each and every incoming string is copied to the buffer first. And this buffer exists for the life time of the state instance. @lhecker already implemented a check in order to shrink the buffer if it grows too much. However, a few days later a facepalm moment appeared to me because buffering of the whole string is not necessary at all. We just need to complete the partial code point in the cache and convert it separately.
It's
currently: copying the whole string in a persistent buffer every time the function is called
vs.
proposed: no buffering, but possibly calling MB2WC or WC2MB twice (only if partials have been cached)

Proposed technical implementation details (optional)

Something about like that for u8u16 (would be quite similar for u16u8 using an array of 2 wchar_t)
nobuffer

Originally created by @german-one on GitHub (Aug 13, 2021). <!-- 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 I ACKNOWLEDGE THE FOLLOWING BEFORE PROCEEDING: 1. If I delete this entire template and go my own path, the core team may close my issue without further explanation or engagement. 2. If I list multiple bugs/concerns in this one issue, the core team may close my issue without further explanation or engagement. 3. If I write an issue that has many duplicates, the core team may close my issue without further explanation or engagement (and without necessarily spending time to find the exact duplicate ID number). 4. If I leave the title incomplete when filing the issue, the core team may close my issue without further explanation or engagement. 5. If I file something completely blank in the body, the core team may close my issue without further explanation or engagement. All good? Then proceed! --> # Description of the new feature/enhancement The `state` classes used for the partials handling contain a buffer which is used to provide a preprocessed string that contains only complete code points for the conversion. This means that each and every incoming string is copied to the buffer first. And this buffer exists for the life time of the `state` instance. @lhecker already implemented a check in order to shrink the buffer if it grows too much. However, a few days later a facepalm moment appeared to me because buffering of the whole string is not necessary at all. We just need to complete the partial code point in the cache and convert it separately. It's currently: _copying the whole string in a persistent buffer every time the function is called_ vs. proposed: _no buffering, but possibly calling `MB2WC` or `WC2MB` twice (only if partials have been cached)_ # Proposed technical implementation details (optional) Something about like that for `u8u16` (would be quite similar for `u16u8` using an array of 2 `wchar_t`) <img width="721" alt="nobuffer" src="https://user-images.githubusercontent.com/46659645/129393515-5d75dfd8-9cf2-4c55-b106-8ab812833a19.png">
claunia added the Area-OutputResolution-Fix-CommittedIssue-TaskProduct-Terminal labels 2026-01-31 04:21:47 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/terminal#14869