Feature request - Unicode UTF8 translation table #151

Open
opened 2026-01-29 15:13:39 +00:00 by claunia · 0 comments
Owner

Originally created by @bobbimanners on GitHub (Dec 18, 2024).

I am enjoying using WRP to enable web browsing on my vintage machines. One feature I would like to see is some sort of table to all translation of commonly-encountered Unicode UTF8 byte sequences to ASCII equivalents.

For English language web readers, I note that a lot of newspapers and web pages use Unicode version of dash (em-dash, etc.), quotation marks, apostrophes etc. In my own software I have previously implemented a filter to convert a few of these common cases into ASCII equivalents.

We'll never get them all (and non-euro languages are obviously a hopeless case), but we could clean up English text very easily, and probably make French, Spanish, German etc., much easier to ready by simply omitting 'accents' / diacriticals. (Or to use German as an example, o-umlaut -> "oe").

You may well consider this out-of-scope.

Originally created by @bobbimanners on GitHub (Dec 18, 2024). I am enjoying using WRP to enable web browsing on my vintage machines. One feature I would like to see is some sort of table to all translation of commonly-encountered Unicode UTF8 byte sequences to ASCII equivalents. For English language web readers, I note that a lot of newspapers and web pages use Unicode version of dash (em-dash, etc.), quotation marks, apostrophes etc. In my own software I have previously implemented a filter to convert a few of these common cases into ASCII equivalents. We'll never get them all (and non-euro languages are obviously a hopeless case), but we could clean up English text very easily, and probably make French, Spanish, German etc., much easier to ready by simply omitting 'accents' / diacriticals. (Or to use German as an example, o-umlaut -> "oe"). You may well consider this out-of-scope.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: claunia/wrp#151