[PR #30] Fix ansi filename decoded as gibberish in zip file #798

Open
opened 2026-01-29 22:17:37 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/adamhathcock/sharpcompress/pull/30

State: closed
Merged: Yes


I have noticed the same issue was created before DecodeString (in ZipFileEntry) fails to properly decode file path

The bug occurred on the zip file entry filename has non-ascii character and created by an archive manager which didn't support unicode entry filename.

According wiki Windows code page

The OEM code pages (original equipment manufacturer) are used by Win32 console applications, and by virtual DOS, and can be considered a holdover from DOS and the original IBM PC architecture.

ANSI code pages (officially called "Windows code pages"[1] after Microsoft accepted the former term being a misnomer[2]) are used for native non-Unicode (say, byte oriented) applications using a graphical user interface on Windows systems.

If in an english windows the Language for non-Unicode programs was set to Chinese (Simplifed,RPC). The zip entry filename will be encoded with codepage 936(ANSI code pages),but sharpcompress decoded the filename with codepage 437(OEM code pages)

In the other word, the codepage for entry filename encoding relays on ANSI code pages.

**Original Pull Request:** https://github.com/adamhathcock/sharpcompress/pull/30 **State:** closed **Merged:** Yes --- I have noticed the same issue was created before [DecodeString (in ZipFileEntry) fails to properly decode file path](https://sharpcompress.codeplex.com/workitem/32) The bug occurred on the zip file entry filename has non-ascii character and created by an archive manager which didn't support unicode entry filename. According wiki [Windows code page](http://en.wikipedia.org/wiki/Windows_code_page) > The OEM code pages (original equipment manufacturer) are used by Win32 console applications, and by virtual DOS, and can be considered a holdover from DOS and the original IBM PC architecture. > > ANSI code pages (officially called "Windows code pages"[1] after Microsoft accepted the former term being a misnomer[2]) are used for native non-Unicode (say, byte oriented) applications using a graphical user interface on Windows systems. If in an english windows the `Language for non-Unicode programs` was set to `Chinese (Simplifed,RPC)`. The zip entry filename will be encoded with codepage 936(ANSI code pages),but sharpcompress decoded the filename with codepage 437(OEM code pages) In the other word, the codepage for entry filename encoding relays on ANSI code pages.
claunia added the pull-request label 2026-01-29 22:17:37 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#798