mirror of
https://github.com/adamhathcock/sharpcompress.git
synced 2026-02-03 21:23:38 +00:00
Tar parsing is too benevolent #459
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @IS4Code on GitHub (May 6, 2021).
I would like to use this library to detect and open common archive formats, yet there is the issue that Tar is recognized far too often and invalid broken objects are produced from files in completely different formats.
First, the checksum in the header is not read and verified at all. In my opinion, doing this would remove a lot of the false positives, but it seems there are implementations that store the checksum there in different formats. Perhaps it could be controlled by a specific option, but I feel it should be enabled by default.
Next, the
Magicproperty in the header is not exposed in any way. Modern tar files should haveustarthere, so perhaps another option could be to reject old files that do not have this signature.@adamhathcock commented on GitHub (Jun 4, 2021):
Tar parsing isn't too great. I never liked my implementation which was based on something really old. I need to adapt another more complete implementation from another library.
Suggestions and PRs welcome.
@IS4Code commented on GitHub (Jun 4, 2021):
@adamhathcock I tried to come up with a better checking of the header based on the checksum:
I tried this on a lot of files and so far found none that would be mistaken as Tar. I haven't however checked if all valid Tar files are correctly accepted. I tried to still account for some different implementations: the checksum could have either spaces or zeros as leading digits, the termination could differ, and both the signed and unsigned checksum is computed. Plus every 64 bytes, it checks if there is even enough remaining bytes to reach the checksum.
It's focused more on preventing false positives, so if something like this is included (and you certainly have my permission), I'd prefer to have this toggleable in case of some weird Tar versions.