SharpCompress hangs in busy wait reading corrupt/truncated ".tar.gz" file #120

Open
opened 2026-01-29 22:06:53 +00:00 by claunia · 3 comments
Owner

Originally created by @erikcturner on GitHub (Aug 28, 2016).

I used one of the samples (reproduced below) and the code fragment hangs on the final MoveToNextEntry call. This same code works fine on a non-truncated .tar.gz file.

using (Stream stream = File.OpenRead(@"C:\Code\sharpcompress.tar.gz"))
{
    var reader = ReaderFactory.Open(stream);
    while (reader.MoveToNextEntry())
    {
        if (!reader.Entry.IsDirectory)
        {
            reader.WriteEntryToDirectory(@"C:\temp", ExtractOptions.ExtractFullPath |  ExtractOptions.Overwrite);
        }
    }
}
Originally created by @erikcturner on GitHub (Aug 28, 2016). I used one of the samples (reproduced below) and the code fragment hangs on the final MoveToNextEntry call. This same code works fine on a non-truncated .tar.gz file. ``` using (Stream stream = File.OpenRead(@"C:\Code\sharpcompress.tar.gz")) { var reader = ReaderFactory.Open(stream); while (reader.MoveToNextEntry()) { if (!reader.Entry.IsDirectory) { reader.WriteEntryToDirectory(@"C:\temp", ExtractOptions.ExtractFullPath | ExtractOptions.Overwrite); } } } ```
Author
Owner

@adamhathcock commented on GitHub (Sep 27, 2016):

Is it supposed to be a valid file? SharpCompress isn't going to check for file validity.

@adamhathcock commented on GitHub (Sep 27, 2016): Is it supposed to be a valid file? SharpCompress isn't going to check for file validity.
Author
Owner

@erikcturner commented on GitHub (Sep 27, 2016):

Adam,

We were generating and downloading a ".tar.gz" file from one of our
servers. Due to a bug in the software, it was returning a shortened
version of the complete ".tar.gz" file.

SharpCompress did not throw an exception under this condition - it just
never returned and seemed to be in a "busy wait" since it was consuming an
entire core's worth of CPU time.

This behavior was unacceptable for our software in the web server so I
ended up writing my own Deflate/Untar functionality that threw an exception
when it detected then condition of "no more data available" and "not done
with TAR entry". I found out that Deflate has no indication that the
compressed stream is too short but TAR can recognize a tarball that is too
short (unless the truncation happens to fall exactly at the intersection
between the end of one file and the TAR header at the beginning of the next
file - and even then it would recognize it as a non-standard TAR file
without the block of 512 zeros at the end).

I submitted the problem report because I thought you might want to know
about the "busy wait" issue - it totally killed the performance of our 12
core server after enough attempts to download truncated ".tar.gz" files.

Erik Turner

On Tue, Sep 27, 2016 at 5:55 AM, Adam Hathcock notifications@github.com
wrote:

Is it supposed to be a valid file? SharpCompress isn't going to check for
file validity.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/adamhathcock/sharpcompress/issues/165#issuecomment-249819968,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOeJPLQQGpq9x5lCg0NZMf5HK83YDxHoks5quOgFgaJpZM4Ju_kJ
.

@erikcturner commented on GitHub (Sep 27, 2016): Adam, We were generating and downloading a ".tar.gz" file from one of our servers. Due to a bug in the software, it was returning a shortened version of the complete ".tar.gz" file. SharpCompress did not throw an exception under this condition - it just never returned and seemed to be in a "busy wait" since it was consuming an entire core's worth of CPU time. This behavior was unacceptable for our software in the web server so I ended up writing my own Deflate/Untar functionality that threw an exception when it detected then condition of "no more data available" and "not done with TAR entry". I found out that Deflate has no indication that the compressed stream is too short but TAR can recognize a tarball that is too short (unless the truncation happens to fall _exactly_ at the intersection between the end of one file and the TAR header at the beginning of the next file - and even then it would recognize it as a non-standard TAR file without the block of 512 zeros at the end). I submitted the problem report because I thought you might want to know about the "busy wait" issue - it totally killed the performance of our 12 core server after enough attempts to download truncated ".tar.gz" files. Erik Turner On Tue, Sep 27, 2016 at 5:55 AM, Adam Hathcock notifications@github.com wrote: > Is it supposed to be a valid file? SharpCompress isn't going to check for > file validity. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > https://github.com/adamhathcock/sharpcompress/issues/165#issuecomment-249819968, > or mute the thread > https://github.com/notifications/unsubscribe-auth/AOeJPLQQGpq9x5lCg0NZMf5HK83YDxHoks5quOgFgaJpZM4Ju_kJ > .
Author
Owner

@adamhathcock commented on GitHub (Sep 28, 2016):

Is there any way you can contribute the code you wrote?

@adamhathcock commented on GitHub (Sep 28, 2016): Is there any way you can contribute the code you wrote?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#120