Tar corruption when writing active files #619

Closed
opened 2026-01-29 22:14:43 +00:00 by claunia · 4 comments
Owner

Originally created by @lopio on GitHub (Mar 28, 2024).

I use the library to compress logs in a streaming manner, without saving a temporary file or storing to memory (RAM and diskspace restriction).

When I use the tar writer to archive active files (logs) the size of the tar header doesn't match the file in the tar, because the file has increased in size between the time the header was created and the file was fully read.

This result in the tar entry being corrupted, and sometime it can also result in the file in question not being recoverable.

Repro steps: Create a tar archive with a file that is being written in actively.

The fix is to stop reading the file at the size set by the tar header. Attached is the proposed fix as a git patch
tarCorruption.patch

Originally created by @lopio on GitHub (Mar 28, 2024). I use the library to compress logs in a streaming manner, without saving a temporary file or storing to memory (RAM and diskspace restriction). When I use the tar writer to archive active files (logs) the size of the tar header doesn't match the file in the tar, because the file has increased in size between the time the header was created and the file was fully read. This result in the tar entry being corrupted, and sometime it can also result in the file in question not being recoverable. Repro steps: Create a tar archive with a file that is being written in actively. The fix is to stop reading the file at the size set by the tar header. Attached is the proposed fix as a git patch [tarCorruption.patch](https://github.com/adamhathcock/sharpcompress/files/14793737/tarCorruption.patch)
Author
Owner

@adamhathcock commented on GitHub (Mar 29, 2024):

I seemingly can't use that patch file

Any chance of a PR?

@adamhathcock commented on GitHub (Mar 29, 2024): I seemingly can't use that patch file Any chance of a PR?
Author
Owner

@lopio commented on GitHub (Mar 29, 2024):

I can't find a way to create a branch for the PR. I do have my complete git diff here:

PS C:\git\sharpcompress> git diff
diff --git a/src/SharpCompress/Utility.cs b/src/SharpCompress/Utility.cs
index 9471c3e..0493e84 100644
--- a/src/SharpCompress/Utility.cs
+++ b/src/SharpCompress/Utility.cs
@@ -270,16 +270,27 @@ public static class Utility
         return sTime.AddSeconds(unixtime);
     }

-    public static long TransferTo(this Stream source, Stream destination)
+    public static long TransferTo(this Stream source, Stream destination, Int64? size = null)
     {
+        Int64 realSize = size ?? source.Length;
         var array = GetTransferByteArray();
         try
         {
             long total = 0;
             while (ReadTransferBlock(source, array, out var count))
             {
-                total += count;
-                destination.Write(array, 0, count);
+                if (total + count > realSize)
+                {
+                    int bytesToWrite = (int) (realSize - total);
+                    destination.Write(array, 0, (int) bytesToWrite);
+                    total = realSize;
+                    break;
+                }
+                else
+                {
+                    total += count;
+                    destination.Write(array, 0, count);
+                }
             }
             return total;
         }
diff --git a/src/SharpCompress/Writers/Tar/TarWriter.cs b/src/SharpCompress/Writers/Tar/TarWriter.cs
index 2427db3..53ce0ea 100644
--- a/src/SharpCompress/Writers/Tar/TarWriter.cs
+++ b/src/SharpCompress/Writers/Tar/TarWriter.cs
@@ -91,7 +91,7 @@ public class TarWriter : AbstractWriter
         header.Size = realSize;
         header.Write(OutputStream);

-        size = source.TransferTo(OutputStream);
+        size = source.TransferTo(OutputStream, realSize);
         PadTo512(size.Value);
     }
@lopio commented on GitHub (Mar 29, 2024): I can't find a way to create a branch for the PR. I do have my complete git diff here: ``` PS C:\git\sharpcompress> git diff diff --git a/src/SharpCompress/Utility.cs b/src/SharpCompress/Utility.cs index 9471c3e..0493e84 100644 --- a/src/SharpCompress/Utility.cs +++ b/src/SharpCompress/Utility.cs @@ -270,16 +270,27 @@ public static class Utility return sTime.AddSeconds(unixtime); } - public static long TransferTo(this Stream source, Stream destination) + public static long TransferTo(this Stream source, Stream destination, Int64? size = null) { + Int64 realSize = size ?? source.Length; var array = GetTransferByteArray(); try { long total = 0; while (ReadTransferBlock(source, array, out var count)) { - total += count; - destination.Write(array, 0, count); + if (total + count > realSize) + { + int bytesToWrite = (int) (realSize - total); + destination.Write(array, 0, (int) bytesToWrite); + total = realSize; + break; + } + else + { + total += count; + destination.Write(array, 0, count); + } } return total; } diff --git a/src/SharpCompress/Writers/Tar/TarWriter.cs b/src/SharpCompress/Writers/Tar/TarWriter.cs index 2427db3..53ce0ea 100644 --- a/src/SharpCompress/Writers/Tar/TarWriter.cs +++ b/src/SharpCompress/Writers/Tar/TarWriter.cs @@ -91,7 +91,7 @@ public class TarWriter : AbstractWriter header.Size = realSize; header.Write(OutputStream); - size = source.TransferTo(OutputStream); + size = source.TransferTo(OutputStream, realSize); PadTo512(size.Value); } ```
Author
Owner

@lopio commented on GitHub (Apr 8, 2024):

Fixed tarCorruption.patch

@lopio commented on GitHub (Apr 8, 2024): Fixed [tarCorruption.patch](https://github.com/adamhathcock/sharpcompress/files/14909002/tarCorruption.patch)
Author
Owner

@adamhathcock commented on GitHub (Apr 9, 2024):

modified the patch a little as my client corrupted it I think

Please validate the PR

@adamhathcock commented on GitHub (Apr 9, 2024): modified the patch a little as my client corrupted it I think Please validate the PR
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#619