Expose DataStartPosition for TarEntry so that a lookup table can be created for a tar file #396

Open
opened 2026-01-29 22:11:16 +00:00 by claunia · 8 comments
Owner

Originally created by @sameerkattel on GitHub (Apr 23, 2020).

Currently underlying stream start position is not exposed for a tar entry and because of this it's not possible to iterate through all the tar entries and create a lookup table.
With a lookup table it will be possible to randomly access tar entry in a tar file.

Originally created by @sameerkattel on GitHub (Apr 23, 2020). Currently underlying stream start position is not exposed for a tar entry and because of this it's not possible to iterate through all the tar entries and create a lookup table. With a lookup table it will be possible to randomly access tar entry in a tar file.
claunia added the question label 2026-01-29 22:11:16 +00:00
Author
Owner

@adamhathcock commented on GitHub (Apr 23, 2020):

TarArchive implicitly makes a lookup table for random access.

@adamhathcock commented on GitHub (Apr 23, 2020): TarArchive implicitly makes a lookup table for random access.
Author
Owner

@sameerkattel commented on GitHub (Apr 24, 2020):

Can that be exposed, so that lookup table can be serialized and saved? or mind pointing me where it makes a lookup table for random access?

@sameerkattel commented on GitHub (Apr 24, 2020): Can that be exposed, so that lookup table can be serialized and saved? or mind pointing me where it makes a lookup table for random access?
Author
Owner

@adamhathcock commented on GitHub (May 24, 2020):

My point is that you don't need to do it because it's done implicitly for all archives via the AbstractArchive class

@adamhathcock commented on GitHub (May 24, 2020): My point is that you don't need to do it because it's done implicitly for all archives via the AbstractArchive class
Author
Owner

@sameerkattel commented on GitHub (May 24, 2020):

I want lookup to export and save in a file so that I can use the lookup file to directly extract a given entry with byte positioning without having to again go through tar parsing process.

@sameerkattel commented on GitHub (May 24, 2020): I want lookup to export and save in a file so that I can use the lookup file to directly extract a given entry with byte positioning without having to again go through tar parsing process.
Author
Owner

@adamhathcock commented on GitHub (May 24, 2020):

Propose a solution because all of the pieces of what it sounds like you want to do are there.

@adamhathcock commented on GitHub (May 24, 2020): Propose a solution because all of the pieces of what it sounds like you want to do are there.
Author
Owner

@sameerkattel commented on GitHub (May 24, 2020):

Want to do sth like this to export meta but DataStartPosition is not available in TarEntry. Though not sure if there is other way of achieving this ?

 public static IEnumerable<TarEntryMeta> GetAllMetaFromArchive(Stream tarStream)
{

    using (var reader = TarReader.Open(tarStream, new ReaderOptions() { LeaveStreamOpen = true }))
    {
        while (reader.MoveToNextEntry())
        {
            if (!reader.Entry.IsDirectory)
            {
                yield return new TarEntryMeta(reader.Entry.Key, reader.Entry.DataStartPosition, reader.Entry.Size);
            }
        }

    }
}

Since now DataEntryPosition is known for TarEntry, I can use the lookup to directly read TarEntry Content from Stream without having to reparse Tar file.

@sameerkattel commented on GitHub (May 24, 2020): Want to do sth like this to export meta but **DataStartPosition** is not available in **TarEntry**. Though not sure if there is other way of achieving this ? ``` public static IEnumerable<TarEntryMeta> GetAllMetaFromArchive(Stream tarStream) { using (var reader = TarReader.Open(tarStream, new ReaderOptions() { LeaveStreamOpen = true })) { while (reader.MoveToNextEntry()) { if (!reader.Entry.IsDirectory) { yield return new TarEntryMeta(reader.Entry.Key, reader.Entry.DataStartPosition, reader.Entry.Size); } } } } ``` Since now **DataEntryPosition** is known for **TarEntry**, I can use the lookup to directly read TarEntry Content from Stream without having to reparse Tar file.
Author
Owner

@adamhathcock commented on GitHub (May 24, 2020):

you can do a PR or count the entry positions using the size. The Entry objects themselves are metadata.

@adamhathcock commented on GitHub (May 24, 2020): you can do a PR or count the entry positions using the size. The Entry objects themselves are metadata.
Author
Owner

@sameerkattel commented on GitHub (May 24, 2020):

Ahhh idea of calculating startposition using size of each entry completely escaped me. May be that will do. Thanks for the wonderful library!

@sameerkattel commented on GitHub (May 24, 2020): Ahhh idea of calculating startposition using size of each entry completely escaped me. May be that will do. Thanks for the wonderful library!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#396