mirror of
https://github.com/adamhathcock/sharpcompress.git
synced 2026-02-08 05:27:04 +00:00
Get file modify date from stream #480
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @brandonkish on GitHub (Nov 24, 2021).
Hello,
I apologize if this is the incorrect place for this, but after reading the ReadMe, it looks as though you prefer questions and requests to be posted here in the Issues Section. If there is a better place to post this question, please let me know and I will move it to that forum.
I would like to request the ability to grab the "modify date" for the next entry in the the stream when using the reader. My understanding is that it is part of the standard for most compressed archives to retain each files modified date (But not the created date). I may be wrong about this.
==Background Info==
I currently have a project I am working on where I am extracting an assortment of 3k - 4k of zip/rar/7zip files. These compressed files are assets for another program and need to be extracted to a database in a particular order and the files extracted in a specific way. However, there is no set standard for the file structures of these zip files, only folder names.
The issue is that some of the archives have different names but have the exact same files / folder hierarchy in them. Also, some of the zip files have identical files / folder hierarchy (Or only contain a few files), but have newer versions of the files. Due to the quantity and size of these compressed files, extracting all files can take hours. Also the library of assets is updated daily, meaning the extraction process takes place several times a week.
My goal is to identify which zip archives have already been extracted and to not extract them again even if they are named differently or duplicated.
In order to identify if two compressed files are identical, regardless of their name, I came up with the idea to create a hash value for the zip, based on the file names, date modified, and size of each file in the compressed file. I would use a look-ahead method of parsing the compressed file with SharpCompress and reading each file's file size and date modified to create a hash in memory to generate the hash, that could be used to compare to other zip files. ( If you have another recommendation it would be greatly appreciated).
However, since two zip files can have the same files, or partial files, I need to determine which zip was created most recently before overwriting the previously extracted files. This is where the request for the ability to view the file's modify date would be very helpful. (Or if you have another suggestion)
The other aspect to this is due to the size / quantity of the zip files, I would like to get this information from the stream to avoid the need to first extract the file, then read the file's property. I know it might be possible to extract the file first then read the file's property, but this would be a huge performance hit. Is there any way to perform this task by parsing the stream before extracting the file?
Thanks.
@brandonkish commented on GitHub (Nov 24, 2021):
I realized I was going about this the wrong way, and discovered this is already implemented. For anyone else looking for this: