mirror of
https://github.com/aaru-dps/Aaru.git
synced 2026-02-04 00:54:33 +00:00
Add support for NetApp WAFL filesystem #84
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @darkstar on GitHub (Aug 26, 2016).
NetApp WAFL is an advanced filesystem that works similar to TUX3 and BTRFS. It includes a RAID subsystem with one or 2 parity disks (like btrfs) and utilizes a "walking journal" type write strategy. It has a custom partition scheme and each disk is self-describing with respect to its position in the RAID tree.
You can not find any documentation outside of patents and the binaries for their OS (OnTap). If you want to start with this, I have some stuff that can help with that:
Interesting details:
If you're interested I could send you the code, although I'm kinda reluctant to put it up on GitHub directly since it has not been clean-room reverse engineered and I don't want to anger NetApp's lawyers ;-)
@darkstar commented on GitHub (Aug 26, 2016):
Oh btw. this could be tricky with the current implementation since you'd need to open multiple files concurrently. But this is also true for (full) btrfs-support (since it does its own RAID), for some VSMD files (those that are split at 2gb boundaries) and for things like BIN/CUE etc.
Probably, opening one of these files should either scan the containing directory for "matching" files by itself (either using a fixed name, using a fixed extension and wildcard as name, or using the extension from the first file and scanning all other files with that extension) or should prompt the user to select/find the "missing" files
@claunia commented on GitHub (Aug 26, 2016):
Multiple files disc images (Cuesheet.cue, Track1.bin, Track2.bin, e.g.) are easily supported. Multiple disk volumes are not. It would need an API to be designed for that.
About "clean room reverse engineering", I'm not a lawyer but if I understood it well when mine explained, in the European Union, when it is done for interoperatibility (DiscImageChef can fall in this category), you can disassembly the code and watch at how it works (but not watch confidential source code) as long as you code the interoperable application but do not publish the direct findings or disassembled code.
In a nutshell, that as long as you has not watched confidential information, you must code it yourself, you cannot send me the information for me to code it.
If the information was get just from guessing (no disassembly involved, by just looking at the on disk structures), you can publish the information.
Also, currently any filesystem priority is to identify and get information about them (only implementing Identify() and GetInformation() methods). Full read-only support is the least priority.
And per your description, read-only support will require parity and compression support. Getting a dynamic ReedSolomon (one with changeable parameters) and a compression API are high priority right now.
I'm going to add the compression API as soon as I solve all current issues (but variable track sizes) as several disc image formats depend on them.
@darkstar commented on GitHub (Aug 26, 2016):
Okay, I think I understand what you're saying about the reverse-engineering issue...
Compression is not neccessarily required as it is an optional feature in later versions of the O/S. RS decoders are also not required since OnTap uses only XOR calculations in its RAID implementation.
However, I think a CRC API would be good, since lots of disk image formats do some kind of checksum to verify data. I have a very flexible CRC implementation with lots of different CRC parameters/polynomials already in place that I can offer. It could easily be extended to support other checksums as well.
You can find it here: http://pastebin.com/6fYUYNPA
@claunia commented on GitHub (Aug 27, 2016):
There is already a CRC API in https://github.com/claunia/DiscImageChef/tree/master/DiscImageChef.Checksums that includes ANSI CRC-16, ISO CRC-32 and ECMA CRC-64. The API also supports Adler-32, MD5, RIPEMD-160, SHA1, SHA2 and SpamSum.
Teledisk and CopyQM are using their own CRC implementations because they're the only ones using them at all, but the existing implementations could be easily made to work with variable polys and parameters.
RS is nonetheless required for CDs, DVDs, Blu-rays, magneto-opticals, SCSI/ATA hard disks, etc, so it's really a priority to have an implementation with variable parameters. Currently there is one with CD parameters for main channel hard coded (does not work with CD subchannel, less even other media).