+ + +

Imploder file formats

+ +
+
From ExoticA
+
+
Jump to: navigation, search
+ + +

These are the file formats used by The Imploder and related utilities. +

In this document, all multi-byte values are stored in the big-endian format. +

+ + +

The Imploder

+

This section has still to be written. The Imploder creates compressed executable files that self-unpack when run. There are several variations: normal Imploder (sub-variants: 3.1, 3.1 pure and 4.0), library Imploder which uses the external "explode.library" (sub-variants: 3.1 and 4.0) and overlayed Imploder which loads the executable at the same time as decrunching it. The compression code is the same as the Disk Imploder and File Imploder, but the Amiga executable file structure has to be reconstituted as well. MC680x0 code to do this can be found here. +

+

DImp

+

The purpose of the Disk Imploder is to compress the raw disk structure of standard Amiga disks with the Imploder compression algorithm. The file extension ".DMP" is used for a standard Disk Imploder file, the extension ".DEX" is used for a self-extracting Disk Imploder file. +

+

Overall DImp file format

+

The regular Disk Imploder format is given below. The self-extracting format is simply the same data, preceded by an Amiga executable that will extract the data. An Amiga executable always begins with the 32-bit value 0x3F3 (1011 in decimal). DImp 1.00 self-extracting files have the DImp data at offset 3856 decimal. DImp 2.27 self-extracting files have the DImp data at offset 5796 decimal. For other versions, you should search the entire executable for the "DIMP" identifier of the header. +

The regular DImp format comprises the following sections, stored consecutively without gaps, and in the order given. +

+ +

DImp header

+

The DImp header has two 32-bit values. First, the identifing value 0x44494D50, or "DIMP" in ASCII. Secondly, the length of the information table to follow, in bytes. It must be between 4 and 404. +

+

DImp information table

+

The DImp information table has all metadata regarding the compression and disk structure. The overall length of the table is 404 (0x194) bytes. If the table length given in the header section is less than 404, then only that number of bytes should be retrieved from the DImp file, the remaining bytes must be filled in with zeroes. The table format is as follows: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OffsetLengthDescription +
0x004This is a checksum of all data in the information table, except this checksum field itself. In other words, the checksum of 400 (0x190) bytes of data from offset 0x004 to offset 0x193, inclusive. See the checksum format for more information. +
0x042This is the level of compression used. As all levels can be unpacked with the same code, it is not needed. +
0x0610This is an 80 bit bitfield, one bit for each possible cylinder on the compressed disk. The most significant bit of byte 0 represents cylinder 0. The least significant bit of byte 0 represents cylinder 7. The least significant bit of byte 9 represents cylinder 79. If a bit is set, the corresponding cylinder is stored in the DImp file. If a bit is not set, the cylinder is not stored in the DImp file. +
0x1028This is an explosion table. It stores the state required for the decompressor to unpack the text message, if present. The actual structure comprises 8 16-bit values used as "base offsets" and 12 8-bit values used as "number of extra bits to read". +
0x2C28This is another explosion table, which stores the state required for the decompressor to unpack a cylinder. All cylinders use the same explosion table. +
0x484This is the compressed length of the text message. If it is 0, there is no text message present. +
0x4C4This is the uncompressed length of the text message. +
0x504This is a checksum of the text message, when uncompressed. See the checksum format for more information. +
0x54320This is an array of 80 32-bit values, one for each cylinder. See the cylinder section for more information. +
+

DImp text message

+

The text message, if present in the DImp file, is simply a stream of Imploder compressed data. The length of this compressed stream is given in the information table at offset 0x4C. The length of the stream when uncompressed is given at offset 0x48. If either of these two values are zero, there is no text message present. The stream should be decompressed with the explosion algorithm, using the explosion table at offset 0x10 in the information table. The resulting uncompressed stream is expected to be printable ISO-8859-1 text, but may feature ANSI codes and Amiga console.device specific escape codes. +

+

DImp cylinder

+

At this point in the DImp file, anything between 0 and 80 compressed streams are present. Each compressed stream is individually sized and represents one cylinder of an Amiga disk. They are ordered from cylinder 0 to cylinder 79. If a cylinder is not present in the DImp file, it uses no bytes in this section of the DImp file. +

An Amiga disk has two sides, 80 tracks, and 11 sectors per track. Each sector is 512 bytes in length. So, a cylinder comprises 22 512-byte sectors, or exactly 11264 bytes of data. The track number is the same for all sectors in a cylinder, and the uncompressed cylinder data is broken into 512 byte sectors in this order: sector 0 on side 0, sector 0 on side 1, sector 1 on side 0, sector 1 on side 1, sector 2 on side 0, sector 2 on side 1, and so on until sector 10 on side 1. +

To determine if a cylinder is present in the DImp file, first check the disk bitmap at offset 0x06 in the information table. If the appropriate bit is 0, that cylinder is not present. If the bit is set, then take the appropriate 32 bit entry from the cylinder information array at offset 0x54 in the information table, and interpret it as follows: +

+
  1. If the entry is 0x00000000, then the cylinder is not present in the file, despite what the disk bitmap said. This happens when an error occurs while reading the disk at compression time.
  2. +
  3. If the entry is 0xFFFFFFFF, then the cylinder comprises nothing but zeros. Assume the cylinder expands to 11264 zero bytes, and does not use any bytes from this part of the DImp file for its definition.
  4. +
  5. In all other cases, the entry must be broken into the most significant 16 bits and the least significant 16 bits. +
    • The most significant 16 bits are the compressed size of this cylinder, in bytes. If this value is more than the uncompressed length of a cylinder, 11264, then something is wrong. The DImp utility exits with the message "wierd info-table entry" in this scenario. If this value is exactly 11264, the cylinder is stored uncompressed. Otherwise, the cylinder data is a stream of Imploder compressed data. The uncompressed length of this data is 11264 bytes. The stream should be decompressed with the explosion algorithm, using the explosion table at offset 0x2C in the information table.
    • +
    • The least significant 16 bits are the least significant 16 bits of the checksum on the cylinder's bytes stored in the file. See the checksum format for more information.
+

DImp checksum format

+

If the length of the data to be checksummed in bytes is not a multiple of 2, assume that the length is one byte longer, and that byte's value is 0. The byte's location is at the very end of the checksummed data. +

To derive the checksum, interpret the data to be checksummed as a contiguous array of 16-bit, unsigned, big-endian values. Compute the sum of all these values, then add 7. The least significant 32 bits of the result are the checksum value. +

+

FImp

+

FImp compresses a single file into the following format: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OffsetLengthDescription +
0x004The identifying value 0x494D5021, or "IMP!" in ASCII. Clones of the FImp format use the IDs "ATN!", "BDPI", "CHFI", "Dupa", "EDAM", "FLT!", "M.H.", "PARA" and "RDC9". [1] [2] +
0x044The uncompressed length of the file, in bytes. +
0x084The offset of the following compressed data section: endoffset. Always even. +
0x0Cendoffset - 0x0CThe compressed data section. +
endoffset4Compressed data longword 3. +
endoffset + 0x044Compressed data longword 2. +
endoffset + 0x084Compressed data longword 1. +
endoffset + 0x0C4The initial literal run length. +
endoffset + 0x102Bit 15 is an indicator of compressed data length; bits 7-0 are the first byte of the compressed data ("initial bit-buffer"). +
endoffset + 0x1228The explosion table (8 16-bit values and 12 8-bit values) +
endoffset + 0x2E4Unknown; appears to be a checksum of the preceding bytes, but out by a little. +
+

Re-ordering the data for decompression

+

The compressed data is not immediately decompressable. The format is designed such that you can load the file, including headers, into a single decompression buffer and decompress it in-place. Because of this, it uses the three longwords (4 bytes each) of the header information as a place to put compressed data, rather than "wasting" 12 bytes. +

To reconstitute the data so it can be decompressed with the explosion algorithm, order the data as follows: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OffsetLengthContents +
0x004Compressed data longword 1 +
0x044Compressed data longword 2 +
0x084Compressed data longword 3 +
0x0Cendoffset - 0x0CCompressed data section (maybe includes initial bit-buffer) +
endoffset4Initial literal run length +
endoffset + 0x041initial bit-buffer, if not in compressed data section +
+

In a "normal" compressed stream, the first five bytes (at the end of the stream; the stream is read backwards) are the first literal run length and the initial byte for the bit buffer. If the length of the input data is odd, then the 1-byte "initial bit-buffer" is placed after the 4-byte "first literal run" in memory. This way, the 4-byte run is at an even memory address, so it can be read directly by the MC680x0. If the length of the input data is even, then the "initial bit-buffer" comes before the 4-byte "first literal run", so the 4-byte run is still at an even memory address. +

In FImp, endoffset is always even, however the length of the compressed data is not always even. So this information is stored in the bit-buffer word. Check the bit-buffer's top bit (bit 15). If it is set, then the length of the compressed data is odd. Place the lower 8 bits of the bit-buffer word as a byte after the initial-literal-run-length and decompress the data with an input length of endoffset + 5. However, if bit 15 is not set, then the length of the compressed data is even, and the final byte of the compressed data section is padding. Write the initial bit-buffer's lower 8 bits into the final byte of the compressed data section (endoffset - 1) and decompress the data with an input length of endoffset + 4. +

+

Example code

+

The following standard C program will decompress FImp files. It requires the C code of the explosion algorithm, listed below, to be included in a file called "explode.c". +

+ + + + + + +

Explosion

+

The "implosion" (compression) algorithm, common to all three formats, is a LZ77-family compressor with static Huffman coding. It creates Imploder compressed data. The "explosion" algorithm is the decompressor for Imploder compressed data. It will be described in full in a later version of this document. For now, only C source code is available. +

+

Example code

+ + + + + + + + + + +
+ +
+
+ +