=== Deduplication Table (`DDT2`) The deduplication table is a multi-level table of pointers to LBAs contained in the image. It starts with the following header. [source,c] /* Undefined */ ==== Field Descriptions [cols="2,2,2,6",options="header"] |=== |Type |Size |Name |Description |uint32_t |4 bytes |identifier |The deduplication table identifier, always `DDT2` or `DDTS`. The first level of a table is always `DDT2` and its presence is mandatory. Subtables will have `DDTS` |uint16_t |2 bytes |type |The data type pointed by this table. See Annex B. |uint16_t |2 bytes |compression |The compression algorithm used in the table. See Annex C. |uint8_t |1 byte |levels |How many levels of subtables are present. 1 means this is the only level. |uint8_t |1 byte |tableLevel |What level does this table correspond to |uint64_t |8 bytes |previousLevel |Pointer to absolute byte offset in the image file where the previous table level resides |uint16_t |2 bytes |negative |The negative displacement of LBA numbers. For media that can have negative LBAs, this establishes the number to substract to the table entry number |uint64_t |8 bytes |start |The first LBA contained in this table. It must be 0 for ‘DDT2’ blocks and can be other number for subtables ‘DDTS’ |uint8_t |1 byte |blockAlignmentShift |Determines block alignment boundaries using the formula 2 << blockAlignmentShift. |uint8_t |1 byte |dataShift |Determines the maximum number of data items in a block using the formula 2 << dataShift. |uint8_t |1 byte |tableShift |Shift used to calculate the number of sectors in a deduplication table entry, using the formula 2 << tableShift. |uint8_t |1 byte |sizeType |Size type (see table below) |uint64_t |8 bytes |entries |How many pointers follow this header. |uint32_t |4 bytes |cmpLength |The size in bytes of the compressed table that follows this header. |uint32_t |4 bytes |length |The size in bytes of the table block when decompressed. |uint64_t |8 bytes |cmpCrc64 |The CRC64-ECMA checksum of the compressed table that follows this header. |uint64_t |8 bytes |crc64 |The CRC64-ECMA checksum of the decompressed table. |=== The size type defines the following type of entries: [cols="1,1,6",options="header"] |=== |Type |Value |Description |Mini |0 |Each entry uses two bytes, with the leftmost byte (mask 0xFF00) used for flags, and the rightmost byte used as a pointer to the sector or next level. |Small |1 |Each entry uses three bytes, with the leftmost byte used for flags and the next two bytes used as a pointer to the sector or next level. |Medium |2 |Each entry uses four bytes, with the leftmost byte (mask 0xFF000000) used for flags and the next three bytes used as a pointer to the sector or next level. |Big |3 |Each entry uses five bytes, with the leftmost byte used for flags and the next three bytes used as a pointer to the sector or next level. |=== ==== Interpretation of Deduplication Table Entries Decoding deduplication tables may seem complex initially, but the logic is structured and manageable. Three parameters are critical for interpreting deduplication table entries: - *block_alignment_shift* - *table_shift* - *data_shift* These parameters are stored in both the master header and each deduplication table header to support reliable decoding. ===== Block Alignment Each block in the image is aligned to a boundary of `2 << block_alignment_shift`. This alignment is essential for technical consistency and performance. ===== Table Shift The `table_shift` parameter defines how many blocks (or sectors) are represented by each entry, based on the deduplication table level. In multi-level tables, this value governs an exponential reduction in scope per level. For example: [cols="1,2",options="header"] |=== | Level | Sectors per Entry | 1 | (2 << table_shift)^2 = 262144 | 2 | 2 << table_shift = 512 | 3 | 1 |=== Tables with more than two levels are rare, but implementations should be resilient enough to handle unexpected depths gracefully. ===== Entry Format Across Levels In non-terminal levels (i.e., all except the last), each entry contains: - Relevant metadata flags for its sector range - An offset pointing to the next deduplication level To obtain the byte offset in the image file, multiply this offset by `2 << block_alignment_shift`. In the last level, the `data_shift` is applied as follows to determine the specific item within a data block: .Example calculation [source] ---- Given: - Entry value = 0x35006 - data_shift = 5 - block_alignment_shift = 9 Step 1: Mask and shift 0x35006 >> 5 = 0x1A80 Step 2: Compute byte offset 0x1A80 * (2 << 9) = 0x6A0000 Step 3: Determine item index 0x35006 & 0x1F = 6 Result: Sector is stored at byte offset 0x6A0000 as item number 6 in the data block. ---- ===== Deduplication table flags [cols="2,1,6",options="header"] |=== |Flag |Value |Description |NotDumped |`0x00` |The sector(s) have not been dumped |Dumped |`0x01` |The sector(s) have been dumped without errors |Errored |`0x02` |The sector(s) returned an error on dumping |Mode1Correct |`0x03` |The sector is MODE 1 and the suffix or prefix is correct and can be regenerated. Must only appear on deduplications tables with types CdSectorPrefixCorrected or CdSectorSuffixCorrected |Mode2Form1Ok |`0x04` |The suffix for MODE 2 sectors is correct, can be regenerated, and corresponds to a MODE 2 Form 1 sector. Must only appear on deduplications tables with type CdSectorSuffixCorrected |Mode2Form2Ok |`0x05` |The suffix for MODE 2 sectors is correct, can be regenerated, and corresponds to a MODE 2 Form 2 sector with a valid CRC. Must only appear on deduplications tables with type CdSectorSuffixCorrected |Mode2Form2NoCrc |`0x06` |The suffix for MODE 2 sectors is correct, can be regenerated, and corresponds to a MODE 2 Form 2 sector with an empty CRC. Must only appear on deduplications tables with type CdSectorSuffixCorrected |Twin |`0x07` |The pointer contains a “twin” sector table (see below) |Unrecorded |`0x08` |The sector was unrecorded and each re-read returns random data |=== When flags are present in a table that has sublevels it applies to all the sectors that shall be present in the subtable, unless the flag specify something else.