Files
libaaruformat/include/aaruformat/structs/tape.h

455 lines
21 KiB
C
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/*
* This file is part of the Aaru Data Preservation Suite.
* Copyright (c) 2019-2025 Natalia Portillo.
*
* This library is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as
* published by the Free Software Foundation; either version 2.1 of the
* License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
* */
/**
* @file tape.h
* @brief Data structures for tape media support in Aaru disk images.
*
* This header defines structures used to represent tape-specific metadata in the Aaru
* image format. Tape media differs from disk media in several key ways:
*
* **Tape Organization:**
* - Tapes are organized into sequential blocks rather than random-access sectors
* - Multiple logical files can exist on a single tape
* - Tapes may be divided into multiple partitions
* - Files are defined by contiguous ranges of blocks within a partition
*
* **File Structure:**
* Tape files are logical groupings of blocks that represent discrete data units.
* Each file has:
* - A unique file number identifying it within its partition
* - A partition number indicating which physical partition contains it
* - A block range (FirstBlock to LastBlock, inclusive) defining its extent
*
* **Partition Structure:**
* Tape partitions are physical divisions of the tape medium. Each partition:
* - Has a unique partition number
* - Contains a contiguous range of blocks
* - Can contain multiple files or none
* - May have different formatting or characteristics
*
* **Block Addressing:**
* Block numbers are sequential within a partition, starting from 0. The FirstBlock
* and LastBlock values are inclusive, meaning both boundary blocks are part of the
* file or partition. For example, a file with FirstBlock=100 and LastBlock=199
* contains 100 blocks (blocks 100 through 199 inclusive).
*
* **Use Cases:**
* - Archiving legacy tape media (DAT, DLT, LTO, QIC, etc.)
* - Preserving tape structure for restoration to physical media
* - Forensic imaging of tape evidence
* - Migration of tape archives to modern storage
* - Documentation of tape organization for archival purposes
*
* **Storage in Aaru Images:**
* The tape metadata blocks (TapeFileHeader and TapePartitionHeader) are optional
* structural blocks written during image finalization. They allow readers to
* understand the logical organization of the tape without parsing the entire
* data stream.
*
* @note All structures in this file use packed alignment (#pragma pack(push, 1))
* to ensure consistent on-disk layout across different compilers and platforms.
*
* @note Block addresses are 0-based and partition numbers typically start from 0
* (though some tape formats may use 1-based numbering).
*
* @see BlockType for the identifier constants used in TapeFileHeader and TapePartitionHeader
*/
#ifndef LIBAARUFORMAT_TAPE_H
#define LIBAARUFORMAT_TAPE_H
#include <stdint.h>
#pragma pack(push, 1)
/**
* @struct TapeFileEntry
* @brief Describes a single logical file on a tape medium.
*
* A tape file is a contiguous sequence of blocks that represents a discrete logical
* unit of data. Files are the primary organizational unit on tape media, analogous
* to files on a filesystem but simpler in structure. Each file is identified by a
* file number and exists within a specific partition.
*
* **File Number:**
* The file number uniquely identifies the file within its partition. File numbering
* typically starts from 0 or 1 depending on the tape format. Sequential files are
* numbered consecutively, though gaps may exist if files were deleted or the tape
* was formatted with specific file positions reserved.
*
* **Partition Association:**
* Each file belongs to exactly one partition. Multi-partition tapes can have files
* with the same file number in different partitions - the combination of partition
* number and file number uniquely identifies a file on the tape.
*
* **Block Range:**
* The FirstBlock and LastBlock fields define the inclusive range of blocks that
* comprise the file. Both endpoints are included in the file. For example:
* - FirstBlock=0, LastBlock=0: A single-block file (block 0)
* - FirstBlock=10, LastBlock=19: A ten-block file (blocks 10-19 inclusive)
* - FirstBlock=100, LastBlock=99: Invalid (FirstBlock must be ≤ LastBlock)
*
* **Physical Layout:**
* Files occupy contiguous blocks on tape. There are no block pointers or allocation
* tables as found in disk filesystems. The physical ordering matches the logical
* ordering defined by the block range.
*
* **File Marks:**
* On physical tape, files are typically separated by filemarks (also called tape marks).
* This structure represents the logical file boundaries; the actual filemarks may be
* implicit in the block numbering or explicitly stored in the tape data stream.
*
* **Size Calculation:**
* File size in blocks = (LastBlock - FirstBlock + 1)
* File size in bytes = file_size_in_blocks × block_size (where block_size is
* tape-format-specific, commonly 512, 1024, or variable)
*
* @note File numbers and block addresses are format-specific. Some formats use
* 0-based numbering while others use 1-based. The Aaru format preserves
* the original numbering scheme.
*
* @note Files cannot span partitions. If a logical dataset crosses partition
* boundaries, it would be represented as separate files in each partition.
*
* @note Empty files (where FirstBlock == LastBlock but containing no data) may
* exist on some tape formats to serve as markers or placeholders.
*/
typedef struct TapeFileEntry
{
uint32_t File; ///< File number (unique within the partition). Identifies this file among all files in the same
///< partition. Numbering scheme is tape-format-dependent.
uint8_t Partition; ///< Partition number containing this file. References a partition defined in the
///< TapePartitionHeader block. Valid range: 0-255.
uint64_t FirstBlock; ///< First block of the file (inclusive). This is the starting block address of the file data.
///< Block addresses are 0-based within the partition.
uint64_t
LastBlock; ///< Last block of the file (inclusive). This is the ending block address of the file data. Must be
///< ≥ FirstBlock. The file contains all blocks from FirstBlock through LastBlock inclusive.
} TapeFileEntry;
/**
* @struct TapeFileHeader
* @brief Header for a tape file metadata block containing file layout information.
*
* This structure serves as the header for a TapeFileBlock, which documents all
* logical files present on the tape medium. The block consists of this fixed-size
* header followed by a variable-length array of TapeFileEntry structures (one per file).
*
* **Block Structure:**
* ```
* +-------------------------+
* | TapeFileHeader | <- Fixed 24-byte header
* +-------------------------+
* | TapeFileEntry 0 | <- First file entry
* +-------------------------+
* | TapeFileEntry 1 | <- Second file entry
* +-------------------------+
* | ... |
* +-------------------------+
* | TapeFileEntry (n-1) | <- Last file entry
* +-------------------------+
* ```
*
* **Purpose:**
* The tape file block enables:
* - Quick lookup of file locations without scanning the entire tape
* - Validation of file boundaries and partition assignments
* - Random access to specific files in the image
* - Preservation of the original tape's logical organization
* - Documentation of tape structure for archival purposes
*
* **Identifier Field:**
* The identifier field must be set to BlockType::TapeFileBlock to indicate this
* block type. This allows the Aaru format parser to recognize and correctly
* interpret the block during image loading.
*
* **Entry Count:**
* The entries field specifies the number of TapeFileEntry structures that follow
* the header. A tape with no files would have entries=0, though this is unusual.
* Maximum value is 2^32-1 (4,294,967,295 files), though practical limits are
* much lower.
*
* **Block Length:**
* The length field contains the size in bytes of the data following this header
* (i.e., the array of TapeFileEntry structures). This is calculated as:
* length = entries × sizeof(TapeFileEntry)
*
* The header itself is NOT included in the length value. Total block size is:
* total_size = sizeof(TapeFileHeader) + length
*
* **CRC64 Checksum:**
* The crc64 field contains a CRC64-ECMA checksum computed over the entry data
* (the array of TapeFileEntry structures, excluding this header). This provides
* integrity verification to detect corruption in the file table. The CRC is
* calculated using the ECMA polynomial.
*
* **File Order:**
* Entries should be ordered by partition number (ascending), then by file number
* (ascending) within each partition. This ordering facilitates efficient lookup
* and matches the physical organization of most tape formats.
*
* **Alignment:**
* When written to the Aaru image, this block is aligned to the image's block
* alignment boundary (typically 2^blockAlignmentShift bytes). Padding may be
* inserted before the block to achieve proper alignment.
*
* **Index Integration:**
* After writing this block to the image file, an IndexEntry is created with:
* - blockType = TapeFileBlock
* - dataType = 0 (tape file blocks have no subtype)
* - offset = file position where this header was written
*
* @note This block is optional. Tape images without file structure metadata can
* omit it, representing the tape as a simple linear sequence of blocks.
*
* @note If a tape has multiple partitions, files from all partitions are included
* in a single TapeFileBlock (distinguished by their Partition field).
*
* @note Empty tapes (no files) may still have a TapeFileHeader with entries=0,
* length=0, though such blocks are typically omitted.
*
* @warning The entries count must accurately reflect the number of TapeFileEntry
* structures in the block. Mismatches will cause parsing errors.
*
* @warning The CRC64 must be recalculated any time the entry data changes.
* Stale CRC values will cause integrity check failures.
*
* @see TapeFileEntry for the structure of individual file entries
* @see TapePartitionHeader for partition structure metadata
* @see BlockType for block type identifier constants
*/
typedef struct TapeFileHeader
{
uint32_t identifier; ///< Block type identifier. Must be set to BlockType::TapeFileBlock. This magic value allows
///< parsers to identify the block type.
uint32_t entries; ///< Number of file entries following this header. Specifies how many TapeFileEntry structures
///< are in this block. Valid range: 0 to 2^32-1.
uint64_t length; ///< Size of entry data in bytes (excluding this header). Calculated as: entries ×
///< sizeof(TapeFileEntry). This is the number of bytes following the header.
uint64_t crc64; ///< CRC64-ECMA checksum of the entry data. Computed over the array of TapeFileEntry structures
///< only. Does NOT include this header. Used for integrity verification.
} TapeFileHeader;
/**
* @struct TapePartitionEntry
* @brief Describes a single physical partition on a tape medium.
*
* Tape partitions are physical divisions of the tape storage area, each with its
* own block address space. Partitioning allows a single tape to be logically
* divided for organizational purposes, access control, or compatibility with
* different systems.
*
* **Partition Number:**
* The partition number uniquely identifies the partition on the tape. Most tapes
* have a single partition (partition 0), but advanced formats like LTO, DLT, and
* AIT support multiple partitions. The numbering scheme is format-specific:
* - Single-partition tapes: Usually partition 0
* - Multi-partition tapes: Typically 0-255, though most formats support fewer
*
* **Block Range:**
* Each partition has an independent block address space starting from its FirstBlock.
* The FirstBlock and LastBlock fields define the inclusive range of blocks in the
* partition:
* - FirstBlock: The first block number in this partition (often 0)
* - LastBlock: The last block number in this partition (inclusive)
*
* **Block Address Spaces:**
* Block addresses are local to each partition. For example:
* - Partition 0: blocks 0-9999 (10,000 blocks)
* - Partition 1: blocks 0-4999 (5,000 blocks)
*
* Block 0 in partition 0 is distinct from block 0 in partition 1. When referencing
* a block, both the partition number and block number are required for uniqueness.
*
* **Partition Size:**
* Partition size in blocks = (LastBlock - FirstBlock + 1)
* Total tape capacity = sum of all partition sizes
*
* **Physical Organization:**
* On physical tape, partitions are typically laid out sequentially:
* - Partition 0 occupies the beginning of the tape
* - Partition 1 follows partition 0
* - And so on...
*
* Some formats allow non-sequential access to partitions via tape directory
* structures or partition tables stored in a leader area.
*
* **Use Cases:**
* - **Data segregation**: Separate system backups from user data
* - **Multi-system use**: Different partitions for different operating systems
* - **Tiered storage**: Fast-access partition for indices, bulk partition for data
* - **Compatibility**: Legacy system may require specific partition layouts
* - **Write protection**: Some formats allow per-partition write protection
*
* **Format Examples:**
* - **LTO (Linear Tape-Open)**: Supports 2-4 partitions depending on generation
* - **DLT (Digital Linear Tape)**: Typically 1 partition, some variants support more
* - **DAT (Digital Audio Tape)**: Usually 1 partition
* - **AIT (Advanced Intelligent Tape)**: Supports multiple partitions
* - **QIC (Quarter-Inch Cartridge)**: Typically 1 partition
*
* @note Partition numbers are format-specific. Some formats use 0-based numbering,
* others use 1-based. The Aaru format preserves the original numbering.
*
* @note Not all tape formats support partitioning. Single-partition tapes are
* common and have one entry with Number=0.
*
* @note Partitions cannot overlap in the physical block space, though the logical
* block numbering within each partition may start from 0.
*
* @note Empty partitions (reserved but unused space) may exist with FirstBlock >
* LastBlock or with no files allocated to them.
*/
typedef struct TapePartitionEntry
{
uint8_t Number; ///< Partition number (unique identifier for this partition). Identifies this partition among all
///< partitions on the tape. Valid range: 0-255, though most tapes use 0-3.
uint64_t FirstBlock; ///< First block in the partition (inclusive). Starting block address for this partition's
///< address space. Often 0, but format-dependent.
uint64_t LastBlock; ///< Last block in the partition (inclusive). Ending block address for this partition's address
///< space. Must be ≥ FirstBlock. The partition contains all blocks from FirstBlock through
///< LastBlock inclusive.
} TapePartitionEntry;
/**
* @struct TapePartitionHeader
* @brief Header for a tape partition metadata block containing partition layout information.
*
* This structure serves as the header for a TapePartitionBlock, which documents
* the physical partitioning of the tape medium. The block consists of this fixed-size
* header followed by a variable-length array of TapePartitionEntry structures (one
* per partition).
*
* **Block Structure:**
* ```
* +---------------------------+
* | TapePartitionHeader | <- Fixed 21-byte header
* +---------------------------+
* | TapePartitionEntry 0 | <- First partition entry
* +---------------------------+
* | TapePartitionEntry 1 | <- Second partition entry
* +---------------------------+
* | ... |
* +---------------------------+
* | TapePartitionEntry (n-1) | <- Last partition entry
* +---------------------------+
* ```
*
* **Purpose:**
* The tape partition block enables:
* - Quick identification of partition boundaries
* - Validation of partition organization
* - Understanding of tape physical layout
* - Support for multi-partition tape formats
* - Preservation of original partitioning scheme
* - Correct interpretation of file locations (files reference partition numbers)
*
* **Identifier Field:**
* The identifier field must be set to BlockType::TapePartitionBlock to indicate
* this block type. This allows the Aaru format parser to recognize and correctly
* interpret the block during image loading.
*
* **Entry Count:**
* The entries field specifies the number of TapePartitionEntry structures that
* follow the header. Valid range is 0-255 (uint8_t), though most tapes have 1-4
* partitions:
* - entries=0: No partitions (unusual, possibly empty tape)
* - entries=1: Single partition (most common)
* - entries>1: Multi-partition tape
*
* **Block Length:**
* The length field contains the size in bytes of the data following this header
* (i.e., the array of TapePartitionEntry structures). This is calculated as:
* length = entries × sizeof(TapePartitionEntry)
*
* The header itself is NOT included in the length value. Total block size is:
* total_size = sizeof(TapePartitionHeader) + length
*
* **CRC64 Checksum:**
* The crc64 field contains a CRC64-ECMA checksum computed over the entry data
* (the array of TapePartitionEntry structures, excluding this header). This
* provides integrity verification to detect corruption in the partition table.
* The CRC is calculated using the ECMA polynomial.
*
* **Partition Order:**
* Entries should be ordered by partition number (ascending). This matches the
* typical physical organization of tape partitions and facilitates efficient
* lookup operations.
*
* **Alignment:**
* When written to the Aaru image, this block is aligned to the image's block
* alignment boundary (typically 2^blockAlignmentShift bytes). Padding may be
* inserted before the block to achieve proper alignment.
*
* **Index Integration:**
* After writing this block to the image file, an IndexEntry is created with:
* - blockType = TapePartitionBlock
* - dataType = 0 (tape partition blocks have no subtype)
* - offset = file position where this header was written
*
* **Relationship to Files:**
* TapePartitionHeader should be written before TapeFileHeader, as files reference
* partitions by number. Readers should parse the partition block first to validate
* partition numbers in file entries.
*
* **Single-Partition Tapes:**
* Even for single-partition tapes, this block should be present with entries=1,
* containing one TapePartitionEntry for partition 0. This provides consistency
* and avoids special-case handling in readers.
*
* @note This block is optional but highly recommended for tape images. Without it,
* partition information must be inferred from file entries or tape format
* specifications.
*
* @note The sum of all partition sizes (in blocks) represents the total tape
* capacity, assuming partitions are contiguous and non-overlapping.
*
* @note Some tape formats allow dynamic repartitioning (changing partition
* boundaries). The Aaru image captures the partition layout at imaging time.
*
* @warning The entries count must accurately reflect the number of TapePartitionEntry
* structures in the block. Mismatches will cause parsing errors.
*
* @warning The CRC64 must be recalculated any time the entry data changes.
* Stale CRC values will cause integrity check failures.
*
* @warning Partition numbers referenced in TapeFileEntry must exist in this
* partition table. Orphaned file entries (referencing non-existent
* partitions) indicate a corrupt or incomplete image.
*
* @see TapePartitionEntry for the structure of individual partition entries
* @see TapeFileHeader for file structure metadata
* @see BlockType for block type identifier constants
*/
typedef struct TapePartitionHeader
{
uint32_t identifier; ///< Block type identifier. Must be set to BlockType::TapePartitionBlock. This magic value
///< allows parsers to identify the block type.
uint8_t entries; ///< Number of partition entries following this header. Specifies how many TapePartitionEntry
///< structures are in this block. Valid range: 0-255. Most tapes have 1-4 partitions.
uint64_t length; ///< Size of entry data in bytes (excluding this header). Calculated as: entries ×
///< sizeof(TapePartitionEntry). This is the number of bytes following the header.
uint64_t crc64; ///< CRC64-ECMA checksum of the entry data. Computed over the array of TapePartitionEntry
///< structures only. Does NOT include this header. Used for integrity verification.
} TapePartitionHeader;
#pragma pack(pop)
#endif // LIBAARUFORMAT_TAPE_H