mirror of
https://github.com/aaru-dps/docs.git
synced 2025-12-16 11:14:37 +00:00
234 lines
6.4 KiB
HTML
234 lines
6.4 KiB
HTML
|
|
|
||
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||
|
|
|
||
|
|
<html>
|
||
|
|
<title>The QCOW Image Format</title>
|
||
|
|
<body bgcolor="#ffffff">
|
||
|
|
<center><h1>The QCOW Image Format</h1></center>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The QCOW image format is one of the disk image formats supported by
|
||
|
|
the QEMU processor emulator. It is a representation of a fixed size
|
||
|
|
block device in a file. Benefits it offers over using raw dump
|
||
|
|
representation include:
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<ol>
|
||
|
|
<li>Smaller file size, even on filesystems which don't support
|
||
|
|
<i>holes</i> (i.e. sparse files)</li>
|
||
|
|
<li>Snapshot support, where the image only represents changes made
|
||
|
|
to an underlying disk image</li>
|
||
|
|
<li>Optional zlib based compression</li>
|
||
|
|
<li>Optional AES encryption</li>
|
||
|
|
</ol>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The qemu-img command is the most common way of manipulating these
|
||
|
|
images e.g.
|
||
|
|
|
||
|
|
<pre>
|
||
|
|
$> qemu-img create -f qcow test.qcow 4G
|
||
|
|
Formating 'test.qcow', fmt=qcow, size=4194304 kB
|
||
|
|
$> qemu-img convert test.qcow -O raw test.img
|
||
|
|
</pre>
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<h2>The Header</h2>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
Each QCOW file begins with a header, in big endian format, as follows:
|
||
|
|
|
||
|
|
<pre>
|
||
|
|
typedef struct QCowHeader {
|
||
|
|
uint32_t magic;
|
||
|
|
uint32_t version;
|
||
|
|
|
||
|
|
uint64_t backing_file_offset;
|
||
|
|
uint32_t backing_file_size;
|
||
|
|
uint32_t mtime;
|
||
|
|
|
||
|
|
uint64_t size; /* in bytes */
|
||
|
|
|
||
|
|
uint8_t cluster_bits;
|
||
|
|
uint8_t l2_bits;
|
||
|
|
uint32_t crypt_method;
|
||
|
|
|
||
|
|
uint64_t l1_table_offset;
|
||
|
|
} QCowHeader;
|
||
|
|
</pre>
|
||
|
|
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<ul>
|
||
|
|
<li>The first 4 bytes contain the characters 'Q', 'F', 'I' followed
|
||
|
|
by <tt>0xfb</tt>.</li>
|
||
|
|
|
||
|
|
<li>The next 4 bytes contain the format version used by the
|
||
|
|
file. Currently, there has only been a single version of the format,
|
||
|
|
version 1.</li>
|
||
|
|
|
||
|
|
<li>The <tt>backing_file_offset</tt> field gives the offset from the
|
||
|
|
beginning of the file to a string containing the path to a file;
|
||
|
|
<tt>backing_file_size</tt> gives the length of this string, which
|
||
|
|
isn't a nul-terminated. If this image is a snapshot image, then this
|
||
|
|
will be the path to the original file. More on snapshots below.</li>
|
||
|
|
|
||
|
|
<li>The mtime field can be ignored.</li>
|
||
|
|
|
||
|
|
<li>The next 8 bytes contain the size, in bytes, of the block device
|
||
|
|
represented by the image.</li>
|
||
|
|
|
||
|
|
<li>The <tt>cluster_bits</tt> and <tt>l2_bits</tt> fields, between
|
||
|
|
them, describe how to map an image offset address to a location
|
||
|
|
within the file. The <tt>cluster_bits</tt> field determines the
|
||
|
|
number of lower bits of the offset address are used as an index
|
||
|
|
within a cluster; the <tt>l2_bits</tt> field gives the number of
|
||
|
|
bits used as an index withing the L2 table. More on the format's
|
||
|
|
2-level lookup system below.</li>
|
||
|
|
|
||
|
|
<li>The <tt>crypt_method</tt> field is 0 if no encryption has been
|
||
|
|
used, and 1 if AES encryption has been used.</li>
|
||
|
|
|
||
|
|
<li>The l1_table_offset gives the offset within the file of the L1
|
||
|
|
table.</li>
|
||
|
|
</ul>
|
||
|
|
|
||
|
|
<h2>2-Level Lookups</h2>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
With QCOW, the contents of the device are stored in
|
||
|
|
<i>clusters</i>. Each cluster contains a number of 512 byte sectors.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>In order to find the cluster for a given address within the device,
|
||
|
|
you must traverse two levels of tables. The L1 table is an array of
|
||
|
|
file offsets to L2 tables, and each L2 table is an array of file
|
||
|
|
offsets to clusters.</p>
|
||
|
|
|
||
|
|
<p>So, an address is split into three separate offsets according to
|
||
|
|
the <tt>cluster_bits</tt> and <tt>l2_bits</tt> fields. For example, if
|
||
|
|
<tt>cluster_bits</tt> is 12 and <tt>l2_bits</tt> is 9, then the
|
||
|
|
address is split up as follows:
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<ul>
|
||
|
|
<li>the lower 12 is an offset within a 4Kb cluster</li>
|
||
|
|
|
||
|
|
<li>the next 9 bits is offset within a 512 entry array of 64 bit
|
||
|
|
file offsets, the L2 table</li>
|
||
|
|
|
||
|
|
<li>the remaining 43 bits is an offset within another array of 64
|
||
|
|
bit file offsets, the L1 table</li>
|
||
|
|
</ul>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
Note, the size of the L1 table is a function of the size of the
|
||
|
|
represented disk image:
|
||
|
|
<pre>
|
||
|
|
l1_size = ceiling (disk_size / (cluster_size * l2_size))
|
||
|
|
</pre>
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>In other words, in order to map a given disk address to an offset
|
||
|
|
within the image:
|
||
|
|
|
||
|
|
<ol>
|
||
|
|
<li>Obtain the L1 table address using the <tt>l1_table_offset</tt>
|
||
|
|
header field</li>
|
||
|
|
|
||
|
|
<li>Use the top (64 - <tt>l2_bits</tt> - <tt>cluster_bits</tt>) bits
|
||
|
|
of the address to index the L1 table as an array of 64 bit
|
||
|
|
entries
|
||
|
|
</li>
|
||
|
|
|
||
|
|
<li>Obtain the L2 table address using the offset in the L1
|
||
|
|
table</li>
|
||
|
|
|
||
|
|
<li>Use the next <tt>l2_bits</tt> of the address to index the L2
|
||
|
|
table as an array of 64 bit entries</li>
|
||
|
|
|
||
|
|
<li>Obtain the cluster address using the offset in the L2 table.
|
||
|
|
|
||
|
|
(Assuming the most significant bit of the cluster address is
|
||
|
|
zero. See the details on compression below.)
|
||
|
|
</li>
|
||
|
|
|
||
|
|
<li>Use the remaining cluster_bits of the address as an offset
|
||
|
|
within the cluster itself</li>
|
||
|
|
</ol>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
If the offset found in either the L1 or L2 table is zero, that area of
|
||
|
|
the disk is not allocated within the image.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<h2>Snapshots</h2>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The QCOW format can represent the notion of a snapshot. That is, a
|
||
|
|
QCOW image can be used to store the changes to another disk image,
|
||
|
|
without actually affecting the contents of the original image.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The representation is very simple. The snapshot image contains the
|
||
|
|
path to the original disk image, and the snapshot image header
|
||
|
|
gives the location of the path string within the file.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
When you want to read an area from the snapshot, you first check to
|
||
|
|
see if that area is allocated within the snapshot image. If not, you
|
||
|
|
read the area from the original disk image.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<h2>Compression</h2>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The QCOW format supports compression by allowing each cluster to be
|
||
|
|
independently compressed with zlib.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
This is represented in the cluster offset obtained from the L2 table
|
||
|
|
as follows:
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<ul>
|
||
|
|
<li>If the most significant bit of the cluster offset is 1, this is
|
||
|
|
a compressed cluster</li>
|
||
|
|
|
||
|
|
<li>The next cluster_bits of the cluster offset is the size of the
|
||
|
|
compressed cluster</li>
|
||
|
|
|
||
|
|
<li>The remaining bits of the cluster offset is the actual address
|
||
|
|
of the compressed cluster within the image</li>
|
||
|
|
</ul>
|
||
|
|
|
||
|
|
<h2>Encryption</h2>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
The QCOW format also supports the encryption of clusters.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
If the crypt_method header field is 1, then a 16 character password
|
||
|
|
is used as the 128 bit AES key.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
Each sector within each cluster is independently encrypted using AES
|
||
|
|
Cipher Block Chaining mode, using the sector's offset (relative to the
|
||
|
|
start of the device) in little-endian format as the first 64 bits of
|
||
|
|
the 128 bit initialisation vector.
|
||
|
|
</p>
|
||
|
|
|
||
|
|
<p>
|
||
|
|
<small><a href="http://blogs.gnome.org/markmc">Mark McLoughlin</a>.
|
||
|
|
June 21, 2006.</small>
|
||
|
|
</p>
|
||
|
|
|
||
|
|
</body>
|
||
|
|
</html>
|
||
|
|
|