mirror of
https://github.com/aaru-dps/docs.git
synced 2025-12-16 19:24:38 +00:00
Added information about commodore formats
This commit is contained in:
102
Commodore/CPK.TXT
Normal file
102
Commodore/CPK.TXT
Normal file
@@ -0,0 +1,102 @@
|
||||
|
||||
*** CPK
|
||||
*** Document revision: 1.3
|
||||
*** Last updated: March 11, 2004
|
||||
*** Compiler/Editor: Peter Schepers
|
||||
*** Contributors/sources: Andre Fachat
|
||||
|
||||
This format, created by Andre Fachat, was not designed for the emulators
|
||||
specifically, but was made primarily for Andre's own purposes.
|
||||
|
||||
It is a very basic format using simple RLE compression, with each file
|
||||
following in sequential order (as Andre put it, "its similar to a UNIX TAR
|
||||
file"). There is no central directory, none of the files are byte aligned,
|
||||
and it uses compression so every file will be different.
|
||||
|
||||
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ASCII
|
||||
----------------------------------------------- ----------------
|
||||
0000: 01 40 41 2E 41 4E 4C 2C 50 00 01 08 24 08 64 00 .@A.ANL,P<>..$.d<>
|
||||
0010: 99 22 93 20 20 20 41 4E 4C 45 49 54 55 4E 47 20 <20>"<22><><EFBFBD><EFBFBD>ANLEITUNG<4E>
|
||||
0020: 5A 55 4D 20 40 41 53 53 45 4D 42 4C 45 52 00 4E ZUM<55>@ASSEMBLER<45>N
|
||||
0030: 08 6E 00 99 22 11 40 41 53 53 20 49 53 54 20 45 .n<><6E>".@ASS<53>IST<53>E
|
||||
0040: 49 4E 20 32 2D 50 41 53 53 2D 41 53 53 45 4D 42 IN<49>2-PASS-ASSEMB
|
||||
0050: 4C 45 52 2E 20 44 45 52 00 78 08 78 00 99 22 11 LER.<2E>DER<45>x.x<><78>".
|
||||
|
||||
The first byte of the file is the version byte. Presently, only $01 is
|
||||
supported.
|
||||
|
||||
0000: 01 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ................
|
||||
|
||||
The filename follows, stored in standard PETASCII, and no padding
|
||||
characters ($A0) are included.
|
||||
|
||||
0000: .. 40 41 2E 41 4E 4C .. .. .. .. .. .. .. .. .. .@A.ANL.........
|
||||
|
||||
The filetype is attached to the end of the filename in the form of ',x',
|
||||
where x is the filetype used (P,S,U), and it is in PETASCII upper case. The
|
||||
filename ends with a $00 (null terminated). REL files are *not* supported
|
||||
as there is no provision made for the RECORD size byte.
|
||||
|
||||
Note that not *all* CPK files will have the ",x" extension added on. If
|
||||
it doesn't exist, assume that the file is a "PRG" type.
|
||||
|
||||
0000: .. .. .. .. .. .. .. 2C 50 00 .. .. .. .. .. .. .......,P<>......
|
||||
|
||||
Following the filename, we get program data.
|
||||
|
||||
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ASCII
|
||||
----------------------------------------------- ----------------
|
||||
0000: .. .. .. .. .. .. .. .. .. .. 01 08 24 08 64 00 ............$.d<>
|
||||
0010: 99 22 93 20 20 20 41 4E 4C 45 49 54 55 4E 47 20 <20>"<22><><EFBFBD><EFBFBD>ANLEITUNG<4E>
|
||||
...
|
||||
0270: 00 83 0A E6 00 99 22 11 20 20 31 32 33 F7 08 20 <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD>".<2E><>123<32>.<2E>
|
||||
0280: 2D 44 45 5A 49 4D 41 4C 00 A4 0A F0 00 99 22 11 -DEZIMAL<41><4C><EFBFBD><EFBFBD><EFBFBD><EFBFBD>".
|
||||
0290: 20 20 24 33 34 35 F7 07 20 2D 48 45 58 41 44 45 <20><>$345<34>.<2E>-HEXADE
|
||||
|
||||
The data requires some explanation as it uses RLE (Run Length Encoding)
|
||||
compression. When creating CPK files, data in the file to be compressed is
|
||||
scanned for runs of repeating bytes, and when a string of 3 or more (up to
|
||||
255) is found, then the following sequence of bytes is output...
|
||||
|
||||
$F7 $xx $yy - where F7 is the code used for "encoded sequence follows",
|
||||
$xx is the number of times to repeat the byte and $yy is
|
||||
the byte to repeat. Using the sample below, we see the F7
|
||||
code, then a "repeat 7 times the number $20"
|
||||
|
||||
0290: .. .. .. .. .. .. F7 07 20 .. .. .. .. .. .. .. ......<2E>.<2E>.......
|
||||
|
||||
Using $F7 as the encoder byte presents one problem: When encoding a file,
|
||||
and we encounter an $F7, what does the packer do? Simple, it gets encoded
|
||||
into $F7 $xx $F7 meaning repeat $F7 for as many times as is needed (if its
|
||||
only 1 $F7, then the value for $xx is $01). The code 'F7' was chosen
|
||||
because it is not a 6502 opcode, a BASIC token, or any commonly used byte,
|
||||
but *not* because it has the least statistical probability of occuring.
|
||||
|
||||
The stored program ends when the string $F7 $00 is encountered, since
|
||||
this sequence can not occur in the file naturally. If you need to search
|
||||
through a CPK file for the filenames, do a hex search for all $F7 $00
|
||||
sequences, since they preceed all filenames except the first.
|
||||
|
||||
The end of a CPK file can be found two different ways:
|
||||
|
||||
1. When an EOF (end of file) occurs, after an $F7 $00 byte sequence.
|
||||
This is the normal method.
|
||||
2. When a filename of $00 occurs, meaning there is no filename, just a
|
||||
null termination. This is not much used anymore.
|
||||
|
||||
Using method #1 for ending the file is more common because it makes
|
||||
adding files to the CPK file very easy. All you have to do as append the
|
||||
new filename/data to the container. Using method #2 means you have to check
|
||||
and see if the last three characters are $F7 $00 $00, and start writing the
|
||||
new file into the container starting after the first $00.
|
||||
|
||||
In order to extract *one* specific file, you would need to read the whole
|
||||
file until you find the filename you want, then output that file only. As
|
||||
this format has no central directory and no file location references, there
|
||||
is no other way to do it.
|
||||
|
||||
This format has not been used for some time now, as when it came out D64
|
||||
and T64 were also being developed and accepted into common use. It is
|
||||
unlikely you will find *any* files in this format. 64COPY V3.2 (and up)
|
||||
does support extraction of these files just in case any are encountered.
|
||||
|
||||
Reference in New Issue
Block a user