Added information about commodore formats

2025-12-16 19:24:38 +00:00 · 2015-03-25 02:19:24 +00:00
parent a110d79c87
commit 712ac67072
52 changed files with 24638 additions and 0 deletions
--- a/Commodore/LHA.TXT
+++ b/Commodore/LHA.TXT
@@ -0,0 +1,119 @@
+
+*** LHA, LZH, LZS (LHArc compressed files)
+*** Document revision: 1.3
+*** Last updated: March 11, 2004
+*** Compiler/Editor: Peter Schepers
+*** Contributors/samples: Joe Forster/STA, net documents
+
+  These files are created with LHA on the C64 (or C128),  and  can  present
+special problems to the typical PC user. The compression used  is  LH1,  an
+old method used on LZH 1.xx (pre-version 2), so any version of LHA  on  the
+PC can  uncompress  them.  However,  LHA  allows  filenames  of  up  to  18
+characters long, and DOS doesn't know how to handle them (Windows 95  unLHA
+utilities will extract the full  filename).  Usually,  some  of  the  files
+already  uncompressed  will  be  overwritten  by  other  files  just  being
+uncompressed because the name seems the same to DOS. To  LHA  however,  the
+filenames are quite different.
+
+  LHA archives always have a string two bytes into the file ("-L??-") which
+describe the type of compression used. Over the  development  life  of  LHA
+there have been several different compression algorithms used. The "??"  in
+the "-L??-" can be one of several possibilites, but on the C64 it is likely
+limited to "H0" (no compression) and "H1". Newer versions  of  LHA/LZH  use
+other combinations like "H2", "H3", "H4", "H5", "ZS", "Z5", and  "Z4".  The
+letters typically used in the compression string come from a combination of
+the creators initials of the LZ algorithm, Lempel/Ziv, and  the  author  of
+the LHA program, Haruyasu Yoshizaki.
+
+The following is a sample of an LHA header. Note the string to  search  for
+at byte $0002:
+
+      00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F        ASCII
+      -----------------------------------------------   ----------------
+0000: 24 93 2D 6C 68 31 2D 39 02 00 00 16 04 00 00 00   ..-lh1-.........
+0010: 08 4C 14 00 00 0E 73 79 73 2E 48 6F 75 73 65 20   ................
+0020: 4D 34 00 53 DE 06 11 1C 12 C4 C8 FA 3A 5B DC CE   ................
+0030: B2 FA 38 1E 46 B0 B6 9E 9B 75 7A 49 71 72 B3 53   ................
+0040: 6E 4E B4 A0 BF 5E 95 B3 05 8A 75 D5 6C E3 03 4A   ................
+0050: 2C 54 F4 AF 05 18 59 E2 F4 34 4A 0A 28 D4 33 E2   ................
+0060: C4 9D 04 D7 C7 8B 91 66 0E E5 DE 98 3C 92 CC B5   ................
+
+  The header layout is fairly basic. The header for each file starts  *two*
+bytes before the "-lh?-" string. The above example has already been trimmed
+down to start at these two bytes. Each header has the same layout, only the
+length varies due to the length of the filename. Here is a breakdown of the
+above example.
+
+    Bytes: $0000: 24 - Length of header (known as "LEN", not including this
+                       and the next byte). If it is zero, we are at the end
+                       of the file.
+            0001: 93                   - Header checksum
+            0002: 2D 6C 68 31 2D       - LHA compression type "-LH1-"
+            0007: 39 02 00 00          - Compressed file size ($00000239)
+            000B: 16 04 00 00          - Uncompressed file size ($00000416)
+            000F: 00 08 4C 14          - Time/date stamp
+            0013: 00                   - File attribute
+            0014: 00                   - Header level
+                                              00 = non-extended header
+                                          01, 02 = extended header
+            0015: 0E                   - Length of the following filename
+            0016: 73 79 73 2E 48 6F 75 - Filename, with a zero and filetype
+                  73 65 20 4D 34 00 53   appended ("SYS.HOUSE M4<4D>S"). The
+                                         name can be up to 18 characters in
+                                         length. Note the length *includes*
+                                         the zero and filetype, making the
+                                         actual filename length 2 bytes
+                                         shorter.
+            0024: DE 06                - File data checksum (starts at LEN)
+            0026: 11 1C 12 C4 C8 FA... - File data (starts at LEN+2)
+
+  The header checksum at byte $0001 is calculated by adding  the  bytes  in
+the header from $0002 (LHA compression type) to LEN+1 (File data checksum),
+without carry.
+
+  The time/date stamp (bytes $000F-$0012), is broken down as follows:
+
+      Bytes:$000F-0010: Time of last modification:
+                        BITS  0- 4: Seconds divided by 2
+                                    (0-58, only even numbers)
+                        BITS  5-10: Minutes (0-59)
+                        BITS 11-15: Hours (0-23, no AM or PM)
+      Bytes:$0011-0012: Date of last modification:
+                        BITS  0- 4: Day (1-31)
+                        BITS  5- 9: Month (1-12)
+                        BITS 10-15: Year minus 1980
+
+  The format of the compressed data is much too complex to get  into  here.
+Understanding the layout would require  knowledge  of  Huffman  coding  and
+sliding dictionaries, and  is  nowhere  near  as  simple  as  ZipCode!  The
+description given in the LHA source  code  for  the  different  compression
+modes are as follows:
+
+      -lh0- no compression, file stored
+      -lh1- 4k sliding dictionary (max 60 bytes) + dynamic Huffman +  fixed
+            encoding of position
+      -lh2- 8k sliding dictionary (max 256 bytes) + dynamic Huffman
+      -lh3- 8k sliding dictionary (max 256 bytes) + static Huffman
+      -lh4- 4k sliding dictionary  (max  256  bytes)  +  static  Huffman  +
+            improved encoding of position and trees
+      -lh5- 8k sliding dictionary  (max  256  bytes)  +  static  Huffman  +
+            improved encoding of position and trees
+      -lzs- 2k sliding dictionary (max 17 bytes)
+      -lz4- no compression, file stored
+      -lz5- 4k sliding dictionary (max 17 bytes)
+
+  There are several utilities that you can use to decompress  these  files,
+like the already-mentioned LHA on the PC, or Star  LHA,  one  of  the  many
+excellent utilities contained in the Star Commander  distribution  package.
+If you use Star LHA, keep in mind it needs the program LHA v2.14 (or newer)
+to extract. If an older version of LHA is used (such as the common  version
+2.13), then the files being extracted will be corrupt. It will extract  the
+files directly into a D64 image, so the long  C64  filenames  will  not  be
+lost.
+
+  To an emulator user there is no use to these files, as  their  only  real
+usage on a C64 was for storage  and  transmission  benefits.  The  standard
+compression program on the PC is PKZIP (or ZIP compatibles), so unless  you
+have some need to send *compressed* files back the C64, there is no use  in
+using LHA.
+