delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1996/08/31/07:37:34

From: k3040e4 AT c210 DOT edvz DOT uni-linz DOT ac DOT at (Oberhumer Markus)
Message-Id: <199608311034.MAA27344@c210.edvz.uni-linz.ac.at>
Subject: LZO 0.20 -- a real-time data compression library
To: djgpp-announce AT delorie DOT com
Date: Sat, 31 Aug 1996 12:34:05 -0200 (MET DST)
Return-Read-To: markus DOT oberhumer AT jk DOT uni-linz DOT ac DOT at

-----BEGIN PGP SIGNED MESSAGE-----


 ==============================================================================
 LZO -- a real-time data compression library
 ==============================================================================

 Author  : Markus Franz Xaver Johannes Oberhumer
           <markus DOT oberhumer AT jk DOT uni-linz DOT ac DOT at>
           http://www.infosys.tuwien.ac.at/Staff/lux/marco/lzo.html
 Version : 0.20
 Date    : 11-Aug-1996


 Availability
 ------------
 ftp://hpv17.infosys.tuwien.ac.at/pub/mfx/lzo/lzo-0.20.tar.gz
 ftp://tsx-11.mit.edu/pub/linux/sources/libs/lzo-0.20.tar.gz

 Archive size is about 99 kB.


 Introduction
 ------------
 LZO is a data compression library which is suitable for data
 de-/compression in real-time. This means it favours speed
 over compression ratio.
 I named it LZO standing for Lempel-Ziv-Oberhumer.

 LZO is written in ANSI C. Both the source code and the compressed
 data format are designed to be portable across platforms,
 and I intend to make LZO work on each architecture that
 has at least 32 bit integers. Somewhat limited support for 16 bit
 MSDOS is implemented as well (using 'huge' pointers).

 This release includes 7 algorithms (LZO1, LZO1A, LZO1B, LZO1C,
 LZO1D, LZO1F and LZO1X) each of which can be tuned to different
 compression levels.

 I already published LZO1 and LZO1A via the Internet newsgroups
 comp.compression and comp.compression.research in March 1996.
 LZO1 and LZO1A are mainly included for compatibility reasons.
 LZO1D decompression is too slow, and there is no fast compressor anyway.
 So the algorithms of current interest are LZO1B, LZO1C, LZO1F
 and LZO1X. Each of these has a slightly different characteristics,
 and it seems that the new LZO1X algorithm is a good overall
 choice for a wide range of data.

 The LZO1B/LZO1C/LZO1F/LZO1X algorithm has the following features:

 - Decompression is simple and *very* fast.
 - Requires no memory for decompression.
 - Requires 64kB of memory for compression.
 - Allows you to dial up extra compression at a speed cost in the
   compressor. The speed of the decompressor is not affected.
 - Algorithm is thread safe.
 - Algorithm is lossless.
 - LZO1B is deterministic.


 Design criteria
 ---------------
 LZO was designed with speed in mind. Decompressor speed has been
 favoured over compressor speed. Real-time decompression should be
 possible for virtually any application. I expect an implementation
 of the decompressor in assembler code to run at the third of the
 speed of a memcpy().

 In fact, I first wrote the decompressor of each algorithm thereby
 defining the compressed data format, verified it with manually created
 test data and at last added the compressor.


 Performance
 -----------
 To keep you interested, here is a short overview of the average
 results when compressing the (in-)famous Calgary Corpus test suite
 with a blocksize of 256kB, running on my i486 DX2/66.

 The naming convention of the algorithms goes LZOxx-N, where N is the
 compression level. Range 1-9 indicates the fast standard levels using
 64kB memory for compression. Level 99 offers better compression at the
 cost of more memory (256kB), and is still reasonable fast.
 Level 999 achieves nearly optimal compression - but it is slow
 and uses much memory, and is mainly intended for generating
 pre-compressed data.

 +--------------------------------------------------------------------+
 | Algorithm       Length  CxB  ComLen  %Remn  Bits  Com K/s  Dec K/s |
 | ---------       ------  ---  ------  -----  ----  -------  ------- |
 | LZO1-1          224401    1  118375   53.5  4.28  1704.11  3580.56 |
 |                                                                    |
 | LZO1A-1         224401    1  115988   52.0  4.16  1656.07  3670.48 |
 |                                                                    |
 | LZO1B-1         224401    1  110653   49.6  3.97  1575.28  4411.62 |
 | LZO1B-2         224401    1  107416   48.6  3.89  1451.90  4437.53 |
 | LZO1B-3         224401    1  105537   47.9  3.83  1343.32  4422.70 |
 | LZO1B-4         224401    1  104828   47.4  3.79  1037.02  4514.11 |
 | LZO1B-5         224401    1  102724   46.7  3.73   943.35  4521.54 |
 | LZO1B-6         224401    1  101210   46.0  3.68   883.91  4483.46 |
 | LZO1B-7         224401    1  101388   46.0  3.68   811.64  4572.20 |
 | LZO1B-8         224401    1   99453   45.2  3.62   731.56  4514.98 |
 | LZO1B-9         224401    1   99118   45.0  3.60   592.89  4426.57 |
 |                                                                    |
 | LZO1C-1         224401    1  112051   50.2  4.02  1518.80  4392.48 |
 | LZO1C-2         224401    1  108791   49.1  3.93  1416.69  4413.89 |
 | LZO1C-3         224401    1  106825   48.4  3.87  1321.23  4389.14 |
 | LZO1C-4         224401    1  105717   47.7  3.82  1032.84  4494.94 |
 | LZO1C-5         224401    1  103605   46.9  3.76   940.19  4498.48 |
 | LZO1C-6         224401    1  102585   46.5  3.72   863.84  4452.71 |
 | LZO1C-7         224401    1  101937   46.2  3.70   772.97  4548.76 |
 | LZO1C-8         224401    1  100779   45.6  3.65   708.72  4487.48 |
 | LZO1C-9         224401    1  100252   45.4  3.63   598.06  4396.90 |
 |                                                                    |
 | LZO1F-1         224401    1  116765   51.5  4.12  1665.56  4805.74 |
 |                                                                    |
 | LZO1X-1         224401    1  110368   49.4  3.95  1601.11  4919.72 |
 |                                                                    |
 | LZO1-99         224401    1  101560   46.7  3.73   473.35  3643.55 |
 | LZO1A-99        224401    1   99958   45.5  3.64   471.55  3692.92 |
 | LZO1B-99        224401    1   95399   43.6  3.49   454.05  4461.44 |
 | LZO1C-99        224401    1   97252   44.1  3.53   455.97  4447.05 |
 |                                                                    |
 | LZO1C-999       224401    1   87762   40.2  3.21    87.57  4714.26 |
 | LZO1D-999       224401    1   87901   40.0  3.20    84.90  3208.91 |
 | LZO1F-999       224401    1   89620   40.4  3.23    78.35  5466.84 |
 | LZO1X-999       224401    1   84521   38.5  3.08    53.40  5203.72 |
 |                                                                    |
 | LZRW1-A         224401    1  123194   55.1  4.41  1564.23  3396.10 |
 | LZRW2           224401    1  115399   51.5  4.12  1301.08  2379.13 |
 | LZRW3           224401    1  110942   50.0  4.00  1518.32  1893.92 |
 | LZRW3-A(2)      224401    1  106308   48.1  3.85  1089.16  1989.55 |
 | LZRW3-A(4)      224401    1  103126   46.8  3.74   889.30  2060.22 |
 | LZRW3-A(8)      224401    1  100722   45.8  3.67   607.05  2064.86 |
 | LZV             224401    1  112511   51.4  4.11  1115.18  4247.76 |
 | zlib-8/1        224401    1   85302   38.8  3.10   360.44  1084.91 |
 | memcpy()        224401    1  224401  100.0  8.00 19992.22 20023.29 |
 +--------------------------------------------------------------------+

 More detailed results can be found in the ./doc directory.


 Short documentation
 -------------------
 LZO is a block compression algorithm - it compresses and decompresses
 a block of data. Block size must be the same for compression
 and decompression.

 LZO compresses a block of data into matches (a LZ77 sliding dictionary)
 and runs of non-matching literals. LZO takes care about long matches
 and long literal runs so that it produces good results on highly
 redundant data and deals accecptable with non-compressible data.

 When dealing with uncompressible data, LZO expands the input
 block by a maximum of 16 bytes per 1024 bytes input (probably
 only by a maximum of 8 bytes, I still have to compute the exact
 value).

 Most algorithms come with two decompressors - a fast and a safe one.
 The safe decompressor is designed such a way that it won't crash your
 application when it is fed with incorrect or damaged data but will
 catch all possible overruns and return an error code in this case.
 When using the safe decompressor you must pass the number of
 bytes available in 'dst' via the parameter 'dst_len'.

 I have verified LZO using gcc 2.7.2 with Richard W.M. Jones's
 bounds checking patches and compressed a lot of files when tuning
 some parameters. LZO is free of any known bugs.


 The algorithms
 --------------
 There are (too) many algorithms implemented. But I want to support
 unlimited backwards compatibility, so I will *not* reduce the LZO
 distribution in the future (what I had planned some time ago).

 As the many object files are mostly independent of each other, the size
 overhead for an executable statically linked with the LZO library
 is usually pretty low cause the linker will only add the modules that
 you are actually using.

 My experiments have shown that LZO1B is good with a large blocksize
 or with very redundant data, LZO1F is good with a small blocksize
 or with binary data and that LZO1X is often the best choice of all.
 LZO1F seems to compress better than LZO1C on these files where LZO1C
 is better than LZO1B. Beware, your mileage may vary.


 Portability
 -----------
 I have built and tested LZO successfully on the following platforms:

   i486 - Linux 1.2.6 - gcc 2.6.3
   i486 - MSDOS - djgpp v2 + gcc 2.7.2
   i486 - MSDOS - emx + gcc 2.7.2
   i486 - MSDOS - Watcom C 10.5 (32 bit)
   i486 - MSDOS - Borland C 4.0 (16 bit)

 LZO should work without problems on all machines that have an ANSI C
 compiler and at least 32 bit integers. The supplied makefile requires
 GNU make.

 I need *your* feedback to support more platforms and fix possible
 portability problems. Please take a look at the file PLATFORM.TXT.


 The future
 ----------
 Right now my time is very limited as I'm (still :-) working on finishing
 my diploma thesis (which has nothing to do with compression at all).

 I consider LZO to be pretty finished right now, and I'd also like
 to freeze most of the source code except for fixes - but no promises...

 The following things still need to be done before this could
 be called version 1:
 - test on other 32/64 bit architechtures. I do not expect major problems.
 - thoroughly test on 16 bit systems (already works with Borland C 4.0)
 - implement the missing LZO1F compression levels
 - implement the missing LZO1X compression levels
 - write a more detailed documentation

 I'm also thinking about these features (contributions are welcome):
 - implement an ultra fast LZO1X-0 (at the cost of compression ratio)
 - speed up the compressors using unaligned memory access where possible
 - a stream interface (compatible to zlib ?) would be nice (and slow ?)
 - can someone help me setting up automake/autoconf ?
 - interfaces to Perl and Java


 Some comments about the source code
 -----------------------------------
 If you want to understand how LZO works, carefully study the sources
 in the ./extra directory first.

 Be warned: the main source code in the ./src directory is a
 real pain to understand as I've experimented with hundreds of slightly
 different versions. It contains many #if and some gotos, and
 is completely optimized for speed and not for readability.
 Code sharing of the different algorithms is implemented by stressing
 the preprocessor - this can be really confusing. Lot's of marcos and
 assertions don't make things better.

 Nevertheless the sources compile very quiet on a variety of
 compilers with the highest warning levels turned on, even
 in C++ mode.


 Thanks
 ------
 Laszlo Molnar <lmolnar AT goliat DOT eik DOT bme DOT hu>
 Charles W. Sandmann <sandmann AT clio DOT rice DOT edu>
 Wolfgang Lugmayr <W DOT Lugmayr AT infosys DOT tuwien DOT ac DOT at>
 Natascha


 Copyright
 ---------
 LZO is Copyright (C) 1996 Markus Franz Xaver Johannes Oberhumer

 LZO is distributed under the terms of the GNU Library General
 Public License (LGPL). See the file COPYING.LIB.

 Special licenses for commercial and other applications which
 are not willing to accept the GNU Library General Public License
 are available by contacting the author.




-----BEGIN PGP SIGNATURE-----
Version: 2.6.2i

iQCVAwUBMh3RaW10fyLu8beJAQHZCwP/bE/eEJvRIsI2B3CtQ1g6fzmmt8ZHikXo
wRTdk5gSccTuDIFBYqOqKpMurmEjOKQFlr34e787MbpXZHAWl2yAbh1pPmg9iFZy
7d4uow1a3J/72QIt/ucWavT0mxXzsBl19ylc71E0Auc76xvxg2HgsD5Iax4L70au
PG/mQiOsVxo=
=LXqt
-----END PGP SIGNATURE-----

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019