delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1999/03/19/06:25:42

Date: Fri, 19 Mar 1999 12:23:47 +0100
From: Hans-Bernhard Broeker <broeker AT physik DOT rwth-aachen DOT de>
Message-Id: <199903191123.MAA08469@acp3bf.physik.rwth-aachen.de>
To: djgpp AT delorie DOT com
Subject: Re: (fwd) Compression
Newsgroups: comp.os.msdos.djgpp
Organization: RWTH Aachen, III. physikalisches Institut B
X-Newsreader: TIN [version 1.2 PL2]
Reply-To: djgpp AT delorie DOT com

In article <36F17BC3 DOT 78B926E4 AT cableol DOT co DOT uk> you wrote:
> I'm sure I heard somewhere that tgz's are based around the same 
> algorithm as zips, so why the mega space saving?  (Perhaps because
> they use a different algorithm?)

No, the packing algorithm itself is 100% identical. The difference
between .zip and .tgz is in the stuff it's packing: single files *in*
the archive, or the whole archive as one.

[...]
> DJ Delorie wrote:
[...]
> > file    zip     tgz
> > djdev   1.42M   1.36M
> > djlsr   1.45M   0.87M

Just to complement what DJ already answered to this: note the
difference between the given examples: djdev gains much less than
djlsr does, from the use of tgz format. In the essence that's because
djlsr contains a really enormous amount of quite similar, and very
*small* files. That's exactly the situation where zip's approach of
packing each file individually is rather inefficient. Packing works by
finding and exploiting repetitions in the input, roughly, but inside a
single, small file, there's not much repetition to be, and thus little
to be gained from reducing them.

Some people have reported that you can get even a bit better than .tar.gz
by using .zip.gz, or .zip.zip, i.e.:

	zip -0 temparchive contained_files...
	zip -9 archive temparchive

(or equivalently, replace the 'zip -9' by 'gzip -9'). The trick is
that zip -0 makes a slightly smaller, and more easily packable package
file than tar does.

--
Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de)
Even if all the snow were burnt, ashes would remain.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019