delorie.com/archives/browse.cgi   search  
Mail Archives: opendos/2002/08/23/10:26:12

Message-ID: <000201c24a98$482a2080$c03dfea9@atlantis>
From: "Matthias Paul" <Matthias DOT Paul AT post DOT rwth-aachen DOT de>
To: <opendos AT delorie DOT com>
References: <01FD6EC775C6D4119CDF0090273F74A4FD6815 AT emwatent02 DOT meters DOT com DOT au> <005701c24779$78481500$c03dfea9 AT atlantis> <200208191336 DOT g7JDauw17018 AT envy DOT delorie DOT com> <000301c2481d$e761e840$c03dfea9 AT atlantis> <200208201429 DOT g7KETl808796 AT envy DOT delorie DOT com> <005601c24951$d2529000$c03dfea9 AT atlantis> <200208212036 DOT g7LKaQJ01683 AT envy DOT delorie DOT com>
Subject: Re: Remove me
Date: Fri, 23 Aug 2002 12:52:15 +0200
Organization: Aachen University of Technology (RWTH), Germany
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4522.1200
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id g7NBTqw17014
Reply-To: opendos AT delorie DOT com

On 2002-08-21, DJ Delorie wrote:

> "automatically" means I add it to my archive system, which includes
> djgpp and cygwin mail archives - hundreds of thousands of emails.
> I would like to think about this some more before committing -

Of course, I can fully understand this. I did in no way intend to
urge you - it was just a suggestion.

> perhaps "last N days" or "this week" might be reasonable compromizes?

Hm, this looks like a dynamic solution which would need reprocessing
every day - probably nice to have as well, but not exactly what
/I/ was looking for. I don't know what the others might prefer...

I was more thinking of a static thing for long term archiving -
so it can still be used in ten (or maybe twenty) years from
now (by then it may have some value for historical researches):

A number of archives each containing one year's contents (OD1997.ZIP,
OD1998.ZIP, etc.) - preferably as plain ASCII text files in a suitable
(DOS 8.3 SFN) directory structure inside. Maybe - for the current year -
also archives with one month' worth each (OD200207.ZIP etc.).

> The worst case is when someone uses "download for offline viewing"
> and ends up with a dozen copies of each of the thousands of emails
> because they ended up following links for *all* the download options.

At the moment I receive individual mails and weekly digests, but
I choosed digests only to make archiving easier under Windows (as
it's difficult to make plain ASCII copies of mails under Windows,
and the database itself is in a binary format - who knows if you
can still find a program being able to read OE 5.5 databases in
a few years, I doubt it... ;-)

The long term solution is to save all important stuff as plain
ASCII files, not in any proprietary formats. Would such archives
be available, I would no longer need the weekly digests.

So, at least in my case, the availability of archives would actually
help to cut down the traffic (once I would have downloaded all the
archives with old stuff, but this would be a one-timer).

> Plus, zip is a bad choice for compression.  .tar.bz2 would give
> a much smaller total file size.  But that's harder to use in a
> non-unix environment.

That's true. But I would still opt for .ZIP because this is the most
common archive format under DOS/Windows, and it is also available
on virtually all other platforms. Another good archiver is RAR, but
since this format is not as widely established, I would not bother
storing long term stuff as .RAR. The TAR archivers I have seen
for DOS were cumbersome at best, so this is a good thing for Unix,
but not for DOS. Again, for long term archiving I would always
choose DOS .ZIP as the least common denominator.
To cut down the resulting archive size you could use "solid"
compresssion by first combining the stuff into a single file
with zero compression, and then compress this file with maximum
compression. With PKZIP 2.50 for DOS (my default archiver, although
definitely not the most intuitive solution) it would look like:

 PKZIP INARC.ZIP *.* -n -e0 -r -p 
 PKZIP OUTARC.ZIP INARC.ZIP -n -exx

instead of

 PKZIP OUTARC.ZIP *.* -n -exx -r -p

For text files this gives a significantly better total compression
ratio than using -exx right from the start. If the ZIP file viewer
in the Norton Commander should be usable (I think, it should), it
is important to *not* use any long filenames but plain 8.3 format,
and to store all filenames with *uppercased* letters, as filenames
with lowcase letters won't be accepted.

Greetings,

 Matthias

--
<mailto:Matthias DOT Paul AT post DOT rwth-aachen DOT de>; <mailto:mpaul AT drdos DOT org>
http://www.uni-bonn.de/~uzs180/mpdokeng.html; http://mpaul.drdos.org
"Programs are poems for computers."
---
Help the victims of the disastrous Danube, Moldau, and Elbe floodings
of the century in the Czech Republic, Austria, and Germany: www.ct1.cz;
www.orf.at; www.tagesschau.de; www.drk.de for latest news & donations.


- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019