Message-ID: <000201c24a98$482a2080$c03dfea9@atlantis> From: "Matthias Paul" To: References: <01FD6EC775C6D4119CDF0090273F74A4FD6815 AT emwatent02 DOT meters DOT com DOT au> <005701c24779$78481500$c03dfea9 AT atlantis> <200208191336 DOT g7JDauw17018 AT envy DOT delorie DOT com> <000301c2481d$e761e840$c03dfea9 AT atlantis> <200208201429 DOT g7KETl808796 AT envy DOT delorie DOT com> <005601c24951$d2529000$c03dfea9 AT atlantis> <200208212036 DOT g7LKaQJ01683 AT envy DOT delorie DOT com> Subject: Re: Remove me Date: Fri, 23 Aug 2002 12:52:15 +0200 Organization: Aachen University of Technology (RWTH), Germany MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id g7NBTqw17014 Reply-To: opendos AT delorie DOT com On 2002-08-21, DJ Delorie wrote: > "automatically" means I add it to my archive system, which includes > djgpp and cygwin mail archives - hundreds of thousands of emails. > I would like to think about this some more before committing - Of course, I can fully understand this. I did in no way intend to urge you - it was just a suggestion. > perhaps "last N days" or "this week" might be reasonable compromizes? Hm, this looks like a dynamic solution which would need reprocessing every day - probably nice to have as well, but not exactly what /I/ was looking for. I don't know what the others might prefer... I was more thinking of a static thing for long term archiving - so it can still be used in ten (or maybe twenty) years from now (by then it may have some value for historical researches): A number of archives each containing one year's contents (OD1997.ZIP, OD1998.ZIP, etc.) - preferably as plain ASCII text files in a suitable (DOS 8.3 SFN) directory structure inside. Maybe - for the current year - also archives with one month' worth each (OD200207.ZIP etc.). > The worst case is when someone uses "download for offline viewing" > and ends up with a dozen copies of each of the thousands of emails > because they ended up following links for *all* the download options. At the moment I receive individual mails and weekly digests, but I choosed digests only to make archiving easier under Windows (as it's difficult to make plain ASCII copies of mails under Windows, and the database itself is in a binary format - who knows if you can still find a program being able to read OE 5.5 databases in a few years, I doubt it... ;-) The long term solution is to save all important stuff as plain ASCII files, not in any proprietary formats. Would such archives be available, I would no longer need the weekly digests. So, at least in my case, the availability of archives would actually help to cut down the traffic (once I would have downloaded all the archives with old stuff, but this would be a one-timer). > Plus, zip is a bad choice for compression. .tar.bz2 would give > a much smaller total file size. But that's harder to use in a > non-unix environment. That's true. But I would still opt for .ZIP because this is the most common archive format under DOS/Windows, and it is also available on virtually all other platforms. Another good archiver is RAR, but since this format is not as widely established, I would not bother storing long term stuff as .RAR. The TAR archivers I have seen for DOS were cumbersome at best, so this is a good thing for Unix, but not for DOS. Again, for long term archiving I would always choose DOS .ZIP as the least common denominator. To cut down the resulting archive size you could use "solid" compresssion by first combining the stuff into a single file with zero compression, and then compress this file with maximum compression. With PKZIP 2.50 for DOS (my default archiver, although definitely not the most intuitive solution) it would look like: PKZIP INARC.ZIP *.* -n -e0 -r -p PKZIP OUTARC.ZIP INARC.ZIP -n -exx instead of PKZIP OUTARC.ZIP *.* -n -exx -r -p For text files this gives a significantly better total compression ratio than using -exx right from the start. If the ZIP file viewer in the Norton Commander should be usable (I think, it should), it is important to *not* use any long filenames but plain 8.3 format, and to store all filenames with *uppercased* letters, as filenames with lowcase letters won't be accepted. Greetings, Matthias -- ; http://www.uni-bonn.de/~uzs180/mpdokeng.html; http://mpaul.drdos.org "Programs are poems for computers." --- Help the victims of the disastrous Danube, Moldau, and Elbe floodings of the century in the Czech Republic, Austria, and Germany: www.ct1.cz; www.orf.at; www.tagesschau.de; www.drk.de for latest news & donations.