Mailing-List: contact cygwin-apps-help AT sourceware DOT cygnus DOT com; run by ezmlm Sender: cygwin-apps-owner AT sourceware DOT cygnus DOT com List-Subscribe: List-Archive: List-Post: List-Help: , Delivered-To: mailing list cygwin-apps AT sources DOT redhat DOT com Date: Thu, 4 Oct 2001 21:20:30 -0400 From: Christopher Faylor To: cygwin-patches AT cygwin DOT com Cc: cygwin-apps AT cygwin DOT com Subject: Re: File handling in setup.exe Message-ID: <20011004212030.C1118@redhat.com> Reply-To: cygwin-apps AT cygwin DOT com Mail-Followup-To: cygwin-patches AT cygwin DOT com, cygwin-apps AT cygwin DOT com References: <3BBD05EB DOT 2357D53A AT etr-usa DOT com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3BBD05EB.2357D53A@etr-usa.com> User-Agent: Mutt/1.3.21i FWIW, I really like what you've proposed. It feels right. Feel free to create a branch in the cinstall directory using the tools in the winsup/maint directory, if you'd like to get working on a real proof of concept. Although, I guess we should wait for a little more input first. Btw, since there was not patch associated with this message, I redirected this to cygwin-apps which is slightly more appropriate than cygwin-patches. cygwin-developers has been the list where setup.exe was most often discussed but maybe we should move setup.exe discussions to cygwin-apps to spare setup developers from cygwin DLL low-level discussions. cgf On Thu, Oct 04, 2001 at 06:59:23PM -0600, Warren Young wrote: >This is regarding the *.cwp stuff that was discussed last month. It was >agreed that my initial patch had good ideas, but that as long as I was >in there, I might as well clean up the code some. I've looked into the >code, and have realized that I need some input before proceeding. > >My initial idea when I agreed to take this on was to just refactor and >OOP-ify the code around tar.cc some. I can do that, but some comments >from Robert Collins got me on the track of looking into handling >alternate sources for package files. > >This implies some kind of link between archive handling and the current >NetIO hierarchy. This would also require changes to geturl.cc and the >code that calls functions in geturl.cc. The foremost issue is, should I >be chasing this at all, or should I simply refactor the tar handling >mechanism as it exists right now? > >If we want a Grand Refactoring and not just some reworking of tar.cc and >friends, here's my proposal: > >I assume that reading packages from the network would be useful for >allowing setup.exe to install directly from the network, without writing >the packages out to disk first as it does today. Yet, we need to keep >that "caching" mechanism somehow, because it's useful. Currently, file >handling logic exists in geturl.cc, nio-file.cc, tar.cc, and probably >other places. To deal with all that, I have in mind something like >this: > >class Source { >public: > Source(out_pathname); > virtual int read(buffer, size); > virtual int write(buffer, size); > > ... >private: > Source() { } // can't create Source objects directly > > FILE* fp_out; >}; > >class HTTPSource : public Source { >public: > HTTPSource(in_url, out_pathname = 0); > ... >}; > > >By default, Source reads data from a file and has the option to cache >the data it reads out to another file. (If out_pathname == 0, the data >isn't cached to a file as it's read.) Subclasses override the >constructor and read() to retrieve data from various network sources. >(HTTP, FTP, WinInet.dll, etc.) When reading straight from a file, you >would set the Source to non-cacheable, but when reading via HTTP, you >could elect to either cache the data to a file, or simply read the data >in without caching it. > >This implies a fairly major refactoring all by itself. As I stated >above, there's a lot of code that assumes that it can write data out to >disk and read it back. My proposal would mean that everything deals >with Source objects. Because the data may not be cached, you'd want to >keep the data pipeline simple: in the HTTP case, you'd read the data >from the network, pass it to the gz/bz unpacker, and pass that stream to >the tar file unpacker. That is, go from initial network connection open >to final unpacking, all in one operation. > >This implies two other class hierarchies: > >class Decomp { // a cleaned-up version of class gzbz from tar.cc >public: > // this is decomp_factory(), from my original patch > static Decomp* factory(Source*) > > ~Decomp(); // gzbz::close() > > virtual int read(buf, len) = 0; > virtual off_t tell() = 0; > >protected: > FILE* fp; > >private: > Decomp(Source*); >}; > >class GZDecomp : public Decomp ... >class BZDecomp : public Decomp ... > > >class Archive { >public: > Archive(Decomp*); > > virtual int read(buf, len) = 0; > virtual off_t tell() = 0; > virtual const char* next_file_name() = 0; >}; > >class TarArchive : public Archive ... >class RPMArchive : public Archive ... > > >These are just "sketches" to give you an idea of where I'm headed with >all this. Don't worry about critiquing the actual member names or even >the minor structures I've sketched out. The main thing is the class >chain structure I've sketched. > >As you can see, you create a Source object to retreive (and optionally >cache) the data, then you create a Decomp object to read data from the >Source and decompress it, and finally an Archive object to parse the >data from the Decomp object, extracting files and other things found in >tar/rpm/deb/whatever files. > >The get_url_*() functions can't exist in this scheme. They only know >how to read files in from what I'm calling Sources. I haven't traced >the code out beyond the get_url_* functions to find out how the data >within the archives is dealt with. My idea, however, is to make all >that code look something like this: > > // Given the URL, the options the user picked, and whether > // we have the file locally already or not, create a Source > // subclass to read the archive in. > Source* source = open_source(url); > > Archive* arch = new Archive(Decomp::factory(source)); > while (arch) { > munch on archive, update UI, spit files out to disk... > } > delete arch; // closes cache file (if any) as well > // as network connections, etc. > >I'm leaving the issue here until I hear back from the people whose >opinions matter. :) I don't want to jump in and start all this rework >if this idea is somehow broken, or simply too grandiose w.r.t. where >people want to see setup.exe go. > >I'm thinking this will take a week of ideal hacking time, which is a lot >considering that I'm doing all this in my spare time here at work. In >real terms, this may take a month or more. >-- >'Net Address: http://www.cyberport.com/~tangent/ >ICBM Address: 36.8274040 N, 108.0204086 W, alt. 1714m -- cgf AT cygnus DOT com Red Hat, Inc. http://sources.redhat.com/ http://www.redhat.com/