Message-Id: Comments: Authenticated sender is From: "Salvador Eduardo Tropea (SET)" Organization: INTI To: Eli Zaretskii , Tim Zastai , djgpp-workers AT delorie DOT com Date: Mon, 14 Dec 1998 13:50:05 +0000 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Subject: Re: Bugs in CVS 1.9 distributed in Simtelnet References: In-reply-to: X-mailer: Pegasus Mail for Windows (v2.54) Reply-To: djgpp-workers AT delorie DOT com Eli Zaretskii wrote: > On Mon, 14 Dec 1998, Salvador Eduardo Tropea (SET) wrote: > > > 1) The import command imports the files using fopen(file,"r"); and for some > > strange reason Tim used _fmode=O_BINARY > > In my experience, it is almost never a good idea to use _fmode=O_BINARY > globally. Most programs need to open some files in text mode and others > in binary mode, so _fmode doesn't save much trouble. Yes I think the same, but looks like Tim found that _fmode=O_BINARY was a quick workaround, and he didn't see the side effects ;-) > > Anyways that's a serious bug because then the text files are saved in the > > repository with \r\n. I think that here CVS should check the mode and open > > the file according it. > > I don't really know much about CVS internals, so what's below are some > general remarks about this text/binary nuisance, mainly in the context of > revision-control software. > > Saving a source file with the DOS-style CRLF EOLs is indeed a Bad Idea > (IMHO). The main problem with that is that the master file is then > non-portable to Unix. So I think source files should be checked in with > the CR characters stripped. Yes, CVS (and RCS) does it. The current copy in Simtel does it for a check-in but fails to do it for an import. > However, CVS can be also used to store non-text files. Clearly, those > should have all their characters preserved. Yes, CVS have a -kb switch (same as RCS) to do it. The sources in Simtel doesn't check this option in the import routine. > Therefore, opening in text mode in all cases is not a good solution > either. CVS should peek at some portion of the file, decide if it is > text or binary (maybe even provide an option for the user to force one of > these), and then strip all CR characters from the CRLF pairs if that's a > text file. That requires a binary open for reading. Not needed, the behavior according the .info is: 1) Normally files are text, the EOL separator is OS dependent, the repository *ever* uses \n as EOL, no mather the OS 2) Using -kb or specifying it the "wrappers" configuration the file is treated as binary and no conversion will be done. So is the user who takes the desition. As the wrappers are configured using regular expressions to say what files use what wrapper that's ok. > Note that it is a mistake to strip ALL the CR characters, even if they are > not followed by an LF: this will have subtle bugs if the source file has > literal CR characters in it. Only a CR before the LF should be stripped > from a text file. This requires to read the file in binary mode, and > then loop through it manually stripping the CRs, since no library > function is smart enough to do this for you. Currently the patch opens the file in *text* mode and lets djgpp's libc do the job of converting \r\n to \n. It should be ok. > A related question is how to write the file when it is checked out. > Clearly, a non-text file should be written verbatim. For a text file, the > answer is less obvious, but I think DOS-style CRLF format is better. For > example, imagine a source file which originally had strings with CRLF > pairs inside it: these would be stripped by check-in, and must be added on > check-out, to prevent program from breaking. Also, many DOS editors still > have trouble with Unix-style EOLs (I was surprised to learn that even > MultiEdit has this bug). The patchs I did does it. That's why I patched check-out to use "wt" to create text files. In djgpp it means \n -> \r\n conversion. > If it is possible, CVS should record the type of the file (text or > binary) in the repository when the file is checked in, and use that info > when it writes the file out later. Yes, CVS does it using the RCS methode, the -kxxx option is stored in some record (expand @ xxxx @; if I remmember well). The b means binary (no $var$ expansion and no EOL conversion). > The above still has some problems, e.g. when a source file has embedded > strings which end in a single literal newline. But these cases are rare. > > > I have a working patch for it that checks the mode and explicitly selects > > "wb" or "wt" (no default assumptions for "w"=xxxx). > > Some old non-ANSI compilers don't support "wt" and "wb". You might be > better off defining a bunch of macros, like FOPEN_WBIN_MODE, so that > users of other compilers could define them as appropriate. Ok, I'll do it with macros. > > I think is much better than RCS > > CVS and RCS are not generally interchangeable, they are designed to work > on different levels and support different development models. RCS is too low level for direct use on projects, CVS solves this problem. And yes they are for different models, but CVS is one step ahead RCS. Now: 1) Is Tim Zastai currently supporting this port? 2) Should I wait the Zastai reply? 3) I think the CVS maintainer doesn't have the djgpp patches (is just an impression, I must dowload the last CVS release). Greetings, SET ------------------------------------ 0 -------------------------------- Visit my home page: http://set-soft.home.ml.org/ or http://www.geocities.com/SiliconValley/Vista/6552/ Salvador Eduardo Tropea (SET). (Electronics Engineer) Alternative e-mail: set-soft AT usa DOT net set AT computer DOT org ICQ: 2951574 Address: Curapaligue 2124, Caseros, 3 de Febrero Buenos Aires, (1678), ARGENTINA TE: +(541) 759 0013