Message-Id: <200506241847.j5OIlIAB032622@delorie.com> Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com From: "Gary R. Van Sickle" To: "'Cygwin List'" Subject: RE: stat file -- cygwin vs. Windows size? Date: Fri, 24 Jun 2005 13:46:56 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit In-Reply-To: <20206A8F2DE2@mail.learnquick.com> X-IsSubscribed: yes > -----Original Message----- > From: cygwin-owner AT cygwin DOT com > [mailto:cygwin-owner AT cygwin DOT com] On Behalf Of Herb Martin > Sent: Friday, June 24, 2005 1:08 PM > To: 'Cygwin List' > Subject: RE: stat file -- cygwin vs. Windows size? > > > >My suspicion is that stat is counting cr-lf as two > > characters but the > > >input routines are treating these as one. > > > > > >If the file has about 20 lines, then that's 20 missing > characters??? > > > > > > Yes, this is right. And yes, this could be the cause of > the situation > > you're noticing. > > Is there a standard Cygwin 'idiom' or function for dealing > with this mismatch, or should I just re-invent the wheel. > As to the former, no, not Cygwin specifically. The problem appears to be that SpamAssassin is making the incorrect but all-too-common assumption that "text file" == "file of 8-bit ASCII characters with '\n' EOL characters". This is as incorrect as thinking "picture file" == "JPEG file". Cygwin does have a number of fetures to "bandaid" many such broken Unix codes, primarily the "text mode mount" feature, but these are just that, a band-aid, not a fix of the root problem (and in your case (and in fact in a similar case in mutt), it can't solve the problem). As others have indicated, the real and true solution here is to open the file in binary mode and handle the various EOL chachter combinations in the SpamAssasin code. Which, yeah, is unfortunately reinventing a wheel which should have been "permanently reinvented" in the last century. But hey, it's only the first few years of the 21st century, maybe by the 22nd we'll have this whole CRLF/LF/CR/LFCR thing sorted out. -- Gary R. Van Sickle -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/