delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2005/06/24/14:47:24

Message-Id: <200506241847.j5OIlIAB032622@delorie.com>
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
From: "Gary R. Van Sickle" <g DOT r DOT vansickle AT worldnet DOT att DOT net>
To: "'Cygwin List'" <cygwin AT cygwin DOT com>
Subject: RE: stat file -- cygwin vs. Windows size?
Date: Fri, 24 Jun 2005 13:46:56 -0500
MIME-Version: 1.0
In-Reply-To: <20206A8F2DE2@mail.learnquick.com>
X-IsSubscribed: yes

> -----Original Message-----
> From: cygwin-owner AT cygwin DOT com 
> [mailto:cygwin-owner AT cygwin DOT com] On Behalf Of Herb Martin
> Sent: Friday, June 24, 2005 1:08 PM
> To: 'Cygwin List'
> Subject: RE: stat file -- cygwin vs. Windows size?
> 
> > >My suspicion is that stat is counting cr-lf as two
> > characters but the
> > >input routines are treating these as one.
> > >
> > >If the file has about 20 lines, then that's 20 missing 
> characters???
> > 
> > 
> > Yes, this is right.  And yes, this could be the cause of 
> the situation 
> > you're noticing.
> 
> Is there a standard Cygwin 'idiom' or function for dealing 
> with this mismatch, or should I just re-invent the wheel.
> 

As to the former, no, not Cygwin specifically.  The problem appears to be
that SpamAssassin is making the incorrect but all-too-common assumption that
"text file" == "file of 8-bit ASCII characters with '\n' EOL characters".
This is as incorrect as thinking "picture file" == "JPEG file".

Cygwin does have a number of fetures to "bandaid" many such broken Unix
codes, primarily the "text mode mount" feature, but these are just that, a
band-aid, not a fix of the root problem (and in your case (and in fact in a
similar case in mutt), it can't solve the problem).  As others have
indicated, the real and true solution here is to open the file in binary
mode and handle the various EOL chachter combinations in the SpamAssasin
code.  Which, yeah, is unfortunately reinventing a wheel which should have
been "permanently reinvented" in the last century.  But hey, it's only the
first few years of the 21st century, maybe by the 22nd we'll have this whole
CRLF/LF/CR/LFCR thing sorted out.

-- 
Gary R. Van Sickle


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019