delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2002/02/05/08:49:47

X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f
Date: Tue, 5 Feb 2002 15:45:34 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
To: Thomas Mueller <tmueller AT bluegrass DOT net>
cc: djgpp AT delorie DOT com
Subject: Re: GNU Emacs DOS (DJGPP) port converts upper-ASCII characters to ASCII 127
In-Reply-To: <a3oha9$194gnj$1@ID-49635.news.dfncis.de>
Message-ID: <Pine.SUN.3.91.1020205153803.24181A-100000@is>
MIME-Version: 1.0
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by delorie.com id g15Dksv11346
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On 5 Feb 2002, Thomas Mueller wrote:

> I tried viewing and editing a big text file, just under 1 MB, using the DOS
> (DJGPP) port of GNU Emacs.  It consists of many concatenated email messages
> including heise online Newsletter, which is in German, and I noticed the upper
> ASCII characters, or some of them, those with umlauts, showed as ASCII 127.

When you say ``which is in German'', what do you mean, exactly?  That is, 
how are the German characters encoded?  Are they encoded in the same 
codepage that is installed as the default on your system, or are they 
encoded in some other encoding, such as Latin-1?

The DJGPP port of Emacs assumes codepage-encoded non-ASCII characters by 
default, unless told otherwise.  To see what codepage does your system 
report to Emacs, evaluate the variable dos-codepage (inside Emacs,
type "M-: dos-codepage RET").  To tell Emacs to read a file encoded in 
latin-1, type "C-x RET c latin-1 RET" immediately before you type "C-x C-f" 
(or immediately before you click Files->Open).

All this is explained in the Emacs manual, of course.

> I tried to remove a bit of junk, and subsequently saved.  I noticed the saved
> file had upper ASCII characters converted to ASCII 127, verified with grep and
> DR-DOS 7.03 EDIT.
> grep -n "düsteren" mbox851.mes
> showed nothing, running from the directory where that file is, while
> grep -n "dsteren" mbox851.mes
> (with ASCII 127 between d and s)
> actually went to the appropriate line.
> 
> This grep came from the DJGPP section of Simtel.
> 
> This is weird.

No, not weird.  Emacs thought that the file was encoded in some codepage, 
probably cp437 or cp850, but that wasn't true.  So it saw some codes that 
are undefined in the codepage, and replaced them with 127, which is the 
glyph used for undefined characters (the Windows and X versions display 
an empty box instead).

> Is this a known bug?

It's not a bug, it's a usage error.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019