delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2002/02/07/06:45:10

X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-bounces using -f
From: "Thomas Mueller" <tmueller AT bluegrass DOT net>
Newsgroups: comp.os.msdos.djgpp
Subject: Re: GNU Emacs DOS (DJGPP) port converts upper-ASCII characters to ASCII 127
Date: 7 Feb 2002 11:38:51 GMT
Lines: 60
Message-ID: <a3tp0a$1b0grj$2@ID-49635.news.dfncis.de>
References: <Pine DOT SUN DOT 3 DOT 91 DOT 1020205153803 DOT 24181A-100000 AT is> <a3oha9$194gnj$1 AT ID-49635 DOT news DOT dfncis DOT de>
NNTP-Posting-Host: dial3-124.bluegrass.net (208.147.34.124)
Mime-Version: 1.0
X-Trace: fu-berlin.de 1013081931 45106035 208.147.34.124 (16 [49635])
X-Mailer: NOS-BOX 2.05
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

from Eli Zaretskii:

  When you say ``which is in German'', what do you mean, exactly?  That is,
  how are the German characters encoded?  Are they encoded in the same
  codepage that is installed as the default on your system, or are they
  encoded in some other encoding, such as Latin-1?

The newsletter is actually in the German language, as opposed to English, and
specifies ISO-8859-1 charset.  From the headers:

Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0

> The DJGPP port of Emacs assumes codepage-encoded non-ASCII characters by
  default, unless told otherwise.  To see what codepage does your system
  report to Emacs, evaluate the variable dos-codepage (inside Emacs,
  type "M-: dos-codepage RET").  To tell Emacs to read a file encoded in
  latin-1, type "C-x RET c latin-1 RET" immediately before you type "C-x C-f"
  (or immediately before you click Files->Open).

> All this is explained in the Emacs manual, of course.

I did what you said (the latter, maybe not at the right time.  But I had the
same problem even after adding to _EMACS
(set-terminal-coding-systsm 'iso-latin-1)
I think I read that in a book, not referring to DOS version.  But a smaller
file, heise online newsletter by itself, about 10 KB, displayed correctly in
ISO-8859-1.

Also, with another file slightly exceeding 1.5 MB, DOS Emacs couldn't determine
line numbers (not enough memory allocated by cwsdpmi?), but vim 6 had no such
problem.  DOS Emacs didn't harm that file because I didn't modify or save.

Linux version of Emacs, in text mode, non-X-Windows, showed the latin-1
characters correctly, and saved correctly after I deleted parts that were spam.

> No, not weird.  Emacs thought that the file was encoded in some codepage,
  probably cp437 or cp850, but that wasn't true.  So it saw some codes that
  are undefined in the codepage, and replaced them with 127, which is the
  glyph used for undefined characters (the Windows and X versions display
  an empty box instead).

> It's not a bug, it's a usage error.

A usage error, like I should have used vim 6 or elvis 2.1_4, or maybe setedit,
instead of Emacs?  Vim never messed up the upper-ASCII characters, nor did EPM
in OS/2.  What other text editor converts upper-ASCII characters to 7-bit and
saves the file that way?

Actually codepages 437 and 850 define all 256 characters, 128 = capital
C-cedilla, 129 = lower-case u-umlaut, 130 = e-acute-accent, etc.  I am
accustomed to reading and mentally transliterating messages from the IBM
charset I saw to ISO-8859-1 (not entirely).  Even if Emacs displays the
upper-ASCII characters as ASCII 127, Emacs shouldn't actually write these
changes to the file.

Apparently a lot of people prefer vi and relatives over Emacs.


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019