delorie.com/archives/browse.cgi   search  
Mail Archives: opendos/2001/03/08/01:22:03

Message-ID: <02bf01c0a797$cbde07c0$b008e289@mpaul>
From: "Matthias Paul" <Matthias DOT Paul AT post DOT rwth-aachen DOT de>
To: <opendos AT delorie DOT com>
References: <01FD6EC775C6D4119CDF0090273F74A4021FC2 AT emwatent02 DOT meters DOT com DOT au>
Subject: Re: Text file format .ASC ? (#2)
Date: Thu, 8 Mar 2001 07:19:05 +0100
Organization: Rechenzentrum RWTH Aachen
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4522.1200
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id BAA21223
Reply-To: opendos AT delorie DOT com

On 2001-03-08, Joe da Silva wrote:

> That's the problem here - all those other text formats I've found
> seem to retain the first 128 characters and do strange things
> with the upper 128 codes. This one doesn't - it seems to use
> just the upper case letters and other characters below about
> 96 ($60), which to me suggests some non-Roman language,
> in which the Roman letters are of secondary importance ...

Just a guess, but could it be that your file is encoded in one
of these DBCS Code Pages (like Shift-JIS) as used in Asia,
so that it could be a mixed representation of one-byte and
two-byte characters? If the first byte is within one of usually
two ranges it opens a window into a set of 256 characters
which are addressed by the following byte. Each 1st byte
within these ranges opens a different window, so you can
have thousands of characters in one codepage, and still have
short representations for US-ASCII (which, however, is
normally used only for Western names and similar stuff, so
it would make sense that you can still see some strings that
look familiar like "PCnnn").
Usually the first range is located *somewhere* between
40h..7Eh and the second range between 80h..FCh, but the
actual count of ranges, their location, and extend depends
on the Country and Code Page settings of the system (under
DOS defined by the DBCS strings in COUNTRY.SYS).
Unfortunately, there are would be plenty of DBCS Code Pages
to try... However, without a DBCS frontend you wonīt be able
to display such a file. But even if you would load such drivers,
if you donīt read Japanese, Chinese, Korean, or the like,
you wonīt be able to understand the contents, anyway...

Well, not exactly, what I would call .ASC, but
who knows... Do you know how old this file is?
Where do you got it from originally?

 Matthias

BTW. Most of these DBCS Code Pages also contain
sets of Roman, Greek, and other characters used in
Western languages.

------------------------------------------------------------
Matthias Paul, Ubierstrasse 28, D-50321 Bruehl, Germany
<Matthias DOT Paul AT post DOT rwth-aachen DOT de> <mpaul AT drdos DOT org>
http://www.uni-bonn.de/~uzs180/mpdokeng.html
------------------------------------------------------------
My homepage has moved, please update your pointers.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright Đ 2019   by DJ Delorie     Updated Jul 2019