Message-ID: <02bf01c0a797$cbde07c0$b008e289@mpaul> From: "Matthias Paul" To: References: <01FD6EC775C6D4119CDF0090273F74A4021FC2 AT emwatent02 DOT meters DOT com DOT au> Subject: Re: Text file format .ASC ? (#2) Date: Thu, 8 Mar 2001 07:19:05 +0100 Organization: Rechenzentrum RWTH Aachen MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id BAA21223 Reply-To: opendos AT delorie DOT com On 2001-03-08, Joe da Silva wrote: > That's the problem here - all those other text formats I've found > seem to retain the first 128 characters and do strange things > with the upper 128 codes. This one doesn't - it seems to use > just the upper case letters and other characters below about > 96 ($60), which to me suggests some non-Roman language, > in which the Roman letters are of secondary importance ... Just a guess, but could it be that your file is encoded in one of these DBCS Code Pages (like Shift-JIS) as used in Asia, so that it could be a mixed representation of one-byte and two-byte characters? If the first byte is within one of usually two ranges it opens a window into a set of 256 characters which are addressed by the following byte. Each 1st byte within these ranges opens a different window, so you can have thousands of characters in one codepage, and still have short representations for US-ASCII (which, however, is normally used only for Western names and similar stuff, so it would make sense that you can still see some strings that look familiar like "PCnnn"). Usually the first range is located *somewhere* between 40h..7Eh and the second range between 80h..FCh, but the actual count of ranges, their location, and extend depends on the Country and Code Page settings of the system (under DOS defined by the DBCS strings in COUNTRY.SYS). Unfortunately, there are would be plenty of DBCS Code Pages to try... However, without a DBCS frontend you wonīt be able to display such a file. But even if you would load such drivers, if you donīt read Japanese, Chinese, Korean, or the like, you wonīt be able to understand the contents, anyway... Well, not exactly, what I would call .ASC, but who knows... Do you know how old this file is? Where do you got it from originally? Matthias BTW. Most of these DBCS Code Pages also contain sets of Roman, Greek, and other characters used in Western languages. ------------------------------------------------------------ Matthias Paul, Ubierstrasse 28, D-50321 Bruehl, Germany http://www.uni-bonn.de/~uzs180/mpdokeng.html ------------------------------------------------------------ My homepage has moved, please update your pointers.