delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2001/04/22/13:15:14

Message-ID: <3AE31151.6E8D693D@jps.net>
From: Dennis Yelle <dennis51 AT jps DOT net>
X-Mailer: Mozilla 4.75 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.os.msdos.djgpp
Subject: Re: string can't read certian characters???
References: <3AE099B3 DOT FE657644 AT jps DOT net>
Lines: 88
Date: Sun, 22 Apr 2001 10:13:53 -0700
NNTP-Posting-Host: 216.119.26.43
X-Complaints-To: abuse AT onemain DOT com
X-Trace: nntp1.onemain.com 987959336 216.119.26.43 (Sun, 22 Apr 2001 13:08:56 EDT)
NNTP-Posting-Date: Sun, 22 Apr 2001 13:08:56 EDT
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
Reply-To: djgpp AT delorie DOT com

Dennis Yelle wrote:
> 
> When I run this program:
> -------------------------
> #include <cassert>
> #include <cstdio>
> #include <fstream>
> #include <iostream>
> #include <string>
> 
> unsigned char data[] = {
>   'd', 0xBC, 0xD2, '\r', '\n',
>   'e', 0xB2, 0x7B, '\r', '\n',
>   'd', 0xBC, 0xD2, '\r', '\n',
>   'e', 0xB2, 0x7B, '\r', '\n',
>   'f', 0xB2, 0x7B, '\r', '\n'
> };
> 
> void write_it( char* name)
> {
>   FILE* out = fopen( name, "wb");
>   assert(out);
>   int count = 0;
>   for( unsigned i=0; i<sizeof(data); ++i) {
>     assert( EOF != fputc( data[i], out));
>     count++;
>   }
>   assert( !fclose( out));
>   cout << "Wrote " << count
>        << " characters to the file " << name << '\n';
> }
> 
> void read_it( char* name)
> {
>   ifstream in( name);
>   assert( in);
>   cout << "Reading....\n";
>   string s;
>   for(;in>>s;) {
>     cout << s.size() << "  " << s << '\n';
>   }
> }
> 
> int main()
> {
>   write_it( "temp");
>   read_it( "temp");
>   return 0;
> }
> ----------------------------
> 
> I get this output:
> ------------------------
> Wrote 25 characters to the file temp
> Reading....
> 3  d+-
> 1  e
> --------------------------
> 
> So it looks like string input from
> a file cannot read some character sequences.
> Is this expected and desired behavior?
> Or is it a bug?

The problem is that when reading with
  in>>s;
the code in the file /djgpp/lang/cxx/std/bastring.cc
on line 448 calls traits::is_del (ch)
and on line 120 of the file \DJGPP\LANG\CXX\STD\STRAITS.H
is_del calls isspace(a)
but isspace only works for characters in the range 0x00 to 0x7f.
So anytime you read a file with
  in >> s;
you end up with undefined behavior if the file contains
any characters above 0x7f.

As far as I know, this is undocumented.

Is this true only of djgpp, or is it also
true of other versions of gcc?

Who should I report this to, in order to get it fixed?

Dennis Yelle
-- 
I am a computer programmer and I am looking for a job.
There is a link to my resume here:  
http://table.jps.net/~vert/

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019