Message-ID: <3AE31151.6E8D693D@jps.net> From: Dennis Yelle X-Mailer: Mozilla 4.75 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.os.msdos.djgpp Subject: Re: string can't read certian characters??? References: <3AE099B3 DOT FE657644 AT jps DOT net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 88 Date: Sun, 22 Apr 2001 10:13:53 -0700 NNTP-Posting-Host: 216.119.26.43 X-Complaints-To: abuse AT onemain DOT com X-Trace: nntp1.onemain.com 987959336 216.119.26.43 (Sun, 22 Apr 2001 13:08:56 EDT) NNTP-Posting-Date: Sun, 22 Apr 2001 13:08:56 EDT To: djgpp AT delorie DOT com DJ-Gateway: from newsgroup comp.os.msdos.djgpp Reply-To: djgpp AT delorie DOT com Dennis Yelle wrote: > > When I run this program: > ------------------------- > #include > #include > #include > #include > #include > > unsigned char data[] = { > 'd', 0xBC, 0xD2, '\r', '\n', > 'e', 0xB2, 0x7B, '\r', '\n', > 'd', 0xBC, 0xD2, '\r', '\n', > 'e', 0xB2, 0x7B, '\r', '\n', > 'f', 0xB2, 0x7B, '\r', '\n' > }; > > void write_it( char* name) > { > FILE* out = fopen( name, "wb"); > assert(out); > int count = 0; > for( unsigned i=0; i assert( EOF != fputc( data[i], out)); > count++; > } > assert( !fclose( out)); > cout << "Wrote " << count > << " characters to the file " << name << '\n'; > } > > void read_it( char* name) > { > ifstream in( name); > assert( in); > cout << "Reading....\n"; > string s; > for(;in>>s;) { > cout << s.size() << " " << s << '\n'; > } > } > > int main() > { > write_it( "temp"); > read_it( "temp"); > return 0; > } > ---------------------------- > > I get this output: > ------------------------ > Wrote 25 characters to the file temp > Reading.... > 3 d+- > 1 e > -------------------------- > > So it looks like string input from > a file cannot read some character sequences. > Is this expected and desired behavior? > Or is it a bug? The problem is that when reading with in>>s; the code in the file /djgpp/lang/cxx/std/bastring.cc on line 448 calls traits::is_del (ch) and on line 120 of the file \DJGPP\LANG\CXX\STD\STRAITS.H is_del calls isspace(a) but isspace only works for characters in the range 0x00 to 0x7f. So anytime you read a file with in >> s; you end up with undefined behavior if the file contains any characters above 0x7f. As far as I know, this is undocumented. Is this true only of djgpp, or is it also true of other versions of gcc? Who should I report this to, in order to get it fixed? Dennis Yelle -- I am a computer programmer and I am looking for a job. There is a link to my resume here: http://table.jps.net/~vert/