Mail Archives: djgpp/2001/04/22/13:15:14
Message-ID: | <3AE31151.6E8D693D@jps.net>
|
From: | Dennis Yelle <dennis51 AT jps DOT net>
|
X-Mailer: | Mozilla 4.75 [en] (Win98; U)
|
X-Accept-Language: | en
|
MIME-Version: | 1.0
|
Newsgroups: | comp.os.msdos.djgpp
|
Subject: | Re: string can't read certian characters???
|
References: | <3AE099B3 DOT FE657644 AT jps DOT net>
|
Lines: | 88
|
Date: | Sun, 22 Apr 2001 10:13:53 -0700
|
NNTP-Posting-Host: | 216.119.26.43
|
X-Complaints-To: | abuse AT onemain DOT com
|
X-Trace: | nntp1.onemain.com 987959336 216.119.26.43 (Sun, 22 Apr 2001 13:08:56 EDT)
|
NNTP-Posting-Date: | Sun, 22 Apr 2001 13:08:56 EDT
|
To: | djgpp AT delorie DOT com
|
DJ-Gateway: | from newsgroup comp.os.msdos.djgpp
|
Reply-To: | djgpp AT delorie DOT com
|
Dennis Yelle wrote:
>
> When I run this program:
> -------------------------
> #include <cassert>
> #include <cstdio>
> #include <fstream>
> #include <iostream>
> #include <string>
>
> unsigned char data[] = {
> 'd', 0xBC, 0xD2, '\r', '\n',
> 'e', 0xB2, 0x7B, '\r', '\n',
> 'd', 0xBC, 0xD2, '\r', '\n',
> 'e', 0xB2, 0x7B, '\r', '\n',
> 'f', 0xB2, 0x7B, '\r', '\n'
> };
>
> void write_it( char* name)
> {
> FILE* out = fopen( name, "wb");
> assert(out);
> int count = 0;
> for( unsigned i=0; i<sizeof(data); ++i) {
> assert( EOF != fputc( data[i], out));
> count++;
> }
> assert( !fclose( out));
> cout << "Wrote " << count
> << " characters to the file " << name << '\n';
> }
>
> void read_it( char* name)
> {
> ifstream in( name);
> assert( in);
> cout << "Reading....\n";
> string s;
> for(;in>>s;) {
> cout << s.size() << " " << s << '\n';
> }
> }
>
> int main()
> {
> write_it( "temp");
> read_it( "temp");
> return 0;
> }
> ----------------------------
>
> I get this output:
> ------------------------
> Wrote 25 characters to the file temp
> Reading....
> 3 d+-
> 1 e
> --------------------------
>
> So it looks like string input from
> a file cannot read some character sequences.
> Is this expected and desired behavior?
> Or is it a bug?
The problem is that when reading with
in>>s;
the code in the file /djgpp/lang/cxx/std/bastring.cc
on line 448 calls traits::is_del (ch)
and on line 120 of the file \DJGPP\LANG\CXX\STD\STRAITS.H
is_del calls isspace(a)
but isspace only works for characters in the range 0x00 to 0x7f.
So anytime you read a file with
in >> s;
you end up with undefined behavior if the file contains
any characters above 0x7f.
As far as I know, this is undocumented.
Is this true only of djgpp, or is it also
true of other versions of gcc?
Who should I report this to, in order to get it fixed?
Dennis Yelle
--
I am a computer programmer and I am looking for a job.
There is a link to my resume here:
http://table.jps.net/~vert/
- Raw text -