X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: <2ECEEFBE44B2488C840CA73169D69A6D@LeakyCauldron> References: <2ECEEFBE44B2488C840CA73169D69A6D AT LeakyCauldron> Date: Mon, 4 Jan 2010 12:29:13 +0000 Message-ID: <416096c61001040429m62b7d93cm5badf57619a8aea0@mail.gmail.com> Subject: Re: Cygwin 1.7.1 sprintf() with format string having 8th bit set From: Andy Koppe To: cygwin AT cygwin DOT com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com 2010/1/4 Joseph Quinsey > In Cygwin 1,7.1, sprintf() with the format string having an 8th bit set > appears to be broken. Sample code (where I've indicated the backslashes in > the comments, in case they are stripped out by the mailer): > > #include > > int main (void) > { > =C2=A0 =C2=A0unsigned char foo[30] =3D ""; > =C2=A0 =C2=A0unsigned char bar[30] =3D ""; > =C2=A0 =C2=A0unsigned char xxx[30] =3D ""; > =C2=A0 =C2=A0sprintf (foo, "\100%s", "ABCD"); /* this is backslash one ze= ro zero =C2=A0 */ > =C2=A0 =C2=A0sprintf (bar, "\300%s", "ABCD"); /* this is backslash three = zero zero */ > =C2=A0 =C2=A0sprintf (xxx, "\300ABCD"); =C2=A0 =C2=A0 =C2=A0 /* this is b= ackslash three zero zero */ > =C2=A0 =C2=A0printf ("%d %d %d %d %d\n", foo[0],foo[1],foo[2],foo[3],foo[= 4]); > =C2=A0 =C2=A0printf ("%d %d %d %d %d\n", bar[0],bar[1],bar[2],bar[3],bar[= 4]); > =C2=A0 =C2=A0printf ("%d %d %d %d %d\n", xxx[0],xxx[1],xxx[2],xxx[3],xxx[= 4]); > =C2=A0 =C2=A0return 0; > } > > gives: > > 64 65 66 67 68 > 0 0 0 0 0 > 192 65 66 67 68 > > The second line of the output should be the same as the third. The issue here is that the character set of the "C" locale in Cygwin 1.7 is UTF-8 and that the \300 on its own is an invalid UTF-8 byte. To get well-defined behaviour, you need to invoke setlocale(LC_CTYPE, ...) with the approriate locale. See the thread at http://cygwin.com/ml/cygwin/2009-12/msg00980.html for more on this. Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple