X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=0.2 required=5.0 tests=AWL,BAYES_40,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_NONE,SARE_SUB_ENC_UTF8 X-Spam-Check-By: sourceware.org From: Bengt Larsson To: cygwin AT cygwin DOT com Subject: A bug with UTF-8 output in a console Date: Mon, 07 Feb 2011 01:31:45 +0100 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="--=_hbfuk6lufh19t8k06j49n5ic2no9i43qi3.MFSBCHJLHS" X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com ----=_hbfuk6lufh19t8k06j49n5ic2no9i43qi3.MFSBCHJLHS Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I think I have found a bug with UTF-8 output in a console in a UTF-8 locale ("C.UTF-8"). If an UTF-8 character straddles a write() boundary, then the output gets garbled. An example program is attached. ----=_hbfuk6lufh19t8k06j49n5ic2no9i43qi3.MFSBCHJLHS Content-Type: application/octet-stream; name=buftest.c Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=buftest.c I2luY2x1ZGUgPHN0ZGlvLmg+CgppbnQKbWFpbihpbnQgYXJnYywgY2hhciAq YXJndltdKQp7CglpbnQgaSwgYnVmOwoJCglpZiAoYXJnYyA+IDEpIHsKCQli dWYgPSBhdG9pKGFyZ3ZbMV0pOwoJCWlmIChidWYgPD0gMCkgcmV0dXJuIDA7 Cgl9IGVsc2UgewoJCWJ1ZiA9IDU7Cgl9CgkKCXByaW50ZigiRmlyc3QgYSBz ZXF1ZW5jZSBvZiBjaGFyYWN0ZXJzLCBhLXJpbmcsIGEtZGlhcmFlc2lzLCBv LWRpYXJhZXNpcywgcmVwZWF0ZWQ6XG5cbiIpOwoJCglmb3IgKGk9MDsgaTwy NTsgaSsrKQoJCXByaW50ZigiXHhjM1x4YTVceGMzXHhhNFx4YzNceGI2Iik7 CgoJcHJpbnRmKCJcblxuIik7CgkKCXByaW50ZigiTm93IHRoZSBzYW1lIHRo aW5nIHdpdGggYSBidWZmZXIgb2YgJWQ6XG5cbiIsIGJ1Zik7CgkKCXNldHZi dWYoc3Rkb3V0LCBOVUxMLCBfSU9GQkYsIGJ1Zik7CgkKCWZvciAoaT0wOyBp PDI1OyBpKyspCgkJcHJpbnRmKCJceGMzXHhhNVx4YzNceGE0XHhjM1x4YjYi KTsKCglwcmludGYoIlxuIik7CQoKCXJldHVybiAwOwp9Cg== ----=_hbfuk6lufh19t8k06j49n5ic2no9i43qi3.MFSBCHJLHS Content-Type: text/plain; charset=us-ascii -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ----=_hbfuk6lufh19t8k06j49n5ic2no9i43qi3.MFSBCHJLHS--