X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:from:date:message-id:subject:to :content-type; q=dns; s=default; b=xej5DWQGeYNEkUjfVzRNIs7+7Qvy2 hI6OTS8605N2w12n2lyA6WdxfzycodgNt0RwCnMQtZoUU/4qaFFLOlsFCD9+6qAG lgpzqBCzf/gIrcmGKPwgxoxm0EXwIvWar0sqy7UFVFNKqb0qG0U9qft5DcM8FT/r 2R7PDPwrEN8hPc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:from:date:message-id:subject:to :content-type; s=default; bh=Uoud20oc4bt+KpYoIxn0DUQ8ZJA=; b=Nr5 gZUbc+xPtc+yU9sE/pr1iY98sExElVFX1MhmEtc9iYx26IWZJm4mmLkal5GkfjzW sbahk/05+4klgEcLjD0Em1Bx0n7nnh8ewZz2u+uzg/Vk29i85gnWfnmdQbYy4M11 i77lSzTA7HFC39mzoRS11wE8GNnYdKop7au4Equs= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=4.6 required=5.0 tests=AWL,BAYES_05,FREEMAIL_FROM,KAM_FROM_URIBL_PCCC,LIKELY_SPAM_SUBJECT,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-yk0-f175.google.com X-Received: by 10.55.22.32 with SMTP id g32mr19438871qkh.4.1427294383341; Wed, 25 Mar 2015 07:39:43 -0700 (PDT) MIME-Version: 1.0 From: Kyzer Date: Wed, 25 Mar 2015 14:34:28 +0000 Message-ID: Subject: With bad UTF-8, cygwin can create files it can't read To: cygwin AT cygwin DOT com Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes Hello, I've found that if you use cygwin to create a file with badly-encoded UTF-8, readdir() gives out an entry with a name that cygwin won't subsequently accept. * create a file using filename with hex bytes F4 8F BF BF * readdir() reports the filename as hex bytes E2 8E B3 ED BF BF * attempting to open or unlink the filename E2 8E B3 ED BF BF fails * attempting to open or unlink the filename F4 8F BF BF succeeds Here's a test case. Beware that it will delete everything in the current directory. #include #include int main() { DIR *d; struct dirent *de; char *fname = "\xF4\x8F\xBF\xBF"; // touch file fclose(fopen(fname, "wb")); // iterate through dir d = opendir("."); while ((de = readdir(d))) { if (de->d_name[0] == '.') continue; printf("unlink(%s) = %d\n", de->d_name, unlink(de->d_name)); } closedir(d); // show that unlink works if you know the real filename printf("unlink(%s) = %d\n", fname, unlink(fname)); } This outputs (piped through hexdump -C) 00000000 75 6e 6c 69 6e 6b 28 e2 8e b3 ed bf bf 29 20 3d |unlink(......) =| 00000010 20 2d 31 0a 75 6e 6c 69 6e 6b 28 f4 8f bf bf 29 | -1.unlink(....)| 00000020 20 3d 20 30 0a | = 0.| 00000025 e.g. unlink(\xe2\x8e\xb3\xed\xbf\xbf) = -1 unlink(\xf4\x8f\xbf\xbf) = 0 This is with cygwin package 1.7.35 $ cygcheck -c cygwin Cygwin Package Information Package Version Status cygwin 1.7.35-1 OK WIndows / DOS does not have the problem: c:\test\t>dir Volume in drive C has no label. Volume Serial Number is ....-.... Directory of c:\test\t 25/03/2015 14:30 . 25/03/2015 14:30 .. 25/03/2015 14:30 0 ?? 1 File(s) 0 bytes 2 Dir(s) 39,906,525,184 bytes free c:\test\t>del * c:\test\t\*, Are you sure (Y/N)? y c:\test\t>dir Volume in drive C has no label. Volume Serial Number is ....-.... Directory of c:\test\t 25/03/2015 14:31 . 25/03/2015 14:31 .. 0 File(s) 0 bytes 2 Dir(s) 39,906,525,184 bytes free Regards Stuart -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple