X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:reply-to:message-id:to:subject :in-reply-to:references:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=iwJ1r5Zk/4F5Ehve 63UJVnKmQoBjOTg+t4+TFeCuRG97iRmlysqlmxWJPPicL7euwwYYZCevOvxfNzLd e+22oln6I7PSW7H6hT4SUHuplHqM/dtXgcwRTu97vYl3JJAjuC1xZlOysFxXAsoB +ouXQQUuFXcj3AbRo9wYQ4DNqlg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:from:reply-to:message-id:to:subject :in-reply-to:references:mime-version:content-type :content-transfer-encoding; s=default; bh=d9dL0RK3CqIeQwchH+G4LD SCvz4=; b=p+COngmDkL/7g4mIG8Pq8Pd9OKVEZWe8VFM9YtokDOYNTTjQJPJHj+ AxFtZ0Taw77GQ8QrqjCapSJFH6Se3oZnmdP5/1r73/E2ZlJ6WIEr4cvnTBkX529V 9sUROsUH1Ya6cc7EdraZ1scvDNxnHzmsjdi4T4haIkM9I4udD+aww= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: =?ISO-8859-1?Q?Yes, score=9.1 required=5.0 tests=AWL,BAYES_95,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,KAM_THEBAT,MIME_BASE64_BLANKS,SPF_SOFTFAIL autolearn=no version=3.3.2 spammy=8:=d0=b0, 8:=d0=b2, H*UA:Bat!, Latin?= X-HELO: smtp.ht-systems.ru Date: Fri, 25 Dec 2015 03:04:51 +0300 From: Andrey Repin Reply-To: cygwin AT cygwin DOT com Message-ID: <773876572.20151225030451@yandex.ru> To: Corinna Vinschen , cygwin AT cygwin DOT com Subject: Re: stat() lstat() not able to read long filename with cyrillic chars? In-Reply-To: <20151224192448.GB4275@calimero.vinschen.de> References: <20151223194440 DOT 5B2A98CFEA AT edrusb DOT is-a-geek DOT org> <20151224192448 DOT GB4275 AT calimero DOT vinschen DOT de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-IsSubscribed: yes Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id tBP05PE7018480 Greetings, Corinna Vinschen! >> First, I have read the FAQ and this mailing archive :) >> >> Here is the problem I meet: >> >> In a directory are placed three files using windows 8's explorer: >> - a short Cyrillic filename "абваб.txt" >> - a long Cyrillic filename >> "абвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабваб.txt" >> - a long Latin filename >> "ababababababababababababababababababababababababababababababababababababababababababababababababababababababababababababa.txt" >> >> >> >From a C program compiled under Cygwin, I can obtain the corresponding >> filename strings using readdir_r()... >> >> "\320\260\320\261\320\262\320\260\320\261.txt" >> "\320\260\320\261\320\262\320\260\320\261\320\262\320\260\320\261 [snipped]" >> "abababababaababababa [snipped]" >> >> ... but passing these strings in turn to lstat() or stat() returns 0 as >> expected for all except for the long Cyrillic filename. > NAME_MAX is 255. On Windows this is the number of UTF-16 chars > unfortunately. On POSIX systems (as on Cygwin) this is the number of > bytes. Long UTF-16 strings in cyrillic take twice as much UTF-8 chars > as it has UTF-16 chars, so NAME_MAX in utf-8 cyrillics translates into > a maximum of 127 UTF-16 chars. Aren't POSIX restrictions are a bit different? Namely 128 bytes per path element and 4096 bytes for file name? > If you need access to UTF-16 filenames with more characters, you can > switch to a one-byte charset temporarily, e.g. > $ LC_ALL=ru_RU your_app > to switch to iso-8859-5 or > $ LC_ALL=ru_RU.CP1251 > to switch to Windows codepage 1251. See > https://cygwin.com/cygwin-ug-net/setup-locale.html > HTH, > Corinna -- With best regards, Andrey Repin Friday, December 25, 2015 03:03:51 Sorry for my terrible english...