delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2015/12/24/19:05:32

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:reply-to:message-id:to:subject
:in-reply-to:references:mime-version:content-type
:content-transfer-encoding; q=dns; s=default; b=iwJ1r5Zk/4F5Ehve
63UJVnKmQoBjOTg+t4+TFeCuRG97iRmlysqlmxWJPPicL7euwwYYZCevOvxfNzLd
e+22oln6I7PSW7H6hT4SUHuplHqM/dtXgcwRTu97vYl3JJAjuC1xZlOysFxXAsoB
+ouXQQUuFXcj3AbRo9wYQ4DNqlg=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:date:from:reply-to:message-id:to:subject
:in-reply-to:references:mime-version:content-type
:content-transfer-encoding; s=default; bh=d9dL0RK3CqIeQwchH+G4LD
SCvz4=; b=p+COngmDkL/7g4mIG8Pq8Pd9OKVEZWe8VFM9YtokDOYNTTjQJPJHj+
AxFtZ0Taw77GQ8QrqjCapSJFH6Se3oZnmdP5/1r73/E2ZlJ6WIEr4cvnTBkX529V
9sUROsUH1Ya6cc7EdraZ1scvDNxnHzmsjdi4T4haIkM9I4udD+aww=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: =?ISO-8859-1?Q?Yes, score=9.1 required=5.0 tests=AWL,BAYES_95,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,KAM_THEBAT,MIME_BASE64_BLANKS,SPF_SOFTFAIL autolearn=no version=3.3.2 spammy=8:=d0=b0, 8:=d0=b2, H*UA:Bat!, Latin?=
X-HELO: smtp.ht-systems.ru
Date: Fri, 25 Dec 2015 03:04:51 +0300
From: Andrey Repin <anrdaemon AT yandex DOT ru>
Reply-To: cygwin AT cygwin DOT com
Message-ID: <773876572.20151225030451@yandex.ru>
To: Corinna Vinschen <cygwin AT cygwin DOT com>, cygwin AT cygwin DOT com
Subject: Re: stat() lstat() not able to read long filename with cyrillic chars?
In-Reply-To: <20151224192448.GB4275@calimero.vinschen.de>
References: <20151223194440 DOT 5B2A98CFEA AT edrusb DOT is-a-geek DOT org> <20151224192448 DOT GB4275 AT calimero DOT vinschen DOT de>
MIME-Version: 1.0
X-IsSubscribed: yes
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id tBP05PE7018480

Greetings, Corinna Vinschen!

>> First, I have read the FAQ and this mailing archive :)
>> 
>> Here is the problem I meet:
>> 
>> In a directory are placed three files using windows 8's explorer:
>> - a short Cyrillic filename "абваб.txt"
>> - a long Cyrillic filename
>> "абвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабвабваб.txt"
>> - a long Latin filename
>> "ababababababababababababababababababababababababababababababababababababababababababababababababababababababababababababa.txt"
>> 
>> 
>> >From a C program compiled under Cygwin, I can obtain the corresponding
>> filename strings using readdir_r()...
>> 
>> "\320\260\320\261\320\262\320\260\320\261.txt"
>> "\320\260\320\261\320\262\320\260\320\261\320\262\320\260\320\261 [snipped]"
>> "abababababaababababa [snipped]"
>> 
>> ... but passing these strings in turn to lstat() or stat() returns 0 as
>> expected for all except for the long Cyrillic filename.

> NAME_MAX is 255.  On Windows this is the number of UTF-16 chars
> unfortunately.  On POSIX systems (as on Cygwin) this is the number of
> bytes.  Long UTF-16 strings in cyrillic take twice as much UTF-8 chars
> as it has UTF-16 chars, so NAME_MAX in utf-8 cyrillics translates into
> a maximum of 127 UTF-16 chars.

Aren't POSIX restrictions are a bit different?
Namely 128 bytes per path element and 4096 bytes for file name?

> If you need access to UTF-16 filenames with more characters, you can
> switch to a one-byte charset temporarily, e.g.

>   $ LC_ALL=ru_RU your_app

> to switch to iso-8859-5 or

>   $ LC_ALL=ru_RU.CP1251

> to switch to Windows codepage 1251.  See
> https://cygwin.com/cygwin-ug-net/setup-locale.html


> HTH,
> Corinna



-- 
With best regards,
Andrey Repin
Friday, December 25, 2015 03:03:51

Sorry for my terrible english...

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019