X-Recipient: archive-cygwin AT delorie DOT com X-Spam-Check-By: sourceware.org X-Authority-Analysis: v=1.0 c=1 a=VZEYN3xUzfIA:10 a=bbX1fbyjUkMA:10 a=xe8BsctaAAAA:8 a=aSzdHKsrw6CyASB12r8A:9 a=m-fIsyNpjxTnB5ILMjMA:7 a=Yq73RuPVTpfH9Pj9NeqWXOkUJyYA:4 a=eDFNAWYWrCwA:10 a=rPt6xJ-oxjAA:10 Message-ID: <493546D3.7060808@byu.net> Date: Tue, 02 Dec 2008 07:31:47 -0700 From: Eric Blake User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.18) Gecko/20081105 Thunderbird/2.0.0.18 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: Avoid duplicate names in /proc/registry (which may crash find) ? References: <4934461E DOT 5040708 AT t-online DOT de> <20081202120840 DOT GM12905 AT calimero DOT vinschen DOT de> In-Reply-To: <20081202120840.GM12905@calimero.vinschen.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 According to Corinna Vinschen on 12/2/2008 5:08 AM: > - If find crashes in this situation, isn't this a bug in find which > should be fixed in find? No. The problem is that readdir() is returning the same name twice, once for a directory and once for a file. Even without d_type support, this means when you go to stat() that name, you get the same answer twice (assuming that stat visits the first of the two instances), where the second answer is wrong; even without d_type, find is unable to traverse the second of the two identical names. In other words, the problem is not in find, but in the fact that Windows allows the registry to violate filesystem semantics by giving the same name to distinct contents. > > - /proc/registry is a convenience for reading the registry. Due to some > funny definitions of the registry it's not a full solution. You can't > write this way, you can't even access the "(Default)" key values. > I personally don't worry if some border cases don't work. For the > border cases and for the full access we have regtool. Maybe the solution for this problem is to make the registry always populate d_type with DT_UNKNOWN; the readdir will still list the name twice, but at least with the same d_type, and such that find falls back to a stat(), which sees the info on the first entry twice, no worse than it currently is. If you do go with a workaround, using a full-blown set<> or hash-table in option 2 seems like overkill. When enumerating a parent key, the code currently visits all sub-keys first, then all values. Isn't it simply a matter of checking, for each sub-key, whether a value of the same name exists, in which case you could then munge the sub-key as name\ or name%5c? (I'm going with the backslash as the munging character, to make it obvious that this was the sub-key). - -- Don't work too hard, make some time for fun as well! Eric Blake ebb9 AT byu DOT net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkk1RtIACgkQ84KuGfSFAYA2zwCgiXA6N53ef3p+n3TWjt++XRhc 7rIAoLghCKX5wNylKtj/hW5zXeOX1ddS =OCmI -----END PGP SIGNATURE----- -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/