delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2012/05/14/18:58:01

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-10.2 required=5.0 tests=AWL,BAYES_00,KHOP_PGP_SIGNED,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Message-ID: <4FB18D1E.4050203@redhat.com>
Date: Mon, 14 May 2012 16:54:22 -0600
From: Eric Blake <eblake AT redhat DOT com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1
MIME-Version: 1.0
To: cygwin AT cygwin DOT com
Subject: Re: problem with find's -size and -exec options
References: <CAPt61RQ7Cqm2h0wOE0EdqF712Gw7rBbSa+41ZQwG_y4E8JATsQ AT mail DOT gmail DOT com>
In-Reply-To: <CAPt61RQ7Cqm2h0wOE0EdqF712Gw7rBbSa+41ZQwG_y4E8JATsQ@mail.gmail.com>
OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg
X-IsSubscribed: yes
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

--------------enig50A45AAF5324DB262678E4BC
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 05/14/2012 04:29 PM, j. k. colligan wrote:
> Friends -
>=20
> I just noticed a difference in behavior between Cygwin's "find" and
> the one in Linux, or
> so it seems.
>=20
> I was trying to locate files smaller than a given size, and thus ran
>=20
>     find . -size -4000c
>=20
> That worked, and listed the file names only for files < 4000 bytes in
> size.  But if I run
>=20
>     find . -size -4000c -exec ls -l {} \;
>=20
> it turns out that *all* files are listed!  (Plus the small ones at the
> end of the list.)  This
> surprised me.  In other similar cases I've run in the past, the
> earlier-in-the-command-line
> filters took effect before the exec.

Compare 'stat .' on Linux and on cygwin.  On Linux, directories have
non-negligible size, typically as a multiple of 4k, and generally
accurately measured according to underlying inodes consumed by the
directory; so, for a non-empty directory, it will have size 4096, and be
filtered out by your '-size -4000c'.  On Cygwin, directory size is
typically faked as 0 (since Windows doesn't really give cygwin any thing
better to work with while still remaining efficient).  POSIX says that
the st_size of a stat() on a directory is undefined (the Linux
definition makes more sense, but cygwin isn't breaking POSIX by
returning 0 instead of jumping through hoops to invent a reasonable
non-zero size).  Therefore, '-size -4000c' doesn't filter '.'.

The reason -exec is listing everything is that on cygwin, you ended up
passing '.' as one of the arguments to list.  You can use '-type f' to
filter out directories.

> Same unexpected (to me) result.  Am I way outta whack here, of is this
> a real problem?

I'll let you draw your own answer to that question :)  But I don't see
any way to change findutils or cygwin to work around this difference in
directory st_size.

--=20
Eric Blake   eblake AT redhat DOT com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


--------------enig50A45AAF5324DB262678E4BC
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBCAAGBQJPsY0eAAoJEKeha0olJ0Nq2RAH/jfyygi6EKFmHMWz9AsNf1eX
h5MuXm25/rx4t0zsmdvE6aPkX7ESwhblNNZhLE7nIHBg8ek4956sd13pfokNHcl4
gU040Tncks3Gymvr+kaALx1YjKiCMx8kY+NWoIf2U9uvNFbVJjBTBPoLIxCA9mWV
DN6dmEce8sCwzubM5mLWp18PQBKBMjilYpB4JWqFTfMshfYY7AINGoHoreFVxGGC
NC5XvWSAi5cgltUYqOzEHl5PwWVFno1rhTYQBsLapa6l93FlrljGIbabjaGUq6P4
IhVSWwL6NvLH2lGL3uwcvxD2sLiejM8dgvY/oFz80PUc2n2K39CqSvTERJEeiWA=
=Z0TP
-----END PGP SIGNATURE-----

--------------enig50A45AAF5324DB262678E4BC--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019