X-Recipient: archive-cygwin@delorie.com
X-Original-To: cygwin@cygwin.com
Delivered-To: cygwin@cygwin.com
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E2F13858C3B
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=harkless.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=yahoo.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048;
 t=1630238804; bh=Is4MwPRudlO6i1UjTB0uHWhtDbjeQEzmig7Vbn0LVhY=;
 h=Subject:To:References:From:Cc:Date:In-Reply-To:From:Subject:Reply-To;
 b=YbpMweSER66Y19WQx+39MbQWBj18a96ytFZ8zgOC991krC1IhDnYP5GVmX6B35WuMGFLwOsuFEQI+poaCYURVJMxHVMXR8mn+DAx4O3wrldkNn8cqCcUO509pJ0OEu0RMUagVsKC+kJh97PXnRo/8W3YRPN217sOREp1AdhK3TWZrqBxc9C4RYtAeNfUHNNfNzgxV5S5BmlzZK0wxHOcZy3HMUUH7emcR9dtAqUhb4305ZBF4vRLilUhGuZqk0SifR12BJtKjrmopnz1n79bBLLY2WnuSTC8RQJn7uNH1SI9cxroKkIAvPHz86ZEbAAa61r01+6LSeJ1Vjla7zHViQ==
X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048;
 t=1630238804; bh=kX+Zw4nVmIeij1xUxIiqSrZUcAKTjhiBpkIVpfC7wIC=;
 h=X-Sonic-MF:Subject:To:From:Date:From:Subject;
 b=cjgfmJceE6uiOj7gClQhEsgUdJgVXyKyT7jzwiElWPy6srZBYqgc6hyjQbz2scz2A1gDQqHR+bZkuIqF+GbXaRTVBH2+N2JBROKY5Um5jtvIlGn+qbFCzMhGOmfTkd3LIyNTLA78I7m+0kgnm8ZHXHmJDDMWMBUYRCfRYrY5NhCoFam2S1npA1meMy2ctKbdJSyF5HcxRUcFFpb0UBcvqM7jNclDJJRnoS08qeFa7apDL19OYc+iq3792c6QwSkB5rejVcUc8cKQgUsyRjqiWtilXAnrfig20+mTJmcEUYqY03SdAGrwmEo1nKZtBD3/Lqsg99su1NkgHKAwZDus/g==
X-YMail-OSG: T80WPkMVM1laedSdKIGxIre1xEGPjMx24ELpkZlttV2tZjpAiDfP9kEZ.JQV12z
 7IolVdXYlkhXGBBJJ7u7PXlm41EM3jOWv3zr5.biWcJ.fhoSI449.u5W3dtaOY1J2wpgGc9mZ4R7
 TcnDLMKYhoJdGr5BWQ9Ov_LJGYWc2s5bis1yC32jmSXDcB__LU9GQJPGWI64Igy9b0heb3Ves0WH
 O96hC3fevn0JC8LI2T0V7MP6i4v1VrASYp_ZU83XW7UZBXmA.S3nxA5gMSzHgy2mj86ifLxCpX54
 5ZAqsGBMQVAEikIW8CfLVdX4FSTX7qUZvlH4zKnEm.ocLHh1WTee47gB_AcguxpzxzGGCAcWVC0Z
 cHYRKdCwqYG5kqbrIpCUc36P08Wy9ibKm0jkQL23h24mQk1TFEai5oG.kMBz5eVMiyBPbfcT.P_o
 OkVQfiL4gcdXZyDh03Zz0tQpoZ42k2eM2MjcxpLa3EM46PwVyqiJndz2DLfKNu4kKhfBiPpD4hwV
 RY4bqBN0wn_kAF2iB.m_6pF2H0xD2euzAF.EIKa2Vp_rsmQhCikNNsgQnCkl6q5xulqKbRsK0e3_
 jW2dsYmA_rgVYrCIwa15Kf7r2B3S2YQozUaql_FmFqd0C6tm9cOPqsKd8dmOIOW6kHLVOiAaBP00
 M3regcut9A2l6Ye60vHXHSdA81UVLEOqiCJlZsF_g7OXsq4tzX2mIfmtQiEPTtZ.xg7Hpc9SaUfs
 v5Y6Q_L0vTDYAv4QL5sQvSo3AjgP8sgUVzJmm2utl6ZzuG6f2fl2MIOtqYuczf9oE6TeCXJ_PKHN
 GkGLUC4YBgL3JWvHMoJaNxtgfUP1GrLW_1Z_q4M7ieyUDrtFYzKyJNUBqWGN8b5rL5C89Njp3D5A
 PnzKJPyzGJgrRuem69361n1ewJ7Qh6T05Q7UweK40WDWjQeBSo5FMxH_X1DMj93GwDZ08pTVkaDV
 _abAbek9VeYJK60an8ZRgdyUVZQs0kLYBsA.eYw.IYMdHY_zrR.BXO6wEE959cdG1XdtUJUfbC6N
 Dn3HfigSTja7qTeHY9htr1Ach6jvxR9vh6n6dpFqCvckYGqC1Gpm1qlpF2wqaiLbpW0F0HGENyil
 f0lWK5AE2Pw2WYV3XuKNpT5.mvVCiVwYZZNi4zj5PmlPfH_paOQLdwXlRkrWF22ML7ILfn53U471
 zqef95Qiw6H7mketq8LgsyFY8WmA2bx_UEBr4NSzz6h86VFuSFNmqhlHtLgTzv5TPZy5n4_TCT.1
 RmQyjWcw0INPuTlt6k6IdiXNaXgMQ1qy30DWjgX1t8dalv9r41kwkZEUp.SmBmzqjgB5vj7ZRU9m
 ovR5_lzWCoDzirnMegqfdJYzlwC51vN4ZLRlOXDIB1_NtXky0rDBxuCx7SvtqYSAoD_DWc_.gn_.
 cuHvr6IR1Fm5bcdtTgWBmjdVFU3nCBfpEYv9jraggNLqQTo_NIN1TBWZLpsvopn.JbxVNsfd0QU2
 cXi1BkG_C2pP7Htzx5oDQy1QmZMrYSf4za4FwiQx74hr2OEUgrTubUcqAa9y_DjWCowq.PyjyZxN
 5QK5srt04FwPKMenpu3NyxcaXX135W1kj_WD3D5emAB_viNtkrBduNWrJHBZF255GzZWLlTdRsVf
 8kMvA3jfbIw0yWhnMwsYBeGla1heRKlNI0kR7nDI7AAsZYTkuPUx2YYw2eA6lOkx5V4DMAdBw6P9
 dlSGbDkg3xjf_X9tDHIPET7.Q0hFUO0BF8a_IYuejvQOdbXKHTuR3ofd9KtKIo0KNsw4j5N6V2LB
 Y9gg_0.mwXaaq_544qGcow6F8bzCK5qZ9Fkv0HZ7zxSQfPxIQpW7lJgwiIMn_FYL9M_SzYWuNuFA
 K.OZ_WJCmof.gUV4HzoZatyvmDqVuph6HfVxFRv8TLWBvYMfUbBX7Dd3PkCsH6MSVUISGUWPl0Hm
 sKodtwQNHZpyDpYxEu8XV_Biu.8MftYXuyQbgvo9qtpfuKzV24xKJ2TcDE_Joe_yowzuXzFIeIe9
 QUAOcf4R3YghF30Tlpyw.KKcDoW3V.povbdcJBqwU3X0QC8f81cRC1Hw6TZOh9aLiO5LE2g3g8ww
 CpYdcmp9nDFVheTjB7xcP3MvG8FLzSwCle2o5kszhcqXQyP8r5KF.IrSLMPNhKJHeUc5NL0o_N.f
 uU6.kwCN2jfbAPCJaeQyBGUmrhgRm_BUSX2.ki2E8_0HxPwn4GFntEDlu5l4mfGTXCISCnTbR.0G
 hPTujsf2Xgdw02bdFdc_XCAsh1ezy3nu1yJlcGm0mQdzjZLVUbmIMpY1RbDUUaTIpymJbQkhMKLR
 p0zZEulXMtBje0.gPrOI_B34S63QWEQyPGuw-
X-Sonic-MF: <dan_harkless@yahoo.com>
Subject: Re: updatedb broken as of findutils 4.8.0-1 due to bigram.exe no
 longer being provided
To: "cygwin@cygwin.com" <cygwin@cygwin.com>
References: <986736274.144968.1630167325057.ref@mail.yahoo.com>
 <986736274.144968.1630167325057@mail.yahoo.com>
 <a60ffa68-274a-5072-c90a-0dce7bc93431@harkless.org>
 <3457cee1-18b5-2916-adee-afdfaf9769ea@t-online.de>
From: Dan Harkless <cygwin-list21@harkless.org>
Message-ID: <525a832a-78fd-5a32-e195-5747120da922@harkless.org>
Date: Sun, 29 Aug 2021 05:06:37 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:78.0) Gecko/20100101
 Thunderbird/78.12.0
MIME-Version: 1.0
In-Reply-To: <3457cee1-18b5-2916-adee-afdfaf9769ea@t-online.de>
Content-Language: en-US
X-Mailer: WebService/1.1.18924
 mail.backend.jedi.jws.acl:role.jedi.acl.token.atz.jws.hermes.yahoo
X-Spam-Status: No, score=2.2 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS, NICE_REPLY_A, RCVD_IN_DNSWL_NONE,
 RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Level: **
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: cygwin@cygwin.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin@cygwin.com>
List-Help: <mailto:cygwin-request@cygwin.com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
 <mailto:cygwin-request@cygwin.com?subject=subscribe>
Cc: bug-findutils@gnu.org
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Errors-To: cygwin-bounces+archive-cygwin=delorie.com@cygwin.com
Sender: "Cygwin" <cygwin-bounces+archive-cygwin=delorie.com@cygwin.com>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 17TC7CZP007183

On 8/29/2021 4:02 AM, Hans-Bernhard Bröker wrote:
> Am 28.08.2021 um 18:23 schrieb Dan Harkless:
>> Looks like it's because in findutils 4.8.0-1, the bigram.exe program 
>> is no longer provided, but the /usr/bin/updatedb script (still) 
>> depends on it being there:
>      [...]
>>      + for binary in $find $frcode $bigram $code
>>      + checkbinary /usr/libexec/frcode
>
> The version of updatedb in the 4.8.0-1 package does not actually 
> contain those lines.  Mention of both $bigram and $code has been 
> removed from the loop construct (and from everywhere else in the script).
>
> That's because the old format of find databases, which is the only one 
> actually using bigram and code, was removed from updatedb as of 
> findutils version 4.7, so there really cannot be a need for the bigram 
> tool any more.

Argh!  So sorry for the false report!  I completely forgot that years 
back I had made a locally patched version (which is earlier in my path) 
of Cygwin updatedb 4.6.0-1 to troubleshoot and work around problems I 
was having with the tool.

I have 12M+ pathnames on my main Windows system, and I suddenly started 
having issues with the updatedb going from taking less than an hour, to 
taking more than 24 hours, and running into the next job.

It was very awkward to try to troubleshoot what was going on without a 
'find' log to 'tail', so I patched my  local copy of updatedb to write 
to an intermediate file, rather than going direct to 'sort' over a pipe.

Another problem I was having was that though I have 24 GB of RAM on my 
system, I would get low-memory popup warnings from the OS when the sort 
would go off.  (The warnings mislay the blame on Firefox, because I 
usually have big sessions running that take even more RAM than the sort.)

I was hoping running sort on a _file_ rather than stdin might allow it 
to reduce the RAM use enough to not get the warning, but unfortunately 
(and unsurprisingly) I still get it with the intermediate file.  This is 
just a warning, though — I haven't had it actually run out of RAM so 
far, I don't think.

The final problem I was addressing in my patched version was some 
missing error-checking, which was causing me to be left with _no_ 
filename DB, when the update would fail, rather than at least being left 
with the one from last time.

I could send along my patches, but I don't know that I've solved these 
issues in a general enough way.  For instance, my 12 million+ pathnames 
come out to about 1.4 GiB of UNIX-linefeed-separated UTF-8 strings.  
Writing that much to my HD is not a concern, but obviously some people 
might not want to write that much every time to, say, a small 
flash-based device.

Thoughts?

-- 
Dan Harkless



-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

