X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=3.5 required=5.0 tests=AWL,BAYES_20,FSL_HELO_NON_FQDN_1,HELO_NO_DOMAIN,RDNS_DYNAMIC,TW_BZ,TW_LZ,TW_PK X-Spam-Check-By: sourceware.org Message-ID: <4CBCB044.7040407@omegacrash.net> Date: Mon, 18 Oct 2010 13:38:28 -0700 User-Agent: Thunderbird 2.0.0.24 (X11/20100411) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: makewhatis bug (man-1.6f-1) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Delivery-Agent: TMDA/1.1.12-kg2 (Pluto) From: Kevin Goodsell X-Primary-Address: kevin AT omegacrash DOT net X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com makewhatis from man-1.6f-1 produces incorrectly formated output. Here's an example comparing the badly-formatted results to correctly formatted results: $ man -k ssh git [] (1) - shell - Restricted login shell for GIT-only SSH access ssh [] (1) - add - adds RSA or DSA identities to the authentication agent ssh [] (1) - agent - authentication agent ssh [] (1) - copy-id - install your public key in a remote machine's authorized_keys ssh [] (1) - keygen - authentication key generation, management and conversion ssh [] (1) - keyscan - gather ssh public keys ssh [] (1) - OpenSSH SSH client (remote login program) ssh [] (8) - keysign - ssh helper program for host - based authentication ssh [] (8) - pkcs11-helper - ssh - agent helper program for PKCS#11 support ssh_config [] (5) - OpenSSH SSH client configuration files sshd [] (8) - OpenSSH SSH daemon sshd_config [] (5) - OpenSSH SSH daemon configuration file XAllocClassHint [] (3) - allocate class hints structure and set or read a window's WM_CLASS property XClassHint [] (3) - allocate class hints structure and set or read a window's WM_CLASS property XGetClassHint [] (3) - allocate class hints structure and set or read a window's WM_CLASS property XSetClassHint [] (3) - allocate class hints structure and set or read a window's WM_CLASS property Using previous version: $ man -k ssh git-shell (1) - Restricted login shell for GIT-only SSH access ssh (1) - OpenSSH SSH client (remote login program) ssh_config (5) - OpenSSH SSH client configuration files ssh-add (1) - adds RSA or DSA identities to the authentication agent ssh-agent (1) - authentication agent ssh-copy-id (1) - install your public key in a remote machine's authorized_keys sshd (8) - OpenSSH SSH daemon sshd_config (5) - OpenSSH SSH daemon configuration file ssh-keygen (1) - authentication key generation, management and conversion ssh-keyscan (1) - gather ssh public keys ssh-keysign (8) - ssh helper program for host-based authentication ssh-pkcs11-helper (8) - ssh-agent helper program for PKCS#11 support XAllocClassHint (3) - allocate class hints structure and set or read a window's WM_CLASS property XClassHint [XAllocClassHint] (3) - allocate class hints structure and set or read a window's WM_CLASS property XGetClassHint [XAllocClassHint] (3) - allocate class hints structure and set or read a window's WM_CLASS property XSetClassHint [XAllocClassHint] (3) - allocate class hints structure and set or read a window's WM_CLASS property The problem is in the whatis database itself, and is caused by a bug in makewhatis in the upstream distribution. From the function do_one in the section of awk code: use_zcat = match(filename,"\\.Z$") || match(filename,"\\.z$") || match(filename,"\\.gz$"); if (!use_zcat) use_bzcat = match(filename,"\\.bz2"); if(!use_bzcat) use_lzcat = match(filename,"\\.lzma"); if (use_zcat || use_bzcat || use_lzcat ) { filename_no_gz = substr(filename, 0, RSTART - 1); } else { filename_no_gz = filename; } When filname ends in .z, .Z, or .gz, use_zcat gets set, and match() sets the variable RSTART. The check for a .bz2 file is careful not to call match() if the file was already determined to be .gz (etc.), but the newer .lzma-checking code fails to check for all earlier possibilities, and invokes match() again. This trashes RSTART, causing the later substr() call to give an unexpected result. How exactly this produces the weird output above is left as an exercise to the reader, but fixing the second 'if' fixes the problem. A simple patch follows. It doesn't attempt to address the underlying problem (that the decompression handling needs to be generalized), it's just a quick fix. Interestingly there are other patches floating around that add support for .xz files and repeat the .lzma mistake, making the bug even worse. -Kevin --- makewhatis 2010-10-07 13:47:42.578125000 -0700 +++ makewhatis.fixed 2010-10-09 10:06:44.234375000 -0700 @@ -268,7 +268,7 @@ match(filename,"\\.z$") || match(filename,"\\.gz$"); if (!use_zcat) use_bzcat = match(filename,"\\.bz2"); - if(!use_bzcat) + if(!use_zcat && !use_bzcat) use_lzcat = match(filename,"\\.lzma"); if (use_zcat || use_bzcat || use_lzcat ) { filename_no_gz = substr(filename, 0, RSTART - 1); -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple