Message-ID: <3E39923A.3040907@mif.vu.lt> Date: Thu, 30 Jan 2003 21:59:38 +0100 From: Laurynas Biveinis Organization: VU MIF User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: lt, en, en-us MIME-Version: 1.0 To: DJGPP Workers Subject: small solve_symlinks speedup Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 30 Jan 2003 19:56:28.0824 (UTC) FILETIME=[AD9F3980:01C2C899] Reply-To: djgpp-workers AT delorie DOT com In parallel to fixing bugs and to address one of the Eli's concerns, I've decided to see how fast or slow __solve_symlinks is. I've built a profiled libc, bash, and run a configure script from bash sources. Actual time measurements are lost in noise, since this test is I/O bound anyways, so I don't consider timing important in this case, but here is a callgraph for solve_symlinks: ----------------------------------------------- 0.00 0.02 305/24148 opendir [216] 0.01 0.11 1801/24148 go32_exec [55] 0.01 0.12 2057/24148 stat [54] 0.03 0.23 3914/24148 __chdir [99] 0.04 0.32 5268/24148 __open [63] 0.04 0.32 5354/24148 __access [76] 0.04 0.33 5449/24148 __solve_dir_symlinks [103] [40] 0.3 0.17 1.44 24148 __solve_symlinks [40] 0.11 0.97 114700/116604 __internal_readlink [53] 0.11 0.15 114700/114700 advance [116] 0.05 0.00 138848/3295731 strcpy [45] 0.04 0.00 85143/3696692 strlen [41] 0.02 0.00 24148/253548 strpbrk [135] 0.00 0.00 57/32298 __djgpp_exception_processor [52] ----------------------------------------------- Since most of the paths passed to __solve_symlinks are not symlinks, IMHO it is reasonable to make a shortcut and check that in advance at the very beginning of __solve_symlinks, thus saving a lot of __internal_readlink calls, which do disk I/O. It is impossible to catch all cases of non-symlinks cheaply due to /dev/ stuff, FSEXT and so on. So I came up with a patch below, which handles only simple cases of non-symlinks, nevertheless lots of them :) With this patch the callgraph above looks like this: ----------------------------------------------- 0.00 0.02 305/24155 opendir [224] 0.02 0.10 1801/24155 go32_exec [54] 0.03 0.12 2059/24155 stat [52] 0.05 0.22 3914/24155 __chdir [88] 0.07 0.30 5268/24155 __open [62] 0.07 0.30 5355/24155 __access [82] 0.08 0.31 5453/24155 __solve_dir_symlinks [97] [40] 0.3 0.33 1.35 24155 __solve_symlinks [40] 0.05 1.00 86360/88266 __internal_readlink [53] 0.11 0.10 62205/62205 advance [124] 0.04 0.00 86360/3246148 strcpy [42] 0.04 0.00 56026/3645530 strlen [27] 0.01 0.00 13207/137617 strpbrk [186] 0.00 0.00 1/35745 __djgpp_exception_processor [41] ----------------------------------------------- So, is the patch below worth it? Is there any cheap way to improve fast detection of non-symlinks? I'm not checking it in yet. On a unrelated note, why does OpenBSD with Pentium MMX 200 Mhz and 16 MB RAM run configure scripts *much* faster than DJGPP on my desktop Pentium II 375 Mhz 192 RAM with W2K ? I can provide bash running profiles if anyone is interested... -- Laurynas Index: xsymlink.c =================================================================== RCS file: /cvs/djgpp/djgpp/src/libc/compat/unistd/xsymlink.c,v retrieving revision 1.9 diff -u -r1.9 xsymlink.c --- xsymlink.c 30 Jan 2003 19:29:28 -0000 1.9 +++ xsymlink.c 30 Jan 2003 19:42:53 -0000 @@ -53,7 +53,20 @@ strcpy(__real_path, __symlink_path); + /* Take a shortcut by checking if source path is a simple file or + directory. ``Simple'' in the sense of DOS, i.e. no /dev/env etc. */ + old_errno = errno; + bytes_copied = __internal_readlink(__symlink_path, 0, fn_buf, FILENAME_MAX); + /* If __internal_readlink finds the path specified but it is not a symlink, + it returns -1 and sets errno to EINVAL. */ + if ((bytes_copied == -1) && (errno == EINVAL)) + { + errno = old_errno; + return 1; + } + errno = old_errno; + /* Begin by start pointing at the first character and end pointing at the first path separator. In the cases like "/foo" end will point to the next path separator. In all cases, if there are no