Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Message-ID: <002701c17d4c$135cf450$ccef85ce@amr.corp.intel.com> From: "Tim Prince" To: "Ralf Habacker" , "Cygwin" References: <006401c17cd6$3c9e2fd0$9a5f07d5 AT BRAMSCHE> Subject: Re: Old Thread: Cygwin Performance Date: Tue, 4 Dec 2001 21:17:01 -0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0024_01C17D09.039D4260" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 ------=_NextPart_000_0024_01C17D09.039D4260 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit I suggest the timing be done using the lmbench replacement for gettimeofday() in terms of Windows QueryPerformance() calls, as attached. This provides sufficient timer resolution to perform the tests quickly, obtain accurate cache latency data, and make the cpu clock rate detection work as well as on linux. Unfortunately, some of the file system and communication latencies appear to be way out of line. You do have as many or more tests working correctly as I have seen in attempts to port earlier lmbench versions to Windows. ----- Original Message ----- From: "Ralf Habacker" To: "Cygwin" Sent: Tuesday, December 04, 2001 7:13 AM Subject: RE: Old Thread: Cygwin Performance > > -----Original Message----- > > From: Tim Prince [mailto:tprince AT computer DOT org] > > Sent: Sunday, December 02, 2001 10:58 PM > > To: Ralf Habacker > > Cc: Cygwin > > Subject: Re: Old Thread: Cygwin Performance > > > > > > Your patch adds lib_cygwin.c to the list of required source files, yet that > > new file is not included. > > Sorry, I've only compared the original source files with the patched, so it fall through. > It's appended. > > > Also, it causes Makefile to invoke the 'get -s' command, of whose function I am not aware. > > I'm not aware too, I have recognized this in the Makefile, but I have ignored this :-) > > > > On my laptop, running linux, the lmbench-2beta2 version corrects a hang in > > the "stable version" code which makes a network connection. Perhaps that is > > not supported anyway in your cygwin version. > > > ----- Original Message ----- > > From: "Ralf Habacker" > > To: "Tim Prince" > > Cc: "Cygwin" > > Sent: Sunday, December 02, 2001 10:29 AM > > Subject: RE: Old Thread: Cygwin Performance > > > > > > > > I'd suggest you offer your patch to the lmbench maintainers. At one > > time, > > > > they were talking about supporting something for Windows. If they don't > > > > adopt it, I suppose the other alternative is to offer to maintain a > > Cygwin > > > > port as an optional Cygwin package. I'd certainly like to try your > > version. > > > > > > Perhaps it is the best, that you look at the patch before offering to the > > lmbench maintainer. > > > I should note some things to the patch: > > > > > > 1. It emulates rpc functions by adding a file "lib_cygwin.c" which > > contains empty rcp_... > > > functions, > > > so that the rpc functions are disabled and will not be tested. > > > > > > 2. Because the makefile does not have any platform depending parts, > > generating lat_rpc.exe is > > > disabled > > > > > > 3. in scripts/lmbench I have added some ' echo -n "*" ' to enable visible > > feedback for the > > > long time execution of some benchmarks. > > > > > > 4. On problem I have recognized is with the "lat_select", it hangs on > > operation. > > > > > > 5. Because I don't have any compare of lmbench running time on other > > platforms I can't say if > > > this is okay. Some benchmarks need several minutes to run, but this may be > > okay. > > > > > > Regards > > > Ralf > > > > > > > ----- Original Message ----- > > > > From: "Ralf Habacker" > > > > To: "Tim Prince" > > > > Cc: "Cygwin" > > > > Sent: Saturday, December 01, 2001 11:44 AM > > > > Subject: RE: Old Thread: Cygwin Performance > > > > > > > > > > > > > > > > > > > > cygwin should have made some improvements in piping since then. > > Amazing > > > > the > > > > > > things I had time to do last year. At that time, I got over a few > > of > > > > the > > > > > > linux specific functions by the use of Chuck Wilson's useful > > packages, > > > > some > > > > > > of which should be integrated into cygwin now. I commented out > > sections > > > > of > > > > > > lmbench which I couldn't figure out how to port. This would be a > > useful > > > > > > port, particularly in view of the new performance issues brought up > > by > > > > XP. > > > > > > > > > > I have get running lmbench 2.0 on cygwin with some patches (removing > > rpc > > > > functions). > > > > > > > > > > Is there anyone who could verify this patch ? To whom should I send > > this > > > > patch ? > > > > > > > > > > Regards > > > > > Ralf > > > > > > > > > > > However, several of the organizations involved in lmbench are trying > > to > > > > stay > > > > > > clear of Bill Gates' vendetta against use of open software together > > with > > > > his > > > > > > products. I was not employed by such an organization at the time I > > was > > > > > > beating on lmbench. > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Piyush Kumar" > > > > > > To: "Cygwin AT Cygwin. Com" > > > > > > Sent: Friday, November 30, 2001 6:49 AM > > > > > > Subject: Old Thread: Cygwin Performance > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I picked this old thread from Oct 2000!!! > > > > > > > Tim reports that cygwin falls short by > > > > > > > performance compared to linux box by a > > > > > > > factor of 2 using lmbench. Is it still > > > > > > > the case? Or have things improved since > > > > > > > Oct 13(Unlucky date!! ;)?? > > > > > > > > > > > > > > I was trying to compile lmbench 2.0 (Patch 2) > > > > > > > on my cygwin , no luck!!!! I couldnt compile it! > > > > > > > Anyone here has tried it before ?? Any luck? > > > > > > > I would be really interested in a lmbench port > > > > > > > on cygwin! If someone has already done it , please > > > > > > > let me know! > > > > > > > > > > > > > > Thanks, > > > > > > > --Piyush > > > > > > > > > > > > > > > > > > > > > =============================================================An > > Old > > > > Thread > > > > > > > > > > > > > > Re: Cygwin Performance Info > > > > > > > To: , "Chris Abbey" > > > > > at > > > > > > > chartermi dot net> > > > > > > > Subject: Re: Cygwin Performance Info > > > > > > > From: "Tim Prince" > > > > > > > Date: Fri, 13 Oct 2000 19:12:40 -0700 > > > > > > > References: > > <4 DOT 3 DOT 2 DOT 7 DOT 0 DOT 20001013184237 DOT 00b6cd70 AT pop DOT bresnanlink DOT net> > > > > > > > > > > > > > > > > > > > > -------------------------------------------------------------------------- > > > > > > -- > > > > > > > ---- > > > > > > > > > > > > > > When I attempted to run lmbench on this old box both under linux > > and > > > > cygwi > > > > > > n, > > > > > > > there were some tests on which cygwin/w2k fell short of linux by a > > > > factor > > > > > > of > > > > > > > 2 or more (opening files, pipe throughput, and the like), and then > > > > there > > > > > > > were the cache statistics on which cygwin beat linux by a small > > > > margin. I > > > > > > > was expecting lmbench to become better adapted to cygwin, but I > > have > > > > no > > > > > > news > > > > > > > there. > > > > > > > ----- Original Message ----- > > > > > > > From: "Chris Abbey" > > > > > > > To: > > > > > > > Sent: Friday, October 13, 2000 4:51 PM > > > > > > > Subject: Re: Cygwin Performance Info > > > > > > > > > > > > > > > > > > > > > > At 19:23 10/13/00 -0400, Laurence F. Wood wrote: > > > > > > > > >Can someone tell me where the performance hit is in cygwin unix > > > > > > > > >emulation? > > > > > > > > > > > > > > > > whichever part you use the most inside your tightest inner loop. > > > > > > > > > > > > > > > > seriously. > > > > > > > > > > > > > > > > that's a big huge open ended question (not about cygwin, about > > ANY > > > > > > > > library/platform) that is as specific to your application as you > > can > > > > > > > > get. For example, if you spend 75% of your computing day > > > > manipulating > > > > > > > > text files and piping them and greping them and running file > > utils > > > > > > > > against them then the cr/lf translation may be a big hit for > > you. > > > > > > > > On the otherhand if most of your computation in a day is spent > > > > answering > > > > > > > > requests that come in on tcp/ip sockets then the remapping of > > > > winsock > > > > > > > > to netinet.h functions maybe your major headache. (note, I'm not > > > > trying > > > > > > > > to imply that either function has a performance problem, merely > > that > > > > > > they > > > > > > > > would be representative places that would have high invocation > > > > counts > > > > > > > > in the course of the given activity.) > > > > > > > > > > > > > > > > To really answer that for your application/workload then you > > need to > > > > > > > > get some form of performance detailing that can tell you how > > much > > > > time > > > > > > > > you are spending in any given method and how often it's called. > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Want to unsubscribe from this list? > > > > > > > > Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > > > > > > > Bug reporting: http://cygwin.com/bugs.html > > > > > > > Documentation: http://cygwin.com/docs.html > > > > > > > FAQ: http://cygwin.com/faq/ > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > > > > > > Bug reporting: http://cygwin.com/bugs.html > > > > > > Documentation: http://cygwin.com/docs.html > > > > > > FAQ: http://cygwin.com/faq/ > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > > > > Bug reporting: http://cygwin.com/bugs.html > > > > Documentation: http://cygwin.com/docs.html > > > > FAQ: http://cygwin.com/faq/ > > > > > > > > > > > > > > > > > -------------------------------------------------------------------------- -- > > ---- > > > > > > > -- > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > > > Bug reporting: http://cygwin.com/bugs.html > > > Documentation: http://cygwin.com/docs.html > > > FAQ: http://cygwin.com/faq/ > > > > > ---------------------------------------------------------------------------- ---- > -- > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple > Bug reporting: http://cygwin.com/bugs.html > Documentation: http://cygwin.com/docs.html > FAQ: http://cygwin.com/faq/ ------=_NextPart_000_0024_01C17D09.039D4260 Content-Type: application/octet-stream; name="lib_timing.c" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="lib_timing.c" /*=0A= * a timing utilities library=0A= *=0A= * Requires 64bit integers to work.=0A= *=0A= * %W% %@%=0A= *=0A= * Copyright (c) 1994-1998 Larry McVoy.=0A= */=0A= #define _LIB /* bench.h needs this */=0A= #include "bench.h"=0A= =0A= #define nz(x) ((x) =3D=3D 0 ? 1 : (x))=0A= =0A= /*=0A= * I know you think these should be 2^10 and 2^20, but people are quoting=0A= * disk sizes in powers of 10, and bandwidths are all power of ten.=0A= * Deal with it.=0A= */=0A= #define MB (1000*1000.0)=0A= #define KB (1000.0)=0A= =0A= static struct timeval start_tv, stop_tv;=0A= FILE *ftiming;=0A= volatile uint64 use_result_dummy; /* !static for optimizers. */=0A= static uint64 iterations;=0A= static void init_timing(void);=0A= =0A= #if defined(hpux) || defined(__hpux)=0A= #include =0A= #endif=0A= =0A= #ifdef RUSAGE=0A= #include =0A= #define SECS(tv) (tv.tv_sec + tv.tv_usec / 1000000.0)=0A= #define mine(f) (int)(ru_stop.f - ru_start.f)=0A= =0A= static struct rusage ru_start, ru_stop;=0A= =0A= void=0A= rusage(void)=0A= {=0A= double sys, user, idle;=0A= double per;=0A= =0A= sys =3D SECS(ru_stop.ru_stime) - SECS(ru_start.ru_stime);=0A= user =3D SECS(ru_stop.ru_utime) - SECS(ru_start.ru_utime);=0A= idle =3D timespent() - (sys + user);=0A= per =3D idle / timespent() * 100;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming, "real=3D%.2f sys=3D%.2f user=3D%.2f idle=3D%.2f = stall=3D%.0f%% ",=0A= timespent(), sys, user, idle, per);=0A= fprintf(ftiming, "rd=3D%d wr=3D%d min=3D%d maj=3D%d ctx=3D%d\n",=0A= mine(ru_inblock), mine(ru_oublock),=0A= mine(ru_minflt), mine(ru_majflt),=0A= mine(ru_nvcsw) + mine(ru_nivcsw));=0A= }=0A= =0A= #endif /* RUSAGE */=0A= /*=0A= * Redirect output someplace else.=0A= */=0A= void=0A= timing(FILE *out)=0A= {=0A= ftiming =3D out;=0A= }=0A= =0A= /*=0A= * Start timing now.=0A= */=0A= void=0A= start(struct timeval *tv)=0A= {=0A= if (tv =3D=3D NULL) {=0A= tv =3D &start_tv;=0A= }=0A= #ifdef RUSAGE=0A= getrusage(RUSAGE_SELF, &ru_start);=0A= #endif=0A= (void) gettimeofday(tv, (struct timezone *) 0);=0A= }=0A= =0A= /*=0A= * Stop timing and return real time in microseconds.=0A= */=0A= uint64=0A= stop(struct timeval *begin, struct timeval *end)=0A= {=0A= if (end =3D=3D NULL) {=0A= end =3D &stop_tv;=0A= }=0A= (void) gettimeofday(end, (struct timezone *) 0);=0A= #ifdef RUSAGE=0A= getrusage(RUSAGE_SELF, &ru_stop);=0A= #endif=0A= =0A= if (begin =3D=3D NULL) {=0A= begin =3D &start_tv;=0A= }=0A= return tvdelta(begin, end);=0A= }=0A= =0A= uint64=0A= now(void)=0A= {=0A= struct timeval t;=0A= uint64 m;=0A= =0A= (void) gettimeofday(&t, (struct timezone *) 0);=0A= m =3D t.tv_sec;=0A= m *=3D 1000000;=0A= m +=3D t.tv_usec;=0A= return (m);=0A= }=0A= =0A= double=0A= Now(void)=0A= {=0A= struct timeval t;=0A= =0A= (void) gettimeofday(&t, (struct timezone *) 0);=0A= return (t.tv_sec * 1000000.0 + t.tv_usec);=0A= }=0A= =0A= uint64=0A= delta(void)=0A= {=0A= static struct timeval last;=0A= struct timeval t;=0A= struct timeval diff;=0A= uint64 m;=0A= =0A= (void) gettimeofday(&t, (struct timezone *) 0);=0A= if (last.tv_usec) {=0A= tvsub(&diff, &t, &last);=0A= last =3D t;=0A= m =3D diff.tv_sec;=0A= m *=3D 1000000;=0A= m +=3D diff.tv_usec;=0A= return (m);=0A= } else {=0A= last =3D t;=0A= return (0);=0A= }=0A= }=0A= =0A= double=0A= Delta(void)=0A= {=0A= struct timeval t;=0A= struct timeval diff;=0A= =0A= (void) gettimeofday(&t, (struct timezone *) 0);=0A= tvsub(&diff, &t, &start_tv);=0A= return (diff.tv_sec + diff.tv_usec / 1000000.0);=0A= }=0A= =0A= void=0A= save_n(uint64 n)=0A= {=0A= iterations =3D n;=0A= }=0A= =0A= uint64=0A= get_n(void)=0A= {=0A= return (iterations);=0A= }=0A= =0A= /*=0A= * Make the time spend be usecs.=0A= */=0A= void=0A= settime(uint64 usecs)=0A= {=0A= bzero((void*)&start_tv, sizeof(start_tv));=0A= stop_tv.tv_sec =3D usecs / 1000000;=0A= stop_tv.tv_usec =3D usecs % 1000000;=0A= }=0A= =0A= void=0A= bandwidth(uint64 bytes, uint64 times, int verbose)=0A= {=0A= struct timeval tdiff;=0A= double mb, secs;=0A= =0A= tvsub(&tdiff, &stop_tv, &start_tv);=0A= secs =3D tdiff.tv_sec;=0A= secs *=3D 1000000;=0A= secs +=3D tdiff.tv_usec;=0A= secs /=3D 1000000;=0A= secs /=3D times;=0A= mb =3D bytes / MB;=0A= if (!ftiming) ftiming =3D stderr;=0A= if (verbose) {=0A= (void) fprintf(ftiming,=0A= "%.4f MB in %.4f secs, %.4f MB/sec\n",=0A= mb, secs, mb/secs);=0A= } else {=0A= if (mb < 1) {=0A= (void) fprintf(ftiming, "%.6f ", mb);=0A= } else {=0A= (void) fprintf(ftiming, "%.2f ", mb);=0A= }=0A= if (mb / secs < 1) {=0A= (void) fprintf(ftiming, "%.6f\n", mb/secs);=0A= } else {=0A= (void) fprintf(ftiming, "%.2f\n", mb/secs);=0A= }=0A= }=0A= }=0A= =0A= void=0A= kb(uint64 bytes)=0A= {=0A= struct timeval td;=0A= double s, bs;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A= bs =3D bytes / nz(s);=0A= if (!ftiming) ftiming =3D stderr;=0A= (void) fprintf(ftiming, "%.0f KB/sec\n", bs / KB);=0A= }=0A= =0A= void=0A= mb(uint64 bytes)=0A= {=0A= struct timeval td;=0A= double s, bs;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A= bs =3D bytes / nz(s);=0A= if (!ftiming) ftiming =3D stderr;=0A= (void) fprintf(ftiming, "%.2f MB/sec\n", bs / MB);=0A= }=0A= =0A= void=0A= latency(uint64 xfers, uint64 size)=0A= {=0A= struct timeval td;=0A= double s;=0A= =0A= if (!ftiming) ftiming =3D stderr;=0A= tvsub(&td, &stop_tv, &start_tv);=0A= s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A= if (xfers > 1) {=0A= fprintf(ftiming, "%d %dKB xfers in %.2f secs, ",=0A= (int) xfers, (int) (size / KB), s);=0A= } else {=0A= fprintf(ftiming, "%.1fKB in ", size / KB);=0A= }=0A= if ((s * 1000 / xfers) > 100) {=0A= fprintf(ftiming, "%.0f millisec%s, ",=0A= s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A= } else {=0A= fprintf(ftiming, "%.4f millisec%s, ",=0A= s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A= }=0A= if (((xfers * size) / (MB * s)) > 1) {=0A= fprintf(ftiming, "%.2f MB/sec\n", (xfers * size) / (MB * s));=0A= } else {=0A= fprintf(ftiming, "%.2f KB/sec\n", (xfers * size) / (KB * s));=0A= }=0A= }=0A= =0A= void=0A= context(uint64 xfers)=0A= {=0A= struct timeval td;=0A= double s;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming,=0A= "%d context switches in %.2f secs, %.0f microsec/switch\n",=0A= (int)xfers, s, s * 1000000 / xfers);=0A= }=0A= =0A= void=0A= nano(char *s, uint64 n)=0A= {=0A= struct timeval td;=0A= double micro;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A= micro *=3D 1000;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming, "%s: %.0f nanoseconds\n", s, micro / n);=0A= }=0A= =0A= void=0A= micro(char *s, uint64 n)=0A= {=0A= struct timeval td;=0A= double micro;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A= micro /=3D n;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A= #if 0=0A= if (micro >=3D 100) {=0A= fprintf(ftiming, "%s: %.1f microseconds\n", s, micro);=0A= } else if (micro >=3D 10) {=0A= fprintf(ftiming, "%s: %.3f microseconds\n", s, micro);=0A= } else {=0A= fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A= }=0A= #endif=0A= }=0A= =0A= void=0A= micromb(uint64 sz, uint64 n)=0A= {=0A= struct timeval td;=0A= double mb, micro;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A= micro /=3D n;=0A= mb =3D sz;=0A= mb /=3D MB;=0A= if (!ftiming) ftiming =3D stderr;=0A= if (micro >=3D 10) {=0A= fprintf(ftiming, "%.6f %.0f\n", mb, micro);=0A= } else {=0A= fprintf(ftiming, "%.6f %.3f\n", mb, micro);=0A= }=0A= }=0A= =0A= void=0A= milli(char *s, uint64 n)=0A= {=0A= struct timeval td;=0A= uint64 milli;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= milli =3D td.tv_sec * 1000 + td.tv_usec / 1000;=0A= milli /=3D n;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming, "%s: %d milliseconds\n", s, (int)milli);=0A= }=0A= =0A= void=0A= ptime(uint64 n)=0A= {=0A= struct timeval td;=0A= double s;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A= if (!ftiming) ftiming =3D stderr;=0A= fprintf(ftiming,=0A= "%d in %.2f secs, %.0f microseconds each\n",=0A= (int)n, s, s * 1000000 / n);=0A= }=0A= =0A= uint64=0A= tvdelta(struct timeval *start, struct timeval *stop)=0A= {=0A= struct timeval td;=0A= uint64 usecs;=0A= =0A= tvsub(&td, stop, start);=0A= usecs =3D td.tv_sec;=0A= usecs *=3D 1000000;=0A= usecs +=3D td.tv_usec;=0A= return (usecs);=0A= }=0A= =0A= void=0A= tvsub(struct timeval * tdiff, struct timeval * t1, struct timeval * t0)=0A= {=0A= tdiff->tv_sec =3D t1->tv_sec - t0->tv_sec;=0A= tdiff->tv_usec =3D t1->tv_usec - t0->tv_usec;=0A= if (tdiff->tv_usec < 0 && tdiff->tv_sec > 0) {=0A= tdiff->tv_sec--;=0A= tdiff->tv_usec +=3D 1000000;=0A= assert(tdiff->tv_usec >=3D 0);=0A= }=0A= =0A= /* time shouldn't go backwards!!! */=0A= if (tdiff->tv_usec < 0 || t1->tv_sec < t0->tv_sec) {=0A= tdiff->tv_sec =3D 0;=0A= tdiff->tv_usec =3D 0;=0A= }=0A= }=0A= =0A= uint64=0A= gettime(void)=0A= {=0A= return (tvdelta(&start_tv, &stop_tv));=0A= }=0A= =0A= double=0A= timespent(void)=0A= {=0A= struct timeval td;=0A= =0A= tvsub(&td, &stop_tv, &start_tv);=0A= return (td.tv_sec + td.tv_usec / 1000000.0);=0A= }=0A= =0A= static char p64buf[10][20];=0A= static int n;=0A= =0A= char *=0A= p64(uint64 big)=0A= {=0A= char *s =3D p64buf[n++];=0A= =0A= if (n =3D=3D 10) n =3D 0;=0A= #ifdef linux=0A= {=0A= int *a =3D (int*)&big;=0A= =0A= if (a[1]) {=0A= sprintf(s, "0x%x%08x", a[1], a[0]);=0A= } else {=0A= sprintf(s, "0x%x", a[0]);=0A= }=0A= }=0A= #endif=0A= #ifdef __sgi=0A= sprintf(s, "0x%llx", big);=0A= #endif=0A= return (s);=0A= }=0A= =0A= char *=0A= p64sz(uint64 big)=0A= {=0A= double d =3D big;=0A= char *tags =3D " KMGTPE";=0A= int t =3D 0;=0A= char *s =3D p64buf[n++];=0A= =0A= if (n =3D=3D 10) n =3D 0;=0A= while (d > 512) t++, d /=3D 1024;=0A= if (d =3D=3D 0) {=0A= return ("0");=0A= }=0A= if (d < 100) {=0A= sprintf(s, "%.4f%c", d, tags[t]);=0A= } else {=0A= sprintf(s, "%.2f%c", d, tags[t]);=0A= }=0A= return (s);=0A= }=0A= =0A= char=0A= last(char *s)=0A= {=0A= while (*s++)=0A= ;=0A= return (s[-2]);=0A= }=0A= =0A= int=0A= bytes(char *s)=0A= {=0A= int n =3D atoi(s);=0A= =0A= if ((last(s) =3D=3D 'k') || (last(s) =3D=3D 'K'))=0A= n *=3D 1024;=0A= if ((last(s) =3D=3D 'm') || (last(s) =3D=3D 'M'))=0A= n *=3D (1024 * 1024);=0A= return (n);=0A= }=0A= =0A= void=0A= use_int(int result) { use_result_dummy +=3D result; }=0A= =0A= void=0A= use_pointer(void *result) { use_result_dummy +=3D (int)result; }=0A= =0A= void=0A= insertinit(result_t *r)=0A= {=0A= int i;=0A= =0A= r->N =3D 0;=0A= for (i =3D 0; i < TRIES; i++) {=0A= r->u[i] =3D 0;=0A= r->n[i] =3D 1;=0A= }=0A= }=0A= =0A= /* biggest to smallest */=0A= void=0A= insertsort(uint64 u, uint64 n, result_t *r)=0A= {=0A= int i, j;=0A= =0A= if (u =3D=3D 0) return;=0A= =0A= for (i =3D 0; i < r->N; ++i) {=0A= if (u/(double)n > r->u[i]/(double)r->n[i]) {=0A= for (j =3D r->N; j > i; --j) {=0A= r->u[j] =3D r->u[j-1];=0A= r->n[j] =3D r->n[j-1];=0A= }=0A= break;=0A= }=0A= }=0A= r->u[i] =3D u;=0A= r->n[i] =3D n;=0A= r->N++;=0A= }=0A= =0A= static result_t results;=0A= =0A= void=0A= print_results(void)=0A= {=0A= int i;=0A= =0A= for (i =3D 0; i < results.N; ++i) {=0A= fprintf(stderr, "%.2f ", (double)results.u[i]/results.n[i]);=0A= }=0A= }=0A= =0A= void=0A= get_results(result_t *r)=0A= {=0A= *r =3D results;=0A= }=0A= =0A= void=0A= save_results(result_t *r)=0A= {=0A= results =3D *r;=0A= save_median();=0A= }=0A= =0A= void=0A= save_minimum()=0A= {=0A= if (results.N =3D=3D 0) {=0A= save_n(1);=0A= settime(0);=0A= } else {=0A= save_n(results.n[results.N - 1]);=0A= settime(results.u[results.N - 1]);=0A= }=0A= }=0A= =0A= void=0A= save_median()=0A= {=0A= int i =3D results.N / 2;=0A= uint64 u, n;=0A= =0A= if (results.N =3D=3D 0) {=0A= n =3D 1;=0A= u =3D 0;=0A= } else if (results.N % 2) {=0A= n =3D results.n[i];=0A= u =3D results.u[i];=0A= } else {=0A= n =3D (results.n[i] + results.n[i-1]) / 2;=0A= u =3D (results.u[i] + results.u[i-1]) / 2;=0A= }=0A= save_n(n); settime(u);=0A= }=0A= =0A= /*=0A= * The inner loop tracks bench.h but uses a different results array.=0A= */=0A= static long *=0A= one_op(register long *p)=0A= {=0A= BENCH_INNER(p =3D (long *)*p, 0);=0A= return (p);=0A= }=0A= =0A= static long *=0A= two_op(register long *p, register long *q)=0A= {=0A= BENCH_INNER(p =3D (long *)*q; q =3D (long*)*p, 0);=0A= return (p);=0A= }=0A= =0A= static long *p =3D (long *)&p;=0A= static long *q =3D (long *)&q;=0A= =0A= double=0A= l_overhead(void)=0A= {=0A= int i;=0A= uint64 N_save, u_save;=0A= static double overhead;=0A= static int initialized =3D 0;=0A= result_t one, two, r_save;=0A= =0A= init_timing();=0A= if (initialized) return (overhead);=0A= =0A= initialized =3D 1;=0A= if (getenv("LOOP_O")) {=0A= overhead =3D atof(getenv("LOOP_O"));=0A= } else {=0A= get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A= insertinit(&one);=0A= insertinit(&two);=0A= for (i =3D 0; i < TRIES; ++i) {=0A= use_pointer((void*)one_op(p));=0A= if (gettime() > t_overhead())=0A= insertsort(gettime() - t_overhead(), get_n(), &one);=0A= use_pointer((void *)two_op(p, q));=0A= if (gettime() > t_overhead())=0A= insertsort(gettime() - t_overhead(), get_n(), &two);=0A= }=0A= /*=0A= * u1 =3D (n1 * (overhead + work))=0A= * u2 =3D (n2 * (overhead + 2 * work))=0A= * =3D=3D> overhead =3D 2. * u1 / n1 - u2 / n2=0A= */=0A= save_results(&one); save_minimum();=0A= overhead =3D 2. * gettime() / (double)get_n();=0A= =0A= save_results(&two); save_minimum();=0A= overhead -=3D gettime() / (double)get_n();=0A= =0A= if (overhead < 0.) overhead =3D 0.; /* Gag */=0A= =0A= save_results(&r_save); save_n(N_save); settime(u_save); =0A= }=0A= return (overhead);=0A= }=0A= =0A= /*=0A= * Figure out the timing overhead. This has to track bench.h=0A= */=0A= uint64=0A= t_overhead(void)=0A= {=0A= uint64 N_save, u_save;=0A= static int initialized =3D 0;=0A= static uint64 overhead =3D 0;=0A= struct timeval tv;=0A= result_t r_save;=0A= =0A= init_timing();=0A= if (initialized) return (overhead);=0A= =0A= initialized =3D 1;=0A= if (getenv("TIMING_O")) {=0A= overhead =3D atof(getenv("TIMING_O"));=0A= } else if (get_enough(0) <=3D 50000) {=0A= /* it is not in the noise, so compute it */=0A= int i;=0A= result_t r;=0A= =0A= get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A= insertinit(&r);=0A= for (i =3D 0; i < TRIES; ++i) {=0A= BENCH_INNER(gettimeofday(&tv, 0), 0);=0A= insertsort(gettime(), get_n(), &r);=0A= }=0A= save_results(&r);=0A= save_minimum();=0A= overhead =3D gettime() / get_n();=0A= =0A= save_results(&r_save); save_n(N_save); settime(u_save); =0A= }=0A= return (overhead);=0A= }=0A= =0A= /*=0A= * Figure out how long to run it.=0A= * If enough =3D=3D 0, then they want us to figure it out.=0A= * If enough is !0 then return it unless we think it is too short.=0A= */=0A= static int long_enough;=0A= static int compute_enough();=0A= =0A= int=0A= get_enough(int e)=0A= {=0A= init_timing();=0A= return (long_enough > e ? long_enough : e);=0A= }=0A= =0A= =0A= static void=0A= init_timing(void)=0A= {=0A= static int done =3D 0;=0A= =0A= if (done) return;=0A= done =3D 1;=0A= long_enough =3D compute_enough();=0A= t_overhead();=0A= l_overhead();=0A= }=0A= =0A= typedef long TYPE;=0A= =0A= static TYPE **=0A= enough_duration(register long N, register TYPE ** p)=0A= {=0A= #define ENOUGH_DURATION_TEN(one) one one one one one one one one one one=0A= while (N-- > 0) {=0A= ENOUGH_DURATION_TEN(p =3D (TYPE **) *p;);=0A= }=0A= return (p);=0A= }=0A= =0A= static uint64=0A= duration(long N)=0A= {=0A= uint64 usecs;=0A= TYPE *x =3D (TYPE *)&x;=0A= TYPE **p =3D (TYPE **)&x;=0A= =0A= start(0);=0A= p =3D enough_duration(N, p);=0A= usecs =3D stop(0, 0);=0A= use_pointer((void *)p);=0A= return (usecs);=0A= }=0A= =0A= /*=0A= * find the minimum time that work "N" takes in "tries" tests=0A= */=0A= static uint64=0A= time_N(long N)=0A= {=0A= int i;=0A= uint64 usecs;=0A= result_t r;=0A= =0A= insertinit(&r);=0A= for (i =3D 1; i < TRIES; ++i) {=0A= usecs =3D duration(N);=0A= insertsort(usecs, N, &r);=0A= }=0A= save_results(&r);=0A= save_minimum();=0A= return (gettime());=0A= }=0A= =0A= /*=0A= * return the amount of work needed to run "enough" microseconds=0A= */=0A= static long=0A= find_N(int enough)=0A= {=0A= int tries;=0A= static long N =3D 10000;=0A= static uint64 usecs =3D 0;=0A= =0A= if (!usecs) usecs =3D time_N(N);=0A= =0A= for (tries =3D 0; tries < 10; ++tries) {=0A= if (0.98 * enough < usecs && usecs < 1.02 * enough)=0A= return (N);=0A= if (usecs < 1000)=0A= N *=3D 10;=0A= else {=0A= double n =3D N;=0A= =0A= n /=3D usecs;=0A= n *=3D enough;=0A= N =3D n + 1;=0A= }=0A= usecs =3D time_N(N);=0A= }=0A= return (-1);=0A= }=0A= =0A= /*=0A= * We want to verify that small modifications proportionally affect the = runtime=0A= */=0A= static double test_points[] =3D {1.015, 1.02, 1.035};=0A= static int=0A= test_time(int enough)=0A= {=0A= int i;=0A= long N;=0A= uint64 usecs, expected, baseline, diff;=0A= =0A= if ((N =3D find_N(enough)) <=3D 0)=0A= return (0);=0A= =0A= baseline =3D time_N(N);=0A= =0A= for (i =3D 0; i < sizeof(test_points) / sizeof(double); ++i) {=0A= usecs =3D time_N((int)((double) N * test_points[i]));=0A= expected =3D (uint64)((double)baseline * test_points[i]);=0A= diff =3D expected > usecs ? expected - usecs : usecs - expected;=0A= if (diff / (double)expected > 0.0025)=0A= return (0);=0A= }=0A= return (1);=0A= }=0A= =0A= =0A= /*=0A= * We want to find the smallest timing interval that has accurate timing=0A= */=0A= static int possibilities[] =3D { 5000, 10000, 50000, 100000 };=0A= static int=0A= compute_enough()=0A= {=0A= int i;=0A= =0A= if (getenv("ENOUGH")) {=0A= return (atoi(getenv("ENOUGH")));=0A= }=0A= for (i =3D 0; i < sizeof(possibilities) / sizeof(int); ++i) {=0A= if (test_time(possibilities[i]))=0A= return (possibilities[i]);=0A= }=0A= =0A= /* =0A= * if we can't find a timing interval that is sufficient, =0A= * then use SHORT as a default.=0A= */=0A= return (SHORT);=0A= }=0A= =0A= /*=0A= * This stuff isn't really lib_timing, but ...=0A= */=0A= void=0A= morefds(void)=0A= {=0A= #ifdef RLIMIT_NOFILE=0A= struct rlimit r;=0A= =0A= getrlimit(RLIMIT_NOFILE, &r);=0A= r.rlim_cur =3D r.rlim_max;=0A= setrlimit(RLIMIT_NOFILE, &r);=0A= #endif=0A= }=0A= =0A= void=0A= touch(char *buf, int nbytes)=0A= {=0A= static psize;=0A= =0A= if (!psize) {=0A= psize =3D getpagesize();=0A= }=0A= while (nbytes > 0) {=0A= *buf =3D 1;=0A= buf +=3D psize;=0A= nbytes -=3D psize;=0A= }=0A= }=0A= =0A= #if defined(hpux) || defined(__hpux)=0A= int=0A= getpagesize()=0A= {=0A= return (sysconf(_SC_PAGE_SIZE));=0A= }=0A= #endif=0A= =0A= #if defined(WIN32)=0A= #if !defined(__CYGWIN__)=0A= int=0A= getpagesize()=0A= {=0A= SYSTEM_INFO s;=0A= =0A= GetSystemInfo(&s);=0A= return ((int)s.dwPageSize);=0A= }=0A= #endif=0A= =0A= LARGE_INTEGER=0A= getFILETIMEoffset()=0A= {=0A= SYSTEMTIME s;=0A= FILETIME f;=0A= LARGE_INTEGER t;=0A= =0A= s.wYear =3D 1970;=0A= s.wMonth =3D 1;=0A= s.wDay =3D 1;=0A= s.wHour =3D 0;=0A= s.wMinute =3D 0;=0A= s.wSecond =3D 0;=0A= s.wMilliseconds =3D 0;=0A= SystemTimeToFileTime(&s, &f);=0A= t.QuadPart =3D f.dwHighDateTime;=0A= t.QuadPart <<=3D 32;=0A= t.QuadPart |=3D f.dwLowDateTime;=0A= return (t);=0A= }=0A= =0A= int=0A= gettimeofday(struct timeval *tv, struct timezone *tz)=0A= {=0A= LARGE_INTEGER t;=0A= FILETIME f;=0A= double microseconds;=0A= static LARGE_INTEGER offset;=0A= static double frequencyToMicroseconds;=0A= static int initialized =3D 0;=0A= static BOOL usePerformanceCounter =3D 0;=0A= =0A= if (!initialized) {=0A= LARGE_INTEGER performanceFrequency;=0A= initialized =3D 1;=0A= usePerformanceCounter =3D = QueryPerformanceFrequency(&performanceFrequency);=0A= if (usePerformanceCounter) {=0A= QueryPerformanceCounter(&offset);=0A= frequencyToMicroseconds =3D (double)performanceFrequency.QuadPart / = 1000000.;=0A= } else {=0A= offset =3D getFILETIMEoffset();=0A= frequencyToMicroseconds =3D 10.;=0A= }=0A= }=0A= if (usePerformanceCounter) QueryPerformanceCounter(&t);=0A= else {=0A= GetSystemTimeAsFileTime(&f);=0A= t.QuadPart =3D f.dwHighDateTime;=0A= t.QuadPart <<=3D 32;=0A= t.QuadPart |=3D f.dwLowDateTime;=0A= }=0A= =0A= t.QuadPart -=3D offset.QuadPart;=0A= microseconds =3D (double)t.QuadPart / frequencyToMicroseconds;=0A= t.QuadPart =3D microseconds;=0A= tv->tv_sec =3D t.QuadPart / 1000000;=0A= tv->tv_usec =3D t.QuadPart % 1000000;=0A= return (0);=0A= }=0A= #endif=0A= ------=_NextPart_000_0024_01C17D09.039D4260 Content-Type: text/plain; charset=us-ascii -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ ------=_NextPart_000_0024_01C17D09.039D4260--