Mail Archives: cygwin/2001/12/05/00:18:31
------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
I suggest the timing be done using the lmbench replacement for
gettimeofday() in terms of Windows QueryPerformance() calls, as attached.
This provides sufficient timer resolution to perform the tests quickly,
obtain accurate cache latency data, and make the cpu clock rate detection
work as well as on linux.
Unfortunately, some of the file system and communication latencies appear to
be way out of line. You do have as many or more tests working correctly as
I have seen in attempts to port earlier lmbench versions to Windows.
----- Original Message -----
From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
To: "Cygwin" <cygwin AT sources DOT redhat DOT com>
Sent: Tuesday, December 04, 2001 7:13 AM
Subject: RE: Old Thread: Cygwin Performance
> > -----Original Message-----
> > From: Tim Prince [mailto:tprince AT computer DOT org]
> > Sent: Sunday, December 02, 2001 10:58 PM
> > To: Ralf Habacker
> > Cc: Cygwin
> > Subject: Re: Old Thread: Cygwin Performance
> >
> >
> > Your patch adds lib_cygwin.c to the list of required source files, yet
that
> > new file is not included.
>
> Sorry, I've only compared the original source files with the patched, so
it fall through.
> It's appended.
>
> > Also, it causes Makefile to invoke the 'get -s' command, of whose
function I am not aware.
>
> I'm not aware too, I have recognized this in the Makefile, but I have
ignored this :-)
> >
> > On my laptop, running linux, the lmbench-2beta2 version corrects a hang
in
> > the "stable version" code which makes a network connection. Perhaps
that is
> > not supported anyway in your cygwin version.
>
> > ----- Original Message -----
> > From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
> > To: "Tim Prince" <tprince AT computer DOT org>
> > Cc: "Cygwin" <cygwin AT sources DOT redhat DOT com>
> > Sent: Sunday, December 02, 2001 10:29 AM
> > Subject: RE: Old Thread: Cygwin Performance
> >
> >
> > > > I'd suggest you offer your patch to the lmbench maintainers. At one
> > time,
> > > > they were talking about supporting something for Windows. If they
don't
> > > > adopt it, I suppose the other alternative is to offer to maintain a
> > Cygwin
> > > > port as an optional Cygwin package. I'd certainly like to try your
> > version.
> > >
> > > Perhaps it is the best, that you look at the patch before offering to
the
> > lmbench maintainer.
> > > I should note some things to the patch:
> > >
> > > 1. It emulates rpc functions by adding a file "lib_cygwin.c" which
> > contains empty rcp_...
> > > functions,
> > > so that the rpc functions are disabled and will not be tested.
> > >
> > > 2. Because the makefile does not have any platform depending parts,
> > generating lat_rpc.exe is
> > > disabled
> > >
> > > 3. in scripts/lmbench I have added some ' echo -n "*" ' to enable
visible
> > feedback for the
> > > long time execution of some benchmarks.
> > >
> > > 4. On problem I have recognized is with the "lat_select", it hangs on
> > operation.
> > >
> > > 5. Because I don't have any compare of lmbench running time on other
> > platforms I can't say if
> > > this is okay. Some benchmarks need several minutes to run, but this
may be
> > okay.
> > >
> > > Regards
> > > Ralf
> > >
> > > > ----- Original Message -----
> > > > From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
> > > > To: "Tim Prince" <tprince AT computer DOT org>
> > > > Cc: "Cygwin" <cygwin AT sources DOT redhat DOT com>
> > > > Sent: Saturday, December 01, 2001 11:44 AM
> > > > Subject: RE: Old Thread: Cygwin Performance
> > > >
> > > >
> > > > > >
> > > > > > cygwin should have made some improvements in piping since then.
> > Amazing
> > > > the
> > > > > > things I had time to do last year. At that time, I got over a
few
> > of
> > > > the
> > > > > > linux specific functions by the use of Chuck Wilson's useful
> > packages,
> > > > some
> > > > > > of which should be integrated into cygwin now. I commented out
> > sections
> > > > of
> > > > > > lmbench which I couldn't figure out how to port. This would be
a
> > useful
> > > > > > port, particularly in view of the new performance issues brought
up
> > by
> > > > XP.
> > > > >
> > > > > I have get running lmbench 2.0 on cygwin with some patches
(removing
> > rpc
> > > > functions).
> > > > >
> > > > > Is there anyone who could verify this patch ? To whom should I
send
> > this
> > > > patch ?
> > > > >
> > > > > Regards
> > > > > Ralf
> > > > >
> > > > > > However, several of the organizations involved in lmbench are
trying
> > to
> > > > stay
> > > > > > clear of Bill Gates' vendetta against use of open software
together
> > with
> > > > his
> > > > > > products. I was not employed by such an organization at the
time I
> > was
> > > > > > beating on lmbench.
> > > > >
> > > > > > ----- Original Message -----
> > > > > > From: "Piyush Kumar" <piyush AT acm DOT org>
> > > > > > To: "Cygwin AT Cygwin. Com" <cygwin AT cygwin DOT com>
> > > > > > Sent: Friday, November 30, 2001 6:49 AM
> > > > > > Subject: Old Thread: Cygwin Performance
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I picked this old thread from Oct 2000!!!
> > > > > > > Tim reports that cygwin falls short by
> > > > > > > performance compared to linux box by a
> > > > > > > factor of 2 using lmbench. Is it still
> > > > > > > the case? Or have things improved since
> > > > > > > Oct 13(Unlucky date!! ;)??
> > > > > > >
> > > > > > > I was trying to compile lmbench 2.0 (Patch 2)
> > > > > > > on my cygwin , no luck!!!! I couldnt compile it!
> > > > > > > Anyone here has tried it before ?? Any luck?
> > > > > > > I would be really interested in a lmbench port
> > > > > > > on cygwin! If someone has already done it , please
> > > > > > > let me know!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > --Piyush
> > > > > > >
> > > > > > >
> > > > > > >
=============================================================An
> > Old
> > > > Thread
> > > > > > >
> > > > > > > Re: Cygwin Performance Info
> > > > > > > To: <cygwin at sourceware dot cygnus dot com>, "Chris Abbey"
> > <cabbey
> > > > at
> > > > > > > chartermi dot net>
> > > > > > > Subject: Re: Cygwin Performance Info
> > > > > > > From: "Tim Prince" <tprince at computer dot org>
> > > > > > > Date: Fri, 13 Oct 2000 19:12:40 -0700
> > > > > > > References:
> > <4 DOT 3 DOT 2 DOT 7 DOT 0 DOT 20001013184237 DOT 00b6cd70 AT pop DOT bresnanlink DOT net>
> > > > > > >
> > > > > >
> > > >
> >
> --------------------------------------------------------------------------
> > > > > > --
> > > > > > > ----
> > > > > > >
> > > > > > > When I attempted to run lmbench on this old box both under
linux
> > and
> > > > cygwi
> > > > > > n,
> > > > > > > there were some tests on which cygwin/w2k fell short of linux
by a
> > > > factor
> > > > > > of
> > > > > > > 2 or more (opening files, pipe throughput, and the like), and
then
> > > > there
> > > > > > > were the cache statistics on which cygwin beat linux by a
small
> > > > margin. I
> > > > > > > was expecting lmbench to become better adapted to cygwin, but
I
> > have
> > > > no
> > > > > > news
> > > > > > > there.
> > > > > > > ----- Original Message -----
> > > > > > > From: "Chris Abbey" <cabbey AT chartermi DOT net>
> > > > > > > To: <cygwin AT sourceware DOT cygnus DOT com>
> > > > > > > Sent: Friday, October 13, 2000 4:51 PM
> > > > > > > Subject: Re: Cygwin Performance Info
> > > > > > >
> > > > > > >
> > > > > > > > At 19:23 10/13/00 -0400, Laurence F. Wood wrote:
> > > > > > > > >Can someone tell me where the performance hit is in cygwin
unix
> > > > > > > > >emulation?
> > > > > > > >
> > > > > > > > whichever part you use the most inside your tightest inner
loop.
> > > > > > > >
> > > > > > > > seriously.
> > > > > > > >
> > > > > > > > that's a big huge open ended question (not about cygwin,
about
> > ANY
> > > > > > > > library/platform) that is as specific to your application as
you
> > can
> > > > > > > > get. For example, if you spend 75% of your computing day
> > > > manipulating
> > > > > > > > text files and piping them and greping them and running file
> > utils
> > > > > > > > against them then the cr/lf translation may be a big hit for
> > you.
> > > > > > > > On the otherhand if most of your computation in a day is
spent
> > > > answering
> > > > > > > > requests that come in on tcp/ip sockets then the remapping
of
> > > > winsock
> > > > > > > > to netinet.h functions maybe your major headache. (note, I'm
not
> > > > trying
> > > > > > > > to imply that either function has a performance problem,
merely
> > that
> > > > > > they
> > > > > > > > would be representative places that would have high
invocation
> > > > counts
> > > > > > > > in the course of the given activity.)
> > > > > > > >
> > > > > > > > To really answer that for your application/workload then you
> > need to
> > > > > > > > get some form of performance detailing that can tell you how
> > much
> > > > time
> > > > > > > > you are spending in any given method and how often it's
called.
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Want to unsubscribe from this list?
> > > > > > > > Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Unsubscribe info:
http://cygwin.com/ml/#unsubscribe-simple
> > > > > > > Bug reporting: http://cygwin.com/bugs.html
> > > > > > > Documentation: http://cygwin.com/docs.html
> > > > > > > FAQ: http://cygwin.com/faq/
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> > > > > > Bug reporting: http://cygwin.com/bugs.html
> > > > > > Documentation: http://cygwin.com/docs.html
> > > > > > FAQ: http://cygwin.com/faq/
> > > > > >
> > > > > >
> > > >
> > > >
> > > > --
> > > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> > > > Bug reporting: http://cygwin.com/bugs.html
> > > > Documentation: http://cygwin.com/docs.html
> > > > FAQ: http://cygwin.com/faq/
> > > >
> > > >
> > >
> >
> >
>
> --------------------------------------------------------------------------
--
> > ----
> >
> >
> > > --
> > > Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> > > Bug reporting: http://cygwin.com/bugs.html
> > > Documentation: http://cygwin.com/docs.html
> > > FAQ: http://cygwin.com/faq/
> >
> >
>
----------------------------------------------------------------------------
----
> --
> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> Bug reporting: http://cygwin.com/bugs.html
> Documentation: http://cygwin.com/docs.html
> FAQ: http://cygwin.com/faq/
------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: application/octet-stream;
name="lib_timing.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="lib_timing.c"
/*=0A=
* a timing utilities library=0A=
*=0A=
* Requires 64bit integers to work.=0A=
*=0A=
* %W% %@%=0A=
*=0A=
* Copyright (c) 1994-1998 Larry McVoy.=0A=
*/=0A=
#define _LIB /* bench.h needs this */=0A=
#include "bench.h"=0A=
=0A=
#define nz(x) ((x) =3D=3D 0 ? 1 : (x))=0A=
=0A=
/*=0A=
* I know you think these should be 2^10 and 2^20, but people are quoting=0A=
* disk sizes in powers of 10, and bandwidths are all power of ten.=0A=
* Deal with it.=0A=
*/=0A=
#define MB (1000*1000.0)=0A=
#define KB (1000.0)=0A=
=0A=
static struct timeval start_tv, stop_tv;=0A=
FILE *ftiming;=0A=
volatile uint64 use_result_dummy; /* !static for optimizers. */=0A=
static uint64 iterations;=0A=
static void init_timing(void);=0A=
=0A=
#if defined(hpux) || defined(__hpux)=0A=
#include <sys/mman.h>=0A=
#endif=0A=
=0A=
#ifdef RUSAGE=0A=
#include <sys/resource.h>=0A=
#define SECS(tv) (tv.tv_sec + tv.tv_usec / 1000000.0)=0A=
#define mine(f) (int)(ru_stop.f - ru_start.f)=0A=
=0A=
static struct rusage ru_start, ru_stop;=0A=
=0A=
void=0A=
rusage(void)=0A=
{=0A=
double sys, user, idle;=0A=
double per;=0A=
=0A=
sys =3D SECS(ru_stop.ru_stime) - SECS(ru_start.ru_stime);=0A=
user =3D SECS(ru_stop.ru_utime) - SECS(ru_start.ru_utime);=0A=
idle =3D timespent() - (sys + user);=0A=
per =3D idle / timespent() * 100;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming, "real=3D%.2f sys=3D%.2f user=3D%.2f idle=3D%.2f =
stall=3D%.0f%% ",=0A=
timespent(), sys, user, idle, per);=0A=
fprintf(ftiming, "rd=3D%d wr=3D%d min=3D%d maj=3D%d ctx=3D%d\n",=0A=
mine(ru_inblock), mine(ru_oublock),=0A=
mine(ru_minflt), mine(ru_majflt),=0A=
mine(ru_nvcsw) + mine(ru_nivcsw));=0A=
}=0A=
=0A=
#endif /* RUSAGE */=0A=
/*=0A=
* Redirect output someplace else.=0A=
*/=0A=
void=0A=
timing(FILE *out)=0A=
{=0A=
ftiming =3D out;=0A=
}=0A=
=0A=
/*=0A=
* Start timing now.=0A=
*/=0A=
void=0A=
start(struct timeval *tv)=0A=
{=0A=
if (tv =3D=3D NULL) {=0A=
tv =3D &start_tv;=0A=
}=0A=
#ifdef RUSAGE=0A=
getrusage(RUSAGE_SELF, &ru_start);=0A=
#endif=0A=
(void) gettimeofday(tv, (struct timezone *) 0);=0A=
}=0A=
=0A=
/*=0A=
* Stop timing and return real time in microseconds.=0A=
*/=0A=
uint64=0A=
stop(struct timeval *begin, struct timeval *end)=0A=
{=0A=
if (end =3D=3D NULL) {=0A=
end =3D &stop_tv;=0A=
}=0A=
(void) gettimeofday(end, (struct timezone *) 0);=0A=
#ifdef RUSAGE=0A=
getrusage(RUSAGE_SELF, &ru_stop);=0A=
#endif=0A=
=0A=
if (begin =3D=3D NULL) {=0A=
begin =3D &start_tv;=0A=
}=0A=
return tvdelta(begin, end);=0A=
}=0A=
=0A=
uint64=0A=
now(void)=0A=
{=0A=
struct timeval t;=0A=
uint64 m;=0A=
=0A=
(void) gettimeofday(&t, (struct timezone *) 0);=0A=
m =3D t.tv_sec;=0A=
m *=3D 1000000;=0A=
m +=3D t.tv_usec;=0A=
return (m);=0A=
}=0A=
=0A=
double=0A=
Now(void)=0A=
{=0A=
struct timeval t;=0A=
=0A=
(void) gettimeofday(&t, (struct timezone *) 0);=0A=
return (t.tv_sec * 1000000.0 + t.tv_usec);=0A=
}=0A=
=0A=
uint64=0A=
delta(void)=0A=
{=0A=
static struct timeval last;=0A=
struct timeval t;=0A=
struct timeval diff;=0A=
uint64 m;=0A=
=0A=
(void) gettimeofday(&t, (struct timezone *) 0);=0A=
if (last.tv_usec) {=0A=
tvsub(&diff, &t, &last);=0A=
last =3D t;=0A=
m =3D diff.tv_sec;=0A=
m *=3D 1000000;=0A=
m +=3D diff.tv_usec;=0A=
return (m);=0A=
} else {=0A=
last =3D t;=0A=
return (0);=0A=
}=0A=
}=0A=
=0A=
double=0A=
Delta(void)=0A=
{=0A=
struct timeval t;=0A=
struct timeval diff;=0A=
=0A=
(void) gettimeofday(&t, (struct timezone *) 0);=0A=
tvsub(&diff, &t, &start_tv);=0A=
return (diff.tv_sec + diff.tv_usec / 1000000.0);=0A=
}=0A=
=0A=
void=0A=
save_n(uint64 n)=0A=
{=0A=
iterations =3D n;=0A=
}=0A=
=0A=
uint64=0A=
get_n(void)=0A=
{=0A=
return (iterations);=0A=
}=0A=
=0A=
/*=0A=
* Make the time spend be usecs.=0A=
*/=0A=
void=0A=
settime(uint64 usecs)=0A=
{=0A=
bzero((void*)&start_tv, sizeof(start_tv));=0A=
stop_tv.tv_sec =3D usecs / 1000000;=0A=
stop_tv.tv_usec =3D usecs % 1000000;=0A=
}=0A=
=0A=
void=0A=
bandwidth(uint64 bytes, uint64 times, int verbose)=0A=
{=0A=
struct timeval tdiff;=0A=
double mb, secs;=0A=
=0A=
tvsub(&tdiff, &stop_tv, &start_tv);=0A=
secs =3D tdiff.tv_sec;=0A=
secs *=3D 1000000;=0A=
secs +=3D tdiff.tv_usec;=0A=
secs /=3D 1000000;=0A=
secs /=3D times;=0A=
mb =3D bytes / MB;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
if (verbose) {=0A=
(void) fprintf(ftiming,=0A=
"%.4f MB in %.4f secs, %.4f MB/sec\n",=0A=
mb, secs, mb/secs);=0A=
} else {=0A=
if (mb < 1) {=0A=
(void) fprintf(ftiming, "%.6f ", mb);=0A=
} else {=0A=
(void) fprintf(ftiming, "%.2f ", mb);=0A=
}=0A=
if (mb / secs < 1) {=0A=
(void) fprintf(ftiming, "%.6f\n", mb/secs);=0A=
} else {=0A=
(void) fprintf(ftiming, "%.2f\n", mb/secs);=0A=
}=0A=
}=0A=
}=0A=
=0A=
void=0A=
kb(uint64 bytes)=0A=
{=0A=
struct timeval td;=0A=
double s, bs;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
bs =3D bytes / nz(s);=0A=
if (!ftiming) ftiming =3D stderr;=0A=
(void) fprintf(ftiming, "%.0f KB/sec\n", bs / KB);=0A=
}=0A=
=0A=
void=0A=
mb(uint64 bytes)=0A=
{=0A=
struct timeval td;=0A=
double s, bs;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
bs =3D bytes / nz(s);=0A=
if (!ftiming) ftiming =3D stderr;=0A=
(void) fprintf(ftiming, "%.2f MB/sec\n", bs / MB);=0A=
}=0A=
=0A=
void=0A=
latency(uint64 xfers, uint64 size)=0A=
{=0A=
struct timeval td;=0A=
double s;=0A=
=0A=
if (!ftiming) ftiming =3D stderr;=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
if (xfers > 1) {=0A=
fprintf(ftiming, "%d %dKB xfers in %.2f secs, ",=0A=
(int) xfers, (int) (size / KB), s);=0A=
} else {=0A=
fprintf(ftiming, "%.1fKB in ", size / KB);=0A=
}=0A=
if ((s * 1000 / xfers) > 100) {=0A=
fprintf(ftiming, "%.0f millisec%s, ",=0A=
s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A=
} else {=0A=
fprintf(ftiming, "%.4f millisec%s, ",=0A=
s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A=
}=0A=
if (((xfers * size) / (MB * s)) > 1) {=0A=
fprintf(ftiming, "%.2f MB/sec\n", (xfers * size) / (MB * s));=0A=
} else {=0A=
fprintf(ftiming, "%.2f KB/sec\n", (xfers * size) / (KB * s));=0A=
}=0A=
}=0A=
=0A=
void=0A=
context(uint64 xfers)=0A=
{=0A=
struct timeval td;=0A=
double s;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming,=0A=
"%d context switches in %.2f secs, %.0f microsec/switch\n",=0A=
(int)xfers, s, s * 1000000 / xfers);=0A=
}=0A=
=0A=
void=0A=
nano(char *s, uint64 n)=0A=
{=0A=
struct timeval td;=0A=
double micro;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
micro *=3D 1000;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming, "%s: %.0f nanoseconds\n", s, micro / n);=0A=
}=0A=
=0A=
void=0A=
micro(char *s, uint64 n)=0A=
{=0A=
struct timeval td;=0A=
double micro;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
micro /=3D n;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A=
#if 0=0A=
if (micro >=3D 100) {=0A=
fprintf(ftiming, "%s: %.1f microseconds\n", s, micro);=0A=
} else if (micro >=3D 10) {=0A=
fprintf(ftiming, "%s: %.3f microseconds\n", s, micro);=0A=
} else {=0A=
fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A=
}=0A=
#endif=0A=
}=0A=
=0A=
void=0A=
micromb(uint64 sz, uint64 n)=0A=
{=0A=
struct timeval td;=0A=
double mb, micro;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
micro /=3D n;=0A=
mb =3D sz;=0A=
mb /=3D MB;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
if (micro >=3D 10) {=0A=
fprintf(ftiming, "%.6f %.0f\n", mb, micro);=0A=
} else {=0A=
fprintf(ftiming, "%.6f %.3f\n", mb, micro);=0A=
}=0A=
}=0A=
=0A=
void=0A=
milli(char *s, uint64 n)=0A=
{=0A=
struct timeval td;=0A=
uint64 milli;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
milli =3D td.tv_sec * 1000 + td.tv_usec / 1000;=0A=
milli /=3D n;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming, "%s: %d milliseconds\n", s, (int)milli);=0A=
}=0A=
=0A=
void=0A=
ptime(uint64 n)=0A=
{=0A=
struct timeval td;=0A=
double s;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
if (!ftiming) ftiming =3D stderr;=0A=
fprintf(ftiming,=0A=
"%d in %.2f secs, %.0f microseconds each\n",=0A=
(int)n, s, s * 1000000 / n);=0A=
}=0A=
=0A=
uint64=0A=
tvdelta(struct timeval *start, struct timeval *stop)=0A=
{=0A=
struct timeval td;=0A=
uint64 usecs;=0A=
=0A=
tvsub(&td, stop, start);=0A=
usecs =3D td.tv_sec;=0A=
usecs *=3D 1000000;=0A=
usecs +=3D td.tv_usec;=0A=
return (usecs);=0A=
}=0A=
=0A=
void=0A=
tvsub(struct timeval * tdiff, struct timeval * t1, struct timeval * t0)=0A=
{=0A=
tdiff->tv_sec =3D t1->tv_sec - t0->tv_sec;=0A=
tdiff->tv_usec =3D t1->tv_usec - t0->tv_usec;=0A=
if (tdiff->tv_usec < 0 && tdiff->tv_sec > 0) {=0A=
tdiff->tv_sec--;=0A=
tdiff->tv_usec +=3D 1000000;=0A=
assert(tdiff->tv_usec >=3D 0);=0A=
}=0A=
=0A=
/* time shouldn't go backwards!!! */=0A=
if (tdiff->tv_usec < 0 || t1->tv_sec < t0->tv_sec) {=0A=
tdiff->tv_sec =3D 0;=0A=
tdiff->tv_usec =3D 0;=0A=
}=0A=
}=0A=
=0A=
uint64=0A=
gettime(void)=0A=
{=0A=
return (tvdelta(&start_tv, &stop_tv));=0A=
}=0A=
=0A=
double=0A=
timespent(void)=0A=
{=0A=
struct timeval td;=0A=
=0A=
tvsub(&td, &stop_tv, &start_tv);=0A=
return (td.tv_sec + td.tv_usec / 1000000.0);=0A=
}=0A=
=0A=
static char p64buf[10][20];=0A=
static int n;=0A=
=0A=
char *=0A=
p64(uint64 big)=0A=
{=0A=
char *s =3D p64buf[n++];=0A=
=0A=
if (n =3D=3D 10) n =3D 0;=0A=
#ifdef linux=0A=
{=0A=
int *a =3D (int*)&big;=0A=
=0A=
if (a[1]) {=0A=
sprintf(s, "0x%x%08x", a[1], a[0]);=0A=
} else {=0A=
sprintf(s, "0x%x", a[0]);=0A=
}=0A=
}=0A=
#endif=0A=
#ifdef __sgi=0A=
sprintf(s, "0x%llx", big);=0A=
#endif=0A=
return (s);=0A=
}=0A=
=0A=
char *=0A=
p64sz(uint64 big)=0A=
{=0A=
double d =3D big;=0A=
char *tags =3D " KMGTPE";=0A=
int t =3D 0;=0A=
char *s =3D p64buf[n++];=0A=
=0A=
if (n =3D=3D 10) n =3D 0;=0A=
while (d > 512) t++, d /=3D 1024;=0A=
if (d =3D=3D 0) {=0A=
return ("0");=0A=
}=0A=
if (d < 100) {=0A=
sprintf(s, "%.4f%c", d, tags[t]);=0A=
} else {=0A=
sprintf(s, "%.2f%c", d, tags[t]);=0A=
}=0A=
return (s);=0A=
}=0A=
=0A=
char=0A=
last(char *s)=0A=
{=0A=
while (*s++)=0A=
;=0A=
return (s[-2]);=0A=
}=0A=
=0A=
int=0A=
bytes(char *s)=0A=
{=0A=
int n =3D atoi(s);=0A=
=0A=
if ((last(s) =3D=3D 'k') || (last(s) =3D=3D 'K'))=0A=
n *=3D 1024;=0A=
if ((last(s) =3D=3D 'm') || (last(s) =3D=3D 'M'))=0A=
n *=3D (1024 * 1024);=0A=
return (n);=0A=
}=0A=
=0A=
void=0A=
use_int(int result) { use_result_dummy +=3D result; }=0A=
=0A=
void=0A=
use_pointer(void *result) { use_result_dummy +=3D (int)result; }=0A=
=0A=
void=0A=
insertinit(result_t *r)=0A=
{=0A=
int i;=0A=
=0A=
r->N =3D 0;=0A=
for (i =3D 0; i < TRIES; i++) {=0A=
r->u[i] =3D 0;=0A=
r->n[i] =3D 1;=0A=
}=0A=
}=0A=
=0A=
/* biggest to smallest */=0A=
void=0A=
insertsort(uint64 u, uint64 n, result_t *r)=0A=
{=0A=
int i, j;=0A=
=0A=
if (u =3D=3D 0) return;=0A=
=0A=
for (i =3D 0; i < r->N; ++i) {=0A=
if (u/(double)n > r->u[i]/(double)r->n[i]) {=0A=
for (j =3D r->N; j > i; --j) {=0A=
r->u[j] =3D r->u[j-1];=0A=
r->n[j] =3D r->n[j-1];=0A=
}=0A=
break;=0A=
}=0A=
}=0A=
r->u[i] =3D u;=0A=
r->n[i] =3D n;=0A=
r->N++;=0A=
}=0A=
=0A=
static result_t results;=0A=
=0A=
void=0A=
print_results(void)=0A=
{=0A=
int i;=0A=
=0A=
for (i =3D 0; i < results.N; ++i) {=0A=
fprintf(stderr, "%.2f ", (double)results.u[i]/results.n[i]);=0A=
}=0A=
}=0A=
=0A=
void=0A=
get_results(result_t *r)=0A=
{=0A=
*r =3D results;=0A=
}=0A=
=0A=
void=0A=
save_results(result_t *r)=0A=
{=0A=
results =3D *r;=0A=
save_median();=0A=
}=0A=
=0A=
void=0A=
save_minimum()=0A=
{=0A=
if (results.N =3D=3D 0) {=0A=
save_n(1);=0A=
settime(0);=0A=
} else {=0A=
save_n(results.n[results.N - 1]);=0A=
settime(results.u[results.N - 1]);=0A=
}=0A=
}=0A=
=0A=
void=0A=
save_median()=0A=
{=0A=
int i =3D results.N / 2;=0A=
uint64 u, n;=0A=
=0A=
if (results.N =3D=3D 0) {=0A=
n =3D 1;=0A=
u =3D 0;=0A=
} else if (results.N % 2) {=0A=
n =3D results.n[i];=0A=
u =3D results.u[i];=0A=
} else {=0A=
n =3D (results.n[i] + results.n[i-1]) / 2;=0A=
u =3D (results.u[i] + results.u[i-1]) / 2;=0A=
}=0A=
save_n(n); settime(u);=0A=
}=0A=
=0A=
/*=0A=
* The inner loop tracks bench.h but uses a different results array.=0A=
*/=0A=
static long *=0A=
one_op(register long *p)=0A=
{=0A=
BENCH_INNER(p =3D (long *)*p, 0);=0A=
return (p);=0A=
}=0A=
=0A=
static long *=0A=
two_op(register long *p, register long *q)=0A=
{=0A=
BENCH_INNER(p =3D (long *)*q; q =3D (long*)*p, 0);=0A=
return (p);=0A=
}=0A=
=0A=
static long *p =3D (long *)&p;=0A=
static long *q =3D (long *)&q;=0A=
=0A=
double=0A=
l_overhead(void)=0A=
{=0A=
int i;=0A=
uint64 N_save, u_save;=0A=
static double overhead;=0A=
static int initialized =3D 0;=0A=
result_t one, two, r_save;=0A=
=0A=
init_timing();=0A=
if (initialized) return (overhead);=0A=
=0A=
initialized =3D 1;=0A=
if (getenv("LOOP_O")) {=0A=
overhead =3D atof(getenv("LOOP_O"));=0A=
} else {=0A=
get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A=
insertinit(&one);=0A=
insertinit(&two);=0A=
for (i =3D 0; i < TRIES; ++i) {=0A=
use_pointer((void*)one_op(p));=0A=
if (gettime() > t_overhead())=0A=
insertsort(gettime() - t_overhead(), get_n(), &one);=0A=
use_pointer((void *)two_op(p, q));=0A=
if (gettime() > t_overhead())=0A=
insertsort(gettime() - t_overhead(), get_n(), &two);=0A=
}=0A=
/*=0A=
* u1 =3D (n1 * (overhead + work))=0A=
* u2 =3D (n2 * (overhead + 2 * work))=0A=
* =3D=3D> overhead =3D 2. * u1 / n1 - u2 / n2=0A=
*/=0A=
save_results(&one); save_minimum();=0A=
overhead =3D 2. * gettime() / (double)get_n();=0A=
=0A=
save_results(&two); save_minimum();=0A=
overhead -=3D gettime() / (double)get_n();=0A=
=0A=
if (overhead < 0.) overhead =3D 0.; /* Gag */=0A=
=0A=
save_results(&r_save); save_n(N_save); settime(u_save); =0A=
}=0A=
return (overhead);=0A=
}=0A=
=0A=
/*=0A=
* Figure out the timing overhead. This has to track bench.h=0A=
*/=0A=
uint64=0A=
t_overhead(void)=0A=
{=0A=
uint64 N_save, u_save;=0A=
static int initialized =3D 0;=0A=
static uint64 overhead =3D 0;=0A=
struct timeval tv;=0A=
result_t r_save;=0A=
=0A=
init_timing();=0A=
if (initialized) return (overhead);=0A=
=0A=
initialized =3D 1;=0A=
if (getenv("TIMING_O")) {=0A=
overhead =3D atof(getenv("TIMING_O"));=0A=
} else if (get_enough(0) <=3D 50000) {=0A=
/* it is not in the noise, so compute it */=0A=
int i;=0A=
result_t r;=0A=
=0A=
get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A=
insertinit(&r);=0A=
for (i =3D 0; i < TRIES; ++i) {=0A=
BENCH_INNER(gettimeofday(&tv, 0), 0);=0A=
insertsort(gettime(), get_n(), &r);=0A=
}=0A=
save_results(&r);=0A=
save_minimum();=0A=
overhead =3D gettime() / get_n();=0A=
=0A=
save_results(&r_save); save_n(N_save); settime(u_save); =0A=
}=0A=
return (overhead);=0A=
}=0A=
=0A=
/*=0A=
* Figure out how long to run it.=0A=
* If enough =3D=3D 0, then they want us to figure it out.=0A=
* If enough is !0 then return it unless we think it is too short.=0A=
*/=0A=
static int long_enough;=0A=
static int compute_enough();=0A=
=0A=
int=0A=
get_enough(int e)=0A=
{=0A=
init_timing();=0A=
return (long_enough > e ? long_enough : e);=0A=
}=0A=
=0A=
=0A=
static void=0A=
init_timing(void)=0A=
{=0A=
static int done =3D 0;=0A=
=0A=
if (done) return;=0A=
done =3D 1;=0A=
long_enough =3D compute_enough();=0A=
t_overhead();=0A=
l_overhead();=0A=
}=0A=
=0A=
typedef long TYPE;=0A=
=0A=
static TYPE **=0A=
enough_duration(register long N, register TYPE ** p)=0A=
{=0A=
#define ENOUGH_DURATION_TEN(one) one one one one one one one one one one=0A=
while (N-- > 0) {=0A=
ENOUGH_DURATION_TEN(p =3D (TYPE **) *p;);=0A=
}=0A=
return (p);=0A=
}=0A=
=0A=
static uint64=0A=
duration(long N)=0A=
{=0A=
uint64 usecs;=0A=
TYPE *x =3D (TYPE *)&x;=0A=
TYPE **p =3D (TYPE **)&x;=0A=
=0A=
start(0);=0A=
p =3D enough_duration(N, p);=0A=
usecs =3D stop(0, 0);=0A=
use_pointer((void *)p);=0A=
return (usecs);=0A=
}=0A=
=0A=
/*=0A=
* find the minimum time that work "N" takes in "tries" tests=0A=
*/=0A=
static uint64=0A=
time_N(long N)=0A=
{=0A=
int i;=0A=
uint64 usecs;=0A=
result_t r;=0A=
=0A=
insertinit(&r);=0A=
for (i =3D 1; i < TRIES; ++i) {=0A=
usecs =3D duration(N);=0A=
insertsort(usecs, N, &r);=0A=
}=0A=
save_results(&r);=0A=
save_minimum();=0A=
return (gettime());=0A=
}=0A=
=0A=
/*=0A=
* return the amount of work needed to run "enough" microseconds=0A=
*/=0A=
static long=0A=
find_N(int enough)=0A=
{=0A=
int tries;=0A=
static long N =3D 10000;=0A=
static uint64 usecs =3D 0;=0A=
=0A=
if (!usecs) usecs =3D time_N(N);=0A=
=0A=
for (tries =3D 0; tries < 10; ++tries) {=0A=
if (0.98 * enough < usecs && usecs < 1.02 * enough)=0A=
return (N);=0A=
if (usecs < 1000)=0A=
N *=3D 10;=0A=
else {=0A=
double n =3D N;=0A=
=0A=
n /=3D usecs;=0A=
n *=3D enough;=0A=
N =3D n + 1;=0A=
}=0A=
usecs =3D time_N(N);=0A=
}=0A=
return (-1);=0A=
}=0A=
=0A=
/*=0A=
* We want to verify that small modifications proportionally affect the =
runtime=0A=
*/=0A=
static double test_points[] =3D {1.015, 1.02, 1.035};=0A=
static int=0A=
test_time(int enough)=0A=
{=0A=
int i;=0A=
long N;=0A=
uint64 usecs, expected, baseline, diff;=0A=
=0A=
if ((N =3D find_N(enough)) <=3D 0)=0A=
return (0);=0A=
=0A=
baseline =3D time_N(N);=0A=
=0A=
for (i =3D 0; i < sizeof(test_points) / sizeof(double); ++i) {=0A=
usecs =3D time_N((int)((double) N * test_points[i]));=0A=
expected =3D (uint64)((double)baseline * test_points[i]);=0A=
diff =3D expected > usecs ? expected - usecs : usecs - expected;=0A=
if (diff / (double)expected > 0.0025)=0A=
return (0);=0A=
}=0A=
return (1);=0A=
}=0A=
=0A=
=0A=
/*=0A=
* We want to find the smallest timing interval that has accurate timing=0A=
*/=0A=
static int possibilities[] =3D { 5000, 10000, 50000, 100000 };=0A=
static int=0A=
compute_enough()=0A=
{=0A=
int i;=0A=
=0A=
if (getenv("ENOUGH")) {=0A=
return (atoi(getenv("ENOUGH")));=0A=
}=0A=
for (i =3D 0; i < sizeof(possibilities) / sizeof(int); ++i) {=0A=
if (test_time(possibilities[i]))=0A=
return (possibilities[i]);=0A=
}=0A=
=0A=
/* =0A=
* if we can't find a timing interval that is sufficient, =0A=
* then use SHORT as a default.=0A=
*/=0A=
return (SHORT);=0A=
}=0A=
=0A=
/*=0A=
* This stuff isn't really lib_timing, but ...=0A=
*/=0A=
void=0A=
morefds(void)=0A=
{=0A=
#ifdef RLIMIT_NOFILE=0A=
struct rlimit r;=0A=
=0A=
getrlimit(RLIMIT_NOFILE, &r);=0A=
r.rlim_cur =3D r.rlim_max;=0A=
setrlimit(RLIMIT_NOFILE, &r);=0A=
#endif=0A=
}=0A=
=0A=
void=0A=
touch(char *buf, int nbytes)=0A=
{=0A=
static psize;=0A=
=0A=
if (!psize) {=0A=
psize =3D getpagesize();=0A=
}=0A=
while (nbytes > 0) {=0A=
*buf =3D 1;=0A=
buf +=3D psize;=0A=
nbytes -=3D psize;=0A=
}=0A=
}=0A=
=0A=
#if defined(hpux) || defined(__hpux)=0A=
int=0A=
getpagesize()=0A=
{=0A=
return (sysconf(_SC_PAGE_SIZE));=0A=
}=0A=
#endif=0A=
=0A=
#if defined(WIN32)=0A=
#if !defined(__CYGWIN__)=0A=
int=0A=
getpagesize()=0A=
{=0A=
SYSTEM_INFO s;=0A=
=0A=
GetSystemInfo(&s);=0A=
return ((int)s.dwPageSize);=0A=
}=0A=
#endif=0A=
=0A=
LARGE_INTEGER=0A=
getFILETIMEoffset()=0A=
{=0A=
SYSTEMTIME s;=0A=
FILETIME f;=0A=
LARGE_INTEGER t;=0A=
=0A=
s.wYear =3D 1970;=0A=
s.wMonth =3D 1;=0A=
s.wDay =3D 1;=0A=
s.wHour =3D 0;=0A=
s.wMinute =3D 0;=0A=
s.wSecond =3D 0;=0A=
s.wMilliseconds =3D 0;=0A=
SystemTimeToFileTime(&s, &f);=0A=
t.QuadPart =3D f.dwHighDateTime;=0A=
t.QuadPart <<=3D 32;=0A=
t.QuadPart |=3D f.dwLowDateTime;=0A=
return (t);=0A=
}=0A=
=0A=
int=0A=
gettimeofday(struct timeval *tv, struct timezone *tz)=0A=
{=0A=
LARGE_INTEGER t;=0A=
FILETIME f;=0A=
double microseconds;=0A=
static LARGE_INTEGER offset;=0A=
static double frequencyToMicroseconds;=0A=
static int initialized =3D 0;=0A=
static BOOL usePerformanceCounter =3D 0;=0A=
=0A=
if (!initialized) {=0A=
LARGE_INTEGER performanceFrequency;=0A=
initialized =3D 1;=0A=
usePerformanceCounter =3D =
QueryPerformanceFrequency(&performanceFrequency);=0A=
if (usePerformanceCounter) {=0A=
QueryPerformanceCounter(&offset);=0A=
frequencyToMicroseconds =3D (double)performanceFrequency.QuadPart / =
1000000.;=0A=
} else {=0A=
offset =3D getFILETIMEoffset();=0A=
frequencyToMicroseconds =3D 10.;=0A=
}=0A=
}=0A=
if (usePerformanceCounter) QueryPerformanceCounter(&t);=0A=
else {=0A=
GetSystemTimeAsFileTime(&f);=0A=
t.QuadPart =3D f.dwHighDateTime;=0A=
t.QuadPart <<=3D 32;=0A=
t.QuadPart |=3D f.dwLowDateTime;=0A=
}=0A=
=0A=
t.QuadPart -=3D offset.QuadPart;=0A=
microseconds =3D (double)t.QuadPart / frequencyToMicroseconds;=0A=
t.QuadPart =3D microseconds;=0A=
tv->tv_sec =3D t.QuadPart / 1000000;=0A=
tv->tv_usec =3D t.QuadPart % 1000000;=0A=
return (0);=0A=
}=0A=
#endif=0A=
------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: text/plain; charset=us-ascii
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
------=_NextPart_000_0024_01C17D09.039D4260--
- Raw text -