delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2001/12/05/00:18:31

Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Message-ID: <002701c17d4c$135cf450$ccef85ce@amr.corp.intel.com>
From: "Tim Prince" <tprince AT computer DOT org>
To: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>,
"Cygwin" <cygwin AT sources DOT redhat DOT com>
References: <006401c17cd6$3c9e2fd0$9a5f07d5 AT BRAMSCHE>
Subject: Re: Old Thread: Cygwin Performance
Date: Tue, 4 Dec 2001 21:17:01 -0800
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4522.1200
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200

------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

I suggest the timing be done using the lmbench replacement for
gettimeofday() in terms of Windows QueryPerformance() calls, as attached.
This provides sufficient timer resolution to perform the tests quickly,
obtain accurate cache latency data, and make the cpu clock rate detection
work as well as on linux.
Unfortunately, some of the file system and communication latencies appear to
be way out of line.  You do have as many or more tests working correctly as
I have seen in attempts to port earlier lmbench versions to Windows.
----- Original Message -----
From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
To: "Cygwin" <cygwin AT sources DOT redhat DOT com>
Sent: Tuesday, December 04, 2001 7:13 AM
Subject: RE: Old Thread: Cygwin Performance


> > -----Original Message-----
> > From: Tim Prince [mailto:tprince AT computer DOT org]
> > Sent: Sunday, December 02, 2001 10:58 PM
> > To: Ralf Habacker
> > Cc: Cygwin
> > Subject: Re: Old Thread: Cygwin Performance
> >
> >
> > Your patch adds lib_cygwin.c to the list of required source files, yet
that
> > new file is not included.
>
> Sorry, I've only compared the original source files with the patched, so
it fall through.
> It's appended.
>
> > Also, it causes Makefile to invoke the 'get -s' command, of whose
function I am not aware.
>
> I'm not aware too, I have recognized this in the Makefile, but I have
ignored this :-)
> >
> > On my laptop, running linux, the lmbench-2beta2 version corrects a hang
in
> > the "stable version" code which makes a network connection.  Perhaps
that is
> > not supported anyway in your cygwin version.
>
> > ----- Original Message -----
> > From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
> > To: "Tim Prince" <tprince AT computer DOT org>
> > Cc: "Cygwin" <cygwin AT sources DOT redhat DOT com>
> > Sent: Sunday, December 02, 2001 10:29 AM
> > Subject: RE: Old Thread: Cygwin Performance
> >
> >
> > > > I'd suggest you offer your patch to the lmbench maintainers.  At one
> > time,
> > > > they were talking about supporting something for Windows.  If they
don't
> > > > adopt it, I suppose the other alternative is to offer to maintain a
> > Cygwin
> > > > port as an optional Cygwin package.  I'd certainly like to try your
> > version.
> > >
> > > Perhaps it is the best, that you look at the patch before offering to
the
> > lmbench maintainer.
> > > I should note some things to the patch:
> > >
> > > 1. It emulates rpc functions by adding a file "lib_cygwin.c" which
> > contains empty rcp_...
> > > functions,
> > >    so that the rpc functions are disabled and will not be tested.
> > >
> > > 2. Because the makefile does not have any platform depending parts,
> > generating lat_rpc.exe is
> > > disabled
> > >
> > > 3. in scripts/lmbench I have added some ' echo -n "*" ' to enable
visible
> > feedback for the
> > > long time execution of some benchmarks.
> > >
> > > 4. On problem I have recognized is with the "lat_select", it hangs on
> > operation.
> > >
> > > 5. Because I don't have any compare of lmbench running time on other
> > platforms I can't say if
> > > this is okay. Some benchmarks need several minutes to run, but this
may be
> > okay.
> > >
> > > Regards
> > > Ralf
> > >
> > > > ----- Original Message -----
> > > > From: "Ralf Habacker" <Ralf DOT Habacker AT freenet DOT de>
> > > > To: "Tim Prince" <tprince AT computer DOT org>
> > > > Cc: "Cygwin" <cygwin AT sources DOT redhat DOT com>
> > > > Sent: Saturday, December 01, 2001 11:44 AM
> > > > Subject: RE: Old Thread: Cygwin Performance
> > > >
> > > >
> > > > > >
> > > > > > cygwin should have made some improvements in piping since then.
> > Amazing
> > > > the
> > > > > > things I had time to do last year.  At that time, I got over  a
few
> > of
> > > > the
> > > > > > linux specific functions by the use of Chuck Wilson's useful
> > packages,
> > > > some
> > > > > > of which should be integrated into cygwin now.  I commented out
> > sections
> > > > of
> > > > > > lmbench which I couldn't figure out how to port.  This would be
a
> > useful
> > > > > > port, particularly in view of the new performance issues brought
up
> > by
> > > > XP.
> > > > >
> > > > > I have get running lmbench 2.0 on cygwin with some patches
(removing
> > rpc
> > > > functions).
> > > > >
> > > > > Is there anyone who could verify this patch ? To whom should I
send
> > this
> > > > patch ?
> > > > >
> > > > > Regards
> > > > > Ralf
> > > > >
> > > > > > However, several of the organizations involved in lmbench are
trying
> > to
> > > > stay
> > > > > > clear of Bill Gates' vendetta against use of open software
together
> > with
> > > > his
> > > > > > products.  I was not employed by such an organization at the
time I
> > was
> > > > > > beating on lmbench.
> > > > >
> > > > > > ----- Original Message -----
> > > > > > From: "Piyush Kumar" <piyush AT acm DOT org>
> > > > > > To: "Cygwin AT Cygwin. Com" <cygwin AT cygwin DOT com>
> > > > > > Sent: Friday, November 30, 2001 6:49 AM
> > > > > > Subject: Old Thread: Cygwin Performance
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I picked this old thread from Oct 2000!!!
> > > > > > > Tim reports that cygwin falls short by
> > > > > > > performance compared to linux box by a
> > > > > > > factor of 2 using lmbench. Is it still
> > > > > > > the case? Or have things improved since
> > > > > > > Oct 13(Unlucky date!! ;)??
> > > > > > >
> > > > > > > I was trying to compile lmbench 2.0 (Patch 2)
> > > > > > > on my cygwin , no luck!!!! I couldnt compile it!
> > > > > > > Anyone here has tried it before ?? Any luck?
> > > > > > > I would be really interested in a lmbench port
> > > > > > > on cygwin! If someone has already done it , please
> > > > > > > let me know!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > --Piyush
> > > > > > >
> > > > > > >
> > > > > > >
=============================================================An
> > Old
> > > > Thread
> > > > > > >
> > > > > > > Re: Cygwin Performance Info
> > > > > > > To: <cygwin at sourceware dot cygnus dot com>, "Chris Abbey"
> > <cabbey
> > > > at
> > > > > > > chartermi dot net>
> > > > > > > Subject: Re: Cygwin Performance Info
> > > > > > > From: "Tim Prince" <tprince at computer dot org>
> > > > > > > Date: Fri, 13 Oct 2000 19:12:40 -0700
> > > > > > > References:
> > <4 DOT 3 DOT 2 DOT 7 DOT 0 DOT 20001013184237 DOT 00b6cd70 AT pop DOT bresnanlink DOT net>
> > > > > > >
> > > > > >
> > > >
> >
> --------------------------------------------------------------------------
> > > > > > --
> > > > > > > ----
> > > > > > >
> > > > > > > When I attempted to run lmbench on this old box both under
linux
> > and
> > > > cygwi
> > > > > > n,
> > > > > > > there were some tests on which cygwin/w2k fell short of linux
by a
> > > > factor
> > > > > > of
> > > > > > > 2 or more (opening files, pipe throughput, and the like), and
then
> > > > there
> > > > > > > were the cache statistics on which cygwin beat linux by a
small
> > > > margin.  I
> > > > > > > was expecting lmbench to become better adapted to cygwin, but
I
> > have
> > > > no
> > > > > > news
> > > > > > > there.
> > > > > > > ----- Original Message -----
> > > > > > > From: "Chris Abbey" <cabbey AT chartermi DOT net>
> > > > > > > To: <cygwin AT sourceware DOT cygnus DOT com>
> > > > > > > Sent: Friday, October 13, 2000 4:51 PM
> > > > > > > Subject: Re: Cygwin Performance Info
> > > > > > >
> > > > > > >
> > > > > > > > At 19:23 10/13/00 -0400, Laurence F. Wood wrote:
> > > > > > > > >Can someone tell me where the performance hit is in cygwin
unix
> > > > > > > > >emulation?
> > > > > > > >
> > > > > > > > whichever part you use the most inside your tightest inner
loop.
> > > > > > > >
> > > > > > > > seriously.
> > > > > > > >
> > > > > > > > that's a big huge open ended question (not about cygwin,
about
> > ANY
> > > > > > > > library/platform) that is as specific to your application as
you
> > can
> > > > > > > > get. For example, if you spend 75% of your computing day
> > > > manipulating
> > > > > > > > text files and piping them and greping them and running file
> > utils
> > > > > > > > against them then the cr/lf translation may be a big hit for
> > you.
> > > > > > > > On the otherhand if most of your computation in a day is
spent
> > > > answering
> > > > > > > > requests that come in on tcp/ip sockets then the remapping
of
> > > > winsock
> > > > > > > > to netinet.h functions maybe your major headache. (note, I'm
not
> > > > trying
> > > > > > > > to imply that either function has a performance problem,
merely
> > that
> > > > > > they
> > > > > > > > would be representative places that would have high
invocation
> > > > counts
> > > > > > > > in the course of the given activity.)
> > > > > > > >
> > > > > > > > To really answer that for your application/workload then you
> > need to
> > > > > > > > get some form of performance detailing that can tell you how
> > much
> > > > time
> > > > > > > > you are spending in any given method and how often it's
called.
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Want to unsubscribe from this list?
> > > > > > > > Send a message to cygwin-unsubscribe AT sourceware DOT cygnus DOT com
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Unsubscribe info:
http://cygwin.com/ml/#unsubscribe-simple
> > > > > > > Bug reporting:         http://cygwin.com/bugs.html
> > > > > > > Documentation:         http://cygwin.com/docs.html
> > > > > > > FAQ:                   http://cygwin.com/faq/
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> > > > > > Bug reporting:         http://cygwin.com/bugs.html
> > > > > > Documentation:         http://cygwin.com/docs.html
> > > > > > FAQ:                   http://cygwin.com/faq/
> > > > > >
> > > > > >
> > > >
> > > >
> > > > --
> > > > Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> > > > Bug reporting:         http://cygwin.com/bugs.html
> > > > Documentation:         http://cygwin.com/docs.html
> > > > FAQ:                   http://cygwin.com/faq/
> > > >
> > > >
> > >
> >
> >
>
> --------------------------------------------------------------------------
--
> > ----
> >
> >
> > > --
> > > Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> > > Bug reporting:         http://cygwin.com/bugs.html
> > > Documentation:         http://cygwin.com/docs.html
> > > FAQ:                   http://cygwin.com/faq/
> >
> >
>


----------------------------------------------------------------------------
----


> --
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> Bug reporting:         http://cygwin.com/bugs.html
> Documentation:         http://cygwin.com/docs.html
> FAQ:                   http://cygwin.com/faq/

------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: application/octet-stream;
	name="lib_timing.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="lib_timing.c"

/*=0A=
 * a timing utilities library=0A=
 *=0A=
 * Requires 64bit integers to work.=0A=
 *=0A=
 * %W% %@%=0A=
 *=0A=
 * Copyright (c) 1994-1998 Larry McVoy.=0A=
 */=0A=
#define	 _LIB /* bench.h needs this */=0A=
#include "bench.h"=0A=
=0A=
#define	nz(x)	((x) =3D=3D 0 ? 1 : (x))=0A=
=0A=
/*=0A=
 * I know you think these should be 2^10 and 2^20, but people are quoting=0A=
 * disk sizes in powers of 10, and bandwidths are all power of ten.=0A=
 * Deal with it.=0A=
 */=0A=
#define	MB	(1000*1000.0)=0A=
#define	KB	(1000.0)=0A=
=0A=
static struct timeval start_tv, stop_tv;=0A=
FILE		*ftiming;=0A=
volatile uint64	use_result_dummy;	/* !static for optimizers. */=0A=
static	uint64	iterations;=0A=
static	void	init_timing(void);=0A=
=0A=
#if defined(hpux) || defined(__hpux)=0A=
#include <sys/mman.h>=0A=
#endif=0A=
=0A=
#ifdef	RUSAGE=0A=
#include <sys/resource.h>=0A=
#define	SECS(tv)	(tv.tv_sec + tv.tv_usec / 1000000.0)=0A=
#define	mine(f)		(int)(ru_stop.f - ru_start.f)=0A=
=0A=
static struct rusage ru_start, ru_stop;=0A=
=0A=
void=0A=
rusage(void)=0A=
{=0A=
	double  sys, user, idle;=0A=
	double  per;=0A=
=0A=
	sys =3D SECS(ru_stop.ru_stime) - SECS(ru_start.ru_stime);=0A=
	user =3D SECS(ru_stop.ru_utime) - SECS(ru_start.ru_utime);=0A=
	idle =3D timespent() - (sys + user);=0A=
	per =3D idle / timespent() * 100;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming, "real=3D%.2f sys=3D%.2f user=3D%.2f idle=3D%.2f =
stall=3D%.0f%% ",=0A=
	    timespent(), sys, user, idle, per);=0A=
	fprintf(ftiming, "rd=3D%d wr=3D%d min=3D%d maj=3D%d ctx=3D%d\n",=0A=
	    mine(ru_inblock), mine(ru_oublock),=0A=
	    mine(ru_minflt), mine(ru_majflt),=0A=
	    mine(ru_nvcsw) + mine(ru_nivcsw));=0A=
}=0A=
=0A=
#endif	/* RUSAGE */=0A=
/*=0A=
 * Redirect output someplace else.=0A=
 */=0A=
void=0A=
timing(FILE *out)=0A=
{=0A=
	ftiming =3D out;=0A=
}=0A=
=0A=
/*=0A=
 * Start timing now.=0A=
 */=0A=
void=0A=
start(struct timeval *tv)=0A=
{=0A=
	if (tv =3D=3D NULL) {=0A=
		tv =3D &start_tv;=0A=
	}=0A=
#ifdef	RUSAGE=0A=
	getrusage(RUSAGE_SELF, &ru_start);=0A=
#endif=0A=
	(void) gettimeofday(tv, (struct timezone *) 0);=0A=
}=0A=
=0A=
/*=0A=
 * Stop timing and return real time in microseconds.=0A=
 */=0A=
uint64=0A=
stop(struct timeval *begin, struct timeval *end)=0A=
{=0A=
	if (end =3D=3D NULL) {=0A=
		end =3D &stop_tv;=0A=
	}=0A=
	(void) gettimeofday(end, (struct timezone *) 0);=0A=
#ifdef	RUSAGE=0A=
	getrusage(RUSAGE_SELF, &ru_stop);=0A=
#endif=0A=
=0A=
	if (begin =3D=3D NULL) {=0A=
		begin =3D &start_tv;=0A=
	}=0A=
	return tvdelta(begin, end);=0A=
}=0A=
=0A=
uint64=0A=
now(void)=0A=
{=0A=
	struct timeval t;=0A=
	uint64	m;=0A=
=0A=
	(void) gettimeofday(&t, (struct timezone *) 0);=0A=
	m =3D t.tv_sec;=0A=
	m *=3D 1000000;=0A=
	m +=3D t.tv_usec;=0A=
	return (m);=0A=
}=0A=
=0A=
double=0A=
Now(void)=0A=
{=0A=
	struct timeval t;=0A=
=0A=
	(void) gettimeofday(&t, (struct timezone *) 0);=0A=
	return (t.tv_sec * 1000000.0 + t.tv_usec);=0A=
}=0A=
=0A=
uint64=0A=
delta(void)=0A=
{=0A=
	static struct timeval last;=0A=
	struct timeval t;=0A=
	struct timeval diff;=0A=
	uint64	m;=0A=
=0A=
	(void) gettimeofday(&t, (struct timezone *) 0);=0A=
	if (last.tv_usec) {=0A=
		tvsub(&diff, &t, &last);=0A=
		last =3D t;=0A=
		m =3D diff.tv_sec;=0A=
		m *=3D 1000000;=0A=
		m +=3D diff.tv_usec;=0A=
		return (m);=0A=
	} else {=0A=
		last =3D t;=0A=
		return (0);=0A=
	}=0A=
}=0A=
=0A=
double=0A=
Delta(void)=0A=
{=0A=
	struct timeval t;=0A=
	struct timeval diff;=0A=
=0A=
	(void) gettimeofday(&t, (struct timezone *) 0);=0A=
	tvsub(&diff, &t, &start_tv);=0A=
	return (diff.tv_sec + diff.tv_usec / 1000000.0);=0A=
}=0A=
=0A=
void=0A=
save_n(uint64 n)=0A=
{=0A=
	iterations =3D n;=0A=
}=0A=
=0A=
uint64=0A=
get_n(void)=0A=
{=0A=
	return (iterations);=0A=
}=0A=
=0A=
/*=0A=
 * Make the time spend be usecs.=0A=
 */=0A=
void=0A=
settime(uint64 usecs)=0A=
{=0A=
	bzero((void*)&start_tv, sizeof(start_tv));=0A=
	stop_tv.tv_sec =3D usecs / 1000000;=0A=
	stop_tv.tv_usec =3D usecs % 1000000;=0A=
}=0A=
=0A=
void=0A=
bandwidth(uint64 bytes, uint64 times, int verbose)=0A=
{=0A=
	struct timeval tdiff;=0A=
	double  mb, secs;=0A=
=0A=
	tvsub(&tdiff, &stop_tv, &start_tv);=0A=
	secs =3D tdiff.tv_sec;=0A=
	secs *=3D 1000000;=0A=
	secs +=3D tdiff.tv_usec;=0A=
	secs /=3D 1000000;=0A=
	secs /=3D times;=0A=
	mb =3D bytes / MB;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	if (verbose) {=0A=
		(void) fprintf(ftiming,=0A=
		    "%.4f MB in %.4f secs, %.4f MB/sec\n",=0A=
		    mb, secs, mb/secs);=0A=
	} else {=0A=
		if (mb < 1) {=0A=
			(void) fprintf(ftiming, "%.6f ", mb);=0A=
		} else {=0A=
			(void) fprintf(ftiming, "%.2f ", mb);=0A=
		}=0A=
		if (mb / secs < 1) {=0A=
			(void) fprintf(ftiming, "%.6f\n", mb/secs);=0A=
		} else {=0A=
			(void) fprintf(ftiming, "%.2f\n", mb/secs);=0A=
		}=0A=
	}=0A=
}=0A=
=0A=
void=0A=
kb(uint64 bytes)=0A=
{=0A=
	struct timeval td;=0A=
	double  s, bs;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
	bs =3D bytes / nz(s);=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	(void) fprintf(ftiming, "%.0f KB/sec\n", bs / KB);=0A=
}=0A=
=0A=
void=0A=
mb(uint64 bytes)=0A=
{=0A=
	struct timeval td;=0A=
	double  s, bs;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
	bs =3D bytes / nz(s);=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	(void) fprintf(ftiming, "%.2f MB/sec\n", bs / MB);=0A=
}=0A=
=0A=
void=0A=
latency(uint64 xfers, uint64 size)=0A=
{=0A=
	struct timeval td;=0A=
	double  s;=0A=
=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
	if (xfers > 1) {=0A=
		fprintf(ftiming, "%d %dKB xfers in %.2f secs, ",=0A=
		    (int) xfers, (int) (size / KB), s);=0A=
	} else {=0A=
		fprintf(ftiming, "%.1fKB in ", size / KB);=0A=
	}=0A=
	if ((s * 1000 / xfers) > 100) {=0A=
		fprintf(ftiming, "%.0f millisec%s, ",=0A=
		    s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A=
	} else {=0A=
		fprintf(ftiming, "%.4f millisec%s, ",=0A=
		    s * 1000 / xfers, xfers > 1 ? "/xfer" : "s");=0A=
	}=0A=
	if (((xfers * size) / (MB * s)) > 1) {=0A=
		fprintf(ftiming, "%.2f MB/sec\n", (xfers * size) / (MB * s));=0A=
	} else {=0A=
		fprintf(ftiming, "%.2f KB/sec\n", (xfers * size) / (KB * s));=0A=
	}=0A=
}=0A=
=0A=
void=0A=
context(uint64 xfers)=0A=
{=0A=
	struct timeval td;=0A=
	double  s;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming,=0A=
	    "%d context switches in %.2f secs, %.0f microsec/switch\n",=0A=
	    (int)xfers, s, s * 1000000 / xfers);=0A=
}=0A=
=0A=
void=0A=
nano(char *s, uint64 n)=0A=
{=0A=
	struct timeval td;=0A=
	double  micro;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
	micro *=3D 1000;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming, "%s: %.0f nanoseconds\n", s, micro / n);=0A=
}=0A=
=0A=
void=0A=
micro(char *s, uint64 n)=0A=
{=0A=
	struct timeval td;=0A=
	double	micro;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
	micro /=3D n;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A=
#if 0=0A=
	if (micro >=3D 100) {=0A=
		fprintf(ftiming, "%s: %.1f microseconds\n", s, micro);=0A=
	} else if (micro >=3D 10) {=0A=
		fprintf(ftiming, "%s: %.3f microseconds\n", s, micro);=0A=
	} else {=0A=
		fprintf(ftiming, "%s: %.4f microseconds\n", s, micro);=0A=
	}=0A=
#endif=0A=
}=0A=
=0A=
void=0A=
micromb(uint64 sz, uint64 n)=0A=
{=0A=
	struct timeval td;=0A=
	double	mb, micro;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	micro =3D td.tv_sec * 1000000 + td.tv_usec;=0A=
	micro /=3D n;=0A=
	mb =3D sz;=0A=
	mb /=3D MB;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	if (micro >=3D 10) {=0A=
		fprintf(ftiming, "%.6f %.0f\n", mb, micro);=0A=
	} else {=0A=
		fprintf(ftiming, "%.6f %.3f\n", mb, micro);=0A=
	}=0A=
}=0A=
=0A=
void=0A=
milli(char *s, uint64 n)=0A=
{=0A=
	struct timeval td;=0A=
	uint64 milli;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	milli =3D td.tv_sec * 1000 + td.tv_usec / 1000;=0A=
	milli /=3D n;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming, "%s: %d milliseconds\n", s, (int)milli);=0A=
}=0A=
=0A=
void=0A=
ptime(uint64 n)=0A=
{=0A=
	struct timeval td;=0A=
	double  s;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	s =3D td.tv_sec + td.tv_usec / 1000000.0;=0A=
	if (!ftiming) ftiming =3D stderr;=0A=
	fprintf(ftiming,=0A=
	    "%d in %.2f secs, %.0f microseconds each\n",=0A=
	    (int)n, s, s * 1000000 / n);=0A=
}=0A=
=0A=
uint64=0A=
tvdelta(struct timeval *start, struct timeval *stop)=0A=
{=0A=
	struct timeval td;=0A=
	uint64	usecs;=0A=
=0A=
	tvsub(&td, stop, start);=0A=
	usecs =3D td.tv_sec;=0A=
	usecs *=3D 1000000;=0A=
	usecs +=3D td.tv_usec;=0A=
	return (usecs);=0A=
}=0A=
=0A=
void=0A=
tvsub(struct timeval * tdiff, struct timeval * t1, struct timeval * t0)=0A=
{=0A=
	tdiff->tv_sec =3D t1->tv_sec - t0->tv_sec;=0A=
	tdiff->tv_usec =3D t1->tv_usec - t0->tv_usec;=0A=
	if (tdiff->tv_usec < 0 && tdiff->tv_sec > 0) {=0A=
		tdiff->tv_sec--;=0A=
		tdiff->tv_usec +=3D 1000000;=0A=
		assert(tdiff->tv_usec >=3D 0);=0A=
	}=0A=
=0A=
	/* time shouldn't go backwards!!! */=0A=
	if (tdiff->tv_usec < 0 || t1->tv_sec < t0->tv_sec) {=0A=
		tdiff->tv_sec =3D 0;=0A=
		tdiff->tv_usec =3D 0;=0A=
	}=0A=
}=0A=
=0A=
uint64=0A=
gettime(void)=0A=
{=0A=
	return (tvdelta(&start_tv, &stop_tv));=0A=
}=0A=
=0A=
double=0A=
timespent(void)=0A=
{=0A=
	struct timeval td;=0A=
=0A=
	tvsub(&td, &stop_tv, &start_tv);=0A=
	return (td.tv_sec + td.tv_usec / 1000000.0);=0A=
}=0A=
=0A=
static	char	p64buf[10][20];=0A=
static	int	n;=0A=
=0A=
char	*=0A=
p64(uint64 big)=0A=
{=0A=
	char	*s =3D p64buf[n++];=0A=
=0A=
	if (n =3D=3D 10) n =3D 0;=0A=
#ifdef  linux=0A=
	{=0A=
        int     *a =3D (int*)&big;=0A=
=0A=
        if (a[1]) {=0A=
                sprintf(s, "0x%x%08x", a[1], a[0]);=0A=
        } else {=0A=
                sprintf(s, "0x%x", a[0]);=0A=
        }=0A=
	}=0A=
#endif=0A=
#ifdef	__sgi=0A=
        sprintf(s, "0x%llx", big);=0A=
#endif=0A=
	return (s);=0A=
}=0A=
=0A=
char	*=0A=
p64sz(uint64 big)=0A=
{=0A=
	double	d =3D big;=0A=
	char	*tags =3D " KMGTPE";=0A=
	int	t =3D 0;=0A=
	char	*s =3D p64buf[n++];=0A=
=0A=
	if (n =3D=3D 10) n =3D 0;=0A=
	while (d > 512) t++, d /=3D 1024;=0A=
	if (d =3D=3D 0) {=0A=
		return ("0");=0A=
	}=0A=
	if (d < 100) {=0A=
		sprintf(s, "%.4f%c", d, tags[t]);=0A=
	} else {=0A=
		sprintf(s, "%.2f%c", d, tags[t]);=0A=
	}=0A=
	return (s);=0A=
}=0A=
=0A=
char=0A=
last(char *s)=0A=
{=0A=
	while (*s++)=0A=
		;=0A=
	return (s[-2]);=0A=
}=0A=
=0A=
int=0A=
bytes(char *s)=0A=
{=0A=
	int	n =3D atoi(s);=0A=
=0A=
	if ((last(s) =3D=3D 'k') || (last(s) =3D=3D 'K'))=0A=
		n *=3D 1024;=0A=
	if ((last(s) =3D=3D 'm') || (last(s) =3D=3D 'M'))=0A=
		n *=3D (1024 * 1024);=0A=
	return (n);=0A=
}=0A=
=0A=
void=0A=
use_int(int result) { use_result_dummy +=3D result; }=0A=
=0A=
void=0A=
use_pointer(void *result) { use_result_dummy +=3D (int)result; }=0A=
=0A=
void=0A=
insertinit(result_t *r)=0A=
{=0A=
	int	i;=0A=
=0A=
	r->N =3D 0;=0A=
	for (i =3D 0; i < TRIES; i++) {=0A=
		r->u[i] =3D 0;=0A=
		r->n[i] =3D 1;=0A=
	}=0A=
}=0A=
=0A=
/* biggest to smallest */=0A=
void=0A=
insertsort(uint64 u, uint64 n, result_t *r)=0A=
{=0A=
	int	i, j;=0A=
=0A=
	if (u =3D=3D 0) return;=0A=
=0A=
	for (i =3D 0; i < r->N; ++i) {=0A=
		if (u/(double)n > r->u[i]/(double)r->n[i]) {=0A=
			for (j =3D r->N; j > i; --j) {=0A=
				r->u[j] =3D r->u[j-1];=0A=
				r->n[j] =3D r->n[j-1];=0A=
			}=0A=
			break;=0A=
		}=0A=
	}=0A=
	r->u[i] =3D u;=0A=
	r->n[i] =3D n;=0A=
	r->N++;=0A=
}=0A=
=0A=
static result_t results;=0A=
=0A=
void=0A=
print_results(void)=0A=
{=0A=
	int	i;=0A=
=0A=
	for (i =3D 0; i < results.N; ++i) {=0A=
		fprintf(stderr, "%.2f ", (double)results.u[i]/results.n[i]);=0A=
	}=0A=
}=0A=
=0A=
void=0A=
get_results(result_t *r)=0A=
{=0A=
	*r =3D results;=0A=
}=0A=
=0A=
void=0A=
save_results(result_t *r)=0A=
{=0A=
	results =3D *r;=0A=
	save_median();=0A=
}=0A=
=0A=
void=0A=
save_minimum()=0A=
{=0A=
	if (results.N =3D=3D 0) {=0A=
		save_n(1);=0A=
		settime(0);=0A=
	} else {=0A=
		save_n(results.n[results.N - 1]);=0A=
		settime(results.u[results.N - 1]);=0A=
	}=0A=
}=0A=
=0A=
void=0A=
save_median()=0A=
{=0A=
	int	i =3D results.N / 2;=0A=
	uint64	u, n;=0A=
=0A=
	if (results.N =3D=3D 0) {=0A=
		n =3D 1;=0A=
		u =3D 0;=0A=
	} else if (results.N % 2) {=0A=
		n =3D results.n[i];=0A=
		u =3D results.u[i];=0A=
	} else {=0A=
		n =3D (results.n[i] + results.n[i-1]) / 2;=0A=
		u =3D (results.u[i] + results.u[i-1]) / 2;=0A=
	}=0A=
	save_n(n); settime(u);=0A=
}=0A=
=0A=
/*=0A=
 * The inner loop tracks bench.h but uses a different results array.=0A=
 */=0A=
static long *=0A=
one_op(register long *p)=0A=
{=0A=
	BENCH_INNER(p =3D (long *)*p, 0);=0A=
	return (p);=0A=
}=0A=
=0A=
static long *=0A=
two_op(register long *p, register long *q)=0A=
{=0A=
	BENCH_INNER(p =3D (long *)*q; q =3D (long*)*p, 0);=0A=
	return (p);=0A=
}=0A=
=0A=
static long	*p =3D (long *)&p;=0A=
static long	*q =3D (long *)&q;=0A=
=0A=
double=0A=
l_overhead(void)=0A=
{=0A=
	int	i;=0A=
	uint64	N_save, u_save;=0A=
	static	double overhead;=0A=
	static	int initialized =3D 0;=0A=
	result_t one, two, r_save;=0A=
=0A=
	init_timing();=0A=
	if (initialized) return (overhead);=0A=
=0A=
	initialized =3D 1;=0A=
	if (getenv("LOOP_O")) {=0A=
		overhead =3D atof(getenv("LOOP_O"));=0A=
	} else {=0A=
		get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A=
		insertinit(&one);=0A=
		insertinit(&two);=0A=
		for (i =3D 0; i < TRIES; ++i) {=0A=
			use_pointer((void*)one_op(p));=0A=
			if (gettime() > t_overhead())=0A=
				insertsort(gettime() - t_overhead(), get_n(), &one);=0A=
			use_pointer((void *)two_op(p, q));=0A=
			if (gettime() > t_overhead())=0A=
				insertsort(gettime() - t_overhead(), get_n(), &two);=0A=
		}=0A=
		/*=0A=
		 * u1 =3D (n1 * (overhead + work))=0A=
		 * u2 =3D (n2 * (overhead + 2 * work))=0A=
		 * =3D=3D> overhead =3D 2. * u1 / n1 - u2 / n2=0A=
		 */=0A=
		save_results(&one); save_minimum();=0A=
		overhead =3D 2. * gettime() / (double)get_n();=0A=
		=0A=
		save_results(&two); save_minimum();=0A=
		overhead -=3D gettime() / (double)get_n();=0A=
		=0A=
		if (overhead < 0.) overhead =3D 0.;	/* Gag */=0A=
=0A=
		save_results(&r_save); save_n(N_save); settime(u_save); =0A=
	}=0A=
	return (overhead);=0A=
}=0A=
=0A=
/*=0A=
 * Figure out the timing overhead.  This has to track bench.h=0A=
 */=0A=
uint64=0A=
t_overhead(void)=0A=
{=0A=
	uint64		N_save, u_save;=0A=
	static int	initialized =3D 0;=0A=
	static uint64	overhead =3D 0;=0A=
	struct timeval	tv;=0A=
	result_t	r_save;=0A=
=0A=
	init_timing();=0A=
	if (initialized) return (overhead);=0A=
=0A=
	initialized =3D 1;=0A=
	if (getenv("TIMING_O")) {=0A=
		overhead =3D atof(getenv("TIMING_O"));=0A=
	} else if (get_enough(0) <=3D 50000) {=0A=
		/* it is not in the noise, so compute it */=0A=
		int		i;=0A=
		result_t	r;=0A=
=0A=
		get_results(&r_save); N_save =3D get_n(); u_save =3D gettime(); =0A=
		insertinit(&r);=0A=
		for (i =3D 0; i < TRIES; ++i) {=0A=
			BENCH_INNER(gettimeofday(&tv, 0), 0);=0A=
			insertsort(gettime(), get_n(), &r);=0A=
		}=0A=
		save_results(&r);=0A=
		save_minimum();=0A=
		overhead =3D gettime() / get_n();=0A=
=0A=
		save_results(&r_save); save_n(N_save); settime(u_save); =0A=
	}=0A=
	return (overhead);=0A=
}=0A=
=0A=
/*=0A=
 * Figure out how long to run it.=0A=
 * If enough =3D=3D 0, then they want us to figure it out.=0A=
 * If enough is !0 then return it unless we think it is too short.=0A=
 */=0A=
static	int	long_enough;=0A=
static	int	compute_enough();=0A=
=0A=
int=0A=
get_enough(int e)=0A=
{=0A=
	init_timing();=0A=
	return (long_enough > e ? long_enough : e);=0A=
}=0A=
=0A=
=0A=
static void=0A=
init_timing(void)=0A=
{=0A=
	static	int done =3D 0;=0A=
=0A=
	if (done) return;=0A=
	done =3D 1;=0A=
	long_enough =3D compute_enough();=0A=
	t_overhead();=0A=
	l_overhead();=0A=
}=0A=
=0A=
typedef long TYPE;=0A=
=0A=
static TYPE **=0A=
enough_duration(register long N, register TYPE ** p)=0A=
{=0A=
#define	ENOUGH_DURATION_TEN(one)	one one one one one one one one one one=0A=
	while (N-- > 0) {=0A=
		ENOUGH_DURATION_TEN(p =3D (TYPE **) *p;);=0A=
	}=0A=
	return (p);=0A=
}=0A=
=0A=
static uint64=0A=
duration(long N)=0A=
{=0A=
	uint64	usecs;=0A=
	TYPE   *x =3D (TYPE *)&x;=0A=
	TYPE  **p =3D (TYPE **)&x;=0A=
=0A=
	start(0);=0A=
	p =3D enough_duration(N, p);=0A=
	usecs =3D stop(0, 0);=0A=
	use_pointer((void *)p);=0A=
	return (usecs);=0A=
}=0A=
=0A=
/*=0A=
 * find the minimum time that work "N" takes in "tries" tests=0A=
 */=0A=
static uint64=0A=
time_N(long N)=0A=
{=0A=
	int     i;=0A=
	uint64	usecs;=0A=
	result_t r;=0A=
=0A=
	insertinit(&r);=0A=
	for (i =3D 1; i < TRIES; ++i) {=0A=
		usecs =3D duration(N);=0A=
		insertsort(usecs, N, &r);=0A=
	}=0A=
	save_results(&r);=0A=
	save_minimum();=0A=
	return (gettime());=0A=
}=0A=
=0A=
/*=0A=
 * return the amount of work needed to run "enough" microseconds=0A=
 */=0A=
static long=0A=
find_N(int enough)=0A=
{=0A=
	int		tries;=0A=
	static long	N =3D 10000;=0A=
	static uint64	usecs =3D 0;=0A=
=0A=
	if (!usecs) usecs =3D time_N(N);=0A=
=0A=
	for (tries =3D 0; tries < 10; ++tries) {=0A=
		if (0.98 * enough < usecs && usecs < 1.02 * enough)=0A=
			return (N);=0A=
		if (usecs < 1000)=0A=
			N *=3D 10;=0A=
		else {=0A=
			double  n =3D N;=0A=
=0A=
			n /=3D usecs;=0A=
			n *=3D enough;=0A=
			N =3D n + 1;=0A=
		}=0A=
		usecs =3D time_N(N);=0A=
	}=0A=
	return (-1);=0A=
}=0A=
=0A=
/*=0A=
 * We want to verify that small modifications proportionally affect the =
runtime=0A=
 */=0A=
static double test_points[] =3D {1.015, 1.02, 1.035};=0A=
static int=0A=
test_time(int enough)=0A=
{=0A=
	int     i;=0A=
	long	N;=0A=
	uint64	usecs, expected, baseline, diff;=0A=
=0A=
	if ((N =3D find_N(enough)) <=3D 0)=0A=
		return (0);=0A=
=0A=
	baseline =3D time_N(N);=0A=
=0A=
	for (i =3D 0; i < sizeof(test_points) / sizeof(double); ++i) {=0A=
		usecs =3D time_N((int)((double) N * test_points[i]));=0A=
		expected =3D (uint64)((double)baseline * test_points[i]);=0A=
		diff =3D expected > usecs ? expected - usecs : usecs - expected;=0A=
		if (diff / (double)expected > 0.0025)=0A=
			return (0);=0A=
	}=0A=
	return (1);=0A=
}=0A=
=0A=
=0A=
/*=0A=
 * We want to find the smallest timing interval that has accurate timing=0A=
 */=0A=
static int     possibilities[] =3D { 5000, 10000, 50000, 100000 };=0A=
static int=0A=
compute_enough()=0A=
{=0A=
	int     i;=0A=
=0A=
	if (getenv("ENOUGH")) {=0A=
		return (atoi(getenv("ENOUGH")));=0A=
	}=0A=
	for (i =3D 0; i < sizeof(possibilities) / sizeof(int); ++i) {=0A=
		if (test_time(possibilities[i]))=0A=
			return (possibilities[i]);=0A=
	}=0A=
=0A=
	/* =0A=
	 * if we can't find a timing interval that is sufficient, =0A=
	 * then use SHORT as a default.=0A=
	 */=0A=
	return (SHORT);=0A=
}=0A=
=0A=
/*=0A=
 * This stuff isn't really lib_timing, but ...=0A=
 */=0A=
void=0A=
morefds(void)=0A=
{=0A=
#ifdef	RLIMIT_NOFILE=0A=
	struct	rlimit r;=0A=
=0A=
	getrlimit(RLIMIT_NOFILE, &r);=0A=
	r.rlim_cur =3D r.rlim_max;=0A=
	setrlimit(RLIMIT_NOFILE, &r);=0A=
#endif=0A=
}=0A=
=0A=
void=0A=
touch(char *buf, int nbytes)=0A=
{=0A=
	static	psize;=0A=
=0A=
	if (!psize) {=0A=
		psize =3D getpagesize();=0A=
	}=0A=
	while (nbytes > 0) {=0A=
		*buf =3D 1;=0A=
		buf +=3D psize;=0A=
		nbytes -=3D psize;=0A=
	}=0A=
}=0A=
=0A=
#if defined(hpux) || defined(__hpux)=0A=
int=0A=
getpagesize()=0A=
{=0A=
	return (sysconf(_SC_PAGE_SIZE));=0A=
}=0A=
#endif=0A=
=0A=
#if defined(WIN32)=0A=
#if !defined(__CYGWIN__)=0A=
int=0A=
getpagesize()=0A=
{=0A=
	SYSTEM_INFO s;=0A=
=0A=
	GetSystemInfo(&s);=0A=
	return ((int)s.dwPageSize);=0A=
}=0A=
#endif=0A=
=0A=
LARGE_INTEGER=0A=
getFILETIMEoffset()=0A=
{=0A=
	SYSTEMTIME s;=0A=
	FILETIME f;=0A=
	LARGE_INTEGER t;=0A=
=0A=
	s.wYear =3D 1970;=0A=
	s.wMonth =3D 1;=0A=
	s.wDay =3D 1;=0A=
	s.wHour =3D 0;=0A=
	s.wMinute =3D 0;=0A=
	s.wSecond =3D 0;=0A=
	s.wMilliseconds =3D 0;=0A=
	SystemTimeToFileTime(&s, &f);=0A=
	t.QuadPart =3D f.dwHighDateTime;=0A=
	t.QuadPart <<=3D 32;=0A=
	t.QuadPart |=3D f.dwLowDateTime;=0A=
	return (t);=0A=
}=0A=
=0A=
int=0A=
gettimeofday(struct timeval *tv, struct timezone *tz)=0A=
{=0A=
	LARGE_INTEGER			t;=0A=
	FILETIME			f;=0A=
	double					microseconds;=0A=
	static LARGE_INTEGER	offset;=0A=
	static double			frequencyToMicroseconds;=0A=
	static int				initialized =3D 0;=0A=
	static BOOL				usePerformanceCounter =3D 0;=0A=
=0A=
	if (!initialized) {=0A=
		LARGE_INTEGER performanceFrequency;=0A=
		initialized =3D 1;=0A=
		usePerformanceCounter =3D =
QueryPerformanceFrequency(&performanceFrequency);=0A=
		if (usePerformanceCounter) {=0A=
			QueryPerformanceCounter(&offset);=0A=
			frequencyToMicroseconds =3D (double)performanceFrequency.QuadPart / =
1000000.;=0A=
		} else {=0A=
			offset =3D getFILETIMEoffset();=0A=
			frequencyToMicroseconds =3D 10.;=0A=
		}=0A=
	}=0A=
	if (usePerformanceCounter) QueryPerformanceCounter(&t);=0A=
	else {=0A=
		GetSystemTimeAsFileTime(&f);=0A=
		t.QuadPart =3D f.dwHighDateTime;=0A=
		t.QuadPart <<=3D 32;=0A=
		t.QuadPart |=3D f.dwLowDateTime;=0A=
	}=0A=
=0A=
	t.QuadPart -=3D offset.QuadPart;=0A=
	microseconds =3D (double)t.QuadPart / frequencyToMicroseconds;=0A=
	t.QuadPart =3D microseconds;=0A=
	tv->tv_sec =3D t.QuadPart / 1000000;=0A=
	tv->tv_usec =3D t.QuadPart % 1000000;=0A=
	return (0);=0A=
}=0A=
#endif=0A=


------=_NextPart_000_0024_01C17D09.039D4260
Content-Type: text/plain; charset=us-ascii

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/
------=_NextPart_000_0024_01C17D09.039D4260--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019