From: sandmann AT clio DOT rice DOT edu (Charles Sandmann)
Message-Id: <10303092007.AA19233@clio.rice.edu>
Subject: Win2K/XP uclock() - using rdtsc
To: djgpp-workers AT delorie DOT com (DJGPP developers)
Date: Sun, 9 Mar 2003 14:07:58 -0600 (CST)
X-Mailer: ELM [version 2.5 PL2]
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Reply-To: djgpp-workers AT delorie DOT com

I played with this a bit.  I found out that:

When the bios tic counter is set varies widely.  I had the process busy 
waiting on the bios counter, then checking rdtsc.  For example, each tic took
between 4.5 Million amd 117 Million cycles on my machine (in a test of 1000
ticks).  It missed 10 tics in that time frame (elapsed was 1010 tics).  A
normal cycle on this machine should take 24.7 Million tics.

So, trying to use a single tic to calibrate the clock would potentially add
lots of inaccuracy.  But waiting more tics causes the first call to be
slower.  Even 9 tics (1/2 second) frequently ends up with a 3.3% error;
18 tics is freqently a 1.3% error.

But because of the noise, saving a single integer "divider" doesn't sacrifice
anything.

Algorithm is:
 wait for tic change, save new tic as tic1
 read tsc
 wait for (n) tic changes save new tic as tic2
 read tsc
 uclock divider = delta tsc / ((tic2 - tic1)*65536)   (65536 uclocks per tic)
 
On the 450Mhz test machine, divider should be 377.

So my question is - how accurate should we try and be?  

On a 60Mhz system (slowest ever to support rdtsc), the divider would be 50 - 
which means a potential 2% error - but this seems within the probable 
calibration loop error.

Comments?  How about a 2% target, with roughly a 1/2 second calibration?
Too long?

I did consider keeping the tic1 which went with a "base" tsc and trying to
re-calibrate on the fly if the divider wouldn't be in the right inteval,
but this seemed overly complex.

(By the way, I also looked at using the PIT as something to calibrate
against, and this seems to just make things worse - like maybe the PIT
is emulated also).

If this works OK, we should probably modify delay() to use uclock in the
win2K case, so we can get higher resolution than a very irregular clock
tick.