delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1994/03/30/03:32:08

To: djgpp AT sun DOT soe DOT clarkson DOT edu, eliz AT is DOT elta DOT co DOT il
Subject: Re: DJGPP Speed
Date: Tue, 29 Mar 1994 23:38:37 -0800
From: Darryl Okahata <darrylo AT sr DOT hp DOT com>

>   This summarizes some testing I did for the DJ GCC 1.11maint4 as far as
> the compilation speed goes.
     [ ... ]
>   Conclusion 2: With a fairly good disk cache, the I/O is probably NOT the
>                 decisive factor (at least for the above-mentioned machines).

Here's another data point:

	System: 60MHz Pentium w/512K L2 cache & 16MB RAM, Adaptec 2842
		VLB SCSI controller, all files on a Fujitsu 2624FA disk
		in SCSI-2 mode (this is an "average-performing" disk by
		today's standards: 11ms access, 1700K/sec transfer),
		DOS 6.2, 2MB smartdrv, QEMM 7.03, QDPMI *disabled*

	File: makeinfo.c from the TeXinfo 3.1 distribution
		(DGJPP: gcc -E makeinfo.c | wc == 8766 lines)

	Times for "gcc -c -O2 -DREMOVE_OUTPUT_EXTENSIONS makeinfo.c":
		21.03 sec (~415 lines/sec)	Temp files on 2MB
						ramdisk.  All other
						files on HD.
		25.27 sec (~345 lines/sec)	Temp files on HD,
						smartdrv write cache
						enabled.
		32.07 sec (~270 lines/sec)	Temp files on HD,
						smartdrv write cache
						disabled.

[ Yeah, I know that I didn't specify -DREMOVE_OUTPUT_EXTENSIONS when I
  counted the number of lines.  This only affects about 9 or so lines,
  and so it really doesn't matter.  Besides, I'm lazy. ;-)  ]

     What this really shows is how important a ramdisk is for tmp files.

     With MSDOS/Windows, there is no such thing as a good disk cache.
Why?  Because accessing the cache requires expensive real to protected
mode switches, and vice-versa.  On my PC, it takes about 21us to go from
real to protected, and about 19us to go from protected to real, for a
total of around 40us.  At 60MHz, this is 2400 cycles (average of 1200
cycles/switch).  This is a *LOT* of wasted cycles -- cycles which I
believe aren't wasted with other operating systems (does anyone know how
expensive call gate transitions are?).

     It gets uglier with 32-bit protected mode code (you know, the stuff
that DJGPP generates ;-).  To do a disk read, you have to do the
following expensive mode switches:

1. One switch to go from DJGPP code (protected mode) to the (real mode)
   disk I/O code in GO32.

2. Two or more switches (real to protected and back again) to check the
   disk cache to see if the data is not in the cache.  (Hmm.  Maybe this
   table is stored in real-mode RAM?  If so, you wouldn't need these
   switches if you know the data is not in the cache.)

3. If the data is not in the cache, we read in data from the disk, and
   do two more switches (at least) to put the just-read data into the
   cache.

4. Finally, we do one more switch to get back to the DJGPP code.

At a minimum, this is 4-6 switches.  At an average of 20us/switch, this
wastes a LOT of CPU cycles.  Making things worse is that the GO32
transfer buffer size is 4K; if we want to read a chunk of data larger
than this, we get to iterate in chunks of 4-6 switches.  Yippee.  1/2 ;-)

     I'm not complaining about DJGPP, mind you, but I'd like to point
out where some of the real inefficiencies are.  With real operating
systems, you don't have this spaghetti of time-consuming mode switches
to read data.

     -- Darryl Okahata
	Internet: darrylo AT sr DOT hp DOT com

DISCLAIMER: this message is the author's personal opinion and does not
constitute the support, opinion or policy of Hewlett-Packard or of the
little green men that have been following him all day.

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019