delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/1995/04/21/02:38:28

Date: Fri, 21 Apr 1995 02:41:57 -0300 (ADT)
From: Bill Davidson <bdavidson AT ra DOT isisnet DOT com>
Subject: Optimization and _farp*
To: djgpp AT sun DOT soe DOT clarkson DOT edu

Hi:
I had a bit of a headscratcher that I though was worth mentioning.
The inline functions in <sys/farptr.h> only work with optimization on, 
right?  Well, optimization can also mangle them severely!

My program had a macro that expanded as follows:

while ((Kptval & 0x80) == 0)
   _farpokew(_go32_conventional_mem_selector(), 0x400+0x1a, \
	 _farpeekw(_go32_conventional_mem_selector(), 0x400+0x1c));

(I didn't write the program, just porting it!)
The program crashed (exception 13) at this location (thank you, symify!).
edebug32 revealed that this source compiled (with -O) into the following:

	mov	ax, [kptval]
	test	al, al
	jl	*someplace else
	mov	ecx, 0x4070
	mov	si, fs:[ecx]	<< this is where it crashed, fs == 0000
	nop
	call	__go32_conventional_mem_selector
	mov	ebx, eax
	call	__go32_conventional_mem_selector
	mov	fs, ax
	mov	fs, bx		<< nice optimization, eh?
etc...

An examination of the _farpeekw() code revealed that it is composed of 
*two* separate asm() statements, and the optimizer reordered them!  I 
replaced it with:

extern inline unsigned short
_farpeekw (unsigned short selector, unsigned short offset) {
    unsigned short result;
    asm("movw %1, %%fs \n"
        ".byte 0x64 \n"
	"movw (%2), %0 "
	: "=r" (result)
	: "r" (selector), "r" (offset));
    return result;
}

and everything is fine.  I intend to rewrite *all* these functions this 
way in the next couple of days, and would be happy to upload the result 
if anyone wants.  I believe they are much safer this way, since the 
selector will have to be loaded before it is referenced.

Bill Davidson
bdavidson AT ra DOT isisnet DOT com

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019