Mail Archives: djgpp/2000/01/17/12:17:14
The source codes in question were developed on a machine with
an Intel Pentium Pro 200 mhz chip. The machine has 128mb of RAM.
The djgpp on this machine was downloaded on December 15,1997.
The version number is 2.01. A new malloc.c was downloaded
January 2, 1998.
The source codes were copied to a machine with an AMD K6 333
mhz processor. That machine has 92 mb of RAM. The djgpp on that
machine was downloaded in December 19,1999 (unzipped from
djdev202.zip). Presumably, this is the latest version of djgpp.
The source codes (which are a few thousand lines in length)
require only 2.4 mb of RAM (in addition to the size of the executa-
bles, which may be as large as 0.700 mb). Therefore, it is unlikely
that memory management has anything to do with the problems we
encountered after copying the source codes from one machine to the
other.
The first problem we dealt with was what appears to us to be
a change in the compiler (we always use
gxx codename.cc -o codename.exe [-O2] [-w]
where [] indicates "optional".) In the source codes developed on
the Intel machine, it was assumed that the default type for sub-
routines is "int". That is, in an include file containing sub-
routines, one writes
extern subroutinename(whatever arguments);
subroutine(whatever arguments)
{.
.
return 0;
}
and the source compiles and the executable runs correctly.
On the AMD machine, the compiler insisted that the type (int)
be stated explicitly:
extern int subroutine name(whatever arguments);
int subroutine(whatever arguments)
{.
.
return 0;
}
The changes were tedious, but when the changes were made, the
source codes compiled and the executables ran correctly on both
machines.
The second problem appeared to us to be a change in the
allowable assembly language (whether due to a difference in the
architecture of the AMD and Intel chips or due to a change in the
assembler we do not know). On the Intel machine, a section of
assembly language in the source code to which an argument was
passed was done as follows:
. //in the C++ part of the code
.
argument=something; //argument is a global variable
mode(argument);
.
.
where, in an include file containing the assembly language codes,
#define mode(argument) \
__asm__ ( \
. \
. \
: \
: "a" (argument) \
: "eax", etc );
which means that the argument is to be passed in the register eax.
This way of coding did not work on the AMD machine. We had to
change to
argument=something //in the C++ part of the code
mode();
#define mode() \
__asm__ ( \
"movl _argument,%%eax\n\t" \
. \
: \
: \
: "eax", etc );
That is, the assembly language functions could not have an argument.
Again, making these changes was tedious, but the resulting source
code compiled correctly and ran correctly on both machines (as long
as the optimizing -O2 switch was not called on--see fourth problem
below).
There was a third (very minor and probably not worth mentioning)
problem related to assembly language programming on the AMD machine.
Sometimes in assembly language one wants the origin of an array named
arrayname, for instance in this assembly language statement,
movl _arraynameorigin,%%ebx.
arraynameorigin has to be set somewhere. On the intel machine,
unsigned int arraynameorigin=arrayname;
(this statement occurs in the C++ part of the code, not assembly
language) results in a warning (arraynameorigigin has no cast), but
on the AMD machine it results in an error. Apparently, the compiler
has been changed. We do not understand why this should be an error.
However, since arrayname is the origin of the array arrayname, and
movl _arrayname,%%ebx
works on both machines, this problem is just a curiosity having no
importance.
The fourth problem, which we have not overcome, is the fact
that, after all of the above revisions, on the AMD machine the
optimization switch -O2 does not work. With the -O2 switch on, the
source codes compile (on both machines) with no reported errors.
The executables run faster on the Intel machine by 30%-50%.
But the executables crash on the AMD machine.
The use of assembly language is crucial, resulting in exe-
cutables which are FOUR TIMES FASTER than those created without
the use of assembly language, for both the Intel and AMD processors.
The -O2 switch, in the Intel case, results in another gain in speed
of 30%-50%, a gain which we would like to get for the AMD processor
if we can locate the source of this fourth problem and fix it.
Here is some timing data:
source assembly language -O2 ticks(91 ticks/sec)
Intel AMD
#1 no no 9735 4145
no yes 4775 2690
yes no 2155 1225
yes yes 1550 crash
#2 no no 9280 3910
no yes 4530 2545
yes no 1780 985
yes yes 1365 crash
The spectacular gains in speed when using assembly language are due
to the use of the commands, adcl, mull, and divl. When the 64-bit
merced chips appear, these commands will be dropped (that is what I
call risc architecture, but others mean something else). The
architecture of the merced will be similar to the alpha architecture
(in my naive view). I understand that these commands will not be
dropped when AMD produces its 64-bit processor. It will be a true
extension of Intel 486 architecture to 64-bits. And so we are very
interested in the AMD chip.
Grendel has asked that we post the crash reports.
Source code #1:
Exiting due to signal SIGFPE
Division by Zero at eip=000040ef, x87 status=0120
eax=00000000 ebx=00000000 ecx=00000036
edx=00000005 esi=00000000 edi=00000038
ebp=000b981c esp=000b97e4 program=C:\directory\codename.EXE
cs: sel=00ef base=83809000 limit=00acafff
ds: sel=00f7 base=83809000 limit=00acafff
es: sel=00f7 base=83809000 limit=00acafff
fs: sel=00cf base=00013d10 limit=0000ffff
gs: sel=0107 base=00000000 limit=0010ffff
ss: sel=00f7 base=83809000 limit=00acafff
App stack: [000b9978..00039978] Exceptn stack: [00039854..00037914]
Call frame traceback EIPs:
0x000040ef
0x00007d67
0x00009373
0x0001ce96
Source code #2:
Exiting due to signal SIGFPE
Division by Zero at eip=000040af, x87 status=0120
eax=00000000 ebx=00000000 ecx=00000036
edx=00000005 esi=00000000 edi=00000038
ebp=000ce118 esp=000ce0e0 program=C:\directory\codename.EXE
cs: sel=00ef base=83809000 limit=00f1afff
ds: sel=00f7 base=83809000 limit=00f1afff
es: sel=00f7 base=83809000 limit=00f1afff
fs: sel=00cf base=00013d10 limit=0000ffff
gs: sel=0107 base=00000000 limit=0010ffff
ss: sel=00f7 base=83809000 limit=00f1afff
App stack: [000ce2a8..0004e2a8] Exceptn stack: [0004e184..0004c244]
Call frame traceback EIPs:
0x000040af
0x0000de02
0x00018929
0x0002daea
I will be off-line January20-February3
February23-March5
- Raw text -