Date: Mon, 9 Mar 1998 11:30:30 +0200 (IST) From: Eli Zaretskii To: Randy Sorensen cc: djgpp AT delorie DOT com Subject: Re: Optimized code, comparing with Borland C++ 4.5 w/ Power Pack In-Reply-To: <350334da.0@superego.idcomm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Precedence: bulk On Sun, 8 Mar 1998, Randy Sorensen wrote: > Here's my problem. When I do shut down out of DOS and run the exec's > included on the CD, they run up to 60 fps, where as the code that I ported > to DJGPP runs at 40 fps with the following optimizations: > "-O6 -ffast-math -funroll-loops -finline -m486". Which version of the compiler did you use? I can confirm that for heavy-duty C++ code (not C code), GCC 2.7.x is not as good as BC++. It seems like the optimizer in cc1plus needs some work (which isn't surprising in a language as complex as C++). But I haven't tried my benchmark, which is a fast chess player, on GCC 2.8 or EGCS (aka PGCC), so I don't know whether these do better. If you used 2.7.2.1, I suggest trying these latter versions as well. One thing you need to be aware of is that the startup code of v2.01 doesn't ensure the stack is aligned even on a word boundary, let alone double-word (the latter is best for Pentium, so I'm told). I think this is only important on Windows, but I'm not sure what you meant by ``shut down out of DOS''--do you reboot in plain DOS, or do you use the so-called ``DOS mode'' of Windows? (If the latter, try the former.) Anyway, you could try replacing crt0.o with the version from the latest v2.02 alpha, which does align the stack. > Is there any other optimizations that will speed it up? Also, I've > heard that using high "-O"'s will cause problems.. should I bring it > down to 4 or 3? You will have to experiment. The FAQ has some advice in section 14.2, but it wasn't yet updated to reflect the changes in GCC 2.8 and PGCC. In general, I find that -funroll-loops sometimes slows down the code, so make sure you really want it. The newer versions of GCC have some additional align-related options which you might try. > Also, since you can't write to video memory in DJGPP by default, I > went about doing so using __djgpp_nearptr_enable() and adding > __djgpp_conventional_base to the video memory address. Is there a > faster way of going about video memory writing? It depends on the program code. If the code allows to switch to far pointers easily, I suggest to try it. My experience is that far pointers are no slower than near pointers, but you might avoid the overhead of the call to `__djgpp_nearptr_enable', which is quite heavy (it calls a DPMI function). > And lastly, I had to put some extra type-casts in there, since gcc.exe kept > giving me warnings about assigning doubles to unsigned char's and stuff. Huh? Are you sure it was about doubles and not pointers to double? If the former, then the original program was sure broken. > Thanks for all your help :) I'd appreciate it if you emailed me rather than > posted a response, since I don't check the groups or the listserv very > often. If you want direct replies, please make the effort to set your headers correctly, without any anti-spammed fakes. Some people reply to so many messages that they cannot afford editing the headers.