delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/1997/02/09/14:08:52

From: khan AT xraylith DOT wisc DOT edu (Mumit Khan)
Subject: g77 (v0.5.19) patch for b17
9 Feb 1997 14:08:52 -0800 :
Approved: cygnus DOT gnu-win32 AT cygnus DOT com
Distribution: cygnus
Message-ID: <9702092102.AA19670.cygnus.gnu-win32@modi.xraylith.wisc.edu>
Original-To: gnu-win32 AT cygnus DOT com
In-Reply-To: Your message of "Mon, 03 Feb 1997 21:42:23 +0900."
<199702031249 DOT VAA29486 AT PPP01 DOT infoPepper DOT or DOT jp>
Original-Sender: owner-gnu-win32 AT cygnus DOT com

Here's what I had sent the g77 maintainers Dec 9, 1996. I've built g77
as a cross-compiler *only*, and not as a native toolchain. Please make
sure that the correct flags are passed to f77-runtime, otherwise file
I/O goes to hell (see below for mor info).

 -- using template mhl.format --
Date:    Mon, 09 Dec 1996 21:04:50 CST
From:    Mumit Khan <khan AT modi DOT xraylith DOT wisc DOT edu>
Subject: g77 changes/comments under gnu-win32 [annoyingly verbose]

I'm appending a tiny patch which, when applied to GNU-win32 beta17, should 
"help" (see item 4 below) build g77 on Window'95 or NT machine. Correctness 
is a different issue of course, and remains to be seen for the most part.
Please forward to fortran@ if appropriate.

Here're some thoughts on the patches and on g77 under '95/NT in general.

[general stuff]

1. I'm using both gnu-win32 b16 and b17 for testing, which in turn uses
   gcc-2.7.2-960712 and gcc-2.7.2-961023 snapshots respectively. There 
   was a minor back-end interface change between 2.7.2 and 960712 and
   another one (see item 3 below) between 960712 and 961023 that affect
   g77.

2. Back-end bugs: gnu-win32 b16 had some back-end major bugs in numerics 
   which caused programs compiled with -O produce Inf and Nan ad infinitum
   and I wanted to wait and test g77 until after those bugs were fixed.
   Looks like b17 does it right.

3. There were some tree interface changes, and g77 gurus should definitely
   critically examine my patches to f/com.c. The deletion of OFFSET_REF
   according to Craig Burley shouldn't affect things (I've simply added
   #ifdef around it), but the new build_complex interface is the one I
   have severe doubts about. My gcc back-end knowledge is about 6 years
   old, and things sure have changed.

[configuration/building]

4. I've built the whole bit as a x-compiler under a linux (RedHat 4.0)
   box, so some of the configuration problems that I've run into will 
   not happen when built natively under NT. I don't have ready access
   to an NT machine, so I have to live with it.

   The configuration values for the f77-runtime is where all the magic
   happens (just one or 2, but makes the difference between working and
   non-working compiler :-). Configuration for x-compilers is tricky in 
   general, since there is no way to test for certain run-time parameters 
   (such as USE_STRLEN or sprintf test). In these cases pessimistic values 
   work, albeit at some performance cost. I haven't made the appropriate
   changes in f/runtime/configure.in, so these what you have to MANUALLY
   change to make the runtime work.

   here's the DEFS entry for the gcc/f/runtime/lib?77/Makefile:

   DEFS =  -DSTDC_HEADERS=1 -D_POSIX_SOURCE=1 -DRETSIGTYPE=void \
       -DMISSING_FILE_ELEMS=1 -DIEEE_drem=1 -DWANT_LEAD_0=1 \
       -DNON_UNIX_STDIO=1
   
   Here's what configure guesses. Configure really incorrectly guesses
   one parameter, NON_UNIX_STDIO, but that's the one that makes the 
   runtime fail. The rest are simply my personal choices.

   (BAD)DEFS =  -DSTDC_HEADERS=1 -D_POSIX_SOURCE=1 -DRETSIGTYPE=void \
       -DMISSING_FILE_ELEMS=1 -DIEEE_drem=1 -DUSE_STRLEN=1 \
       -DNON_ANSI_RW_MODES=1 -DPad_UDread=1 -DALWAYS_FLUSH=1 \
       -DWANT_LEAD_0=1 

   NON_UNIX_STDIO: configure assumes that the existence of fstat implies 
   UNIX_type stdio, which is not the case for gnu-win32. This is the most 
   crucial parameter, esp if you want your unformatted files to be written 
   correctly.

[performance]

5. Runtime performance for unformatted files is HORRIBLE. MS Powerstation
   written code writes unformatted files so much faster that I didn't
   even bother timing it. The reason is due to the way libI77 generates
   the START and END records for unformatted records.

   Here's an example:

     real blah
     write (iunit) blah
   
   Here's the skeleton f2c or g77 generated code:
       
     s_wsue
	f__nowwriting
     do_uio
	do_us
     e_wsue
   
Adding the libc routines:

     ==============================================================

     s_wsue
	f__nowwriting
	  [
	    loc = ftell(fp)
	    fseek (fp, loc, SEEK_SET)
	  ]
	[
	  /* remember the record location so we can come back and write
	   * the record length after the entire record is written.
	   * f__recloc == 0 if we open a new file, which is the case
	   * here. Advance 4 bytes (sizeof(long)) to start writing the
	   * data. uiolen is long integer (4 bytes) below.
	   */
	  f__recloc = ftell(fp)
	  fseek(fp, sizeof(uiolen), SEEK_CUR)
	]
	
     do_uio
	do_us
	   [
	     fwrite(ptr, sizeof(int), nelem, fp)
	     /* current file ptr should be f__recloc + 8 */
	   ]
     e_wsue
	 [
	   /* record length 12 bytes ( 3 * sizeof(int) ) */
	   reclen = 12
	   fwrite(&reclen, sizeof(int), 1, fp)
	   /* current file ptr should be f__recloc +  20. Remember it
	    * so we can come back to it after writing the record length
	    * at the beginning of the record. 
	    */ 

	   loc = ftell(fp)
	   /* loc should be 20 here, since we've kept room for a 4 byte 
	    * length (in s_wsue) and written 3 4-byte integers (in calls
	    * to do_uio/do_us) and written the length at the end already.
	    * THIS IS WHERE THINGS GO WRONG (ftell returns 16, not 20)!
	    */

	   /* go back to beginning of record (0 offset in this case) */
	   fseek(fp, f__recloc, SEEK_SET)
	   fwrite(&reclen, sizeof(int), 1, fp)
	   /* back to f__recloc + 20, ready for next record */
	   fseek(fp, loc, SEEK_SET)
	 ]

   ================================================================

   So, where's the problem you ask? All those ftells and fseeks are KILLING
   the performance under gnu-win32 and makes the code write so slow that
   I could probably outrun it by hand. fseek and ftell call fflush and
   when you're writing large number of records this is disastrous (which
   our legacy code does, and it's killing us).

   What I've done is rewrite our unformatted I/O routines in C (since I
   know our records and also the format the libf2c uses) and suddenly get
   about 270% performance increase in the I/O. Here're some numbers from
   a linux and sparc-sun-sunos4.1.3 boxen for a intentionally badly
   written FORTRAN unformatted write with implicit loop statement:

     i586-*-linux:
	C VERSION    : 0.38user 0.69system 0:01.28elapsed 
	G77 VERSION  : 3.17user 3.86system 0:08.88elapsed

	% ls -lg *.out
	-rw-rw-r--   1 khan     staff     5200000 Dec  1 23:39 cwrite.out
	-rw-rw-r--   1 khan     staff     5200000 Dec  1 23:39 fwrite.out
	% diff -s *.out
	Files cwrite.out and fwrite.out are identical
      
     sparc-sun-sunos4.1.3:
	g77-0.5.18   : 9.2 real         3.8 user         5.2 sys
	Sun FORTRAN 2.0.1 patch 100968-02
		     : 3.8 real         2.6 user         1.1 sys
	gcc-2.7.2    : 2.4 real         1.6 user         0.6 sys  

   Here's the fortran code (please don't comment on the quality ;-):

   ================
	    program fort
    c
	    implicit integer (a-z)
	    parameter (npoints = 50000)
	    parameter (ncol = 12)
	    double precision array(18, npoints)
    c
	    do 10 j = 1, npoints
		do 10 i = 1, ncol
		    array(i, j) = j
     10	continue
    c
	    open(11, file='fwrite.out', form='unformatted', status='unknown',
	 $		err=100)
	    rewind(11)
    c
	    do 99 i = 1, npoints
		write(11) (array(j, i), j = 1, ncol)
     99	continue
	    close(11)
	    istatus = 0
	    goto 200
     100	write(*,*) 'error opening file'
	    istatus = 1
     200	call exit(istatus)
	    end

   ================

   And the equivalent C code:
   ================

    #include <stdio.h>

    #define ncol 12
    #define npoints 50000

    int main() {
	double array[npoints][18];
	int i, j, reclen;
	FILE *fp;

	for (i = 0; i < npoints; ++i) {
	    for (j = 0; j < ncol; ++j) {
		array[i][j] = i+1;
	    }
	}

	fp = fopen("cwrite.out", "wb");
	for (i = 0; i < npoints; ++i) {
	    reclen = sizeof(double) * ncol;
	    fwrite(&reclen, sizeof(int), 1, fp);
	    fwrite(array[i], sizeof(double) * ncol, 1, fp);
	    fwrite(&reclen, sizeof(int), 1, fp);
	}
	fclose(fp);
	return 0;
    }

   ================

   Now my question for the compiler gurus. Is there any way G77 can
   pre-scan the write statements and WRITE out the record length in
   s_wsue instead of counting the number of bytes written and writing
   it out in e_wsue? Essentially that's what I do in my C code and
   the result is self-evident.
  

[patch]

6. Here's the patch that you should apply to gnu-win32 b17 to patch
   cleanly. For b16, you should remove the build_complex interface
   change introduced in b17. 

   To sucessfully get things to work, stop the compilation after the
   gcc/f/runtime/configure is done running and substitute the DEFS
   from item 4 above. You might also have set RANLIB_TEST = true in
   a top-level Makefile to avoid a syntax error when ranlib'ing 
   libf2c.a (pretty obvious ...).

   I have some local fixes to configure.in, but I'll wait till I hear
   about these changes first from g77 maintainers before I do some more
   work on those.

7. I should really write a pause() substitute, but since I don't use
   it (nor shall I in the future), I'll let somebody else do it right.
   Currently it assumes the MSDOS behaviour, since libc in gnu-win32.
   lacks pause().

Regards,
Mumit -- khan AT xraylith DOT wisc DOT edu
http://www.xraylith.wisc.edu/~khan/

Index: gcc/f/Make-lang.in
===================================================================
RCS file: /home/khan/src/CVSROOT/gnu/gcc/f/Make-lang.in,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 Make-lang.in
*** Make-lang.in	1996/12/08 07:19:57	1.1.1.1
--- Make-lang.in	1996/12/08 07:30:21
***************
*** 267,273 ****
    $(srcdir)/f/com.h f/proj.h $(srcdir)/f/runtime/Makefile.in \
    $(srcdir)/f/runtime/libF77/Makefile.in \
    $(srcdir)/f/runtime/libI77/Makefile.in \
!   $(GCC_PARTS) $(srcdir)/config/$(xmake_file) $(srcdir)/config/$(tmake_file)
  # The make "stage?" in compiler spec. is fully qualified as above
  	top=`pwd`; \
  	src=`cd $(srcdir); pwd`; \
--- 267,273 ----
    $(srcdir)/f/com.h f/proj.h $(srcdir)/f/runtime/Makefile.in \
    $(srcdir)/f/runtime/libF77/Makefile.in \
    $(srcdir)/f/runtime/libI77/Makefile.in \
!   $(GCC_PARTS) $(xmake_file) $(tmake_file)
  # The make "stage?" in compiler spec. is fully qualified as above
  	top=`pwd`; \
  	src=`cd $(srcdir); pwd`; \
***************
*** 283,289 ****
  #	cd f/f2c; $(MAKE) all
  #
  #f/f2c/Makefile: $(srcdir)/f/f2c/Makefile.in $(GCC_PARTS) \
! #            $(srcdir)/config/$(xmake_file) $(srcdir)/config/$(tmake_file)
  #	top=`pwd`; cd f/f2c; \
  #          $${top}/f/f2c/configure --srcdir=$${top}/f/f2c
  
--- 283,289 ----
  #	cd f/f2c; $(MAKE) all
  #
  #f/f2c/Makefile: $(srcdir)/f/f2c/Makefile.in $(GCC_PARTS) \
! #            $(xmake_file) $(tmake_file)
  #	top=`pwd`; cd f/f2c; \
  #          $${top}/f/f2c/configure --srcdir=$${top}/f/f2c
  
Index: gcc/f/com.c
===================================================================
RCS file: /home/khan/src/CVSROOT/gnu/gcc/f/com.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 com.c
*** com.c	1996/12/08 07:19:57	1.1.1.1
--- com.c	1996/12/08 08:06:55
***************
*** 9879,9885 ****
  	  case FFEINFO_kindtypeANY:
  	    return error_mark_node;
  	  }
! 	item = build_complex (build_real (el_type, real),
  			      build_real (el_type, imag));
  	TREE_TYPE (item) = tree_type;
        }
--- 9879,9885 ----
  	  case FFEINFO_kindtypeANY:
  	    return error_mark_node;
  	  }
! 	item = build_complex (NULL_TREE, build_real (el_type, real),
  			      build_real (el_type, imag));
  	TREE_TYPE (item) = tree_type;
        }
***************
*** 11656,11662 ****
--- 11656,11664 ----
  	  || (TREE_CODE (item) == INDIRECT_REF)
  	  || (TREE_CODE (item) == ARRAY_REF)
  	  || (TREE_CODE (item) == COMPONENT_REF)
+ #ifdef OFFSET_REF
  	  || (TREE_CODE (item) == OFFSET_REF)
+ #endif
  	  || (TREE_CODE (item) == BUFFER_REF)
  	  || (TREE_CODE (item) == REALPART_EXPR)
  	  || (TREE_CODE (item) == IMAGPART_EXPR))
Index: gcc/f/runtime/libF77/s_paus.c
===================================================================
RCS file: /home/khan/src/CVSROOT/gnu/gcc/f/runtime/libF77/s_paus.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 s_paus.c
*** s_paus.c	1996/12/08 07:20:03	1.1.1.1
--- s_paus.c	1996/12/08 07:29:00
***************
*** 60,66 ****
  	if( isatty(fileno(stdin)) )
  		s_1paus(stdin);
  	else {
! #if defined (MSDOS) && !defined (GO32)
  		FILE *fin;
  		fin = fopen("con", "r");
  		if (!fin) {
--- 60,66 ----
  	if( isatty(fileno(stdin)) )
  		s_1paus(stdin);
  	else {
! #if (defined (MSDOS) && !defined (GO32)) || defined(__CYGWIN32__)
  		FILE *fin;
  		fin = fopen("con", "r");
  		if (!fin) {
-
For help on using this list, send a message to
"gnu-win32-request AT cygnus DOT com" with one line of text: "help".

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019