Mail Archives: djgpp/1997/02/20/11:51:09

delorie.com/archives/browse.cgi

search

Mail Archives: djgpp/1997/02/20/11:51:09

From: kagel AT quasar DOT bloomberg DOT com

Date: Thu, 20 Feb 1997 11:16:09 -0500

Message-Id: <9702201616.AA04348@quasar.bloomberg.com >

To: eliz AT is DOT elta DOT co DOT il

Cc: jbennett AT ti DOT com, djgpp AT delorie DOT com

In-Reply-To: <Pine.SUN.3.91.970219100950.22519I-100000@is> (message from Eli Zaretskii on Wed, 19 Feb 1997 10:11:53 +0200 (IST))

Subject: Re: Netlib code [was Re: flops...]

Reply-To: kagel AT dg1 DOT bloomberg DOT com

   Errors-To: postmaster AT ns1
   Date: Wed, 19 Feb 1997 10:11:53 +0200 (IST)
   From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
   X-Sender: eliz AT is
   Cc: jbennett AT ti DOT com, djgpp AT delorie DOT com
   Mime-Version: 1.0
   Content-Type: TEXT/PLAIN; charset=US-ASCII
   Content-Length: 942


   On Tue, 18 Feb 1997 kagel AT quasar DOT bloomberg DOT com wrote:

   >    say the least.  The problem is not with the performance of the Fortran
   >    code but with the memory bandwidth overhead associated with converting
   >    the C row-major matrices to the Fortran column-major order prior to
   > 
   > What conversion?  The FORTRAN is not converting you arrays.  FORTRAN and C
   > share a common calling convention (ignoring the facts that FORTRAN passes
   > string lengths and always passes pointers).  They just disagree on which
   > dimension to increment first.  You are not inverting the arrays are you?  Just
   > declare the C arrays with the indices reversed and everything will be fine.

   I don't know whether this is or isn't the problem which causes the 
   slow-down, but note that accessing a large array columnwise might hurt 
   performance due to CPU cache trashing and the virtual memory trashing (if 
   the array is large enough to exceed the physical RAM).

No, no.  You misunderstand what is happening here.  FORTRAN is still accessing
the actual data in the same order that "C" is.  It is just that the sense of
the indices is inverted in the source code.  In other words, since FORTRAN
insists that the row be incremented first FORTRAN compiler writers, knowing the
hardware as they must, make the FORTRAN column the same physical dimension as
the "C" row so that the memory thrashing you mention does not happen.  This
means that if for example in FORTRAN you have:

	INTEGER*4 big_array(100,20)

Then in "C" you can declare the same memory as:

   int big_array[20][100];

And these are identical memory image definitions.  Here check this out:

t.c:

   #include <stdio.h>

   int main(void)
   {
       long t1[10][20], i, j;

       test_( t1 );

       for (i=0;i<10;i++) {
	   for (j=0;j<20;j++) {
	       printf( "t1[%d][%d]=%d  ", i, j, t1[i][j] );
	   }
	   printf( "\n" );
       }
       printf( "\n\n" );
       prt_( t1 );
       return 0;
   }

tf.f:

      subroutine test( t1 )
      integer*4 t1(20,10), i, j, k

      k = 0
      do 100 i=1,10
         do 90 j=1,20
            k = k + 1
            t1(j,i) = k
 90      continue
 100  continue
      end

      subroutine prt( t1 )
      integer*4 t1(5,10), i, j

      do 100 i=1,10
         do 90 j=1,5
           write( *, 200) j, i, t1(j,i)
 90      continue
 100  continue

 200  format( "T1(",i2,",",i2,")=",i4 )

      end

It all works, it is all efficient and it involves no copying of data! This
prints:

t1[0][0]=1  t1[0][1]=2  t1[0][2]=3  t1[0][3]=4  t1[0][4]=5  
t1[1][0]=6  t1[1][1]=7  t1[1][2]=8  t1[1][3]=9  t1[1][4]=10  
t1[2][0]=11  t1[2][1]=12  t1[2][2]=13  t1[2][3]=14  t1[2][4]=15  
t1[3][0]=16  t1[3][1]=17  t1[3][2]=18  t1[3][3]=19  t1[3][4]=20  
t1[4][0]=21  t1[4][1]=22  t1[4][2]=23  t1[4][3]=24  t1[4][4]=25  
t1[5][0]=26  t1[5][1]=27  t1[5][2]=28  t1[5][3]=29  t1[5][4]=30  
t1[6][0]=31  t1[6][1]=32  t1[6][2]=33  t1[6][3]=34  t1[6][4]=35  
t1[7][0]=36  t1[7][1]=37  t1[7][2]=38  t1[7][3]=39  t1[7][4]=40  
t1[8][0]=41  t1[8][1]=42  t1[8][2]=43  t1[8][3]=44  t1[8][4]=45  
t1[9][0]=46  t1[9][1]=47  t1[9][2]=48  t1[9][3]=49  t1[9][4]=50  


T1( 1, 1)=   1
T1( 2, 1)=   2
T1( 3, 1)=   3
T1( 4, 1)=   4
T1( 5, 1)=   5
T1( 1, 2)=   6
T1( 2, 2)=   7
T1( 3, 2)=   8
T1( 4, 2)=   9
T1( 5, 2)=  10
T1( 1, 3)=  11
T1( 2, 3)=  12
T1( 3, 3)=  13
T1( 4, 3)=  14
T1( 5, 3)=  15
T1( 1, 4)=  16
T1( 2, 4)=  17
T1( 3, 4)=  18
T1( 4, 4)=  19
T1( 5, 4)=  20
T1( 1, 5)=  21
T1( 2, 5)=  22
T1( 3, 5)=  23
T1( 4, 5)=  24
T1( 5, 5)=  25
T1( 1, 6)=  26
T1( 2, 6)=  27
T1( 3, 6)=  28
T1( 4, 6)=  29
T1( 5, 6)=  30
T1( 1, 7)=  31
T1( 2, 7)=  32
T1( 3, 7)=  33
T1( 4, 7)=  34
T1( 5, 7)=  35
T1( 1, 8)=  36
T1( 2, 8)=  37
T1( 3, 8)=  38
T1( 4, 8)=  39
T1( 5, 8)=  40
T1( 1, 9)=  41
T1( 2, 9)=  42
T1( 3, 9)=  43
T1( 4, 9)=  44
T1( 5, 9)=  45
T1( 1,10)=  46
T1( 2,10)=  47
T1( 3,10)=  48
T1( 4,10)=  49
T1( 5,10)=  50


-- 
Art S. Kagel, kagel AT quasar DOT bloomberg DOT com

A proverb is no proverb to you 'till life has illustrated it.  -- John Keats

- Raw text -

webmaster	delorie software privacy
Copyright © 2019 by DJ Delorie	Updated Jul 2019

From:	kagel AT quasar DOT bloomberg DOT com
Date:	Thu, 20 Feb 1997 11:16:09 -0500
Message-Id:	<9702201616.AA04348@quasar.bloomberg.com >
To:	eliz AT is DOT elta DOT co DOT il
Cc:	jbennett AT ti DOT com, djgpp AT delorie DOT com
In-Reply-To:	<Pine.SUN.3.91.970219100950.22519I-100000@is> (message from Eli Zaretskii on Wed, 19 Feb 1997 10:11:53 +0200 (IST))
Subject:	Re: Netlib code [was Re: flops...]
Reply-To:	kagel AT dg1 DOT bloomberg DOT com