Mail Archives: djgpp/1997/02/20/11:51:09
Errors-To: postmaster AT ns1
Date: Wed, 19 Feb 1997 10:11:53 +0200 (IST)
From: Eli Zaretskii <eliz AT is DOT elta DOT co DOT il>
X-Sender: eliz AT is
Cc: jbennett AT ti DOT com, djgpp AT delorie DOT com
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Length: 942
On Tue, 18 Feb 1997 kagel AT quasar DOT bloomberg DOT com wrote:
> say the least. The problem is not with the performance of the Fortran
> code but with the memory bandwidth overhead associated with converting
> the C row-major matrices to the Fortran column-major order prior to
>
> What conversion? The FORTRAN is not converting you arrays. FORTRAN and C
> share a common calling convention (ignoring the facts that FORTRAN passes
> string lengths and always passes pointers). They just disagree on which
> dimension to increment first. You are not inverting the arrays are you? Just
> declare the C arrays with the indices reversed and everything will be fine.
I don't know whether this is or isn't the problem which causes the
slow-down, but note that accessing a large array columnwise might hurt
performance due to CPU cache trashing and the virtual memory trashing (if
the array is large enough to exceed the physical RAM).
No, no. You misunderstand what is happening here. FORTRAN is still accessing
the actual data in the same order that "C" is. It is just that the sense of
the indices is inverted in the source code. In other words, since FORTRAN
insists that the row be incremented first FORTRAN compiler writers, knowing the
hardware as they must, make the FORTRAN column the same physical dimension as
the "C" row so that the memory thrashing you mention does not happen. This
means that if for example in FORTRAN you have:
INTEGER*4 big_array(100,20)
Then in "C" you can declare the same memory as:
int big_array[20][100];
And these are identical memory image definitions. Here check this out:
t.c:
#include <stdio.h>
int main(void)
{
long t1[10][20], i, j;
test_( t1 );
for (i=0;i<10;i++) {
for (j=0;j<20;j++) {
printf( "t1[%d][%d]=%d ", i, j, t1[i][j] );
}
printf( "\n" );
}
printf( "\n\n" );
prt_( t1 );
return 0;
}
tf.f:
subroutine test( t1 )
integer*4 t1(20,10), i, j, k
k = 0
do 100 i=1,10
do 90 j=1,20
k = k + 1
t1(j,i) = k
90 continue
100 continue
end
subroutine prt( t1 )
integer*4 t1(5,10), i, j
do 100 i=1,10
do 90 j=1,5
write( *, 200) j, i, t1(j,i)
90 continue
100 continue
200 format( "T1(",i2,",",i2,")=",i4 )
end
It all works, it is all efficient and it involves no copying of data! This
prints:
t1[0][0]=1 t1[0][1]=2 t1[0][2]=3 t1[0][3]=4 t1[0][4]=5
t1[1][0]=6 t1[1][1]=7 t1[1][2]=8 t1[1][3]=9 t1[1][4]=10
t1[2][0]=11 t1[2][1]=12 t1[2][2]=13 t1[2][3]=14 t1[2][4]=15
t1[3][0]=16 t1[3][1]=17 t1[3][2]=18 t1[3][3]=19 t1[3][4]=20
t1[4][0]=21 t1[4][1]=22 t1[4][2]=23 t1[4][3]=24 t1[4][4]=25
t1[5][0]=26 t1[5][1]=27 t1[5][2]=28 t1[5][3]=29 t1[5][4]=30
t1[6][0]=31 t1[6][1]=32 t1[6][2]=33 t1[6][3]=34 t1[6][4]=35
t1[7][0]=36 t1[7][1]=37 t1[7][2]=38 t1[7][3]=39 t1[7][4]=40
t1[8][0]=41 t1[8][1]=42 t1[8][2]=43 t1[8][3]=44 t1[8][4]=45
t1[9][0]=46 t1[9][1]=47 t1[9][2]=48 t1[9][3]=49 t1[9][4]=50
T1( 1, 1)= 1
T1( 2, 1)= 2
T1( 3, 1)= 3
T1( 4, 1)= 4
T1( 5, 1)= 5
T1( 1, 2)= 6
T1( 2, 2)= 7
T1( 3, 2)= 8
T1( 4, 2)= 9
T1( 5, 2)= 10
T1( 1, 3)= 11
T1( 2, 3)= 12
T1( 3, 3)= 13
T1( 4, 3)= 14
T1( 5, 3)= 15
T1( 1, 4)= 16
T1( 2, 4)= 17
T1( 3, 4)= 18
T1( 4, 4)= 19
T1( 5, 4)= 20
T1( 1, 5)= 21
T1( 2, 5)= 22
T1( 3, 5)= 23
T1( 4, 5)= 24
T1( 5, 5)= 25
T1( 1, 6)= 26
T1( 2, 6)= 27
T1( 3, 6)= 28
T1( 4, 6)= 29
T1( 5, 6)= 30
T1( 1, 7)= 31
T1( 2, 7)= 32
T1( 3, 7)= 33
T1( 4, 7)= 34
T1( 5, 7)= 35
T1( 1, 8)= 36
T1( 2, 8)= 37
T1( 3, 8)= 38
T1( 4, 8)= 39
T1( 5, 8)= 40
T1( 1, 9)= 41
T1( 2, 9)= 42
T1( 3, 9)= 43
T1( 4, 9)= 44
T1( 5, 9)= 45
T1( 1,10)= 46
T1( 2,10)= 47
T1( 3,10)= 48
T1( 4,10)= 49
T1( 5,10)= 50
--
Art S. Kagel, kagel AT quasar DOT bloomberg DOT com
A proverb is no proverb to you 'till life has illustrated it. -- John Keats
- Raw text -