Mail Archives: djgpp-workers/1996/10/16/15:40:20
Hi Workers, here SET from Argentina:
I'm working with Robert to enhance the RHIDE, or to make crazy to Robert I
don't know, anyways ... yesterday Robert points me that memmove is very slow
in comparisson to memcpy, I replayed (aprox.):
"Why? the only difference between memmove and memcpy is a comparisson at the
start to select the direction of copy."
But he show me a test that shows that memmove is 2.5 times slower than
memcpy in a 486 (I tested this on my 5x86 and is 1.74 times slower).
I saw the sources and I discovered the reason: memmove EVER moves bytes, on
the other hand memcpy uses the smart routine ___dj_movedata.
I don't know if that's fixed in the 2.01 alpha but if not here is a try to
make memmove similar, in terms of speed to memcpy:
------------------- memmove.s -------------------------
/* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */
.file "memmove.s"
.globl _memmove
_memmove:
pushl %ebp
movl %esp,%ebp
pushl %esi
pushl %edi
movl 8(%ebp),%edi
movl 12(%ebp),%esi
movl 16(%ebp),%ecx
jecxz L2
cmpl %esi,%edi
jb L3
call ___dj_movedata_rev
jmp L2
L3:
call ___dj_movedata
L2:
cld
popl %edi
popl %esi
movl 8(%ebp),%eax
leave
ret
------------------- end of memmove.s -------------------------
------------------- djmdr.s -------------------
/* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */
/* Modified by SET to copy in reverse order */
# This routine moves %ecx bytes from %ds:%esi to %es:%edi. It clobbers
# %eax, %ecx, %esi, %edi, and eflags.
.file "djmdr.s"
.text
.align 4
.globl ___dj_movedata_rev
___dj_movedata_rev:
std
# Add the counter to the index
addl %ecx,%edi
addl %ecx,%esi
decl %esi
decl %edi
cmpl $15,%ecx
jle small_move
jmp mod_4_check
# Transfer bytes until either %esi or %edi is aligned % 3
align_mod_4:
movsb
decl %ecx
mod_4_check:
movl %esi,%eax
andl $3,%eax
cmpl $3,%eax
jz big_move
movl %edi,%eax
andl $3,%eax
cmpl $3,%eax
jnz align_mod_4
big_move:
movb %cl,%al # We will store leftover count in %al
shrl $2,%ecx
andb $3,%al
# Now retrocess the index 3 positions
subl $3,%edi
subl $3,%esi
rep
movsl
# %ecx known to be zero here, so insert the leftover count in %al
movb %al,%cl
# advance the index by 3
addl $3,%edi
addl $3,%esi
small_move:
rep
movsb
ret
------------------- end of djmdr.s -------------------
I tested it and *seems* to work fine, but:
a) I'm not sure.
b) May be can be optimized.
Hope this help and can be included in the 2.01.
SET
********************************************************************************
Salvador Eduardo Tropea (SET) - salvador AT inti DOT edu DOT ar
Work: INTI (National Institute of Industrial Technology) Sector: ICE
(Electronic Control & Instrumentation)
Post (Home): Curapaligue 2124 - Caseros (1678)- Buenos Aires - Argentina
- Raw text -