delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp-workers/1996/10/16/15:40:20

From: "Salvador Eduardo Tropea (SET)" <salvador AT inti DOT edu DOT ar>
To: djgpp-workers AT delorie DOT com
Subject: memmove v.s. memcpy, trying to solve
Date: Wed, 16 Oct 1996 16:23:12 +0300 (GMT)
Message-ID: <9610161623.aa08018@ailin.inti.edu.ar>

Hi Workers, here SET from Argentina:

  I'm working with Robert to enhance the RHIDE, or to make crazy to Robert I
don't know, anyways ... yesterday Robert points me that memmove is very slow
in comparisson to memcpy, I replayed (aprox.):
"Why? the only difference between memmove and memcpy is a comparisson at the
start to select the direction of copy."
  But he show me a test that shows that memmove is 2.5 times slower than
memcpy in a 486 (I tested this on my 5x86 and is 1.74 times slower).
  I saw the sources and I discovered the reason: memmove EVER moves bytes, on
the other hand memcpy uses the smart routine ___dj_movedata.
  I don't know if that's fixed in the 2.01 alpha but if not here is a try to
make memmove similar, in terms of speed to memcpy:

------------------- memmove.s -------------------------
/* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */
        .file "memmove.s"
        .globl  _memmove
_memmove:
        pushl   %ebp
        movl    %esp,%ebp
        pushl   %esi
        pushl   %edi
        movl    8(%ebp),%edi
        movl    12(%ebp),%esi
        movl    16(%ebp),%ecx
        jecxz   L2

        cmpl    %esi,%edi
        jb      L3

        call    ___dj_movedata_rev
        jmp     L2
L3:
        call    ___dj_movedata

L2:
        cld
        popl    %edi
        popl    %esi
        movl    8(%ebp),%eax
        leave
        ret
------------------- end of memmove.s -------------------------

------------------- djmdr.s -------------------
/* Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details */
/* Modified by SET to copy in reverse order */
# This routine moves %ecx bytes from %ds:%esi to %es:%edi.  It clobbers
# %eax, %ecx, %esi, %edi, and eflags. 

        .file "djmdr.s"
        .text
        .align 4
        .globl ___dj_movedata_rev
___dj_movedata_rev:
        std
        # Add the counter to the index
        addl    %ecx,%edi
        addl    %ecx,%esi
        decl    %esi
        decl    %edi

        cmpl    $15,%ecx
        jle     small_move
        jmp     mod_4_check
        
        # Transfer bytes until either %esi or %edi is aligned % 3
align_mod_4:    
        movsb
        decl    %ecx
mod_4_check:
        movl    %esi,%eax
        andl    $3,%eax
        cmpl    $3,%eax
        jz big_move
        movl    %edi,%eax
        andl    $3,%eax
        cmpl    $3,%eax
        jnz     align_mod_4

big_move:
        movb    %cl,%al  # We will store leftover count in %al
        shrl    $2,%ecx
        andb    $3,%al
        # Now retrocess the index 3 positions
        subl    $3,%edi
        subl    $3,%esi
        rep
        movsl

        # %ecx known to be zero here, so insert the leftover count in %al
        movb    %al,%cl

        # advance the index by 3
        addl    $3,%edi
        addl    $3,%esi

small_move:
        rep
        movsb
        ret
------------------- end of djmdr.s -------------------

  I tested it and *seems* to work fine, but:

a) I'm not sure.
b) May be can be optimized.

  Hope this help and can be included in the 2.01.

SET


********************************************************************************
Salvador Eduardo Tropea (SET) - salvador AT inti DOT edu DOT ar
Work: INTI (National Institute of Industrial Technology) Sector: ICE 
(Electronic Control & Instrumentation)
Post (Home): Curapaligue 2124 - Caseros (1678)- Buenos Aires - Argentina 

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019