X-Authentication-Warning: delorie.com: mailnull set sender to djgpp-workers-bounces using -f Date: Mon, 11 Feb 2002 09:08:03 -0600 From: Eric Rudd Subject: Re: Alignment problem To: djgpp-workers AT delorie DOT com Message-id: <3C67DE53.9922E8BF@cyberoptics.com> Organization: CyberOptics MIME-version: 1.0 X-Mailer: Mozilla 4.72 [en] (Win95; U) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Accept-Language: en,pdf References: <3C629769 DOT AEAFB611 AT cyberoptics DOT com> <379-Fri08Feb2002101042+0200-eliz AT is DOT elta DOT co DOT il> <200202081420 DOT g18EKWb06863 AT envy DOT delorie DOT com> <7872-Fri08Feb2002203948+0200-eliz AT is DOT elta DOT co DOT il> <200202081853 DOT g18IrgO08699 AT envy DOT delorie DOT com> <9003-Sat09Feb2002090616+0200-eliz AT is DOT elta DOT co DOT il> Reply-To: djgpp-workers AT delorie DOT com Eli Zaretskii wrote: > > Date: Fri, 8 Feb 2002 13:53:42 -0500 > > From: DJ Delorie > >> Malloc doesn't need to align to "optimum" alignment. It only needs >> align to "required" alignment. > > I'm not sure I understand this: what is the ``required'' alignment for a > buffer that is 8 bytes large or larger? Isn't the required alignment > for these buffers 8-byte alignment? That is, if I malloc a double, > shouldn't I expect to get a double aligned on 8-byte boundary? I think that some confusion has arisen from the fact that the various posters are discussing three different questions: 1. What alignment is required by the C standard? 2. What alignment is required for the x86 to execute correctly? 3. What alignment is required for the x86 to execute at optimal speed? As far as I understand, the C standard only requires that the pointer returned by malloc be usable to point to any data type or structure that one might use in C. The x86 processor requires only that the data be byte-aligned for correct execution. (I just looked at the opcode description, and this appears to be the case even for the {L,S}{I,G}DT that Martin mentions.) I see no evidence that the current malloc fails to meet either of these first two requirements. Thus, the present behavior of malloc does not indicate a bug. However, there are substantial performance penalties from misaligned memory references, so from the standpoint of wanting compiled code to execute at optimal speed, there is a desire to improve the alignment of pointers returned from malloc. Here is an excerpt from the "Intel Pentium 4 and Intel Xeon Processor Optimization Reference Manual": "Assembly/Compiler Coding Rule 15. (H impact, H generality) Align data on natural operand size address boundaries. For best performance, align data as follows: * Align 8-bit data at any address. * Align 16-bit data to be contained within an aligned four byte word. * Align 32-bit data so that its base address is a multiple of four. * Align 64-bit data so that its base address is a multiple of eight. * Align 80-bit data so that its base address is a multiple of sixteen. * Align 128-bit data so that its base address is a multiple of sixteen. A 64-byte or greater data structure or array should be aligned so that its base address is a multiple of 64. Sorting data in decreasing size order is one heuristic for assisting with natural alignment. As long as 16-byte boundaries (and cache lines) are never crossed, natural alignment is not strictly necessary, though it is an easy way to enforce this." It appears that there is some merit in aligning to as large as a 64-byte boundary (the size of one cache line on the Pentium 4), though I think that most of the needs would be met with 8-byte alignment. -Eric