Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com From: "Dave Korn" To: Cc: Subject: RE: g++ 3.4.0 cygwin, codegen SSE & alignement issues Date: Wed, 28 Apr 2004 17:55:20 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit In-Reply-To: <6.0.1.1.0.20040428085945.01f20a90@imap.myrealbox.com> Message-ID: X-OriginalArrivalTime: 28 Apr 2004 16:55:20.0203 (UTC) FILETIME=[96F5C5B0:01C42D41] > -----Original Message----- > From: Tim Prince > Sent: 28 April 2004 17:19 > Because of the different division of responsibilities, if a > function built > by gcc is called by a function built by a commercial compiler > (or by gcc > -Os), the stack has a 75% probability of being mis-aligned. > It may be > possible to overcome this by having a wrapper function > between, which is > built by gcc with alignment specified, but does not use SSE. I once wrote a patch for gcc (for the ppc backend, but the principles should be applicable if not the actual code) to add a new -m option, the effect of which was to modify prolog generation code so that instead of just subtracting a constant from the sp to allocate the new frame, it also dynamically calculated how much extra to subtract to get the correct alignment for the resulting new sp value. It was pretty simple, involving just a few extra assembler instructions in each prolog. [ In fact, it may not be as simple as that (...any more). With the ppc eabi, the effect of allocating more space on the stack than you've actually defined in the stack frame is that a gap opens up between the outgoing args area, which grows up from the bottom of the frame, and the local vars and saved regs area, which grow down from the top of the frame. This didn't do any harm in 2.95.x, but it might well go wrong in gcc-3.x.x, where the handling of eliminable regs and starting frame offset is different. I'm also unsure about how badly this sort of malarkey might break gdb's understanding of what is going on in a function's frame, but I would imagine it would do so quite badly. ] It's a total waste of bytes in a situation where you know that the OS or CRT gets it right for you, but it would be useful in a mixed objects/abis/compilers situation. Looks like there might be call for the same sort of thing for the i.86 backend? > Presumably, there is a performance advantage to gcc of > assuming that the > caller passes an aligned stack, but not enough to persuade commercial > compilers to adopt a compatible scheme. Well, it's quicker to allocate a constant size stack frame than to dynamically calculate the alignment requirements, but only by two or three fairly trivial instructions. And although aligning the frame just once at startup and keeping it aligned by always allocating aligned-size stack frames, in some situations stack memory is a limited resource, and particularly since not all code uses vector registers, there's a lot of stack memory usage to be saved by not making all the stack frames bigger just for the sake of the very few frames for functions that actually use the vector regs. So I'd say it's probably one of those trade-offs for which there's no one 'right' answer. cheers, DaveK -- Can't think of a witty .sigline today.... -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/