X-Authentication-Warning: delorie.com: mail set sender to djgpp-workers-bounces using -f X-Recipient: djgpp-workers AT delorie DOT com From: "Pierre Muller" To: Subject: 3rd try: SSE2 sigsegv due to alignment problem Date: Tue, 11 Sep 2007 09:25:38 +0200 Message-ID: <001501c7f444$f3c5abc0$db510340$@u-strasbg.fr> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Acf0RPOm1aJv8lHuQdaHhvlrEwMdOQ== Content-Language: en-us Reply-To: djgpp-workers AT delorie DOT com I sent this message in April, but it never reached the mailing list :( I finally asked DJ to update my email address in the mailing list. Thanks DJ >> Original message starts here Hi, I am testing the next release of Free Pascal for go32v2. This release uses an internal copy of the linker script that seems to generate a problem regarding the 16 byte alignment requirement of memory reading or writing xmm registers. the following code is the source of the problem: uses mmx; { only a small test to see if it works in principal } var b : byte; q : array[0..15] of byte; begin if is_sse2_cpu then asm movdqa %xmm1,%xmm2 movdqa q,%xmm4 psubq %xmm1,%xmm2 psubq q,%xmm4 end; end; the allocation of B and Q variables are made using .lcomm but when I compile with stabs info, I discover that both B and Q are at an address ending with 0x8. Looking into the linker script, I discovered that the .bss section has some code: ScriptRes.Add(' .bss SIZEOF(.data) + ADDR(.data) :'); ScriptRes.Add(' {'); ScriptRes.Add(' _object.2 = . ;'); ScriptRes.Add(' . += 24 ;'); ScriptRes.Add(' *(.bss)'); which reflects the .bss SIZEOF(.data) + ADDR(.data) : { _object.2 = . ; . += 24 ; *(.bss .bss.* .gnu.linkonce.b.*) found in CVS revision 1.11 of djgpp/lib/djgpp.djl which is the default linker script if I am not mistaken. I have no clue of the use of this 24 reserved bytes, but if I change this to . += 32 ; then both B and Q variables are aligned at a 16 byte boundary, and the code does not crash anymore. Question 1: what is the use of these 24 bytes? Question 2: is there any problems with increasing this to 32 bytes, in order to get a correct alignment? (I noticed that the first space allocated in sel_buf, so if there is code that assumes that sel_buf is at 24 bytes from object.2 symbol, this might lead to problems.) Pierre Muller PS: sorry, but I was unable to translate the code above into djgpp assembler inside C.