From: avillaci AT ceibo DOT fiec DOT espol DOT edu DOT ec Date: Tue, 22 May 2001 08:59:40 -0500 Message-Id: <200105221359.IAA08632@localhost.localdomain> X-Authentication-Warning: localhost.localdomain: nobody set sender to avillaci AT ceibo DOT fiec DOT espol DOT edu DOT ec using -f To: djgpp AT delorie DOT com Cc: bug-gnu-utils AT gnu DOT org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e" User-Agent: Imap webMail Program 2.0.11 Sender: avillaci AT ceibo DOT fiec DOT espol DOT edu DOT ec X-Originating-IP: 200.9.176.31 Subject: AS incorrectly sign-extends constants assembled with instructions with .code16gcc Reply-To: djgpp AT delorie DOT com This message is in MIME format. ---MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e Content-Type: text/plain Content-Transfer-Encoding: 8bit When experimenting with the .code16 and .code16gcc directives in order to compile C code in 16-bit real mode, I found that some constants that appeared in the C code as parameters to a procedure were being sign extended as if by assignment from \\\\\\\\\\\\\\\'signed short\\\\\\\\\\\\\\\' to \\\\\\\\\\\\\\\'signed long\\\\\\\\\\\\\\\'. However, the parameters were unsigned longs like 0x8000UL, passed as arguments to a procedure that takes \\\\\\\\\\\\\\\'dwords\\\\\\\\\\\\\\\', defined as \\\\\\\\\\\\\\\'typedef unsigned long dword\\\\\\\\\\\\\\\'. Examination of the assembly output showed that the compiler was generating the correct opcodes with correct instructions (because the .code16gcc was given with an __asm__ directive, the compiler could not know that the code would be assembled to an USE16 segment). Examination through DEBUG showed that the constant was sign extended in the binary opcodes, though not in the assembly output. This means that, when assembling with the .code16gcc directive, \\\\\\\\\\\\\\\'as\\\\\\\\\\\\\\\' incorrectly sign-extends 32-bit constants that fit in a 16-bit word and have bit 15 set (like 0x00008000UL) (that is, the most significant bit is bit 15), at least with \\\\\\\\\\\\\\\'pushl $constant\\\\\\\\\\\\\\\' , \\\\\\\\\\\\\\\'movl $constant, r/m32 \\\\\\\\\\\\\\\', and \\\\\\\\\\\\\\\'andl/orl $constant, r/m32\\\\\\\\\\\\\\\' instructions. The file \\\\\\\\\\\\\\\'as_bug.s\\\\\\\\\\\\\\\' is a minimal example that reproduces the bug. The binary output is the following: 66 68 00 80 FF FF /* assembled from \\\\\\\\\\\\\\\'pushl $0x8000 \\\\\\\\\\\\\\\', but is really \\\\\\\\\\\\\\\'pushl $0xFFFF8000\\\\\\\\\\\\\\\' */ 66 58 /* popl %eax */ 66 83 E0 FF /* assembled from \\\\\\\\\\\\\\\'andl $0xFFFF, %eax\\\\\\\\\\\\\\\', but is really \\\\\\\\\\\\\\\'andl $0xFFFFFFFF, % eax\\\\\\\\\\\\\\\' */ 66 B8 FF FF FF FF /* assembled from \\\\\\\\\\\\\\\'movl $0xFFFF, %eax\\\\\\\\\\\\\\\', but is really \\\\\\\\\\\\\\\'movl $0xFFFFFFFF, % eax\\\\\\\\\\\\\\\' */ 90 /* NOPs for padding */ 90 90 90 90 66 68 00 80 FF FF /* assembled from \\\\\\\\\\\\\\\'pushl $0xFFFF8000\\\\\\\\\\\\\\\' */ 66 58 /* popl %eax */ 66 83 E0 FF /* assembled from \\\\\\\\\\\\\\\'andl $0xFFFFFFFF, %eax\\\\\\\\\\\\\\\' */ 66 B8 FF FF FF FF /* assembled from \\\\\\\\\\\\\\\'movl $0xFFFFFFFF, %eax\\\\\\\\\\\\\\\' */ 90 /* NOPs for padding */ 90 90 90 90 B8 00 4C /* Normal exit with INT 21h */ CD 21 Constants assembled in section \\\\\\\\\\\\\\\'.data\\\\\\\\\\\\\\\' with \\\\\\\\\\\\\\\'.long\\\\\\\\\\\\\\\' are not affected by .code16gcc, so a partial workaroud is to store constants as initialized global variables, which are then accessed from code. However, this does not work when garbage in the high word of a 32-bit register needs to be masked out with, for example, \\\\\\\\\\\\\\\'andl $65535, %eax\\\\\\\\\\\\\\\'. This instruction is emitted every time a function returns with a 16-bit value in a 32-bit register. Output of \\\\\\\\\\\\\\\'as --version\\\\\\\\\\\\\\\': ------------------------- GNU assembler 2.9.5 Copyright 1997 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License. This program has absolutely no warranty. This assembler was configured for a target of `i386-pc- msdosdjgpp\\\\\\\\\\\\\\\'. Patches applied to \\\\\\\\\\\\\\\'as\\\\\\\\\\\\\\\' source: ------------------------------- none Machine in use: --------------- AMD K5-PR133 at 100 MHz, Award BIOS 1996. Operating system: ----------------- Microsoft Windows 95 4.00.950a IE 5 5.00.2314.1003 Compiler used to compile \\\\\\\\\\\\\\\'as\\\\\\\\\\\\\\\': ------------------------------ gcc 2.9.5 Makefile used to invoke \\\\\\\\\\\\\\\'as\\\\\\\\\\\\\\\': ----------------------------- OBJECTS =as_bug.o CC =gcc as_bug.com: $(OBJECTS) comfile.djl ld -o as_bug.com -Tcomfile.djl $(OBJECTS) -- oformat binary -M .s.o: $(CC) -c -o $*.o $*.s Contents of \\\\\\\\\\\\\\\'comfile.djl\\\\\\\\\\\\\\\': -------------------------- OUTPUT_FORMAT(\\\\\\\\\\\\\\\"coff-go32\\\\\\\\\\\\\\\") ENTRY(_start) SECTIONS { .text 0x100 : { *(.text) . = ALIGN(0x10); } .data . : { *(.data) . = ALIGN(0x10); } .bss . : { *(.bss) *(COMMON) . = ALIGN(0x10); } } Assembly input file \\\\\\\\\\\\\\\'as_bug.s\\\\\\\\\\\\\\\' (minimal example that reproduces bug): -------------------------------------------------------- -- .code16gcc .text .globl _start _start: pushl $0x8000 /* <-- 0x8000 is being sign extended into 0xFFFF8000 */ popl %eax /* Restore stack pointer */ andl $0xFFFF, %eax /* <-- incorrectly assembled into \\\\\\\\\\\\\\\'andl $0xFFFFFFFF, %eax\\\\\\\\\\\\\\\' */ movl $0xFFFF, %eax /* <-- incorrectly assembled into \\\\\\\\\\\\\\\'movl $0xFFFFFFFF, %eax\\\\\\\\\\\\\\\' */ nop /* Padding included to get \\\\\\\\\\\\\\\'objdump --disassemble\\\\\\\\\\\\\\\' into synch */ nop /* so that opcodes can be checked for equality */ nop nop nop /* The above code assembles exactly like the one below (use a hex dump or disassembler to check). */ pushl $0xFFFF8000 popl %eax andl $0xFFFFFFFF, %eax movl $0xFFFFFFFF, %eax nop nop nop nop nop movw $0x4c00, %ax /* Normal exit through INT 21h */ int $0x21 (All files are given as attachments to this report) ----------------------------------------------------------------- Este mail fue enviado a traves del servidor de Webmail de la FIEC ---MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e Content-Type: text/plain; name="as_bug.s" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="as_bug.s" .code16gcc .text .globl _start _start: pushl $0x8000 /* <-- 0x8000 is being sign extended into 0xFFFF8000 */ popl %eax /* Restore stack pointer */ andl $0xFFFF, %eax /* <-- incorrectly assembled into 'andl $0xFFFFFFFF, %eax' */ movl $0xFFFF, %eax /* <-- incorrectly assembled into 'movl $0xFFFFFFFF, %eax' */ nop /* Padding included to get 'objdump --disassemble' into synch */ nop /* so that opcodes can be checked for equality */ nop nop nop /* The above code assembles exactly like the one below (use a hex dump or disassembler to check). */ pushl $0xFFFF8000 popl %eax andl $0xFFFFFFFF, %eax movl $0xFFFFFFFF, %eax nop nop nop nop nop movw $0x4c00, %ax /* Normal exit through INT 21h */ int $0x21 ---MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e Content-Type: text/plain; name="comfile.djl" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="comfile.djl" OUTPUT_FORMAT("coff-go32") ENTRY(_start) SECTIONS { .text 0x100 : { *(.text) . = ALIGN(0x10); } .data . : { *(.data) . = ALIGN(0x10); } .bss . : { *(.bss) *(COMMON) . = ALIGN(0x10); } } ---MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e Content-Type: text/plain; name="Makefile" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="Makefile" OBJECTS =as_bug.o CC =gcc as_bug.com: $(OBJECTS) comfile.djl ld -o as_bug.com -Tcomfile.djl $(OBJECTS) --oformat binary -M .s.o: $(CC) -c -o $*.o $*.s ---MOQ9905399804a11db13bd3a5e7e570a5b518e5a0f9e--