delorie.com/archives/browse.cgi | search |
DKIM-Filter: | OpenDKIM Filter v2.11.0 delorie.com 49DMK64v3114780 |
Authentication-Results: | delorie.com; |
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=I/f1q/p9 | |
X-Recipient: | archive-cygwin AT delorie DOT com |
DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org D6780385B517 |
DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; |
s=default; t=1728858004; | |
bh=P8gMFaLgaVOqBNCXg3z0GIRZDh9h34pnCSQTiwFO9wg=; | |
h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: | |
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: | |
From; | |
b=I/f1q/p9+Slc52BH3OOyNeDfWYt8DoEitl7BykT8ZyxDIeT8RHBfiJuQfDvuDjpQZ | |
G+C2rqGozQnvTvWyXhc7LcuZiY3eRhE0UpUeSGrzEO4xseruDarqh85/5QBKzvN/lI | |
Z5o6rV8R/8RqD7g5RnNVrvssMZmuNZfyMXak/PR8= | |
X-Original-To: | cygwin AT cygwin DOT com |
Delivered-To: | cygwin AT cygwin DOT com |
DMARC-Filter: | OpenDMARC Filter v1.4.2 sourceware.org 726FC3856950 |
ARC-Filter: | OpenARC Filter v1.0.0 sourceware.org 726FC3856950 |
ARC-Seal: | i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1728857977; cv=none; |
b=yEGBF3P7xeMGc9KJat1sBX27KFxk5FXvYiI9tpr6guIJBm/wPdZecrAVpvz5vELc19k6egw9qjk3QaLvCmRz11UAIhibulNEAGd3kDdA9STnNIZhEAFP3BzfDcQiN2obhsuEWkgNm8t6Z0RbDcr6Z3F+WTJIZgHO/WGGvFiz4bU= | |
ARC-Message-Signature: | i=1; a=rsa-sha256; d=sourceware.org; s=key; |
t=1728857977; c=relaxed/simple; | |
bh=HBLhB/mtDkQAlB2nTIgVup+dTrSdNSXFnSvsTKiYjog=; | |
h=Message-ID:Date:MIME-Version:Subject:To:From; | |
b=Nn6qszElmLE972z6TJxyiBjXicR/GME3hoOifstiFx3ZaJLS8G0TnYKDVVsqf4WqzP2sWKIO6wkZCMbVaF9ObhWBgtJ8+lLnxtIGBD/svNfkc/w0OGJzchofktmpTIA21c0nqUiWHS+ygQlbS9FDb5SIKq/1DN1O8cC0BJuaSb4= | |
ARC-Authentication-Results: | i=1; server2.sourceware.org |
Message-ID: | <26b71767-a2a5-423a-96cd-8d01f9438527@SystematicSW.ab.ca> |
Date: | Sun, 13 Oct 2024 16:19:31 -0600 |
MIME-Version: | 1.0 |
User-Agent: | Mozilla Thunderbird |
Subject: | Re: cygwin 3.5.4-1: signal handling destroys 'long double' values |
To: | cygwin AT cygwin DOT com |
References: | <922a6d7e-3ee1-9bb7-dfd7-b94c53a7b9d4 AT t-online DOT de> |
<20241008202057 DOT abd3dc5bb4df172c530e7655 AT nifty DOT ne DOT jp> | |
<79171662-eede-4b14-aaf4-ebd98e6d98de AT SystematicSW DOT ab DOT ca> | |
<99f51137-2889-4985-b4c6-a460e05befb8 AT SystematicSW DOT ab DOT ca> | |
<20241013081407 DOT f07402abe9f721924f461dcc AT nifty DOT ne DOT jp> | |
<51e4e5dd-57ef-4cbc-aff4-572eebb863e2 AT SystematicSW DOT ab DOT ca> | |
<20241014050649 DOT ddaa7e0d14365a86d8523f1d AT nifty DOT ne DOT jp> | |
Autocrypt: | addr=Brian DOT Inglis AT Shaw DOT ca; keydata= |
xjMEXopx9BYJKwYBBAHaRw8BAQdAPq8FIaW+Bz7xnfyJ1gHQyf2EZo5sAwSPy/bRAcLeWl/N | |
I0JyaWFuIEluZ2xpcyA8QnJpYW4uSW5nbGlzQFNoYXcuY2E+wpYEExYIAD4WIQTG63sbl+cr | |
2nyOuZiKvQKcH1E27wUCXopx9AIbAwUJCWYBgAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK | |
CRCKvQKcH1E276DmAP91Bt8kfJhKHYb9b2sao2fxwJFsl1GlRi516WKI0OkphQEA+ULITsPs | |
blfzSq+GgI7q4LPfRfTLy4Oo3gorlnhnfgnOOAReinH0EgorBgEEAZdVAQUBAQdAepgIsLwm | |
GQicfoIBaB9xHp63MQJqVCPbgPzESTg7EEwDAQgHwn0EGBYIACYWIQTG63sbl+cr2nyOuZiK | |
vQKcH1E27wUCXopx9AIbDAUJCWYBgAAKCRCKvQKcH1E27+zoAP4u2ivMQBAqaMeLOilqRWgy | |
nV2ATImz1p2v1H5P4kBiDwD3caPK1cxU5lijzuSDCjgtIpgF/avHbjA32fxJdIRwAA== | |
Organization: | Systematic Software |
In-Reply-To: | <20241014050649.ddaa7e0d14365a86d8523f1d@nifty.ne.jp> |
X-Stat-Signature: | b5p4m9ud6tq95xn53kpoou1zdagdpwky |
X-Rspamd-Server: | rspamout05 |
X-Spam-Status: | No, score=-0.7 required=5.0 tests=BAYES_00, BODY_8BITS, |
KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, | |
SPF_PASS, TXREP, | |
UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 | |
X-Rspamd-Queue-Id: | A7CD217 |
X-Session-Marker: | 427269616E2E496E676C69734053797374656D6174696353572E61622E6361 |
X-Session-ID: | U2FsdGVkX1+8E3zldUJikdH7O8Hjdg9zAL0ApBmh9do= |
X-HE-Tag: | 1728857967-778658 |
X-HE-Meta: | U2FsdGVkX1/UNvaKRez9MVtytZRyKucJtNhTv149DoD3m0aJcwvamzkAvQbdrdIk95D9ZeUZkbzoKi5WF1ep7mGh8vdyb6thvOsjL/5INWSnMJaKa6u5/Inek8ui78NejfHhjMxGX/TdHGChSp4YUc5wO+O3qLUNxYBJm6Tx2sscEUaQOaf6wNYEkcrCwOx+5IuDeZhnk6FS5IxcuTIPiFwKkXEuCZZIfWv79rIrCRaUhQunYUPC68OMgCZ1tPYCE1r7HGVzS1ufzGxdDGW9S1zA4wk1FBnjpotJrmzWK3KlaZTqAVWHprIRdkeQFQJw24pvpIgdyTiaoHHD4Bm0O8LLUAFgRnm8v+2CKw0VHPN8bZRV9BbttkKk8ZwBp1fGKkgQTyIjAjaxHh5oH56H5TgNYR5OrZ/ajJiDSyTnPNEXpZmKXMZHcaewPkLDQVuv4yE8wtTcxgGZqbe+w9hWPQf1VKU7dmCY9wMd0YXp3kp0tjCwitAPHQiqIlhrdNkYUZ02HWsghPwZlTLUxoJ8jmoisgmhWS7VBHQVgfB6h2LpzMdH+WP0QlIfoMrshe+p2iKtweDjV/8DKJd6UyZW9qCcFMPzJQY4 |
X-Spam-Checker-Version: | SpamAssassin 3.4.6 (2021-04-09) on |
server2.sourceware.org | |
X-BeenThere: | cygwin AT cygwin DOT com |
X-Mailman-Version: | 2.1.30 |
List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
From: | Brian Inglis via Cygwin <cygwin AT cygwin DOT com> |
Reply-To: | cygwin AT cygwin DOT com |
Cc: | Brian Inglis <Brian DOT Inglis AT SystematicSW DOT ab DOT ca> |
Errors-To: | cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com |
Sender: | "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com> |
X-MIME-Autoconverted: | from base64 to 8bit by delorie.com id 49DMK64v3114780 |
On 2024-10-13 14:06, Takashi Yano via Cygwin wrote: > Hi Brian > > On Sun, 13 Oct 2024 10:41:58 -0600 > Brian Inglis wrote: >> On 2024-10-12 17:14, Takashi Yano via Cygwin wrote: >>> Hi Brian, >>> >>> On Tue, 8 Oct 2024 10:37:14 -0600 >>> Brian Inglis wrote: >>>> On 2024-10-08 10:14, Brian Inglis via Cygwin wrote: >>>>> On 2024-10-08 05:20, Takashi Yano via Cygwin wrote: >>>>>> On Mon, 7 Oct 2024 15:11:52 +0200 >>>>>> Christian Franke wrote: >>>>>>> $ gcc -o sigtest -O2 sigtest.c >>>>>>> >>>>>>> $ ./sigtest > out.txt >>>>>>> (press ^C 42x :-) >>>>>>> >>>>>>> $ sort out.txt | uniq -c >>>>>>>       3 x = 0x1.23456789p+0, y = -nan, d = -nan >>>>>>>       6 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = -nan >>>>>>>      33 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = 0x0p+0 >>>>>>> >>>>>>> The problem also occurs if compiled without -O2, but less often. No >>>>>>> problem occurs if compiled with -DWORKS which suggests that only 'long >>>>>>> double' is affected. >>>>>> >>>>>> Thanks for the report. I looked into this problem and might find the >>>>>> cause. It seems due to a bug of scripts/gendef. It generates signal >>>>>> handler caller (sigfe.s) which stores/restores the registers. >>>>>> >>>>>> In sigdelayed, control word is stored/restored by fnstcw/fldcw instruction, >>>>>> however, fninit instruction destroys some status registers in FPU (x87). >>>>>> >>>>>> I think we shold use fnstenv/fldenv rather than fnstcw/fldcw and fninit. >>>>>> However, I'm not familiar with x87 instructions, so I may overlook >>>>>> something. >>>>>> >>>>>> Could anyone expert of x87 instructions and sigfe stuff give some >>>>>> comments? >>>>> >>>>> AIUI x87 FP handling is outdated and mainly unused on current systems, as >>>>> current systems do more and use more than the legacy x87 instructions and stack. >>>>> >>>>> See https://en.cppreference.com/w/c/numeric/fenv and related docs for more >>>>> modern approaches. >>>>> >>>>> You would have to look into the AMD/Intel/IEEE docs for lower level details. >>>> >>>> This is basically what ISTR: >>>> >>>> https://beta.boost.org/doc/libs/1_82_0/libs/context/doc/html/context/rationale/x86_and_floating_point_env.html >>>> >>>> where legacy x87 and MMX registers are not used or preserved on x86_64/amd64, as >>>> SSE... instructions and XMM registers are used. >>> >>> Thanks for the advice. I read throuh the web pages and related documents >>> and made a patch which uses fxsave/fxrstor and xsave/xrstror to >>> cygwin-patches AT cygwin DOT com mailing list. >>> https://cygwin.com/pipermail/cygwin-patches/2024q4/012804.html >>> >>> Is this as you intended? >> >> That seems to be the preferred approach now, as long as you can correctly >> determine adequate space for fxsave and xsave, given the varying feature sets, >> register counts, and register sizes of recent processors: >> sse/2/3/4.1/4.2/4a/5/ssse3 avx2/512 128/256/512 bits X/Y/ZMM registers. > > Thanks for checking. > > According to https://cdrdv2.intel.com/v1/dl/getContent/671110 , > fxsave uses 512 bytes fixed length memory to save the current > state of the x87 FPU, MMX technology, XMM, and MXCSR registers. > > The patch allocates 0x238 bytes: > 0x200 (512 bytes): fxsave area > 0x008 ( 8 bytes): for 16-byte alignment > 0x010 ( 16 bytes): work area > 0x020 ( 32 bytes): reserved for later processing That is just the FPU state, MMX state, and 16 16B XMM registers, etc. Please also note that 64 bit operands or REX prefix must be used with FXSAVE/FXRSTOR to save expanded state rather than legacy state. > According to https://cdrdv2.intel.com/v1/dl/getContent/671436 , > cpuid instruction with eax=0dh and ecs=00h returns the maximum > size required by xsave in ebx. So the patch allocates: > ebx + 0x048 bytes. > 0x018 ( 24 bytes): for 64-byte alignment > 0x010 ( 16 bytes): work area > 0x020 ( 32 bytes): reserved for later processing That is for features currently enabled in XCR0 user state, not all the values of all possible registers, for all possible features, in ecx, which are supported, may be enabled, and in use. You need 2KB to store 32 X/Y/ZMM 64B registers, and new real and virtual features may require more. It may be conservative, but I would suggest allocating the space in ecx as documented, just in case of future changes, and that can be reduced to 512 if only fxsave is supported. I suggest you should check for fxsave in cpuid 1:0 edx:24, fall back to fnsave/frstor if not, and keep everything aligned to 64 bytes for safety. For my AMD A10-9700 /proc/cpuinfo shows: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx *fxsr* sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmuldq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes *xsave* avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm perfctr_core perfctr_nb bpext ptsc mwaitx cpb hw_pstate fsgsbase bmi1 avx2 smep bmi2 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decode_assists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov and /usr/bin/cpuid (package cpuid) shows (see my added !): ... feature information (1/edx): x87 FPU on chip = true VME: virtual-8086 mode enhancement = true DE: debugging extensions = true PSE: page size extensions = true TSC: time stamp counter = true RDMSR and WRMSR support = true PAE: physical address extensions = true MCE: machine check exception = true CMPXCHG8B inst. = true APIC on chip = true SYSENTER and SYSEXIT = true MTRR: memory type range registers = true PTE global bit = true MCA: machine check architecture = true CMOV: conditional move/compare instr = true PAT: page attribute table = true PSE-36: page size extension = true PSN: processor serial number = false CLFLUSH instruction = true DS: debug store = false ACPI: thermal monitor and clock ctrl = false MMX Technology = true ! FXSAVE/FXRSTOR = true SSE extensions = true SSE2 extensions = true SS: self snoop = false hyper-threading / multi-core supported = true TM: therm. monitor = false IA64 = false PBE: pending break event = false feature information (1/ecx): PNI/SSE3: Prescott New Instructions = true PCLMULDQ instruction = true DTES64: 64-bit debug store = false MONITOR/MWAIT = true CPL-qualified debug store = false VMX: virtual machine extensions = false SMX: safer mode extensions = false Enhanced Intel SpeedStep Technology = false TM2: thermal monitor 2 = false SSSE3 extensions = true context ID: adaptive or shared L1 data = false SDBG: IA32_DEBUG_INTERFACE = false FMA instruction = true CMPXCHG16B instruction = true xTPR disable = false PDCM: perfmon and debug = false PCID: process context identifiers = false DCA: direct cache access = false SSE4.1 extensions = true SSE4.2 extensions = true x2APIC: extended xAPIC support = false MOVBE instruction = true POPCNT instruction = true time stamp counter deadline = false AES instruction = true XSAVE/XSTOR states = true ! OS-enabled XSAVE/XSTOR = true AVX: advanced vector extensions = true F16C half-precision convert instruction = true RDRAND instruction = true hypervisor guest status = false ... XSAVE features (0xd/0): XCR0 valid bit field mask = 0x4000000000000007 x87 state = true SSE state = true AVX state = true MPX BNDREGS = false MPX BNDCSR = false AVX-512 opmask = false AVX-512 ZMM_Hi256 = false AVX-512 Hi16_ZMM = false PKRU state = false XTILECFG state = false XTILEDATA state = false bytes required by fields in XCR0 = 0x00000340 (832) ! bytes required by XSAVE/XRSTOR area = 0x000003c0 (960) XSAVEOPT instruction = true XSAVEC instruction = false XGETBV instruction = false XSAVES/XRSTORS instructions = false XFD: extended feature disable supported = false SAVE area size in bytes = 0x00000000 (0) IA32_XSS valid bit field mask = 0x0000000000000000 PT state = false PASID state = false CET_U user state = false CET_S supervisor state = false HDC state = false UINTR state = false LBR state = false HWP state = false AVX/YMM features (0xd/2): AVX/YMM save state byte size = 0x00000100 (256) AVX/YMM save state byte offset = 0x00000240 (576) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false XFD faulting supported = false LWP features (0xd/0x3e): LWP save state byte size = 0x00000080 (128) LWP save state byte offset = 0x00000340 (832) supported in IA32_XSS or XCR0 = XCR0 (user state) 64-byte alignment in compacted XSAVE = false XFD faulting supported = false ... -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |