DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49KL24bX2003527 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=ceNsK2oC X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ED2983858C66 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1729458124; bh=voebDqxErm4Px3Oc1qMDfB3irf/TTc4qNVHEH6pbi4I=; h=Date:To:Subject:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ceNsK2oCd6Qlyt5Z7Juys6/S1361iK8isaoDT0TZpG06WlgmJ1/ULPSJEMF1juGFZ GzKgk1l/CXCmBf7oSJ9fWlpUyyfxaZpXsoJmGYBMoYvugypCh1+2zCL7yMrbIic/iL sYURJvQyU8ZQRMEbxbRVyVa4j16mWH5t4WMZQv+I= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 84C663858D20 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 84C663858D20 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729458098; cv=none; b=nAAX0fQTRYfdGDOHH0bkTao8vkuiC5fZmU+3jUOUdMHDJSL1bX45Wk6srcemuzCcO+80+XFhuEz6CMoeefKvh0u1Noc3NeYiL1OTZaBYnmx+wAx5NleFqbunqDl+EpGJX2KU9Twq+50nM/2XWRRT/nIEQ480M3xe/A9/mo4vX3g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729458098; c=relaxed/simple; bh=h/u/MqWaNDZrXz4xQ+njXUEv7113uy6q0Z3IzzhkR2w=; h=Date:From:To:Subject:Message-Id:Mime-Version:DKIM-Signature; b=vBOBYC8UBiDFCaZK1EeDwMtGH6Ny5/cReoDEUwiwfrwc5oI/SyGzaCmWg63zxm63bfO5edNcJiw4aupWnwMWeLVK+n+QH8txdkuSs2GXliRiotoPkw1J6/q2lmAbRFqgwPKAn+Ms9xlFZhM25TczIMaCsksKHJfhi1FCHbUEBuk= ARC-Authentication-Results: i=1; server2.sourceware.org Date: Mon, 21 Oct 2024 06:01:30 +0900 To: cygwin AT cygwin DOT com Subject: Re: cygwin 3.5.4-1: signal handling destroys 'long double' values Message-Id: <20241021060130.95323037222f281344896fa5@nifty.ne.jp> In-Reply-To: <20241014153822.fdd2dbf7cfc5396ebfa6b136@nifty.ne.jp> References: <922a6d7e-3ee1-9bb7-dfd7-b94c53a7b9d4 AT t-online DOT de> <20241008202057 DOT abd3dc5bb4df172c530e7655 AT nifty DOT ne DOT jp> <79171662-eede-4b14-aaf4-ebd98e6d98de AT SystematicSW DOT ab DOT ca> <99f51137-2889-4985-b4c6-a460e05befb8 AT SystematicSW DOT ab DOT ca> <20241013081407 DOT f07402abe9f721924f461dcc AT nifty DOT ne DOT jp> <51e4e5dd-57ef-4cbc-aff4-572eebb863e2 AT SystematicSW DOT ab DOT ca> <20241014050649 DOT ddaa7e0d14365a86d8523f1d AT nifty DOT ne DOT jp> <26b71767-a2a5-423a-96cd-8d01f9438527 AT SystematicSW DOT ab DOT ca> <20241014142958 DOT ecf5faeb06a11a8c7a5301de AT nifty DOT ne DOT jp> <20241014145940 DOT c8a84f4360cd3b58837493d1 AT nifty DOT ne DOT jp> <20241014153822 DOT fdd2dbf7cfc5396ebfa6b136 AT nifty DOT ne DOT jp> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32) Mime-Version: 1.0 X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Takashi Yano via Cygwin Reply-To: Takashi Yano Content-Type: text/plain; charset="utf-8" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49KL24bX2003527 Hi Brian, On Mon, 14 Oct 2024 15:38:22 +0900 Takashi Yano wrote: > On Mon, 14 Oct 2024 14:59:40 +0900 > Takashi Yano wrote: > > On Mon, 14 Oct 2024 14:29:58 +0900 > > Takashi Yano wrote: > > > Hi Brian, > > > > > > Thanks for the detail expression. > > > > > > On Sun, 13 Oct 2024 16:19:31 -0600 > > > Brian Inglis wrote: > > > > On 2024-10-13 14:06, Takashi Yano via Cygwin wrote: > > > > > Hi Brian > > > > > > > > > > On Sun, 13 Oct 2024 10:41:58 -0600 > > > > > Brian Inglis wrote: > > > > >> On 2024-10-12 17:14, Takashi Yano via Cygwin wrote: > > > > >>> Hi Brian, > > > > >>> > > > > >>> On Tue, 8 Oct 2024 10:37:14 -0600 > > > > >>> Brian Inglis wrote: > > > > >>>> On 2024-10-08 10:14, Brian Inglis via Cygwin wrote: > > > > >>>>> On 2024-10-08 05:20, Takashi Yano via Cygwin wrote: > > > > >>>>>> On Mon, 7 Oct 2024 15:11:52 +0200 > > > > >>>>>> Christian Franke wrote: > > > > >>>>>>> $ gcc -o sigtest -O2 sigtest.c > > > > >>>>>>> > > > > >>>>>>> $ ./sigtest > out.txt > > > > >>>>>>> (press ^C 42x :-) > > > > >>>>>>> > > > > >>>>>>> $ sort out.txt | uniq -c > > > > >>>>>>>         3 x = 0x1.23456789p+0, y = -nan, d = -nan > > > > >>>>>>>         6 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = -nan > > > > >>>>>>>        33 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = 0x0p+0 > > > > >>>>>>> > > > > >>>>>>> The problem also occurs if compiled without -O2, but less often. No > > > > >>>>>>> problem occurs if compiled with -DWORKS which suggests that only 'long > > > > >>>>>>> double' is affected. > > > > >>>>>> > > > > >>>>>> Thanks for the report. I looked into this problem and might find the > > > > >>>>>> cause. It seems due to a bug of scripts/gendef. It generates signal > > > > >>>>>> handler caller (sigfe.s) which stores/restores the registers. > > > > >>>>>> > > > > >>>>>> In sigdelayed, control word is stored/restored by fnstcw/fldcw instruction, > > > > >>>>>> however, fninit instruction destroys some status registers in FPU (x87). > > > > >>>>>> > > > > >>>>>> I think we shold use fnstenv/fldenv rather than fnstcw/fldcw and fninit. > > > > >>>>>> However, I'm not familiar with x87 instructions, so I may overlook > > > > >>>>>> something. > > > > >>>>>> > > > > >>>>>> Could anyone expert of x87 instructions and sigfe stuff give some > > > > >>>>>> comments? > > > > >>>>> > > > > >>>>> AIUI x87 FP handling is outdated and mainly unused on current systems, as > > > > >>>>> current systems do more and use more than the legacy x87 instructions and stack. > > > > >>>>> > > > > >>>>> See https://en.cppreference.com/w/c/numeric/fenv and related docs for more > > > > >>>>> modern approaches. > > > > >>>>> > > > > >>>>> You would have to look into the AMD/Intel/IEEE docs for lower level details. > > > > >>>> > > > > >>>> This is basically what ISTR: > > > > >>>> > > > > >>>> https://beta.boost.org/doc/libs/1_82_0/libs/context/doc/html/context/rationale/x86_and_floating_point_env.html > > > > >>>> > > > > >>>> where legacy x87 and MMX registers are not used or preserved on x86_64/amd64, as > > > > >>>> SSE... instructions and XMM registers are used. > > > > >>> > > > > >>> Thanks for the advice. I read throuh the web pages and related documents > > > > >>> and made a patch which uses fxsave/fxrstor and xsave/xrstror to > > > > >>> cygwin-patches AT cygwin DOT com mailing list. > > > > >>> https://cygwin.com/pipermail/cygwin-patches/2024q4/012804.html > > > > >>> > > > > >>> Is this as you intended? > > > > >> > > > > >> That seems to be the preferred approach now, as long as you can correctly > > > > >> determine adequate space for fxsave and xsave, given the varying feature sets, > > > > >> register counts, and register sizes of recent processors: > > > > >> sse/2/3/4.1/4.2/4a/5/ssse3 avx2/512 128/256/512 bits X/Y/ZMM registers. > > > > > > > > > > Thanks for checking. > > > > > > > > > > According to https://cdrdv2.intel.com/v1/dl/getContent/671110 , > > > > > fxsave uses 512 bytes fixed length memory to save the current > > > > > state of the x87 FPU, MMX technology, XMM, and MXCSR registers. > > > > > > > > > > The patch allocates 0x238 bytes: > > > > > 0x200 (512 bytes): fxsave area > > > > > 0x008 ( 8 bytes): for 16-byte alignment > > > > > 0x010 ( 16 bytes): work area > > > > > 0x020 ( 32 bytes): reserved for later processing > > > > > > > > That is just the FPU state, MMX state, and 16 16B XMM registers, etc. > > > > Please also note that 64 bit operands or REX prefix must be used with > > > > FXSAVE/FXRSTOR to save expanded state rather than legacy state. > > > > > > Fixed. > > > > > > > > According to https://cdrdv2.intel.com/v1/dl/getContent/671436 , > > > > > cpuid instruction with eax=0dh and ecs=00h returns the maximum > > > > > size required by xsave in ebx. So the patch allocates: > > > > > ebx + 0x048 bytes. > > > > > 0x018 ( 24 bytes): for 64-byte alignment > > > > > 0x010 ( 16 bytes): work area > > > > > 0x020 ( 32 bytes): reserved for later processing > > > > > > > > That is for features currently enabled in XCR0 user state, not all the values of > > > > all possible registers, for all possible features, in ecx, which are supported, > > > > may be enabled, and in use. > > > > You need 2KB to store 32 X/Y/ZMM 64B registers, and new real and virtual > > > > features may require more. > > > > > > Do you mean we should use ecx value rather than ebx returned by > > > cpuid (eax=0dh,ecx=0)? I did not understand difference of the > > > values of ebx and ecx returned by cpuid. > > > > > > Fixed. > > > > On the second thought, it is not necessary to use the ecx value > > because the patch uses the EDX:EAX value of cpuid(0d,0) for xsave, > > is it? This means that only features enabled in XCR0 are saved. > > The features not enabed in XCR0 cannot be used in user mode, so > > we do not need to store the states for them. > > No, I was wrong. The EDX:EAX value of cpuid(0d,0) means the features > that CAN BE enabled in XCR0. > > Please see v3 patch. Any suggenstions? -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple