delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2024/10/14/02:00:12

DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 49E60B7b3266577
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=EZAiebWZ
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D77203858CDA
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1728885610;
bh=sGqKxmW36bu5kUkctBrsAoJyTRx194hbm2SxbjuiKQ0=;
h=Date:To:Subject:In-Reply-To:References:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=EZAiebWZyi1g9ucVMH1LJGE3yeryo8XDGJYxQht6PMaZg1iTI7yV8bIsJoxF6cCOx
e2l5KGeh4nL4YjsCLdFoIrHLQAST2NejS67T+T0gbTl7TaR+GixdY6Wf0Jh5goZ+Lv
oFYNYaM2myHNUmCtAzirPBK2t5hvX4UDv8SSnkJ0=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 43EF13858D34
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 43EF13858D34
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1728885586; cv=none;
b=f9L0xPudN/cjWJERLV5kBQVWbqepPxmpJlNpuA8C1swCIFoJrRDdJk/e/Inn80pePgbPxT4ajMOgvZ1aldzMyOjlR28l0YRbV7Z9+p28KrcpT4xYPRT9K82qBJU+2AVXoSH87A8eU0gT+rVnZ430NkhweSkcISfoJG3+T8ppCuo=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1728885586; c=relaxed/simple;
bh=igV7LcsQenfc8NCYHtwOOkSpgPVAPGcAZBstfY6UNRw=;
h=Date:From:To:Subject:Message-Id:Mime-Version:DKIM-Signature;
b=iCtWthUl33dessvedxjdsfA5GDmYjQn4FjK1uknuL2mGscbHb7b1R5DRC5IMoZwvoIkmisf5rMMGvK9BpFtW20eYcuS+59DPmv4Ag09bsnQ+N1BbetjYL3NmM88hiwx2tXMoRKPUvhOns06yQQbAMZP50+qWbUEY4RTxWachhaI=
ARC-Authentication-Results: i=1; server2.sourceware.org
Date: Mon, 14 Oct 2024 14:59:40 +0900
To: cygwin AT cygwin DOT com
Subject: Re: cygwin 3.5.4-1: signal handling destroys 'long double' values
Message-Id: <20241014145940.c8a84f4360cd3b58837493d1@nifty.ne.jp>
In-Reply-To: <20241014142958.ecf5faeb06a11a8c7a5301de@nifty.ne.jp>
References: <922a6d7e-3ee1-9bb7-dfd7-b94c53a7b9d4 AT t-online DOT de>
<20241008202057 DOT abd3dc5bb4df172c530e7655 AT nifty DOT ne DOT jp>
<79171662-eede-4b14-aaf4-ebd98e6d98de AT SystematicSW DOT ab DOT ca>
<99f51137-2889-4985-b4c6-a460e05befb8 AT SystematicSW DOT ab DOT ca>
<20241013081407 DOT f07402abe9f721924f461dcc AT nifty DOT ne DOT jp>
<51e4e5dd-57ef-4cbc-aff4-572eebb863e2 AT SystematicSW DOT ab DOT ca>
<20241014050649 DOT ddaa7e0d14365a86d8523f1d AT nifty DOT ne DOT jp>
<26b71767-a2a5-423a-96cd-8d01f9438527 AT SystematicSW DOT ab DOT ca>
<20241014142958 DOT ecf5faeb06a11a8c7a5301de AT nifty DOT ne DOT jp>
X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32)
Mime-Version: 1.0
X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, BODY_8BITS,
DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A,
RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS,
TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
server2.sourceware.org
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Takashi Yano via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 49E60B7b3266577

On Mon, 14 Oct 2024 14:29:58 +0900
Takashi Yano wrote:
> Hi Brian,
> 
> Thanks for the detail expression.
> 
> On Sun, 13 Oct 2024 16:19:31 -0600
> Brian Inglis wrote:
> > On 2024-10-13 14:06, Takashi Yano via Cygwin wrote:
> > > Hi Brian
> > > 
> > > On Sun, 13 Oct 2024 10:41:58 -0600
> > > Brian Inglis wrote:
> > >> On 2024-10-12 17:14, Takashi Yano via Cygwin wrote:
> > >>> Hi Brian,
> > >>>
> > >>> On Tue, 8 Oct 2024 10:37:14 -0600
> > >>> Brian Inglis wrote:
> > >>>> On 2024-10-08 10:14, Brian Inglis via Cygwin wrote:
> > >>>>> On 2024-10-08 05:20, Takashi Yano via Cygwin wrote:
> > >>>>>> On Mon, 7 Oct 2024 15:11:52 +0200
> > >>>>>> Christian Franke wrote:
> > >>>>>>> $ gcc -o sigtest -O2 sigtest.c
> > >>>>>>>
> > >>>>>>> $ ./sigtest > out.txt
> > >>>>>>> (press ^C 42x :-)
> > >>>>>>>
> > >>>>>>> $ sort out.txt | uniq -c
> > >>>>>>>           3 x = 0x1.23456789p+0, y = -nan, d = -nan
> > >>>>>>>           6 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = -nan
> > >>>>>>>          33 x = 0x1.23456789p+0, y = 0x1.23456789p+0, d = 0x0p+0
> > >>>>>>>
> > >>>>>>> The problem also occurs if compiled without -O2, but less often. No
> > >>>>>>> problem occurs if compiled with -DWORKS which suggests that only 'long
> > >>>>>>> double' is affected.
> > >>>>>>
> > >>>>>> Thanks for the report. I looked into this problem and might find the
> > >>>>>> cause. It seems due to a bug of scripts/gendef. It generates signal
> > >>>>>> handler caller (sigfe.s) which stores/restores the registers.
> > >>>>>>
> > >>>>>> In sigdelayed, control word is stored/restored by fnstcw/fldcw instruction,
> > >>>>>> however, fninit instruction destroys some status registers in FPU (x87).
> > >>>>>>
> > >>>>>> I think we shold use fnstenv/fldenv rather than fnstcw/fldcw and fninit.
> > >>>>>> However, I'm not familiar with x87 instructions, so I may overlook
> > >>>>>> something.
> > >>>>>>
> > >>>>>> Could anyone expert of x87 instructions and sigfe stuff give some
> > >>>>>> comments?
> > >>>>>
> > >>>>> AIUI x87 FP handling is outdated and mainly unused on current systems, as
> > >>>>> current systems do more and use more than the legacy x87 instructions and stack.
> > >>>>>
> > >>>>> See https://en.cppreference.com/w/c/numeric/fenv and related docs for more
> > >>>>> modern approaches.
> > >>>>>
> > >>>>> You would have to look into the AMD/Intel/IEEE docs for lower level details.
> > >>>>
> > >>>> This is basically what ISTR:
> > >>>>
> > >>>> https://beta.boost.org/doc/libs/1_82_0/libs/context/doc/html/context/rationale/x86_and_floating_point_env.html
> > >>>>
> > >>>> where legacy x87 and MMX registers are not used or preserved on x86_64/amd64, as
> > >>>> SSE... instructions and XMM registers are used.
> > >>>
> > >>> Thanks for the advice. I read throuh the web pages and related documents
> > >>> and made a patch which uses fxsave/fxrstor and xsave/xrstror to
> > >>> cygwin-patches AT cygwin DOT com mailing list.
> > >>> https://cygwin.com/pipermail/cygwin-patches/2024q4/012804.html
> > >>>
> > >>> Is this as you intended?
> > >>
> > >> That seems to be the preferred approach now, as long as you can correctly
> > >> determine adequate space for fxsave and xsave, given the varying feature sets,
> > >> register counts, and register sizes of recent processors:
> > >> sse/2/3/4.1/4.2/4a/5/ssse3 avx2/512 128/256/512 bits X/Y/ZMM registers.
> > > 
> > > Thanks for checking.
> > > 
> > > According to https://cdrdv2.intel.com/v1/dl/getContent/671110 ,
> > > fxsave uses 512 bytes fixed length memory to save the current
> > > state of the x87 FPU, MMX technology, XMM, and MXCSR registers.
> > > 
> > > The patch allocates 0x238 bytes:
> > >   0x200 (512 bytes): fxsave area
> > >   0x008 (  8 bytes): for 16-byte alignment
> > >   0x010 ( 16 bytes): work area
> > >   0x020 ( 32 bytes): reserved for later processing
> > 
> > That is just the FPU state, MMX state, and 16 16B XMM registers, etc.
> > Please also note that 64 bit operands or REX prefix must be used with 
> > FXSAVE/FXRSTOR to save expanded state rather than legacy state.
> 
> Fixed.
> 
> > > According to https://cdrdv2.intel.com/v1/dl/getContent/671436 ,
> > > cpuid instruction with eax=0dh and ecs=00h returns the maximum
> > > size required by xsave in ebx. So the patch allocates:
> > > ebx + 0x048 bytes.
> > >   0x018 ( 24 bytes): for 64-byte alignment
> > >   0x010 ( 16 bytes): work area
> > >   0x020 ( 32 bytes): reserved for later processing
> > 
> > That is for features currently enabled in XCR0 user state, not all the values of 
> > all possible registers, for all possible features, in ecx, which are supported, 
> > may be enabled, and in use.
> > You need 2KB to store 32 X/Y/ZMM 64B registers, and new real and virtual 
> > features may require more.
> 
> Do you mean we should use ecx value rather than ebx returned by
> cpuid (eax=0dh,ecx=0)? I did not understand difference of the
> values of ebx and ecx returned by cpuid.
> 
> Fixed.

On the second thought, it is not necessary to use the ecx value
because the patch uses the EDX:EAX value of cpuid(0d,0) for xsave,
is it? This means that only features enabled in XCR0 are saved.
The features not enabed in XCR0 cannot be used in user mode, so
we do not need to store the states for them.

-- 
Takashi Yano <takashi DOT yano AT nifty DOT ne DOT jp>

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019