delorie.com/archives/browse.cgi | search |
X-Recipient: | archive-cygwin AT delorie DOT com |
DomainKey-Signature: | a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:from:to:subject:date:reply-to:message-id | |
:mime-version:content-type; q=dns; s=default; b=dbkewQN7XirZ0X0a | |
wo2cAIiMtSkf4WTi+kn93ltLORGsgzLmhQf57Gq/cyaGE+pcV682zpCklr3yxsOc | |
3Ymn3kYFhFWSAs7oMS/SVBNTBv6SI3PP6bgxBBXElYhsFISVBHstmD/Tz4YptJ12 | |
WdcHf28sKWqm5onSyEf/M1rQjMU= | |
DKIM-Signature: | v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id |
:list-unsubscribe:list-subscribe:list-archive:list-post | |
:list-help:sender:from:to:subject:date:reply-to:message-id | |
:mime-version:content-type; s=default; bh=hOrpwGELWG3cpPAFthy9jw | |
4Jvoc=; b=Bx+ChaISmR37gjefgzgImKvFm2VmeOWUHXd13geFMsmUE97hVDhZl1 | |
C4MgF72KcPCa2lfFqrok7H4ZAsdItSh24W7WwWLTOLxZxo3Y/dzAtJKet2hVARRL | |
F64i9opS1TXTBgD1Je4QoJz8zFSktvhIxS6CFRk7ZYOrp8+BA0TV8= | |
Mailing-List: | contact cygwin-help AT cygwin DOT com; run by ezmlm |
List-Id: | <cygwin.cygwin.com> |
List-Subscribe: | <mailto:cygwin-subscribe AT cygwin DOT com> |
List-Archive: | <http://sourceware.org/ml/cygwin/> |
List-Post: | <mailto:cygwin AT cygwin DOT com> |
List-Help: | <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs> |
Sender: | cygwin-owner AT cygwin DOT com |
Mail-Followup-To: | cygwin AT cygwin DOT com |
Delivered-To: | mailing list cygwin AT cygwin DOT com |
Authentication-Results: | sourceware.org; auth=none |
X-Virus-Found: | No |
X-Spam-SWARE-Status: | No, score=-1.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 |
X-HELO: | homiemail-a36.g.dreamhost.com |
From: | Bengt Larsson <lists DOT cygwin4 AT bengtl DOT net> |
To: | cygwin AT cygwin DOT com |
Subject: | Performance problem with gcc 4.9.2-3 on 64 bit |
Date: | Fri, 27 Feb 2015 17:49:17 +0100 |
Reply-To: | cygwin AT cygwin DOT com |
Message-ID: | <cm61fadkltbqshdmtiil4nb50b0gf1n3qg@4ax.com> |
User-Agent: | ForteAgent/7.20.32.1218 |
MIME-Version: | 1.0 |
X-IsSubscribed: | yes |
----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Below are two benchmarks that explore maximum floating point performance. loopm6 is double precision floating point and loopm6fp is parallell single-precision. They are manually unrolled multiply-add loops. I used to reach 2.8 and 11 GFlops on these. Now I only get 2 and 6. If you explore the inner loop with gcc -O2 -S you can see that it seems to use few registers. If you run them, there is a parameter expected. I use 30000 - 50000. gcc 4.9.2-3 on 64-bit. I use gcc -O2. ----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS Content-Type: application/octet-stream; name=loopm6.c Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=loopm6.c I2luY2x1ZGUgPHN0ZGxpYi5oPgojaW5jbHVkZSA8c3RkaW8uaD4KI2luY2x1 ZGUgInRpbWVycy5oIgoKc3RhdGljIGNoYXIgKnByb2dyYW07CgpzdGF0aWMg aW50CnVzYWdlKCkgewoJZnByaW50ZihzdGRlcnIsIlVzYWdlOiAlcyBbbG9v cHNdXG4iLHByb2dyYW0pOwoJZXhpdCgxKTsKfQoKaW50Cm1haW4oaW50IGFy Z2MsIGNoYXIgKmFyZ3ZbXSkKewoJaW50IGksaixudW07Cglkb3VibGUgc3Vt MSxzdW0yLHN1bTMsc3VtNCxzdW01LHN1bTYsc3VtNyxzdW04OwoJZG91Ymxl IHRpbWUsbWZsb3A7CgoJcHJvZ3JhbSA9IGFyZ3ZbMF07CgoJc3VtMSA9IDAu MDsKCXN1bTIgPSAxLjA7CglzdW0zID0gMi4wOwoJc3VtNCA9IDMuMDsKCXN1 bTUgPSA0LjA7CglzdW02ID0gNS4wOwoJc3VtNyA9IDYuMDsKCXN1bTggPSA3 LjA7CgoJaWYgKGFyZ2MgPT0gMSkgewoJCXdoaWxlKDEpOwoJfSBlbHNlIGlm IChhcmdjID09IDIpIHsKCQludW09YXRvaShhcmd2WzFdKTsKCgkJaWYgKG51 bSA8PSAwKSB1c2FnZSgpOwoKCQl0aW1lID0gQ1BVVElNRSgpOwoKCQlmb3Ig KGk9MDsgaTxudW07IGkrKykgewoJCQlmb3Ioaj0wOyBqPG51bTsgaisrKSB7 CgkJCQlzdW0xID0gc3VtMSowLjUgKyAxMC4wOwoJCQkJc3VtMiA9IHN1bTIq MC41ICsgMTAuMDsKCQkJCXN1bTMgPSBzdW0zKjAuNSArIDEwLjA7CgkJCQlz dW00ID0gc3VtNCowLjUgKyAxMC4wOwoJCQkJc3VtNSA9IHN1bTUqMC41ICsg MTAuMDsKCQkJCXN1bTYgPSBzdW02KjAuNSArIDEwLjA7CgkJCQlzdW03ID0g c3VtNyowLjUgKyAxMC4wOwoJCQkJc3VtOCA9IHN1bTgqMC41ICsgMTAuMDsK CQkJfQoJCX0KCgkJdGltZSA9IENQVVRJTUUoKSAtIHRpbWU7Cgl9IGVsc2Ug ewoJCXVzYWdlKCk7Cgl9CgoJbWZsb3AgPSAoZG91YmxlKW51bSooZG91Ymxl KW51bSoxNioxZS02OwoKCXByaW50ZigiJWUgJWUgJWUgJWUgJWUgJWUgJWUg JWVcbiIsc3VtMSxzdW0yLHN1bTMsc3VtNCxzdW01LHN1bTYsc3VtNyxzdW04 KTsKCXByaW50ZigidGltZSA9ICUuM2ZcbiIsdGltZSk7CglwcmludGYoIiUu MmYgTUZsb3BzXG4iLG1mbG9wL3RpbWUpOwoKCXJldHVybiAwOwp9Cg== ----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS Content-Type: application/octet-stream; name=loopm6fp.c Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=loopm6fp.c I2luY2x1ZGUgPHN0ZGxpYi5oPgojaW5jbHVkZSA8c3RkaW8uaD4KI2luY2x1 ZGUgInRpbWVycy5oIgoKdHlwZWRlZiBmbG9hdCB2NHNmIF9fYXR0cmlidXRl X18oKHZlY3Rvcl9zaXplKDE2KSkpOwp0eXBlZGVmIHVuaW9uIHsKCQl2NHNm IHY7CgkJZmxvYXQgZls0XTsKCX0gdjR2ZWN0b3I7CgpzdGF0aWMgY2hhciAq cHJvZ3JhbTsKCnN0YXRpYyBpbnQKdXNhZ2UoKSB7CglmcHJpbnRmKHN0ZGVy ciwiVXNhZ2U6ICVzIFtsb29wc11cbiIscHJvZ3JhbSk7CglleGl0KDEpOwp9 CgppbnQKbWFpbihpbnQgYXJnYywgY2hhciAqYXJndltdKQp7CglpbnQgaSxq LG51bTsKCXY0dmVjdG9yIHN1bTEsc3VtMixzdW0zLHN1bTQsc3VtNSxzdW02 LHN1bTcsIHN1bTgsIGhhbGYsdGVuOwoJZG91YmxlIHRpbWUsbWZsb3A7CgoJ cHJvZ3JhbSA9IGFyZ3ZbMF07CgoJc3VtMS52ID0gKHY0c2YpIHswLjBmLDAu MGYsMC4wZiwwLjBmfTsKCXN1bTIudiA9ICh2NHNmKSB7MS4wZiwxLjBmLDEu MGYsMS4wZn07CglzdW0zLnYgPSAodjRzZikgezIuMGYsMi4wZiwyLjBmLDIu MGZ9OwoJc3VtNC52ID0gKHY0c2YpIHszLjBmLDMuMGYsMy4wZiwzLjBmfTsK CXN1bTUudiA9ICh2NHNmKSB7NC4wZiw0LjBmLDQuMGYsNC4wZn07CglzdW02 LnYgPSAodjRzZikgezUuMGYsNS4wZiw1LjBmLDUuMGZ9OwoJc3VtNy52ID0g KHY0c2YpIHs2LjBmLDYuMGYsNi4wZiw2LjBmfTsKCXN1bTgudiA9ICh2NHNm KSB7Ny4wZiw3LjBmLDcuMGYsNy4wZn07CgloYWxmLnYgPSAodjRzZikgezAu NWYsMC41ZiwwLjVmLDAuNWZ9OwoJdGVuLnYgID0gKHY0c2YpIHsxMC4wZiwx MC4wZiwxMC4wZiwxMC4wZn07CgoJaWYgKGFyZ2MgPT0gMSkgewoJCXdoaWxl KDEpOwoJfSBlbHNlIGlmIChhcmdjID09IDIpIHsKCQludW09YXRvaShhcmd2 WzFdKTsKCgkJaWYgKG51bSA8PSAwKSB1c2FnZSgpOwoKCQl0aW1lID0gQ1BV VElNRSgpOwoKCQlmb3IgKGk9MDsgaTxudW07IGkrKykgewoJCQlmb3Ioaj0w OyBqPG51bTsgaisrKSB7CgkJCQlzdW0xLnYgPSBzdW0xLnYqaGFsZi52ICsg dGVuLnY7CgkJCQlzdW0yLnYgPSBzdW0yLnYqaGFsZi52ICsgdGVuLnY7CgkJ CQlzdW0zLnYgPSBzdW0zLnYqaGFsZi52ICsgdGVuLnY7CgkJCQlzdW00LnYg PSBzdW00LnYqaGFsZi52ICsgdGVuLnY7CgkJCQlzdW01LnYgPSBzdW01LnYq aGFsZi52ICsgdGVuLnY7CgkJCQlzdW02LnYgPSBzdW02LnYqaGFsZi52ICsg dGVuLnY7CgkJCQlzdW03LnYgPSBzdW03LnYqaGFsZi52ICsgdGVuLnY7CgkJ CQlzdW04LnYgPSBzdW04LnYqaGFsZi52ICsgdGVuLnY7CgkJCX0KCQl9CgoJ CXRpbWUgPSBDUFVUSU1FKCkgLSB0aW1lOwoJfSBlbHNlIHsKCQl1c2FnZSgp OwoJfQoKCW1mbG9wID0gKGRvdWJsZSludW0qKGRvdWJsZSludW0qMTYqNCox ZS02OwoKCXByaW50ZigiJWUgJWUgJWUgJWUgJWUgJWUgJWUgJWVcbiIsCgkJ c3VtMS5mWzBdLHN1bTIuZlswXSxzdW0zLmZbMF0sc3VtNC5mWzBdLHN1bTUu ZlswXSxzdW02LmZbMF0sc3VtNy5mWzBdLHN1bTguZlswXSk7CglwcmludGYo InRpbWUgPSAlLjNmXG4iLHRpbWUpOwoJcHJpbnRmKCIlLjJmIE1GbG9wc1xu IixtZmxvcC90aW1lKTsKCglyZXR1cm4gMDsKfQo= ----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS Content-Type: application/octet-stream; name=timers.h Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=timers.h I2luY2x1ZGUgPHN5cy90aW1lLmg+CiNpbmNsdWRlIDxzeXMvcmVzb3VyY2Uu aD4KCiNpZm5kZWYgQ1BVVElNRQojZGVmaW5lIENQVVRJTUUgcmVhbHRpbWUK I2VuZGlmCgpzdGF0aWMgZG91YmxlCnVzZXJ0aW1lKCkKewoJc3RydWN0IHJ1 c2FnZSBydXNhZ2U7CgoJZ2V0cnVzYWdlKFJVU0FHRV9TRUxGLCZydXNhZ2Up OwoKCXJldHVybiAoZG91YmxlKShydXNhZ2UucnVfdXRpbWUudHZfc2VjKSAr IChkb3VibGUpKHJ1c2FnZS5ydV91dGltZS50dl91c2VjKSAqIDEuMGUtMDY7 Cn0KCnN0YXRpYyBkb3VibGUKc3lzdGVtdGltZSgpCnsKCXN0cnVjdCBydXNh Z2UgcnVzYWdlOwoKCWdldHJ1c2FnZShSVVNBR0VfU0VMRiwmcnVzYWdlKTsK CglyZXR1cm4gKGRvdWJsZSkocnVzYWdlLnJ1X3N0aW1lLnR2X3NlYykgKyAo ZG91YmxlKShydXNhZ2UucnVfc3RpbWUudHZfdXNlYykgKiAxLjBlLTA2Owp9 CgpzdGF0aWMgZG91YmxlCnN5c3VzZXJ0aW1lKCkKewoJc3RydWN0IHJ1c2Fn ZSBydXNhZ2U7CgoJZ2V0cnVzYWdlKFJVU0FHRV9TRUxGLCZydXNhZ2UpOwoK CXJldHVybiAoZG91YmxlKShydXNhZ2UucnVfdXRpbWUudHZfc2VjKSArIChk b3VibGUpKHJ1c2FnZS5ydV9zdGltZS50dl9zZWMpICsKCSAgICAgICAoKGRv dWJsZSkocnVzYWdlLnJ1X3V0aW1lLnR2X3VzZWMpICsgKGRvdWJsZSkocnVz YWdlLnJ1X3N0aW1lLnR2X3VzZWMpKSAqIDEuMGUtMDY7Cn0KCnN0YXRpYyBk b3VibGUKcmVhbHRpbWUoKQp7CglzdHJ1Y3QgdGltZXZhbCB0OwoKCWdldHRp bWVvZmRheSgmdCxOVUxMKTsKCXJldHVybiAoZG91YmxlKSB0LnR2X3NlYyAr IChkb3VibGUpIHQudHZfdXNlYyAqIDFlLTY7Cn0K ----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS Content-Type: text/plain; charset=us-ascii -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ----=_d081fap6h14spmmajed8v41eqs7ov8asus.MFSBCHJLHS--
webmaster | delorie software privacy |
Copyright © 2019 by DJ Delorie | Updated Jul 2019 |