X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=-0.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MISSING_MIMEOLE,RCVD_VIA_APNIC X-Spam-Check-By: sourceware.org X-Nifty-SrcIP: [172.22.201.189] Message-ID: <2265626.1033831361860540947.herumi@nifty.com> Date: Tue, 26 Feb 2013 15:35:40 +0900 (JST) From: MITSUNARI Shigeo To: cygwin AT cygwin DOT com Subject: wrong performance of malloc/free under multi-threading MIME-Version: 1.0 Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 7bit Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Hi. I found that the performance of malloc/free is wrong under multi-threading. The following test program reproduces the problem. The program repeats malloc and free under multi-thread. I measured the timing on Cygwin and Linux. timing(sec)| threadNum -----------+----------+------------- | 1 | 2 -----------+----------+------------- Linux | 1.45 | 0.69 -----------+----------+------------- Cygwin | 2.059 | 53.165 -----------+----------+------------- The timing under Linux seems good scale but it is very wrong under Cygwin. Is it intentional behavior or do I use pthread in bad way? env : Core i7-2600 + Windows 7 Ultimate SP1(64bit) + 8GiB memory %gcc malloc-free-pthread.c -lpthread -Wall -Wextra -ansi -pedantic -O2 -m32 // results for Linux % time ./a.out 1 threadNum=1, n=120000 begin=0, end=120000 end 1.432u 0.016s 0:01.45 99.3% 0+0k 0+0io 0pf+0w % time ./a.out 2 threadNum=2, n=120000 begin=0, end=60000 begin=60000, end=120000 end 1.384u 0.000s 0:00.69 200.0% 0+0k 0+0io 0pf+0w // results for Cygwin // I stopped Anti-virus software under measuring. $ time ./a.exe 1 threadNum=1, n=120000 begin=0, end=120000 end real 0m2.059s user 0m2.059s sys 0m0.000s $ time ./a.exe 2 threadNum=2, n=120000 begin=0, end=60000 begin=60000, end=120000 end real 0m53.165s user 0m11.949s sys 0m19.812s // Linux % uname -a Linux i3 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64 x8 6_64 x86_64 GNU/Linux % gcc -v 4.7 // Cygwin % uname -mrsv CYGWIN_NT-6.1-WOW64 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 % gcc -dumpversion 4.5.3 // test code % cat malloc-free-pthread.c #include #include #include #include void task(int idx) { const int size = 100; int i; for (i = 1; i < size; i++) { char *p = (char*)malloc(i); memset(p, idx, i); // ensure to call malloc/free under optimization free(p); } } typedef struct { int begin; int end; } Range; void* run(void *arg) { const Range *range = (const Range*)arg; int begin = range->begin; int end = range->end; printf("begin=%d, end=%d\n", begin, end); while (begin != end) { task(begin); begin++; } return 0; } #define MAX_THREAD_NUM 4 int main(int argc, char *argv[]) { const int threadNum = argc == 1 ? 1 : atoi(argv[1]); const int n = 1 * 2 * 3 * 4 * 5000; if (threadNum < 0 || threadNum > MAX_THREAD_NUM) { printf("threadNum = 0, 1, 2, 3, 4\n"); return 1; } printf("threadNum=%d, n=%d\n", threadNum, n); if (threadNum == 0) { Range range; puts("no thread\n"); range.begin = 0; range.end = n; run(&range); } else { const int dn = n / threadNum; Range range[MAX_THREAD_NUM]; pthread_t pt[MAX_THREAD_NUM]; int i; for (i = 0; i < threadNum; i++) { range[i].begin = i * dn; range[i].end = (i + 1) * dn; if (pthread_create(&pt[i], NULL, run, &range[i]) != 0) { printf("ERR create %d\n", i); return 1; } } for (i = 0; i < threadNum; i++) { if (pthread_join(pt[i], NULL) != 0) { printf("ERR join %d\n", i); return 1; } } } puts("end"); return 0; } --- Yours, MITSUNARI Shigeo -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple