| delorie.com/archives/browse.cgi | search |
| DMARC-Filter: | OpenDMARC Filter v1.4.2 delorie.com 61PAXLrM403294 |
| Authentication-Results: | delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com |
| Authentication-Results: | delorie.com; spf=pass smtp.mailfrom=cygwin.com |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 delorie.com 61PAXLrM403294 |
| Authentication-Results: | delorie.com; |
| dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=YM6Lj/0A | |
| X-Recipient: | archive-cygwin AT delorie DOT com |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org 19DDE4BAD163 |
| DKIM-Signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; |
| s=default; t=1772015600; | |
| bh=RzREgvZWTBiLmAO3lH0aHUMSt/7zyk+q6Cqy+H+jZus=; | |
| h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: | |
| List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: | |
| From; | |
| b=YM6Lj/0Afmj8sGQz+K0XbRTkWFS+TWIDaodLUMepr+vr/3dUal+hzBsShI76dkho2 | |
| 7SXJffP1gP/TuYCEv4hEGQ5pthNaLR2Fj+t64fzlhGKYu+0tfiafFDsJLsyIAaiNTW | |
| GHqQkGeTxiBInk3YRgmhxVIOyOba9Sb4NIM3IlwI= | |
| X-Original-To: | cygwin AT cygwin DOT com |
| Delivered-To: | cygwin AT cygwin DOT com |
| DMARC-Filter: | OpenDMARC Filter v1.4.2 sourceware.org B1A9F4BA23CE |
| ARC-Filter: | OpenARC Filter v1.0.0 sourceware.org B1A9F4BA23CE |
| ARC-Seal: | i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1772015544; cv=none; |
| b=n8rlZQrozobhV+FBXN+pluwRJ7vk94P6GTabKw7cOeDE5rOWEeJs2GUxM8mJ39rPBI+KWUY/zrRVy8AZuNRVcsfEFDjXLTAFQBHNH0y6i1c7XnD9kPWC+wsYTkKD9gF7fRebkLupSivYLBUCJDyW+W7H32wXqNcuoL+ru041Des= | |
| ARC-Message-Signature: | i=1; a=rsa-sha256; d=sourceware.org; s=key; |
| t=1772015544; c=relaxed/simple; | |
| bh=6LhBuWlsnSz9XNGohh7Z9JkvdIYkTzi+MOKOeXx+y8Y=; | |
| h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; | |
| b=G5cXsDHNm1KonSVGrec/28nfKRRnF4QXpTerVH5FDUpEROR9ZuXcv5rrqiyQxH2Z8yee8WYlDV0LaNnHhXMhmJ5YSn5NfKwbnGbqFaeD724VWBD4HjZOO0oIMvtTpDqGRvPxhuZFJZ1l9uSNXpvy37C8SWtfD2eyR04wBqZq+uM= | |
| ARC-Authentication-Results: | i=1; server2.sourceware.org |
| DKIM-Filter: | OpenDKIM Filter v2.11.0 sourceware.org B1A9F4BA23CE |
| Date: | Wed, 25 Feb 2026 21:32:13 +1100 |
| To: | cygwin AT cygwin DOT com |
| Subject: | Re: Memmove causing program crashes, giving SIGTRAP in GDB(?) |
| Message-ID: | <aZ7PrbisVR1R4A7v@dimstar.local.net> |
| Mail-Followup-To: | cygwin AT cygwin DOT com |
| References: | <547312365 DOT 1464244 DOT 1771958282029 AT connect DOT xfinity DOT com> |
| MIME-Version: | 1.0 |
| In-Reply-To: | <547312365.1464244.1771958282029@connect.xfinity.com> |
| X-Atmail-Id: | duncan_roe AT optusnet DOT com DOT au |
| X-atmailcloud-spam-action: | |
| X-AVAS-Report: | FROM_HAS_DN(0.00) FROM_EQ_ENVFROM(0.00) UNKNOWN_SENDER(0.50) |
| TO_DN_NONE(0.00) TO_MATCH_ENVRCPT_ALL(0.00) CLOUDMARK_NOT_SPAM(-10.00) | |
| MIME_GOOD(-0.10) FREEMAIL_ENVFROM(0.00) CYREN_UNKNOWN(0.00) | |
| RCVD_VIA_SMTP_AUTH(0.00) ARC_NA(0.00) ASN(0.00) RCVD_COUNT_ONE(0.00) | |
| MIME_TRACE(0.00) RCVD_NO_TLS_LAST(0.10) FREEMAIL_FROM(0.00) | |
| RCPT_COUNT_ONE(0.00) MISSING_XM_UA(0.00) SUBJECT_HAS_QUESTION(0.00) | |
| X-atmailcloud-route: | unknown |
| X-BeenThere: | cygwin AT cygwin DOT com |
| X-Mailman-Version: | 2.1.30 |
| List-Id: | General Cygwin discussions and problem reports <cygwin.cygwin.com> |
| List-Unsubscribe: | <https://cygwin.com/mailman/options/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe> | |
| List-Archive: | <https://cygwin.com/pipermail/cygwin/> |
| List-Post: | <mailto:cygwin AT cygwin DOT com> |
| List-Help: | <mailto:cygwin-request AT cygwin DOT com?subject=help> |
| List-Subscribe: | <https://cygwin.com/mailman/listinfo/cygwin>, |
| <mailto:cygwin-request AT cygwin DOT com?subject=subscribe> | |
| From: | Duncan Roe via Cygwin <cygwin AT cygwin DOT com> |
| Reply-To: | Duncan Roe <duncan_roe AT optusnet DOT com DOT au> |
| Errors-To: | cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com |
| Sender: | "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com> |
Hi Kennon,
On Tue, Feb 24, 2026 at 10:38:01AM -0800, cygwin wrote:
> Hello,
>
> I am having a problem with that is apparently related to memmove and looking for some advice on how to investigate further. This winter I have been working to simplify GLZA source code and make it more readable. GLZA is an advanced open source code straight line grammar compressor first released in 2015. Among these changes was replacing some rather bloated code with memmove and memset in various locations. The program started crashing occassionally and after extensively reviewing the changes, I was unable to find a cause for these crashes. So I installed gdb to try to find out what was going on and was apparently able to find the cause of the problem. As a new gdb user, I am not very comfortable with trusting the results of what gdb showing, but it is pointing directly to one of the code changes I made. I backed out of this code change and the program has not crashed after 3 days of nearly continuous testing.
>
> So here is what gdb reports when backtrace is run immediately after reporting a "SIGTRAP":
>
> (gdb) bt full
> #0 0x00007ff9dd8aa98b in KERNELBASE!DebugBreak () from /cygdrive/c/Windows/system32/KERNELBASE.dll
> No symbol table info available.
> #1 0x00007ff9ca3b6417 in cygwin1!.assert () from /cygdrive/c/Windows/cygwin1.dll
> No symbol table info available.
> #2 0x00007ff9ca3cfb18 in secure_getenv () from /cygdrive/c/Windows/cygwin1.dll
> No symbol table info available.
> #3 0x00007ff9e03dd82d in ntdll!.chkstk () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #4 0x00007ff9e038916b in ntdll!RtlRaiseException () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #5 0x00007ff9e03dc9ee in ntdll!KiUserExceptionDispatcher () from /cygdrive/c/Windows/SYSTEM32/ntdll.dll
> No symbol table info available.
> #6 0x00007ff9ca3b12a9 in memmove () from /cygdrive/c/Windows/cygwin1.dll
> No symbol table info available.
> #7 0x0000000100409a7c in rank_scores_thread (arg=0x6ffece890010) at GLZAcompress.c:904
> new_score_rank = 2633
> new_score_lmi2 = 183964750
> new_score_pmi2 = 183964725
> rank = 4380
> max_rank = 2633
> num_symbols = 25
> new_score_lmi = 92079851
> new_score_pmi = 92079826
> thread_data_ptr = 0x6ffece890010
> max_scores = 4883
> candidates_index = 0xa00034470
> score_index = 4380
> node_score_num_symbols = 7
> num_candidates = 4381
> node_ptrs_num = 12224
> local_write_index = 12225
> rank_scores_buffer = 0x6ffece890020
> candidates = 0x6ffece990020
> score = 47.6283531
> #8 0x00007ff9ca412eec in cygwin1!.getreent () from /cygdrive/c/Windows/cygwin1.dll
> No symbol table info available.
> #9 0x00007ff9ca3b47d3 in cygwin1!.assert () from /cygdrive/c/Windows/cygwin1.dll
> No symbol table info available.
> #10 0x0000000000000000 in ?? ()
> No symbol table info available.
>
> GLZAcompress.c line 904 is as follows and is in code that runs as a separate thread created in main:
> memmove(&candidates_index[new_score_rank+1], &candidates_index[new_score_rank], 2 * (rank - new_score_rank));
> This does point directly to where a code change was made.
>
> candidates_index is allocated in main and not ever intentionally changed until deallocated at the end of program execution:
> if (0 == (candidates_index = (uint16_t *)malloc(max_scores * sizeof(uint16_t))))
> fprintf(stderr, "ERROR - memory allocation failed\n");
> This value is passed to the thread in a structure pointed to by the thread arg. The value 0xa00034470 for candidates_index is similar to what is reported on subsequent runs with added code to print this value so I don't think it's corrupted, but would need to duplicate the crash after checking the initial value to be 100% certain. With gdb reporting that rank = 4380 and new_score_rank = 2633 at the time of the SIGTRAP, this should be a backward move of 1747 uint16_t values by 2 bytes with a 2 byte difference between the source and destination addresses.
>
> Prior to this code change and for the last 3 days I have been using this code instead and not seen any crashes:
> uint16_t * score_ptr = &candidates_index[new_score_rank];
> uint16_t * candidate_ptr = &candidates_index[rank];
> while (candidate_ptr >= score_ptr + 8) {
> *candidate_ptr = *(candidate_ptr - 1);
> *(candidate_ptr - 1) = *(candidate_ptr - 2);
> *(candidate_ptr - 2) = *(candidate_ptr - 3);
> *(candidate_ptr - 3) = *(candidate_ptr - 4);
> *(candidate_ptr - 4) = *(candidate_ptr - 5);
> *(candidate_ptr - 5) = *(candidate_ptr - 6);
> *(candidate_ptr - 6) = *(candidate_ptr - 7);
> *(candidate_ptr - 7) = *(candidate_ptr - 8);
> candidate_ptr -= 8;
> }
> while (candidate_ptr > score_ptr) {
> *candidate_ptr = *(candidate_ptr - 1);
> candidate_ptr--;
> }
> Yes, it's bloated code that should do the same thing as the memmove, but most importantly the code has never caused any problems. Interestingly, even this code shows memmove in the assembly code (gcc -S), but only for the second while loop. The looping code for the first while loop looks like this and moves 8 uint16_t's in just 5 instruction so it is perhaps not as inefficient as the source code looks:
> .L25:
> movdqu -16(%rax), %xmm1
> subq $16, %rax
> movups %xmm1, 2(%rax)
> cmpq %rdx, %rax
> jnb .L25
>
> It may or may not matter, but the code this is happening on is very CPU intensive - there can be up to 8 threads running at the same time when this problem occurs. The problem doesn't occur consistently, it seems to be rather random. The program runs about 500 iterations of ranking up to the top 30,000 new grammar rule candidates over nearly 4 hours on my test case and has crashed on different iterations each time it has crashed, even though the thread that seems to be crashing should be seeing exactly the same data each time the program is run. The malloc'ed array address could be changing, I haven't checked that out.
>
> I find it really hard to believe there is a bug in memmove but that seems to be what gdb and my testing are indicating. So I am looking for advice on how to better understand what is causing the program to crash. I would like to review the code memset is using, but have not been able to figure out how to track that down. Any help in understanding what code the complier is using for memmove would be helpful. Are there other things I could possibly be overlooking? Are the any other things I should review or report that would be helpful? I could try to write a simplified test case if that would be useful.
>
> Best Regards,
>
> Kennon Conrad
>
>
>
> --
> Problem reports: https://cygwin.com/problems.html
> FAQ: https://cygwin.com/faq/
> Documentation: https://cygwin.com/docs.html
> Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
The memmove() call acceses new_score_rank 3 times while the old code only
accessed it once. Is it possible that another CPU alters new_score_rank between
these acesses?
You could eliminate that possibility by making a local copy of new_score_rank
and using that in the memmove() call. Worth a try?
Cheers ... Duncan.
--
Problem reports: https://cygwin.com/problems.html
FAQ: https://cygwin.com/faq/
Documentation: https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
| webmaster | delorie software privacy |
| Copyright © 2019 by DJ Delorie | Updated Jul 2019 |