Mail Archives: djgpp/2019/06/21/03:19:01
On Thu, 20 Jun 2019 23:33:12 +0200
"J.W. Jagersma (jwjagersma AT gmail DOT com) [via djgpp AT delorie DOT com]"
<djgpp AT delorie DOT com> wrote:
> On 2019-06-20 13:21, Rod Pemberton wrote:
Earlier, you said,
JWJ> Variables changing for no reason. Then a pointer or some offset
JWJ> changes and the next access triggers a page or GP fault.
The most difficult problem I ever tracked down was a race condition
error. It resulted in corruption of a variable from overwriting. This
was caused by non-reentrant code being reentered. That could be your
issue as well, e.g., if using hardware or software interrupt routines.
If so, you'll see file scope or global variable(s) being corrupted when
the code is reentered, just like you described earlier. The solution
is to eliminate the file scope or global variable, using auto or local
variables instead, or wrap the interrupt routine with CLI/STI to prevent
interruption. You may even need to disable NMI via port writes.
Of course, not saving/restoring C/C++ code's in-use registers does much
the same thing. Certain registers must be preserved and can't be
clobbered. These can also be referred to as callee-savee or
caller-saved registers. For DJGPP, my notes say that EBX, EDI, ESI,
EBP, DS, ES, SS registers must preserved. I.e., check your inlined
assembly's clobber list.
> > Does a printf() placed nearby eliminate the issue? (memory
> > allocation)
>
> Doesn't eliminate it, but any code change (including adding
> printf/cout) tends to change the memory location where the corruption
> occurs.
If it was my code, I'd probably work on this angle to "chase" the error
around until I found it or some clue. Or, I'd "shotgun" a bunch of
printf()'s to get an indication of where something was going awry.
E.g., print out letters to trace the code flow before your debugger
break points hit.
As I've done that in the past (without using a debugger), it can
take quite a while to find something to look into. Keep trying.
Obviously, there is no way to know if the source of the error is near
to the problem or far away in either code space or time.
Apparently, you're using a debugger, which I would guess should help
plenty over the rudimentary methods I prefer to use. Perhaps, keep
watching the watch points you've got, but keep working your way back up
the code chain checking for modified pointers.
> > Are you accessing memory that hasn't been allocated? (buffer
> > overflow)
>
> As far as I'm aware, no. (and if I was, I would stop doing it :))
> For the most part I'm using c++ constructs like std::vector and
> std::unique_ptr which are designed to prevent these sort of issues.
Sorry, I don't know C++. Clearly, this is the primary reason you
believe it to be a lower level issue with malloc.
v2.03 libc.a in djdev report using malloc
v2.04 libc.a in djdev report using malloc
v2.05 libc.a in djdev reports using nmalloc
So, only v2.05 should be using nmalloc.
If you could produce code that causes mv2freelist() to fail somewhat
consistently, I could see what happens with v2.03 (which uses malloc
instead of nmalloc). I also had v2.04 installed, but I don't have it
on an active DOS partition right now (backed up).
So, maybe you could try a v2.03 install? You'll have to select the
older version files from a DJGPP mirror. This may provide an additional
reference point to go on or a way to compare outcomes. E.g., if your
application works correctly on v2.03, but not on v2.05, then v2.05 has
a problem ... Since I never really used v2.04 and it seemed to always
be in beta, I may install v2.05 but not any time soon.
> > Are you using any assembly? (register corruption) > Are you using
> > any other "advanced" features of DJGPP like DPMI to allocate
> > memory, nearptr's or farptr's, transfer buffer, etc?
>
> There's a lot of that going on, and most of those features I
> implemented myself to be more in line with idiomatic c++ code.
> However I used those same routines in other programs and it doesn't
> cause any issues there.
From personal and professional experience, bugs can hide in code for a
long, long time. Attack the "black box" from many angles. Try and try
again, until you succeed.
I.e., I consider any code to be worthy of review. Sigh, I recently
found 3 bugs in a simple program of mine that I've used for years ...
What I was looking for was a way to speed up the program. :-(
Rod Pemberton
--
Once upon a time, many decades ago in a place far away, humble people
sought their freedom, and lost. "Ideas are bulletproof."
- Raw text -