delorie.com/archives/browse.cgi   search  
Mail Archives: djgpp/2013/06/18/04:45:15

X-Authentication-Warning: delorie.com: mail set sender to djgpp-bounces using -f
X-Received: by 10.224.129.196 with SMTP id p4mr12014903qas.6.1371543970790;
Tue, 18 Jun 2013 01:26:10 -0700 (PDT)
X-Received: by 10.50.88.101 with SMTP id bf5mr593446igb.0.1371543970571; Tue,
18 Jun 2013 01:26:10 -0700 (PDT)
Newsgroups: comp.os.msdos.djgpp
Date: Tue, 18 Jun 2013 01:26:10 -0700 (PDT)
In-Reply-To: <kpo3gh$4qo$1@speranza.aioe.org>
Complaints-To: groups-abuse AT google DOT com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=71.222.72.40; posting-account=jrLHRgkAAABPV01ZW_RN_U6Tm5UnYNUx
NNTP-Posting-Host: 71.222.72.40
References: <36e857f0-9899-496b-9fc6-32251e109888 AT googlegroups DOT com> <kpo3gh$4qo$1 AT speranza DOT aioe DOT org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <858cbded-7989-46e6-a997-93f842cdb3b0@googlegroups.com>
Subject: Re: General Protection Fault error is intermittent
From: "K.J.Williams" <lordwilliams1972 AT gmail DOT com>
Injection-Date: Tue, 18 Jun 2013 08:26:10 +0000
Bytes: 19657
Lines: 721
To: djgpp AT delorie DOT com
DJ-Gateway: from newsgroup comp.os.msdos.djgpp
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id r5I8j2b6020720
Reply-To: djgpp AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: djgpp AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

On Monday, June 17, 2013 3:48:36 PM UTC-7, Rod Pemberton wrote:
> "K.J.Williams" <lordwilliams1972 AT gmail DOT com> wrote in message
> 
> news:36e857f0-9899-496b-9fc6-32251e109888 AT googlegroups DOT com...
> 
> >
> 
> > I have a problem with why a General Protection Fault message is
> 
> > not displayed all the time, for a function called string_parser
> 
> > in two different programs. The function called string_parser
> 
> > uses a string.h function called strtok(). I am trying to
> 
> > implement it to work right, and I know that I am not doing it
> 
> > correctly. But the question is why in a smaller program it
> 
> > compiles fine and doesn't exhibit the General Protection
> 
> > Fault message vs. when I implement it in a larger program
> 
> > it does ?
> 
> 
> 
> This looks just a bit like homework to me...  Is it?
> 
Actually it is a program that I survived from a old compiler called 
Borland Turbo C++ for DOS v3.0. a 16-bit compiler which just became 
impossible to use when I wanted to compile program larger than 64k. 
So I switched to DJGPP which has everything that the old compiler had 
Ive been working on this program since 2009 when I discovered how to use 
strtok to break text line down into segmented text lines. 

> 
> 
> Have you found the problem(s) yet?

Yeah I am not checking for a null pointer in the second use 
of strtok() in my string_parser() to prevent me from copying 
the string back to the calling statement. Thats not why 
I posted my message here. You see, I wrote the program parstext.c 
to test my function for the larger program that I want to implement 
it for. But the problem is that DJGPP will not warn me about violating 
memory boundaries with the parstext.c program as it does with the 
bigger program WATT ( a project of several files ). I wanted to know 
why isn't DJGPP warning me when I run parstext.c runs without complaints?   
Its basically a segmented memory violation caused by strtok on following 
calls.


> 
> 
> 
> What have you tried?
> 
I have debugged my program using printf statements and commenting out 
parts of my program to follow the source of the error was in my 
string_parser() function which is not implementing strtok() as it 
should be , defined by the C standard ( which was not clear to me at first ).

> 
> 
> My first guess would be that you're using strtok() in your large
> program too.  Check to see if your larger program's code is using
> strtok() also.  That could be an issue.
> 
That's what I am doing. I usually program smaller programs that use a 
function idea that I think of to make sure they work right before 
implementing them into my bigger program.
> 
> 
> My next guess would be that your buffers sizes are too small or
> incorrect.  Does changing PARSE_SIZE_LIMIT to a larger value, say
> 512 or 4096, fix the problem?  If so, your buffers are too small.

Actually this is a command console program which uses the width of 
screen in textmode(C4350); from conio.h , one text line is 80 characters 
+ 1 for the null character '\0', hence #define PARSE_SIZE_LIMIT 81

 
> Does eliminating most usage of printf()'s eliminate the problem?
> Or, does using printf()'s eliminate the problem?  If the problem
> appears or disappears with usage of printf(), you have a memory
> allocation or leak problem somewhere.  For some odd reason,

I am actually using cprintf() in my bigger program and printf() in 
my smaller program - with respect to the differences that one uses 
stdout, and the other uses direct BIOS calls to the video.

> 
> DJGPP's printf() will reveal memory allocation issues.  Sometimes
> using it will eliminate the crash.  Other times using it will
> cause the crash.  In both cases, it's something wrong with memory
> allocation or buffer overflow, etc.

Yes - as previously mentioned a misuse of strtok , which is causing 
a segmentation fault which is my problem to fix. Again... that's not 
why I posted my message here about.


> 
> 
> 
> Does using the C library's fgets() instead of kfgets() eliminate
> the issue?  If so, something in kfgets() is at fault.
> Have you verified the range represented by y and z are correct?
> Have you verified that counta and countb are correct?
> Other suggestions or possible issues in the code below.
> 

Ive been told in another programming forum - on www.cprogramming.com 
in the c programming section. That a C string is defined as char array which must end with '\0' - over and over again. Secondly Ive been told that a char which holds a single character value such as '#' can not be used as a token for strtok(); - over and over again. I redesigned my program to use a c-string 
as a token but I misunderstood the C standard in C99 implementation. Last 
I misunderstood, that strtok() does not check for a null character at all,
which was giving me problems in the first place with my larger program.  

> 
> 
> > [here is] my source code :
> 
> > /*
> 
> >   program name: parstext.c
> 
> >   author: K.J.Williams
> 
> >
> 
> >   Purpose: This is a sample of how string_parser, which uses
> 
> > strtok() is
> 
> >            used parse commands in the larger program that I am
> 
> > developing
> 
> >            called ( by acronym ) WATT.
> 
> > */
> 
> >
> 
> > #include <stdio.h>
> 
> > #include <string.h>
> 
> >
> 
> > // original design setting:
> 
> > #define PARSE_SIZE_LIMIT 81
> 
> >
> 
> 
> 
> #define PARSE_SIZE_LIMIT 81
> 
> 
> 
> This should probably be split into two defines.   This is to make
> 
> sure the allocated buffer or character array is _always_ larger
> 
> than the quantity read or written to the buffer.  Use one define
> 
> for the character arrays and another for the buffer size.  They
> 
> should be slightly larger by at least two chars for the DOS
> 
> newline:
> 

Remind me again, is DOS newline : "\r\l" or "\r\n" ???

Secondly dont I have to use a null character at the end of my string , +1 to 80 of my which is why PARSE_SIZE_LIMIT is 81 - otherwise its not a valid C string that I can pass to strtok() ?  
> 
> 
>  #define PARSE_SIZE_LIMIT 80
> 
>  #define BUFFER_SIZE 82
> 

I thought that my function kfgets which actually uses fgets, settled this problem when it eliminates the extra newline, after reading in a PARSE_SIZE_LIMIT defined size. 

> 
> 
> So, any place you declare a variable that used PARSE_SIZE_LIMIT,
> 
> it would be changed to use BUFFER_SIZE:
> 
> 
> 
> Change:
> 
>  char variable[PARSE_SIZE_LIMIT]
> 
> 
> 
> To:
> 
>  char variable[BUFFER_SIZE]
> 
> 
> 
> This would be done for the declarations of the following
> 
> variables:
> 
>  target, worda, uword in main()
> 
>  subject in string_parser()
> 
>  line in kfgets()
> 
> 
> 
> Leave the other non-declaration uses of PARSE_SIZE_LIMIT alone.
> 
> 
> 
> > [SNIP]
> 
> >
> 
> > int main (void)
> 
> > {
> 
> >    //declare char arrays and initialize garbage with '\0'
> 
> >
> 
> > [SNIP]
> 
> >
> 
> >    printf("Compilation Date : %s @ %s PST (24hr
> 
> > format)\n",__DATE__,__TIME__);
> 
> >    printf("WATT: single text line string parsing tester\n");
> 
> >    printf("Enter AT Command (ATcmd) statment(s) in the syntax
> 
> > format of\n");
> 
> >    printf("[#(ATcmd): arg1 arg2 ... argN ] or [#(ATcmd): ] - for
> 
> > no args, and etc.\n");
> 
> >
> 
> 
> 
> The word "statement" in main() has two letter e's, not just one,
> 
> i.e., misspelled...
> 

- oops - typo...

> 
> 
> >    //prompt user
> 
> >
> 
> >     do
> 
> >     {
> 
> >
> 
> >     //initialize (or reset) these variables
> 
> >     z = 1;
> 
> >     target[0] = '\0';
> 
> >
> 
> > [SNIP]
> 
> >
> 
> >         if(countb > 1)
> 
> >         {
> 
> >           for(y = 1;y <= countb;y++)
> 
> >           {
> 
> >              string_parser(worda,uword,' ',y);
> 
> >              printf("...sub-part#%d : %s\n",y,uword);
> 
> >           }
> 
> >         }
> 
> >
> 
> >       //pause
> 
> 
> 
> z++;
> 
> 
> 
> See below.
> 
> 
> 
> >       if(z < counta); { return_key(); }
> 
> 
> 
> This appears to have a bug.  There is an extra semi-colon between
> 
> what appears to be the correct body of the if() and the if's
> 
> condition.  Removed semicolon:
> 
> 
> 
>        if(z < counta) { return_key(); }

- another typo to change .... 
> 
> 
> 
> >
> 
> >       z++;
> 
> 
> 
> Z appears to be off by one for the if() above.  I moved it up
> 
> above.
> 
> 
> 
> > [SNIP]
> 
> >
> 
> > //parses c strings by using single character tokens with
> 
> > strtok();
> 
> > short int string_parser(char *userstring, char *target, char
> 
> > magic, short int word)
> 
> > {
> 
> 
> 
> Since "magic" is only one character, it's probably easier if you
> 
> used strchr() or strrchr() to parse instead of using strtok().
> 
> 
> 
> >   char subject[PARSE_SIZE_LIMIT]; subject[0] = '\0';
> 
> > //initialize subject
> 
> >
> 
> >   //w1 = token with null character ; w2 = token without null
> 
> > character
> 
> 
> 
> Remove comment for w2.  It's wrong.  Or, change to same comment as
> 
> w1.  A null character, i.e., '\0' in C, and a NULL pointer are
> 
> completely different.

yes, the forum of www.cprogramming.com for their c programming section 
drummed this point over and over in my posts their. Part of my confusion 
was with how I understood the C standard of strtok() defined in C99

> 
> 
> 
> >   char w1[2]; w1[0] = magic; w1[1] = '\0';
> 
> >   char w2[1]; w2[0] = magic;
> 
> >
> 
> 
> 
> It appears that there are two bugs here.  w2 in string_parser is a
> 
> standard string.  It should be null terminated, like w1.  It
> 
> should also have enough space for two characters.  I.e.,
> 
> string_parser() needs this here:
> 
> 
> 
>    char w2[2]; w2[0] = magic; w2[1]='\0';
> 

again my confusion...
> 
> 
> >   //temp variables:
> 
> >   char a = 0;//copies x before the if evaluation of x - see
> 
> > below
> 
> >   char b = 0;//copies b after the if evaluation of x - see below
> 
> >   char c = 0;//the difference between a & b - see below
> 
> >   char d = 0;//d = subject[x]; - see below
> 
> >
> 
> >   char x = 0;//for loop counter
> 
> >
> 
> > [SNIP]
> 
> >
> 
> >          //printf(" true : y = %d\n",y);//temp
> 
> >        }
> 
> >        else
> 
> >        {
> 
> >          //printf("false\n");//temp - do nothing
> 
> >        }
> 
> >     //end of for loop
> 
> >     }
> 
> >
> 
> >     if (word == -2) { if(z != y) { return -1; } }
> 
> >
> 
> 
> 
> y++;
> 
> 
> 
> Y seems to be off by one. 

Its intended to count the valid tokens ( which are single characters - '#' ) 
are spaced at least by 2 character differences to count as strings that are parsed out separately.
> 
> 
> 
> > [SNIP]
> 
> >   //** this is where strtok() is used **
> 
> >
> 
> >   /*
> 
> >     note: strtok expects the token to be a cstring without a
> 
> > null character
> 
> >           on the first use, and a token to be a cstring *with* a
> 
> >           null character every use after the first.
> 
> >   */
> 
> >
> 
> 
> 
> No...
> 
> 
> 
> strtok() expects the first argument to be a valid pointer to char,
> 
> i.e., char *, for the first call.  strtok() expects the first
> 
> argument to be a NULL pointer to continue processing the same
> 
> token for additional calls.  Your use of strtok() in the code
> 
> below that comment appears correct.
> 
> 
> 
> > [SNIP]
> 
> > //generic prompt and get text from user
> 
> > short int kfgets(char *target)
> 
> > {
> 
> >    short int a;//temp. variables
> 
> >    char line[PARSE_SIZE_LIMIT]; line[0] = '\0'; //temp string
> 
> > storage
> 
> >
> 
> >    //programmer must provide a prompt for whatever information
> 
> > wanted
> 
> >    //from the user
> 
> >
> 
> >    fgets(line ,PARSE_SIZE_LIMIT, stdin);
> 
> >    a = strlen(line);
> 
> >    line[a-1] = '\0';//get rid of the newline character added by
> 
> > fgets
> 
> >
> 
> 
> 
> I'm not sure how you're detecting the final argument.  You might
> 
> be checking for a '\0' somewhere with strtok(), but I didn't see
> 
> that.  If so, it doesn't seem to be working...  But, space padding
> 
> the string here in kfgets() will allow you parse the final arg.
> 
> 
> 
> Comment out:
> 
>   line[a-1]='\0';
> 
> 
> 
> Add:
> 
>   line[a-1]=' ';
> 
>   line[a]='\0';
> 
> 
> 
> FYI, this is probably not the fix you want...
> 
Actually line[a-1] = '\0' is a programming practice that www.cprogramming.com
has in its own documentation as valid way to strip the extra newline character
produced by fgets();

> 
> 
> > [SNIP]
> 
> >
> 
> > //a ANSI C equivalent of " Press any key to continue "
> 
> >
> 
> > short int return_key(void)
> 
> > {
> 
> >    char x[2];
> 
> >
> 
> >    printf("\n Press Enter...\r");
> 
> >    while (fgets(x,2,stdin) != NULL && x[0] != '\n');
> 
> >
> 
> 
> 
> return_key() attempts to get two characters, but checks for one...
> 
> Two characters for newline under DOS ( '\r\n') would require stdin
> 
> to be set to binary mode.  Off hand, I don't recall if stdin is
> 
> set to binary mode or text mode, but I suspect it's text since I
> 
> generally use binary mode and check for two...  I.e., one
> 
> character ('\n') is all I think you should need to get.
> 
I thought that since DJGPP is a port of GCC, which is from a Linux ( Unix )
environment , all modes in file reading are binary regardless if they are 
reading or writing - so I thought that GCC was imported with that strict  implementation. And again, I know that a newline feed 
in Linux is '\l' but I am not sure for DOS , if its "\r\l" or "\r\n" ..???  

> 
> 
> It seems you're using fgets() to read just a few characters in
> multiple places?  Why not just use getc() or getch()?  Was this in
> preparation for your kfgets() read routine?
> 
Well in context I am using kfgets, to produce a portable program temporarely to test my program and post it - so that if it were to be compiled at least it would be portable. Otherwise I would be using kbhit(); which is not part of C standard. 
> 
> 
> That's definately not everything, but it's a start.
> 
Well, thank you for looking over my code and pointing out some problems that were confusing to me. However, it doesn't explain why DJGPP will NOT produce General Protection Fault warning ( I assume this the equivalent of a segmentation fault for memory violation ) for my parstext.c program - but it does it every time for my larger program WATT for the same memory violation 
happening in parse_string(); that is implemented there.. 

Here is what I am doing...

I am developing a scripting language which will interpret ASCII text files in MS-DOS to execute created console programs , in a modified MS-DOS environment that is booted up from a USB flash drive. This is similar how fedora core OS can be booted up on a USB flash drive without a hard drive. 

My main program called WATT... uses a command line interface the same way that
GCC or any other command line program interprets .... 

When it runs my program , it copies one text line either containing my scripting language commands or text to do whatever depending on what has been 
told to the interpreter.... 

The string_parser(), is the keystone that I am trying to get work under DJGPP, as it did under Borland Turbo C++ for DOS v3.0, which I have long ago tossed out.

Basically when a scripting text line is detected (as fed in from the text file), it is identified by the first character in the line which is '#' which declares it a scripting code to execute, but first it has to parse it out because there could be more than one scripting statement on one line.... 

I define my commands (even ones with arguments) to start with a '#' and end with a space ' ' - e.g. (#clear: #newline: 1 ). 

When string_parser is used 

#clear: - is separately copied to another cstring via strtok() and strcpy().
and the rest of the line is parsed the same way to the end of the line 
for other commands. 

IF the first character of the text line read in is not '#' then its treated as ASCII text to print via cprintf(); , or do whatever as the interpreter has done to treat it as. 

So I wrote the parstext.c to test string_parser() and it worked after I compiled it with DJGPP - but when I implemented string_parser in my bigger program which has many files that are thousands of lines, I get a segmentation fault ( the general protection fault warning by the compiler in this case ) for allowing strtok() to return a NULL pointer as a error which is copied to the string that gets passed to the calling statement in my large program, that I was not checking for - because I didn't know that I should have in the first place. But more importantly a bad implementation use of strtok().

But it doesn't make sense to me that the general protection fault warning doesn't happen with string_parser() in parstext.c - which would have really 
helped me alot. Thats why I am confused .... 


Karl J. Williams


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019