delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2011/08/10/14:10:04

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
From: "Fischer, Matthew L" <matt DOT fischer AT hp DOT com>
To: "cygwin AT cygwin DOT com" <cygwin AT cygwin DOT com>
Date: Wed, 10 Aug 2011 19:07:39 +0100
Subject: regex_t internals: can we use re_magic to tell whether a regex has been regcomp'd?
Message-ID: <D6F0B7F3CE566B42B24474A7E7B8CBC436891ABDEA@GVW0547EXC.americas.hpqcorp.net>
MIME-Version: 1.0
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
X-MIME-Autoconverted: from quoted-printable to 8bit by delorie.com id p7AI9wm1006303

We are porting code from Linux that is attempting to determine whether a regular expression has been properly regcomp'd and not freed.  The code from Linux is looking into the buffer inside regex_t.   On Cygwin, the "buffer" (not the same field name) is hidden inside re_guts which has a comment that dissuades us from using it for this purpose.  However, from looking at the Cygwin implementation, it looks like if re_magic is != 0 then the regexp is valid and has been regcomp'd and not regfree'd.  Is this interpretation correct?

The porting mechanism in the code below seems to work well, but we're not sure whether re_magic is the best solution for Cygwin.  Is method below the best and more importantly, safest option?

bool regexValid() 
{
#ifdef __CYGWIN__
                return (m_reg.re_magic != 0);
#endif
                //original Linux code
return m_Reg.buffer != NULL;  
}

Structures:
------------------------------------

Linux regex_t - typedef'd to struct re_pattern"

struct re_pattern_buffer
{
  /* Space that holds the compiled pattern.  It is declared as
     `unsigned char *' because its elements are sometimes used as
     array indexes.  */
  unsigned char *__REPB_PREFIX(buffer);
...
}
typedef struct re_pattern_buffer regex_t;

------------------------------------


Cygwin regex_t:

On Cygwin, the malloc'd space is down in "re_guts" which has a great comment: 
typedef struct {
                int re_magic;
                size_t re_nsub;                 /* number of parenthesized subexpressions */
#ifdef __CYGWIN__
                const char *re_endp;     /* end pointer for REG_PEND */
#else
                __const char *re_endp;               /* end pointer for REG_PEND */
#endif
                struct re_guts *re_g;     /* none of your business :-) */
} regex_t;


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019