Mail Archives: djgpp/2006/12/01/06:30:17
Alexei A. Frounze wrote:
> Alexei A. Frounze wrote:
>> Brian Inglis wrote:
>>> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ
>>> Delorie <dj AT delorie DOT com> wrote:
>>>
>>>>
>>>>> This function indicates if STRING matches the PATTERN. ..."
>>>>>
>>>>> So DJGPP says that "\" doesn't match "\\" while Linux says it
>>>>> does. Well, I say DJGPP is right as the pattern says there should be
>>>>> two
>>>>> backslashes and you only provide one.
>>>>
>>>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>>>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>>>> escaped backslash, whereas a string of "\" is a single backslash.
>>>> They should match.
>>>
>>> switch ((c = *pattern++))
>>> {
>>> ...
>>> ...
>>> ...
>>> case '\\':
>>> /*+++ pattern already post-incremented to point to next char */
>>> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
>>> pattern[1]))
>>> /*+++ should be:
>>> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
>>> *+++ as end of input pattern will match end char in escapes string
>>> */ {
>>> /*+++ end of input pattern might be clearer with ! or == '\0' */
>>> if ((c = *pattern++) == 0)
>>> {
>>> c = '\\';
>>> --pattern;
>>> }
>>> if (c != *string++)
>>> return FNM_NOMATCH;
>>> break;
>>> }
>>
>> I don't think the above is enough. There's another problem. With the
>> above code you'd never see (c = *pattern++) == 0. My bet is that the
>> intent was to treat the slash in the last character of pattern as an
>> ordinary character. That would explain the {c = '\\'; --pattern;}
>> thing along with the fallthrough behavior. But the code is broken in
>> this place. Dunno if it was tested against the single unix spec or
>> just a little bit to see that it seems to work (in some basic cases).
>
> One more thing to consider, closing bracket as first character in the
> list/range inside the bracket expression:
> fnmatch("[]]", "]", 0) must return 0, doesn't
> fnmatch("[!]]", "]", 0) must return 1, does (by luck)
> fnmatch("[!]]", "a", 0) must return 0, doesn't
>
> And one more, which seems to be partially wrong even in that same RH9
> linux distro:
> fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in
> DJGPP fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both
> linux and DJGPP -- the spec doesn't make an exception for the escaped
> opening
> bracket, \[ when describes the escaping option. Dunno, maybe it would
> be wise not to escape the bracket, but other characters should be
> made escapable w/o a problem.
Actually, I was wrong about linux failing fnmatch("\[abc\]","[abc]",0)==0 --
I didn't put quotation marks around arguments to fnmatch that were passed to
it from the command line and therefore fnmatch wasn't comparing the same
thing (shell stripped some stuff). So, the above two things are only wrong
in DJGPP.
> Alex
> P.S. all the details obtained from fnmatch()'s description in The
> Single Unix Specification V3 2004 issue 6.
> P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't
> support, but the above things are pretty basic and it would be nice
> to have them handled properly, unless I overlook some major
> DOS-related issue for which it would be desirable to deviate from the
> spec.
Another couple of examples revealing incorrect behavior of fnmatch() in
DJGPP:
fnmatch("*\a", "a", 0) must return 0, doesn't
fnmatch("\[a]", "[a]", 0) must return 0, doesn't
fnmatch("\[a]", "a", 0) must return 1, does
So, as I understand it, the fnmatch() code flaws are:
1. rangematch() doesn't allow for the following two patterns: []...] and
[!]...] where ] is a valid char in the range
2. asterisk handling doesn't distinguish in the following the various
options for c and what follows c:
else if (isslash(c) && flags & FNM_PATHNAME)
{
if ((string = find_slash(string)) == NULL)
return FNM_NOMATCH;
break;
}
a) c=='/' // forward slash
b) c=='\', (flags & FNM_NOESCAPE)!=0 // back slash
c) c=='\', (flags & FNM_NOESCAPE)==0, isslash(pattern[1])==1 // any escaped
slash
3. '\\' handling is completely broken (wrong indices and logic). If escaping
is on, it must fall through the case to default if '\\' is followed by
anything from "\\?*[" to make sure those aren't interpreted as special chars
again but instead are interpreted as ordinary chars. For all other chars
(including '\0') it's better to break out from the case/switch to treat
those chars as ordinary by the other existing cases.
Alex
- Raw text -