Sender: richdawe AT bigfoot DOT com Message-ID: <3A09EAB9.6924045A@bigfoot.com> Date: Thu, 09 Nov 2000 00:07:21 +0000 From: Richard Dawe X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.17 i586) X-Accept-Language: de,fr MIME-Version: 1.0 To: DJGPP workers Subject: Summary of the snprintf() situation Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Reply-To: djgpp-workers AT delorie DOT com Hello. I've created an archive of the snprintf() source that Alain sent me, converted into the right hierachy for the DJGPP source tree. It will be here: http://www.phekda.freeserve.co.uk/richdawe/djgpp/snprintf/ Now, to snprintf(). Here is a summary of the salient information from the thread, from a mix of public and private e-mails. (I got involved in this by inquiring about Alain's implementation of snprintf() distributed with some GNU packages.) There were two main threads: * Avoiding overflows on output. * Return value when specified buffer size == 0. Introduction ------------ Alain Magloire's original mail: "I was playing with djgpp this afternoon and notice snprintf() was missing. It's not ANSI not even POSIX, I beleive. But more and more apps use it because of the security that it provides, sprintf() and vsprintf() provide no buffer checks. I think it's define in Unix98 and most modern OS will have it. Well at least the 3 sitting in front of me (Solaris, Linux, QNX/NTO)." (Rich: I just checked the Unix98 spec and it is there, on the page for fprintf(). You can get the Unix98 spec for free from the Open Group - http://www.opengroup.org/ or http://www.unix.org (I think). C99 has it and I imagine the next Unix200x/POSIX specification will have it.) DJ Delorie expressed concern that the prototypes in stdio.h could conflict with packages that provide their own *snprintf functions. He gave an example: "For example, we don't prototype xmalloc() even though we provide it, because the prototypes used by actual software vary widely." Nate Eldredge voiced his support for the addition of snprintf(). Mark Elbrecht stated that they were in the C9x draft and should therefore be considered ANSI. (Rich: NB: C9x has been ratified as C99.) To DJ's concerns about packages including their own snprintf(), Eli Zaretskii said that he had not seen any packages that use these functions and asked if people could see what declarations packages used. Here is an attempt to explain how it works. Please correct any mistakes. snprintf() creates a FILE structure. The buffer for this file is set to be the buffer given to snprintf(). The space left in the buffer in the FILE is set to specified buffer size ('n') - 1, to leave space for a nul. The FILE flags were set to text mode and _IOSTRG. I believe _IOSTRG stands for I/O from a string, e.g. do all "I/O" from a string buffer, e.g. sscanf(). Anyhow, once the FILE is set up, _doprnt() can be called to do the string output. The Overflow Issue ------------------ Now, in order that snprintf() doesn't overflow the buffer, there has to be some way of terminating its output. There are two paths: * rewrite the code to account for the size of the string buffer (overflows can't happen); * modify the existing code to guard against overflows (overflows prevented when the buffer fills). Alain's original method was to modify PUTC in doprnt.c to this: #define PUTC(ch) \ do { \ if (fp->_flags & _IOSTRG && fp->_flags & _IOWRT && !fp->_cnt) \ return counter; \ (void) putc(ch); \ counter++; \ } while (0) [ PUTC currently calls putc() like this: ] [ ] [ #define PUTC(ch) (void) putc(ch, fp) ] Alain also suggested that _flsbuf() or __putc_raw() could be modified, to be aware of string buffers, to take the appropriate action. Eli replied: "I'd prefer to change __putc (on ) so that it simply returns without doing anything when fp->_cnt is non-positive. After all, __putc has no business calling _flsbuf for fake streams with _IOSTRG flag set, so we might as well say it explicitly, instead of relying on INT_MAX to never be exhausted, which just might become untrue one day." Eli also pointed out that Alain's macro was inconsistent with the function definition. Later Eli said: "Look at , and you will see that putc() actually writes a character only if the buffer is full. If you arrange for the buffer to never fill up, putc() simply accumulates the characters in the buffer of the FILE object." Eli also warned: "Look at what sprintf does, it already handles this situation. Why do we need something different with snprintf? _strbuf._flag = _IOWRT|_IOSTRG; Note that this should also set _IONTERM flag, like sprintf does, otherwise the termios hook might catch it and do the wrong thing." (Rich: This change has been made - _IONTERM is now set by snprintf().) Alain looked at putc_raw() and then commented: "I had a look at the putc_raw() and the answer was obvious. This function should never call _flsbuf() when the stream is a string: p->_flag = _IOSTR; Since flsbuf() role is to get a new buffer malloc(BUFSIZ) when none where allocated." Eli agreed that putc_raw() should never call _flsbuf() . Alain then proposed modifying __putc_raw() to do a no-op, once the buffer was full: static __inline__ int __putc_raw(int const x,FILE *const p) { if(p->_cnt>0) { p->_cnt--; return((unsigned char)(*(p->_ptr++)=(unsigned char)x)); } else if (p->_flag & _IOSTR) { /* For String stream never flush. And we also mimic the * GNU libc by keep on counting. */ p->cnt--; return 0; } return(_flsbuf(x,p)); } (Rich: It seems that between this version and the current version, the 'p->cnt--;' line for string streams has been lost.) Eli asked: "Isn't it better to return x? `putc' is documented to return its argument (although most programs never use that)." Alain said: "I actually think, it should return EOF, since no data was written to the stream. This would signal the error that you've reach the end of the stream, which is what {f,}putc() should do." Eli replied: "But we want the ``stream'' in this case to never reach its end, don't we? The current kludgey implementation makes the stream be INT_MAX characters large, and the goal of the change was to make it even larger. So returning EOF right in the beginning conflicts with that." (Some text omitted/unquoted from mail.) "An example is doprnt.c. Currently, there's no problem with returning EOF in that case, since doprnt.c simply throws away the return value of putc. But imagine that someone some day might want to look at what putc returns and e.g. return a failure indication if it returns EOF." Alain replied: "The goal of the changes was to stop overflow. With other stream type .i.e != _IOSTRG, when the limit of the buffer is reach the stream is flush and the counter is reset. By default the buffer[BUFSIZ] or what ever setvbuf() sets it to." With regard to returning EOF on overflow: "Each function snprintf, printf, sprintf, returns and a negative value when an error occured, obviously overflow is an error, IMO." (Some text omitted/unquoted from mail.) "Basically when the counter roll-over, the action of putc() is a noop and I think it is a good thing to tell the caller that whatever you writing is going to the big bucket and you should take appropriate action to reset the counter or your buffer. There is no side-effect to it." Alain said that sprintf() is actually a special case of snprintf(). Alain recommended making sprintf() & snprintf() call vsnprintf() to do all the work, removing duplication, if return codes could be agreed. Alain also recommended making FILE's _cnt flag a long and using LONG_MAX instead of INT_MAX. His example sprintf(): int sprintf(char *_s, const char *_format, ...) { va_list args; int ret; va_start(args, _format); ret = vsnprintf(_s, INT_MAX, _format, args); va_end(args); return ret; } and snprintf(): int snprintf(char *_s, size_t _n, const char *_format, ...) { va_list args; int ret; va_start(args, _format); ret = vsnprintf(_s, _n, _format, args); va_end(args); return ret; } The Return Code Issue --------------------- Eli was unsure what the behaviour should be when n was zero or negative. Alain's implementation returns EOF. Eli suggested that calling with n = 0 was a way of asking how much buffer space would be need to store the formatted output. Eli inquired what other implementations do. The results were: * Nate replied that glibc returns the number of characters, as Eli suggested. Nate also pointed out a bad cast from int to size_t (the "negative" buffer sizes). Since n is a size_t, it cannot be negative. (Rich: This has been fixed.) * Alain pointed out that Unix98's behaviour was unspecified. (Rich: The Unix98 standard says: "If the value of n is zero on a call to snprintf(), an unspecified value less than 1 is returned.") Alain agreed that the desire to follow the C9x standard was fair. Later on Eli suggested that the C9x draft did not preclude returning the buffer space required and, in fact, hinted about such a feature. Outro ----- That's pretty much it. I hope that's useful. There should be enough info there to add snprintf() to DJGPP. Bye, Rich =] -- Richard Dawe [ mailto:richdawe AT bigfoot DOT com | http://www.bigfoot.com/~richdawe/ ]