From patchwork Thu Jun 27 12:55:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 92959 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F10AA38323D2 for ; Thu, 27 Jun 2024 12:56:31 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 0B4343832E5B for ; Thu, 27 Jun 2024 12:55:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B4343832E5B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0B4343832E5B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719492962; cv=none; b=DCcuwM+ikWBMjcr9Pre26ElMX5ml7lSBUVWHzyjCuO0TPJro2pV5ylXv6UZHM09n3niovYw8yzoSHUlYenoe8hZlJkzJ0hVrqvb09kBun0m1RuYfh0wc6h5OWLkQgQzx1or9khQ0rYfDU+93l0kxTIkPJDWOIz8qFHiEWEeL3A0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719492962; c=relaxed/simple; bh=T7BNOSKnOyK7gNV0trMTcJwKheT/ebFBeeyE8TrdJJo=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=sE9g/v3knS8YSUjzBr8oEXWdTYQ2CnI0RYwlda1Ivuq8n8OIaDzU5nAZBIkh5wq7Rt4IfkeHU+EbPFvYoZtENfwRZ4cIdA7jT/bSZFhvZG2hmp5k06KV2SqRQf/AcTcYRNVeuvKzlqomc5FE/r+pnxPHMNVVkRlaNLAtBL5jVCU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719492958; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=34p4Ksd6cVDcXGDKYMZEnurT5Zx7HBN/uuvWqyxx2HU=; b=OlWxhZBWT1gCgYJ5gSsZMVoMiFNLqYN2b09zjBMmpz+eeZaRTIw59q1DFPZ3q3i0FT3q0B u63wgZNcJ1wFQXJ+YKubV8PVwNn6FAsyQqPfftfNk4eE2/tSdXs2iUoTxuHAUHofPOtofD ZlVQ4lMj5xZejMph1jAYox2p2QCWkio= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-553-xxAFqFVOMsazFowmWi7naA-1; Thu, 27 Jun 2024 08:55:56 -0400 X-MC-Unique: xxAFqFVOMsazFowmWi7naA-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 19CC31944D30; Thu, 27 Jun 2024 12:55:55 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.45.224.225]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 91AD719560A3; Thu, 27 Jun 2024 12:55:53 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Cc: Reviewed-by@sourceware.org:Paul Eggert Subject: [PATCH] manual: Document that strnlen, wcsnlen do not need arrays of full size Date: Thu, 27 Jun 2024 14:55:50 +0200 Message-ID: <878qyqmtnt.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org The standards are at best ambiguous as to whether it is valid to call these functions with array arguments that are shorter than the given number of elements. Supporting such usage is very useful, see stdio-common/Xprintf_buffer_puts_1.c for an example. There is no other string function that provides this functionality. Also update the description of strndup accordingly because it uses strnlen internally. Tested on aarch64-linux-gnu (Neoverse-V2), i686-linux-gnu (Zen 4), powerpc64le-linux-gnu (POWER10), s390x-linux-gnu (z16), x86_64-linux-gnu (Zen 4). --- manual/string.texi | 31 ++++++----- string/Makefile | 1 + string/test-Xnlen-gnu.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++ string/test-strnlen-gnu.c | 4 ++ wcsmbs/Makefile | 1 + wcsmbs/tst-wcsnlen-gnu.c | 5 ++ 6 files changed, 163 insertions(+), 12 deletions(-) base-commit: 21738846a19eb4a36981efd37d9ee7cb6d687494 diff --git a/manual/string.texi b/manual/string.texi index 0b667bd3fb..0bf3c45d0e 100644 --- a/manual/string.texi +++ b/manual/string.texi @@ -311,14 +311,16 @@ This function was introduced in @w{Amendment 1} to @w{ISO C90}. @deftypefun size_t strnlen (const char *@var{s}, size_t @var{maxlen}) @standards{POSIX.1, string.h} @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} -This returns the offset of the first null byte in the array @var{s}, -except that it returns @var{maxlen} if the first @var{maxlen} bytes -are all non-null. -Therefore this function is equivalent to -@code{(strlen (@var{s}) < @var{maxlen} ? strlen (@var{s}) : @var{maxlen})} -but it -is more efficient and works even if @var{s} is not null-terminated so -long as @var{maxlen} does not exceed the size of @var{s}'s array. +This function returns the smallest non-negative integer @var{i} less +than @var{maxlen} such that @code{@var{s}[@var{i}]} is zero, or +@var{maxlen} if no such integer exists. The array object into which +@var{s} points must contain a null character at @var{s} or subsequently, +or there must be at least @var{maxlen} array elements starting at +@var{s}. + +Therefore this function is equivalent to @code{(strlen (@var{s}) < +@var{maxlen} ? strlen (@var{s}) : @var{maxlen})} but it is more +efficient and works even if @var{s} is not null-terminated. @smallexample char string[32] = "hello, world"; @@ -330,7 +332,8 @@ strnlen (string, 5) This function is part of POSIX.1-2008 and later editions, but was available in @theglibc{} and other systems as an extension long before -it was standardized. It is declared in @file{string.h}. +it was standardized. It is declared in @file{string.h}. Support for +input arrays shorter than @var{maxlen} characters is a GNU extension. @end deftypefun @deftypefun size_t wcsnlen (const wchar_t *@var{ws}, size_t @var{maxlen}) @@ -339,8 +342,9 @@ it was standardized. It is declared in @file{string.h}. @code{wcsnlen} is the wide character equivalent to @code{strnlen}. The @var{maxlen} parameter specifies the maximum number of wide characters. -This function is part of POSIX.1-2008 and later editions, and is -declared in @file{wchar.h}. +This function is part of POSIX and is declared in @file{wchar.h}. +Support for input arrays shorter than @var{maxlen} wide characters is a +GNU extension. @end deftypefun @node Copying Strings and Arrays @@ -922,7 +926,10 @@ processing strings. @standards{GNU, string.h} @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} This function is similar to @code{strdup} but always copies at most -@var{size} bytes into the newly allocated string. +@var{size} bytes into the newly allocated string. The array object into +which @var{s} points must contain a null character at @var{s} or +subsequently, or there must be at least @var{size} array elements +starting at @var{s}. If the length of @var{s} is more than @var{size}, then @code{strndup} copies just the first @var{size} bytes and adds a closing null byte. diff --git a/string/Makefile b/string/Makefile index 8f31fa49e6..6cc629968a 100644 --- a/string/Makefile +++ b/string/Makefile @@ -184,6 +184,7 @@ tests := \ test-strncpy \ test-strndup \ test-strnlen \ + test-strnlen-gnu \ test-strpbrk \ test-strrchr \ test-strspn \ diff --git a/string/test-Xnlen-gnu.c b/string/test-Xnlen-gnu.c new file mode 100644 index 0000000000..ab40a2e687 --- /dev/null +++ b/string/test-Xnlen-gnu.c @@ -0,0 +1,133 @@ +/* Test GNU extension for non-array inputs to string length functions. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* This skeleton file is included from string/test-strnlen-gnu.c and + wcsmbs/tst-wcsnlen-gnu.c to test that reading of the array stops at + the first null character. + + TEST_IDENTIFIER must be the test function identifier. TEST_NAME is + the same as a string. + + CHAR must be defined as the character type. */ + +#include +#include +#include +#include +#include +#include +#include + +typedef __typeof (TEST_IDENTIFIER) *proto_t; + +#define TEST_MAIN +#include "test-string.h" + +IMPL (TEST_IDENTIFIER, 1) + +static int +test_main (void) +{ + enum { buffer_length = 256 }; + TEST_VERIFY_EXIT (sysconf (_SC_PAGESIZE) >= buffer_length); + + test_init (); + + /* Buffer layout: There are a_count 'A' character followed by + zero_count null character, for a total of buffer_length + character: + + AAAAA...AAAAA 00000 ... 00000 (unmapped page follows) + \ / \ / + (a_count) (zero_count) + \___ (buffer_length) ___/ + ^ + | + start_offset + + The buffer length does not change, but a_count (and thus _zero) + and start_offset vary. + + If start_offset == buffer_length, only 0 is a valid length + argument. The result is 0. + + Otherwwise, if zero_count > 0 (if there a null characters in the + buffer), then any length argument is valid. If start_offset < + a_count (i.e., there is a non-null character at start_offset), the + result is the minimum of a_count - start_offset and the length + argument. Otherwise the result is 0. + + Otherwise, there are no null characters before the unmapped page. + The length argument must not be greater than buffer_length - + start_offset, and the result is the length argument. */ + + struct support_next_to_fault ntf + = support_next_to_fault_allocate (buffer_length * sizeof (CHAR)); + CHAR *buffer = (CHAR *) ntf.buffer; + + FOR_EACH_IMPL (impl, 0) + { + printf ("info: testing %s\n", impl->name); + for (size_t i = 0; i < buffer_length; ++i) + buffer[i] = 'A'; + + for (int zero_count = 0; zero_count <= buffer_length; ++zero_count) + { + if (zero_count > 0) + buffer[buffer_length - zero_count] = 0; + int a_count = buffer_length - zero_count; + for (int start_offset = 0; start_offset <= buffer_length; + ++start_offset) + { + CHAR *start_pointer = buffer + start_offset; + if (start_offset == buffer_length) + TEST_COMPARE (CALL (impl, buffer + start_offset, 0), 0); + else if (zero_count > 0) + for (int length_argument = 0; + length_argument <= 2 * buffer_length; + ++length_argument) + { + if (test_verbose) + printf ("zero_count=%d a_count=%d start_offset=%d" + " length_argument=%d\n", + zero_count, a_count, start_offset, + length_argument); + if (start_offset < a_count) + TEST_COMPARE (CALL (impl, start_pointer, length_argument), + MIN (a_count - start_offset, + length_argument)); + else + TEST_COMPARE (CALL (impl, start_pointer, length_argument), + 0); + } + else + for (int length_argument = 0; + length_argument <= buffer_length - start_offset; + ++length_argument) + TEST_COMPARE (CALL (impl, start_pointer, length_argument), + length_argument); + } + } + } + + support_next_to_fault_free (&ntf); + + return 0; +} + +#include diff --git a/string/test-strnlen-gnu.c b/string/test-strnlen-gnu.c new file mode 100644 index 0000000000..ffb3f534d0 --- /dev/null +++ b/string/test-strnlen-gnu.c @@ -0,0 +1,4 @@ +#define TEST_IDENTIFIER strnlen +#define TEST_NAME "strnlen" +typedef char CHAR; +#include "test-Xnlen-gnu.c" diff --git a/wcsmbs/Makefile b/wcsmbs/Makefile index 1cddd8cc6d..1c65cd759b 100644 --- a/wcsmbs/Makefile +++ b/wcsmbs/Makefile @@ -184,6 +184,7 @@ tests := \ tst-wcslcpy \ tst-wcslcpy2 \ tst-wcsnlen \ + tst-wcsnlen-gnu \ tst-wcstod-nan-locale \ tst-wcstod-nan-sign \ tst-wcstod-round \ diff --git a/wcsmbs/tst-wcsnlen-gnu.c b/wcsmbs/tst-wcsnlen-gnu.c new file mode 100644 index 0000000000..68a17dd1f8 --- /dev/null +++ b/wcsmbs/tst-wcsnlen-gnu.c @@ -0,0 +1,5 @@ +#include +#define TEST_IDENTIFIER wcsnlen +#define TEST_NAME "wcsnlen" +typedef wchar_t CHAR; +#include "../string/test-Xnlen-gnu.c"