From patchwork Thu Jul 1 01:55:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12353359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_RED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27EB0C11F66 for ; Thu, 1 Jul 2021 01:55:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D335D61420 for ; Thu, 1 Jul 2021 01:55:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D335D61420 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 59E636B00C7; Wed, 30 Jun 2021 21:55:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 528066B00C9; Wed, 30 Jun 2021 21:55:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C8706B00CA; Wed, 30 Jun 2021 21:55:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 0E9D46B00C7 for ; Wed, 30 Jun 2021 21:55:19 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DA6D982E090B for ; Thu, 1 Jul 2021 01:55:18 +0000 (UTC) X-FDA: 78312351516.03.49FD5CA Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP id 847FAF00012B for ; Thu, 1 Jul 2021 01:55:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 9600D61476; Thu, 1 Jul 2021 01:55:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1625104517; bh=HtnzH4et/7uUBBU5CSmAMAU5ohVsctPE28Lz0R7p6/Y=; h=Date:From:To:Subject:In-Reply-To:From; b=qzIViynqW2gKtKx3TArcWNFi4f6ThoQqVvU+pKiSHYBjcVvpV7EJKcp03x0nhXccw mYAXIoHZKBDJhrEGWmECs+3StHjNCHYxGODSkWnHQgpXi2W27iRAzQpxBHg+k/iKvV M1LsFpG0b2MvLC4qpCw8ymspA60H+idmArideipw= Date: Wed, 30 Jun 2021 18:55:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, andriy.shevchenko@linux.intel.com, bfields@fieldses.org, chuck.lever@oracle.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org, viro@zeniv.linux.org.uk Subject: [patch 152/192] lib/string_helpers: introduce ESCAPE_NAP to escape non-ASCII and non-printable Message-ID: <20210701015517.qp0JG4ivY%akpm@linux-foundation.org> In-Reply-To: <20210630184624.9ca1937310b0dd5ce66b30e7@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 847FAF00012B X-Stat-Signature: ieopcx4usopjxk66siqee9bkxkz1nq7x Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=qzIViynq; dmarc=none; spf=pass (imf17.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1625104518-261595 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andy Shevchenko Subject: lib/string_helpers: introduce ESCAPE_NAP to escape non-ASCII and non-printable Some users may want to have an ASCII based filter for printable only characters, provided by conjunction of isascii() and isprint() functions. Here is the addition of a such. Link: https://lkml.kernel.org/r/20210504180819.73127-6-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko Cc: Alexander Viro Cc: Chuck Lever Cc: "J. Bruce Fields" Signed-off-by: Andrew Morton --- include/linux/string_helpers.h | 1 + lib/string_helpers.c | 20 ++++++++++++++++---- 2 files changed, 17 insertions(+), 4 deletions(-) --- a/include/linux/string_helpers.h~lib-string_helpers-introduce-escape_nap-to-escape-non-ascii-and-non-printable +++ a/include/linux/string_helpers.h @@ -53,6 +53,7 @@ static inline int string_unescape_any_in #define ESCAPE_ANY_NP (ESCAPE_ANY | ESCAPE_NP) #define ESCAPE_HEX BIT(5) #define ESCAPE_NA BIT(6) +#define ESCAPE_NAP BIT(7) int string_escape_mem(const char *src, size_t isz, char *dst, size_t osz, unsigned int flags, const char *only); --- a/lib/string_helpers.c~lib-string_helpers-introduce-escape_nap-to-escape-non-ascii-and-non-printable +++ a/lib/string_helpers.c @@ -454,9 +454,11 @@ static bool escape_hex(unsigned char c, * * 1. The character is not matched to the one from @only string and thus * must go as-is to the output. - * 2. The character is matched to the printable or ASCII class, if asked, + * 2. The character is matched to the printable and ASCII classes, if asked, * and in case of match it passes through to the output. - * 3. The character is checked if it falls into the class given by @flags. + * 3. The character is matched to the printable or ASCII class, if asked, + * and in case of match it passes through to the output. + * 4. The character is checked if it falls into the class given by @flags. * %ESCAPE_OCTAL and %ESCAPE_HEX are going last since they cover any * character. Note that they actually can't go together, otherwise * %ESCAPE_HEX will be ignored. @@ -489,11 +491,15 @@ static bool escape_hex(unsigned char c, * '\xHH' - byte with hexadecimal value HH (2 digits) * %ESCAPE_NA: * escape only non-ascii characters, checked by isascii() + * %ESCAPE_NAP: + * escape only non-printable or non-ascii characters * - * One notable caveat, the %ESCAPE_NP and %ESCAPE_NA have higher priority - * than the rest of the flags (%ESCAPE_NP is higher than %ESCAPE_NA). + * One notable caveat, the %ESCAPE_NAP, %ESCAPE_NP and %ESCAPE_NA have the + * higher priority than the rest of the flags (%ESCAPE_NAP is the highest). * It doesn't make much sense to use either of them without %ESCAPE_OCTAL * or %ESCAPE_HEX, because they cover most of the other character classes. + * %ESCAPE_NAP can utilize %ESCAPE_SPACE or %ESCAPE_SPECIAL in addition to + * the above. * * Return: * The total size of the escaped output that would be generated for @@ -515,6 +521,8 @@ int string_escape_mem(const char *src, s * Apply rules in the following sequence: * - the @only string is supplied and does not contain a * character under question + * - the character is printable and ASCII, when @flags has + * %ESCAPE_NAP bit set * - the character is printable, when @flags has * %ESCAPE_NP bit set * - the character is ASCII, when @flags has @@ -528,6 +536,10 @@ int string_escape_mem(const char *src, s escape_passthrough(c, &p, end)) continue; + if (isascii(c) && isprint(c) && + flags & ESCAPE_NAP && escape_passthrough(c, &p, end)) + continue; + if (isprint(c) && flags & ESCAPE_NP && escape_passthrough(c, &p, end)) continue;