From patchwork Sat Sep 28 17:32:57 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roy Franz X-Patchwork-Id: 2958171 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id AB9859F289 for ; Sat, 28 Sep 2013 17:33:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id B553620363 for ; Sat, 28 Sep 2013 17:33:50 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D0902027D for ; Sat, 28 Sep 2013 17:33:49 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VPyOW-0007TT-Q7; Sat, 28 Sep 2013 17:33:28 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VPyOO-0006CQ-AN; Sat, 28 Sep 2013 17:33:20 +0000 Received: from mail-qe0-f42.google.com ([209.85.128.42]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VPyO8-0006Ah-9t for linux-arm-kernel@lists.infradead.org; Sat, 28 Sep 2013 17:33:05 +0000 Received: by mail-qe0-f42.google.com with SMTP id 1so2702219qec.15 for ; Sat, 28 Sep 2013 10:32:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=U1vgsBEZJlpP1dFKESQ0/C6vS3hVmfUARncODX2qTuo=; b=Uzhx2l+qC9KHpghdbxV7H/d8Zu6QjzfuKw0+9AaYgbdk9XxWlvqqhOPCF8lP5RJQmi bAYZMewGxyROVlckczhOPREh9X+JX9F5Uzdz26K1PktHogKZoNb2BR4/ap3V0dRBf68S 6rK9QFQGRp1IPLa1sO7iAv5iqQoZ3eCKxVNISDe+63RBxL98zcO4nfq3he4LKo5s8ycu bjqQkIWqBy6Zd8NG5UKoYBakWfUWUa+lx3To6XnsGiT1jPmJM8ayap+Pdg7GSo8SGCFD CCCTe0wguiI2/53Y6DmDA7sSBnlN1yPIoUZBWxXfPJtY5qir65RxMvMPs9U/9IBGfmmi /E6A== X-Gm-Message-State: ALoCoQm0r+6el1AYEpl7eVaQKzZzG5r1/wotrUpD4FO3vcE6cjwjwMck9B6uplCQsYjLMemXXeNB X-Received: by 10.229.79.70 with SMTP id o6mr17363018qck.21.1380389561186; Sat, 28 Sep 2013 10:32:41 -0700 (PDT) Received: from rfranz-t520.caveonetworks.com (static-108-20-23-67.bstnma.fios.verizon.net. [108.20.23.67]) by mx.google.com with ESMTPSA id x8sm31723886qam.2.1969.12.31.16.00.00 (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 28 Sep 2013 10:32:40 -0700 (PDT) From: Roy Franz To: linux-kernel@vger.kernel.org, linux-efi@vger.kernel.org, matt.fleming@intel.com, linux-arm-kernel@lists.infradead.org, linux@arm.linux.org.uk Subject: [PATCH] Improve EFI stub command line conversion Date: Sat, 28 Sep 2013 10:32:57 -0700 Message-Id: <1380389577-27039-2-git-send-email-roy.franz@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1380389577-27039-1-git-send-email-roy.franz@linaro.org> References: <1380389577-27039-1-git-send-email-roy.franz@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130928_133304_457624_2E7ED715 X-CRM114-Status: GOOD ( 20.53 ) X-Spam-Score: -2.6 (--) Cc: Roy Franz , hpa@zytor.com, msalter@redhat.com X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: "H. Peter Anvin" Improve the conversion of the UTF-16 EFI command line to UTF-8 for passing to the kernel. Signed-off-by: Roy Franz --- arch/arm/boot/compressed/efi-stub.c | 2 +- arch/x86/boot/compressed/eboot.c | 3 +- drivers/firmware/efi/efi-stub-helper.c | 93 ++++++++++++++++++++++++-------- 3 files changed, 72 insertions(+), 26 deletions(-) diff --git a/arch/arm/boot/compressed/efi-stub.c b/arch/arm/boot/compressed/efi-stub.c index a77cc4f..f720284 100644 --- a/arch/arm/boot/compressed/efi-stub.c +++ b/arch/arm/boot/compressed/efi-stub.c @@ -109,7 +109,7 @@ int efi_entry(void *handle, efi_system_table_t *sys_table, * so this memory just needs to not conflict with boot protocol * requirements. */ - cmdline_ptr = efi_convert_cmdline_to_ascii(sys_table, image, + cmdline_ptr = efi_convert_cmdline(sys_table, image, &cmdline_size); if (!cmdline_ptr) { efi_printk(sys_table, PRINTK_PREFIX"ERROR: Unable to allocate memory for command line.\n"); diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c index beb07a4..a5628cd 100644 --- a/arch/x86/boot/compressed/eboot.c +++ b/arch/x86/boot/compressed/eboot.c @@ -488,8 +488,7 @@ struct boot_params *make_boot_params(void *handle, efi_system_table_t *_table) hdr->type_of_loader = 0x21; /* Convert unicode cmdline to ascii */ - cmdline_ptr = efi_convert_cmdline_to_ascii(sys_table, image, - &options_size); + cmdline_ptr = efi_convert_cmdline(sys_table, image, &options_size); if (!cmdline_ptr) goto fail; hdr->cmd_line_ptr = (unsigned long)cmdline_ptr; diff --git a/drivers/firmware/efi/efi-stub-helper.c b/drivers/firmware/efi/efi-stub-helper.c index d3448a9..6602001 100644 --- a/drivers/firmware/efi/efi-stub-helper.c +++ b/drivers/firmware/efi/efi-stub-helper.c @@ -578,63 +578,110 @@ static efi_status_t efi_relocate_kernel(efi_system_table_t *sys_table_arg, } /* - * Convert the unicode UEFI command line to ASCII to pass to kernel. + * Get the number of UTF-8 bytes corresponding to an UTF-16 character. + * This overestimates for surrogates, but that is okay. + */ +static int efi_utf8_bytes(u16 c) +{ + return 1 + (c >= 0x80) + (c >= 0x800); +} + +/* + * Convert an UTF-16 string, not necessarily null terminated, to UTF-8. + */ +static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n) +{ + unsigned int c; + + while (n--) { + c = *src++; + if (n && c >= 0xd800 && c <= 0xdbff && + *src >= 0xdc00 && *src <= 0xdfff) { + c = 0x10000 + ((c & 0x3ff) << 10) + (*src & 0x3ff); + src++; + n--; + } + if (c >= 0xd800 && c <= 0xdfff) + c = 0xfffd; /* Unmatched surrogate */ + if (c < 0x80) { + *dst++ = c; + continue; + } + if (c < 0x800) { + *dst++ = 0xc0 + (c >> 6); + goto t1; + } + if (c < 0x10000) { + *dst++ = 0xe0 + (c >> 12); + goto t2; + } + *dst++ = 0xf0 + (c >> 18); + *dst++ = 0x80 + ((c >> 12) & 0x3f); +t2: + *dst++ = 0x80 + ((c >> 6) & 0x3f); +t1: + *dst++ = 0x80 + (c & 0x3f); + } + + return dst; +} + +/* + * Do proper conversion from UTF-16 to UTF-8 * Size of memory allocated return in *cmd_line_len. * Returns NULL on error. */ -static char *efi_convert_cmdline_to_ascii(efi_system_table_t *sys_table_arg, - efi_loaded_image_t *image, - int *cmd_line_len) +static char *efi_convert_cmdline(efi_system_table_t *sys_table_arg, + efi_loaded_image_t *image, + int *cmd_line_len) { - u16 *s2; + const u16 *s2; u8 *s1 = NULL; unsigned long cmdline_addr = 0; - int load_options_size = image->load_options_size / 2; /* ASCII */ - void *options = image->load_options; - int options_size = 0; + int load_options_chars = image->load_options_size / 2; /* UTF-16 chars */ + const u16 *options = image->load_options; + int options_bytes = 0; /* UTF-8 bytes */ + int options_chars = 0; /* UTF-16 chars */ efi_status_t status; - int i; u16 zero = 0; if (options) { s2 = options; - while (*s2 && *s2 != '\n' && options_size < load_options_size) { - s2++; - options_size++; + while (*s2 && *s2 != '\n' && options_chars < load_options_chars) { + options_bytes += efi_utf8_bytes(*s2++); + options_chars++; } } - if (options_size == 0) { - /* No command line options, so return empty string*/ - options_size = 1; + if (!options_chars) { + /* No command line options, so return empty string */ options = &zero; } - options_size++; /* NUL termination */ + options_bytes++; /* NUL termination */ + #ifdef CONFIG_ARM /* * For ARM, allocate at a high address to avoid reserved * regions at low addresses that we don't know the specfics of * at the time we are processing the command line. */ - status = efi_high_alloc(sys_table_arg, options_size, 0, + status = efi_high_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr, 0xfffff000); #else - status = efi_low_alloc(sys_table_arg, options_size, 0, + status = efi_low_alloc(sys_table_arg, options_bytes, 0, &cmdline_addr); #endif if (status != EFI_SUCCESS) return NULL; s1 = (u8 *)cmdline_addr; - s2 = (u16 *)options; - - for (i = 0; i < options_size - 1; i++) - *s1++ = *s2++; + s2 = (const u16 *)options; + s1 = efi_utf16_to_utf8(s1, s2, options_chars); *s1 = '\0'; - *cmd_line_len = options_size; + *cmd_line_len = options_bytes; return (char *)cmdline_addr; }