From patchwork Mon Oct 10 22:53:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nick Desaulniers X-Patchwork-Id: 13003320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC5A8C433FE for ; Mon, 10 Oct 2022 22:55:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=AS18zr/nRV2mdMyU9tVTfbgzvPEQSxiMRys9Gx5h2Uc=; b=xm0aLTJ1ed0KXb2uxfYjJjwrX4 DaA6lWI+p3mcLebtXdNMSx/QeKCUJhGPYd4WkWZkexjkoKVfVKnN+JCqMYl2Fku4F0e+B8Fa//pFk 6B/hdq6cKsaj/TzLIlI2RSShrEQZOaB1zUywvDAws4I+JtT+7WbMj5eQmp4eEhm8tgtyHTtDXtLMC vh/7m2g0bHW+8sje8zEC52ladH1kjFeIl7VhumiLUQV03uY99LebWfTAEZ0+EQuwEZzxg2H8rZi9n BEgoxT6XRo7s9ej+jDvpq5uAMUg8wVPD9XjfIpHQpDZ1nPgpeL2j+RxuF4WaJoqxOSjA0SoPKdzkB LVXGdAZw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oi1em-002WFB-OL; Mon, 10 Oct 2022 22:53:56 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oi1ek-002WDz-0p for linux-arm-kernel@lists.infradead.org; Mon, 10 Oct 2022 22:53:55 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-360b9418f64so38613357b3.7 for ; Mon, 10 Oct 2022 15:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=35T83bsZ4xGPTaOlyecWSoUgFUedgbH6BlUxlOUBlR8=; b=oTutQm7Zb4sdm6oN3sN1aqtG9kDMiU7ItlRnISemwBS9t48mFyaPD6rfBkSxPsg5MP v+uAW4w+skMr8tqJESJ8HFSR8MfGgIPd8LFPNMJW8vWFnhx2Vu0AUiVJam3kOfWdAVdf QZpdTCS1rI1IY6gKaQlNlbqZzj4wDR8spjrOjq2YOuKCOdC2R4Ooaevctw5xKr+0kawn BRzg1g45r2R3f/XcFo/pc4gVT9clpnMN7wC7ahzTdEV2M6iYIevOesE7ZfULdGUg5UYX Tru+SL61avsmaetlHsZJTD8pk4dUTVGIvTSIggYlVaNqRlZ1oyCGMUmT0PLUkvZt2x0O Gx8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=35T83bsZ4xGPTaOlyecWSoUgFUedgbH6BlUxlOUBlR8=; b=c7gS0mg0kCN8CqXn+WU5C0W9ZJ7KBZF3yUBsWOoitTySk2dks7/yOxdranbFbrFYp9 okV0wVvJ4Q7ikCi5AsLY04wVAwJsRz8gc5oqeS3lZ+5ucNXG1ViHcioib4exR5+VW9p6 U6HGbRj2gZAvsD5t2XFXRn9qEtdQCSkPR1DZB6sjwGV/LEoOzy28OsdCwAcxYxHwkkAX ju9tLlzpcnMjMvtp5F7p832B/V7EZQgbLxEjXWqwLbPpd2qD8FlCK1r9zXCHXdx5ag2Z 8StU+eI30tK2vRN8VjzZli6pb8Arzf3jZArqOj8BZ4qt5+3Pt47rhy0VJya7dPcK7IqQ dz8g== X-Gm-Message-State: ACrzQf3LpiYr4X+zd3yXfLkZOljcYwAavljBwbTZ/svuvF7+KLWInfIc YUdW7YgcPj4w7r1yTGus04jqt64keowCtGe3Ors= X-Google-Smtp-Source: AMsMyM6q9jQ47MmC0Y/IsHY+Z0OQ3TlY1am07kfC8geRQ5Kno2DnoQbRAI62yZ3QiVelIrmwkSBy+pufO7a+InhVzBI= X-Received: from ndesaulniers-desktop.svl.corp.google.com ([2620:0:100e:712:283b:bbf5:938:fb2d]) (user=ndesaulniers job=sendgmr) by 2002:a25:9e83:0:b0:6be:ebbb:9d8b with SMTP id p3-20020a259e83000000b006beebbb9d8bmr20756154ybq.333.1665442431236; Mon, 10 Oct 2022 15:53:51 -0700 (PDT) Date: Mon, 10 Oct 2022 15:53:42 -0700 In-Reply-To: Mime-Version: 1.0 References: X-Developer-Key: i=ndesaulniers@google.com; a=ed25519; pk=UIrHvErwpgNbhCkRZAYSX0CFd/XFEwqX3D0xqtqjNug= X-Developer-Signature: v=1; a=ed25519-sha256; t=1665442422; l=2033; i=ndesaulniers@google.com; s=20220923; h=from:subject; bh=rZfNaR/aBwwpz1oPD+ZP2XMfcjdSORrzwrlR54g5esk=; b=+6wxScJxIKR3muVDZdynqydUM7ayvYaD10gSryPI/aKJaULM14Ikux1pjsPpQqfnmjRkAmi1dbwX zuTFcpdnC/vxxB97Gl2PE6kLYI17zVZxk/7LEp0vT7OK1OoWAzaP X-Mailer: git-send-email 2.38.0.rc2.412.g84df46c1b4-goog Message-ID: <20221010225342.3903590-1-ndesaulniers@google.com> Subject: [PATCH] ARM: NWFPE: avoid compiler-generated __aeabi_uldivmod From: Nick Desaulniers To: Arnd Bergmann , Russell King Cc: Tom Rix , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, llvm@lists.linux.dev, Miguel Ojeda , Ard Biesheuvel , Gary Guo , Craig Topper , Philip Reames , jh@jhauser.us, Nick Desaulniers , Nathan Chancellor X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221010_155354_084462_0F3AC369 X-CRM114-Status: GOOD ( 15.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org clang-15's ability to elide loops completely became more aggressive when it can deduce how a variable is being updated in a loop. Counting down one variable by an increment of another can be replaced by a modulo operation. For 64b variables on 32b ARM EABI targets, this can result in the compiler generating calls to __aeabi_uldivmod, which it does for a do while loop in float64_rem(). For the kernel, we'd generally prefer that developers not open code 64b division via binary / operators and instead use the more explicit helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can result in linkage failures due to undefined references to __aeabi_uldivmod(). While developers can avoid open coding divisions on 64b variables, the compiler doesn't know that the Linux kernel has a partial implementation of a compiler runtime (--rtlib) to enforce this convention. It's also undecidable for the compiler whether the code in question would be faster to execute the loop vs elide it and do the 64b division. While I actively avoid using the internal -mllvm command line flags, I think we get better code than using barrier() here, which will force reloads+spills in the loop for all toolchains. Link: https://github.com/ClangBuiltLinux/linux/issues/1666 Reported-by: Nathan Chancellor Signed-off-by: Nick Desaulniers Reviewed-by: Arnd Bergmann Tested-by: Nathan Chancellor --- arch/arm/nwfpe/Makefile | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/arm/nwfpe/Makefile b/arch/arm/nwfpe/Makefile index 303400fa2cdf..2aec85ab1e8b 100644 --- a/arch/arm/nwfpe/Makefile +++ b/arch/arm/nwfpe/Makefile @@ -11,3 +11,9 @@ nwfpe-y += fpa11.o fpa11_cpdo.o fpa11_cpdt.o \ entry.o nwfpe-$(CONFIG_FPE_NWFPE_XP) += extended_cpdo.o + +# Try really hard to avoid generating calls to __aeabi_uldivmod() from +# float64_rem() due to loop elision. +ifdef CONFIG_CC_IS_CLANG +CFLAGS_softfloat.o += -mllvm -replexitval=never +endif