From patchwork Mon Aug 6 23:11:08 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Fuerst X-Patchwork-Id: 1283811 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 444FDDF280 for ; Tue, 7 Aug 2012 07:19:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5DF699F075 for ; Tue, 7 Aug 2012 00:19:16 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-pb0-f49.google.com (mail-pb0-f49.google.com [209.85.160.49]) by gabe.freedesktop.org (Postfix) with ESMTP id 2873F9E7D5 for ; Mon, 6 Aug 2012 16:11:36 -0700 (PDT) Received: by pbbrq13 with SMTP id rq13so6672756pbb.36 for ; Mon, 06 Aug 2012 16:11:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer; bh=6YwXB2DdemssIW84oxSLB4tVTvA7ZNYeWy5XAzb9sS8=; b=y/TH7f2cbxukxPqT8GnptDgoUDCEooGo5bdWZIEkCRFhSmgHigzlyegoQiNLBxcy68 lHZ3axECPPf7i6KPjo5z/d6CtNOqVtVqxAYxZ3Jw/ofs0PnXPu+3omAB0LjNkfWdIjik 6E1DaU0zop8XZrLQIsyDJIpdDlttTINOsesg6CmUphGhkiXookcZAJTfKuDQHtxfnzf4 oKcoVX4P2ZrVOyUwF8HTjrbmwDjXdJd7iFN4fruBDAOs2Zi0xJeGkS1Ciwan/Eacdojx wM2R6ZSGnKc9cKXmglDt4QIZNUQew145TMYGFwSjgLo+AdKLpeLFQpvfs9zYsY+FJ4SE S0jQ== Received: by 10.68.226.102 with SMTP id rr6mr22581631pbc.99.1344294696048; Mon, 06 Aug 2012 16:11:36 -0700 (PDT) Received: from localhost.localdomain (c-24-18-84-54.hsd1.wa.comcast.net. [24.18.84.54]) by mx.google.com with ESMTPS id oo6sm6388795pbc.22.2012.08.06.16.11.35 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 06 Aug 2012 16:11:35 -0700 (PDT) From: Steven Fuerst To: dri-devel@lists.freedesktop.org Subject: [[PATCH][RESENT] 1/3] Replace i2f() in r600_blit.c with an optimized version. Date: Mon, 6 Aug 2012 16:11:08 -0700 Message-Id: <1344294671-13905-1-git-send-email-svfuerst@gmail.com> X-Mailer: git-send-email 1.7.10.4 X-Mailman-Approved-At: Mon, 06 Aug 2012 22:55:21 -0700 Cc: Steven Fuerst X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org We use __fls() to find the most significant bit. Using that, the loop can be avoided. A second trick is to use the mod(32) behaviour of the rotate instructions on x86 to expand the range of the unsigned int to float conversion to the full 32 bits. The routine is now exact up to 2^24. Above that, we truncate which is equivalent to rounding towards zero. Signed-off-by: Steven Fuerst --- drivers/gpu/drm/radeon/r600_blit.c | 52 +++++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit.c b/drivers/gpu/drm/radeon/r600_blit.c index 3c031a4..f0ce441 100644 --- a/drivers/gpu/drm/radeon/r600_blit.c +++ b/drivers/gpu/drm/radeon/r600_blit.c @@ -489,29 +489,37 @@ set_default_state(drm_radeon_private_t *dev_priv) ADVANCE_RING(); } -static uint32_t i2f(uint32_t input) +/* 23 bits of float fractional data */ +#define I2F_FRAC_BITS 23 +#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Will be exact from 0 to 2^24. Above that, we round towards zero + * as the fractional bits will not fit in a float. (It would be better to + * round towards even as the fpu does, but that is slower.) + * This routine depends on the mod(32) behaviour of the rotate instructions + * on x86. + */ +static uint32_t i2f(uint32_t x) { - u32 result, i, exponent, fraction; - - if ((input & 0x3fff) == 0) - result = 0; /* 0 is a special case */ - else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input & 0x3fff) << 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i < 14; i++) { - if (fraction & 0x800000) - break; - else { - fraction = fraction << 1; /* keep - shifting left until top bit = 1 */ - exponent = exponent - 1; - } - } - result = exponent << 23 | (fraction & 0x7fffff); /* mask - off top bit; assumed 1 */ - } - return result; + uint32_t msb, exponent, fraction; + + /* Zero is special */ + if (!x) return 0; + + /* Get location of the most significant bit */ + msb = __fls(x); + + /* + * Use a rotate instead of a shift because that works both leftwards + * and rightwards due to the mod(32) beahviour. This means we don't + * need to check to see if we are above 2^24 or not. + */ + fraction = ror32(x, msb - I2F_FRAC_BITS) & I2F_MASK; + exponent = (127 + msb) << I2F_FRAC_BITS; + + return fraction + exponent; }