diff mbox

[1/3] Replace i2f() in r600_blit.c with an optimized version.

Message ID CAOqL70=5QYwPSomUt6Q_o7=Rqi0o+bz5SC9NBXTqBMGUb9UbSA@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Steven Fuerst Aug. 6, 2012, 6:14 p.m. UTC
Replace i2f() in r600_blit.c with an optimized version.

We use __fls() to find the most significant bit.  Using that, the
loop can be avoided.  A second trick is to use the mod(32)
behaviour of the rotate instructions on x86 to expand the range
of the unsigned int to float conversion to the full 32 bits.

The routine is now exact up to 2^24.  Above that, we truncate which
is equivalent to rounding towards zero.

Signed-off-by: Steven Fuerst <svfuerst@gmail.com>
---
 drivers/gpu/drm/radeon/r600_blit.c |   52
+++++++++++++++++++++---------------
 1 file changed, 30 insertions(+), 22 deletions(-)

Comments

Paul Menzel Aug. 6, 2012, 9:42 p.m. UTC | #1
Dear Steven,


thank you for your patches.

Am Montag, den 06.08.2012, 11:14 -0700 schrieb Steven Fuerst:
> Replace i2f() in r600_blit.c with an optimized version.
> 
> We use __fls() to find the most significant bit.  Using that, the
> loop can be avoided.  A second trick is to use the mod(32)
> behaviour of the rotate instructions on x86 to expand the range
> of the unsigned int to float conversion to the full 32 bits.
> 
> The routine is now exact up to 2^24.  Above that, we truncate which
> is equivalent to rounding towards zero.
> 
> Signed-off-by: Steven Fuerst <svfuerst@gmail.com>
> ---
>  drivers/gpu/drm/radeon/r600_blit.c |   52
> +++++++++++++++++++++---------------

Unfortunately you sent your message not just as plain text [1] and the
Google Mail mailer automatically wrapped the lines and there mangled the
patch.

Please use a “good” mail client or just `git sent-email`. The manual
page has a section explaining how to set it up with Google Mail I think.
It would be great if you could resent your patches as `[PATCH][RESENT]`
(option `--subject-prefix=…` of `git format-patch`) since there is no
functional change.

[…]


Thanks,

Paul


[1] http://en.opensuse.org/openSUSE:Mailing_list_netiquette
Steven Fuerst Aug. 6, 2012, 11:07 p.m. UTC | #2
> Unfortunately you sent your message not just as plain text [1] and the
> Google Mail mailer automatically wrapped the lines and there mangled the
> patch.

Oops, sorry about that.  New patches incoming via git-send-email.  No
changes other than the unwanted line-wrapping is hopefully avoided.

Steven Fuerst
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/r600_blit.c
b/drivers/gpu/drm/radeon/r600_blit.c
index 3c031a4..f0ce441 100644
--- a/drivers/gpu/drm/radeon/r600_blit.c
+++ b/drivers/gpu/drm/radeon/r600_blit.c
@@ -489,29 +489,37 @@  set_default_state(drm_radeon_private_t *dev_priv)
  ADVANCE_RING();
 }

-static uint32_t i2f(uint32_t input)
+/* 23 bits of float fractional data */
+#define I2F_FRAC_BITS 23
+#define I2F_MASK ((1 << I2F_FRAC_BITS) - 1)
+
+/*
+ * Converts unsigned integer into 32-bit IEEE floating point
representation.
+ * Will be exact from 0 to 2^24.  Above that, we round towards zero
+ * as the fractional bits will not fit in a float.  (It would be better to
+ * round towards even as the fpu does, but that is slower.)
+ * This routine depends on the mod(32) behaviour of the rotate instructions
+ * on x86.
+ */
+static uint32_t i2f(uint32_t x)
 {
- u32 result, i, exponent, fraction;
-
- if ((input & 0x3fff) == 0)
- result = 0; /* 0 is a special case */
- else {
- exponent = 140; /* exponent biased by 127; */
- fraction = (input & 0x3fff) << 10; /* cheat and only
-      handle numbers below 2^^15 */
- for (i = 0; i < 14; i++) {
- if (fraction & 0x800000)
- break;
- else {
- fraction = fraction << 1; /* keep
-     shifting left until top bit = 1 */
- exponent = exponent - 1;
- }
- }
- result = exponent << 23 | (fraction & 0x7fffff); /* mask
-    off top bit; assumed 1 */
- }
- return result;
+ uint32_t msb, exponent, fraction;
+
+ /* Zero is special */
+ if (!x) return 0;
+
+ /* Get location of the most significant bit */
+ msb = __fls(x);
+
+ /*
+ * Use a rotate instead of a shift because that works both leftwards
+ * and rightwards due to the mod(32) beahviour.  This means we don't
+ * need to check to see if we are above 2^24 or not.
+ */
+ fraction = ror32(x, msb - I2F_FRAC_BITS) & I2F_MASK;
+ exponent = (127 + msb) << I2F_FRAC_BITS;
+
+ return fraction + exponent;
 }