diff mbox

i915 driver fails on i686 userspace + x86_64 kernel

Message ID 20131015174552.3c68055b@IRBT4585 (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Roskin Oct. 15, 2013, 9:45 p.m. UTC
On Tue, 15 Oct 2013 21:59:08 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> Yikes, that implies we have a size mismatch with the kernel - ideally
> we construct the struct to have the same size when compiled with 32
> or 64 bits.
> 
> Please try:
> 
> commit a63b4d5a0766a7e98efeff8dd520c58e9a1bea88
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Oct 15 21:53:16 2013 +0100
> 
>     sna: Expand packed KMS structs for 64-bit alignment

It helps.  Xorg starts.

However, I feel uneasy about that patch.  I tried to debug the problem,
and it looked not as a data corruption, but as something opposite.  The
kernel writes correct data to the userspace, but the userspace gets the
old value.

My theory is that the compat code incorrectly restores some data on the
stack that should not be restored.  It would happen to the value that
is the closest to the stack pointer.  gcc orders variables on the stack
on its own, so additional arrays would only protect the enc structure
only if gcc would place them below enc.

The attached patch makes ioctl calls with enc and the 4 members of the
encoderid array.  encoderid[0] keeps the original value of 0x12345678.
Other calls return the right value.  The kernel can be changed to show
the pointer address and read back the value with get_user.  The value
and the pointer address are correct.

I tried Linux 3.8.12 and it made no difference.  I also recompiled the
kernel and xf86-video-intel with gcc 4.4.7 (the Ubuntu default is
4.6.3), and it also made no difference.

It's hard to believe that there is a bug in the stack saving code that
went unnoticed for so long, but I cannot think of anything else that
would explain the results I'm observing.

If the compiler gets the size of a certain structure wrong, why would
the compat call restore any data on stack on return?

I have an impression that your patch complicates things too much
without addressing the actual problem.

Comments

Pavel Roskin Oct. 16, 2013, 12:08 a.m. UTC | #1
On Wed, 16 Oct 2013 00:39:34 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> > However, I feel uneasy about that patch.  I tried to debug the
> > problem, and it looked not as a data corruption, but as something
> > opposite.  The kernel writes correct data to the userspace, but the
> > userspace gets the old value.

I added code to print addresses of structures and found that when X
fails, enc starts exactly at the address where conn ends.  So it's not
like enc is too close to the stack pointer.

But for a 64-bit compiler that compiled the kernel, conn ends 4 bytes
later because it's 8-byte aligned.  So enc.encoder_id sits in the same
place as the unused bytes of conn.  That's risky.  The compiler can
assume that conn is only changed by accessing its elements and not by
accessing any other structure or variable.

I see that the kernel is compiled with -fno-strict-aliasing, so it must
be something other that aliasing.  Also, the kernel code doesn't know
where it would write the encoders, so I don't see how the compiler
could optimize it out.  It must be some runtime optimization.

Even though I don't understand the mechanism, your patch doesn't seem
wrong anymore.  Maybe it could be simplified by using the aligned
attribute or by other means.  On the positive side, it's a change
entirely in userspace.

> For other structures we have taken great care to make sure that
> they are padded to 64-bit boundaries for exactly this abi
> incompatibility for 32-bit userspaces. To be strict we should do an
> ioc32 wrapper in the kernel.

I see.  After all, the kernel fails to write the encoders at the
requested address, so perhaps it could be improved as well.

Thank you for the fix!
diff mbox

Patch

Protect array

From: Pavel Roskin <proski@gnu.org>


---
 src/sna/sna_display.c |   24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
index 28151ab..27e429c 100644
--- a/src/sna/sna_display.c
+++ b/src/sna/sna_display.c
@@ -2588,7 +2588,8 @@  sna_output_init(ScrnInfoPtr scrn, struct sna_mode *mode, int num)
 	struct sna *sna = to_sna(scrn);
 	xf86OutputPtr output;
 	struct drm_mode_get_connector conn;
-	struct drm_mode_get_encoder enc;
+	int encoderid[4];
+	static struct drm_mode_get_encoder enc;
 	struct drm_mode_modeinfo dummy;
 	struct sna_output *sna_output;
 	const char *output_name;
@@ -2601,17 +2602,38 @@  sna_output_init(ScrnInfoPtr scrn, struct sna_mode *mode, int num)
 	VG_CLEAR(conn);
 	VG_CLEAR(enc);
 
+	DBG(("%s: &encoderid=%p\n", __FUNCTION__, &encoderid));
+	DBG(("%s: &enc.encoder_id=%p\n", __FUNCTION__, &enc.encoder_id));
+
 	conn.connector_id = mode->kmode->connectors[num];
 	conn.count_props = 0;
 	conn.count_modes = 1; /* skip detect */
 	conn.modes_ptr = (uintptr_t)&dummy;
 	conn.count_encoders = 1;
 	conn.encoders_ptr = (uintptr_t)&enc.encoder_id;
+	enc.encoder_id = 0x12345678;
 
 	if (drmIoctl(sna->kgem.fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn)) {
 		DBG(("%s: GETCONNECTOR failed, ret=%d\n", __FUNCTION__, errno));
 		return false;
 	}
+	DBG(("%s: enc.encoder_id=%#x\n", __FUNCTION__, enc.encoder_id));
+
+	for (i = 0; i < 4; i++) {
+		conn.connector_id = mode->kmode->connectors[num];
+		conn.count_props = 0;
+		conn.count_modes = 1; /* skip detect */
+		conn.modes_ptr = (uintptr_t)&dummy;
+		conn.count_encoders = 1;
+		conn.encoders_ptr = (uintptr_t)&encoderid[i];
+		encoderid[i] = 0x12345678;
+
+		if (drmIoctl(sna->kgem.fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn)) {
+			DBG(("%s: GETCONNECTOR failed, ret=%d\n", __FUNCTION__, errno));
+			return false;
+		}
+		DBG(("%s: encoderid[%d]=%#x\n", __FUNCTION__, i, encoderid[i]));
+	}
 
 	if (conn.count_encoders != 1) {
 		DBG(("%s: unexpected number [%d] of encoders attached\n",