diff mbox

[3/3,media] cx88: use a mutex to protect cx8802_devlist

Message ID 20110402094451.GD17015@elie (mailing list archive)
State Superseded
Headers show

Commit Message

Jonathan Nieder April 2, 2011, 9:44 a.m. UTC
From: Ben Hutchings <ben@decadent.org.uk>
Date: Tue, 29 Mar 2011 03:25:15 +0100

Add and use a mutex to protect the cx88-mpeg device list.  Previously
the BKL prevented races.

[jn: split from a larger patch, with new commit message; also protect
 use in cx8802_probe]

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
That's the end of the series.  Hopefully I haven't mangled the patches
too much --- if we're lucky they might even still work, though I
wouldn't bet on it.  Bug reports and improvements welcome.

Good night,
Jonathan

 drivers/media/video/cx88/cx88-mpeg.c |   20 +++++++++++++++++---
 1 files changed, 17 insertions(+), 3 deletions(-)

Comments

Jonathan Nieder April 2, 2011, 7:29 p.m. UTC | #1
Hi Andreas,

(please turn off HTML mail.)
Andreas Huber wrote:

> There is a reference count bug in the driver code. The driver's
> active_ref count may become negative which leads to unpredictable
> behavior. (mpeg video device inaccessible, etc ...)

Hmm, the patchset didn't touch active_ref handling.

active_ref was added by v2.6.25-rc3~132^2~7 (V4L/DVB (7194):
cx88-mpeg: Allow concurrent access to cx88-mpeg devices, 2008-02-11)
and relies on three assumptions:

 * (successful) calls to cx8802_driver::request_acquire are balanced
   with calls to cx8802_driver::request_release;

 * cx8802_driver::advise_acquire is non-null if and only if
   cx8802_driver::advise_release is (since both are NULL for
   blackbird, non-NULL for dvb);

 * no data races.

I suppose it would be more idiomatic to use an atomic_t, but access to
active_ref was previously protected by the BKL and now it is protected
by core->lock.  So it's not clear to me why this doesn't work.

Any hints?  (e.g., a detailed reproduction recipe, or a log after
adding a printk to find out when exactly active_ref becomes negative)

Thanks for reporting.
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Huber April 2, 2011, 8:13 p.m. UTC | #2
Hi Jonathan, thanks for locking into it.
I'll try to debug more deeply what's going wrong and keep you up to date.
Andi.

On 02.04.2011 21:29, Jonathan Nieder wrote:
> Hi Andreas,
>
> (please turn off HTML mail.)
> Andreas Huber wrote:
>
>> There is a reference count bug in the driver code. The driver's
>> active_ref count may become negative which leads to unpredictable
>> behavior. (mpeg video device inaccessible, etc ...)
> Hmm, the patchset didn't touch active_ref handling.
>
> active_ref was added by v2.6.25-rc3~132^2~7 (V4L/DVB (7194):
> cx88-mpeg: Allow concurrent access to cx88-mpeg devices, 2008-02-11)
> and relies on three assumptions:
>
>   * (successful) calls to cx8802_driver::request_acquire are balanced
>     with calls to cx8802_driver::request_release;
>
>   * cx8802_driver::advise_acquire is non-null if and only if
>     cx8802_driver::advise_release is (since both are NULL for
>     blackbird, non-NULL for dvb);
>
>   * no data races.
>
> I suppose it would be more idiomatic to use an atomic_t, but access to
> active_ref was previously protected by the BKL and now it is protected
> by core->lock.  So it's not clear to me why this doesn't work.
>
> Any hints?  (e.g., a detailed reproduction recipe, or a log after
> adding a printk to find out when exactly active_ref becomes negative)
>
> Thanks for reporting.
> Jonathan

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Huber April 4, 2011, 2:12 a.m. UTC | #3
I added some printk entries and finally got suspicious output after

rmmod cx88_blackbird ; modprobe cx88_blackbird debug=true

[...]
cx88[0]/2: registered device video3 [mpeg]
cx88[0]/2-bb: mpeg_open
[cx88-blackbird.c,mpeg_open(),line 1059] mutex_lock(&dev->core->lock);
cx88[0]/2-bb: Initialize codec
[...]
cx88[0]/2-bb: open dev=video3
[cx88-blackbird.c,mpeg_open(),line 1103] mutex_unlock(&dev->core->lock); 
// normal exit
[...]
cx88[1]/2: subsystem: 0070:9601, board: Hauppauge WinTV-HVR1300 
DVB-T/Hybrid MPEG Encoder [card=56]
[cx88-blackbird.c,mpeg_release(),line 1122] mutex_lock(&dev->core->lock);
cx88[0] core->active_ref=0
[cx88-mpeg.c,cx8802_request_release(),line 655]                         
   // BANG !!!!!!!!!!!!!!! core->active_ref=-1
[cx88-blackbird.c,mpeg_release(),line 1129] drv->request_release(drv); 
// drv->core->active_ref=(0->0)
[cx88-blackbird.c,mpeg_release(),line 1133] 
mutex_unlock(&dev->core->lock); // normal exit
cx88[1]/2-bb: cx8802_blackbird_probe
[...]

Analyzing this lead me to the conclusion, that a call to mpeg_open() in 
cx88-blackbird.c
returns with 0 (= success)
while not actually having increased the active_ref count.

Here's the relevant code fragment ...

static int mpeg_open(struct file *file)
{
[...]
     /* Make sure we can acquire the hardware */
     drv = cx8802_get_driver(dev, CX88_MPEG_BLACKBIRD);      // HOW TO 
DEAL WITH NULL  ??????
     if (drv) {
         err = drv->request_acquire(drv);
         if(err != 0) {
             dprintk(1,"%s: Unable to acquire hardware, %d\n", __func__, 
err);
             mutex_unlock(&dev->core->lock);
             return err;
         }
     }
[...]
     return 0;
}

I suspect, that

drv = cx8802_get_driver(dev, CX88_MPEG_BLACKBIRD);

in my case might evaluates to NULL (not normally but during driver 
initialization!?)
which then leads to

1) mpeg_open() returns with success

and

2) active_ref count has not been increased

which results in a negative active_ref count later on.

But I might as well be totally wrong.

Andi

On 02.04.2011 22:13, Andreas Huber wrote:
> Hi Jonathan, thanks for locking into it.
> I'll try to debug more deeply what's going wrong and keep you up to date.
> Andi.
>
> On 02.04.2011 21:29, Jonathan Nieder wrote:
>> Hi Andreas,
>>
>> (please turn off HTML mail.)
>> Andreas Huber wrote:
>>
>>> There is a reference count bug in the driver code. The driver's
>>> active_ref count may become negative which leads to unpredictable
>>> behavior. (mpeg video device inaccessible, etc ...)
>> Hmm, the patchset didn't touch active_ref handling.
>>
>> active_ref was added by v2.6.25-rc3~132^2~7 (V4L/DVB (7194):
>> cx88-mpeg: Allow concurrent access to cx88-mpeg devices, 2008-02-11)
>> and relies on three assumptions:
>>
>>   * (successful) calls to cx8802_driver::request_acquire are balanced
>>     with calls to cx8802_driver::request_release;
>>
>>   * cx8802_driver::advise_acquire is non-null if and only if
>>     cx8802_driver::advise_release is (since both are NULL for
>>     blackbird, non-NULL for dvb);
>>
>>   * no data races.
>>
>> I suppose it would be more idiomatic to use an atomic_t, but access to
>> active_ref was previously protected by the BKL and now it is protected
>> by core->lock.  So it's not clear to me why this doesn't work.
>>
>> Any hints?  (e.g., a detailed reproduction recipe, or a log after
>> adding a printk to find out when exactly active_ref becomes negative)
>>
>> Thanks for reporting.
>> Jonathan
>

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/media/video/cx88/cx88-mpeg.c b/drivers/media/video/cx88/cx88-mpeg.c
index 9147c16..6b58647 100644
--- a/drivers/media/video/cx88/cx88-mpeg.c
+++ b/drivers/media/video/cx88/cx88-mpeg.c
@@ -78,6 +78,7 @@  static void flush_request_modules(struct cx8802_dev *dev)
 
 
 static LIST_HEAD(cx8802_devlist);
+static DEFINE_MUTEX(cx8802_mutex);
 /* ------------------------------------------------------------------ */
 
 static int cx8802_start_dma(struct cx8802_dev    *dev,
@@ -689,6 +690,8 @@  int cx8802_register_driver(struct cx8802_driver *drv)
 		return err;
 	}
 
+	mutex_lock(&cx8802_mutex);
+
 	list_for_each_entry(dev, &cx8802_devlist, devlist) {
 		printk(KERN_INFO
 		       "%s/2: subsystem: %04x:%04x, board: %s [card=%d]\n",
@@ -698,8 +701,10 @@  int cx8802_register_driver(struct cx8802_driver *drv)
 
 		/* Bring up a new struct for each driver instance */
 		driver = kzalloc(sizeof(*drv),GFP_KERNEL);
-		if (driver == NULL)
-			return -ENOMEM;
+		if (driver == NULL) {
+			err = -ENOMEM;
+			goto out;
+		}
 
 		/* Snapshot of the driver registration data */
 		drv->core = dev->core;
@@ -723,7 +728,10 @@  int cx8802_register_driver(struct cx8802_driver *drv)
 
 	}
 
-	return i ? 0 : -ENODEV;
+	err = i ? 0 : -ENODEV;
+out:
+	mutex_unlock(&cx8802_mutex);
+	return err;
 }
 
 int cx8802_unregister_driver(struct cx8802_driver *drv)
@@ -737,6 +745,8 @@  int cx8802_unregister_driver(struct cx8802_driver *drv)
 	       drv->type_id == CX88_MPEG_DVB ? "dvb" : "blackbird",
 	       drv->hw_access == CX8802_DRVCTL_SHARED ? "shared" : "exclusive");
 
+	mutex_lock(&cx8802_mutex);
+
 	list_for_each_entry(dev, &cx8802_devlist, devlist) {
 		printk(KERN_INFO
 		       "%s/2: subsystem: %04x:%04x, board: %s [card=%d]\n",
@@ -763,6 +773,8 @@  int cx8802_unregister_driver(struct cx8802_driver *drv)
 		mutex_unlock(&dev->core->lock);
 	}
 
+	mutex_unlock(&cx8802_mutex);
+
 	return err;
 }
 
@@ -800,7 +812,9 @@  static int __devinit cx8802_probe(struct pci_dev *pci_dev,
 		goto fail_free;
 
 	INIT_LIST_HEAD(&dev->drvlist);
+	mutex_lock(&cx8802_mutex);
 	list_add_tail(&dev->devlist,&cx8802_devlist);
+	mutex_unlock(&cx8802_mutex);
 
 	/* now autoload cx88-dvb or cx88-blackbird */
 	request_modules(dev);