diff mbox series

counter: drop chrdev_lock

Message ID 20211017185521.3468640-1-david@lechnology.com (mailing list archive)
State Accepted
Headers show
Series counter: drop chrdev_lock | expand

Commit Message

David Lechner Oct. 17, 2021, 6:55 p.m. UTC
This removes the chrdev_lock from the counter subsystem. This was
intended to prevent opening the chrdev more than once. However, this
doesn't work in practice since userspace can duplicate file descriptors
and pass file descriptors to other processes. Since this protection
can't be relied on, it is best to just remove it.

Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: David Lechner <david@lechnology.com>
---
 drivers/counter/counter-chrdev.c |  6 ------
 drivers/counter/counter-sysfs.c  | 13 +++----------
 include/linux/counter.h          |  7 -------
 3 files changed, 3 insertions(+), 23 deletions(-)

Comments

Greg KH Oct. 18, 2021, 6:08 a.m. UTC | #1
On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> This removes the chrdev_lock from the counter subsystem. This was
> intended to prevent opening the chrdev more than once. However, this
> doesn't work in practice since userspace can duplicate file descriptors
> and pass file descriptors to other processes. Since this protection
> can't be relied on, it is best to just remove it.

Much better, thanks!

One remaining question:

> --- a/include/linux/counter.h
> +++ b/include/linux/counter.h
> @@ -297,7 +297,6 @@ struct counter_ops {
>   * @events:		queue of detected Counter events
>   * @events_wait:	wait queue to allow blocking reads of Counter events
>   * @events_lock:	lock to protect Counter events queue read operations
> - * @chrdev_lock:	lock to limit chrdev to a single open at a time
>   * @ops_exist_lock:	lock to prevent use during removal

Why do you still need 2 locks for the same structure?

thanks,

greg k-h
William Breathitt Gray Oct. 18, 2021, 8:58 a.m. UTC | #2
On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > This removes the chrdev_lock from the counter subsystem. This was
> > intended to prevent opening the chrdev more than once. However, this
> > doesn't work in practice since userspace can duplicate file descriptors
> > and pass file descriptors to other processes. Since this protection
> > can't be relied on, it is best to just remove it.
> 
> Much better, thanks!
> 
> One remaining question:
> 
> > --- a/include/linux/counter.h
> > +++ b/include/linux/counter.h
> > @@ -297,7 +297,6 @@ struct counter_ops {
> >   * @events:		queue of detected Counter events
> >   * @events_wait:	wait queue to allow blocking reads of Counter events
> >   * @events_lock:	lock to protect Counter events queue read operations
> > - * @chrdev_lock:	lock to limit chrdev to a single open at a time
> >   * @ops_exist_lock:	lock to prevent use during removal
> 
> Why do you still need 2 locks for the same structure?
> 
> thanks,
> 
> greg k-h

Originally there was only the events_lock mutex. Initially I tried using
it to also limit the chrdev to a single open, but then came across a
"lock held when returning to user space" warning:
https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5

Instead of losing the benefits of a mutex lock for protecting the
events, I ultimately implemented the chrdev_lock separately as an
atomic_t. If the chrdev_lock is removed, then we'll use events_lock
solely from now on for this structure.

William Breathitt Gray
Greg KH Oct. 18, 2021, 9:13 a.m. UTC | #3
On Mon, Oct 18, 2021 at 05:58:37PM +0900, William Breathitt Gray wrote:
> On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
> > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > This removes the chrdev_lock from the counter subsystem. This was
> > > intended to prevent opening the chrdev more than once. However, this
> > > doesn't work in practice since userspace can duplicate file descriptors
> > > and pass file descriptors to other processes. Since this protection
> > > can't be relied on, it is best to just remove it.
> > 
> > Much better, thanks!
> > 
> > One remaining question:
> > 
> > > --- a/include/linux/counter.h
> > > +++ b/include/linux/counter.h
> > > @@ -297,7 +297,6 @@ struct counter_ops {
> > >   * @events:		queue of detected Counter events
> > >   * @events_wait:	wait queue to allow blocking reads of Counter events
> > >   * @events_lock:	lock to protect Counter events queue read operations
> > > - * @chrdev_lock:	lock to limit chrdev to a single open at a time
> > >   * @ops_exist_lock:	lock to prevent use during removal
> > 
> > Why do you still need 2 locks for the same structure?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Originally there was only the events_lock mutex. Initially I tried using
> it to also limit the chrdev to a single open, but then came across a
> "lock held when returning to user space" warning:
> https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5
> 
> Instead of losing the benefits of a mutex lock for protecting the
> events, I ultimately implemented the chrdev_lock separately as an
> atomic_t. If the chrdev_lock is removed, then we'll use events_lock
> solely from now on for this structure.

chrdev_lock should be removed, it doesn't really do what you think it
does, as per the thread yesterday :)

So does this mean you can also drop the ops_exist_lock?

thanks,

greg k-h
William Breathitt Gray Oct. 18, 2021, 9:14 a.m. UTC | #4
On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> This removes the chrdev_lock from the counter subsystem. This was
> intended to prevent opening the chrdev more than once. However, this
> doesn't work in practice since userspace can duplicate file descriptors
> and pass file descriptors to other processes. Since this protection
> can't be relied on, it is best to just remove it.
> 
> Suggested-by: Greg KH <gregkh@linuxfoundation.org>
> Signed-off-by: David Lechner <david@lechnology.com>

Hi David,

If userspace can bypass this protection then we might as well remove the
code as moot. In agree in principle to this change, but I do have some
comments inline below.

> ---
>  drivers/counter/counter-chrdev.c |  6 ------
>  drivers/counter/counter-sysfs.c  | 13 +++----------
>  include/linux/counter.h          |  7 -------
>  3 files changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/counter/counter-chrdev.c b/drivers/counter/counter-chrdev.c
> index 967c94ae95bb..b747dc81cfc6 100644
> --- a/drivers/counter/counter-chrdev.c
> +++ b/drivers/counter/counter-chrdev.c
> @@ -384,10 +384,6 @@ static int counter_chrdev_open(struct inode *inode, struct file *filp)
>  							    typeof(*counter),
>  							    chrdev);
>  
> -	/* Ensure chrdev is not opened more than 1 at a time */
> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> -		return -EBUSY;
> -
>  	get_device(&counter->dev);
>  	filp->private_data = counter;
>  
> @@ -419,7 +415,6 @@ static int counter_chrdev_release(struct inode *inode, struct file *filp)
>  	mutex_unlock(&counter->ops_exist_lock);
>  
>  	put_device(&counter->dev);
> -	atomic_dec(&counter->chrdev_lock);
>  
>  	return ret;
>  }
> @@ -445,7 +440,6 @@ int counter_chrdev_add(struct counter_device *const counter)
>  	mutex_init(&counter->events_lock);
>  
>  	/* Initialize character device */

We can remove this comment because the purpose of calling cdev_init() is
obvious enough from the name.

> -	atomic_set(&counter->chrdev_lock, 0);
>  	cdev_init(&counter->chrdev, &counter_fops);
>  
>  	/* Allocate Counter events queue */
> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> index 1ccd771da25f..7bf8882ff54d 100644
> --- a/drivers/counter/counter-sysfs.c
> +++ b/drivers/counter/counter-sysfs.c
> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
>  					   u64 val)
>  {
>  	DECLARE_KFIFO_PTR(events, struct counter_event);
> -	int err = 0;
> -
> -	/* Ensure chrdev is not opened more than 1 at a time */
> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> -		return -EBUSY;
> +	int err;
>  
>  	/* Allocate new events queue */
>  	err = kfifo_alloc(&events, val, GFP_KERNEL);
>  	if (err)
> -		goto exit_early;
> +		return err;
>  
>  	/* Swap in new events queue */
>  	kfifo_free(&counter->events);
>  	counter->events.kfifo = events.kfifo;

Do we need to hold the events_lock mutex here for this swap in case
counter_chrdev_read() is in the middle of reading the kfifo to
userspace, or do the kfifo macros already protect us from a race
condition here?

William Breathitt Gray

>  
> -exit_early:
> -	atomic_dec(&counter->chrdev_lock);
> -
> -	return err;
> +	return 0;
>  }
>  
>  static struct counter_comp counter_num_signals_comp =
> diff --git a/include/linux/counter.h b/include/linux/counter.h
> index 22b14a552b1d..0fd99e255a50 100644
> --- a/include/linux/counter.h
> +++ b/include/linux/counter.h
> @@ -297,7 +297,6 @@ struct counter_ops {
>   * @events:		queue of detected Counter events
>   * @events_wait:	wait queue to allow blocking reads of Counter events
>   * @events_lock:	lock to protect Counter events queue read operations
> - * @chrdev_lock:	lock to limit chrdev to a single open at a time
>   * @ops_exist_lock:	lock to prevent use during removal
>   */
>  struct counter_device {
> @@ -325,12 +324,6 @@ struct counter_device {
>  	DECLARE_KFIFO_PTR(events, struct counter_event);
>  	wait_queue_head_t events_wait;
>  	struct mutex events_lock;
> -	/*
> -	 * chrdev_lock is locked by counter_chrdev_open() and unlocked by
> -	 * counter_chrdev_release(), so a mutex is not possible here because
> -	 * chrdev_lock will invariably be held when returning to user space
> -	 */
> -	atomic_t chrdev_lock;
>  	struct mutex ops_exist_lock;
>  };
>  
> -- 
> 2.25.1
>
William Breathitt Gray Oct. 18, 2021, 9:51 a.m. UTC | #5
On Mon, Oct 18, 2021 at 11:13:16AM +0200, Greg KH wrote:
> On Mon, Oct 18, 2021 at 05:58:37PM +0900, William Breathitt Gray wrote:
> > On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
> > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > > This removes the chrdev_lock from the counter subsystem. This was
> > > > intended to prevent opening the chrdev more than once. However, this
> > > > doesn't work in practice since userspace can duplicate file descriptors
> > > > and pass file descriptors to other processes. Since this protection
> > > > can't be relied on, it is best to just remove it.
> > > 
> > > Much better, thanks!
> > > 
> > > One remaining question:
> > > 
> > > > --- a/include/linux/counter.h
> > > > +++ b/include/linux/counter.h
> > > > @@ -297,7 +297,6 @@ struct counter_ops {
> > > >   * @events:		queue of detected Counter events
> > > >   * @events_wait:	wait queue to allow blocking reads of Counter events
> > > >   * @events_lock:	lock to protect Counter events queue read operations
> > > > - * @chrdev_lock:	lock to limit chrdev to a single open at a time
> > > >   * @ops_exist_lock:	lock to prevent use during removal
> > > 
> > > Why do you still need 2 locks for the same structure?
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > Originally there was only the events_lock mutex. Initially I tried using
> > it to also limit the chrdev to a single open, but then came across a
> > "lock held when returning to user space" warning:
> > https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5
> > 
> > Instead of losing the benefits of a mutex lock for protecting the
> > events, I ultimately implemented the chrdev_lock separately as an
> > atomic_t. If the chrdev_lock is removed, then we'll use events_lock
> > solely from now on for this structure.
> 
> chrdev_lock should be removed, it doesn't really do what you think it
> does, as per the thread yesterday :)
> 
> So does this mean you can also drop the ops_exist_lock?
> 
> thanks,
> 
> greg k-h

When counter_unregister is called, the ops member is set to NULL to
indicate that the driver will be removed and that no new device
operations should occur (because the ops callbacks will no longer be
valid). The ops_exist_lock is used to allow existing ops callback
dereferences to complete before the driver is removed so that we do not
suffer a page fault.

I don't believe we can remove this protection (or can we?) but perhaps
we can merge the three mutex locks (n_events_list_lock, events_lock, and
ops_exist_lock) into a single "counter_lock" that handles all mutex
locking for this structure.

William Breathitt Gray
David Lechner Oct. 18, 2021, 4:03 p.m. UTC | #6
On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
>> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
>> index 1ccd771da25f..7bf8882ff54d 100644
>> --- a/drivers/counter/counter-sysfs.c
>> +++ b/drivers/counter/counter-sysfs.c
>> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
>>   					   u64 val)
>>   {
>>   	DECLARE_KFIFO_PTR(events, struct counter_event);
>> -	int err = 0;
>> -
>> -	/* Ensure chrdev is not opened more than 1 at a time */
>> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
>> -		return -EBUSY;
>> +	int err;
>>   
>>   	/* Allocate new events queue */
>>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
>>   	if (err)
>> -		goto exit_early;
>> +		return err;
>>   
>>   	/* Swap in new events queue */
>>   	kfifo_free(&counter->events);
>>   	counter->events.kfifo = events.kfifo;
> 
> Do we need to hold the events_lock mutex here for this swap in case
> counter_chrdev_read() is in the middle of reading the kfifo to
> userspace, or do the kfifo macros already protect us from a race
> condition here?
> 
Another possibility might be to disallow changing the size while
events are enabled. Otherwise, we also need to protect against
write after free.

I considered this:

	swap(counter->events.kfifo, events.kfifo);
	kfifo_free(&events);

But I'm not sure that would be safe enough.
David Lechner Oct. 18, 2021, 4:14 p.m. UTC | #7
On 10/18/21 4:51 AM, William Breathitt Gray wrote:
> On Mon, Oct 18, 2021 at 11:13:16AM +0200, Greg KH wrote:
>> On Mon, Oct 18, 2021 at 05:58:37PM +0900, William Breathitt Gray wrote:
>>> On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
>>>> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
>>>>> This removes the chrdev_lock from the counter subsystem. This was
>>>>> intended to prevent opening the chrdev more than once. However, this
>>>>> doesn't work in practice since userspace can duplicate file descriptors
>>>>> and pass file descriptors to other processes. Since this protection
>>>>> can't be relied on, it is best to just remove it.
>>>>
>>>> Much better, thanks!
>>>>
>>>> One remaining question:
>>>>
>>>>> --- a/include/linux/counter.h
>>>>> +++ b/include/linux/counter.h
>>>>> @@ -297,7 +297,6 @@ struct counter_ops {
>>>>>    * @events:		queue of detected Counter events
>>>>>    * @events_wait:	wait queue to allow blocking reads of Counter events
>>>>>    * @events_lock:	lock to protect Counter events queue read operations
>>>>> - * @chrdev_lock:	lock to limit chrdev to a single open at a time
>>>>>    * @ops_exist_lock:	lock to prevent use during removal
>>>>
>>>> Why do you still need 2 locks for the same structure?
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>
>>> Originally there was only the events_lock mutex. Initially I tried using
>>> it to also limit the chrdev to a single open, but then came across a
>>> "lock held when returning to user space" warning:
>>> https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5
>>>
>>> Instead of losing the benefits of a mutex lock for protecting the
>>> events, I ultimately implemented the chrdev_lock separately as an
>>> atomic_t. If the chrdev_lock is removed, then we'll use events_lock
>>> solely from now on for this structure.
>>
>> chrdev_lock should be removed, it doesn't really do what you think it
>> does, as per the thread yesterday :)
>>
>> So does this mean you can also drop the ops_exist_lock?
>>
>> thanks,
>>
>> greg k-h
> 
> When counter_unregister is called, the ops member is set to NULL to
> indicate that the driver will be removed and that no new device
> operations should occur (because the ops callbacks will no longer be
> valid). The ops_exist_lock is used to allow existing ops callback
> dereferences to complete before the driver is removed so that we do not
> suffer a page fault.
> 
> I don't believe we can remove this protection (or can we?) but perhaps
> we can merge the three mutex locks (n_events_list_lock, events_lock, and
> ops_exist_lock) into a single "counter_lock" that handles all mutex
> locking for this structure.
> 

The different mutexes protect individual parts of the counter struct
rather than the struct as a whole (a linked list, kfifo reads, and
callback ops), so I think it makes the code clearer having individual
mutexes for each rather than having a global mutex for unrelated
actions.
William Breathitt Gray Oct. 18, 2021, 11:56 p.m. UTC | #8
On Mon, Oct 18, 2021 at 11:14:15AM -0500, David Lechner wrote:
> On 10/18/21 4:51 AM, William Breathitt Gray wrote:
> > On Mon, Oct 18, 2021 at 11:13:16AM +0200, Greg KH wrote:
> >> On Mon, Oct 18, 2021 at 05:58:37PM +0900, William Breathitt Gray wrote:
> >>> On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
> >>>> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> >>>>> This removes the chrdev_lock from the counter subsystem. This was
> >>>>> intended to prevent opening the chrdev more than once. However, this
> >>>>> doesn't work in practice since userspace can duplicate file descriptors
> >>>>> and pass file descriptors to other processes. Since this protection
> >>>>> can't be relied on, it is best to just remove it.
> >>>>
> >>>> Much better, thanks!
> >>>>
> >>>> One remaining question:
> >>>>
> >>>>> --- a/include/linux/counter.h
> >>>>> +++ b/include/linux/counter.h
> >>>>> @@ -297,7 +297,6 @@ struct counter_ops {
> >>>>>    * @events:		queue of detected Counter events
> >>>>>    * @events_wait:	wait queue to allow blocking reads of Counter events
> >>>>>    * @events_lock:	lock to protect Counter events queue read operations
> >>>>> - * @chrdev_lock:	lock to limit chrdev to a single open at a time
> >>>>>    * @ops_exist_lock:	lock to prevent use during removal
> >>>>
> >>>> Why do you still need 2 locks for the same structure?
> >>>>
> >>>> thanks,
> >>>>
> >>>> greg k-h
> >>>
> >>> Originally there was only the events_lock mutex. Initially I tried using
> >>> it to also limit the chrdev to a single open, but then came across a
> >>> "lock held when returning to user space" warning:
> >>> https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5
> >>>
> >>> Instead of losing the benefits of a mutex lock for protecting the
> >>> events, I ultimately implemented the chrdev_lock separately as an
> >>> atomic_t. If the chrdev_lock is removed, then we'll use events_lock
> >>> solely from now on for this structure.
> >>
> >> chrdev_lock should be removed, it doesn't really do what you think it
> >> does, as per the thread yesterday :)
> >>
> >> So does this mean you can also drop the ops_exist_lock?
> >>
> >> thanks,
> >>
> >> greg k-h
> > 
> > When counter_unregister is called, the ops member is set to NULL to
> > indicate that the driver will be removed and that no new device
> > operations should occur (because the ops callbacks will no longer be
> > valid). The ops_exist_lock is used to allow existing ops callback
> > dereferences to complete before the driver is removed so that we do not
> > suffer a page fault.
> > 
> > I don't believe we can remove this protection (or can we?) but perhaps
> > we can merge the three mutex locks (n_events_list_lock, events_lock, and
> > ops_exist_lock) into a single "counter_lock" that handles all mutex
> > locking for this structure.
> > 
> 
> The different mutexes protect individual parts of the counter struct
> rather than the struct as a whole (a linked list, kfifo reads, and
> callback ops), so I think it makes the code clearer having individual
> mutexes for each rather than having a global mutex for unrelated
> actions.

That's a fair point. Because these operations are independent of each
other, keeping the mutexes separate would allow users to configure the
next events watch list without necessarily stalling the current read()
of events. I doubt the difference is significant, but if the purpose of
these locks are well enough defined then independent mutex locks may not
be a problem.

William Breathitt Gray
William Breathitt Gray Oct. 19, 2021, 6:53 a.m. UTC | #9
On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> >> index 1ccd771da25f..7bf8882ff54d 100644
> >> --- a/drivers/counter/counter-sysfs.c
> >> +++ b/drivers/counter/counter-sysfs.c
> >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> >>   					   u64 val)
> >>   {
> >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> >> -	int err = 0;
> >> -
> >> -	/* Ensure chrdev is not opened more than 1 at a time */
> >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> >> -		return -EBUSY;
> >> +	int err;
> >>   
> >>   	/* Allocate new events queue */
> >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> >>   	if (err)
> >> -		goto exit_early;
> >> +		return err;
> >>   
> >>   	/* Swap in new events queue */
> >>   	kfifo_free(&counter->events);
> >>   	counter->events.kfifo = events.kfifo;
> > 
> > Do we need to hold the events_lock mutex here for this swap in case
> > counter_chrdev_read() is in the middle of reading the kfifo to
> > userspace, or do the kfifo macros already protect us from a race
> > condition here?
> > 
> Another possibility might be to disallow changing the size while
> events are enabled. Otherwise, we also need to protect against
> write after free.
> 
> I considered this:
> 
> 	swap(counter->events.kfifo, events.kfifo);
> 	kfifo_free(&events);
> 
> But I'm not sure that would be safe enough.

I think it depends on whether it's safe to call kfifo_free() while other
kfifo_*() calls are executing. I suspect it is not safe because I don't
think kfifo_free() waits until all kfifo read/write operations are
finished before freeing -- but if I'm wrong here please let me know.

Because of that, will need to hold the counter->events_lock afterall so
that we don't modify the events fifo while a kfifo read/write is going
on, lest we suffer an address fault. This can happen regardless of
whether you swap before or after the kfifo_free() because the old fifo
address could still be in use within those uncompleted kfifo_*() calls
if they were called before the swap but don't complete before the
kfifo_free().

So we have a problem now that I think you have already noticed: the
kfifo_in() call in counter_push_events() also needs protection, but it's
executing within an interrupt context so we can't try to lock a mutex
lest we end up sleeping.

One option we have is as you suggested: we disallow changing size while
events are enabled. However, that will require us to keep track of when
events are disabled and implement a spinlock to ensure that we don't
disable events in the middle of a kfifo_in().

Alternatively, we could change events_lock to a spinlock and use it to
protect all these operations on the counter->events fifo. Would this
alternative be a better option so that we avoid creating another
separate lock?

As a side note, you can also remove the atomic.h includes from these
files now as well because we won't be working with an atomic_t anymore.

William Breathitt Gray
Greg KH Oct. 19, 2021, 7:07 a.m. UTC | #10
On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > >> index 1ccd771da25f..7bf8882ff54d 100644
> > >> --- a/drivers/counter/counter-sysfs.c
> > >> +++ b/drivers/counter/counter-sysfs.c
> > >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > >>   					   u64 val)
> > >>   {
> > >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> > >> -	int err = 0;
> > >> -
> > >> -	/* Ensure chrdev is not opened more than 1 at a time */
> > >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > >> -		return -EBUSY;
> > >> +	int err;
> > >>   
> > >>   	/* Allocate new events queue */
> > >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > >>   	if (err)
> > >> -		goto exit_early;
> > >> +		return err;
> > >>   
> > >>   	/* Swap in new events queue */
> > >>   	kfifo_free(&counter->events);
> > >>   	counter->events.kfifo = events.kfifo;
> > > 
> > > Do we need to hold the events_lock mutex here for this swap in case
> > > counter_chrdev_read() is in the middle of reading the kfifo to
> > > userspace, or do the kfifo macros already protect us from a race
> > > condition here?
> > > 
> > Another possibility might be to disallow changing the size while
> > events are enabled. Otherwise, we also need to protect against
> > write after free.
> > 
> > I considered this:
> > 
> > 	swap(counter->events.kfifo, events.kfifo);
> > 	kfifo_free(&events);
> > 
> > But I'm not sure that would be safe enough.
> 
> I think it depends on whether it's safe to call kfifo_free() while other
> kfifo_*() calls are executing. I suspect it is not safe because I don't
> think kfifo_free() waits until all kfifo read/write operations are
> finished before freeing -- but if I'm wrong here please let me know.
> 
> Because of that, will need to hold the counter->events_lock afterall so
> that we don't modify the events fifo while a kfifo read/write is going
> on, lest we suffer an address fault. This can happen regardless of
> whether you swap before or after the kfifo_free() because the old fifo
> address could still be in use within those uncompleted kfifo_*() calls
> if they were called before the swap but don't complete before the
> kfifo_free().
> 
> So we have a problem now that I think you have already noticed: the
> kfifo_in() call in counter_push_events() also needs protection, but it's
> executing within an interrupt context so we can't try to lock a mutex
> lest we end up sleeping.
> 
> One option we have is as you suggested: we disallow changing size while
> events are enabled. However, that will require us to keep track of when
> events are disabled and implement a spinlock to ensure that we don't
> disable events in the middle of a kfifo_in().
> 
> Alternatively, we could change events_lock to a spinlock and use it to
> protect all these operations on the counter->events fifo. Would this
> alternative be a better option so that we avoid creating another
> separate lock?

I would recommend just having a single lock here if at all possible,
until you determine that there a performance problem that can be
measured that would require it to be split up.

thanks,

greg k-h
William Breathitt Gray Oct. 19, 2021, 7:18 a.m. UTC | #11
On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> > On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > > On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > > >> index 1ccd771da25f..7bf8882ff54d 100644
> > > >> --- a/drivers/counter/counter-sysfs.c
> > > >> +++ b/drivers/counter/counter-sysfs.c
> > > >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > > >>   					   u64 val)
> > > >>   {
> > > >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> > > >> -	int err = 0;
> > > >> -
> > > >> -	/* Ensure chrdev is not opened more than 1 at a time */
> > > >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > > >> -		return -EBUSY;
> > > >> +	int err;
> > > >>   
> > > >>   	/* Allocate new events queue */
> > > >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > > >>   	if (err)
> > > >> -		goto exit_early;
> > > >> +		return err;
> > > >>   
> > > >>   	/* Swap in new events queue */
> > > >>   	kfifo_free(&counter->events);
> > > >>   	counter->events.kfifo = events.kfifo;
> > > > 
> > > > Do we need to hold the events_lock mutex here for this swap in case
> > > > counter_chrdev_read() is in the middle of reading the kfifo to
> > > > userspace, or do the kfifo macros already protect us from a race
> > > > condition here?
> > > > 
> > > Another possibility might be to disallow changing the size while
> > > events are enabled. Otherwise, we also need to protect against
> > > write after free.
> > > 
> > > I considered this:
> > > 
> > > 	swap(counter->events.kfifo, events.kfifo);
> > > 	kfifo_free(&events);
> > > 
> > > But I'm not sure that would be safe enough.
> > 
> > I think it depends on whether it's safe to call kfifo_free() while other
> > kfifo_*() calls are executing. I suspect it is not safe because I don't
> > think kfifo_free() waits until all kfifo read/write operations are
> > finished before freeing -- but if I'm wrong here please let me know.
> > 
> > Because of that, will need to hold the counter->events_lock afterall so
> > that we don't modify the events fifo while a kfifo read/write is going
> > on, lest we suffer an address fault. This can happen regardless of
> > whether you swap before or after the kfifo_free() because the old fifo
> > address could still be in use within those uncompleted kfifo_*() calls
> > if they were called before the swap but don't complete before the
> > kfifo_free().
> > 
> > So we have a problem now that I think you have already noticed: the
> > kfifo_in() call in counter_push_events() also needs protection, but it's
> > executing within an interrupt context so we can't try to lock a mutex
> > lest we end up sleeping.
> > 
> > One option we have is as you suggested: we disallow changing size while
> > events are enabled. However, that will require us to keep track of when
> > events are disabled and implement a spinlock to ensure that we don't
> > disable events in the middle of a kfifo_in().
> > 
> > Alternatively, we could change events_lock to a spinlock and use it to
> > protect all these operations on the counter->events fifo. Would this
> > alternative be a better option so that we avoid creating another
> > separate lock?
> 
> I would recommend just having a single lock here if at all possible,
> until you determine that there a performance problem that can be
> measured that would require it to be split up.
> 
> thanks,
> 
> greg k-h

All right let's go with a single events_lock spinlock then. David, if
you make those changes and submit a v2, I'll be okay with this patch and
can provide my ack for it.

Thanks,

William Breathitt Gray
Greg KH Oct. 19, 2021, 7:29 a.m. UTC | #12
On Tue, Oct 19, 2021 at 04:18:42PM +0900, William Breathitt Gray wrote:
> On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> > On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> > > On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > > > On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > > > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > > >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > > > >> index 1ccd771da25f..7bf8882ff54d 100644
> > > > >> --- a/drivers/counter/counter-sysfs.c
> > > > >> +++ b/drivers/counter/counter-sysfs.c
> > > > >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > > > >>   					   u64 val)
> > > > >>   {
> > > > >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> > > > >> -	int err = 0;
> > > > >> -
> > > > >> -	/* Ensure chrdev is not opened more than 1 at a time */
> > > > >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > > > >> -		return -EBUSY;
> > > > >> +	int err;
> > > > >>   
> > > > >>   	/* Allocate new events queue */
> > > > >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > > > >>   	if (err)
> > > > >> -		goto exit_early;
> > > > >> +		return err;
> > > > >>   
> > > > >>   	/* Swap in new events queue */
> > > > >>   	kfifo_free(&counter->events);
> > > > >>   	counter->events.kfifo = events.kfifo;
> > > > > 
> > > > > Do we need to hold the events_lock mutex here for this swap in case
> > > > > counter_chrdev_read() is in the middle of reading the kfifo to
> > > > > userspace, or do the kfifo macros already protect us from a race
> > > > > condition here?
> > > > > 
> > > > Another possibility might be to disallow changing the size while
> > > > events are enabled. Otherwise, we also need to protect against
> > > > write after free.
> > > > 
> > > > I considered this:
> > > > 
> > > > 	swap(counter->events.kfifo, events.kfifo);
> > > > 	kfifo_free(&events);
> > > > 
> > > > But I'm not sure that would be safe enough.
> > > 
> > > I think it depends on whether it's safe to call kfifo_free() while other
> > > kfifo_*() calls are executing. I suspect it is not safe because I don't
> > > think kfifo_free() waits until all kfifo read/write operations are
> > > finished before freeing -- but if I'm wrong here please let me know.
> > > 
> > > Because of that, will need to hold the counter->events_lock afterall so
> > > that we don't modify the events fifo while a kfifo read/write is going
> > > on, lest we suffer an address fault. This can happen regardless of
> > > whether you swap before or after the kfifo_free() because the old fifo
> > > address could still be in use within those uncompleted kfifo_*() calls
> > > if they were called before the swap but don't complete before the
> > > kfifo_free().
> > > 
> > > So we have a problem now that I think you have already noticed: the
> > > kfifo_in() call in counter_push_events() also needs protection, but it's
> > > executing within an interrupt context so we can't try to lock a mutex
> > > lest we end up sleeping.
> > > 
> > > One option we have is as you suggested: we disallow changing size while
> > > events are enabled. However, that will require us to keep track of when
> > > events are disabled and implement a spinlock to ensure that we don't
> > > disable events in the middle of a kfifo_in().
> > > 
> > > Alternatively, we could change events_lock to a spinlock and use it to
> > > protect all these operations on the counter->events fifo. Would this
> > > alternative be a better option so that we avoid creating another
> > > separate lock?
> > 
> > I would recommend just having a single lock here if at all possible,
> > until you determine that there a performance problem that can be
> > measured that would require it to be split up.
> > 
> > thanks,
> > 
> > greg k-h
> 
> All right let's go with a single events_lock spinlock then. David, if
> you make those changes and submit a v2, I'll be okay with this patch and
> can provide my ack for it.

Wait, no, you need one patch to remove the atomic lock for the open
"protection" and then another one for the other locks.  The original
patch here was fine, but can be part of a patch series, don't lump them
all together into one huge change.

thanks,

greg k-h
William Breathitt Gray Oct. 19, 2021, 7:46 a.m. UTC | #13
On Tue, Oct 19, 2021 at 09:29:17AM +0200, Greg KH wrote:
> On Tue, Oct 19, 2021 at 04:18:42PM +0900, William Breathitt Gray wrote:
> > On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> > > On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> > > > On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > > > > On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > > > > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > > > >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > > > > >> index 1ccd771da25f..7bf8882ff54d 100644
> > > > > >> --- a/drivers/counter/counter-sysfs.c
> > > > > >> +++ b/drivers/counter/counter-sysfs.c
> > > > > >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > > > > >>   					   u64 val)
> > > > > >>   {
> > > > > >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> > > > > >> -	int err = 0;
> > > > > >> -
> > > > > >> -	/* Ensure chrdev is not opened more than 1 at a time */
> > > > > >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > > > > >> -		return -EBUSY;
> > > > > >> +	int err;
> > > > > >>   
> > > > > >>   	/* Allocate new events queue */
> > > > > >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > > > > >>   	if (err)
> > > > > >> -		goto exit_early;
> > > > > >> +		return err;
> > > > > >>   
> > > > > >>   	/* Swap in new events queue */
> > > > > >>   	kfifo_free(&counter->events);
> > > > > >>   	counter->events.kfifo = events.kfifo;
> > > > > > 
> > > > > > Do we need to hold the events_lock mutex here for this swap in case
> > > > > > counter_chrdev_read() is in the middle of reading the kfifo to
> > > > > > userspace, or do the kfifo macros already protect us from a race
> > > > > > condition here?
> > > > > > 
> > > > > Another possibility might be to disallow changing the size while
> > > > > events are enabled. Otherwise, we also need to protect against
> > > > > write after free.
> > > > > 
> > > > > I considered this:
> > > > > 
> > > > > 	swap(counter->events.kfifo, events.kfifo);
> > > > > 	kfifo_free(&events);
> > > > > 
> > > > > But I'm not sure that would be safe enough.
> > > > 
> > > > I think it depends on whether it's safe to call kfifo_free() while other
> > > > kfifo_*() calls are executing. I suspect it is not safe because I don't
> > > > think kfifo_free() waits until all kfifo read/write operations are
> > > > finished before freeing -- but if I'm wrong here please let me know.
> > > > 
> > > > Because of that, will need to hold the counter->events_lock afterall so
> > > > that we don't modify the events fifo while a kfifo read/write is going
> > > > on, lest we suffer an address fault. This can happen regardless of
> > > > whether you swap before or after the kfifo_free() because the old fifo
> > > > address could still be in use within those uncompleted kfifo_*() calls
> > > > if they were called before the swap but don't complete before the
> > > > kfifo_free().
> > > > 
> > > > So we have a problem now that I think you have already noticed: the
> > > > kfifo_in() call in counter_push_events() also needs protection, but it's
> > > > executing within an interrupt context so we can't try to lock a mutex
> > > > lest we end up sleeping.
> > > > 
> > > > One option we have is as you suggested: we disallow changing size while
> > > > events are enabled. However, that will require us to keep track of when
> > > > events are disabled and implement a spinlock to ensure that we don't
> > > > disable events in the middle of a kfifo_in().
> > > > 
> > > > Alternatively, we could change events_lock to a spinlock and use it to
> > > > protect all these operations on the counter->events fifo. Would this
> > > > alternative be a better option so that we avoid creating another
> > > > separate lock?
> > > 
> > > I would recommend just having a single lock here if at all possible,
> > > until you determine that there a performance problem that can be
> > > measured that would require it to be split up.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > All right let's go with a single events_lock spinlock then. David, if
> > you make those changes and submit a v2, I'll be okay with this patch and
> > can provide my ack for it.
> 
> Wait, no, you need one patch to remove the atomic lock for the open
> "protection" and then another one for the other locks.  The original
> patch here was fine, but can be part of a patch series, don't lump them
> all together into one huge change.
> 
> thanks,
> 
> greg k-h

Understood. I'll provide my ack for this patch here then.

Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Greg KH Oct. 19, 2021, 9:36 a.m. UTC | #14
On Tue, Oct 19, 2021 at 04:46:07PM +0900, William Breathitt Gray wrote:
> On Tue, Oct 19, 2021 at 09:29:17AM +0200, Greg KH wrote:
> > On Tue, Oct 19, 2021 at 04:18:42PM +0900, William Breathitt Gray wrote:
> > > On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> > > > On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> > > > > On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > > > > > On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > > > > > > On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > > > > > >> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > > > > > >> index 1ccd771da25f..7bf8882ff54d 100644
> > > > > > >> --- a/drivers/counter/counter-sysfs.c
> > > > > > >> +++ b/drivers/counter/counter-sysfs.c
> > > > > > >> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > > > > > >>   					   u64 val)
> > > > > > >>   {
> > > > > > >>   	DECLARE_KFIFO_PTR(events, struct counter_event);
> > > > > > >> -	int err = 0;
> > > > > > >> -
> > > > > > >> -	/* Ensure chrdev is not opened more than 1 at a time */
> > > > > > >> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > > > > > >> -		return -EBUSY;
> > > > > > >> +	int err;
> > > > > > >>   
> > > > > > >>   	/* Allocate new events queue */
> > > > > > >>   	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > > > > > >>   	if (err)
> > > > > > >> -		goto exit_early;
> > > > > > >> +		return err;
> > > > > > >>   
> > > > > > >>   	/* Swap in new events queue */
> > > > > > >>   	kfifo_free(&counter->events);
> > > > > > >>   	counter->events.kfifo = events.kfifo;
> > > > > > > 
> > > > > > > Do we need to hold the events_lock mutex here for this swap in case
> > > > > > > counter_chrdev_read() is in the middle of reading the kfifo to
> > > > > > > userspace, or do the kfifo macros already protect us from a race
> > > > > > > condition here?
> > > > > > > 
> > > > > > Another possibility might be to disallow changing the size while
> > > > > > events are enabled. Otherwise, we also need to protect against
> > > > > > write after free.
> > > > > > 
> > > > > > I considered this:
> > > > > > 
> > > > > > 	swap(counter->events.kfifo, events.kfifo);
> > > > > > 	kfifo_free(&events);
> > > > > > 
> > > > > > But I'm not sure that would be safe enough.
> > > > > 
> > > > > I think it depends on whether it's safe to call kfifo_free() while other
> > > > > kfifo_*() calls are executing. I suspect it is not safe because I don't
> > > > > think kfifo_free() waits until all kfifo read/write operations are
> > > > > finished before freeing -- but if I'm wrong here please let me know.
> > > > > 
> > > > > Because of that, will need to hold the counter->events_lock afterall so
> > > > > that we don't modify the events fifo while a kfifo read/write is going
> > > > > on, lest we suffer an address fault. This can happen regardless of
> > > > > whether you swap before or after the kfifo_free() because the old fifo
> > > > > address could still be in use within those uncompleted kfifo_*() calls
> > > > > if they were called before the swap but don't complete before the
> > > > > kfifo_free().
> > > > > 
> > > > > So we have a problem now that I think you have already noticed: the
> > > > > kfifo_in() call in counter_push_events() also needs protection, but it's
> > > > > executing within an interrupt context so we can't try to lock a mutex
> > > > > lest we end up sleeping.
> > > > > 
> > > > > One option we have is as you suggested: we disallow changing size while
> > > > > events are enabled. However, that will require us to keep track of when
> > > > > events are disabled and implement a spinlock to ensure that we don't
> > > > > disable events in the middle of a kfifo_in().
> > > > > 
> > > > > Alternatively, we could change events_lock to a spinlock and use it to
> > > > > protect all these operations on the counter->events fifo. Would this
> > > > > alternative be a better option so that we avoid creating another
> > > > > separate lock?
> > > > 
> > > > I would recommend just having a single lock here if at all possible,
> > > > until you determine that there a performance problem that can be
> > > > measured that would require it to be split up.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > All right let's go with a single events_lock spinlock then. David, if
> > > you make those changes and submit a v2, I'll be okay with this patch and
> > > can provide my ack for it.
> > 
> > Wait, no, you need one patch to remove the atomic lock for the open
> > "protection" and then another one for the other locks.  The original
> > patch here was fine, but can be part of a patch series, don't lump them
> > all together into one huge change.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Understood. I'll provide my ack for this patch here then.
> 
> Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com>

Thanks, now queued up!

greg k-h
David Lechner Oct. 19, 2021, 2:44 p.m. UTC | #15
On 10/19/21 2:18 AM, William Breathitt Gray wrote:
> On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
>> On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
>>> On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
>>>> On 10/18/21 4:14 AM, William Breathitt Gray wrote:
>>>>> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
>>>>>> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
>>>>>> index 1ccd771da25f..7bf8882ff54d 100644
>>>>>> --- a/drivers/counter/counter-sysfs.c
>>>>>> +++ b/drivers/counter/counter-sysfs.c
>>>>>> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
>>>>>>    					   u64 val)
>>>>>>    {
>>>>>>    	DECLARE_KFIFO_PTR(events, struct counter_event);
>>>>>> -	int err = 0;
>>>>>> -
>>>>>> -	/* Ensure chrdev is not opened more than 1 at a time */
>>>>>> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
>>>>>> -		return -EBUSY;
>>>>>> +	int err;
>>>>>>    
>>>>>>    	/* Allocate new events queue */
>>>>>>    	err = kfifo_alloc(&events, val, GFP_KERNEL);
>>>>>>    	if (err)
>>>>>> -		goto exit_early;
>>>>>> +		return err;
>>>>>>    
>>>>>>    	/* Swap in new events queue */
>>>>>>    	kfifo_free(&counter->events);
>>>>>>    	counter->events.kfifo = events.kfifo;
>>>>>
>>>>> Do we need to hold the events_lock mutex here for this swap in case
>>>>> counter_chrdev_read() is in the middle of reading the kfifo to
>>>>> userspace, or do the kfifo macros already protect us from a race
>>>>> condition here?
>>>>>
>>>> Another possibility might be to disallow changing the size while
>>>> events are enabled. Otherwise, we also need to protect against
>>>> write after free.
>>>>
>>>> I considered this:
>>>>
>>>> 	swap(counter->events.kfifo, events.kfifo);
>>>> 	kfifo_free(&events);
>>>>
>>>> But I'm not sure that would be safe enough.
>>>
>>> I think it depends on whether it's safe to call kfifo_free() while other
>>> kfifo_*() calls are executing. I suspect it is not safe because I don't
>>> think kfifo_free() waits until all kfifo read/write operations are
>>> finished before freeing -- but if I'm wrong here please let me know.
>>>
>>> Because of that, will need to hold the counter->events_lock afterall so
>>> that we don't modify the events fifo while a kfifo read/write is going
>>> on, lest we suffer an address fault. This can happen regardless of
>>> whether you swap before or after the kfifo_free() because the old fifo
>>> address could still be in use within those uncompleted kfifo_*() calls
>>> if they were called before the swap but don't complete before the
>>> kfifo_free().
>>>
>>> So we have a problem now that I think you have already noticed: the
>>> kfifo_in() call in counter_push_events() also needs protection, but it's
>>> executing within an interrupt context so we can't try to lock a mutex
>>> lest we end up sleeping.
>>>
>>> One option we have is as you suggested: we disallow changing size while
>>> events are enabled. However, that will require us to keep track of when
>>> events are disabled and implement a spinlock to ensure that we don't
>>> disable events in the middle of a kfifo_in().
>>>
>>> Alternatively, we could change events_lock to a spinlock and use it to
>>> protect all these operations on the counter->events fifo. Would this
>>> alternative be a better option so that we avoid creating another
>>> separate lock?
>>
>> I would recommend just having a single lock here if at all possible,
>> until you determine that there a performance problem that can be
>> measured that would require it to be split up.
>>
>> thanks,
>>
>> greg k-h
> 
> All right let's go with a single events_lock spinlock then. David, if
> you make those changes and submit a v2, I'll be okay with this patch and
> can provide my ack for it.
> 

We can't use a spin lock for everything since there are operations
that can sleep that need to be in the critical sections. Likewise,
we can't use a mutex for everything since some critical sections
are in interrupt handlers. But, I suppose we can try combining
the existing mutexes. Since the kfifo is accessed from both
contexts, it seems like it still needs more consideration than
just a mutex or a spin lock, e.g. if events are enabled, don't
allow swapping out the kfifo buffer.
William Breathitt Gray Oct. 19, 2021, 8:59 p.m. UTC | #16
On Tue, Oct 19, 2021 at 09:44:04AM -0500, David Lechner wrote:
> On 10/19/21 2:18 AM, William Breathitt Gray wrote:
> > On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> >> On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> >>> On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> >>>> On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> >>>>> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> >>>>>> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> >>>>>> index 1ccd771da25f..7bf8882ff54d 100644
> >>>>>> --- a/drivers/counter/counter-sysfs.c
> >>>>>> +++ b/drivers/counter/counter-sysfs.c
> >>>>>> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> >>>>>>    					   u64 val)
> >>>>>>    {
> >>>>>>    	DECLARE_KFIFO_PTR(events, struct counter_event);
> >>>>>> -	int err = 0;
> >>>>>> -
> >>>>>> -	/* Ensure chrdev is not opened more than 1 at a time */
> >>>>>> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> >>>>>> -		return -EBUSY;
> >>>>>> +	int err;
> >>>>>>    
> >>>>>>    	/* Allocate new events queue */
> >>>>>>    	err = kfifo_alloc(&events, val, GFP_KERNEL);
> >>>>>>    	if (err)
> >>>>>> -		goto exit_early;
> >>>>>> +		return err;
> >>>>>>    
> >>>>>>    	/* Swap in new events queue */
> >>>>>>    	kfifo_free(&counter->events);
> >>>>>>    	counter->events.kfifo = events.kfifo;
> >>>>>
> >>>>> Do we need to hold the events_lock mutex here for this swap in case
> >>>>> counter_chrdev_read() is in the middle of reading the kfifo to
> >>>>> userspace, or do the kfifo macros already protect us from a race
> >>>>> condition here?
> >>>>>
> >>>> Another possibility might be to disallow changing the size while
> >>>> events are enabled. Otherwise, we also need to protect against
> >>>> write after free.
> >>>>
> >>>> I considered this:
> >>>>
> >>>> 	swap(counter->events.kfifo, events.kfifo);
> >>>> 	kfifo_free(&events);
> >>>>
> >>>> But I'm not sure that would be safe enough.
> >>>
> >>> I think it depends on whether it's safe to call kfifo_free() while other
> >>> kfifo_*() calls are executing. I suspect it is not safe because I don't
> >>> think kfifo_free() waits until all kfifo read/write operations are
> >>> finished before freeing -- but if I'm wrong here please let me know.
> >>>
> >>> Because of that, will need to hold the counter->events_lock afterall so
> >>> that we don't modify the events fifo while a kfifo read/write is going
> >>> on, lest we suffer an address fault. This can happen regardless of
> >>> whether you swap before or after the kfifo_free() because the old fifo
> >>> address could still be in use within those uncompleted kfifo_*() calls
> >>> if they were called before the swap but don't complete before the
> >>> kfifo_free().
> >>>
> >>> So we have a problem now that I think you have already noticed: the
> >>> kfifo_in() call in counter_push_events() also needs protection, but it's
> >>> executing within an interrupt context so we can't try to lock a mutex
> >>> lest we end up sleeping.
> >>>
> >>> One option we have is as you suggested: we disallow changing size while
> >>> events are enabled. However, that will require us to keep track of when
> >>> events are disabled and implement a spinlock to ensure that we don't
> >>> disable events in the middle of a kfifo_in().
> >>>
> >>> Alternatively, we could change events_lock to a spinlock and use it to
> >>> protect all these operations on the counter->events fifo. Would this
> >>> alternative be a better option so that we avoid creating another
> >>> separate lock?
> >>
> >> I would recommend just having a single lock here if at all possible,
> >> until you determine that there a performance problem that can be
> >> measured that would require it to be split up.
> >>
> >> thanks,
> >>
> >> greg k-h
> > 
> > All right let's go with a single events_lock spinlock then. David, if
> > you make those changes and submit a v2, I'll be okay with this patch and
> > can provide my ack for it.
> > 
> 
> We can't use a spin lock for everything since there are operations
> that can sleep that need to be in the critical sections. Likewise,
> we can't use a mutex for everything since some critical sections
> are in interrupt handlers. But, I suppose we can try combining
> the existing mutexes. Since the kfifo is accessed from both
> contexts, it seems like it still needs more consideration than
> just a mutex or a spin lock, e.g. if events are enabled, don't
> allow swapping out the kfifo buffer.

I think there is a deadlock case if we combine the ops_exists_lock with
the n_events_list_lock, so this will need further thought. However, at
the very least the swap occuring in counter_events_queue_size_write()
and the kfifo_in() in counter_push_events() require some sort of
locking; it is trivial to cause a page fault with the code in its
current state.

I think this can be fixed if just events_lock is changed to spinlock for
now without modifying the other locks. We can try to combine the
remaining locks in a subsequent patch, if they are capable of being
combined.

William Breathitt Gray
William Breathitt Gray Oct. 20, 2021, 5:42 a.m. UTC | #17
On Wed, Oct 20, 2021 at 05:59:32AM +0900, William Breathitt Gray wrote:
> On Tue, Oct 19, 2021 at 09:44:04AM -0500, David Lechner wrote:
> > On 10/19/21 2:18 AM, William Breathitt Gray wrote:
> > > On Tue, Oct 19, 2021 at 09:07:48AM +0200, Greg KH wrote:
> > >> On Tue, Oct 19, 2021 at 03:53:08PM +0900, William Breathitt Gray wrote:
> > >>> On Mon, Oct 18, 2021 at 11:03:49AM -0500, David Lechner wrote:
> > >>>> On 10/18/21 4:14 AM, William Breathitt Gray wrote:
> > >>>>> On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
> > >>>>>> diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
> > >>>>>> index 1ccd771da25f..7bf8882ff54d 100644
> > >>>>>> --- a/drivers/counter/counter-sysfs.c
> > >>>>>> +++ b/drivers/counter/counter-sysfs.c
> > >>>>>> @@ -796,25 +796,18 @@ static int counter_events_queue_size_write(struct counter_device *counter,
> > >>>>>>    					   u64 val)
> > >>>>>>    {
> > >>>>>>    	DECLARE_KFIFO_PTR(events, struct counter_event);
> > >>>>>> -	int err = 0;
> > >>>>>> -
> > >>>>>> -	/* Ensure chrdev is not opened more than 1 at a time */
> > >>>>>> -	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
> > >>>>>> -		return -EBUSY;
> > >>>>>> +	int err;
> > >>>>>>    
> > >>>>>>    	/* Allocate new events queue */
> > >>>>>>    	err = kfifo_alloc(&events, val, GFP_KERNEL);
> > >>>>>>    	if (err)
> > >>>>>> -		goto exit_early;
> > >>>>>> +		return err;
> > >>>>>>    
> > >>>>>>    	/* Swap in new events queue */
> > >>>>>>    	kfifo_free(&counter->events);
> > >>>>>>    	counter->events.kfifo = events.kfifo;
> > >>>>>
> > >>>>> Do we need to hold the events_lock mutex here for this swap in case
> > >>>>> counter_chrdev_read() is in the middle of reading the kfifo to
> > >>>>> userspace, or do the kfifo macros already protect us from a race
> > >>>>> condition here?
> > >>>>>
> > >>>> Another possibility might be to disallow changing the size while
> > >>>> events are enabled. Otherwise, we also need to protect against
> > >>>> write after free.
> > >>>>
> > >>>> I considered this:
> > >>>>
> > >>>> 	swap(counter->events.kfifo, events.kfifo);
> > >>>> 	kfifo_free(&events);
> > >>>>
> > >>>> But I'm not sure that would be safe enough.
> > >>>
> > >>> I think it depends on whether it's safe to call kfifo_free() while other
> > >>> kfifo_*() calls are executing. I suspect it is not safe because I don't
> > >>> think kfifo_free() waits until all kfifo read/write operations are
> > >>> finished before freeing -- but if I'm wrong here please let me know.
> > >>>
> > >>> Because of that, will need to hold the counter->events_lock afterall so
> > >>> that we don't modify the events fifo while a kfifo read/write is going
> > >>> on, lest we suffer an address fault. This can happen regardless of
> > >>> whether you swap before or after the kfifo_free() because the old fifo
> > >>> address could still be in use within those uncompleted kfifo_*() calls
> > >>> if they were called before the swap but don't complete before the
> > >>> kfifo_free().
> > >>>
> > >>> So we have a problem now that I think you have already noticed: the
> > >>> kfifo_in() call in counter_push_events() also needs protection, but it's
> > >>> executing within an interrupt context so we can't try to lock a mutex
> > >>> lest we end up sleeping.
> > >>>
> > >>> One option we have is as you suggested: we disallow changing size while
> > >>> events are enabled. However, that will require us to keep track of when
> > >>> events are disabled and implement a spinlock to ensure that we don't
> > >>> disable events in the middle of a kfifo_in().
> > >>>
> > >>> Alternatively, we could change events_lock to a spinlock and use it to
> > >>> protect all these operations on the counter->events fifo. Would this
> > >>> alternative be a better option so that we avoid creating another
> > >>> separate lock?
> > >>
> > >> I would recommend just having a single lock here if at all possible,
> > >> until you determine that there a performance problem that can be
> > >> measured that would require it to be split up.
> > >>
> > >> thanks,
> > >>
> > >> greg k-h
> > > 
> > > All right let's go with a single events_lock spinlock then. David, if
> > > you make those changes and submit a v2, I'll be okay with this patch and
> > > can provide my ack for it.
> > > 
> > 
> > We can't use a spin lock for everything since there are operations
> > that can sleep that need to be in the critical sections. Likewise,
> > we can't use a mutex for everything since some critical sections
> > are in interrupt handlers. But, I suppose we can try combining
> > the existing mutexes. Since the kfifo is accessed from both
> > contexts, it seems like it still needs more consideration than
> > just a mutex or a spin lock, e.g. if events are enabled, don't
> > allow swapping out the kfifo buffer.
> 
> I think there is a deadlock case if we combine the ops_exists_lock with
> the n_events_list_lock, so this will need further thought. However, at
> the very least the swap occuring in counter_events_queue_size_write()
> and the kfifo_in() in counter_push_events() require some sort of
> locking; it is trivial to cause a page fault with the code in its
> current state.
> 
> I think this can be fixed if just events_lock is changed to spinlock for
> now without modifying the other locks. We can try to combine the
> remaining locks in a subsequent patch, if they are capable of being
> combined.
> 
> William Breathitt Gray

After considering this further, kfifo_to_user() could possibly sleep so
we can't use a spinlock here afterall. As such, events_lock should
remain as a mutex and instead we'll only allow swapping out the kfifo
buffer when events are disabled.

William Breathitt Gray
diff mbox series

Patch

diff --git a/drivers/counter/counter-chrdev.c b/drivers/counter/counter-chrdev.c
index 967c94ae95bb..b747dc81cfc6 100644
--- a/drivers/counter/counter-chrdev.c
+++ b/drivers/counter/counter-chrdev.c
@@ -384,10 +384,6 @@  static int counter_chrdev_open(struct inode *inode, struct file *filp)
 							    typeof(*counter),
 							    chrdev);
 
-	/* Ensure chrdev is not opened more than 1 at a time */
-	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
-		return -EBUSY;
-
 	get_device(&counter->dev);
 	filp->private_data = counter;
 
@@ -419,7 +415,6 @@  static int counter_chrdev_release(struct inode *inode, struct file *filp)
 	mutex_unlock(&counter->ops_exist_lock);
 
 	put_device(&counter->dev);
-	atomic_dec(&counter->chrdev_lock);
 
 	return ret;
 }
@@ -445,7 +440,6 @@  int counter_chrdev_add(struct counter_device *const counter)
 	mutex_init(&counter->events_lock);
 
 	/* Initialize character device */
-	atomic_set(&counter->chrdev_lock, 0);
 	cdev_init(&counter->chrdev, &counter_fops);
 
 	/* Allocate Counter events queue */
diff --git a/drivers/counter/counter-sysfs.c b/drivers/counter/counter-sysfs.c
index 1ccd771da25f..7bf8882ff54d 100644
--- a/drivers/counter/counter-sysfs.c
+++ b/drivers/counter/counter-sysfs.c
@@ -796,25 +796,18 @@  static int counter_events_queue_size_write(struct counter_device *counter,
 					   u64 val)
 {
 	DECLARE_KFIFO_PTR(events, struct counter_event);
-	int err = 0;
-
-	/* Ensure chrdev is not opened more than 1 at a time */
-	if (!atomic_add_unless(&counter->chrdev_lock, 1, 1))
-		return -EBUSY;
+	int err;
 
 	/* Allocate new events queue */
 	err = kfifo_alloc(&events, val, GFP_KERNEL);
 	if (err)
-		goto exit_early;
+		return err;
 
 	/* Swap in new events queue */
 	kfifo_free(&counter->events);
 	counter->events.kfifo = events.kfifo;
 
-exit_early:
-	atomic_dec(&counter->chrdev_lock);
-
-	return err;
+	return 0;
 }
 
 static struct counter_comp counter_num_signals_comp =
diff --git a/include/linux/counter.h b/include/linux/counter.h
index 22b14a552b1d..0fd99e255a50 100644
--- a/include/linux/counter.h
+++ b/include/linux/counter.h
@@ -297,7 +297,6 @@  struct counter_ops {
  * @events:		queue of detected Counter events
  * @events_wait:	wait queue to allow blocking reads of Counter events
  * @events_lock:	lock to protect Counter events queue read operations
- * @chrdev_lock:	lock to limit chrdev to a single open at a time
  * @ops_exist_lock:	lock to prevent use during removal
  */
 struct counter_device {
@@ -325,12 +324,6 @@  struct counter_device {
 	DECLARE_KFIFO_PTR(events, struct counter_event);
 	wait_queue_head_t events_wait;
 	struct mutex events_lock;
-	/*
-	 * chrdev_lock is locked by counter_chrdev_open() and unlocked by
-	 * counter_chrdev_release(), so a mutex is not possible here because
-	 * chrdev_lock will invariably be held when returning to user space
-	 */
-	atomic_t chrdev_lock;
 	struct mutex ops_exist_lock;
 };