diff mbox

[1/2] freezer: add unsafe versions of freezable helpers

Message ID 1367615050-3894-1-git-send-email-ccross@android.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Colin Cross May 3, 2013, 9:04 p.m. UTC
NFS calls the freezable helpers with locks held, which is unsafe
and caused lockdep warnings when 6aa9707 "lockdep: check that no
locks held at freeze time" was applied (reverted in dbf520a).
Add new *_unsafe versions of the helpers that will not run the
lockdep test when 6aa9707 is reapplied, and call them from NFS.

Signed-off-by: Colin Cross <ccross@android.com>
---
 fs/nfs/inode.c          |  2 +-
 fs/nfs/nfs3proc.c       |  2 +-
 fs/nfs/nfs4proc.c       |  4 ++--
 include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
 net/sunrpc/sched.c      |  2 +-
 5 files changed, 46 insertions(+), 6 deletions(-)

Comments

Pavel Machek May 4, 2013, 1 p.m. UTC | #1
Hi!

> NFS calls the freezable helpers with locks held, which is unsafe
> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> locks held at freeze time" was applied (reverted in dbf520a).
> Add new *_unsafe versions of the helpers that will not run the
> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> 
> Signed-off-by: Colin Cross <ccross@android.com>

Looks mostly good.

> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
>  	freezer_count();						\
>  })
>  
> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> +#define freezable_schedule_unsafe()					\
> +({									\
> +	freezer_do_not_count();						\
> +	schedule();							\
> +	freezer_count_unsafe();						\
> +})
> +

Make it inline function? :-). Add short explanation why it is good
idea?

> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> +#define freezable_schedule_timeout_killable_unsafe(timeout)		\
> +({									\
> +	long __retval;							\
> +	freezer_do_not_count();						\
> +	__retval = schedule_timeout_killable(timeout);			\
> +	freezer_count_unsafe();						\
> +	__retval;							\
> +})

Function too?
									Pavel
Colin Cross May 4, 2013, 8:23 p.m. UTC | #2
On Sat, May 4, 2013 at 6:00 AM, Pavel Machek <pavel@ucw.cz> wrote:
> Hi!
>
>> NFS calls the freezable helpers with locks held, which is unsafe
>> and caused lockdep warnings when 6aa9707 "lockdep: check that no
>> locks held at freeze time" was applied (reverted in dbf520a).
>> Add new *_unsafe versions of the helpers that will not run the
>> lockdep test when 6aa9707 is reapplied, and call them from NFS.
>>
>> Signed-off-by: Colin Cross <ccross@android.com>
>
> Looks mostly good.
>
>> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
>>       freezer_count();                                                \
>>  })
>>
>> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> +#define freezable_schedule_unsafe()                                  \
>> +({                                                                   \
>> +     freezer_do_not_count();                                         \
>> +     schedule();                                                     \
>> +     freezer_count_unsafe();                                         \
>> +})
>> +
>
> Make it inline function? :-). Add short explanation why it is good
> idea?

These are exact copies of the existing non-unsafe versions, except
they call freezer_count_unsafe() instead of freezer_count().  The next
version of my other patch stack that goes on top of this has a patch
to convert the macros in this file to static inline functions (at
least the ones that can be).  I'd rather not mix it in with this patch
for ease of comparison with the existing calls.

>> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
>> +({                                                                   \
>> +     long __retval;                                                  \
>> +     freezer_do_not_count();                                         \
>> +     __retval = schedule_timeout_killable(timeout);                  \
>> +     freezer_count_unsafe();                                         \
>> +     __retval;                                                       \
>> +})
>
> Function too?
>                                                                         Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 4, 2013, 10:55 p.m. UTC | #3
Hi!

> >> NFS calls the freezable helpers with locks held, which is unsafe
> >> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> >> locks held at freeze time" was applied (reverted in dbf520a).
> >> Add new *_unsafe versions of the helpers that will not run the
> >> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> >>
> >> Signed-off-by: Colin Cross <ccross@android.com>
> >
> > Looks mostly good.
> >
> >> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
> >>       freezer_count();                                                \
> >>  })
> >>
> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> +#define freezable_schedule_unsafe()                                  \
> >> +({                                                                   \
> >> +     freezer_do_not_count();                                         \
> >> +     schedule();                                                     \
> >> +     freezer_count_unsafe();                                         \
> >> +})
> >> +
> >
> > Make it inline function? :-). Add short explanation why it is good
> > idea?
> 
> These are exact copies of the existing non-unsafe versions, except
> they call freezer_count_unsafe() instead of freezer_count().  The next
> version of my other patch stack that goes on top of this has a patch
> to convert the macros in this file to static inline functions (at
> least the ones that can be).  I'd rather not mix it in with this patch
> for ease of comparison with the existing calls.

Ok.

Acked-by: Pavel Machek <pavel@ucw.cz>
Ingo Molnar May 5, 2013, 9:23 a.m. UTC | #4
* Colin Cross <ccross@android.com> wrote:

> NFS calls the freezable helpers with locks held, which is unsafe
> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> locks held at freeze time" was applied (reverted in dbf520a).
> Add new *_unsafe versions of the helpers that will not run the
> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> 
> Signed-off-by: Colin Cross <ccross@android.com>
> ---
>  fs/nfs/inode.c          |  2 +-
>  fs/nfs/nfs3proc.c       |  2 +-
>  fs/nfs/nfs4proc.c       |  4 ++--
>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
>  net/sunrpc/sched.c      |  2 +-
>  5 files changed, 46 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 1f94167..53cbee5 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
>  {
>  	if (fatal_signal_pending(current))
>  		return -ERESTARTSYS;
> -	freezable_schedule();
> +	freezable_schedule_unsafe();

I'd suggest naming such variants _unkillable() instead of _unsafe().

There's nothing inherently 'unsafe' about it: the user asked for a hard 
NFS mount and is getting it: with the side effect that it exposes the 
machine to network delays in a 'hard' way as well. Which means suspend may 
block indefinitely as well on network failure.

So it's two conflicting user requirements: 'hard NFS mount' and 'suspend 
now'. We pick the lesser evil, the requirement that is considered higher 
prio: the hard NFS mount in this case.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Machek May 5, 2013, 10:12 p.m. UTC | #5
Hi!

> > NFS calls the freezable helpers with locks held, which is unsafe
> > and caused lockdep warnings when 6aa9707 "lockdep: check that no
> > locks held at freeze time" was applied (reverted in dbf520a).
> > Add new *_unsafe versions of the helpers that will not run the
> > lockdep test when 6aa9707 is reapplied, and call them from NFS.
> > 
> > Signed-off-by: Colin Cross <ccross@android.com>
> > ---
> >  fs/nfs/inode.c          |  2 +-
> >  fs/nfs/nfs3proc.c       |  2 +-
> >  fs/nfs/nfs4proc.c       |  4 ++--
> >  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
> >  net/sunrpc/sched.c      |  2 +-
> >  5 files changed, 46 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index 1f94167..53cbee5 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
> >  {
> >  	if (fatal_signal_pending(current))
> >  		return -ERESTARTSYS;
> > -	freezable_schedule();
> > +	freezable_schedule_unsafe();
> 
> I'd suggest naming such variants _unkillable() instead of _unsafe().
> 
> There's nothing inherently 'unsafe' about it: the user asked for a hard 
> NFS mount and is getting it: with the side effect that it exposes the 
> machine to network delays in a 'hard' way as well. Which means suspend may 
> block indefinitely as well on network failure.

You only want to use _unsafe() variants when you enter refrigerator
with locks held.

And entering refrigerator with locks is tricky... and unsafe :-). It
is not directly related to killability.

									Pavel
Peter Zijlstra May 6, 2013, 8:50 a.m. UTC | #6
On Fri, May 03, 2013 at 02:04:09PM -0700, Colin Cross wrote:
> NFS calls the freezable helpers with locks held, which is unsafe
> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> locks held at freeze time" was applied (reverted in dbf520a).
> Add new *_unsafe versions of the helpers that will not run the
> lockdep test when 6aa9707 is reapplied, and call them from NFS.

Am I the only one that would like a bit more information about why NFS does
this and why we need to work around it?

From replies in this thread I surmise its got something to do with hard NFS
mounts. And I suppose I could go apply google to lkml and try and find the
previous discussion, but really this should be in the Changelog.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton May 6, 2013, 10:56 a.m. UTC | #7
On Fri,  3 May 2013 14:04:09 -0700
Colin Cross <ccross@android.com> wrote:

> NFS calls the freezable helpers with locks held, which is unsafe
> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> locks held at freeze time" was applied (reverted in dbf520a).
> Add new *_unsafe versions of the helpers that will not run the
> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> 
> Signed-off-by: Colin Cross <ccross@android.com>
> ---
>  fs/nfs/inode.c          |  2 +-
>  fs/nfs/nfs3proc.c       |  2 +-
>  fs/nfs/nfs4proc.c       |  4 ++--
>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
>  net/sunrpc/sched.c      |  2 +-
>  5 files changed, 46 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 1f94167..53cbee5 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
>  {
>  	if (fatal_signal_pending(current))
>  		return -ERESTARTSYS;
> -	freezable_schedule();
> +	freezable_schedule_unsafe();
>  	return 0;
>  }
>  EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
> index 43ea96c..ce90eb4 100644
> --- a/fs/nfs/nfs3proc.c
> +++ b/fs/nfs/nfs3proc.c
> @@ -33,7 +33,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
>  		res = rpc_call_sync(clnt, msg, flags);
>  		if (res != -EJUKEBOX)
>  			break;
> -		freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
> +		freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
>  		res = -ERESTARTSYS;
>  	} while (!fatal_signal_pending(current));
>  	return res;
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 0ad025e..a236077 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -266,7 +266,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
>  		*timeout = NFS4_POLL_RETRY_MIN;
>  	if (*timeout > NFS4_POLL_RETRY_MAX)
>  		*timeout = NFS4_POLL_RETRY_MAX;
> -	freezable_schedule_timeout_killable(*timeout);
> +	freezable_schedule_timeout_killable_unsafe(*timeout);
>  	if (fatal_signal_pending(current))
>  		res = -ERESTARTSYS;
>  	*timeout <<= 1;
> @@ -4309,7 +4309,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
>  static unsigned long
>  nfs4_set_lock_task_retry(unsigned long timeout)
>  {
> -	freezable_schedule_timeout_killable(timeout);
> +	freezable_schedule_timeout_killable_unsafe(timeout);
>  	timeout <<= 1;
>  	if (timeout > NFS4_LOCK_MAXTIMEOUT)
>  		return NFS4_LOCK_MAXTIMEOUT;
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index e70df40..5b31e21c 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -46,7 +46,11 @@ extern int freeze_kernel_threads(void);
>  extern void thaw_processes(void);
>  extern void thaw_kernel_threads(void);
>  
> -static inline bool try_to_freeze(void)
> +/*
> + * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
> + * If try_to_freeze causes a lockdep warning it means the caller may deadlock
> + */
> +static inline bool try_to_freeze_unsafe(void)
>  {
>  	might_sleep();
>  	if (likely(!freezing(current)))
> @@ -54,6 +58,11 @@ static inline bool try_to_freeze(void)
>  	return __refrigerator(false);
>  }
>  
> +static inline bool try_to_freeze(void)
> +{
> +	return try_to_freeze_unsafe();
> +}
> +
>  extern bool freeze_task(struct task_struct *p);
>  extern bool set_freezable(void);
>  
> @@ -115,6 +124,14 @@ static inline void freezer_count(void)
>  	try_to_freeze();
>  }
>  
> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> +static inline void freezer_count_unsafe(void)
> +{
> +	current->flags &= ~PF_FREEZER_SKIP;
> +	smp_mb();
> +	try_to_freeze_unsafe();
> +}
> +
>  /**
>   * freezer_should_skip - whether to skip a task when determining frozen
>   *			 state is reached
> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
>  	freezer_count();						\
>  })
>  
> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> +#define freezable_schedule_unsafe()					\
> +({									\
> +	freezer_do_not_count();						\
> +	schedule();							\
> +	freezer_count_unsafe();						\
> +})
> +
>  /* Like schedule_timeout_killable(), but should not block the freezer. */
>  #define freezable_schedule_timeout_killable(timeout)			\
>  ({									\
> @@ -162,6 +187,16 @@ static inline bool freezer_should_skip(struct task_struct *p)
>  	__retval;							\
>  })
>  
> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> +#define freezable_schedule_timeout_killable_unsafe(timeout)		\
> +({									\
> +	long __retval;							\
> +	freezer_do_not_count();						\
> +	__retval = schedule_timeout_killable(timeout);			\
> +	freezer_count_unsafe();						\
> +	__retval;							\
> +})
> +
>  /*
>   * Freezer-friendly wrappers around wait_event_interruptible(),
>   * wait_event_killable() and wait_event_interruptible_timeout(), originally
> @@ -225,9 +260,14 @@ static inline void set_freezable(void) {}
>  
>  #define freezable_schedule()  schedule()
>  
> +#define freezable_schedule_unsafe()  schedule()
> +
>  #define freezable_schedule_timeout_killable(timeout)			\
>  	schedule_timeout_killable(timeout)
>  
> +#define freezable_schedule_timeout_killable_unsafe(timeout)		\
> +	schedule_timeout_killable(timeout)
> +
>  #define wait_event_freezable(wq, condition)				\
>  		wait_event_interruptible(wq, condition)
>  
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index f8529fc..8dcfadc 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -254,7 +254,7 @@ static int rpc_wait_bit_killable(void *word)
>  {
>  	if (fatal_signal_pending(current))
>  		return -ERESTARTSYS;
> -	freezable_schedule();
> +	freezable_schedule_unsafe();
>  	return 0;
>  }
>  

Looks reasonable, but note that CIFS uses wait_event_freezekillable
with locks held too, which will likely have the same problem and will
need the same workaround for now.
Jeff Layton May 6, 2013, 10:58 a.m. UTC | #8
On Mon, 6 May 2013 10:50:25 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Fri, May 03, 2013 at 02:04:09PM -0700, Colin Cross wrote:
> > NFS calls the freezable helpers with locks held, which is unsafe
> > and caused lockdep warnings when 6aa9707 "lockdep: check that no
> > locks held at freeze time" was applied (reverted in dbf520a).
> > Add new *_unsafe versions of the helpers that will not run the
> > lockdep test when 6aa9707 is reapplied, and call them from NFS.
> 
> Am I the only one that would like a bit more information about why NFS does
> this and why we need to work around it?
> 

NFS does this because this is how we initially "fixed" it to not block
the freezer. It sucks and is prone to deadlocking if you selectively
freeze processes via the cgroup freezer. It does basically "work" in
most cases though if the freezer is running to suspend the whole system.

I'm looking at fixing this by instead setting points within the NFS/RPC
layers that cause these calls to return something like ERESTARTSYS
instead when a freeze event comes in, depending on whether a call has
already been transmitted to the server.

That's likely to take a while though, and in the meantime we don't want
lockdep complaining every time someone fires off a RPC call with locks
held. So for now, we'd like to "opt out" of these lockdep warnings.
Tejun Heo May 6, 2013, 6:55 p.m. UTC | #9
On Fri, May 03, 2013 at 02:04:09PM -0700, Colin Cross wrote:
> NFS calls the freezable helpers with locks held, which is unsafe
> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> locks held at freeze time" was applied (reverted in dbf520a).
> Add new *_unsafe versions of the helpers that will not run the
> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> 
> Signed-off-by: Colin Cross <ccross@android.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.
Colin Cross May 6, 2013, 7:57 p.m. UTC | #10
On Mon, May 6, 2013 at 3:56 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Fri,  3 May 2013 14:04:09 -0700
> Colin Cross <ccross@android.com> wrote:
>
>> NFS calls the freezable helpers with locks held, which is unsafe
>> and caused lockdep warnings when 6aa9707 "lockdep: check that no
>> locks held at freeze time" was applied (reverted in dbf520a).
>> Add new *_unsafe versions of the helpers that will not run the
>> lockdep test when 6aa9707 is reapplied, and call them from NFS.
>>
>> Signed-off-by: Colin Cross <ccross@android.com>
>> ---
>>  fs/nfs/inode.c          |  2 +-
>>  fs/nfs/nfs3proc.c       |  2 +-
>>  fs/nfs/nfs4proc.c       |  4 ++--
>>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
>>  net/sunrpc/sched.c      |  2 +-
>>  5 files changed, 46 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>> index 1f94167..53cbee5 100644
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
>>  {
>>       if (fatal_signal_pending(current))
>>               return -ERESTARTSYS;
>> -     freezable_schedule();
>> +     freezable_schedule_unsafe();
>>       return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
>> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
>> index 43ea96c..ce90eb4 100644
>> --- a/fs/nfs/nfs3proc.c
>> +++ b/fs/nfs/nfs3proc.c
>> @@ -33,7 +33,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
>>               res = rpc_call_sync(clnt, msg, flags);
>>               if (res != -EJUKEBOX)
>>                       break;
>> -             freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
>> +             freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
>>               res = -ERESTARTSYS;
>>       } while (!fatal_signal_pending(current));
>>       return res;
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 0ad025e..a236077 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -266,7 +266,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
>>               *timeout = NFS4_POLL_RETRY_MIN;
>>       if (*timeout > NFS4_POLL_RETRY_MAX)
>>               *timeout = NFS4_POLL_RETRY_MAX;
>> -     freezable_schedule_timeout_killable(*timeout);
>> +     freezable_schedule_timeout_killable_unsafe(*timeout);
>>       if (fatal_signal_pending(current))
>>               res = -ERESTARTSYS;
>>       *timeout <<= 1;
>> @@ -4309,7 +4309,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
>>  static unsigned long
>>  nfs4_set_lock_task_retry(unsigned long timeout)
>>  {
>> -     freezable_schedule_timeout_killable(timeout);
>> +     freezable_schedule_timeout_killable_unsafe(timeout);
>>       timeout <<= 1;
>>       if (timeout > NFS4_LOCK_MAXTIMEOUT)
>>               return NFS4_LOCK_MAXTIMEOUT;
>> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
>> index e70df40..5b31e21c 100644
>> --- a/include/linux/freezer.h
>> +++ b/include/linux/freezer.h
>> @@ -46,7 +46,11 @@ extern int freeze_kernel_threads(void);
>>  extern void thaw_processes(void);
>>  extern void thaw_kernel_threads(void);
>>
>> -static inline bool try_to_freeze(void)
>> +/*
>> + * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
>> + * If try_to_freeze causes a lockdep warning it means the caller may deadlock
>> + */
>> +static inline bool try_to_freeze_unsafe(void)
>>  {
>>       might_sleep();
>>       if (likely(!freezing(current)))
>> @@ -54,6 +58,11 @@ static inline bool try_to_freeze(void)
>>       return __refrigerator(false);
>>  }
>>
>> +static inline bool try_to_freeze(void)
>> +{
>> +     return try_to_freeze_unsafe();
>> +}
>> +
>>  extern bool freeze_task(struct task_struct *p);
>>  extern bool set_freezable(void);
>>
>> @@ -115,6 +124,14 @@ static inline void freezer_count(void)
>>       try_to_freeze();
>>  }
>>
>> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> +static inline void freezer_count_unsafe(void)
>> +{
>> +     current->flags &= ~PF_FREEZER_SKIP;
>> +     smp_mb();
>> +     try_to_freeze_unsafe();
>> +}
>> +
>>  /**
>>   * freezer_should_skip - whether to skip a task when determining frozen
>>   *                    state is reached
>> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
>>       freezer_count();                                                \
>>  })
>>
>> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> +#define freezable_schedule_unsafe()                                  \
>> +({                                                                   \
>> +     freezer_do_not_count();                                         \
>> +     schedule();                                                     \
>> +     freezer_count_unsafe();                                         \
>> +})
>> +
>>  /* Like schedule_timeout_killable(), but should not block the freezer. */
>>  #define freezable_schedule_timeout_killable(timeout)                 \
>>  ({                                                                   \
>> @@ -162,6 +187,16 @@ static inline bool freezer_should_skip(struct task_struct *p)
>>       __retval;                                                       \
>>  })
>>
>> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
>> +({                                                                   \
>> +     long __retval;                                                  \
>> +     freezer_do_not_count();                                         \
>> +     __retval = schedule_timeout_killable(timeout);                  \
>> +     freezer_count_unsafe();                                         \
>> +     __retval;                                                       \
>> +})
>> +
>>  /*
>>   * Freezer-friendly wrappers around wait_event_interruptible(),
>>   * wait_event_killable() and wait_event_interruptible_timeout(), originally
>> @@ -225,9 +260,14 @@ static inline void set_freezable(void) {}
>>
>>  #define freezable_schedule()  schedule()
>>
>> +#define freezable_schedule_unsafe()  schedule()
>> +
>>  #define freezable_schedule_timeout_killable(timeout)                 \
>>       schedule_timeout_killable(timeout)
>>
>> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
>> +     schedule_timeout_killable(timeout)
>> +
>>  #define wait_event_freezable(wq, condition)                          \
>>               wait_event_interruptible(wq, condition)
>>
>> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> index f8529fc..8dcfadc 100644
>> --- a/net/sunrpc/sched.c
>> +++ b/net/sunrpc/sched.c
>> @@ -254,7 +254,7 @@ static int rpc_wait_bit_killable(void *word)
>>  {
>>       if (fatal_signal_pending(current))
>>               return -ERESTARTSYS;
>> -     freezable_schedule();
>> +     freezable_schedule_unsafe();
>>       return 0;
>>  }
>>
>
> Looks reasonable, but note that CIFS uses wait_event_freezekillable
> with locks held too, which will likely have the same problem and will
> need the same workaround for now.

I didn't see any lockdep warnings reported on the mailing list when
the lockdep patch was previously applied, can you point me to the lock
that is held when wait_event_freezkillable is called?  I don't want to
add an _unsafe call where its not needed and cause more confusion.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton May 6, 2013, 9:43 p.m. UTC | #11
On Mon, 6 May 2013 12:57:54 -0700
Colin Cross <ccross@android.com> wrote:

> On Mon, May 6, 2013 at 3:56 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > On Fri,  3 May 2013 14:04:09 -0700
> > Colin Cross <ccross@android.com> wrote:
> >
> >> NFS calls the freezable helpers with locks held, which is unsafe
> >> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> >> locks held at freeze time" was applied (reverted in dbf520a).
> >> Add new *_unsafe versions of the helpers that will not run the
> >> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> >>
> >> Signed-off-by: Colin Cross <ccross@android.com>
> >> ---
> >>  fs/nfs/inode.c          |  2 +-
> >>  fs/nfs/nfs3proc.c       |  2 +-
> >>  fs/nfs/nfs4proc.c       |  4 ++--
> >>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
> >>  net/sunrpc/sched.c      |  2 +-
> >>  5 files changed, 46 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> >> index 1f94167..53cbee5 100644
> >> --- a/fs/nfs/inode.c
> >> +++ b/fs/nfs/inode.c
> >> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
> >>  {
> >>       if (fatal_signal_pending(current))
> >>               return -ERESTARTSYS;
> >> -     freezable_schedule();
> >> +     freezable_schedule_unsafe();
> >>       return 0;
> >>  }
> >>  EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
> >> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
> >> index 43ea96c..ce90eb4 100644
> >> --- a/fs/nfs/nfs3proc.c
> >> +++ b/fs/nfs/nfs3proc.c
> >> @@ -33,7 +33,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
> >>               res = rpc_call_sync(clnt, msg, flags);
> >>               if (res != -EJUKEBOX)
> >>                       break;
> >> -             freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
> >> +             freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
> >>               res = -ERESTARTSYS;
> >>       } while (!fatal_signal_pending(current));
> >>       return res;
> >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> >> index 0ad025e..a236077 100644
> >> --- a/fs/nfs/nfs4proc.c
> >> +++ b/fs/nfs/nfs4proc.c
> >> @@ -266,7 +266,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
> >>               *timeout = NFS4_POLL_RETRY_MIN;
> >>       if (*timeout > NFS4_POLL_RETRY_MAX)
> >>               *timeout = NFS4_POLL_RETRY_MAX;
> >> -     freezable_schedule_timeout_killable(*timeout);
> >> +     freezable_schedule_timeout_killable_unsafe(*timeout);
> >>       if (fatal_signal_pending(current))
> >>               res = -ERESTARTSYS;
> >>       *timeout <<= 1;
> >> @@ -4309,7 +4309,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
> >>  static unsigned long
> >>  nfs4_set_lock_task_retry(unsigned long timeout)
> >>  {
> >> -     freezable_schedule_timeout_killable(timeout);
> >> +     freezable_schedule_timeout_killable_unsafe(timeout);
> >>       timeout <<= 1;
> >>       if (timeout > NFS4_LOCK_MAXTIMEOUT)
> >>               return NFS4_LOCK_MAXTIMEOUT;
> >> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> >> index e70df40..5b31e21c 100644
> >> --- a/include/linux/freezer.h
> >> +++ b/include/linux/freezer.h
> >> @@ -46,7 +46,11 @@ extern int freeze_kernel_threads(void);
> >>  extern void thaw_processes(void);
> >>  extern void thaw_kernel_threads(void);
> >>
> >> -static inline bool try_to_freeze(void)
> >> +/*
> >> + * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
> >> + * If try_to_freeze causes a lockdep warning it means the caller may deadlock
> >> + */
> >> +static inline bool try_to_freeze_unsafe(void)
> >>  {
> >>       might_sleep();
> >>       if (likely(!freezing(current)))
> >> @@ -54,6 +58,11 @@ static inline bool try_to_freeze(void)
> >>       return __refrigerator(false);
> >>  }
> >>
> >> +static inline bool try_to_freeze(void)
> >> +{
> >> +     return try_to_freeze_unsafe();
> >> +}
> >> +
> >>  extern bool freeze_task(struct task_struct *p);
> >>  extern bool set_freezable(void);
> >>
> >> @@ -115,6 +124,14 @@ static inline void freezer_count(void)
> >>       try_to_freeze();
> >>  }
> >>
> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> +static inline void freezer_count_unsafe(void)
> >> +{
> >> +     current->flags &= ~PF_FREEZER_SKIP;
> >> +     smp_mb();
> >> +     try_to_freeze_unsafe();
> >> +}
> >> +
> >>  /**
> >>   * freezer_should_skip - whether to skip a task when determining frozen
> >>   *                    state is reached
> >> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
> >>       freezer_count();                                                \
> >>  })
> >>
> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> +#define freezable_schedule_unsafe()                                  \
> >> +({                                                                   \
> >> +     freezer_do_not_count();                                         \
> >> +     schedule();                                                     \
> >> +     freezer_count_unsafe();                                         \
> >> +})
> >> +
> >>  /* Like schedule_timeout_killable(), but should not block the freezer. */
> >>  #define freezable_schedule_timeout_killable(timeout)                 \
> >>  ({                                                                   \
> >> @@ -162,6 +187,16 @@ static inline bool freezer_should_skip(struct task_struct *p)
> >>       __retval;                                                       \
> >>  })
> >>
> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
> >> +({                                                                   \
> >> +     long __retval;                                                  \
> >> +     freezer_do_not_count();                                         \
> >> +     __retval = schedule_timeout_killable(timeout);                  \
> >> +     freezer_count_unsafe();                                         \
> >> +     __retval;                                                       \
> >> +})
> >> +
> >>  /*
> >>   * Freezer-friendly wrappers around wait_event_interruptible(),
> >>   * wait_event_killable() and wait_event_interruptible_timeout(), originally
> >> @@ -225,9 +260,14 @@ static inline void set_freezable(void) {}
> >>
> >>  #define freezable_schedule()  schedule()
> >>
> >> +#define freezable_schedule_unsafe()  schedule()
> >> +
> >>  #define freezable_schedule_timeout_killable(timeout)                 \
> >>       schedule_timeout_killable(timeout)
> >>
> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
> >> +     schedule_timeout_killable(timeout)
> >> +
> >>  #define wait_event_freezable(wq, condition)                          \
> >>               wait_event_interruptible(wq, condition)
> >>
> >> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> >> index f8529fc..8dcfadc 100644
> >> --- a/net/sunrpc/sched.c
> >> +++ b/net/sunrpc/sched.c
> >> @@ -254,7 +254,7 @@ static int rpc_wait_bit_killable(void *word)
> >>  {
> >>       if (fatal_signal_pending(current))
> >>               return -ERESTARTSYS;
> >> -     freezable_schedule();
> >> +     freezable_schedule_unsafe();
> >>       return 0;
> >>  }
> >>
> >
> > Looks reasonable, but note that CIFS uses wait_event_freezekillable
> > with locks held too, which will likely have the same problem and will
> > need the same workaround for now.
> 
> I didn't see any lockdep warnings reported on the mailing list when
> the lockdep patch was previously applied, can you point me to the lock
> that is held when wait_event_freezkillable is called?  I don't want to
> add an _unsafe call where its not needed and cause more confusion.

It's pretty much all of the same VFS-level locks...

Basically, when a process wants to send a synchronous SMB to a CIFS
server, it'll send off the request and then call wait_for_response() to
wait on the reply. If you need a particular call stack, then you can
look in the rmdir() codepath as an example:

vfs_rmdir takes the i_mutex
   cifs_rmdir (via the inode ops)
      CIFSSMBRmDir (via the smb version ops)
          SendReceive
               wait_for_response

...at that point a freeze can occur while you're still holding the
i_mutex.

There are many other possibilities for other codepaths that end up in
wait_for_response(). Once we get a solution in place for NFS, we'll
need to do something very similar for CIFS.
Colin Cross May 6, 2013, 9:54 p.m. UTC | #12
On Mon, May 6, 2013 at 2:43 PM, Jeff Layton <jlayton@redhat.com> wrote:
> On Mon, 6 May 2013 12:57:54 -0700
> Colin Cross <ccross@android.com> wrote:
>
>> On Mon, May 6, 2013 at 3:56 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> > On Fri,  3 May 2013 14:04:09 -0700
>> > Colin Cross <ccross@android.com> wrote:
>> >
>> >> NFS calls the freezable helpers with locks held, which is unsafe
>> >> and caused lockdep warnings when 6aa9707 "lockdep: check that no
>> >> locks held at freeze time" was applied (reverted in dbf520a).
>> >> Add new *_unsafe versions of the helpers that will not run the
>> >> lockdep test when 6aa9707 is reapplied, and call them from NFS.
>> >>
>> >> Signed-off-by: Colin Cross <ccross@android.com>
>> >> ---
>> >>  fs/nfs/inode.c          |  2 +-
>> >>  fs/nfs/nfs3proc.c       |  2 +-
>> >>  fs/nfs/nfs4proc.c       |  4 ++--
>> >>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
>> >>  net/sunrpc/sched.c      |  2 +-
>> >>  5 files changed, 46 insertions(+), 6 deletions(-)
>> >>
>> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>> >> index 1f94167..53cbee5 100644
>> >> --- a/fs/nfs/inode.c
>> >> +++ b/fs/nfs/inode.c
>> >> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
>> >>  {
>> >>       if (fatal_signal_pending(current))
>> >>               return -ERESTARTSYS;
>> >> -     freezable_schedule();
>> >> +     freezable_schedule_unsafe();
>> >>       return 0;
>> >>  }
>> >>  EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
>> >> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
>> >> index 43ea96c..ce90eb4 100644
>> >> --- a/fs/nfs/nfs3proc.c
>> >> +++ b/fs/nfs/nfs3proc.c
>> >> @@ -33,7 +33,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
>> >>               res = rpc_call_sync(clnt, msg, flags);
>> >>               if (res != -EJUKEBOX)
>> >>                       break;
>> >> -             freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
>> >> +             freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
>> >>               res = -ERESTARTSYS;
>> >>       } while (!fatal_signal_pending(current));
>> >>       return res;
>> >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> >> index 0ad025e..a236077 100644
>> >> --- a/fs/nfs/nfs4proc.c
>> >> +++ b/fs/nfs/nfs4proc.c
>> >> @@ -266,7 +266,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
>> >>               *timeout = NFS4_POLL_RETRY_MIN;
>> >>       if (*timeout > NFS4_POLL_RETRY_MAX)
>> >>               *timeout = NFS4_POLL_RETRY_MAX;
>> >> -     freezable_schedule_timeout_killable(*timeout);
>> >> +     freezable_schedule_timeout_killable_unsafe(*timeout);
>> >>       if (fatal_signal_pending(current))
>> >>               res = -ERESTARTSYS;
>> >>       *timeout <<= 1;
>> >> @@ -4309,7 +4309,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
>> >>  static unsigned long
>> >>  nfs4_set_lock_task_retry(unsigned long timeout)
>> >>  {
>> >> -     freezable_schedule_timeout_killable(timeout);
>> >> +     freezable_schedule_timeout_killable_unsafe(timeout);
>> >>       timeout <<= 1;
>> >>       if (timeout > NFS4_LOCK_MAXTIMEOUT)
>> >>               return NFS4_LOCK_MAXTIMEOUT;
>> >> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
>> >> index e70df40..5b31e21c 100644
>> >> --- a/include/linux/freezer.h
>> >> +++ b/include/linux/freezer.h
>> >> @@ -46,7 +46,11 @@ extern int freeze_kernel_threads(void);
>> >>  extern void thaw_processes(void);
>> >>  extern void thaw_kernel_threads(void);
>> >>
>> >> -static inline bool try_to_freeze(void)
>> >> +/*
>> >> + * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
>> >> + * If try_to_freeze causes a lockdep warning it means the caller may deadlock
>> >> + */
>> >> +static inline bool try_to_freeze_unsafe(void)
>> >>  {
>> >>       might_sleep();
>> >>       if (likely(!freezing(current)))
>> >> @@ -54,6 +58,11 @@ static inline bool try_to_freeze(void)
>> >>       return __refrigerator(false);
>> >>  }
>> >>
>> >> +static inline bool try_to_freeze(void)
>> >> +{
>> >> +     return try_to_freeze_unsafe();
>> >> +}
>> >> +
>> >>  extern bool freeze_task(struct task_struct *p);
>> >>  extern bool set_freezable(void);
>> >>
>> >> @@ -115,6 +124,14 @@ static inline void freezer_count(void)
>> >>       try_to_freeze();
>> >>  }
>> >>
>> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> >> +static inline void freezer_count_unsafe(void)
>> >> +{
>> >> +     current->flags &= ~PF_FREEZER_SKIP;
>> >> +     smp_mb();
>> >> +     try_to_freeze_unsafe();
>> >> +}
>> >> +
>> >>  /**
>> >>   * freezer_should_skip - whether to skip a task when determining frozen
>> >>   *                    state is reached
>> >> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
>> >>       freezer_count();                                                \
>> >>  })
>> >>
>> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> >> +#define freezable_schedule_unsafe()                                  \
>> >> +({                                                                   \
>> >> +     freezer_do_not_count();                                         \
>> >> +     schedule();                                                     \
>> >> +     freezer_count_unsafe();                                         \
>> >> +})
>> >> +
>> >>  /* Like schedule_timeout_killable(), but should not block the freezer. */
>> >>  #define freezable_schedule_timeout_killable(timeout)                 \
>> >>  ({                                                                   \
>> >> @@ -162,6 +187,16 @@ static inline bool freezer_should_skip(struct task_struct *p)
>> >>       __retval;                                                       \
>> >>  })
>> >>
>> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
>> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
>> >> +({                                                                   \
>> >> +     long __retval;                                                  \
>> >> +     freezer_do_not_count();                                         \
>> >> +     __retval = schedule_timeout_killable(timeout);                  \
>> >> +     freezer_count_unsafe();                                         \
>> >> +     __retval;                                                       \
>> >> +})
>> >> +
>> >>  /*
>> >>   * Freezer-friendly wrappers around wait_event_interruptible(),
>> >>   * wait_event_killable() and wait_event_interruptible_timeout(), originally
>> >> @@ -225,9 +260,14 @@ static inline void set_freezable(void) {}
>> >>
>> >>  #define freezable_schedule()  schedule()
>> >>
>> >> +#define freezable_schedule_unsafe()  schedule()
>> >> +
>> >>  #define freezable_schedule_timeout_killable(timeout)                 \
>> >>       schedule_timeout_killable(timeout)
>> >>
>> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
>> >> +     schedule_timeout_killable(timeout)
>> >> +
>> >>  #define wait_event_freezable(wq, condition)                          \
>> >>               wait_event_interruptible(wq, condition)
>> >>
>> >> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> >> index f8529fc..8dcfadc 100644
>> >> --- a/net/sunrpc/sched.c
>> >> +++ b/net/sunrpc/sched.c
>> >> @@ -254,7 +254,7 @@ static int rpc_wait_bit_killable(void *word)
>> >>  {
>> >>       if (fatal_signal_pending(current))
>> >>               return -ERESTARTSYS;
>> >> -     freezable_schedule();
>> >> +     freezable_schedule_unsafe();
>> >>       return 0;
>> >>  }
>> >>
>> >
>> > Looks reasonable, but note that CIFS uses wait_event_freezekillable
>> > with locks held too, which will likely have the same problem and will
>> > need the same workaround for now.
>>
>> I didn't see any lockdep warnings reported on the mailing list when
>> the lockdep patch was previously applied, can you point me to the lock
>> that is held when wait_event_freezkillable is called?  I don't want to
>> add an _unsafe call where its not needed and cause more confusion.
>
> It's pretty much all of the same VFS-level locks...
>
> Basically, when a process wants to send a synchronous SMB to a CIFS
> server, it'll send off the request and then call wait_for_response() to
> wait on the reply. If you need a particular call stack, then you can
> look in the rmdir() codepath as an example:
>
> vfs_rmdir takes the i_mutex
>    cifs_rmdir (via the inode ops)
>       CIFSSMBRmDir (via the smb version ops)
>           SendReceive
>                wait_for_response
>
> ...at that point a freeze can occur while you're still holding the
> i_mutex.
>
> There are many other possibilities for other codepaths that end up in
> wait_for_response(). Once we get a solution in place for NFS, we'll
> need to do something very similar for CIFS.

Makes sense, I will add CIFS to the patch.  Would you prefer it in the
same or separate patches.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds May 6, 2013, 9:58 p.m. UTC | #13
On Mon, May 6, 2013 at 2:54 PM, Colin Cross <ccross@android.com> wrote:
>>
>> There are many other possibilities for other codepaths that end up in
>> wait_for_response(). Once we get a solution in place for NFS, we'll
>> need to do something very similar for CIFS.
>
> Makes sense, I will add CIFS to the patch.  Would you prefer it in the
> same or separate patches.

Quite frankly, is it worth resurrecting these patches at all?

The only things it actually complained about are not worth the pain
fixing and are getting explicitly not warned about - is there any
reason to believe the patches are worth maintaining and the extra
complexity is worth it?

           Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton May 6, 2013, 9:59 p.m. UTC | #14
On Mon, 6 May 2013 14:54:53 -0700
Colin Cross <ccross@android.com> wrote:

> On Mon, May 6, 2013 at 2:43 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > On Mon, 6 May 2013 12:57:54 -0700
> > Colin Cross <ccross@android.com> wrote:
> >
> >> On Mon, May 6, 2013 at 3:56 AM, Jeff Layton <jlayton@redhat.com> wrote:
> >> > On Fri,  3 May 2013 14:04:09 -0700
> >> > Colin Cross <ccross@android.com> wrote:
> >> >
> >> >> NFS calls the freezable helpers with locks held, which is unsafe
> >> >> and caused lockdep warnings when 6aa9707 "lockdep: check that no
> >> >> locks held at freeze time" was applied (reverted in dbf520a).
> >> >> Add new *_unsafe versions of the helpers that will not run the
> >> >> lockdep test when 6aa9707 is reapplied, and call them from NFS.
> >> >>
> >> >> Signed-off-by: Colin Cross <ccross@android.com>
> >> >> ---
> >> >>  fs/nfs/inode.c          |  2 +-
> >> >>  fs/nfs/nfs3proc.c       |  2 +-
> >> >>  fs/nfs/nfs4proc.c       |  4 ++--
> >> >>  include/linux/freezer.h | 42 +++++++++++++++++++++++++++++++++++++++++-
> >> >>  net/sunrpc/sched.c      |  2 +-
> >> >>  5 files changed, 46 insertions(+), 6 deletions(-)
> >> >>
> >> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> >> >> index 1f94167..53cbee5 100644
> >> >> --- a/fs/nfs/inode.c
> >> >> +++ b/fs/nfs/inode.c
> >> >> @@ -79,7 +79,7 @@ int nfs_wait_bit_killable(void *word)
> >> >>  {
> >> >>       if (fatal_signal_pending(current))
> >> >>               return -ERESTARTSYS;
> >> >> -     freezable_schedule();
> >> >> +     freezable_schedule_unsafe();
> >> >>       return 0;
> >> >>  }
> >> >>  EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
> >> >> diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
> >> >> index 43ea96c..ce90eb4 100644
> >> >> --- a/fs/nfs/nfs3proc.c
> >> >> +++ b/fs/nfs/nfs3proc.c
> >> >> @@ -33,7 +33,7 @@ nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
> >> >>               res = rpc_call_sync(clnt, msg, flags);
> >> >>               if (res != -EJUKEBOX)
> >> >>                       break;
> >> >> -             freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
> >> >> +             freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
> >> >>               res = -ERESTARTSYS;
> >> >>       } while (!fatal_signal_pending(current));
> >> >>       return res;
> >> >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> >> >> index 0ad025e..a236077 100644
> >> >> --- a/fs/nfs/nfs4proc.c
> >> >> +++ b/fs/nfs/nfs4proc.c
> >> >> @@ -266,7 +266,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
> >> >>               *timeout = NFS4_POLL_RETRY_MIN;
> >> >>       if (*timeout > NFS4_POLL_RETRY_MAX)
> >> >>               *timeout = NFS4_POLL_RETRY_MAX;
> >> >> -     freezable_schedule_timeout_killable(*timeout);
> >> >> +     freezable_schedule_timeout_killable_unsafe(*timeout);
> >> >>       if (fatal_signal_pending(current))
> >> >>               res = -ERESTARTSYS;
> >> >>       *timeout <<= 1;
> >> >> @@ -4309,7 +4309,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
> >> >>  static unsigned long
> >> >>  nfs4_set_lock_task_retry(unsigned long timeout)
> >> >>  {
> >> >> -     freezable_schedule_timeout_killable(timeout);
> >> >> +     freezable_schedule_timeout_killable_unsafe(timeout);
> >> >>       timeout <<= 1;
> >> >>       if (timeout > NFS4_LOCK_MAXTIMEOUT)
> >> >>               return NFS4_LOCK_MAXTIMEOUT;
> >> >> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> >> >> index e70df40..5b31e21c 100644
> >> >> --- a/include/linux/freezer.h
> >> >> +++ b/include/linux/freezer.h
> >> >> @@ -46,7 +46,11 @@ extern int freeze_kernel_threads(void);
> >> >>  extern void thaw_processes(void);
> >> >>  extern void thaw_kernel_threads(void);
> >> >>
> >> >> -static inline bool try_to_freeze(void)
> >> >> +/*
> >> >> + * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
> >> >> + * If try_to_freeze causes a lockdep warning it means the caller may deadlock
> >> >> + */
> >> >> +static inline bool try_to_freeze_unsafe(void)
> >> >>  {
> >> >>       might_sleep();
> >> >>       if (likely(!freezing(current)))
> >> >> @@ -54,6 +58,11 @@ static inline bool try_to_freeze(void)
> >> >>       return __refrigerator(false);
> >> >>  }
> >> >>
> >> >> +static inline bool try_to_freeze(void)
> >> >> +{
> >> >> +     return try_to_freeze_unsafe();
> >> >> +}
> >> >> +
> >> >>  extern bool freeze_task(struct task_struct *p);
> >> >>  extern bool set_freezable(void);
> >> >>
> >> >> @@ -115,6 +124,14 @@ static inline void freezer_count(void)
> >> >>       try_to_freeze();
> >> >>  }
> >> >>
> >> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> >> +static inline void freezer_count_unsafe(void)
> >> >> +{
> >> >> +     current->flags &= ~PF_FREEZER_SKIP;
> >> >> +     smp_mb();
> >> >> +     try_to_freeze_unsafe();
> >> >> +}
> >> >> +
> >> >>  /**
> >> >>   * freezer_should_skip - whether to skip a task when determining frozen
> >> >>   *                    state is reached
> >> >> @@ -152,6 +169,14 @@ static inline bool freezer_should_skip(struct task_struct *p)
> >> >>       freezer_count();                                                \
> >> >>  })
> >> >>
> >> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> >> +#define freezable_schedule_unsafe()                                  \
> >> >> +({                                                                   \
> >> >> +     freezer_do_not_count();                                         \
> >> >> +     schedule();                                                     \
> >> >> +     freezer_count_unsafe();                                         \
> >> >> +})
> >> >> +
> >> >>  /* Like schedule_timeout_killable(), but should not block the freezer. */
> >> >>  #define freezable_schedule_timeout_killable(timeout)                 \
> >> >>  ({                                                                   \
> >> >> @@ -162,6 +187,16 @@ static inline bool freezer_should_skip(struct task_struct *p)
> >> >>       __retval;                                                       \
> >> >>  })
> >> >>
> >> >> +/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
> >> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
> >> >> +({                                                                   \
> >> >> +     long __retval;                                                  \
> >> >> +     freezer_do_not_count();                                         \
> >> >> +     __retval = schedule_timeout_killable(timeout);                  \
> >> >> +     freezer_count_unsafe();                                         \
> >> >> +     __retval;                                                       \
> >> >> +})
> >> >> +
> >> >>  /*
> >> >>   * Freezer-friendly wrappers around wait_event_interruptible(),
> >> >>   * wait_event_killable() and wait_event_interruptible_timeout(), originally
> >> >> @@ -225,9 +260,14 @@ static inline void set_freezable(void) {}
> >> >>
> >> >>  #define freezable_schedule()  schedule()
> >> >>
> >> >> +#define freezable_schedule_unsafe()  schedule()
> >> >> +
> >> >>  #define freezable_schedule_timeout_killable(timeout)                 \
> >> >>       schedule_timeout_killable(timeout)
> >> >>
> >> >> +#define freezable_schedule_timeout_killable_unsafe(timeout)          \
> >> >> +     schedule_timeout_killable(timeout)
> >> >> +
> >> >>  #define wait_event_freezable(wq, condition)                          \
> >> >>               wait_event_interruptible(wq, condition)
> >> >>
> >> >> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> >> >> index f8529fc..8dcfadc 100644
> >> >> --- a/net/sunrpc/sched.c
> >> >> +++ b/net/sunrpc/sched.c
> >> >> @@ -254,7 +254,7 @@ static int rpc_wait_bit_killable(void *word)
> >> >>  {
> >> >>       if (fatal_signal_pending(current))
> >> >>               return -ERESTARTSYS;
> >> >> -     freezable_schedule();
> >> >> +     freezable_schedule_unsafe();
> >> >>       return 0;
> >> >>  }
> >> >>
> >> >
> >> > Looks reasonable, but note that CIFS uses wait_event_freezekillable
> >> > with locks held too, which will likely have the same problem and will
> >> > need the same workaround for now.
> >>
> >> I didn't see any lockdep warnings reported on the mailing list when
> >> the lockdep patch was previously applied, can you point me to the lock
> >> that is held when wait_event_freezkillable is called?  I don't want to
> >> add an _unsafe call where its not needed and cause more confusion.
> >
> > It's pretty much all of the same VFS-level locks...
> >
> > Basically, when a process wants to send a synchronous SMB to a CIFS
> > server, it'll send off the request and then call wait_for_response() to
> > wait on the reply. If you need a particular call stack, then you can
> > look in the rmdir() codepath as an example:
> >
> > vfs_rmdir takes the i_mutex
> >    cifs_rmdir (via the inode ops)
> >       CIFSSMBRmDir (via the smb version ops)
> >           SendReceive
> >                wait_for_response
> >
> > ...at that point a freeze can occur while you're still holding the
> > i_mutex.
> >
> > There are many other possibilities for other codepaths that end up in
> > wait_for_response(). Once we get a solution in place for NFS, we'll
> > need to do something very similar for CIFS.
> 
> Makes sense, I will add CIFS to the patch.  Would you prefer it in the
> same or separate patches.

A new patch on top of this one would be fine with me, but I don't feel
strongly either way.

Thanks,
Jeff Layton May 6, 2013, 10:05 p.m. UTC | #15
On Mon, 6 May 2013 14:58:31 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, May 6, 2013 at 2:54 PM, Colin Cross <ccross@android.com> wrote:
> >>
> >> There are many other possibilities for other codepaths that end up in
> >> wait_for_response(). Once we get a solution in place for NFS, we'll
> >> need to do something very similar for CIFS.
> >
> > Makes sense, I will add CIFS to the patch.  Would you prefer it in the
> > same or separate patches.
> 
> Quite frankly, is it worth resurrecting these patches at all?
> 
> The only things it actually complained about are not worth the pain
> fixing and are getting explicitly not warned about - is there any
> reason to believe the patches are worth maintaining and the extra
> complexity is worth it?
> 
>            Linus

Well, these problems are worth the pain of fixing, I think. It's just
going to take us a while to get there since it involves some
significant surgery.

As to whether the warnings themselves are worthwhile now that we're
excluding the most egregious offenders from them, I don't much care
either way.
Colin Cross May 6, 2013, 10:11 p.m. UTC | #16
On Mon, May 6, 2013 at 2:58 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, May 6, 2013 at 2:54 PM, Colin Cross <ccross@android.com> wrote:
>>>
>>> There are many other possibilities for other codepaths that end up in
>>> wait_for_response(). Once we get a solution in place for NFS, we'll
>>> need to do something very similar for CIFS.
>>
>> Makes sense, I will add CIFS to the patch.  Would you prefer it in the
>> same or separate patches.
>
> Quite frankly, is it worth resurrecting these patches at all?
>
> The only things it actually complained about are not worth the pain
> fixing and are getting explicitly not warned about - is there any
> reason to believe the patches are worth maintaining and the extra
> complexity is worth it?

There was at least one real other case caught when this patch was
applied: https://lkml.org/lkml/2013/3/4/390.  Tejun asked that I
resurrect it because I'm adding some additional APIs similar to
freezable_schedule() and he wanted to make sure they didn't get used
improperly in the future.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo May 6, 2013, 10:14 p.m. UTC | #17
Hello, Linus.

On Mon, May 06, 2013 at 02:58:31PM -0700, Linus Torvalds wrote:
> Quite frankly, is it worth resurrecting these patches at all?
> 
> The only things it actually complained about are not worth the pain
> fixing and are getting explicitly not warned about - is there any
> reason to believe the patches are worth maintaining and the extra
> complexity is worth it?

It might not be too useful for nfs but Colin has a patchset spreading
the use of the freezable schedules in a number of places in an attempt
to reduce freeze latency for android, which includes introducing a lot
of schedule_*_freezable() constructs, so I think we need these
warnings one way or the other so that we don't end up with more
problems like nfs currently has.

Thakns.
diff mbox

Patch

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 1f94167..53cbee5 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -79,7 +79,7 @@  int nfs_wait_bit_killable(void *word)
 {
 	if (fatal_signal_pending(current))
 		return -ERESTARTSYS;
-	freezable_schedule();
+	freezable_schedule_unsafe();
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nfs_wait_bit_killable);
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index 43ea96c..ce90eb4 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -33,7 +33,7 @@  nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags)
 		res = rpc_call_sync(clnt, msg, flags);
 		if (res != -EJUKEBOX)
 			break;
-		freezable_schedule_timeout_killable(NFS_JUKEBOX_RETRY_TIME);
+		freezable_schedule_timeout_killable_unsafe(NFS_JUKEBOX_RETRY_TIME);
 		res = -ERESTARTSYS;
 	} while (!fatal_signal_pending(current));
 	return res;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 0ad025e..a236077 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -266,7 +266,7 @@  static int nfs4_delay(struct rpc_clnt *clnt, long *timeout)
 		*timeout = NFS4_POLL_RETRY_MIN;
 	if (*timeout > NFS4_POLL_RETRY_MAX)
 		*timeout = NFS4_POLL_RETRY_MAX;
-	freezable_schedule_timeout_killable(*timeout);
+	freezable_schedule_timeout_killable_unsafe(*timeout);
 	if (fatal_signal_pending(current))
 		res = -ERESTARTSYS;
 	*timeout <<= 1;
@@ -4309,7 +4309,7 @@  int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4
 static unsigned long
 nfs4_set_lock_task_retry(unsigned long timeout)
 {
-	freezable_schedule_timeout_killable(timeout);
+	freezable_schedule_timeout_killable_unsafe(timeout);
 	timeout <<= 1;
 	if (timeout > NFS4_LOCK_MAXTIMEOUT)
 		return NFS4_LOCK_MAXTIMEOUT;
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index e70df40..5b31e21c 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -46,7 +46,11 @@  extern int freeze_kernel_threads(void);
 extern void thaw_processes(void);
 extern void thaw_kernel_threads(void);
 
-static inline bool try_to_freeze(void)
+/*
+ * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
+ * If try_to_freeze causes a lockdep warning it means the caller may deadlock
+ */
+static inline bool try_to_freeze_unsafe(void)
 {
 	might_sleep();
 	if (likely(!freezing(current)))
@@ -54,6 +58,11 @@  static inline bool try_to_freeze(void)
 	return __refrigerator(false);
 }
 
+static inline bool try_to_freeze(void)
+{
+	return try_to_freeze_unsafe();
+}
+
 extern bool freeze_task(struct task_struct *p);
 extern bool set_freezable(void);
 
@@ -115,6 +124,14 @@  static inline void freezer_count(void)
 	try_to_freeze();
 }
 
+/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
+static inline void freezer_count_unsafe(void)
+{
+	current->flags &= ~PF_FREEZER_SKIP;
+	smp_mb();
+	try_to_freeze_unsafe();
+}
+
 /**
  * freezer_should_skip - whether to skip a task when determining frozen
  *			 state is reached
@@ -152,6 +169,14 @@  static inline bool freezer_should_skip(struct task_struct *p)
 	freezer_count();						\
 })
 
+/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
+#define freezable_schedule_unsafe()					\
+({									\
+	freezer_do_not_count();						\
+	schedule();							\
+	freezer_count_unsafe();						\
+})
+
 /* Like schedule_timeout_killable(), but should not block the freezer. */
 #define freezable_schedule_timeout_killable(timeout)			\
 ({									\
@@ -162,6 +187,16 @@  static inline bool freezer_should_skip(struct task_struct *p)
 	__retval;							\
 })
 
+/* DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION */
+#define freezable_schedule_timeout_killable_unsafe(timeout)		\
+({									\
+	long __retval;							\
+	freezer_do_not_count();						\
+	__retval = schedule_timeout_killable(timeout);			\
+	freezer_count_unsafe();						\
+	__retval;							\
+})
+
 /*
  * Freezer-friendly wrappers around wait_event_interruptible(),
  * wait_event_killable() and wait_event_interruptible_timeout(), originally
@@ -225,9 +260,14 @@  static inline void set_freezable(void) {}
 
 #define freezable_schedule()  schedule()
 
+#define freezable_schedule_unsafe()  schedule()
+
 #define freezable_schedule_timeout_killable(timeout)			\
 	schedule_timeout_killable(timeout)
 
+#define freezable_schedule_timeout_killable_unsafe(timeout)		\
+	schedule_timeout_killable(timeout)
+
 #define wait_event_freezable(wq, condition)				\
 		wait_event_interruptible(wq, condition)
 
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index f8529fc..8dcfadc 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -254,7 +254,7 @@  static int rpc_wait_bit_killable(void *word)
 {
 	if (fatal_signal_pending(current))
 		return -ERESTARTSYS;
-	freezable_schedule();
+	freezable_schedule_unsafe();
 	return 0;
 }