mbox series

[v2,0/2] seqlock,mm: lockdep annotation + write_seqlock_irqsave()

Message ID 20230623171232.892937-1-bigeasy@linutronix.de (mailing list archive)
Headers show
Series seqlock,mm: lockdep annotation + write_seqlock_irqsave() | expand

Message

Sebastian Andrzej Siewior June 23, 2023, 5:12 p.m. UTC
Hi,

this has been a single patch (2/2) but then it was pointed out that the
lockdep annotation in seqlock needs to be adjusted to fully close the
printk window so that there is no printing after the seq-lock has been
acquired and before printk_deferred_enter() takes effect.

I'm sending both patches in this series so both sides (locking and mm)
are aware of the situation. 
I hope that both patches can be applied independently via their subsystem
tree (the lockdep splat + deadlock is low risk).

The original thread starts at
	https://lore.kernel.org/20230621104034.HT6QnNkQ@linutronix.de

Sebastian

Comments

Tetsuo Handa June 25, 2023, 2:27 a.m. UTC | #1
On 2023/06/24 2:12, Sebastian Andrzej Siewior wrote:
> Hi,
> 
> this has been a single patch (2/2) but then it was pointed out that the
> lockdep annotation in seqlock needs to be adjusted to fully close the
> printk window so that there is no printing after the seq-lock has been
> acquired and before printk_deferred_enter() takes effect.
> 
> I'm sending both patches in this series so both sides (locking and mm)
> are aware of the situation. 
> I hope that both patches can be applied independently via their subsystem
> tree (the lockdep splat + deadlock is low risk).
> 
> The original thread starts at
> 	https://lore.kernel.org/20230621104034.HT6QnNkQ@linutronix.de
> 
> Sebastian

The original thread is too long to read. Below is a full summary for locking
maintainers to accept [PATCH 1/2].



When commit 1ca7d67cf5d5 ("seqcount: Add lockdep functionality to
seqcount/seqlock structures") added seqcount_acquire(&s->dep_map) check to
write_seqcount_begin_nested() and seqcount_release(&s->dep_map) check to
write_seqcount_end(), the ordering of updating s->sequence and doing lockdep
annotation was not important.

But since commit 3d36424b3b58 ("mm/page_alloc: fix race condition between
build_all_zonelists and page allocation") started calling
read_seqbegin(&zonelist_update_seq)/read_seqretry(&zonelist_update_seq) from
kmalloc(GFP_ATOMIC) path, commit 1007843a9190 ("mm/page_alloc: fix potential
deadlock on zonelist_update_seq seqlock") tried to close the race window using

 __build_all_zonelists() {
+    local_irq_save(flags);
+    printk_deferred_enter();
     write_seqlock(&zonelist_update_seq);
     (...snipped...)
     write_sequnlock(&zonelist_update_seq);
+    printk_deferred_exit();
+    local_irq_restore(flags);
 }

pattern. The reason behind this ordering was to

  satisfy "printk_deferred_enter() depends on local IRQs being disabled"

and

  make sure that "no synchronous printk() (for whatever reasons, not only
  printk() from build_zonelists() from __build_all_zonelists(), but also
  including printk() from lockdep, soft-lockup, KCSAN etc.) happens between
  write_seqlock() and write_sequnlock()

. However, Sebastian Andrzej Siewior mentioned that this ordering is
problematic if CONFIG_PREEMPT_RT=y, for disabling local IRQs conflicts with
"spin_lock(&zonelist_update_seq.lock) from write_seqlock(&zonelist_update_seq)
needs to be able to sleep", and Sebastian is proposing

 __build_all_zonelists() {
-    local_irq_save(flags);
-    printk_deferred_enter();
-    write_seqlock(&zonelist_update_seq);
+    write_seqlock_irqsave(&zonelist_update_seq, flags);
+    printk_deferred_enter();
     (...snipped...)
+    printk_deferred_exit();
+    write_sequnlock_irqrestore(&zonelist_update_seq, flags);
-    write_sequnlock(&zonelist_update_seq);
-    printk_deferred_exit();
-    local_irq_restore(flags);
 }

change as [PATCH 2/2]. Since write_seqlock_irqsave() becomes write_seqlock()
if CONFIG_PREEMPT_RT=y, this change can solve the conflict.

In order to accept this proposal, we need to make sure that

  no synchronous printk() happens between

    write_seqlock_irqsave(&zonelist_update_seq, flags) made
    zonelist_update_seq.seqcount odd

  and

    printk_deferred_enter() takes effect

and

  no synchronous printk() happens between

    printk_deferred_exit() took effect

  and

    write_sequnlock_irqrestore(&zonelist_update_seq, flags) makes
    zonelist_update_seq.seqcount even

, and Sebastian is proposing

 static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass)
 {
-    do_raw_write_seqcount_begin(s);
     seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_);
+    do_raw_write_seqcount_begin(s);
 }

 static inline void do_write_seqcount_end(seqcount_t *s)
 {
-    seqcount_release(&s->dep_map, _RET_IP_);
     do_raw_write_seqcount_end(s);
+    seqcount_release(&s->dep_map, _RET_IP_);
 }

as [PATCH 1/2].

With [PATCH 1/2] and [PATCH 2/2], possibility of synchronous printk() changes like below.

__build_all_zonelists() {
  write_seqlock_irqsave(&zonelist_update_seq, flags) {
    __write_seqlock_irqsave(&zonelist_update_seq) {
      spin_lock_irqsave(&zonelist_update_seq.lock, flags); // local IRQs disabled = synchronous printk() from IRQs is disabled here
      do_write_seqcount_begin(&zonelist_update_seq.seqcount.seqcount) {
        do_write_seqcount_begin_nested(&zonelist_update_seq.seqcount.seqcount, 0) {
          seqcount_acquire(&zonelist_update_seq.seqcount.seqcount.dep_map, 0, 0, _RET_IP_); // synchronous printk() from lockdep might happen here
          do_raw_write_seqcount_begin(&zonelist_update_seq.seqcount.seqcount) {
            seqcount_acquire(&zonelist_update_seq.seqcount.seqcount.dep_map, 0, 0, _RET_IP_); // zonelist_update_seq.seqcount.seqcount.sequence is guaranteed to be even = kmalloc(GFP_ATOMIC) with port lock held is safe = synchronous printk() is safe
            kcsan_nestable_atomic_begin(); // KCSAN is diabled = synchronous printk() from KCSAN is disabled here
            zonelist_update_seq.seqcount.seqcount.sequence++; // zonelist_update_seq.seqcount.seqcount.sequence is now odd = kmalloc(GFP_ATOMIC) with port lock held is not safe = synchronous printk() is not safe
          }
        }
      }
    }
  }
  printk_deferred_enter(); // synchronous printk() from whatever reason is disabled here
  (...snipped...)
  printk_deferred_exit(); // synchronous printk() from whatever reason is enabled here
  write_sequnlock_irqrestore(&zonelist_update_seq, flags) {
    do_write_seqcount_end(&zonelist_update_seq.seqcount.seqcount) {
      do_raw_write_seqcount_end(&zonelist_update_seq.seqcount.seqcount) {
        zonelist_update_seq.seqcount.seqcount.sequence++; // zonelist_update_seq.seqcount.seqcount.sequence is now even = kmalloc(GFP_ATOMIC) with port lock held is safe = synchronous printk() is safe
        kcsan_nestable_atomic_end(); // KCSAN is enabled = synchronous printk() from KCSAN is enabled here
      }
      seqcount_release(&zonelist_update_seq.seqcount.seqcount.dep_map, _RET_IP_); // synchronous printk() from lockdep might happen here
    }
    spin_unlock_irqrestore(&zonelist_update_seq.lock, flags); // local IRQs enabled = synchronous printk() from IRQs is enabled here
  }
}