From patchwork Sat Apr 29 10:06:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13226972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8274BC77B60 for ; Sat, 29 Apr 2023 10:07:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229610AbjD2KHC (ORCPT ); Sat, 29 Apr 2023 06:07:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230458AbjD2KHB (ORCPT ); Sat, 29 Apr 2023 06:07:01 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61C37199A; Sat, 29 Apr 2023 03:06:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=52g8YkWqIE5+lEwxe0tRmHooEDbLRSkh62LI9agimvo=; b=LCWFIu2qf9PT2y0ld14/1SGXX1 Qi1OYcxAkNP+joQHC7HHn683is8ickYteSDAR0AIFmwlJ0UnMowhafmlsu7vfnc9UC4uY8eOTpRUF hUHn1qy6eXftBAdC9gLqbDtRx+6wsiJSUH6R20yC+WjAbTzELPa1teq6++/9XPOZJWaftGBexmSXo xyBKug5yGkWWZG8hZZcdpWj+DAkCrS5jar7tq1/uybvOvNO5a1Thfalr9l8dC+hO0DpSG9tNJ+t6n z+JHuMQHxb9gCXACbvr0EWHqIZM8sXuhSOcqh3OqiqjxOa/naEeRH2Sf8ITd9om2TVzT5o9Q68FCT 0CLpSq5w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pshT9-005SJ6-3a; Sat, 29 Apr 2023 10:06:19 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1F192300348; Sat, 29 Apr 2023 12:06:14 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id C8C43241CA941; Sat, 29 Apr 2023 12:06:14 +0200 (CEST) Date: Sat, 29 Apr 2023 12:06:14 +0200 From: Peter Zijlstra To: Qi Zheng Cc: Vlastimil Babka , "Zhang, Qiang1" , Boqun Feng , "42.hyeyoo@gmail.com" <42.hyeyoo@gmail.com>, "akpm@linux-foundation.org" , "roman.gushchin@linux.dev" , "iamjoonsoo.kim@lge.com" , "rientjes@google.com" , "penberg@kernel.org" , "cl@linux.com" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Zhao Gongyi , Sebastian Andrzej Siewior , Thomas Gleixner , RCU , "Paul E . McKenney" Subject: [PATCH] debugobjects,locking: Annotate __debug_object_init() wait type violation Message-ID: <20230429100614.GA1489784@hirez.programming.kicks-ass.net> References: <932bf921-a076-e166-4f95-1adb24d544cf@bytedance.com> <20230412124735.GE628377@hirez.programming.kicks-ass.net> <76e15f10-d3f1-2cab-63cb-25aa3b4f2cd4@bytedance.com> <20230413074013.GD4253@hirez.programming.kicks-ass.net> <20230425150313.GA1340063@hirez.programming.kicks-ass.net> <6582d31f-e4d5-a982-4797-c22b2dd8b150@bytedance.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <6582d31f-e4d5-a982-4797-c22b2dd8b150@bytedance.com> Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org On Tue, Apr 25, 2023 at 11:51:05PM +0800, Qi Zheng wrote: > I just tested the following code and > it can resolve the warning I encountered. :) Excellent; I've spruiced up the code with a few comments and helpers (as suggested by Vlastimil on IRC) and will queue the below commit. --- Subject: debugobjects,locking: Annotate __debug_object_init() wait type violation From: Peter Zijlstra Date: Tue, 25 Apr 2023 17:03:13 +0200 There is an explicit wait-type violation in __debug_object_init() for PREEMPT_RT=n kernels which allows them to more easily fill the object pool and reduce the chance of allocation failures. Lockdep's wait-type checks are designed to check the PREEMPT_RT locking rules even for PREEMPT_RT=n kernels and object to this, so create a lockdep annotation to allow this to stand. Specifically, create a 'lock' type that overrides the inner wait-type while it is held -- allowing one to temporarily raise it, such that the violation is hidden. Reported-by: Vlastimil Babka Reported-by: Qi Zheng Signed-off-by: Peter Zijlstra (Intel) Tested-by: Qi Zheng --- include/linux/lockdep.h | 14 ++++++++++++++ include/linux/lockdep_types.h | 1 + kernel/locking/lockdep.c | 28 +++++++++++++++++++++------- lib/debugobjects.c | 15 +++++++++++++-- 4 files changed, 49 insertions(+), 9 deletions(-) --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -339,6 +339,16 @@ extern void lock_unpin_lock(struct lockd #define lockdep_repin_lock(l,c) lock_repin_lock(&(l)->dep_map, (c)) #define lockdep_unpin_lock(l,c) lock_unpin_lock(&(l)->dep_map, (c)) +/* + * Must use lock_map_aquire_try() with override maps to avoid + * lockdep thinking they participate in the block chain. + */ +#define DEFINE_WAIT_OVERRIDE_MAP(_name, _wait_type) \ + struct lockdep_map _name = { \ + .name = #_name "-wait-type-override", \ + .wait_type_inner = _wait_type, \ + .lock_type = LD_LOCK_WAIT_OVERRIDE, } + #else /* !CONFIG_LOCKDEP */ static inline void lockdep_init_task(struct task_struct *task) @@ -427,6 +437,9 @@ extern int lockdep_is_held(const void *) #define lockdep_repin_lock(l, c) do { (void)(l); (void)(c); } while (0) #define lockdep_unpin_lock(l, c) do { (void)(l); (void)(c); } while (0) +#define DEFINE_WAIT_OVERRIDE_MAP(_name, _wait_type) \ + struct lockdep_map __maybe_unused _name = {} + #endif /* !LOCKDEP */ enum xhlock_context_t { @@ -551,6 +564,7 @@ do { \ #define rwsem_release(l, i) lock_release(l, i) #define lock_map_acquire(l) lock_acquire_exclusive(l, 0, 0, NULL, _THIS_IP_) +#define lock_map_acquire_try(l) lock_acquire_exclusive(l, 0, 1, NULL, _THIS_IP_) #define lock_map_acquire_read(l) lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_) #define lock_map_acquire_tryread(l) lock_acquire_shared_recursive(l, 0, 1, NULL, _THIS_IP_) #define lock_map_release(l) lock_release(l, _THIS_IP_) --- a/include/linux/lockdep_types.h +++ b/include/linux/lockdep_types.h @@ -33,6 +33,7 @@ enum lockdep_wait_type { enum lockdep_lock_type { LD_LOCK_NORMAL = 0, /* normal, catch all */ LD_LOCK_PERCPU, /* percpu */ + LD_LOCK_WAIT_OVERRIDE, /* annotation */ LD_LOCK_MAX, }; --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -2253,6 +2253,9 @@ static inline bool usage_match(struct lo static inline bool usage_skip(struct lock_list *entry, void *mask) { + if (entry->class->lock_type == LD_LOCK_NORMAL) + return false; + /* * Skip local_lock() for irq inversion detection. * @@ -2279,14 +2282,16 @@ static inline bool usage_skip(struct loc * As a result, we will skip local_lock(), when we search for irq * inversion bugs. */ - if (entry->class->lock_type == LD_LOCK_PERCPU) { - if (DEBUG_LOCKS_WARN_ON(entry->class->wait_type_inner < LD_WAIT_CONFIG)) - return false; + if (entry->class->lock_type == LD_LOCK_PERCPU && + DEBUG_LOCKS_WARN_ON(entry->class->wait_type_inner < LD_WAIT_CONFIG)) + return false; - return true; - } + /* + * Skip WAIT_OVERRIDE for irq inversion detection -- it's not actually + * a lock and only used to override the wait_type. + */ - return false; + return true; } /* @@ -4752,7 +4757,8 @@ static int check_wait_context(struct tas for (; depth < curr->lockdep_depth; depth++) { struct held_lock *prev = curr->held_locks + depth; - u8 prev_inner = hlock_class(prev)->wait_type_inner; + struct lock_class *class = hlock_class(prev); + u8 prev_inner = class->wait_type_inner; if (prev_inner) { /* @@ -4762,6 +4768,14 @@ static int check_wait_context(struct tas * Also due to trylocks. */ curr_inner = min(curr_inner, prev_inner); + + /* + * Allow override for annotations -- this is typically + * only valid/needed for code that only exists when + * CONFIG_PREEMPT_RT=n. + */ + if (unlikely(class->lock_type == LD_LOCK_WAIT_OVERRIDE)) + curr_inner = prev_inner; } } --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -563,10 +563,21 @@ __debug_object_init(void *addr, const st /* * On RT enabled kernels the pool refill must happen in preemptible - * context: + * context -- for !RT kernels we rely on the fact that spinlock_t and + * raw_spinlock_t are basically the same type and this lock-type + * inversion works just fine. */ - if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) + if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) { + /* + * Annotate away the spinlock_t inside raw_spinlock_t warning + * by temporarily raising the wait-type to WAIT_SLEEP, matching + * the preemptible() condition above. + */ + static DEFINE_WAIT_OVERRIDE_MAP(fill_pool_map, LD_WAIT_SLEEP); + lock_map_acquire_try(&fill_pool_map); fill_pool(); + lock_map_release(&fill_pool_map); + } db = get_bucket((unsigned long) addr);