From patchwork Fri Oct 15 22:32:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562891 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88ACEC433EF for ; Fri, 15 Oct 2021 22:33:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6E639611C3 for ; Fri, 15 Oct 2021 22:33:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235622AbhJOWgD (ORCPT ); Fri, 15 Oct 2021 18:36:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243357AbhJOWgD (ORCPT ); Fri, 15 Oct 2021 18:36:03 -0400 Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [IPv6:2607:f8b0:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 290CFC061570 for ; Fri, 15 Oct 2021 15:33:56 -0700 (PDT) Received: by mail-ot1-x330.google.com with SMTP id 34-20020a9d0325000000b00552cae0decbso4539393otv.0 for ; Fri, 15 Oct 2021 15:33:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=a8w+RS10CLLhJABVXoee8/sARMH0zV70Q2mKDEBrpqQ=; b=UIp/M44q+r4/bGIUztGM9bssK/MCbjrVHNm/zNAxGtLy6D/kzeY8KckWmXr4TeRNyZ gswsSgOdz7hi/wPTZPxX/R5ngn05O10WT8O12HAJqLW37SY/g2DsvtsEc6IjqQAh+x2e zKwEFOcdDy21T3oeuQdCJhjXp8Nqgv8D+8x+qa1IylQl5gziw9n5PRbQwC3IcDMDlzbW VXsKV3aPTYgsIl55Z9MWay29hu4TxKIuvHSMakh0/1hMBHXhbxdDOVToiFHtfMImLvVA YL/UW8X0TdkBWiQFg+PM8tjOBW1kzgG74QW0W5Z+zpqtNMH5A9MvX5KnW7T2PXfdDh9v kliA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a8w+RS10CLLhJABVXoee8/sARMH0zV70Q2mKDEBrpqQ=; b=O/ooLdSaQpALubpAA5EkdVBl8+bgSd2jJ//ooEZm3RCJWwAhKfN2kVEwN6yfq6fUvB v7mG34lqoGgtt19wcwgCQabzRTgcMj28hJky5d7KhShYRiw+rWIis7ACzoyH3EPjnKDD g/3KG5XWPRDqAfcwkOPk50KOjEHRlHP3m3IIKRkjOAr2H5gdNmpxfabQp5DY6uEWDhzm ltxLNm3CSWzX+JNWqXR5fi3sH8Q2+Ef5MuXbvT6orbAWBPZpYiU22fjIKblmthdE86N+ fpqDUHxYF07eBMKu6e/8GcUzirFIiRWetnhzT5PLZ9mDwiuNIwYGSNrX1BxHjuWt+KuQ xolQ== X-Gm-Message-State: AOAM533gfPizXGtZ7yT6Fkg0mpxT4Daq8xUdXS9JLPyvBSzdGTt0e1wq FYqkYeMfPHtAg7Z8mMmrVB8= X-Google-Smtp-Source: ABdhPJzqe8tmyILa4AM5tTBOD7JW/4piUnbmvQJC76/w6DE5orizFLT/uRrO4wdsYRhTMsl1Kj0e7A== X-Received: by 2002:a05:6830:240e:: with SMTP id j14mr9927590ots.354.1634337235595; Fri, 15 Oct 2021 15:33:55 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:55 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 01/10] RDMA/rxe: Make rxe_alloc() take pool lock Date: Fri, 15 Oct 2021 17:32:42 -0500 Message-Id: <20211015223250.6501-2-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org In rxe_pool.c there are two separate pool APIs for creating a new object rxe_alloc() and rxe_alloc_locked(). Currently they are identical. Make rxe_alloc() take the pool lock which is in line with the other APIs in the library and was the original intent. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_pool.c | 21 ++++----------------- 1 file changed, 4 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 7b4cb46edfd9..6553ea160d4f 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -354,27 +354,14 @@ void *rxe_alloc_locked(struct rxe_pool *pool) void *rxe_alloc(struct rxe_pool *pool) { - struct rxe_type_info *info = &rxe_type_info[pool->type]; - struct rxe_pool_entry *elem; + unsigned long flags; u8 *obj; - if (atomic_inc_return(&pool->num_elem) > pool->max_elem) - goto out_cnt; - - obj = kzalloc(info->size, GFP_KERNEL); - if (!obj) - goto out_cnt; - - elem = (struct rxe_pool_entry *)(obj + info->elem_offset); - - elem->pool = pool; - kref_init(&elem->ref_cnt); + write_lock_irqsave(&pool->pool_lock, flags); + obj = rxe_alloc_locked(pool); + write_unlock_irqrestore(&pool->pool_lock, flags); return obj; - -out_cnt: - atomic_dec(&pool->num_elem); - return NULL; } int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) From patchwork Fri Oct 15 22:32:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562895 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEB5BC433EF for ; Fri, 15 Oct 2021 22:33:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D116F611C8 for ; Fri, 15 Oct 2021 22:33:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243377AbhJOWgF (ORCPT ); Fri, 15 Oct 2021 18:36:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243358AbhJOWgE (ORCPT ); Fri, 15 Oct 2021 18:36:04 -0400 Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C52CCC061762 for ; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) Received: by mail-ot1-x334.google.com with SMTP id b4-20020a9d7544000000b00552ab826e3aso26434otl.4 for ; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UBcwCYstuuqp0cZ1Z44eMdFaEG91I+fxLxvc5AY+B4Q=; b=bqN30SM97yobF4QOk7OYMq2iKiVCycKqI0lkhtOqPya7I5TyeMsH2bsA4Hrv7Ms9PM Iu6DHnaMzlP8yNrlvVxibnYuKCkps8T/TepO0yhoIJtLIvKvuaR8GpzF0/wBW9Dy4pdT WftNxejjxdxzcPHEDF3Umfi76NmRWABuw8jyedcQ2Zp1LTac7scRGstI+eD65RxrFWQl ZY4YbFBaWuAHfK6t0CIHwxyCR8CWz3/7Gm9olfNwHySlLMHq9KvftiXJ08qpCa9oK5hx Burq6/GIW4o69Om8xZ8BPlPKlNogQRRxOc2uu8a+1eS9DhR3JwKeqFTQvDtqr2sLkfVW J5tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UBcwCYstuuqp0cZ1Z44eMdFaEG91I+fxLxvc5AY+B4Q=; b=2Egir/YawH91dPql2b1txhqp8hcLR8i8ikmll+sPopnXzDqG7oY0IAkb9Lya093TqQ K2jZqR9YxyzxRWWG4RO/a///ddXb198a/6+SeknSQWYmEhI9Je+NY/7bDONNycKRc763 Oim9RY18Siwx0M/P/vp8DkdDSZIZ55Nk2pIm3OAsb9gTveGVCaT7O5yypuOyEEb0Y6bZ oNsmXtyzIWVM+6Ugu9GttFAXO7wyVxIFW39fbnBSLc8yXkF8wiw0bgUvXkDr+atVevO4 xNYqLxJhawerIp/lC/SZxSagr+pt72oLrQuVSXjHsvchZZH3tqFIOMGrHK+CaStq7QxZ bbbA== X-Gm-Message-State: AOAM530909FBdxjp8yo0FOozIu6q0ppdPiSNW2jQgNW4bB2LbpGPTIZ2 mwnZeyoFGr0zEYEOxPz1TIU= X-Google-Smtp-Source: ABdhPJxCWXSE1ngv0v2rRVEQNXCasBlIGVYXzAt2QeYm90leIX0WwEK2fyWWXZGIQPsn6RX/4EfvYA== X-Received: by 2002:a9d:7403:: with SMTP id n3mr763766otk.9.1634337236133; Fri, 15 Oct 2021 15:33:56 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:55 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 02/10] RDMA/rxe: Copy setup parameters into rxe_pool Date: Fri, 15 Oct 2021 17:32:43 -0500 Message-Id: <20211015223250.6501-3-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org In rxe_pool.c copy remaining pool setup parameters from rxe_pool_info into rxe_pool. This saves looking up rxe_pool_info in the performance path. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_pool.c | 49 ++++++++++++---------------- drivers/infiniband/sw/rxe/rxe_pool.h | 6 ++-- 2 files changed, 23 insertions(+), 32 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 6553ea160d4f..7250c40037b5 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -7,9 +7,8 @@ #include "rxe.h" #include "rxe_loc.h" -/* info about object pools - */ -struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { +/* info about object pools */ +static const struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { [RXE_TYPE_UC] = { .name = "rxe-uc", .size = sizeof(struct rxe_ucontext), @@ -88,11 +87,6 @@ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { }, }; -static inline const char *pool_name(struct rxe_pool *pool) -{ - return rxe_type_info[pool->type].name; -} - static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min) { int err = 0; @@ -127,35 +121,36 @@ int rxe_pool_init( enum rxe_elem_type type, unsigned int max_elem) { + const struct rxe_type_info *info = &rxe_type_info[type]; int err = 0; - size_t size = rxe_type_info[type].size; memset(pool, 0, sizeof(*pool)); pool->rxe = rxe; + pool->name = info->name; pool->type = type; pool->max_elem = max_elem; - pool->elem_size = ALIGN(size, RXE_POOL_ALIGN); - pool->flags = rxe_type_info[type].flags; + pool->elem_size = ALIGN(info->size, RXE_POOL_ALIGN); + pool->elem_offset = info->elem_offset; + pool->flags = info->flags; pool->index.tree = RB_ROOT; pool->key.tree = RB_ROOT; - pool->cleanup = rxe_type_info[type].cleanup; + pool->cleanup = info->cleanup; atomic_set(&pool->num_elem, 0); rwlock_init(&pool->pool_lock); - if (rxe_type_info[type].flags & RXE_POOL_INDEX) { - err = rxe_pool_init_index(pool, - rxe_type_info[type].max_index, - rxe_type_info[type].min_index); + if (info->flags & RXE_POOL_INDEX) { + err = rxe_pool_init_index(pool, info->max_index, + info->min_index); if (err) goto out; } - if (rxe_type_info[type].flags & RXE_POOL_KEY) { - pool->key.key_offset = rxe_type_info[type].key_offset; - pool->key.key_size = rxe_type_info[type].key_size; + if (info->flags & RXE_POOL_KEY) { + pool->key.key_offset = info->key_offset; + pool->key.key_size = info->key_size; } out: @@ -166,7 +161,7 @@ void rxe_pool_cleanup(struct rxe_pool *pool) { if (atomic_read(&pool->num_elem) > 0) pr_warn("%s pool destroyed with unfree'd elem\n", - pool_name(pool)); + pool->name); kfree(pool->index.table); } @@ -329,18 +324,17 @@ void __rxe_drop_index(struct rxe_pool_entry *elem) void *rxe_alloc_locked(struct rxe_pool *pool) { - struct rxe_type_info *info = &rxe_type_info[pool->type]; struct rxe_pool_entry *elem; u8 *obj; if (atomic_inc_return(&pool->num_elem) > pool->max_elem) goto out_cnt; - obj = kzalloc(info->size, GFP_ATOMIC); + obj = kzalloc(pool->elem_size, GFP_ATOMIC); if (!obj) goto out_cnt; - elem = (struct rxe_pool_entry *)(obj + info->elem_offset); + elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); elem->pool = pool; kref_init(&elem->ref_cnt); @@ -384,14 +378,13 @@ void rxe_elem_release(struct kref *kref) struct rxe_pool_entry *elem = container_of(kref, struct rxe_pool_entry, ref_cnt); struct rxe_pool *pool = elem->pool; - struct rxe_type_info *info = &rxe_type_info[pool->type]; u8 *obj; if (pool->cleanup) pool->cleanup(elem); if (!(pool->flags & RXE_POOL_NO_ALLOC)) { - obj = (u8 *)elem - info->elem_offset; + obj = (u8 *)elem - pool->elem_offset; kfree(obj); } @@ -400,7 +393,6 @@ void rxe_elem_release(struct kref *kref) void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) { - struct rxe_type_info *info = &rxe_type_info[pool->type]; struct rb_node *node; struct rxe_pool_entry *elem; u8 *obj; @@ -420,7 +412,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) if (node) { kref_get(&elem->ref_cnt); - obj = (u8 *)elem - info->elem_offset; + obj = (u8 *)elem - pool->elem_offset; } else { obj = NULL; } @@ -442,7 +434,6 @@ void *rxe_pool_get_index(struct rxe_pool *pool, u32 index) void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) { - struct rxe_type_info *info = &rxe_type_info[pool->type]; struct rb_node *node; struct rxe_pool_entry *elem; u8 *obj; @@ -466,7 +457,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) if (node) { kref_get(&elem->ref_cnt); - obj = (u8 *)elem - info->elem_offset; + obj = (u8 *)elem - pool->elem_offset; } else { obj = NULL; } diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 1feca1bffced..fb10e0098415 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -44,8 +44,6 @@ struct rxe_type_info { size_t key_size; }; -extern struct rxe_type_info rxe_type_info[]; - struct rxe_pool_entry { struct rxe_pool *pool; struct kref ref_cnt; @@ -61,14 +59,16 @@ struct rxe_pool_entry { struct rxe_pool { struct rxe_dev *rxe; + const char *name; rwlock_t pool_lock; /* protects pool add/del/search */ - size_t elem_size; void (*cleanup)(struct rxe_pool_entry *obj); enum rxe_pool_flags flags; enum rxe_elem_type type; unsigned int max_elem; atomic_t num_elem; + size_t elem_size; + size_t elem_offset; /* only used if indexed */ struct { From patchwork Fri Oct 15 22:32:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562893 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73627C433F5 for ; Fri, 15 Oct 2021 22:33:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D2CD611C3 for ; Fri, 15 Oct 2021 22:33:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239073AbhJOWgF (ORCPT ); Fri, 15 Oct 2021 18:36:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243357AbhJOWgE (ORCPT ); Fri, 15 Oct 2021 18:36:04 -0400 Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [IPv6:2607:f8b0:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0C95C061570 for ; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) Received: by mail-ot1-x330.google.com with SMTP id p6-20020a9d7446000000b0054e6bb223f3so14811024otk.3 for ; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YYccv7B1RGBLN3kZ6DsUk/IlZiQ2zSGNu9sCcppMK0o=; b=D/wneBUSueb9GXQqVrgk9rBLLoOIXnwsceaSVzXQLSIvs0kbflqwS2+88hJ4Fu9CTq 60KPa8W3e2cXk9uxoozmGk6fBc3n4WOrcRLKUxUeQ61x38LwEi3U965QeVkTUoB/hm6G XwR1Zrg+FW/SbcZaMpkbUiikG5Dy6BKWxUwyFCjNw9TyvpGJscogAg1vJ2ZDwhHzZxEH f5RcawW0g1PY6yxiW1S1FUJyUIQTMaRUW3+hBiLk2RBgjJmT5/V2cdxOB1SbGrXdadFW zeuqXhXQVhqRLSRdHhN7BrT0CKdXB6xmpZV8Nz4mWD+dH0MOWrDF0hC1QncNlyqnbHzd kaEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YYccv7B1RGBLN3kZ6DsUk/IlZiQ2zSGNu9sCcppMK0o=; b=k1Ks/vPWuOXlpC+AgZA0uqtgZyXImL0CafrXrhhwCZX87rfiHtfcI1RxOZLTFFnksR JrvAd6lPnEkO4DlntYI4/CNnqlQ4MAyXJ0ZJztW8FRIyNzHyx2FxEHjMiASM/4V9YOEx fGw1ws4NgwEI+D+dcs3MiZmJJgOTatVFU8c1LtFbZ9T859MmaENuZ8Zz+YW+f5rUOl8y 8xfHayJyYz1LajgjPb5WLbbwlbbpHFroENcfIhtIVP+Rc2cAymVsVgOVa5E0c2rsX4Nc QdIMrNdOFrdajYvRWdIHgjwaSdOKSbQecpV2QpX3+PQI/MsHlAuSnVD/9oYuGAwXefMS rMqA== X-Gm-Message-State: AOAM5339yREbotxbkEM8gU0m6keRWCaEob2pc27xSQ37MVUqpryaYIa+ Cc+bGOcEZ9nzUpdMv6PKAVhZ4OwyGwc= X-Google-Smtp-Source: ABdhPJxdOPLoqj0EiTsPDAZ8SBVWVeCyALADJcnVF4aMxqYbIE8+DOTkt/HH4ZP9k+0ereAnpCgi9Q== X-Received: by 2002:a05:6830:30a4:: with SMTP id g4mr10208227ots.312.1634337237074; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:56 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 03/10] RDMA/rxe: Save object pointer in pool element Date: Fri, 15 Oct 2021 17:32:44 -0500 Message-Id: <20211015223250.6501-4-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org In rxe_pool.c currently there are many cases where it is necessary to compute the offset from a pool element struct to the object containing the pool element in a type independent way. By saving a pointer to the object when they are created extra work can be saved. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_pool.c | 25 ++++++++++++++----------- drivers/infiniband/sw/rxe/rxe_pool.h | 1 + 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 7250c40037b5..bbc8dc63f53d 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -220,7 +220,8 @@ static int rxe_insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) elem = rb_entry(parent, struct rxe_pool_entry, key_node); cmp = memcmp((u8 *)elem + pool->key.key_offset, - (u8 *)new + pool->key.key_offset, pool->key.key_size); + (u8 *)new + pool->key.key_offset, + pool->key.key_size); if (cmp == 0) { pr_warn("key already exists!\n"); @@ -325,7 +326,7 @@ void __rxe_drop_index(struct rxe_pool_entry *elem) void *rxe_alloc_locked(struct rxe_pool *pool) { struct rxe_pool_entry *elem; - u8 *obj; + void *obj; if (atomic_inc_return(&pool->num_elem) > pool->max_elem) goto out_cnt; @@ -337,6 +338,7 @@ void *rxe_alloc_locked(struct rxe_pool *pool) elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); elem->pool = pool; + elem->obj = obj; kref_init(&elem->ref_cnt); return obj; @@ -349,7 +351,7 @@ void *rxe_alloc_locked(struct rxe_pool *pool) void *rxe_alloc(struct rxe_pool *pool) { unsigned long flags; - u8 *obj; + void *obj; write_lock_irqsave(&pool->pool_lock, flags); obj = rxe_alloc_locked(pool); @@ -364,6 +366,7 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) goto out_cnt; elem->pool = pool; + elem->obj = (u8 *)elem - pool->elem_offset; kref_init(&elem->ref_cnt); return 0; @@ -378,13 +381,13 @@ void rxe_elem_release(struct kref *kref) struct rxe_pool_entry *elem = container_of(kref, struct rxe_pool_entry, ref_cnt); struct rxe_pool *pool = elem->pool; - u8 *obj; + void *obj; if (pool->cleanup) pool->cleanup(elem); if (!(pool->flags & RXE_POOL_NO_ALLOC)) { - obj = (u8 *)elem - pool->elem_offset; + obj = elem->obj; kfree(obj); } @@ -395,7 +398,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) { struct rb_node *node; struct rxe_pool_entry *elem; - u8 *obj; + void *obj; node = pool->index.tree.rb_node; @@ -412,7 +415,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) if (node) { kref_get(&elem->ref_cnt); - obj = (u8 *)elem - pool->elem_offset; + obj = elem->obj; } else { obj = NULL; } @@ -422,7 +425,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) void *rxe_pool_get_index(struct rxe_pool *pool, u32 index) { - u8 *obj; + void *obj; unsigned long flags; read_lock_irqsave(&pool->pool_lock, flags); @@ -436,7 +439,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) { struct rb_node *node; struct rxe_pool_entry *elem; - u8 *obj; + void *obj; int cmp; node = pool->key.tree.rb_node; @@ -457,7 +460,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) if (node) { kref_get(&elem->ref_cnt); - obj = (u8 *)elem - pool->elem_offset; + obj = elem->obj; } else { obj = NULL; } @@ -467,7 +470,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) void *rxe_pool_get_key(struct rxe_pool *pool, void *key) { - u8 *obj; + void *obj; unsigned long flags; read_lock_irqsave(&pool->pool_lock, flags); diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index fb10e0098415..e9bda4b14f86 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -46,6 +46,7 @@ struct rxe_type_info { struct rxe_pool_entry { struct rxe_pool *pool; + void *obj; struct kref ref_cnt; struct list_head list; From patchwork Fri Oct 15 22:32:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562901 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1331DC433FE for ; Fri, 15 Oct 2021 22:34:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC83D611C3 for ; Fri, 15 Oct 2021 22:33:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243374AbhJOWgG (ORCPT ); Fri, 15 Oct 2021 18:36:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243366AbhJOWgF (ORCPT ); Fri, 15 Oct 2021 18:36:05 -0400 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4552BC061765 for ; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) Received: by mail-ot1-x335.google.com with SMTP id e59-20020a9d01c1000000b00552c91a99f7so5713ote.6 for ; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qoPLepkVDWYcoIyuwODPeRTybASa27tirWHZaNHqzWc=; b=pAnIjNMbKhBCNnC7boduxA7VkSuM6p0YPPgktcJ0Od/J2dgRTxaaMz4+O/M2CVojJz CI1s9HhNKbsxqEeZ6zBEG+gkGii0u7Ro6DHOORx+J3OZfyFTofktpItQ8bMdE6IEKAqy rgTHbA+MLl/HE/u6qMx/DZ817Hw+uvyXoqLrmalVjwly/hVCAh8MZHBGdaQF3MhXdSz+ vkyUxV8+IqkCHpudytZWSbNcVtDJvShaHvUkYpGEEGMr6KAfjVJbFpuOhp6HhAIuVsTg rdf9qg6hsfAenY3DGycQZR7D2NdZMWr3lK0PWFBltV/rqA/H2lbpNW3qiY7simTbQqXK 78aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qoPLepkVDWYcoIyuwODPeRTybASa27tirWHZaNHqzWc=; b=DfklnQVJbYS0N8uzKDYqMv7k3LmQL1m/35UT33+R/saQ9GdUuApeIDNQOAQK+vg/do ZU+kp3ysvu3RSZ9kDQhxe0h2nFVJXkUS1GZneI5XzX/3Wzm/AY7T+3SBWL6NzX1L+G1P lbm4g/t0nNtVJT1Vzmfil8HK+XocpMO+CGSgLzkdyx7ZeC6PYoy7yZqxYGhTdL/7bKwX WA2E40kgacrhYeQQ65rkvTAk9gdzd6JKXwlPNRsftu990mYIdr6HOIg3nXRTkIMzgqni ZOcYCWOc3B4vZpkG0+Q2pFlxuHZs86cDvhdNF+IXKU3XNitq4g22zWvkjikzyBfHIPrV QNJA== X-Gm-Message-State: AOAM532+dG6kcQSSZOuNOqgUYgYCsD6yauayL1+7OOMYI/3ZcYPaj8P9 RRQAowm7elG7WLRAOYSSBTc= X-Google-Smtp-Source: ABdhPJyNtn7PSXChrocQTi1kTbqwlvqPFXyDFLbaaZPdyo5bKXxP0+iNM1dipQ5Y4NfUPljBh0QkSQ== X-Received: by 2002:a05:6830:541:: with SMTP id l1mr10478370otb.347.1634337237677; Fri, 15 Oct 2021 15:33:57 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:57 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 04/10] RDMA/rxe: Combine rxe_add_index with rxe_alloc Date: Fri, 15 Oct 2021 17:32:45 -0500 Message-Id: <20211015223250.6501-5-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently rxe objects which have an index require adding and dropping the indices in a separate API call from allocating and freeing the object. These are always performed together so this patch combines them in a single operation. By taking a single pool lock around allocating the object and adding the index metadata or dropping the index metadata and releasing the object the possibility of a race condition where the metadata is not consistent with the state of the object is removed. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_mr.c | 1 - drivers/infiniband/sw/rxe/rxe_mw.c | 5 +-- drivers/infiniband/sw/rxe/rxe_pool.c | 59 +++++++++++++++------------ drivers/infiniband/sw/rxe/rxe_pool.h | 22 ---------- drivers/infiniband/sw/rxe/rxe_verbs.c | 13 ------ 5 files changed, 33 insertions(+), 67 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 53271df10e47..6e71f67ccfe9 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -693,7 +693,6 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) mr->state = RXE_MR_STATE_INVALID; rxe_drop_ref(mr_pd(mr)); - rxe_drop_index(mr); rxe_drop_ref(mr); return 0; diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index 9534a7fe1a98..854d0c283521 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -20,7 +20,6 @@ int rxe_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) return ret; } - rxe_add_index(mw); mw->rkey = ibmw->rkey = (mw->pelem.index << 8) | rxe_get_next_key(-1); mw->state = (mw->ibmw.type == IB_MW_TYPE_2) ? RXE_MW_STATE_FREE : RXE_MW_STATE_VALID; @@ -335,7 +334,5 @@ struct rxe_mw *rxe_lookup_mw(struct rxe_qp *qp, int access, u32 rkey) void rxe_mw_cleanup(struct rxe_pool_entry *elem) { - struct rxe_mw *mw = container_of(elem, typeof(*mw), pelem); - - rxe_drop_index(mw); + /* nothing to do currently */ } diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index bbc8dc63f53d..b030a774c251 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -166,12 +166,16 @@ void rxe_pool_cleanup(struct rxe_pool *pool) kfree(pool->index.table); } +/* should never fail because there are at least as many indices as + * max objects + */ static u32 alloc_index(struct rxe_pool *pool) { u32 index; u32 range = pool->index.max_index - pool->index.min_index + 1; - index = find_next_zero_bit(pool->index.table, range, pool->index.last); + index = find_next_zero_bit(pool->index.table, range, + pool->index.last); if (index >= range) index = find_first_zero_bit(pool->index.table, range); @@ -192,7 +196,8 @@ static int rxe_insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new) elem = rb_entry(parent, struct rxe_pool_entry, index_node); if (elem->index == new->index) { - pr_warn("element already exists!\n"); + pr_warn("element with index = 0x%x already exists!\n", + new->index); return -EINVAL; } @@ -281,31 +286,21 @@ void __rxe_drop_key(struct rxe_pool_entry *elem) write_unlock_irqrestore(&pool->pool_lock, flags); } -int __rxe_add_index_locked(struct rxe_pool_entry *elem) +static int rxe_add_index(struct rxe_pool_entry *elem) { struct rxe_pool *pool = elem->pool; int err; elem->index = alloc_index(pool); err = rxe_insert_index(pool, elem); + if (err) + clear_bit(elem->index - pool->index.min_index, + pool->index.table); return err; } -int __rxe_add_index(struct rxe_pool_entry *elem) -{ - struct rxe_pool *pool = elem->pool; - unsigned long flags; - int err; - - write_lock_irqsave(&pool->pool_lock, flags); - err = __rxe_add_index_locked(elem); - write_unlock_irqrestore(&pool->pool_lock, flags); - - return err; -} - -void __rxe_drop_index_locked(struct rxe_pool_entry *elem) +static void rxe_drop_index(struct rxe_pool_entry *elem) { struct rxe_pool *pool = elem->pool; @@ -313,20 +308,11 @@ void __rxe_drop_index_locked(struct rxe_pool_entry *elem) rb_erase(&elem->index_node, &pool->index.tree); } -void __rxe_drop_index(struct rxe_pool_entry *elem) -{ - struct rxe_pool *pool = elem->pool; - unsigned long flags; - - write_lock_irqsave(&pool->pool_lock, flags); - __rxe_drop_index_locked(elem); - write_unlock_irqrestore(&pool->pool_lock, flags); -} - void *rxe_alloc_locked(struct rxe_pool *pool) { struct rxe_pool_entry *elem; void *obj; + int err; if (atomic_inc_return(&pool->num_elem) > pool->max_elem) goto out_cnt; @@ -341,6 +327,14 @@ void *rxe_alloc_locked(struct rxe_pool *pool) elem->obj = obj; kref_init(&elem->ref_cnt); + if (pool->flags & RXE_POOL_INDEX) { + err = rxe_add_index(elem); + if (err) { + kfree(obj); + goto out_cnt; + } + } + return obj; out_cnt: @@ -362,6 +356,8 @@ void *rxe_alloc(struct rxe_pool *pool) int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) { + int err; + if (atomic_inc_return(&pool->num_elem) > pool->max_elem) goto out_cnt; @@ -369,6 +365,12 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) elem->obj = (u8 *)elem - pool->elem_offset; kref_init(&elem->ref_cnt); + if (pool->flags & RXE_POOL_INDEX) { + err = rxe_add_index(elem); + if (err) + goto out_cnt; + } + return 0; out_cnt: @@ -386,6 +388,9 @@ void rxe_elem_release(struct kref *kref) if (pool->cleanup) pool->cleanup(elem); + if (pool->flags & RXE_POOL_INDEX) + rxe_drop_index(elem); + if (!(pool->flags & RXE_POOL_NO_ALLOC)) { obj = elem->obj; kfree(obj); diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index e9bda4b14f86..f76addf87b4a 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -109,28 +109,6 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem); #define rxe_add_to_pool(pool, obj) __rxe_add_to_pool(pool, &(obj)->pelem) -/* assign an index to an indexed object and insert object into - * pool's rb tree holding and not holding the pool_lock - */ -int __rxe_add_index_locked(struct rxe_pool_entry *elem); - -#define rxe_add_index_locked(obj) __rxe_add_index_locked(&(obj)->pelem) - -int __rxe_add_index(struct rxe_pool_entry *elem); - -#define rxe_add_index(obj) __rxe_add_index(&(obj)->pelem) - -/* drop an index and remove object from rb tree - * holding and not holding the pool_lock - */ -void __rxe_drop_index_locked(struct rxe_pool_entry *elem); - -#define rxe_drop_index_locked(obj) __rxe_drop_index_locked(&(obj)->pelem) - -void __rxe_drop_index(struct rxe_pool_entry *elem); - -#define rxe_drop_index(obj) __rxe_drop_index(&(obj)->pelem) - /* assign a key to a keyed object and insert object into * pool's rb tree holding and not holding pool_lock */ diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 0aa0d7e52773..84ea03bc6a26 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -181,7 +181,6 @@ static int rxe_create_ah(struct ib_ah *ibah, return err; /* create index > 0 */ - rxe_add_index(ah); ah->ah_num = ah->pelem.index; if (uresp) { @@ -189,7 +188,6 @@ static int rxe_create_ah(struct ib_ah *ibah, err = copy_to_user(&uresp->ah_num, &ah->ah_num, sizeof(uresp->ah_num)); if (err) { - rxe_drop_index(ah); rxe_drop_ref(ah); return -EFAULT; } @@ -230,7 +228,6 @@ static int rxe_destroy_ah(struct ib_ah *ibah, u32 flags) { struct rxe_ah *ah = to_rah(ibah); - rxe_drop_index(ah); rxe_drop_ref(ah); return 0; } @@ -438,7 +435,6 @@ static int rxe_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init, if (err) return err; - rxe_add_index(qp); err = rxe_qp_from_init(rxe, qp, pd, init, uresp, ibqp->pd, udata); if (err) goto qp_init; @@ -446,7 +442,6 @@ static int rxe_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init, return 0; qp_init: - rxe_drop_index(qp); rxe_drop_ref(qp); return err; } @@ -491,7 +486,6 @@ static int rxe_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) struct rxe_qp *qp = to_rqp(ibqp); rxe_qp_destroy(qp); - rxe_drop_index(qp); rxe_drop_ref(qp); return 0; } @@ -898,7 +892,6 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) if (!mr) return ERR_PTR(-ENOMEM); - rxe_add_index(mr); rxe_add_ref(pd); rxe_mr_init_dma(pd, access, mr); @@ -922,8 +915,6 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, goto err2; } - rxe_add_index(mr); - rxe_add_ref(pd); err = rxe_mr_init_user(pd, start, length, iova, access, mr); @@ -934,7 +925,6 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, err3: rxe_drop_ref(pd); - rxe_drop_index(mr); rxe_drop_ref(mr); err2: return ERR_PTR(err); @@ -957,8 +947,6 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, goto err1; } - rxe_add_index(mr); - rxe_add_ref(pd); err = rxe_mr_init_fast(pd, max_num_sg, mr); @@ -969,7 +957,6 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, err2: rxe_drop_ref(pd); - rxe_drop_index(mr); rxe_drop_ref(mr); err1: return ERR_PTR(err); From patchwork Fri Oct 15 22:32:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562899 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7945AC4332F for ; Fri, 15 Oct 2021 22:34:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 60956611C3 for ; Fri, 15 Oct 2021 22:34:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243380AbhJOWgG (ORCPT ); Fri, 15 Oct 2021 18:36:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243379AbhJOWgF (ORCPT ); Fri, 15 Oct 2021 18:36:05 -0400 Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D09AFC061764 for ; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) Received: by mail-ot1-x32c.google.com with SMTP id b4-20020a9d7544000000b00552ab826e3aso26545otl.4 for ; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=12QLK1J0GgEI3o/qAvDD6/aX7RFH2+9IU9HSCaBOOvo=; b=lDD7/LovWnK0SqaYsP4RG89Pfyhn23WazvKyfunvFbFc+0w53+qGKb+8ZLqd/W+8y9 ZmelyqU01mAsQuBGiMYsOmNRoESih1rSLeynF8kcT6mNYBOPCu7EGPRudeeQ8tmNc60h abcjd0DlaJUq4IyKmL6WtiNYb/KRAZp+8A72cWxtByJK4FOnSnMEdNJrpRditIqt58YJ H/ltksmqdGVEaoFln/DAI7pZXvqCjTho3qkxw8DvpNgKZ05Q0HjTbhFl+zLXVtN/3OqU M7r4L6/uyMHQAcgjtmmCjiaqRj3vOyp//HPXlbxy5hW3ITuMU+/t7cAU3hmNhOAwAeHN ZW0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=12QLK1J0GgEI3o/qAvDD6/aX7RFH2+9IU9HSCaBOOvo=; b=VR1OEhFCXwZaS8mfCl4T7Yp1wsAeaOZ9aceHmJ0ofbqXocrWQHTNzG+OCezmrvdBpD bbnInnIRUM0abxCvGsIayxLJklRu3dirLFEg3Dw0AkQgDsQV7/5nBjJY6cog4FBfg+nD DpXEyeaivKyonsaYe+IEosl3uUCHPcH5Rvc+SYe11WrqmXFZWAQVwP2F2VfRaiHpFM/p gXq0iHGMTyMqWLUY238luk6SDGsmDRmXvbw1uW6w41hvkCDzsx582PJ5cPN2XeSB0JJ1 SVFPeRjrc6cG9Yx5vNoXL37KZqB8sAeOB80p9oA/a6KD5oHugo7VgkJoQorOA93hqACT YklA== X-Gm-Message-State: AOAM533YT9ozpA34WqPGmdzhQTYINdI/JxnIqa3+5EOmAlHM30L26Kk3 d0IXQ6uNCQTbEqJMZmtw1s0= X-Google-Smtp-Source: ABdhPJw5T9XgzqEiNjsGb1bRv76cHmA9xHYpgCUiIoiIFMB3earZK4wCTiLYzcXw7TxadTazfJ44pA== X-Received: by 2002:a05:6830:1293:: with SMTP id z19mr6516095otp.353.1634337238207; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:57 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 05/10] RDMA/rxe: Combine rxe_add_key with rxe_alloc Date: Fri, 15 Oct 2021 17:32:46 -0500 Message-Id: <20211015223250.6501-6-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently adding and dropping a key from a rxe object requires separate API calls from allocating and freeing the object but these are always performed together. This patch combines these into single APIs. This requires adding new rxe_allocate_with_key(_locked) APIs. By combining allocating an object and adding key metadata inside a single locked sequence and dropping the key metadata and releasing the object the possibility of a race condition where the object state and key metadata state are inconsistent is removed. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_mcast.c | 5 +- drivers/infiniband/sw/rxe/rxe_pool.c | 81 +++++++++++++-------------- drivers/infiniband/sw/rxe/rxe_pool.h | 24 ++------ 3 files changed, 45 insertions(+), 65 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c index 1c1d1b53312d..337dc2c68051 100644 --- a/drivers/infiniband/sw/rxe/rxe_mcast.c +++ b/drivers/infiniband/sw/rxe/rxe_mcast.c @@ -15,18 +15,16 @@ static struct rxe_mc_grp *create_grp(struct rxe_dev *rxe, int err; struct rxe_mc_grp *grp; - grp = rxe_alloc_locked(&rxe->mc_grp_pool); + grp = rxe_alloc_with_key_locked(&rxe->mc_grp_pool, mgid); if (!grp) return ERR_PTR(-ENOMEM); INIT_LIST_HEAD(&grp->qp_list); spin_lock_init(&grp->mcg_lock); grp->rxe = rxe; - rxe_add_key_locked(grp, mgid); err = rxe_mcast_add(rxe, mgid); if (unlikely(err)) { - rxe_drop_key_locked(grp); rxe_drop_ref(grp); return ERR_PTR(err); } @@ -174,6 +172,5 @@ void rxe_mc_cleanup(struct rxe_pool_entry *arg) struct rxe_mc_grp *grp = container_of(arg, typeof(*grp), pelem); struct rxe_dev *rxe = grp->rxe; - rxe_drop_key(grp); rxe_mcast_delete(rxe, &grp->mgid); } diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index b030a774c251..b0963eca75c7 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -245,47 +245,6 @@ static int rxe_insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) return 0; } -int __rxe_add_key_locked(struct rxe_pool_entry *elem, void *key) -{ - struct rxe_pool *pool = elem->pool; - int err; - - memcpy((u8 *)elem + pool->key.key_offset, key, pool->key.key_size); - err = rxe_insert_key(pool, elem); - - return err; -} - -int __rxe_add_key(struct rxe_pool_entry *elem, void *key) -{ - struct rxe_pool *pool = elem->pool; - unsigned long flags; - int err; - - write_lock_irqsave(&pool->pool_lock, flags); - err = __rxe_add_key_locked(elem, key); - write_unlock_irqrestore(&pool->pool_lock, flags); - - return err; -} - -void __rxe_drop_key_locked(struct rxe_pool_entry *elem) -{ - struct rxe_pool *pool = elem->pool; - - rb_erase(&elem->key_node, &pool->key.tree); -} - -void __rxe_drop_key(struct rxe_pool_entry *elem) -{ - struct rxe_pool *pool = elem->pool; - unsigned long flags; - - write_lock_irqsave(&pool->pool_lock, flags); - __rxe_drop_key_locked(elem); - write_unlock_irqrestore(&pool->pool_lock, flags); -} - static int rxe_add_index(struct rxe_pool_entry *elem) { struct rxe_pool *pool = elem->pool; @@ -342,6 +301,31 @@ void *rxe_alloc_locked(struct rxe_pool *pool) return NULL; } +void *rxe_alloc_with_key_locked(struct rxe_pool *pool, void *key) +{ + struct rxe_pool_entry *elem; + u8 *obj; + int err; + + obj = rxe_alloc_locked(pool); + if (!obj) + return NULL; + + elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); + memcpy((u8 *)elem + pool->key.key_offset, key, pool->key.key_size); + err = rxe_insert_key(pool, elem); + if (err) { + kfree(obj); + goto out_cnt; + } + + return obj; + +out_cnt: + atomic_dec(&pool->num_elem); + return NULL; +} + void *rxe_alloc(struct rxe_pool *pool) { unsigned long flags; @@ -354,6 +338,18 @@ void *rxe_alloc(struct rxe_pool *pool) return obj; } +void *rxe_alloc_with_key(struct rxe_pool *pool, void *key) +{ + unsigned long flags; + void *obj; + + write_lock_irqsave(&pool->pool_lock, flags); + obj = rxe_alloc_with_key_locked(pool, key); + write_unlock_irqrestore(&pool->pool_lock, flags); + + return obj; +} + int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) { int err; @@ -391,6 +387,9 @@ void rxe_elem_release(struct kref *kref) if (pool->flags & RXE_POOL_INDEX) rxe_drop_index(elem); + if (pool->flags & RXE_POOL_KEY) + rb_erase(&elem->key_node, &pool->key.tree); + if (!(pool->flags & RXE_POOL_NO_ALLOC)) { obj = elem->obj; kfree(obj); diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index f76addf87b4a..e0242d488cc8 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -104,31 +104,15 @@ void *rxe_alloc_locked(struct rxe_pool *pool); void *rxe_alloc(struct rxe_pool *pool); +void *rxe_alloc_with_key_locked(struct rxe_pool *pool, void *key); + +void *rxe_alloc_with_key(struct rxe_pool *pool, void *key); + /* connect already allocated object to pool */ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem); #define rxe_add_to_pool(pool, obj) __rxe_add_to_pool(pool, &(obj)->pelem) -/* assign a key to a keyed object and insert object into - * pool's rb tree holding and not holding pool_lock - */ -int __rxe_add_key_locked(struct rxe_pool_entry *elem, void *key); - -#define rxe_add_key_locked(obj, key) __rxe_add_key_locked(&(obj)->pelem, key) - -int __rxe_add_key(struct rxe_pool_entry *elem, void *key); - -#define rxe_add_key(obj, key) __rxe_add_key(&(obj)->pelem, key) - -/* remove elem from rb tree holding and not holding the pool_lock */ -void __rxe_drop_key_locked(struct rxe_pool_entry *elem); - -#define rxe_drop_key_locked(obj) __rxe_drop_key_locked(&(obj)->pelem) - -void __rxe_drop_key(struct rxe_pool_entry *elem); - -#define rxe_drop_key(obj) __rxe_drop_key(&(obj)->pelem) - /* lookup an indexed object from index holding and not holding the pool_lock. * takes a reference on object */ From patchwork Fri Oct 15 22:32:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562897 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3A90C43219 for ; Fri, 15 Oct 2021 22:34:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA7EC611C8 for ; Fri, 15 Oct 2021 22:34:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243358AbhJOWgG (ORCPT ); Fri, 15 Oct 2021 18:36:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243357AbhJOWgG (ORCPT ); Fri, 15 Oct 2021 18:36:06 -0400 Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58C9DC061764 for ; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) Received: by mail-ot1-x331.google.com with SMTP id w12-20020a056830410c00b0054e7ceecd88so14860208ott.2 for ; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=POUkgMs77Z6vULT8VzPbqJDx+GoVIJ6/p4tfX7cfkxI=; b=kFn4XjPDSa1wcyKlbnxUVOy9GHHjnUNETU1sxQyKDXyb7vvTcQSjC/ItO4WPYboydU qrKNmwshWSmOp8LyxL+pZaARe8106fOK2jWU2kte1uAdSTU0ZhfCRG8NBqQTABJCaPZy KwDTuyLeWWsthS2iB/15TdbENHX+pNtWrSlEbbXdomBsYYFMaBR0eSgr9zKXLxfR2abl JBFaX+ecXrC56BQId1hY3t5IkuDj3sqrKBjJj88LGbC8ik5fIxcLrMSeI2O6Xz9yeZqg ifeaj/UC3zhilGWTWXJrRX7n1JjXniwnV56S2WsMDH6FmM0D0LVy1HqCzmP5pOYASxkD 2D3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=POUkgMs77Z6vULT8VzPbqJDx+GoVIJ6/p4tfX7cfkxI=; b=gPxdnKL6XT33LKiGPUvIL0VMUDeU9N+hLWOU/jTVTMG5LiZc9w6bcVVggLVp94WpZn wuVhCHAztMqD9Lo6+Gpah4atK7AQcq7YcACWPC58a5ieiR4/tchwQEstLDWIyX/3W+qn jDQTcjDMyfxSRG0v81+5i72PWdd0MnFfLdJVMGd3YmwxBwPjouXvTqh8ldVLl1BAfa+2 uEtBLwyGApuU2msHQbzwZ2x3lAcv1cISQVntMN1NzVh8ncGkU7pQ+BofK8gonMZMtYSy c0k4r6cXbzH9cgCoAnT5ohes9tjJfHzb/K+9DBFTKCI93TnFfQltsa1EkVuacfKtjs4r CC3g== X-Gm-Message-State: AOAM5329tA1B/i+c6tq6xPtxZw0B+1zOMCLOHTYkkq+iVOjZZnL/hwlP uE9Py8pCqYTtEv7QHRI9y5LgKF40UoU= X-Google-Smtp-Source: ABdhPJy2akHPb11o/odFXnkYpDAeaB3/m1Imo5dKL3TrvqrLj9nkF04jNVaaXlPP8tRo7cW/f7eaOw== X-Received: by 2002:a9d:61c7:: with SMTP id h7mr10421525otk.21.1634337238687; Fri, 15 Oct 2021 15:33:58 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:58 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 06/10] RDMA/rxe: Fix potential race condition in rxe_pool Date: Fri, 15 Oct 2021 17:32:47 -0500 Message-Id: <20211015223250.6501-7-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently there is a possible race condition related to rxe indexed or keyed objects where one thread is the last one holding a reference to an object and drops that reference triggering a call to rxe_elem_release() while at the same time another thread looks up the object from its index or key by calling rxe_pool_get_index(/_key). This can happen if an unexpected packet arrives as a result of a retry attempt and looks up its rkey or a multicast packet arrives just as the verbs consumer drops the mcast group. Add locking to prevent looking up an object from its index or key while another thread is trying to destroy the object. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_pool.c | 51 +++++++++++++++++----------- drivers/infiniband/sw/rxe/rxe_pool.h | 15 ++++++-- 2 files changed, 45 insertions(+), 21 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index b0963eca75c7..e9c5e4e887c3 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -267,7 +267,7 @@ static void rxe_drop_index(struct rxe_pool_entry *elem) rb_erase(&elem->index_node, &pool->index.tree); } -void *rxe_alloc_locked(struct rxe_pool *pool) +static void *__rxe_alloc_locked(struct rxe_pool *pool) { struct rxe_pool_entry *elem; void *obj; @@ -280,11 +280,10 @@ void *rxe_alloc_locked(struct rxe_pool *pool) if (!obj) goto out_cnt; - elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); + elem = (struct rxe_pool_entry *)((u8 *)obj + pool->elem_offset); elem->pool = pool; elem->obj = obj; - kref_init(&elem->ref_cnt); if (pool->flags & RXE_POOL_INDEX) { err = rxe_add_index(elem); @@ -301,17 +300,32 @@ void *rxe_alloc_locked(struct rxe_pool *pool) return NULL; } +void *rxe_alloc_locked(struct rxe_pool *pool) +{ + struct rxe_pool_entry *elem; + void *obj; + + obj = __rxe_alloc_locked(pool); + if (!obj) + return NULL; + + elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); + kref_init(&elem->ref_cnt); + + return obj; +} + void *rxe_alloc_with_key_locked(struct rxe_pool *pool, void *key) { struct rxe_pool_entry *elem; - u8 *obj; + void *obj; int err; - obj = rxe_alloc_locked(pool); + obj = __rxe_alloc_locked(pool); if (!obj) return NULL; - elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); + elem = (struct rxe_pool_entry *)((u8 *)obj + pool->elem_offset); memcpy((u8 *)elem + pool->key.key_offset, key, pool->key.key_size); err = rxe_insert_key(pool, elem); if (err) { @@ -319,6 +333,8 @@ void *rxe_alloc_with_key_locked(struct rxe_pool *pool, void *key) goto out_cnt; } + kref_init(&elem->ref_cnt); + return obj; out_cnt: @@ -352,14 +368,15 @@ void *rxe_alloc_with_key(struct rxe_pool *pool, void *key) int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) { + unsigned long flags; int err; + write_lock_irqsave(&pool->pool_lock, flags); if (atomic_inc_return(&pool->num_elem) > pool->max_elem) goto out_cnt; elem->pool = pool; elem->obj = (u8 *)elem - pool->elem_offset; - kref_init(&elem->ref_cnt); if (pool->flags & RXE_POOL_INDEX) { err = rxe_add_index(elem); @@ -367,10 +384,14 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) goto out_cnt; } + kref_init(&elem->ref_cnt); + write_unlock_irqrestore(&pool->pool_lock, flags); + return 0; out_cnt: atomic_dec(&pool->num_elem); + write_unlock_irqrestore(&pool->pool_lock, flags); return -EINVAL; } @@ -402,7 +423,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) { struct rb_node *node; struct rxe_pool_entry *elem; - void *obj; + void *obj = NULL; node = pool->index.tree.rb_node; @@ -417,12 +438,8 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) break; } - if (node) { - kref_get(&elem->ref_cnt); + if (node && kref_get_unless_zero(&elem->ref_cnt)) obj = elem->obj; - } else { - obj = NULL; - } return obj; } @@ -443,7 +460,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) { struct rb_node *node; struct rxe_pool_entry *elem; - void *obj; + void *obj = NULL; int cmp; node = pool->key.tree.rb_node; @@ -462,12 +479,8 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) break; } - if (node) { - kref_get(&elem->ref_cnt); + if (node && kref_get_unless_zero(&elem->ref_cnt)) obj = elem->obj; - } else { - obj = NULL; - } return obj; } diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index e0242d488cc8..50083cb9530e 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -131,9 +131,20 @@ void *rxe_pool_get_key(struct rxe_pool *pool, void *key); void rxe_elem_release(struct kref *kref); /* take a reference on an object */ -#define rxe_add_ref(elem) kref_get(&(elem)->pelem.ref_cnt) +static inline int __rxe_add_ref(struct rxe_pool_entry *elem) +{ + int ret = kref_get_unless_zero(&elem->ref_cnt); + + if (unlikely(!ret)) + pr_warn("Taking a reference on a %s object that is already destroyed\n", + elem->pool->name); + + return (ret) ? 0 : -EINVAL; +} + +#define rxe_add_ref(obj) __rxe_add_ref(&(obj)->pelem) /* drop a reference on an object */ -#define rxe_drop_ref(elem) kref_put(&(elem)->pelem.ref_cnt, rxe_elem_release) +#define rxe_drop_ref(obj) kref_put(&(obj)->pelem.ref_cnt, rxe_elem_release) #endif /* RXE_POOL_H */ From patchwork Fri Oct 15 22:32:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562903 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0425C433EF for ; Fri, 15 Oct 2021 22:34:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A5EE611C8 for ; Fri, 15 Oct 2021 22:34:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243366AbhJOWgH (ORCPT ); Fri, 15 Oct 2021 18:36:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243364AbhJOWgG (ORCPT ); Fri, 15 Oct 2021 18:36:06 -0400 Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDFE9C061570 for ; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) Received: by mail-ot1-x329.google.com with SMTP id e59-20020a9d01c1000000b00552c91a99f7so5843ote.6 for ; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=L+YSfFZg68UPAaK98QVqm3782JOahrkZ3lmkfME30F4=; b=NTJcZGB/VbGEamcvea4Sq37NCiL/GLaUnoSQeqehUBem/6ptpoeMrWX30miAKE2Udh 5O/Y7RiGNo38y5R1AG5dU92qwBpELjo7C1B5uu/jp9aP4q46WksTv+ToTOfrRG/b6Vhm z2qom+a2MRjlrKnntK7InGYZP0aWknLlKyupwqyz/hgktrJNigIPB3ehUrYkHbGoKVm3 n3jrvjm8A9vzVCK7LkrsX3pUHOIh6IEk7bQ3cmt84WPULPIaAxwLoNAu1BaJspYx8LNB /z6ihwMaVzJ1MDR7nQvl00Vum3zESofuNyYKiLlJ4Kq6AXMa32zMX0Sfs9u1vMdjNafy NjRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=L+YSfFZg68UPAaK98QVqm3782JOahrkZ3lmkfME30F4=; b=XQv0wip8UQXpzhQJvfN3DWPhIHAfrE4mT+D0U22nYW3TgdP7znWivZHjnm0Iy3h6Ul /Kkh5R/WHkUurW/n0TjBPoo15Hs84gD4YVWIoSpvhabLV4AIekEfh9OOYvxTqu/EERNt GruIbXe8IQ3ESu0kuErNk/222fESIS3LY74IJ3tSTzM/6MLIMmWs2AuYTbOsmJmy/6LQ N295rwyKC9Rt278jl0DTz6L3U9gLL5VWPvbxQBp0uIfyPfpHHvJeIviACZYDTnIdFkUI OdA1HCLsbFlpq+jvfeFJZWDq/hiZXKcCC3VxnwYY6PNIVYasi98pPHQNU7sxVb63Ks0H 198A== X-Gm-Message-State: AOAM530TUYKb2P8T/TydH0DSRctCJ4aXZg4IoXUevO0nJ8V+DUmWaxze VPBeEDEjVon5DmFgY7A4zRo= X-Google-Smtp-Source: ABdhPJzMzMy+VUJm+W0eg0jEaR+E6761ete7F+3BLmzuDu5jAMEDMcFgrfvCjRTfcYlHtrcAl5DC8g== X-Received: by 2002:a9d:490e:: with SMTP id e14mr10188157otf.194.1634337239245; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:58 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 07/10] RDMA/rxe: Separate out last rxe_drop_ref Date: Fri, 15 Oct 2021 17:32:48 -0500 Message-Id: <20211015223250.6501-8-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently rxe_drop_ref() can be called from a destroy or deallocate verb or from someone releasing a reference protecting a pointer. This leads to the possibility that the object will be destroyed after rdma-core thinks it has. This was the original intent of using kref's but can cause other problems. This patch separates out the last rxe_drop_ref as rxe_fini_ref() which can return an error if the object is not ready to be destroyed and changes rxe_drop_ref() to return an error if the object has already been destroyed. Now the destroy verbs will return -EBUSY if the object cannot be destroyed and other add/drop references will return -EINVAL if the object has already or would be destroyed. Correct programs should not normally return these values. Locking is added so that cleanup is executed atomically. Some drop references have their order changed so references are dropped after the pointer they protect has been successfully removed. This patch exposes some referencing errors which are addressed in the following patches. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_cq.c | 9 +- drivers/infiniband/sw/rxe/rxe_loc.h | 3 +- drivers/infiniband/sw/rxe/rxe_mr.c | 10 +- drivers/infiniband/sw/rxe/rxe_mw.c | 23 ++-- drivers/infiniband/sw/rxe/rxe_pool.c | 148 ++++++++++++++++++++------ drivers/infiniband/sw/rxe/rxe_pool.h | 37 ++++--- drivers/infiniband/sw/rxe/rxe_qp.c | 1 + drivers/infiniband/sw/rxe/rxe_srq.c | 8 ++ drivers/infiniband/sw/rxe/rxe_verbs.c | 40 ++++--- 9 files changed, 192 insertions(+), 87 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c index 6848426c074f..0c05d612ae63 100644 --- a/drivers/infiniband/sw/rxe/rxe_cq.c +++ b/drivers/infiniband/sw/rxe/rxe_cq.c @@ -141,18 +141,15 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited) return 0; } -void rxe_cq_disable(struct rxe_cq *cq) +void rxe_cq_cleanup(struct rxe_pool_entry *arg) { + struct rxe_cq *cq = container_of(arg, typeof(*cq), pelem); unsigned long flags; + /* TODO get rid of this */ spin_lock_irqsave(&cq->cq_lock, flags); cq->is_dying = true; spin_unlock_irqrestore(&cq->cq_lock, flags); -} - -void rxe_cq_cleanup(struct rxe_pool_entry *arg) -{ - struct rxe_cq *cq = container_of(arg, typeof(*cq), pelem); if (cq->queue) rxe_queue_cleanup(cq->queue); diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 1ca43b859d80..a25d1c9f6adb 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -35,8 +35,6 @@ int rxe_cq_resize_queue(struct rxe_cq *cq, int new_cqe, int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited); -void rxe_cq_disable(struct rxe_cq *cq); - void rxe_cq_cleanup(struct rxe_pool_entry *arg); /* rxe_mcast.c */ @@ -187,6 +185,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq, int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq, struct ib_srq_attr *attr, enum ib_srq_attr_mask mask, struct rxe_modify_srq_cmd *ucmd, struct ib_udata *udata); +void rxe_srq_cleanup(struct rxe_pool_entry *arg); void rxe_dealloc(struct ib_device *ib_dev); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 6e71f67ccfe9..6c50c8562fd8 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -684,6 +684,8 @@ int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr) int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { struct rxe_mr *mr = to_rmr(ibmr); + struct rxe_pd *pd = mr_pd(mr); + int err; if (atomic_read(&mr->num_mw) > 0) { pr_warn("%s: Attempt to deregister an MR while bound to MWs\n", @@ -692,8 +694,12 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) } mr->state = RXE_MR_STATE_INVALID; - rxe_drop_ref(mr_pd(mr)); - rxe_drop_ref(mr); + + err = rxe_fini_ref(mr); + if (err) + return err; + + rxe_drop_ref(pd); return 0; } diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index 854d0c283521..599699f93332 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -56,12 +56,15 @@ int rxe_dealloc_mw(struct ib_mw *ibmw) struct rxe_mw *mw = to_rmw(ibmw); struct rxe_pd *pd = to_rpd(ibmw->pd); unsigned long flags; + int err; spin_lock_irqsave(&mw->lock, flags); rxe_do_dealloc_mw(mw); spin_unlock_irqrestore(&mw->lock, flags); - rxe_drop_ref(mw); + err = rxe_fini_ref(mw); + if (err) + return err; rxe_drop_ref(pd); return 0; @@ -177,9 +180,9 @@ static void rxe_do_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, } if (mw->length) { + /* take over ref on mr from caller */ mw->mr = mr; atomic_inc(&mr->num_mw); - rxe_add_ref(mr); } if (mw->ibmw.type == IB_MW_TYPE_2) { @@ -192,7 +195,7 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe) { int ret; struct rxe_mw *mw; - struct rxe_mr *mr; + struct rxe_mr *mr = NULL; struct rxe_dev *rxe = to_rdev(qp->ibqp.device); u32 mw_rkey = wqe->wr.wr.mw.mw_rkey; u32 mr_lkey = wqe->wr.wr.mw.mr_lkey; @@ -217,25 +220,23 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe) } if (unlikely(mr->lkey != mr_lkey)) { + rxe_drop_ref(mr); ret = -EINVAL; - goto err_drop_mr; + goto err_drop_mw; } - } else { - mr = NULL; } spin_lock_irqsave(&mw->lock, flags); - ret = rxe_check_bind_mw(qp, wqe, mw, mr); - if (ret) + if (ret) { + if (mr) + rxe_drop_ref(mr); goto err_unlock; + } rxe_do_bind_mw(qp, wqe, mw, mr); err_unlock: spin_unlock_irqrestore(&mw->lock, flags); -err_drop_mr: - if (mr) - rxe_drop_ref(mr); err_drop_mw: rxe_drop_ref(mw); err: diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index e9c5e4e887c3..955bb283c76a 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -33,6 +33,7 @@ static const struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { .name = "rxe-srq", .size = sizeof(struct rxe_srq), .elem_offset = offsetof(struct rxe_srq, pelem), + .cleanup = rxe_srq_cleanup, .flags = RXE_POOL_INDEX | RXE_POOL_NO_ALLOC, .min_index = RXE_MIN_SRQ_INDEX, .max_index = RXE_MAX_SRQ_INDEX, @@ -195,8 +196,11 @@ static int rxe_insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new) parent = *link; elem = rb_entry(parent, struct rxe_pool_entry, index_node); - if (elem->index == new->index) { - pr_warn("element with index = 0x%x already exists!\n", + /* this can happen if memory was recycled and/or the + * old object was not deleted from the pool index + */ + if (unlikely(elem == new || elem->index == new->index)) { + pr_warn("%s#%d: already in pool\n", pool->name, new->index); return -EINVAL; } @@ -310,7 +314,7 @@ void *rxe_alloc_locked(struct rxe_pool *pool) return NULL; elem = (struct rxe_pool_entry *)(obj + pool->elem_offset); - kref_init(&elem->ref_cnt); + refcount_set(&elem->refcnt, 1); return obj; } @@ -333,7 +337,7 @@ void *rxe_alloc_with_key_locked(struct rxe_pool *pool, void *key) goto out_cnt; } - kref_init(&elem->ref_cnt); + refcount_set(&elem->refcnt, 1); return obj; @@ -384,7 +388,7 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) goto out_cnt; } - kref_init(&elem->ref_cnt); + refcount_set(&elem->refcnt, 1); write_unlock_irqrestore(&pool->pool_lock, flags); return 0; @@ -395,30 +399,6 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_entry *elem) return -EINVAL; } -void rxe_elem_release(struct kref *kref) -{ - struct rxe_pool_entry *elem = - container_of(kref, struct rxe_pool_entry, ref_cnt); - struct rxe_pool *pool = elem->pool; - void *obj; - - if (pool->cleanup) - pool->cleanup(elem); - - if (pool->flags & RXE_POOL_INDEX) - rxe_drop_index(elem); - - if (pool->flags & RXE_POOL_KEY) - rb_erase(&elem->key_node, &pool->key.tree); - - if (!(pool->flags & RXE_POOL_NO_ALLOC)) { - obj = elem->obj; - kfree(obj); - } - - atomic_dec(&pool->num_elem); -} - void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) { struct rb_node *node; @@ -438,7 +418,7 @@ void *rxe_pool_get_index_locked(struct rxe_pool *pool, u32 index) break; } - if (node && kref_get_unless_zero(&elem->ref_cnt)) + if (node && refcount_inc_not_zero(&elem->refcnt)) obj = elem->obj; return obj; @@ -479,7 +459,7 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key) break; } - if (node && kref_get_unless_zero(&elem->ref_cnt)) + if (node && refcount_inc_not_zero(&elem->refcnt)) obj = elem->obj; return obj; @@ -496,3 +476,109 @@ void *rxe_pool_get_key(struct rxe_pool *pool, void *key) return obj; } + +int __rxe_add_ref_locked(struct rxe_pool_entry *elem) +{ + int done; + + done = refcount_inc_not_zero(&elem->refcnt); + if (done) + return 0; + else + return -EINVAL; +} + +int __rxe_add_ref(struct rxe_pool_entry *elem) +{ + struct rxe_pool *pool = elem->pool; + unsigned long flags; + int ret; + + read_lock_irqsave(&pool->pool_lock, flags); + ret = __rxe_add_ref_locked(elem); + read_unlock_irqrestore(&pool->pool_lock, flags); + + return ret; +} + +int __rxe_drop_ref_locked(struct rxe_pool_entry *elem) +{ + int done; + + done = refcount_dec_not_one(&elem->refcnt); + if (done) + return 0; + else + return -EINVAL; +} + +int __rxe_drop_ref(struct rxe_pool_entry *elem) +{ + struct rxe_pool *pool = elem->pool; + unsigned long flags; + int ret; + + read_lock_irqsave(&pool->pool_lock, flags); + ret = __rxe_drop_ref_locked(elem); + read_unlock_irqrestore(&pool->pool_lock, flags); + + return ret; +} + +static int __rxe_fini(struct rxe_pool_entry *elem) +{ + struct rxe_pool *pool = elem->pool; + int done; + + done = refcount_dec_if_one(&elem->refcnt); + if (done) { + if (pool->flags & RXE_POOL_INDEX) + rxe_drop_index(elem); + if (pool->flags & RXE_POOL_KEY) + rb_erase(&elem->key_node, &pool->key.tree); + atomic_dec(&pool->num_elem); + return 0; + } else { + return -EBUSY; + } +} + +/* can only be used by pools that have a cleanup + * routine that can run while holding a spinlock + */ +int __rxe_fini_ref_locked(struct rxe_pool_entry *elem) +{ + struct rxe_pool *pool = elem->pool; + int ret; + + ret = __rxe_fini(elem); + + if (!ret) { + if (pool->cleanup) + pool->cleanup(elem); + if (!(pool->flags & RXE_POOL_NO_ALLOC)) + kfree(elem->obj); + } + + return ret; +} + +int __rxe_fini_ref(struct rxe_pool_entry *elem) +{ + struct rxe_pool *pool = elem->pool; + unsigned long flags; + int ret; + + read_lock_irqsave(&pool->pool_lock, flags); + ret = __rxe_fini(elem); + read_unlock_irqrestore(&pool->pool_lock, flags); + + if (!ret) { + if (pool->cleanup) + pool->cleanup(elem); + if (!(pool->flags & RXE_POOL_NO_ALLOC)) + kfree(elem->obj); + } + + return ret; +} diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 50083cb9530e..2fe1009145a5 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -7,6 +7,8 @@ #ifndef RXE_POOL_H #define RXE_POOL_H +#include + #define RXE_POOL_ALIGN (16) #define RXE_POOL_CACHE_FLAGS (0) @@ -47,7 +49,7 @@ struct rxe_type_info { struct rxe_pool_entry { struct rxe_pool *pool; void *obj; - struct kref ref_cnt; + refcount_t refcnt; struct list_head list; /* only used if keyed */ @@ -127,24 +129,33 @@ void *rxe_pool_get_key_locked(struct rxe_pool *pool, void *key); void *rxe_pool_get_key(struct rxe_pool *pool, void *key); -/* cleanup an object when all references are dropped */ -void rxe_elem_release(struct kref *kref); - /* take a reference on an object */ -static inline int __rxe_add_ref(struct rxe_pool_entry *elem) -{ - int ret = kref_get_unless_zero(&elem->ref_cnt); +int __rxe_add_ref_locked(struct rxe_pool_entry *elem); - if (unlikely(!ret)) - pr_warn("Taking a reference on a %s object that is already destroyed\n", - elem->pool->name); +#define rxe_add_ref_locked(obj) __rxe_add_ref_locked(&(obj)->pelem) - return (ret) ? 0 : -EINVAL; -} +int __rxe_add_ref(struct rxe_pool_entry *elem); #define rxe_add_ref(obj) __rxe_add_ref(&(obj)->pelem) /* drop a reference on an object */ -#define rxe_drop_ref(obj) kref_put(&(obj)->pelem.ref_cnt, rxe_elem_release) +int __rxe_drop_ref_locked(struct rxe_pool_entry *elem); + +#define rxe_drop_ref_locked(obj) __rxe_drop_ref_locked(&(obj)->pelem) + +int __rxe_drop_ref(struct rxe_pool_entry *elem); + +#define rxe_drop_ref(obj) __rxe_drop_ref(&(obj)->pelem) + +/* drop last reference on an object */ +int __rxe_fini_ref_locked(struct rxe_pool_entry *elem); + +#define rxe_fini_ref_locked(obj) __rxe_fini_ref_locked(&(obj)->pelem) + +int __rxe_fini_ref(struct rxe_pool_entry *elem); + +#define rxe_fini_ref(obj) __rxe_fini_ref(&(obj)->pelem) + +#define rxe_read_ref(obj) refcount_read(&(obj)->pelem.refcnt) #endif /* RXE_POOL_H */ diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index 975321812c87..7503aebddcf4 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -835,5 +835,6 @@ void rxe_qp_cleanup(struct rxe_pool_entry *arg) { struct rxe_qp *qp = container_of(arg, typeof(*qp), pelem); + rxe_qp_destroy(qp); execute_in_process_context(rxe_qp_do_cleanup, &qp->cleanup_work); } diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c index eb1c4c3b3a78..bb00643a2929 100644 --- a/drivers/infiniband/sw/rxe/rxe_srq.c +++ b/drivers/infiniband/sw/rxe/rxe_srq.c @@ -154,3 +154,11 @@ int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq, srq->rq.queue = NULL; return err; } + +void rxe_srq_cleanup(struct rxe_pool_entry *arg) +{ + struct rxe_srq *srq = container_of(arg, typeof(*srq), pelem); + + if (srq->rq.queue) + rxe_queue_cleanup(srq->rq.queue); +} diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 84ea03bc6a26..ec1f72332737 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -115,7 +115,7 @@ static void rxe_dealloc_ucontext(struct ib_ucontext *ibuc) { struct rxe_ucontext *uc = to_ruc(ibuc); - rxe_drop_ref(uc); + rxe_fini_ref(uc); } static int rxe_port_immutable(struct ib_device *dev, u32 port_num, @@ -149,8 +149,7 @@ static int rxe_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) { struct rxe_pd *pd = to_rpd(ibpd); - rxe_drop_ref(pd); - return 0; + return rxe_fini_ref(pd); } static int rxe_create_ah(struct ib_ah *ibah, @@ -228,8 +227,7 @@ static int rxe_destroy_ah(struct ib_ah *ibah, u32 flags) { struct rxe_ah *ah = to_rah(ibah); - rxe_drop_ref(ah); - return 0; + return rxe_fini_ref(ah); } static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr) @@ -313,8 +311,8 @@ static int rxe_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *init, return 0; err2: + rxe_fini_ref(srq); rxe_drop_ref(pd); - rxe_drop_ref(srq); err1: return err; } @@ -367,12 +365,15 @@ static int rxe_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr) static int rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) { struct rxe_srq *srq = to_rsrq(ibsrq); + struct rxe_pd *pd = srq->pd; + int err; - if (srq->rq.queue) - rxe_queue_cleanup(srq->rq.queue); + err = rxe_fini_ref(srq); + if (err) + return err; + + rxe_drop_ref(pd); - rxe_drop_ref(srq->pd); - rxe_drop_ref(srq); return 0; } @@ -437,12 +438,12 @@ static int rxe_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init, err = rxe_qp_from_init(rxe, qp, pd, init, uresp, ibqp->pd, udata); if (err) - goto qp_init; + goto err_fini; return 0; -qp_init: - rxe_drop_ref(qp); +err_fini: + rxe_fini_ref(qp); return err; } @@ -485,9 +486,7 @@ static int rxe_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) { struct rxe_qp *qp = to_rqp(ibqp); - rxe_qp_destroy(qp); - rxe_drop_ref(qp); - return 0; + return rxe_fini_ref(qp); } static int validate_send_wr(struct rxe_qp *qp, const struct ib_send_wr *ibwr, @@ -797,10 +796,7 @@ static int rxe_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) { struct rxe_cq *cq = to_rcq(ibcq); - rxe_cq_disable(cq); - - rxe_drop_ref(cq); - return 0; + return rxe_fini_ref(cq); } static int rxe_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata) @@ -924,8 +920,8 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, return &mr->ibmr; err3: + rxe_fini_ref(mr); rxe_drop_ref(pd); - rxe_drop_ref(mr); err2: return ERR_PTR(err); } @@ -956,8 +952,8 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, return &mr->ibmr; err2: + rxe_fini_ref(mr); rxe_drop_ref(pd); - rxe_drop_ref(mr); err1: return ERR_PTR(err); } From patchwork Fri Oct 15 22:32:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562905 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2DC6C433F5 for ; Fri, 15 Oct 2021 22:34:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8AA82611C8 for ; Fri, 15 Oct 2021 22:34:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243390AbhJOWgJ (ORCPT ); Fri, 15 Oct 2021 18:36:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243357AbhJOWgH (ORCPT ); Fri, 15 Oct 2021 18:36:07 -0400 Received: from mail-ot1-x335.google.com (mail-ot1-x335.google.com [IPv6:2607:f8b0:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 584F5C061762 for ; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) Received: by mail-ot1-x335.google.com with SMTP id l16-20020a9d6a90000000b0054e7ab56f27so14723006otq.12 for ; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=e1mdN7kCN9kjK0XMT+xR4Lm7X6g+u0jgxqxhdxwqf9s=; b=GvZ0ULQiGeRZu5wQQZjoeh32bBktPq4McQFy6gMFR4Sqr5JZwCxisgqXtTE5eAqf0n CtT+6n5X8Idw97YtPAqg07WEdtCfY7d+qb/JzRiERutAZqS+6VDf+5I3tJtupT1vPIZ4 7YszArJLMPjxE8u3P1iYtGOIEzNT44gvyDIfuzI5KwpxWgANrxS+ecCrty9KXSZ9ezq7 CenjaZqfL7aqy7FdJeHpoOUSnK1AH86GhefmEw4ogSuYdJVELLW7kGSDbGND928EqA97 uBpKF+JR8xs1J5rzudX/+uRX84KwoS1nVfxxDIueGQ9QjkEW5DmM/qTLH7C4CEbF5MW3 57TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=e1mdN7kCN9kjK0XMT+xR4Lm7X6g+u0jgxqxhdxwqf9s=; b=sxiLWtWHfy/P8QHl/Xcl1DHrY9Es9AbIMgMjdM7FQ249dCeXr4chZEdvCec0I2bXxz ZyPYClphH1pWAzluliDubEWrcdMlmVF2awe9aGuvahpZJscyijP5cBkiEdTcLN8f7lPB KCOboZqhebg1f+HNBFvb9eRcVQqHy/kIRDh+u9mEq/CBsWi3D58+q3Ht/rgGknlJT1pV BfrXa5TkDCpuGxaf6c6SGu6vQwYWb+7Sjh82HMK19Mc2XM0RnwscKA8YVYwQQXs5xFFh m/HhVNApU531B0KHOEIIv2OVNZHRytL1BJ7/0VLLXUY1Owi670sYEVG7T/dvS6IYy0vJ Kc6g== X-Gm-Message-State: AOAM5305XUpfU/MLbQTvQ/x44xSvlPq93H6gk2TX+oGubcAFGgdpv9qy PGyUXbwZQ7wmFrh5iOe1yorcG9g3rSo= X-Google-Smtp-Source: ABdhPJwD1jHuIXj1RBTtujdbOxfdeiPzzIGvv2a8OY82tGHIseDko7zlFK9OW0PSHFOau0OMM2TMHg== X-Received: by 2002:a05:6830:13c5:: with SMTP id e5mr10126590otq.374.1634337239739; Fri, 15 Oct 2021 15:33:59 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:33:59 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 08/10] RDMA/rxe: Rewrite rxe_mcast.c Date: Fri, 15 Oct 2021 17:32:49 -0500 Message-Id: <20211015223250.6501-9-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Rewrite rxe_mcast to support rxe_fini_ref and clean up referencing. Drop mc_elem as rxe pool objects and just use kmalloc/kfree. Take qp references as well as mc_grp references to force all mc groups to be de-attached before destroying QPs. Simplify the api to rxe_verbs. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe.c | 8 - drivers/infiniband/sw/rxe/rxe_loc.h | 11 +- drivers/infiniband/sw/rxe/rxe_mcast.c | 205 +++++++++++++++++--------- drivers/infiniband/sw/rxe/rxe_net.c | 22 --- drivers/infiniband/sw/rxe/rxe_pool.c | 6 - drivers/infiniband/sw/rxe/rxe_pool.h | 1 - drivers/infiniband/sw/rxe/rxe_verbs.c | 12 +- drivers/infiniband/sw/rxe/rxe_verbs.h | 3 +- 8 files changed, 143 insertions(+), 125 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index 8e0f9c489cab..4298a1d20ad5 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -31,7 +31,6 @@ void rxe_dealloc(struct ib_device *ib_dev) rxe_pool_cleanup(&rxe->mr_pool); rxe_pool_cleanup(&rxe->mw_pool); rxe_pool_cleanup(&rxe->mc_grp_pool); - rxe_pool_cleanup(&rxe->mc_elem_pool); if (rxe->tfm) crypto_free_shash(rxe->tfm); @@ -165,15 +164,8 @@ static int rxe_init_pools(struct rxe_dev *rxe) if (err) goto err9; - err = rxe_pool_init(rxe, &rxe->mc_elem_pool, RXE_TYPE_MC_ELEM, - rxe->attr.max_total_mcast_qp_attach); - if (err) - goto err10; - return 0; -err10: - rxe_pool_cleanup(&rxe->mc_grp_pool); err9: rxe_pool_cleanup(&rxe->mw_pool); err8: diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index a25d1c9f6adb..78312df8eea3 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -38,19 +38,12 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited); void rxe_cq_cleanup(struct rxe_pool_entry *arg); /* rxe_mcast.c */ -int rxe_mcast_get_grp(struct rxe_dev *rxe, union ib_gid *mgid, - struct rxe_mc_grp **grp_p); - int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, - struct rxe_mc_grp *grp); - + union ib_gid *mgid); int rxe_mcast_drop_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, union ib_gid *mgid); - void rxe_drop_all_mcast_groups(struct rxe_qp *qp); -void rxe_mc_cleanup(struct rxe_pool_entry *arg); - /* rxe_mmap.c */ struct rxe_mmap_info { struct list_head pending_mmaps; @@ -104,8 +97,6 @@ int rxe_prepare(struct rxe_pkt_info *pkt, struct sk_buff *skb); int rxe_xmit_packet(struct rxe_qp *qp, struct rxe_pkt_info *pkt, struct sk_buff *skb); const char *rxe_parent_name(struct rxe_dev *rxe, unsigned int port_num); -int rxe_mcast_add(struct rxe_dev *rxe, union ib_gid *mgid); -int rxe_mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid); /* rxe_qp.c */ int rxe_qp_chk_init(struct rxe_dev *rxe, struct ib_qp_init_attr *init); diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c index 337dc2c68051..f5a1492e9e48 100644 --- a/drivers/infiniband/sw/rxe/rxe_mcast.c +++ b/drivers/infiniband/sw/rxe/rxe_mcast.c @@ -5,116 +5,192 @@ */ #include "rxe.h" -#include "rxe_loc.h" -/* caller should hold mc_grp_pool->pool_lock */ -static struct rxe_mc_grp *create_grp(struct rxe_dev *rxe, - struct rxe_pool *pool, - union ib_gid *mgid) +static int rxe_mcast_add(struct rxe_dev *rxe, union ib_gid *mgid) { int err; + unsigned char ll_addr[ETH_ALEN]; + + ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr); + err = dev_mc_add(rxe->ndev, ll_addr); + + return err; +} + +static int rxe_mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid) +{ + int err; + unsigned char ll_addr[ETH_ALEN]; + + ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr); + err = dev_mc_del(rxe->ndev, ll_addr); + + return err; +} + +static int rxe_mcast_get_grp(struct rxe_dev *rxe, union ib_gid *mgid, + struct rxe_mc_grp **grp_p) +{ + struct rxe_pool *pool = &rxe->mc_grp_pool; struct rxe_mc_grp *grp; + unsigned long flags; + int err = 0; + + /* Perform this while holding the mc_grp_pool lock + * to prevent races where two coincident calls fail to lookup the + * same group and then both create the same group. + */ + write_lock_irqsave(&pool->pool_lock, flags); + grp = rxe_pool_get_key_locked(pool, mgid); + if (grp) + goto done; grp = rxe_alloc_with_key_locked(&rxe->mc_grp_pool, mgid); - if (!grp) - return ERR_PTR(-ENOMEM); + if (!grp) { + err = -ENOMEM; + goto done; + } INIT_LIST_HEAD(&grp->qp_list); spin_lock_init(&grp->mcg_lock); grp->rxe = rxe; err = rxe_mcast_add(rxe, mgid); - if (unlikely(err)) { - rxe_drop_ref(grp); - return ERR_PTR(err); + if (err) { + rxe_fini_ref_locked(grp); + grp = NULL; + goto done; } - return grp; + /* match the reference taken by get_key */ + rxe_add_ref_locked(grp); +done: + *grp_p = grp; + write_unlock_irqrestore(&pool->pool_lock, flags); + + return err; } -int rxe_mcast_get_grp(struct rxe_dev *rxe, union ib_gid *mgid, - struct rxe_mc_grp **grp_p) +static void rxe_mcast_put_grp(struct rxe_mc_grp *grp) { - int err; - struct rxe_mc_grp *grp; + struct rxe_dev *rxe = grp->rxe; struct rxe_pool *pool = &rxe->mc_grp_pool; unsigned long flags; - if (rxe->attr.max_mcast_qp_attach == 0) - return -EINVAL; - write_lock_irqsave(&pool->pool_lock, flags); - grp = rxe_pool_get_key_locked(pool, mgid); - if (grp) - goto done; + rxe_drop_ref_locked(grp); - grp = create_grp(rxe, pool, mgid); - if (IS_ERR(grp)) { - write_unlock_irqrestore(&pool->pool_lock, flags); - err = PTR_ERR(grp); - return err; + if (rxe_read_ref(grp) == 1) { + rxe_mcast_delete(rxe, &grp->mgid); + rxe_fini_ref_locked(grp); } -done: write_unlock_irqrestore(&pool->pool_lock, flags); - *grp_p = grp; - return 0; } +/** + * rxe_mcast_add_grp_elem() - Associate a multicast address with a QP + * @rxe: the rxe device + * @qp: the QP + * @mgid: the multicast address + * + * Each multicast group can be associated with one or more QPs and + * each QP can be associated with zero or more multicast groups. + * Between each multicast group associated with a QP there is a + * rxe_mc_elem struct which has two list head structs and is joined + * both to a list of QPs on the multicast group and a list of groups + * on the QP. The elem has pointers to the group and the QP and + * takes a reference for each one. + * + * Return: 0 on success or an error on failure. + */ int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, - struct rxe_mc_grp *grp) + union ib_gid *mgid) { - int err; struct rxe_mc_elem *elem; + struct rxe_mc_grp *grp; + int err; + + if (rxe->attr.max_mcast_qp_attach == 0) + return -EINVAL; + + /* takes a ref on grp if successful */ + err = rxe_mcast_get_grp(rxe, mgid, &grp); + if (err) + return err; - /* check to see of the qp is already a member of the group */ spin_lock_bh(&qp->grp_lock); spin_lock_bh(&grp->mcg_lock); + + /* check to see if the qp is already a member of the group */ list_for_each_entry(elem, &grp->qp_list, qp_list) { - if (elem->qp == qp) { - err = 0; - goto out; - } + if (elem->qp == qp) + goto drop_ref; } if (grp->num_qp >= rxe->attr.max_mcast_qp_attach) { err = -ENOMEM; - goto out; + goto drop_ref; } - elem = rxe_alloc_locked(&rxe->mc_elem_pool); - if (!elem) { + if (atomic_read(&rxe->total_mcast_qp_attach) >= + rxe->attr.max_total_mcast_qp_attach) { err = -ENOMEM; - goto out; + goto drop_ref; } - /* each qp holds a ref on the grp */ - rxe_add_ref(grp); + elem = kmalloc(sizeof(*elem), GFP_KERNEL); + if (!elem) { + err = -ENOMEM; + goto drop_ref; + } + atomic_inc(&rxe->total_mcast_qp_attach); grp->num_qp++; + rxe_add_ref(qp); elem->qp = qp; + /* still holding a ref on grp */ elem->grp = grp; list_add(&elem->qp_list, &grp->qp_list); list_add(&elem->grp_list, &qp->grp_list); - err = 0; -out: + goto done; + +drop_ref: + rxe_drop_ref(grp); + +done: spin_unlock_bh(&grp->mcg_lock); spin_unlock_bh(&qp->grp_lock); + return err; } +/** + * rxe_mcast_drop_grp_elem() - Dissociate multicast address and QP + * @rxe: the rxe device + * @qp: the QP + * @mgid: the multicast group + * + * Walk the list of group elements to find one which matches QP + * Then delete from group and qp lists and free pointers and the elem. + * Check to see if we have removed the last qp from group and delete + * it if so. + * + * Return: 0 on success else an error on failure + */ int rxe_mcast_drop_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, union ib_gid *mgid) { - struct rxe_mc_grp *grp; struct rxe_mc_elem *elem, *tmp; + struct rxe_mc_grp *grp; + int err = 0; grp = rxe_pool_get_key(&rxe->mc_grp_pool, mgid); if (!grp) - goto err1; + return -EINVAL; spin_lock_bh(&qp->grp_lock); spin_lock_bh(&grp->mcg_lock); @@ -123,26 +199,28 @@ int rxe_mcast_drop_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, if (elem->qp == qp) { list_del(&elem->qp_list); list_del(&elem->grp_list); + rxe_drop_ref(grp); + rxe_drop_ref(qp); grp->num_qp--; - - spin_unlock_bh(&grp->mcg_lock); - spin_unlock_bh(&qp->grp_lock); - rxe_drop_ref(elem); - rxe_drop_ref(grp); /* ref held by QP */ - rxe_drop_ref(grp); /* ref from get_key */ - return 0; + kfree(elem); + atomic_dec(&rxe->total_mcast_qp_attach); + goto done; } } + err = -EINVAL; +done: spin_unlock_bh(&grp->mcg_lock); spin_unlock_bh(&qp->grp_lock); - rxe_drop_ref(grp); /* ref from get_key */ -err1: - return -EINVAL; + + rxe_mcast_put_grp(grp); + + return err; } void rxe_drop_all_mcast_groups(struct rxe_qp *qp) { + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); struct rxe_mc_grp *grp; struct rxe_mc_elem *elem; @@ -162,15 +240,10 @@ void rxe_drop_all_mcast_groups(struct rxe_qp *qp) list_del(&elem->qp_list); grp->num_qp--; spin_unlock_bh(&grp->mcg_lock); - rxe_drop_ref(grp); - rxe_drop_ref(elem); - } -} -void rxe_mc_cleanup(struct rxe_pool_entry *arg) -{ - struct rxe_mc_grp *grp = container_of(arg, typeof(*grp), pelem); - struct rxe_dev *rxe = grp->rxe; - - rxe_mcast_delete(rxe, &grp->mgid); + kfree(elem); + atomic_dec(&rxe->total_mcast_qp_attach); + rxe_drop_ref(qp); + rxe_mcast_put_grp(grp); + } } diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index 2cb810cb890a..fcdf998ec896 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -20,28 +20,6 @@ static struct rxe_recv_sockets recv_sockets; -int rxe_mcast_add(struct rxe_dev *rxe, union ib_gid *mgid) -{ - int err; - unsigned char ll_addr[ETH_ALEN]; - - ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr); - err = dev_mc_add(rxe->ndev, ll_addr); - - return err; -} - -int rxe_mcast_delete(struct rxe_dev *rxe, union ib_gid *mgid) -{ - int err; - unsigned char ll_addr[ETH_ALEN]; - - ipv6_eth_mc_map((struct in6_addr *)mgid->raw, ll_addr); - err = dev_mc_del(rxe->ndev, ll_addr); - - return err; -} - static struct dst_entry *rxe_find_route4(struct net_device *ndev, struct in_addr *saddr, struct in_addr *daddr) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 955bb283c76a..59f1a1919e30 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -76,16 +76,10 @@ static const struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { .name = "rxe-mc_grp", .size = sizeof(struct rxe_mc_grp), .elem_offset = offsetof(struct rxe_mc_grp, pelem), - .cleanup = rxe_mc_cleanup, .flags = RXE_POOL_KEY, .key_offset = offsetof(struct rxe_mc_grp, mgid), .key_size = sizeof(union ib_gid), }, - [RXE_TYPE_MC_ELEM] = { - .name = "rxe-mc_elem", - .size = sizeof(struct rxe_mc_elem), - .elem_offset = offsetof(struct rxe_mc_elem, pelem), - }, }; static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 2fe1009145a5..f04df69c52ba 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -28,7 +28,6 @@ enum rxe_elem_type { RXE_TYPE_MR, RXE_TYPE_MW, RXE_TYPE_MC_GRP, - RXE_TYPE_MC_ELEM, RXE_NUM_TYPES, /* keep me last */ }; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index ec1f72332737..1b5084fd10ab 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -984,20 +984,10 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, static int rxe_attach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid) { - int err; struct rxe_dev *rxe = to_rdev(ibqp->device); struct rxe_qp *qp = to_rqp(ibqp); - struct rxe_mc_grp *grp; - - /* takes a ref on grp if successful */ - err = rxe_mcast_get_grp(rxe, mgid, &grp); - if (err) - return err; - - err = rxe_mcast_add_grp_elem(rxe, qp, grp); - rxe_drop_ref(grp); - return err; + return rxe_mcast_add_grp_elem(rxe, qp, mgid); } static int rxe_detach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid) diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 35e041450090..4f1d7777f755 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -403,7 +403,8 @@ struct rxe_dev { struct rxe_pool mr_pool; struct rxe_pool mw_pool; struct rxe_pool mc_grp_pool; - struct rxe_pool mc_elem_pool; + + atomic_t total_mcast_qp_attach; spinlock_t pending_lock; /* guard pending_mmaps */ struct list_head pending_mmaps; From patchwork Fri Oct 15 22:32:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562907 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B45DC4332F for ; Fri, 15 Oct 2021 22:34:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 23ACF611C8 for ; Fri, 15 Oct 2021 22:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243378AbhJOWgJ (ORCPT ); Fri, 15 Oct 2021 18:36:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243379AbhJOWgI (ORCPT ); Fri, 15 Oct 2021 18:36:08 -0400 Received: from mail-oi1-x236.google.com (mail-oi1-x236.google.com [IPv6:2607:f8b0:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECE4BC061764 for ; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) Received: by mail-oi1-x236.google.com with SMTP id m67so15127777oif.6 for ; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9sCSUNrdcKQ7qOdXkWQRkXdmDMPw1rZkboJhDViP6TY=; b=KogjXUQmqP0YZJlxlIqUhXWQbG9p5fs08p1RN7q/cFLl9Awnv6WwyjfWWZhOOVNIUl hkndFb3vMcQi8IAmU+l0AL6yzE8YA1IYXjT9XwfH/C/ylLbxS9/Dduk4oT2RWYmynGfS kgwk/9ZDFoTHzJZOKCoENlwgEg7vjEW9NiXXDQlPLad9gBX43rqi4DjsgOsSRHzein7Z ZR05iiOr7rnIhvzTAlo1hWycmu4h5OX/AfZOYoxuqif4iaKH1CpAuFzuMlIYMlWt5v2C 5cidV/R2OSpXFso9wcLLR62+enOLKHCZyNJ8hWcrK6XRQtcS1ot4fs4j6x/3HQzW0FL0 ULSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9sCSUNrdcKQ7qOdXkWQRkXdmDMPw1rZkboJhDViP6TY=; b=IBXjgPQSHAuRDcV7BMp+Si0GuekNRjzwJxpPJNGRC3cV2JJ3sYwNRdHvyaLMVfx9+z 25Cs+6qXXgepfhwGyUAanebyBBjCe7CSATeNmrMOwMrAaAf6EhDg2BnC6hOnjPm8/4dp 4chuNRC8X8nwuKvp2ILscRbREK5GBIMb2kE/FmyQv1NsIquDqkSeW0UCxoxJUcCB/jvJ PXRRGJlM5IT/dQ3djtX2sqivKIp+hFwOrw8s/GRrnoRka9SKKxORn9j0y3v7y5kkVrrs SJgfurCB5pB1buMoPMALighP8nHbpkKlOCbButLA//KmUPuRB7oPIoeNy1llrsntoRWG E5Gg== X-Gm-Message-State: AOAM531DhHZYLZ0CUioQo0QVTKD/CyiHTlnChwHNN5Ctp/q06M/ggiML xIjAK9d+QhannOmudYnfqDIVO0Ebh5Q= X-Google-Smtp-Source: ABdhPJzX9pepFpAKqcsxcnEKc2kjPaODSl5qD4ONhAI6VHszvkmB1e4rhLbNpBfeC90o33j8K2oY5Q== X-Received: by 2002:a05:6808:2389:: with SMTP id bp9mr7208281oib.140.1634337240357; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.33.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:34:00 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 09/10] RDMA/rxe: Fix ref error in rxe_av.c Date: Fri, 15 Oct 2021 17:32:50 -0500 Message-Id: <20211015223250.6501-10-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org The commit referenced below can take a reference to the AH which is never dropped. This only happens in the UD request path. This patch optionally passes that AH back to the caller so that it can hold the reference while the AV is being accessed and then drop it. Code to do this is added to rxe_req.c. The AV is also passed to rxe_prepare in rxe_net.c as an optimization. Fixes: e2fe06c90806 ("RDMA/rxe: Lookup kernel AH from ah index in UD WQEs") Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_av.c | 24 ++++++++++-- drivers/infiniband/sw/rxe/rxe_loc.h | 5 ++- drivers/infiniband/sw/rxe/rxe_net.c | 17 +++++---- drivers/infiniband/sw/rxe/rxe_req.c | 55 +++++++++++++++++----------- drivers/infiniband/sw/rxe/rxe_resp.c | 2 +- 5 files changed, 67 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_av.c b/drivers/infiniband/sw/rxe/rxe_av.c index 38c7b6fb39d7..8a6910a01e66 100644 --- a/drivers/infiniband/sw/rxe/rxe_av.c +++ b/drivers/infiniband/sw/rxe/rxe_av.c @@ -99,11 +99,14 @@ void rxe_av_fill_ip_info(struct rxe_av *av, struct rdma_ah_attr *attr) av->network_type = type; } -struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt) +struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt, struct rxe_ah **ahp) { struct rxe_ah *ah; u32 ah_num; + if (ahp) + *ahp = NULL; + if (!pkt || !pkt->qp) return NULL; @@ -117,10 +120,25 @@ struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt) if (ah_num) { /* only new user provider or kernel client */ ah = rxe_pool_get_index(&pkt->rxe->ah_pool, ah_num); - if (!ah || ah->ah_num != ah_num || rxe_ah_pd(ah) != pkt->qp->pd) { - pr_warn("Unable to find AH matching ah_num\n"); + if (!ah) { + pr_warn("%s: Unable to find AH matching ah_num\n", + __func__); + return NULL; + } + + if (rxe_ah_pd(ah) != pkt->qp->pd) { + pr_warn("%s: PDs don't match for AH and QP\n", + __func__); + rxe_drop_ref(ah); return NULL; } + + /* let caller hold ref to ah */ + if (ahp) + *ahp = ah; + else + rxe_drop_ref(ah); + return &ah->av; } diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 78312df8eea3..a689ee8386b8 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -19,7 +19,7 @@ void rxe_av_to_attr(struct rxe_av *av, struct rdma_ah_attr *attr); void rxe_av_fill_ip_info(struct rxe_av *av, struct rdma_ah_attr *attr); -struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt); +struct rxe_av *rxe_get_av(struct rxe_pkt_info *pkt, struct rxe_ah **ahp); /* rxe_cq.c */ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq, @@ -93,7 +93,8 @@ void rxe_mw_cleanup(struct rxe_pool_entry *arg); /* rxe_net.c */ struct sk_buff *rxe_init_packet(struct rxe_dev *rxe, struct rxe_av *av, int paylen, struct rxe_pkt_info *pkt); -int rxe_prepare(struct rxe_pkt_info *pkt, struct sk_buff *skb); +int rxe_prepare(struct rxe_av *av, struct rxe_pkt_info *pkt, + struct sk_buff *skb); int rxe_xmit_packet(struct rxe_qp *qp, struct rxe_pkt_info *pkt, struct sk_buff *skb); const char *rxe_parent_name(struct rxe_dev *rxe, unsigned int port_num); diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index fcdf998ec896..996aabd6e57b 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -271,13 +271,13 @@ static void prepare_ipv6_hdr(struct dst_entry *dst, struct sk_buff *skb, ip6h->payload_len = htons(skb->len - sizeof(*ip6h)); } -static int prepare4(struct rxe_pkt_info *pkt, struct sk_buff *skb) +static int prepare4(struct rxe_av *av, struct rxe_pkt_info *pkt, + struct sk_buff *skb) { struct rxe_qp *qp = pkt->qp; struct dst_entry *dst; bool xnet = false; __be16 df = htons(IP_DF); - struct rxe_av *av = rxe_get_av(pkt); struct in_addr *saddr = &av->sgid_addr._sockaddr_in.sin_addr; struct in_addr *daddr = &av->dgid_addr._sockaddr_in.sin_addr; @@ -297,11 +297,11 @@ static int prepare4(struct rxe_pkt_info *pkt, struct sk_buff *skb) return 0; } -static int prepare6(struct rxe_pkt_info *pkt, struct sk_buff *skb) +static int prepare6(struct rxe_av *av, struct rxe_pkt_info *pkt, + struct sk_buff *skb) { struct rxe_qp *qp = pkt->qp; struct dst_entry *dst; - struct rxe_av *av = rxe_get_av(pkt); struct in6_addr *saddr = &av->sgid_addr._sockaddr_in6.sin6_addr; struct in6_addr *daddr = &av->dgid_addr._sockaddr_in6.sin6_addr; @@ -322,16 +322,17 @@ static int prepare6(struct rxe_pkt_info *pkt, struct sk_buff *skb) return 0; } -int rxe_prepare(struct rxe_pkt_info *pkt, struct sk_buff *skb) +int rxe_prepare(struct rxe_av *av, struct rxe_pkt_info *pkt, + struct sk_buff *skb) { int err = 0; if (skb->protocol == htons(ETH_P_IP)) - err = prepare4(pkt, skb); + err = prepare4(av, pkt, skb); else if (skb->protocol == htons(ETH_P_IPV6)) - err = prepare6(pkt, skb); + err = prepare6(av, pkt, skb); - if (ether_addr_equal(skb->dev->dev_addr, rxe_get_av(pkt)->dmac)) + if (ether_addr_equal(skb->dev->dev_addr, av->dmac)) pkt->mask |= RXE_LOOPBACK_MASK; return err; diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 0c9d2af15f3d..891cf98c74a0 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -361,6 +361,7 @@ static inline int get_mtu(struct rxe_qp *qp) } static struct sk_buff *init_req_packet(struct rxe_qp *qp, + struct rxe_av *av, struct rxe_send_wqe *wqe, int opcode, int payload, struct rxe_pkt_info *pkt) @@ -368,7 +369,6 @@ static struct sk_buff *init_req_packet(struct rxe_qp *qp, struct rxe_dev *rxe = to_rdev(qp->ibqp.device); struct sk_buff *skb; struct rxe_send_wr *ibwr = &wqe->wr; - struct rxe_av *av; int pad = (-payload) & 0x3; int paylen; int solicited; @@ -378,21 +378,9 @@ static struct sk_buff *init_req_packet(struct rxe_qp *qp, /* length from start of bth to end of icrc */ paylen = rxe_opcode[opcode].length + payload + pad + RXE_ICRC_SIZE; - - /* pkt->hdr, port_num and mask are initialized in ifc layer */ - pkt->rxe = rxe; - pkt->opcode = opcode; - pkt->qp = qp; - pkt->psn = qp->req.psn; - pkt->mask = rxe_opcode[opcode].mask; - pkt->paylen = paylen; - pkt->wqe = wqe; + pkt->paylen = paylen; /* init skb */ - av = rxe_get_av(pkt); - if (!av) - return NULL; - skb = rxe_init_packet(rxe, av, paylen, pkt); if (unlikely(!skb)) return NULL; @@ -453,13 +441,13 @@ static struct sk_buff *init_req_packet(struct rxe_qp *qp, return skb; } -static int finish_packet(struct rxe_qp *qp, struct rxe_send_wqe *wqe, - struct rxe_pkt_info *pkt, struct sk_buff *skb, - int paylen) +static int finish_packet(struct rxe_qp *qp, struct rxe_av *av, + struct rxe_send_wqe *wqe, struct rxe_pkt_info *pkt, + struct sk_buff *skb, int paylen) { int err; - err = rxe_prepare(pkt, skb); + err = rxe_prepare(av, pkt, skb); if (err) return err; @@ -614,6 +602,7 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe) int rxe_requester(void *arg) { struct rxe_qp *qp = (struct rxe_qp *)arg; + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); struct rxe_pkt_info pkt; struct sk_buff *skb; struct rxe_send_wqe *wqe; @@ -625,6 +614,8 @@ int rxe_requester(void *arg) struct rxe_send_wqe rollback_wqe; u32 rollback_psn; struct rxe_queue *q = qp->sq.queue; + struct rxe_ah *ah; + struct rxe_av *av; rxe_add_ref(qp); @@ -711,14 +702,28 @@ int rxe_requester(void *arg) payload = mtu; } - skb = init_req_packet(qp, wqe, opcode, payload, &pkt); + pkt.rxe = rxe; + pkt.opcode = opcode; + pkt.qp = qp; + pkt.psn = qp->req.psn; + pkt.mask = rxe_opcode[opcode].mask; + pkt.wqe = wqe; + + av = rxe_get_av(&pkt, &ah); + if (unlikely(!av)) { + pr_err("qp#%d Failed no address vector\n", qp_num(qp)); + wqe->status = IB_WC_LOC_QP_OP_ERR; + goto err_drop_ah; + } + + skb = init_req_packet(qp, av, wqe, opcode, payload, &pkt); if (unlikely(!skb)) { pr_err("qp#%d Failed allocating skb\n", qp_num(qp)); wqe->status = IB_WC_LOC_QP_OP_ERR; - goto err; + goto err_drop_ah; } - ret = finish_packet(qp, wqe, &pkt, skb, payload); + ret = finish_packet(qp, av, wqe, &pkt, skb, payload); if (unlikely(ret)) { pr_debug("qp#%d Error during finish packet\n", qp_num(qp)); if (ret == -EFAULT) @@ -726,9 +731,12 @@ int rxe_requester(void *arg) else wqe->status = IB_WC_LOC_QP_OP_ERR; kfree_skb(skb); - goto err; + goto err_drop_ah; } + if (ah) + rxe_drop_ref(ah); + /* * To prevent a race on wqe access between requester and completer, * wqe members state and psn need to be set before calling @@ -757,6 +765,9 @@ int rxe_requester(void *arg) goto next_wqe; +err_drop_ah: + if (ah) + rxe_drop_ref(ah); err: wqe->state = wqe_state_error; __rxe_do_task(&qp->comp.task); diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index e8f435fa6e4d..f589f4dde35c 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -632,7 +632,7 @@ static struct sk_buff *prepare_ack_packet(struct rxe_qp *qp, if (ack->mask & RXE_ATMACK_MASK) atmack_set_orig(ack, qp->resp.atomic_orig); - err = rxe_prepare(ack, skb); + err = rxe_prepare(&qp->pri_av, ack, skb); if (err) { kfree_skb(skb); return NULL; From patchwork Fri Oct 15 22:32:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12562909 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF618C433FE for ; Fri, 15 Oct 2021 22:34:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 95AEC611C3 for ; Fri, 15 Oct 2021 22:34:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243379AbhJOWgK (ORCPT ); Fri, 15 Oct 2021 18:36:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243386AbhJOWgI (ORCPT ); Fri, 15 Oct 2021 18:36:08 -0400 Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75429C061766 for ; Fri, 15 Oct 2021 15:34:01 -0700 (PDT) Received: by mail-ot1-x32a.google.com with SMTP id x33-20020a9d37a4000000b0054733a85462so14761695otb.10 for ; Fri, 15 Oct 2021 15:34:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=A0W0lrzLvesrS4NTRZlqKXS7YNvdwbyHsjppKRKOK8g=; b=k8/mLughY0nQO2brs7jtnUFDg67kNIuD81l+7K7wHUWEvhIGivlt2V5hWcI91uHsJS OBPU+GfHpWcI10EmCPQRt3HxWYdUjelzXaBcuwZu7EK1Q3nM1NecqtFRZZaitoakguZw QdCdR7MyI1T3D9fD1ToM6PEKD5+YshUaYYsAZ21AmV8m1dyH0/ShpKyBqVwsav6Bh4q4 BLj4TmXPV5UNQqWm6TkZGFMhybLqYwq3/tQzCz7m28ZZVeehX2UpAv9yy/Juw2XTBUtZ SXrsRgR+unDEDtbudeX2+KOhnFI/ukgUwigtlncarJpXuuEqoY5ltBfjcOk33/Zl/ALS 1Tag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=A0W0lrzLvesrS4NTRZlqKXS7YNvdwbyHsjppKRKOK8g=; b=RyWedWYMqWRHIW7CLMV5fu8EfngdKm3AHTwYp2kgnZRpbyGL/01ms0eKF28AmYKKoT 1TuciEUVGDsfq/IE56r5CF6OKP5EMIAaBoGBwaI/qZMjYZe7MIl0L6MWxZyioU78iH8Z tzZ7tZcz84W9jFDWc4VY5XZxUhLSGN02Kid33uiTfbU5vTSqAUeXJJbl/AUrMl/ShHla rleuxop6XU1yhTYe9v4uRCuLeKvKPtiapu0xYF/Bz9lxuENgniWnLQ26Gl0QAD7oF9yS Hx/9BvN6AsaP7IGRR5HZIGzx6uZQLuQhYCIKuhifagRcUgkiz56qHpdBUur67H164atB WNew== X-Gm-Message-State: AOAM532JwvcihmxeYhLXItxrZiSF67pclbTAjmp0fiHjnc8zJO+gS86M C932YABF+t61sZiniRxjB0lpwClS478= X-Google-Smtp-Source: ABdhPJyHZta2opOQKrQvuE9SYFzXtY+EvhUPlxebAFxzUjmUcnao5gTXER2LcWvBUdz1HNKEwiu4tQ== X-Received: by 2002:a05:6830:4021:: with SMTP id i1mr10726195ots.69.1634337240860; Fri, 15 Oct 2021 15:34:00 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-191f-cddf-7836-208c.res6.spectrum.com. [2603:8081:140c:1a00:191f:cddf:7836:208c]) by smtp.gmail.com with ESMTPSA id v22sm1193896ott.80.2021.10.15.15.34.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 15:34:00 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH for-next v2 10/10] RDMA/rxe: Replace mr by rkey in responder resources Date: Fri, 15 Oct 2021 17:32:51 -0500 Message-Id: <20211015223250.6501-11-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211015223250.6501-1-rpearsonhpe@gmail.com> References: <20211015223250.6501-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Currently rxe saves a copy of MR in responder resources for RDMA reads. Since the responder resources are never freed just over written if more are needed this MR may not have a reference freed until the QP is destroyed. This patch uses the rkey instead of the MR and on subsequent packets of a multipacket read reply message it looks up the MR from the rkey for each packet. This makes it possible for a user to deregister an MR or unbind a MW on the fly and get correct behaviour. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_qp.c | 10 +-- drivers/infiniband/sw/rxe/rxe_resp.c | 123 ++++++++++++++++++-------- drivers/infiniband/sw/rxe/rxe_verbs.h | 1 - 3 files changed, 87 insertions(+), 47 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index 7503aebddcf4..23b4ffe23c4f 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -135,12 +135,8 @@ static void free_rd_atomic_resources(struct rxe_qp *qp) void free_rd_atomic_resource(struct rxe_qp *qp, struct resp_res *res) { - if (res->type == RXE_ATOMIC_MASK) { + if (res->type == RXE_ATOMIC_MASK) kfree_skb(res->atomic.skb); - } else if (res->type == RXE_READ_MASK) { - if (res->read.mr) - rxe_drop_ref(res->read.mr); - } res->type = 0; } @@ -816,10 +812,8 @@ static void rxe_qp_do_cleanup(struct work_struct *work) if (qp->pd) rxe_drop_ref(qp->pd); - if (qp->resp.mr) { + if (qp->resp.mr) rxe_drop_ref(qp->resp.mr); - qp->resp.mr = NULL; - } if (qp_type(qp) == IB_QPT_RC) sk_dst_reset(qp->sk->sk); diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index f589f4dde35c..c776289842e5 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -641,6 +641,78 @@ static struct sk_buff *prepare_ack_packet(struct rxe_qp *qp, return skb; } +static struct resp_res *rxe_prepare_read_res(struct rxe_qp *qp, + struct rxe_pkt_info *pkt) +{ + struct resp_res *res; + u32 pkts; + + res = &qp->resp.resources[qp->resp.res_head]; + rxe_advance_resp_resource(qp); + free_rd_atomic_resource(qp, res); + + res->type = RXE_READ_MASK; + res->replay = 0; + res->read.va = qp->resp.va + qp->resp.offset; + res->read.va_org = qp->resp.va + qp->resp.offset; + res->read.resid = qp->resp.resid; + res->read.length = qp->resp.resid; + res->read.rkey = qp->resp.rkey; + + pkts = max_t(u32, (reth_len(pkt) + qp->mtu - 1)/qp->mtu, 1); + res->first_psn = pkt->psn; + res->cur_psn = pkt->psn; + res->last_psn = (pkt->psn + pkts - 1) & BTH_PSN_MASK; + + res->state = rdatm_res_state_new; + + return res; +} + +/** + * rxe_recheck_mr - revalidate MR from rkey and get a reference + * @qp: the qp + * @rkey: the rkey + * + * This code allows the MR to be invalidated or deregistered or + * the MW if one was used to be invalidated or deallocated. + * It is assumed that the access permissions if originally good + * are OK and the mappings to be unchanged. + * + * Return: mr on success else NULL + */ +static struct rxe_mr *rxe_recheck_mr(struct rxe_qp *qp, u32 rkey) +{ + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); + struct rxe_mr *mr; + struct rxe_mw *mw; + + if (rkey_is_mw(rkey)) { + mw = rxe_pool_get_index(&rxe->mw_pool, rkey >> 8); + if (!mw || mw->rkey != rkey) + return NULL; + + if (mw->state != RXE_MW_STATE_VALID) { + rxe_drop_ref(mw); + return NULL; + } + + mr = mw->mr; + rxe_drop_ref(mw); + } else { + mr = rxe_pool_get_index(&rxe->mr_pool, rkey >> 8); + if (!mr || mr->rkey != rkey) + return NULL; + } + + if (mr->state != RXE_MR_STATE_VALID) { + rxe_drop_ref(mr); + return NULL; + } + + return mr; +} + /* RDMA read response. If res is not NULL, then we have a current RDMA request * being processed or replayed. */ @@ -655,53 +727,26 @@ static enum resp_states read_reply(struct rxe_qp *qp, int opcode; int err; struct resp_res *res = qp->resp.res; + struct rxe_mr *mr; if (!res) { - /* This is the first time we process that request. Get a - * resource - */ - res = &qp->resp.resources[qp->resp.res_head]; - - free_rd_atomic_resource(qp, res); - rxe_advance_resp_resource(qp); - - res->type = RXE_READ_MASK; - res->replay = 0; - - res->read.va = qp->resp.va + - qp->resp.offset; - res->read.va_org = qp->resp.va + - qp->resp.offset; - - res->first_psn = req_pkt->psn; - - if (reth_len(req_pkt)) { - res->last_psn = (req_pkt->psn + - (reth_len(req_pkt) + mtu - 1) / - mtu - 1) & BTH_PSN_MASK; - } else { - res->last_psn = res->first_psn; - } - res->cur_psn = req_pkt->psn; - - res->read.resid = qp->resp.resid; - res->read.length = qp->resp.resid; - res->read.rkey = qp->resp.rkey; - - /* note res inherits the reference to mr from qp */ - res->read.mr = qp->resp.mr; - qp->resp.mr = NULL; - - qp->resp.res = res; - res->state = rdatm_res_state_new; + res = rxe_prepare_read_res(qp, req_pkt); + qp->resp.res = res; } if (res->state == rdatm_res_state_new) { + mr = qp->resp.mr; + qp->resp.mr = NULL; + if (res->read.resid <= mtu) opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_ONLY; else opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST; } else { + mr = rxe_recheck_mr(qp, res->read.rkey); + if (!mr) + return RESPST_ERR_RKEY_VIOLATION; + if (res->read.resid > mtu) opcode = IB_OPCODE_RC_RDMA_READ_RESPONSE_MIDDLE; else @@ -717,10 +762,12 @@ static enum resp_states read_reply(struct rxe_qp *qp, if (!skb) return RESPST_ERR_RNR; - err = rxe_mr_copy(res->read.mr, res->read.va, payload_addr(&ack_pkt), + err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), payload, RXE_FROM_MR_OBJ); if (err) pr_err("Failed copying memory\n"); + if (mr) + rxe_drop_ref(mr); if (bth_pad(&ack_pkt)) { u8 *pad = payload_addr(&ack_pkt) + payload; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 4f1d7777f755..0cfbef7a36c9 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -157,7 +157,6 @@ struct resp_res { struct sk_buff *skb; } atomic; struct { - struct rxe_mr *mr; u64 va_org; u32 rkey; u32 length;