From patchwork Thu Feb 27 22:39:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995276 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0289927FE8E for ; Thu, 27 Feb 2025 22:39:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.153.30 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695992; cv=none; b=H0sBpcL2xbt4micUUzyoqluX6MdP4n9knqt/893bsp5oJbfkUT5T4tdP/czE17QbJ/s6zkSzRpOBCUcSS3WF7CQeZo95rogf4/wUnLKIhS8qXBKKeRA/Uwg6uDn7GO2Zi5w8xmJNaC5EJQjHnxvI+TPbLkBmlQwCPRK7mmBpnUQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695992; c=relaxed/simple; bh=dHxCVDya3+j5PBBwUvC3UohhRNjgvI3ccXZHig0pYfE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sI9ZLiADLAK0630JBuvdGWlpKvGLMw0ulSNIA7/PJA1iRP5p30SV0zd8gcTZGu7wGKN6EioB7KlwZEln1VAI3oIcdT6CLliBtBKmXKLlxVYWd7vnenWHi0MKTvwtpAVabfE1akWH8vMJChURtvRop/52h6HsaFI/ND4cJupA3uM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=aIPfjGx1; arc=none smtp.client-ip=67.231.153.30 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="aIPfjGx1" Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51RMdi6U011870 for ; Thu, 27 Feb 2025 14:39:49 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=717Fucz3MwnGMkuf5ctouUv4BVsW93ozRQBD9opCJkQ=; b=aIPfjGx1z/yV 5TRQN8Qjk/WqZZSDDuNdIFzdWFk0Gy7NfDnXp9GMaB/r+bEgwPL5MvZPTBzCM5u5 AoRF13MrFDX9FRaUygErP2dXgM0hf1DGj4f1d6pTClkFBhyUig4FjryZ50kUpBGU SQP1Sq4hakF3cfgWmmqOC+9Qa7/xd4TSfB2tf/OMmZZ37XMM5GzqqfzL1ezDr69O C4/n28L2+MzlNp8Z0ZwOGYochL6Si4+ZEp+L/qt7Q5lrtJhhwXhrHEfAwMqER90u fCFgdRTihblFGkkQfHcjzg9ZoT8ZYCcqqbdJng3L5ykKIgi4gQzTL6eBxJJoTsHo fx5nFy7gqQ== Received: from maileast.thefacebook.com ([163.114.135.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 452wtb9ntv-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:39:49 -0800 (PST) Received: from twshared40462.17.frc2.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a9:6f::237c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:14 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 6DE311888280C; Thu, 27 Feb 2025 14:39:17 -0800 (PST) From: Keith Busch To: , , , , CC: , , Keith Busch Subject: [PATCHv8 1/6] io_uring/rw: move buffer_select outside generic prep Date: Thu, 27 Feb 2025 14:39:11 -0800 Message-ID: <20250227223916.143006-2-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: q4n5a7dtYgeiII92u-ydMcYk9-G-4nzl X-Proofpoint-ORIG-GUID: q4n5a7dtYgeiII92u-ydMcYk9-G-4nzl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Keith Busch Cleans up the generic rw prep to not require the do_import flag. Use a different prep function for callers that might need buffer select. Based-on-a-patch-by: Jens Axboe Signed-off-by: Keith Busch Reviewed-by: Ming Lei --- io_uring/rw.c | 45 ++++++++++++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/io_uring/rw.c b/io_uring/rw.c index 788f06fbd7db1..b21b423b3cf8f 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -248,8 +248,8 @@ static int io_prep_rw_pi(struct io_kiocb *req, struct io_rw *rw, int ddir, return ret; } -static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, - int ddir, bool do_import) +static int __io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, + int ddir) { struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); unsigned ioprio; @@ -285,14 +285,6 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, rw->len = READ_ONCE(sqe->len); rw->flags = READ_ONCE(sqe->rw_flags); - if (do_import && !io_do_buffer_select(req)) { - struct io_async_rw *io = req->async_data; - - ret = io_import_rw_buffer(ddir, req, io, 0); - if (unlikely(ret)) - return ret; - } - attr_type_mask = READ_ONCE(sqe->attr_type_mask); if (attr_type_mask) { u64 attr_ptr; @@ -307,26 +299,45 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, return 0; } +static int io_rw_do_import(struct io_kiocb *req, int ddir) +{ + if (io_do_buffer_select(req)) + return 0; + + return io_import_rw_buffer(ddir, req, req->async_data, 0); +} + +static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, + int ddir) +{ + int ret; + + ret = __io_prep_rw(req, sqe, ddir); + if (unlikely(ret)) + return ret; + + return io_rw_do_import(req, ddir); +} + int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe) { - return io_prep_rw(req, sqe, ITER_DEST, true); + return io_prep_rw(req, sqe, ITER_DEST); } int io_prep_write(struct io_kiocb *req, const struct io_uring_sqe *sqe) { - return io_prep_rw(req, sqe, ITER_SOURCE, true); + return io_prep_rw(req, sqe, ITER_SOURCE); } static int io_prep_rwv(struct io_kiocb *req, const struct io_uring_sqe *sqe, int ddir) { - const bool do_import = !(req->flags & REQ_F_BUFFER_SELECT); int ret; - ret = io_prep_rw(req, sqe, ddir, do_import); + ret = io_prep_rw(req, sqe, ddir); if (unlikely(ret)) return ret; - if (do_import) + if (!(req->flags & REQ_F_BUFFER_SELECT)) return 0; /* @@ -353,7 +364,7 @@ static int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe struct io_async_rw *io; int ret; - ret = io_prep_rw(req, sqe, ddir, false); + ret = __io_prep_rw(req, sqe, ddir); if (unlikely(ret)) return ret; @@ -386,7 +397,7 @@ int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (!(req->flags & REQ_F_BUFFER_SELECT)) return -EINVAL; - ret = io_prep_rw(req, sqe, ITER_DEST, false); + ret = __io_prep_rw(req, sqe, ITER_DEST); if (unlikely(ret)) return ret; From patchwork Thu Feb 27 22:39:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995272 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B657C23E35D for ; Thu, 27 Feb 2025 22:39:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695967; cv=none; b=hWwyv+s6Sz8JyJcHInGGSz61UD6QJDqBagK5ejcokxrkQunYEZc1mYhYLLujC1Gvtql9J0H2oJMsQSfuQP3dtbTDo3Be0n/dlnh/7LHlDd/dFjOgOgdvgsTA4LJQL37nWlD9m8vf6KPsVvglwdqOft9D1kvZ9mCbERR0CPnr7Q4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695967; c=relaxed/simple; bh=k7dm2mYm+tEvw2wjaJRDLcdLZPDwdFZt1/ZEhdyjjqc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=d3g4wpfirtZ8/Of7AwwjQn86SBL9/1kffHeAkdAcworlIUCQcwsaTQT4QHMo96qMfgDcmEPyid+TIFW7TxCAjOMsjl+le3oIiXzq06pasDF5uIGpzsslKM4ufzPUKLudwTLcP1A0ZHjnGg83FcLQbrAZz36sIW3/1nNJGs/TCHY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=RWt2pHWc; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="RWt2pHWc" Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51RMbmhg032434 for ; Thu, 27 Feb 2025 14:39:25 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=Oz2RWLCKlHp+FkPvD5TrQfmOyHekSvWyNtpF3IJ/nNs=; b=RWt2pHWcTnAI q0UTWv2qYlVNG84RUYWHmHqI0jBZYi87ShcebDLa1k1lWt/gQtcY1Kq1oc3anOps XhvHb6WfN3RH9OZvxmx9aKFpGnhDzELYziYVUf0BLHHK0SrMJ31ahej4AZNduqFU 1ViNj5J0jY70QhQNO4XMQn8ifPHR0BOImONnnl+Pgp4Q7VoCrb9PjERmGRLohS7O I4LgKxO+Wi1bNlOLlVb0bUhKe8LTtk2x1XFiEvEDbv2lGwleJgpSA1eTYkziAf4a rCF2Kx8nookQZD2YaSgmQlgsY98Mf1zE6jc6NElcVxbngSTqS/peJYI/zKhywPkr xKXGj4r4ng== Received: from mail.thefacebook.com ([163.114.134.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 4530qwg44m-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:39:24 -0800 (PST) Received: from twshared32179.32.frc3.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c08b:78::c78f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:13 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 844791888280D; Thu, 27 Feb 2025 14:39:17 -0800 (PST) From: Keith Busch To: , , , , CC: , , Keith Busch Subject: [PATCHv8 2/6] io_uring/rw: move fixed buffer import to issue path Date: Thu, 27 Feb 2025 14:39:12 -0800 Message-ID: <20250227223916.143006-3-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: wv3UMmiZwjtik8VJKQZL0PSGQfx7FFhk X-Proofpoint-GUID: wv3UMmiZwjtik8VJKQZL0PSGQfx7FFhk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Keith Busch Registered buffers may depend on a linked command, which makes the prep path too early to import. Move to the issue path when the node is actually needed like all the other users of fixed buffers. Signed-off-by: Keith Busch Reviewed-by: Ming Lei --- io_uring/opdef.c | 4 ++-- io_uring/rw.c | 39 ++++++++++++++++++++++++++++++--------- io_uring/rw.h | 2 ++ 3 files changed, 34 insertions(+), 11 deletions(-) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 89f50ecadeaf3..9511262c513e4 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -105,7 +105,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .prep = io_prep_read_fixed, - .issue = io_read, + .issue = io_read_fixed, }, [IORING_OP_WRITE_FIXED] = { .needs_file = 1, @@ -119,7 +119,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .prep = io_prep_write_fixed, - .issue = io_write, + .issue = io_write_fixed, }, [IORING_OP_POLL_ADD] = { .needs_file = 1, diff --git a/io_uring/rw.c b/io_uring/rw.c index b21b423b3cf8f..7bc23802a388e 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -357,31 +357,30 @@ int io_prep_writev(struct io_kiocb *req, const struct io_uring_sqe *sqe) return io_prep_rwv(req, sqe, ITER_SOURCE); } -static int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe, +static int io_init_rw_fixed(struct io_kiocb *req, unsigned int issue_flags, int ddir) { struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); - struct io_async_rw *io; + struct io_async_rw *io = req->async_data; int ret; - ret = __io_prep_rw(req, sqe, ddir); - if (unlikely(ret)) - return ret; + if (io->bytes_done) + return 0; - io = req->async_data; - ret = io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, 0); + ret = io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, + issue_flags); iov_iter_save_state(&io->iter, &io->iter_state); return ret; } int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) { - return io_prep_rw_fixed(req, sqe, ITER_DEST); + return __io_prep_rw(req, sqe, ITER_DEST); } int io_prep_write_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) { - return io_prep_rw_fixed(req, sqe, ITER_SOURCE); + return __io_prep_rw(req, sqe, ITER_SOURCE); } /* @@ -1147,6 +1146,28 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) } } +int io_read_fixed(struct io_kiocb *req, unsigned int issue_flags) +{ + int ret; + + ret = io_init_rw_fixed(req, issue_flags, ITER_DEST); + if (unlikely(ret)) + return ret; + + return io_read(req, issue_flags); +} + +int io_write_fixed(struct io_kiocb *req, unsigned int issue_flags) +{ + int ret; + + ret = io_init_rw_fixed(req, issue_flags, ITER_SOURCE); + if (unlikely(ret)) + return ret; + + return io_write(req, issue_flags); +} + void io_rw_fail(struct io_kiocb *req) { int res; diff --git a/io_uring/rw.h b/io_uring/rw.h index a45e0c71b59d6..bf121b81ebe84 100644 --- a/io_uring/rw.h +++ b/io_uring/rw.h @@ -38,6 +38,8 @@ int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_prep_write(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_read(struct io_kiocb *req, unsigned int issue_flags); int io_write(struct io_kiocb *req, unsigned int issue_flags); +int io_read_fixed(struct io_kiocb *req, unsigned int issue_flags); +int io_write_fixed(struct io_kiocb *req, unsigned int issue_flags); void io_readv_writev_cleanup(struct io_kiocb *req); void io_rw_fail(struct io_kiocb *req); void io_req_rw_complete(struct io_kiocb *req, io_tw_token_t tw); From patchwork Thu Feb 27 22:39:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995278 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0C2026F469 for ; Thu, 27 Feb 2025 22:40:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.153.30 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740696006; cv=none; b=qyi2pX8Tv1RYKcjIbbE+cPOy08dPx0w6OL/EXGBB1caOeDySTO+YW/EUlbvys5SyE6sjLxbvjX8ijVPwak2nU5+N4Slpm9/ljkWyNuWF6dYNM/kUpb7i1GqJUlULuJx0TYXqxh9HiPZhY9MNm6i7z6zoDpATBYoY/j55S+r9xag= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740696006; c=relaxed/simple; bh=VCYCRmQ8GMsCSkoa2Oba5wZwysPaX545Y5agb0250v0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=T21X1u9bQlhhCI1Iamtcu5mIvy4SY2tPC3NlNRoW1GKHp3LZccnYS0Yr0HFIEDfq6QGRfCjCBHd2z0nw8tapEPFEjK9z80VH750PKj0F1sVzzcf5JZY42pru3F/YLFlBThVeqbD8oRtnzasH3aoReq/pQxr+v7NYKAc8VCyT/18= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=Akr2HU5E; arc=none smtp.client-ip=67.231.153.30 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="Akr2HU5E" Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51RMdiR9011881 for ; Thu, 27 Feb 2025 14:40:03 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=c/rvD21U8zQ6S1LUw7Ft4KPhPGl3wO16d7e5rYdZ1sM=; b=Akr2HU5EQlFn Ei4zV5oDaV2kKUYbt/L7XlG3V5JUGz6bI19UNQ3x6u7r8gms+h3VrQ8PqYWMazhR KQ9vCXFmfOjfJADjrvtLYaPyvy5k9/pUCrmSYBlPfDg21K4X+TfMFD+LV2IbXfFp YQVtdDQE1EJr17Qp0+7IaljDqwfPiEadsgK7yxbEEjPHAlCbi7QXScyvW1KidVId 2n14Ixw8CYdhrlLw9b1/+a7um08H8Z4z5/261pWJwWF2GzZrclfHE5OR+L4rxGj0 VYHaBUi0U8MO//xYYey57/2FHKVnfZWWtW64vi9B6i3bx/N0OWgDXoFJI5uzt7zY 7VuHBm4uZA== Received: from maileast.thefacebook.com ([163.114.135.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 452wtb9nsk-17 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:40:03 -0800 (PST) Received: from twshared40462.17.frc2.facebook.com (2620:10d:c0a8:fe::f072) by mail.thefacebook.com (2620:10d:c0a9:6f::8fd4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:14 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id C14AF18882811; Thu, 27 Feb 2025 14:39:17 -0800 (PST) From: Keith Busch To: , , , , CC: , , Xinyu Zhang , Keith Busch Subject: [PATCHv8 3/6] nvme: map uring_cmd data even if address is 0 Date: Thu, 27 Feb 2025 14:39:13 -0800 Message-ID: <20250227223916.143006-4-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: JyZcHzVH-RReOtcrp5O5aBrDG3NEIgna X-Proofpoint-ORIG-GUID: JyZcHzVH-RReOtcrp5O5aBrDG3NEIgna X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Xinyu Zhang When using kernel registered bvec fixed buffers, the "address" is actually the offset into the bvec rather than userspace address. Therefore it can be 0. We can skip checking whether the address is NULL before mapping uring_cmd data. Bad userspace address will be handled properly later when the user buffer is imported. With this patch, we will be able to use the kernel registered bvec fixed buffers in io_uring NVMe passthru with ublk zero-copy support. Reviewed-by: Caleb Sander Mateos Reviewed-by: Jens Axboe Reviewed-by: Ming Lei Signed-off-by: Xinyu Zhang Signed-off-by: Keith Busch --- drivers/nvme/host/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index fb266cf1f8c66..98a0750c0cda5 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -512,7 +512,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, return PTR_ERR(req); req->timeout = d.timeout_ms ? msecs_to_jiffies(d.timeout_ms) : 0; - if (d.addr && d.data_len) { + if (d.data_len) { ret = nvme_map_user_request(req, d.addr, d.data_len, nvme_to_user_ptr(d.metadata), d.metadata_len, ioucmd, vec, issue_flags); From patchwork Thu Feb 27 22:39:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995277 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4749B22B5BC for ; Thu, 27 Feb 2025 22:40:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.153.30 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740696003; cv=none; b=rtHcMNBeOQXCFY0wZWD43SvRa1L+0ek2t7veqX5zhbVbZjEZN/6kJ96QoIb9qqSZf32v8Nv+eCI3TMQy9OpFwcfHJ1DPaeiVHDvNJLmunLOjdG7LQoaTddH0HjNvmudpgqPcppWvZfp/Nj7PRd/8IHTzIWPsZcsb2Eww+yy+JFM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740696003; c=relaxed/simple; bh=WdX7C6heYWDbxjcINIFdpEstg6W2EbKyUpQ9SZC8K1E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZmqriHYjRpiwEqqEAXcmi1tPhDqy98670fuXL6p0FT3iLrunMpHPK8YdlCD+IecgCspDbb5ngGydkBsqFNiKJxE/OkbrkcXvYTA0kluqHqKA2gKxQYdqNxemdHX3vJhWHubXVv51xbZOdh357oE3NgPRNk1hu86YfnIFK1wO198= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=hA9zVb0s; arc=none smtp.client-ip=67.231.153.30 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="hA9zVb0s" Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51RMdiR2011881 for ; Thu, 27 Feb 2025 14:40:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=ElZ0wok2zPHHu3RVXvRiuOKG8By28N0zN2EuqI29MbQ=; b=hA9zVb0s7SGf U1MYAirrfOcnlQqIr6zXMSGjlFEU2zwi6GC8y9+opAPtFLvrCORPb/3QC3IcMuEv okLPUlQIjYd75FFPWmD74ZLgJsc21cOn+hPfqbzmUP2iRhe4hcZ9uzGO5Iz/MwhK oBcLPLpn1MSo3HkR+vc1KECy8Tf2OBvMqQkhzqgt+qHFH2sIWaucF8vgZZAlcIYz dyVCRbnmb93eGWoBBhArQKxsDhEuVtCz5fHVFxBKxifqWprXe5GGz5dH1sI4oLLs CO1YF2iE8AdlKJWdu5GofwPXoR/0kX/SlWrVsHqAmZDfUhv6htNAP+fJWol19VQq vGnhck+9wA== Received: from maileast.thefacebook.com ([163.114.135.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 452wtb9nsk-11 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:39:59 -0800 (PST) Received: from twshared40462.17.frc2.facebook.com (2620:10d:c0a8:fe::f072) by mail.thefacebook.com (2620:10d:c0a9:6f::8fd4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:14 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 1A82518882814; Thu, 27 Feb 2025 14:39:17 -0800 (PST) From: Keith Busch To: , , , , CC: , , Keith Busch Subject: [PATCHv8 4/6] io_uring: add support for kernel registered bvecs Date: Thu, 27 Feb 2025 14:39:14 -0800 Message-ID: <20250227223916.143006-5-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: VtZoIV1DKzfYt5Qaqw3SR2zb8xfOWFiX X-Proofpoint-ORIG-GUID: VtZoIV1DKzfYt5Qaqw3SR2zb8xfOWFiX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Keith Busch Provide an interface for the kernel to leverage the existing pre-registered buffers that io_uring provides. User space can reference these later to achieve zero-copy IO. User space must register an empty fixed buffer table with io_uring in order for the kernel to make use of it. Signed-off-by: Keith Busch Reviewed-by: Ming Lei --- include/linux/io_uring/cmd.h | 7 ++ io_uring/io_uring.c | 3 + io_uring/rsrc.c | 123 +++++++++++++++++++++++++++++++++-- io_uring/rsrc.h | 9 +++ io_uring/rw.c | 3 + 5 files changed, 138 insertions(+), 7 deletions(-) diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h index 87150dc0a07cf..cf8d80d847344 100644 --- a/include/linux/io_uring/cmd.h +++ b/include/linux/io_uring/cmd.h @@ -4,6 +4,7 @@ #include #include +#include /* only top 8 bits of sqe->uring_cmd_flags for kernel internal use */ #define IORING_URING_CMD_CANCELABLE (1U << 30) @@ -125,4 +126,10 @@ static inline struct io_uring_cmd_data *io_uring_cmd_get_async_data(struct io_ur return cmd_to_io_kiocb(cmd)->async_data; } +int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, + void (*release)(void *), unsigned int index, + unsigned int issue_flags); +void io_buffer_unregister_bvec(struct io_uring_cmd *cmd, unsigned int index, + unsigned int issue_flags); + #endif /* _LINUX_IO_URING_CMD_H */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index db1c0792def63..2f5dd47e7dbf5 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3947,6 +3947,9 @@ static int __init io_uring_init(void) io_uring_optable_init(); + /* imu->dir is u8 */ + BUILD_BUG_ON((IO_IMU_DEST | IO_IMU_SOURCE) > U8_MAX); + /* * Allow user copy in the per-command field, which starts after the * file in io_kiocb and until the opcode field. The openat2 handling diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index f814526982c36..0eceaf2e03777 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -9,6 +9,7 @@ #include #include #include +#include #include @@ -101,17 +102,23 @@ int io_buffer_validate(struct iovec *iov) return 0; } -static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_rsrc_node *node) +static void io_release_ubuf(void *priv) { - struct io_mapped_ubuf *imu = node->buf; + struct io_mapped_ubuf *imu = priv; unsigned int i; - if (!refcount_dec_and_test(&imu->refs)) - return; for (i = 0; i < imu->nr_bvecs; i++) unpin_user_page(imu->bvec[i].bv_page); +} + +static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf *imu) +{ + if (!refcount_dec_and_test(&imu->refs)) + return; + if (imu->acct_pages) io_unaccount_mem(ctx, imu->acct_pages); + imu->release(imu->priv); kvfree(imu); } @@ -451,7 +458,7 @@ void io_free_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node *node) break; case IORING_RSRC_BUFFER: if (node->buf) - io_buffer_unmap(ctx, node); + io_buffer_unmap(ctx, node->buf); break; default: WARN_ON_ONCE(1); @@ -761,6 +768,10 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, imu->len = iov->iov_len; imu->nr_bvecs = nr_pages; imu->folio_shift = PAGE_SHIFT; + imu->release = io_release_ubuf; + imu->priv = imu; + imu->is_kbuf = false; + imu->dir = IO_IMU_DEST | IO_IMU_SOURCE; if (coalesced) imu->folio_shift = data.folio_shift; refcount_set(&imu->refs, 1); @@ -857,6 +868,95 @@ int io_sqe_buffers_register(struct io_ring_ctx *ctx, void __user *arg, return ret; } +int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, + void (*release)(void *), unsigned int index, + unsigned int issue_flags) +{ + struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx; + struct io_rsrc_data *data = &ctx->buf_table; + struct req_iterator rq_iter; + struct io_mapped_ubuf *imu; + struct io_rsrc_node *node; + struct bio_vec bv, *bvec; + u16 nr_bvecs; + int ret = 0; + + io_ring_submit_lock(ctx, issue_flags); + if (index >= data->nr) { + ret = -EINVAL; + goto unlock; + } + index = array_index_nospec(index, data->nr); + + if (data->nodes[index]) { + ret = -EBUSY; + goto unlock; + } + + node = io_rsrc_node_alloc(IORING_RSRC_BUFFER); + if (!node) { + ret = -ENOMEM; + goto unlock; + } + + nr_bvecs = blk_rq_nr_phys_segments(rq); + imu = kvmalloc(struct_size(imu, bvec, nr_bvecs), GFP_KERNEL); + if (!imu) { + kfree(node); + ret = -ENOMEM; + goto unlock; + } + + imu->ubuf = 0; + imu->len = blk_rq_bytes(rq); + imu->acct_pages = 0; + imu->folio_shift = PAGE_SHIFT; + imu->nr_bvecs = nr_bvecs; + refcount_set(&imu->refs, 1); + imu->release = release; + imu->priv = rq; + imu->is_kbuf = true; + + if (op_is_write(req_op(rq))) + imu->dir = IO_IMU_SOURCE; + else + imu->dir = IO_IMU_DEST; + + bvec = imu->bvec; + rq_for_each_bvec(bv, rq, rq_iter) + *bvec++ = bv; + + node->buf = imu; + data->nodes[index] = node; +unlock: + io_ring_submit_unlock(ctx, issue_flags); + return ret; +} +EXPORT_SYMBOL_GPL(io_buffer_register_bvec); + +void io_buffer_unregister_bvec(struct io_uring_cmd *cmd, unsigned int index, + unsigned int issue_flags) +{ + struct io_ring_ctx *ctx = cmd_to_io_kiocb(cmd)->ctx; + struct io_rsrc_data *data = &ctx->buf_table; + struct io_rsrc_node *node; + + io_ring_submit_lock(ctx, issue_flags); + if (index >= data->nr) + goto unlock; + index = array_index_nospec(index, data->nr); + + node = data->nodes[index]; + if (!node || !node->buf->is_kbuf) + goto unlock; + + io_put_rsrc_node(ctx, node); + data->nodes[index] = NULL; +unlock: + io_ring_submit_unlock(ctx, issue_flags); +} +EXPORT_SYMBOL_GPL(io_buffer_unregister_bvec); + static int io_import_fixed(int ddir, struct iov_iter *iter, struct io_mapped_ubuf *imu, u64 buf_addr, size_t len) @@ -871,6 +971,8 @@ static int io_import_fixed(int ddir, struct iov_iter *iter, /* not inside the mapped region */ if (unlikely(buf_addr < imu->ubuf || buf_end > (imu->ubuf + imu->len))) return -EFAULT; + if (!(imu->dir & (1 << ddir))) + return -EFAULT; /* * Might not be a start of buffer, set size appropriately @@ -883,8 +985,8 @@ static int io_import_fixed(int ddir, struct iov_iter *iter, /* * Don't use iov_iter_advance() here, as it's really slow for * using the latter parts of a big fixed buffer - it iterates - * over each segment manually. We can cheat a bit here, because - * we know that: + * over each segment manually. We can cheat a bit here for user + * registered nodes, because we know that: * * 1) it's a BVEC iter, we set it up * 2) all bvecs are the same in size, except potentially the @@ -898,8 +1000,15 @@ static int io_import_fixed(int ddir, struct iov_iter *iter, */ const struct bio_vec *bvec = imu->bvec; + /* + * Kernel buffer bvecs, on the other hand, don't necessarily + * have the size property of user registered ones, so we have + * to use the slow iter advance. + */ if (offset < bvec->bv_len) { iter->iov_offset = offset; + } else if (imu->is_kbuf) { + iov_iter_advance(iter, offset); } else { unsigned long seg_skip; diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index f0e9080599646..7600e2736eeb3 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -20,6 +20,11 @@ struct io_rsrc_node { }; }; +enum { + IO_IMU_DEST = 1 << ITER_DEST, + IO_IMU_SOURCE = 1 << ITER_SOURCE, +}; + struct io_mapped_ubuf { u64 ubuf; unsigned int len; @@ -27,6 +32,10 @@ struct io_mapped_ubuf { unsigned int folio_shift; refcount_t refs; unsigned long acct_pages; + void (*release)(void *); + void *priv; + bool is_kbuf; + u8 dir; struct bio_vec bvec[] __counted_by(nr_bvecs); }; diff --git a/io_uring/rw.c b/io_uring/rw.c index 7bc23802a388e..5ee9f8949e8ba 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -629,6 +629,7 @@ static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb) */ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter) { + struct io_kiocb *req = cmd_to_io_kiocb(rw); struct kiocb *kiocb = &rw->kiocb; struct file *file = kiocb->ki_filp; ssize_t ret = 0; @@ -644,6 +645,8 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter) if ((kiocb->ki_flags & IOCB_NOWAIT) && !(kiocb->ki_filp->f_flags & O_NONBLOCK)) return -EAGAIN; + if ((req->flags & REQ_F_BUF_NODE) && req->buf_node->buf->is_kbuf) + return -EFAULT; ppos = io_kiocb_ppos(kiocb); From patchwork Thu Feb 27 22:39:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995274 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A6A2270EC8 for ; Thu, 27 Feb 2025 22:39:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695969; cv=none; b=ifgJrcNB5+orwNZOBVdykv4M043sDP+21iomqy22RhhXTXb3EDIEZokpdvKKq89VmqAvTMUkqZeeJNLFXi8M2hPY3k364ONFkN2Iph+t2ngZ82HlNQpmMpu3pC6RkFCristl7tHY7LolHI+1ev+ptfeB8EnKfijBVTHYGiFXTP0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695969; c=relaxed/simple; bh=0+0C9RGt9fCy4+ZjRlJjHdJWrlJg1ksiB/2RJboOotI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TACXQmLOXj1yi8UDz954iqb6B5GZlcc2RNqgd/El6jDtonzxR4WXO5FZCYmKPbVbnuhXaDQww+dMjS5zB4yHp/sUAuacBPNCv5QloT+QSwSbCYAQUjq0+zzAelBOKpmJb7JcU27BHqAkuaG3r/ThbRIHnuUDVoxPCUQRuw6pOw0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=NPpadwCw; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="NPpadwCw" Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51RMbmhm032434 for ; Thu, 27 Feb 2025 14:39:27 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=J9Cg0chcoH6lM3GPPTxsII2bjK54fyiWhbfyyN7RC7Y=; b=NPpadwCwYYKy rGz5bbrX001Yx5x4diyqsS202KAWf4ybc7S3fwRCnG1xPQdJ5+pDLkqNr5xgCKn0 kzzSjkCNqsDGE/vFI0xCHjXoV1odszXD5BZjpv4nDGMpwQpmnPP8feqWVndbAQtW aZ1wE9DMkKiCle1r7H8gMVi3r2dBrcWAb2sxElQfLVfY0JzURqbs3WT+PISEQQHf u9aMHRxOCoizI6PsTLGxS8aSkvSP/N7rRK+UroWJ4B0gCzxOQXHs5eheMHScf4WZ uvqvDOZfEyECKEp9bW4P1RXSZTkb/lMRTo/uIval2rLuJ+BY14PQ33Gk6+cu2bLU k6vDSv2CzA== Received: from mail.thefacebook.com ([163.114.134.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 4530qwg44m-11 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:39:27 -0800 (PST) Received: from twshared9216.15.frc2.facebook.com (2620:10d:c085:108::150d) by mail.thefacebook.com (2620:10d:c08b:78::c78f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:15 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 7815718882818; Thu, 27 Feb 2025 14:39:18 -0800 (PST) From: Keith Busch To: , , , , CC: , , Keith Busch Subject: [PATCHv8 5/6] ublk: zc register/unregister bvec Date: Thu, 27 Feb 2025 14:39:15 -0800 Message-ID: <20250227223916.143006-6-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: a9MRyGbpaYNGuh08FOviisymq4I_ukyX X-Proofpoint-GUID: a9MRyGbpaYNGuh08FOviisymq4I_ukyX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Keith Busch Provide new operations for the user to request mapping an active request to an io uring instance's buf_table. The user has to provide the index it wants to install the buffer. A reference count is taken on the request to ensure it can't be completed while it is active in a ring's buf_table. Signed-off-by: Keith Busch Reviewed-by: Ming Lei --- drivers/block/ublk_drv.c | 59 ++++++++++++++++++++++++++++++----- include/uapi/linux/ublk_cmd.h | 4 +++ 2 files changed, 56 insertions(+), 7 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index e8f52d8341fba..8d7d6862a80f5 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -51,6 +51,9 @@ /* private ioctl command mirror */ #define UBLK_CMD_DEL_DEV_ASYNC _IOC_NR(UBLK_U_CMD_DEL_DEV_ASYNC) +#define UBLK_IO_REGISTER_IO_BUF _IOC_NR(UBLK_U_IO_REGISTER_IO_BUF) +#define UBLK_IO_UNREGISTER_IO_BUF _IOC_NR(UBLK_U_IO_UNREGISTER_IO_BUF) + /* All UBLK_F_* have to be included into UBLK_F_ALL */ #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \ | UBLK_F_URING_CMD_COMP_IN_TASK \ @@ -197,12 +200,14 @@ struct ublk_params_header { static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq); +static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, + struct ublk_queue *ubq, int tag, size_t offset); static inline unsigned int ublk_req_build_flags(struct request *req); static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq, int tag); static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub) { - return ub->dev_info.flags & UBLK_F_USER_COPY; + return ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY); } static inline bool ublk_dev_is_zoned(const struct ublk_device *ub) @@ -592,7 +597,7 @@ static void ublk_apply_params(struct ublk_device *ub) static inline bool ublk_support_user_copy(const struct ublk_queue *ubq) { - return ubq->flags & UBLK_F_USER_COPY; + return ubq->flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY); } static inline bool ublk_need_req_ref(const struct ublk_queue *ubq) @@ -1758,6 +1763,45 @@ static inline void ublk_prep_cancel(struct io_uring_cmd *cmd, io_uring_cmd_mark_cancelable(cmd, issue_flags); } +static void ublk_io_release(void *priv) +{ + struct request *rq = priv; + struct ublk_queue *ubq = rq->mq_hctx->driver_data; + + ublk_put_req_ref(ubq, rq); +} + +static int ublk_register_io_buf(struct io_uring_cmd *cmd, + struct ublk_queue *ubq, unsigned int tag, + const struct ublksrv_io_cmd *ub_cmd, + unsigned int issue_flags) +{ + struct ublk_device *ub = cmd->file->private_data; + int index = (int)ub_cmd->addr, ret; + struct request *req; + + req = __ublk_check_and_get_req(ub, ubq, tag, 0); + if (!req) + return -EINVAL; + + ret = io_buffer_register_bvec(cmd, req, ublk_io_release, index, + issue_flags); + if (ret) { + ublk_put_req_ref(ubq, req); + return ret; + } + + return 0; +} + +static int ublk_unregister_io_buf(struct io_uring_cmd *cmd, + const struct ublksrv_io_cmd *ub_cmd, + unsigned int issue_flags) +{ + io_buffer_unregister_bvec(cmd, ub_cmd->addr, issue_flags); + return 0; +} + static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags, const struct ublksrv_io_cmd *ub_cmd) @@ -1809,6 +1853,10 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, ret = -EINVAL; switch (_IOC_NR(cmd_op)) { + case UBLK_IO_REGISTER_IO_BUF: + return ublk_register_io_buf(cmd, ubq, tag, ub_cmd, issue_flags); + case UBLK_IO_UNREGISTER_IO_BUF: + return ublk_unregister_io_buf(cmd, ub_cmd, issue_flags); case UBLK_IO_FETCH_REQ: /* UBLK_IO_FETCH_REQ is only allowed before queue is setup */ if (ublk_queue_ready(ubq)) { @@ -2475,7 +2523,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) * buffer by pwrite() to ublk char device, which can't be * used for unprivileged device */ - if (info.flags & UBLK_F_USER_COPY) + if (info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY)) return -EINVAL; } @@ -2543,9 +2591,6 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) goto out_free_dev_number; } - /* We are not ready to support zero copy */ - ub->dev_info.flags &= ~UBLK_F_SUPPORT_ZERO_COPY; - ub->dev_info.nr_hw_queues = min_t(unsigned int, ub->dev_info.nr_hw_queues, nr_cpu_ids); ublk_align_max_io_size(ub); @@ -2876,7 +2921,7 @@ static int ublk_ctrl_get_features(struct io_uring_cmd *cmd) { const struct ublksrv_ctrl_cmd *header = io_uring_sqe_cmd(cmd->sqe); void __user *argp = (void __user *)(unsigned long)header->addr; - u64 features = UBLK_F_ALL & ~UBLK_F_SUPPORT_ZERO_COPY; + u64 features = UBLK_F_ALL; if (header->len != UBLK_FEATURES_LEN || !header->addr) return -EINVAL; diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h index 8093acdeaa114..7255b36b5cf63 100644 --- a/include/uapi/linux/ublk_cmd.h +++ b/include/uapi/linux/ublk_cmd.h @@ -94,6 +94,10 @@ _IOWR('u', UBLK_IO_COMMIT_AND_FETCH_REQ, struct ublksrv_io_cmd) #define UBLK_U_IO_NEED_GET_DATA \ _IOWR('u', UBLK_IO_NEED_GET_DATA, struct ublksrv_io_cmd) +#define UBLK_U_IO_REGISTER_IO_BUF \ + _IOWR('u', 0x23, struct ublksrv_io_cmd) +#define UBLK_U_IO_UNREGISTER_IO_BUF \ + _IOWR('u', 0x24, struct ublksrv_io_cmd) /* only ABORT means that no re-fetch */ #define UBLK_IO_RES_OK 0 From patchwork Thu Feb 27 22:39:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13995275 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7ED8F274259 for ; Thu, 27 Feb 2025 22:39:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.153.30 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695978; cv=none; b=Q1Cn0PSWKbHjcX3CBx6nus0oo7PQjjzZtD5A1GWKPaszTaruUUUMF6a3RrpA+kl0E4/21FSYKAsTNeWWIppppwbOvp/NAop0mufx3LAsRUkCfVklt6RBHqnyZJ6S47dSDAxdOMjfkKbs7a5LMlktLse5XiXbKEI9oqOYcE/lFOs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740695978; c=relaxed/simple; bh=CTqOkhMmDRNpT9rLqdAGDeq34k3ZtCMxmHRaMLiexZA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DuM/8aNEjRBdFwFI5A2r9TRs6V0bNMLkQSrHM1AsbrTeJejxWwS7RKY6tJ8r18b1EXDE3OwarwI284CsDbwpUD9GxkflcFFOizJS2bETbgq7UUQbbZPo6Y4OgeKO3jgH12WHL789M0V5EAFDB15cZ/XZvYucAsGsC/gOfoewGVk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=NOGrHNhK; arc=none smtp.client-ip=67.231.153.30 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="NOGrHNhK" Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.18.1.2/8.18.1.2) with ESMTP id 51RMbZ5C025607 for ; Thu, 27 Feb 2025 14:39:35 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=UK4VKrnbbqRbsvlFbP+kOC2S+sGsNhIP2BWp/9Ksa3U=; b=NOGrHNhKfI1T By7qcndvAVJqWeQXauCmBIF6mP6FRyBDf69DzRKde/y8lTgD6BXDa8+NYkb+tdHo IdfYDKYHAbUBlnOGoWimnMlz2LcSRlO+4c+WTQZFvnjAuETBAhiAmXZhppEKY7aN c92Jwi9UkpLdKTM3jxUcvmB9CpRAxL8z9GlGU/KoUUIHy5xQHZx17dp4nieYQ30c JLBphyyhxkBlM6MGE6U08Kk6haVhhqXRjkpL0XTTcvQMiZ3U+klTeynvPP73g+Pp bSfMHRxEAb1tp5tRBf7p9OxWzpUzA57kQFf8v9AdHcRXa5ukV8hK4SlxuKtuoxow 2KvAAmgEsw== Received: from mail.thefacebook.com ([163.114.134.16]) by m0001303.ppops.net (PPS) with ESMTPS id 452w27j4a9-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 27 Feb 2025 14:39:35 -0800 (PST) Received: from twshared18153.09.ash9.facebook.com (2620:10d:c085:108::150d) by mail.thefacebook.com (2620:10d:c08b:78::c78f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.14; Thu, 27 Feb 2025 22:39:24 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 8A5E21888281A; Thu, 27 Feb 2025 14:39:18 -0800 (PST) From: Keith Busch To: , , , , CC: , , Keith Busch Subject: [PATCHv8 6/6] io_uring: cache nodes and mapped buffers Date: Thu, 27 Feb 2025 14:39:16 -0800 Message-ID: <20250227223916.143006-7-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250227223916.143006-1-kbusch@meta.com> References: <20250227223916.143006-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: 6po2IQL67PAJSKZnHz6nI2W4pY1l2XlR X-Proofpoint-GUID: 6po2IQL67PAJSKZnHz6nI2W4pY1l2XlR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-27_08,2025-02-27_01,2024-11-22_01 From: Keith Busch Frequent alloc/free cycles on these is pretty costly. Use an io cache to more efficiently reuse these buffers. Signed-off-by: Keith Busch --- include/linux/io_uring_types.h | 2 + io_uring/filetable.c | 2 +- io_uring/io_uring.c | 2 + io_uring/rsrc.c | 70 +++++++++++++++++++++++++++------- io_uring/rsrc.h | 4 +- 5 files changed, 64 insertions(+), 16 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index c0fe8a00fe53a..3ce87dcd99eec 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -294,6 +294,8 @@ struct io_ring_ctx { struct io_file_table file_table; struct io_rsrc_data buf_table; + struct io_alloc_cache node_cache; + struct io_alloc_cache imu_cache; struct io_submit_state submit_state; diff --git a/io_uring/filetable.c b/io_uring/filetable.c index dd8eeec97acf6..a21660e3145ab 100644 --- a/io_uring/filetable.c +++ b/io_uring/filetable.c @@ -68,7 +68,7 @@ static int io_install_fixed_file(struct io_ring_ctx *ctx, struct file *file, if (slot_index >= ctx->file_table.data.nr) return -EINVAL; - node = io_rsrc_node_alloc(IORING_RSRC_FILE); + node = io_rsrc_node_alloc(ctx, IORING_RSRC_FILE); if (!node) return -ENOMEM; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 2f5dd47e7dbf5..8f542a5f20a60 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -292,6 +292,7 @@ static void io_free_alloc_caches(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->uring_cache, kfree); io_alloc_cache_free(&ctx->msg_cache, kfree); io_futex_cache_free(ctx); + io_rsrc_cache_free(ctx); } static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) @@ -339,6 +340,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) ret |= io_alloc_cache_init(&ctx->msg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_kiocb), 0); ret |= io_futex_cache_init(ctx); + ret |= io_rsrc_cache_init(ctx); if (ret) goto free_ref; init_completion(&ctx->ref_comp); diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 0eceaf2e03777..450b4c039334d 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -33,6 +33,8 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, #define IORING_MAX_FIXED_FILES (1U << 20) #define IORING_MAX_REG_BUFFERS (1U << 14) +#define IO_CACHED_BVECS_SEGS 32 + int __io_account_mem(struct user_struct *user, unsigned long nr_pages) { unsigned long page_limit, cur_pages, new_pages; @@ -111,6 +113,22 @@ static void io_release_ubuf(void *priv) unpin_user_page(imu->bvec[i].bv_page); } +static struct io_mapped_ubuf *io_alloc_imu(struct io_ring_ctx *ctx, + int nr_bvecs) +{ + if (nr_bvecs <= IO_CACHED_BVECS_SEGS) + return io_cache_alloc(&ctx->imu_cache, GFP_KERNEL); + return kvmalloc(struct_size_t(struct io_mapped_ubuf, bvec, nr_bvecs), + GFP_KERNEL); +} + +static void io_free_imu(struct io_ring_ctx *ctx, struct io_mapped_ubuf *imu) +{ + if (imu->nr_bvecs > IO_CACHED_BVECS_SEGS || + !io_alloc_cache_put(&ctx->imu_cache, imu)) + kvfree(imu); +} + static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf *imu) { if (!refcount_dec_and_test(&imu->refs)) @@ -119,22 +137,44 @@ static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf *imu) if (imu->acct_pages) io_unaccount_mem(ctx, imu->acct_pages); imu->release(imu->priv); - kvfree(imu); } -struct io_rsrc_node *io_rsrc_node_alloc(int type) +struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx, int type) { struct io_rsrc_node *node; - node = kzalloc(sizeof(*node), GFP_KERNEL); + node = io_cache_alloc(&ctx->node_cache, GFP_KERNEL); if (node) { node->type = type; node->refs = 1; + node->tag = 0; + node->file_ptr = 0; } return node; } -__cold void io_rsrc_data_free(struct io_ring_ctx *ctx, struct io_rsrc_data *data) +bool io_rsrc_cache_init(struct io_ring_ctx *ctx) +{ + const int imu_cache_size = struct_size_t(struct io_mapped_ubuf, bvec, + IO_CACHED_BVECS_SEGS); + const int node_size = sizeof(struct io_rsrc_node); + bool ret; + + ret = io_alloc_cache_init(&ctx->node_cache, IO_ALLOC_CACHE_MAX, + node_size, 0); + ret |= io_alloc_cache_init(&ctx->imu_cache, IO_ALLOC_CACHE_MAX, + imu_cache_size, 0); + return ret; +} + +void io_rsrc_cache_free(struct io_ring_ctx *ctx) +{ + io_alloc_cache_free(&ctx->node_cache, kfree); + io_alloc_cache_free(&ctx->imu_cache, kfree); +} + +__cold void io_rsrc_data_free(struct io_ring_ctx *ctx, + struct io_rsrc_data *data) { if (!data->nr) return; @@ -207,7 +247,7 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx, err = -EBADF; break; } - node = io_rsrc_node_alloc(IORING_RSRC_FILE); + node = io_rsrc_node_alloc(ctx, IORING_RSRC_FILE); if (!node) { err = -ENOMEM; fput(file); @@ -465,7 +505,8 @@ void io_free_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node *node) break; } - kfree(node); + if (!io_alloc_cache_put(&ctx->node_cache, node)) + kvfree(node); } int io_sqe_files_unregister(struct io_ring_ctx *ctx) @@ -527,7 +568,7 @@ int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg, goto fail; } ret = -ENOMEM; - node = io_rsrc_node_alloc(IORING_RSRC_FILE); + node = io_rsrc_node_alloc(ctx, IORING_RSRC_FILE); if (!node) { fput(file); goto fail; @@ -732,7 +773,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, if (!iov->iov_base) return NULL; - node = io_rsrc_node_alloc(IORING_RSRC_BUFFER); + node = io_rsrc_node_alloc(ctx, IORING_RSRC_BUFFER); if (!node) return ERR_PTR(-ENOMEM); node->buf = NULL; @@ -752,10 +793,11 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, coalesced = io_coalesce_buffer(&pages, &nr_pages, &data); } - imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL); + imu = io_alloc_imu(ctx, nr_pages); if (!imu) goto done; + imu->nr_bvecs = nr_pages; ret = io_buffer_account_pin(ctx, pages, nr_pages, imu, last_hpage); if (ret) { unpin_user_pages(pages, nr_pages); @@ -766,7 +808,6 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, /* store original address for later verification */ imu->ubuf = (unsigned long) iov->iov_base; imu->len = iov->iov_len; - imu->nr_bvecs = nr_pages; imu->folio_shift = PAGE_SHIFT; imu->release = io_release_ubuf; imu->priv = imu; @@ -789,7 +830,8 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, } done: if (ret) { - kvfree(imu); + if (imu) + io_free_imu(ctx, imu); if (node) io_put_rsrc_node(ctx, node); node = ERR_PTR(ret); @@ -893,14 +935,14 @@ int io_buffer_register_bvec(struct io_uring_cmd *cmd, struct request *rq, goto unlock; } - node = io_rsrc_node_alloc(IORING_RSRC_BUFFER); + node = io_rsrc_node_alloc(ctx, IORING_RSRC_BUFFER); if (!node) { ret = -ENOMEM; goto unlock; } nr_bvecs = blk_rq_nr_phys_segments(rq); - imu = kvmalloc(struct_size(imu, bvec, nr_bvecs), GFP_KERNEL); + imu = io_alloc_imu(ctx, nr_bvecs); if (!imu) { kfree(node); ret = -ENOMEM; @@ -1137,7 +1179,7 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx if (!src_node) { dst_node = NULL; } else { - dst_node = io_rsrc_node_alloc(IORING_RSRC_BUFFER); + dst_node = io_rsrc_node_alloc(ctx, IORING_RSRC_BUFFER); if (!dst_node) { ret = -ENOMEM; goto out_free; diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 7600e2736eeb3..27e545694d01e 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -48,7 +48,9 @@ struct io_imu_folio_data { unsigned int nr_folios; }; -struct io_rsrc_node *io_rsrc_node_alloc(int type); +bool io_rsrc_cache_init(struct io_ring_ctx *ctx); +void io_rsrc_cache_free(struct io_ring_ctx *ctx); +struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx, int type); void io_free_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node *node); void io_rsrc_data_free(struct io_ring_ctx *ctx, struct io_rsrc_data *data); int io_rsrc_data_alloc(struct io_rsrc_data *data, unsigned nr);