From patchwork Thu Oct 17 16:09:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13840308 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46AB41D95B5 for ; Thu, 17 Oct 2024 16:10:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181423; cv=none; b=eMqBQy7HPHZnPP5kAmwSI2BvlTRYlrlwsi3tA/KL/gzHoOBc6NmzQKXU1Bu5AYghuMUIhLfd3w+BtBrrCJbbXA14x7ZhMv5ACE0jNstKxQjqwJGk28sRee99kZqxqgTf29d7NK0AXks7GjKPmCWPOHQGFJtuezWH/XhIEzhcVDU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181423; c=relaxed/simple; bh=C/WBRiycOZQcPwrC1sW8w8AXc0VnrI4DNeQOyGBCoeE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Jc+yIRjqAWU8s1dxfjsdjhN3btIA/M+t4ntH9SfXw8NqRGj4bvIfa/1dMht452MDDXIzS7Ktt0GQinDLeVMvvVeqApSBCfAcfMo1mx/gsuOo2YtZ5vX+sW/+4xtsJ5MXksom/sZFqIn89nBgf9Clp/jYnpX/e7tp+4o8hkUbWPU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=c31A6XtI; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="c31A6XtI" Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49HCh1EM006877 for ; Thu, 17 Oct 2024 09:10:21 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=F8aJRbYmwU4rV7+ts4Kj/9zivoR8gg/b0xX0kqcFNzw=; b=c31A6XtIm9Ci gcYKQ3yho/Rzn03fosS0zovJ8pJxkiDkTSxJ159qOZNIwJxcwVflTWKo9WDoAC66 4x1hvBwCv0zqpUOLVYfMIVOhFzhbbi2aaAlPXRIZ5mLfqrjGooGC3BATWk07ACgF ISkgHrk0D9MzomGroaLmlrROTq1Kud2GwcAQNVgT4tSLjY3rAUzN74yqTNAGL7Nt Nu5aaGNgCVhnoUZjbClcnQ+vfc/OHNWN3j0a3NKrUkPCj0DVq3Ohvl7n5HnU5Hm0 N/pDuGDEvvpQ5S+1z3MpjIKULRoFcErSyOhneyx/PoJIEOqE+HtgYBy5X8Mq+fHA Vy02XrAjUQ== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 42ar0mn3pw-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 17 Oct 2024 09:10:21 -0700 (PDT) Received: from twshared16035.07.ash9.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a9:6f::8fd4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.11; Thu, 17 Oct 2024 16:10:17 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 2D3EF143A4AB4; Thu, 17 Oct 2024 09:10:01 -0700 (PDT) From: Keith Busch To: , , , , CC: , , , Keith Busch Subject: [PATCHv8 2/6] block: use generic u16 for write hints Date: Thu, 17 Oct 2024 09:09:33 -0700 Message-ID: <20241017160937.2283225-3-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017160937.2283225-1-kbusch@meta.com> References: <20241017160937.2283225-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: pTUlMa3YVgIKob7cgcgZxpU9PFIaYqRI X-Proofpoint-GUID: pTUlMa3YVgIKob7cgcgZxpU9PFIaYqRI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-05_02,2024-10-04_01,2024-09-30_01 From: Keith Busch This is still backwards compatible with lifetime hints. It just doesn't constrain the hints to that definition. Signed-off-by: Keith Busch Reviewed-by: Hannes Reinecke --- include/linux/blk-mq.h | 3 +-- include/linux/blk_types.h | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 59e9adf815a49..bf007a4081d9b 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -8,7 +8,6 @@ #include #include #include -#include struct blk_mq_tags; struct blk_flush_queue; @@ -156,7 +155,7 @@ struct request { struct blk_crypto_keyslot *crypt_keyslot; #endif - enum rw_hint write_hint; + unsigned short write_hint; unsigned short ioprio; enum mq_rq_state state; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index dce7615c35e7e..56b7fb961e0c7 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -219,7 +219,7 @@ struct bio { */ unsigned short bi_flags; /* BIO_* below */ unsigned short bi_ioprio; - enum rw_hint bi_write_hint; + unsigned short bi_write_hint; blk_status_t bi_status; atomic_t __bi_remaining; From patchwork Thu Oct 17 16:09:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13840309 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61D131DE8A4 for ; Thu, 17 Oct 2024 16:10:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181428; cv=none; b=eKj2WPjniOoZvPuCFlvs6qxK/by3EKY5GjNc1/t+94wxPhMvoC0ejNS0y2kztEpGelqa8q2C9NrHLjSm0pcUY2ub7auJL8r/53iFEvdAkqgnhmAklR29ODOpnidbesbuqNaxxDGkjnNyANVBtDzMqz+e0ZfM1h3mSqObqUQW/MQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181428; c=relaxed/simple; bh=MvS/i32WJkDXXnSscWSi1W93KMQWsKONvvpOgs3T5G8=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tLYDDsKb1tbYUtVD6/ajNqN5KKwS34swYyP/d60YT4gtp/ttoql02VBSAdZqtxxcQkNg98iXc6d+3E2i2c6YbK9Ijhe1IjJfG5jOg1l6xE9uTqDQgG3bpPqjnYzY75nz97IldMccT8BxFREMYBzA1qPpH/I7c/9l7x+8k9dpahE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=Z0l5OvWa; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="Z0l5OvWa" Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49HCgsB8003824 for ; Thu, 17 Oct 2024 09:10:26 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=Fpv8CvzFpr/t1VctB+FPTead3v5vovwpLAkShY9F2ik=; b=Z0l5OvWaMysp KMXYFzntY/wCwA+7z/Vy9Rr775j9k3A6klqNjvB+mpfrF7lJPVNgIQhnSs895PbB RORv4KLNys0PkXh4a41Abi1nQSGGypY3zZolrJgRESkM/rYFT+L9r2SdVjDVMt8X xOgusRSl0hfQbCajU62s+eOetg7aj84z8fyNZJkKdog89sI8aowg4x+ZD38wQnVe DSRDZg7mnaO9WI49LAhe2R5dUnPY5uCwwNoL7iabLAuOkwSQIR7ncIrYT0Hc1up2 J5scsh3VbCLMLb4s0I4g6puiaJ/ZKbjDf4Kh2T1iWKHRSy1dARN5q5N8o9/dg7Lh bCvxd5huRw== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 42a9tjkbjn-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 17 Oct 2024 09:10:26 -0700 (PDT) Received: from twshared23455.15.frc2.facebook.com (2620:10d:c0a8:fe::f072) by mail.thefacebook.com (2620:10d:c0a9:6f::237c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.11; Thu, 17 Oct 2024 16:10:21 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id ADEF7143A4AD2; Thu, 17 Oct 2024 09:10:07 -0700 (PDT) From: Keith Busch To: , , , , CC: , , , Keith Busch Subject: [PATCHv8 3/6] block: introduce max_write_hints queue limit Date: Thu, 17 Oct 2024 09:09:34 -0700 Message-ID: <20241017160937.2283225-4-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017160937.2283225-1-kbusch@meta.com> References: <20241017160937.2283225-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: cLV7bbtPP3KybQkkm_PgsMumILzZnrNN X-Proofpoint-GUID: cLV7bbtPP3KybQkkm_PgsMumILzZnrNN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-05_02,2024-10-04_01,2024-09-30_01 From: Keith Busch Drivers with hardware that support write hints need a way to export how many are available so applications can generically query this. Signed-off-by: Keith Busch Reviewed-by: Hannes Reinecke --- Documentation/ABI/stable/sysfs-block | 7 +++++++ block/blk-settings.c | 3 +++ block/blk-sysfs.c | 3 +++ block/fops.c | 2 ++ include/linux/blkdev.h | 12 ++++++++++++ 5 files changed, 27 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block index 8353611107154..f2db2cabb8e75 100644 --- a/Documentation/ABI/stable/sysfs-block +++ b/Documentation/ABI/stable/sysfs-block @@ -506,6 +506,13 @@ Description: [RO] Maximum size in bytes of a single element in a DMA scatter/gather list. +What: /sys/block//queue/max_write_hints +Date: October 2024 +Contact: linux-block@vger.kernel.org +Description: + [RO] Maximum number of write hints supported, 0 if not + supported. If supported, valid values are 1 through + max_write_hints, inclusive. What: /sys/block//queue/max_segments Date: March 2010 diff --git a/block/blk-settings.c b/block/blk-settings.c index a446654ddee5e..921fb4d334fa4 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -43,6 +43,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK; /* Inherit limits from component devices */ + lim->max_write_hints = USHRT_MAX; lim->max_segments = USHRT_MAX; lim->max_discard_segments = USHRT_MAX; lim->max_hw_sectors = UINT_MAX; @@ -544,6 +545,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + t->max_write_hints = min(t->max_write_hints, b->max_write_hints); + alignment = queue_limit_alignment_offset(b, start); /* Bottom device has different alignment. Check that it is diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 741b95dfdbf6f..85f48ca461049 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -104,6 +104,7 @@ QUEUE_SYSFS_LIMIT_SHOW(max_segments) QUEUE_SYSFS_LIMIT_SHOW(max_discard_segments) QUEUE_SYSFS_LIMIT_SHOW(max_integrity_segments) QUEUE_SYSFS_LIMIT_SHOW(max_segment_size) +QUEUE_SYSFS_LIMIT_SHOW(max_write_hints) QUEUE_SYSFS_LIMIT_SHOW(logical_block_size) QUEUE_SYSFS_LIMIT_SHOW(physical_block_size) QUEUE_SYSFS_LIMIT_SHOW(chunk_sectors) @@ -457,6 +458,7 @@ QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb"); QUEUE_RO_ENTRY(queue_max_segments, "max_segments"); QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments"); QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size"); +QUEUE_RO_ENTRY(queue_max_write_hints, "max_write_hints"); QUEUE_RW_LOAD_MODULE_ENTRY(elv_iosched, "scheduler"); QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size"); @@ -591,6 +593,7 @@ static struct attribute *queue_attrs[] = { &queue_max_discard_segments_entry.attr, &queue_max_integrity_segments_entry.attr, &queue_max_segment_size_entry.attr, + &queue_max_write_hints_entry.attr, &queue_hw_sector_size_entry.attr, &queue_logical_block_size_entry.attr, &queue_physical_block_size_entry.attr, diff --git a/block/fops.c b/block/fops.c index 85b9b97d372c8..d0b16d3975fd6 100644 --- a/block/fops.c +++ b/block/fops.c @@ -376,6 +376,8 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter) if (blkdev_dio_invalid(bdev, iocb->ki_pos, iter, is_atomic)) return -EINVAL; + if (iocb->ki_write_hint > bdev_max_write_hints(bdev)) + return -EINVAL; nr_pages = bio_iov_vecs_to_alloc(iter, BIO_MAX_VECS + 1); if (likely(nr_pages <= BIO_MAX_VECS)) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6b78a68e0bd9c..01aba0ffeff6e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -393,6 +393,8 @@ struct queue_limits { unsigned short max_integrity_segments; unsigned short max_discard_segments; + unsigned short max_write_hints; + unsigned int max_open_zones; unsigned int max_active_zones; @@ -1183,6 +1185,11 @@ static inline unsigned short queue_max_segments(const struct request_queue *q) return q->limits.max_segments; } +static inline unsigned short queue_max_write_hints(struct request_queue *q) +{ + return q->limits.max_write_hints; +} + static inline unsigned short queue_max_discard_segments(const struct request_queue *q) { return q->limits.max_discard_segments; @@ -1230,6 +1237,11 @@ static inline unsigned int bdev_max_segments(struct block_device *bdev) return queue_max_segments(bdev_get_queue(bdev)); } +static inline unsigned short bdev_max_write_hints(struct block_device *bdev) +{ + return queue_max_write_hints(bdev_get_queue(bdev)); +} + static inline unsigned queue_logical_block_size(const struct request_queue *q) { return q->limits.logical_block_size; From patchwork Thu Oct 17 16:09:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13840310 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49C921DF247 for ; Thu, 17 Oct 2024 16:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181432; cv=none; b=NyXQxpwnOZpV0SUDw1acMZDIFVB4BisCPZoWWCYCwwfjru+bTOnh4KMEDbyoSj20f2SAFvMB0BSW/m9YGqsN8BenaCLFqZCLdVP8gPy9Ao6EkMdJqlKolekDlAnsebripkFNfTqhuGIWpndbl/hvRS+lCfxTUPywwuA9hOOpyx4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181432; c=relaxed/simple; bh=p/pdlwEOz7x4atHfPH22yaUj+cFfP6Ldj5Fdo1PQsAg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RADT/Cb6t6BI5sZxAAmp9noEtqmex66GINA4Qw+PkYDOhz0m1aorrFnshOGou7u2kmDLM2w6E6jXflRiCSBnTF+rwRLswt6JKs4c+Tgq5eM6bVLXERceSRUXeyyW+hl8/LoR8QrPiVr7Ly/ZpLHWi8j29kyUJw5nBwV/py4JshQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=IpsQFmON; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="IpsQFmON" Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49HCgsBF003824 for ; Thu, 17 Oct 2024 09:10:30 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=8XyX8u59mNjnQPnGrxzikA04q0KGVl3u3qy1AjZng4g=; b=IpsQFmON4IMW CH/iLUNJvCbdArXLLYsgLxtHPtSmunWocZc6QSNJUgBUeoJdHYieF6TQTIksvdln 7Z9xPLofaq/X2vsdEDJpYSB3/+UjVzv4sj2ZUn0QEOT0uF2iekQaLEQyctRrWLdX a3qhXO7MdefRTbpkdejv4s7txNh2yjT4sESFXtxiAYfC1fNGbxu4rGibsrBrfYLK rKsphDuZCxF2Y1Eb+de/UdcJKLurqYJuescb0DpZ73MvSrYZ7iDi+4yWiH1US/Bu t5V+xB0T7zDgkhiWoiEfIX1KDgR1Zazem27gFeqbeoJZhg3ePEH9TDn6qMKhn+SC 6TsjdCIi7g== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 42a9tjkbjn-13 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 17 Oct 2024 09:10:30 -0700 (PDT) Received: from twshared4085.05.ash9.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a9:6f::237c) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.11; Thu, 17 Oct 2024 16:10:23 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id 2C20E143A4AFC; Thu, 17 Oct 2024 09:10:14 -0700 (PDT) From: Keith Busch To: , , , , CC: , , , Keith Busch Subject: [PATCHv8 4/6] fs: introduce per-io hint support flag Date: Thu, 17 Oct 2024 09:09:35 -0700 Message-ID: <20241017160937.2283225-5-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017160937.2283225-1-kbusch@meta.com> References: <20241017160937.2283225-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: YHfkg_dy8TLaela61ErU-72jN2LnD0b5 X-Proofpoint-GUID: YHfkg_dy8TLaela61ErU-72jN2LnD0b5 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-05_02,2024-10-04_01,2024-09-30_01 From: Keith Busch A block device may support write hints on a per-io basis. The raw block file operations can effectively use these, but real filesystems are not ready to make use of this. Provide a file_operations flag to indicate support, and set it for the block file operations. Signed-off-by: Keith Busch Reviewed-by: Hannes Reinecke --- block/fops.c | 2 +- include/linux/fs.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/block/fops.c b/block/fops.c index d0b16d3975fd6..15a63e26161ea 100644 --- a/block/fops.c +++ b/block/fops.c @@ -869,7 +869,7 @@ const struct file_operations def_blk_fops = { .splice_write = iter_file_splice_write, .fallocate = blkdev_fallocate, .uring_cmd = blkdev_uring_cmd, - .fop_flags = FOP_BUFFER_RASYNC, + .fop_flags = FOP_BUFFER_RASYNC | FOP_PER_IO_HINTS, }; static __init int blkdev_init(void) diff --git a/include/linux/fs.h b/include/linux/fs.h index 04e875a37f604..026dc9801dc20 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2117,6 +2117,8 @@ struct file_operations { #define FOP_HUGE_PAGES ((__force fop_flags_t)(1 << 4)) /* Treat loff_t as unsigned (e.g., /dev/mem) */ #define FOP_UNSIGNED_OFFSET ((__force fop_flags_t)(1 << 5)) +/* File system can handle per-io hints */ +#define FOP_PER_IO_HINTS ((__force fop_flags_t)(1 << 6)) /* Wrap a directory iterator that needs exclusive inode access */ int wrap_directory_iterator(struct file *, struct dir_context *, From patchwork Thu Oct 17 16:09:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13840311 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 520C61DF267 for ; Thu, 17 Oct 2024 16:10:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=67.231.145.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181434; cv=none; b=WRqiu2o7P/pKZlutsy+UUM5o5+H4T8xrFmHy1AxqADM/ZA2jPH+4YJZx3Ek60ySrpnFGATLP3M6OTTKhgpn3NMa2/qPrwdWK5owsvrt5Z6ECO/XKOMIsryTC1Upr//D5ZUw9EvtlUA/pre4R/AiBAF5Neaq5fZL5GdSHbKcTWys= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729181434; c=relaxed/simple; bh=DbI4oJWTL68DZghETXmLNUCsxPzqSr/T+mWexsILi8E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZhL8bRsuobRw+ItjMqy5gov/JxqesBgnOyMCQoJ4pASGY0KkSHLmXnS3D3BDdkKuM6grOdYpKOQ1sAtjxcCWPDOa7D0B8PLvw6u6iN844SFrOPnEu88KSqmtwKxl0EiBJVDjPvn0ncBO51Co05EV0TXYJMWi7NU8DDHky+wchJw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com; spf=pass smtp.mailfrom=meta.com; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b=FGgxopGS; arc=none smtp.client-ip=67.231.145.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=meta.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="FGgxopGS" Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 49HCh1EV006877 for ; Thu, 17 Oct 2024 09:10:28 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=s2048-2021-q4; bh=OwIAJzKtghZAIEiijCWk0Jg1lUOZ2HiT8OKL2/Nxeyw=; b=FGgxopGSbfeG cR3367fdrM3pzFWzkCm0xyt1u49tMM6qW0Z7/6+5SCc461cyI2xXVGqi3uJM6fsG lMXxrZPwCdyde3JV5anhVfd8PMiy/R3VOcda2oPgpa1kO4zSYpXHBuCH8gjbhgU2 FAvPafjsRyA/5Pd7GnVDojQJ5URdTvIaeZNhMzHoNfqffP3jAT/LX9EVNiP85was F4At58YRz8fjPmIfKdA4DyLWdq/8jvM8i329lYonUhjQRZW7kgAAuRoxxlRbZl/N XXZdNx9mYdOOFHcLBGChQW1PEjDvrwsXEHqqmmGKSfJeGWF36M+NxzKc9I8AxEib J4dHLplH2w== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 42ar0mn3pw-13 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 17 Oct 2024 09:10:28 -0700 (PDT) Received: from twshared29849.08.ash9.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a9:6f::8fd4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1544.11; Thu, 17 Oct 2024 16:10:20 +0000 Received: by devbig638.nha1.facebook.com (Postfix, from userid 544533) id F0491143A4B14; Thu, 17 Oct 2024 09:10:18 -0700 (PDT) From: Keith Busch To: , , , , CC: , , , Nitesh Shetty , Keith Busch Subject: [PATCHv8 5/6] io_uring: enable per-io hinting capability Date: Thu, 17 Oct 2024 09:09:36 -0700 Message-ID: <20241017160937.2283225-6-kbusch@meta.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017160937.2283225-1-kbusch@meta.com> References: <20241017160937.2283225-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: o1ipjxyJMRJG4wSiHBV94M0svJ2hgPkG X-Proofpoint-GUID: o1ipjxyJMRJG4wSiHBV94M0svJ2hgPkG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-05_02,2024-10-04_01,2024-09-30_01 From: Kanchan Joshi With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and all the subsequent writes on the file pass that hint value down. This can be limiting for block device as all the writes can be tagged with only one lifetime hint value. Concurrent writes (with different hint values) are hard to manage. Per-IO hinting solves that problem. Allow userspace to pass additional metadata in the SQE. __u16 write_hint; This accepts all hint values that the file allows. The write handlers (io_prep_rw, io_write) send the hint value to lower-layer using kiocb. This is good for upporting direct IO, but not when kiocb is not available (e.g., buffered IO). When per-io hints are not passed, the per-inode hint values are set in the kiocb (as before). Otherwise, per-io hints take the precedence over per-inode hints. Signed-off-by: Kanchan Joshi Signed-off-by: Nitesh Shetty Signed-off-by: Keith Busch Reviewed-by: Hannes Reinecke --- include/uapi/linux/io_uring.h | 4 ++++ io_uring/rw.c | 11 +++++++++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 86cb385fe0b53..bd9acc0053318 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -92,6 +92,10 @@ struct io_uring_sqe { __u16 addr_len; __u16 __pad3[1]; }; + struct { + __u16 write_hint; + __u16 __pad4[1]; + }; }; union { struct { diff --git a/io_uring/rw.c b/io_uring/rw.c index ffd637ca0bd17..9a6d3ba76af4f 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -279,7 +279,11 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, rw->kiocb.ki_ioprio = get_current_ioprio(); } rw->kiocb.dio_complete = NULL; - + if (ddir == ITER_SOURCE && + req->file->f_op->fop_flags & FOP_PER_IO_HINTS) + rw->kiocb.ki_write_hint = READ_ONCE(sqe->write_hint); + else + rw->kiocb.ki_write_hint = WRITE_LIFE_NOT_SET; rw->addr = READ_ONCE(sqe->addr); rw->len = READ_ONCE(sqe->len); rw->flags = READ_ONCE(sqe->rw_flags); @@ -1027,7 +1031,10 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(ret)) return ret; req->cqe.res = iov_iter_count(&io->iter); - rw->kiocb.ki_write_hint = file_write_hint(rw->kiocb.ki_filp); + + /* Use per-file hint only if per-io hint is not set. */ + if (rw->kiocb.ki_write_hint == WRITE_LIFE_NOT_SET) + rw->kiocb.ki_write_hint = file_write_hint(rw->kiocb.ki_filp); if (force_nonblock) { /* If the file doesn't support async, just async punt */