From patchwork Wed Jul 19 20:19:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikulas Patocka X-Patchwork-Id: 13319454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BDF5C001DE for ; Wed, 19 Jul 2023 20:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230111AbjGSUUJ (ORCPT ); Wed, 19 Jul 2023 16:20:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231428AbjGSUT5 (ORCPT ); Wed, 19 Jul 2023 16:19:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 976D7189 for ; Wed, 19 Jul 2023 13:19:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689797955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=Zmj5nzJTSHzE8zf1I1zUcK3keiQbhtTkmdKSaDQ9f0U=; b=Cl8P64NISxeFroQFTmSBwDZB+c6Pb+BQxa/YCKmRgqbBfH1DXjdBjgwe8eZUofeQmkCWDs UBuxERLR+ZSdQIHhGVFX5PsBiH2Ldm7h3esqhaLFsZv57Jni3y+Z+xXLDpGx1vY6hyYq5Z yo9AKVrIPnk8wBRIURGZSX3wELX9TYE= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-275-DPzm9kHEPC-FG72qJFmPkw-1; Wed, 19 Jul 2023 16:19:10 -0400 X-MC-Unique: DPzm9kHEPC-FG72qJFmPkw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0FBEA1C31C46; Wed, 19 Jul 2023 20:19:10 +0000 (UTC) Received: from file1-rdu.file-001.prod.rdu2.dc.redhat.com (unknown [10.11.5.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F3B1640C206F; Wed, 19 Jul 2023 20:19:09 +0000 (UTC) Received: by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix, from userid 12668) id EDE063096A42; Wed, 19 Jul 2023 20:19:09 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix) with ESMTP id ECE523F7CF; Wed, 19 Jul 2023 22:19:09 +0200 (CEST) Date: Wed, 19 Jul 2023 22:19:09 +0200 (CEST) From: Mikulas Patocka To: Jens Axboe cc: Li Nan , Zdenek Kabelac , Christoph Hellwig , linux-block@vger.kernel.org, dm-devel@redhat.com Subject: [PATCH 1/3] brd: extend the rcu regions to cover read and write Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch extends the rcu regions, so that lookup followed by a read or write of a page is done inside rcu read lock. This is needed for the following patch that enables discard. Signed-off-by: Mikulas Patocka --- drivers/block/brd.c | 8 ++++++++ 1 file changed, 8 insertions(+) Index: linux-2.6/drivers/block/brd.c =================================================================== --- linux-2.6.orig/drivers/block/brd.c +++ linux-2.6/drivers/block/brd.c @@ -150,23 +150,27 @@ static void copy_to_brd(struct brd_devic size_t copy; copy = min_t(size_t, n, PAGE_SIZE - offset); + rcu_read_lock(); page = brd_lookup_page(brd, sector); BUG_ON(!page); dst = kmap_atomic(page); memcpy(dst + offset, src, copy); kunmap_atomic(dst); + rcu_read_unlock(); if (copy < n) { src += copy; sector += copy >> SECTOR_SHIFT; copy = n - copy; + rcu_read_lock(); page = brd_lookup_page(brd, sector); BUG_ON(!page); dst = kmap_atomic(page); memcpy(dst, src, copy); kunmap_atomic(dst); + rcu_read_unlock(); } } @@ -182,6 +186,7 @@ static void copy_from_brd(void *dst, str size_t copy; copy = min_t(size_t, n, PAGE_SIZE - offset); + rcu_read_lock(); page = brd_lookup_page(brd, sector); if (page) { src = kmap_atomic(page); @@ -189,11 +194,13 @@ static void copy_from_brd(void *dst, str kunmap_atomic(src); } else memset(dst, 0, copy); + rcu_read_unlock(); if (copy < n) { dst += copy; sector += copy >> SECTOR_SHIFT; copy = n - copy; + rcu_read_lock(); page = brd_lookup_page(brd, sector); if (page) { src = kmap_atomic(page); @@ -201,6 +208,7 @@ static void copy_from_brd(void *dst, str kunmap_atomic(src); } else memset(dst, 0, copy); + rcu_read_unlock(); } } From patchwork Wed Jul 19 20:20:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikulas Patocka X-Patchwork-Id: 13319464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEF7FC0015E for ; Wed, 19 Jul 2023 20:21:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229450AbjGSUVg (ORCPT ); Wed, 19 Jul 2023 16:21:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229707AbjGSUVe (ORCPT ); Wed, 19 Jul 2023 16:21:34 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CB92213D for ; Wed, 19 Jul 2023 13:20:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689798021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=sUdi2Mx7+ghCvg1Vjjj6YdB102CEZh7ccSFO54ddLPA=; b=iNISLfSur8YEjLY2Bp63VS6UzvTmjsAEuMBg6K6Eq5J4CsQDB8GaOAU77xiys0fpjBqpCz GZGSJaR7TWGoGwWnol6o2CjFcJvr3/fRYOJLjRPU8Su70DzIChOkQyWgCMXOtCwbRDdBfq MuSBG6PKj8AzlBtuBQ7oGXpMwaLOWLw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-326-lCOtgq1VOk-gQSkRafRJ3g-1; Wed, 19 Jul 2023 16:20:18 -0400 X-MC-Unique: lCOtgq1VOk-gQSkRafRJ3g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0C8F68F1846; Wed, 19 Jul 2023 20:20:18 +0000 (UTC) Received: from file1-rdu.file-001.prod.rdu2.dc.redhat.com (unknown [10.11.5.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EB1CA2166B25; Wed, 19 Jul 2023 20:20:17 +0000 (UTC) Received: by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix, from userid 12668) id E503A3096A42; Wed, 19 Jul 2023 20:20:17 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix) with ESMTP id E416C3F7CF; Wed, 19 Jul 2023 22:20:17 +0200 (CEST) Date: Wed, 19 Jul 2023 22:20:17 +0200 (CEST) From: Mikulas Patocka To: Jens Axboe cc: Li Nan , Zdenek Kabelac , Christoph Hellwig , linux-block@vger.kernel.org, dm-devel@redhat.com Subject: [PATCH 2/3] brd: enable discard Message-ID: <3fcf222-4894-992-ae81-c72ca983d82@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch implements discard in the brd driver. We use RCU to free the page, so that if there are any concurrent readers or writes, they won't touch the page after it is freed. Note that we replace "BUG_ON(!page);" with "if (page) ..." in copy_to_brd - the page can't be NULL under normal circumstances, it can only be NULL if REQ_OP_WRITE races with REQ_OP_DISCARD. If these two bios race with each other on the same page, the result is undefined, so we can handle this race condition just by skipping the copying. Signed-off-by: Mikulas Patocka --- drivers/block/brd.c | 85 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 74 insertions(+), 11 deletions(-) Index: linux-2.6/drivers/block/brd.c =================================================================== --- linux-2.6.orig/drivers/block/brd.c +++ linux-2.6/drivers/block/brd.c @@ -46,6 +46,8 @@ struct brd_device { u64 brd_nr_pages; }; +static bool discard; + /* * Look up and return a brd's page for a given sector. */ @@ -100,6 +102,26 @@ static int brd_insert_page(struct brd_de return ret; } +static void brd_free_page_rcu(struct rcu_head *head) +{ + struct page *page = container_of(head, struct page, rcu_head); + __free_page(page); +} + +static void brd_free_page(struct brd_device *brd, sector_t sector) +{ + struct page *page; + pgoff_t idx; + + idx = sector >> PAGE_SECTORS_SHIFT; + page = xa_erase(&brd->brd_pages, idx); + + if (page) { + BUG_ON(page->index != idx); + call_rcu(&page->rcu_head, brd_free_page_rcu); + } +} + /* * Free all backing store pages and xarray. This must only be called when * there are no other users of the device. @@ -152,11 +174,11 @@ static void copy_to_brd(struct brd_devic copy = min_t(size_t, n, PAGE_SIZE - offset); rcu_read_lock(); page = brd_lookup_page(brd, sector); - BUG_ON(!page); - - dst = kmap_atomic(page); - memcpy(dst + offset, src, copy); - kunmap_atomic(dst); + if (page) { + dst = kmap_atomic(page); + memcpy(dst + offset, src, copy); + kunmap_atomic(dst); + } rcu_read_unlock(); if (copy < n) { @@ -165,11 +187,11 @@ static void copy_to_brd(struct brd_devic copy = n - copy; rcu_read_lock(); page = brd_lookup_page(brd, sector); - BUG_ON(!page); - - dst = kmap_atomic(page); - memcpy(dst, src, copy); - kunmap_atomic(dst); + if (page) { + dst = kmap_atomic(page); + memcpy(dst, src, copy); + kunmap_atomic(dst); + } rcu_read_unlock(); } } @@ -248,13 +270,44 @@ out: return err; } +void brd_do_discard(struct brd_device *brd, struct bio *bio) +{ + sector_t sector, len, front_pad; + + if (unlikely(!discard)) { + bio->bi_status = BLK_STS_NOTSUPP; + return; + } + + sector = bio->bi_iter.bi_sector; + len = bio_sectors(bio); + front_pad = -sector & (PAGE_SECTORS - 1); + sector += front_pad; + if (unlikely(len <= front_pad)) + return; + len -= front_pad; + len = round_down(len, PAGE_SECTORS); + while (len) { + brd_free_page(brd, sector); + sector += PAGE_SECTORS; + len -= PAGE_SECTORS; + cond_resched(); + } +} + static void brd_submit_bio(struct bio *bio) { struct brd_device *brd = bio->bi_bdev->bd_disk->private_data; - sector_t sector = bio->bi_iter.bi_sector; + sector_t sector; struct bio_vec bvec; struct bvec_iter iter; + if (bio_op(bio) == REQ_OP_DISCARD) { + brd_do_discard(brd, bio); + goto endio; + } + + sector = bio->bi_iter.bi_sector; bio_for_each_segment(bvec, bio, iter) { unsigned int len = bvec.bv_len; int err; @@ -276,6 +329,7 @@ static void brd_submit_bio(struct bio *b sector += len >> SECTOR_SHIFT; } +endio: bio_endio(bio); } @@ -299,6 +353,10 @@ static int max_part = 1; module_param(max_part, int, 0444); MODULE_PARM_DESC(max_part, "Num Minors to reserve between devices"); +static bool discard = false; +module_param(discard, bool, 0444); +MODULE_PARM_DESC(discard, "Support discard"); + MODULE_LICENSE("GPL"); MODULE_ALIAS_BLOCKDEV_MAJOR(RAMDISK_MAJOR); MODULE_ALIAS("rd"); @@ -364,6 +422,11 @@ static int brd_alloc(int i) */ blk_queue_physical_block_size(disk->queue, PAGE_SIZE); + if (discard) { + disk->queue->limits.discard_granularity = PAGE_SIZE; + blk_queue_max_discard_sectors(disk->queue, round_down(UINT_MAX, PAGE_SECTORS)); + } + /* Tell the block layer that this is not a rotational device */ blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue); blk_queue_flag_set(QUEUE_FLAG_SYNCHRONOUS, disk->queue); From patchwork Wed Jul 19 20:21:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikulas Patocka X-Patchwork-Id: 13319465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FD4CC0015E for ; Wed, 19 Jul 2023 20:22:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229456AbjGSUWz (ORCPT ); Wed, 19 Jul 2023 16:22:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229714AbjGSUWy (ORCPT ); Wed, 19 Jul 2023 16:22:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E36BC135 for ; Wed, 19 Jul 2023 13:22:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689798121; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=sRo3gN/LJXiVMx/g1vtWeFh4Eo71p22bbKf6S4YuYe4=; b=Bwr2KUfdM5yedKaLXjISXHGH1ShzzOZKW/jRSktr8v5YqJBmrExssKGnkPWRDqI4p88Z8B SugoOxxTMcVQ2o6X6mPcvedFpFex2nrw/8Y6m7wkHmDxTPNFPoxmash3PQ1AskGNJmKHKy /dwFfZa96XTIt9JAmjcUqQ+9BoTZOTU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-271-nmesSBlfMDSn_4MPSD-aXw-1; Wed, 19 Jul 2023 16:21:46 -0400 X-MC-Unique: nmesSBlfMDSn_4MPSD-aXw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 75766856F66; Wed, 19 Jul 2023 20:21:14 +0000 (UTC) Received: from file1-rdu.file-001.prod.rdu2.dc.redhat.com (unknown [10.11.5.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 66F2B40D2839; Wed, 19 Jul 2023 20:21:14 +0000 (UTC) Received: by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix, from userid 12668) id 6129F3096A42; Wed, 19 Jul 2023 20:21:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by file1-rdu.file-001.prod.rdu2.dc.redhat.com (Postfix) with ESMTP id 5FF873F7CF; Wed, 19 Jul 2023 22:21:14 +0200 (CEST) Date: Wed, 19 Jul 2023 22:21:14 +0200 (CEST) From: Mikulas Patocka To: Jens Axboe cc: Li Nan , Zdenek Kabelac , Christoph Hellwig , linux-block@vger.kernel.org, dm-devel@redhat.com Subject: [PATCH 3/3] brd: implement write zeroes Message-ID: <73c46137-c06e-348f-3d37-8c305ad4c4f3@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This patch implements REQ_OP_WRITE_ZEROES on brd. Write zeroes will free the pages just like discard, but the difference is that it writes zeroes to the preceding and following page if the range is not aligned on page boundary. Signed-off-by: Mikulas Patocka --- drivers/block/brd.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) Index: linux-2.6/drivers/block/brd.c =================================================================== --- linux-2.6.orig/drivers/block/brd.c +++ linux-2.6/drivers/block/brd.c @@ -272,7 +272,8 @@ out: void brd_do_discard(struct brd_device *brd, struct bio *bio) { - sector_t sector, len, front_pad; + bool zero_padding = bio_op(bio) == REQ_OP_WRITE_ZEROES; + sector_t sector, len, front_pad, end_pad; if (unlikely(!discard)) { bio->bi_status = BLK_STS_NOTSUPP; @@ -282,11 +283,22 @@ void brd_do_discard(struct brd_device *b sector = bio->bi_iter.bi_sector; len = bio_sectors(bio); front_pad = -sector & (PAGE_SECTORS - 1); + + if (zero_padding && unlikely(front_pad != 0)) + copy_to_brd(brd, page_address(ZERO_PAGE(0)), + sector, min(len, front_pad) << SECTOR_SHIFT); + sector += front_pad; if (unlikely(len <= front_pad)) return; len -= front_pad; - len = round_down(len, PAGE_SECTORS); + + end_pad = len & (PAGE_SECTORS - 1); + if (zero_padding && unlikely(end_pad != 0)) + copy_to_brd(brd, page_address(ZERO_PAGE(0)), + sector + len - end_pad, end_pad << SECTOR_SHIFT); + len -= end_pad; + while (len) { brd_free_page(brd, sector); sector += PAGE_SECTORS; @@ -302,7 +314,8 @@ static void brd_submit_bio(struct bio *b struct bio_vec bvec; struct bvec_iter iter; - if (bio_op(bio) == REQ_OP_DISCARD) { + if (bio_op(bio) == REQ_OP_DISCARD || + bio_op(bio) == REQ_OP_WRITE_ZEROES) { brd_do_discard(brd, bio); goto endio; } @@ -355,7 +368,7 @@ MODULE_PARM_DESC(max_part, "Num Minors t static bool discard = false; module_param(discard, bool, 0444); -MODULE_PARM_DESC(discard, "Support discard"); +MODULE_PARM_DESC(discard, "Support discard and write zeroes"); MODULE_LICENSE("GPL"); MODULE_ALIAS_BLOCKDEV_MAJOR(RAMDISK_MAJOR); @@ -425,6 +438,7 @@ static int brd_alloc(int i) if (discard) { disk->queue->limits.discard_granularity = PAGE_SIZE; blk_queue_max_discard_sectors(disk->queue, round_down(UINT_MAX, PAGE_SECTORS)); + blk_queue_max_write_zeroes_sectors(disk->queue, round_down(UINT_MAX, PAGE_SECTORS)); } /* Tell the block layer that this is not a rotational device */