From patchwork Thu Jun 27 02:49:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11018677 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C33C113AF for ; Thu, 27 Jun 2019 02:49:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6ABB286FE for ; Thu, 27 Jun 2019 02:49:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA6D728A2D; Thu, 27 Jun 2019 02:49:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12A7328A1D for ; Thu, 27 Jun 2019 02:49:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726952AbfF0CtO (ORCPT ); Wed, 26 Jun 2019 22:49:14 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:34994 "EHLO esa1.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726829AbfF0CtN (ORCPT ); Wed, 26 Jun 2019 22:49:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561603753; x=1593139753; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fdbWKlu4/HXQHRMlCeZPeDdMQRlpifpGaqf6BWq4u2U=; b=ckUH2ZtKhdN1DcchTfNKLpZCkm5mQj/9GXxb9jNwgPBbMATfGU3mNO/O ynw94MQAJrTxGHig3Fm9w7jBiH2WBDYB3T4ZMgs8EslFJc9ktPL/Nzlyr /H2ygmLWsOX+FW4oK+I7Day56FmgIzyqXf6pR4axk+ZO4qaWMNCMuhL9o FlexMaZRIyOFRI1dHAY5LMQg3vGyYzI6ZAmZqKZAFkndvW62u9yrmeQws I2fFH8zodN9lEgkcplAMPn6GCX4iS/vlTq8mIrK0xrX/iyZHAkmte2Rg0 B45jt2k+9VEMdQKryWeVEc6bEiw93THKd6RHffR7XNDmGoOfFyOZ/bUGw A==; X-IronPort-AV: E=Sophos;i="5.63,422,1557158400"; d="scan'208";a="218022028" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 27 Jun 2019 10:49:13 +0800 IronPort-SDR: VfwpkXnbYzTZa2TrjXrQliDD7z1kVS7cu56BzM4lQasHu0KzGYV5P7FXTmkRs3iuyFxr53+2d/ 5f5wr7cRChvKZjf+6Hja34Ie2vA7zk9CsfGlSmIlb115ZaIl86eYbGcb0uKLDGugfsOtgRWm2F iSft2oYbxpqbHMXf0vA/6MiXCCWnsQF+tNvDtbTKIfY8a6ldspoB3NZFNMlcwl6x5NLXYNIylL CbFk9lSootYgL2/NUO1VD+NItz3fymWAUwZe2YYe5B/bYru1SFbVciLR8HREMukSJLIaPd2jM7 zTAOFR1pt/FEB2LvHtZhlmNh Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP; 26 Jun 2019 19:48:21 -0700 IronPort-SDR: Osrt+M0ewq+AneBBY4j6p3sLWJx3JP+P7Gp9iXVzgOM9PYv4hvXE6oMWsoo+LQJyBjbI4z2Xz5 zS+5YKD0RjniCdHWc6+JFeHM873/upA2rBByl+sTpiTVgSESjY5T2kBSeA09dbs7EMPWGSgmcD fGodW30GqVb+uSPaz/7mZM43Lhje2pyxk4XeWUjT9kfMXULVOoXEhw6XMgHhrRAIpCDKJrtf9p fhS+uYF291Hu9dBQl25fAVfe8QGxnXnl/8TxxB0h66YwMICdk4bbmDG7xIFf7onV69EOaVRj5U 9U4= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 26 Jun 2019 19:49:13 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe Cc: Christoph Hellwig , Bart Van Assche Subject: [PATCH V4 1/3] block: Allow mapping of vmalloc-ed buffers Date: Thu, 27 Jun 2019 11:49:08 +0900 Message-Id: <20190627024910.23987-2-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190627024910.23987-1-damien.lemoal@wdc.com> References: <20190627024910.23987-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP To allow the SCSI subsystem scsi_execute_req() function to issue requests using large buffers that are better allocated with vmalloc() rather than kmalloc(), modify bio_map_kern() and bio_copy_kern() to allow passing a buffer allocated with vmalloc(). To do so, detect vmalloc-ed buffers using is_vmalloc_addr(). For vmalloc-ed buffers, flush the buffer using flush_kernel_vmap_range(), use vmalloc_to_page() instead of virt_to_page() to obtain the pages of the buffer, and invalidate the buffer addresses with invalidate_kernel_vmap_range() on completion of read BIOs. This last point is executed using the function bio_invalidate_vmalloc_pages() which is defined only if the architecture defines ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE, that is, if the architecture actually needs the invalidation done. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal --- block/bio.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index ce797d73bb43..1c21d1e7f1b8 100644 --- a/block/bio.c +++ b/block/bio.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include "blk.h" @@ -1479,8 +1480,26 @@ void bio_unmap_user(struct bio *bio) bio_put(bio); } +#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE +static void bio_invalidate_vmalloc_pages(struct bio *bio) +{ + if (bio->bi_private) { + struct bvec_iter_all iter_all; + struct bio_vec *bvec; + unsigned long len = 0; + + bio_for_each_segment_all(bvec, bio, iter_all) + len += bvec->bv_len; + invalidate_kernel_vmap_range(bio->bi_private, len); + } +} +#else +static void bio_invalidate_vmalloc_pages(struct bio *bio) {} +#endif + static void bio_map_kern_endio(struct bio *bio) { + bio_invalidate_vmalloc_pages(bio); bio_put(bio); } @@ -1501,6 +1520,8 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT; unsigned long start = kaddr >> PAGE_SHIFT; const int nr_pages = end - start; + bool is_vmalloc = is_vmalloc_addr(data); + struct page *page; int offset, i; struct bio *bio; @@ -1508,6 +1529,12 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, if (!bio) return ERR_PTR(-ENOMEM); + if (is_vmalloc) { + flush_kernel_vmap_range(data, len); + if ((!op_is_write(bio_op(bio)))) + bio->bi_private = data; + } + offset = offset_in_page(kaddr); for (i = 0; i < nr_pages; i++) { unsigned int bytes = PAGE_SIZE - offset; @@ -1518,7 +1545,11 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, if (bytes > len) bytes = len; - if (bio_add_pc_page(q, bio, virt_to_page(data), bytes, + if (!is_vmalloc) + page = virt_to_page(data); + else + page = vmalloc_to_page(data); + if (bio_add_pc_page(q, bio, page, bytes, offset) < bytes) { /* we don't support partial mappings */ bio_put(bio); @@ -1531,6 +1562,7 @@ struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, } bio->bi_end_io = bio_map_kern_endio; + return bio; } EXPORT_SYMBOL(bio_map_kern); @@ -1543,6 +1575,7 @@ static void bio_copy_kern_endio(struct bio *bio) static void bio_copy_kern_endio_read(struct bio *bio) { + unsigned long len = 0; char *p = bio->bi_private; struct bio_vec *bvec; struct bvec_iter_all iter_all; @@ -1550,8 +1583,12 @@ static void bio_copy_kern_endio_read(struct bio *bio) bio_for_each_segment_all(bvec, bio, iter_all) { memcpy(p, page_address(bvec->bv_page), bvec->bv_len); p += bvec->bv_len; + len += bvec->bv_len; } + if (is_vmalloc_addr(bio->bi_private)) + invalidate_kernel_vmap_range(bio->bi_private, len); + bio_copy_kern_endio(bio); } @@ -1572,6 +1609,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, unsigned long kaddr = (unsigned long)data; unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT; unsigned long start = kaddr >> PAGE_SHIFT; + bool is_vmalloc = is_vmalloc_addr(data); struct bio *bio; void *p = data; int nr_pages = 0; @@ -1587,6 +1625,9 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, if (!bio) return ERR_PTR(-ENOMEM); + if (is_vmalloc) + flush_kernel_vmap_range(data, len); + while (len) { struct page *page; unsigned int bytes = PAGE_SIZE; From patchwork Thu Jun 27 02:49:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11018681 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9741413AF for ; Thu, 27 Jun 2019 02:49:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8AB69286FE for ; Thu, 27 Jun 2019 02:49:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7F27028A3A; Thu, 27 Jun 2019 02:49:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ECEF4286FE for ; Thu, 27 Jun 2019 02:49:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726967AbfF0CtQ (ORCPT ); Wed, 26 Jun 2019 22:49:16 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:34994 "EHLO esa1.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726829AbfF0CtQ (ORCPT ); Wed, 26 Jun 2019 22:49:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561603754; x=1593139754; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7kbEPhpOWuZEiLmXFafrzlRcGM3900czchapAYxS100=; b=gLsKJx6smdhFYgWCQdyNZLrN6e87v4aVYcGVM5D/CqajzAUAcCKUTBkQ oUJzLvYYeDH8bzWYgXt22dMqRVfcKwpIbSL5HWEJ987r7ZZhTd5knfw5j F/FDeN4QjIsYvD6OcuS/hHAAxJlSjNa54FqPkd7f0CUwrruji3k8Q3mSU lr8wrYkfig4qHV0JGdZ/Y6vcf3TUdv6lafUp292Pb/tdpAd6lcftC9rpD IYPcjMgwCsCsHOVboWeHkSUTV4xLyNXvH7tXonGT/mNObPUOpVZJt90bT mCcSmKlOeXQr5Jvq/PU9jWSEc1j5quP3WKtftYf629J7s7fOt2UHeiz4L Q==; X-IronPort-AV: E=Sophos;i="5.63,422,1557158400"; d="scan'208";a="218022030" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 27 Jun 2019 10:49:14 +0800 IronPort-SDR: UvOy39rhLk9Hb/gCReud/kIKBYXu0EnWDhrw4xHQ9WCbtLcpEuSyRSGd/Jpro8szlamvUh1Jg5 Kuxs1t3Mb8zcqQ9QJcZ9XHXjwVw2vaJQXc6BPQ97p0Viv0LUYFXDtsn7p/gRecyl+Jl0zfARhC Md3mB1iFo4728Zcyq9o8I1+wcfOMN66RoTwVnZ0ttwuD8TtQQTqhgJTQ9udTNyWqCW1iZ7dhv9 i0h+2+H6sK9NdmVIIw10dNJ0157aB0n0Oy5noi0nH+/ljyqmVpBvSQ2e7a6/xmJw50W8gfET8+ S19cFQmO4/wqVBpJy8iaNi0C Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP; 26 Jun 2019 19:48:22 -0700 IronPort-SDR: or+Pdld1RnuhYrkbazVh8mBrYpNY9nz7YKsxokcBl2kLihVc+t+7wn+taCyEL6ObcRVC26F7vU 6mvAEqlnNbqLuVqVZS2nhEPqpBjzCiTNiAkOraEJ2o45ZkqxaudwfandDs9NV0JnoaO0YgWuq+ hrE5aSVOVcfjap0BqxXhj7foztYFtW0T/Nb8HYx68n7KMDfHl6d6sBOjQQKF2VhdGX73losu2h bCvGcoKAtqB3+ope/r6ab8g31t6YTuc58EyYGzyRismXhtA9SqFvjT6p6wwfciURPD9L3StIBN FyM= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 26 Jun 2019 19:49:14 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe Cc: Christoph Hellwig , Bart Van Assche Subject: [PATCH V4 2/3] sd_zbc: Fix report zones buffer allocation Date: Thu, 27 Jun 2019 11:49:09 +0900 Message-Id: <20190627024910.23987-3-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190627024910.23987-1-damien.lemoal@wdc.com> References: <20190627024910.23987-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During disk scan and revalidation done with sd_revalidate(), the zones of a zoned disk are checked using the helper function blk_revalidate_disk_zones() if a configuration change is detected (change in the number of zones or zone size). The function blk_revalidate_disk_zones() issues report_zones calls that are very large, that is, to obtain zone information for all zones of the disk with a single command. The size of the report zones command buffer necessary for such large request generally is lower than the disk max_hw_sectors and KMALLOC_MAX_SIZE (4MB) and succeeds on boot (no memory fragmentation), but often fail at run time (e.g. hot-plug event). This causes the disk revalidation to fail and the disk capacity to be changed to 0. This problem can be avoided by using vmalloc() instead of kmalloc() for the buffer allocation. To limit the amount of memory to be allocated, this patch also introduces the arbitrary SD_ZBC_REPORT_MAX_ZONES maximum number of zones to report with a single report zones command. This limit may be lowered further to satisfy the disk max_hw_sectors limit. Finally, to ensure that the vmalloc-ed buffer can always be mapped in a request, the buffer size is further limited to at most queue_max_segments() pages, allowing successful mapping of the buffer even in the worst case scenario where none of the buffer pages are contiguous. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal --- drivers/scsi/sd_zbc.c | 83 ++++++++++++++++++++++++++++++++----------- 1 file changed, 62 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 7334024b64f1..ecd967fb39c1 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -9,6 +9,7 @@ */ #include +#include #include @@ -50,7 +51,7 @@ static void sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf, /** * sd_zbc_do_report_zones - Issue a REPORT ZONES scsi command. * @sdkp: The target disk - * @buf: Buffer to use for the reply + * @buf: vmalloc-ed buffer to use for the reply * @buflen: the buffer size * @lba: Start LBA of the report * @partial: Do partial report @@ -79,6 +80,7 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, put_unaligned_be32(buflen, &cmd[10]); if (partial) cmd[14] = ZBC_REPORT_ZONE_PARTIAL; + memset(buf, 0, buflen); result = scsi_execute_req(sdp, cmd, DMA_FROM_DEVICE, @@ -103,6 +105,48 @@ static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf, return 0; } +/* + * Maximum number of zones to get with one report zones command. + */ +#define SD_ZBC_REPORT_MAX_ZONES 8192U + +/** + * Allocate a buffer for report zones reply. + * @disk: The target disk + * @nr_zones: Maximum number of zones to report + * @buflen: Size of the buffer allocated + * @gfp_mask: Memory allocation mask + * + */ +static void *sd_zbc_alloc_report_buffer(struct request_queue *q, + unsigned int nr_zones, size_t *buflen, + gfp_t gfp_mask) +{ + size_t bufsize; + void *buf; + + /* + * Report zone buffer size should be at most 64B times the number of + * zones requested plus the 64B reply header, but should be at least + * SECTOR_SIZE for ATA devices. + * Make sure that this size does not exceed the hardware capabilities. + * Furthermore, since the report zone command cannot be split, make + * sure that the allocated buffer can always be mapped by limiting the + * number of pages allocated to the HBA max segments limit. + */ + nr_zones = min(nr_zones, SD_ZBC_REPORT_MAX_ZONES); + bufsize = roundup((nr_zones + 1) * 64, 512); + bufsize = min_t(size_t, bufsize, + queue_max_hw_sectors(q) << SECTOR_SHIFT); + bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT); + + buf = __vmalloc(bufsize, gfp_mask, PAGE_KERNEL); + if (buf) + *buflen = bufsize; + + return buf; +} + /** * sd_zbc_report_zones - Disk report zones operation. * @disk: The target disk @@ -118,9 +162,9 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, gfp_t gfp_mask) { struct scsi_disk *sdkp = scsi_disk(disk); - unsigned int i, buflen, nrz = *nr_zones; + unsigned int i, nrz = *nr_zones; unsigned char *buf; - size_t offset = 0; + size_t buflen = 0, offset = 0; int ret = 0; if (!sd_is_zoned(sdkp)) @@ -132,16 +176,14 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, * without exceeding the device maximum command size. For ATA disks, * buffers must be aligned to 512B. */ - buflen = min(queue_max_hw_sectors(disk->queue) << 9, - roundup((nrz + 1) * 64, 512)); - buf = kmalloc(buflen, gfp_mask); + buf = sd_zbc_alloc_report_buffer(disk->queue, nrz, &buflen, gfp_mask); if (!buf) return -ENOMEM; ret = sd_zbc_do_report_zones(sdkp, buf, buflen, sectors_to_logical(sdkp->device, sector), true); if (ret) - goto out_free_buf; + goto out; nrz = min(nrz, get_unaligned_be32(&buf[0]) / 64); for (i = 0; i < nrz; i++) { @@ -152,8 +194,8 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, *nr_zones = nrz; -out_free_buf: - kfree(buf); +out: + kvfree(buf); return ret; } @@ -287,8 +329,6 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp, return 0; } -#define SD_ZBC_BUF_SIZE 131072U - /** * sd_zbc_check_zones - Check the device capacity and zone sizes * @sdkp: Target disk @@ -304,22 +344,23 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp, */ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) { + size_t bufsize, buflen; u64 zone_blocks = 0; sector_t max_lba, block = 0; unsigned char *buf; unsigned char *rec; - unsigned int buf_len; - unsigned int list_length; int ret; u8 same; /* Get a buffer */ - buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL); + buf = sd_zbc_alloc_report_buffer(sdkp->disk->queue, + SD_ZBC_REPORT_MAX_ZONES, + &bufsize, GFP_NOIO); if (!buf) return -ENOMEM; /* Do a report zone to get max_lba and the same field */ - ret = sd_zbc_do_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, 0, false); + ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, 0, false); if (ret) goto out_free; @@ -355,12 +396,12 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) do { /* Parse REPORT ZONES header */ - list_length = get_unaligned_be32(&buf[0]) + 64; + buflen = min_t(size_t, get_unaligned_be32(&buf[0]) + 64, + bufsize); rec = buf + 64; - buf_len = min(list_length, SD_ZBC_BUF_SIZE); /* Parse zone descriptors */ - while (rec < buf + buf_len) { + while (rec < buf + buflen) { u64 this_zone_blocks = get_unaligned_be64(&rec[8]); if (zone_blocks == 0) { @@ -376,8 +417,8 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) } if (block < sdkp->capacity) { - ret = sd_zbc_do_report_zones(sdkp, buf, SD_ZBC_BUF_SIZE, - block, true); + ret = sd_zbc_do_report_zones(sdkp, buf, bufsize, block, + true); if (ret) goto out_free; } @@ -408,7 +449,7 @@ static int sd_zbc_check_zones(struct scsi_disk *sdkp, u32 *zblocks) } out_free: - kfree(buf); + kvfree(buf); return ret; } From patchwork Thu Jun 27 02:49:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 11018685 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 241D01932 for ; Thu, 27 Jun 2019 02:49:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16F4A286FE for ; Thu, 27 Jun 2019 02:49:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B0A728A2D; Thu, 27 Jun 2019 02:49:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A771D286FE for ; Thu, 27 Jun 2019 02:49:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726970AbfF0CtQ (ORCPT ); Wed, 26 Jun 2019 22:49:16 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:35004 "EHLO esa1.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726962AbfF0CtQ (ORCPT ); Wed, 26 Jun 2019 22:49:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1561603755; x=1593139755; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lg8pbEXqIGRxgTLVtp82Eq91m5uzXzwMVXBnCRVLznI=; b=NiW545gH2omc/buw7VwX6ruBCVzZg8OCBGfOUy1NQZbPxby9QVNgsFyq G60KbtjXtWZ9lLxsHUdpf5Jm30XCKdTyzcdqATi0MhvWAK0ru5XsnSv3K kLJJgQHFnlWRFnH5uJToOYR+Yz/ItV7gzp9SrYWdkAevi0Bcxwr4RToJH XqtPfj2p4KLOQSyB5uVs2Ic8zlv0ZgGQVCacW63YZPeNl66WCOYv/8kwp Y9rINzzUBmOaC2BKndHBfRb22xGH5pI75ZF8IbxSP6S3FvKWWYJWA22EC nnoqG5kQTAAbLunPba+wQS5PgplHXRVcm2L1EoEwIFy34FEXZ4QNufhK7 A==; X-IronPort-AV: E=Sophos;i="5.63,422,1557158400"; d="scan'208";a="218022033" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 27 Jun 2019 10:49:15 +0800 IronPort-SDR: DYd6rgX/eTMCl5tvCyLWEO2p+wohGgxS9B8pusCbhLGN+414JTn4IX5NTl6vsLezm/EPelx5s7 P1lkTkAfEj3yUJx+Cs24ijrmRscEciUamIiym4GY6dGofUAaRKmZlPlDce4Gn4ek3Ng4GXV/bc cxGvJMz9oGCZoEYPA7NNU8ikeqltFTjiEvPD/riw4XNLLQso8efPOSfyM2wjdtFdcmMyF7b6w1 31fMOv33xV3vYcHMC4WgpWupuvH7YehlvZTISJClxHGYJRcGDVoaiAzNxIpFsIqGxQ3U8uchrJ QVsnq4LbJMDR35qyNAZk/ZV2 Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP; 26 Jun 2019 19:48:24 -0700 IronPort-SDR: l7Q8eErx4O47E08s4VUeN4+1j05XjzrDcO/8OSqhR3Wq6y/kPfq4USFc3NfsLABvaLhOW9/UXM S97qAus7EP/h+saCFDvENc85isKBO7n5VkEa4Dczf9yDUKD2MpkbVDxeA8N8tEAN566jafBwDM eZGoM03qBqXwF8ziTlFCpichdx+Lo/CcMjL15aQhbyYgUMzb6CT7Lrmy99IP+p4Lc8f8/cQGRF Crpj0+/wnVes0JmD2l7u0jj5pXGM7/TOgjvynSuoz0sliE34C7I4/gpjuzKVKCH1tjlZxaRkaR WzI= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 26 Jun 2019 19:49:15 -0700 From: Damien Le Moal To: linux-scsi@vger.kernel.org, "Martin K . Petersen" , linux-block@vger.kernel.org, Jens Axboe Cc: Christoph Hellwig , Bart Van Assche Subject: [PATCH V4 3/3] block: Limit zone array allocation size Date: Thu, 27 Jun 2019 11:49:10 +0900 Message-Id: <20190627024910.23987-4-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190627024910.23987-1-damien.lemoal@wdc.com> References: <20190627024910.23987-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Limit the size of the struct blk_zone array used in blk_revalidate_disk_zones() to avoid memory allocation failures leading to disk revalidation failure. Further reduce the likelyhood of these failures by using kvmalloc() instead of directly allocating contiguous pages. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Reviewed-by: Bart Van Assche --- block/blk-zoned.c | 29 +++++++++++++---------------- include/linux/blkdev.h | 5 +++++ 2 files changed, 18 insertions(+), 16 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index ae7e91bd0618..26f878b9b5f5 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -373,22 +373,20 @@ static inline unsigned long *blk_alloc_zone_bitmap(int node, * Allocate an array of struct blk_zone to get nr_zones zone information. * The allocated array may be smaller than nr_zones. */ -static struct blk_zone *blk_alloc_zones(int node, unsigned int *nr_zones) +static struct blk_zone *blk_alloc_zones(unsigned int *nr_zones) { - size_t size = *nr_zones * sizeof(struct blk_zone); - struct page *page; - int order; - - for (order = get_order(size); order >= 0; order--) { - page = alloc_pages_node(node, GFP_NOIO | __GFP_ZERO, order); - if (page) { - *nr_zones = min_t(unsigned int, *nr_zones, - (PAGE_SIZE << order) / sizeof(struct blk_zone)); - return page_address(page); - } + struct blk_zone *zones; + size_t nrz = min(*nr_zones, BLK_ZONED_REPORT_MAX_ZONES); + + zones = kvcalloc(nrz, sizeof(struct blk_zone), GFP_NOIO); + if (!zones) { + *nr_zones = 0; + return NULL; } - return NULL; + *nr_zones = nrz; + + return zones; } void blk_queue_free_zone_bitmaps(struct request_queue *q) @@ -443,7 +441,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) /* Get zone information and initialize seq_zones_bitmap */ rep_nr_zones = nr_zones; - zones = blk_alloc_zones(q->node, &rep_nr_zones); + zones = blk_alloc_zones(&rep_nr_zones); if (!zones) goto out; @@ -480,8 +478,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk) blk_mq_unfreeze_queue(q); out: - free_pages((unsigned long)zones, - get_order(rep_nr_zones * sizeof(struct blk_zone))); + kvfree(zones); kfree(seq_zones_wlock); kfree(seq_zones_bitmap); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 592669bcc536..f7faac856017 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -344,6 +344,11 @@ struct queue_limits { #ifdef CONFIG_BLK_DEV_ZONED +/* + * Maximum number of zones to report with a single report zones command. + */ +#define BLK_ZONED_REPORT_MAX_ZONES 8192U + extern unsigned int blkdev_nr_zones(struct block_device *bdev); extern int blkdev_report_zones(struct block_device *bdev, sector_t sector, struct blk_zone *zones,