From patchwork Fri Jun 29 15:50:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 10496991 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A18E960230 for ; Fri, 29 Jun 2018 15:50:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DFCE29BDA for ; Fri, 29 Jun 2018 15:50:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 826E829BE1; Fri, 29 Jun 2018 15:50:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C104529C05 for ; Fri, 29 Jun 2018 15:50:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935609AbeF2PuF (ORCPT ); Fri, 29 Jun 2018 11:50:05 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:17315 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932641AbeF2PuF (ORCPT ); Fri, 29 Jun 2018 11:50:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1530287405; x=1561823405; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to; bh=0124dnSKsYvPQdiLOCXTBsrBRKcZlOxpvecu/QwbNZ8=; b=L2mFeH+4/Vub3Z3VnTL1oGjRi6RAxbDvDijEPUr4BD1xPwZCFDSfBW+S iK+jS5DO/vx17wEW3FfXagSXcAxX/bh6iT6I0MBFdex1ZgW8nujq5y0DI xqkVZZTZ3Y8ifiEDUT+O0MTeqPlxdDmicxihXhvyLYvM5u03k/Y8O2a2f kMLYGpoY4CUOXHLiKvGFc8AchCFMov329JYFdobULdBgk0x+EU1mVGNow NEKjQxsDughOjls8Bz8vGZZWuRtuTF/6bdeGgTGSiNT5bbUBlY2uZuotV 05RyN6RLmX1wxML6ZTZkjopIAZqu3liYarmFJBJVm1s/RSWE7590nDEY2 w==; X-IronPort-AV: E=Sophos;i="5.51,285,1526313600"; d="scan'208,223";a="83678544" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 29 Jun 2018 23:50:04 +0800 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP; 29 Jun 2018 08:39:33 -0700 Received: from thinkpad-bart.sdcorp.global.sandisk.com (HELO [10.111.67.248]) ([10.111.67.248]) by uls-op-cesaip02.wdc.com with ESMTP; 29 Jun 2018 08:50:05 -0700 Subject: Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME To: Christoph Hellwig , "Martin K. Petersen" Cc: Jens Axboe , "linux-block@vger.kernel.org" References: <20180626001009.16557-1-bart.vanassche@wdc.com> <20180629085827.GA16441@lst.de> From: Bart Van Assche Message-ID: <9c9f117b-0d70-bb31-1dd1-febe1be21645@wdc.com> Date: Fri, 29 Jun 2018 08:50:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180629085827.GA16441@lst.de> Content-Language: en-US Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 06/29/18 01:58, Christoph Hellwig wrote: > Maybe it is time to remove REQ_OP_WRITE_SAME finally. We don't have > a user except for drbd which is passing it from the other side, and > it creates way too many special cases like this. Hello Christoph, If REQ_OP_WRITE_SAME would be removed from the block layer, how is software like e.g. LIO ever expected to pass down WRITE SAME requests to block devices? What if in the future we would want to add support for a new request type to the block layer for which - just like for write same commands - the number of bytes affected on the medium differs from the data out buffer size? The SCSI spec already support several such commands today. One variant of the SCSI VERIFY command verifies multiple logical blocks on the medium but does not have a Data Out buffer. Another example is the COMPARE AND WRITE command, for which the size of the Data Out buffer is the double of the number of bytes affected on the medium. It's not impossible that an equivalent of COMPARE AND WRITE will ever be added to the NVMe spec. Have you considered to modify the block layer such that blk_rq_payload_bytes() does not have to be made more complicated than returning a single member from the request data structure? How about the following approach: * Add a new member to struct request that represents what is called the Data Out buffer size in the SCSI specs and keep using __data_len for the number of bytes affected on the medium. Modify blk_rq_bio_prep() and __blk_rq_prep_clone() such that these functions set both __data_len and the new struct request member instead of only __data_len. * Remove RQF_SPECIAL_PAYLOAD and request.special_vec. Modify all block driver code that sets the RQF_SPECIAL_PAYLOAD and request.special_vec such that rq->bio is replaced by a bio with the proper payload and such that the original bio is restored before completing the request. As you are most likely aware there is already code in the block layer that replaces and restores the original request bio, namely the code in blk-flush.c. If you would like to see the patches I came up with to add REQ_OP_WRITE_SAME support to LIO, I have attached these to this e-mail. Thanks, Bart. From 3d8c13d953221411d5d803e76e2e6cabd506d0d4 Mon Sep 17 00:00:00 2001 From: Bart Van Assche Date: Thu, 28 Jun 2018 10:12:58 -0700 Subject: [PATCH 2/2] target: Use REQ_OP_WRITE_SAME to implement WRITE SAME Use blkdev_submit_write_same() instead of open-coding it. Note: as one can see in target_alloc_sgl(), the target core sets the offset in a scatter/gather list to zero for all allocated pages. Signed-off-by: Bart Van Assche --- drivers/target/target_core_iblock.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index 1bc9b14236d8..fa4dd4f5c8b3 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -465,6 +465,8 @@ iblock_execute_write_same(struct se_cmd *cmd) sector_t block_lba = target_to_linux_sector(dev, cmd->t_task_lba); sector_t sectors = target_to_linux_sector(dev, sbc_get_write_same_sectors(cmd)); + struct blk_plug plug; + int ret; if (cmd->prot_op) { pr_err("WRITE_SAME: Protection information with IBLOCK" @@ -481,6 +483,7 @@ iblock_execute_write_same(struct se_cmd *cmd) return TCM_INVALID_CDB_FIELD; } + /* 1. Use REQ_OP_WRITE_ZEROES if supported and if appropriate. */ if (bdev_write_zeroes_sectors(bdev)) { if (!iblock_execute_zero_out(bdev, cmd)) return 0; @@ -491,6 +494,20 @@ iblock_execute_write_same(struct se_cmd *cmd) goto fail; cmd->priv = ibr; + /* 2. Try REQ_OP_WRITE_SAME. */ + blk_start_plug(&plug); + ret = blkdev_submit_write_same(bdev, block_lba, sectors, GFP_KERNEL, + sg_page(sg), iblock_bio_done, cmd); + blk_finish_plug(&plug); + if (ret == 0) + return 0; + if (ret != -EOPNOTSUPP) + goto fail; + + /* + * 3. If neither REQ_OP_WRITE_SAME nor REQ_OP_WRITE_ZEROES are + * supported, use REQ_OP_WRITE. + */ bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, 0); if (!bio) goto fail_free_ibr; -- 2.17.1