From patchwork Mon May 1 17:53:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 9706769 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BE5FE602B5 for ; Mon, 1 May 2017 17:54:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2A59208C2 for ; Mon, 1 May 2017 17:54:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6EEF2808F; Mon, 1 May 2017 17:54:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 24226208C2 for ; Mon, 1 May 2017 17:54:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750703AbdEARyE (ORCPT ); Mon, 1 May 2017 13:54:04 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:42560 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750760AbdEARyD (ORCPT ); Mon, 1 May 2017 13:54:03 -0400 X-IronPort-AV: E=Sophos;i="5.37,401,1488816000"; d="scan'208";a="14501993" Received: from mail-bl2nam02lp0088.outbound.protection.outlook.com (HELO NAM02-BL2-obe.outbound.protection.outlook.com) ([207.46.163.88]) by ob1.hgst.iphmx.com with ESMTP; 02 May 2017 01:54:02 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sharedspace.onmicrosoft.com; s=selector1-wdc-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=vTw5rPXF1G0jBoeOZJP38n6wT5ld4bAr54RAXx5lKMg=; b=oUaHdG+dCvlrsnOty/2HkZKFzl2A5hA1Vz/4rgJwmCXTF+XlN4Iucqxd8cGJS6gkN4shuuBRHvflaXo8km4O8TSblUKUKvt7ZMHnyD9XmDr62SbtjxUULrwIy0dCtyUGbQ9S+VF8tQ6uqGzUNxqpsWOj8mCFDX7nOidkMbm7Rr0= Authentication-Results: redhat.com; dkim=none (message not signed) header.d=none; redhat.com; dmarc=none action=none header.from=wdc.com; Received: from washi.fujisawa.hgst.com (199.255.44.173) by BL2PR04MB1971.namprd04.prod.outlook.com (10.167.97.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.13; Mon, 1 May 2017 17:53:57 +0000 From: damien.lemoal@wdc.com To: dm-devel@redhat.com, Mike Snitzer , Alasdair Kergon Cc: Hannes Reinecke , Christoph Hellwig , Bart Van Assche , linux-block@vger.kernel.org, Damien Le Moal Subject: [PATCH v2 09/10] dm-kcopyd: Add sequential write feature Date: Tue, 2 May 2017 02:53:13 +0900 Message-Id: <20170501175314.10922-10-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170501175314.10922-1-damien.lemoal@wdc.com> References: <20170501175314.10922-1-damien.lemoal@wdc.com> MIME-Version: 1.0 X-Originating-IP: [199.255.44.173] X-ClientProxiedBy: BN6PR1301CA0002.namprd13.prod.outlook.com (10.174.84.143) To BL2PR04MB1971.namprd04.prod.outlook.com (10.167.97.143) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ab06d46a-bdc0-4f1d-e554-08d490bb0c0c X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(48565401081)(201703131423075)(201703031133081); SRVR:BL2PR04MB1971; X-Microsoft-Exchange-Diagnostics: 1; BL2PR04MB1971; 3:wMc0NjweTsjLCIMXlt88unYfAtyPFqH18Ajb95NYeXisfUmgCVM/eiyYPxqgQtgrEgM1fN8GXelgaueGehr2ruqhzyuRW7bJ6N5DNNVmQecs4o5RsahHneyTubg5EfeJnZjBWLlRRi22HZYKEov2TgY2fKdJLhncBDHqT3+v9jYyFN+iRY6KhVlSbpVor+EJVZspYFiaEKiCruNPM/7z+dMlPmhSdSv+/MbD1yd9Cnb41KyPZxoRRfMejIzNiAR/nwVbs3U/HJdkrS1QBX36Xe71KAM1okyKtBWhlT2nvb27EzmcRBwELIdJrSOEPmrekzx8oqCJpse55E/tp4DZmx4HETpDTqYcOTPFDjFo79s=; 25:S2cppx3PNwmOqtR5rS7O57jCeC4+6i7DG96KUuquzqXXkSwu6esf0Z3dsFLAn7CF1lqIBNLZahaKMAEfmNeAACsVGfjZ0akJUxaWCKYP0Hnx3egJSbppT1Ub+OLsAe0zgv5hdwOhQK8rKAsFGBGFuGF6OrvvmNg21HbL1KNII5JBJU7IaM1tbYMdYddBn+MLu7wNPRIR1XM6y+r7JfGwCril+qNjYHGQLseajuE7WoaEllvUhRB3bVVNIvgknNeim2CPOQEC6r+XlFXyZiFcZn8xuIhfW89RwEetO+6aZ4PpDKDNaWtka0TP6/T41LXZ7LXPKqWfytX+fTkIwIb2Cu0OXVdlELc3ZL/Vs70NrEkZNgjOHXN9xMsoW6yoxDwKodZM/AEUyU2aILrYxrZztNagoHs8sTI91BVPKQZonvE4R1/BsK2D+qBrP6f4w+9Bmb0qWJRUjMVn/SQE1oi4UA== X-Microsoft-Exchange-Diagnostics: 1; BL2PR04MB1971; 31:JSm8jEZ4x6jm7LGqHalVqw/C11Xz6Q9wSE/RmFnft6mcgBUaoUc0jlk2k5O0wEKLRjAKQN0AD7teYZKNEKD0JKl2ERtXIwXoZlKP524+ef8vukRyhDglWlHUr7Ls1xR9l/Xu+USBFJTZ6E3bmFuVpWUrmGBjMDonkOMmG3FmLEC78jbvinOR9+DTUuMHjg3ZgCNC9pnMqO+3ASb6wCOG5o5sOuw8X3qV4AlNwefNVLCEXD7iDY+etX7LUazUDX5a; 20:/nC02bFQgMEOFbeAiX4i+W1SMCS0PVY3NBCSw5flF25i1JuqdhYD4iPIv5+tOvKXtK+Or4125SAvbTqUfv/BUrHqGRHwrnbXZbZO1IHhJfj5HQzTPp2zwz4IBPUYVPADaMp5mUXbX0XhtNWndsXvgse/IJDvLMY4gPT6Gu/aaO0ANpvZsIiQOf/Dgz7DEuH7+tC6zKDtFN+OabSxo/UJqAGbiyNq2IAJ0v+CCpSOE/QouCfFGj3y0BWfuCjV/omdm+lyKy6JBh6OtT4mfo1vXd8GyZpos9+n9Q9a39/1gLtXBCMemhk43J4Hjrk1YfjYN88x+Zey0vA+VMD+4Dy6OTXh2/ZdkXU7uiLS/IVXgDqMNRPWnXT7u2xiKOtM1PGH8D1hFkizxS2nlyL6od6N57uD6/OzWe3SDdv47SvPvi5IwruA6XRk5Ne1hEwb/bQ1uoxhsML9C190+FsuCNwp3YNdeC89I0gvZkr3A1pJMi1NuYrYrMJteQUXL89e6zFr WDCIPOUTBOUND: EOP-TRUE X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041248)(20161123555025)(20161123564025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(20161123562025)(6072148); SRVR:BL2PR04MB1971; BCL:0; PCL:0; RULEID:; SRVR:BL2PR04MB1971; X-Microsoft-Exchange-Diagnostics: 1; BL2PR04MB1971; 4:wNqXZOv2YARsxq/U5L6u43xx22v1nuLV/3FBHnwW/c45PM7vPgxpF+OkaEqPh9w7ce5gXE2EV3yU/BI4H9pZ4j1us8gNVbitz7vtwpt+RoVQ4RTBt2NcsnMzInIq8D/o0FoCdkgiL1whEUWm9mqt8RQufEXf1VLAShV4dWrnJWtRYZiWZ7djIDZPzgYipJEnVQ9bSM7GwUaLAK9/9hfzllLtMcbwZsmDz0Ajox5qfrhbI1/HSRcDwTMdJAJ7X7VRGHwof8iO0gwjtvg/z/S2Nre8e/1/YkyePWc3O8HRt0s6nwMCkfy4dxNY2ipcFR97S/I/cr0y8UHv/SL9MibT2L6z7EboKaabhHcsBxzWypF35eJlPCi/SnujiejnUiqCMKAmt6iWJiL+b9S505EadO+gm73wrhNTfscP4YJQsbGExGLIDB4etGANcXcebYeBdOibYNFxJkpkBr1U0ZxeWTGpD7WPm3ogNMtt/7sUQDqa2ODW1zV7cwgaFKvyTKYz1RbcxEir+bBoc+v/Ywyb8MygZD2oHdXgDEaC71iw6Q51zdWrE1QttgcQcMqedXPdMPPJlKTn1YwRZm3BMCugAhWKK/0GDJLLFOOZWCPRo4fN7TDe9IFa4u1mmHKBImNnGKL5F0qpiRVjjeeUQllOCZcUySgIBDsc5u7FtvuuPvkLKBVX2BkvGY5BmAhWSz7CT66yxBskyAB3vc1/0L+xK4xOOuoaT9of50lVNYLKydI= X-Forefront-PRVS: 02945962BD X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(39450400003)(39840400002)(39860400002)(39400400002)(39410400002)(39850400002)(2950100002)(47776003)(50226002)(85782001)(25786009)(42186005)(7736002)(5660300001)(6636002)(305945005)(81166006)(6116002)(3846002)(8676002)(33646002)(54906002)(6666003)(53936002)(6486002)(9686003)(189998001)(6512007)(86362001)(1076002)(50466002)(2906002)(38730400002)(48376002)(4326008)(36756003)(76176999)(508600001)(50986999)(85772001); DIR:OUT; SFP:1102; SCL:1; SRVR:BL2PR04MB1971; H:washi.fujisawa.hgst.com; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BL2PR04MB1971; 23:OlIu3Y4m1EbdBnTlt+ansLyB7xahg9g+T4Ep+eQqZ?= =?us-ascii?Q?nGiiqu5NHnxmEEY9regnVAz/Lbylvsxa7m9OI7ifGQJitdAdRuPhk+1pxWXB?= =?us-ascii?Q?VipkT5AxeO7s6Cy09bqpyRafPieW+5xMho4MuchxW+SnZM0C8PoiBD5XGBH0?= =?us-ascii?Q?KkrBWaxQ8B36gNNuHSB1+pSvuQD1fzgfG3k+2xBnQxoYuHPh2Wsd5y3h8ya+?= =?us-ascii?Q?NJ2KshxqtOHNrzzT49AEpaMKCHlwhGtHh8D+c54U9hBoOW4bZsko81xIV9Us?= =?us-ascii?Q?dRa2wBoAJbhXHgKmA+s8KYau4dfrsPB2QjRoBabYZHRlgHuztXvmYTBhkHhT?= =?us-ascii?Q?2w+A8xp7Eet/tbmCVDBqgQwagvkrf9dwF1LSnqaG5yRnSAb3tbNN9YlvxART?= =?us-ascii?Q?RnDdlDQJrA21PD7VHrH48luM0PSak0uJ5U+cQOXT4iO/uX3yEKV0ibGicscZ?= =?us-ascii?Q?QGjvu0thRMaUXX0pvpJdtW3t749SnRO4LODrDVoDVmkR0uZ3nIleqDH4fIC5?= =?us-ascii?Q?14WrDHd8jzB36FINbC0YOWuoAfofRhfjSwuhqOlGK93743pIndOChgrDZnlN?= =?us-ascii?Q?OJglIOo2sKzVJ49y2DUbk4gJkZdAHTVa8ka/wFl4CjdwQEq6DxkRYThFQN1X?= =?us-ascii?Q?c/6RTm2ht314AQlYxwgdu4LTgCvCG24yEPbeNtpjtyLBuMYo73sSYVEc8sN9?= =?us-ascii?Q?cQXY3zqtQ9SCJeib8xGflr4ud9lZ+sS4975CtFtrgioYa1kYKY1UApdsXpAK?= =?us-ascii?Q?C2W9HAJMPWodA+A6MadsVeM/XsEUkEp2HGuBVo6Ibt7b5CoKHzM41j1VbBeq?= =?us-ascii?Q?HhXQAkHuRFRgqAGtWIxV2EZDxvAT5sX1vheTIgCwHEhcLZuEg4dWNYCQp6dk?= =?us-ascii?Q?HAwZitutONRqSXdAcJTdlYptS2YpjYV3/2CVwIkYY49H9nfEMg9q0CCCINcw?= =?us-ascii?Q?UMfb+mbrtkkeYjnhrOP5sIuW27uVoQ15p/IQlLIXf654HkG5acrq8246ETAN?= =?us-ascii?Q?+SFr+Idj6f7t/aPDSo7kQH0hrt3ffNEgx6ibC6fmpi8i0nKbQ5YKsg+ThEjs?= =?us-ascii?Q?Y3ya6o=3D?= X-Microsoft-Exchange-Diagnostics: 1; BL2PR04MB1971; 6:HQ0u4m5OHxGyw27EzzrqQKwos83bLHL+3unMXBES36RigKNDTnRZbzFzrIz+tldPfl1gJUd3qkUk8m+lg7b0D8kA0ky9hFpKiEPmXMvZ/+hn6nELHIPahunWyUg1AANstG/vnNEzzC5m3lxO68aiZaFcIRnSEFS+ewT/7G/FcFTUWphgB+DGIhE9ycdL9NzrBHxn5hWNo1qIIyevIfNz+pIE2N3t7tqHEE5/86wOI2dSL6kYvfw1/djaHG+3wCPKe5/TpcXAI2x3K7oHB4L0P+XBw8UJK7PZCXiSJo9Zo+J4lpivKfgmBaBe/y5xE7XQOJ0LOhOBPp1s19b+XAZb9wj5f+ILZNcw+2a4avljM23Elbs06h6tplVrKcMN63CtKd0PgK7KUucreFv5HwnevaXrYTjgDE5VlITtmoFmgpeIyZvFDKxbHTf2UwtgwiY3yEpuJ23CWXu8HVfui0gp3xCZmbaUDwJ2Qg41Oo0Ygiss/DShGcApgfpyEAdq5lVADmT1cpd44vZ3s6FL1XV+59lUZCTM6WWRyj+ZntIGbQo=; 5:y1peaUN0BTdvTrfkMPFgbyyMbbogq2csce0e2hBA1Q24jCs8EDh3MQcSDfixlSBKsyfACTgo442B8azwXYGYsmelI31m8R0ea3RYadV67qrcF/H6JFFyak0Qvk1l8OHGzbtJxzBe/4Linz3xKjgfpbczyH9P2glSX5+To0b3EoU=; 24:/LJemcJiOJLAtrh/n8Oj3T1WKsCvGhXGQo/777RPXsrP/z7p0M8gpj5+Rl89j7/4MREA9mt9xDQAIkXPXm3gF1RvnxjnR5Y4r6DVeZ8JzWc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; BL2PR04MB1971; 7:snHe2auZV9ACxJVVRJqa/M8WPAeiDdWaDIiCL49nU0fPxYgwpJ57wLSewo1I4DEFMqAAItpb1d7PYP8m+fMtjksjmgyIrW13N/+o/zg048F/y0/Mgrs8bXPn5cPtB0smSysFFF48KBR9cHtch2hYWlbo3HgIjLzG/pBKpmDynnGAp7LGNzzf2HBPAaWEDQ+9K0S/HeIxjQj7w9qgPhHiisYtnTtJufBIh/1lbE/VWPguK/twJh+xJ2bmjRKEVllco7N2IpR2eWiTK6dV3G2pBejM/2NwmL8wla7So6uUu9f7qTZoVSK9Bly1U50lyK1bdKYZc+dUZm8QyepGXK7HAw==; 20:8lgpfFG1QhmK6A/MzrKMgcK/5vAYrLZGgmO9L+ynbszmJ8pI6XIu0RDwYCk9cWbrqJuhGPnORxQahvqaBhk8PPvdp+RDcoAcVqz59PizTWF3ySOo+8Vz54roi9fFW4iokCeOf6EnRgTikv32nN5l6wDhX9XhvNrRw1DaI+k98mc= X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 May 2017 17:53:57.9540 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR04MB1971 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Damien Le Moal When copyying blocks to host-managed zoned block devices, writes must be sequential. dm_kcopyd_copy() does not howerver guarantee this as writes are issued in the completion order of reads, and reads may complete out of order despite being issued sequentially. Fix this by introducing the DM_KCOPYD_WRITE_SEQ flag. This can be specified by the user when calling dm_kcopyd_copy() and is set automatically if one of the destinations is a host-managed zoned block device. For a split job, the master job maintains the write position at which writes must be issued. This is checked with the pop() function which is modify to not return any write I/O sub job that is not at the correct write position. When DM_KCOPYD_WRITE_SEQ is specified for a job, errors cannot be ignored and the flag DM_KCOPYD_IGNORE_ERROR is ignored, even if specified by the user. Signed-off-by: Damien Le Moal Reviewed-by: Hannes Reinecke --- drivers/md/dm-kcopyd.c | 68 +++++++++++++++++++++++++++++++++++++++++++++-- include/linux/dm-kcopyd.h | 1 + 2 files changed, 67 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c index 9e9d04cb..477846e 100644 --- a/drivers/md/dm-kcopyd.c +++ b/drivers/md/dm-kcopyd.c @@ -356,6 +356,7 @@ struct kcopyd_job { struct mutex lock; atomic_t sub_jobs; sector_t progress; + sector_t write_ofst; struct kcopyd_job *master_job; }; @@ -386,6 +387,34 @@ void dm_kcopyd_exit(void) * Functions to push and pop a job onto the head of a given job * list. */ +static struct kcopyd_job *pop_io_job(struct list_head *jobs, + struct dm_kcopyd_client *kc) +{ + struct kcopyd_job *job; + + /* + * For I/O jobs, pop any read, any write without sequential write + * constraint and sequential writes that are at the right position. + */ + list_for_each_entry(job, jobs, list) { + + if (job->rw == READ || + !test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags)) { + list_del(&job->list); + return job; + } + + if (job->write_ofst == job->master_job->write_ofst) { + job->master_job->write_ofst += job->source.count; + list_del(&job->list); + return job; + } + + } + + return NULL; +} + static struct kcopyd_job *pop(struct list_head *jobs, struct dm_kcopyd_client *kc) { @@ -395,8 +424,12 @@ static struct kcopyd_job *pop(struct list_head *jobs, spin_lock_irqsave(&kc->job_lock, flags); if (!list_empty(jobs)) { - job = list_entry(jobs->next, struct kcopyd_job, list); - list_del(&job->list); + if (jobs == &kc->io_jobs) { + job = pop_io_job(jobs, kc); + } else { + job = list_entry(jobs->next, struct kcopyd_job, list); + list_del(&job->list); + } } spin_unlock_irqrestore(&kc->job_lock, flags); @@ -506,6 +539,14 @@ static int run_io_job(struct kcopyd_job *job) .client = job->kc->io_client, }; + /* + * If we need to write sequentially and some reads or writes failed, + * no point in continuing. + */ + if (test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags) && + job->master_job->write_err) + return -EIO; + io_job_start(job->kc->throttle); if (job->rw == READ) @@ -655,6 +696,7 @@ static void segment_complete(int read_err, unsigned long write_err, int i; *sub_job = *job; + sub_job->write_ofst = progress; sub_job->source.sector += progress; sub_job->source.count = count; @@ -723,6 +765,27 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from, job->num_dests = num_dests; memcpy(&job->dests, dests, sizeof(*dests) * num_dests); + /* + * If one of the destination is a host-managed zoned block device, + * we need to write sequentially. If one of the destination is a + * host-aware device, then leave it to the caller to choose what to do. + */ + if (!test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags)) { + for (i = 0; i < job->num_dests; i++) { + if (bdev_zoned_model(dests[i].bdev) == BLK_ZONED_HM) { + set_bit(DM_KCOPYD_WRITE_SEQ, &job->flags); + break; + } + } + } + + /* + * If we need to write sequentially, errors cannot be ignored. + */ + if (test_bit(DM_KCOPYD_WRITE_SEQ, &job->flags) && + test_bit(DM_KCOPYD_IGNORE_ERROR, &job->flags)) + clear_bit(DM_KCOPYD_IGNORE_ERROR, &job->flags); + if (from) { job->source = *from; job->pages = NULL; @@ -746,6 +809,7 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from, job->fn = fn; job->context = context; job->master_job = job; + job->write_ofst = 0; if (job->source.count <= SUB_JOB_SIZE) dispatch_job(job); diff --git a/include/linux/dm-kcopyd.h b/include/linux/dm-kcopyd.h index f486d63..cfac858 100644 --- a/include/linux/dm-kcopyd.h +++ b/include/linux/dm-kcopyd.h @@ -20,6 +20,7 @@ #define DM_KCOPYD_MAX_REGIONS 8 #define DM_KCOPYD_IGNORE_ERROR 1 +#define DM_KCOPYD_WRITE_SEQ 2 struct dm_kcopyd_throttle { unsigned throttle;