From patchwork Thu Aug 25 13:44:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Butsykin X-Patchwork-Id: 9299429 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 104D5607D8 for ; Thu, 25 Aug 2016 14:18:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 015D428D9C for ; Thu, 25 Aug 2016 14:18:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EA03C292BE; Thu, 25 Aug 2016 14:18:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAD_ENC_HEADER,BAYES_00, DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E7D1E28D9C for ; Thu, 25 Aug 2016 14:18:44 +0000 (UTC) Received: from localhost ([::1]:56559 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bcvUS-0004rZ-29 for patchwork-qemu-devel@patchwork.kernel.org; Thu, 25 Aug 2016 10:18:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bcvTw-0004oQ-2u for qemu-devel@nongnu.org; Thu, 25 Aug 2016 10:18:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bcvTt-00048V-JD for qemu-devel@nongnu.org; Thu, 25 Aug 2016 10:18:12 -0400 Received: from mail-db5eur01on0125.outbound.protection.outlook.com ([104.47.2.125]:9920 helo=EUR01-DB5-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bcvTn-00046x-2U; Thu, 25 Aug 2016 10:18:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=V7aCm2JiE6RGhh0GCCy2QroHvfZdoY+gh34n0xynGFs=; b=UjymD9UFud1s/o2Qfe9LIKAckGaCpJub0NXcIm6u3IKWpCQGny8uudqTw626cEs31yQH4MTnnppP26HdcSdjSh2aUchARA3/6CX9WlDRAuynGQzuwYuvkNELon/gclKYWosHzJI3r9SPF9Uh1JIauiEjFrWuMIj8o6xrApXjwy0= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=pbutsykin@virtuozzo.com; Received: from pavelb-Z68P-DS3.sw.ru (195.214.232.10) by DB6PR0802MB2549.eurprd08.prod.outlook.com (10.172.251.147) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.587.9; Thu, 25 Aug 2016 13:45:40 +0000 From: Pavel Butsykin To: , Date: Thu, 25 Aug 2016 16:44:13 +0300 Message-ID: <20160825134421.20231-15-pbutsykin@virtuozzo.com> X-Mailer: git-send-email 2.8.3 In-Reply-To: <20160825134421.20231-1-pbutsykin@virtuozzo.com> References: <20160825134421.20231-1-pbutsykin@virtuozzo.com> MIME-Version: 1.0 X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: AMSPR02CA0013.eurprd02.prod.outlook.com (10.242.225.141) To DB6PR0802MB2549.eurprd08.prod.outlook.com (10.172.251.147) X-MS-Office365-Filtering-Correlation-Id: 089f0a01-fcb5-4ee3-41ac-08d3ccee1aeb X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2549; 2:jlBMP56tRTxAwEhNFYrflNwSNStgG2QAhoY3ovQytmR02Y+2niTp304846x7YJaSwCejdBvW/XOlUgUji/2gTtupH6f45GmeAhyx21+fXd0V5fcphuK2yAUAkollH8MZEPrZfm4alJ5Tqb0uFKM9JU1M+DehN9cDuLZk2VgjUJS8er51BvxRrbfV6TZR8IDi; 3:Tv7v63FqQ0A0DCKkc9bo8ZfKpIjEjGGXH8YWYSSrVHGafyvo5ZNsjV6tok6WhOp09WlLJC7DHluuHqf01oxFpY2qMUaEw6WMptXBfMk+gZkkRglFKpKFDMJPM0o1PDld X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0802MB2549; X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2549; 25:/GmF2vmxIFSRPMnZzCBO5sdLACwlF7LWWdfrn5i/joPhsSPQJKC6cB2d4NNVfdcthTWN6PnpWoWdLS3KmbcJKRckFq1ALdjwkvMkElC7A/4qvKJyeyDVreNCD6u2tR+ekMFfAIJpMOX1XaScRpP3cSbIjXsIov6MEfkn7MjI3bROasnrEneShkp+ti51S+gWyQCTTpJ0Ko0PzrBP6yKnLl+JOe1Q8aMxOhyfwHvuA/JsFHU2+6PP8PC4whNaJcOMD48gtJGCp4SqTZzVuwXgZaKkwngls2BsAuGMuDDJzS8O7ceeInbl26AOZYDWuHQ8liVFgy48Ip14Hk/motuH6D4OJTdAXl1crkAEep0n/a3DH1e5QtvX3EuWHUBhdMmavt4K0Eix7YueeZ+Xh8P+H1IQ9mXoBZtKb6ftPMFUnQV00iDte35eQ3UBKpq21nuLcyPH1HKNiZEpkdrsEhVbtcqUjm27tOa/EcK5QHyd1gYTTadNvo1ZE++2EWtMX8fdlbb+Tb8z9b2y4vJQUA4ZneIfv73uW/GKtyNEgSsKog7HIxb8O8UVte3zwnXcmuXU7bdPIy38cjAFCc2DXFgouBHnq+EIAVicqTMhEimDTD0DlaU3oq9bsSdXw/yu7LQBtVYmL0iFSQplgCvXYrP8kuYJSxHt6zV1iHERnUDOG0T0L0+YzjxV3NONo7vS6iCByKlxEJLC//DhfCil9ji0pw==; 31:gGDuHp/H2dHR0YJp8nVEV8Wu7sqktRc7N2oP9F4IkmBA199hN7vClMYaJ4w2mQv5IY0Aqplv8OCttJ1rRPSHhCNTzn49T9OjycBfNoEsYWdEAYb44RQkbyzd69h3ifHvCrN5ZXtTDTUfnbAfb2FgL2sf4avlKzERz4J3IuoYL33YBWvKPAwMY1Un7o+QHY2oKu2SJQzp6E1i7yzqwi1jQbzE2hqkHXEmyBkH0OS2bO0= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6043046)(6042046); SRVR:DB6PR0802MB2549; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0802MB2549; X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2549; 4:Kx3GhZej6ZhxfJTqVZRBGOelOjG90C4XT1jQ5JJpJupve+VXVV1+ln0rZ6DbkjSh6L7xDCRVM27FdERKdbOf8vpXklg4SIhVG/MYVE/94cBm7vB9fGpNBWzr9CpGCjk3Y3spFRiVJ1Z/R6KAz1P0u8xL/TpGMhjXFwcTGQ5EUgzNWm0DnlB+GvbhUUaddbBRFj30A5iYd4pmz4dAI9mMEzjoenmskcIYxL0tSxcQbMTvQa41BUYnBH1HPnffTYh9JQBIqgYF1RSAypcVn1wTYk/jTxPg7VhPKP1j1lhN/OkhqPRoBT4KBUffwwHlw6UN1BNmORI/dZR8waUMtklPoyxZ5lF9BqwliHkQV8sMHkghX0TQ406Zkt6U53GT53gGuC3VxcGLSd7htkukUL+1w9Mjsh4d329xNfKvW0+MOtw= X-Forefront-PRVS: 0045236D47 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(7916002)(189002)(199003)(86362001)(33646002)(1076002)(69596002)(5003940100001)(42186005)(68736007)(189998001)(92566002)(66066001)(106356001)(47776003)(6116002)(3846002)(586003)(36756003)(5001770100001)(97736004)(2906002)(4326007)(2950100001)(5660300001)(76176999)(48376002)(53416004)(105586002)(50466002)(81156014)(8676002)(50226002)(77096005)(81166006)(19580405001)(50986999)(229853001)(101416001)(7846002)(7736002)(305945005)(19580395003)(21314002)(217873001); DIR:OUT; SFP:1102; SCL:1; SRVR:DB6PR0802MB2549; H:pavelb-Z68P-DS3.sw.ru; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: virtuozzo.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB6PR0802MB2549; 23:hgrWWDFv9kDNyLHlV3NSC35N95dmde2DosoEmWi?= =?us-ascii?Q?1xXhBvnLGDP3OZgDpQemlV6CMNCIr11af2mOzdgM1V5JY7+mzk59Y74t7UpV?= =?us-ascii?Q?CCPezZWJFEckyN33vMJAjm+UuE89fkvj74lwl9v80h3h25TNHYVlCkHcdZVF?= =?us-ascii?Q?OZ9ishuadVl1VL8azs9yJFWLAeEAAJYyn1KaRaPkC94cyLGeNydzwCMz7Cs7?= =?us-ascii?Q?bMvRzAR5O9ip+hWadWDZkETZozYRNIVh6p5SZbFDRuCYdRkUh6cqPmILfoSX?= =?us-ascii?Q?T525sY0O2PA+jTIAdOtn+qHQoZAB5VdIwTpl6ugGBlGG4S8ekLySD0CfCKXm?= =?us-ascii?Q?evIF2+4BfvC/EI3wL06lcN93iBRsw5rfAEQIGEWLy5ANNpjDgHyyn53JYSOM?= =?us-ascii?Q?qGH7Jkv/m7XrN1JZTamGdca9oBytsWmVvcgL0+zd3d63IGpGc3o5rcIlAS0n?= =?us-ascii?Q?vIfS6isu01KKJUrKeM0J/Zcb1qcfCmANYqqsM6tm05/0S5MHg/mr4cIfpVeJ?= =?us-ascii?Q?ZWQl5V/XOPTu3bslCj4pHk0jk/isdfgUAYSp6zmB4c5pFLH4a8uI9Qk34QyC?= =?us-ascii?Q?hlIXQKsKJDe7W1Up/8tS1w8pOnq/6mb2em7w6aPyuX4tVoERvLvtaAaZIXQG?= =?us-ascii?Q?ivDX9Bd06bFPMXiYubfrT6cl1A/cH0RTTciJ7FlIPSvAtHBi2Rs7LwMqpMiD?= =?us-ascii?Q?UX/yr6x9BmACiqk/mj3fANUjjelDWx+sNvLJ0mloWf14ZC0vcDNaOPpxlUm8?= =?us-ascii?Q?LveMxJg4Bc/9AaYQbHbxQoKG+NNGH0r2QRCUGiK8UNIDUl3rdlv3haYLP0ur?= =?us-ascii?Q?6+UTBxdbiVXfEIn8zfihP9Mk3Mcmn2KSdRvaps43okZpi5NUTqya/ViTTlp5?= =?us-ascii?Q?Rx+99o1uVd+BqCA8Aw7mA84Bvc5tMLGPyNtNqKmKBfAAR1CWmfxsflelyb3l?= =?us-ascii?Q?8+6RiRiS0q00wqvCNZbNF4a0F1cVGVdp9qBb58lXqieVQMjJdcAYvg8tozpE?= =?us-ascii?Q?ffxOJ9SMbceFz7O5x/URvbsbvPduYQaKuj4nERQlFUg4ccVqI2Zu8Xd8m4El?= =?us-ascii?Q?KzI9i3zY1RqMUsddYDxXSsRUh5xczTB3qP9/AX9mtYJRpAiUrPuc0a2uYSC9?= =?us-ascii?Q?l/XnwuIHSE/W7D11fVj9S8ArV5d2Cc7I0DfsfPorzhF6bFuJvtpRGmC8ERvr?= =?us-ascii?Q?C39ZO2044rICsFRM=3D?= X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2549; 6:vovP+hDoYIIkjLTgbGmR7X0S+SQgJmZ+Gvfv+2feGBST3lrhjXVZxYsCIaJAdAnujYYBLVv8Ba1iiS+oQvGlMHxucux5c8WWcHZSBTgp7Fgq5+4DFaoiqH9OJ3lGQ9LKhDTSodJrlVrBglC+8A1PUdw3CgPdo3LRb7Ohkuqnefv1G1W7UM+vkoipFNgG6Eavg7akIGBwli+VgQ1A79lCvo8mRXCfzvFkZFcrzXXsUw6uiTESfPjUF/A/NsYt/iJG6TqJMCUEWvDriRoCTcVeN1gqmiUfhhj+xTa/yG5cn/Cw6ctaZZjomhdj/xcooqAd; 5:Qe14CfX7tTO7A1XQEnb8VyJSpJN11BwE/0HBt/MsQ0s2NPZpAm3fAIUous1hY2nB3Z86xwwmYoLNnNrHXRxOUvRtaOuwoA5uhdXleUzlvxlwtxTdMXEmM1VqZkp3gzgCCvl3B5WRbTTpt/g4qMFQAw==; 24:FdKQ9GLQ4iXxDYojhRWPUioIOl3ADMTaoDk1uWcI7i4N8EjJc5SFWP/rK+08KNop+ZpH4dFoVEWSABoJyGKk9KFDJLpoQltfc6DABiekulI=; 7:K4buDiCQfqbvN27LUMgn4lQe8HAayfNBCzl/jL7ac25YrR3orv9rjy7mDRAku0BjLMdwvdizaYLYImtmX395/UjC8qbhhhwcouXbRxH08FDiMktd7EJWJIpWSVCmrvLhtzgc/NZusO9QxQkNwDhIwulmm3w4WTJHF+RIbV1bTgFtsQRahGycRGCRH5cZnWVX6YTkmr6BFFio0GVh4iJHmWIJd1XgndSi+6d+GlQjfqKKv9iHWxhbvLzu6x6GSmv2 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DB6PR0802MB2549; 20:pNBoVmwZz5V/xOyiPt+87iX29b6cOcWLP0K6ia58eXsOMimgXw0LL0oODt80q9lLjkeNaF5f0S32pcyMOScH0TGr9mmL/b8snIO60b0bC6GDNeRvN+Pj0QT7BVf2nAk5WNn9paU2UboZ9ZbVDVLkw7BaknqratlHNoLZ3+kVs10= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Aug 2016 13:45:40.8414 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2549 X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy] X-Received-From: 104.47.2.125 Subject: [Qemu-devel] [PATCH RFC 14/22] block/pcache: add support for rescheduling requests X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, den@openvz.org, jsnow@redhat.com, stefanha@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Now we can't drop nodes until aio write request will not be completed, because there is no guarantee that in the interval of time between the start request and its completion can be cached overlapping chunk of blocks and some data in the cache will be irrelevant. Also became possible when aio write corresponds to PCNode with status NODE_WAIT_STATUS, if we drop the nodes in aio callback, then these nodes can be skipped because there is a guarantee that at the time of processing aio read for pending node data on the disk will be relevant. Signed-off-by: Pavel Butsykin --- block/pcache.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 112 insertions(+), 24 deletions(-) diff --git a/block/pcache.c b/block/pcache.c index 1ff4c6a..cb5f884 100644 --- a/block/pcache.c +++ b/block/pcache.c @@ -43,6 +43,11 @@ typedef struct RbNodeKey { uint32_t size; } RbNodeKey; +typedef struct ACBEntryLink { + QTAILQ_ENTRY(ACBEntryLink) entry; + struct PrefCacheAIOCB *acb; +} ACBEntryLink; + typedef struct BlockNode { struct RbNode rb_node; union { @@ -58,6 +63,10 @@ typedef struct BlockNode { typedef struct PCNode { BlockNode cm; + struct { + QTAILQ_HEAD(acb_head, ACBEntryLink) list; + uint32_t cnt; + } wait; uint32_t status; uint32_t ref; uint8_t *data; @@ -181,7 +190,6 @@ static inline PCNode *pcache_node_ref(PCNode *node) { assert(node->status == NODE_SUCCESS_STATUS || node->status == NODE_WAIT_STATUS); - assert(atomic_read(&node->ref) == 0);/* XXX: only for sequential requests */ atomic_inc(&node->ref); return node; @@ -277,6 +285,8 @@ static inline void *pcache_node_alloc(RbNodeKey* key) node->status = NODE_WAIT_STATUS; qemu_co_mutex_init(&node->lock); node->data = g_malloc(node->cm.nb_sectors << BDRV_SECTOR_BITS); + node->wait.cnt = 0; + QTAILQ_INIT(&node->wait.list); return node; } @@ -308,15 +318,33 @@ static void pcache_node_drop(BDRVPCacheState *s, PCNode *node) pcache_node_unref(s, node); } +static inline PCNode *pcache_get_most_unused_node(BDRVPCacheState *s) +{ + PCNode *node; + assert(!QTAILQ_EMPTY(&s->pcache.lru.list)); + + qemu_co_mutex_lock(&s->pcache.lru.lock); + node = PCNODE(QTAILQ_LAST(&s->pcache.lru.list, lru_head)); + pcache_node_ref(node); + qemu_co_mutex_unlock(&s->pcache.lru.lock); + + return node; +} + static void pcache_try_shrink(BDRVPCacheState *s) { while (s->pcache.curr_size > s->cfg_cache_size) { - qemu_co_mutex_lock(&s->pcache.lru.lock); - assert(!QTAILQ_EMPTY(&s->pcache.lru.list)); - PCNode *rmv_node = PCNODE(QTAILQ_LAST(&s->pcache.lru.list, lru_head)); - qemu_co_mutex_unlock(&s->pcache.lru.lock); + PCNode *rmv_node; + /* it can happen if all nodes are waiting */ + if (QTAILQ_EMPTY(&s->pcache.lru.list)) { + DPRINTF("lru list is empty, but curr_size: %d\n", + s->pcache.curr_size); + break; + } + rmv_node = pcache_get_most_unused_node(s); pcache_node_drop(s, rmv_node); + pcache_node_unref(s, rmv_node); #ifdef PCACHE_DEBUG atomic_inc(&s->shrink_cnt_node); #endif @@ -392,7 +420,7 @@ static uint64_t ranges_overlap_size(uint64_t node1, uint32_t size1, return MIN(node1 + size1, node2 + size2) - MAX(node1, node2); } -static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) +static inline void pcache_node_read_buf(PrefCacheAIOCB *acb, PCNode* node) { uint64_t qiov_offs = 0, node_offs = 0; uint32_t size; @@ -407,15 +435,41 @@ static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) node->cm.sector_num, node->cm.nb_sectors) << BDRV_SECTOR_BITS; + qemu_co_mutex_lock(&node->lock); /* XXX: use rw lock */ + copy = \ + qemu_iovec_from_buf(acb->qiov, qiov_offs, node->data + node_offs, size); + qemu_co_mutex_unlock(&node->lock); + assert(copy == size); +} + +static inline void pcache_node_read_wait(PrefCacheAIOCB *acb, PCNode *node) +{ + ACBEntryLink *link = g_slice_alloc(sizeof(*link)); + link->acb = acb; + + atomic_inc(&node->wait.cnt); + QTAILQ_INSERT_HEAD(&node->wait.list, link, entry); + acb->ref++; +} + +static void pcache_node_read(PrefCacheAIOCB *acb, PCNode* node) +{ assert(node->status == NODE_SUCCESS_STATUS || + node->status == NODE_WAIT_STATUS || node->status == NODE_REMOVE_STATUS); assert(node->data != NULL); qemu_co_mutex_lock(&node->lock); - copy = \ - qemu_iovec_from_buf(acb->qiov, qiov_offs, node->data + node_offs, size); - assert(copy == size); + if (node->status == NODE_WAIT_STATUS) { + pcache_node_read_wait(acb, node); + qemu_co_mutex_unlock(&node->lock); + + return; + } qemu_co_mutex_unlock(&node->lock); + + pcache_node_read_buf(acb, node); + pcache_node_unref(acb->s, node); } static inline void prefetch_init_key(PrefCacheAIOCB *acb, RbNodeKey* key) @@ -446,10 +500,11 @@ static void pcache_pickup_parts_of_cache(PrefCacheAIOCB *acb, PCNode *node, size -= up_size; num += up_size; } - pcache_node_read(acb, node); up_size = MIN(node->cm.sector_num + node->cm.nb_sectors - num, size); - - pcache_node_unref(acb->s, node); + pcache_node_read(acb, node); /* don't use node after pcache_node_read, + * node maybe free. + */ + node = NULL; size -= up_size; num += up_size; @@ -488,7 +543,6 @@ static int32_t pcache_prefetch(PrefCacheAIOCB *acb) acb->nb_sectors) { pcache_node_read(acb, node); - pcache_node_unref(acb->s, node); return PREFETCH_FULL_UP; } pcache_pickup_parts_of_cache(acb, node, key.num, key.size); @@ -513,6 +567,31 @@ static void complete_aio_request(PrefCacheAIOCB *acb) } } +static void pcache_complete_acb_wait_queue(BDRVPCacheState *s, PCNode *node) +{ + ACBEntryLink *link, *next; + + if (atomic_read(&node->wait.cnt) == 0) { + return; + } + + QTAILQ_FOREACH_SAFE(link, &node->wait.list, entry, next) { + PrefCacheAIOCB *wait_acb = link->acb; + + QTAILQ_REMOVE(&node->wait.list, link, entry); + g_slice_free1(sizeof(*link), link); + + pcache_node_read_buf(wait_acb, node); + + assert(node->ref != 0); + pcache_node_unref(s, node); + + complete_aio_request(wait_acb); + atomic_dec(&node->wait.cnt); + } + assert(atomic_read(&node->wait.cnt) == 0); +} + static void pcache_node_submit(PrefCachePartReq *req) { PCNode *node = req->node; @@ -539,14 +618,17 @@ static void pcache_merge_requests(PrefCacheAIOCB *acb) qemu_co_mutex_lock(&acb->requests.lock); QTAILQ_FOREACH_SAFE(req, &acb->requests.list, entry, next) { + PCNode *node = req->node; QTAILQ_REMOVE(&acb->requests.list, req, entry); assert(req != NULL); - assert(req->node->status == NODE_WAIT_STATUS); + assert(node->status == NODE_WAIT_STATUS); pcache_node_submit(req); - pcache_node_read(acb, req->node); + pcache_node_read_buf(acb, node); + + pcache_complete_acb_wait_queue(acb->s, node); pcache_node_unref(acb->s, req->node); @@ -559,22 +641,27 @@ static void pcache_try_node_drop(PrefCacheAIOCB *acb) { BDRVPCacheState *s = acb->s; RbNodeKey key; + PCNode *node; + uint64_t end_offs = acb->sector_num + acb->nb_sectors; - prefetch_init_key(acb, &key); - + key.num = acb->sector_num; do { - PCNode *node; - qemu_co_mutex_lock(&s->pcache.tree.lock); + key.size = end_offs - key.num; + + qemu_co_mutex_lock(&s->pcache.tree.lock); /* XXX: use get_next_node */ node = pcache_node_search(&s->pcache.tree.root, &key); qemu_co_mutex_unlock(&s->pcache.tree.lock); if (node == NULL) { - break; + return; } - - pcache_node_drop(s, node); + if (node->status != NODE_WAIT_STATUS) { + assert(node->status == NODE_SUCCESS_STATUS); + pcache_node_drop(s, node); + } + key.num = node->cm.sector_num + node->cm.nb_sectors; pcache_node_unref(s, node); - } while (true); + } while (end_offs > key.num); } static void pcache_aio_cb(void *opaque, int ret) @@ -586,6 +673,8 @@ static void pcache_aio_cb(void *opaque, int ret) return; } pcache_merge_requests(acb); + } else { /* QEMU_AIO_WRITE */ + pcache_try_node_drop(acb); /* XXX: use write through */ } complete_aio_request(acb); @@ -649,7 +738,6 @@ static BlockAIOCB *pcache_aio_writev(BlockDriverState *bs, { PrefCacheAIOCB *acb = pcache_aio_get(bs, sector_num, qiov, nb_sectors, cb, opaque, QEMU_AIO_WRITE); - pcache_try_node_drop(acb); /* XXX: use write through */ bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, pcache_aio_cb, acb);