From patchwork Tue Oct 28 17:09:17 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 5178611 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 9B3189F387 for ; Tue, 28 Oct 2014 17:09:25 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 66EC9201F2 for ; Tue, 28 Oct 2014 17:09:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8289D20122 for ; Tue, 28 Oct 2014 17:09:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751386AbaJ1RJV (ORCPT ); Tue, 28 Oct 2014 13:09:21 -0400 Received: from mail-pa0-f42.google.com ([209.85.220.42]:53796 "EHLO mail-pa0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751153AbaJ1RJU (ORCPT ); Tue, 28 Oct 2014 13:09:20 -0400 Received: by mail-pa0-f42.google.com with SMTP id bj1so1179964pad.1 for ; Tue, 28 Oct 2014 10:09:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type; bh=R4LfrIPForRI/KAGUwqN2TiF9UvJfqmcAZYmJs5jp4E=; b=Gu966E+TGg7QWghxxSzpCSCO7E4bChKXPslIOJjm+StqmKGmcF9H2HLBPh0GyUkj/G 6eVKYQiMRNZIWb8LeIjsOq0unWErzca9cKghQDx6uAK44pPXcbnoapvV58S0StuBYHHN D/aHHVgD73gP/NvMDU1i9lBFL2ckR5EIyHkQXnkARRwQCDQQ+CRqJ37m6P03ph0qRkkw pwMsNCbnJkKGQacqyXKT+WlpVWHa1vD9B596aI1MSGJm9rRLEz9rTMpJXHFH/Qapm1tY IM5eBXgBTOX+3oNH9hp3F5Ae3n58YsmuerMfp/Ab60JcdxryEDwWh3LJ4kJVnmsQY8Rf L1Rg== X-Gm-Message-State: ALoCoQmHqaHL86Xwq7oKlrI2so26g6SdnWjWK2PMMygHqn+mq+/ibaHY522xwLE9zE4VSRfgzjIs X-Received: by 10.70.15.228 with SMTP id a4mr4834639pdd.77.1414516159437; Tue, 28 Oct 2014 10:09:19 -0700 (PDT) Received: from [192.168.3.11] (66.29.187.51.static.utbb.net. [66.29.187.51]) by mx.google.com with ESMTPSA id uh2sm2124176pbc.86.2014.10.28.10.09.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 28 Oct 2014 10:09:18 -0700 (PDT) Message-ID: <544FCDBD.7070106@kernel.dk> Date: Tue, 28 Oct 2014 11:09:17 -0600 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Ketor D CC: Mark Kirkwood , Mark Nelson , Mark Nelson , "fio@vger.kernel.org" , "xan.peng" , "ceph-devel@vger.kernel.org" Subject: Re: fio rbd completions (Was: fio rbd hang for block sizes > 1M) References: <5449BBB3.7090109@catalyst.net.nz> <544B1D50.4010101@kernel.dk> <544B2C19.7070009@catalyst.net.nz> <544BF808.2090800@kernel.dk> <544C2371.1020403@catalyst.net.nz> <544E547C.30009@kernel.dk> <544E6330.1000202@kernel.dk> <544E63EA.1010204@kernel.dk> <544E6691.106@kernel.dk> <544E6A8D.1040608@kernel.dk> <544EC05D.7040807@catalyst.net.nz> <544EC7F1.6010900@kernel.dk> <544ED37D.6060800@catalyst.net.nz> <79432849-2F58-4738-82E1-E4D8AB10A6AE@kernel.dk> <544FB2F2.2060703@kernel.dk> In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2014-10-28 09:49, Ketor D wrote: > Cannot get the new commited code from github now. > When I get the newest code, I will test. So here's another idea, applies on top of current -git. Basically it makes rbd wait for the oldest event, not just the first one in the array of all ios. This is the saner thing to do, as hopefully the oldest event will be the one to complete first. At least it has a much higher chance of being the right thing to do, than just waiting on a random event. Completely untested, so you might have to fiddle a bit with it to ensure that it actually works... diff --git a/engines/rbd.c b/engines/rbd.c index cf7be0acd1e3..f3129044c430 100644 --- a/engines/rbd.c +++ b/engines/rbd.c @@ -20,6 +20,7 @@ struct rbd_data { rados_ioctx_t io_ctx; rbd_image_t image; struct io_u **aio_events; + struct io_u **sort_events; }; struct rbd_options { @@ -80,20 +81,19 @@ static int _fio_setup_rbd_data(struct thread_data *td, if (td->io_ops->data) return 0; - rbd_data = malloc(sizeof(struct rbd_data)); + rbd_data = calloc(1, sizeof(struct rbd_data)); if (!rbd_data) goto failed; - memset(rbd_data, 0, sizeof(struct rbd_data)); - - rbd_data->aio_events = malloc(td->o.iodepth * sizeof(struct io_u *)); + rbd_data->aio_events = calloc(td->o.iodepth, sizeof(struct io_u *)); if (!rbd_data->aio_events) goto failed; - memset(rbd_data->aio_events, 0, td->o.iodepth * sizeof(struct io_u *)); + rbd_data->sort_events = calloc(td->o.iodepth, sizeof(struct io_u *)); + if (!rbd_data->sort_events) + goto failed; *rbd_data_ptr = rbd_data; - return 0; failed: @@ -218,14 +218,32 @@ static inline int fri_check_complete(struct rbd_data *rbd_data, return 0; } +static int rbd_io_u_cmp(const void *p1, const void *p2) +{ + const struct io_u **a = (const struct io_u **) p1; + const struct io_u **b = (const struct io_u **) p2; + uint64_t at, bt; + + at = utime_since_now(&(*a)->start_time); + bt = utime_since_now(&(*b)->start_time); + + if (at < bt) + return -1; + else if (at == bt) + return 0; + else + return 1; +} + static int rbd_iter_events(struct thread_data *td, unsigned int *events, unsigned int min_evts, int wait) { struct rbd_data *rbd_data = td->io_ops->data; unsigned int this_events = 0; struct io_u *io_u; - int i; + int i, sort_idx; + sort_idx = 0; io_u_qiter(&td->io_u_all, io_u, i) { struct fio_rbd_iou *fri = io_u->engine_data; @@ -236,16 +254,39 @@ static int rbd_iter_events(struct thread_data *td, unsigned int *events, if (fri_check_complete(rbd_data, io_u, events)) this_events++; - else if (wait) { - rbd_aio_wait_for_complete(fri->completion); + else if (wait) + rbd_data->sort_events[sort_idx++] = io_u; - if (fri_check_complete(rbd_data, io_u, events)) - this_events++; - } if (*events >= min_evts) break; } + if (!wait || !sort_idx) + return this_events; + + qsort(rbd_data->sort_events, sort_idx, sizeof(struct io_u *), rbd_io_u_cmp); + for (i = 0; i < sort_idx; i++) { + struct fio_rbd_iou *fri; + + io_u = rbd_data->sort_events[i]; + fri = io_u->engine_data; + + if (fri_check_complete(rbd_data, io_u, events)) { + this_events++; + continue; + } + if (!wait) + continue; + + rbd_aio_wait_for_complete(fri->completion); + + if (fri_check_complete(rbd_data, io_u, events)) + this_events++; + + if (wait && *events >= min_evts) + wait = 0; + } + return this_events; } @@ -359,6 +400,7 @@ static void fio_rbd_cleanup(struct thread_data *td) if (rbd_data) { _fio_rbd_disconnect(rbd_data); free(rbd_data->aio_events); + free(rbd_data->sort_events); free(rbd_data); }