From patchwork Fri Feb 10 23:17:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Shilovsky X-Patchwork-Id: 9567597 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 25369602B6 for ; Fri, 10 Feb 2017 23:18:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1EF2928541 for ; Fri, 10 Feb 2017 23:18:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 13E82285F1; Fri, 10 Feb 2017 23:18:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 66FFC28541 for ; Fri, 10 Feb 2017 23:18:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751953AbdBJXSK (ORCPT ); Fri, 10 Feb 2017 18:18:10 -0500 Received: from mail-pg0-f65.google.com ([74.125.83.65]:35359 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751934AbdBJXSJ (ORCPT ); Fri, 10 Feb 2017 18:18:09 -0500 Received: by mail-pg0-f65.google.com with SMTP id 204so4077160pge.2 for ; Fri, 10 Feb 2017 15:18:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:in-reply-to:references; bh=kggFVa2JGbHQeagwMi1QN2p2HJzPHjjcFtt2Eq8jIEs=; b=F32RSq1esg4rPVu2h/bUc4WKERgbSgFrwAzfmYbtJsnwHstDkoBiS9MCCS+6tG2Fur O5CYOVlMJaabXh5QPc7O5EReLnZbgq3ejHL055HYdKHCDuik3WjBUXpavC++Vy5sPnXj BBKgWKEnAG5NGMaj3KihSCpCrHUyh4FrQr9l8exTin4CurtQIgCzINxXP4vHnaeyZQD/ TBaOqkK2kwKyvjhrWb5+CfvscmvI4z4f1O0mXAEc3hAjUxJrOpE9LymRkIWMWtY0OhI5 WDLf9sXqEouPW6VSeMRtUMSNbU3tBV6MRqHut1yTCkDXNuJwTJmcb+CoXJH37RZrunyO JCkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=kggFVa2JGbHQeagwMi1QN2p2HJzPHjjcFtt2Eq8jIEs=; b=Gx4HTd3NaC/nh0SKJ3C9bJ9gRSm+jtpmCkn6FYaYQh3CZ29ujHLo27HQ7AKEUD+lCI dR9K5HM/etOz/8/jn/LPI+QvjafUlbZbv++1i31Nd74gF5A2VbDSLYoNoP7Zj2DIXAsw CR2+Hgw97HE1mvQDs0HwTST5nbaFKVl5g2Hffo0Z6LHFl1/X3RswPnpG0lJs/MvaoCel EtD9nxdgjIA0kQONp+s+M0T6fwD5sZ2kodX7tb7EwrRNrnFAOdMuHKFYoX/xvI78Ycst mRXBWEV/H/qCJgl8C8ioi19nhO9CLxAWm1jjE+7K3BpmbwsiBD+s5sBpBZxCMYKwdmLT Q3yQ== X-Gm-Message-State: AMke39mBKxDVPLRGQV9LhcstjwnB1KahkQFPYIkzDTmmdOocpPGYiwwR7vJtmzgxtur3PQ== X-Received: by 10.84.173.4 with SMTP id o4mr14818417plb.106.1486768688129; Fri, 10 Feb 2017 15:18:08 -0800 (PST) Received: from ubuntu-vm.corp.microsoft.com ([2001:4898:80e8:9::63b]) by smtp.gmail.com with ESMTPSA id p26sm7551719pfj.23.2017.02.10.15.18.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 10 Feb 2017 15:18:06 -0800 (PST) From: Pavel Shilovsky X-Google-Original-From: Pavel Shilovsky To: linux-cifs@vger.kernel.org Subject: [PATCH 2/2] CIFS: Add asynchronous write support through kernel AIO Date: Fri, 10 Feb 2017 15:17:58 -0800 Message-Id: <1486768678-36802-3-git-send-email-pshilov@microsoft.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1486768678-36802-1-git-send-email-pshilov@microsoft.com> References: <1486768678-36802-1-git-send-email-pshilov@microsoft.com> Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds support to process write calls passed by io_submit() asynchronously. It based on the previously introduced async context that allows to process i/o responses in a separate thread and return the caller immediately for asynchronous calls. This improves writing performance of single threaded applications with increasing of i/o queue depth size. Signed-off-by: Pavel Shilovsky --- fs/cifs/cifsglob.h | 2 + fs/cifs/file.c | 189 ++++++++++++++++++++++++++++++++++++++--------------- 2 files changed, 140 insertions(+), 51 deletions(-) diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index 80771ca..caa6f1a 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1119,6 +1119,7 @@ struct cifs_aio_ctx { struct cifsFileInfo *cfile; struct page **pages; struct bio_vec *bv; + loff_t pos; unsigned int npages; ssize_t rc; unsigned int len; @@ -1166,6 +1167,7 @@ struct cifs_writedata { enum writeback_sync_modes sync_mode; struct work_struct work; struct cifsFileInfo *cfile; + struct cifs_aio_ctx *ctx; __u64 offset; pid_t pid; unsigned int bytes; diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 6ceeed2..8f80397 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -2561,11 +2561,14 @@ cifs_uncached_writedata_release(struct kref *refcount) struct cifs_writedata *wdata = container_of(refcount, struct cifs_writedata, refcount); + kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release); for (i = 0; i < wdata->nr_pages; i++) put_page(wdata->pages[i]); cifs_writedata_release(refcount); } +static void collect_uncached_write_data(struct cifs_aio_ctx *ctx); + static void cifs_uncached_writev_complete(struct work_struct *work) { @@ -2581,7 +2584,8 @@ cifs_uncached_writev_complete(struct work_struct *work) spin_unlock(&inode->i_lock); complete(&wdata->done); - + collect_uncached_write_data(wdata->ctx); + /* the below call can possibly free the last ref to aio ctx */ kref_put(&wdata->refcount, cifs_uncached_writedata_release); } @@ -2630,7 +2634,8 @@ wdata_fill_from_iovec(struct cifs_writedata *wdata, struct iov_iter *from, static int cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from, struct cifsFileInfo *open_file, - struct cifs_sb_info *cifs_sb, struct list_head *wdata_list) + struct cifs_sb_info *cifs_sb, struct list_head *wdata_list, + struct cifs_aio_ctx *ctx) { int rc = 0; size_t cur_len; @@ -2698,6 +2703,8 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from, wdata->pagesz = PAGE_SIZE; wdata->tailsz = cur_len - ((nr_pages - 1) * PAGE_SIZE); wdata->credits = credits; + wdata->ctx = ctx; + kref_get(&ctx->refcount); if (!wdata->cfile->invalidHandle || !cifs_reopen_file(wdata->cfile, false)) @@ -2723,81 +2730,61 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from, return rc; } -ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from) +static void collect_uncached_write_data(struct cifs_aio_ctx *ctx) { - struct file *file = iocb->ki_filp; - ssize_t total_written = 0; - struct cifsFileInfo *open_file; + struct cifs_writedata *wdata, *tmp; struct cifs_tcon *tcon; struct cifs_sb_info *cifs_sb; - struct cifs_writedata *wdata, *tmp; - struct list_head wdata_list; - struct iov_iter saved_from = *from; + struct dentry *dentry = ctx->cfile->dentry; + unsigned int i; int rc; - /* - * BB - optimize the way when signing is disabled. We can drop this - * extra memory-to-memory copying and use iovec buffers for constructing - * write request. - */ - - rc = generic_write_checks(iocb, from); - if (rc <= 0) - return rc; - - INIT_LIST_HEAD(&wdata_list); - cifs_sb = CIFS_FILE_SB(file); - open_file = file->private_data; - tcon = tlink_tcon(open_file->tlink); - - if (!tcon->ses->server->ops->async_writev) - return -ENOSYS; + tcon = tlink_tcon(ctx->cfile->tlink); + cifs_sb = CIFS_SB(dentry->d_sb); - rc = cifs_write_from_iter(iocb->ki_pos, iov_iter_count(from), from, - open_file, cifs_sb, &wdata_list); + mutex_lock(&ctx->aio_mutex); - /* - * If at least one write was successfully sent, then discard any rc - * value from the later writes. If the other write succeeds, then - * we'll end up returning whatever was written. If it fails, then - * we'll get a new rc value from that. - */ - if (!list_empty(&wdata_list)) - rc = 0; + if (list_empty(&ctx->list)) { + mutex_unlock(&ctx->aio_mutex); + return; + } + rc = ctx->rc; /* * Wait for and collect replies for any successful sends in order of - * increasing offset. Once an error is hit or we get a fatal signal - * while waiting, then return without waiting for any more replies. + * increasing offset. Once an error is hit, then return without waiting + * for any more replies. */ restart_loop: - list_for_each_entry_safe(wdata, tmp, &wdata_list, list) { + list_for_each_entry_safe(wdata, tmp, &ctx->list, list) { if (!rc) { - /* FIXME: freezable too? */ - rc = wait_for_completion_killable(&wdata->done); - if (rc) - rc = -EINTR; - else if (wdata->result) + if (!try_wait_for_completion(&wdata->done)) { + mutex_unlock(&ctx->aio_mutex); + return; + } + + if (wdata->result) rc = wdata->result; else - total_written += wdata->bytes; + ctx->total_len += wdata->bytes; /* resend call if it's a retryable error */ if (rc == -EAGAIN) { struct list_head tmp_list; - struct iov_iter tmp_from = saved_from; + struct iov_iter tmp_from = ctx->iter; INIT_LIST_HEAD(&tmp_list); list_del_init(&wdata->list); iov_iter_advance(&tmp_from, - wdata->offset - iocb->ki_pos); + wdata->offset - ctx->pos); rc = cifs_write_from_iter(wdata->offset, wdata->bytes, &tmp_from, - open_file, cifs_sb, &tmp_list); + ctx->cfile, cifs_sb, &tmp_list, + ctx); - list_splice(&tmp_list, &wdata_list); + list_splice(&tmp_list, &ctx->list); kref_put(&wdata->refcount, cifs_uncached_writedata_release); @@ -2808,12 +2795,112 @@ ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from) kref_put(&wdata->refcount, cifs_uncached_writedata_release); } + for (i = 0; i < ctx->npages; i++) + put_page(ctx->bv[i].bv_page); + + cifs_stats_bytes_written(tcon, ctx->total_len); + set_bit(CIFS_INO_INVALID_MAPPING, &CIFS_I(dentry->d_inode)->flags); + + ctx->rc = (rc == 0) ? ctx->total_len : rc; + + mutex_unlock(&ctx->aio_mutex); + + if (ctx->iocb && ctx->iocb->ki_complete) + ctx->iocb->ki_complete(ctx->iocb, ctx->rc, 0); + else + complete(&ctx->done); +} + +ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + ssize_t total_written = 0; + struct cifsFileInfo *cfile; + struct cifs_tcon *tcon; + struct cifs_sb_info *cifs_sb; + struct cifs_aio_ctx *ctx; + struct iov_iter saved_from = *from; + int rc; + + /* + * BB - optimize the way when signing is disabled. We can drop this + * extra memory-to-memory copying and use iovec buffers for constructing + * write request. + */ + + rc = generic_write_checks(iocb, from); + if (rc <= 0) + return rc; + + cifs_sb = CIFS_FILE_SB(file); + cfile = file->private_data; + tcon = tlink_tcon(cfile->tlink); + + if (!tcon->ses->server->ops->async_writev) + return -ENOSYS; + + ctx = cifs_aio_ctx_alloc(); + if (!ctx) + return -ENOMEM; + + ctx->cfile = cifsFileInfo_get(cfile); + + if (!is_sync_kiocb(iocb)) + ctx->iocb = iocb; + + ctx->pos = iocb->ki_pos; + + rc = setup_aio_ctx_iter(ctx, from, WRITE); + if (rc) { + kref_put(&ctx->refcount, cifs_aio_ctx_release); + return rc; + } + + /* grab a lock here due to read response handlers can access ctx */ + mutex_lock(&ctx->aio_mutex); + + rc = cifs_write_from_iter(iocb->ki_pos, ctx->len, &saved_from, + cfile, cifs_sb, &ctx->list, ctx); + + /* + * If at least one write was successfully sent, then discard any rc + * value from the later writes. If the other write succeeds, then + * we'll end up returning whatever was written. If it fails, then + * we'll get a new rc value from that. + */ + if (!list_empty(&ctx->list)) + rc = 0; + + mutex_unlock(&ctx->aio_mutex); + + if (rc) { + kref_put(&ctx->refcount, cifs_aio_ctx_release); + return rc; + } + + if (!is_sync_kiocb(iocb)) { + kref_put(&ctx->refcount, cifs_aio_ctx_release); + return -EIOCBQUEUED; + } + + /* FIXME: freezable sleep too? */ + rc = wait_for_completion_killable(&ctx->done); + if (rc) { + mutex_lock(&ctx->aio_mutex); + ctx->rc = rc = -EINTR; + total_written = ctx->total_len; + mutex_unlock(&ctx->aio_mutex); + } else { + rc = ctx->rc; + total_written = ctx->total_len; + } + + kref_put(&ctx->refcount, cifs_aio_ctx_release); + if (unlikely(!total_written)) return rc; iocb->ki_pos += total_written; - set_bit(CIFS_INO_INVALID_MAPPING, &CIFS_I(file_inode(file))->flags); - cifs_stats_bytes_written(tcon, total_written); return total_written; }