From patchwork Fri Nov 1 09:31:46 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Li Wang X-Patchwork-Id: 3123631 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id F167BBEEB2 for ; Fri, 1 Nov 2013 09:32:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 44CF420131 for ; Fri, 1 Nov 2013 09:32:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 12293200F7 for ; Fri, 1 Nov 2013 09:32:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755510Ab3KAJcA (ORCPT ); Fri, 1 Nov 2013 05:32:00 -0400 Received: from m53-178.qiye.163.com ([123.58.178.53]:48709 "EHLO m53-178.qiye.163.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755270Ab3KAJb7 (ORCPT ); Fri, 1 Nov 2013 05:31:59 -0400 Received: from localhost.localdomain (unknown [222.240.177.34]) by m53-178.qiye.163.com (HMail) with ESMTPA id BB24713A80CC; Fri, 1 Nov 2013 17:31:55 +0800 (CST) From: Li Wang To: ceph-devel@vger.kernel.org Cc: Sage Weil , Li Wang , Yunchuan Wen Subject: [RFC PATCH] ceph: Capture stride readahead Date: Fri, 1 Nov 2013 17:31:46 +0800 Message-Id: <1383298306-8492-1-git-send-email-liwang@ubuntukylin.com> X-Mailer: git-send-email 1.7.9.5 X-HM-Spam-Status: e1koWUFPN1dZCBgUCR5ZQUpMVU5PQkJCQklITEtNS05PTFdZCQ4XHghZQV koKz0kKDQ9Lz06MjckMjUkMz46Pz4pQUtVS0A2IyQiPigkMjUkMz46Pz4pQUtVS0ArLykkNTQkMj UkMz46Pz4pQUlVS0A4NC41LykiJDg1QUtVS0ApPjwyNDUkOigyOkFLVUtAKyk0LTI1OD4kMy41Oj VBS1VLQD8iNTo2MjgkMiskNTQkMjUkMz46Pz4pQUtVS0ApPjo3JDIrJDI1JCk5NyQyNSQzPjo*Pi lBSklVS0A2LjcvMiQpOCsvJD8yPT0#KT41LyQyNSQzPjo*PilBSVVLQDIrJC80PzoiJDg1LyRLJE pLS0FLVUtAMiskTiQ2MjUuLz4kODUvJEskSktBS1VLQDIrJEhLJDYyNS4vPiQ4NS8kSyROS0FLVU tAMiskSiQzNC4pJDg1LyRLJEpLS0FLVUtAMiskSiQ2MjUuLz4kODUvJEskSktBS1VLQCguOSQ#QU pVTk5ZBg++ X-HM-Sender-Digest: e1kSHx4VD1lBWUc6MQg6Cjo4LDo4EDorKjhIOj4qOkMwCjFVSlVKSENI SUJDSEpOQkxJVTMWGhIXVRcSDBoVHDsOGQ4VDw4QAhcSFVUYFBZFWVdZDB4ZWUEdGhcIHldZCAFZ QU9KSEo3V1kSC1lBWUlJSVVJT0tVSkxMVUhPWQY+ Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Enable ceph to capture stride readahead, the algorithm is simple and straightforward: prefetch the next stripe if hit. In the future, it may be implemented as enabled only when user requests explicitly as a mount option. Signed-off-by: Yunchuan Wen Signed-off-by: Li Wang --- fs/ceph/file.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- fs/ceph/super.h | 8 ++++++++ 2 files changed, 67 insertions(+), 1 deletion(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 3de8982..16a3981 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "super.h" #include "mds_client.h" @@ -635,6 +636,60 @@ out: return ret; } +static void ceph_stride_readahead(struct file *file, loff_t pos, size_t length) +{ + struct address_space *mapping = file->f_mapping; + struct ceph_file_info *fi = file->private_data; + struct ceph_file_stride_ra_info *info = &fi->stride; + struct blk_plug plug; + LIST_HEAD(page_pool); + loff_t next_pos; + pgoff_t start, end, page_idx; + unsigned int nr_pages = 0; + + if (info->length != length) + goto skip; + if (pos != info->pos + info->stride) + goto skip; + + next_pos = pos + info->stride; + start = next_pos >> PAGE_CACHE_SHIFT; + end = (next_pos + length - 1) >> PAGE_CACHE_SHIFT; + end = min(end, start + file->f_ra.ra_pages); + + for (page_idx = start; page_idx <= end; ++page_idx) { + struct page *page; + + rcu_read_lock(); + page = radix_tree_lookup(&mapping->page_tree, page_idx); + rcu_read_unlock(); + + if (page) + continue; + + page = page_cache_alloc_readahead(mapping); + if (!page) + break; + page->index = page_idx; + list_add(&page->lru, &page_pool); + + ++nr_pages; + } + + if (!nr_pages) + goto skip; + + blk_start_plug(&plug); + mapping->a_ops->readpages(file, mapping, &page_pool, nr_pages); + put_pages_list(&page_pool); + blk_finish_plug(&plug); + +skip: + info->length = length; + info->stride = pos - info->pos; + info->pos = pos; +} + /* * Wrap generic_file_aio_read with checks for cap bits on the inode. * Atomically grab references, so that those bits are not released @@ -675,8 +730,11 @@ again: (fi->flags & CEPH_F_SYNC)) /* hmm, this isn't really async... */ ret = ceph_sync_read(filp, base, len, ppos, &checkeof); - else + else { ret = generic_file_aio_read(iocb, iov, nr_segs, pos); + if (ret >= 0) + ceph_stride_readahead(filp, pos, iocb->ki_nbytes); + } out: dout("aio_read %p %llx.%llx dropping cap refs on %s = %d\n", diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 6014b0a..72b4382 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -567,6 +567,12 @@ extern void ceph_reservation_status(struct ceph_fs_client *client, #define CEPH_F_SYNC 1 #define CEPH_F_ATEND 2 +struct ceph_file_stride_ra_info { + loff_t pos; + size_t length; + loff_t stride; +}; + struct ceph_file_info { short fmode; /* initialized on open */ short flags; /* CEPH_F_* */ @@ -585,6 +591,8 @@ struct ceph_file_info { /* used for -o dirstat read() on directory thing */ char *dir_info; int dir_info_len; + + struct ceph_file_stride_ra_info stride; };