From patchwork Sat Mar 10 18:18:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andiry Xu X-Patchwork-Id: 10273961 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BEE51602BD for ; Sat, 10 Mar 2018 18:21:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A925E2974F for ; Sat, 10 Mar 2018 18:21:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9D0DC296E5; Sat, 10 Mar 2018 18:21:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=no version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 51DEB29106 for ; Sat, 10 Mar 2018 18:21:46 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 51B6122646308; Sat, 10 Mar 2018 10:15:17 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=2607:f8b0:400e:c00::242; helo=mail-pf0-x242.google.com; envelope-from=jix024@eng.ucsd.edu; receiver=linux-nvdimm@lists.01.org Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com [IPv6:2607:f8b0:400e:c00::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5FCC122646306 for ; Sat, 10 Mar 2018 10:15:16 -0800 (PST) Received: by mail-pf0-x242.google.com with SMTP id h11so2617868pfn.4 for ; Sat, 10 Mar 2018 10:21:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eng.ucsd.edu; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jdhnp9SqZ18A8ybHFznpxQf1J4S+yTe0HFUn4fSTNh4=; b=VyY9TmV98sHPEFCb8y0mv3hLjaFimpQ69E7kNB7h0PJnPe3yqRWcWR99O5O0cF7E5r DIfgqHxNaowsll8SFZ/SkU62LviP8KDi+vlWRsMSlPUYUyShdZpwX1rB1ULOgVRbcgVP oz6nGM3TaWkJpDFmjzCSQNyPWiF4N3kvwJMoU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jdhnp9SqZ18A8ybHFznpxQf1J4S+yTe0HFUn4fSTNh4=; b=Mv1k2CqlXA0ucRRFeSC2DNCk+YI9vBRhJD6fkWcojNbvYSEYSPVstjCLtY24aZZ904 +KaI7VJZBXfDGrpupS/NGAD764W4gtn1MtaA6O3ZhYFLEVPVpihdTVbQIJB7GWD8EC8W XUEET1g6yyUkyBSw7wCBZuo4CZuv4S28sQmGKQk07/WEWkyOZKLnCFtaQCu7cW+PrL4j D6ZqR1PuWoa5bMpsSnReYAqxaKjpebCBuBIdcfO1fCdId8a9kQ4BL5r6rMm0kGKUSQQE aa/oSbYt2YGxtxDSyuDpdQASgVTyyjmP/ED7LxkdLOGM63OTt02OXWuebktzOCwXsm2n VKyQ== X-Gm-Message-State: AElRT7GlWU8zx08oOolkjbXRyEviutMzXz7nIpGqqgl/P7qIphHlZqsf XkL/Vv5PMszoEs+El5gMExw1sw== X-Google-Smtp-Source: AG47ELv1iOGQcHLmGCU1XzPjJzg/qAMn08Pw/kSjuaI3ZKd4X7C1Akp30h2mftwriOfcC152ViofoA== X-Received: by 10.99.126.84 with SMTP id o20mr2220181pgn.188.1520706094722; Sat, 10 Mar 2018 10:21:34 -0800 (PST) Received: from brienza-desktop.8.8.4.4 (andxu.ucsd.edu. [132.239.17.134]) by smtp.gmail.com with ESMTPSA id h80sm9210167pfj.181.2018.03.10.10.21.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 10 Mar 2018 10:21:34 -0800 (PST) From: Andiry Xu To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: [RFC v2 65/83] File operation: read. Date: Sat, 10 Mar 2018 10:18:46 -0800 Message-Id: <1520705944-6723-66-git-send-email-jix024@eng.ucsd.edu> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> References: <1520705944-6723-1-git-send-email-jix024@eng.ucsd.edu> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: coughlan@redhat.com, miklos@szeredi.hu, Andiry Xu , david@fromorbit.com, jack@suse.com, swanson@cs.ucsd.edu, swhiteho@redhat.com, andiry.xu@gmail.com MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP From: Andiry Xu NOVA is a DAX file system and does not use page cache. For read, NOVA looks up the file write entry by searching the radix tree, and copies data from pmem pages to user buffer directly. Signed-off-by: Andiry Xu --- fs/nova/file.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/fs/nova/file.c b/fs/nova/file.c index f60fdf3..842da45 100644 --- a/fs/nova/file.c +++ b/fs/nova/file.c @@ -113,9 +113,153 @@ static int nova_open(struct inode *inode, struct file *filp) return generic_file_open(inode, filp); } +static ssize_t +do_dax_mapping_read(struct file *filp, char __user *buf, + size_t len, loff_t *ppos) +{ + struct inode *inode = filp->f_mapping->host; + struct super_block *sb = inode->i_sb; + struct nova_inode_info *si = NOVA_I(inode); + struct nova_inode_info_header *sih = &si->header; + struct nova_file_write_entry *entry; + pgoff_t index, end_index; + unsigned long offset; + loff_t isize, pos; + size_t copied = 0, error = 0; + timing_t memcpy_time; + + pos = *ppos; + index = pos >> PAGE_SHIFT; + offset = pos & ~PAGE_MASK; + + if (!access_ok(VERIFY_WRITE, buf, len)) { + error = -EFAULT; + goto out; + } + + isize = i_size_read(inode); + if (!isize) + goto out; + + nova_dbgv("%s: inode %lu, offset %lld, count %lu, size %lld\n", + __func__, inode->i_ino, pos, len, isize); + + if (len > isize - pos) + len = isize - pos; + + if (len <= 0) + goto out; + + end_index = (isize - 1) >> PAGE_SHIFT; + do { + unsigned long nr, left; + unsigned long nvmm; + void *dax_mem = NULL; + int zero = 0; + + /* nr is the maximum number of bytes to copy from this page */ + if (index >= end_index) { + if (index > end_index) + goto out; + nr = ((isize - 1) & ~PAGE_MASK) + 1; + if (nr <= offset) + goto out; + } + + entry = nova_get_write_entry(sb, sih, index); + if (unlikely(entry == NULL)) { + nova_dbgv("Required extent not found: pgoff %lu, inode size %lld\n", + index, isize); + nr = PAGE_SIZE; + zero = 1; + goto memcpy; + } + + /* Find contiguous blocks */ + if (index < entry->pgoff || + index - entry->pgoff >= entry->num_pages) { + nova_err(sb, "%s ERROR: %lu, entry pgoff %llu, num %u, blocknr %llu\n", + __func__, index, entry->pgoff, + entry->num_pages, entry->block >> PAGE_SHIFT); + return -EINVAL; + } + if (entry->reassigned == 0) { + nr = (entry->num_pages - (index - entry->pgoff)) + * PAGE_SIZE; + } else { + nr = PAGE_SIZE; + } + + nvmm = get_nvmm(sb, sih, entry, index); + dax_mem = nova_get_block(sb, (nvmm << PAGE_SHIFT)); + +memcpy: + nr = nr - offset; + if (nr > len - copied) + nr = len - copied; + + NOVA_START_TIMING(memcpy_r_nvmm_t, memcpy_time); + + if (!zero) + left = __copy_to_user(buf + copied, + dax_mem + offset, nr); + else + left = __clear_user(buf + copied, nr); + + NOVA_END_TIMING(memcpy_r_nvmm_t, memcpy_time); + + if (left) { + nova_dbg("%s ERROR!: bytes %lu, left %lu\n", + __func__, nr, left); + error = -EFAULT; + goto out; + } + + copied += (nr - left); + offset += (nr - left); + index += offset >> PAGE_SHIFT; + offset &= ~PAGE_MASK; + } while (copied < len); + +out: + *ppos = pos + copied; + if (filp) + file_accessed(filp); + + NOVA_STATS_ADD(read_bytes, copied); + + nova_dbgv("%s returned %zu\n", __func__, copied); + return copied ? copied : error; +} + +/* + * Wrappers. We need to use the read lock to avoid + * concurrent truncate operation. No problem for write because we held + * lock. + */ +static ssize_t nova_dax_file_read(struct file *filp, char __user *buf, + size_t len, loff_t *ppos) +{ + struct inode *inode = filp->f_mapping->host; + struct nova_inode_info *si = NOVA_I(inode); + struct nova_inode_info_header *sih = &si->header; + ssize_t res; + timing_t dax_read_time; + + NOVA_START_TIMING(dax_read_t, dax_read_time); + inode_lock_shared(inode); + sih_lock_shared(sih); + res = do_dax_mapping_read(filp, buf, len, ppos); + sih_unlock_shared(sih); + inode_unlock_shared(inode); + NOVA_END_TIMING(dax_read_t, dax_read_time); + return res; +} + const struct file_operations nova_dax_file_operations = { .llseek = nova_llseek, + .read = nova_dax_file_read, .open = nova_open, .fsync = nova_fsync, .flush = nova_flush,