From patchwork Mon Jul 13 16:30:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660369 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DE7EF618 for ; Mon, 13 Jul 2020 16:31:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC00F2065F for ; Mon, 13 Jul 2020 16:31:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MRRqeIDU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730477AbgGMQbP (ORCPT ); Mon, 13 Jul 2020 12:31:15 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:52009 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730007AbgGMQbO (ORCPT ); Mon, 13 Jul 2020 12:31:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657870; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o1lqHzY57p2ty6+cxikXuHsbtxl8Y+xJg6YxG6iKJok=; b=MRRqeIDU1qO1Bepdq8C45bkDTWuDqzRdS2FbISgHNNzY01nb2CWgbi4JEBVf5D7SD+mWHq /p6vUE8oVC4fqRENtex7aVgxehEE3O4pyBtTH63nNyLs2RsyJo/T3Lr1BYO8rQxLtXSnxj 5Xj53eRwu9EWoAy6+eFhxLWNn3gHFQU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-2-HmDCbhAqM5yATHEboj25ZA-1; Mon, 13 Jul 2020 12:31:00 -0400 X-MC-Unique: HmDCbhAqM5yATHEboj25ZA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CAD791800D42; Mon, 13 Jul 2020 16:30:58 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id F331119C66; Mon, 13 Jul 2020 16:30:52 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 01/32] iov_iter: Add ITER_MAPPING From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:30:52 +0100 Message-ID: <159465785214.1376674.6062549291411362531.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add an iterator, ITER_MAPPING, that walks through a set of pages attached to an address_space, starting at a given page and offset and walking for the specified amount of bytes. The caller must guarantee that the pages are all present and they must be locked using PG_locked, PG_writeback or PG_fscache to prevent them from going away or being migrated whilst they're being accessed. This is useful for copying data from socket buffers to inodes in network filesystems and for transferring data between those inodes and the cache using direct I/O. Whilst it is true that ITER_BVEC could be used instead, that would require a bio_vec array to be allocated to refer to all the pages - which should be redundant if inode->i_pages also points to all these pages. This could also be turned into an ITER_XARRAY, taking and xarray pointer instead of a mapping pointer. It would be mostly trivial, except for the use of find_get_pages_contig() by iov_iter_get_pages*(). Signed-off-by: David Howells cc: Matthew Wilcox --- include/linux/uio.h | 11 ++ lib/iov_iter.c | 286 +++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 274 insertions(+), 23 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 9576fd8158d7..a0321a740f51 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -11,6 +11,7 @@ #include struct page; +struct address_space; struct pipe_inode_info; struct kvec { @@ -25,6 +26,7 @@ enum iter_type { ITER_BVEC = 16, ITER_PIPE = 32, ITER_DISCARD = 64, + ITER_MAPPING = 128, }; struct iov_iter { @@ -40,6 +42,7 @@ struct iov_iter { const struct iovec *iov; const struct kvec *kvec; const struct bio_vec *bvec; + struct address_space *mapping; struct pipe_inode_info *pipe; }; union { @@ -48,6 +51,7 @@ struct iov_iter { unsigned int head; unsigned int start_head; }; + loff_t mapping_start; }; }; @@ -81,6 +85,11 @@ static inline bool iov_iter_is_discard(const struct iov_iter *i) return iov_iter_type(i) == ITER_DISCARD; } +static inline bool iov_iter_is_mapping(const struct iov_iter *i) +{ + return iov_iter_type(i) == ITER_MAPPING; +} + static inline unsigned char iov_iter_rw(const struct iov_iter *i) { return i->type & (READ | WRITE); @@ -222,6 +231,8 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_ void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe, size_t count); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count); +void iov_iter_mapping(struct iov_iter *i, unsigned int direction, struct address_space *mapping, + loff_t start, size_t count); ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start); ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index bf538c2bec77..e4a073523b76 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -75,7 +75,40 @@ } \ } -#define iterate_all_kinds(i, n, v, I, B, K) { \ +#define iterate_mapping(i, n, __v, skip, STEP) { \ + struct page *page; \ + size_t wanted = n, seg, offset; \ + loff_t start = i->mapping_start + skip; \ + pgoff_t index = start >> PAGE_SHIFT; \ + \ + XA_STATE(xas, &i->mapping->i_pages, index); \ + \ + rcu_read_lock(); \ + for (page = xas_load(&xas); page; page = xas_next(&xas)) { \ + if (xas_retry(&xas, page)) \ + continue; \ + if (WARN_ON(xa_is_value(page))) \ + break; \ + if (WARN_ON(PageHuge(page))) \ + break; \ + if (!page) \ + break; \ + __v.bv_page = find_subpage(page, xas.xa_index); \ + offset = (i->mapping_start + skip) & ~PAGE_MASK; \ + seg = PAGE_SIZE - offset; \ + __v.bv_offset = offset; \ + __v.bv_len = min(n, seg); \ + (void)(STEP); \ + n -= __v.bv_len; \ + skip += __v.bv_len; \ + if (n == 0) \ + break; \ + } \ + rcu_read_unlock(); \ + n = wanted - n; \ +} + +#define iterate_all_kinds(i, n, v, I, B, K, M) { \ if (likely(n)) { \ size_t skip = i->iov_offset; \ if (unlikely(i->type & ITER_BVEC)) { \ @@ -87,6 +120,9 @@ struct kvec v; \ iterate_kvec(i, n, v, kvec, skip, (K)) \ } else if (unlikely(i->type & ITER_DISCARD)) { \ + } else if (unlikely(i->type & ITER_MAPPING)) { \ + struct bio_vec v; \ + iterate_mapping(i, n, v, skip, (M)); \ } else { \ const struct iovec *iov; \ struct iovec v; \ @@ -95,7 +131,7 @@ } \ } -#define iterate_and_advance(i, n, v, I, B, K) { \ +#define iterate_and_advance(i, n, v, I, B, K, M) { \ if (unlikely(i->count < n)) \ n = i->count; \ if (i->count) { \ @@ -120,6 +156,9 @@ i->kvec = kvec; \ } else if (unlikely(i->type & ITER_DISCARD)) { \ skip += n; \ + } else if (unlikely(i->type & ITER_MAPPING)) { \ + struct bio_vec v; \ + iterate_mapping(i, n, v, skip, (M)) \ } else { \ const struct iovec *iov; \ struct iovec v; \ @@ -629,7 +668,9 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i) copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), memcpy_to_page(v.bv_page, v.bv_offset, (from += v.bv_len) - v.bv_len, v.bv_len), - memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len) + memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len), + memcpy_to_page(v.bv_page, v.bv_offset, + (from += v.bv_len) - v.bv_len, v.bv_len) ) return bytes; @@ -747,6 +788,15 @@ size_t _copy_to_iter_mcsafe(const void *addr, size_t bytes, struct iov_iter *i) bytes = curr_addr - s_addr - rem; return bytes; } + }), + ({ + rem = memcpy_mcsafe_to_page(v.bv_page, v.bv_offset, + (from += v.bv_len) - v.bv_len, v.bv_len); + if (rem) { + curr_addr = (unsigned long) from; + bytes = curr_addr - s_addr - rem; + return bytes; + } }) ) @@ -768,7 +818,9 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i) copyin((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), - memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len) + memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), + memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) return bytes; @@ -794,7 +846,9 @@ bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i) 0;}), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), - memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len) + memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), + memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) iov_iter_advance(i, bytes); @@ -814,7 +868,9 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i) v.iov_base, v.iov_len), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), - memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len) + memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), + memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) return bytes; @@ -849,7 +905,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i) memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), memcpy_flushcache((to += v.iov_len) - v.iov_len, v.iov_base, - v.iov_len) + v.iov_len), + memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) return bytes; @@ -873,7 +931,9 @@ bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i) 0;}), memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), - memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len) + memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), + memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) iov_iter_advance(i, bytes); @@ -910,7 +970,7 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, { if (unlikely(!page_copy_sane(page, offset, bytes))) return 0; - if (i->type & (ITER_BVEC|ITER_KVEC)) { + if (i->type & (ITER_BVEC | ITER_KVEC | ITER_MAPPING)) { void *kaddr = kmap_atomic(page); size_t wanted = copy_to_iter(kaddr + offset, bytes, i); kunmap_atomic(kaddr); @@ -933,7 +993,7 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes, WARN_ON(1); return 0; } - if (i->type & (ITER_BVEC|ITER_KVEC)) { + if (i->type & (ITER_BVEC | ITER_KVEC | ITER_MAPPING)) { void *kaddr = kmap_atomic(page); size_t wanted = _copy_from_iter(kaddr + offset, bytes, i); kunmap_atomic(kaddr); @@ -977,7 +1037,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *i) iterate_and_advance(i, bytes, v, clear_user(v.iov_base, v.iov_len), memzero_page(v.bv_page, v.bv_offset, v.bv_len), - memset(v.iov_base, 0, v.iov_len) + memset(v.iov_base, 0, v.iov_len), + memzero_page(v.bv_page, v.bv_offset, v.bv_len) ) return bytes; @@ -1001,7 +1062,9 @@ size_t iov_iter_copy_from_user_atomic(struct page *page, copyin((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page, v.bv_offset, v.bv_len), - memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len) + memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len), + memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page, + v.bv_offset, v.bv_len) ) kunmap_atomic(kaddr); return bytes; @@ -1072,7 +1135,13 @@ void iov_iter_advance(struct iov_iter *i, size_t size) i->count -= size; return; } - iterate_and_advance(i, size, v, 0, 0, 0) + if (unlikely(iov_iter_is_mapping(i))) { + /* We really don't want to fetch pages if we can avoid it */ + i->iov_offset += size; + i->count -= size; + return; + } + iterate_and_advance(i, size, v, 0, 0, 0, 0) } EXPORT_SYMBOL(iov_iter_advance); @@ -1116,7 +1185,12 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) return; } unroll -= i->iov_offset; - if (iov_iter_is_bvec(i)) { + if (iov_iter_is_mapping(i)) { + BUG(); /* We should never go beyond the start of the specified + * range since we might then be straying into pages that + * aren't pinned. + */ + } else if (iov_iter_is_bvec(i)) { const struct bio_vec *bvec = i->bvec; while (1) { size_t n = (--bvec)->bv_len; @@ -1153,9 +1227,9 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i) return i->count; // it is a silly place, anyway if (i->nr_segs == 1) return i->count; - if (unlikely(iov_iter_is_discard(i))) + if (unlikely(iov_iter_is_discard(i) || iov_iter_is_mapping(i))) return i->count; - else if (iov_iter_is_bvec(i)) + if (iov_iter_is_bvec(i)) return min(i->count, i->bvec->bv_len - i->iov_offset); else return min(i->count, i->iov->iov_len - i->iov_offset); @@ -1203,6 +1277,32 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction, } EXPORT_SYMBOL(iov_iter_pipe); +/** + * iov_iter_mapping - Initialise an I/O iterator to use the pages in a mapping + * @i: The iterator to initialise. + * @direction: The direction of the transfer. + * @mapping: The mapping to access. + * @start: The start file position. + * @count: The size of the I/O buffer in bytes. + * + * Set up an I/O iterator to either draw data out of the pages attached to an + * inode or to inject data into those pages. The pages *must* be prevented + * from evaporation, either by taking a ref on them or locking them by the + * caller. + */ +void iov_iter_mapping(struct iov_iter *i, unsigned int direction, + struct address_space *mapping, + loff_t start, size_t count) +{ + BUG_ON(direction & ~1); + i->type = ITER_MAPPING | (direction & (READ | WRITE)); + i->mapping = mapping; + i->mapping_start = start; + i->count = count; + i->iov_offset = 0; +} +EXPORT_SYMBOL(iov_iter_mapping); + /** * iov_iter_discard - Initialise an I/O iterator that discards data * @i: The iterator to initialise. @@ -1236,7 +1336,8 @@ unsigned long iov_iter_alignment(const struct iov_iter *i) iterate_all_kinds(i, size, v, (res |= (unsigned long)v.iov_base | v.iov_len, 0), res |= v.bv_offset | v.bv_len, - res |= (unsigned long)v.iov_base | v.iov_len + res |= (unsigned long)v.iov_base | v.iov_len, + res |= v.bv_offset | v.bv_len ) return res; } @@ -1258,7 +1359,9 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i) (res |= (!res ? 0 : (unsigned long)v.bv_offset) | (size != v.bv_len ? size : 0)), (res |= (!res ? 0 : (unsigned long)v.iov_base) | - (size != v.iov_len ? size : 0)) + (size != v.iov_len ? size : 0)), + (res |= (!res ? 0 : (unsigned long)v.bv_offset) | + (size != v.bv_len ? size : 0)) ); return res; } @@ -1308,6 +1411,48 @@ static ssize_t pipe_get_pages(struct iov_iter *i, return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start); } +static ssize_t iter_mapping_get_pages(struct iov_iter *i, + struct page **pages, size_t maxsize, + unsigned maxpages, size_t *_start_offset) +{ + unsigned nr, offset; + pgoff_t index, count; + size_t size = maxsize, actual; + loff_t pos; + + if (!size || !maxpages) + return 0; + + pos = i->mapping_start + i->iov_offset; + index = pos >> PAGE_SHIFT; + offset = pos & ~PAGE_MASK; + *_start_offset = offset; + + count = 1; + if (size > PAGE_SIZE - offset) { + size -= PAGE_SIZE - offset; + count += size >> PAGE_SHIFT; + size &= ~PAGE_MASK; + if (size) + count++; + } + + if (count > maxpages) + count = maxpages; + + nr = find_get_pages_contig(i->mapping, index, count, pages); + if (nr == 0) + return 0; + + actual = PAGE_SIZE * nr; + actual -= offset; + if (nr == count && size > 0) { + unsigned last_offset = (nr > 1) ? 0 : offset; + actual -= PAGE_SIZE - (last_offset + size); + } + return actual; +} + ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start) @@ -1317,6 +1462,8 @@ ssize_t iov_iter_get_pages(struct iov_iter *i, if (unlikely(iov_iter_is_pipe(i))) return pipe_get_pages(i, pages, maxsize, maxpages, start); + if (unlikely(iov_iter_is_mapping(i))) + return iter_mapping_get_pages(i, pages, maxsize, maxpages, start); if (unlikely(iov_iter_is_discard(i))) return -EFAULT; @@ -1343,7 +1490,8 @@ ssize_t iov_iter_get_pages(struct iov_iter *i, return v.bv_len; }),({ return -EFAULT; - }) + }), + 0 ) return 0; } @@ -1387,6 +1535,51 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i, return n; } +static ssize_t iter_mapping_get_pages_alloc(struct iov_iter *i, + struct page ***pages, size_t maxsize, + size_t *_start_offset) +{ + struct page **p; + unsigned nr, offset; + pgoff_t index, count; + size_t size = maxsize, actual; + loff_t pos; + + if (!size) + return 0; + + pos = i->mapping_start + i->iov_offset; + index = pos >> PAGE_SHIFT; + offset = pos & ~PAGE_MASK; + *_start_offset = offset; + + count = 1; + if (size > PAGE_SIZE - offset) { + size -= PAGE_SIZE - offset; + count += size >> PAGE_SHIFT; + size &= ~PAGE_MASK; + if (size) + count++; + } + + p = get_pages_array(count); + if (!p) + return -ENOMEM; + *pages = p; + + nr = find_get_pages_contig(i->mapping, index, count, p); + if (nr == 0) + return 0; + + actual = PAGE_SIZE * nr; + actual -= offset; + if (nr == count && size > 0) { + unsigned last_offset = (nr > 1) ? 0 : offset; + actual -= PAGE_SIZE - (last_offset + size); + } + return actual; +} + ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start) @@ -1398,6 +1591,8 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, if (unlikely(iov_iter_is_pipe(i))) return pipe_get_pages_alloc(i, pages, maxsize, start); + if (unlikely(iov_iter_is_mapping(i))) + return iter_mapping_get_pages_alloc(i, pages, maxsize, start); if (unlikely(iov_iter_is_discard(i))) return -EFAULT; @@ -1430,7 +1625,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, return v.bv_len; }),({ return -EFAULT; - }) + }), 0 ) return 0; } @@ -1469,6 +1664,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, v.iov_base, v.iov_len, sum, off); off += v.iov_len; + }), ({ + char *p = kmap_atomic(v.bv_page); + next = csum_partial_copy_nocheck(p + v.bv_offset, + (to += v.bv_len) - v.bv_len, + v.bv_len, 0); + kunmap_atomic(p); + sum = csum_block_add(sum, next, off); + off += v.bv_len; }) ) *csum = sum; @@ -1511,6 +1714,14 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, v.iov_base, v.iov_len, sum, off); off += v.iov_len; + }), ({ + char *p = kmap_atomic(v.bv_page); + next = csum_partial_copy_nocheck(p + v.bv_offset, + (to += v.bv_len) - v.bv_len, + v.bv_len, 0); + kunmap_atomic(p); + sum = csum_block_add(sum, next, off); + off += v.bv_len; }) ) *csum = sum; @@ -1557,6 +1768,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, (from += v.iov_len) - v.iov_len, v.iov_len, sum, off); off += v.iov_len; + }), ({ + char *p = kmap_atomic(v.bv_page); + next = csum_partial_copy_nocheck((from += v.bv_len) - v.bv_len, + p + v.bv_offset, + v.bv_len, 0); + kunmap_atomic(p); + sum = csum_block_add(sum, next, off); + off += v.bv_len; }) ) *csum = sum; @@ -1606,6 +1825,21 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages) npages = pipe_space_for_user(iter_head, pipe->tail, pipe); if (npages >= maxpages) return maxpages; + } else if (unlikely(iov_iter_is_mapping(i))) { + unsigned offset; + + offset = (i->mapping_start + i->iov_offset) & ~PAGE_MASK; + + npages = 1; + if (size > PAGE_SIZE - offset) { + size -= PAGE_SIZE - offset; + npages += size >> PAGE_SHIFT; + size &= ~PAGE_MASK; + if (size) + npages++; + } + if (npages >= maxpages) + return maxpages; } else iterate_all_kinds(i, size, v, ({ unsigned long p = (unsigned long)v.iov_base; npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE) @@ -1622,7 +1856,8 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages) - p / PAGE_SIZE; if (npages >= maxpages) return maxpages; - }) + }), + 0 ) return npages; } @@ -1635,7 +1870,7 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags) WARN_ON(1); return NULL; } - if (unlikely(iov_iter_is_discard(new))) + if (unlikely(iov_iter_is_discard(new) || iov_iter_is_mapping(new))) return NULL; if (iov_iter_is_bvec(new)) return new->bvec = kmemdup(new->bvec, @@ -1747,7 +1982,12 @@ int iov_iter_for_each_range(struct iov_iter *i, size_t bytes, kunmap(v.bv_page); err;}), ({ w = v; - err = f(&w, context);}) + err = f(&w, context);}), ({ + w.iov_base = kmap(v.bv_page) + v.bv_offset; + w.iov_len = v.bv_len; + err = f(&w, context); + kunmap(v.bv_page); + err;}) ) return err; } From patchwork Mon Jul 13 16:31:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660373 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6730313A4 for ; Mon, 13 Jul 2020 16:31:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4F7942065F for ; Mon, 13 Jul 2020 16:31:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XkbqXpCj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730380AbgGMQbN (ORCPT ); Mon, 13 Jul 2020 12:31:13 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:43420 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730179AbgGMQbM (ORCPT ); Mon, 13 Jul 2020 12:31:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p3v2F73edBr6Ggies7zt3y5ZooG1GfcT3tRqvWmvwJg=; b=XkbqXpCjzPO/WvXotw/Tf6eSP/Bjb1uqSGSE1cMbHD461H7K3fWAbhnq5h8QqdTk5taXD5 k8RNSYVPWW8FMR2fQQQ6rAB+oVtUtmwUeOOeqozZrnlxeMmG2WuYFL3AVwEP6IsfKe0ieG hyXsh9AWy7Q/I4qhb1rnh4IH7mk5LxE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-250-CjyeGd5yOpSG5Xhm4WMapA-1; Mon, 13 Jul 2020 12:31:09 -0400 X-MC-Unique: CjyeGd5yOpSG5Xhm4WMapA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 75FE81085; Mon, 13 Jul 2020 16:31:07 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE2412B6DD; Mon, 13 Jul 2020 16:31:04 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 02/32] vm: Add wait/unlock functions for PG_fscache From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:04 +0100 Message-ID: <159465786405.1376674.15704703594171055681.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add functions to unlock and wait for unlock of PG_fscache analogously with those for PG_lock. Signed-off-by: David Howells --- include/linux/pagemap.h | 14 ++++++++++++++ mm/filemap.c | 18 ++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cf2468da68e9..0b917990dc1e 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -501,6 +501,7 @@ extern int __lock_page_killable(struct page *page); extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags); extern void unlock_page(struct page *page); +extern void unlock_page_fscache(struct page *page); /* * Return true if the page was successfully locked @@ -575,6 +576,19 @@ static inline int wait_on_page_locked_killable(struct page *page) return wait_on_page_bit_killable(compound_head(page), PG_locked); } +/** + * wait_on_page_fscache - Wait for PG_fscache to be cleared on a page + * @page: The page + * + * Wait for the fscache mark to be removed from a page, usually signifying the + * completion of a write from that page to the cache. + */ +static inline void wait_on_page_fscache(struct page *page) +{ + if (PagePrivate2(page)) + wait_on_page_bit(compound_head(page), PG_fscache); +} + extern void put_and_wait_on_page_locked(struct page *page); void wait_on_page_writeback(struct page *page); diff --git a/mm/filemap.c b/mm/filemap.c index f0ae9a6308cb..4894e9705d34 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1293,6 +1293,24 @@ void unlock_page(struct page *page) } EXPORT_SYMBOL(unlock_page); +/** + * unlock_page_fscache - Unlock a page pinned with PG_fscache + * @page: The page + * + * Unlocks the page and wakes up sleepers in wait_on_page_fscache(). Also + * wakes those waiting for the lock and writeback bits because the wakeup + * mechanism is shared. But that's OK - those sleepers will just go back to + * sleep. + */ +void unlock_page_fscache(struct page *page) +{ + page = compound_head(page); + VM_BUG_ON_PAGE(!PagePrivate2(page), page); + clear_bit_unlock(PG_fscache, &page->flags); + wake_up_page_bit(page, PG_fscache); +} +EXPORT_SYMBOL(unlock_page_fscache); + /** * end_page_writeback - end writeback against a page * @page: the page From patchwork Mon Jul 13 16:31:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660381 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9253E13A4 for ; Mon, 13 Jul 2020 16:31:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7AAA02065F for ; Mon, 13 Jul 2020 16:31:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="i1TCCQSj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730251AbgGMQbY (ORCPT ); Mon, 13 Jul 2020 12:31:24 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:27010 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730256AbgGMQbX (ORCPT ); Mon, 13 Jul 2020 12:31:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657882; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3hEyWi8czD3M2ZNrh9uxJa2z5WTZcojM78ypRlIZeMs=; b=i1TCCQSjvFFMn+h6LDrjHGgNYj6iI2XXTzYkUMY54uA3ROMPBBrN1y0MCaq1fB7WOgwZhy A71WnWw/PUC8Y0uKCmvg3LzxqyBK/PBNEdwjpk8lTHguok0JWHZgtWta3UrZtlE6JCskys pXaT4q6fmdlgkI6VddxgSEm/PNdUmj0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-96-ieiKC1Z0Mcu9j4SJGfHcug-1; Mon, 13 Jul 2020 12:31:20 -0400 X-MC-Unique: ieiKC1Z0Mcu9j4SJGfHcug-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 018F8108A; Mon, 13 Jul 2020 16:31:19 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7AFAB10013C0; Mon, 13 Jul 2020 16:31:13 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 03/32] vfs: Export rw_verify_area() for use by cachefiles From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:12 +0100 Message-ID: <159465787270.1376674.9709773455326854521.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Export rw_verify_area() for so that cachefiles can use it before issuing call_read_iter() and call_write_iter() to effect async DIO operations against the cache. Signed-off-by: David Howells --- fs/internal.h | 5 ----- fs/read_write.c | 1 + include/linux/fs.h | 1 + 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 9b863a7bd708..8213a29a972f 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -156,11 +156,6 @@ extern char *simple_dname(struct dentry *, char *, int); extern void dput_to_list(struct dentry *, struct list_head *); extern void shrink_dentry_list(struct list_head *); -/* - * read_write.c - */ -extern int rw_verify_area(int, struct file *, const loff_t *, size_t); - /* * pipe.c */ diff --git a/fs/read_write.c b/fs/read_write.c index bbfa9b12b15e..eb18270a1e14 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -400,6 +400,7 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t return security_file_permission(file, read_write == READ ? MAY_READ : MAY_WRITE); } +EXPORT_SYMBOL(rw_verify_area); static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 3f881a892ea7..aa3e3af92220 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2905,6 +2905,7 @@ extern int notify_change(struct dentry *, struct iattr *, struct inode **); extern int inode_permission(struct inode *, int); extern int generic_permission(struct inode *, int); extern int __check_sticky(struct inode *dir, struct inode *inode); +extern int rw_verify_area(int, struct file *, const loff_t *, size_t); static inline bool execute_ok(struct inode *inode) { From patchwork Mon Jul 13 16:31:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660391 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 40F7313A4 for ; Mon, 13 Jul 2020 16:31:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 29E492065F for ; Mon, 13 Jul 2020 16:31:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="b2ybyvCt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730486AbgGMQbe (ORCPT ); Mon, 13 Jul 2020 12:31:34 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:27500 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730479AbgGMQbd (ORCPT ); Mon, 13 Jul 2020 12:31:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657892; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Iouzq0Yi3HPtZd0p4FYDjFy4Q/8dNzl3FIny7ymXMV0=; b=b2ybyvCtGeZuH0Oc+2f9biKXcRjZndZMAmY4s475xjxyDVCWsrdWK2JeKgT8Skx/WmUv33 EoErpsjjuL3zQBG5RshoglQ/jqXa/3pkuf1zUjNzJuo1HWX2KUVp5O8K97Ac/xQWrX+XiH yhw6gV0GDIQR4vwsoGhk19II/T/wLZA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-205-ImBpIKiBM4Gp04BjYK3ydg-1; Mon, 13 Jul 2020 12:31:30 -0400 X-MC-Unique: ImBpIKiBM4Gp04BjYK3ydg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 24F7C100A8C1; Mon, 13 Jul 2020 16:31:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 548C110021B3; Mon, 13 Jul 2020 16:31:24 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 04/32] vfs: Provide S_CACHE_FILE inode flag From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:24 +0100 Message-ID: <159465788421.1376674.17851071117062513659.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Provide an S_CACHE_FILE inode flag that cachefiles can set to ward off other kernel services and drivers (including itself) from using its cache files. Signed-off-by: David Howells --- include/linux/fs.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index aa3e3af92220..33d30742b26e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2003,6 +2003,7 @@ struct super_operations { #define S_ENCRYPTED 16384 /* Encrypted file (using fs/crypto/) */ #define S_CASEFOLD 32768 /* Casefolded file */ #define S_VERITY 65536 /* Verity file (using fs/verity/) */ +#define S_CACHE_FILE 0x20000 /* File is in use as cache file (eg. fs/cachefiles) */ /* * Note that nosuid etc flags are inode-specific: setting some file-system From patchwork Mon Jul 13 16:31:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C46113A4 for ; Mon, 13 Jul 2020 16:31:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 144B92065F for ; Mon, 13 Jul 2020 16:31:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Kkx+SPRZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730512AbgGMQbr (ORCPT ); Mon, 13 Jul 2020 12:31:47 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:33232 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730507AbgGMQbq (ORCPT ); Mon, 13 Jul 2020 12:31:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657905; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xY8bHMAAYMuhdLzdE9CY5HGi5hz2/1pR5v57pe/8g3I=; b=Kkx+SPRZBlh74znkEXwcaAe4npuUCyP92qL399whifYqqA1lJvSCLhSToedN2880EIBc8s SiiCWoRh9350qILkxwTp21wVkzusz3mGMgXs3U4gNDnCf4W8ZvsuOeJ5kqANUB8v0cRDt+ 3ew7mwAA78FNoovSkptRWnQTkVfsInk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-19-x9G9ssjJP3ucHZsk_1l_bA-1; Mon, 13 Jul 2020 12:31:41 -0400 X-MC-Unique: x9G9ssjJP3ucHZsk_1l_bA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8716218FF669; Mon, 13 Jul 2020 16:31:39 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 299BF7621B; Mon, 13 Jul 2020 16:31:34 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 05/32] mm: Provide lru_to_last_page() to get last of a page list From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:33 +0100 Message-ID: <159465789337.1376674.4843292427395367970.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Provide a macro, lru_to_last_page(), to find the last page in a page list (the opposite of lru_to_page()). Signed-off-by: David Howells --- include/linux/mm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index dc7b87310c10..9692b9b58b06 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -216,6 +216,7 @@ int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *, #define PAGE_ALIGNED(addr) IS_ALIGNED((unsigned long)(addr), PAGE_SIZE) #define lru_to_page(head) (list_entry((head)->prev, struct page, lru)) +#define lru_to_last_page(head) (list_entry((head)->next, struct page, lru)) /* * Linux kernel virtual memory manager primitives. From patchwork Mon Jul 13 16:31:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660409 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 61ACD13A4 for ; Mon, 13 Jul 2020 16:32:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 40B9C2065F for ; Mon, 13 Jul 2020 16:32:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VFThRIv7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730332AbgGMQcG (ORCPT ); Mon, 13 Jul 2020 12:32:06 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:55550 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730267AbgGMQcB (ORCPT ); Mon, 13 Jul 2020 12:32:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657917; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ziVh1EL1Ehrwcxm8whnchbheyq0zd4kEH339R26rgo8=; b=VFThRIv7y33a7KkP6gh6BFciR/eAxrbO2qgSLVMFxOUctoTWD7V8ZjpVNWUjXdHugCJhLW uazNUKwzA7vQcM6YIxEWWQnN+99YZlPyvwxqU8T8UYDFFzfMLnTjeZoD5ibfLX/cF7IIXJ JUZsrHYIjyvkhkoqijg0vHz8VYdLBs8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-356-ajsZ5OfuMA63MblKHofbtw-1; Mon, 13 Jul 2020 12:31:53 -0400 X-MC-Unique: ajsZ5OfuMA63MblKHofbtw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0556780183C; Mon, 13 Jul 2020 16:31:51 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 80AC860CD1; Mon, 13 Jul 2020 16:31:45 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 06/32] cachefiles: Remove tree of active files and use S_CACHE_FILE inode flag From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:44 +0100 Message-ID: <159465790476.1376674.873862963768715193.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Remove the tree of active dentries from the cachefiles_cache struct and instead set a flag, S_CACHE_FILE, on the backing inode to indicate that this file is in use by the kernel so as to ward off other kernel users. This simplifies the code a lot and also prevents two overlain caches from fighting with each other. Signed-off-by: David Howells --- fs/cachefiles/daemon.c | 4 fs/cachefiles/interface.c | 20 -- fs/cachefiles/internal.h | 10 - fs/cachefiles/namei.c | 375 +++++++------------------------------ include/trace/events/cachefiles.h | 29 --- 5 files changed, 78 insertions(+), 360 deletions(-) diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 752c1e43416f..8a937d6d5e22 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -102,8 +102,6 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file) } mutex_init(&cache->daemon_mutex); - cache->active_nodes = RB_ROOT; - rwlock_init(&cache->active_lock); init_waitqueue_head(&cache->daemon_pollwq); /* set default caching limits @@ -138,8 +136,6 @@ static int cachefiles_daemon_release(struct inode *inode, struct file *file) cachefiles_daemon_unbind(cache); - ASSERT(!cache->active_nodes.rb_node); - /* clean up the control file interface */ cache->cachefilesd = NULL; file->private_data = NULL; diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 99f42d216ef7..b868afb970ad 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -36,7 +36,6 @@ static struct fscache_object *cachefiles_alloc_object( ASSERTCMP(object->backer, ==, NULL); - BUG_ON(test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)); atomic_set(&object->usage, 1); fscache_object_init(&object->fscache, cookie, &cache->cache); @@ -74,7 +73,6 @@ static struct fscache_object *cachefiles_alloc_object( nomem_key: kfree(buffer); nomem_buffer: - BUG_ON(test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)); kmem_cache_free(cachefiles_object_jar, object); fscache_object_destroyed(&cache->cache); nomem_object: @@ -190,8 +188,6 @@ static void cachefiles_drop_object(struct fscache_object *_object) struct cachefiles_object *object; struct cachefiles_cache *cache; const struct cred *saved_cred; - struct inode *inode; - blkcnt_t i_blocks = 0; ASSERT(_object); @@ -218,10 +214,6 @@ static void cachefiles_drop_object(struct fscache_object *_object) _object != cache->cache.fsdef ) { _debug("- retire object OBJ%x", object->fscache.debug_id); - inode = d_backing_inode(object->dentry); - if (inode) - i_blocks = inode->i_blocks; - cachefiles_begin_secure(cache, &saved_cred); cachefiles_delete_object(cache, object); cachefiles_end_secure(cache, saved_cred); @@ -231,14 +223,11 @@ static void cachefiles_drop_object(struct fscache_object *_object) if (object->backer != object->dentry) dput(object->backer); object->backer = NULL; - } - /* note that the object is now inactive */ - if (test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)) - cachefiles_mark_object_inactive(cache, object, i_blocks); - - dput(object->dentry); - object->dentry = NULL; + cachefiles_unmark_inode_in_use(object, object->dentry); + dput(object->dentry); + object->dentry = NULL; + } _leave(""); } @@ -274,7 +263,6 @@ static void cachefiles_put_object(struct fscache_object *_object, if (u == 0) { _debug("- kill object OBJ%x", object->fscache.debug_id); - ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)); ASSERTCMP(object->fscache.parent, ==, NULL); ASSERTCMP(object->backer, ==, NULL); ASSERTCMP(object->dentry, ==, NULL); diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index a5d48f271ce1..f8f308ce7385 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -38,12 +38,9 @@ struct cachefiles_object { struct dentry *dentry; /* the file/dir representing this object */ struct dentry *backer; /* backing file */ loff_t i_size; /* object size */ - unsigned long flags; -#define CACHEFILES_OBJECT_ACTIVE 0 /* T if marked active */ atomic_t usage; /* object usage count */ uint8_t type; /* object type */ uint8_t new; /* T if object new */ - struct rb_node active_node; /* link in active tree (dentry is key) */ }; extern struct kmem_cache *cachefiles_object_jar; @@ -59,8 +56,6 @@ struct cachefiles_cache { const struct cred *cache_cred; /* security override for accessing cache */ struct mutex daemon_mutex; /* command serialisation mutex */ wait_queue_head_t daemon_pollwq; /* poll waitqueue for daemon */ - struct rb_root active_nodes; /* active nodes (can't be culled) */ - rwlock_t active_lock; /* lock for active_nodes */ atomic_t gravecounter; /* graveyard uniquifier */ atomic_t f_released; /* number of objects released lately */ atomic_long_t b_released; /* number of blocks released lately */ @@ -126,9 +121,8 @@ extern char *cachefiles_cook_key(const u8 *raw, int keylen, uint8_t type); /* * namei.c */ -extern void cachefiles_mark_object_inactive(struct cachefiles_cache *cache, - struct cachefiles_object *object, - blkcnt_t i_blocks); +extern void cachefiles_unmark_inode_in_use(struct cachefiles_object *object, + struct dentry *dentry); extern int cachefiles_delete_object(struct cachefiles_cache *cache, struct cachefiles_object *object); extern int cachefiles_walk_to_object(struct cachefiles_object *parent, diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 924042e8cced..818d1bca1904 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -21,251 +21,51 @@ #define CACHEFILES_KEYBUF_SIZE 512 /* - * dump debugging info about an object + * Mark the backing file as being a cache file if it's not already in use so. */ -static noinline -void __cachefiles_printk_object(struct cachefiles_object *object, - const char *prefix) +static bool cachefiles_mark_inode_in_use(struct cachefiles_object *object, + struct dentry *dentry) { - struct fscache_cookie *cookie; - const u8 *k; - unsigned loop; - - pr_err("%sobject: OBJ%x\n", prefix, object->fscache.debug_id); - pr_err("%sobjstate=%s fl=%lx wbusy=%x ev=%lx[%lx]\n", - prefix, object->fscache.state->name, - object->fscache.flags, work_busy(&object->fscache.work), - object->fscache.events, object->fscache.event_mask); - pr_err("%sops=%u\n", - prefix, object->fscache.n_ops); - pr_err("%sparent=%p\n", - prefix, object->fscache.parent); - - spin_lock(&object->fscache.lock); - cookie = object->fscache.cookie; - if (cookie) { - pr_err("%scookie=%p [pr=%p fl=%lx]\n", - prefix, - object->fscache.cookie, - object->fscache.cookie->parent, - object->fscache.cookie->flags); - pr_err("%skey=[%u] '", prefix, cookie->key_len); - k = (cookie->key_len <= sizeof(cookie->inline_key)) ? - cookie->inline_key : cookie->key; - for (loop = 0; loop < cookie->key_len; loop++) - pr_cont("%02x", k[loop]); - pr_cont("'\n"); - } else { - pr_err("%scookie=NULL\n", prefix); - } - spin_unlock(&object->fscache.lock); -} - -/* - * dump debugging info about a pair of objects - */ -static noinline void cachefiles_printk_object(struct cachefiles_object *object, - struct cachefiles_object *xobject) -{ - if (object) - __cachefiles_printk_object(object, ""); - if (xobject) - __cachefiles_printk_object(xobject, "x"); -} - -/* - * mark the owner of a dentry, if there is one, to indicate that that dentry - * has been preemptively deleted - * - the caller must hold the i_mutex on the dentry's parent as required to - * call vfs_unlink(), vfs_rmdir() or vfs_rename() - */ -static void cachefiles_mark_object_buried(struct cachefiles_cache *cache, - struct dentry *dentry, - enum fscache_why_object_killed why) -{ - struct cachefiles_object *object; - struct rb_node *p; - - _enter(",'%pd'", dentry); - - write_lock(&cache->active_lock); - - p = cache->active_nodes.rb_node; - while (p) { - object = rb_entry(p, struct cachefiles_object, active_node); - if (object->dentry > dentry) - p = p->rb_left; - else if (object->dentry < dentry) - p = p->rb_right; - else - goto found_dentry; - } - - write_unlock(&cache->active_lock); - trace_cachefiles_mark_buried(NULL, dentry, why); - _leave(" [no owner]"); - return; + struct inode *inode = d_backing_inode(dentry); + bool can_use = false; - /* found the dentry for */ -found_dentry: - kdebug("preemptive burial: OBJ%x [%s] %p", - object->fscache.debug_id, - object->fscache.state->name, - dentry); + _enter(",%p", object); - trace_cachefiles_mark_buried(object, dentry, why); + inode_lock(inode); - if (fscache_object_is_live(&object->fscache)) { - pr_err("\n"); - pr_err("Error: Can't preemptively bury live object\n"); - cachefiles_printk_object(object, NULL); + if (!(inode->i_flags & S_CACHE_FILE)) { + inode->i_flags |= S_CACHE_FILE; + trace_cachefiles_mark_active(object, dentry); + can_use = true; } else { - if (why != FSCACHE_OBJECT_IS_STALE) - fscache_object_mark_killed(&object->fscache, why); + pr_notice("cachefiles: Inode already in use: %pd\n", dentry); } - write_unlock(&cache->active_lock); - _leave(" [owner marked]"); + inode_unlock(inode); + return can_use; } /* - * record the fact that an object is now active + * Unmark a backing inode. */ -static int cachefiles_mark_object_active(struct cachefiles_cache *cache, - struct cachefiles_object *object) +void cachefiles_unmark_inode_in_use(struct cachefiles_object *object, + struct dentry *dentry) { - struct cachefiles_object *xobject; - struct rb_node **_p, *_parent = NULL; - struct dentry *dentry; - - _enter(",%p", object); - -try_again: - write_lock(&cache->active_lock); - - dentry = object->dentry; - trace_cachefiles_mark_active(object, dentry); - - if (test_and_set_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags)) { - pr_err("Error: Object already active\n"); - cachefiles_printk_object(object, NULL); - BUG(); - } - - _p = &cache->active_nodes.rb_node; - while (*_p) { - _parent = *_p; - xobject = rb_entry(_parent, - struct cachefiles_object, active_node); - - ASSERT(xobject != object); - - if (xobject->dentry > dentry) - _p = &(*_p)->rb_left; - else if (xobject->dentry < dentry) - _p = &(*_p)->rb_right; - else - goto wait_for_old_object; - } - - rb_link_node(&object->active_node, _parent, _p); - rb_insert_color(&object->active_node, &cache->active_nodes); - - write_unlock(&cache->active_lock); - _leave(" = 0"); - return 0; - - /* an old object from a previous incarnation is hogging the slot - we - * need to wait for it to be destroyed */ -wait_for_old_object: - trace_cachefiles_wait_active(object, dentry, xobject); - clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); - - if (fscache_object_is_live(&xobject->fscache)) { - pr_err("\n"); - pr_err("Error: Unexpected object collision\n"); - cachefiles_printk_object(object, xobject); - } - atomic_inc(&xobject->usage); - write_unlock(&cache->active_lock); - - if (test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)) { - wait_queue_head_t *wq; - - signed long timeout = 60 * HZ; - wait_queue_entry_t wait; - bool requeue; - - /* if the object we're waiting for is queued for processing, - * then just put ourselves on the queue behind it */ - if (work_pending(&xobject->fscache.work)) { - _debug("queue OBJ%x behind OBJ%x immediately", - object->fscache.debug_id, - xobject->fscache.debug_id); - goto requeue; - } - - /* otherwise we sleep until either the object we're waiting for - * is done, or the fscache_object is congested */ - wq = bit_waitqueue(&xobject->flags, CACHEFILES_OBJECT_ACTIVE); - init_wait(&wait); - requeue = false; - do { - prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); - if (!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)) - break; - - requeue = fscache_object_sleep_till_congested(&timeout); - } while (timeout > 0 && !requeue); - finish_wait(wq, &wait); - - if (requeue && - test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)) { - _debug("queue OBJ%x behind OBJ%x after wait", - object->fscache.debug_id, - xobject->fscache.debug_id); - goto requeue; - } - - if (timeout <= 0) { - pr_err("\n"); - pr_err("Error: Overlong wait for old active object to go away\n"); - cachefiles_printk_object(object, xobject); - goto requeue; - } - } - - ASSERT(!test_bit(CACHEFILES_OBJECT_ACTIVE, &xobject->flags)); - - cache->cache.ops->put_object(&xobject->fscache, - (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_retry); - goto try_again; + struct inode *inode = d_backing_inode(dentry); -requeue: - cache->cache.ops->put_object(&xobject->fscache, - (enum fscache_obj_ref_trace)cachefiles_obj_put_wait_timeo); - _leave(" = -ETIMEDOUT"); - return -ETIMEDOUT; + inode_lock(inode); + inode->i_flags &= ~S_CACHE_FILE; + inode_unlock(inode); + trace_cachefiles_mark_inactive(object, dentry, inode); } /* * Mark an object as being inactive. */ -void cachefiles_mark_object_inactive(struct cachefiles_cache *cache, - struct cachefiles_object *object, - blkcnt_t i_blocks) +static void cachefiles_mark_object_inactive(struct cachefiles_cache *cache, + struct cachefiles_object *object) { - struct dentry *dentry = object->dentry; - struct inode *inode = d_backing_inode(dentry); - - trace_cachefiles_mark_inactive(object, dentry, inode); - - write_lock(&cache->active_lock); - rb_erase(&object->active_node, &cache->active_nodes); - clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); - write_unlock(&cache->active_lock); - - wake_up_bit(&object->flags, CACHEFILES_OBJECT_ACTIVE); + blkcnt_t i_blocks = d_backing_inode(object->dentry)->i_blocks; /* This object can now be culled, so we need to let the daemon know * that there is something it can remove if it needs to. @@ -286,7 +86,6 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, struct cachefiles_object *object, struct dentry *dir, struct dentry *rep, - bool preemptive, enum fscache_why_object_killed why) { struct dentry *grave, *trap; @@ -310,9 +109,6 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, } else { trace_cachefiles_unlink(object, rep, why); ret = vfs_unlink(d_inode(dir), rep, NULL); - - if (preemptive) - cachefiles_mark_object_buried(cache, rep, why); } inode_unlock(d_inode(dir)); @@ -373,8 +169,7 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, return -ENOMEM; } - cachefiles_io_error(cache, "Lookup error %ld", - PTR_ERR(grave)); + cachefiles_io_error(cache, "Lookup error %ld", PTR_ERR(grave)); return -EIO; } @@ -416,9 +211,6 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, if (ret != 0 && ret != -ENOMEM) cachefiles_io_error(cache, "Rename failed with error %d", ret); - - if (preemptive) - cachefiles_mark_object_buried(cache, rep, why); } unlock_rename(cache->graveyard, dir); @@ -446,26 +238,18 @@ int cachefiles_delete_object(struct cachefiles_cache *cache, inode_lock_nested(d_inode(dir), I_MUTEX_PARENT); - if (test_bit(FSCACHE_OBJECT_KILLED_BY_CACHE, &object->fscache.flags)) { - /* object allocation for the same key preemptively deleted this - * object's file so that it could create its own file */ - _debug("object preemptively buried"); + /* We need to check that our parent is _still_ our parent - it may have + * been renamed. + */ + if (dir == object->dentry->d_parent) { + ret = cachefiles_bury_object(cache, object, dir, object->dentry, + FSCACHE_OBJECT_WAS_RETIRED); + } else { + /* It got moved, presumably by cachefilesd culling it, so it's + * no longer in the key path and we can ignore it. + */ inode_unlock(d_inode(dir)); ret = 0; - } else { - /* we need to check that our parent is _still_ our parent - it - * may have been renamed */ - if (dir == object->dentry->d_parent) { - ret = cachefiles_bury_object(cache, object, dir, - object->dentry, false, - FSCACHE_OBJECT_WAS_RETIRED); - } else { - /* it got moved, presumably by cachefilesd culling it, - * so it's no longer in the key path and we can ignore - * it */ - inode_unlock(d_inode(dir)); - ret = 0; - } } dput(dir); @@ -487,6 +271,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, struct path path; unsigned long start; const char *name; + bool marked = false; int ret, nlen; _enter("OBJ%x{%p},OBJ%x,%s,", @@ -529,6 +314,7 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, cachefiles_hist(cachefiles_lookup_histogram, start); if (IS_ERR(next)) { trace_cachefiles_lookup(object, next, NULL); + ret = PTR_ERR(next); goto lookup_error; } @@ -628,6 +414,13 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, /* we've found the object we were looking for */ object->dentry = next; + /* note that we're now using this object */ + if (!cachefiles_mark_inode_in_use(object, object->dentry)) { + ret = -EBUSY; + goto check_error_unlock; + } + marked = true; + /* if we've found that the terminal object exists, then we need to * check its attributes and delete it if it's out of date */ if (!object->new) { @@ -640,13 +433,12 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, object->dentry = NULL; ret = cachefiles_bury_object(cache, object, dir, next, - true, FSCACHE_OBJECT_IS_STALE); dput(next); next = NULL; if (ret < 0) - goto delete_error; + goto error_out2; _debug("redo lookup"); fscache_object_retrying_stale(&object->fscache); @@ -654,16 +446,10 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, } } - /* note that we're now using this object */ - ret = cachefiles_mark_object_active(cache, object); - inode_unlock(d_inode(dir)); dput(dir); dir = NULL; - if (ret == -ETIMEDOUT) - goto mark_active_timed_out; - _debug("=== OBTAINED_OBJECT ==="); if (object->new) { @@ -712,26 +498,19 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, cachefiles_io_error(cache, "Create/mkdir failed"); goto error; -mark_active_timed_out: - _debug("mark active timed out"); - goto release_dentry; - +check_error_unlock: + inode_unlock(d_inode(dir)); + dput(dir); check_error: - _debug("check error %d", ret); - cachefiles_mark_object_inactive( - cache, object, d_backing_inode(object->dentry)->i_blocks); -release_dentry: + if (marked) + cachefiles_unmark_inode_in_use(object, object->dentry); + cachefiles_mark_object_inactive(cache, object); dput(object->dentry); object->dentry = NULL; goto error_out; -delete_error: - _debug("delete error %d", ret); - goto error_out2; - lookup_error: - _debug("lookup error %ld", PTR_ERR(next)); - ret = PTR_ERR(next); + _debug("lookup error %d", ret); if (ret == -EIO) cachefiles_io_error(cache, "Lookup failed"); next = NULL; @@ -861,8 +640,6 @@ static struct dentry *cachefiles_check_active(struct cachefiles_cache *cache, struct dentry *dir, char *filename) { - struct cachefiles_object *object; - struct rb_node *_n; struct dentry *victim; unsigned long start; int ret; @@ -892,34 +669,9 @@ static struct dentry *cachefiles_check_active(struct cachefiles_cache *cache, return ERR_PTR(-ENOENT); } - /* check to see if we're using this object */ - read_lock(&cache->active_lock); - - _n = cache->active_nodes.rb_node; - - while (_n) { - object = rb_entry(_n, struct cachefiles_object, active_node); - - if (object->dentry > victim) - _n = _n->rb_left; - else if (object->dentry < victim) - _n = _n->rb_right; - else - goto object_in_use; - } - - read_unlock(&cache->active_lock); - //_leave(" = %p", victim); return victim; -object_in_use: - read_unlock(&cache->active_lock); - inode_unlock(d_inode(dir)); - dput(victim); - //_leave(" = -EBUSY [in use]"); - return ERR_PTR(-EBUSY); - lookup_error: inode_unlock(d_inode(dir)); ret = PTR_ERR(victim); @@ -948,6 +700,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, char *filename) { struct dentry *victim; + struct inode *inode; int ret; _enter(",%pd/,%s", dir, filename); @@ -956,6 +709,19 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, if (IS_ERR(victim)) return PTR_ERR(victim); + /* check to see if someone is using this object */ + inode = d_inode(victim); + inode_lock(inode); + if (inode->i_flags & S_CACHE_FILE) { + ret = -EBUSY; + } else { + inode->i_flags |= S_CACHE_FILE; + ret = 0; + } + inode_unlock(inode); + if (ret < 0) + goto error_unlock; + _debug("victim -> %p %s", victim, d_backing_inode(victim) ? "positive" : "negative"); @@ -971,7 +737,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, /* actually remove the victim (drops the dir mutex) */ _debug("bury"); - ret = cachefiles_bury_object(cache, NULL, dir, victim, false, + ret = cachefiles_bury_object(cache, NULL, dir, victim, FSCACHE_OBJECT_WAS_CULLED); if (ret < 0) goto error; @@ -1008,6 +774,7 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir, char *filename) { struct dentry *victim; + int ret = 0; //_enter(",%pd/,%s", // dir, filename); @@ -1017,7 +784,9 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir, return PTR_ERR(victim); inode_unlock(d_inode(dir)); + if (d_inode(victim)->i_flags & S_CACHE_FILE) + ret = -EBUSY; dput(victim); //_leave(" = 0"); - return 0; + return ret; } diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index 9a448fe9355d..c877035c2946 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -237,35 +237,6 @@ TRACE_EVENT(cachefiles_mark_active, __entry->obj, __entry->de) ); -TRACE_EVENT(cachefiles_wait_active, - TP_PROTO(struct cachefiles_object *obj, - struct dentry *de, - struct cachefiles_object *xobj), - - TP_ARGS(obj, de, xobj), - - /* Note that obj may be NULL */ - TP_STRUCT__entry( - __field(unsigned int, obj ) - __field(unsigned int, xobj ) - __field(struct dentry *, de ) - __field(u16, flags ) - __field(u16, fsc_flags ) - ), - - TP_fast_assign( - __entry->obj = obj->fscache.debug_id; - __entry->de = de; - __entry->xobj = xobj->fscache.debug_id; - __entry->flags = xobj->flags; - __entry->fsc_flags = xobj->fscache.flags; - ), - - TP_printk("o=%08x d=%p wo=%08x wf=%x wff=%x", - __entry->obj, __entry->de, __entry->xobj, - __entry->flags, __entry->fsc_flags) - ); - TRACE_EVENT(cachefiles_mark_inactive, TP_PROTO(struct cachefiles_object *obj, struct dentry *de, From patchwork Mon Jul 13 16:31:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4F221510 for ; Mon, 13 Jul 2020 16:32:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9DC5B206F5 for ; Mon, 13 Jul 2020 16:32:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QscXOp+K" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730525AbgGMQcJ (ORCPT ); Mon, 13 Jul 2020 12:32:09 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:37907 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730267AbgGMQcJ (ORCPT ); Mon, 13 Jul 2020 12:32:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FFWzOvKlQKljLhANCdGMPaz1s06wEFCG5wrNzS5w+bY=; b=QscXOp+Kixilq7eyPleSOZKDPCtHAfJ5f9rrqcD+/aLVg7LhSAxKD+n0iCHvEX2Z5dJo+e CWdcItKp5D5R0j4ruIF1S5vt/5AKFv66/LYJNZtoKUXgtor8e6n+On8yJ+OsYOfdYWILXy mekSLQFId5kSFeLCzPU+ZD4rpHj4koM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-332-tfYVvGyfMaSbVREs3bDUZA-1; Mon, 13 Jul 2020 12:32:04 -0400 X-MC-Unique: tfYVvGyfMaSbVREs3bDUZA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B290A80183C; Mon, 13 Jul 2020 16:32:02 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 06AD310013C0; Mon, 13 Jul 2020 16:31:56 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 07/32] fscache: Provide a simple thread pool for running ops asynchronously From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:31:56 +0100 Message-ID: <159465791622.1376674.11171480091432676587.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Provide a simple thread pool that can be used to run cookie management operations in the background and a dispatcher infrastructure to punt operations to the pool if threads are available or to just run the operation in the calling thread if not. A future patch will replace all the object state machine stuff with whole routines that do all the work in one go without trying to interleave bits from various objects. Signed-off-by: David Howells --- fs/fscache/Makefile | 1 fs/fscache/dispatcher.c | 144 ++++++++++++++++++++++++++++++++++++++++ fs/fscache/internal.h | 8 ++ fs/fscache/main.c | 7 ++ include/trace/events/fscache.h | 6 +- 5 files changed, 165 insertions(+), 1 deletion(-) create mode 100644 fs/fscache/dispatcher.c diff --git a/fs/fscache/Makefile b/fs/fscache/Makefile index ac3fcd909fff..7b10c6aad157 100644 --- a/fs/fscache/Makefile +++ b/fs/fscache/Makefile @@ -6,6 +6,7 @@ fscache-y := \ cache.o \ cookie.o \ + dispatcher.o \ fsdef.o \ main.o \ netfs.o \ diff --git a/fs/fscache/dispatcher.c b/fs/fscache/dispatcher.c new file mode 100644 index 000000000000..fba71b99c951 --- /dev/null +++ b/fs/fscache/dispatcher.c @@ -0,0 +1,144 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Object dispatcher + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define FSCACHE_DEBUG_LEVEL OPERATION +#include +#include +#include +#include +#include "internal.h" + +#define FSCACHE_DISPATCHER_POOL_SIZE 8 + +static LIST_HEAD(fscache_pending_work); +static DEFINE_SPINLOCK(fscache_work_lock); +static DECLARE_WAIT_QUEUE_HEAD(fscache_dispatcher_pool); +static struct completion fscache_dispatcher_pool_done[FSCACHE_DISPATCHER_POOL_SIZE]; +static bool fscache_dispatcher_stop; + +struct fscache_work { + struct list_head link; + struct fscache_cookie *cookie; + struct fscache_object *object; + int param; + void (*func)(struct fscache_cookie *, struct fscache_object *, int); +}; + +/* + * Attempt to queue some work to do. If there's too much asynchronous work + * already queued, we'll do it here in this thread instead. + */ +void fscache_dispatch(struct fscache_cookie *cookie, + struct fscache_object *object, + int param, + void (*func)(struct fscache_cookie *, + struct fscache_object *, int)) +{ + struct fscache_work *work; + bool queued = false; + + work = kzalloc(sizeof(struct fscache_work), GFP_KERNEL); + if (work) { + work->cookie = cookie; + work->object = object; + work->param = param; + work->func = func; + + spin_lock(&fscache_work_lock); + if (waitqueue_active(&fscache_dispatcher_pool) || + list_empty(&fscache_pending_work)) { + fscache_cookie_get(cookie, fscache_cookie_get_work); + list_add_tail(&work->link, &fscache_pending_work); + wake_up(&fscache_dispatcher_pool); + queued = true; + } + spin_unlock(&fscache_work_lock); + } + + if (!queued) { + kfree(work); + func(cookie, object, param); + } +} + +/* + * A dispatcher thread. + */ +static int fscache_dispatcher(void *data) +{ + struct completion *done = data; + + for (;;) { + if (!list_empty(&fscache_pending_work)) { + struct fscache_work *work = NULL; + + spin_lock(&fscache_work_lock); + if (!list_empty(&fscache_pending_work)) { + work = list_entry(fscache_pending_work.next, + struct fscache_work, link); + list_del_init(&work->link); + } + spin_unlock(&fscache_work_lock); + + if (work) { + work->func(work->cookie, work->object, work->param); + fscache_cookie_put(work->cookie, fscache_cookie_put_work); + kfree(work); + } + continue; + } else if (fscache_dispatcher_stop) { + break; + } + + wait_event_freezable(fscache_dispatcher_pool, + (fscache_dispatcher_stop || + !list_empty(&fscache_pending_work))); + } + + complete_and_exit(done, 0); +} + +/* + * Start up the dispatcher threads. + */ +int fscache_init_dispatchers(void) +{ + struct task_struct *t; + int i; + + for (i = 0; i < FSCACHE_DISPATCHER_POOL_SIZE; i++) { + t = kthread_create(fscache_dispatcher, + &fscache_dispatcher_pool_done[i], + "kfsc/%d", i); + if (IS_ERR(t)) + goto failed; + wake_up_process(t); + } + + return 0; + +failed: + fscache_dispatcher_stop = true; + wake_up_all(&fscache_dispatcher_pool); + for (i--; i >= 0; i--) + wait_for_completion(&fscache_dispatcher_pool_done[i]); + return PTR_ERR(t); +} + +/* + * Kill off the dispatcher threads. + */ +void fscache_kill_dispatchers(void) +{ + int i; + + fscache_dispatcher_stop = true; + wake_up_all(&fscache_dispatcher_pool); + + for (i = 0; i < FSCACHE_DISPATCHER_POOL_SIZE; i++) + wait_for_completion(&fscache_dispatcher_pool_done[i]); +} diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index bc5539d2157b..2100e2222884 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -75,6 +75,14 @@ extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *); extern void fscache_cookie_put(struct fscache_cookie *, enum fscache_cookie_trace); +/* + * dispatcher.c + */ +extern void fscache_dispatch(struct fscache_cookie *, struct fscache_object *, int, + void (*func)(struct fscache_cookie *, struct fscache_object *, int)); +extern int fscache_init_dispatchers(void); +extern void fscache_kill_dispatchers(void); + /* * fsdef.c */ diff --git a/fs/fscache/main.c b/fs/fscache/main.c index c1e6cc9091aa..c8f1beafa8e1 100644 --- a/fs/fscache/main.c +++ b/fs/fscache/main.c @@ -125,6 +125,10 @@ static int __init fscache_init(void) for_each_possible_cpu(cpu) init_waitqueue_head(&per_cpu(fscache_object_cong_wait, cpu)); + ret = fscache_init_dispatchers(); + if (ret < 0) + goto error_dispatchers; + ret = fscache_proc_init(); if (ret < 0) goto error_proc; @@ -159,6 +163,8 @@ static int __init fscache_init(void) unregister_sysctl_table(fscache_sysctl_header); error_sysctl: #endif + fscache_kill_dispatchers(); +error_dispatchers: fscache_proc_cleanup(); error_proc: destroy_workqueue(fscache_op_wq); @@ -183,6 +189,7 @@ static void __exit fscache_exit(void) unregister_sysctl_table(fscache_sysctl_header); #endif fscache_proc_cleanup(); + fscache_kill_dispatchers(); destroy_workqueue(fscache_op_wq); destroy_workqueue(fscache_object_wq); pr_notice("Unloaded\n"); diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h index 08d7de72409d..fb3fdf2921ee 100644 --- a/include/trace/events/fscache.h +++ b/include/trace/events/fscache.h @@ -26,11 +26,13 @@ enum fscache_cookie_trace { fscache_cookie_get_attach_object, fscache_cookie_get_reacquire, fscache_cookie_get_register_netfs, + fscache_cookie_get_work, fscache_cookie_put_acquire_nobufs, fscache_cookie_put_dup_netfs, fscache_cookie_put_relinquish, fscache_cookie_put_object, fscache_cookie_put_parent, + fscache_cookie_put_work, }; #endif @@ -45,11 +47,13 @@ enum fscache_cookie_trace { EM(fscache_cookie_get_attach_object, "GET obj") \ EM(fscache_cookie_get_reacquire, "GET raq") \ EM(fscache_cookie_get_register_netfs, "GET net") \ + EM(fscache_cookie_get_work, "GET wrk") \ EM(fscache_cookie_put_acquire_nobufs, "PUT nbf") \ EM(fscache_cookie_put_dup_netfs, "PUT dnt") \ EM(fscache_cookie_put_relinquish, "PUT rlq") \ EM(fscache_cookie_put_object, "PUT obj") \ - E_(fscache_cookie_put_parent, "PUT prn") + EM(fscache_cookie_put_parent, "PUT prn") \ + E_(fscache_cookie_put_work, "PUT wrk") /* * Export enum symbols via userspace. From patchwork Mon Jul 13 16:32:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60593618 for ; Mon, 13 Jul 2020 16:32:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3ACF7206F0 for ; Mon, 13 Jul 2020 16:32:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WjFedn4q" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730544AbgGMQcd (ORCPT ); Mon, 13 Jul 2020 12:32:33 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:40360 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730537AbgGMQcc (ORCPT ); Mon, 13 Jul 2020 12:32:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657949; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j6wJEF6h9RyZlI1lvZu1g53t5XRJdv+VdZXfwMx2rgA=; b=WjFedn4qppJhYrpAdw1QQ5OSY5XEZafNNUXvlRwLJcUma64tTA5G54BHz7RZsWdUCBsfsW O2fAJiWJUtAdi84QjrnQ8HUFzAH5xidctaK2kRBV1tdySzeuLi9kvg9PnuZUkMSFdiImgZ ty5fTBlBVYrvLzaWXEnqnu8tRfUbz8E= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-174-iNjn9-t1P7aRbJc_SOSc0w-1; Mon, 13 Jul 2020 12:32:28 -0400 X-MC-Unique: iNjn9-t1P7aRbJc_SOSc0w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2811B107ACCA; Mon, 13 Jul 2020 16:32:26 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id ABDCF5C1D0; Mon, 13 Jul 2020 16:32:20 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 09/32] fscache: Rewrite the I/O API based on iov_iter From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:32:19 +0100 Message-ID: <159465793995.1376674.8648007758551605034.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Rewrite the fscache I/O API by introducing a number of new routines based on a number of principles: (1) The cache provides *only* write-to-cache and read-from-cache calls for transferring data to/from the cache. (2) The bufferage for I/O to/from the cache is supplied with an iov_iter. There is no requirement that the iov_iters involved have anything to do with an inode's pagecache, though if it does, an ITER_MAPPING iterator is available. (3) I/O to/from any particular cache object is done in one of a number of modes, set for the cache object at cookie acquisition time: (A) Single blob. The blob must be written in its entirety in one go. (B) Granular. Writes to the cache should be done in granule sized-blocks, where, for the moment, a granule will be 256KiB, but could be variable. This allows the metadata indicating which granules are present to be smaller at the cost of using more disk space. In both cases, reads from the cache may be done in smaller chunks and small update writes may be done inside a block that exists. (4) I/O to/from the cache must be aligned to the DIO block size of the backing filesystem. The cache tells the caller what it should consider the DIO block size to be. This will never be larger than page size. (5) Completion of the I/O results in a callback - after which the cache no longer knows about it. (6) The cache doesn't retain any pointers back into the netfs, either the code, its state or its pagecache. To do granular I/O, the netfs has to take the read or write request it got from the VFS/VM and 'shape' it to fit the caching parameters. It does this by filling in a form to indicate the extent of the operation it might like to make: struct fscache_request_shape { /* Parameters */ loff_t i_size; pgoff_t proposed_start; unsigned int proposed_nr_pages; unsigned int max_io_pages; bool for_write; /* Result */ unsigned int to_be_done; unsigned int granularity; unsigned int dio_block_size; unsigned int actual_nr_pages; pgoff_t actual_start; }; and then it calls: void fscache_shape_request(struct fscache_cookie *cookie, struct fscache_request_shape *shape); to shape it. The netfs should set 'proposed_start' to be the first page to read, 'proposed_nr_pages' to indicate the size of the request and 'i_size' to indicate the size that the file should be considered to be. 'max_io_pages' should be set to the maximum size of a transaction, up to UINT_MAX, and 'for_write' should be set to true if this is for a write to the cache. The cache will then shape the proposed read to fit a blocking factor appropriate for the cache object and region of the file. It may extend start forward and may shrink or extend the request to fit the granularity of the cache. This will be trimmed to the end of file as specified by the proposed file size. Upon return, 'to_be_done' will be set to one of FSCACHE_READ_FROM_SERVER, FSCACHE_READ_FROM_CACHE, FSCACHE_FILL_WITH_ZERO, and may have FSCACHE_WRITE_TO_CACHE bitwise-OR'd onto it. 'actual_start' and 'actual_nr_pages' will be set to indicate the cache's proposal for the desired size and position of the operation. 'granularity' will be set to hold the cache block granularity (in pages) and transaction can be shortened to a multiple of this. Note that the shaped request will always include the proposed_start page. 'dio_block_size' will be set to whatever I/O size the cache must communicate with its storage in. This is necessary to set up the iov_iter to be passed to the cache for reading and writing so that it can do direct I/O. Once the netfs has set up its request, if FSCACHE_READ_FROM_CACHE was set, it should then call: void fscache_read(struct fscache_io_request *req) to read data from the cache. To do this, it needs to fill out a request descriptor: struct fscache_io_request { const struct fscache_io_request_ops *ops; struct fscache_cookie *cookie; loff_t pos; loff_t len; int error; bool (*is_still_valid)(struct fscache_io_request *); void (*done)(struct fscache_io_request *); ... }; The ops pointer, cookie, position and length should be set to describe the I/O operation to be performed. An 'is_still_valid' function may be provided to check whether the operation should still go ahead after a wait in case it got invalidated by the server. A 'done' function may be provided that will be called to finalise the operation. If provided, the 'done' function will be always be called, even when the operation doesn't take place because there's no cache. If no done function is called, the operation will be synchronous. Note that the pages must be pinned - typically by locking them. If FSCACHE_WRITE_TO_CACHE was set, then once the data is read from the server, the netfs should write it to the cache by calling: void fscache_write(struct fscache_io_request *req) The request descriptor is set as for fscache_read(). Note that the pages must be pinned. In this case, PG_fscache can be set on the page and the pages can be unlocked; the bit can then be cleared by the done handler. The releasepage, invalidatepage, launderpage and page_mkwrite functions should be used to suspend progress until the bit is cleared. The following functions are made available in an earlier patch for this: void unlock_page_fscache(struct page *page); void wait_on_page_fscache(struct page *page) Signed-off-by: David Howells --- fs/fscache/Makefile | 1 fs/fscache/io.c | 170 +++++++++++++++++++++++++++++++ include/linux/fscache-cache.h | 28 +++++ include/linux/fscache.h | 201 +++++++++++++++++++++++++++++++++++++ include/trace/events/cachefiles.h | 2 5 files changed, 402 insertions(+) create mode 100644 fs/fscache/io.c diff --git a/fs/fscache/Makefile b/fs/fscache/Makefile index 396e1b5fdc28..3caf66810e7b 100644 --- a/fs/fscache/Makefile +++ b/fs/fscache/Makefile @@ -8,6 +8,7 @@ fscache-y := \ cookie.o \ dispatcher.o \ fsdef.o \ + io.o \ main.o \ netfs.o \ obj.o \ diff --git a/fs/fscache/io.c b/fs/fscache/io.c new file mode 100644 index 000000000000..8d7f79551699 --- /dev/null +++ b/fs/fscache/io.c @@ -0,0 +1,170 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Data I/O routines + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define FSCACHE_DEBUG_LEVEL OPERATION +#include +#include +#include +#include "internal.h" + +/* + * Initialise an I/O request + */ +void __fscache_init_io_request(struct fscache_io_request *req, + struct fscache_cookie *cookie) +{ + req->cookie = fscache_cookie_get(cookie, fscache_cookie_get_ioreq); +} +EXPORT_SYMBOL(__fscache_init_io_request); + +/* + * Clean up an I/O request + */ +void __fscache_free_io_request(struct fscache_io_request *req) +{ + if (req->object) + req->object->cache->ops->put_object(req->object, + fscache_obj_put_ioreq); + fscache_cookie_put(req->cookie, fscache_cookie_put_ioreq); +} +EXPORT_SYMBOL(__fscache_free_io_request); + +enum fscache_want_stage { + FSCACHE_WANT_PARAMS, + FSCACHE_WANT_WRITE, + FSCACHE_WANT_READ, +}; + +/* + * Begin an I/O operation on the cache, waiting till we reach the right state. + * + * Returns a pointer to the object to use or an error. If an object is + * returned, it will have an extra ref on it. + */ +static struct fscache_object *fscache_begin_io_operation( + struct fscache_cookie *cookie, + enum fscache_want_stage want, + struct fscache_io_request *req) +{ + struct fscache_object *object; + enum fscache_cookie_stage stage; + +again: + spin_lock(&cookie->lock); + + stage = cookie->stage; + _enter("c=%08x{%u},%x", cookie->debug_id, stage, want); + + switch (stage) { + case FSCACHE_COOKIE_STAGE_QUIESCENT: + case FSCACHE_COOKIE_STAGE_DEAD: + goto not_live; + case FSCACHE_COOKIE_STAGE_INITIALISING: + case FSCACHE_COOKIE_STAGE_LOOKING_UP: + case FSCACHE_COOKIE_STAGE_INVALIDATING: + goto wait_and_validate; + + case FSCACHE_COOKIE_STAGE_NO_DATA_YET: + if (want == FSCACHE_WANT_READ) + goto no_data_yet; + /* Fall through */ + case FSCACHE_COOKIE_STAGE_ACTIVE: + goto ready; + } + +ready: + object = hlist_entry(cookie->backing_objects.first, + struct fscache_object, cookie_link); + + if (fscache_cache_is_broken(object)) + goto not_live; + + object->cache->ops->grab_object(object, fscache_obj_get_ioreq); + + atomic_inc(&cookie->n_ops); + spin_unlock(&cookie->lock); + return object; + +wait_and_validate: + spin_unlock(&cookie->lock); + wait_var_event(&cookie->stage, cookie->stage != stage); + if (req && + req->ops->is_still_valid && + !req->ops->is_still_valid(req)) { + _leave(" = -ESTALE"); + return ERR_PTR(-ESTALE); + } + goto again; + +no_data_yet: + spin_unlock(&cookie->lock); + _leave(" = -ENODATA"); + return ERR_PTR(-ENODATA); + +not_live: + spin_unlock(&cookie->lock); + _leave(" = -ENOBUFS"); + return ERR_PTR(-ENOBUFS); +} + +/* + * Determine the size of an allocation granule or a region of data in the + * cache. + */ +void __fscache_shape_request(struct fscache_cookie *cookie, + struct fscache_request_shape *shape) +{ + struct fscache_object *object = + fscache_begin_io_operation(cookie, FSCACHE_WANT_PARAMS, NULL); + + if (!IS_ERR(object)) { + object->cache->ops->shape_request(object, shape); + object->cache->ops->put_object(object, fscache_obj_put_ioreq); + fscache_end_io_operation(cookie); + } +} +EXPORT_SYMBOL(__fscache_shape_request); + +/* + * Read data from the cache. + */ +int __fscache_read(struct fscache_io_request *req, struct iov_iter *iter) +{ + struct fscache_object *object = + fscache_begin_io_operation(req->cookie, FSCACHE_WANT_READ, req); + + if (!IS_ERR(object)) { + req->object = object; + return object->cache->ops->read(object, req, iter); + } else { + req->error = PTR_ERR(object); + if (req->io_done) + req->io_done(req); + return req->error; + } +} +EXPORT_SYMBOL(__fscache_read); + +/* + * Write data to the cache. + */ +int __fscache_write(struct fscache_io_request *req, struct iov_iter *iter) +{ + struct fscache_object *object = + fscache_begin_io_operation(req->cookie, FSCACHE_WANT_WRITE, req); + + if (!IS_ERR(object)) { + req->object = object; + return object->cache->ops->write(object, req, iter); + } else { + req->error = PTR_ERR(object); + if (req->io_done) + req->io_done(req); + return req->error; + } +} +EXPORT_SYMBOL(__fscache_write); diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index de1cffb2558e..81a41e37f07b 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -22,11 +22,13 @@ struct fscache_cache; struct fscache_cache_ops; struct fscache_object; +struct fscache_io_operations; enum fscache_obj_ref_trace { fscache_obj_get_attach, fscache_obj_get_exists, fscache_obj_get_inval, + fscache_obj_get_ioreq, fscache_obj_get_wait, fscache_obj_get_withdraw, fscache_obj_new, @@ -37,6 +39,7 @@ enum fscache_obj_ref_trace { fscache_obj_put_drop_child, fscache_obj_put_drop_obj, fscache_obj_put_inval, + fscache_obj_put_ioreq, fscache_obj_put_lookup_fail, fscache_obj_put_withdraw, fscache_obj_ref__nr_traces @@ -134,6 +137,20 @@ struct fscache_cache_ops { /* reserve space for an object's data and associated metadata */ int (*reserve_space)(struct fscache_object *object, loff_t i_size); + + /* Shape the extent of a read or write */ + void (*shape_request)(struct fscache_object *object, + struct fscache_request_shape *shape); + + /* Read data from the cache */ + int (*read)(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter); + + /* Write data to the cache */ + int (*write)(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter); }; extern struct fscache_cookie fscache_fsdef_index; @@ -239,4 +256,15 @@ static inline void fscache_end_io_operation(struct fscache_cookie *cookie) wake_up_var(&cookie->n_ops); } +static inline void fscache_get_io_request(struct fscache_io_request *req) +{ + req->ops->get(req); +} + +static inline void fscache_put_io_request(struct fscache_io_request *req) +{ + if (req) + req->ops->put(req); +} + #endif /* _LINUX_FSCACHE_CACHE_H */ diff --git a/include/linux/fscache.h b/include/linux/fscache.h index 11b18761a3b6..aec75fc0d297 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -42,9 +42,11 @@ /* pattern used to fill dead space in an index entry */ #define FSCACHE_INDEX_DEADFILL_PATTERN 0x79 +struct iov_iter; struct fscache_cache_tag; struct fscache_cookie; struct fscache_netfs; +struct fscache_io_request_ops; enum fscache_cookie_type { FSCACHE_COOKIE_TYPE_INDEX, @@ -122,6 +124,73 @@ struct fscache_cookie { }; }; +/* + * The size and shape of a request to the cache, adjusted for cache + * granularity, for the data available on doing a read, the page size and + * non-contiguities and for the netfs's own I/O patterning. + * + * Before calling fscache_shape_request(), @proposed_start and @proposed_end + * must be set to indicate the bounds of the request and @max_io_pages to the + * limit the netfs is willing to accept on the size of an I/O operation. + * @i_size should be set to the size the file should be considered to be and + * @for_write should be set if a write request is being shaped. + * + * After shaping, @actual_start and @actual_end will mark out the size of the + * shaped request. @granularity will convey the size a the cache block, should + * the request need to be reduced in scope, either due to memory constraints or + * netfs I/O constraints. @dio_block_size will be set to the direct I/O size + * for the cache - fscache_read/write() can't be expected read/write chunks + * smaller than this or at positions that aren't aligned to this. + * + * Finally, @to_be_done will be set by the shaper to indicate whether the + * region can be read from the cache or filled with zeros and whether it should + * be written to the cache after being read from the server or cleared. + */ +struct fscache_request_shape { + /* Parameters */ + loff_t i_size; /* The file size to use in calculations */ + pgoff_t proposed_start; /* First page in the proposed request */ + unsigned int proposed_nr_pages; /* Number of pages in the proposed request */ + unsigned int max_io_pages; /* Max pages in a netfs I/O request (or UINT_MAX) */ + bool for_write; /* Set if shaping a write */ + + /* Result */ +#define FSCACHE_READ_FROM_SERVER 0x00 +#define FSCACHE_READ_FROM_CACHE 0x01 +#define FSCACHE_WRITE_TO_CACHE 0x02 +#define FSCACHE_FILL_WITH_ZERO 0x04 + unsigned int to_be_done; /* What should be done by the caller */ + unsigned int granularity; /* Cache granularity in pages */ + unsigned int dio_block_size; /* Block size required for direct I/O */ + unsigned int actual_nr_pages; /* Number of pages in the shaped request */ + pgoff_t actual_start; /* First page in the shaped request */ +}; + +/* + * Descriptor for an fscache I/O request. + */ +struct fscache_io_request { + const struct fscache_io_request_ops *ops; + struct fscache_cookie *cookie; + struct fscache_object *object; + loff_t pos; /* Where to start the I/O */ + loff_t len; /* Size of the I/O */ + loff_t transferred; /* Amount of data transferred */ + short error; /* 0 or error that occurred */ + unsigned long flags; +#define FSCACHE_IO_DATA_FROM_SERVER 0 /* Set if data was read from server */ +#define FSCACHE_IO_DATA_FROM_CACHE 1 /* Set if data was read from the cache */ + void (*io_done)(struct fscache_io_request *); +}; + +struct fscache_io_request_ops { + bool (*is_still_valid)(struct fscache_io_request *); + void (*issue_op)(struct fscache_io_request *); + void (*done)(struct fscache_io_request *); + void (*get)(struct fscache_io_request *); + void (*put)(struct fscache_io_request *); +}; + /* * slow-path functions for when there is actually caching available, and the * netfs does actually have a valid token @@ -149,6 +218,12 @@ extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool); extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *); extern void __fscache_invalidate(struct fscache_cookie *); extern void __fscache_wait_on_invalidate(struct fscache_cookie *); +extern void __fscache_shape_request(struct fscache_cookie *, struct fscache_request_shape *); +extern void __fscache_init_io_request(struct fscache_io_request *, + struct fscache_cookie *); +extern void __fscache_free_io_request(struct fscache_io_request *); +extern int __fscache_read(struct fscache_io_request *, struct iov_iter *); +extern int __fscache_write(struct fscache_io_request *, struct iov_iter *); /** * fscache_register_netfs - Register a filesystem as desiring caching services @@ -407,4 +482,130 @@ void fscache_wait_on_invalidate(struct fscache_cookie *cookie) __fscache_wait_on_invalidate(cookie); } +/** + * fscache_init_io_request - Initialise an I/O request + * @req: The I/O request to initialise + * @cookie: The I/O cookie to access + * @ops: The operations table to set + */ +static inline void fscache_init_io_request(struct fscache_io_request *req, + struct fscache_cookie *cookie, + const struct fscache_io_request_ops *ops) +{ + req->ops = ops; + if (fscache_cookie_valid(cookie)) + __fscache_init_io_request(req, cookie); +} + +/** + * fscache_free_io_request - Clean up an I/O request + * @req: The I/O request to clean + */ +static inline +void fscache_free_io_request(struct fscache_io_request *req) +{ + if (req->cookie) + __fscache_free_io_request(req); +} + +/** + * fscache_shape_request - Shape an request to fit cache granulation + * @cookie: The cache cookie to access + * @shape: The request proposed by the VM/filesystem (gets modified). + * + * Shape the size and position of a cache I/O request such that either the + * region will entirely be read from the server or entirely read from the + * cache. The proposed region may be adjusted by a combination of extending + * the front forward and/or extending or shrinking the end. In any case, the + * first page of the proposed request will be contained in the revised extent. + * + * The function sets shape->to_be_done to FSCACHE_READ_FROM_CACHE to indicate + * that the data is resident in the cache and can be read from there, + * FSCACHE_WRITE_TO_CACHE to indicate that the data isn't present, but the + * netfs should write it, FSCACHE_FILL_WITH_ZERO to indicate that the data + * should be all zeros on the server and can just be fabricated locally or + * FSCACHE_READ_FROM_SERVER to indicate that there's no cache or an error + * occurred and the netfs should just read from the server. + */ +static inline +void fscache_shape_request(struct fscache_cookie *cookie, + struct fscache_request_shape *shape) +{ + shape->to_be_done = FSCACHE_READ_FROM_SERVER; + shape->granularity = 1; + shape->dio_block_size = 1; + shape->actual_nr_pages = shape->proposed_nr_pages; + shape->actual_start = shape->proposed_start; + + if (fscache_cookie_valid(cookie)) + __fscache_shape_request(cookie, shape); +} + +/** + * fscache_read - Read data from the cache. + * @req: The I/O request descriptor + * @iter: The buffer to read into + * + * The cache will attempt to read from the object referred to by the cookie, + * using the size and position described in the request. The data will be + * transferred to the buffer described by the iterator specified in the request. + * + * If this fails or can't be done, an error will be set in the request + * descriptor and the netfs must reissue the read to the server. + * + * Note that the length and position of the request should be aligned to the DIO + * block size returned by fscache_shape_request(). + * + * If req->done is set, the request will be submitted as asynchronous I/O and + * -EIOCBQUEUED may be returned to indicate that the operation is in progress. + * The done function will be called when the operation is concluded either way. + * + * If req->done is not set, the request will be submitted as synchronous I/O and + * will be completed before the function returns. + */ +static inline +int fscache_read(struct fscache_io_request *req, struct iov_iter *iter) +{ + if (fscache_cookie_valid(req->cookie)) + return __fscache_read(req, iter); + req->error = -ENODATA; + if (req->io_done) + req->io_done(req); + return -ENODATA; +} + + +/** + * fscache_write - Write data to the cache. + * @req: The I/O request description + * @iter: The data to write + * + * The cache will attempt to write to the object referred to by the cookie, + * using the size and position described in the request. The data will be + * transferred from the iterator specified in the request. + * + * If this fails or can't be done, an error will be set in the request + * descriptor. + * + * Note that the length and position of the request should be aligned to the DIO + * block size returned by fscache_shape_request(). + * + * If req->io_done is set, the request will be submitted as asynchronous I/O and + * -EIOCBQUEUED may be returned to indicate that the operation is in progress. + * The done function will be called when the operation is concluded either way. + * + * If req->io_done is not set, the request will be submitted as synchronous I/O and + * will be completed before the function returns. + */ +static inline +int fscache_write(struct fscache_io_request *req, struct iov_iter *iter) +{ + if (fscache_cookie_valid(req->cookie)) + return __fscache_write(req, iter); + req->error = -ENOBUFS; + if (req->io_done) + req->io_done(req); + return -ENOBUFS; +} + #endif /* _LINUX_FSCACHE_H */ diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index 4fedc2e9c428..0aa3f3126f6e 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -39,6 +39,7 @@ enum cachefiles_obj_ref_trace { EM(fscache_obj_get_attach, "GET attach") \ EM(fscache_obj_get_exists, "GET exists") \ EM(fscache_obj_get_inval, "GET inval") \ + EM(fscache_obj_get_ioreq, "GET ioreq") \ EM(fscache_obj_get_wait, "GET wait") \ EM(fscache_obj_get_withdraw, "GET withdraw") \ EM(fscache_obj_new, "NEW obj") \ @@ -49,6 +50,7 @@ enum cachefiles_obj_ref_trace { EM(fscache_obj_put_drop_child, "PUT drop_child") \ EM(fscache_obj_put_drop_obj, "PUT drop_obj") \ EM(fscache_obj_put_inval, "PUT inval") \ + EM(fscache_obj_put_ioreq, "PUT ioreq") \ EM(fscache_obj_put_withdraw, "PUT withdraw") \ EM(fscache_obj_put_lookup_fail, "PUT lookup_fail") \ EM(cachefiles_obj_put_wait_retry, "PUT wait_retry") \ From patchwork Mon Jul 13 16:32:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660435 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF45113A4 for ; Mon, 13 Jul 2020 16:33:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A73DC2065F for ; Mon, 13 Jul 2020 16:33:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dFDrs7x1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730556AbgGMQcu (ORCPT ); Mon, 13 Jul 2020 12:32:50 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:21900 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730412AbgGMQco (ORCPT ); Mon, 13 Jul 2020 12:32:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657963; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D1wPsoBbNSccmnM5FewiFx0xZyD377Fb6Hp+rWU3DnI=; b=dFDrs7x13G3yiGpJLZuYd9j+Ci6nkLywiRgZS2DO6YXFgbHYMeWnaSYepDggPDGXMbqM/O DR4z2f0zRNXK5MwilIYrfoiSIopSyNkNQvtSUZX6tz0/N40uOjxRR2x69SNbzPhizhePfz Widj8+CWh78QZ1fitKk7Jzz6GAdxu00= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-236-aVjucGd_MIeHprorq-cBOQ-1; Mon, 13 Jul 2020 12:32:39 -0400 X-MC-Unique: aVjucGd_MIeHprorq-cBOQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 97D8110059B5; Mon, 13 Jul 2020 16:32:37 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 25C8760BF3; Mon, 13 Jul 2020 16:32:32 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 10/32] fscache: Remove fscache_wait_on_invalidate() From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:32:31 +0100 Message-ID: <159465795136.1376674.599056208279354471.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Remove fscache_wait_on_invalidate() as the invalidation wait is now built into the I/O path. Signed-off-by: David Howells --- fs/fscache/cookie.c | 14 -------------- include/linux/fscache.h | 17 ----------------- 2 files changed, 31 deletions(-) diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index a8aa1639e93b..a1eba3be9ce8 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -492,20 +492,6 @@ void __fscache_invalidate(struct fscache_cookie *cookie) } EXPORT_SYMBOL(__fscache_invalidate); -/* - * Wait for object invalidation to complete. - */ -void __fscache_wait_on_invalidate(struct fscache_cookie *cookie) -{ - _enter("%p", cookie); - - wait_on_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING, - TASK_UNINTERRUPTIBLE); - - _leave(""); -} -EXPORT_SYMBOL(__fscache_wait_on_invalidate); - /* * Update the index entries backing a cookie. The writeback is done lazily. */ diff --git a/include/linux/fscache.h b/include/linux/fscache.h index aec75fc0d297..56fdd0e74a88 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -217,7 +217,6 @@ extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool); extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *); extern void __fscache_invalidate(struct fscache_cookie *); -extern void __fscache_wait_on_invalidate(struct fscache_cookie *); extern void __fscache_shape_request(struct fscache_cookie *, struct fscache_request_shape *); extern void __fscache_init_io_request(struct fscache_io_request *, struct fscache_cookie *); @@ -466,22 +465,6 @@ void fscache_invalidate(struct fscache_cookie *cookie) __fscache_invalidate(cookie); } -/** - * fscache_wait_on_invalidate - Wait for invalidation to complete - * @cookie: The cookie representing the cache object - * - * Wait for the invalidation of an object to complete. - * - * See Documentation/filesystems/caching/netfs-api.rst for a complete - * description. - */ -static inline -void fscache_wait_on_invalidate(struct fscache_cookie *cookie) -{ - if (fscache_cookie_valid(cookie)) - __fscache_wait_on_invalidate(cookie); -} - /** * fscache_init_io_request - Initialise an I/O request * @req: The I/O request to initialise From patchwork Mon Jul 13 16:32:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660431 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC0F51510 for ; Mon, 13 Jul 2020 16:32:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AD7B12065F for ; Mon, 13 Jul 2020 16:32:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fO9w4ebQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730366AbgGMQc5 (ORCPT ); Mon, 13 Jul 2020 12:32:57 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:40699 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730572AbgGMQcz (ORCPT ); Mon, 13 Jul 2020 12:32:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657973; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tP6l1C0alH26x13TIIslzPNEfR9MrU1nOyotTY1c1j0=; b=fO9w4ebQVUU6gO4GZsJiJSjo74vEkxSnY2kOGANcDakts0xLIl7G3EDbnx9RmgSzwLKaPh cGFr8zAr/KGfTw2h9cZIb2CksATZx8UGrvkQELo3cr/dQ5elsTspchQ+jIIfJZbdoL2aFe on7q88P4cpinkwGiqUV9dWMmI1p95pA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-435-MC_DnXy_Nt2kzlCaveItmw-1; Mon, 13 Jul 2020 12:32:51 -0400 X-MC-Unique: MC_DnXy_Nt2kzlCaveItmw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B68611902EA2; Mon, 13 Jul 2020 16:32:49 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA436724C3; Mon, 13 Jul 2020 16:32:43 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 11/32] fscache: Keep track of size of a file last set independently on the server From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:32:42 +0100 Message-ID: <159465796283.1376674.15372489386955555864.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Keep track of the size of a file that we're caching as last set independently on the server by another client. As long as this does not change, we can make the assumption that anything over that boundary, if not represented in the local cache, will not be represented on the server either and can be just cleared rather than being read, thereby saving a trip to the server. This only works if we make space in the cache by zapping whole files and not just punching bits out of them as if we write to the server but don't keep a copy in the cache, the assumption mentioned above no longer holds true. We also need to update this size when invalidation occurs. Signed-off-by: David Howells --- fs/afs/inode.c | 2 +- fs/fscache/cookie.c | 8 +++++++- include/linux/fscache.h | 8 +++++--- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/afs/inode.c b/fs/afs/inode.c index 49d897437998..b0772e64a844 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -569,7 +569,7 @@ static void afs_zap_data(struct afs_vnode *vnode) _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode); #ifdef CONFIG_AFS_FSCACHE - fscache_invalidate(vnode->cache); + fscache_invalidate(vnode->cache, i_size_read(&vnode->vfs_inode)); #endif /* nuke all the non-dirty pages that aren't locked, mapped or being diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index a1eba3be9ce8..5c53027d3f53 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -159,6 +159,7 @@ struct fscache_cookie *fscache_alloc_cookie( cookie->key_len = index_key_len; cookie->aux_len = aux_data_len; cookie->object_size = object_size; + cookie->zero_point = object_size; strlcpy(cookie->type_name, type_name, sizeof(cookie->type_name)); if (fscache_set_key(cookie, index_key, index_key_len) < 0) @@ -473,7 +474,7 @@ void fscache_set_cookie_stage(struct fscache_cookie *cookie, /* * Invalidate an object. Callable with spinlocks held. */ -void __fscache_invalidate(struct fscache_cookie *cookie) +void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size) { _enter("{%s}", cookie->type_name); @@ -486,6 +487,11 @@ void __fscache_invalidate(struct fscache_cookie *cookie) */ ASSERTCMP(cookie->type, ==, FSCACHE_COOKIE_TYPE_DATAFILE); + spin_lock(&cookie->lock); + cookie->object_size = new_size; + cookie->zero_point = new_size; + spin_unlock(&cookie->lock); + if (!hlist_empty(&cookie->backing_objects) && test_and_set_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags)) fscache_dispatch(cookie, NULL, 0, fscache_invalidate_object); diff --git a/include/linux/fscache.h b/include/linux/fscache.h index 56fdd0e74a88..bfb28cebfcfd 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -102,6 +102,7 @@ struct fscache_cookie { struct list_head proc_link; /* Link in proc list */ char type_name[8]; /* Cookie type name */ loff_t object_size; /* Size of the netfs object */ + loff_t zero_point; /* Size after which no data on server */ unsigned long flags; #define FSCACHE_COOKIE_INVALIDATING 4 /* T if cookie is being invalidated */ @@ -216,8 +217,8 @@ extern void __fscache_use_cookie(struct fscache_cookie *, bool); extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const loff_t *); extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool); extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *); -extern void __fscache_invalidate(struct fscache_cookie *); extern void __fscache_shape_request(struct fscache_cookie *, struct fscache_request_shape *); +extern void __fscache_invalidate(struct fscache_cookie *, loff_t); extern void __fscache_init_io_request(struct fscache_io_request *, struct fscache_cookie *); extern void __fscache_free_io_request(struct fscache_io_request *); @@ -448,6 +449,7 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie) /** * fscache_invalidate - Notify cache that an object needs invalidation * @cookie: The cookie representing the cache object + * @size: The revised size of the object. * * Notify the cache that an object is needs to be invalidated and that it * should abort any retrievals or stores it is doing on the cache. The object @@ -459,10 +461,10 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie) * description. */ static inline -void fscache_invalidate(struct fscache_cookie *cookie) +void fscache_invalidate(struct fscache_cookie *cookie, loff_t size) { if (fscache_cookie_valid(cookie)) - __fscache_invalidate(cookie); + __fscache_invalidate(cookie, size); } /** From patchwork Mon Jul 13 16:32:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1DBE5618 for ; Mon, 13 Jul 2020 16:33:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 06A04206F5 for ; Mon, 13 Jul 2020 16:33:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FqFYIQCY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730577AbgGMQdN (ORCPT ); Mon, 13 Jul 2020 12:33:13 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:40856 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730586AbgGMQdI (ORCPT ); Mon, 13 Jul 2020 12:33:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jQyN5nMptUdr+FXFnO+0cz+AZazAi5FEXN8n0OwGMPM=; b=FqFYIQCYE3wPVIwLVJek3Y/3zdwK1O3+2dl9N5JctErzDEznmbhagsabEo6CBAhq2XKHuU q+G09+eR/0mRGPc1eR/Oc5qQZTDTb8OHGcQJHc+U1SotMZDl16ya58VbcY4Tb2E7WVdFQH IfLfeH2EcLdQ0cFwCTT8f8vQ0LaeSF4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-501-6akDkgNxPv2r1vJGIyslDQ-1; Mon, 13 Jul 2020 12:33:03 -0400 X-MC-Unique: 6akDkgNxPv2r1vJGIyslDQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A6FB100A8C0; Mon, 13 Jul 2020 16:33:01 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id B34915D9CC; Mon, 13 Jul 2020 16:32:55 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 12/32] fscache, cachefiles: Fix disabled histogram warnings From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:32:54 +0100 Message-ID: <159465797497.1376674.14328755555295693847.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Fix variable unused warnings due to disabled histogram stuff. Signed-off-by: David Howells --- fs/cachefiles/internal.h | 7 +++++-- fs/fscache/internal.h | 6 ++++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index b89f76a03546..16d15291a629 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -143,11 +143,11 @@ extern int cachefiles_check_in_use(struct cachefiles_cache *cache, /* * proc.c */ -#ifdef CONFIG_CACHEFILES_HISTOGRAM extern atomic_t cachefiles_lookup_histogram[HZ]; extern atomic_t cachefiles_mkdir_histogram[HZ]; extern atomic_t cachefiles_create_histogram[HZ]; +#ifdef CONFIG_CACHEFILES_HISTOGRAM extern int __init cachefiles_proc_init(void); extern void cachefiles_proc_cleanup(void); static inline @@ -162,7 +162,10 @@ void cachefiles_hist(atomic_t histogram[], unsigned long start_jif) #else #define cachefiles_proc_init() (0) #define cachefiles_proc_cleanup() do {} while (0) -#define cachefiles_hist(hist, start_jif) do {} while (0) +static inline +void cachefiles_hist(atomic_t histogram[], unsigned long start_jif) +{ +} #endif /* diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index 443671310e31..a70c1a612309 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -95,13 +95,13 @@ extern struct fscache_cookie fscache_fsdef_index; /* * histogram.c */ -#ifdef CONFIG_FSCACHE_HISTOGRAM extern atomic_t fscache_obj_instantiate_histogram[HZ]; extern atomic_t fscache_objs_histogram[HZ]; extern atomic_t fscache_ops_histogram[HZ]; extern atomic_t fscache_retrieval_delay_histogram[HZ]; extern atomic_t fscache_retrieval_histogram[HZ]; +#ifdef CONFIG_FSCACHE_HISTOGRAM static inline void fscache_hist(atomic_t histogram[], unsigned long start_jif) { unsigned long jif = jiffies - start_jif; @@ -113,7 +113,9 @@ static inline void fscache_hist(atomic_t histogram[], unsigned long start_jif) extern const struct seq_operations fscache_histogram_ops; #else -#define fscache_hist(hist, start_jif) do {} while (0) +static inline void fscache_hist(atomic_t histogram[], unsigned long start_jif) +{ +} #endif /* From patchwork Mon Jul 13 16:33:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CDD3913A4 for ; Mon, 13 Jul 2020 16:33:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B66642065F for ; Mon, 13 Jul 2020 16:33:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="d8LfWRyc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730609AbgGMQdU (ORCPT ); Mon, 13 Jul 2020 12:33:20 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:33923 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729873AbgGMQdS (ORCPT ); Mon, 13 Jul 2020 12:33:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594657997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UAVOFIpflnrDTDC4tc7UBSxnZ+7A5gGjq40lYeIbzs8=; b=d8LfWRycTOzq2MByN188x0RMEQW3gjkxnl3MA2fq3bQ9JlrZ0+171cXzt2/VoVZ4sZvipP 5EhVO6jdhcO1RVEuFE7L6nFXUy6x5jawrY2GvitNsWBqta7zOdszVUI63ce3JxqlN7dP5m lnpnlphAoQDhBZQMwbzF9bUEszHba/g= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-259-CTesaysMOqm-GmYkPqXWxw-1; Mon, 13 Jul 2020 12:33:14 -0400 X-MC-Unique: CTesaysMOqm-GmYkPqXWxw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B754B100978C; Mon, 13 Jul 2020 16:33:12 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 46D7319D7D; Mon, 13 Jul 2020 16:33:07 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 13/32] fscache: Recast assertion in terms of cookie not being an index From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:06 +0100 Message-ID: <159465798650.1376674.16738070178705686097.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Recast assertion in terms of cookie not being an index rather than being a datafile. Signed-off-by: David Howells --- fs/fscache/cookie.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index 5c53027d3f53..2d9d147411cd 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -485,7 +485,7 @@ void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size) * there, and if it's doing that, it may as well just retire the * cookie. */ - ASSERTCMP(cookie->type, ==, FSCACHE_COOKIE_TYPE_DATAFILE); + ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX); spin_lock(&cookie->lock); cookie->object_size = new_size; From patchwork Mon Jul 13 16:33:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D745013A4 for ; Mon, 13 Jul 2020 16:33:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE21A2065F for ; Mon, 13 Jul 2020 16:33:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Yujw+eLW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729826AbgGMQdb (ORCPT ); Mon, 13 Jul 2020 12:33:31 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:48446 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730625AbgGMQd3 (ORCPT ); Mon, 13 Jul 2020 12:33:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RWMLZe4Vdjyi4M5qHNYNhL1EdhgFGsvNnmLuTk/M2gI=; b=Yujw+eLW0iIemfTX7vpgwAkY6l8BzaLgtL2VHcJd9UTn+pTRgkk3oMgWQtobT6bLjkA939 K5DOqix3S1kXLVVW1T1SohmCKyK43HcoNf90H+6h4CmqHWgscqTpw3S0KwvwH2kp/Mf4XF X1J4541avZiZyrKDNyGcX4PbHIqMWD0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-407-ovdf7efDN0S6TwV9kunCZw-1; Mon, 13 Jul 2020 12:33:26 -0400 X-MC-Unique: ovdf7efDN0S6TwV9kunCZw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 32F8E800597; Mon, 13 Jul 2020 16:33:24 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id B85FF5D9CC; Mon, 13 Jul 2020 16:33:18 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 14/32] cachefiles: Remove some redundant checks on unsigned values From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:17 +0100 Message-ID: <159465799796.1376674.7066284663368258060.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Remove some redundant checks for unsigned values being >= 0. Signed-off-by: David Howells --- fs/cachefiles/bind.c | 6 ++---- fs/cachefiles/daemon.c | 6 +++--- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c index 4c59e1ef4500..84fe89d5999e 100644 --- a/fs/cachefiles/bind.c +++ b/fs/cachefiles/bind.c @@ -36,13 +36,11 @@ int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args) args); /* start by checking things over */ - ASSERT(cache->fstop_percent >= 0 && - cache->fstop_percent < cache->fcull_percent && + ASSERT(cache->fstop_percent < cache->fcull_percent && cache->fcull_percent < cache->frun_percent && cache->frun_percent < 100); - ASSERT(cache->bstop_percent >= 0 && - cache->bstop_percent < cache->bcull_percent && + ASSERT(cache->bstop_percent < cache->bcull_percent && cache->bcull_percent < cache->brun_percent && cache->brun_percent < 100); diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c index 8a937d6d5e22..e8ab3ab57147 100644 --- a/fs/cachefiles/daemon.c +++ b/fs/cachefiles/daemon.c @@ -221,7 +221,7 @@ static ssize_t cachefiles_daemon_write(struct file *file, if (test_bit(CACHEFILES_DEAD, &cache->flags)) return -EIO; - if (datalen < 0 || datalen > PAGE_SIZE - 1) + if (datalen > PAGE_SIZE - 1) return -EOPNOTSUPP; /* drag the command string into the kernel so we can parse it */ @@ -378,7 +378,7 @@ static int cachefiles_daemon_fstop(struct cachefiles_cache *cache, char *args) if (args[0] != '%' || args[1] != '\0') return -EINVAL; - if (fstop < 0 || fstop >= cache->fcull_percent) + if (fstop >= cache->fcull_percent) return cachefiles_daemon_range_error(cache, args); cache->fstop_percent = fstop; @@ -450,7 +450,7 @@ static int cachefiles_daemon_bstop(struct cachefiles_cache *cache, char *args) if (args[0] != '%' || args[1] != '\0') return -EINVAL; - if (bstop < 0 || bstop >= cache->bcull_percent) + if (bstop >= cache->bcull_percent) return cachefiles_daemon_range_error(cache, args); cache->bstop_percent = bstop; From patchwork Mon Jul 13 16:33:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4ADE2618 for ; Mon, 13 Jul 2020 16:33:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2FED72065F for ; Mon, 13 Jul 2020 16:33:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="W2+5ISby" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730640AbgGMQdn (ORCPT ); Mon, 13 Jul 2020 12:33:43 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:25697 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730449AbgGMQdm (ORCPT ); Mon, 13 Jul 2020 12:33:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RsKlZPL79lsVIpqG5GEJIOHq0XzcTHkHZqx9vAFcpIs=; b=W2+5ISbyyT6lGjMealOkLVZrIAUgUMqAbFKJ91lOgOVCcyrCYMIY1CxSnHxQo+rL/TxBcb nYyYQpcVFsQhCKIcT1wSQKLBbcr9wVXJXZUAhmPJQzz+hsn78msINUyQ5s0j/ofd548oq/ 3weESqJrZ6iUKaJtTQk9d8l7Palq6Y0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-441-TF_sxZHTPhqKDZsdug7vmg-1; Mon, 13 Jul 2020 12:33:35 -0400 X-MC-Unique: TF_sxZHTPhqKDZsdug7vmg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2A12A8027F3; Mon, 13 Jul 2020 16:33:33 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 36A045D9DC; Mon, 13 Jul 2020 16:33:30 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 15/32] cachefiles: trace: Log coherency checks From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:29 +0100 Message-ID: <159465800942.1376674.11074050532334474977.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a cachefiles tracepoint that logs the result of coherency management when the coherency data on a file in the cache is checked or committed. Signed-off-by: David Howells --- fs/cachefiles/xattr.c | 45 ++++++++++++++++++++++-------- include/trace/events/cachefiles.h | 56 +++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+), 12 deletions(-) diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index 5b2f6da91cc8..17c16c2bd07e 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -125,12 +125,21 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object, ret = vfs_setxattr(dentry, cachefiles_xattr_cache, buf, sizeof(struct cachefiles_xattr) + len, xattr_flags); - kfree(buf); - if (ret < 0 && ret != -ENOMEM) - cachefiles_io_error_obj( - object, - "Failed to set xattr with error %d", ret); + if (ret < 0) { + trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, + 0, + cachefiles_coherency_set_fail); + if (ret != -ENOMEM) + cachefiles_io_error_obj( + object, + "Failed to set xattr with error %d", ret); + } else { + trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, + 0, + cachefiles_coherency_set_ok); + } + kfree(buf); _leave(" = %d", ret); return ret; } @@ -144,7 +153,9 @@ int cachefiles_check_auxdata(struct cachefiles_object *object) struct dentry *dentry = object->dentry; unsigned int len = object->fscache.cookie->aux_len, tlen; const void *p = fscache_get_aux(object->fscache.cookie); - ssize_t ret; + enum cachefiles_coherency_trace why; + ssize_t xlen; + int ret = -ESTALE; ASSERT(dentry); ASSERT(d_backing_inode(dentry)); @@ -154,14 +165,24 @@ int cachefiles_check_auxdata(struct cachefiles_object *object) if (!buf) return -ENOMEM; - ret = vfs_getxattr(dentry, cachefiles_xattr_cache, buf, tlen); - if (ret == tlen && - buf->type == object->fscache.cookie->type && - memcmp(buf->data, p, len) == 0) + xlen = vfs_getxattr(dentry, cachefiles_xattr_cache, buf, tlen); + if (xlen != tlen) { + if (xlen == -EIO) + cachefiles_io_error_obj( + object, + "Failed to read aux with error %zd", xlen); + why = cachefiles_coherency_check_xattr; + } else if (buf->type != object->fscache.cookie->type) { + why = cachefiles_coherency_check_type; + } else if (memcmp(buf->data, p, len) != 0) { + why = cachefiles_coherency_check_aux; + } else { + why = cachefiles_coherency_check_ok; ret = 0; - else - ret = -ESTALE; + } + trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, + 0, why); kfree(buf); return ret; } diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index 0aa3f3126f6e..bf588c3f4a07 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -24,6 +24,19 @@ enum cachefiles_obj_ref_trace { cachefiles_obj_ref__nr_traces }; +enum cachefiles_coherency_trace { + cachefiles_coherency_check_aux, + cachefiles_coherency_check_content, + cachefiles_coherency_check_dirty, + cachefiles_coherency_check_len, + cachefiles_coherency_check_objsize, + cachefiles_coherency_check_ok, + cachefiles_coherency_check_type, + cachefiles_coherency_check_xattr, + cachefiles_coherency_set_fail, + cachefiles_coherency_set_ok, +}; + #endif /* @@ -56,6 +69,18 @@ enum cachefiles_obj_ref_trace { EM(cachefiles_obj_put_wait_retry, "PUT wait_retry") \ E_(cachefiles_obj_put_wait_timeo, "PUT wait_timeo") +#define cachefiles_coherency_traces \ + EM(cachefiles_coherency_check_aux, "BAD aux ") \ + EM(cachefiles_coherency_check_content, "BAD cont") \ + EM(cachefiles_coherency_check_dirty, "BAD dirt") \ + EM(cachefiles_coherency_check_len, "BAD len ") \ + EM(cachefiles_coherency_check_objsize, "BAD osiz") \ + EM(cachefiles_coherency_check_ok, "OK ") \ + EM(cachefiles_coherency_check_type, "BAD type") \ + EM(cachefiles_coherency_check_xattr, "BAD xatt") \ + EM(cachefiles_coherency_set_fail, "SET fail") \ + E_(cachefiles_coherency_set_ok, "SET ok ") + /* * Export enum symbols via userspace. */ @@ -66,6 +91,7 @@ enum cachefiles_obj_ref_trace { cachefiles_obj_kill_traces; cachefiles_obj_ref_traces; +cachefiles_coherency_traces; /* * Now redefine the EM() and E_() macros to map the enums to the strings that @@ -295,6 +321,36 @@ TRACE_EVENT(cachefiles_mark_buried, __print_symbolic(__entry->why, cachefiles_obj_kill_traces)) ); +TRACE_EVENT(cachefiles_coherency, + TP_PROTO(struct cachefiles_object *obj, + ino_t ino, + int content, + enum cachefiles_coherency_trace why), + + TP_ARGS(obj, ino, content, why), + + /* Note that obj may be NULL */ + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(enum cachefiles_coherency_trace, why ) + __field(int, content ) + __field(u64, ino ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->why = why; + __entry->content = content; + __entry->ino = ino; + ), + + TP_printk("o=%08x %s i=%llx c=%u", + __entry->obj, + __print_symbolic(__entry->why, cachefiles_coherency_traces), + __entry->ino, + __entry->content) + ); + #endif /* _TRACE_CACHEFILES_H */ /* This part must be outside protection */ From patchwork Mon Jul 13 16:33:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 01EDD618 for ; Mon, 13 Jul 2020 16:33:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DE77D2065F for ; Mon, 13 Jul 2020 16:33:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fLiQLz8X" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730654AbgGMQdx (ORCPT ); Mon, 13 Jul 2020 12:33:53 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:57010 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730372AbgGMQdw (ORCPT ); Mon, 13 Jul 2020 12:33:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fOWYXigZvYWiY8apzgv2t9nu9RfiJe//8dPmvbLaW1M=; b=fLiQLz8Xhksjwl5ltcXGa/5ZADX/rUK2HjhbtbW1mK+OPfo7O65P6FvJKaF2AHfVPwleis 0rbGr//LXEWzsn4+mhQqAxfQFWVA3Uwxe4uRBi3gy8bYkxOwqzIOdL4KUgx9lnotWUx/gT mRY6FS4NeqW23PXLh6PttWmNr4UkUR0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-360-RoEP98VlPS-IryIyEwbRWA-1; Mon, 13 Jul 2020 12:33:46 -0400 X-MC-Unique: RoEP98VlPS-IryIyEwbRWA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1DB6D1800D42; Mon, 13 Jul 2020 16:33:45 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A24E19D61; Mon, 13 Jul 2020 16:33:39 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 16/32] cachefiles: Split cachefiles_drop_object() up a bit From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:38 +0100 Message-ID: <159465801837.1376674.800536726710094793.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Split cachefiles_drop_object() up a bit to make it easier to modify later. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 58 ++++++++++++++++++++++++++++++--------------- 1 file changed, 39 insertions(+), 19 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index e4d1a82b9f33..56ed6f203e1c 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -192,6 +192,42 @@ static void cachefiles_update_object(struct fscache_object *_object) _leave(""); } +/* + * Commit changes to the object as we drop it. + */ +static void cachefiles_commit_object(struct cachefiles_object *object, + struct cachefiles_cache *cache) +{ +} + +/* + * Finalise and object and close the VFS structs that we have. + */ +static void cachefiles_clean_up_object(struct cachefiles_object *object, + struct cachefiles_cache *cache, + bool invalidate) +{ + if (invalidate && &object->fscache != cache->cache.fsdef) { + _debug("- inval object OBJ%x", object->fscache.debug_id); + cachefiles_delete_object(cache, object); + } else { + cachefiles_commit_object(object, cache); + } + + /* close the filesystem stuff attached to the object */ + if (object->backing_file) + fput(object->backing_file); + object->backing_file = NULL; + + if (object->backer != object->dentry) + dput(object->backer); + object->backer = NULL; + + cachefiles_unmark_inode_in_use(object, object->dentry); + dput(object->dentry); + object->dentry = NULL; +} + /* * discard the resources pinned by an object and effect retirement if * requested @@ -223,25 +259,9 @@ static void cachefiles_drop_object(struct fscache_object *_object, * before we set it up. */ if (object->dentry) { - if (invalidate && _object != cache->cache.fsdef) { - _debug("- inval object OBJ%x", object->fscache.debug_id); - cachefiles_begin_secure(cache, &saved_cred); - cachefiles_delete_object(cache, object); - cachefiles_end_secure(cache, saved_cred); - } - - /* close the filesystem stuff attached to the object */ - if (object->backing_file) - fput(object->backing_file); - object->backing_file = NULL; - - if (object->backer != object->dentry) - dput(object->backer); - object->backer = NULL; - - cachefiles_unmark_inode_in_use(object, object->dentry); - dput(object->dentry); - object->dentry = NULL; + cachefiles_begin_secure(cache, &saved_cred); + cachefiles_clean_up_object(object, cache, invalidate); + cachefiles_end_secure(cache, saved_cred); } _leave(""); From patchwork Mon Jul 13 16:33:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8CDEC13A4 for ; Mon, 13 Jul 2020 16:34:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 757EC206F0 for ; Mon, 13 Jul 2020 16:34:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZghqXAkg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730664AbgGMQeA (ORCPT ); Mon, 13 Jul 2020 12:34:00 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:41604 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730372AbgGMQd7 (ORCPT ); Mon, 13 Jul 2020 12:33:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658038; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fqCwktpptGLXnn57q1p4M/wCtIu5AeIAT193bext8mM=; b=ZghqXAkgmnsgtamQrxfbOVceGAO0h+T2O8o/c/DdSjngWHtFClY9nz4+aoh8lbeh4/Z1QN /muwWBwgxHsUShnH4Xb9R740ufCfjEqkY4tLG12WhgLoj7ihexz9sQB7NSuR3c97IWv7w/ Pm53/zIla5TxXPOfrNs/TC74MPHOX8M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-355-F2I97yUWMMyO9sq6mKLsIQ-1; Mon, 13 Jul 2020 12:33:56 -0400 X-MC-Unique: F2I97yUWMMyO9sq6mKLsIQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E4A1A1800D42; Mon, 13 Jul 2020 16:33:53 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 251E819D61; Mon, 13 Jul 2020 16:33:50 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 17/32] cachefiles: Implement new fscache I/O backend API From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:50 +0100 Message-ID: <159465803035.1376674.12906653212889524200.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Implement the new fscache I/O backend API in cachefiles. The cachefiles_object struct carries a non-accounted file to the cachefiles object (so that it doesn't cause ENFILE). Signed-off-by: David Howells --- fs/cachefiles/Makefile | 1 + fs/cachefiles/interface.c | 3 ++ fs/cachefiles/internal.h | 13 +++++++ fs/cachefiles/io.c | 87 +++++++++++++++++++++++++++++++++++++++++++++ fs/cachefiles/namei.c | 3 ++ 5 files changed, 107 insertions(+) create mode 100644 fs/cachefiles/io.c diff --git a/fs/cachefiles/Makefile b/fs/cachefiles/Makefile index 3455d3646547..d894d317d6e7 100644 --- a/fs/cachefiles/Makefile +++ b/fs/cachefiles/Makefile @@ -7,6 +7,7 @@ cachefiles-y := \ bind.o \ daemon.o \ interface.o \ + io.o \ key.o \ main.o \ namei.o \ diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 56ed6f203e1c..4ce7ab5c75db 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -465,4 +465,7 @@ const struct fscache_cache_ops cachefiles_cache_ops = { .put_object = cachefiles_put_object, .get_object_usage = cachefiles_get_object_usage, .sync_cache = cachefiles_sync_cache, + .shape_request = cachefiles_shape_request, + .read = cachefiles_read, + .write = cachefiles_write, }; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 16d15291a629..b82e7f8b00bd 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -115,6 +115,19 @@ extern const struct fscache_cache_ops cachefiles_cache_ops; extern struct fscache_object *cachefiles_grab_object(struct fscache_object *_object, enum fscache_obj_ref_trace why); +/* + * io.c + */ +extern void cachefiles_shape_request(struct fscache_object *object, + struct fscache_request_shape *shape); +extern int cachefiles_read(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter); +extern int cachefiles_write(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter); +extern bool cachefiles_open_object(struct cachefiles_object *obj); + /* * key.c */ diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c new file mode 100644 index 000000000000..89fd4a24e613 --- /dev/null +++ b/fs/cachefiles/io.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Data I/O routines + * + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include "internal.h" + +/* + * Determine the size of a data extent in a cache object. This must be written + * as a whole unit, but can be read piecemeal. + */ +void cachefiles_shape_request(struct fscache_object *object, + struct fscache_request_shape *shape) +{ + return 0; +} + +/* + * Initiate a read from the cache. + */ +int cachefiles_read(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter) +{ + req->error = -ENODATA; + if (req->io_done) + req->io_done(req); + return -ENODATA; +} + +/* + * Initiate a write to the cache. + */ +int cachefiles_write(struct fscache_object *object, + struct fscache_io_request *req, + struct iov_iter *iter) +{ + req->error = -ENOBUFS; + if (req->io_done) + req->io_done(req); + return -ENOBUFS; +} + +/* + * Open a cache object. + */ +bool cachefiles_open_object(struct cachefiles_object *object) +{ + struct cachefiles_cache *cache = + container_of(object->fscache.cache, struct cachefiles_cache, cache); + struct file *file; + struct path path; + + path.mnt = cache->mnt; + path.dentry = object->backer; + + file = open_with_fake_path(&path, + O_RDWR | O_LARGEFILE | O_DIRECT, + d_backing_inode(object->backer), + cache->cache_cred); + if (IS_ERR(file)) + goto error; + + if (!S_ISREG(file_inode(file)->i_mode)) + goto error_file; + + if (unlikely(!file->f_op->read_iter) || + unlikely(!file->f_op->write_iter)) { + pr_notice("Cache does not support read_iter and write_iter\n"); + goto error_file; + } + + object->backing_file = file; + return true; + +error_file: + fput(file); +error: + return false; +} diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index d1c8828ebbbb..d9c9a7d7eb8a 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -492,6 +492,9 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, } else { BUG(); // TODO: open file in data-class subdir } + + if (!cachefiles_open_object(object)) + goto check_error; } if (object->new) From patchwork Mon Jul 13 16:33:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24550618 for ; Mon, 13 Jul 2020 16:34:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 08A1C2065F for ; Mon, 13 Jul 2020 16:34:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cGa/RV53" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729927AbgGMQeQ (ORCPT ); Mon, 13 Jul 2020 12:34:16 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:32163 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730453AbgGMQeN (ORCPT ); Mon, 13 Jul 2020 12:34:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lf5rjM3anbmEp+kGNpjumlt4jhnWFjEVtbFDwzCTHok=; b=cGa/RV53/3oxGXW8ZUmr+zmNOBQyf9mW/Wn4d0CF56Eq4wYrIppzvB970dic1tRvX27Yxp 9OVZE/ERErp6NJ1GV3VmeevHfGLC4qQLVAGQzgo8mHuX/dLFkWqvBhA6/FVQR8hfonGVt6 tnq3mKNuWuLZcw5JPNATUXTsDYfWFAg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-138-uEmelj5vOq2Y9IVLSz-tIQ-1; Mon, 13 Jul 2020 12:34:07 -0400 X-MC-Unique: uEmelj5vOq2Y9IVLSz-tIQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A0E1710059B5; Mon, 13 Jul 2020 16:34:05 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id E79BC19D7D; Mon, 13 Jul 2020 16:33:59 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 18/32] cachefiles: Merge object->backer into object->dentry From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:33:59 +0100 Message-ID: <159465803914.1376674.8451362224962725376.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Merge the object->backer pointer into the object->dentry pointer and assume that data objects are always going to be just regular files. object->dentry can then more easily be overridden later by invalidation without having two different things to update the xattrs on. object->old maintains a pointer to the old file so that we can unlink the it later. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 35 +++++++++++++++++------------------ fs/cachefiles/internal.h | 2 +- fs/cachefiles/io.c | 4 ++-- fs/cachefiles/namei.c | 4 +++- 4 files changed, 23 insertions(+), 22 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 4ce7ab5c75db..6384fba652eb 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -171,16 +171,16 @@ static void cachefiles_update_object(struct fscache_object *_object) cachefiles_begin_secure(cache, &saved_cred); object_size = object->fscache.cookie->object_size; - if (i_size_read(d_inode(object->backer)) > object_size) { + if (i_size_read(d_inode(object->dentry)) > object_size) { struct path path = { .mnt = cache->mnt, - .dentry = object->backer + .dentry = object->dentry }; - _debug("trunc %llx -> %llx", i_size_read(d_inode(object->backer)), object_size); + _debug("trunc %llx -> %llx", i_size_read(d_inode(object->dentry)), object_size); ret = vfs_truncate(&path, object_size); if (ret < 0) { cachefiles_io_error_obj(object, "Trunc-to-size failed"); - cachefiles_remove_object_xattr(cache, object->backer); + cachefiles_remove_object_xattr(cache, object->dentry); goto out; } } @@ -219,9 +219,8 @@ static void cachefiles_clean_up_object(struct cachefiles_object *object, fput(object->backing_file); object->backing_file = NULL; - if (object->backer != object->dentry) - dput(object->backer); - object->backer = NULL; + dput(object->old); + object->old = NULL; cachefiles_unmark_inode_in_use(object, object->dentry); dput(object->dentry); @@ -295,7 +294,7 @@ static void cachefiles_put_object(struct fscache_object *_object, if (u == 0) { _debug("- kill object OBJ%x", object->fscache.debug_id); - ASSERTCMP(object->backer, ==, NULL); + ASSERTCMP(object->old, ==, NULL); ASSERTCMP(object->dentry, ==, NULL); ASSERTCMP(object->fscache.n_children, ==, 0); @@ -360,17 +359,17 @@ static int cachefiles_attr_changed(struct cachefiles_object *object) if (ni_size == object->i_size) return 0; - if (!object->backer) + if (!object->dentry) return -ENOBUFS; - ASSERT(d_is_reg(object->backer)); + ASSERT(d_is_reg(object->dentry)); - oi_size = i_size_read(d_backing_inode(object->backer)); + oi_size = i_size_read(d_backing_inode(object->dentry)); if (oi_size == ni_size) return 0; cachefiles_begin_secure(cache, &saved_cred); - inode_lock(d_inode(object->backer)); + inode_lock(d_inode(object->dentry)); /* if there's an extension to a partial page at the end of the backing * file, we need to discard the partial page so that we pick up new @@ -379,17 +378,17 @@ static int cachefiles_attr_changed(struct cachefiles_object *object) _debug("discard tail %llx", oi_size); newattrs.ia_valid = ATTR_SIZE; newattrs.ia_size = oi_size & PAGE_MASK; - ret = notify_change(object->backer, &newattrs, NULL); + ret = notify_change(object->dentry, &newattrs, NULL); if (ret < 0) goto truncate_failed; } newattrs.ia_valid = ATTR_SIZE; newattrs.ia_size = ni_size; - ret = notify_change(object->backer, &newattrs, NULL); + ret = notify_change(object->dentry, &newattrs, NULL); truncate_failed: - inode_unlock(d_inode(object->backer)); + inode_unlock(d_inode(object->dentry)); cachefiles_end_secure(cache, saved_cred); if (ret == -EIO) { @@ -422,10 +421,10 @@ static void cachefiles_invalidate_object(struct fscache_object *_object) _enter("{OBJ%x},[%llu]", object->fscache.debug_id, (unsigned long long)ni_size); - if (object->backer) { - ASSERT(d_is_reg(object->backer)); + if (object->dentry) { + ASSERT(d_is_reg(object->dentry)); - path.dentry = object->backer; + path.dentry = object->dentry; path.mnt = cache->mnt; cachefiles_begin_secure(cache, &saved_cred); diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index b82e7f8b00bd..a00ffb63baf4 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -35,7 +35,7 @@ extern unsigned cachefiles_debug; struct cachefiles_object { struct fscache_object fscache; /* fscache handle */ struct dentry *dentry; /* the file/dir representing this object */ - struct dentry *backer; /* backing file */ + struct dentry *old; /* backing file */ struct file *backing_file; /* File open on backing storage */ loff_t i_size; /* object size */ atomic_t usage; /* object usage count */ diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index 89fd4a24e613..d17734455af2 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -59,11 +59,11 @@ bool cachefiles_open_object(struct cachefiles_object *object) struct path path; path.mnt = cache->mnt; - path.dentry = object->backer; + path.dentry = object->dentry; file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT, - d_backing_inode(object->backer), + d_backing_inode(object->dentry), cache->cache_cred); if (IS_ERR(file)) goto error; diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index d9c9a7d7eb8a..3dc64ae5dde8 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -488,7 +488,7 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, goto check_error; } - object->backer = object->dentry; + object->old = dget(object->dentry); } else { BUG(); // TODO: open file in data-class subdir } @@ -523,7 +523,9 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, cachefiles_unmark_inode_in_use(object, object->dentry); cachefiles_mark_object_inactive(cache, object); dput(object->dentry); + dput(object->old); object->dentry = NULL; + object->old = NULL; goto error_out; lookup_error: From patchwork Mon Jul 13 16:34:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660505 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F350618 for ; Mon, 13 Jul 2020 16:34:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E43752065F for ; Mon, 13 Jul 2020 16:34:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LB3/Syqe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730665AbgGMQeb (ORCPT ); Mon, 13 Jul 2020 12:34:31 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:58873 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730146AbgGMQe0 (ORCPT ); Mon, 13 Jul 2020 12:34:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CATo6PDsluFPlRw2dZ+EaKcVcg0qZye7OZRXv0dOV+0=; b=LB3/SyqeRCxr/2bXUY3cgNm5IRFWzn7rmQmDg8e/bT8vgmxAF1uxznD59xNIEG699FKOVz 13oLQ95DoRPPF6tNOD+uJ4OzR3MlDJbrZII2668zbWAmg9oZfUyBWHQ9GroMj+469U+BG3 mLLqIczpoRkwKp3dXF/9lqgz6ZcwcCU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-408-q2ZWEJFZMgCkX-ifD_SB7w-1; Mon, 13 Jul 2020 12:34:19 -0400 X-MC-Unique: q2ZWEJFZMgCkX-ifD_SB7w-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3B31C8015CB; Mon, 13 Jul 2020 16:34:17 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9D62410021B3; Mon, 13 Jul 2020 16:34:11 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 19/32] cachefiles: Implement a content-present indicator and bitmap From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:34:10 +0100 Message-ID: <159465805087.1376674.13636976053799223498.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Implement a content indicator that indicates the presence or absence of content and a bitmap that indicates which blocks of granular content are present in a granular file. This is added to the xattr that stores the netfs coherency data, along with the file size and the file zero point (the point after which it can be assumed that the server doesn't have any data). In the content bitmap, if present, each bit indicates which 256KiB granules of a cache file are present. This is stored in a separate xattr, which is loaded when the first I/O handle is created on that cache object and saved when the object is discarded from memory. Non-index objects in the cache can be monolithic or granular. The content map isn't used for monolithic objects (FSCACHE_COOKIE_ADV_SINGLE_CHUNK) as they are expected to be all-or-nothing, so the content indicator alone suffices. Examples of this would be AFS directory or symlink content. Signed-off-by: David Howells --- fs/cachefiles/Makefile | 1 fs/cachefiles/bind.c | 1 fs/cachefiles/content-map.c | 251 +++++++++++++++++++++++++++++++++++++ fs/cachefiles/interface.c | 5 + fs/cachefiles/internal.h | 31 +++++ fs/cachefiles/io.c | 4 + fs/cachefiles/xattr.c | 24 +++- include/trace/events/cachefiles.h | 4 - 8 files changed, 313 insertions(+), 8 deletions(-) create mode 100644 fs/cachefiles/content-map.c diff --git a/fs/cachefiles/Makefile b/fs/cachefiles/Makefile index d894d317d6e7..84615aca866a 100644 --- a/fs/cachefiles/Makefile +++ b/fs/cachefiles/Makefile @@ -5,6 +5,7 @@ cachefiles-y := \ bind.o \ + content-map.o \ daemon.o \ interface.o \ io.o \ diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c index 84fe89d5999e..40377633e3d9 100644 --- a/fs/cachefiles/bind.c +++ b/fs/cachefiles/bind.c @@ -102,6 +102,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache) goto error_root_object; atomic_set(&fsdef->usage, 1); + rwlock_init(&fsdef->content_map_lock); fsdef->type = FSCACHE_COOKIE_TYPE_INDEX; _debug("- fsdef %p", fsdef); diff --git a/fs/cachefiles/content-map.c b/fs/cachefiles/content-map.c new file mode 100644 index 000000000000..594624cb1cb9 --- /dev/null +++ b/fs/cachefiles/content-map.c @@ -0,0 +1,251 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Datafile content management + * + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include "internal.h" + +static const char cachefiles_xattr_content_map[] = + XATTR_USER_PREFIX "CacheFiles.content"; + +static bool cachefiles_granule_is_present(struct cachefiles_object *object, + size_t granule) +{ + bool res; + + if (granule / 8 >= object->content_map_size) + return false; + read_lock_bh(&object->content_map_lock); + res = test_bit_le(granule, object->content_map); + read_unlock_bh(&object->content_map_lock); + return res; +} + +/* + * Mark the content map to indicate stored granule. + */ +void cachefiles_mark_content_map(struct fscache_io_request *req) +{ + struct cachefiles_object *object = + container_of(req->object, struct cachefiles_object, fscache); + loff_t pos = req->pos; + + _enter("%llx", pos); + + read_lock_bh(&object->content_map_lock); + + if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) { + if (pos == 0) { + object->content_info = CACHEFILES_CONTENT_SINGLE; + set_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->fscache.flags); + } + } else { + pgoff_t granule; + loff_t end = pos + req->len; + + pos = round_down(pos, CACHEFILES_GRAN_SIZE); + do { + granule = pos / CACHEFILES_GRAN_SIZE; + if (granule / 8 >= object->content_map_size) + break; + + set_bit_le(granule, object->content_map); + object->content_map_changed = true; + pos += CACHEFILES_GRAN_SIZE; + + } while (pos < end); + + if (object->content_info != CACHEFILES_CONTENT_MAP) { + object->content_info = CACHEFILES_CONTENT_MAP; + set_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->fscache.flags); + } + } + + read_unlock_bh(&object->content_map_lock); +} + +/* + * Expand the content map to a larger file size. + */ +void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t size) +{ + u8 *map, *zap; + + /* Determine the size. There's one bit per granule. We size it in + * terms of 8-byte chunks, where a 64-bit span * 256KiB bytes granules + * covers 16MiB of file space. At that, 512B will cover 1GiB. + */ + if (size > 0) { + size += CACHEFILES_GRAN_SIZE - 1; + size /= CACHEFILES_GRAN_SIZE; + size += 8 - 1; + size /= 8; + size = roundup_pow_of_two(size); + } else { + size = 8; + } + + if (size <= object->content_map_size) + return; + + map = kzalloc(size, GFP_KERNEL); + if (!map) + return; + + write_lock_bh(&object->content_map_lock); + if (size > object->content_map_size) { + zap = object->content_map; + memcpy(map, zap, object->content_map_size); + object->content_map = map; + object->content_map_size = size; + } else { + zap = map; + } + write_unlock_bh(&object->content_map_lock); + + kfree(zap); +} + +/* + * Adjust the content map when we shorten a backing object. + * + * We need to unmark any granules that are going to be discarded. + */ +void cachefiles_shorten_content_map(struct cachefiles_object *object, + loff_t new_size) +{ + struct fscache_cookie *cookie = object->fscache.cookie; + loff_t granule, o_granule; + + if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) + return; + + write_lock_bh(&object->content_map_lock); + + if (object->content_info == CACHEFILES_CONTENT_MAP) { + if (cookie->zero_point > new_size) + cookie->zero_point = new_size; + + granule = new_size; + granule += CACHEFILES_GRAN_SIZE - 1; + granule /= CACHEFILES_GRAN_SIZE; + + o_granule = cookie->object_size; + o_granule += CACHEFILES_GRAN_SIZE - 1; + o_granule /= CACHEFILES_GRAN_SIZE; + + for (; o_granule > granule; o_granule--) + clear_bit_le(o_granule, object->content_map); + } + + write_unlock_bh(&object->content_map_lock); +} + +/* + * Load the content map. + */ +bool cachefiles_load_content_map(struct cachefiles_object *object) +{ + struct cachefiles_cache *cache = container_of(object->fscache.cache, + struct cachefiles_cache, cache); + const struct cred *saved_cred; + ssize_t got; + loff_t size; + u8 *map = NULL; + + _enter("c=%08x,%llx", + object->fscache.cookie->debug_id, + object->fscache.cookie->object_size); + + object->content_info = CACHEFILES_CONTENT_NO_DATA; + if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) { + /* Single-chunk object. The presence or absence of the content + * map xattr is sufficient indication. + */ + size = 0; + } else { + /* Granulated object. There's one bit per granule. We size it + * in terms of 8-byte chunks, where a 64-bit span * 256KiB + * bytes granules covers 16MiB of file space. At that, 512B + * will cover 1GiB. + */ + size = object->fscache.cookie->object_size; + if (size > 0) { + size += CACHEFILES_GRAN_SIZE - 1; + size /= CACHEFILES_GRAN_SIZE; + size += 8 - 1; + size /= 8; + if (size < 8) + size = 8; + size = roundup_pow_of_two(size); + } else { + size = 8; + } + + map = kzalloc(size, GFP_KERNEL); + if (!map) + return false; + } + + cachefiles_begin_secure(cache, &saved_cred); + got = vfs_getxattr(object->dentry, cachefiles_xattr_content_map, + map, size); + cachefiles_end_secure(cache, saved_cred); + if (got < 0 && got != -ENODATA) { + kfree(map); + _leave(" = f [%zd]", got); + return false; + } + + if (size == 0) { + if (got != -ENODATA) + object->content_info = CACHEFILES_CONTENT_SINGLE; + _leave(" = t [%zd]", got); + } else { + object->content_map = map; + object->content_map_size = size; + object->content_info = CACHEFILES_CONTENT_MAP; + _leave(" = t [%zd/%llu %*phN]", got, size, (int)size, map); + } + + return true; +} + +/* + * Save the content map. + */ +void cachefiles_save_content_map(struct cachefiles_object *object) +{ + ssize_t ret; + size_t size; + u8 *map; + + _enter("c=%08x", object->fscache.cookie->debug_id); + + if (object->content_info != CACHEFILES_CONTENT_MAP) + return; + + size = object->content_map_size; + map = object->content_map; + + /* Don't save trailing zeros, but do save at least one byte */ + for (; size > 0; size--) + if (map[size - 1]) + break; + + ret = vfs_setxattr(object->dentry, cachefiles_xattr_content_map, + map, size, 0); + if (ret < 0) { + cachefiles_io_error_obj(object, "Unable to set xattr"); + return; + } + + _leave(" = %zd", ret); +} diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 6384fba652eb..de4fb41103a6 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -37,6 +37,7 @@ struct fscache_object *cachefiles_alloc_object(struct fscache_cookie *cookie, return NULL; } + rwlock_init(&object->content_map_lock); fscache_object_init(&object->fscache, cookie, &cache->cache); object->fscache.parent = parent; object->fscache.stage = FSCACHE_OBJECT_STAGE_LOOKING_UP; @@ -198,6 +199,8 @@ static void cachefiles_update_object(struct fscache_object *_object) static void cachefiles_commit_object(struct cachefiles_object *object, struct cachefiles_cache *cache) { + if (object->content_map_changed) + cachefiles_save_content_map(object); } /* @@ -298,6 +301,8 @@ static void cachefiles_put_object(struct fscache_object *_object, ASSERTCMP(object->dentry, ==, NULL); ASSERTCMP(object->fscache.n_children, ==, 0); + kfree(object->content_map); + cache = object->fscache.cache; fscache_object_destroy(&object->fscache); kmem_cache_free(cachefiles_object_jar, object); diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index a00ffb63baf4..4085c1185693 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -19,6 +19,11 @@ #include #include +/* Cachefile granularity */ +#define CACHEFILES_GRAN_SIZE (256 * 1024) +#define CACHEFILES_GRAN_PAGES (CACHEFILES_GRAN_SIZE / PAGE_SIZE) +#define CACHEFILES_DIO_BLOCK_SIZE 4096 + struct cachefiles_cache; struct cachefiles_object; @@ -29,6 +34,16 @@ extern unsigned cachefiles_debug; #define cachefiles_gfp (__GFP_RECLAIM | __GFP_NORETRY | __GFP_NOMEMALLOC) +enum cachefiles_content { + /* These values are saved on disk */ + CACHEFILES_CONTENT_NO_DATA = 0, /* No content stored */ + CACHEFILES_CONTENT_SINGLE = 1, /* Content is monolithic, all is present */ + CACHEFILES_CONTENT_ALL = 2, /* Content is all present, no map */ + CACHEFILES_CONTENT_MAP = 3, /* Content is piecemeal, map in use */ + CACHEFILES_CONTENT_DIRTY = 4, /* Content is dirty (only seen on disk) */ + nr__cachefiles_content +}; + /* * node records */ @@ -41,6 +56,13 @@ struct cachefiles_object { atomic_t usage; /* object usage count */ uint8_t type; /* object type */ bool new; /* T if object new */ + + /* Map of the content blocks in the object */ + enum cachefiles_content content_info:8; /* Info about content presence */ + bool content_map_changed; + u8 *content_map; /* Content present bitmap */ + unsigned int content_map_size; /* Size of buffer */ + rwlock_t content_map_lock; }; extern struct kmem_cache *cachefiles_object_jar; @@ -100,6 +122,15 @@ static inline void cachefiles_state_changed(struct cachefiles_cache *cache) extern int cachefiles_daemon_bind(struct cachefiles_cache *cache, char *args); extern void cachefiles_daemon_unbind(struct cachefiles_cache *cache); +/* + * content-map.c + */ +extern void cachefiles_mark_content_map(struct fscache_io_request *req); +extern void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t size); +extern void cachefiles_shorten_content_map(struct cachefiles_object *object, loff_t new_size); +extern bool cachefiles_load_content_map(struct cachefiles_object *object); +extern void cachefiles_save_content_map(struct cachefiles_object *object); + /* * daemon.c */ diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index d17734455af2..e324b835b1a0 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -61,6 +61,10 @@ bool cachefiles_open_object(struct cachefiles_object *object) path.mnt = cache->mnt; path.dentry = object->dentry; + if (object->content_info == CACHEFILES_CONTENT_MAP && + !cachefiles_load_content_map(object)) + goto error; + file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT, d_backing_inode(object->dentry), diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index 17c16c2bd07e..a1d4a3d1db69 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -16,8 +16,11 @@ #include "internal.h" struct cachefiles_xattr { - uint8_t type; - uint8_t data[]; + __be64 object_size; /* Actual size of the object */ + __be64 zero_point; /* Size after which server has no data not written by us */ + __u8 type; /* Type of object */ + __u8 content; /* Content presence (enum cachefiles_content) */ + __u8 data[]; /* netfs coherency data */ } __packed; static const char cachefiles_xattr_cache[] = @@ -118,7 +121,10 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object, if (!buf) return -ENOMEM; - buf->type = object->fscache.cookie->type; + buf->object_size = cpu_to_be64(object->fscache.cookie->object_size); + buf->zero_point = cpu_to_be64(object->fscache.cookie->zero_point); + buf->type = object->fscache.cookie->type; + buf->content = object->content_info; if (len > 0) memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len); @@ -127,7 +133,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object, xattr_flags); if (ret < 0) { trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, - 0, + buf->content, cachefiles_coherency_set_fail); if (ret != -ENOMEM) cachefiles_io_error_obj( @@ -135,7 +141,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object, "Failed to set xattr with error %d", ret); } else { trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, - 0, + buf->content, cachefiles_coherency_set_ok); } @@ -174,15 +180,21 @@ int cachefiles_check_auxdata(struct cachefiles_object *object) why = cachefiles_coherency_check_xattr; } else if (buf->type != object->fscache.cookie->type) { why = cachefiles_coherency_check_type; + } else if (buf->content >= nr__cachefiles_content) { + why = cachefiles_coherency_check_content; } else if (memcmp(buf->data, p, len) != 0) { why = cachefiles_coherency_check_aux; + } else if (be64_to_cpu(buf->object_size) != object->fscache.cookie->object_size) { + why = cachefiles_coherency_check_objsize; } else { + object->fscache.cookie->zero_point = be64_to_cpu(buf->zero_point); + object->content_info = buf->content; why = cachefiles_coherency_check_ok; ret = 0; } trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, - 0, why); + buf->content, why); kfree(buf); return ret; } diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index bf588c3f4a07..e7af1d683009 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -324,7 +324,7 @@ TRACE_EVENT(cachefiles_mark_buried, TRACE_EVENT(cachefiles_coherency, TP_PROTO(struct cachefiles_object *obj, ino_t ino, - int content, + enum cachefiles_content content, enum cachefiles_coherency_trace why), TP_ARGS(obj, ino, content, why), @@ -333,7 +333,7 @@ TRACE_EVENT(cachefiles_coherency, TP_STRUCT__entry( __field(unsigned int, obj ) __field(enum cachefiles_coherency_trace, why ) - __field(int, content ) + __field(enum cachefiles_content, content ) __field(u64, ino ) ), From patchwork Mon Jul 13 16:34:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660509 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 628E41510 for ; Mon, 13 Jul 2020 16:34:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A71A206F0 for ; Mon, 13 Jul 2020 16:34:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RHRNhY8p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730672AbgGMQeg (ORCPT ); Mon, 13 Jul 2020 12:34:36 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:43550 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730670AbgGMQef (ORCPT ); Mon, 13 Jul 2020 12:34:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658073; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CqhXJF7UXSuUzj+MJy08E9oZvFPZN+DukB4VERBi1NU=; b=RHRNhY8p5l7bEPdgCETiDsPBa/Gs5iAIpWEr9JMR7NnWzRWhku78JF25RkDqFsjzUjVJyk hGqCPIQvfqM531KW3NXoH2qrF2LvGfpuowW6j7tjurdGMg5gz78gQmXOWYjeEMH7bLvrD5 wFpydk6JCj0Emm4MhxXKOOUoGdtxf+o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-83-ZvnomP70ND2ZcJdmiSycNw-1; Mon, 13 Jul 2020 12:34:31 -0400 X-MC-Unique: ZvnomP70ND2ZcJdmiSycNw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D1E001009441; Mon, 13 Jul 2020 16:34:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2E5205D9D7; Mon, 13 Jul 2020 16:34:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 20/32] cachefiles: Implement extent shaper From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:34:22 +0100 Message-ID: <159465806243.1376674.18132863053001232748.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Implement the function that shapes extents to map onto the granules in a cache file. When setting to fetch data from the server to be cached, the extent will be expanded to align with granule size and cut down so that it doesn't cross the boundary between a non-present extent and a present extent. When setting to read data from the cache, the extent will be cut down so that it doesn't cross the boundary between a present extent and a non-present extent. If no caching is taking place, whatever was requested goes. Signed-off-by: David Howells --- fs/cachefiles/content-map.c | 217 ++++++++++++++++++++++++++++++++++++------- fs/cachefiles/internal.h | 4 - fs/cachefiles/io.c | 10 -- 3 files changed, 184 insertions(+), 47 deletions(-) diff --git a/fs/cachefiles/content-map.c b/fs/cachefiles/content-map.c index 594624cb1cb9..91c44bb39a93 100644 --- a/fs/cachefiles/content-map.c +++ b/fs/cachefiles/content-map.c @@ -15,6 +15,31 @@ static const char cachefiles_xattr_content_map[] = XATTR_USER_PREFIX "CacheFiles.content"; +/* + * Determine the map size for a granulated object. + * + * There's one bit per granule. We size it in terms of 8-byte chunks, where a + * 64-bit span * 256KiB bytes granules covers 16MiB of file space. At that, + * 512B will cover 1GiB. + */ +static size_t cachefiles_map_size(loff_t i_size) +{ + loff_t size; + size_t granules, bits, bytes, map_size; + + if (i_size <= CACHEFILES_GRAN_SIZE * 64) + return 8; + + size = i_size + CACHEFILES_GRAN_SIZE - 1; + granules = size / CACHEFILES_GRAN_SIZE; + bits = granules + (64 - 1); + bits &= ~(64 - 1); + bytes = bits / 8; + map_size = roundup_pow_of_two(bytes); + _leave(" = %zx [i=%llx g=%zu b=%zu]", map_size, i_size, granules, bits); + return map_size; +} + static bool cachefiles_granule_is_present(struct cachefiles_object *object, size_t granule) { @@ -28,6 +53,145 @@ static bool cachefiles_granule_is_present(struct cachefiles_object *object, return res; } +/* + * Shape the extent of a single-chunk data object. + */ +static void cachefiles_shape_single(struct fscache_object *obj, + struct fscache_request_shape *shape) +{ + struct cachefiles_object *object = + container_of(obj, struct cachefiles_object, fscache); + pgoff_t eof; + + _enter("{%lx,%x,%x},%llx,%d", + shape->proposed_start, shape->proposed_nr_pages, + shape->max_io_pages, shape->i_size, shape->for_write); + + shape->dio_block_size = CACHEFILES_DIO_BLOCK_SIZE; + + if (object->content_info == CACHEFILES_CONTENT_SINGLE) { + shape->to_be_done = FSCACHE_READ_FROM_CACHE; + } else { + eof = (shape->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; + + shape->actual_start = 0; + shape->actual_nr_pages = eof; + shape->granularity = 0; + shape->to_be_done = FSCACHE_WRITE_TO_CACHE; + } +} + +/* + * Determine the size of a data extent in a cache object. + * + * In cachefiles, a data cache object is divided into granules of 256KiB, each + * of which must be written as a whole unit when the cache is being loaded. + * Data may be read out piecemeal. + * + * The extent is resized, but the result will always contain the starting page + * from the extent. + * + * If the granule does not exist in the cachefile, the start may be brought + * forward to align with the beginning of a granule boundary, and the end may be + * moved either way to align also. The extent will be cut off it it would cross + * over the boundary between what's cached and what's not. + * + * If the starting granule does exist in the cachefile, the extent will be + * shortened, if necessary, so that it doesn't cross over into a region that is + * not present. + * + * If the granule does not exist and we cannot cache it for lack of space, the + * requested extent is left unaltered. + */ +void cachefiles_shape_request(struct fscache_object *obj, + struct fscache_request_shape *shape) +{ + struct cachefiles_object *object = + container_of(obj, struct cachefiles_object, fscache); + unsigned int max_pages; + pgoff_t start, end, eof, bend; + size_t granule; + + if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) { + cachefiles_shape_single(obj, shape); + goto out; + } + + start = shape->proposed_start; + end = shape->proposed_start + shape->proposed_nr_pages; + max_pages = shape->max_io_pages; + _enter("{%lx,%lx,%x},%llx,%d", + start, end, max_pages, shape->i_size, shape->for_write); + + max_pages = round_down(max_pages, CACHEFILES_GRAN_PAGES); + if (end - start > max_pages) + end = start + max_pages; + + /* If the content map didn't get expanded for some reason - simply + * ignore this granule. + */ + granule = start / CACHEFILES_GRAN_PAGES; + if (granule / 8 >= object->content_map_size) + return; + + if (cachefiles_granule_is_present(object, granule)) { + /* The start of the requested extent is present in the cache - + * restrict the returned extent to the maximum length of what's + * available. + */ + bend = round_up(start + 1, CACHEFILES_GRAN_PAGES); + while (bend < end) { + pgoff_t i = round_up(bend + 1, CACHEFILES_GRAN_PAGES); + granule = i / CACHEFILES_GRAN_PAGES; + if (!cachefiles_granule_is_present(object, granule)) + break; + bend = i; + } + + if (end > bend) + end = bend; + shape->to_be_done = FSCACHE_READ_FROM_CACHE; + } else { + /* Otherwise expand the extent in both directions to cover what + * we want for caching purposes. + */ + start = round_down(start, CACHEFILES_GRAN_PAGES); + end = round_up(end, CACHEFILES_GRAN_PAGES); + + /* But trim to the end of the file and the starting page */ + eof = (shape->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; + if (eof <= shape->proposed_start) + eof = shape->proposed_start + 1; + if (end > eof) + end = eof; + + if ((start << PAGE_SHIFT) >= object->fscache.cookie->zero_point) { + /* The start of the requested extent is beyond the + * original EOF of the file on the server - therefore + * it's not going to be found on the server. + */ + end = round_up(start + 1, CACHEFILES_GRAN_PAGES); + shape->to_be_done = FSCACHE_FILL_WITH_ZERO; + } else { + end = start + CACHEFILES_GRAN_PAGES; + if (end > eof) + end = eof; + shape->to_be_done = FSCACHE_WRITE_TO_CACHE; + } + + /* TODO: Check we have space in the cache */ + } + + shape->actual_start = start; + shape->actual_nr_pages = end - start; + shape->granularity = CACHEFILES_GRAN_PAGES; + shape->dio_block_size = CACHEFILES_DIO_BLOCK_SIZE; + +out: + _leave(" [%x,%lx,%x]", + shape->to_be_done, shape->actual_start, shape->actual_nr_pages); +} + /* * Mark the content map to indicate stored granule. */ @@ -74,23 +238,14 @@ void cachefiles_mark_content_map(struct fscache_io_request *req) /* * Expand the content map to a larger file size. */ -void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t size) +void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t i_size) { + size_t size; u8 *map, *zap; - /* Determine the size. There's one bit per granule. We size it in - * terms of 8-byte chunks, where a 64-bit span * 256KiB bytes granules - * covers 16MiB of file space. At that, 512B will cover 1GiB. - */ - if (size > 0) { - size += CACHEFILES_GRAN_SIZE - 1; - size /= CACHEFILES_GRAN_SIZE; - size += 8 - 1; - size /= 8; - size = roundup_pow_of_two(size); - } else { - size = 8; - } + size = cachefiles_map_size(i_size); + + _enter("%llx,%zx,%x", i_size, size, object->content_map_size); if (size <= object->content_map_size) return; @@ -122,7 +277,7 @@ void cachefiles_shorten_content_map(struct cachefiles_object *object, loff_t new_size) { struct fscache_cookie *cookie = object->fscache.cookie; - loff_t granule, o_granule; + size_t granule, tmp, bytes; if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) return; @@ -137,12 +292,16 @@ void cachefiles_shorten_content_map(struct cachefiles_object *object, granule += CACHEFILES_GRAN_SIZE - 1; granule /= CACHEFILES_GRAN_SIZE; - o_granule = cookie->object_size; - o_granule += CACHEFILES_GRAN_SIZE - 1; - o_granule /= CACHEFILES_GRAN_SIZE; + tmp = granule; + tmp = round_up(granule, 64); + bytes = tmp / 8; + if (bytes < object->content_map_size) + memset(object->content_map + bytes, 0, + object->content_map_size - bytes); - for (; o_granule > granule; o_granule--) - clear_bit_le(o_granule, object->content_map); + if (tmp > granule) + for (tmp--; tmp > granule; tmp--) + clear_bit_le(tmp, object->content_map); } write_unlock_bh(&object->content_map_lock); @@ -157,7 +316,7 @@ bool cachefiles_load_content_map(struct cachefiles_object *object) struct cachefiles_cache, cache); const struct cred *saved_cred; ssize_t got; - loff_t size; + size_t size; u8 *map = NULL; _enter("c=%08x,%llx", @@ -176,19 +335,7 @@ bool cachefiles_load_content_map(struct cachefiles_object *object) * bytes granules covers 16MiB of file space. At that, 512B * will cover 1GiB. */ - size = object->fscache.cookie->object_size; - if (size > 0) { - size += CACHEFILES_GRAN_SIZE - 1; - size /= CACHEFILES_GRAN_SIZE; - size += 8 - 1; - size /= 8; - if (size < 8) - size = 8; - size = roundup_pow_of_two(size); - } else { - size = 8; - } - + size = cachefiles_map_size(object->fscache.cookie->object_size); map = kzalloc(size, GFP_KERNEL); if (!map) return false; @@ -212,7 +359,7 @@ bool cachefiles_load_content_map(struct cachefiles_object *object) object->content_map = map; object->content_map_size = size; object->content_info = CACHEFILES_CONTENT_MAP; - _leave(" = t [%zd/%llu %*phN]", got, size, (int)size, map); + _leave(" = t [%zd/%zu %*phN]", got, size, (int)size, map); } return true; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 4085c1185693..2ea469b77712 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -125,6 +125,8 @@ extern void cachefiles_daemon_unbind(struct cachefiles_cache *cache); /* * content-map.c */ +extern void cachefiles_shape_request(struct fscache_object *object, + struct fscache_request_shape *shape); extern void cachefiles_mark_content_map(struct fscache_io_request *req); extern void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t size); extern void cachefiles_shorten_content_map(struct cachefiles_object *object, loff_t new_size); @@ -149,8 +151,6 @@ extern struct fscache_object *cachefiles_grab_object(struct fscache_object *_obj /* * io.c */ -extern void cachefiles_shape_request(struct fscache_object *object, - struct fscache_request_shape *shape); extern int cachefiles_read(struct fscache_object *object, struct fscache_io_request *req, struct iov_iter *iter); diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index e324b835b1a0..ddb44ec5a199 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -12,16 +12,6 @@ #include #include "internal.h" -/* - * Determine the size of a data extent in a cache object. This must be written - * as a whole unit, but can be read piecemeal. - */ -void cachefiles_shape_request(struct fscache_object *object, - struct fscache_request_shape *shape) -{ - return 0; -} - /* * Initiate a read from the cache. */ From patchwork Mon Jul 13 16:34:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660515 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 579AB13A4 for ; Mon, 13 Jul 2020 16:34:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3F1AF206F0 for ; Mon, 13 Jul 2020 16:34:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FNMbxNbt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730685AbgGMQep (ORCPT ); Mon, 13 Jul 2020 12:34:45 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:47919 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730362AbgGMQep (ORCPT ); Mon, 13 Jul 2020 12:34:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658083; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o4oUojr5NHsycfbNOi5qrHI5NK95AfoUtSwAFHR3zOA=; b=FNMbxNbtK7/Q0xJL6eXAe7XDAfouqwHs01KAL4+9zSBCwxGJ9h13Xeqin7xzwYOuHm+jRw mYenZCTKzjpTM+uKJYQgPJXqSMyTnLQqBnsaDmH7DFvG7VlDYX+CmDhpoqDG0g/87RbL1w +gDP9wweotAAjNLlYSPXKn/tWyEaJ4w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-366-w8veJSKFOBOGwEzjpT7jhA-1; Mon, 13 Jul 2020 12:34:42 -0400 X-MC-Unique: w8veJSKFOBOGwEzjpT7jhA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4D165800597; Mon, 13 Jul 2020 16:34:40 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB6DB72AC3; Mon, 13 Jul 2020 16:34:34 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 21/32] cachefiles: Round the cachefile size up to DIO block size From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:34:34 +0100 Message-ID: <159465807406.1376674.8117873071279426760.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Round the size of a cachefile up to DIO block size so that we can always read back the last partial page of a file using direct I/O. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index de4fb41103a6..054d5cc794b5 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -184,6 +184,17 @@ static void cachefiles_update_object(struct fscache_object *_object) cachefiles_remove_object_xattr(cache, object->dentry); goto out; } + + object_size = round_up(object_size, CACHEFILES_DIO_BLOCK_SIZE); + _debug("trunc %llx -> %llx", i_size_read(d_inode(object->dentry)), object_size); + if (i_size_read(d_inode(object->dentry)) < object_size) { + ret = vfs_truncate(&path, object_size); + if (ret < 0) { + cachefiles_io_error_obj(object, "Trunc-to-dio-size failed"); + cachefiles_remove_object_xattr(cache, object->dentry); + goto out; + } + } } cachefiles_set_object_xattr(object, XATTR_REPLACE); @@ -354,6 +365,7 @@ static int cachefiles_attr_changed(struct cachefiles_object *object) int ret; ni_size = object->fscache.cookie->object_size; + ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE); _enter("{OBJ%x},[%llu]", object->fscache.debug_id, (unsigned long long) ni_size); @@ -422,6 +434,7 @@ static void cachefiles_invalidate_object(struct fscache_object *_object) struct cachefiles_cache, cache); ni_size = object->fscache.cookie->object_size; + ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE); _enter("{OBJ%x},[%llu]", object->fscache.debug_id, (unsigned long long)ni_size); From patchwork Mon Jul 13 16:34:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660525 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 072F3618 for ; Mon, 13 Jul 2020 16:34:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF2882065F for ; Mon, 13 Jul 2020 16:34:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="R4uJHFOo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730479AbgGMQe6 (ORCPT ); Mon, 13 Jul 2020 12:34:58 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:54717 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730219AbgGMQe5 (ORCPT ); Mon, 13 Jul 2020 12:34:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658095; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TD8IvGU/cs1ejkxvD4Als159+boOBcHJ/ANKfYOUjVo=; b=R4uJHFOo9lyv5gxzizEfuUki1Pzw0NX4IpDw7qprcVv5bwF6ju/1ZT9D4fbpSqDOjg4JsD 24pZ6U7cPPCcnYo1ubBnK7YN405oISsa4qfCuKFpLLGt6MQ9oT+y8SlfQOA566L1QGBhKj KbdFHTrzkAkqdD8BcEJwW0x/tzgefQk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-256-kurJ08AJOVSsaNR4BK1wUA-1; Mon, 13 Jul 2020 12:34:53 -0400 X-MC-Unique: kurJ08AJOVSsaNR4BK1wUA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BFAB210059BC; Mon, 13 Jul 2020 16:34:51 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 53D275C1D0; Mon, 13 Jul 2020 16:34:46 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 22/32] cachefiles: Implement read and write parts of new I/O API From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:34:45 +0100 Message-ID: <159465808553.1376674.11788737980809596736.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Implement writing into the cache and reading back from the cache inside cachefiles using asynchronous direct I/O from the specified iterator. The size and position of the request should be aligned to the reported dio_block_size. Errors and completion are reported by callback. Signed-off-by: David Howells --- fs/cachefiles/io.c | 208 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 202 insertions(+), 6 deletions(-) diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index ddb44ec5a199..42e0d620d778 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -12,30 +12,226 @@ #include #include "internal.h" +struct cachefiles_kiocb { + struct kiocb iocb; + struct fscache_io_request *req; + refcount_t ki_refcnt; +}; + +static inline void cachefiles_put_kiocb(struct cachefiles_kiocb *ki) +{ + if (refcount_dec_and_test(&ki->ki_refcnt)) { + fscache_put_io_request(ki->req); + fput(ki->iocb.ki_filp); + kfree(ki); + } +} + +/* + * Handle completion of a read from the cache. + */ +static void cachefiles_read_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct cachefiles_kiocb *ki = container_of(iocb, struct cachefiles_kiocb, iocb); + struct fscache_io_request *req = ki->req; + + _enter("%llx,%ld,%ld", req->len, ret, ret2); + + fscache_end_io_operation(req->cookie); + + if (ret < 0) { + req->error = ret; + } else if (ret != req->len) { + req->error = -ENODATA; + } else { + req->transferred = ret; + set_bit(FSCACHE_IO_DATA_FROM_CACHE, &req->flags); + } + if (req->io_done) + req->io_done(req); + cachefiles_put_kiocb(ki); +} + /* * Initiate a read from the cache. */ -int cachefiles_read(struct fscache_object *object, +int cachefiles_read(struct fscache_object *obj, struct fscache_io_request *req, struct iov_iter *iter) { - req->error = -ENODATA; + struct cachefiles_object *object = + container_of(obj, struct cachefiles_object, fscache); + struct cachefiles_kiocb *ki; + struct file *file = object->backing_file; + ssize_t ret = -ENOBUFS; + + _enter("%pD,%li,%llx,%llx/%llx", + file, file_inode(file)->i_ino, req->pos, req->len, i_size_read(file->f_inode)); + + ki = kzalloc(sizeof(struct cachefiles_kiocb), GFP_KERNEL); + if (!ki) + goto presubmission_error; + + refcount_set(&ki->ki_refcnt, 2); + ki->iocb.ki_filp = get_file(file); + ki->iocb.ki_pos = req->pos; + ki->iocb.ki_flags = IOCB_DIRECT; + ki->iocb.ki_hint = ki_hint_validate(file_write_hint(file)); + ki->iocb.ki_ioprio = get_current_ioprio(); + ki->req = req; + + if (req->io_done) + ki->iocb.ki_complete = cachefiles_read_complete; + + ret = rw_verify_area(READ, file, &ki->iocb.ki_pos, iov_iter_count(iter)); + if (ret < 0) + goto presubmission_error_free; + + fscache_get_io_request(req); + ret = call_read_iter(file, &ki->iocb, iter); + switch (ret) { + case -EIOCBQUEUED: + goto in_progress; + + case -ERESTARTSYS: + case -ERESTARTNOINTR: + case -ERESTARTNOHAND: + case -ERESTART_RESTARTBLOCK: + /* There's no easy way to restart the syscall since other AIO's + * may be already running. Just fail this IO with EINTR. + */ + ret = -EINTR; + /* Fall through */ + default: + cachefiles_read_complete(&ki->iocb, ret, 0); + if (ret > 0) + ret = 0; + break; + } + +in_progress: + cachefiles_put_kiocb(ki); + _leave(" = %zd", ret); + return ret; + +presubmission_error_free: + fput(file); + kfree(ki); +presubmission_error: + req->error = -ENOMEM; + if (req->io_done) + req->io_done(req); + return -ENOMEM; +} + +/* + * Handle completion of a write to the cache. + */ +static void cachefiles_write_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct cachefiles_kiocb *ki = container_of(iocb, struct cachefiles_kiocb, iocb); + struct fscache_io_request *req = ki->req; + struct inode *inode = file_inode(ki->iocb.ki_filp); + + _enter("%llx,%ld,%ld", req->len, ret, ret2); + + /* Tell lockdep we inherited freeze protection from submission thread */ + __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); + __sb_end_write(inode->i_sb, SB_FREEZE_WRITE); + + fscache_end_io_operation(req->cookie); + + if (ret < 0) + req->error = ret; + else if (ret != req->len) + req->error = -ENOBUFS; + else + cachefiles_mark_content_map(req); if (req->io_done) req->io_done(req); - return -ENODATA; + cachefiles_put_kiocb(ki); } /* * Initiate a write to the cache. */ -int cachefiles_write(struct fscache_object *object, +int cachefiles_write(struct fscache_object *obj, struct fscache_io_request *req, struct iov_iter *iter) { - req->error = -ENOBUFS; + struct cachefiles_object *object = + container_of(obj, struct cachefiles_object, fscache); + struct cachefiles_kiocb *ki; + struct inode *inode; + struct file *file = object->backing_file; + ssize_t ret = -ENOBUFS; + + _enter("%pD,%li,%llx,%llx/%llx", + file, file_inode(file)->i_ino, req->pos, req->len, i_size_read(file->f_inode)); + + ki = kzalloc(sizeof(struct cachefiles_kiocb), GFP_KERNEL); + if (!ki) + goto presubmission_error; + + refcount_set(&ki->ki_refcnt, 2); + ki->iocb.ki_filp = get_file(file); + ki->iocb.ki_pos = req->pos; + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_WRITE; + ki->iocb.ki_hint = ki_hint_validate(file_write_hint(file)); + ki->iocb.ki_ioprio = get_current_ioprio(); + ki->req = req; + + if (req->io_done) + ki->iocb.ki_complete = cachefiles_write_complete; + + ret = rw_verify_area(WRITE, file, &ki->iocb.ki_pos, iov_iter_count(iter)); + if (ret < 0) + goto presubmission_error_free; + + /* Open-code file_start_write here to grab freeze protection, which + * will be released by another thread in aio_complete_rw(). Fool + * lockdep by telling it the lock got released so that it doesn't + * complain about the held lock when we return to userspace. + */ + inode = file_inode(file); + __sb_start_write(inode->i_sb, SB_FREEZE_WRITE, true); + __sb_writers_release(inode->i_sb, SB_FREEZE_WRITE); + + fscache_get_io_request(req); + ret = call_write_iter(file, &ki->iocb, iter); + switch (ret) { + case -EIOCBQUEUED: + goto in_progress; + + case -ERESTARTSYS: + case -ERESTARTNOINTR: + case -ERESTARTNOHAND: + case -ERESTART_RESTARTBLOCK: + /* There's no easy way to restart the syscall since other AIO's + * may be already running. Just fail this IO with EINTR. + */ + ret = -EINTR; + /* Fall through */ + default: + cachefiles_write_complete(&ki->iocb, ret, 0); + if (ret > 0) + ret = 0; + break; + } + +in_progress: + cachefiles_put_kiocb(ki); + _leave(" = %zd", ret); + return ret; + +presubmission_error_free: + fput(file); + kfree(ki); +presubmission_error: + req->error = -ENOMEM; if (req->io_done) req->io_done(req); - return -ENOBUFS; + return -ENOMEM; } /* From patchwork Mon Jul 13 16:34:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660533 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A2E581510 for ; Mon, 13 Jul 2020 16:35:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 881DD2067D for ; Mon, 13 Jul 2020 16:35:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DWgpG2uv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730240AbgGMQfK (ORCPT ); Mon, 13 Jul 2020 12:35:10 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:46881 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730492AbgGMQfJ (ORCPT ); Mon, 13 Jul 2020 12:35:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658107; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rVenIhoN1kPnORHI5PqF7aNOKItyyL3fIQiCBHDCBTc=; b=DWgpG2uvE6f3FD1R6KWpY64mex3Ssl0IMXQqekL/+bJgjyTW5H622pUKoxh7XWwr5jNTvY 7cwPiOuyaEJHA32gAnRETKbnPaKAjuYdk7WCRU2B2hSLM85Bi+6KJ2QehPchyuNLn/Mdkp wpdO7etDngUHn41CMZlsRm3L/PUNZC8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-321-uyg96JPGOSKroSyJJz1-4w-1; Mon, 13 Jul 2020 12:35:05 -0400 X-MC-Unique: uyg96JPGOSKroSyJJz1-4w-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6B7251009441; Mon, 13 Jul 2020 16:35:03 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id C3B7310021B3; Mon, 13 Jul 2020 16:34:57 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 23/32] cachefiles: Add I/O tracepoints From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:34:57 +0100 Message-ID: <159465809699.1376674.8132002248953593870.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org --- fs/cachefiles/interface.c | 16 +++-- fs/cachefiles/io.c | 2 + include/trace/events/cachefiles.h | 123 +++++++++++++++++++++++++++++++++++++ 3 files changed, 136 insertions(+), 5 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 054d5cc794b5..e73de62d0e73 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -160,7 +160,8 @@ static void cachefiles_update_object(struct fscache_object *_object) struct cachefiles_object *object; struct cachefiles_cache *cache; const struct cred *saved_cred; - loff_t object_size; + struct inode *inode; + loff_t object_size, i_size; int ret; _enter("{OBJ%x}", _object->debug_id); @@ -172,12 +173,15 @@ static void cachefiles_update_object(struct fscache_object *_object) cachefiles_begin_secure(cache, &saved_cred); object_size = object->fscache.cookie->object_size; - if (i_size_read(d_inode(object->dentry)) > object_size) { + inode = d_inode(object->dentry); + i_size = i_size_read(inode); + if (i_size > object_size) { struct path path = { .mnt = cache->mnt, .dentry = object->dentry }; - _debug("trunc %llx -> %llx", i_size_read(d_inode(object->dentry)), object_size); + _debug("trunc %llx -> %llx", i_size, object_size); + trace_cachefiles_trunc(object, inode, i_size, object_size); ret = vfs_truncate(&path, object_size); if (ret < 0) { cachefiles_io_error_obj(object, "Trunc-to-size failed"); @@ -186,8 +190,10 @@ static void cachefiles_update_object(struct fscache_object *_object) } object_size = round_up(object_size, CACHEFILES_DIO_BLOCK_SIZE); - _debug("trunc %llx -> %llx", i_size_read(d_inode(object->dentry)), object_size); - if (i_size_read(d_inode(object->dentry)) < object_size) { + i_size = i_size_read(inode); + _debug("trunc %llx -> %llx", i_size, object_size); + if (i_size < object_size) { + trace_cachefiles_trunc(object, inode, i_size, object_size); ret = vfs_truncate(&path, object_size); if (ret < 0) { cachefiles_io_error_obj(object, "Trunc-to-dio-size failed"); diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index 42e0d620d778..268e6f69ba9c 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -88,6 +88,7 @@ int cachefiles_read(struct fscache_object *obj, goto presubmission_error_free; fscache_get_io_request(req); + trace_cachefiles_read(object, file_inode(file), req); ret = call_read_iter(file, &ki->iocb, iter); switch (ret) { case -EIOCBQUEUED: @@ -198,6 +199,7 @@ int cachefiles_write(struct fscache_object *obj, __sb_writers_release(inode->i_sb, SB_FREEZE_WRITE); fscache_get_io_request(req); + trace_cachefiles_write(object, inode, req); ret = call_write_iter(file, &ki->iocb, iter); switch (ret) { case -EIOCBQUEUED: diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h index e7af1d683009..d83568e8fee8 100644 --- a/include/trace/events/cachefiles.h +++ b/include/trace/events/cachefiles.h @@ -351,6 +351,129 @@ TRACE_EVENT(cachefiles_coherency, __entry->content) ); +TRACE_EVENT(cachefiles_read, + TP_PROTO(struct cachefiles_object *obj, + struct inode *backer, + struct fscache_io_request *req), + + TP_ARGS(obj, backer, req), + + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(unsigned int, backer ) + __field(unsigned int, len ) + __field(loff_t, pos ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->backer = backer->i_ino; + __entry->pos = req->pos; + __entry->len = req->len; + ), + + TP_printk("o=%08x b=%08x p=%llx l=%x", + __entry->obj, + __entry->backer, + __entry->pos, + __entry->len) + ); + +TRACE_EVENT(cachefiles_write, + TP_PROTO(struct cachefiles_object *obj, + struct inode *backer, + struct fscache_io_request *req), + + TP_ARGS(obj, backer, req), + + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(unsigned int, backer ) + __field(unsigned int, len ) + __field(loff_t, pos ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->backer = backer->i_ino; + __entry->pos = req->pos; + __entry->len = req->len; + ), + + TP_printk("o=%08x b=%08x p=%llx l=%x", + __entry->obj, + __entry->backer, + __entry->pos, + __entry->len) + ); + +TRACE_EVENT(cachefiles_trunc, + TP_PROTO(struct cachefiles_object *obj, struct inode *backer, + loff_t from, loff_t to), + + TP_ARGS(obj, backer, from, to), + + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(unsigned int, backer ) + __field(loff_t, from ) + __field(loff_t, to ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->backer = backer->i_ino; + __entry->from = from; + __entry->to = to; + ), + + TP_printk("o=%08x b=%08x l=%llx->%llx", + __entry->obj, + __entry->backer, + __entry->from, + __entry->to) + ); + +TRACE_EVENT(cachefiles_tmpfile, + TP_PROTO(struct cachefiles_object *obj, struct inode *backer), + + TP_ARGS(obj, backer), + + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(unsigned int, backer ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->backer = backer->i_ino; + ), + + TP_printk("o=%08x b=%08x", + __entry->obj, + __entry->backer) + ); + +TRACE_EVENT(cachefiles_link, + TP_PROTO(struct cachefiles_object *obj, struct inode *backer), + + TP_ARGS(obj, backer), + + TP_STRUCT__entry( + __field(unsigned int, obj ) + __field(unsigned int, backer ) + ), + + TP_fast_assign( + __entry->obj = obj->fscache.debug_id; + __entry->backer = backer->i_ino; + ), + + TP_printk("o=%08x b=%08x", + __entry->obj, + __entry->backer) + ); + #endif /* _TRACE_CACHEFILES_H */ /* This part must be outside protection */ From patchwork Mon Jul 13 16:35:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660551 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9AB417EE for ; Mon, 13 Jul 2020 16:35:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84E7420738 for ; Mon, 13 Jul 2020 16:35:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RMOKuN7t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730550AbgGMQf2 (ORCPT ); Mon, 13 Jul 2020 12:35:28 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:28233 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730542AbgGMQf0 (ORCPT ); Mon, 13 Jul 2020 12:35:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658122; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JHC4V3SNcNWEeoqr0+yZp8TAkthvX50zruMsPOPj7go=; b=RMOKuN7toljblqAo2BiG8WzbxbHIaaEDJVCCpn1TpCi9KecS3xg3JAgqQyCIeh7Lbotqdq DSgBkeJHyxmyMkuK83uySxXGIh7TAS2JB17tryHILgINuVYESwlcShT5hQcPeFIdUG2Ajt wAzEpKDQALAfBO8+PgUDj38aUvWOx98= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-280-q2J_Mu_0PfeF9CIG-wJgyA-1; Mon, 13 Jul 2020 12:35:17 -0400 X-MC-Unique: q2J_Mu_0PfeF9CIG-wJgyA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 41E038015CB; Mon, 13 Jul 2020 16:35:15 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6CE0A6FDD1; Mon, 13 Jul 2020 16:35:09 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 24/32] fscache: Add read helper From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:35:08 +0100 Message-ID: <159465810864.1376674.10267227421160756746.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a trio of helper functions: fscache_read_helper_page_list(); fscache_read_helper_locked_page(); fscache_read_helper_for_write(); to do the work of shaping read requests, attempting to read from the cache, issuing or reissuing requests to the filesystem to pass to the server and writing back to the filesystem. The filesystem passes in a prepared request descriptor with the fscache descriptor embedded in it to one of the helper functions. The caller also indicates which page(s) it is interested in and provides some operations to issue reads and manage the request descriptor. The helper is placed into its own module, fsinfo_support.ko, which must be enabled unconditionally by any filesystem which wishes to use the helper even if CONFIG_FSCACHE=no. This module is selected by CONFIG_FSCACHE_SUPPORT. About half of the code is optimised away by CONFIG_FSCACHE=no. Also add a tracepoint to track calls. A set of 'notes' are taken to record the path through the function and this is dumped into the trace. Signed-off-by: David Howells --- fs/Makefile | 2 fs/fscache/Kconfig | 4 fs/fscache/Makefile | 3 fs/fscache/internal.h | 8 fs/fscache/main.c | 1 fs/fscache/read_helper.c | 656 ++++++++++++++++++++++++++++++++ include/linux/fscache.h | 26 + include/trace/events/fscache_support.h | 91 ++++ 8 files changed, 789 insertions(+), 2 deletions(-) create mode 100644 fs/fscache/read_helper.c create mode 100644 include/trace/events/fscache_support.h diff --git a/fs/Makefile b/fs/Makefile index 2ce5112b02c8..8b0a5b5b1d86 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -68,7 +68,7 @@ obj-$(CONFIG_PROFILING) += dcookies.o obj-$(CONFIG_DLM) += dlm/ # Do not add any filesystems before this line -obj-$(CONFIG_FSCACHE) += fscache/ +obj-$(CONFIG_FSCACHE_SUPPORT) += fscache/ obj-$(CONFIG_REISERFS_FS) += reiserfs/ obj-$(CONFIG_EXT4_FS) += ext4/ # We place ext4 before ext2 so that clean ext3 root fs's do NOT mount using the diff --git a/fs/fscache/Kconfig b/fs/fscache/Kconfig index ce6f731065d0..369c12ef0167 100644 --- a/fs/fscache/Kconfig +++ b/fs/fscache/Kconfig @@ -1,7 +1,11 @@ # SPDX-License-Identifier: GPL-2.0-only +config FSCACHE_SUPPORT + tristate "Support for local caching of network filesystems" + config FSCACHE tristate "General filesystem local caching manager" + depends on FSCACHE_SUPPORT help This option enables a generic filesystem caching manager that can be used by various network and other filesystems to cache data locally. diff --git a/fs/fscache/Makefile b/fs/fscache/Makefile index 3caf66810e7b..0a5c8c654942 100644 --- a/fs/fscache/Makefile +++ b/fs/fscache/Makefile @@ -20,3 +20,6 @@ fscache-$(CONFIG_FSCACHE_HISTOGRAM) += histogram.o fscache-$(CONFIG_FSCACHE_OBJECT_LIST) += object-list.o obj-$(CONFIG_FSCACHE) := fscache.o + +fscache_support-y := read_helper.o +obj-$(CONFIG_FSCACHE_SUPPORT) += fscache_support.o diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index a70c1a612309..2674438ccafd 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -30,6 +30,8 @@ #include #include +#if IS_ENABLED(CONFIG_FSCACHE) + #define FSCACHE_MIN_THREADS 4 #define FSCACHE_MAX_THREADS 32 @@ -266,6 +268,12 @@ void fscache_update_aux(struct fscache_cookie *cookie, cookie->object_size = *object_size; } +#else /* CONFIG_FSCACHE */ + +#define fscache_op_wq system_wq + +#endif /* CONFIG_FSCACHE */ + /*****************************************************************************/ /* * debug tracing diff --git a/fs/fscache/main.c b/fs/fscache/main.c index b2691439377b..ac4fd4d59479 100644 --- a/fs/fscache/main.c +++ b/fs/fscache/main.c @@ -39,6 +39,7 @@ MODULE_PARM_DESC(fscache_debug, struct kobject *fscache_root; struct workqueue_struct *fscache_op_wq; +EXPORT_SYMBOL(fscache_op_wq); /* these values serve as lower bounds, will be adjusted in fscache_init() */ static unsigned fscache_object_max_active = 4; diff --git a/fs/fscache/read_helper.c b/fs/fscache/read_helper.c new file mode 100644 index 000000000000..62fed27aa938 --- /dev/null +++ b/fs/fscache/read_helper.c @@ -0,0 +1,656 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Read helper. + * + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define FSCACHE_DEBUG_LEVEL OPERATION +#include +#include +#include +#include +#include +#include +#include "internal.h" +#define CREATE_TRACE_POINTS +#include + +MODULE_DESCRIPTION("FS Cache Manager Support"); +MODULE_AUTHOR("Red Hat, Inc."); +MODULE_LICENSE("GPL"); + +#define FSCACHE_RHLP_NOTE_READ_FROM_CACHE FSCACHE_READ_FROM_CACHE +#define FSCACHE_RHLP_NOTE_WRITE_TO_CACHE FSCACHE_WRITE_TO_CACHE +#define FSCACHE_RHLP_NOTE_FILL_WITH_ZERO FSCACHE_FILL_WITH_ZERO +#define FSCACHE_RHLP_NOTE_READ_FOR_WRITE 0x00000100 /* Type: FSCACHE_READ_FOR_WRITE */ +#define FSCACHE_RHLP_NOTE_READ_LOCKED_PAGE 0x00000200 /* Type: FSCACHE_READ_LOCKED_PAGE */ +#define FSCACHE_RHLP_NOTE_READ_PAGE_LIST 0x00000300 /* Type: FSCACHE_READ_PAGE_LIST */ +#define FSCACHE_RHLP_NOTE_LIST_NOMEM 0x00001000 /* Page list: ENOMEM */ +#define FSCACHE_RHLP_NOTE_LIST_U2D 0x00002000 /* Page list: page uptodate */ +#define FSCACHE_RHLP_NOTE_LIST_ERROR 0x00004000 /* Page list: add error */ +#define FSCACHE_RHLP_NOTE_TRAILER_ADD 0x00010000 /* Trailer: Creating */ +#define FSCACHE_RHLP_NOTE_TRAILER_NOMEM 0x00020000 /* Trailer: ENOMEM */ +#define FSCACHE_RHLP_NOTE_TRAILER_U2D 0x00040000 /* Trailer: Uptodate */ +#define FSCACHE_RHLP_NOTE_U2D_IN_PREFACE 0x00100000 /* Uptodate page in preface */ +#define FSCACHE_RHLP_NOTE_UNDERSIZED 0x00200000 /* Undersized block */ +#define FSCACHE_RHLP_NOTE_AFTER_EOF 0x00400000 /* After EOF */ +#define FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE 0x00800000 /* Actually write to the cache */ +#define FSCACHE_RHLP_NOTE_CANCELLED 0x80000000 /* Operation cancelled by netfs */ + +unsigned fscache_rh_debug; +module_param_named(debug, fscache_rh_debug, uint, S_IWUSR | S_IRUGO); +MODULE_PARM_DESC(fscache_rh_debug, "FS-Cache read helper debugging mask"); +#define fscache_debug fscache_rh_debug + +enum fscache_read_type { + FSCACHE_READ_PAGE_LIST, /* Read the list of pages (readpages) */ + FSCACHE_READ_LOCKED_PAGE, /* requested_page is added and locked */ + FSCACHE_READ_FOR_WRITE, /* This read is a prelude to write_begin */ +}; + +static void fscache_read_from_server(struct fscache_io_request *req) +{ + req->ops->issue_op(req); +} + +/* + * Deal with the completion of writing the data to the cache. We have to clear + * the PG_fscache bits on the pages involved and releases the caller's ref. + */ +static void fscache_read_copy_done(struct fscache_io_request *req) +{ + struct page *page; + pgoff_t index = req->pos >> PAGE_SHIFT; + pgoff_t last = index + req->nr_pages - 1; + + XA_STATE(xas, &req->mapping->i_pages, index); + + _enter("%lx,%x,%llx", index, req->nr_pages, req->transferred); + + /* Clear PG_fscache on the pages that were being written out. */ + rcu_read_lock(); + xas_for_each(&xas, page, last) { + BUG_ON(xa_is_value(page)); + BUG_ON(PageCompound(page)); + + unlock_page_fscache(page); + } + rcu_read_unlock(); +} + +/* + * Write a completed read request to the cache. + */ +static void fscache_do_read_copy_to_cache(struct work_struct *work) +{ + struct fscache_io_request *req = + container_of(work, struct fscache_io_request, work); + struct iov_iter iter; + + _enter(""); + + iov_iter_mapping(&iter, WRITE, req->mapping, req->pos, + round_up(req->len, req->dio_block_size)); + + req->io_done = fscache_read_copy_done; + fscache_write(req, &iter); + fscache_put_io_request(req); +} + +static void fscache_read_copy_to_cache(struct fscache_io_request *req) +{ + fscache_get_io_request(req); + + if (!in_softirq()) + return fscache_do_read_copy_to_cache(&req->work); + + BUG_ON(work_pending(&req->work)); + INIT_WORK(&req->work, fscache_do_read_copy_to_cache); + if (!queue_work(fscache_op_wq, &req->work)) + BUG(); +} + +/* + * Clear the unread part of the file on a short read. + */ +static void fscache_clear_unread(struct fscache_io_request *req) +{ + struct iov_iter iter; + + iov_iter_mapping(&iter, WRITE, req->mapping, + req->pos + req->transferred, + req->len - req->transferred); + + _debug("clear %zx @%llx", iov_iter_count(&iter), iter.mapping_start); + + iov_iter_zero(iov_iter_count(&iter), &iter); +} + +/* + * Handle completion of a read operation. This may be called in softirq + * context. + */ +static void fscache_read_done(struct fscache_io_request *req) +{ + struct page *page; + pgoff_t start = req->pos >> PAGE_SHIFT; + pgoff_t last = start + req->nr_pages - 1; + + XA_STATE(xas, &req->mapping->i_pages, start); + + _enter("%lx,%x,%llx,%d", + start, req->nr_pages, req->transferred, req->error); + + if (req->transferred < req->len) + fscache_clear_unread(req); + + if (!test_bit(FSCACHE_IO_DONT_UNLOCK_PAGES, &req->flags)) { + rcu_read_lock(); + xas_for_each(&xas, page, last) { + if (test_bit(FSCACHE_IO_WRITE_TO_CACHE, &req->flags)) + SetPageFsCache(page); + if (page == req->no_unlock_page) + SetPageUptodate(page); + else + page_endio(page, false, 0); + put_page(page); + } + rcu_read_unlock(); + } + + task_io_account_read(req->transferred); + req->ops->done(req); + if (test_and_clear_bit(FSCACHE_IO_READ_IN_PROGRESS, &req->flags)) + wake_up_bit(&req->flags, FSCACHE_IO_READ_IN_PROGRESS); + + if (test_bit(FSCACHE_IO_WRITE_TO_CACHE, &req->flags)) + fscache_read_copy_to_cache(req); +} + +/* + * Reissue the read against the server. + */ +static void fscache_reissue_read(struct work_struct *work) +{ + struct fscache_io_request *req = + container_of(work, struct fscache_io_request, work); + + _debug("DOWNLOAD: %llu", req->len); + + req->io_done = fscache_read_done; + fscache_read_from_server(req); + fscache_put_io_request(req); +} + +/* + * Handle completion of a read from cache operation. If the read failed, we + * need to reissue the request against the server. We might, however, be + * called in softirq mode and need to punt. + */ +static void fscache_file_read_maybe_reissue(struct fscache_io_request *req) +{ + _enter("%d", req->error); + + if (req->error == 0) { + fscache_read_done(req); + } else { + INIT_WORK(&req->work, fscache_reissue_read); + fscache_get_io_request(req); + queue_work(fscache_op_wq, &req->work); + } +} + +/* + * Issue a read against the cache. + */ +static void fscache_read_from_cache(struct fscache_io_request *req) +{ + struct iov_iter iter; + + iov_iter_mapping(&iter, READ, req->mapping, req->pos, req->len); + fscache_read(req, &iter); +} + +/* + * Discard the locks and page refs that we obtained on a sequence of pages. + */ +static void fscache_ignore_pages(struct address_space *mapping, + pgoff_t start, pgoff_t end) +{ + struct page *page; + + _enter("%lx,%lx", start, end); + + if (end > start) { + XA_STATE(xas, &mapping->i_pages, start); + + rcu_read_lock(); + xas_for_each(&xas, page, end - 1) { + _debug("- ignore %lx", page->index); + BUG_ON(xa_is_value(page)); + BUG_ON(PageCompound(page)); + + unlock_page(page); + put_page(page); + } + rcu_read_unlock(); + } +} + +/** + * fscache_read_helper - Helper to manage a read request + * @req: The initialised request structure to use + * @requested_page: Singular page to include (LOCKED_PAGE/FOR_WRITE) + * @pages: Unattached pages to include (PAGE_LIST) + * @page_to_be_written: The index of the primary page (FOR_WRITE) + * @max_pages: The maximum number of pages to read in one transaction + * @type: FSCACHE_READ_* + * @aop_flags: AOP_FLAG_* + * + * Read a sequence of pages appropriately sized for an fscache allocation + * block. Pages are added at both ends and to fill in the gaps as appropriate + * to make it the right size. + * + * req->mapping should indicate the mapping to which the pages will be attached. + * + * The operations pointed to by req->ops will be used to issue or reissue a + * read against the server in case the cache is unavailable, incomplete or + * generates an error. req->iter will be set up to point to the iterator + * representing the buffer to be filled in. + * + * A ref on @req is consumed eventually by this function or one of its + * eventually-dispatched callees. + */ +static int fscache_read_helper(struct fscache_io_request *req, + struct page **requested_page, + struct list_head *pages, + pgoff_t page_to_be_written, + pgoff_t max_pages, + enum fscache_read_type type, + unsigned int aop_flags) +{ + struct fscache_request_shape shape; + struct address_space *mapping = req->mapping; + struct page *page; + enum fscache_read_helper_trace what; + unsigned int notes; + pgoff_t eof, cursor, start; + loff_t new_size; + int ret; + + shape.granularity = 1; + shape.max_io_pages = max_pages; + shape.i_size = i_size_read(mapping->host); + shape.for_write = false; + + switch (type) { + case FSCACHE_READ_PAGE_LIST: + shape.proposed_start = lru_to_page(pages)->index; + shape.proposed_nr_pages = + lru_to_last_page(pages)->index - shape.proposed_start + 1; + break; + + case FSCACHE_READ_LOCKED_PAGE: + shape.proposed_start = (*requested_page)->index; + shape.proposed_nr_pages = 1; + break; + + case FSCACHE_READ_FOR_WRITE: + new_size = (loff_t)(page_to_be_written + 1) << PAGE_SHIFT; + if (new_size > shape.i_size) + shape.i_size = new_size; + shape.proposed_start = page_to_be_written; + shape.proposed_nr_pages = 1; + break; + + default: + BUG(); + } + + _enter("%lx,%x", shape.proposed_start, shape.proposed_nr_pages); + + eof = (shape.i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; + + fscache_shape_request(req->cookie, &shape); + if (req->ops->reshape) + req->ops->reshape(req, &shape); + notes = shape.to_be_done; + + req->dio_block_size = shape.dio_block_size; + + start = cursor = shape.actual_start; + + /* Add pages to the pagecache. We keep the pages ref'd and locked + * until the read is complete. We may also need to add pages to both + * sides of the request to make it up to the cache allocation granule + * alignment and size. + * + * Note that it's possible for the file size to change whilst we're + * doing this, but we rely on the server returning less than we asked + * for if the file shrank. We also rely on this to deal with a partial + * page at the end of the file. + * + * If we're going to end up loading from the server and writing to the + * cache, we start by inserting blank pages before the first page being + * examined. If we can fetch from the cache or we're not going to + * write to the cache, it's unnecessary. + */ + if (notes & FSCACHE_RHLP_NOTE_WRITE_TO_CACHE) { + notes |= FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE; + while (cursor < shape.proposed_start) { + page = find_or_create_page(mapping, cursor, + readahead_gfp_mask(mapping)); + if (!page) + goto nomem; + if (!PageUptodate(page)) { + req->nr_pages++; /* Add to the reading list */ + cursor++; + continue; + } + + /* There's an up-to-date page in the preface - just + * fetch the requested pages and skip saving to the + * cache. + */ + notes |= FSCACHE_RHLP_NOTE_U2D_IN_PREFACE; + notes &= ~FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE; + fscache_ignore_pages(mapping, start, cursor + 1); + start = cursor = shape.proposed_start; + req->nr_pages = 0; + break; + } + page = NULL; + } else { + notes &= ~FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE; + start = cursor = shape.proposed_start; + req->nr_pages = 0; + } + + switch (type) { + case FSCACHE_READ_FOR_WRITE: + /* We're doing a prefetch for a write on a single page. We get + * or create the requested page if we weren't given it and lock + * it. + */ + notes |= FSCACHE_RHLP_NOTE_READ_FOR_WRITE; + if (*requested_page) { + _debug("prewrite req %lx", cursor); + page = *requested_page; + ret = -ERESTARTSYS; + if (lock_page_killable(page) < 0) + goto dont; + } else { + _debug("prewrite new %lx %lx", cursor, eof); + page = grab_cache_page_write_begin(mapping, shape.proposed_start, + aop_flags); + if (!page) + goto nomem; + *requested_page = page; + } + + if (PageUptodate(page)) { + notes |= FSCACHE_RHLP_NOTE_LIST_U2D; + + trace_fscache_read_helper(req->cookie, + start, start + req->nr_pages, + notes, fscache_read_helper_race); + req->ops->done(req); + ret = 0; + goto cancelled; + } + + get_page(page); + req->no_unlock_page = page; + req->nr_pages++; + cursor++; + page = NULL; + ret = 0; + break; + + case FSCACHE_READ_LOCKED_PAGE: + /* We've got a single page preattached to the inode and locked. + * Get our own ref on it. + */ + _debug("locked"); + notes |= FSCACHE_RHLP_NOTE_READ_LOCKED_PAGE; + get_page(*requested_page); + req->nr_pages++; + cursor++; + ret = 0; + break; + + case FSCACHE_READ_PAGE_LIST: + /* We've been given a contiguous list of pages to add. */ + notes |= FSCACHE_RHLP_NOTE_READ_PAGE_LIST; + do { + _debug("given %lx", cursor); + + page = lru_to_page(pages); + if (WARN_ON(page->index != cursor)) + break; + + list_del(&page->lru); + + ret = add_to_page_cache_lru(page, mapping, cursor, + readahead_gfp_mask(mapping)); + switch (ret) { + case 0: + /* Add to the reading list */ + req->nr_pages++; + cursor++; + page = NULL; + break; + + case -EEXIST: + put_page(page); + + _debug("conflict %lx %d", cursor, ret); + page = find_or_create_page(mapping, cursor, + readahead_gfp_mask(mapping)); + if (!page) { + notes |= FSCACHE_RHLP_NOTE_LIST_NOMEM; + goto stop; + } + + if (PageUptodate(page)) { + unlock_page(page); + put_page(page); /* Avoid overwriting */ + ret = 0; + notes |= FSCACHE_RHLP_NOTE_LIST_U2D; + goto stop; + } + + req->nr_pages++; /* Add to the reading list */ + cursor++; + break; + + default: + _debug("add fail %lx %d", cursor, ret); + put_page(page); + page = NULL; + notes |= FSCACHE_RHLP_NOTE_LIST_ERROR; + goto stop; + } + + /* Trim the fetch to the cache granularity so we don't + * get a chain-failure of blocks being unable to be + * used because the previous uncached read spilt over. + */ + if ((notes & FSCACHE_RHLP_NOTE_U2D_IN_PREFACE) && + cursor == shape.actual_start + shape.granularity) + break; + + } while (!list_empty(pages) && req->nr_pages < shape.actual_nr_pages); + ret = 0; + break; + + default: + BUG(); + } + + /* If we're going to be writing to the cache, insert pages after the + * requested block to make up the numbers. + */ + if (notes & FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE) { + notes |= FSCACHE_RHLP_NOTE_TRAILER_ADD; + while (req->nr_pages < shape.actual_nr_pages) { + _debug("after %lx", cursor); + page = find_or_create_page(mapping, cursor, + readahead_gfp_mask(mapping)); + if (!page) { + notes |= FSCACHE_RHLP_NOTE_TRAILER_NOMEM; + goto stop; + } + if (PageUptodate(page)) { + unlock_page(page); + put_page(page); /* Avoid overwriting */ + notes |= FSCACHE_RHLP_NOTE_TRAILER_U2D; + goto stop; + } + + req->nr_pages++; /* Add to the reading list */ + cursor++; + } + } + +stop: + _debug("have %u", req->nr_pages); + if (req->nr_pages == 0) + goto dont; + + if (cursor <= shape.proposed_start) { + _debug("v.short"); + goto nomem_unlock; /* We wouldn't've included the first page */ + } + +submit_anyway: + if ((notes & FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE) && + req->nr_pages < shape.actual_nr_pages) { + /* The request is short of what we need to be able to cache the + * entire set of pages and the trailer, so trim it to cache + * granularity if we can without reducing it to nothing. + */ + unsigned int down_to = round_down(req->nr_pages, shape.granularity); + _debug("short %u", down_to); + + notes |= FSCACHE_RHLP_NOTE_UNDERSIZED; + + if (down_to > 0) { + fscache_ignore_pages(mapping, shape.actual_start + down_to, cursor); + req->nr_pages = down_to; + } else { + notes &= ~FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE; + } + } + + req->len = req->nr_pages * PAGE_SIZE; + req->pos = start; + req->pos <<= PAGE_SHIFT; + + if (start >= eof) { + notes |= FSCACHE_RHLP_NOTE_AFTER_EOF; + what = fscache_read_helper_skip; + } else if (notes & FSCACHE_RHLP_NOTE_FILL_WITH_ZERO) { + what = fscache_read_helper_zero; + } else if (notes & FSCACHE_RHLP_NOTE_READ_FROM_CACHE) { + what = fscache_read_helper_read; + } else { + what = fscache_read_helper_download; + } + + ret = 0; + if (req->ops->is_req_valid) { + /* Allow the netfs to decide if the request is still valid + * after all the pages are locked. + */ + ret = req->ops->is_req_valid(req); + if (ret < 0) + notes |= FSCACHE_RHLP_NOTE_CANCELLED; + } + + trace_fscache_read_helper(req->cookie, start, start + req->nr_pages, + notes, what); + + if (notes & FSCACHE_RHLP_NOTE_CANCELLED) + goto cancelled; + + if (notes & FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE) + __set_bit(FSCACHE_IO_WRITE_TO_CACHE, &req->flags); + + __set_bit(FSCACHE_IO_READ_IN_PROGRESS, &req->flags); + + switch (what) { + case fscache_read_helper_skip: + /* The read is entirely beyond the end of the file, so skip the + * actual operation and let the done handler deal with clearing + * the pages. + */ + _debug("SKIP READ: %llu", req->len); + fscache_read_done(req); + break; + case fscache_read_helper_zero: + _debug("ZERO READ: %llu", req->len); + fscache_read_done(req); + break; + case fscache_read_helper_read: + req->io_done = fscache_file_read_maybe_reissue; + fscache_read_from_cache(req); + break; + case fscache_read_helper_download: + _debug("DOWNLOAD: %llu", req->len); + req->io_done = fscache_read_done; + fscache_read_from_server(req); + break; + default: + BUG(); + } + + _leave(" = 0"); + return 0; + +nomem: + if (cursor > shape.proposed_start) + goto submit_anyway; +nomem_unlock: + ret = -ENOMEM; +cancelled: + fscache_ignore_pages(mapping, start, cursor); +dont: + _leave(" = %d", ret); + return ret; +} + +int fscache_read_helper_page_list(struct fscache_io_request *req, + struct list_head *pages, + pgoff_t max_pages) +{ + ASSERT(pages); + ASSERT(!list_empty(pages)); + return fscache_read_helper(req, NULL, pages, 0, max_pages, + FSCACHE_READ_PAGE_LIST, 0); +} +EXPORT_SYMBOL(fscache_read_helper_page_list); + +int fscache_read_helper_locked_page(struct fscache_io_request *req, + struct page *page, + pgoff_t max_pages) +{ + ASSERT(page); + return fscache_read_helper(req, &page, NULL, 0, max_pages, + FSCACHE_READ_LOCKED_PAGE, 0); +} +EXPORT_SYMBOL(fscache_read_helper_locked_page); + +int fscache_read_helper_for_write(struct fscache_io_request *req, + struct page **page, + pgoff_t index, + pgoff_t max_pages, + unsigned int aop_flags) +{ + ASSERT(page); + ASSERTIF(*page, (*page)->index == index); + return fscache_read_helper(req, page, NULL, index, max_pages, + FSCACHE_READ_FOR_WRITE, aop_flags); +} +EXPORT_SYMBOL(fscache_read_helper_for_write); diff --git a/include/linux/fscache.h b/include/linux/fscache.h index bfb28cebfcfd..0aee6edef672 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -181,12 +181,24 @@ struct fscache_io_request { unsigned long flags; #define FSCACHE_IO_DATA_FROM_SERVER 0 /* Set if data was read from server */ #define FSCACHE_IO_DATA_FROM_CACHE 1 /* Set if data was read from the cache */ +#define FSCACHE_IO_DONT_UNLOCK_PAGES 2 /* Don't unlock the pages on completion */ +#define FSCACHE_IO_READ_IN_PROGRESS 3 /* Cleared and woken upon completion of the read */ +#define FSCACHE_IO_WRITE_TO_CACHE 4 /* Set if should write to cache */ void (*io_done)(struct fscache_io_request *); + struct work_struct work; + + /* Bits for readpages helper */ + struct address_space *mapping; /* The mapping being accessed */ + unsigned int nr_pages; /* Number of pages involved in the I/O */ + unsigned int dio_block_size; /* Rounding for direct I/O in the cache */ + struct page *no_unlock_page; /* Don't unlock this page after read */ }; struct fscache_io_request_ops { + int (*is_req_valid)(struct fscache_io_request *); bool (*is_still_valid)(struct fscache_io_request *); void (*issue_op)(struct fscache_io_request *); + void (*reshape)(struct fscache_io_request *, struct fscache_request_shape *); void (*done)(struct fscache_io_request *); void (*get)(struct fscache_io_request *); void (*put)(struct fscache_io_request *); @@ -489,7 +501,7 @@ static inline void fscache_init_io_request(struct fscache_io_request *req, static inline void fscache_free_io_request(struct fscache_io_request *req) { - if (req->cookie) + if (fscache_cookie_valid(req->cookie)) __fscache_free_io_request(req); } @@ -593,4 +605,16 @@ int fscache_write(struct fscache_io_request *req, struct iov_iter *iter) return -ENOBUFS; } +extern int fscache_read_helper_page_list(struct fscache_io_request *, + struct list_head *, + pgoff_t); +extern int fscache_read_helper_locked_page(struct fscache_io_request *, + struct page *, + pgoff_t); +extern int fscache_read_helper_for_write(struct fscache_io_request *, + struct page **, + pgoff_t, + pgoff_t, + unsigned int); + #endif /* _LINUX_FSCACHE_H */ diff --git a/include/trace/events/fscache_support.h b/include/trace/events/fscache_support.h new file mode 100644 index 000000000000..0d94650ad637 --- /dev/null +++ b/include/trace/events/fscache_support.h @@ -0,0 +1,91 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* FS-Cache support module tracepoints + * + * Copyright (C) 2020 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM fscache_support + +#if !defined(_TRACE_FSCACHE_SUPPORT_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_FSCACHE_SUPPORT_H + +#include +#include + +/* + * Define enums for tracing information. + */ +#ifndef __FSCACHE_SUPPORT_DECLARE_TRACE_ENUMS_ONCE_ONLY +#define __FSCACHE_SUPPORT_DECLARE_TRACE_ENUMS_ONCE_ONLY + +enum fscache_read_helper_trace { + fscache_read_helper_download, + fscache_read_helper_race, + fscache_read_helper_read, + fscache_read_helper_skip, + fscache_read_helper_zero, +}; + +#endif + +#define fscache_read_helper_traces \ + EM(fscache_read_helper_download, "DOWN") \ + EM(fscache_read_helper_race, "RACE") \ + EM(fscache_read_helper_read, "READ") \ + EM(fscache_read_helper_skip, "SKIP") \ + E_(fscache_read_helper_zero, "ZERO") + + +/* + * Export enum symbols via userspace. + */ +#undef EM +#undef E_ +#define EM(a, b) TRACE_DEFINE_ENUM(a); +#define E_(a, b) TRACE_DEFINE_ENUM(a); + +fscache_read_helper_traces; + +/* + * Now redefine the EM() and E_() macros to map the enums to the strings that + * will be printed in the output. + */ +#undef EM +#undef E_ +#define EM(a, b) { a, b }, +#define E_(a, b) { a, b } + +TRACE_EVENT(fscache_read_helper, + TP_PROTO(struct fscache_cookie *cookie, pgoff_t start, pgoff_t end, + unsigned int notes, enum fscache_read_helper_trace what), + + TP_ARGS(cookie, start, end, notes, what), + + TP_STRUCT__entry( + __field(unsigned int, cookie ) + __field(pgoff_t, start ) + __field(pgoff_t, end ) + __field(unsigned int, notes ) + __field(enum fscache_read_helper_trace, what ) + ), + + TP_fast_assign( + __entry->cookie = cookie ? cookie->debug_id : 0; + __entry->start = start; + __entry->end = end; + __entry->what = what; + __entry->notes = notes; + ), + + TP_printk("c=%08x %s n=%08x p=%lx-%lx", + __entry->cookie, + __print_symbolic(__entry->what, fscache_read_helper_traces), + __entry->notes, + __entry->start, __entry->end) + ); + +#endif /* _TRACE_FSCACHE_SUPPORT_H */ + +/* This part must be outside protection */ +#include From patchwork Mon Jul 13 16:35:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660543 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12D9313B4 for ; Mon, 13 Jul 2020 16:35:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EAEC8206F5 for ; Mon, 13 Jul 2020 16:35:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KcEfADtG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730711AbgGMQff (ORCPT ); Mon, 13 Jul 2020 12:35:35 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:27389 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730411AbgGMQfd (ORCPT ); Mon, 13 Jul 2020 12:35:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658131; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wQkmMwKqdoqPES4EgtSnMTmwzuuUgpaX2PYBWluvNRQ=; b=KcEfADtGP9yOrQqOQnz5o1b51hxzLhZR+da10RwFfeijMFDwcy0JHt9rPAeii4q+Sz2hVj 6f7Rq+PaECk0qbuDOjedjIpHdxd3E6V8mTWbhLfUKzYU0w+7FDaif1cVH4RI2V80Q3fRK4 8IDADu6Sv8yIyVDZEEtPHzncNFKFE1Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-191-FznXJ4y8M4qbT_ZvwuEGwg-1; Mon, 13 Jul 2020 12:35:28 -0400 X-MC-Unique: FznXJ4y8M4qbT_ZvwuEGwg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 07B1E107ACCA; Mon, 13 Jul 2020 16:35:27 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E22B60CD0; Mon, 13 Jul 2020 16:35:21 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 25/32] fscache: Display cache-specific data in /proc/fs/fscache/objects From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:35:20 +0100 Message-ID: <159465812047.1376674.773368234437370021.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow the cache to add information in /proc/fs/fscache/objects instead of displaying cookie key and aux data - which can be seen in the cookies file. Signed-off-by: David Howells --- fs/cachefiles/content-map.c | 41 +++++++++++++++++++++++++++++++++++++++++ fs/cachefiles/interface.c | 1 + fs/cachefiles/internal.h | 1 + fs/fscache/object-list.c | 33 +++------------------------------ include/linux/fscache-cache.h | 4 ++++ 5 files changed, 50 insertions(+), 30 deletions(-) diff --git a/fs/cachefiles/content-map.c b/fs/cachefiles/content-map.c index 91c44bb39a93..f2a10e8d8d6d 100644 --- a/fs/cachefiles/content-map.c +++ b/fs/cachefiles/content-map.c @@ -396,3 +396,44 @@ void cachefiles_save_content_map(struct cachefiles_object *object) _leave(" = %zd", ret); } + +/* + * Display object information in proc. + */ +int cachefiles_display_object(struct seq_file *m, struct fscache_object *_object) +{ + struct cachefiles_object *object = + container_of(_object, struct cachefiles_object, fscache); + + if (object->fscache.cookie->type == FSCACHE_COOKIE_TYPE_INDEX) { + if (object->content_info != CACHEFILES_CONTENT_NO_DATA) + seq_printf(m, " ???%u???", object->content_info); + } else { + switch (object->content_info) { + case CACHEFILES_CONTENT_NO_DATA: + seq_puts(m, " "); + break; + case CACHEFILES_CONTENT_SINGLE: + seq_puts(m, " "); + break; + case CACHEFILES_CONTENT_ALL: + seq_puts(m, " "); + break; + case CACHEFILES_CONTENT_MAP: + read_lock_bh(&object->content_map_lock); + if (object->content_map) { + seq_printf(m, " %*phN", + object->content_map_size, + object->content_map); + } + read_unlock_bh(&object->content_map_lock); + break; + default: + seq_printf(m, " <%u>", object->content_info); + break; + } + } + + seq_putc(m, '\n'); + return 0; +} diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index e73de62d0e73..78180d269c5f 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -491,4 +491,5 @@ const struct fscache_cache_ops cachefiles_cache_ops = { .shape_request = cachefiles_shape_request, .read = cachefiles_read, .write = cachefiles_write, + .display_object = cachefiles_display_object, }; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index 2ea469b77712..c91a9b3c5bd5 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -132,6 +132,7 @@ extern void cachefiles_expand_content_map(struct cachefiles_object *object, loff extern void cachefiles_shorten_content_map(struct cachefiles_object *object, loff_t new_size); extern bool cachefiles_load_content_map(struct cachefiles_object *object); extern void cachefiles_save_content_map(struct cachefiles_object *object); +extern int cachefiles_display_object(struct seq_file *m, struct fscache_object *object); /* * daemon.c diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c index 5777f909d31a..361610e124bd 100644 --- a/fs/fscache/object-list.c +++ b/fs/fscache/object-list.c @@ -155,7 +155,6 @@ static int fscache_objlist_show(struct seq_file *m, void *v) struct fscache_cookie *cookie; unsigned long config = data->config; char _type[3], *type; - u8 *p; if ((unsigned long) v == 1) { seq_puts(m, "OBJECT PARENT USE CHLDN OPS FL S" @@ -201,8 +200,6 @@ static int fscache_objlist_show(struct seq_file *m, void *v) obj->stage); if (obj->cookie) { - uint16_t keylen = 0, auxlen = 0; - switch (cookie->type) { case 0: type = "IX"; @@ -211,8 +208,7 @@ static int fscache_objlist_show(struct seq_file *m, void *v) type = "DT"; break; default: - snprintf(_type, sizeof(_type), "%02u", - cookie->type); + snprintf(_type, sizeof(_type), "%02x", cookie->type); type = _type; break; } @@ -223,34 +219,11 @@ static int fscache_objlist_show(struct seq_file *m, void *v) type, cookie->stage, cookie->flags); - - if (config & FSCACHE_OBJLIST_CONFIG_KEY) - keylen = cookie->key_len; - - if (config & FSCACHE_OBJLIST_CONFIG_AUX) - auxlen = cookie->aux_len; - - if (keylen > 0 || auxlen > 0) { - seq_puts(m, " "); - p = keylen <= sizeof(cookie->inline_key) ? - cookie->inline_key : cookie->key; - for (; keylen > 0; keylen--) - seq_printf(m, "%02x", *p++); - if (auxlen > 0) { - if (config & FSCACHE_OBJLIST_CONFIG_KEY) - seq_puts(m, ", "); - p = auxlen <= sizeof(cookie->inline_aux) ? - cookie->inline_aux : cookie->aux; - for (; auxlen > 0; auxlen--) - seq_printf(m, "%02x", *p++); - } - } - - seq_puts(m, "\n"); } else { seq_puts(m, "\n"); } - return 0; + + return obj->cache->ops->display_object(m, obj); } static const struct seq_operations fscache_objlist_ops = { diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index 81a41e37f07b..1357c44d371b 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -19,6 +19,7 @@ #define NR_MAXCACHES BITS_PER_LONG +struct seq_file; struct fscache_cache; struct fscache_cache_ops; struct fscache_object; @@ -151,6 +152,9 @@ struct fscache_cache_ops { int (*write)(struct fscache_object *object, struct fscache_io_request *req, struct iov_iter *iter); + + /* Display object info in /proc/fs/fscache/objects */ + int (*display_object)(struct seq_file *m, struct fscache_object *object); }; extern struct fscache_cookie fscache_fsdef_index; From patchwork Mon Jul 13 16:35:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660559 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71C6F1510 for ; Mon, 13 Jul 2020 16:35:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 56287206F5 for ; Mon, 13 Jul 2020 16:35:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Z+OksT5G" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730731AbgGMQfr (ORCPT ); Mon, 13 Jul 2020 12:35:47 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:23346 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730347AbgGMQfq (ORCPT ); Mon, 13 Jul 2020 12:35:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658144; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=82R50oM3fLGYFRzRYFN6AIMM8NnlH3BJGDV9nGHGEXg=; b=Z+OksT5GEdD1RishtKrnzTACgjS72G8eUuXa6HEE0qFvhVBrnsGC9tW+LbP+yrdzg9vdtM fp3+SzCDhlNQZlQpu5/8TLcQDJLtz+Tib2xITigTSgr/sBRWvnagTtipH1enIKf6XMlJop xKOn/CqFhmmGCoZEKvl1E704e/zWDlE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-305-Q3WUxPnjPveEQof5IUo9kA-1; Mon, 13 Jul 2020 12:35:41 -0400 X-MC-Unique: Q3WUxPnjPveEQof5IUo9kA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D1C871080; Mon, 13 Jul 2020 16:35:38 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 18FBF19C66; Mon, 13 Jul 2020 16:35:32 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 26/32] fscache: Remove more obsolete stats From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:35:32 +0100 Message-ID: <159465813226.1376674.13527511953573909880.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Remove some more stats that have become obsolete. Signed-off-by: David Howells --- fs/fscache/internal.h | 18 ++---------------- fs/fscache/obj.c | 6 +++--- fs/fscache/stats.c | 50 +++++++++---------------------------------------- 3 files changed, 14 insertions(+), 60 deletions(-) diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index 2674438ccafd..d2b856aa5f0e 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -165,25 +165,13 @@ extern void fscache_proc_cleanup(void); * stats.c */ #ifdef CONFIG_FSCACHE_STATS -extern atomic_t fscache_n_op_pend; -extern atomic_t fscache_n_op_run; -extern atomic_t fscache_n_op_enqueue; -extern atomic_t fscache_n_op_deferred_release; -extern atomic_t fscache_n_op_initialised; -extern atomic_t fscache_n_op_release; -extern atomic_t fscache_n_op_gc; -extern atomic_t fscache_n_op_cancelled; -extern atomic_t fscache_n_op_rejected; - extern atomic_t fscache_n_acquires; extern atomic_t fscache_n_acquires_null; extern atomic_t fscache_n_acquires_no_cache; extern atomic_t fscache_n_acquires_ok; -extern atomic_t fscache_n_acquires_nobufs; extern atomic_t fscache_n_acquires_oom; extern atomic_t fscache_n_invalidates; -extern atomic_t fscache_n_invalidates_run; extern atomic_t fscache_n_updates; extern atomic_t fscache_n_updates_null; @@ -202,15 +190,13 @@ extern atomic_t fscache_n_object_no_alloc; extern atomic_t fscache_n_object_lookups; extern atomic_t fscache_n_object_lookups_negative; extern atomic_t fscache_n_object_lookups_positive; -extern atomic_t fscache_n_object_lookups_timed_out; -extern atomic_t fscache_n_object_created; +extern atomic_t fscache_n_object_creates; extern atomic_t fscache_n_object_avail; extern atomic_t fscache_n_object_dead; extern atomic_t fscache_n_cop_alloc_object; extern atomic_t fscache_n_cop_lookup_object; -extern atomic_t fscache_n_cop_lookup_complete; -extern atomic_t fscache_n_cop_grab_object; +extern atomic_t fscache_n_cop_create_object; extern atomic_t fscache_n_cop_invalidate_object; extern atomic_t fscache_n_cop_update_object; extern atomic_t fscache_n_cop_drop_object; diff --git a/fs/fscache/obj.c b/fs/fscache/obj.c index 63373b99ac34..baab7c465142 100644 --- a/fs/fscache/obj.c +++ b/fs/fscache/obj.c @@ -47,10 +47,10 @@ static int fscache_do_lookup_object(struct fscache_object *object, void *data) static int fscache_do_create_object(struct fscache_object *object, void *data) { int ret; - fscache_stat(&fscache_n_object_lookups); - fscache_stat(&fscache_n_cop_lookup_object); + fscache_stat(&fscache_n_object_creates); + fscache_stat(&fscache_n_cop_create_object); ret = object->cache->ops->create_object(object, data); - fscache_stat_d(&fscache_n_cop_lookup_object); + fscache_stat_d(&fscache_n_cop_create_object); return ret; } diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c index 583817f4f113..ccca0016fd26 100644 --- a/fs/fscache/stats.c +++ b/fs/fscache/stats.c @@ -14,25 +14,13 @@ /* * operation counters */ -atomic_t fscache_n_op_pend; -atomic_t fscache_n_op_run; -atomic_t fscache_n_op_enqueue; -atomic_t fscache_n_op_deferred_release; -atomic_t fscache_n_op_initialised; -atomic_t fscache_n_op_release; -atomic_t fscache_n_op_gc; -atomic_t fscache_n_op_cancelled; -atomic_t fscache_n_op_rejected; - atomic_t fscache_n_acquires; atomic_t fscache_n_acquires_null; atomic_t fscache_n_acquires_no_cache; atomic_t fscache_n_acquires_ok; -atomic_t fscache_n_acquires_nobufs; atomic_t fscache_n_acquires_oom; atomic_t fscache_n_invalidates; -atomic_t fscache_n_invalidates_run; atomic_t fscache_n_updates; atomic_t fscache_n_updates_null; @@ -51,15 +39,13 @@ atomic_t fscache_n_object_no_alloc; atomic_t fscache_n_object_lookups; atomic_t fscache_n_object_lookups_negative; atomic_t fscache_n_object_lookups_positive; -atomic_t fscache_n_object_lookups_timed_out; -atomic_t fscache_n_object_created; +atomic_t fscache_n_object_creates; atomic_t fscache_n_object_avail; atomic_t fscache_n_object_dead; atomic_t fscache_n_cop_alloc_object; atomic_t fscache_n_cop_lookup_object; -atomic_t fscache_n_cop_lookup_complete; -atomic_t fscache_n_cop_grab_object; +atomic_t fscache_n_cop_create_object; atomic_t fscache_n_cop_invalidate_object; atomic_t fscache_n_cop_update_object; atomic_t fscache_n_cop_drop_object; @@ -90,25 +76,21 @@ int fscache_stats_show(struct seq_file *m, void *v) atomic_read(&fscache_n_object_avail), atomic_read(&fscache_n_object_dead)); - seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u" - " oom=%u\n", + seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u oom=%u\n", atomic_read(&fscache_n_acquires), atomic_read(&fscache_n_acquires_null), atomic_read(&fscache_n_acquires_no_cache), atomic_read(&fscache_n_acquires_ok), - atomic_read(&fscache_n_acquires_nobufs), atomic_read(&fscache_n_acquires_oom)); - seq_printf(m, "Lookups: n=%u neg=%u pos=%u crt=%u tmo=%u\n", + seq_printf(m, "Lookups: n=%u neg=%u pos=%u crt=%u\n", atomic_read(&fscache_n_object_lookups), atomic_read(&fscache_n_object_lookups_negative), atomic_read(&fscache_n_object_lookups_positive), - atomic_read(&fscache_n_object_created), - atomic_read(&fscache_n_object_lookups_timed_out)); + atomic_read(&fscache_n_object_creates)); - seq_printf(m, "Invals : n=%u run=%u\n", - atomic_read(&fscache_n_invalidates), - atomic_read(&fscache_n_invalidates_run)); + seq_printf(m, "Invals : n=%u\n", + atomic_read(&fscache_n_invalidates)); seq_printf(m, "Updates: n=%u nul=%u run=%u\n", atomic_read(&fscache_n_updates), @@ -120,23 +102,9 @@ int fscache_stats_show(struct seq_file *m, void *v) atomic_read(&fscache_n_relinquishes_null), atomic_read(&fscache_n_relinquishes_retire)); - seq_printf(m, "Ops : pend=%u run=%u enq=%u can=%u rej=%u\n", - atomic_read(&fscache_n_op_pend), - atomic_read(&fscache_n_op_run), - atomic_read(&fscache_n_op_enqueue), - atomic_read(&fscache_n_op_cancelled), - atomic_read(&fscache_n_op_rejected)); - seq_printf(m, "Ops : ini=%u dfr=%u rel=%u gc=%u\n", - atomic_read(&fscache_n_op_initialised), - atomic_read(&fscache_n_op_deferred_release), - atomic_read(&fscache_n_op_release), - atomic_read(&fscache_n_op_gc)); - - seq_printf(m, "CacheOp: alo=%d luo=%d luc=%d gro=%d\n", + seq_printf(m, "CacheOp: alo=%d luo=%d\n", atomic_read(&fscache_n_cop_alloc_object), - atomic_read(&fscache_n_cop_lookup_object), - atomic_read(&fscache_n_cop_lookup_complete), - atomic_read(&fscache_n_cop_grab_object)); + atomic_read(&fscache_n_cop_lookup_object)); seq_printf(m, "CacheOp: inv=%d upo=%d dro=%d pto=%d atc=%d syn=%d\n", atomic_read(&fscache_n_cop_invalidate_object), atomic_read(&fscache_n_cop_update_object), From patchwork Mon Jul 13 16:35:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660569 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70ED813B4 for ; Mon, 13 Jul 2020 16:36:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5330C2067D for ; Mon, 13 Jul 2020 16:36:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Nrv+psB6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730382AbgGMQgF (ORCPT ); Mon, 13 Jul 2020 12:36:05 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:39271 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730179AbgGMQf7 (ORCPT ); Mon, 13 Jul 2020 12:35:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658156; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8uBgeBrOUUnKfxtv55796AFwjNnsDy70aj5ixeVep7k=; b=Nrv+psB6LVIvyLQ8ooVVO4cbWtyPPu+67nYAG4yo46T9YTgIiQtzvwUat9YkMea09xJohx WLo1uNV+096Th25GM814uGcCzddNMcSfnTXh6ADTJ4qq4sK2h0e1tiZWYs2hfzP4fCGJae 6vo9r1NJXkM59XHhdIvQ+CaY9x1AqHc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-299-PQAz-ha4Od-Kn70DbuelAA-1; Mon, 13 Jul 2020 12:35:52 -0400 X-MC-Unique: PQAz-ha4Od-Kn70DbuelAA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 73EB980572E; Mon, 13 Jul 2020 16:35:50 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCA0878A45; Mon, 13 Jul 2020 16:35:44 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 27/32] fscache: New stats From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:35:44 +0100 Message-ID: <159465814408.1376674.14594678302267796541.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Create some new stat counters appropriate to the new routines and display them in /proc/fs/fscache/stats. Signed-off-by: David Howells --- fs/fscache/dispatcher.c | 6 ++++ fs/fscache/internal.h | 25 +++++++++++++++++ fs/fscache/io.c | 2 + fs/fscache/read_helper.c | 38 +++++++++++++++++++++++-- fs/fscache/stats.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 137 insertions(+), 3 deletions(-) diff --git a/fs/fscache/dispatcher.c b/fs/fscache/dispatcher.c index fba71b99c951..489b8ab8cccd 100644 --- a/fs/fscache/dispatcher.c +++ b/fs/fscache/dispatcher.c @@ -41,6 +41,8 @@ void fscache_dispatch(struct fscache_cookie *cookie, struct fscache_work *work; bool queued = false; + fscache_stat(&fscache_n_dispatch_count); + work = kzalloc(sizeof(struct fscache_work), GFP_KERNEL); if (work) { work->cookie = cookie; @@ -57,10 +59,13 @@ void fscache_dispatch(struct fscache_cookie *cookie, queued = true; } spin_unlock(&fscache_work_lock); + if (queued) + fscache_stat(&fscache_n_dispatch_deferred); } if (!queued) { kfree(work); + fscache_stat(&fscache_n_dispatch_inline); func(cookie, object, param); } } @@ -86,6 +91,7 @@ static int fscache_dispatcher(void *data) if (work) { work->func(work->cookie, work->object, work->param); + fscache_stat(&fscache_n_dispatch_in_pool); fscache_cookie_put(work->cookie, fscache_cookie_put_work); kfree(work); } diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index d2b856aa5f0e..d9391d3974d1 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -209,6 +209,30 @@ extern atomic_t fscache_n_cache_stale_objects; extern atomic_t fscache_n_cache_retired_objects; extern atomic_t fscache_n_cache_culled_objects; +extern atomic_t fscache_n_dispatch_count; +extern atomic_t fscache_n_dispatch_deferred; +extern atomic_t fscache_n_dispatch_inline; +extern atomic_t fscache_n_dispatch_in_pool; + +extern atomic_t fscache_n_read; +extern atomic_t fscache_n_write; + +extern atomic_t fscache_n_read_helper; +extern atomic_t fscache_n_read_helper_stop_nomem; +extern atomic_t fscache_n_read_helper_stop_uptodate; +extern atomic_t fscache_n_read_helper_stop_exist; +extern atomic_t fscache_n_read_helper_stop_kill; +extern atomic_t fscache_n_read_helper_read; +extern atomic_t fscache_n_read_helper_download; +extern atomic_t fscache_n_read_helper_zero; +extern atomic_t fscache_n_read_helper_beyond_eof; +extern atomic_t fscache_n_read_helper_reissue; +extern atomic_t fscache_n_read_helper_read_done; +extern atomic_t fscache_n_read_helper_read_failed; +extern atomic_t fscache_n_read_helper_copy; +extern atomic_t fscache_n_read_helper_copy_done; +extern atomic_t fscache_n_read_helper_copy_failed; + static inline void fscache_stat(atomic_t *stat) { atomic_inc(stat); @@ -256,6 +280,7 @@ void fscache_update_aux(struct fscache_cookie *cookie, #else /* CONFIG_FSCACHE */ +#define fscache_stat(stat) do {} while(0) #define fscache_op_wq system_wq #endif /* CONFIG_FSCACHE */ diff --git a/fs/fscache/io.c b/fs/fscache/io.c index 8d7f79551699..d38101d77d27 100644 --- a/fs/fscache/io.c +++ b/fs/fscache/io.c @@ -138,6 +138,7 @@ int __fscache_read(struct fscache_io_request *req, struct iov_iter *iter) fscache_begin_io_operation(req->cookie, FSCACHE_WANT_READ, req); if (!IS_ERR(object)) { + fscache_stat(&fscache_n_read); req->object = object; return object->cache->ops->read(object, req, iter); } else { @@ -158,6 +159,7 @@ int __fscache_write(struct fscache_io_request *req, struct iov_iter *iter) fscache_begin_io_operation(req->cookie, FSCACHE_WANT_WRITE, req); if (!IS_ERR(object)) { + fscache_stat(&fscache_n_write); req->object = object; return object->cache->ops->write(object, req, iter); } else { diff --git a/fs/fscache/read_helper.c b/fs/fscache/read_helper.c index 62fed27aa938..227b326a54e2 100644 --- a/fs/fscache/read_helper.c +++ b/fs/fscache/read_helper.c @@ -68,6 +68,11 @@ static void fscache_read_copy_done(struct fscache_io_request *req) _enter("%lx,%x,%llx", index, req->nr_pages, req->transferred); + if (req->error == 0) + fscache_stat(&fscache_n_read_helper_copy_done); + else + fscache_stat(&fscache_n_read_helper_copy_failed); + /* Clear PG_fscache on the pages that were being written out. */ rcu_read_lock(); xas_for_each(&xas, page, last) { @@ -90,6 +95,8 @@ static void fscache_do_read_copy_to_cache(struct work_struct *work) _enter(""); + fscache_stat(&fscache_n_read_helper_copy); + iov_iter_mapping(&iter, WRITE, req->mapping, req->pos, round_up(req->len, req->dio_block_size)); @@ -142,6 +149,11 @@ static void fscache_read_done(struct fscache_io_request *req) _enter("%lx,%x,%llx,%d", start, req->nr_pages, req->transferred, req->error); + if (req->error == 0) + fscache_stat(&fscache_n_read_helper_read_done); + else + fscache_stat(&fscache_n_read_helper_read_failed); + if (req->transferred < req->len) fscache_clear_unread(req); @@ -195,6 +207,7 @@ static void fscache_file_read_maybe_reissue(struct fscache_io_request *req) if (req->error == 0) { fscache_read_done(req); } else { + fscache_stat(&fscache_n_read_helper_reissue); INIT_WORK(&req->work, fscache_reissue_read); fscache_get_io_request(req); queue_work(fscache_op_wq, &req->work); @@ -279,6 +292,8 @@ static int fscache_read_helper(struct fscache_io_request *req, loff_t new_size; int ret; + fscache_stat(&fscache_n_read_helper); + shape.granularity = 1; shape.max_io_pages = max_pages; shape.i_size = i_size_read(mapping->host); @@ -341,8 +356,10 @@ static int fscache_read_helper(struct fscache_io_request *req, while (cursor < shape.proposed_start) { page = find_or_create_page(mapping, cursor, readahead_gfp_mask(mapping)); - if (!page) + if (!page) { + fscache_stat(&fscache_n_read_helper_stop_nomem); goto nomem; + } if (!PageUptodate(page)) { req->nr_pages++; /* Add to the reading list */ cursor++; @@ -355,6 +372,7 @@ static int fscache_read_helper(struct fscache_io_request *req, */ notes |= FSCACHE_RHLP_NOTE_U2D_IN_PREFACE; notes &= ~FSCACHE_RHLP_NOTE_DO_WRITE_TO_CACHE; + fscache_stat(&fscache_n_read_helper_stop_uptodate); fscache_ignore_pages(mapping, start, cursor + 1); start = cursor = shape.proposed_start; req->nr_pages = 0; @@ -378,18 +396,23 @@ static int fscache_read_helper(struct fscache_io_request *req, _debug("prewrite req %lx", cursor); page = *requested_page; ret = -ERESTARTSYS; - if (lock_page_killable(page) < 0) + if (lock_page_killable(page) < 0) { + fscache_stat(&fscache_n_read_helper_stop_kill); goto dont; + } } else { _debug("prewrite new %lx %lx", cursor, eof); page = grab_cache_page_write_begin(mapping, shape.proposed_start, aop_flags); - if (!page) + if (!page) { + fscache_stat(&fscache_n_read_helper_stop_nomem); goto nomem; + } *requested_page = page; } if (PageUptodate(page)) { + fscache_stat(&fscache_n_read_helper_stop_uptodate); notes |= FSCACHE_RHLP_NOTE_LIST_U2D; trace_fscache_read_helper(req->cookie, @@ -450,12 +473,14 @@ static int fscache_read_helper(struct fscache_io_request *req, readahead_gfp_mask(mapping)); if (!page) { notes |= FSCACHE_RHLP_NOTE_LIST_NOMEM; + fscache_stat(&fscache_n_read_helper_stop_nomem); goto stop; } if (PageUptodate(page)) { unlock_page(page); put_page(page); /* Avoid overwriting */ + fscache_stat(&fscache_n_read_helper_stop_exist); ret = 0; notes |= FSCACHE_RHLP_NOTE_LIST_U2D; goto stop; @@ -468,6 +493,7 @@ static int fscache_read_helper(struct fscache_io_request *req, default: _debug("add fail %lx %d", cursor, ret); put_page(page); + fscache_stat(&fscache_n_read_helper_stop_nomem); page = NULL; notes |= FSCACHE_RHLP_NOTE_LIST_ERROR; goto stop; @@ -500,12 +526,14 @@ static int fscache_read_helper(struct fscache_io_request *req, readahead_gfp_mask(mapping)); if (!page) { notes |= FSCACHE_RHLP_NOTE_TRAILER_NOMEM; + fscache_stat(&fscache_n_read_helper_stop_nomem); goto stop; } if (PageUptodate(page)) { unlock_page(page); put_page(page); /* Avoid overwriting */ notes |= FSCACHE_RHLP_NOTE_TRAILER_U2D; + fscache_stat(&fscache_n_read_helper_stop_uptodate); goto stop; } @@ -587,18 +615,22 @@ static int fscache_read_helper(struct fscache_io_request *req, * the pages. */ _debug("SKIP READ: %llu", req->len); + fscache_stat(&fscache_n_read_helper_beyond_eof); fscache_read_done(req); break; case fscache_read_helper_zero: _debug("ZERO READ: %llu", req->len); + fscache_stat(&fscache_n_read_helper_zero); fscache_read_done(req); break; case fscache_read_helper_read: + fscache_stat(&fscache_n_read_helper_read); req->io_done = fscache_file_read_maybe_reissue; fscache_read_from_cache(req); break; case fscache_read_helper_download: _debug("DOWNLOAD: %llu", req->len); + fscache_stat(&fscache_n_read_helper_download); req->io_done = fscache_read_done; fscache_read_from_server(req); break; diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c index ccca0016fd26..63fb4d831f4d 100644 --- a/fs/fscache/stats.c +++ b/fs/fscache/stats.c @@ -58,6 +58,46 @@ atomic_t fscache_n_cache_stale_objects; atomic_t fscache_n_cache_retired_objects; atomic_t fscache_n_cache_culled_objects; +atomic_t fscache_n_dispatch_count; +atomic_t fscache_n_dispatch_deferred; +atomic_t fscache_n_dispatch_inline; +atomic_t fscache_n_dispatch_in_pool; + +atomic_t fscache_n_read; +atomic_t fscache_n_write; + +atomic_t fscache_n_read_helper; +atomic_t fscache_n_read_helper_stop_nomem; +atomic_t fscache_n_read_helper_stop_uptodate; +atomic_t fscache_n_read_helper_stop_exist; +atomic_t fscache_n_read_helper_stop_kill; +atomic_t fscache_n_read_helper_read; +atomic_t fscache_n_read_helper_download; +atomic_t fscache_n_read_helper_zero; +atomic_t fscache_n_read_helper_beyond_eof; +atomic_t fscache_n_read_helper_reissue; +atomic_t fscache_n_read_helper_read_done; +atomic_t fscache_n_read_helper_read_failed; +atomic_t fscache_n_read_helper_copy; +atomic_t fscache_n_read_helper_copy_done; +atomic_t fscache_n_read_helper_copy_failed; + +EXPORT_SYMBOL(fscache_n_read_helper); +EXPORT_SYMBOL(fscache_n_read_helper_stop_nomem); +EXPORT_SYMBOL(fscache_n_read_helper_stop_uptodate); +EXPORT_SYMBOL(fscache_n_read_helper_stop_exist); +EXPORT_SYMBOL(fscache_n_read_helper_stop_kill); +EXPORT_SYMBOL(fscache_n_read_helper_read); +EXPORT_SYMBOL(fscache_n_read_helper_download); +EXPORT_SYMBOL(fscache_n_read_helper_zero); +EXPORT_SYMBOL(fscache_n_read_helper_beyond_eof); +EXPORT_SYMBOL(fscache_n_read_helper_reissue); +EXPORT_SYMBOL(fscache_n_read_helper_read_done); +EXPORT_SYMBOL(fscache_n_read_helper_read_failed); +EXPORT_SYMBOL(fscache_n_read_helper_copy); +EXPORT_SYMBOL(fscache_n_read_helper_copy_done); +EXPORT_SYMBOL(fscache_n_read_helper_copy_failed); + /* * display the general statistics */ @@ -117,5 +157,34 @@ int fscache_stats_show(struct seq_file *m, void *v) atomic_read(&fscache_n_cache_stale_objects), atomic_read(&fscache_n_cache_retired_objects), atomic_read(&fscache_n_cache_culled_objects)); + + seq_printf(m, "Disp : n=%u il=%u df=%u pl=%u\n", + atomic_read(&fscache_n_dispatch_count), + atomic_read(&fscache_n_dispatch_inline), + atomic_read(&fscache_n_dispatch_deferred), + atomic_read(&fscache_n_dispatch_in_pool)); + + seq_printf(m, "IO : rd=%u wr=%u\n", + atomic_read(&fscache_n_read), + atomic_read(&fscache_n_write)); + + seq_printf(m, "RdHelp : nm=%u ud=%u ex=%u kl=%u\n", + atomic_read(&fscache_n_read_helper_stop_nomem), + atomic_read(&fscache_n_read_helper_stop_uptodate), + atomic_read(&fscache_n_read_helper_stop_exist), + atomic_read(&fscache_n_read_helper_stop_kill)); + seq_printf(m, "RdHelp : n=%u rd=%u dl=%u zr=%u eo=%u\n", + atomic_read(&fscache_n_read_helper), + atomic_read(&fscache_n_read_helper_read), + atomic_read(&fscache_n_read_helper_download), + atomic_read(&fscache_n_read_helper_zero), + atomic_read(&fscache_n_read_helper_beyond_eof)); + seq_printf(m, "RdHelp : ri=%u dn=%u fl=%u cp=%u cd=%u cf=%u\n", + atomic_read(&fscache_n_read_helper_reissue), + atomic_read(&fscache_n_read_helper_read_done), + atomic_read(&fscache_n_read_helper_read_failed), + atomic_read(&fscache_n_read_helper_copy), + atomic_read(&fscache_n_read_helper_copy_done), + atomic_read(&fscache_n_read_helper_copy_failed)); return 0; } From patchwork Mon Jul 13 16:35:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660573 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5713D1510 for ; Mon, 13 Jul 2020 16:36:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 31C01206F5 for ; Mon, 13 Jul 2020 16:36:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ku9E4tSc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730418AbgGMQgK (ORCPT ); Mon, 13 Jul 2020 12:36:10 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:53622 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730258AbgGMQgJ (ORCPT ); Mon, 13 Jul 2020 12:36:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658166; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zp0Zx5Rtne2iAp2FELym6VDTOjz+nqoJf2nUNSuz50w=; b=Ku9E4tScg1qq3zgoy4czi6K022Y0VEH5uZf1HyeYK+yfTW5FcRozGv2z/lk3Lqo5W7INLj 5tgiUaGYjKOajSIxqVFVbPiXEVnBO066w/scVMGoIc7ljYqqqoqGhE90QIBRA1v8vMIPZS VVftkmiZvKO1qcj3vPiO2YhZR/j9dvE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-354-vEKL5XtbPim6o-hYRlTNDg-1; Mon, 13 Jul 2020 12:36:04 -0400 X-MC-Unique: vEKL5XtbPim6o-hYRlTNDg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E8FB11009600; Mon, 13 Jul 2020 16:36:02 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7546E5FC34; Mon, 13 Jul 2020 16:35:56 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 28/32] fscache, cachefiles: Rewrite invalidation From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:35:55 +0100 Message-ID: <159465815567.1376674.11728768649953111384.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Rewrite the cache object invalidation code in fscache and cachefiles. The following changes are made to fscache: (1) Invalidation is now ignored or allowed to proceed depending on the 'stage' a non-index cookie is in with respect to the backing object. (2) If invalidation is proceeds, it pins the object and holds an operation count for the duration. (3) The fscache_object struct is given an invalidation counter that is incremented any time fscache_invalidate() is called, even if the cookie is at a stage in which it cannot be applied. The counter, however, can be noted and applied retroactively later. (4) The invalidation counter is noted in the operation struct when a cache operation is begun and can be checked on operation completion to find out if any consequent metadata changes should be dropped. (5) New operations aren't allowed to proceed if the object is being invalidated. and to cachefiles: (1) If an open object is invalidated, the open backing file is replaced with a tmpfile (as if opened O_TMPFILE). This is held unlinked until the object released from memory, at which point the file is simply abandoned if it was retired or the old file is unlinked and the new one linked into its place. Note: This would be easier if linkat() could be given a flag to indicate the destination should be overwritten or if RENAME_EXCHANGE could be applied to tmpfiles, effectively unlinking the destination. (2) Upon invalidation, the content map is replaced with a blank one. Signed-off-by: David Howells --- fs/afs/inode.c | 8 ++- fs/cachefiles/content-map.c | 32 ++++++++++ fs/cachefiles/interface.c | 130 ++++++++++++++++++++++++++++++++++------- fs/cachefiles/internal.h | 9 ++- fs/cachefiles/namei.c | 69 ++++++++++++++++++++-- fs/cachefiles/xattr.c | 6 +- fs/fscache/cookie.c | 47 +++++++++++++-- fs/fscache/io.c | 2 + fs/fscache/obj.c | 31 +++------- include/linux/fscache-cache.h | 5 +- include/linux/fscache.h | 15 +++-- 11 files changed, 283 insertions(+), 71 deletions(-) diff --git a/fs/afs/inode.c b/fs/afs/inode.c index b0772e64a844..eab191b9c01d 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -569,7 +569,13 @@ static void afs_zap_data(struct afs_vnode *vnode) _enter("{%llx:%llu}", vnode->fid.vid, vnode->fid.vnode); #ifdef CONFIG_AFS_FSCACHE - fscache_invalidate(vnode->cache, i_size_read(&vnode->vfs_inode)); + { + struct afs_vnode_cache_aux aux = { + .data_version = vnode->status.data_version, + }; + fscache_invalidate(afs_vnode_cache(vnode), &aux, + i_size_read(&vnode->vfs_inode), 0); + } #endif /* nuke all the non-dirty pages that aren't locked, mapped or being diff --git a/fs/cachefiles/content-map.c b/fs/cachefiles/content-map.c index f2a10e8d8d6d..3e310fd58497 100644 --- a/fs/cachefiles/content-map.c +++ b/fs/cachefiles/content-map.c @@ -192,6 +192,34 @@ void cachefiles_shape_request(struct fscache_object *obj, shape->to_be_done, shape->actual_start, shape->actual_nr_pages); } +/* + * Allocate a new content map. + */ +u8 *cachefiles_new_content_map(struct cachefiles_object *object, + unsigned int *_size) +{ + size_t size; + u8 *map = NULL; + + _enter(""); + + if (!(object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK)) { + /* Single-chunk object. The presence or absence of the content + * map xattr is sufficient indication. + */ + *_size = 0; + return NULL; + } + + /* Granular object. */ + size = cachefiles_map_size(object->fscache.cookie->object_size); + map = kzalloc(size, GFP_KERNEL); + if (!map) + return ERR_PTR(-ENOMEM); + *_size = size; + return map; +} + /* * Mark the content map to indicate stored granule. */ @@ -205,7 +233,9 @@ void cachefiles_mark_content_map(struct fscache_io_request *req) read_lock_bh(&object->content_map_lock); - if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) { + if (req->inval_counter != object->fscache.inval_counter) { + _debug("inval mark"); + } else if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) { if (pos == 0) { object->content_info = CACHEFILES_CONTENT_SINGLE; set_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->fscache.flags); diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 78180d269c5f..76f3a89d3e6c 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -203,7 +203,7 @@ static void cachefiles_update_object(struct fscache_object *_object) } } - cachefiles_set_object_xattr(object, XATTR_REPLACE); + cachefiles_set_object_xattr(object); out: cachefiles_end_secure(cache, saved_cred); @@ -213,11 +213,15 @@ static void cachefiles_update_object(struct fscache_object *_object) /* * Commit changes to the object as we drop it. */ -static void cachefiles_commit_object(struct cachefiles_object *object, +static bool cachefiles_commit_object(struct cachefiles_object *object, struct cachefiles_cache *cache) { if (object->content_map_changed) cachefiles_save_content_map(object); + + if (test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) + return cachefiles_commit_tmpfile(cache, object); + return true; } /* @@ -424,47 +428,127 @@ static int cachefiles_attr_changed(struct cachefiles_object *object) } /* - * Invalidate an object + * Create a temporary file and leave it unattached and un-xattr'd until the + * time comes to discard the object from memory. */ -static void cachefiles_invalidate_object(struct fscache_object *_object) +static struct file *cachefiles_create_tmpfile(struct cachefiles_object *object) { - struct cachefiles_object *object; struct cachefiles_cache *cache; const struct cred *saved_cred; + struct file *file; struct path path; uint64_t ni_size; - int ret; + long ret; - object = container_of(_object, struct cachefiles_object, fscache); cache = container_of(object->fscache.cache, struct cachefiles_cache, cache); ni_size = object->fscache.cookie->object_size; ni_size = round_up(ni_size, CACHEFILES_DIO_BLOCK_SIZE); + cachefiles_begin_secure(cache, &saved_cred); + + path.mnt = cache->mnt; + path.dentry = vfs_tmpfile(cache->graveyard, S_IFREG, O_RDWR); + if (IS_ERR(path.dentry)) { + if (PTR_ERR(path.dentry) == -EIO) + cachefiles_io_error_obj(object, "Failed to create tmpfile"); + file = ERR_CAST(path.dentry); + goto out; + } + + trace_cachefiles_tmpfile(object, d_inode(path.dentry)); + + if (ni_size > 0) { + trace_cachefiles_trunc(object, d_inode(path.dentry), 0, ni_size); + ret = vfs_truncate(&path, ni_size); + if (ret < 0) { + file = ERR_PTR(ret); + goto out_dput; + } + } + + file = open_with_fake_path(&path, + O_RDWR | O_LARGEFILE | O_DIRECT, + d_backing_inode(path.dentry), + cache->cache_cred); +out_dput: + dput(path.dentry); +out: + cachefiles_end_secure(cache, saved_cred); + return file; +} + +/* + * Invalidate an object + */ +static bool cachefiles_invalidate_object(struct fscache_object *_object, + unsigned int flags) +{ + struct cachefiles_object *object; + struct file *file, *old_file; + u8 *map, *old_map; + unsigned int map_size; + + object = container_of(_object, struct cachefiles_object, fscache); + _enter("{OBJ%x},[%llu]", - object->fscache.debug_id, (unsigned long long)ni_size); + object->fscache.debug_id, _object->cookie->object_size); + + if ((flags & FSCACHE_INVAL_LIGHT) && + test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) { + _leave(" = t [light]"); + return true; + } if (object->dentry) { ASSERT(d_is_reg(object->dentry)); - path.dentry = object->dentry; - path.mnt = cache->mnt; - - cachefiles_begin_secure(cache, &saved_cred); - ret = vfs_truncate(&path, 0); - if (ret == 0) - ret = vfs_truncate(&path, ni_size); - cachefiles_end_secure(cache, saved_cred); - - if (ret != 0) { - if (ret == -EIO) - cachefiles_io_error_obj(object, - "Invalidate failed"); - } + file = cachefiles_create_tmpfile(object); + if (IS_ERR(file)) + goto failed; + + map = cachefiles_new_content_map(object, &map_size); + if (IS_ERR(map)) + goto failed_fput; + + /* Substitute the VFS target */ + _debug("sub"); + dget(file->f_path.dentry); /* Do outside of content_map_lock */ + spin_lock(&object->fscache.lock); + write_lock_bh(&object->content_map_lock); + + if (!object->old) + /* Save the dentry carrying the path information */ + object->old = object->dentry; + + old_file = object->backing_file; + old_map = object->content_map; + object->backing_file = file; + object->dentry = file->f_path.dentry; + object->content_info = CACHEFILES_CONTENT_NO_DATA; + object->content_map = map; + object->content_map_size = map_size; + object->content_map_changed = true; + set_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags); + set_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->fscache.flags); + + write_unlock_bh(&object->content_map_lock); + spin_unlock(&object->fscache.lock); + _debug("subbed"); + + kfree(old_map); + fput(old_file); } - _leave(""); + _leave(" = t [tmpfile]"); + return true; + +failed_fput: + fput(file); +failed: + _leave(" = f"); + return false; } static unsigned int cachefiles_get_object_usage(const struct fscache_object *_object) diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index c91a9b3c5bd5..ba60fc9dda0a 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -54,6 +54,8 @@ struct cachefiles_object { struct file *backing_file; /* File open on backing storage */ loff_t i_size; /* object size */ atomic_t usage; /* object usage count */ + unsigned long flags; +#define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Object has a tmpfile that need linking */ uint8_t type; /* object type */ bool new; /* T if object new */ @@ -127,6 +129,7 @@ extern void cachefiles_daemon_unbind(struct cachefiles_cache *cache); */ extern void cachefiles_shape_request(struct fscache_object *object, struct fscache_request_shape *shape); +extern u8 *cachefiles_new_content_map(struct cachefiles_object *object, unsigned int *_size); extern void cachefiles_mark_content_map(struct fscache_io_request *req); extern void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t size); extern void cachefiles_shorten_content_map(struct cachefiles_object *object, loff_t new_size); @@ -185,6 +188,9 @@ extern int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, extern int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir, char *filename); +extern bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache, + struct cachefiles_object *object); + /* * proc.c */ @@ -237,8 +243,7 @@ static inline void cachefiles_end_secure(struct cachefiles_cache *cache, * xattr.c */ extern int cachefiles_check_object_type(struct cachefiles_object *object); -extern int cachefiles_set_object_xattr(struct cachefiles_object *object, - unsigned int xattr_flags); +extern int cachefiles_set_object_xattr(struct cachefiles_object *object); extern int cachefiles_check_auxdata(struct cachefiles_object *object); extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache, struct dentry *dentry); diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index 3dc64ae5dde8..e63ee4b88268 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -468,7 +468,7 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, if (object->new) { /* attach data to a newly constructed terminal object */ - ret = cachefiles_set_object_xattr(object, XATTR_CREATE); + ret = cachefiles_set_object_xattr(object); if (ret < 0) goto check_error; } else { @@ -487,8 +487,6 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, pr_warn("cachefiles: Block size too large\n"); goto check_error; } - - object->old = dget(object->dentry); } else { BUG(); // TODO: open file in data-class subdir } @@ -523,9 +521,7 @@ bool cachefiles_walk_to_object(struct cachefiles_object *parent, cachefiles_unmark_inode_in_use(object, object->dentry); cachefiles_mark_object_inactive(cache, object); dput(object->dentry); - dput(object->old); object->dentry = NULL; - object->old = NULL; goto error_out; lookup_error: @@ -811,3 +807,66 @@ int cachefiles_check_in_use(struct cachefiles_cache *cache, struct dentry *dir, //_leave(" = 0"); return ret; } + +/* + * Attempt to link a temporary file into its rightful place in the cache. + */ +bool cachefiles_commit_tmpfile(struct cachefiles_cache *cache, + struct cachefiles_object *object) +{ + struct dentry *dir, *dentry, *old; + char *name; + unsigned int namelen; + bool success = false; + int ret; + + _enter(",%pd", object->old); + + namelen = object->old->d_name.len; + name = kmemdup_nul(object->old->d_name.name, namelen, GFP_KERNEL); + if (!name) + goto out; + + dir = dget_parent(object->old); + + inode_lock_nested(d_inode(dir), I_MUTEX_PARENT); + ret = cachefiles_bury_object(cache, object, dir, object->old, + FSCACHE_OBJECT_IS_STALE); + dput(object->old); + object->old = NULL; + if (ret < 0 && ret != -ENOENT) { + _debug("bury fail %d", ret); + goto out_name; + } + + inode_lock_nested(d_inode(dir), I_MUTEX_PARENT); + dentry = lookup_one_len(name, dir, namelen); + if (IS_ERR(dentry)) { + _debug("lookup fail %ld", PTR_ERR(dentry)); + goto out_unlock; + } + + ret = vfs_link(object->dentry, d_inode(dir), dentry, NULL); + if (ret < 0) { + _debug("link fail %d", ret); + dput(dentry); + } else { + trace_cachefiles_link(object, d_inode(object->dentry)); + spin_lock(&object->fscache.lock); + old = object->dentry; + object->dentry = dentry; + success = true; + clear_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags); + spin_unlock(&object->fscache.lock); + dput(old); + } + +out_unlock: + inode_unlock(d_inode(dir)); +out_name: + kfree(name); + dput(dir); +out: + _leave(" = %u", success); + return success; +} diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index a1d4a3d1db69..22c56ca2fd0b 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -104,8 +104,7 @@ int cachefiles_check_object_type(struct cachefiles_object *object) /* * set the state xattr on a cache file */ -int cachefiles_set_object_xattr(struct cachefiles_object *object, - unsigned int xattr_flags) +int cachefiles_set_object_xattr(struct cachefiles_object *object) { struct cachefiles_xattr *buf; struct dentry *dentry = object->dentry; @@ -129,8 +128,7 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object, memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len); ret = vfs_setxattr(dentry, cachefiles_xattr_cache, - buf, sizeof(struct cachefiles_xattr) + len, - xattr_flags); + buf, sizeof(struct cachefiles_xattr) + len, 0); if (ret < 0) { trace_cachefiles_coherency(object, d_inode(dentry)->i_ino, buf->content, diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index 2d9d147411cd..fc93f4b69198 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -472,10 +472,14 @@ void fscache_set_cookie_stage(struct fscache_cookie *cookie, } /* - * Invalidate an object. Callable with spinlocks held. + * Invalidate an object. */ -void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size) +void __fscache_invalidate(struct fscache_cookie *cookie, + const void *aux_data, loff_t new_size, + unsigned int flags) { + struct fscache_object *object = NULL; + _enter("{%s}", cookie->type_name); fscache_stat(&fscache_n_invalidates); @@ -488,13 +492,42 @@ void __fscache_invalidate(struct fscache_cookie *cookie, loff_t new_size) ASSERTCMP(cookie->type, !=, FSCACHE_COOKIE_TYPE_INDEX); spin_lock(&cookie->lock); - cookie->object_size = new_size; + fscache_update_aux(cookie, aux_data, &new_size); cookie->zero_point = new_size; - spin_unlock(&cookie->lock); - if (!hlist_empty(&cookie->backing_objects) && - test_and_set_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags)) - fscache_dispatch(cookie, NULL, 0, fscache_invalidate_object); + if (!hlist_empty(&cookie->backing_objects)) { + object = hlist_entry(cookie->backing_objects.first, + struct fscache_object, cookie_link); + object->inval_counter++; + } + + switch (cookie->stage) { + case FSCACHE_COOKIE_STAGE_QUIESCENT: + case FSCACHE_COOKIE_STAGE_DEAD: + case FSCACHE_COOKIE_STAGE_INITIALISING: /* Assume later checks will catch it */ + case FSCACHE_COOKIE_STAGE_INVALIDATING: /* is_still_valid will catch it */ + spin_unlock(&cookie->lock); + _leave(" [no %u]", cookie->stage); + return; + + case FSCACHE_COOKIE_STAGE_LOOKING_UP: + spin_unlock(&cookie->lock); + _leave(" [look %x]", object->inval_counter); + return; + + case FSCACHE_COOKIE_STAGE_NO_DATA_YET: + case FSCACHE_COOKIE_STAGE_ACTIVE: + cookie->stage = FSCACHE_COOKIE_STAGE_INVALIDATING; + wake_up_var(&cookie->stage); + + atomic_inc(&cookie->n_ops); + object->cache->ops->grab_object(object, fscache_obj_get_inval); + spin_unlock(&cookie->lock); + + fscache_dispatch(cookie, object, flags, fscache_invalidate_object); + _leave(" [inv]"); + return; + } } EXPORT_SYMBOL(__fscache_invalidate); diff --git a/fs/fscache/io.c b/fs/fscache/io.c index d38101d77d27..1885cfbe7f04 100644 --- a/fs/fscache/io.c +++ b/fs/fscache/io.c @@ -84,6 +84,8 @@ static struct fscache_object *fscache_begin_io_operation( goto not_live; object->cache->ops->grab_object(object, fscache_obj_get_ioreq); + if (req) + req->inval_counter = object->inval_counter; atomic_inc(&cookie->n_ops); spin_unlock(&cookie->lock); diff --git a/fs/fscache/obj.c b/fs/fscache/obj.c index baab7c465142..a7064a4cb486 100644 --- a/fs/fscache/obj.c +++ b/fs/fscache/obj.c @@ -241,32 +241,21 @@ void fscache_lookup_object(struct fscache_cookie *cookie, } /* - * Invalidate an object + * Invalidate an object. param passes the invalidation flags. */ void fscache_invalidate_object(struct fscache_cookie *cookie, - struct fscache_object *unused, int param) + struct fscache_object *object, int flags) { - struct fscache_object *object = NULL; + bool success; - spin_lock(&cookie->lock); - - if (!hlist_empty(&cookie->backing_objects)) { - object = hlist_entry(cookie->backing_objects.first, - struct fscache_object, - cookie_link); - object = object->cache->ops->grab_object(object, - fscache_obj_get_inval); - } - - spin_unlock(&cookie->lock); - - if (object) { - object->cache->ops->invalidate_object(object); - fscache_do_put_object(object, fscache_obj_put_inval); - } + success = object->cache->ops->invalidate_object(object, flags); + fscache_do_put_object(object, fscache_obj_put_inval); - clear_bit_unlock(FSCACHE_COOKIE_INVALIDATING, &cookie->flags); - wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING); + if (success) + fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_NO_DATA_YET); + else + fscache_set_cookie_stage(cookie, FSCACHE_COOKIE_STAGE_DEAD); + fscache_end_io_operation(cookie); } /* diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index 1357c44d371b..da85eb15b3c9 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -120,7 +120,8 @@ struct fscache_cache_ops { void (*update_object)(struct fscache_object *object); /* Invalidate an object */ - void (*invalidate_object)(struct fscache_object *object); + bool (*invalidate_object)(struct fscache_object *object, + unsigned int flags); /* discard the resources pinned by an object and effect retirement if * necessary */ @@ -176,10 +177,12 @@ enum fscache_object_stage { struct fscache_object { int debug_id; /* debugging ID */ int n_children; /* number of child objects */ + unsigned int inval_counter; /* Number of invalidations applied */ enum fscache_object_stage stage; /* Stage of object's lifecycle */ spinlock_t lock; /* state and operations lock */ unsigned long flags; +#define FSCACHE_OBJECT_NEEDS_INVAL 8 /* T if object needs invalidation */ #define FSCACHE_OBJECT_NEEDS_UPDATE 9 /* T if object attrs need writing to disk */ struct list_head cache_link; /* link in cache->object_list */ diff --git a/include/linux/fscache.h b/include/linux/fscache.h index 0aee6edef672..c313950afd8a 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -57,6 +57,8 @@ enum fscache_cookie_type { #define FSCACHE_ADV_WRITE_CACHE 0x00 /* Do cache if written to locally */ #define FSCACHE_ADV_WRITE_NOCACHE 0x02 /* Don't cache if written to locally */ +#define FSCACHE_INVAL_LIGHT 0x01 /* Don't re-invalidate if temp object */ + /* * fscache cached network filesystem type * - name, version and ops must be filled in before registration @@ -105,7 +107,6 @@ struct fscache_cookie { loff_t zero_point; /* Size after which no data on server */ unsigned long flags; -#define FSCACHE_COOKIE_INVALIDATING 4 /* T if cookie is being invalidated */ #define FSCACHE_COOKIE_ACQUIRED 5 /* T if cookie is in use */ #define FSCACHE_COOKIE_RELINQUISHED 6 /* T if cookie has been relinquished */ @@ -178,6 +179,7 @@ struct fscache_io_request { loff_t len; /* Size of the I/O */ loff_t transferred; /* Amount of data transferred */ short error; /* 0 or error that occurred */ + unsigned int inval_counter; /* object->inval_counter at begin_op */ unsigned long flags; #define FSCACHE_IO_DATA_FROM_SERVER 0 /* Set if data was read from server */ #define FSCACHE_IO_DATA_FROM_CACHE 1 /* Set if data was read from the cache */ @@ -230,7 +232,7 @@ extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool); extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *); extern void __fscache_shape_request(struct fscache_cookie *, struct fscache_request_shape *); -extern void __fscache_invalidate(struct fscache_cookie *, loff_t); +extern void __fscache_invalidate(struct fscache_cookie *, const void *, loff_t, unsigned int); extern void __fscache_init_io_request(struct fscache_io_request *, struct fscache_cookie *); extern void __fscache_free_io_request(struct fscache_io_request *); @@ -461,22 +463,23 @@ void fscache_unpin_cookie(struct fscache_cookie *cookie) /** * fscache_invalidate - Notify cache that an object needs invalidation * @cookie: The cookie representing the cache object + * @aux_data: The updated auxiliary data for the cookie (may be NULL) * @size: The revised size of the object. + * @flags: Invalidation flags (FSCACHE_INVAL_*) * * Notify the cache that an object is needs to be invalidated and that it * should abort any retrievals or stores it is doing on the cache. The object * is then marked non-caching until such time as the invalidation is complete. * - * This can be called with spinlocks held. - * * See Documentation/filesystems/caching/netfs-api.rst for a complete * description. */ static inline -void fscache_invalidate(struct fscache_cookie *cookie, loff_t size) +void fscache_invalidate(struct fscache_cookie *cookie, + const void *aux_data, loff_t size, unsigned int flags) { if (fscache_cookie_valid(cookie)) - __fscache_invalidate(cookie, size); + __fscache_invalidate(cookie, aux_data, size, flags); } /** From patchwork Mon Jul 13 16:36:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 43DED13B4 for ; Mon, 13 Jul 2020 16:36:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 256882067D for ; Mon, 13 Jul 2020 16:36:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NVyQRR7U" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730744AbgGMQgY (ORCPT ); Mon, 13 Jul 2020 12:36:24 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:52678 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729751AbgGMQgX (ORCPT ); Mon, 13 Jul 2020 12:36:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658181; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4Dx73C4k9Thc3ljEFUPGVG7P3hZSGE4+WwiC3XK8A+c=; b=NVyQRR7UjDsG7ZW2t7lTnzBdDpCN3i/WoRJ9+Iy7zgf5S60XCoJLP6Vehri3eCGicYJ1En NlzPxkpH0sdx7PWxFYLbk6Y+kfVsBXq46Q+HJ0qKdolaZCQCxhtDxJg7G1jVc8ELBaOE5D AAJIZ+pjgcArukmgqLHCgWP81Em4+f0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-158-LNWmS2GOPOaPnnliCcz2VQ-1; Mon, 13 Jul 2020 12:36:19 -0400 X-MC-Unique: LNWmS2GOPOaPnnliCcz2VQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7FE501088; Mon, 13 Jul 2020 16:36:17 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id EE1AF6FDD1; Mon, 13 Jul 2020 16:36:08 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 29/32] fscache: Implement "will_modify" parameter on fscache_use_cookie() From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:36:08 +0100 Message-ID: <159465816816.1376674.16552237991218497564.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Implement the "will_modify" parameter passed to fscache_use_cookie(). Setting this to true will henceforth cause the affected object to be marked as dirty on disk, subject to conflict resolution in the event that power failure or a crash occurs or the filesystem operates in disconnected mode. The dirty flag is removed when the fscache_object is discarded from memory. A cache hook is provided to prepare for writing - and this can be used to mark the object on disk. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 65 +++++++++++++++++++++++++++++++++++++++++ fs/cachefiles/internal.h | 2 + fs/cachefiles/xattr.c | 21 +++++++++++++ fs/fscache/cookie.c | 17 +++++++++-- fs/fscache/internal.h | 1 + fs/fscache/obj.c | 28 +++++++++++++++--- include/linux/fscache-cache.h | 4 +++ 7 files changed, 129 insertions(+), 9 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index 76f3a89d3e6c..c626cc4248a7 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -210,14 +210,78 @@ static void cachefiles_update_object(struct fscache_object *_object) _leave(""); } +/* + * Shorten the backing object to discard any dirty data and free up + * any unused granules. + */ +static bool cachefiles_shorten_object(struct cachefiles_object *object, loff_t new_size) +{ + struct cachefiles_cache *cache; + struct inode *inode; + struct path path; + loff_t i_size; + + cache = container_of(object->fscache.cache, + struct cachefiles_cache, cache); + path.mnt = cache->mnt; + path.dentry = object->dentry; + + inode = d_inode(object->dentry); + trace_cachefiles_trunc(object, inode, i_size_read(inode), new_size); + if (vfs_truncate(&path, new_size) < 0) { + cachefiles_io_error_obj(object, "Trunc-to-size failed"); + cachefiles_remove_object_xattr(cache, object->dentry); + return false; + } + + new_size = round_up(new_size, CACHEFILES_DIO_BLOCK_SIZE); + i_size = i_size_read(inode); + if (i_size < new_size) { + trace_cachefiles_trunc(object, inode, i_size, new_size); + if (vfs_truncate(&path, new_size) < 0) { + cachefiles_io_error_obj(object, "Trunc-to-dio-size failed"); + cachefiles_remove_object_xattr(cache, object->dentry); + return false; + } + } + + return true; +} + +/* + * Trim excess stored data off of an object. + */ +static bool cachefiles_trim_object(struct cachefiles_object *object) +{ + loff_t object_size; + + _enter("{OBJ%x}", object->fscache.debug_id); + + object_size = object->fscache.cookie->object_size; + if (i_size_read(d_inode(object->dentry)) <= object_size) + return true; + + return cachefiles_shorten_object(object, object_size); +} + /* * Commit changes to the object as we drop it. */ static bool cachefiles_commit_object(struct cachefiles_object *object, struct cachefiles_cache *cache) { + bool update = false; + if (object->content_map_changed) cachefiles_save_content_map(object); + if (test_and_clear_bit(FSCACHE_OBJECT_LOCAL_WRITE, &object->fscache.flags)) + update = true; + if (test_and_clear_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->fscache.flags)) + update = true; + if (update) { + if (cachefiles_trim_object(object)) + cachefiles_set_object_xattr(object); + } if (test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) return cachefiles_commit_tmpfile(cache, object); @@ -575,5 +639,6 @@ const struct fscache_cache_ops cachefiles_cache_ops = { .shape_request = cachefiles_shape_request, .read = cachefiles_read, .write = cachefiles_write, + .prepare_to_write = cachefiles_prepare_to_write, .display_object = cachefiles_display_object, }; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index ba60fc9dda0a..bfe56eb53104 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -247,7 +247,7 @@ extern int cachefiles_set_object_xattr(struct cachefiles_object *object); extern int cachefiles_check_auxdata(struct cachefiles_object *object); extern int cachefiles_remove_object_xattr(struct cachefiles_cache *cache, struct dentry *dentry); - +extern int cachefiles_prepare_to_write(struct fscache_object *object); /* * error handling diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index 22c56ca2fd0b..456301b7abb0 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -124,6 +124,8 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object) buf->zero_point = cpu_to_be64(object->fscache.cookie->zero_point); buf->type = object->fscache.cookie->type; buf->content = object->content_info; + if (test_bit(FSCACHE_OBJECT_LOCAL_WRITE, &object->fscache.flags)) + buf->content = CACHEFILES_CONTENT_DIRTY; if (len > 0) memcpy(buf->data, fscache_get_aux(object->fscache.cookie), len); @@ -184,10 +186,16 @@ int cachefiles_check_auxdata(struct cachefiles_object *object) why = cachefiles_coherency_check_aux; } else if (be64_to_cpu(buf->object_size) != object->fscache.cookie->object_size) { why = cachefiles_coherency_check_objsize; + } else if (buf->content == CACHEFILES_CONTENT_DIRTY) { + // TODO: Begin conflict resolution + pr_warn("Dirty object in cache\n"); + why = cachefiles_coherency_check_dirty; } else { object->fscache.cookie->zero_point = be64_to_cpu(buf->zero_point); object->content_info = buf->content; why = cachefiles_coherency_check_ok; + object->fscache.cookie->zero_point = be64_to_cpu(buf->zero_point); + object->content_info = buf->content; ret = 0; } @@ -219,3 +227,16 @@ int cachefiles_remove_object_xattr(struct cachefiles_cache *cache, _leave(" = %d", ret); return ret; } + +/* + * Stick a marker on the cache object to indicate that it's dirty. + */ +int cachefiles_prepare_to_write(struct fscache_object *_object) +{ + struct cachefiles_object *object = + container_of(_object, struct cachefiles_object, fscache); + + _enter("c=%08x", object->fscache.cookie->debug_id); + + return cachefiles_set_object_xattr(object); +} diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index fc93f4b69198..22cb8efe261f 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -342,6 +342,8 @@ EXPORT_SYMBOL(__fscache_acquire_cookie); void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify) { enum fscache_cookie_stage stage; + struct fscache_object *object; + bool write_set; _enter("c=%08x", cookie->debug_id); @@ -360,7 +362,7 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify) /* The lookup job holds its own active increment */ atomic_inc(&cookie->n_active); - fscache_dispatch(cookie, NULL, 0, fscache_lookup_object); + fscache_dispatch(cookie, NULL, will_modify, fscache_lookup_object); break; case FSCACHE_COOKIE_STAGE_INITIALISING: @@ -373,8 +375,17 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify) case FSCACHE_COOKIE_STAGE_NO_DATA_YET: case FSCACHE_COOKIE_STAGE_ACTIVE: case FSCACHE_COOKIE_STAGE_INVALIDATING: - // TODO: Handle will_modify - spin_unlock(&cookie->lock); + if (will_modify) { + object = hlist_entry(cookie->backing_objects.first, + struct fscache_object, cookie_link); + write_set = test_and_set_bit(FSCACHE_OBJECT_LOCAL_WRITE, + &object->flags); + spin_unlock(&cookie->lock); + if (!write_set) + fscache_dispatch(cookie, object, 0, fscache_prepare_to_write); + } else { + spin_unlock(&cookie->lock); + } break; case FSCACHE_COOKIE_STAGE_DEAD: diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index d9391d3974d1..120bb68f74b1 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -136,6 +136,7 @@ extern void fscache_lookup_object(struct fscache_cookie *, struct fscache_object extern void fscache_invalidate_object(struct fscache_cookie *, struct fscache_object *, int); extern void fscache_drop_object(struct fscache_cookie *, struct fscache_object *, bool); extern void fscache_discard_objects(struct fscache_cookie *, struct fscache_object *, int); +extern void fscache_prepare_to_write(struct fscache_cookie *, struct fscache_object *, int); /* * object-list.c diff --git a/fs/fscache/obj.c b/fs/fscache/obj.c index a7064a4cb486..139b59472628 100644 --- a/fs/fscache/obj.c +++ b/fs/fscache/obj.c @@ -117,7 +117,8 @@ static bool fscache_wrangle_object(struct fscache_cookie *cookie, * Create an object chain, making sure that the index chain is fully created. */ static struct fscache_object *fscache_lookup_object_chain(struct fscache_cookie *cookie, - struct fscache_cache *cache) + struct fscache_cache *cache, + bool will_modify) { struct fscache_object *object = NULL, *parent, *xobject; @@ -131,7 +132,7 @@ static struct fscache_object *fscache_lookup_object_chain(struct fscache_cookie spin_unlock(&cookie->lock); /* Recurse to look up/create the parent index. */ - parent = fscache_lookup_object_chain(cookie->parent, cache); + parent = fscache_lookup_object_chain(cookie->parent, cache, false); if (IS_ERR(parent)) goto error; @@ -146,6 +147,9 @@ static struct fscache_object *fscache_lookup_object_chain(struct fscache_cookie if (!object) goto error; + if (will_modify) + __set_bit(FSCACHE_OBJECT_LOCAL_WRITE, &object->flags); + xobject = fscache_attach_object(cookie, object); if (xobject != object) { fscache_do_put_object(object, fscache_obj_put_alloc_dup); @@ -203,7 +207,8 @@ static struct fscache_object *fscache_lookup_object_chain(struct fscache_cookie * - this must make sure the index chain is instantiated and instantiate the * object representation too */ -static void fscache_lookup_object_locked(struct fscache_cookie *cookie) +static void fscache_lookup_object_locked(struct fscache_cookie *cookie, + bool will_modify) { struct fscache_object *object; struct fscache_cache *cache; @@ -221,12 +226,16 @@ static void fscache_lookup_object_locked(struct fscache_cookie *cookie) _debug("cache %s", cache->tag->name); - object = fscache_lookup_object_chain(cookie, cache); + object = fscache_lookup_object_chain(cookie, cache, will_modify); if (!object) { _leave(" [fail]"); return; } + if (will_modify && + test_and_set_bit(FSCACHE_OBJECT_LOCAL_WRITE, &object->flags)) + fscache_prepare_to_write(cookie, object, 0); + fscache_do_put_object(object, fscache_obj_put); _leave(" [done]"); } @@ -235,7 +244,7 @@ void fscache_lookup_object(struct fscache_cookie *cookie, struct fscache_object *object, int param) { down_read(&fscache_addremove_sem); - fscache_lookup_object_locked(cookie); + fscache_lookup_object_locked(cookie, param); up_read(&fscache_addremove_sem); __fscache_unuse_cookie(cookie, NULL, NULL); } @@ -339,3 +348,12 @@ void fscache_discard_objects(struct fscache_cookie *cookie, up_read(&fscache_addremove_sem); _leave(""); } + +/* + * Prepare a cache object to be written to. + */ +void fscache_prepare_to_write(struct fscache_cookie *cookie, + struct fscache_object *object, int param) +{ + object->cache->ops->prepare_to_write(object); +} diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index da85eb15b3c9..3625fd431d9f 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -154,6 +154,9 @@ struct fscache_cache_ops { struct fscache_io_request *req, struct iov_iter *iter); + /* Prepare to write to a live cache object */ + int (*prepare_to_write)(struct fscache_object *object); + /* Display object info in /proc/fs/fscache/objects */ int (*display_object)(struct seq_file *m, struct fscache_object *object); }; @@ -182,6 +185,7 @@ struct fscache_object { spinlock_t lock; /* state and operations lock */ unsigned long flags; +#define FSCACHE_OBJECT_LOCAL_WRITE 1 /* T if the object is being modified locally */ #define FSCACHE_OBJECT_NEEDS_INVAL 8 /* T if object needs invalidation */ #define FSCACHE_OBJECT_NEEDS_UPDATE 9 /* T if object attrs need writing to disk */ From patchwork Mon Jul 13 16:36:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 928CD13B4 for ; Mon, 13 Jul 2020 16:36:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76CDD2067D for ; Mon, 13 Jul 2020 16:36:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OOdKxG8b" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730777AbgGMQgl (ORCPT ); Mon, 13 Jul 2020 12:36:41 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:56561 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730547AbgGMQgj (ORCPT ); Mon, 13 Jul 2020 12:36:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658198; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qPZPF8CZHtaoL1uBXqtCN+iDfCdWcWYQMvIrZ6uG2yY=; b=OOdKxG8bX8eITjGE2xxhRQOhkFLDSM3vwh/VZtxvi6F1L12yDmc8desu6QzKNWVq3QInQH k3EvBgks4ZrAJs5XcNriQoiAFuHXaxwzusHYfX4YoUB4zr83H4Hl4mwfLCebP2v/vnx+uo HwR5Ef3grdImFTUgBOkFmlahl5UVGuk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-402-9Pba6R6MNXasIO1U8nE6QA-1; Mon, 13 Jul 2020 12:36:34 -0400 X-MC-Unique: 9Pba6R6MNXasIO1U8nE6QA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ACF9E1080; Mon, 13 Jul 2020 16:36:32 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7F2B45D9DC; Mon, 13 Jul 2020 16:36:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 30/32] fscache: Provide resize operation From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:36:22 +0100 Message-ID: <159465818273.1376674.5693474446095659046.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Provide a cache operation to resize an object. This is intended to be run synchronously rather than being deferred as it really needs to run inside the inode lock on the netfs inode from ->setattr() to correctly order with respect to other truncates and writes. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 24 ++++++++++++++++++++++++ fs/fscache/internal.h | 3 +++ fs/fscache/io.c | 27 +++++++++++++++++++++++++++ fs/fscache/stats.c | 9 +++++++-- include/linux/fscache-cache.h | 2 ++ include/linux/fscache.h | 18 ++++++++++++++++++ 6 files changed, 81 insertions(+), 2 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index c626cc4248a7..d4172a40ddc9 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -248,6 +248,29 @@ static bool cachefiles_shorten_object(struct cachefiles_object *object, loff_t n return true; } +/* + * Resize the backing object. + */ +static void cachefiles_resize_object(struct fscache_object *_object, loff_t new_size) +{ + struct cachefiles_object *object = + container_of(_object, struct cachefiles_object, fscache); + loff_t old_size = object->fscache.cookie->object_size; + + _enter("%llu->%llu", old_size, new_size); + + if (new_size < old_size) { + cachefiles_shorten_content_map(object, new_size); + cachefiles_shorten_object(object, new_size); + return; + } + + /* The file is being expanded. We don't need to do anything + * particularly. cookie->initial_size doesn't change and so the point + * at which we have to download before doesn't change. + */ +} + /* * Trim excess stored data off of an object. */ @@ -631,6 +654,7 @@ const struct fscache_cache_ops cachefiles_cache_ops = { .free_lookup_data = cachefiles_free_lookup_data, .grab_object = cachefiles_grab_object, .update_object = cachefiles_update_object, + .resize_object = cachefiles_resize_object, .invalidate_object = cachefiles_invalidate_object, .drop_object = cachefiles_drop_object, .put_object = cachefiles_put_object, diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index 120bb68f74b1..eb61e0716e20 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -178,6 +178,9 @@ extern atomic_t fscache_n_updates; extern atomic_t fscache_n_updates_null; extern atomic_t fscache_n_updates_run; +extern atomic_t fscache_n_resizes; +extern atomic_t fscache_n_resizes_null; + extern atomic_t fscache_n_relinquishes; extern atomic_t fscache_n_relinquishes_null; extern atomic_t fscache_n_relinquishes_retire; diff --git a/fs/fscache/io.c b/fs/fscache/io.c index 1885cfbe7f04..1a074f9c4bbe 100644 --- a/fs/fscache/io.c +++ b/fs/fscache/io.c @@ -172,3 +172,30 @@ int __fscache_write(struct fscache_io_request *req, struct iov_iter *iter) } } EXPORT_SYMBOL(__fscache_write); + +/* + * Change the size of a backing object. + */ +void __fscache_resize_cookie(struct fscache_cookie *cookie, loff_t new_size) +{ + struct fscache_object *object; + + ASSERT(cookie->type != FSCACHE_COOKIE_TYPE_INDEX); + + object = fscache_begin_io_operation(cookie, FSCACHE_WANT_WRITE, NULL); + if (!IS_ERR(object)) { + fscache_stat(&fscache_n_resizes); + set_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->flags); + + /* We cannot defer a resize as we need to do it inside the + * netfs's inode lock so that we're serialised with respect to + * writes. + */ + object->cache->ops->resize_object(object, new_size); + object->cache->ops->put_object(object, fscache_obj_put_ioreq); + fscache_end_io_operation(cookie); + } else { + fscache_stat(&fscache_n_resizes_null); + } +} +EXPORT_SYMBOL(__fscache_resize_cookie); diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c index 63fb4d831f4d..33cea7f527db 100644 --- a/fs/fscache/stats.c +++ b/fs/fscache/stats.c @@ -26,6 +26,9 @@ atomic_t fscache_n_updates; atomic_t fscache_n_updates_null; atomic_t fscache_n_updates_run; +atomic_t fscache_n_resizes; +atomic_t fscache_n_resizes_null; + atomic_t fscache_n_relinquishes; atomic_t fscache_n_relinquishes_null; atomic_t fscache_n_relinquishes_retire; @@ -132,10 +135,12 @@ int fscache_stats_show(struct seq_file *m, void *v) seq_printf(m, "Invals : n=%u\n", atomic_read(&fscache_n_invalidates)); - seq_printf(m, "Updates: n=%u nul=%u run=%u\n", + seq_printf(m, "Updates: n=%u nul=%u run=%u rsz=%u rsn=%u\n", atomic_read(&fscache_n_updates), atomic_read(&fscache_n_updates_null), - atomic_read(&fscache_n_updates_run)); + atomic_read(&fscache_n_updates_run), + atomic_read(&fscache_n_resizes), + atomic_read(&fscache_n_resizes_null)); seq_printf(m, "Relinqs: n=%u nul=%u rtr=%u\n", atomic_read(&fscache_n_relinquishes), diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index 3625fd431d9f..ba0ad89a968e 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -118,6 +118,8 @@ struct fscache_cache_ops { /* store the updated auxiliary data on an object */ void (*update_object)(struct fscache_object *object); + /* Change the size of a data object */ + void (*resize_object)(struct fscache_object *object, loff_t new_size); /* Invalidate an object */ bool (*invalidate_object)(struct fscache_object *object, diff --git a/include/linux/fscache.h b/include/linux/fscache.h index c313950afd8a..cd8b6dc81c52 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -232,6 +232,7 @@ extern void __fscache_unuse_cookie(struct fscache_cookie *, const void *, const extern void __fscache_relinquish_cookie(struct fscache_cookie *, bool); extern void __fscache_update_cookie(struct fscache_cookie *, const void *, const loff_t *); extern void __fscache_shape_request(struct fscache_cookie *, struct fscache_request_shape *); +extern void __fscache_resize_cookie(struct fscache_cookie *, loff_t); extern void __fscache_invalidate(struct fscache_cookie *, const void *, loff_t, unsigned int); extern void __fscache_init_io_request(struct fscache_io_request *, struct fscache_cookie *); @@ -431,6 +432,23 @@ void fscache_update_cookie(struct fscache_cookie *cookie, const void *aux_data, __fscache_update_cookie(cookie, aux_data, object_size); } +/** + * fscache_resize_cookie - Request that a cache object be resized + * @cookie: The cookie representing the cache object + * @new_size: The new size of the object (may be NULL) + * + * Request that the size of an object be changed. + * + * See Documentation/filesystems/caching/netfs-api.txt for a complete + * description. + */ +static inline +void fscache_resize_cookie(struct fscache_cookie *cookie, loff_t new_size) +{ + if (fscache_cookie_valid(cookie)) + __fscache_resize_cookie(cookie, new_size); +} + /** * fscache_pin_cookie - Pin a data-storage cache object in its cache * @cookie: The cookie representing the cache object From patchwork Mon Jul 13 16:36:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4C0D1510 for ; Mon, 13 Jul 2020 16:36:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9AA802067D for ; Mon, 13 Jul 2020 16:36:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QpwLCYwK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730778AbgGMQgu (ORCPT ); Mon, 13 Jul 2020 12:36:50 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:50289 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730501AbgGMQgu (ORCPT ); Mon, 13 Jul 2020 12:36:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658207; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5BzTH+WaXOvL6izZWzLKZvvWnEGobEIEm3ZK8LEQzUA=; b=QpwLCYwKK0Vrn1kAvySvvCBZXbh3oNmzTobOHqFmvBD+NEu/0sbih7AojsmxLtT28CYBwn XT+ssSurGaK50VOkLdLXH8q1Q1GL3kWf3QGV00iJRiLh4PjloHlXfLk6zjrPc9aeZGyPJv 2qNhzBCJrIzxjgnrbM/Rf4f8U/HnwMY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-81-WwDraCX2Plup5CiXp-724A-1; Mon, 13 Jul 2020 12:36:46 -0400 X-MC-Unique: WwDraCX2Plup5CiXp-724A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 056D31090; Mon, 13 Jul 2020 16:36:44 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id A63ED60BF3; Mon, 13 Jul 2020 16:36:38 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 31/32] fscache: Remove the update operation From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:36:37 +0100 Message-ID: <159465819792.1376674.9917789832076544130.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Remove the cache-side of the object update operation as it doesn't serialise with other setattr, O_TRUNC and write operations. Signed-off-by: David Howells --- fs/cachefiles/interface.c | 59 ----------------------------------------- fs/fscache/internal.h | 1 - fs/fscache/obj.c | 14 ---------- fs/fscache/stats.c | 4 +-- include/linux/fscache-cache.h | 2 - 5 files changed, 1 insertion(+), 79 deletions(-) diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c index d4172a40ddc9..21a06dd575ca 100644 --- a/fs/cachefiles/interface.c +++ b/fs/cachefiles/interface.c @@ -152,64 +152,6 @@ struct fscache_object *cachefiles_grab_object(struct fscache_object *_object, return &object->fscache; } -/* - * update the auxiliary data for an object object on disk - */ -static void cachefiles_update_object(struct fscache_object *_object) -{ - struct cachefiles_object *object; - struct cachefiles_cache *cache; - const struct cred *saved_cred; - struct inode *inode; - loff_t object_size, i_size; - int ret; - - _enter("{OBJ%x}", _object->debug_id); - - object = container_of(_object, struct cachefiles_object, fscache); - cache = container_of(object->fscache.cache, struct cachefiles_cache, - cache); - - cachefiles_begin_secure(cache, &saved_cred); - - object_size = object->fscache.cookie->object_size; - inode = d_inode(object->dentry); - i_size = i_size_read(inode); - if (i_size > object_size) { - struct path path = { - .mnt = cache->mnt, - .dentry = object->dentry - }; - _debug("trunc %llx -> %llx", i_size, object_size); - trace_cachefiles_trunc(object, inode, i_size, object_size); - ret = vfs_truncate(&path, object_size); - if (ret < 0) { - cachefiles_io_error_obj(object, "Trunc-to-size failed"); - cachefiles_remove_object_xattr(cache, object->dentry); - goto out; - } - - object_size = round_up(object_size, CACHEFILES_DIO_BLOCK_SIZE); - i_size = i_size_read(inode); - _debug("trunc %llx -> %llx", i_size, object_size); - if (i_size < object_size) { - trace_cachefiles_trunc(object, inode, i_size, object_size); - ret = vfs_truncate(&path, object_size); - if (ret < 0) { - cachefiles_io_error_obj(object, "Trunc-to-dio-size failed"); - cachefiles_remove_object_xattr(cache, object->dentry); - goto out; - } - } - } - - cachefiles_set_object_xattr(object); - -out: - cachefiles_end_secure(cache, saved_cred); - _leave(""); -} - /* * Shorten the backing object to discard any dirty data and free up * any unused granules. @@ -653,7 +595,6 @@ const struct fscache_cache_ops cachefiles_cache_ops = { .lookup_object = cachefiles_lookup_object, .free_lookup_data = cachefiles_free_lookup_data, .grab_object = cachefiles_grab_object, - .update_object = cachefiles_update_object, .resize_object = cachefiles_resize_object, .invalidate_object = cachefiles_invalidate_object, .drop_object = cachefiles_drop_object, diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h index eb61e0716e20..ca74b0090e15 100644 --- a/fs/fscache/internal.h +++ b/fs/fscache/internal.h @@ -202,7 +202,6 @@ extern atomic_t fscache_n_cop_alloc_object; extern atomic_t fscache_n_cop_lookup_object; extern atomic_t fscache_n_cop_create_object; extern atomic_t fscache_n_cop_invalidate_object; -extern atomic_t fscache_n_cop_update_object; extern atomic_t fscache_n_cop_drop_object; extern atomic_t fscache_n_cop_put_object; extern atomic_t fscache_n_cop_sync_cache; diff --git a/fs/fscache/obj.c b/fs/fscache/obj.c index 139b59472628..a2306f32044c 100644 --- a/fs/fscache/obj.c +++ b/fs/fscache/obj.c @@ -54,14 +54,6 @@ static int fscache_do_create_object(struct fscache_object *object, void *data) return ret; } -static void fscache_do_update_object(struct fscache_object *object) -{ - fscache_stat(&fscache_n_updates_run); - fscache_stat(&fscache_n_cop_update_object); - object->cache->ops->update_object(object); - fscache_stat_d(&fscache_n_cop_update_object); -} - static void fscache_do_drop_object(struct fscache_cache *cache, struct fscache_object *object, bool invalidate) @@ -282,12 +274,6 @@ void fscache_drop_object(struct fscache_cookie *cookie, _enter("{OBJ%x,%d},%u", object->debug_id, object->n_children, invalidate); - if (!invalidate && - test_bit(FSCACHE_OBJECT_NEEDS_UPDATE, &object->flags)) { - _debug("final update"); - fscache_do_update_object(object); - } - spin_lock(&cache->object_list_lock); list_del_init(&object->cache_link); spin_unlock(&cache->object_list_lock); diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c index 33cea7f527db..f35f22f9a7f3 100644 --- a/fs/fscache/stats.c +++ b/fs/fscache/stats.c @@ -50,7 +50,6 @@ atomic_t fscache_n_cop_alloc_object; atomic_t fscache_n_cop_lookup_object; atomic_t fscache_n_cop_create_object; atomic_t fscache_n_cop_invalidate_object; -atomic_t fscache_n_cop_update_object; atomic_t fscache_n_cop_drop_object; atomic_t fscache_n_cop_put_object; atomic_t fscache_n_cop_sync_cache; @@ -150,9 +149,8 @@ int fscache_stats_show(struct seq_file *m, void *v) seq_printf(m, "CacheOp: alo=%d luo=%d\n", atomic_read(&fscache_n_cop_alloc_object), atomic_read(&fscache_n_cop_lookup_object)); - seq_printf(m, "CacheOp: inv=%d upo=%d dro=%d pto=%d atc=%d syn=%d\n", + seq_printf(m, "CacheOp: inv=%d dro=%d pto=%d atc=%d syn=%d\n", atomic_read(&fscache_n_cop_invalidate_object), - atomic_read(&fscache_n_cop_update_object), atomic_read(&fscache_n_cop_drop_object), atomic_read(&fscache_n_cop_put_object), atomic_read(&fscache_n_cop_attr_changed), diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index ba0ad89a968e..14ee82de2d79 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -116,8 +116,6 @@ struct fscache_cache_ops { /* unpin an object in the cache */ void (*unpin_object)(struct fscache_object *object); - /* store the updated auxiliary data on an object */ - void (*update_object)(struct fscache_object *object); /* Change the size of a data object */ void (*resize_object)(struct fscache_object *object, loff_t new_size); From patchwork Mon Jul 13 16:36:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 11660607 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 843B414DD for ; Mon, 13 Jul 2020 16:37:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 687572067D for ; Mon, 13 Jul 2020 16:37:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I5ncy3oc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730559AbgGMQg6 (ORCPT ); Mon, 13 Jul 2020 12:36:58 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:51608 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730645AbgGMQg5 (ORCPT ); Mon, 13 Jul 2020 12:36:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594658216; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uZSAC2c/5STKCGTu7EeIJThZsT3GakeWwbaSwr2NVJA=; b=I5ncy3oc3CO63owTe67yYB5Tb4D3M5qBfT8kTPrrPOJ8IjoohsQYyNxamagqOl6luDvuQV +mWdcPsvZMz4820G95cSDJ1TyLvL7T1p8P6cAAhjuJ5FeIoCZDa3VAPuSYo49iifGFYd+G /r/P7nlcu7B1LCREjY2oUPsBsZ5JhEA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-479-cCgsEntoP_uuTEMIWfAmgw-1; Mon, 13 Jul 2020 12:36:54 -0400 X-MC-Unique: cCgsEntoP_uuTEMIWfAmgw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9E82D107ACCA; Mon, 13 Jul 2020 16:36:52 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-112-113.rdu2.redhat.com [10.10.112.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 02DF960CD0; Mon, 13 Jul 2020 16:36:49 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH 32/32] cachefiles: Shape write requests From: David Howells To: Trond Myklebust , Anna Schumaker , Steve French , Alexander Viro , Matthew Wilcox Cc: Jeff Layton , Dave Wysochanski , dhowells@redhat.com, linux-cachefs@redhat.com, linux-afs@lists.infradead.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs-developer@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 13 Jul 2020 17:36:49 +0100 Message-ID: <159465820921.1376674.16898427212445252830.stgit@warthog.procyon.org.uk> In-Reply-To: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> References: <159465784033.1376674.18106463693989811037.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.22 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org In cachefiles_shape_extent(), shape a write request to always write to the cache. The assumption is made that the caller has read the entire cache granule beforehand if necessary. Possibly this should be amended so that writes will only take place to granules that are marked present and granules that lie beyond the EOF. Signed-off-by: David Howells --- fs/cachefiles/content-map.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/fs/cachefiles/content-map.c b/fs/cachefiles/content-map.c index 3e310fd58497..592fc426500b 100644 --- a/fs/cachefiles/content-map.c +++ b/fs/cachefiles/content-map.c @@ -69,7 +69,8 @@ static void cachefiles_shape_single(struct fscache_object *obj, shape->dio_block_size = CACHEFILES_DIO_BLOCK_SIZE; - if (object->content_info == CACHEFILES_CONTENT_SINGLE) { + if (!shape->for_write && + object->content_info == CACHEFILES_CONTENT_SINGLE) { shape->to_be_done = FSCACHE_READ_FROM_CACHE; } else { eof = (shape->i_size + PAGE_SIZE - 1) >> PAGE_SHIFT; @@ -127,14 +128,20 @@ void cachefiles_shape_request(struct fscache_object *obj, if (end - start > max_pages) end = start + max_pages; - /* If the content map didn't get expanded for some reason - simply - * ignore this granule. - */ granule = start / CACHEFILES_GRAN_PAGES; - if (granule / 8 >= object->content_map_size) - return; + if (granule / 8 >= object->content_map_size) { + cachefiles_expand_content_map(object, shape->i_size); + if (granule / 8 >= object->content_map_size) + return; + } - if (cachefiles_granule_is_present(object, granule)) { + if (shape->for_write) { + /* Assume that the preparation to write involved preloading any + * bits of the cache that weren't to be written and filling any + * gaps that didn't end up being written. + */ + shape->to_be_done = FSCACHE_WRITE_TO_CACHE; + } else if (cachefiles_granule_is_present(object, granule)) { /* The start of the requested extent is present in the cache - * restrict the returned extent to the maximum length of what's * available.