Message ID | 9a458c9c553c6a8d5416c91650a9b152458459d0.1723670362.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fanotify: add pre-content hooks | expand |
On Wed 14-08-24 17:25:29, Josef Bacik wrote: > With page faults we can trigger readahead on the file, and then > subsequent faults can find these pages and insert them into the file > without emitting an fanotify event. To avoid this case, disable > readahead if we have pre-content watches on the file. This way we are > guaranteed to get an event for every range we attempt to access on a > pre-content watched file. > > Reviewed-by: Christian Brauner <brauner@kernel.org> > Signed-off-by: Josef Bacik <josef@toxicpanda.com> ... > @@ -674,6 +675,14 @@ void page_cache_sync_ra(struct readahead_control *ractl, > { > bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM); > > + /* > + * If we have pre-content watches we need to disable readahead to make > + * sure that we don't find 0 filled pages in cache that we never emitted > + * events for. > + */ > + if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file)) > + return; > + There are callers which don't pass struct file to readahead (either to page_cache_sync_ra() or page_cache_async_ra()). Luckily these are very few - cramfs for a block device (we don't care) and btrfs from code paths like send-receive or defrag. Now if you tell me you're fine breaking these corner cases for btrfs, I'll take your word for it but it looks like a nasty trap to me. Now doing things like defrag or send-receive on offline files on HSM managed filesystem doesn't look like a terribly good idea anyway so perhaps we just want btrfs to check and refuse such things? Honza
On Thu, Aug 29, 2024 at 12:48:05PM +0200, Jan Kara wrote: > On Wed 14-08-24 17:25:29, Josef Bacik wrote: > > With page faults we can trigger readahead on the file, and then > > subsequent faults can find these pages and insert them into the file > > without emitting an fanotify event. To avoid this case, disable > > readahead if we have pre-content watches on the file. This way we are > > guaranteed to get an event for every range we attempt to access on a > > pre-content watched file. > > > > Reviewed-by: Christian Brauner <brauner@kernel.org> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > ... > > > @@ -674,6 +675,14 @@ void page_cache_sync_ra(struct readahead_control *ractl, > > { > > bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM); > > > > + /* > > + * If we have pre-content watches we need to disable readahead to make > > + * sure that we don't find 0 filled pages in cache that we never emitted > > + * events for. > > + */ > > + if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file)) > > + return; > > + > > There are callers which don't pass struct file to readahead (either to > page_cache_sync_ra() or page_cache_async_ra()). Luckily these are very few > - cramfs for a block device (we don't care) and btrfs from code paths like > send-receive or defrag. Now if you tell me you're fine breaking these > corner cases for btrfs, I'll take your word for it but it looks like a > nasty trap to me. Now doing things like defrag or send-receive on offline > files on HSM managed filesystem doesn't look like a terribly good idea > anyway so perhaps we just want btrfs to check and refuse such things? > We can't have HSM on a send subvolume because they have to be read only. I hadn't thought of defrag, I'll respin and add a patch to disallow defrag on a file that has content watches. Thanks, Josef
diff --git a/mm/filemap.c b/mm/filemap.c index ca8c8d889eef..8b1684b62177 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3122,6 +3122,14 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) unsigned long vm_flags = vmf->vma->vm_flags; unsigned int mmap_miss; + /* + * If we have pre-content watches we need to disable readahead to make + * sure that we don't populate our mapping with 0 filled pages that we + * never emitted an event for. + */ + if (fsnotify_file_has_pre_content_watches(file)) + return fpin; + #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* Use the readahead code, even if readahead is disabled */ if ((vm_flags & VM_HUGEPAGE) && HPAGE_PMD_ORDER <= MAX_PAGECACHE_ORDER) { @@ -3190,6 +3198,10 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, struct file *fpin = NULL; unsigned int mmap_miss; + /* See comment in do_sync_mmap_readahead. */ + if (fsnotify_file_has_pre_content_watches(file)) + return fpin; + /* If we don't want any read-ahead, don't bother */ if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages) return fpin; diff --git a/mm/readahead.c b/mm/readahead.c index 817b2a352d78..bc068d9218e3 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -128,6 +128,7 @@ #include <linux/blk-cgroup.h> #include <linux/fadvise.h> #include <linux/sched/mm.h> +#include <linux/fsnotify.h> #include "internal.h" @@ -674,6 +675,14 @@ void page_cache_sync_ra(struct readahead_control *ractl, { bool do_forced_ra = ractl->file && (ractl->file->f_mode & FMODE_RANDOM); + /* + * If we have pre-content watches we need to disable readahead to make + * sure that we don't find 0 filled pages in cache that we never emitted + * events for. + */ + if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file)) + return; + /* * Even if readahead is disabled, issue this request as readahead * as we'll need it to satisfy the requested range. The forced @@ -704,6 +713,10 @@ void page_cache_async_ra(struct readahead_control *ractl, if (!ractl->ra->ra_pages) return; + /* See the comment in page_cache_sync_ra. */ + if (ractl->file && fsnotify_file_has_pre_content_watches(ractl->file)) + return; + /* * Same bit is used for PG_readahead and PG_reclaim. */