Message ID | 20201006230930.3908-1-rcampbell@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ext4/xfs: add page refcount helper | expand |
On Tue, Oct 6, 2020 at 4:09 PM Ralph Campbell <rcampbell@nvidia.com> wrote: > > There are several places where ZONE_DEVICE struct pages assume a reference > count == 1 means the page is idle and free. Instead of open coding this, > add a helper function to hide this detail. > > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > > I'm resending this as a separate patch since I think it is ready to > merge. Originally, this was part of an RFC and is unchanged from v3: > https://lore.kernel.org/linux-mm/20201001181715.17416-1-rcampbell@nvidia.com > > It applies cleanly to linux-5.9.0-rc7-mm1 but doesn't really > depend on anything, just simple merge conflicts when applied to > other trees. > I'll let the various maintainers decide which tree and when to merge. > It isn't urgent since it is a clean up patch. Thanks Ralph, it looks good to me. Jan, or Ted care to ack? I don't have much else pending for dax at the moment as Andrew is carrying my dax updates for this cycle. Andrew please take this into -mm if you get a chance. Otherwise I'll cycle back to it when some other dax updates arrive in my queue.
On Tue 06-10-20 16:09:30, Ralph Campbell wrote: > There are several places where ZONE_DEVICE struct pages assume a reference > count == 1 means the page is idle and free. Instead of open coding this, > add a helper function to hide this detail. > > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks as sane direction but if we are going to abstract checks when ZONE_DEVICE page is idle, we should also update e.g. mm/swap.c:put_devmap_managed_page() or mm/gup.c:__unpin_devmap_managed_user_page() (there may be more places like this but I found at least these two...). Maybe Dan has more thoughts about this. Honza > diff --git a/fs/dax.c b/fs/dax.c > index 5b47834f2e1b..85c63f735909 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, > for_each_mapped_pfn(entry, pfn) { > struct page *page = pfn_to_page(pfn); > > - WARN_ON_ONCE(trunc && page_ref_count(page) > 1); > + WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page)); > WARN_ON_ONCE(page->mapping && page->mapping != mapping); > page->mapping = NULL; > page->index = 0; > @@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry) > for_each_mapped_pfn(entry, pfn) { > struct page *page = pfn_to_page(pfn); > > - if (page_ref_count(page) > 1) > + if (!dax_layout_is_idle_page(page)) > return page; > } > return NULL; > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 771ed8b1fadb..132620cbfa13 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3937,10 +3937,7 @@ int ext4_break_layouts(struct inode *inode) > if (!page) > return 0; > > - error = ___wait_var_event(&page->_refcount, > - atomic_read(&page->_refcount) == 1, > - TASK_INTERRUPTIBLE, 0, 0, > - ext4_wait_dax_page(ei)); > + error = dax_wait_page(ei, page, ext4_wait_dax_page); > } while (error == 0); > > return error; > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 3d1b95124744..a5304aaeaa3a 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -749,9 +749,7 @@ xfs_break_dax_layouts( > return 0; > > *retry = true; > - return ___wait_var_event(&page->_refcount, > - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, > - 0, 0, xfs_wait_dax_page(inode)); > + return dax_wait_page(inode, page, xfs_wait_dax_page); > } > > int > diff --git a/include/linux/dax.h b/include/linux/dax.h > index b52f084aa643..8909a91cd381 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space *mapping) > return mapping->host && IS_DAX(mapping->host); > } > > +static inline bool dax_layout_is_idle_page(struct page *page) > +{ > + return page_ref_count(page) == 1; > +} > + > +#define dax_wait_page(_inode, _page, _wait_cb) \ > + ___wait_var_event(&(_page)->_refcount, \ > + dax_layout_is_idle_page(_page), \ > + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode)) > + > #ifdef CONFIG_DEV_DAX_HMEM_DEVICES > void hmem_register_device(int target_nid, struct resource *r); > #else > -- > 2.20.1 >
On Tue, Oct 06, 2020 at 07:40:05PM -0700, Dan Williams wrote: > On Tue, Oct 6, 2020 at 4:09 PM Ralph Campbell <rcampbell@nvidia.com> wrote: > > > > There are several places where ZONE_DEVICE struct pages assume a reference > > count == 1 means the page is idle and free. Instead of open coding this, > > add a helper function to hide this detail. > > > > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > --- > > > > I'm resending this as a separate patch since I think it is ready to > > merge. Originally, this was part of an RFC and is unchanged from v3: > > https://lore.kernel.org/linux-mm/20201001181715.17416-1-rcampbell@nvidia.com > > > > It applies cleanly to linux-5.9.0-rc7-mm1 but doesn't really > > depend on anything, just simple merge conflicts when applied to > > other trees. > > I'll let the various maintainers decide which tree and when to merge. > > It isn't urgent since it is a clean up patch. > > Thanks Ralph, it looks good to me. Jan, or Ted care to ack? I don't > have much else pending for dax at the moment as Andrew is carrying my > dax updates for this cycle. Andrew please take this into -mm if you > get a chance. Otherwise I'll cycle back to it when some other dax > updates arrive in my queue. Acked-by: Theodore Ts'o <tytso@mit.edu> # for fs/ext4/inode.c
On Tue, Oct 06, 2020 at 04:09:30PM -0700, Ralph Campbell wrote: > There are several places where ZONE_DEVICE struct pages assume a reference > count == 1 means the page is idle and free. Instead of open coding this, > add a helper function to hide this detail. > > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > > I'm resending this as a separate patch since I think it is ready to > merge. Originally, this was part of an RFC and is unchanged from v3: > https://lore.kernel.org/linux-mm/20201001181715.17416-1-rcampbell@nvidia.com > > It applies cleanly to linux-5.9.0-rc7-mm1 but doesn't really > depend on anything, just simple merge conflicts when applied to > other trees. > I'll let the various maintainers decide which tree and when to merge. > It isn't urgent since it is a clean up patch. > > fs/dax.c | 4 ++-- > fs/ext4/inode.c | 5 +---- > fs/xfs/xfs_file.c | 4 +--- > include/linux/dax.h | 10 ++++++++++ > 4 files changed, 14 insertions(+), 9 deletions(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 5b47834f2e1b..85c63f735909 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, > for_each_mapped_pfn(entry, pfn) { > struct page *page = pfn_to_page(pfn); > > - WARN_ON_ONCE(trunc && page_ref_count(page) > 1); > + WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page)); > WARN_ON_ONCE(page->mapping && page->mapping != mapping); > page->mapping = NULL; > page->index = 0; > @@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry) > for_each_mapped_pfn(entry, pfn) { > struct page *page = pfn_to_page(pfn); > > - if (page_ref_count(page) > 1) > + if (!dax_layout_is_idle_page(page)) > return page; > } > return NULL; > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 771ed8b1fadb..132620cbfa13 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3937,10 +3937,7 @@ int ext4_break_layouts(struct inode *inode) > if (!page) > return 0; > > - error = ___wait_var_event(&page->_refcount, > - atomic_read(&page->_refcount) == 1, > - TASK_INTERRUPTIBLE, 0, 0, > - ext4_wait_dax_page(ei)); > + error = dax_wait_page(ei, page, ext4_wait_dax_page); > } while (error == 0); > > return error; > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 3d1b95124744..a5304aaeaa3a 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -749,9 +749,7 @@ xfs_break_dax_layouts( > return 0; > > *retry = true; > - return ___wait_var_event(&page->_refcount, > - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, > - 0, 0, xfs_wait_dax_page(inode)); > + return dax_wait_page(inode, page, xfs_wait_dax_page); I don't mind this open-coded soup getting cleaned up into a macro, though my general opinion is that if the mm/dax developers are ok with this then: Acked-by: Darrick J. Wong <darrick.wong@oracle.com> --D > } > > int > diff --git a/include/linux/dax.h b/include/linux/dax.h > index b52f084aa643..8909a91cd381 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space *mapping) > return mapping->host && IS_DAX(mapping->host); > } > > +static inline bool dax_layout_is_idle_page(struct page *page) > +{ > + return page_ref_count(page) == 1; > +} > + > +#define dax_wait_page(_inode, _page, _wait_cb) \ > + ___wait_var_event(&(_page)->_refcount, \ > + dax_layout_is_idle_page(_page), \ > + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode)) > + > #ifdef CONFIG_DEV_DAX_HMEM_DEVICES > void hmem_register_device(int target_nid, struct resource *r); > #else > -- > 2.20.1 >
On 10/7/20 1:25 AM, Jan Kara wrote: > On Tue 06-10-20 16:09:30, Ralph Campbell wrote: >> There are several places where ZONE_DEVICE struct pages assume a reference >> count == 1 means the page is idle and free. Instead of open coding this, >> add a helper function to hide this detail. >> >> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> >> Reviewed-by: Christoph Hellwig <hch@lst.de> > > Looks as sane direction but if we are going to abstract checks when > ZONE_DEVICE page is idle, we should also update e.g. > mm/swap.c:put_devmap_managed_page() or > mm/gup.c:__unpin_devmap_managed_user_page() (there may be more places like > this but I found at least these two...). Maybe Dan has more thoughts about > this. > > Honza I think this is a good point but I would like to make that a follow on patch rather than add to this one.
On Wed, Oct 7, 2020 at 1:25 AM Jan Kara <jack@suse.cz> wrote: > > On Tue 06-10-20 16:09:30, Ralph Campbell wrote: > > There are several places where ZONE_DEVICE struct pages assume a reference > > count == 1 means the page is idle and free. Instead of open coding this, > > add a helper function to hide this detail. > > > > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > Looks as sane direction but if we are going to abstract checks when > ZONE_DEVICE page is idle, we should also update e.g. > mm/swap.c:put_devmap_managed_page() or > mm/gup.c:__unpin_devmap_managed_user_page() (there may be more places like > this but I found at least these two...). Maybe Dan has more thoughts about > this. Yes, but I think that cleanup comes once the idle page count is unified to be 0 across typical and ZONE_DEVICE pages. Then free_devmap_managed_page() can be moved internal to __put_page(). For this patch it's just hiding the "idle == 1" assumption from dax-filesystems.
diff --git a/fs/dax.c b/fs/dax.c index 5b47834f2e1b..85c63f735909 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -358,7 +358,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - WARN_ON_ONCE(trunc && page_ref_count(page) > 1); + WARN_ON_ONCE(trunc && !dax_layout_is_idle_page(page)); WARN_ON_ONCE(page->mapping && page->mapping != mapping); page->mapping = NULL; page->index = 0; @@ -372,7 +372,7 @@ static struct page *dax_busy_page(void *entry) for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - if (page_ref_count(page) > 1) + if (!dax_layout_is_idle_page(page)) return page; } return NULL; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 771ed8b1fadb..132620cbfa13 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3937,10 +3937,7 @@ int ext4_break_layouts(struct inode *inode) if (!page) return 0; - error = ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, - TASK_INTERRUPTIBLE, 0, 0, - ext4_wait_dax_page(ei)); + error = dax_wait_page(ei, page, ext4_wait_dax_page); } while (error == 0); return error; diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 3d1b95124744..a5304aaeaa3a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -749,9 +749,7 @@ xfs_break_dax_layouts( return 0; *retry = true; - return ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, - 0, 0, xfs_wait_dax_page(inode)); + return dax_wait_page(inode, page, xfs_wait_dax_page); } int diff --git a/include/linux/dax.h b/include/linux/dax.h index b52f084aa643..8909a91cd381 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -243,6 +243,16 @@ static inline bool dax_mapping(struct address_space *mapping) return mapping->host && IS_DAX(mapping->host); } +static inline bool dax_layout_is_idle_page(struct page *page) +{ + return page_ref_count(page) == 1; +} + +#define dax_wait_page(_inode, _page, _wait_cb) \ + ___wait_var_event(&(_page)->_refcount, \ + dax_layout_is_idle_page(_page), \ + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode)) + #ifdef CONFIG_DEV_DAX_HMEM_DEVICES void hmem_register_device(int target_nid, struct resource *r); #else