diff mbox series

[RFC] Documentation for folio_lock() and friends

Message ID YkspW4HDL54xEg69@casper.infradead.org (mailing list archive)
State New
Headers show
Series [RFC] Documentation for folio_lock() and friends | expand

Commit Message

Matthew Wilcox April 4, 2022, 5:22 p.m. UTC
It's a shame to not have these functions documented.  I'm sure I've
missed a few things that would be useful to document here.

Comments

NeilBrown April 5, 2022, 5:48 a.m. UTC | #1
On Tue, 05 Apr 2022, Matthew Wilcox wrote:
> It's a shame to not have these functions documented.  I'm sure I've
> missed a few things that would be useful to document here.
> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index ab47579af434..47b7851f1b64 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -888,6 +888,18 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
>  void unlock_page(struct page *page);
>  void folio_unlock(struct folio *folio);
>  
> +/**
> + * folio_trylock() - Attempt to lock a folio.
> + * @folio: The folio to attempt to lock.
> + *
> + * Sometimes it is undesirable to wait for a folio to be unlocked (eg
> + * when the locks are being taken in the wrong order, or if making
> + * progress through a batch of folios is more important than processing
> + * them in order).  Usually folio_lock() is the correct function to call.

Usually?
I think a "see also" type reference to folio_lock() is useful, but I
don't think "usually" is helpful.

> + *
> + * Context: Any context.
> + * Return: Whether the lock was successfully acquired.
> + */
>  static inline bool folio_trylock(struct folio *folio)
>  {
>  	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
> @@ -901,6 +913,26 @@ static inline int trylock_page(struct page *page)
>  	return folio_trylock(page_folio(page));
>  }
>  
> +/**
> + * folio_lock() - Lock this folio.
> + * @folio: The folio to lock.
> + *
> + * The folio lock protects against many things, probably more than it
> + * should.  It is primarily held while a folio is being read from storage,
> + * either from its backing file or from swap.  It is also held while a
> + * folio is being truncated from its address_space.
> + *
> + * Holding the lock usually prevents the contents of the folio from being
> + * modified by other kernel users, although it does not prevent userspace
> + * from modifying data if it's mapped.  The lock is not consistently held
> + * while a folio is being modified by DMA.

I don't think this paragraph is helpful...  maybe if it listed which
change *are* prevented by the lock, rather than a few which aren't?

I think it is significant that the lock prevents removal from the page
cache, and so ->mapping is only stable while the lock is held.  It might
be worth adding something about that.

> + *
> + * Context: May sleep.  If you need to acquire the locks of two or
> + * more folios, they must be in order of ascending index, if they are
> + * in the same address_space.  If they are in different address_spaces,
> + * acquire the lock of the folio which belongs to the address_space which
> + * has the lowest address in memory first.
> + */
>  static inline void folio_lock(struct folio *folio)
>  {
>  	might_sleep();
> @@ -908,6 +940,17 @@ static inline void folio_lock(struct folio *folio)
>  		__folio_lock(folio);
>  }
>  
> +/**
> + * lock_page() - Lock the folio containing this page.
> + * @page: The page to lock.
> + *
> + * See folio_lock() for a description of what the lock protects.
> + * This is a legacy function and new code should probably use folio_lock()
> + * instead.
> + *
> + * Context: May sleep.  Pages in the same folio share a lock, so do not
> + * attempt to lock two pages which share a folio.
> + */
>  static inline void lock_page(struct page *page)
>  {
>  	struct folio *folio;
> @@ -918,6 +961,16 @@ static inline void lock_page(struct page *page)
>  		__folio_lock(folio);
>  }
>  
> +/**
> + * folio_lock_killable() - Lock this folio, interruptible by a fatal signal.
> + * @folio: The folio to lock.
> + *
> + * Attempts to lock the folio, like folio_lock(), except that the sleep
> + * to acquire the lock is interruptible by a fatal signal.
> + *
> + * Context: May sleep; see folio_lock().
> + * Return: 0 if the lock was acquired; -EINTR if a fatal signal was received.
> + */
>  static inline int folio_lock_killable(struct folio *folio)
>  {
>  	might_sleep();
> @@ -964,8 +1017,8 @@ int folio_wait_bit_killable(struct folio *folio, int bit_nr);
>   * Wait for a folio to be unlocked.
>   *
>   * This must be called with the caller "holding" the folio,
> - * ie with increased "page->count" so that the folio won't
> - * go away during the wait..
> + * ie with increased folio reference count so that the folio won't
> + * go away during the wait.
>   */
>  static inline void folio_wait_locked(struct folio *folio)
>  {
> 
> 

Thanks,
NeilBrown
Matthew Wilcox April 28, 2022, 5:28 p.m. UTC | #2
On Tue, Apr 05, 2022 at 03:48:19PM +1000, NeilBrown wrote:
> On Tue, 05 Apr 2022, Matthew Wilcox wrote:
> > It's a shame to not have these functions documented.  I'm sure I've
> > missed a few things that would be useful to document here.
> > 
> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index ab47579af434..47b7851f1b64 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -888,6 +888,18 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> >  void unlock_page(struct page *page);
> >  void folio_unlock(struct folio *folio);
> >  
> > +/**
> > + * folio_trylock() - Attempt to lock a folio.
> > + * @folio: The folio to attempt to lock.
> > + *
> > + * Sometimes it is undesirable to wait for a folio to be unlocked (eg
> > + * when the locks are being taken in the wrong order, or if making
> > + * progress through a batch of folios is more important than processing
> > + * them in order).  Usually folio_lock() is the correct function to call.
> 
> Usually?
> I think a "see also" type reference to folio_lock() is useful, but I
> don't think "usually" is helpful.

That was supposed to stand in contrast to the "Sometimes".  How about
this:

 * folio_lock() is a sleeping function.  If sleeping is not the right
 * behaviour (eg when the locks are being taken in the wrong order,
 * or if making progress through a batch of folios is more important
 * than processing them in order) then you can use folio_trylock().
 * It is never appropriate to implement a spinlock using folio_trylock().

... if not, could you suggest some better wording?

> > +/**
> > + * folio_lock() - Lock this folio.
> > + * @folio: The folio to lock.
> > + *
> > + * The folio lock protects against many things, probably more than it
> > + * should.  It is primarily held while a folio is being read from storage,
> > + * either from its backing file or from swap.  It is also held while a
> > + * folio is being truncated from its address_space.
> > + *
> > + * Holding the lock usually prevents the contents of the folio from being
> > + * modified by other kernel users, although it does not prevent userspace
> > + * from modifying data if it's mapped.  The lock is not consistently held
> > + * while a folio is being modified by DMA.
> 
> I don't think this paragraph is helpful...  maybe if it listed which
> change *are* prevented by the lock, rather than a few which aren't?

I put that in because it's a common misconception ("holding the page
locked will prevent it from being modified").

> I think it is significant that the lock prevents removal from the page
> cache, and so ->mapping is only stable while the lock is held.  It might
> be worth adding something about that.

That's implied by the last sentence of the first paragraph, but I can
include that.  Actually, ->mapping is also stable if the page is mapped
and you hold the page table lock of a page table it's mapped in.  That's
not theoretical BTW, it's the conditions under which ->dirty_folio() is
called -- either the folio lock is held, OR the folio is mapped and the
PTL is held.  That prevents truncation because truncation unmaps pages
before clearing ->mapping, and it needs to take the PTL to unmap the page.

Hard to express that in a lockdep expression because the PTL isn't
passed to folio_mark_dirty().  That explanation probably doesn't
belong here, so how about ...

 * folio_lock() - Lock this folio.
 * @folio: The folio to lock.
 *
 * The folio lock protects against many things, probably more than it
 * should.  It is primarily held while a folio is being brought uptodate,
 * either from its backing file or from swap.  It is also held while a
 * folio is being truncated from its address_space, so holding the lock
 * is sufficient to keep folio->mapping stable.
 *
 * The folio lock is also held while write() is modifying the page to
 * provide POSIX atomicity guarantees (as long as the write does not
 * cross a page boundary).  Other modifications to the data in the folio
 * do not hold the folio lock and can race with writes, eg DMA and stores
 * to mapped pages.
 *
 * Context: May sleep.  If you need to acquire the locks of two or
 * more folios, they must be in order of ascending index, if they are
 * in the same address_space.  If they are in different address_spaces,
 * acquire the lock of the folio which belongs to the address_space which
 * has the lowest address in memory first.


Looking at the comment on folio_mark_dirty(), it's a bit unhelpful.
How about this:

  * folio_mark_dirty - Mark a folio as being modified.
  * @folio: The folio.
  *
- * For folios with a mapping this should be done with the folio lock held
- * for the benefit of asynchronous memory errors who prefer a consistent
- * dirty state. This rule can be broken in some special cases,
- * but should be better not to.
+ * The folio may not be truncated while this function is running.
+ * Holding the folio lock is sufficient to prevent truncation, but some
+ * callers cannot acquire a sleeping lock.  These callers instead hold
+ * the page table lock for a page table which contains at least one page
+ * in this folio.  Truncation will block on the page table lock as it
+ * unmaps pages before removing the folio from its mapping.
  *
  * Return: True if the folio was newly dirtied, false if it was already dirty.
diff mbox series

Patch

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ab47579af434..47b7851f1b64 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -888,6 +888,18 @@  bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
 void unlock_page(struct page *page);
 void folio_unlock(struct folio *folio);
 
+/**
+ * folio_trylock() - Attempt to lock a folio.
+ * @folio: The folio to attempt to lock.
+ *
+ * Sometimes it is undesirable to wait for a folio to be unlocked (eg
+ * when the locks are being taken in the wrong order, or if making
+ * progress through a batch of folios is more important than processing
+ * them in order).  Usually folio_lock() is the correct function to call.
+ *
+ * Context: Any context.
+ * Return: Whether the lock was successfully acquired.
+ */
 static inline bool folio_trylock(struct folio *folio)
 {
 	return likely(!test_and_set_bit_lock(PG_locked, folio_flags(folio, 0)));
@@ -901,6 +913,26 @@  static inline int trylock_page(struct page *page)
 	return folio_trylock(page_folio(page));
 }
 
+/**
+ * folio_lock() - Lock this folio.
+ * @folio: The folio to lock.
+ *
+ * The folio lock protects against many things, probably more than it
+ * should.  It is primarily held while a folio is being read from storage,
+ * either from its backing file or from swap.  It is also held while a
+ * folio is being truncated from its address_space.
+ *
+ * Holding the lock usually prevents the contents of the folio from being
+ * modified by other kernel users, although it does not prevent userspace
+ * from modifying data if it's mapped.  The lock is not consistently held
+ * while a folio is being modified by DMA.
+ *
+ * Context: May sleep.  If you need to acquire the locks of two or
+ * more folios, they must be in order of ascending index, if they are
+ * in the same address_space.  If they are in different address_spaces,
+ * acquire the lock of the folio which belongs to the address_space which
+ * has the lowest address in memory first.
+ */
 static inline void folio_lock(struct folio *folio)
 {
 	might_sleep();
@@ -908,6 +940,17 @@  static inline void folio_lock(struct folio *folio)
 		__folio_lock(folio);
 }
 
+/**
+ * lock_page() - Lock the folio containing this page.
+ * @page: The page to lock.
+ *
+ * See folio_lock() for a description of what the lock protects.
+ * This is a legacy function and new code should probably use folio_lock()
+ * instead.
+ *
+ * Context: May sleep.  Pages in the same folio share a lock, so do not
+ * attempt to lock two pages which share a folio.
+ */
 static inline void lock_page(struct page *page)
 {
 	struct folio *folio;
@@ -918,6 +961,16 @@  static inline void lock_page(struct page *page)
 		__folio_lock(folio);
 }
 
+/**
+ * folio_lock_killable() - Lock this folio, interruptible by a fatal signal.
+ * @folio: The folio to lock.
+ *
+ * Attempts to lock the folio, like folio_lock(), except that the sleep
+ * to acquire the lock is interruptible by a fatal signal.
+ *
+ * Context: May sleep; see folio_lock().
+ * Return: 0 if the lock was acquired; -EINTR if a fatal signal was received.
+ */
 static inline int folio_lock_killable(struct folio *folio)
 {
 	might_sleep();
@@ -964,8 +1017,8 @@  int folio_wait_bit_killable(struct folio *folio, int bit_nr);
  * Wait for a folio to be unlocked.
  *
  * This must be called with the caller "holding" the folio,
- * ie with increased "page->count" so that the folio won't
- * go away during the wait..
+ * ie with increased folio reference count so that the folio won't
+ * go away during the wait.
  */
 static inline void folio_wait_locked(struct folio *folio)
 {