Message ID | 20200526195123.29053-5-axboe@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for async buffered reads | expand |
On Tue, May 26, 2020 at 01:51:15PM -0600, Jens Axboe wrote: > Normally waiting for a page to become unlocked, or locking the page, > requires waiting for IO to complete. Add support for lock_page_async() > and wait_on_page_locked_async(), which are callback based instead. This wait_on_page_locked_async() is actually in the next patch, requiring some back and forth to review. I wonder if this and the next patch could be merged to have the new API and callers introduced together? > allows a caller to get notified when a page becomes unlocked, rather > than wait for it. > > We add a new iocb field, ki_waitq, to pass in the necessary data for this > to happen. We can unionize this with ki_cookie, since that is only used > for polled IO. Polled IO can never co-exist with async callbacks, as it is > (by definition) polled completions. struct wait_page_key is made public, > and we define struct wait_page_async as the interface between the caller > and the core. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
On 5/26/20 3:59 PM, Johannes Weiner wrote: > On Tue, May 26, 2020 at 01:51:15PM -0600, Jens Axboe wrote: >> Normally waiting for a page to become unlocked, or locking the page, >> requires waiting for IO to complete. Add support for lock_page_async() >> and wait_on_page_locked_async(), which are callback based instead. This > > wait_on_page_locked_async() is actually in the next patch, requiring > some back and forth to review. I wonder if this and the next patch > could be merged to have the new API and callers introduced together? I'm fine with that, if that is preferable. Don't feel strongly about that at all, just tried to do it as piecemeal as possible to make it easier to review.
On Tue, May 26, 2020 at 04:01:07PM -0600, Jens Axboe wrote: > On 5/26/20 3:59 PM, Johannes Weiner wrote: > > On Tue, May 26, 2020 at 01:51:15PM -0600, Jens Axboe wrote: > >> Normally waiting for a page to become unlocked, or locking the page, > >> requires waiting for IO to complete. Add support for lock_page_async() > >> and wait_on_page_locked_async(), which are callback based instead. This > > > > wait_on_page_locked_async() is actually in the next patch, requiring > > some back and forth to review. I wonder if this and the next patch > > could be merged to have the new API and callers introduced together? > > I'm fine with that, if that is preferable. Don't feel strongly about > that at all, just tried to do it as piecemeal as possible to make > it easier to review. Not worth sending a new iteration over, IMO.
On Tue, May 26, 2020 at 01:51:15PM -0600, Jens Axboe wrote: > +static int __wait_on_page_locked_async(struct page *page, > + struct wait_page_queue *wait, bool set) > +{ > + struct wait_queue_head *q = page_waitqueue(page); > + int ret = 0; > + > + wait->page = page; > + wait->bit_nr = PG_locked; > + > + spin_lock_irq(&q->lock); > + if (set) > + ret = !trylock_page(page); > + else > + ret = PageLocked(page); > + if (ret) { > + __add_wait_queue_entry_tail(q, &wait->wait); > + SetPageWaiters(page); > + if (set) > + ret = !trylock_page(page); > + else > + ret = PageLocked(page); Between the callers and this function, we actually look at PG_lock three times; once in the caller, then after taking the spinlock, then after adding ourselves to the waitqueue. I understand the first and third, but is it really worth doing the second test? It feels unlikely to succeed and only saves us setting PageWaiters.
On 6/1/20 8:26 AM, Matthew Wilcox wrote: > On Tue, May 26, 2020 at 01:51:15PM -0600, Jens Axboe wrote: >> +static int __wait_on_page_locked_async(struct page *page, >> + struct wait_page_queue *wait, bool set) >> +{ >> + struct wait_queue_head *q = page_waitqueue(page); >> + int ret = 0; >> + >> + wait->page = page; >> + wait->bit_nr = PG_locked; >> + >> + spin_lock_irq(&q->lock); >> + if (set) >> + ret = !trylock_page(page); >> + else >> + ret = PageLocked(page); >> + if (ret) { >> + __add_wait_queue_entry_tail(q, &wait->wait); >> + SetPageWaiters(page); >> + if (set) >> + ret = !trylock_page(page); >> + else >> + ret = PageLocked(page); > > Between the callers and this function, we actually look at PG_lock three > times; once in the caller, then after taking the spinlock, then after > adding ourselves to the waitqueue. I understand the first and third, but > is it really worth doing the second test? It feels unlikely to succeed > and only saves us setting PageWaiters. That's probably true, and we can skip the 2nd one. I'll make the change.
diff --git a/include/linux/fs.h b/include/linux/fs.h index d3ebb49189df..ba1fff0e7bca 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -314,6 +314,8 @@ enum rw_hint { #define IOCB_SYNC (1 << 5) #define IOCB_WRITE (1 << 6) #define IOCB_NOWAIT (1 << 7) +/* iocb->ki_waitq is valid */ +#define IOCB_WAITQ (1 << 8) struct kiocb { struct file *ki_filp; @@ -327,7 +329,10 @@ struct kiocb { int ki_flags; u16 ki_hint; u16 ki_ioprio; /* See linux/ioprio.h */ - unsigned int ki_cookie; /* for ->iopoll */ + union { + unsigned int ki_cookie; /* for ->iopoll */ + struct wait_page_queue *ki_waitq; /* for async buffered IO */ + }; randomized_struct_fields_end }; diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 53d980f2208d..d3e63c9c61ae 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -495,6 +495,7 @@ static inline int wake_page_match(struct wait_page_queue *wait_page, extern void __lock_page(struct page *page); extern int __lock_page_killable(struct page *page); +extern int __lock_page_async(struct page *page, struct wait_page_queue *wait); extern int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags); extern void unlock_page(struct page *page); @@ -531,6 +532,14 @@ static inline int lock_page_killable(struct page *page) return 0; } +static inline int lock_page_async(struct page *page, + struct wait_page_queue *wait) +{ + if (!trylock_page(page)) + return __lock_page_async(page, wait); + return 0; +} + /* * lock_page_or_retry - Lock the page, unless this would block and the * caller indicated that it can handle a retry. diff --git a/mm/filemap.c b/mm/filemap.c index e891b5bee8fd..c746541b1d49 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1183,6 +1183,42 @@ int wait_on_page_bit_killable(struct page *page, int bit_nr) } EXPORT_SYMBOL(wait_on_page_bit_killable); +static int __wait_on_page_locked_async(struct page *page, + struct wait_page_queue *wait, bool set) +{ + struct wait_queue_head *q = page_waitqueue(page); + int ret = 0; + + wait->page = page; + wait->bit_nr = PG_locked; + + spin_lock_irq(&q->lock); + if (set) + ret = !trylock_page(page); + else + ret = PageLocked(page); + if (ret) { + __add_wait_queue_entry_tail(q, &wait->wait); + SetPageWaiters(page); + if (set) + ret = !trylock_page(page); + else + ret = PageLocked(page); + /* + * If we were succesful now, we know we're still on the + * waitqueue as we're still under the lock. This means it's + * safe to remove and return success, we know the callback + * isn't going to trigger. + */ + if (!ret) + __remove_wait_queue(q, &wait->wait); + else + ret = -EIOCBQUEUED; + } + spin_unlock_irq(&q->lock); + return ret; +} + /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked * @page: The page to wait for. @@ -1345,6 +1381,11 @@ int __lock_page_killable(struct page *__page) } EXPORT_SYMBOL_GPL(__lock_page_killable); +int __lock_page_async(struct page *page, struct wait_page_queue *wait) +{ + return __wait_on_page_locked_async(page, wait, true); +} + /* * Return values: * 1 - page is locked; mmap_sem is still held.
Normally waiting for a page to become unlocked, or locking the page, requires waiting for IO to complete. Add support for lock_page_async() and wait_on_page_locked_async(), which are callback based instead. This allows a caller to get notified when a page becomes unlocked, rather than wait for it. We add a new iocb field, ki_waitq, to pass in the necessary data for this to happen. We can unionize this with ki_cookie, since that is only used for polled IO. Polled IO can never co-exist with async callbacks, as it is (by definition) polled completions. struct wait_page_key is made public, and we define struct wait_page_async as the interface between the caller and the core. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- include/linux/fs.h | 7 ++++++- include/linux/pagemap.h | 9 +++++++++ mm/filemap.c | 41 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 56 insertions(+), 1 deletion(-)