Message ID | CA+55aFwC3nfmum6oPoBRU325CKjK-g3MuqW+bhq3+9DnE_4fgw@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 5 December 2016 at 21:35, Linus Torvalds <torvalds@linux-foundation.org> wrote: > Note for Ingo and Peter: this patch has not been tested at all. But > Vegard did test an earlier patch of mine that just verified that yes, > the issue really was that wait queue entries remained on the wait > queue head just as we were about to return and free it. The second patch has been running for 1h+ without any problems of any kind. I should typically have seen 2 crashes by now. I'll let it run overnight to be sure. Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5 December 2016 at 22:33, Vegard Nossum <vegard.nossum@gmail.com> wrote: > On 5 December 2016 at 21:35, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> Note for Ingo and Peter: this patch has not been tested at all. But >> Vegard did test an earlier patch of mine that just verified that yes, >> the issue really was that wait queue entries remained on the wait >> queue head just as we were about to return and free it. > > The second patch has been running for 1h+ without any problems of any > kind. I should typically have seen 2 crashes by now. I'll let it run > overnight to be sure. Alright, so nearly 12 hours later I don't see either the new warning or the original crash at all, so feel free to add: Tested-by: Vegard Nossum <vegard.nossum@oracle.com>. That said, my 8 VMs had all panicked in some way due to OOMs (which is new since v4.8), although some got page allocation stalls for >20s and died because "khugepaged blocked for more than 120 seconds", others got "Out of memory and no killable processes". Vegard -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/shmem.c b/mm/shmem.c index 166ebf5d2bce..17beb44e9f4f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1848,6 +1848,19 @@ alloc_nohuge: page = shmem_alloc_and_acct_page(gfp, info, sbinfo, return error; } +/* + * This is like autoremove_wake_function, but it removes the wait queue + * entry unconditionally - even if something else had already woken the + * target. + */ +static int synchronous_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key) +{ + int ret = default_wake_function(wait, mode, sync, key); + list_del_init(&wait->task_list); + return ret; +} + + static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { struct inode *inode = file_inode(vma->vm_file); @@ -1883,7 +1896,7 @@ static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) vmf->pgoff >= shmem_falloc->start && vmf->pgoff < shmem_falloc->next) { wait_queue_head_t *shmem_falloc_waitq; - DEFINE_WAIT(shmem_fault_wait); + DEFINE_WAIT_FUNC(shmem_fault_wait, synchronous_wake_function); ret = VM_FAULT_NOPAGE; if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) && @@ -2665,6 +2678,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, spin_lock(&inode->i_lock); inode->i_private = NULL; wake_up_all(&shmem_falloc_waitq); + WARN_ON_ONCE(!list_empty(&shmem_falloc_waitq.task_list)); spin_unlock(&inode->i_lock); error = 0; goto out;