Message ID | 20241118194048.2355180-1-Liam.Howlett@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [6.12.y] mm/mmap: fix __mmap_region() error handling in rare merge failure case | expand |
Okay, before I get yelled at... This commit is only necessary for 6.12.y until Lorenzo's other fixes to older stables land (and I'll have to figure out what to do in each). The commit will not work on mm-unstable, because it doesn't exist due to refactoring. The commit does not have a tag about "upstream commit" because there isn't one - the closest thing I could point to does not have a stable git id. So here I am with a fix for a kernel that was released a few hours ago that is not necessary in v6.13, for a bug that's out there on syzkaller. Also, it's very unlikely to happen unless you inject failures like syzkaller. But hey, pretty decent turn-around on finding a fix - so that's a rosy outlook. * Liam R. Howlett <Liam.Howlett@oracle.com> [241118 14:41]: > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > The mmap_region() function tries to install a new vma, which requires a > pre-allocation for the maple tree write due to the complex locking > scenarios involved. > > Recent efforts to simplify the error recovery required the relocation of > the preallocation of the maple tree nodes (via vma_iter_prealloc() > calling mas_preallocate()) higher in the function. > > The relocation of the preallocation meant that, if there was a file > associated with the vma and the driver call (mmap_file()) modified the > vma flags, then a new merge of the new vma with existing vmas is > attempted. > > During the attempt to merge the existing vma with the new vma, the vma > iterator is used - the same iterator that would be used for the next > write attempt to the tree. In the event of needing a further allocation > and if the new allocations fails, the vma iterator (and contained maple > state) will cleaned up, including freeing all previous allocations and > will be reset internally. > > Upon returning to the __mmap_region() function, the error reason is lost > and the function sets the vma iterator limits, and then tries to > continue to store the new vma using vma_iter_store() - which expects > preallocated nodes. > > A preallocation should be performed in case the allocations were lost > during the failure scenario - there is no risk of over allocating. The > range is already set in the vma_iter_store() call below, so it is not > necessary. > > Reported-by: syzbot+bc6bfc25a68b7a020ee1@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000 > Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour") > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Jann Horn <jannh@google.com> > Cc: <stable@vger.kernel.org> > --- > mm/mmap.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 79d541f1502b2..5cef9a1981f1b 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > vm_flags = vma->vm_flags; > goto file_expanded; > } > - vma_iter_config(&vmi, addr, end); > + if (vma_iter_prealloc(&vmi, vma)) { > + error = -ENOMEM; > + goto unmap_and_free_file_vma; > + } > } > > vm_flags = vma->vm_flags; > -- > 2.43.0 >
On Mon, Nov 18, 2024 at 03:32:14PM -0500, Liam R. Howlett wrote: > Okay, before I get yelled at... > > This commit is only necessary for 6.12.y until Lorenzo's other fixes to > older stables land (and I'll have to figure out what to do in each). > > The commit will not work on mm-unstable, because it doesn't exist due to > refactoring. > > The commit does not have a tag about "upstream commit" because there > isn't one - the closest thing I could point to does not have a stable > git id. > > So here I am with a fix for a kernel that was released a few hours ago > that is not necessary in v6.13, for a bug that's out there on syzkaller. > > Also, it's very unlikely to happen unless you inject failures like > syzkaller. But hey, pretty decent turn-around on finding a fix - so > that's a rosy outlook. Why isn't this needed in 6.13.y? What's going to be different in there that this isn't needed? Do you just want me to take this for the 6.12.y tree now? I'll be glad to, just confused a bit. thanks, greg k-h
* Greg KH <gregkh@linuxfoundation.org> [241119 09:17]: > On Mon, Nov 18, 2024 at 03:32:14PM -0500, Liam R. Howlett wrote: > > Okay, before I get yelled at... > > > > This commit is only necessary for 6.12.y until Lorenzo's other fixes to > > older stables land (and I'll have to figure out what to do in each). > > > > The commit will not work on mm-unstable, because it doesn't exist due to > > refactoring. > > > > The commit does not have a tag about "upstream commit" because there > > isn't one - the closest thing I could point to does not have a stable > > git id. > > > > So here I am with a fix for a kernel that was released a few hours ago > > that is not necessary in v6.13, for a bug that's out there on syzkaller. > > > > Also, it's very unlikely to happen unless you inject failures like > > syzkaller. But hey, pretty decent turn-around on finding a fix - so > > that's a rosy outlook. > > Why isn't this needed in 6.13.y? What's going to be different in there > that this isn't needed? The code has been refactored and avoids the scenario. I'd name the refactoring commit as the upstream commit, but it does not have a stable git id as it's in mm-unstable. So I'm at a bit of a loss of how to follow the process. > > Do you just want me to take this for the 6.12.y tree now? I'll be glad > to, just confused a bit. Yes, please. Thanks, Liam
On 11/19/24 15:25, Liam R. Howlett wrote: > * Greg KH <gregkh@linuxfoundation.org> [241119 09:17]: >> On Mon, Nov 18, 2024 at 03:32:14PM -0500, Liam R. Howlett wrote: >> > Okay, before I get yelled at... >> > >> > This commit is only necessary for 6.12.y until Lorenzo's other fixes to >> > older stables land (and I'll have to figure out what to do in each). >> > >> > The commit will not work on mm-unstable, because it doesn't exist due to >> > refactoring. >> > >> > The commit does not have a tag about "upstream commit" because there >> > isn't one - the closest thing I could point to does not have a stable >> > git id. >> > >> > So here I am with a fix for a kernel that was released a few hours ago >> > that is not necessary in v6.13, for a bug that's out there on syzkaller. >> > >> > Also, it's very unlikely to happen unless you inject failures like >> > syzkaller. But hey, pretty decent turn-around on finding a fix - so >> > that's a rosy outlook. >> >> Why isn't this needed in 6.13.y? What's going to be different in there >> that this isn't needed? > > The code has been refactored and avoids the scenario. I'd name the > refactoring commit as the upstream commit, but it does not have a stable > git id as it's in mm-unstable. So I'm at a bit of a loss of how to > follow the process. Is it not in mm-stable now, given we're in a merge window? Anyway AFAIU if the stable-specific fix is completely different from the upstream refactoring, we don't even try to pretend it's the same "commit XYZ upstream" no? >> >> Do you just want me to take this for the 6.12.y tree now? I'll be glad >> to, just confused a bit. > > Yes, please. > > > Thanks, > Liam
On Mon, Nov 18, 2024 at 02:40:48PM -0500, Liam R. Howlett wrote: > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > The mmap_region() function tries to install a new vma, which requires a > pre-allocation for the maple tree write due to the complex locking > scenarios involved. > > Recent efforts to simplify the error recovery required the relocation of > the preallocation of the maple tree nodes (via vma_iter_prealloc() > calling mas_preallocate()) higher in the function. > > The relocation of the preallocation meant that, if there was a file > associated with the vma and the driver call (mmap_file()) modified the > vma flags, then a new merge of the new vma with existing vmas is > attempted. > > During the attempt to merge the existing vma with the new vma, the vma > iterator is used - the same iterator that would be used for the next > write attempt to the tree. In the event of needing a further allocation > and if the new allocations fails, the vma iterator (and contained maple > state) will cleaned up, including freeing all previous allocations and > will be reset internally. > > Upon returning to the __mmap_region() function, the error reason is lost > and the function sets the vma iterator limits, and then tries to > continue to store the new vma using vma_iter_store() - which expects > preallocated nodes. > > A preallocation should be performed in case the allocations were lost > during the failure scenario - there is no risk of over allocating. The > range is already set in the vma_iter_store() call below, so it is not > necessary. > > Reported-by: syzbot+bc6bfc25a68b7a020ee1@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000 > Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour") > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Jann Horn <jannh@google.com> > Cc: <stable@vger.kernel.org> > --- > mm/mmap.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 79d541f1502b2..5cef9a1981f1b 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > vm_flags = vma->vm_flags; > goto file_expanded; > } > - vma_iter_config(&vmi, addr, end); > + if (vma_iter_prealloc(&vmi, vma)) { > + error = -ENOMEM; > + goto unmap_and_free_file_vma; > + } This breaks the whole thing of these changes which is to avoid having to abort after invoking mmap_file(), so we can't do it this way. As discussed via IRC, it's probably best to do this with __GFP_NOFAIL or similar to avoid the abort altogether. But reaching this point is very common, since drivers will often change the flags, and failing to merge is also very common so we need to be sure this is a no-op if no pre-allocation is required in the _very rare_ (if even possible in reality) case of a merge failure due to OOM. We store in the vmg state whether a merge failed due to oom, which you can test for explicitly via vmg_nomem(). > } > > vm_flags = vma->vm_flags; > -- > 2.43.0 >
* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [241119 09:59]: > On Mon, Nov 18, 2024 at 02:40:48PM -0500, Liam R. Howlett wrote: > > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com> > > > > The mmap_region() function tries to install a new vma, which requires a > > pre-allocation for the maple tree write due to the complex locking > > scenarios involved. > > > > Recent efforts to simplify the error recovery required the relocation of > > the preallocation of the maple tree nodes (via vma_iter_prealloc() > > calling mas_preallocate()) higher in the function. > > > > The relocation of the preallocation meant that, if there was a file > > associated with the vma and the driver call (mmap_file()) modified the > > vma flags, then a new merge of the new vma with existing vmas is > > attempted. > > > > During the attempt to merge the existing vma with the new vma, the vma > > iterator is used - the same iterator that would be used for the next > > write attempt to the tree. In the event of needing a further allocation > > and if the new allocations fails, the vma iterator (and contained maple > > state) will cleaned up, including freeing all previous allocations and > > will be reset internally. > > > > Upon returning to the __mmap_region() function, the error reason is lost > > and the function sets the vma iterator limits, and then tries to > > continue to store the new vma using vma_iter_store() - which expects > > preallocated nodes. > > > > A preallocation should be performed in case the allocations were lost > > during the failure scenario - there is no risk of over allocating. The > > range is already set in the vma_iter_store() call below, so it is not > > necessary. > > > > Reported-by: syzbot+bc6bfc25a68b7a020ee1@syzkaller.appspotmail.com > > Closes: https://syzkaller.appspot.com/x/log.txt?x=17b0ace8580000 > > Fixes: 5de195060b2e2 ("mm: resolve faulty mmap_region() error path behaviour") > > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> > > Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > Cc: Vlastimil Babka <vbabka@suse.cz> > > Cc: Jann Horn <jannh@google.com> > > Cc: <stable@vger.kernel.org> > > --- > > mm/mmap.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > index 79d541f1502b2..5cef9a1981f1b 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, > > vm_flags = vma->vm_flags; > > goto file_expanded; > > } > > - vma_iter_config(&vmi, addr, end); > > + if (vma_iter_prealloc(&vmi, vma)) { > > + error = -ENOMEM; > > + goto unmap_and_free_file_vma; > > + } > > This breaks the whole thing of these changes which is to avoid having to > abort after invoking mmap_file(), so we can't do it this way. > > As discussed via IRC, it's probably best to do this with __GFP_NOFAIL or > similar to avoid the abort altogether. > > But reaching this point is very common, since drivers will often change the > flags, and failing to merge is also very common so we need to be sure this > is a no-op if no pre-allocation is required in the _very rare_ (if even > possible in reality) case of a merge failure due to OOM. The tree will do nothing if there is enough or more than needed preallocated for the operation. If this is common though, we should try to avoid the call into the tree with the vmg_nomem() test. > > We store in the vmg state whether a merge failed due to oom, which you can > test for explicitly via vmg_nomem(). Okay, I will respin. Thanks, Liam
diff --git a/mm/mmap.c b/mm/mmap.c index 79d541f1502b2..5cef9a1981f1b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1491,7 +1491,10 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr, vm_flags = vma->vm_flags; goto file_expanded; } - vma_iter_config(&vmi, addr, end); + if (vma_iter_prealloc(&vmi, vma)) { + error = -ENOMEM; + goto unmap_and_free_file_vma; + } } vm_flags = vma->vm_flags;