Message ID | 20220401072926.45051-1-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/swapfile: unuse_pte can map random data if swap read fails | expand |
On 01.04.22 09:29, Miaohe Lin wrote: > There is a bug in unuse_pte(): when swap page happens to be unreadable, > page filled with random data is mapped into user address space. The fix > is to check for PageUptodate and fail swapoff in case of error. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/swapfile.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 63c61f8b2611..e72a35de7a0f 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > ret = 0; > goto out; > } > + if (unlikely(!PageUptodate(page))) { > + ret = -EIO; > + goto out; > + } Yeah, we have the same handling in do_swap_page(), whereby we send a SIGBUS because we're dealing with an actual access. Interestingly, folio_test_uptodate() states: "Anonymous and CoW folios are always uptodate." @Willy, is that true or is the swapin case not documented there?
On Mon, Apr 04, 2022 at 03:37:36PM +0200, David Hildenbrand wrote: > On 01.04.22 09:29, Miaohe Lin wrote: > > There is a bug in unuse_pte(): when swap page happens to be unreadable, > > page filled with random data is mapped into user address space. The fix > > is to check for PageUptodate and fail swapoff in case of error. > > > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > > --- > > mm/swapfile.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 63c61f8b2611..e72a35de7a0f 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > > ret = 0; > > goto out; > > } > > + if (unlikely(!PageUptodate(page))) { > > + ret = -EIO; > > + goto out; > > + } > > Yeah, we have the same handling in do_swap_page(), whereby we send a > SIGBUS because we're dealing with an actual access. > > Interestingly, folio_test_uptodate() states: > > "Anonymous and CoW folios are always uptodate." > > @Willy, is that true or is the swapin case not documented there? Why do we keep a !Uptodate page in the swap cache? If it can't be read in from swap, I thought we just freed the page. Since Miaohe has observed that not happening, I guess it doesn't work that way, but why not make it work that way?
On Fri, 1 Apr 2022 15:29:26 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote: > There is a bug in unuse_pte(): when swap page happens to be unreadable, > page filled with random data is mapped into user address space. The fix > is to check for PageUptodate and fail swapoff in case of error. > > ... > > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > ret = 0; > goto out; > } > + if (unlikely(!PageUptodate(page))) { > + ret = -EIO; > + goto out; > + } > > dec_mm_counter(vma->vm_mm, MM_SWAPENTS); > inc_mm_counter(vma->vm_mm, MM_ANONPAGES); Failing the swapoff after -EIO seems a bit rude. The user ends up with a permanently mounted swap because a sector was bad? That would be like failing truncate() or close() or umount after -EIO on a regular file. Somewhat. Can we do something better? Such as shooting down the page anyway and permitting the swapoff to proceed? Worst case, just leak the dang page with an apologetic message.
On 2022/4/5 6:53, Andrew Morton wrote: > On Fri, 1 Apr 2022 15:29:26 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote: > >> There is a bug in unuse_pte(): when swap page happens to be unreadable, >> page filled with random data is mapped into user address space. The fix >> is to check for PageUptodate and fail swapoff in case of error. >> >> ... >> >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >> ret = 0; >> goto out; >> } >> + if (unlikely(!PageUptodate(page))) { >> + ret = -EIO; >> + goto out; >> + } >> >> dec_mm_counter(vma->vm_mm, MM_SWAPENTS); >> inc_mm_counter(vma->vm_mm, MM_ANONPAGES); > > Failing the swapoff after -EIO seems a bit rude. The user ends up with > a permanently mounted swap because a sector was bad? > This is really unfortunate. :( > That would be like failing truncate() or close() or umount after -EIO > on a regular file. Somewhat. > > Can we do something better? Such as shooting down the page anyway and > permitting the swapoff to proceed? Worst case, just leak the dang page > with an apologetic message. > . > We must have a way to prevent user from accessing the wrong data. One way is kept the page in the swap cache and kill the user when page is accessed. But this will end up with a permanently mounted swap. Another way I can figure out now is that we could set the page table entry to some special swap entry, such as SWP_EIO like SWP_HWPOISON, we can thus kill the user when page is accessed while swapoff can proceed. But this makes the code more complicated... Any suggestions? Many thanks!
On 2022/4/4 22:11, Matthew Wilcox wrote: > On Mon, Apr 04, 2022 at 03:37:36PM +0200, David Hildenbrand wrote: >> On 01.04.22 09:29, Miaohe Lin wrote: >>> There is a bug in unuse_pte(): when swap page happens to be unreadable, >>> page filled with random data is mapped into user address space. The fix >>> is to check for PageUptodate and fail swapoff in case of error. >>> >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> --- >>> mm/swapfile.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/mm/swapfile.c b/mm/swapfile.c >>> index 63c61f8b2611..e72a35de7a0f 100644 >>> --- a/mm/swapfile.c >>> +++ b/mm/swapfile.c >>> @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >>> ret = 0; >>> goto out; >>> } >>> + if (unlikely(!PageUptodate(page))) { >>> + ret = -EIO; >>> + goto out; >>> + } >> >> Yeah, we have the same handling in do_swap_page(), whereby we send a >> SIGBUS because we're dealing with an actual access. >> >> Interestingly, folio_test_uptodate() states: >> >> "Anonymous and CoW folios are always uptodate." >> >> @Willy, is that true or is the swapin case not documented there? > > Why do we keep a !Uptodate page in the swap cache? If it can't be > read in from swap, I thought we just freed the page. Since Miaohe We could free the bad page. But we still need a way to prevent user from accessing the wrong data. > has observed that not happening, I guess it doesn't work that way, > but why not make it work that way? How could we make it work that way? Could you please tell me in more detail? Or any suggestions? Many thanks! > > . >
diff --git a/mm/swapfile.c b/mm/swapfile.c index 63c61f8b2611..e72a35de7a0f 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1795,6 +1795,10 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, ret = 0; goto out; } + if (unlikely(!PageUptodate(page))) { + ret = -EIO; + goto out; + } dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
There is a bug in unuse_pte(): when swap page happens to be unreadable, page filled with random data is mapped into user address space. The fix is to check for PageUptodate and fail swapoff in case of error. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/swapfile.c | 4 ++++ 1 file changed, 4 insertions(+)