diff mbox series

[3/4] iommufd: Do not corrupt the pfn list when doing batch carry

Message ID 3-v1-ceab6a4d7d7a+94-iommufd_syz_jgg@nvidia.com (mailing list archive)
State Accepted
Commit 13a0d1ae7ee6b438f5537711a8c60cba00554943
Headers show
Series Fix three syzkaller splats in iommufd | expand

Commit Message

Jason Gunthorpe March 31, 2023, 3:32 p.m. UTC
If batch->end is 0 then setting npfns[0] before computing the new value of
pfns will fail to adjust the pfn and result in various page accounting
corruptions. It should be ordered after.

This seems to result in various kinds of page meta-data corruption related
failures:

  WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740
  Modules linked in:
  CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:try_grab_folio+0x503/0x740
  Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89
  RSP: 0018:ffffc90000f37908 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26
  RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002
  RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008
  R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540
  R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540
  FS:  00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0
  PKRU: 55555554
  Call Trace:
   <TASK>
   internal_get_user_pages_fast+0xd32/0x2200
   pin_user_pages_fast+0x65/0x90
   pfn_reader_user_pin+0x376/0x390
   pfn_reader_next+0x14a/0x7b0
   pfn_reader_first+0x140/0x1b0
   iopt_area_fill_domain+0x74/0x210
   iopt_table_add_domain+0x30e/0x6e0
   iommufd_device_selftest_attach+0x7f/0x140
   iommufd_test+0x10ff/0x16f0
   iommufd_fops_ioctl+0x206/0x330
   __x64_sys_ioctl+0x10e/0x160
   do_syscall_64+0x3b/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

Cc: <stable@vger.kernel.org>
Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Link: https://lore.kernel.org/r/ZBExkEW/On0ue68q@xpf.sh.intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommufd/pages.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Pengfei Xu April 1, 2023, 6:29 a.m. UTC | #1
Hi Jason,

On 2023-03-31 at 12:32:26 -0300, Jason Gunthorpe wrote:
> If batch->end is 0 then setting npfns[0] before computing the new value of
> pfns will fail to adjust the pfn and result in various page accounting
> corruptions. It should be ordered after.
> 
> This seems to result in various kinds of page meta-data corruption related
> failures:
> 
>   WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740
>   Modules linked in:
>   CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:try_grab_folio+0x503/0x740
>   Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89
>   RSP: 0018:ffffc90000f37908 EFLAGS: 00010046
>   RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26
>   RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002
>   RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008
>   R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540
>   R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540
>   FS:  00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0
>   PKRU: 55555554
>   Call Trace:
>    <TASK>
>    internal_get_user_pages_fast+0xd32/0x2200
>    pin_user_pages_fast+0x65/0x90
>    pfn_reader_user_pin+0x376/0x390
>    pfn_reader_next+0x14a/0x7b0
>    pfn_reader_first+0x140/0x1b0
>    iopt_area_fill_domain+0x74/0x210
>    iopt_table_add_domain+0x30e/0x6e0
>    iommufd_device_selftest_attach+0x7f/0x140
>    iommufd_test+0x10ff/0x16f0
>    iommufd_fops_ioctl+0x206/0x330
>    __x64_sys_ioctl+0x10e/0x160
>    do_syscall_64+0x3b/0x90
>    entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> Cc: <stable@vger.kernel.org>
> Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
> Reported-by: Pengfei Xu <pengfei.xu@intel.com>
> Link: https://lore.kernel.org/r/ZBExkEW/On0ue68q@xpf.sh.intel.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/iommufd/pages.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> index b11aace836542d..3c47846cc5efe8 100644
> --- a/drivers/iommu/iommufd/pages.c
> +++ b/drivers/iommu/iommufd/pages.c
> @@ -294,9 +294,9 @@ static void batch_clear_carry(struct pfn_batch *batch, unsigned int keep_pfns)
>  			batch->npfns[batch->end - 1] < keep_pfns);
>  
>  	batch->total_pfns = keep_pfns;
> -	batch->npfns[0] = keep_pfns;
>  	batch->pfns[0] = batch->pfns[batch->end - 1] +
>  			 (batch->npfns[batch->end - 1] - keep_pfns);
> +	batch->npfns[0] = keep_pfns;
>  	batch->end = 0;
>  }
>  
> -- 
  I tested the reproduced code in the kernel with all 3 fixed patches.
  Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230313_234302_try_grab_folio/repro.c
  This issue was gone and the issue was fixed.

  Thanks!
  BR.

> 2.40.0
>
Pengfei Xu April 3, 2023, 7:02 a.m. UTC | #2
Hi Jason,

  Could you add "Tested-by" tag from me?

  Thanks!
  BR.

On 2023-04-01 at 14:29:33 +0800, Pengfei Xu wrote:
> Hi Jason,
> 
> On 2023-03-31 at 12:32:26 -0300, Jason Gunthorpe wrote:
> > If batch->end is 0 then setting npfns[0] before computing the new value of
> > pfns will fail to adjust the pfn and result in various page accounting
> > corruptions. It should be ordered after.
> > 
> > This seems to result in various kinds of page meta-data corruption related
> > failures:
> > 
> >   WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740
> >   Modules linked in:
> >   CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
> >   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> >   RIP: 0010:try_grab_folio+0x503/0x740
> >   Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8 2f bf dd ff 48 8b 5d c8 31 ff 48 89
> >   RSP: 0018:ffffc90000f37908 EFLAGS: 00010046
> >   RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26
> >   RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002
> >   RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008
> >   R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540
> >   R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540
> >   FS:  00007fecbf659740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
> >   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0
> >   PKRU: 55555554
> >   Call Trace:
> >    <TASK>
> >    internal_get_user_pages_fast+0xd32/0x2200
> >    pin_user_pages_fast+0x65/0x90
> >    pfn_reader_user_pin+0x376/0x390
> >    pfn_reader_next+0x14a/0x7b0
> >    pfn_reader_first+0x140/0x1b0
> >    iopt_area_fill_domain+0x74/0x210
> >    iopt_table_add_domain+0x30e/0x6e0
> >    iommufd_device_selftest_attach+0x7f/0x140
> >    iommufd_test+0x10ff/0x16f0
> >    iommufd_fops_ioctl+0x206/0x330
> >    __x64_sys_ioctl+0x10e/0x160
> >    do_syscall_64+0x3b/0x90
> >    entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > 
> > Cc: <stable@vger.kernel.org>
> > Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
> > Reported-by: Pengfei Xu <pengfei.xu@intel.com>
> > Link: https://lore.kernel.org/r/ZBExkEW/On0ue68q@xpf.sh.intel.com
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/iommu/iommufd/pages.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
> > index b11aace836542d..3c47846cc5efe8 100644
> > --- a/drivers/iommu/iommufd/pages.c
> > +++ b/drivers/iommu/iommufd/pages.c
> > @@ -294,9 +294,9 @@ static void batch_clear_carry(struct pfn_batch *batch, unsigned int keep_pfns)
> >  			batch->npfns[batch->end - 1] < keep_pfns);
> >  
> >  	batch->total_pfns = keep_pfns;
> > -	batch->npfns[0] = keep_pfns;
> >  	batch->pfns[0] = batch->pfns[batch->end - 1] +
> >  			 (batch->npfns[batch->end - 1] - keep_pfns);
> > +	batch->npfns[0] = keep_pfns;
> >  	batch->end = 0;
> >  }
> >  
> > -- 
>   I tested the reproduced code in the kernel with all 3 fixed patches.
>   Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230313_234302_try_grab_folio/repro.c
>   This issue was gone and the issue was fixed.
> 
>   Thanks!
>   BR.
> 
> > 2.40.0
> >
Tian, Kevin April 4, 2023, 9:44 a.m. UTC | #3
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, March 31, 2023 11:32 PM
> 
> If batch->end is 0 then setting npfns[0] before computing the new value of
> pfns will fail to adjust the pfn and result in various page accounting
> corruptions. It should be ordered after.
> 
> This seems to result in various kinds of page meta-data corruption related
> failures:
> 
>   WARNING: CPU: 1 PID: 527 at mm/gup.c:75 try_grab_folio+0x503/0x740
>   Modules linked in:
>   CPU: 1 PID: 527 Comm: repro Not tainted 6.3.0-rc2-eeac8ede1755+ #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-
> gd239552ce722-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:try_grab_folio+0x503/0x740
>   Code: e3 01 48 89 de e8 6d c1 dd ff 48 85 db 0f 84 7c fe ff ff e8 4f bf dd ff 49
> 8d 47 ff 48 89 45 d0 e9 73 fe ff ff e8 3d bf dd ff <0f> 0b 31 db e9 d0 fc ff ff e8
> 2f bf dd ff 48 8b 5d c8 31 ff 48 89
>   RSP: 0018:ffffc90000f37908 EFLAGS: 00010046
>   RAX: 0000000000000000 RBX: 00000000fffffc02 RCX: ffffffff81504c26
>   RDX: 0000000000000000 RSI: ffff88800d030000 RDI: 0000000000000002
>   RBP: ffffc90000f37948 R08: 000000000003ca24 R09: 0000000000000008
>   R10: 000000000003ca00 R11: 0000000000000023 R12: ffffea000035d540
>   R13: 0000000000000001 R14: 0000000000000000 R15: ffffea000035d540
>   FS:  00007fecbf659740(0000) GS:ffff88807dd00000(0000)
> knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00000000200011c3 CR3: 000000000ef66006 CR4: 0000000000770ee0
>   PKRU: 55555554
>   Call Trace:
>    <TASK>
>    internal_get_user_pages_fast+0xd32/0x2200
>    pin_user_pages_fast+0x65/0x90
>    pfn_reader_user_pin+0x376/0x390
>    pfn_reader_next+0x14a/0x7b0
>    pfn_reader_first+0x140/0x1b0
>    iopt_area_fill_domain+0x74/0x210
>    iopt_table_add_domain+0x30e/0x6e0
>    iommufd_device_selftest_attach+0x7f/0x140
>    iommufd_test+0x10ff/0x16f0
>    iommufd_fops_ioctl+0x206/0x330
>    __x64_sys_ioctl+0x10e/0x160
>    do_syscall_64+0x3b/0x90
>    entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> Cc: <stable@vger.kernel.org>
> Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages")
> Reported-by: Pengfei Xu <pengfei.xu@intel.com>
> Link: https://lore.kernel.org/r/ZBExkEW/On0ue68q@xpf.sh.intel.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jason Gunthorpe April 4, 2023, 1:01 p.m. UTC | #4
On Mon, Apr 03, 2023 at 03:02:56PM +0800, Pengfei Xu wrote:
> Hi Jason,
> 
>   Could you add "Tested-by" tag from me?

Yes, I did, in future you can respond to the cover letter with that
tag and the tools will pick it up

Jason
Pengfei Xu April 4, 2023, 1:36 p.m. UTC | #5
Hi Jason,

On 2023-04-04 at 10:01:55 -0300, Jason Gunthorpe wrote:
> On Mon, Apr 03, 2023 at 03:02:56PM +0800, Pengfei Xu wrote:
> > Hi Jason,
> > 
> >   Could you add "Tested-by" tag from me?
> 
> Yes, I did, in future you can respond to the cover letter with that
> tag and the tools will pick it up
> 
  Ah, I see, I will do that next time, thanks for suggestion!

  BR.
  Thanks!

> Jason
diff mbox series

Patch

diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index b11aace836542d..3c47846cc5efe8 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -294,9 +294,9 @@  static void batch_clear_carry(struct pfn_batch *batch, unsigned int keep_pfns)
 			batch->npfns[batch->end - 1] < keep_pfns);
 
 	batch->total_pfns = keep_pfns;
-	batch->npfns[0] = keep_pfns;
 	batch->pfns[0] = batch->pfns[batch->end - 1] +
 			 (batch->npfns[batch->end - 1] - keep_pfns);
+	batch->npfns[0] = keep_pfns;
 	batch->end = 0;
 }