Message ID | 1455266355-44676-1-git-send-email-hare@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02/12/2016 01:39 AM, Hannes Reinecke wrote: > Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for > current->mm to see if we have a user space context and only copies data > if we do. Now if an IO gets interrupted by a signal data isn't copied > into user space any more (as we don't have a user space context) but > user space isn't notified about it. > > This patch modifies the behaviour to return -EINTR from bio_uncopy_user() > to notify userland that a signal has interrupted the syscall, otherwise > it could lead to a situation where the caller may get a buffer with > no data returned. > > This can be reproduced by issuing SG_IO ioctl()s in one thread while > constantly sending signals to it. Good catch, fix looks good to me. Applied for 4.5.
On 16-02-12 03:39 AM, Hannes Reinecke wrote: > Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for > current->mm to see if we have a user space context and only copies data > if we do. Now if an IO gets interrupted by a signal data isn't copied > into user space any more (as we don't have a user space context) but > user space isn't notified about it. > > This patch modifies the behaviour to return -EINTR from bio_uncopy_user() > to notify userland that a signal has interrupted the syscall, otherwise > it could lead to a situation where the caller may get a buffer with > no data returned. Interesting, the "f091" commit has been in the kernel since 2013 hence your reference to v.3.11 . I always had the feeling that handling signals that interrupted SG_IO calls was skating on thin ice. Hence in ddpt (but not sg_dd nor sgp_dd) the code masks out all signals (that it can) during the SG_IO calls then opens a signal window briefly after a SG_IO ioctl has finished and before the next one starts. This approach used by ddpt is borrowed from dd (in coreutils) which masks signals during its read() and write() calls. Any idea how accurate resid is in this scenario? Doug Gilbert > This can be reproduced by issuing SG_IO ioctl()s in one thread while > constantly sending signals to it. > > Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > Signed-off-by: Hannes Reinecke <hare@suse.de> > Cc: stable@vger.kernel.org # v.3.11+ > --- > block/bio.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index dbabd48..24e5b69 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1090,9 +1090,12 @@ int bio_uncopy_user(struct bio *bio) > if (!bio_flagged(bio, BIO_NULL_MAPPED)) { > /* > * if we're in a workqueue, the request is orphaned, so > - * don't copy into a random user address space, just free. > + * don't copy into a random user address space, just free > + * and return -EINTR so user space doesn't expect any data. > */ > - if (current->mm && bio_data_dir(bio) == READ) > + if (!current->mm) > + ret = -EINTR; > + else if (bio_data_dir(bio) == READ) > ret = bio_copy_to_iter(bio, bmd->iter); > if (bmd->is_our_pages) > bio_free_pages(bio); > -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/12/2016 05:05 PM, Douglas Gilbert wrote: > On 16-02-12 03:39 AM, Hannes Reinecke wrote: >> Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for >> current->mm to see if we have a user space context and only copies data >> if we do. Now if an IO gets interrupted by a signal data isn't copied >> into user space any more (as we don't have a user space context) but >> user space isn't notified about it. >> >> This patch modifies the behaviour to return -EINTR from bio_uncopy_user() >> to notify userland that a signal has interrupted the syscall, otherwise >> it could lead to a situation where the caller may get a buffer with >> no data returned. > > Interesting, the "f091" commit has been in the kernel since 2013 > hence your reference to v.3.11 . I always had the feeling that > handling signals that interrupted SG_IO calls was skating on thin > ice. Yeah; that bug was really annoying, as occasionally receiving no data whatsoever without any indication makes it really, really hard to debug. Kudos go to Johannes and Ewan for pointing to the offending function. > Hence in ddpt (but not sg_dd nor sgp_dd) the code masks out > all signals (that it can) during the SG_IO calls then opens a > signal window briefly after a SG_IO ioctl has finished and before > the next one starts. This approach used by ddpt is borrowed from > dd (in coreutils) which masks signals during its read() and > write() calls. > > Any idea how accurate resid is in this scenario? > Bah. F*sk knows. That takes far more POSIX knowledge than I have. I would suspect that by masking out signals they'll be marked as pending for the process, and will be delivered once you unmask them. Or dropped, depending. In either case, once they are blocked out the kernel part shouldn't receive a signal, and hence we should be able to receive the data properly. But as noted I'm not a POSIX expert. Cheers, Hannes
On Fri, 2016-02-12 at 09:39 +0100, Hannes Reinecke wrote: > Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for > current->mm to see if we have a user space context and only copies data > if we do. Now if an IO gets interrupted by a signal data isn't copied > into user space any more (as we don't have a user space context) but > user space isn't notified about it. > > This patch modifies the behaviour to return -EINTR from bio_uncopy_user() > to notify userland that a signal has interrupted the syscall, otherwise > it could lead to a situation where the caller may get a buffer with > no data returned. > > This can be reproduced by issuing SG_IO ioctl()s in one thread while > constantly sending signals to it. Well, this is definitely an improvement, since it now indicates to the user that there was a problem instead of silently returning with no data and no error, but it doesn't completely fix the issue. For one thing, if the SG_IO is performed to a sequential media device, the I/O is actually performed, and the position changes, it is just the data that is not copied back. So e.g. a user program making calls to read from a tape device that gets signals and retrying on -EINTR will skip blocks. There is a similar problem already with SG_IO and -ERESTARTSYS. I believe that that only way around this is to mask the signals. (This is also problem with non-idempotent commands, such as COMPARE AND WRITE.) I should mention that the patch "sg: Fix user memory corruption when SG_IO is interrupted by a signal" broke a 3rd-party kernel module that happened to use SG_IO to pass *kernel* addresses for I/O. Which worked for them until the current->mm test was added. (I don't know why they didn't just put I/O on the request queue like everyone else, though.) This patch should go in, though. Reviewed-by: Ewan D. Milne <emilne@redhat.com> > > Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > Signed-off-by: Hannes Reinecke <hare@suse.de> > Cc: stable@vger.kernel.org # v.3.11+ > --- > block/bio.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index dbabd48..24e5b69 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1090,9 +1090,12 @@ int bio_uncopy_user(struct bio *bio) > if (!bio_flagged(bio, BIO_NULL_MAPPED)) { > /* > * if we're in a workqueue, the request is orphaned, so > - * don't copy into a random user address space, just free. > + * don't copy into a random user address space, just free > + * and return -EINTR so user space doesn't expect any data. > */ > - if (current->mm && bio_data_dir(bio) == READ) > + if (!current->mm) > + ret = -EINTR; > + else if (bio_data_dir(bio) == READ) > ret = bio_copy_to_iter(bio, bmd->iter); > if (bmd->is_our_pages) > bio_free_pages(bio); -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/block/bio.c b/block/bio.c index dbabd48..24e5b69 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1090,9 +1090,12 @@ int bio_uncopy_user(struct bio *bio) if (!bio_flagged(bio, BIO_NULL_MAPPED)) { /* * if we're in a workqueue, the request is orphaned, so - * don't copy into a random user address space, just free. + * don't copy into a random user address space, just free + * and return -EINTR so user space doesn't expect any data. */ - if (current->mm && bio_data_dir(bio) == READ) + if (!current->mm) + ret = -EINTR; + else if (bio_data_dir(bio) == READ) ret = bio_copy_to_iter(bio, bmd->iter); if (bmd->is_our_pages) bio_free_pages(bio);