Message ID | 1447964734-16010-1-git-send-email-jack@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu 19-11-15 21:25:34, Jan Kara wrote: > Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and > we issue direct IO read at offset 1024, blockdev_direct_IO() reads the > tail of the last block and the logic for handling short DIO reads in > dio_complete() results in a return value -24 (1000 - 1024) which > obviously confuses userspace. > > Fix the problem by bailing out early once we sample i_size and can > reliably check that direct IO read starts beyond i_size. > > Reported-by: Avi Kivity <avi@scylladb.com> > Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 > CC: stable@vger.kernel.org > CC: Steven Whitehouse <swhiteho@redhat.com> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/direct-io.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > Avi, this patch fixes the issue for me. Jens, can you pick up this fix please? Thanks! Honza > diff --git a/fs/direct-io.c b/fs/direct-io.c > index 18e7554cf94c..08094c9d8172 100644 > --- a/fs/direct-io.c > +++ b/fs/direct-io.c > @@ -1163,6 +1163,15 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > } > } > > + /* Once we sampled i_size check for reads beyond EOF */ > + dio->i_size = i_size_read(inode); > + if (iov_iter_rw(iter) == READ && offset >= dio->i_size) { > + if (dio->flags & DIO_LOCKING) > + mutex_unlock(&inode->i_mutex); > + kmem_cache_free(dio_cache, dio); > + goto out; > + } > + > /* > * For file extending writes updating i_size before data writeouts > * complete can expose uninitialized blocks in dumb filesystems. > @@ -1216,7 +1225,6 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > sdio.next_block_for_io = -1; > > dio->iocb = iocb; > - dio->i_size = i_size_read(inode); > > spin_lock_init(&dio->bio_lock); > dio->refcount = 1; > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/30/2015 06:10 AM, Jan Kara wrote: > On Thu 19-11-15 21:25:34, Jan Kara wrote: >> Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and >> we issue direct IO read at offset 1024, blockdev_direct_IO() reads the >> tail of the last block and the logic for handling short DIO reads in >> dio_complete() results in a return value -24 (1000 - 1024) which >> obviously confuses userspace. >> >> Fix the problem by bailing out early once we sample i_size and can >> reliably check that direct IO read starts beyond i_size. >> >> Reported-by: Avi Kivity <avi@scylladb.com> >> Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 >> CC: stable@vger.kernel.org >> CC: Steven Whitehouse <swhiteho@redhat.com> >> Signed-off-by: Jan Kara <jack@suse.cz> >> --- >> fs/direct-io.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> Avi, this patch fixes the issue for me. > > Jens, can you pick up this fix please? Thanks! Yup, added for 4.4, thanks!
On 11/19/2015 10:25 PM, Jan Kara wrote: > Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and > we issue direct IO read at offset 1024, blockdev_direct_IO() reads the > tail of the last block and the logic for handling short DIO reads in > dio_complete() results in a return value -24 (1000 - 1024) which > obviously confuses userspace. > > Fix the problem by bailing out early once we sample i_size and can > reliably check that direct IO read starts beyond i_size. > > Reported-by: Avi Kivity <avi@scylladb.com> > Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 > CC: stable@vger.kernel.org While this patch made it into upstream, it did not appear in 4.3.4. Did it slip through the proverbial cracks? Can it be queued for 4.3.5? > CC: Steven Whitehouse <swhiteho@redhat.com> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/direct-io.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > Avi, this patch fixes the issue for me. > > Honza > > diff --git a/fs/direct-io.c b/fs/direct-io.c > index 18e7554cf94c..08094c9d8172 100644 > --- a/fs/direct-io.c > +++ b/fs/direct-io.c > @@ -1163,6 +1163,15 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > } > } > > + /* Once we sampled i_size check for reads beyond EOF */ > + dio->i_size = i_size_read(inode); > + if (iov_iter_rw(iter) == READ && offset >= dio->i_size) { > + if (dio->flags & DIO_LOCKING) > + mutex_unlock(&inode->i_mutex); > + kmem_cache_free(dio_cache, dio); > + goto out; > + } > + > /* > * For file extending writes updating i_size before data writeouts > * complete can expose uninitialized blocks in dumb filesystems. > @@ -1216,7 +1225,6 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, > sdio.next_block_for_io = -1; > > dio->iocb = iocb; > - dio->i_size = i_size_read(inode); > > spin_lock_init(&dio->bio_lock); > dio->refcount = 1; -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 27, 2016 at 12:38:14PM +0200, Avi Kivity wrote: > On 11/19/2015 10:25 PM, Jan Kara wrote: > >Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and > >we issue direct IO read at offset 1024, blockdev_direct_IO() reads the > >tail of the last block and the logic for handling short DIO reads in > >dio_complete() results in a return value -24 (1000 - 1024) which > >obviously confuses userspace. > > > >Fix the problem by bailing out early once we sample i_size and can > >reliably check that direct IO read starts beyond i_size. > > > >Reported-by: Avi Kivity <avi@scylladb.com> > >Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 > >CC: stable@vger.kernel.org > > While this patch made it into upstream, it did not appear in 4.3.4. Did it > slip through the proverbial cracks? Can it be queued for 4.3.5? There are over 400 patches right now in my queue that haven't made it into a 4.3.x kernel. These are in the queue, in good company :) I'll go dig these out as I guess people care about them more than others... thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/27/2016 07:13 PM, Greg KH wrote: > On Wed, Jan 27, 2016 at 12:38:14PM +0200, Avi Kivity wrote: >> On 11/19/2015 10:25 PM, Jan Kara wrote: >>> Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and >>> we issue direct IO read at offset 1024, blockdev_direct_IO() reads the >>> tail of the last block and the logic for handling short DIO reads in >>> dio_complete() results in a return value -24 (1000 - 1024) which >>> obviously confuses userspace. >>> >>> Fix the problem by bailing out early once we sample i_size and can >>> reliably check that direct IO read starts beyond i_size. >>> >>> Reported-by: Avi Kivity <avi@scylladb.com> >>> Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 >>> CC: stable@vger.kernel.org >> While this patch made it into upstream, it did not appear in 4.3.4. Did it >> slip through the proverbial cracks? Can it be queued for 4.3.5? > There are over 400 patches right now in my queue that haven't made it > into a 4.3.x kernel. Many projects would consider 400 patches a major release, and here they are behind two dots... > These are in the queue, in good company :) > > I'll go dig these out as I guess people care about them more than > others... > > Much appreciated. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 27, 2016 at 07:16:32PM +0200, Avi Kivity wrote: > On 01/27/2016 07:13 PM, Greg KH wrote: > >On Wed, Jan 27, 2016 at 12:38:14PM +0200, Avi Kivity wrote: > >>On 11/19/2015 10:25 PM, Jan Kara wrote: > >>>Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and > >>>we issue direct IO read at offset 1024, blockdev_direct_IO() reads the > >>>tail of the last block and the logic for handling short DIO reads in > >>>dio_complete() results in a return value -24 (1000 - 1024) which > >>>obviously confuses userspace. > >>> > >>>Fix the problem by bailing out early once we sample i_size and can > >>>reliably check that direct IO read starts beyond i_size. > >>> > >>>Reported-by: Avi Kivity <avi@scylladb.com> > >>>Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 > >>>CC: stable@vger.kernel.org > >>While this patch made it into upstream, it did not appear in 4.3.4. Did it > >>slip through the proverbial cracks? Can it be queued for 4.3.5? > >There are over 400 patches right now in my queue that haven't made it > >into a 4.3.x kernel. > > Many projects would consider 400 patches a major release, and here they are > behind two dots... Look at the percentages, it's just a tiny drop in the bucket compared to what is hitting Linus's tree :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 27, 2016 at 07:16:32PM +0200, Avi Kivity wrote: > > These are in the queue, in good company :) > > > >I'll go dig these out as I guess people care about them more than > >others... > > > > > > Much appreciated. Note, they didn't build on 3.14 so I had to revert both of them. If you could provide a fix for that release, that would be great. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/27/2016 07:46 PM, Greg KH wrote: > On Wed, Jan 27, 2016 at 07:16:32PM +0200, Avi Kivity wrote: >>> These are in the queue, in good company :) >>> >>> I'll go dig these out as I guess people care about them more than >>> others... >>> >>> >> Much appreciated. > Note, they didn't build on 3.14 I think the regression was introduced post 14, let me check. > so I had to revert both of them. Both of what? There was only one patch. Or did you mean you drop this patch for 4.3.5? > If you > could provide a fix for that release, that would be great. > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/27/2016 07:49 PM, Avi Kivity wrote: > On 01/27/2016 07:46 PM, Greg KH wrote: >> On Wed, Jan 27, 2016 at 07:16:32PM +0200, Avi Kivity wrote: >>>> These are in the queue, in good company :) >>>> >>>> I'll go dig these out as I guess people care about them more than >>>> others... >>>> >>>> >>> Much appreciated. >> Note, they didn't build on 3.14 > > I think the regression was introduced post 14, let me check. > Actually it was introduced in 3.14-rc1. I could try a backport, but it would be better if someone who actually knows the code did this. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 27, 2016 at 07:49:15PM +0200, Avi Kivity wrote: > On 01/27/2016 07:46 PM, Greg KH wrote: > >On Wed, Jan 27, 2016 at 07:16:32PM +0200, Avi Kivity wrote: > >>> These are in the queue, in good company :) > >>> > >>>I'll go dig these out as I guess people care about them more than > >>>others... > >>> > >>> > >>Much appreciated. > >Note, they didn't build on 3.14 > > I think the regression was introduced post 14, let me check. Not according to the stable tag in the commit :) > > so I had to revert both of them. > > Both of what? There was only one patch. Or did you mean you drop this > patch for 4.3.5? There was a follow-on fix for this patch as well, commit 2d4594acbf6d8f75a27f3578476b6a27d8b13ebb. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/direct-io.c b/fs/direct-io.c index 18e7554cf94c..08094c9d8172 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1163,6 +1163,15 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, } } + /* Once we sampled i_size check for reads beyond EOF */ + dio->i_size = i_size_read(inode); + if (iov_iter_rw(iter) == READ && offset >= dio->i_size) { + if (dio->flags & DIO_LOCKING) + mutex_unlock(&inode->i_mutex); + kmem_cache_free(dio_cache, dio); + goto out; + } + /* * For file extending writes updating i_size before data writeouts * complete can expose uninitialized blocks in dumb filesystems. @@ -1216,7 +1225,6 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, sdio.next_block_for_io = -1; dio->iocb = iocb; - dio->i_size = i_size_read(inode); spin_lock_init(&dio->bio_lock); dio->refcount = 1;
Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and we issue direct IO read at offset 1024, blockdev_direct_IO() reads the tail of the last block and the logic for handling short DIO reads in dio_complete() results in a return value -24 (1000 - 1024) which obviously confuses userspace. Fix the problem by bailing out early once we sample i_size and can reliably check that direct IO read starts beyond i_size. Reported-by: Avi Kivity <avi@scylladb.com> Fixes: 9fe55eea7e4b444bafc42fa0000cc2d1d2847275 CC: stable@vger.kernel.org CC: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/direct-io.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) Avi, this patch fixes the issue for me. Honza