Message ID | 9a978571-8648-e830-5735-1f4748ce2e30@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tmpfs: fix regressions from wider use of ZERO_PAGE | expand |
On Fri, Apr 08, 2022 at 01:38:41PM -0700, Hugh Dickins wrote: > + } else if (iter_is_iovec(to)) { > + /* > + * Copy to user tends to be so well optimized, but > + * clear_user() not so much, that it is noticeably > + * faster to copy the zero page instead of clearing. > + */ > + ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to); Is the offset and length guaranteed to be less than PAGE_SIZE here? Either way I'd rather do this optimization in iov_iter_zero rather than hiding it in tmpfs.
On Sat, 9 Apr 2022, Christoph Hellwig wrote: > On Fri, Apr 08, 2022 at 01:38:41PM -0700, Hugh Dickins wrote: > > + } else if (iter_is_iovec(to)) { > > + /* > > + * Copy to user tends to be so well optimized, but > > + * clear_user() not so much, that it is noticeably > > + * faster to copy the zero page instead of clearing. > > + */ > > + ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to); > > Is the offset and length guaranteed to be less than PAGE_SIZE here? Almost :) The offset is guaranteed to be less than PAGE_SIZE here, and the length is guaranteed to be less than or equal to PAGE_SIZE - offset. > > Either way I'd rather do this optimization in iov_iter_zero rather > than hiding it in tmpfs. Let's see what others say. I think we would all prefer clear_user() to be enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero(). But that careful work won't get done by magic, nor by me. And iov_iter_zero() has to deal with a wider range of possibilities, when pulling in cache lines of ZERO_PAGE(0) will be less advantageous, than in tmpfs doing a large dd - the case I'm aiming not to regress here (tmpfs has been copying ZERO_PAGE(0) like this for years). Hugh
On Fri, Apr 08, 2022 at 11:08:29PM -0700, Hugh Dickins wrote: > > > > Either way I'd rather do this optimization in iov_iter_zero rather > > than hiding it in tmpfs. > > Let's see what others say. I think we would all prefer clear_user() to be > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero(). > But that careful work won't get done by magic, nor by me. I agree with that. > And iov_iter_zero() has to deal with a wider range of possibilities, > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous, > than in tmpfs doing a large dd - the case I'm aiming not to regress here > (tmpfs has been copying ZERO_PAGE(0) like this for years). Maybe. OTOH I'd hate to have iov_iter_zero not used much because it sucks too much. So how can we entice someone with the right knowledge to implement a decent clear_user for x86?
On Fri, 8 Apr 2022 23:08:29 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote: > > > > Either way I'd rather do this optimization in iov_iter_zero rather > > than hiding it in tmpfs. > > Let's see what others say. I think we would all prefer clear_user() to be > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero(). > But that careful work won't get done by magic, nor by me. > > And iov_iter_zero() has to deal with a wider range of possibilities, > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous, > than in tmpfs doing a large dd - the case I'm aiming not to regress here > (tmpfs has been copying ZERO_PAGE(0) like this for years). We do need something to get 5.18 fixed. Christoph, do you think we should proceed with this patch for 5.18?
On Tue, Apr 12, 2022 at 04:22:21PM -0700, Andrew Morton wrote: > On Fri, 8 Apr 2022 23:08:29 -0700 (PDT) Hugh Dickins <hughd@google.com> wrote: > > > > > > > Either way I'd rather do this optimization in iov_iter_zero rather > > > than hiding it in tmpfs. > > > > Let's see what others say. I think we would all prefer clear_user() to be > > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero(). > > But that careful work won't get done by magic, nor by me. > > > > And iov_iter_zero() has to deal with a wider range of possibilities, > > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous, > > than in tmpfs doing a large dd - the case I'm aiming not to regress here > > (tmpfs has been copying ZERO_PAGE(0) like this for years). > > We do need something to get 5.18 fixed. Christoph, do you think we > should proceed with this patch for 5.18? Well, let's queue it up then.
On Tue, Apr 12, 2022 at 06:57:57AM +0200, Christoph Hellwig wrote: > On Fri, Apr 08, 2022 at 11:08:29PM -0700, Hugh Dickins wrote: > > > > > > Either way I'd rather do this optimization in iov_iter_zero rather > > > than hiding it in tmpfs. > > > > Let's see what others say. I think we would all prefer clear_user() to be > > enhanced, and hack around it neither here in tmpfs nor in iov_iter_zero(). > > But that careful work won't get done by magic, nor by me. > > I agree with that. > > > And iov_iter_zero() has to deal with a wider range of possibilities, > > when pulling in cache lines of ZERO_PAGE(0) will be less advantageous, > > than in tmpfs doing a large dd - the case I'm aiming not to regress here > > (tmpfs has been copying ZERO_PAGE(0) like this for years). > > Maybe. OTOH I'd hate to have iov_iter_zero not used much because it > sucks too much. > > So how can we entice someone with the right knowledge to implement a > decent clear_user for x86? Apparently that already happened, but it needs finishing up: https://lore.kernel.org/lkml/Yk9yBcj78mpXOOLL@zx2c4.com/
--- 5.18-rc1/mm/filemap.c +++ linux/mm/filemap.c @@ -1063,12 +1063,6 @@ void __init pagecache_init(void) init_waitqueue_head(&folio_wait_table[i]); page_writeback_init(); - - /* - * tmpfs uses the ZERO_PAGE for reading holes: it is up-to-date, - * and splice's page_cache_pipe_buf_confirm() needs to see that. - */ - SetPageUptodate(ZERO_PAGE(0)); } /* --- 5.18-rc1/mm/shmem.c +++ linux/mm/shmem.c @@ -2513,7 +2513,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) pgoff_t end_index; unsigned long nr, ret; loff_t i_size = i_size_read(inode); - bool got_page; end_index = i_size >> PAGE_SHIFT; if (index > end_index) @@ -2570,24 +2569,34 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) */ if (!offset) mark_page_accessed(page); - got_page = true; + /* + * Ok, we have the page, and it's up-to-date, so + * now we can copy it to user space... + */ + ret = copy_page_to_iter(page, offset, nr, to); + put_page(page); + + } else if (iter_is_iovec(to)) { + /* + * Copy to user tends to be so well optimized, but + * clear_user() not so much, that it is noticeably + * faster to copy the zero page instead of clearing. + */ + ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to); } else { - page = ZERO_PAGE(0); - got_page = false; + /* + * But submitting the same page twice in a row to + * splice() - or others? - can result in confusion: + * so don't attempt that optimization on pipes etc. + */ + ret = iov_iter_zero(nr, to); } - /* - * Ok, we have the page, and it's up-to-date, so - * now we can copy it to user space... - */ - ret = copy_page_to_iter(page, offset, nr, to); retval += ret; offset += ret; index += offset >> PAGE_SHIFT; offset &= ~PAGE_MASK; - if (got_page) - put_page(page); if (!iov_iter_count(to)) break; if (ret < nr) {