Message ID | 20220718002645.28817-1-fmdefrancesco@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | module: Replace kmap() with kmap_local_page() | expand |
On Mon, Jul 18, 2022 at 02:26:45AM +0200, Fabio M. De Francesco wrote: > kmap() is being deprecated in favor of kmap_local_page(). > > Two main problems with kmap(): (1) It comes with an overhead as mapping > space is restricted and protected by a global lock for synchronization and > (2) it also requires global TLB invalidation when the kmap’s pool wraps > and it might block when the mapping space is fully utilized until a slot > becomes available. Neat! > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). Yes but the mapping can only be accessed from within this thread and the thread is bound to the CPU while the mapping is active. So, what if we get a preemption during decompression here? What happens? > It is faster than kmap() in kernels with HIGHMEM enabled. Its use in > module_gzip_decompress() and in module_xz_decompress() is safe and > should be preferred. > > Therefore, replace kmap() with kmap_local_page(). While this churn is going on everywhere I was wondering why not go ahead and adopt kmap_local_folio() instead? Luis > Tested on a QEMU/KVM x86_32 VM with 4GB RAM, booting kernels with > HIGHMEM64GB enabled. Modules compressed with XZ or GZIP decompress > properly. > > Suggested-by: Ira Weiny <ira.weiny@intel.com> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> > --- > kernel/module/decompress.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c > index 2fc7081dd7c1..4d0bcb3d9e44 100644 > --- a/kernel/module/decompress.c > +++ b/kernel/module/decompress.c > @@ -119,10 +119,10 @@ static ssize_t module_gzip_decompress(struct load_info *info, > goto out_inflate_end; > } > > - s.next_out = kmap(page); > + s.next_out = kmap_local_page(page); > s.avail_out = PAGE_SIZE; > rc = zlib_inflate(&s, 0); > - kunmap(page); > + kunmap_local(s.next_out); > > new_size += PAGE_SIZE - s.avail_out; > } while (rc == Z_OK); > @@ -178,11 +178,11 @@ static ssize_t module_xz_decompress(struct load_info *info, > goto out; > } > > - xz_buf.out = kmap(page); > + xz_buf.out = kmap_local_page(page); > xz_buf.out_pos = 0; > xz_buf.out_size = PAGE_SIZE; > xz_ret = xz_dec_run(xz_dec, &xz_buf); > - kunmap(page); > + kunmap_local(xz_buf.out); > > new_size += xz_buf.out_pos; > } while (xz_buf.out_pos == PAGE_SIZE && xz_ret == XZ_OK); > -- > 2.37.1 >
On martedì 19 luglio 2022 00:19:50 CEST Luis Chamberlain wrote: > On Mon, Jul 18, 2022 at 02:26:45AM +0200, Fabio M. De Francesco wrote: > > kmap() is being deprecated in favor of kmap_local_page(). > > > > Two main problems with kmap(): (1) It comes with an overhead as mapping > > space is restricted and protected by a global lock for synchronization and > > (2) it also requires global TLB invalidation when the kmap’s pool wraps > > and it might block when the mapping space is fully utilized until a slot > > becomes available. > > Neat! > Thanks! > > With kmap_local_page() the mappings are per thread, CPU local, can take > > page faults, and can be called from any context (including interrupts). > > Yes but the mapping can only be accessed from within this thread and the > thread is bound to the CPU while the mapping is active. So, what if we > get a preemption during decompression here? What happens? No problems at all, please read commit f3ba3c710ac5 ("mm/highmem: Provide kmap_local*") from Thomas Gleixner: "[] Now that the kmap atomic index is stored in task struct provide a preemptible variant. On context switch the maps of an outgoing task are removed and the map of the incoming task are restored. That's obviously slow, but highmem is slow anyway. The kmap_local.*() functions can be invoked from both preemptible and atomic context. kmap local sections disable migration to keep the resulting virtual mapping address correct, but disable neither pagefaults nor preemption.". However, yours is a good question. It made me realize that this specific topic is not discussed properly in highmem.rst (which also embeds kdocs from highmem.h and highmem-internal.h). Some weeks ago I reworked highmem.rst and, within a couple of days, I'll also send a second round of additions and clean-ups to the documentation of kmap_local_page(). I'll use this opportunity to explain more clearly what happens to the local mappings when a task is preempted and then is restored. It looks like currently this documentation lacks clarity about the fact that tasks which use pointers returned by kmap_local_page() _can_ be preempted and that these pointers are still valid when the tasks are rescheduled to run again. In the meantime, I'll submit a v2 of this patch with a slightly extended commit message which informs about this specific topic. Again thanks for pointing it out that the commit message of this patch should be more clear about what happens with preemption. > > It is faster than kmap() in kernels with HIGHMEM enabled. Its use in > > module_gzip_decompress() and in module_xz_decompress() is safe and > > should be preferred. > > > > Therefore, replace kmap() with kmap_local_page(). > > While this churn is going on everywhere I was wondering why not > go ahead and adopt kmap_local_folio() instead? I'm sorry but, due to my lack of knowledge and experience, I'm not sure to understand how kmap_local_folio() could help here. My fault. I'm going to make some research and ask for help from more experienced developers. > Luis > > > Tested on a QEMU/KVM x86_32 VM with 4GB RAM, booting kernels with > > HIGHMEM64GB enabled. Modules compressed with XZ or GZIP decompress > > properly. > > > > Suggested-by: Ira Weiny <ira.weiny@intel.com> > > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> > > --- > > kernel/module/decompress.c | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) [snip] Thanks, Fabio
On Tue, Jul 19, 2022 at 11:19:24AM +0200, Fabio M. De Francesco wrote: > On martedì 19 luglio 2022 00:19:50 CEST Luis Chamberlain wrote: > > > Therefore, replace kmap() with kmap_local_page(). > > > > While this churn is going on everywhere I was wondering why not > > go ahead and adopt kmap_local_folio() instead? > > I'm sorry but, due to my lack of knowledge and experience, I'm not sure to > understand how kmap_local_folio() could help here. My fault. I'm going to > make some research and ask for help from more experienced developers. I haven't made this suggestion to Fabio before for a few reasons. First, it makes his work harder. He not only has to understand the implications of the kmap semantic changes but also the implications of the folio change. Then, I'm not sure that I necessarily have enough infrastructure in place for doing a folio conversion everywhere that he's doing a kmap/kmap_atomic to kmap_local_page conversion. What makes it particularly tricky is that you can only kmap a single page out of a folio at a time; there's no ability to kmap the entire folio, no matter how large it is. I've looked at doing the conversion for ext2 directories, and it's _tricky_. There's only one 'checked' flag for the entire folio, but ext2_check_page() needs to take a mapped page. So now we have to make a judgement call about whether to support caching ext2 directories with large folios or whether to restrict them to single-page folios. So yes, there's probably a second patch coming for maintainers to look at that will convert the kmap_local_page() to kmap_local_folio(). However, I think it's actually less of a burden for maintainers if these two different conversions happen separately because there are very different considerations to review. Also, there's no equivalent to kmap() or kmap_atomic() for folios (deliberately), so the more conversions to kmap_local_page() Fabio gets done, the easier it will be for a later folio conversion.
On Tue, Jul 19, 2022 at 09:23:49PM +0100, Matthew Wilcox wrote: > On Tue, Jul 19, 2022 at 11:19:24AM +0200, Fabio M. De Francesco wrote: > > On martedì 19 luglio 2022 00:19:50 CEST Luis Chamberlain wrote: > > > > Therefore, replace kmap() with kmap_local_page(). > > > > > > While this churn is going on everywhere I was wondering why not > > > go ahead and adopt kmap_local_folio() instead? > > > > I'm sorry but, due to my lack of knowledge and experience, I'm not sure to > > understand how kmap_local_folio() could help here. My fault. I'm going to > > make some research and ask for help from more experienced developers. > > I haven't made this suggestion to Fabio before for a few reasons. > > First, it makes his work harder. He not only has to understand the > implications of the kmap semantic changes but also the implications of > the folio change. > > Then, I'm not sure that I necessarily have enough infrastructure in place > for doing a folio conversion everywhere that he's doing a kmap/kmap_atomic > to kmap_local_page conversion. > > What makes it particularly tricky is that you can only kmap a single > page out of a folio at a time; there's no ability to kmap the entire > folio, no matter how large it is. I've looked at doing the conversion > for ext2 directories, and it's _tricky_. There's only one 'checked' > flag for the entire folio, but ext2_check_page() needs to take a mapped > page. So now we have to make a judgement call about whether to support > caching ext2 directories with large folios or whether to restrict them > to single-page folios. > > So yes, there's probably a second patch coming for maintainers to look > at that will convert the kmap_local_page() to kmap_local_folio(). > However, I think it's actually less of a burden for maintainers if > these two different conversions happen separately because there are very > different considerations to review. Also, there's no equivalent to kmap() > or kmap_atomic() for folios (deliberately), so the more conversions to > kmap_local_page() Fabio gets done, the easier it will be for a later > folio conversion. Makes sense, thanks for the feedback. I'll wrestle with ensuring the first step to kmap_local_page() doens't break things where I see them suggested first. Luis
diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c index 2fc7081dd7c1..4d0bcb3d9e44 100644 --- a/kernel/module/decompress.c +++ b/kernel/module/decompress.c @@ -119,10 +119,10 @@ static ssize_t module_gzip_decompress(struct load_info *info, goto out_inflate_end; } - s.next_out = kmap(page); + s.next_out = kmap_local_page(page); s.avail_out = PAGE_SIZE; rc = zlib_inflate(&s, 0); - kunmap(page); + kunmap_local(s.next_out); new_size += PAGE_SIZE - s.avail_out; } while (rc == Z_OK); @@ -178,11 +178,11 @@ static ssize_t module_xz_decompress(struct load_info *info, goto out; } - xz_buf.out = kmap(page); + xz_buf.out = kmap_local_page(page); xz_buf.out_pos = 0; xz_buf.out_size = PAGE_SIZE; xz_ret = xz_dec_run(xz_dec, &xz_buf); - kunmap(page); + kunmap_local(xz_buf.out); new_size += xz_buf.out_pos; } while (xz_buf.out_pos == PAGE_SIZE && xz_ret == XZ_OK);
kmap() is being deprecated in favor of kmap_local_page(). Two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Its use in module_gzip_decompress() and in module_xz_decompress() is safe and should be preferred. Therefore, replace kmap() with kmap_local_page(). Tested on a QEMU/KVM x86_32 VM with 4GB RAM, booting kernels with HIGHMEM64GB enabled. Modules compressed with XZ or GZIP decompress properly. Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> --- kernel/module/decompress.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)