Message ID | 20230109205336.3665937-9-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Per-VMA locks | expand |
On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: >This configuration variable will be used to build the support for VMA >locking during page fault handling. > >This is enabled by default on supported architectures with SMP and MMU >set. > >The architecture support is needed since the page fault handler is called >from the architecture's page faulting code which needs modifications to >handle faults under VMA lock. I don't think that per-vma locking should be something that is user-configurable. It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? Thanks, Davidlohr
On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: > > >This configuration variable will be used to build the support for VMA > >locking during page fault handling. > > > >This is enabled by default on supported architectures with SMP and MMU > >set. > > > >The architecture support is needed since the page fault handler is called > >from the architecture's page faulting code which needs modifications to > >handle faults under VMA lock. > > I don't think that per-vma locking should be something that is user-configurable. > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? Thanks for the suggestion! I would be happy to make that change if there are no objections. I think the only pushback might have been the vma size increase but with the latest optimization in the last patch maybe that's less of an issue? > > Thanks, > Davidlohr > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Tue 10-01-23 16:44:42, Suren Baghdasaryan wrote: > On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: > > > > >This configuration variable will be used to build the support for VMA > > >locking during page fault handling. > > > > > >This is enabled by default on supported architectures with SMP and MMU > > >set. > > > > > >The architecture support is needed since the page fault handler is called > > >from the architecture's page faulting code which needs modifications to > > >handle faults under VMA lock. > > > > I don't think that per-vma locking should be something that is user-configurable. > > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? > > Thanks for the suggestion! I would be happy to make that change if > there are no objections. I think the only pushback might have been the > vma size increase but with the latest optimization in the last patch > maybe that's less of an issue? Has vma size ever been a real problem? Sure there might be a lot of those but your patch increases it by rwsem (without the last patch) which is something like 40B on top of 136B vma so we are talking about 400B in total which even with wild mapcount limits shouldn't really be prohibitive. With a default map count limit we are talking about 2M increase at most (per address space). Or are you aware of any specific usecases where vma size is a real problem?
* Michal Hocko <mhocko@suse.com> wrote: > On Tue 10-01-23 16:44:42, Suren Baghdasaryan wrote: > > On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > > > > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: > > > > > > >This configuration variable will be used to build the support for VMA > > > >locking during page fault handling. > > > > > > > >This is enabled by default on supported architectures with SMP and MMU > > > >set. > > > > > > > >The architecture support is needed since the page fault handler is called > > > >from the architecture's page faulting code which needs modifications to > > > >handle faults under VMA lock. > > > > > > I don't think that per-vma locking should be something that is user-configurable. > > > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? > > > > Thanks for the suggestion! I would be happy to make that change if > > there are no objections. I think the only pushback might have been the > > vma size increase but with the latest optimization in the last patch > > maybe that's less of an issue? > > Has vma size ever been a real problem? Sure there might be a lot of those > but your patch increases it by rwsem (without the last patch) which is > something like 40B on top of 136B vma so we are talking about 400B in > total which even with wild mapcount limits shouldn't really be > prohibitive. With a default map count limit we are talking about 2M > increase at most (per address space). > > Or are you aware of any specific usecases where vma size is a real > problem? 40 bytes for the rwsem, plus the patch also adds a 32-bit sequence counter: + int vm_lock_seq; + struct rw_semaphore lock; So it's +44 bytes. Thanks, Ingo
From: Ingo Molnar > Sent: 11 January 2023 09:54 > > * Michal Hocko <mhocko@suse.com> wrote: > > > On Tue 10-01-23 16:44:42, Suren Baghdasaryan wrote: > > > On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > > > > > > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: > > > > > > > > >This configuration variable will be used to build the support for VMA > > > > >locking during page fault handling. > > > > > > > > > >This is enabled by default on supported architectures with SMP and MMU > > > > >set. > > > > > > > > > >The architecture support is needed since the page fault handler is called > > > > >from the architecture's page faulting code which needs modifications to > > > > >handle faults under VMA lock. > > > > > > > > I don't think that per-vma locking should be something that is user-configurable. > > > > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? > > > > > > Thanks for the suggestion! I would be happy to make that change if > > > there are no objections. I think the only pushback might have been the > > > vma size increase but with the latest optimization in the last patch > > > maybe that's less of an issue? > > > > Has vma size ever been a real problem? Sure there might be a lot of those > > but your patch increases it by rwsem (without the last patch) which is > > something like 40B on top of 136B vma so we are talking about 400B in > > total which even with wild mapcount limits shouldn't really be > > prohibitive. With a default map count limit we are talking about 2M > > increase at most (per address space). > > > > Or are you aware of any specific usecases where vma size is a real > > problem? > > 40 bytes for the rwsem, plus the patch also adds a 32-bit sequence counter: > > + int vm_lock_seq; > + struct rw_semaphore lock; > > So it's +44 bytes. Depend in whether vm_lock_seq goes into a padding hole or not it will be 40 or 48 bytes. But if these structures are allocated individually (not an array) then it depends on how may items kmalloc() fits into a page (or 2,4). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Jan 11, 2023 at 2:03 AM David Laight <David.Laight@aculab.com> wrote: > > From: Ingo Molnar > > Sent: 11 January 2023 09:54 > > > > * Michal Hocko <mhocko@suse.com> wrote: > > > > > On Tue 10-01-23 16:44:42, Suren Baghdasaryan wrote: > > > > On Tue, Jan 10, 2023 at 4:39 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > > > > > > > > > On Mon, 09 Jan 2023, Suren Baghdasaryan wrote: > > > > > > > > > > >This configuration variable will be used to build the support for VMA > > > > > >locking during page fault handling. > > > > > > > > > > > >This is enabled by default on supported architectures with SMP and MMU > > > > > >set. > > > > > > > > > > > >The architecture support is needed since the page fault handler is called > > > > > >from the architecture's page faulting code which needs modifications to > > > > > >handle faults under VMA lock. > > > > > > > > > > I don't think that per-vma locking should be something that is user-configurable. > > > > > It should just be depdendant on the arch. So maybe just remove CONFIG_PER_VMA_LOCK? > > > > > > > > Thanks for the suggestion! I would be happy to make that change if > > > > there are no objections. I think the only pushback might have been the > > > > vma size increase but with the latest optimization in the last patch > > > > maybe that's less of an issue? > > > > > > Has vma size ever been a real problem? Sure there might be a lot of those > > > but your patch increases it by rwsem (without the last patch) which is > > > something like 40B on top of 136B vma so we are talking about 400B in > > > total which even with wild mapcount limits shouldn't really be > > > prohibitive. With a default map count limit we are talking about 2M > > > increase at most (per address space). > > > > > > Or are you aware of any specific usecases where vma size is a real > > > problem? Well, when fixing the cacheline bouncing problem in the initial design I was adding 44 bytes to 152-byte vm_area_struct (CONFIG_NUMA enabled) and pushing it just above 192 bytes while allocating these structures from cache-aligned slab (keeping the lock in a separate cacheline to prevent cacheline bouncing). That would use the whole 256 bytes per VMA and it did make me nervous. The current design with no need to cache-align vm_area_structs and with 44-byte overhead trimmed down to 16 bytes seems much more palatable. > > > > 40 bytes for the rwsem, plus the patch also adds a 32-bit sequence counter: > > > > + int vm_lock_seq; > > + struct rw_semaphore lock; > > > > So it's +44 bytes. Correct. > > Depend in whether vm_lock_seq goes into a padding hole or not > it will be 40 or 48 bytes. > > But if these structures are allocated individually (not an array) > then it depends on how may items kmalloc() fits into a page (or 2,4). Yep. Depends on how we arrange the fields. Anyhow. Sounds like the overhead of the current design is small enough to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture support? Thanks, Suren. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) >
On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: [...] > Anyhow. Sounds like the overhead of the current design is small enough > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > support? Yes. Further optimizations can be done on top. Let's not over optimize at this stage.
On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko <mhocko@suse.com> wrote: > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > [...] > > Anyhow. Sounds like the overhead of the current design is small enough > > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > > support? > > Yes. Further optimizations can be done on top. Let's not over optimize > at this stage. Sure, I won't optimize any further. Just to expand on your question. Original design would be problematic for embedded systems like Android. It notoriously has a high number of VMAs due to anonymous VMAs being named, which prevents them from merging. 2M per process increase would raise questions, therefore I felt the need for optimizing the memory overhead which is done in the last patch. Thanks for the feedback! > -- > Michal Hocko > SUSE Labs
On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko <mhocko@suse.com> wrote: > > > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > > [...] > > > Anyhow. Sounds like the overhead of the current design is small enough > > > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > > > support? > > > > Yes. Further optimizations can be done on top. Let's not over optimize > > at this stage. > > Sure, I won't optimize any further. > Just to expand on your question. Original design would be problematic > for embedded systems like Android. It notoriously has a high number of > VMAs due to anonymous VMAs being named, which prevents them from > merging. What is the usual number of VMAs in that environment?
On Wed, Jan 11, 2023 at 9:37 AM Michal Hocko <mhocko@suse.com> wrote: > > On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko <mhocko@suse.com> wrote: > > > > > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > > > [...] > > > > Anyhow. Sounds like the overhead of the current design is small enough > > > > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > > > > support? > > > > > > Yes. Further optimizations can be done on top. Let's not over optimize > > > at this stage. > > > > Sure, I won't optimize any further. > > Just to expand on your question. Original design would be problematic > > for embedded systems like Android. It notoriously has a high number of > > VMAs due to anonymous VMAs being named, which prevents them from > > merging. > > What is the usual number of VMAs in that environment? I've seen some games which had over 4000 VMAs but that's on the upper side. In my calculations I used 40000 VMAs as a ballpark number and rough calculations before size optimization would increase memory consumption by ~2M (depending on the lock placement in vm_area_struct it would vary a bit). In Android, the performance team flags any change that exceeds 500KB, so it would raise questions. > > -- > Michal Hocko > SUSE Labs
On Wed 11-01-23 09:49:08, Suren Baghdasaryan wrote: > On Wed, Jan 11, 2023 at 9:37 AM Michal Hocko <mhocko@suse.com> wrote: > > > > On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > > > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko <mhocko@suse.com> wrote: > > > > > > > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > > > > [...] > > > > > Anyhow. Sounds like the overhead of the current design is small enough > > > > > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > > > > > support? > > > > > > > > Yes. Further optimizations can be done on top. Let's not over optimize > > > > at this stage. > > > > > > Sure, I won't optimize any further. > > > Just to expand on your question. Original design would be problematic > > > for embedded systems like Android. It notoriously has a high number of > > > VMAs due to anonymous VMAs being named, which prevents them from > > > merging. > > > > What is the usual number of VMAs in that environment? > > I've seen some games which had over 4000 VMAs but that's on the upper > side. In my calculations I used 40000 VMAs as a ballpark number and > rough calculations before size optimization would increase memory > consumption by ~2M (depending on the lock placement in vm_area_struct > it would vary a bit). In Android, the performance team flags any > change that exceeds 500KB, so it would raise questions. Thanks, that is a useful information! This is just slightly off-topic but I ak wondering how much memory those vma names consume. Are there that many unique names or they just happen to be alternating so that neighboring ones tend to be different.
On Wed, Jan 11, 2023 at 10:03 AM Michal Hocko <mhocko@suse.com> wrote: > > On Wed 11-01-23 09:49:08, Suren Baghdasaryan wrote: > > On Wed, Jan 11, 2023 at 9:37 AM Michal Hocko <mhocko@suse.com> wrote: > > > > > > On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > > > > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko <mhocko@suse.com> wrote: > > > > > > > > > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > > > > > [...] > > > > > > Anyhow. Sounds like the overhead of the current design is small enough > > > > > > to remove CONFIG_PER_VMA_LOCK and let it depend only on architecture > > > > > > support? > > > > > > > > > > Yes. Further optimizations can be done on top. Let's not over optimize > > > > > at this stage. > > > > > > > > Sure, I won't optimize any further. > > > > Just to expand on your question. Original design would be problematic > > > > for embedded systems like Android. It notoriously has a high number of > > > > VMAs due to anonymous VMAs being named, which prevents them from > > > > merging. > > > > > > What is the usual number of VMAs in that environment? > > > > I've seen some games which had over 4000 VMAs but that's on the upper > > side. In my calculations I used 40000 VMAs as a ballpark number and > > rough calculations before size optimization would increase memory > > consumption by ~2M (depending on the lock placement in vm_area_struct > > it would vary a bit). In Android, the performance team flags any > > change that exceeds 500KB, so it would raise questions. > > Thanks, that is a useful information! This is just slightly off-topic > but I ak wondering how much memory those vma names consume. Are there > that many unique names or they just happen to be alternating so that > neighboring ones tend to be different. Good question. I don't have the ready answer to that but will try to collect some stats. I know that many names are standardized but haven't looked at how they are distributed in the address space. Will followup once I collect the data. Thanks, Suren. > -- > Michal Hocko > SUSE Labs
diff --git a/mm/Kconfig b/mm/Kconfig index ff7b209dec05..0aeca3794972 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1183,6 +1183,19 @@ config LRU_GEN_STATS This option has a per-memcg and per-node memory overhead. # } +config ARCH_SUPPORTS_PER_VMA_LOCK + def_bool n + +config PER_VMA_LOCK + bool "Per-vma locking support" + default y + depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP + help + Allow per-vma locking during page fault handling. + + This feature allows locking each virtual memory area separately when + handling page faults instead of taking mmap_lock. + source "mm/damon/Kconfig" endmenu
This configuration variable will be used to build the support for VMA locking during page fault handling. This is enabled by default on supported architectures with SMP and MMU set. The architecture support is needed since the page fault handler is called from the architecture's page faulting code which needs modifications to handle faults under VMA lock. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- mm/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+)