Message ID | 20200327021058.221911-8-walken@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add a new mmap locking API wrapping mmap_sem calls | expand |
On Thu, 26 Mar 2020, Michel Lespinasse wrote: >Add a couple APIs to allow splitting mmap_read_unlock() into two calls: >- mmap_read_release(), called by the task that had taken the mmap lock; >- mmap_read_unlock_non_owner(), called from a work queue. > >These apis are used by kernel/bpf/stackmap.c only. I'm not crazy about the idea generalizing such calls into an mm api. We try to stay away from non-owner semantics in locking - granted the IS_ENABLED(CONFIG_PREEMPT_RT) warning, but still. Could this give future users the wrong impression? What about just using rwsem calls directly in bpf? Thanks, Davidlohr
On Thu, Mar 26, 2020 at 9:48 PM Davidlohr Bueso <dave@stgolabs.net> wrote: > > On Thu, 26 Mar 2020, Michel Lespinasse wrote: > > >Add a couple APIs to allow splitting mmap_read_unlock() into two calls: > >- mmap_read_release(), called by the task that had taken the mmap lock; > >- mmap_read_unlock_non_owner(), called from a work queue. > > > >These apis are used by kernel/bpf/stackmap.c only. > > I'm not crazy about the idea generalizing such calls into an mm api. > We try to stay away from non-owner semantics in locking - granted > the IS_ENABLED(CONFIG_PREEMPT_RT) warning, but still. > > Could this give future users the wrong impression? What about just > using rwsem calls directly in bpf? I see what you mean and I certainly don't want to encourage any new non-owner call sites to appear.... This bpf stackmap site is a small pain point in my larger range locking patchset too. I am not sure what is the proper response to it; the opposite side of your argument could be that using a direct rwsem call there hides the issue and makes it less likely for someone to fix it ? I don't have a very strong opinion on this, as I think it can be argued either way... But at a minimum, I think it'd be worth adding a comment asking people not to add new call sites to the mmap_read_release() and mmap_read_unlock_non_owner() APIs ?
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 40a972a26857..00d6cc02581d 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -62,6 +62,16 @@ static inline void mmap_read_unlock(struct mm_struct *mm) up_read(&mm->mmap_sem); } +static inline void mmap_read_release(struct mm_struct *mm, unsigned long ip) +{ + rwsem_release(&mm->mmap_sem.dep_map, ip); +} + +static inline void mmap_read_unlock_non_owner(struct mm_struct *mm) +{ + up_read_non_owner(&mm->mmap_sem); +} + static inline bool mmap_is_locked(struct mm_struct *mm) { return rwsem_is_locked(&mm->mmap_sem) != 0; diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index f2115f691577..413b512a99eb 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -33,7 +33,7 @@ struct bpf_stack_map { /* irq_work to run up_read() for build_id lookup in nmi context */ struct stack_map_irq_work { struct irq_work irq_work; - struct rw_semaphore *sem; + struct mm_struct *mm; }; static void do_up_read(struct irq_work *entry) @@ -41,8 +41,7 @@ static void do_up_read(struct irq_work *entry) struct stack_map_irq_work *work; work = container_of(entry, struct stack_map_irq_work, irq_work); - up_read_non_owner(work->sem); - work->sem = NULL; + mmap_read_unlock_non_owner(work->mm); } static DEFINE_PER_CPU(struct stack_map_irq_work, up_read_work); @@ -332,14 +331,14 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, if (!work) { mmap_read_unlock(current->mm); } else { - work->sem = ¤t->mm->mmap_sem; + work->mm = current->mm; irq_work_queue(&work->irq_work); /* * The irq_work will release the mmap_sem with * up_read_non_owner(). The rwsem_release() is called * here to release the lock from lockdep's perspective. */ - rwsem_release(¤t->mm->mmap_sem.dep_map, _RET_IP_); + mmap_read_release(current->mm, _RET_IP_); } }
Add a couple APIs to allow splitting mmap_read_unlock() into two calls: - mmap_read_release(), called by the task that had taken the mmap lock; - mmap_read_unlock_non_owner(), called from a work queue. These apis are used by kernel/bpf/stackmap.c only. Signed-off-by: Michel Lespinasse <walken@google.com> --- include/linux/mmap_lock.h | 10 ++++++++++ kernel/bpf/stackmap.c | 9 ++++----- 2 files changed, 14 insertions(+), 5 deletions(-)