Message ID | 20211026173822.502506-2-pasha.tatashin@soleen.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Hardening page _refcount | expand |
On Tue, Oct 26, 2021 at 05:38:15PM +0000, Pasha Tatashin wrote: > static inline void page_ref_add(struct page *page, int nr) > { > - atomic_add(nr, &page->_refcount); > + int ret; > + > + VM_BUG_ON(nr <= 0); > + ret = atomic_add_return(nr, &page->_refcount); > + VM_BUG_ON_PAGE(ret <= 0, page); This isn't right. _refcount is allowed to overflow into the negatives. See page_ref_zero_or_close_to_overflow() and the conversations that led to it being added.
On Tue, Oct 26, 2021 at 3:50 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 26, 2021 at 05:38:15PM +0000, Pasha Tatashin wrote: > > static inline void page_ref_add(struct page *page, int nr) > > { > > - atomic_add(nr, &page->_refcount); > > + int ret; > > + > > + VM_BUG_ON(nr <= 0); > > + ret = atomic_add_return(nr, &page->_refcount); > > + VM_BUG_ON_PAGE(ret <= 0, page); > > This isn't right. _refcount is allowed to overflow into the negatives. > See page_ref_zero_or_close_to_overflow() and the conversations that led > to it being added. #define page_ref_zero_or_close_to_overflow(page) \ 1204 ((unsigned int) page_ref_count(page) + 127u <= 127u) Uh, right, I saw the macro but did not realize there was an (unsigned int) cast. OK, I think we can move this macro inside: include/linux/page_ref.h modify it to something like this: #define page_ref_zero_or_close_to_overflow(page) \ ((unsigned int) page_ref_count(page) + v + 127u <= v + 127u) The sub/dec can also be fixed to ensure that we do not underflow but still working with the fact that we use all 32bits of _refcount. Pasha
On Tue, Oct 26, 2021 at 5:34 PM Pasha Tatashin <pasha.tatashin@soleen.com> wrote: > > On Tue, Oct 26, 2021 at 3:50 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Tue, Oct 26, 2021 at 05:38:15PM +0000, Pasha Tatashin wrote: > > > static inline void page_ref_add(struct page *page, int nr) > > > { > > > - atomic_add(nr, &page->_refcount); > > > + int ret; > > > + > > > + VM_BUG_ON(nr <= 0); > > > + ret = atomic_add_return(nr, &page->_refcount); > > > + VM_BUG_ON_PAGE(ret <= 0, page); > > > > This isn't right. _refcount is allowed to overflow into the negatives. > > See page_ref_zero_or_close_to_overflow() and the conversations that led > > to it being added. > > #define page_ref_zero_or_close_to_overflow(page) \ > 1204 ((unsigned int) page_ref_count(page) + 127u <= 127u) > > > Uh, right, I saw the macro but did not realize there was an (unsigned int) cast. > > OK, I think we can move this macro inside: > include/linux/page_ref.h > > modify it to something like this: > #define page_ref_zero_or_close_to_overflow(page) \ > ((unsigned int) page_ref_count(page) + v + 127u <= v + 127u) > > The sub/dec can also be fixed to ensure that we do not underflow but > still working with the fact that we use all 32bits of _refcount. I think we can do that by using: atomic_fetch_*() and check for overflow/underflow after operation. I will send the updated series soon. Pasha
On Tue, Oct 26, 2021 at 09:21:29PM -0400, Pasha Tatashin wrote: > I think we can do that by using: > atomic_fetch_*() and check for overflow/underflow after operation. I > will send the updated series soon. Please include performance measurements. You're adding code to some hot paths.
On Wed, Oct 27, 2021 at 1:38 AM Pasha Tatashin <pasha.tatashin@soleen.com> wrote: > > The problems with page->_refcount are hard to debug, because usually > when they are detected, the damage has occurred a long time ago. Yet, > the problems with invalid page refcount may be catastrophic and lead to > memory corruptions. > > Reduce the scope of when the _refcount problems manifest themselves by > adding checks for underflows and overflows into functions that modify > _refcount. > > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> I found some atomic_add/dec are replaced with atomic_add/dec_return, those helpers with return value imply a full memory barrier around it, but others without return value do not. Do you have any numbers to show the impact? Maybe atomic_add/dec_return_relaxed can help this. Thanks.
> I found some atomic_add/dec are replaced with atomic_add/dec_return, I am going to replace -return variants with -fetch variants, potentially -fetch > those helpers with return value imply a full memory barrier around it, but > others without return value do not. Do you have any numbers to show > the impact? Maybe atomic_add/dec_return_relaxed can help this. The generic variant uses arch_cmpxchg() for all atomic variants without any extra barriers. Therefore, on platforms that use generic implementations there won't be performance differences except for an extra branch that checks results when VM_BUG_ON is enabled. On x86 the difference between the two is the following atomic_add: lock add %eax,(%rsi) atomic_fetch_add: lock xadd %eax,(%rsi) atomic_fetch_add_relaxed: lock xadd %eax,(%rsi) No differences between relaxed and non relaxed variants. However, we used lock xadd instead of lock add. I am not sure if the performance difference is going to be different. Pasha
On Tue, Oct 26, 2021 at 11:05 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Oct 26, 2021 at 09:21:29PM -0400, Pasha Tatashin wrote: > > I think we can do that by using: > > atomic_fetch_*() and check for overflow/underflow after operation. I > > will send the updated series soon. > > Please include performance measurements. You're adding code to some hot > paths. I will do boot performance tests to ensure we do not regress initializing "struct pages" on larger machines. Do you have a suggestion for another perf test? Thanks, Pasha
On Thu, Oct 28, 2021 at 2:22 AM Pasha Tatashin <pasha.tatashin@soleen.com> wrote: > > > I found some atomic_add/dec are replaced with atomic_add/dec_return, > > I am going to replace -return variants with -fetch variants, potentially -fetch > > > those helpers with return value imply a full memory barrier around it, but > > others without return value do not. Do you have any numbers to show > > the impact? Maybe atomic_add/dec_return_relaxed can help this. > > The generic variant uses arch_cmpxchg() for all atomic variants > without any extra barriers. Therefore, on platforms that use generic > implementations there won't be performance differences except for an > extra branch that checks results when VM_BUG_ON is enabled. > > On x86 the difference between the two is the following > > atomic_add: > lock add %eax,(%rsi) > > atomic_fetch_add: > lock xadd %eax,(%rsi) > > atomic_fetch_add_relaxed: > lock xadd %eax,(%rsi) > > No differences between relaxed and non relaxed variants. However, we Right. There is no difference on x86. Maybe there are differences in other architectures. > used lock xadd instead of lock add. I am not sure if the performance > difference is going to be different. > > Pasha
diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h index 7ad46f45df39..b3ec2b231fc7 100644 --- a/include/linux/page_ref.h +++ b/include/linux/page_ref.h @@ -90,21 +90,35 @@ static inline void init_page_count(struct page *page) static inline void page_ref_add(struct page *page, int nr) { - atomic_add(nr, &page->_refcount); + int ret; + + VM_BUG_ON(nr <= 0); + ret = atomic_add_return(nr, &page->_refcount); + VM_BUG_ON_PAGE(ret <= 0, page); + if (page_ref_tracepoint_active(page_ref_mod)) __page_ref_mod(page, nr); } static inline void page_ref_sub(struct page *page, int nr) { - atomic_sub(nr, &page->_refcount); + int ret; + + VM_BUG_ON(nr <= 0); + ret = atomic_sub_return(nr, &page->_refcount); + VM_BUG_ON_PAGE(ret < 0, page); + if (page_ref_tracepoint_active(page_ref_mod)) __page_ref_mod(page, -nr); } static inline int page_ref_sub_return(struct page *page, int nr) { - int ret = atomic_sub_return(nr, &page->_refcount); + int ret; + + VM_BUG_ON(nr <= 0); + ret = atomic_sub_return(nr, &page->_refcount); + VM_BUG_ON_PAGE(ret < 0, page); if (page_ref_tracepoint_active(page_ref_mod_and_return)) __page_ref_mod_and_return(page, -nr, ret); @@ -113,31 +127,43 @@ static inline int page_ref_sub_return(struct page *page, int nr) static inline void page_ref_inc(struct page *page) { - atomic_inc(&page->_refcount); + int ret = atomic_inc_return(&page->_refcount); + + VM_BUG_ON_PAGE(ret <= 0, page); + if (page_ref_tracepoint_active(page_ref_mod)) __page_ref_mod(page, 1); } static inline void page_ref_dec(struct page *page) { - atomic_dec(&page->_refcount); + int ret = atomic_dec_return(&page->_refcount); + + VM_BUG_ON_PAGE(ret < 0, page); + if (page_ref_tracepoint_active(page_ref_mod)) __page_ref_mod(page, -1); } static inline int page_ref_sub_and_test(struct page *page, int nr) { - int ret = atomic_sub_and_test(nr, &page->_refcount); + int ret; + + VM_BUG_ON(nr <= 0); + ret = atomic_sub_return(nr, &page->_refcount); + VM_BUG_ON_PAGE(ret < 0, page); if (page_ref_tracepoint_active(page_ref_mod_and_test)) __page_ref_mod_and_test(page, -nr, ret); - return ret; + return ret == 0; } static inline int page_ref_inc_return(struct page *page) { int ret = atomic_inc_return(&page->_refcount); + VM_BUG_ON_PAGE(ret <= 0, page); + if (page_ref_tracepoint_active(page_ref_mod_and_return)) __page_ref_mod_and_return(page, 1, ret); return ret; @@ -145,17 +171,21 @@ static inline int page_ref_inc_return(struct page *page) static inline int page_ref_dec_and_test(struct page *page) { - int ret = atomic_dec_and_test(&page->_refcount); + int ret = atomic_dec_return(&page->_refcount); + + VM_BUG_ON_PAGE(ret < 0, page); if (page_ref_tracepoint_active(page_ref_mod_and_test)) __page_ref_mod_and_test(page, -1, ret); - return ret; + return ret == 0; } static inline int page_ref_dec_return(struct page *page) { int ret = atomic_dec_return(&page->_refcount); + VM_BUG_ON_PAGE(ret < 0, page); + if (page_ref_tracepoint_active(page_ref_mod_and_return)) __page_ref_mod_and_return(page, -1, ret); return ret; @@ -163,16 +193,23 @@ static inline int page_ref_dec_return(struct page *page) static inline int page_ref_add_unless(struct page *page, int nr, int u) { - int ret = atomic_add_unless(&page->_refcount, nr, u); + int ret; + + VM_BUG_ON(nr <= 0 || u < 0); + ret = atomic_fetch_add_unless(&page->_refcount, nr, u); + VM_BUG_ON_PAGE(ret < 0, page); if (page_ref_tracepoint_active(page_ref_mod_unless)) __page_ref_mod_unless(page, nr, ret); - return ret; + return ret != u; } static inline int page_ref_freeze(struct page *page, int count) { - int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count); + int ret; + + VM_BUG_ON(count <= 0); + ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count); if (page_ref_tracepoint_active(page_ref_freeze)) __page_ref_freeze(page, count, ret);
The problems with page->_refcount are hard to debug, because usually when they are detected, the damage has occurred a long time ago. Yet, the problems with invalid page refcount may be catastrophic and lead to memory corruptions. Reduce the scope of when the _refcount problems manifest themselves by adding checks for underflows and overflows into functions that modify _refcount. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> --- include/linux/page_ref.h | 61 ++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 12 deletions(-)