Message ID | 20190214062339.7139-1-mpe@ellerman.id.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present() | expand |
On Thu 14-02-19 17:23:39, Michael Ellerman wrote: > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > rather than just checking that the value is non-zero, e.g.: > > static inline int pgd_present(pgd_t pgd) > { > - return !pgd_none(pgd); > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > } > > Unfortunately this is broken on big endian, as the result of the > bitwise && is truncated to int, which is always zero because > _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and > pud_present() are always false at compile time, and the compiler > elides the subsequent code. > > Remarkably with that bug present we are still able to boot and run > with few noticeable effects. However under some work loads we are able > to trigger a warning in the ext4 code: Wow, good catch. I wouldn't believe there are so few bad effects from such a major breakage! :) Honza > > WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0 > CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1 > ... > NIP .ext4_set_page_dirty+0x70/0xb0 > LR .set_page_dirty+0xa0/0x150 > Call Trace: > .set_page_dirty+0xa0/0x150 > .unmap_page_range+0xbf0/0xe10 > .unmap_vmas+0x84/0x130 > .unmap_region+0xe8/0x190 > .__do_munmap+0x2f0/0x510 > .__vm_munmap+0x80/0x110 > .__se_sys_munmap+0x14/0x30 > system_call+0x5c/0x70 > > The fix is simple, we need to convert the result of the bitwise && to > an int before returning it. > > Thanks to Jan Kara and Aneesh for help with debugging. > > Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") > Cc: stable@vger.kernel.org # v4.20+ > Reported-by: Erhard F. <erhard_f@mailbox.org> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > --- > arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h > index c9bfe526ca9d..d8c8d7c9df15 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud) > > static inline int pud_present(pud_t pud) > { > - return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); > + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); > } > > extern struct page *pud_page(pud_t pud); > @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd) > > static inline int pgd_present(pgd_t pgd) > { > - return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > } > > static inline pte_t pgd_pte(pgd_t pgd) > -- > 2.20.1 >
On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > rather than just checking that the value is non-zero, e.g.: > > static inline int pgd_present(pgd_t pgd) > { > - return !pgd_none(pgd); > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > } > > Unfortunately this is broken on big endian, as the result of the > bitwise && is truncated to int, which is always zero because Not sure why that should happen, why is the result an int? What causes the casting of pgd_t & be64 to be truncated to an int. > _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and > pud_present() are always false at compile time, and the compiler > elides the subsequent code. > > Remarkably with that bug present we are still able to boot and run > with few noticeable effects. However under some work loads we are able > to trigger a warning in the ext4 code: > > WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0 > CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1 > ... > NIP .ext4_set_page_dirty+0x70/0xb0 > LR .set_page_dirty+0xa0/0x150 > Call Trace: > .set_page_dirty+0xa0/0x150 > .unmap_page_range+0xbf0/0xe10 > .unmap_vmas+0x84/0x130 > .unmap_region+0xe8/0x190 > .__do_munmap+0x2f0/0x510 > .__vm_munmap+0x80/0x110 > .__se_sys_munmap+0x14/0x30 > system_call+0x5c/0x70 > > The fix is simple, we need to convert the result of the bitwise && to > an int before returning it. > > Thanks to Jan Kara and Aneesh for help with debugging. > > Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") > Cc: stable@vger.kernel.org # v4.20+ > Reported-by: Erhard F. <erhard_f@mailbox.org> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > --- > arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h > index c9bfe526ca9d..d8c8d7c9df15 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud) > > static inline int pud_present(pud_t pud) > { > - return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); > + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); > } > > extern struct page *pud_page(pud_t pud); > @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd) > > static inline int pgd_present(pgd_t pgd) > { > - return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > } > Care to put a big FAT warning, so that we don't repeat this again (as in authors planning on changing these bits). Balbir Singh.
On Feb 14 2019, Michael Ellerman <mpe@ellerman.id.au> wrote: > The fix is simple, we need to convert the result of the bitwise && to > an int before returning it. Alternatively, the return type could be changed to bool, so that the compiler does the right thing by itself. Andreas.
Hi all, On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > > rather than just checking that the value is non-zero, e.g.: > > > > static inline int pgd_present(pgd_t pgd) > > { > > - return !pgd_none(pgd); > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > > } > > > > Unfortunately this is broken on big endian, as the result of the > > bitwise && is truncated to int, which is always zero because (Bitwise "&" of course). > Not sure why that should happen, why is the result an int? What > causes the casting of pgd_t & be64 to be truncated to an int. Yes, it's not obvious as written... It's simply that the return type of pgd_present is int. So it is truncated _after_ the bitwise and. Segher
On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote: > Hi all, > > On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: > > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: > > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > > > rather than just checking that the value is non-zero, e.g.: > > > > > > static inline int pgd_present(pgd_t pgd) > > > { > > > - return !pgd_none(pgd); > > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > > > } > > > > > > Unfortunately this is broken on big endian, as the result of the > > > bitwise && is truncated to int, which is always zero because > > (Bitwise "&" of course). > > > Not sure why that should happen, why is the result an int? What > > causes the casting of pgd_t & be64 to be truncated to an int. > > Yes, it's not obvious as written... It's simply that the return type of > pgd_present is int. So it is truncated _after_ the bitwise and. > Thanks, I am surprised the compiler does not complain about the truncation of bits. I wonder if we are missing -Wconversion Balbir
Andreas Schwab <schwab@linux-m68k.org> writes: > On Feb 14 2019, Michael Ellerman <mpe@ellerman.id.au> wrote: > >> The fix is simple, we need to convert the result of the bitwise && to >> an int before returning it. > > Alternatively, the return type could be changed to bool, so that the > compiler does the right thing by itself. Yes that would be preferable. All other architectures return an int so I wasn't game to switch to bool for a fix. But I don't see why it should matter so I'll do a patch using bool for next. cheers
Balbir Singh <bsingharora@gmail.com> writes: > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote: >> Hi all, >> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT >> > > rather than just checking that the value is non-zero, e.g.: >> > > >> > > static inline int pgd_present(pgd_t pgd) >> > > { >> > > - return !pgd_none(pgd); >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); >> > > } >> > > >> > > Unfortunately this is broken on big endian, as the result of the >> > > bitwise && is truncated to int, which is always zero because >> >> (Bitwise "&" of course). >> >> > Not sure why that should happen, why is the result an int? What >> > causes the casting of pgd_t & be64 to be truncated to an int. >> >> Yes, it's not obvious as written... It's simply that the return type of >> pgd_present is int. So it is truncated _after_ the bitwise and. >> > > Thanks, I am surprised the compiler does not complain about the truncation > of bits. I wonder if we are missing -Wconversion Good luck with that :) What I should start doing is building with it enabled and then comparing the output before and after commits to make sure we're not introducing new cases. cheers
Segher Boessenkool <segher@kernel.crashing.org> writes: > Hi all, > > On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: >> On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: >> > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT >> > rather than just checking that the value is non-zero, e.g.: >> > >> > static inline int pgd_present(pgd_t pgd) >> > { >> > - return !pgd_none(pgd); >> > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); >> > } >> > >> > Unfortunately this is broken on big endian, as the result of the >> > bitwise && is truncated to int, which is always zero because > > (Bitwise "&" of course). Thanks, I fixed that up. cheers
On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote: > Balbir Singh <bsingharora@gmail.com> writes: > > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote: > >> Hi all, > >> > >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: > >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: > >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > >> > > rather than just checking that the value is non-zero, e.g.: > >> > > > >> > > static inline int pgd_present(pgd_t pgd) > >> > > { > >> > > - return !pgd_none(pgd); > >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > >> > > } > >> > > > >> > > Unfortunately this is broken on big endian, as the result of the > >> > > bitwise && is truncated to int, which is always zero because > >> > >> (Bitwise "&" of course). > >> > >> > Not sure why that should happen, why is the result an int? What > >> > causes the casting of pgd_t & be64 to be truncated to an int. > >> > >> Yes, it's not obvious as written... It's simply that the return type of > >> pgd_present is int. So it is truncated _after_ the bitwise and. > >> > > > > Thanks, I am surprised the compiler does not complain about the truncation > > of bits. I wonder if we are missing -Wconversion > > Good luck with that :) > > What I should start doing is building with it enabled and then comparing > the output before and after commits to make sure we're not introducing > new cases. > Fair enough, my point was that the compiler can help out. I'll see what -Wconversion finds on my local build :) Balbir Singh.
Balbir Singh <bsingharora@gmail.com> writes: > On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote: >> Balbir Singh <bsingharora@gmail.com> writes: >> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote: >> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: >> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: >> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT >> >> > > rather than just checking that the value is non-zero, e.g.: >> >> > > >> >> > > static inline int pgd_present(pgd_t pgd) >> >> > > { >> >> > > - return !pgd_none(pgd); >> >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); >> >> > > } >> >> > > >> >> > > Unfortunately this is broken on big endian, as the result of the >> >> > > bitwise && is truncated to int, which is always zero because >> >> >> >> (Bitwise "&" of course). >> >> >> >> > Not sure why that should happen, why is the result an int? What >> >> > causes the casting of pgd_t & be64 to be truncated to an int. >> >> >> >> Yes, it's not obvious as written... It's simply that the return type of >> >> pgd_present is int. So it is truncated _after_ the bitwise and. >> >> >> > >> > Thanks, I am surprised the compiler does not complain about the truncation >> > of bits. I wonder if we are missing -Wconversion >> >> Good luck with that :) >> >> What I should start doing is building with it enabled and then comparing >> the output before and after commits to make sure we're not introducing >> new cases. > > Fair enough, my point was that the compiler can help out. I'll see what > -Wconversion finds on my local build :) I get about 43MB of warnings here :) cheers
On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote: > Balbir Singh <bsingharora@gmail.com> writes: > > On Sun, Feb 17, 2019 at 07:34:20PM +1100, Michael Ellerman wrote: > >> Balbir Singh <bsingharora@gmail.com> writes: > >> > On Sat, Feb 16, 2019 at 08:22:12AM -0600, Segher Boessenkool wrote: > >> >> On Sat, Feb 16, 2019 at 09:55:11PM +1100, Balbir Singh wrote: > >> >> > On Thu, Feb 14, 2019 at 05:23:39PM +1100, Michael Ellerman wrote: > >> >> > > In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT > >> >> > > rather than just checking that the value is non-zero, e.g.: > >> >> > > > >> >> > > static inline int pgd_present(pgd_t pgd) > >> >> > > { > >> >> > > - return !pgd_none(pgd); > >> >> > > + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); > >> >> > > } > >> >> > > > >> >> > > Unfortunately this is broken on big endian, as the result of the > >> >> > > bitwise && is truncated to int, which is always zero because > >> >> > >> >> (Bitwise "&" of course). > >> >> > >> >> > Not sure why that should happen, why is the result an int? What > >> >> > causes the casting of pgd_t & be64 to be truncated to an int. > >> >> > >> >> Yes, it's not obvious as written... It's simply that the return type of > >> >> pgd_present is int. So it is truncated _after_ the bitwise and. > >> >> > >> > > >> > Thanks, I am surprised the compiler does not complain about the truncation > >> > of bits. I wonder if we are missing -Wconversion > >> > >> Good luck with that :) > >> > >> What I should start doing is building with it enabled and then comparing > >> the output before and after commits to make sure we're not introducing > >> new cases. > > > > Fair enough, my point was that the compiler can help out. I'll see what > > -Wconversion finds on my local build :) > > I get about 43MB of warnings here :) > I got about 181M with a failed build :(, but the warnings pointed to some cases that can be a good project for cleanup For example 1. static inline long regs_return_value(struct pt_regs *regs) { if (is_syscall_success(regs)) return regs->gpr[3]; else return -regs->gpr[3]; } In the case of is_syscall_success() returning false, we should ensure that regs->gpr[3] is negative and capped within a certain limit, but it might be an expensive check 2. static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array, unsigned int index, unsigned int hidx) { hpte_slot_array[index] = (hidx << 1) | 0x1; } hidx is 3 bits, but the argument is unsigned int. The caller probably does a hidx & 0x7, but it's not clear from the code 3. hash__pmd_bad (pmd_bad) and hash__pud_bad (pud_bad) have issues similar to what was found, but since the the page table indices are below 32, the macros are safe :) And a few more, but I am not sure why I spent time looking at possible issues, may be I am being stupid or overly pessimistic :) Balbir
On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote: > Balbir Singh <bsingharora@gmail.com> writes: > > Fair enough, my point was that the compiler can help out. I'll see what > > -Wconversion finds on my local build :) > > I get about 43MB of warnings here :) Yes, -Wconversion complains about a lot of things that are idiomatic C. There is a reason -Wconversion is not in -Wall or -Wextra. Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote: >> Balbir Singh <bsingharora@gmail.com> writes: >> > Fair enough, my point was that the compiler can help out. I'll see what >> > -Wconversion finds on my local build :) >> >> I get about 43MB of warnings here :) > > Yes, -Wconversion complains about a lot of things that are idiomatic C. > There is a reason -Wconversion is not in -Wall or -Wextra. Actually a lot of those go away when I add -Wno-sign-conversion. And what's left seems mostly reasonable, they all indicate the possibility of a bug I think. In fact this works and would have caught the bug: diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index d8c8d7c9df15..3114e3f368e2 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud) static inline int pud_present(pud_t pud) { + __diag_push(); + __diag_warn(GCC, 8, "-Wconversion", "ulong -> int"); + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); + + __diag_pop(); } extern struct page *pud_page(pud_t pud); Obviously we're not going to instrument every function like that. But we could start instrumenting particular files. cheers
On Wed, Feb 20, 2019 at 10:18:38PM +1100, Michael Ellerman wrote: > Segher Boessenkool <segher@kernel.crashing.org> writes: > > On Mon, Feb 18, 2019 at 11:49:18AM +1100, Michael Ellerman wrote: > >> Balbir Singh <bsingharora@gmail.com> writes: > >> > Fair enough, my point was that the compiler can help out. I'll see what > >> > -Wconversion finds on my local build :) > >> > >> I get about 43MB of warnings here :) > > > > Yes, -Wconversion complains about a lot of things that are idiomatic C. > > There is a reason -Wconversion is not in -Wall or -Wextra. > > Actually a lot of those go away when I add -Wno-sign-conversion. > > And what's left seems mostly reasonable, they all indicate the > possibility of a bug I think. > > In fact this works and would have caught the bug: > > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h > index d8c8d7c9df15..3114e3f368e2 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -904,7 +904,12 @@ static inline int pud_none(pud_t pud) > > static inline int pud_present(pud_t pud) > { > + __diag_push(); > + __diag_warn(GCC, 8, "-Wconversion", "ulong -> int"); > + > return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); > + > + __diag_pop(); > } > > extern struct page *pud_page(pud_t pud); > > > > Obviously we're not going to instrument every function like that. But we > could start instrumenting particular files. So you want to instrument the functions that you know are buggy, using some weird incantations to catch only those errors you already know about? (I am worried this does not scale, in many dimensions). Segher
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index c9bfe526ca9d..d8c8d7c9df15 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -904,7 +904,7 @@ static inline int pud_none(pud_t pud) static inline int pud_present(pud_t pud) { - return (pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PRESENT)); } extern struct page *pud_page(pud_t pud); @@ -951,7 +951,7 @@ static inline int pgd_none(pgd_t pgd) static inline int pgd_present(pgd_t pgd) { - return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); } static inline pte_t pgd_pte(pgd_t pgd)