diff mbox series

arch/{mips, sparc, microblaze, powerpc}: Don't enable pagefault/preempt twice

Message ID 20200518184843.3029640-1-ira.weiny@intel.com (mailing list archive)
State New, archived
Headers show
Series arch/{mips, sparc, microblaze, powerpc}: Don't enable pagefault/preempt twice | expand

Commit Message

Ira Weiny May 18, 2020, 6:48 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

The kunmap_atomic clean up failed to remove one set of pagefault/preempt
enables when vaddr is not in the fixmap.

Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 arch/microblaze/mm/highmem.c | 5 +----
 arch/mips/mm/highmem.c       | 5 +----
 arch/powerpc/mm/highmem.c    | 5 +----
 arch/sparc/mm/highmem.c      | 5 +----
 4 files changed, 4 insertions(+), 16 deletions(-)

Comments

Guenter Roeck May 19, 2020, 4:54 p.m. UTC | #1
On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> enables when vaddr is not in the fixmap.
> 
> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

microblazeel works with this patch, as do the nosmp sparc32 boot tests,
but sparc32 boot tests with SMP enabled still fail with lots of messages
such as:

BUG: Bad page state in process swapper/0  pfn:006a1
page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
flags: 0x0()
raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
[f00e7ab8 :
bad_page+0xa8/0x108 ]
[f00e8b54 :
free_pcppages_bulk+0x154/0x52c ]
[f00ea024 :
free_unref_page+0x54/0x6c ]
[f00ed864 :
free_reserved_area+0x58/0xec ]
[f0527104 :
kernel_init+0x14/0x110 ]
[f000b77c :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

Code path leading to that message is different but always the same
from free_unref_page().

Still testing ppc images.

Guenter
Ira Weiny May 19, 2020, 6:40 p.m. UTC | #2
On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > enables when vaddr is not in the fixmap.
> > 
> > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> microblazeel works with this patch,

Awesome...  Andrew in my rush yesterday I should have put a reported by on the
patch for Guenter as well.

Sorry about that Guenter,
Ira

> as do the nosmp sparc32 boot tests,
> but sparc32 boot tests with SMP enabled still fail with lots of messages
> such as:
> 
> BUG: Bad page state in process swapper/0  pfn:006a1
> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> flags: 0x0()
> raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> page dumped because: nonzero mapcount
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> [f00e7ab8 :
> bad_page+0xa8/0x108 ]
> [f00e8b54 :
> free_pcppages_bulk+0x154/0x52c ]
> [f00ea024 :
> free_unref_page+0x54/0x6c ]
> [f00ed864 :
> free_reserved_area+0x58/0xec ]
> [f0527104 :
> kernel_init+0x14/0x110 ]
> [f000b77c :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]
> 
> Code path leading to that message is different but always the same
> from free_unref_page().
> 
> Still testing ppc images.
> 
> Guenter
Guenter Roeck May 19, 2020, 7:42 p.m. UTC | #3
On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> > On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > > enables when vaddr is not in the fixmap.
> > > 
> > > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > microblazeel works with this patch,
> 
> Awesome...  Andrew in my rush yesterday I should have put a reported by on the
> patch for Guenter as well.
> 
> Sorry about that Guenter,

No worries.

> Ira
> 
> > as do the nosmp sparc32 boot tests,
> > but sparc32 boot tests with SMP enabled still fail with lots of messages
> > such as:
> > 
> > BUG: Bad page state in process swapper/0  pfn:006a1
> > page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> > flags: 0x0()
> > raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> > page dumped because: nonzero mapcount
> > Modules linked in:
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> > [f00e7ab8 :
> > bad_page+0xa8/0x108 ]
> > [f00e8b54 :
> > free_pcppages_bulk+0x154/0x52c ]
> > [f00ea024 :
> > free_unref_page+0x54/0x6c ]
> > [f00ed864 :
> > free_reserved_area+0x58/0xec ]
> > [f0527104 :
> > kernel_init+0x14/0x110 ]
> > [f000b77c :
> > ret_from_kernel_thread+0xc/0x38 ]
> > [00000000 :
> > 0x0 ]
> > 
> > Code path leading to that message is different but always the same
> > from free_unref_page().
> > 
> > Still testing ppc images.
> > 

ppc image tests are passing with this patch.

Guenter
Ira Weiny May 20, 2020, 5:02 a.m. UTC | #4
On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
> > On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> > > On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > > > enables when vaddr is not in the fixmap.
> > > > 
> > > > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > microblazeel works with this patch,
> > 
> > Awesome...  Andrew in my rush yesterday I should have put a reported by on the
> > patch for Guenter as well.
> > 
> > Sorry about that Guenter,
> 
> No worries.
> 
> > Ira
> > 
> > > as do the nosmp sparc32 boot tests,
> > > but sparc32 boot tests with SMP enabled still fail with lots of messages
> > > such as:
> > > 
> > > BUG: Bad page state in process swapper/0  pfn:006a1
> > > page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> > > flags: 0x0()
> > > raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> > > page dumped because: nonzero mapcount
> > > Modules linked in:
> > > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> > > [f00e7ab8 :
> > > bad_page+0xa8/0x108 ]
> > > [f00e8b54 :
> > > free_pcppages_bulk+0x154/0x52c ]
> > > [f00ea024 :
> > > free_unref_page+0x54/0x6c ]
> > > [f00ed864 :
> > > free_reserved_area+0x58/0xec ]
> > > [f0527104 :
> > > kernel_init+0x14/0x110 ]
> > > [f000b77c :
> > > ret_from_kernel_thread+0xc/0x38 ]
> > > [00000000 :
> > > 0x0 ]

I'm really not seeing how this is related to the kmap clean up.

But just to make sure I'm trying to run your environment for sparc and having
less luck than with microblaze.

Could you give me the command which is failing above?

Ira

> > > 
> > > Code path leading to that message is different but always the same
> > > from free_unref_page().
> > > 
> > > Still testing ppc images.
> > > 
> 
> ppc image tests are passing with this patch.
> 
> Guenter
Ira Weiny May 20, 2020, 5:13 a.m. UTC | #5
On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
> > On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> > > On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > > > enables when vaddr is not in the fixmap.
> > > > 
> > > > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > microblazeel works with this patch,
> > 
> > Awesome...  Andrew in my rush yesterday I should have put a reported by on the
> > patch for Guenter as well.
> > 
> > Sorry about that Guenter,
> 
> No worries.
> 
> > Ira
> > 
> > > as do the nosmp sparc32 boot tests,
> > > but sparc32 boot tests with SMP enabled still fail with lots of messages
> > > such as:
> > > 
> > > BUG: Bad page state in process swapper/0  pfn:006a1
> > > page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> > > flags: 0x0()
> > > raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> > > page dumped because: nonzero mapcount
> > > Modules linked in:
> > > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> > > [f00e7ab8 :
> > > bad_page+0xa8/0x108 ]
> > > [f00e8b54 :
> > > free_pcppages_bulk+0x154/0x52c ]
> > > [f00ea024 :
> > > free_unref_page+0x54/0x6c ]
> > > [f00ed864 :
> > > free_reserved_area+0x58/0xec ]
> > > [f0527104 :
> > > kernel_init+0x14/0x110 ]
> > > [f000b77c :
> > > ret_from_kernel_thread+0xc/0x38 ]
> > > [00000000 :
> > > 0x0 ]
> > > 
> > > Code path leading to that message is different but always the same
> > > from free_unref_page().

Actually it occurs to me that the patch consolidating kmap_prot is odd for
sparc 32 bit...

Its a long shot but could you try reverting this patch?

4ea7d2419e3f kmap: consolidate kmap_prot definitions

Alternately I will need to figure out how to run the sparc on qemu here...

Thanks very much for all the testing though!  :-D

Ira

> > > 
> > > Still testing ppc images.
> > > 
> 
> ppc image tests are passing with this patch.
> 
> Guenter
Ira Weiny May 21, 2020, 3:14 p.m. UTC | #6
On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
> > On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> > > as do the nosmp sparc32 boot tests,
> > > but sparc32 boot tests with SMP enabled still fail with lots of messages
> > > such as:
> > > 
> > > BUG: Bad page state in process swapper/0  pfn:006a1
> > > page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> > > flags: 0x0()
> > > raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> > > page dumped because: nonzero mapcount
> > > Modules linked in:
> > > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> > > [f00e7ab8 :
> > > bad_page+0xa8/0x108 ]
> > > [f00e8b54 :
> > > free_pcppages_bulk+0x154/0x52c ]
> > > [f00ea024 :
> > > free_unref_page+0x54/0x6c ]
> > > [f00ed864 :
> > > free_reserved_area+0x58/0xec ]
> > > [f0527104 :
> > > kernel_init+0x14/0x110 ]
> > > [f000b77c :
> > > ret_from_kernel_thread+0xc/0x38 ]
> > > [00000000 :
> > > 0x0 ]
> > > 
> > > Code path leading to that message is different but always the same
> > > from free_unref_page().
> > > 
> > > Still testing ppc images.
> > > 
> 
> ppc image tests are passing with this patch.

How about sparc?  I finally got your scripts to run on sparc and everything
looks to be passing?

Are we all good now?

Thanks again!
Ira
Guenter Roeck May 21, 2020, 4:05 p.m. UTC | #7
On 5/19/20 10:13 PM, Ira Weiny wrote:
> On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
>> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
>>> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
>>>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
>>>>> From: Ira Weiny <ira.weiny@intel.com>
>>>>>
>>>>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
>>>>> enables when vaddr is not in the fixmap.
>>>>>
>>>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>>>>
>>>> microblazeel works with this patch,
>>>
>>> Awesome...  Andrew in my rush yesterday I should have put a reported by on the
>>> patch for Guenter as well.
>>>
>>> Sorry about that Guenter,
>>
>> No worries.
>>
>>> Ira
>>>
>>>> as do the nosmp sparc32 boot tests,
>>>> but sparc32 boot tests with SMP enabled still fail with lots of messages
>>>> such as:
>>>>
>>>> BUG: Bad page state in process swapper/0  pfn:006a1
>>>> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
>>>> flags: 0x0()
>>>> raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
>>>> page dumped because: nonzero mapcount
>>>> Modules linked in:
>>>> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
>>>> [f00e7ab8 :
>>>> bad_page+0xa8/0x108 ]
>>>> [f00e8b54 :
>>>> free_pcppages_bulk+0x154/0x52c ]
>>>> [f00ea024 :
>>>> free_unref_page+0x54/0x6c ]
>>>> [f00ed864 :
>>>> free_reserved_area+0x58/0xec ]
>>>> [f0527104 :
>>>> kernel_init+0x14/0x110 ]
>>>> [f000b77c :
>>>> ret_from_kernel_thread+0xc/0x38 ]
>>>> [00000000 :
>>>> 0x0 ]
>>>>
>>>> Code path leading to that message is different but always the same
>>>> from free_unref_page().
> 
> Actually it occurs to me that the patch consolidating kmap_prot is odd for
> sparc 32 bit...
> 
> Its a long shot but could you try reverting this patch?
> 
> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> 

That is not easy to revert, unfortunately, due to several follow-up patches.

Guenter

> Alternately I will need to figure out how to run the sparc on qemu here...
> 
> Thanks very much for all the testing though!  :-D
> 
> Ira
> 
>>>>
>>>> Still testing ppc images.
>>>>
>>
>> ppc image tests are passing with this patch.
>>
>> Guenter
Al Viro May 21, 2020, 5:27 p.m. UTC | #8
On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > enables when vaddr is not in the fixmap.
> > 
> > Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> microblazeel works with this patch, as do the nosmp sparc32 boot tests,
> but sparc32 boot tests with SMP enabled still fail with lots of messages
> such as:

BTW, what's your setup for sparc32 boot tests?  IOW, how do you manage to
shrink the damn thing enough to have the loader cope with it?  I hadn't
been able to do that for the current mainline ;-/
Ira Weiny May 21, 2020, 5:42 p.m. UTC | #9
On Thu, May 21, 2020 at 09:05:41AM -0700, Guenter Roeck wrote:
> On 5/19/20 10:13 PM, Ira Weiny wrote:
> > On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
> >> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
> >>> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> >>>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> >>>>> From: Ira Weiny <ira.weiny@intel.com>
> >>>>>
> >>>>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> >>>>> enables when vaddr is not in the fixmap.
> >>>>>
> >>>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> >>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >>>>
> >>>> microblazeel works with this patch,
> >>>
> >>> Awesome...  Andrew in my rush yesterday I should have put a reported by on the
> >>> patch for Guenter as well.
> >>>
> >>> Sorry about that Guenter,
> >>
> >> No worries.
> >>
> >>> Ira
> >>>
> >>>> as do the nosmp sparc32 boot tests,
> >>>> but sparc32 boot tests with SMP enabled still fail with lots of messages
> >>>> such as:
> >>>>
> >>>> BUG: Bad page state in process swapper/0  pfn:006a1
> >>>> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> >>>> flags: 0x0()
> >>>> raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
> >>>> page dumped because: nonzero mapcount
> >>>> Modules linked in:
> >>>> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
> >>>> [f00e7ab8 :
> >>>> bad_page+0xa8/0x108 ]
> >>>> [f00e8b54 :
> >>>> free_pcppages_bulk+0x154/0x52c ]
> >>>> [f00ea024 :
> >>>> free_unref_page+0x54/0x6c ]
> >>>> [f00ed864 :
> >>>> free_reserved_area+0x58/0xec ]
> >>>> [f0527104 :
> >>>> kernel_init+0x14/0x110 ]
> >>>> [f000b77c :
> >>>> ret_from_kernel_thread+0xc/0x38 ]
> >>>> [00000000 :
> >>>> 0x0 ]
> >>>>
> >>>> Code path leading to that message is different but always the same
> >>>> from free_unref_page().
> > 
> > Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > sparc 32 bit...
> > 
> > Its a long shot but could you try reverting this patch?
> > 
> > 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > 
> 
> That is not easy to revert, unfortunately, due to several follow-up patches.

I have gotten your sparc tests to run and they all pass...

08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
Build reference: v5.7-rc4-17-g852b6f2edc0f

Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:nosmp:initrd ... running ......... passed
Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed

Is there another test I need to run?

Ira


> 
> Guenter
> 
> > Alternately I will need to figure out how to run the sparc on qemu here...
> > 
> > Thanks very much for all the testing though!  :-D
> > 
> > Ira
> > 
> >>>>
> >>>> Still testing ppc images.
> >>>>
> >>
> >> ppc image tests are passing with this patch.
> >>
> >> Guenter
>
Guenter Roeck May 21, 2020, 10:20 p.m. UTC | #10
On 5/21/20 10:27 AM, Al Viro wrote:
> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
>>> From: Ira Weiny <ira.weiny@intel.com>
>>>
>>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
>>> enables when vaddr is not in the fixmap.
>>>
>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>>
>> microblazeel works with this patch, as do the nosmp sparc32 boot tests,
>> but sparc32 boot tests with SMP enabled still fail with lots of messages
>> such as:
> 
> BTW, what's your setup for sparc32 boot tests?  IOW, how do you manage to
> shrink the damn thing enough to have the loader cope with it?  I hadn't
> been able to do that for the current mainline ;-/
> 

defconfig seems to work just fine, even after enabling various debug
and file system options.

Guenter
Guenter Roeck May 21, 2020, 10:27 p.m. UTC | #11
On 5/21/20 10:42 AM, Ira Weiny wrote:
> On Thu, May 21, 2020 at 09:05:41AM -0700, Guenter Roeck wrote:
>> On 5/19/20 10:13 PM, Ira Weiny wrote:
>>> On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
>>>> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
>>>>> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
>>>>>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
>>>>>>> From: Ira Weiny <ira.weiny@intel.com>
>>>>>>>
>>>>>>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
>>>>>>> enables when vaddr is not in the fixmap.
>>>>>>>
>>>>>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
>>>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>>>>>>
>>>>>> microblazeel works with this patch,
>>>>>
>>>>> Awesome...  Andrew in my rush yesterday I should have put a reported by on the
>>>>> patch for Guenter as well.
>>>>>
>>>>> Sorry about that Guenter,
>>>>
>>>> No worries.
>>>>
>>>>> Ira
>>>>>
>>>>>> as do the nosmp sparc32 boot tests,
>>>>>> but sparc32 boot tests with SMP enabled still fail with lots of messages
>>>>>> such as:
>>>>>>
>>>>>> BUG: Bad page state in process swapper/0  pfn:006a1
>>>>>> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
>>>>>> flags: 0x0()
>>>>>> raw: 00000000 00000100 00000122 00000000 00000001 00000000 00000000 00000000
>>>>>> page dumped because: nonzero mapcount
>>>>>> Modules linked in:
>>>>>> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-20200518-00002-gb178d2d56f29 #1
>>>>>> [f00e7ab8 :
>>>>>> bad_page+0xa8/0x108 ]
>>>>>> [f00e8b54 :
>>>>>> free_pcppages_bulk+0x154/0x52c ]
>>>>>> [f00ea024 :
>>>>>> free_unref_page+0x54/0x6c ]
>>>>>> [f00ed864 :
>>>>>> free_reserved_area+0x58/0xec ]
>>>>>> [f0527104 :
>>>>>> kernel_init+0x14/0x110 ]
>>>>>> [f000b77c :
>>>>>> ret_from_kernel_thread+0xc/0x38 ]
>>>>>> [00000000 :
>>>>>> 0x0 ]
>>>>>>
>>>>>> Code path leading to that message is different but always the same
>>>>>> from free_unref_page().
>>>
>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
>>> sparc 32 bit...
>>>
>>> Its a long shot but could you try reverting this patch?
>>>
>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
>>>
>>
>> That is not easy to revert, unfortunately, due to several follow-up patches.
> 
> I have gotten your sparc tests to run and they all pass...
> 
> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> Build reference: v5.7-rc4-17-g852b6f2edc0f
> 

That doesn't look like it is linux-next, which I guess means that something
else in linux-next breaks it. What is your qemu version ?

Thanks,
Guenter

> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> 
> Is there another test I need to run?
> 
> Ira
> 
> 
>>
>> Guenter
>>
>>> Alternately I will need to figure out how to run the sparc on qemu here...
>>>
>>> Thanks very much for all the testing though!  :-D
>>>
>>> Ira
>>>
>>>>>>
>>>>>> Still testing ppc images.
>>>>>>
>>>>
>>>> ppc image tests are passing with this patch.
>>>>
>>>> Guenter
>>
Ira Weiny May 21, 2020, 10:36 p.m. UTC | #12
> On 5/21/20 10:42 AM, Ira Weiny wrote:
> > On Thu, May 21, 2020 at 09:05:41AM -0700, Guenter Roeck wrote:
> >> On 5/19/20 10:13 PM, Ira Weiny wrote:
> >>> On Tue, May 19, 2020 at 12:42:15PM -0700, Guenter Roeck wrote:
> >>>> On Tue, May 19, 2020 at 11:40:32AM -0700, Ira Weiny wrote:
> >>>>> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> >>>>>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com
> wrote:
> >>>>>>> From: Ira Weiny <ira.weiny@intel.com>
> >>>>>>>
> >>>>>>> The kunmap_atomic clean up failed to remove one set of
> >>>>>>> pagefault/preempt enables when vaddr is not in the fixmap.
> >>>>>>>
> >>>>>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate
> >>>>>>> code")
> >>>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >>>>>>
> >>>>>> microblazeel works with this patch,
> >>>>>
> >>>>> Awesome...  Andrew in my rush yesterday I should have put a
> >>>>> reported by on the patch for Guenter as well.
> >>>>>
> >>>>> Sorry about that Guenter,
> >>>>
> >>>> No worries.
> >>>>
> >>>>> Ira
> >>>>>
> >>>>>> as do the nosmp sparc32 boot tests, but sparc32 boot tests with
> >>>>>> SMP enabled still fail with lots of messages such as:
> >>>>>>
> >>>>>> BUG: Bad page state in process swapper/0  pfn:006a1
> >>>>>> page:f0933420 refcount:0 mapcount:1 mapping:(ptrval) index:0x1
> >>>>>> flags: 0x0()
> >>>>>> raw: 00000000 00000100 00000122 00000000 00000001 00000000
> >>>>>> 00000000 00000000 page dumped because: nonzero mapcount
> Modules
> >>>>>> linked in:
> >>>>>> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B             5.7.0-rc6-next-
> 20200518-00002-gb178d2d56f29 #1
> >>>>>> [f00e7ab8 :
> >>>>>> bad_page+0xa8/0x108 ]
> >>>>>> [f00e8b54 :
> >>>>>> free_pcppages_bulk+0x154/0x52c ]
> >>>>>> [f00ea024 :
> >>>>>> free_unref_page+0x54/0x6c ]
> >>>>>> [f00ed864 :
> >>>>>> free_reserved_area+0x58/0xec ]
> >>>>>> [f0527104 :
> >>>>>> kernel_init+0x14/0x110 ]
> >>>>>> [f000b77c :
> >>>>>> ret_from_kernel_thread+0xc/0x38 ]
> >>>>>> [00000000 :
> >>>>>> 0x0 ]
> >>>>>>
> >>>>>> Code path leading to that message is different but always the
> >>>>>> same from free_unref_page().
> >>>
> >>> Actually it occurs to me that the patch consolidating kmap_prot is
> >>> odd for sparc 32 bit...
> >>>
> >>> Its a long shot but could you try reverting this patch?
> >>>
> >>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> >>>
> >>
> >> That is not easy to revert, unfortunately, due to several follow-up
> patches.
> >
> > I have gotten your sparc tests to run and they all pass...
> >
> > 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh
> > Build reference: v5.7-rc4-17-g852b6f2edc0f
> >
> 
> That doesn't look like it is linux-next, which I guess means that something
> else in linux-next breaks it. What is your qemu version ?

Ah yea that was just 5.7-rc4 with my patch set applied.  Yes must be something else or an interaction with my patch set.

Did I see another email with Mike which may fix this?

Ira

> 
> Thanks,
> Guenter
> 
> > Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> > Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> > Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> >
> > Is there another test I need to run?
> >
> > Ira
> >
> >
> >>
> >> Guenter
> >>
> >>> Alternately I will need to figure out how to run the sparc on qemu here...
> >>>
> >>> Thanks very much for all the testing though!  :-D
> >>>
> >>> Ira
> >>>
> >>>>>>
> >>>>>> Still testing ppc images.
> >>>>>>
> >>>>
> >>>> ppc image tests are passing with this patch.
> >>>>
> >>>> Guenter
> >>
Al Viro May 21, 2020, 10:46 p.m. UTC | #13
On Thu, May 21, 2020 at 03:20:46PM -0700, Guenter Roeck wrote:
> On 5/21/20 10:27 AM, Al Viro wrote:
> > On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> >> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> >>> From: Ira Weiny <ira.weiny@intel.com>
> >>>
> >>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> >>> enables when vaddr is not in the fixmap.
> >>>
> >>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> >>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> >>
> >> microblazeel works with this patch, as do the nosmp sparc32 boot tests,
> >> but sparc32 boot tests with SMP enabled still fail with lots of messages
> >> such as:
> > 
> > BTW, what's your setup for sparc32 boot tests?  IOW, how do you manage to
> > shrink the damn thing enough to have the loader cope with it?  I hadn't
> > been able to do that for the current mainline ;-/
> > 
> 
> defconfig seems to work just fine, even after enabling various debug
> and file system options.

The hell?  How do you manage to get the kernel in?  sparc32_defconfig
ends up with 5316876 bytes unpacked...
Al Viro May 22, 2020, 12:46 a.m. UTC | #14
On Thu, May 21, 2020 at 11:46:12PM +0100, Al Viro wrote:
> On Thu, May 21, 2020 at 03:20:46PM -0700, Guenter Roeck wrote:
> > On 5/21/20 10:27 AM, Al Viro wrote:
> > > On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
> > >> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
> > >>> From: Ira Weiny <ira.weiny@intel.com>
> > >>>
> > >>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
> > >>> enables when vaddr is not in the fixmap.
> > >>>
> > >>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
> > >>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > >>
> > >> microblazeel works with this patch, as do the nosmp sparc32 boot tests,
> > >> but sparc32 boot tests with SMP enabled still fail with lots of messages
> > >> such as:
> > > 
> > > BTW, what's your setup for sparc32 boot tests?  IOW, how do you manage to
> > > shrink the damn thing enough to have the loader cope with it?  I hadn't
> > > been able to do that for the current mainline ;-/
> > > 
> > 
> > defconfig seems to work just fine, even after enabling various debug
> > and file system options.
> 
> The hell?  How do you manage to get the kernel in?  sparc32_defconfig
> ends up with 5316876 bytes unpacked...

Incidentally, trying to load it via -kernel/-initrd leads to
Configuration device id QEMU version 1 machine id 64
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 15 offset 0
Invalid FCode start byte
CPUs: 1 x TI,TMS390Z55
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Dec 27 2018 19:17
  Type 'help' for detailed information
[sparc] Kernel already loaded
switching to new context:
PROMLIB: obio_ranges 1
PROMLIB: Sun Boot Prom Version 3 Revision 2
Linux version 5.7.0-rc1-00002-gcf51e129b968 (al@duke) (gcc version 6.3.0 20170516 (Debian 6.3.0-18), GNU ld (GNU Binutils for Debian) 2.28) #32 Thu May 21 18:36:07 EDT 2020
printk: bootconsole [earlyprom0] enabled
ARCH: SUN4M
TYPE: Sun4m SparcStation10/20
Ethernet address: 52:54:00:12:34:56
303MB HIGHMEM available.
OF stdout device is: /obio/zs@0,100000:a
PROM: Built device tree with 30051 bytes of memory.
Booting Linux...
Power off control detected.
Kernel panic - not syncing: Failed to allocate memory for percpu areas.
CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc1-00002-gcf51e129b968 #32
[f04f92a8 : 
setup_per_cpu_areas+0x58/0x90 ] 
[f04edbf4 : 
start_kernel+0xc0/0x4a0 ] 
[f04ed43c : 
continue_boot+0x324/0x334 ] 
[00000000 : 
0x0 ] 

Press Stop-A (L1-A) from sun keyboard or send break
twice on console to return to the boot prom
---[ end Kernel panic - not syncing: Failed to allocate memory for percpu areas. ]---

Giving guest more RAM doesn't change the outcome (well, the number HIGHMEM line is
obviously higher, but that's it).

So which sparc32 kernel have you booted with defconfig and how have you done
that?
Guenter Roeck May 22, 2020, 1:11 a.m. UTC | #15
On 5/21/20 5:46 PM, Al Viro wrote:
> On Thu, May 21, 2020 at 11:46:12PM +0100, Al Viro wrote:
>> On Thu, May 21, 2020 at 03:20:46PM -0700, Guenter Roeck wrote:
>>> On 5/21/20 10:27 AM, Al Viro wrote:
>>>> On Tue, May 19, 2020 at 09:54:22AM -0700, Guenter Roeck wrote:
>>>>> On Mon, May 18, 2020 at 11:48:43AM -0700, ira.weiny@intel.com wrote:
>>>>>> From: Ira Weiny <ira.weiny@intel.com>
>>>>>>
>>>>>> The kunmap_atomic clean up failed to remove one set of pagefault/preempt
>>>>>> enables when vaddr is not in the fixmap.
>>>>>>
>>>>>> Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
>>>>>> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>>>>>
>>>>> microblazeel works with this patch, as do the nosmp sparc32 boot tests,
>>>>> but sparc32 boot tests with SMP enabled still fail with lots of messages
>>>>> such as:
>>>>
>>>> BTW, what's your setup for sparc32 boot tests?  IOW, how do you manage to
>>>> shrink the damn thing enough to have the loader cope with it?  I hadn't
>>>> been able to do that for the current mainline ;-/
>>>>
>>>
>>> defconfig seems to work just fine, even after enabling various debug
>>> and file system options.
>>
>> The hell?  How do you manage to get the kernel in?  sparc32_defconfig
>> ends up with 5316876 bytes unpacked...
> 
> Incidentally, trying to load it via -kernel/-initrd leads to
> Configuration device id QEMU version 1 machine id 64
> Probing SBus slot 0 offset 0
> Probing SBus slot 1 offset 0
> Probing SBus slot 2 offset 0
> Probing SBus slot 3 offset 0
> Probing SBus slot 15 offset 0
> Invalid FCode start byte
> CPUs: 1 x TI,TMS390Z55
> UUID: 00000000-0000-0000-0000-000000000000
> Welcome to OpenBIOS v1.1 built on Dec 27 2018 19:17
>   Type 'help' for detailed information
> [sparc] Kernel already loaded
> switching to new context:
> PROMLIB: obio_ranges 1
> PROMLIB: Sun Boot Prom Version 3 Revision 2
> Linux version 5.7.0-rc1-00002-gcf51e129b968 (al@duke) (gcc version 6.3.0 20170516 (Debian 6.3.0-18), GNU ld (GNU Binutils for Debian) 2.28) #32 Thu May 21 18:36:07 EDT 2020
> printk: bootconsole [earlyprom0] enabled
> ARCH: SUN4M
> TYPE: Sun4m SparcStation10/20
> Ethernet address: 52:54:00:12:34:56
> 303MB HIGHMEM available.
> OF stdout device is: /obio/zs@0,100000:a
> PROM: Built device tree with 30051 bytes of memory.
> Booting Linux...
> Power off control detected.
> Kernel panic - not syncing: Failed to allocate memory for percpu areas.
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc1-00002-gcf51e129b968 #32
> [f04f92a8 : 
> setup_per_cpu_areas+0x58/0x90 ] 
> [f04edbf4 : 
> start_kernel+0xc0/0x4a0 ] 
> [f04ed43c : 
> continue_boot+0x324/0x334 ] 
> [00000000 : 
> 0x0 ] 
> 
> Press Stop-A (L1-A) from sun keyboard or send break
> twice on console to return to the boot prom
> ---[ end Kernel panic - not syncing: Failed to allocate memory for percpu areas. ]---
> 
> Giving guest more RAM doesn't change the outcome (well, the number HIGHMEM line is
> obviously higher, but that's it).
> 
> So which sparc32 kernel have you booted with defconfig and how have you done
> that?
> 

Mainline, with:

qemu-system-sparc -M SS-4 -kernel arch/sparc/boot/zImage -no-reboot \
	-snapshot -drive file=rootfs.ext2,format=raw,if=scsi \
	-append "panic=-1 slub_debug=FZPUA root=/dev/sda console=ttyS0"
	-nographic -monitor none

The machine doesn't really matter, though. The root file system is built
with buildroot.

Note that I carry two reverts in my qemu images.

Revert "tcx: switch to load_image_mr() and remove prom_addr hack"
Revert "cg3: switch to load_image_mr() and remove prom-addr hack"

I have been carrying those since ~2017. I didn't check recently
if they are still needed.

If sparc32 is no longer supported in the upstream kernel,
would it possibly make sense remove its support ?

Guenter
Al Viro May 22, 2020, 1:29 a.m. UTC | #16
On Thu, May 21, 2020 at 06:11:08PM -0700, Guenter Roeck wrote:

> Mainline, with:
> 
> qemu-system-sparc -M SS-4 -kernel arch/sparc/boot/zImage -no-reboot \
> 	-snapshot -drive file=rootfs.ext2,format=raw,if=scsi \
> 	-append "panic=-1 slub_debug=FZPUA root=/dev/sda console=ttyS0"
> 	-nographic -monitor none
> 
> The machine doesn't really matter, though.

It does, unfortunately - try that with SS-10 and watch what happens ;-/
Al Viro May 22, 2020, 1:35 a.m. UTC | #17
On Fri, May 22, 2020 at 02:29:50AM +0100, Al Viro wrote:
> On Thu, May 21, 2020 at 06:11:08PM -0700, Guenter Roeck wrote:
> 
> > Mainline, with:
> > 
> > qemu-system-sparc -M SS-4 -kernel arch/sparc/boot/zImage -no-reboot \
> > 	-snapshot -drive file=rootfs.ext2,format=raw,if=scsi \
> > 	-append "panic=-1 slub_debug=FZPUA root=/dev/sda console=ttyS0"
> > 	-nographic -monitor none
> > 
> > The machine doesn't really matter, though.
> 
> It does, unfortunately - try that with SS-10 and watch what happens ;-/

Ugh...  It's actually something in -m handling: -m 256 passes, -m 512
leads to that panic.
Guenter Roeck May 22, 2020, 2:28 a.m. UTC | #18
On Fri, May 22, 2020 at 02:35:23AM +0100, Al Viro wrote:
> On Fri, May 22, 2020 at 02:29:50AM +0100, Al Viro wrote:
> > On Thu, May 21, 2020 at 06:11:08PM -0700, Guenter Roeck wrote:
> > 
> > > Mainline, with:
> > > 
> > > qemu-system-sparc -M SS-4 -kernel arch/sparc/boot/zImage -no-reboot \
> > > 	-snapshot -drive file=rootfs.ext2,format=raw,if=scsi \
> > > 	-append "panic=-1 slub_debug=FZPUA root=/dev/sda console=ttyS0"
> > > 	-nographic -monitor none
> > > 
> > > The machine doesn't really matter, though.
> > 
> > It does, unfortunately - try that with SS-10 and watch what happens ;-/
> 
> Ugh...  It's actually something in -m handling: -m 256 passes, -m 512
> leads to that panic.

Default seems to be 128M. Anyway, see below log for SS-10.

Guenter

---
Configuration device id QEMU version 1 machine id 64
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 15 offset 0
Invalid FCode start byte
CPUs: 1 x TI,TMS390Z55
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Oct 28 2019 17:08
  Type 'help' for detailed information
[sparc] Kernel already loaded
switching to new context:
PROMLIB: obio_ranges 1
PROMLIB: Sun Boot Prom Version 3 Revision 2
Linux version 5.7.0-rc6-00026-g03fb3acae4be (groeck@server.roeck-us.net) (gcc version 6.5.0 (Buildroot 2018.11-rc2-00071-g4310260), GNU ld (GNU Binutils) 2.31.1) #1 Thu May 21 19:17:48 PDT 2020
printk: bootconsole [earlyprom0] enabled
ARCH: SUN4M
TYPE: Sun4m SparcStation10/20
Ethernet address: 52:54:00:12:34:56
OF stdout device is: /obio/zs@0,100000:a
PROM: Built device tree with 30586 bytes of memory.
Booting Linux...
Power off control detected.
Built 1 zonelists, mobility grouping on.  Total pages: 25012
Kernel command line: panic=-1 slub_debug=FZPUA root=/dev/sda console=ttyS0
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
Sorting __ex_table...
mem auto-init: stack:off, heap alloc:off, heap free:off
Memory: 103428K/100944K available (5050K kernel code, 178K rwdata, 1212K rodata, 176K init, 158K bss, 4294964812K reserved, 0K cma-reserved, 0K highmem)
NR_IRQS: 64
Console: colour dummy device 80x25
------------------------
| Locking API testsuite:
----------------------------------------------------------------------------
                                 | spin |wlock |rlock |mutex | wsem | rsem |
  --------------------------------------------------------------------------
                     A-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
                 A-B-B-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
             A-B-B-C-C-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
             A-B-C-A-B-C deadlock:failed|failed|  ok  |failed|failed|failed|failed|
         A-B-B-C-C-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
         A-B-C-D-B-D-D-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
         A-B-C-D-B-C-D-A deadlock:failed|failed|  ok  |failed|failed|failed|failed|
                    double unlock:  ok  |  ok  |failed|  ok  |failed|failed|  ok  |
                  initialize held:failed|failed|failed|failed|failed|failed|failed|
  --------------------------------------------------------------------------
              recursive read-lock:             |  ok  |             |failed|
           recursive read-lock #2:             |  ok  |             |failed|
            mixed read-write-lock:             |failed|             |failed|
            mixed write-read-lock:             |failed|             |failed|
  mixed read-lock/lock-write ABBA:             |failed|             |failed|
   mixed read-lock/lock-read ABBA:             |  ok  |             |failed|
 mixed write-lock/lock-write ABBA:             |failed|             |failed|
  --------------------------------------------------------------------------
     hard-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/12:failed|failed|  ok  |
     hard-irqs-on + irq-safe-A/21:failed|failed|  ok  |
     soft-irqs-on + irq-safe-A/21:failed|failed|  ok  |
       sirq-safe-A => hirqs-on/12:failed|failed|  ok  |
       sirq-safe-A => hirqs-on/21:failed|failed|  ok  |
         hard-safe-A + irqs-on/12:failed|failed|  ok  |
         soft-safe-A + irqs-on/12:failed|failed|  ok  |
         hard-safe-A + irqs-on/21:failed|failed|  ok  |
         soft-safe-A + irqs-on/21:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/123:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/123:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/132:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/132:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/213:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/213:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/231:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/231:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/312:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/312:failed|failed|  ok  |
    hard-safe-A + unsafe-B #1/321:failed|failed|  ok  |
    soft-safe-A + unsafe-B #1/321:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/123:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/123:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/132:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/132:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/213:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/213:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/231:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/231:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/312:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/312:failed|failed|  ok  |
    hard-safe-A + unsafe-B #2/321:failed|failed|  ok  |
    soft-safe-A + unsafe-B #2/321:failed|failed|  ok  |
      hard-irq lock-inversion/123:failed|failed|  ok  |
      soft-irq lock-inversion/123:failed|failed|  ok  |
      hard-irq lock-inversion/132:failed|failed|  ok  |
      soft-irq lock-inversion/132:failed|failed|  ok  |
      hard-irq lock-inversion/213:failed|failed|  ok  |
      soft-irq lock-inversion/213:failed|failed|  ok  |
      hard-irq lock-inversion/231:failed|failed|  ok  |
      soft-irq lock-inversion/231:failed|failed|  ok  |
      hard-irq lock-inversion/312:failed|failed|  ok  |
      soft-irq lock-inversion/312:failed|failed|  ok  |
      hard-irq lock-inversion/321:failed|failed|  ok  |
      soft-irq lock-inversion/321:failed|failed|  ok  |
      hard-irq read-recursion/123:  ok  |
      soft-irq read-recursion/123:  ok  |
      hard-irq read-recursion/132:  ok  |
      soft-irq read-recursion/132:  ok  |
      hard-irq read-recursion/213:  ok  |
      soft-irq read-recursion/213:  ok  |
      hard-irq read-recursion/231:  ok  |
      soft-irq read-recursion/231:  ok  |
      hard-irq read-recursion/312:  ok  |
      soft-irq read-recursion/312:  ok  |
      hard-irq read-recursion/321:  ok  |
      soft-irq read-recursion/321:  ok  |
  --------------------------------------------------------------------------
  | Wound/wait tests |
  ---------------------
                  ww api failures:  ok  |  ok  |  ok  |
               ww contexts mixing:failed|  ok  |
             finishing ww context:  ok  |  ok  |  ok  |  ok  |
               locking mismatches:  ok  |  ok  |  ok  |
                 EDEADLK handling:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
           spinlock nest unlocked:failed|
  -----------------------------------------------------
                                 |block | try  |context|
  -----------------------------------------------------
                          context:failed|  ok  |  ok  |
                              try:failed|  ok  |failed|
                            block:failed|  ok  |failed|
                         spinlock:failed|  ok  |failed|
--------------------------------------------------------
164 out of 262 testcases failed, as expected. |
----------------------------------------------------
clocksource: timer_cs: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 1763180808480 ns
Calibrating delay loop... 518.14 BogoMIPS (lpj=1036288)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
devtmpfs: initialized
random: get_random_u32 called from bucket_table_alloc.isra.22+0x50/0x1e4 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 256 (order: 0, 7168 bytes, linear)
NET: Registered protocol family 16
IOMMU: impl 0 vers 3 table 0x(ptrval)[262144 B] map [65536 b]
vgaarb: loaded
SCSI subsystem initialized
clocksource: Switched to clocksource timer_cs
NET: Registered protocol family 2
tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 6144 bytes, linear)
TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
TCP bind hash table entries: 1024 (order: 2, 20480 bytes, linear)
TCP: Hash tables configured (established 1024 bind 1024)
UDP hash table entries: 256 (order: 1, 12288 bytes, linear)
UDP-Lite hash table entries: 256 (order: 1, 12288 bytes, linear)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
PCI: CLS 0 bytes, default 32
apc: power management initialized
random: fast init done
workingset: timestamp_bits=30 max_order=15 bucket_order=0
squashfs: version 4.0 (2009/01/31) Phillip Lougher
jitterentropy: Initialization failed with host not compliant with requirements: 2
io scheduler mq-deadline registered
io scheduler kyber registered
String selftests succeeded
test_firmware: interface ready
test passed
test_bitmap: loaded.
test_bitmap: parselist: 14: input is '0-2047:128/256' OK, Time: 2000
test_bitmap: parselist_user: 14: input is '0-2047:128/256' OK, Time: 3000
test_bitmap: all 1663 tests passed
test_uuid: all 18 tests passed
crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
crc32: self tests passed, processed 225944 bytes in 450500 nsec
crc32c: CRC_LE_BITS = 64
crc32c: self tests passed, processed 225944 bytes in 249000 nsec
crc32_combine: 8373 self tests passed
crc32c_combine: 8373 self tests passed
rbtree testing
 -> test 1 (latency of nnodes insert+delete): 0 cycles
 -> test 2 (latency of nnodes cached insert+delete): 0 cycles
 -> test 3 (latency of inorder traversal): 0 cycles
 -> test 4 (latency to fetch first node)
        non-cached: 0 cycles
        cached: 0 cycles
augmented rbtree testing
 -> test 1 (latency of nnodes insert+delete): 0 cycles
 -> test 2 (latency of nnodes cached insert+delete): 0 cycles
interval tree insert/remove
 -> 0 cycles
interval tree search
 -> 0 cycles (2692 results)
ffd374f8: ttyS0 at MMIO 0xf1100000 (irq = 5, base_baud = 307200) is a zs
Console: ttyS0 (SunZilog zs0)
printk: console [ttyS0] enabled
printk: console [ttyS0] enabled
printk: bootconsole [earlyprom0] disabled
printk: bootconsole [earlyprom0] disabled
ffd374f8: ttyS1 at MMIO 0xf1100004 (irq = 5, base_baud = 307200) is a zs
ffd37770: Keyboard at MMIO 0xf1000000 (irq = 5) is a zs
ffd37770: Mouse at MMIO 0xf1000004 (irq = 5) is a zs
brd: module loaded
esp ffd392ec: esp0: regs[(ptrval):(ptrval)] irq[2]
esp ffd392ec: esp0: is a FAS100A, 40 MHz (ccf=0), SCSI ID 7
scsi host0: esp
scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
scsi target0:0:0: Beginning Domain Validation
scsi target0:0:0: Domain Validation skipping write tests
scsi target0:0:0: Ending Domain Validation
scsi 0:0:2:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI: 5
scsi target0:0:2: Beginning Domain Validation
scsi target0:0:2: Domain Validation skipping write tests
scsi target0:0:2: Ending Domain Validation
sr 0:0:2:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
cdrom: Uniform CD-ROM driver Revision: 3.20
sd 0:0:0:0: [sda] 40960 512-byte logical blocks: (21.0 MB/20.0 MiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] Attached SCSI disk
ioremap: done with statics, switching to malloc
SunLance: using auto-carrier-detection.
eth0: LANCE 52:54:00:12:34:56
rtc-m48t59 rtc-m48t59.0: IRQ index 0 not found
rtc-m48t59 rtc-m48t59.0: registered as rtc0
rtc-m48t59 rtc-m48t59.0: setting system clock to 2020-05-22T02:19:50 UTC (1590113990)
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
NET: Registered protocol family 10
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
TAP version 14
    # Subtest: sysctl_test
    1..10
    ok 1 - sysctl_test_api_dointvec_null_tbl_data
    ok 2 - sysctl_test_api_dointvec_table_maxlen_unset
    ok 3 - sysctl_test_api_dointvec_table_len_is_zero
    ok 4 - sysctl_test_api_dointvec_table_read_but_position_set
    ok 5 - sysctl_test_dointvec_read_happy_single_positive
    ok 6 - sysctl_test_dointvec_read_happy_single_negative
    ok 7 - sysctl_test_dointvec_write_happy_single_positive
    ok 8 - sysctl_test_dointvec_write_happy_single_negative
    ok 9 - sysctl_test_api_dointvec_write_single_less_int_min
    ok 10 - sysctl_test_api_dointvec_write_single_greater_int_max
ok 1 - sysctl_test
    # Subtest: ext4_inode_test
    1..1
    ok 1 - inode_test_xtimestamp_decoding
ok 2 - ext4_inode_test
    # Subtest: kunit-try-catch-test
    1..2
    ok 1 - kunit_test_try_catch_successful_try_no_catch
    ok 2 - kunit_test_try_catch_unsuccessful_try_does_catch
ok 3 - kunit-try-catch-test
    # Subtest: kunit-resource-test
    1..5
    ok 1 - kunit_resource_test_init_resources
    ok 2 - kunit_resource_test_alloc_resource
    ok 3 - kunit_resource_test_destroy_resource
    ok 4 - kunit_resource_test_cleanup_resources
    ok 5 - kunit_resource_test_proper_free_ordering
ok 4 - kunit-resource-test
    # Subtest: kunit-log-test
    1..1
put this in log.
this too.
add to suite log.
along with this.
    ok 1 - kunit_log_test
ok 5 - kunit-log-test
    # Subtest: string-stream-test
    1..3
    ok 1 - string_stream_test_empty_on_creation
    ok 2 - string_stream_test_not_empty_after_add
    ok 3 - string_stream_test_get_string
ok 6 - string-stream-test
    # Subtest: list-kunit-test
    1..36
    ok 1 - list_test_list_init
    ok 2 - list_test_list_add
    ok 3 - list_test_list_add_tail
    ok 4 - list_test_list_del
    ok 5 - list_test_list_replace
    ok 6 - list_test_list_replace_init
    ok 7 - list_test_list_swap
    ok 8 - list_test_list_del_init
    ok 9 - list_test_list_move
    ok 10 - list_test_list_move_tail
    ok 11 - list_test_list_bulk_move_tail
    ok 12 - list_test_list_is_first
    ok 13 - list_test_list_is_last
    ok 14 - list_test_list_empty
    ok 15 - list_test_list_empty_careful
    ok 16 - list_test_list_rotate_left
    ok 17 - list_test_list_rotate_to_front
    ok 18 - list_test_list_is_singular
    ok 19 - list_test_list_cut_position
    ok 20 - list_test_list_cut_before
    ok 21 - list_test_list_splice
    ok 22 - list_test_list_splice_tail
    ok 23 - list_test_list_splice_init
    ok 24 - list_test_list_splice_tail_init
    ok 25 - list_test_list_entry
    ok 26 - list_test_list_first_entry
    ok 27 - list_test_list_last_entry
    ok 28 - list_test_list_first_entry_or_null
    ok 29 - list_test_list_next_entry
    ok 30 - list_test_list_prev_entry
    ok 31 - list_test_list_for_each
    ok 32 - list_test_list_for_each_prev
    ok 33 - list_test_list_for_each_safe
    ok 34 - list_test_list_for_each_prev_safe
    ok 35 - list_test_list_for_each_entry
    ok 36 - list_test_list_for_each_entry_reverse
ok 7 - list-kunit-test
    # Subtest: qos-kunit-test
    1..3
    ok 1 - freq_qos_test_min
    ok 2 - freq_qos_test_maxdef
    ok 3 - freq_qos_test_readd
ok 8 - qos-kunit-test
EXT4-fs (sda): mounted filesystem without journal. Opts: (null)
VFS: Mounted root (ext4 filesystem) readonly on device 8:0.
devtmpfs: mounted
Freeing unused kernel memory: 176K
This architecture does not have kernel memory protection.
Run /sbin/init as init process
EXT4-fs (sda): re-mounted. Opts: (null)
ext4 filesystem being remounted at / supports timestamps until 2038 (0x7fffffff)
Starting syslogd: OK
Starting klogd: OK
Initializing random number generator... random: dd: uninitialized urandom read (512 bytes read)
done.
Starting network: OK
Found console ttyS0

Linux version 5.7.0-rc6-00026-g03fb3acae4be (groeck@server.roeck-us.net) (gcc version 6.5.0 (Buildroot 2018.11-rc2-00071-g4310260), GNU ld (GNU Binutils) 2.31.1) #1 Thu May 21 19:17:48 PDT 2020
Boot successful.
Rebooting
Found console ttyS0
Stopping network: OK
Saving random seed... random: dd: uninitialized urandom read (512 bytes read)
done.
Stopping klogd: OK
Stopping syslogd: OK
umount: devtmpfs busy - remounted read-only
EXT4-fs (sda): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
sd 0:0:0:0: [sda] Synchronizing SCSI cache
reboot: Restarting system
Andrew Morton June 3, 2020, 8:57 p.m. UTC | #19
On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:

> > > 
> > > Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > > sparc 32 bit...
> > > 
> > > Its a long shot but could you try reverting this patch?
> > > 
> > > 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > > 
> > 
> > That is not easy to revert, unfortunately, due to several follow-up patches.
> 
> I have gotten your sparc tests to run and they all pass...
> 
> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> Build reference: v5.7-rc4-17-g852b6f2edc0f
> 
> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> 
> Is there another test I need to run?

This all petered out, but as I understand it, this patchset still might
have issues on various architectures.

Can folks please provide an update on the testing status?
Ira Weiny June 3, 2020, 9:14 p.m. UTC | #20
On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> 
> > > > 
> > > > Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > > > sparc 32 bit...
> > > > 
> > > > Its a long shot but could you try reverting this patch?
> > > > 
> > > > 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > > > 
> > > 
> > > That is not easy to revert, unfortunately, due to several follow-up patches.
> > 
> > I have gotten your sparc tests to run and they all pass...
> > 
> > 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> > Build reference: v5.7-rc4-17-g852b6f2edc0f
> > 
> > Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> > Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> > Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> > Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> > Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> > Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> > Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> > 
> > Is there another test I need to run?
> 
> This all petered out, but as I understand it, this patchset still might
> have issues on various architectures.
> 
> Can folks please provide an update on the testing status?

I believe the tests were failing for Guenter due to another patch set...[1]

My tests with just this series are working.

From my understanding the other failures were unrelated.[2]

	<quote Mike Rapoport>
	I've checked the patch above on top of the mmots which already has
	Ira's patches and it booted fine. I've used sparc32_defconfig to build
	the kernel and qemu-system-sparc with default machine and CPU.
	</quote>

Mike, am I wrong?  Do you think the kmap() patches are still causing issues?

Ira

[1] https://lore.kernel.org/lkml/2807E5FD2F6FDA4886F6618EAC48510E92EAB1DE@CRSMSX101.amr.corp.intel.com/
[2] https://lore.kernel.org/lkml/20200520195110.GH1118872@kernel.org/
Guenter Roeck June 3, 2020, 11:44 p.m. UTC | #21
On 6/3/20 2:14 PM, Ira Weiny wrote:
> On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
>> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
>>
>>>>>
>>>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
>>>>> sparc 32 bit...
>>>>>
>>>>> Its a long shot but could you try reverting this patch?
>>>>>
>>>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
>>>>>
>>>>
>>>> That is not easy to revert, unfortunately, due to several follow-up patches.
>>>
>>> I have gotten your sparc tests to run and they all pass...
>>>
>>> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
>>> Build reference: v5.7-rc4-17-g852b6f2edc0f
>>>
>>> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
>>> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
>>> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
>>> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
>>> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
>>> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
>>>
>>> Is there another test I need to run?
>>
>> This all petered out, but as I understand it, this patchset still might
>> have issues on various architectures.
>>
>> Can folks please provide an update on the testing status?
> 
> I believe the tests were failing for Guenter due to another patch set...[1]
> 
> My tests with just this series are working.
> 
>>From my understanding the other failures were unrelated.[2]
> 
> 	<quote Mike Rapoport>
> 	I've checked the patch above on top of the mmots which already has
> 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> 	the kernel and qemu-system-sparc with default machine and CPU.
> 	</quote>
> 
> Mike, am I wrong?  Do you think the kmap() patches are still causing issues?
> 

For my part, all I can say is that -next is in pretty bad shape right now.
The summary of my tests says:

Build results:
	total: 151 pass: 130 fail: 21
Qemu test results:
	total: 430 pass: 375 fail: 55

sparc32 smp images in next-20200603 still crash for me with a spinlock
recursion. s390 images hang early in boot. Several others (alpha, arm64,
various ppc) don't even compile. I can run some more bisects over time,
but this is becoming a full-time job :-(.

Guenter
Mike Rapoport June 4, 2020, 6:18 a.m. UTC | #22
On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> On 6/3/20 2:14 PM, Ira Weiny wrote:
> > On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
> >> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> >>
> >>>>>
> >>>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
> >>>>> sparc 32 bit...
> >>>>>
> >>>>> Its a long shot but could you try reverting this patch?
> >>>>>
> >>>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> >>>>>
> >>>>
> >>>> That is not easy to revert, unfortunately, due to several follow-up patches.
> >>>
> >>> I have gotten your sparc tests to run and they all pass...
> >>>
> >>> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> >>> Build reference: v5.7-rc4-17-g852b6f2edc0f
> >>>
> >>> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> >>> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> >>> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> >>> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> >>> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> >>> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> >>> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> >>> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> >>> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> >>>
> >>> Is there another test I need to run?
> >>
> >> This all petered out, but as I understand it, this patchset still might
> >> have issues on various architectures.
> >>
> >> Can folks please provide an update on the testing status?
> > 
> > I believe the tests were failing for Guenter due to another patch set...[1]
> > 
> > My tests with just this series are working.
> > 
> >>From my understanding the other failures were unrelated.[2]
> > 
> > 	<quote Mike Rapoport>
> > 	I've checked the patch above on top of the mmots which already has
> > 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> > 	the kernel and qemu-system-sparc with default machine and CPU.
> > 	</quote>
> > 
> > Mike, am I wrong?  Do you think the kmap() patches are still causing issues?

sparc32 UP and microblaze work for me with next-20200603, but I didn't
test other architectures. 
 
> For my part, all I can say is that -next is in pretty bad shape right now.
> The summary of my tests says:
> 
> Build results:
> 	total: 151 pass: 130 fail: 21
> Qemu test results:
> 	total: 430 pass: 375 fail: 55
> 
> sparc32 smp images in next-20200603 still crash for me with a spinlock
> recursion.

I think this is because Will's fixes [1] are not yet in -next.

> s390 images hang early in boot. Several others (alpha, arm64,
> various ppc) don't even compile. I can run some more bisects over time,
> but this is becoming a full-time job :-(.
> 
> Guenter

[1] https://lore.kernel.org/lkml/20200526173302.377-1-will@kernel.org
Ira Weiny June 4, 2020, 6:22 a.m. UTC | #23
On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> On 6/3/20 2:14 PM, Ira Weiny wrote:
> > On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
> >> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> >>

...

> >>
> >> This all petered out, but as I understand it, this patchset still might
> >> have issues on various architectures.
> >>
> >> Can folks please provide an update on the testing status?
> > 
> > I believe the tests were failing for Guenter due to another patch set...[1]
> > 
> > My tests with just this series are working.
> > 
> >>From my understanding the other failures were unrelated.[2]
> > 
> > 	<quote Mike Rapoport>
> > 	I've checked the patch above on top of the mmots which already has
> > 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> > 	the kernel and qemu-system-sparc with default machine and CPU.
> > 	</quote>
> > 
> > Mike, am I wrong?  Do you think the kmap() patches are still causing issues?
> > 
> 
> For my part, all I can say is that -next is in pretty bad shape right now.
> The summary of my tests says:
> 
> Build results:
> 	total: 151 pass: 130 fail: 21
> Qemu test results:
> 	total: 430 pass: 375 fail: 55
> 
> sparc32 smp images in next-20200603 still crash for me with a spinlock
> recursion. s390 images hang early in boot. Several others (alpha, arm64,
> various ppc) don't even compile. I can run some more bisects over time,
> but this is becoming a full-time job :-(.
> 

I'm not sure what the process here is.  I just applied my series[1] on
Linus' Master branch[2] and ran sparc32 and s290 from your tests.

sparc32: (passes)

21:43:49 > /home/iweiny/dev/linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
Build reference: v5.7-7188-g67a7a97e8a0f

Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:nosmp:initrd ... running ......... passed
Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ...... .... passed
Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed


s390: (does not compile)

<stdin>:1511:2: warning: #warning syscall clone3 not implemented [-Wcpp]
In file included from ./arch/sparc/include/asm/bug.h:6:0,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/mmdebug.h:5,
                 from ./include/linux/mm.h:9,
                 from mm/huge_memory.c:8:
mm/huge_memory.c: In function 'hugepage_init':
./include/linux/compiler.h:403:38: error: call to '__compiletime_assert_127' declared with attribute error: BUILD_BUG_ON failed: ((13 + (13-3))-13) >= 9
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
./include/linux/compiler.h:384:4: note: in definition of macro '__compiletime_assert'
    prefix ## suffix();    \
    ^~~~~~
./include/linux/compiler.h:403:2: note: in expansion of macro '_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
  ^~~~~~~~~~~~~~~~
./include/linux/bug.h:24:4: note: in expansion of macro 'BUILD_BUG_ON'
    BUILD_BUG_ON(cond);             \
    ^~~~~~~~~~~~
mm/huge_memory.c:403:2: note: in expansion of macro 'MAYBE_BUILD_BUG_ON'
  MAYBE_BUILD_BUG_ON(HPAGE_PMD_ORDER >= MAX_ORDER);
  ^~~~~~~~~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:267: mm/huge_memory.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:1735: mm] Error 2
make: *** Waiting for unfinished jobs....
------------


The s390 error is the same on Linus' master and linux-next.  So whatever is
causing that has slipped into mainline and/or is something I've broken in the
test scripts.


With linux-next on sparc I too see the spinlock issue; something like:

...
Starting syslogd: BUG: spinlock recursion on CPU#0, S01syslogd/139
 lock: 0xf53ef350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-next-20200603 #1
[f0067d00 : 
do_raw_spin_lock+0xa8/0xd8 ] 
[f00d598c : 
copy_page_range+0x328/0x804 ] 
[f0025c34 : 
dup_mm+0x334/0x434 ] 
[f0027198 : 
copy_process+0x1248/0x12d4 ] 
[f00273b8 : 
_do_fork+0x54/0x30c ] 
[f00276e4 : 
do_fork+0x5c/0x6c ] 
[f000de44 : 
sparc_do_fork+0x18/0x38 ] 
[f000b7f4 : 
do_syscall+0x34/0x40 ] 
[5010cd4c : 
0x5010cd4c ] 

qemu-system-sparc: terminating on signal 15 from pid 2000056 (/bin/bash)
...


FWIW I don't see any of this being an issue with the kmap() code but I agree
things could be cleaner.  How can we back linux-next off a bit?  I'm not an
expert here with how linux-next works.

For example I just picked the latest patch from me within the linux-next tree:

  2e483306d5a8 arch/{mips,sparc,microblaze,powerpc}: don't enable pagefault/preempt twice

And built from there it looks good for sparc.

23:01:31 > /home/iweiny/dev/linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
Build reference: v5.7-719-g2e483306d5a8

Building sparc32:SPARCClassic:nosmp:scsi:hd ... running .......... passed
Building sparc32:SPARCbook:nosmp:scsi:cd ... running .......... passed
Building sparc32:LX:nosmp:noapc:scsi:hd ... running .......... passed
Building sparc32:SS-4:nosmp:initrd ... running .......... passed
Building sparc32:SS-5:nosmp:scsi:hd ... running .......... passed
Building sparc32:SS-10:nosmp:scsi:cd ... running .......... passed
Building sparc32:SS-20:nosmp:scsi:hd ... running .......... passed
Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:smp:scsi:hd ... running ......^[[1;2D... passed
Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed


I'm going to bisect between there and HEAD.


Ira


[1]

67a7a97e8a0f arch/{mips,sparc,microblaze,powerpc}: Don't enable pagefault/preempt twice
4a3dd9ec36d8 kmap: Consolidate kmap_prot definitions
a3b39b1668ac sparc: Remove unnecessary includes
452195c6e8a8 parisc/kmap: Remove duplicate kmap code
317e4af1da94 kmap: Remove kmap_atomic_to_page()
e11e52415a4d drm: Remove drm specific kmap_atomic code
afd4911f0cfb arch/kmap: Define kmap_atomic_prot() for all arch's
2a5524d63341 arch/kmap: Don't hard code kmap_prot values
c94bbaab0296 arch/kmap: Ensure kmap_prot visibility
6f29a6b66d3b arch/kunmap_atomic: Consolidate duplicate code
0c7122ef07d1 arch/kmap_atomic: Consolidate duplicate code
63b8bbf47723 {x86,powerpc,microblaze}/kmap: Move preempt disable
23b3175de76f arch/kunmap: Remove duplicate kunmap implementations
9514dd54fda8 arch/kmap: Remove redundant arch specific kmaps
e92e53c0080b arch/xtensa: Move kmap build bug out of the way
cab1afa4f6ac arch/kmap: Remove BUG_ON()

[2] cb8e59cc8720 (linus/master, linus-master) Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Mike Rapoport June 4, 2020, 6:37 a.m. UTC | #24
On Wed, Jun 03, 2020 at 11:22:26PM -0700, Ira Weiny wrote:
> On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> 
> With linux-next on sparc I too see the spinlock issue; something like:
> 
> ...
> Starting syslogd: BUG: spinlock recursion on CPU#0, S01syslogd/139
>  lock: 0xf53ef350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-next-20200603 #1
> [f0067d00 : 
> do_raw_spin_lock+0xa8/0xd8 ] 
> [f00d598c : 
> copy_page_range+0x328/0x804 ] 
> [f0025c34 : 
> dup_mm+0x334/0x434 ] 
> [f0027198 : 
> copy_process+0x1248/0x12d4 ] 
> [f00273b8 : 
> _do_fork+0x54/0x30c ] 
> [f00276e4 : 
> do_fork+0x5c/0x6c ] 
> [f000de44 : 
> sparc_do_fork+0x18/0x38 ] 
> [f000b7f4 : 
> do_syscall+0x34/0x40 ] 
> [5010cd4c : 
> 0x5010cd4c ] 
> 
> 
> I'm going to bisect between there and HEAD.

The sparc issue should be fixed by 

https://lore.kernel.org/lkml/20200526173302.377-1-will@kernel.org
 
> Ira
Ira Weiny June 4, 2020, 6:43 a.m. UTC | #25
On Thu, Jun 04, 2020 at 09:18:05AM +0300, Mike Rapoport wrote:
> On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> > On 6/3/20 2:14 PM, Ira Weiny wrote:
> > > On Wed, Jun 03, 2020 at 01:57:36PM -0700, Andrew Morton wrote:
> > >> On Thu, 21 May 2020 10:42:50 -0700 Ira Weiny <ira.weiny@intel.com> wrote:
> > >>
> > >>>>>
> > >>>>> Actually it occurs to me that the patch consolidating kmap_prot is odd for
> > >>>>> sparc 32 bit...
> > >>>>>
> > >>>>> Its a long shot but could you try reverting this patch?
> > >>>>>
> > >>>>> 4ea7d2419e3f kmap: consolidate kmap_prot definitions
> > >>>>>
> > >>>>
> > >>>> That is not easy to revert, unfortunately, due to several follow-up patches.
> > >>>
> > >>> I have gotten your sparc tests to run and they all pass...
> > >>>
> > >>> 08:10:34 > ../linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
> > >>> Build reference: v5.7-rc4-17-g852b6f2edc0f
> > >>>
> > >>> Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
> > >>> Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-4:nosmp:initrd ... running ......... passed
> > >>> Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
> > >>> Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
> > >>> Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
> > >>> Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
> > >>> Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
> > >>> Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed
> > >>>
> > >>> Is there another test I need to run?
> > >>
> > >> This all petered out, but as I understand it, this patchset still might
> > >> have issues on various architectures.
> > >>
> > >> Can folks please provide an update on the testing status?
> > > 
> > > I believe the tests were failing for Guenter due to another patch set...[1]
> > > 
> > > My tests with just this series are working.
> > > 
> > >>From my understanding the other failures were unrelated.[2]
> > > 
> > > 	<quote Mike Rapoport>
> > > 	I've checked the patch above on top of the mmots which already has
> > > 	Ira's patches and it booted fine. I've used sparc32_defconfig to build
> > > 	the kernel and qemu-system-sparc with default machine and CPU.
> > > 	</quote>
> > > 
> > > Mike, am I wrong?  Do you think the kmap() patches are still causing issues?
> 
> sparc32 UP and microblaze work for me with next-20200603, but I didn't
> test other architectures. 
>  
> > For my part, all I can say is that -next is in pretty bad shape right now.
> > The summary of my tests says:
> > 
> > Build results:
> > 	total: 151 pass: 130 fail: 21
> > Qemu test results:
> > 	total: 430 pass: 375 fail: 55
> > 
> > sparc32 smp images in next-20200603 still crash for me with a spinlock
> > recursion.
> 
> I think this is because Will's fixes [1] are not yet in -next.
> 
> > s390 images hang early in boot. Several others (alpha, arm64,
> > various ppc) don't even compile. I can run some more bisects over time,
> > but this is becoming a full-time job :-(.
> > 
> > Guenter
> 
> [1] https://lore.kernel.org/lkml/20200526173302.377-1-will@kernel.org

I abandoned the bisect and tested with this fix.[1]  It passes.  Guenter, on
the original thread we had microblaze and ppc working with my fix.

https://lore.kernel.org/lkml/20200519194215.GA71941@roeck-us.net/

Sounds like the current failures above are from something much newer in the
tree.

Ira

[1]
23:26:24 > /home/iweiny/dev/linux-build-test/rootfs/sparc/run-qemu-sparc.sh 
Build reference: next-20200603-3-gf5afe92a2135

Building sparc32:SPARCClassic:nosmp:scsi:hd ... running ......... passed
Building sparc32:SPARCbook:nosmp:scsi:cd ... running ......... passed
Building sparc32:LX:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:nosmp:initrd ... running ......... passed
Building sparc32:SS-5:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-10:nosmp:scsi:cd ... running ......... passed
Building sparc32:SS-20:nosmp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:nosmp:scsi:hd ... running ......... passed
Building sparc32:Voyager:nosmp:noapc:scsi:hd ... running ......... passed
Building sparc32:SS-4:smp:scsi:hd ... running ......... passed
Building sparc32:SS-5:smp:scsi:cd ... running ......... passed
Building sparc32:SS-10:smp:scsi:hd ... running ......... passed
Building sparc32:SS-20:smp:scsi:hd ... running ......... passed
Building sparc32:SS-600MP:smp:scsi:hd ... running ......... passed
Building sparc32:Voyager:smp:noapc:scsi:hd ... running ......... passed


> -- 
> Sincerely yours,
> Mike.
Ira Weiny June 4, 2020, 6:44 a.m. UTC | #26
On Thu, Jun 04, 2020 at 09:37:45AM +0300, Mike Rapoport wrote:
> On Wed, Jun 03, 2020 at 11:22:26PM -0700, Ira Weiny wrote:
> > On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> > 
> > With linux-next on sparc I too see the spinlock issue; something like:
> > 
> > ...
> > Starting syslogd: BUG: spinlock recursion on CPU#0, S01syslogd/139
> >  lock: 0xf53ef350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-next-20200603 #1
> > [f0067d00 : 
> > do_raw_spin_lock+0xa8/0xd8 ] 
> > [f00d598c : 
> > copy_page_range+0x328/0x804 ] 
> > [f0025c34 : 
> > dup_mm+0x334/0x434 ] 
> > [f0027198 : 
> > copy_process+0x1248/0x12d4 ] 
> > [f00273b8 : 
> > _do_fork+0x54/0x30c ] 
> > [f00276e4 : 
> > do_fork+0x5c/0x6c ] 
> > [f000de44 : 
> > sparc_do_fork+0x18/0x38 ] 
> > [f000b7f4 : 
> > do_syscall+0x34/0x40 ] 
> > [5010cd4c : 
> > 0x5010cd4c ] 
> > 
> > 
> > I'm going to bisect between there and HEAD.
> 
> The sparc issue should be fixed by 
> 
> https://lore.kernel.org/lkml/20200526173302.377-1-will@kernel.org

Saw your other email.  And yes they are!

Thanks!
Ira

>  
> > Ira
> 
> -- 
> Sincerely yours,
> Mike.
Guenter Roeck June 4, 2020, 9:38 a.m. UTC | #27
On 6/3/20 11:22 PM, Ira Weiny wrote:
[ ... ]
> 
> s390: (does not compile)
> 
> <stdin>:1511:2: warning: #warning syscall clone3 not implemented [-Wcpp]
> In file included from ./arch/sparc/include/asm/bug.h:6:0,
>                  from ./include/linux/bug.h:5,
>                  from ./include/linux/mmdebug.h:5,
>                  from ./include/linux/mm.h:9,
>                  from mm/huge_memory.c:8:
> mm/huge_memory.c: In function 'hugepage_init':
> ./include/linux/compiler.h:403:38: error: call to '__compiletime_assert_127' declared with attribute error: BUILD_BUG_ON failed: ((13 + (13-3))-13) >= 9
>   _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>                                       ^
> ./include/linux/compiler.h:384:4: note: in definition of macro '__compiletime_assert'
>     prefix ## suffix();    \
>     ^~~~~~
> ./include/linux/compiler.h:403:2: note: in expansion of macro '_compiletime_assert'
>   _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
>   ^~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
>  #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
>                                      ^~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
>   BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
>   ^~~~~~~~~~~~~~~~
> ./include/linux/bug.h:24:4: note: in expansion of macro 'BUILD_BUG_ON'
>     BUILD_BUG_ON(cond);             \
>     ^~~~~~~~~~~~
> mm/huge_memory.c:403:2: note: in expansion of macro 'MAYBE_BUILD_BUG_ON'
>   MAYBE_BUILD_BUG_ON(HPAGE_PMD_ORDER >= MAX_ORDER);
>   ^~~~~~~~~~~~~~~~~~
> make[1]: *** [scripts/Makefile.build:267: mm/huge_memory.o] Error 1
> make[1]: *** Waiting for unfinished jobs....
> make: *** [Makefile:1735: mm] Error 2
> make: *** Waiting for unfinished jobs....
> ------------
> 
> 
> The s390 error is the same on Linus' master and linux-next.  So whatever is
> causing that has slipped into mainline and/or is something I've broken in the
> test scripts.
> 

Compiler version related. gcc version 8.x and later no longer work.
Bisect points to commit a148866489f ("sched: Replace rq::wake_list").
Oddly enough x86 images are broken as well. You'll have to use an
older version of gcc (or presumably clang) until this is fixed.

Guenter
Mike Rapoport June 4, 2020, 9:41 a.m. UTC | #28
On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> 
> sparc32 smp images in next-20200603 still crash for me with a spinlock
> recursion. s390 images hang early in boot. Several others (alpha, arm64,
> various ppc) don't even compile. I can run some more bisects over time,
> but this is becoming a full-time job :-(.

I've been able to bisect s390 hang to commit b614345f52bc ("x86/entry:
Clarify irq_{enter,exit}_rcu()").

After this commit, lockdep_hardirq_exit() is called twice on s390 (and
others) - one time in irq_exit_rcu() and another one in irq_exit():

/**
 * irq_exit_rcu() - Exit an interrupt context without updating RCU
 *
 * Also processes softirqs if needed and possible.
 */
void irq_exit_rcu(void)
{
	__irq_exit_rcu();
	 /* must be last! */
	lockdep_hardirq_exit();
}

/**
 * irq_exit - Exit an interrupt context, update RCU and lockdep
 *
 * Also processes softirqs if needed and possible.
 */
void irq_exit(void)
{
	irq_exit_rcu();
	rcu_irq_exit();
	 /* must be last! */
	lockdep_hardirq_exit();
}

Removing the call in irq_exit() make s390 boot again, and judgung by the
x86 entry code, the comment /* must be last! */ is stale...

@Peter, @Thomas, can you comment please?

From e51d50ee6f4d1f446decf91c2c67230da14ff82c Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Thu, 4 Jun 2020 12:37:03 +0300
Subject: [PATCH] softirq: don't call lockdep_hardirq_exit() twice

After commit b614345f52bc ("x86/entry: Clarify irq_{enter,exit}_rcu()")
lockdep_hardirq_exit() is called twice on every architecture that uses
irq_exit(): one time in irq_exit_rcu() and another one in irq_exit().

Remove the extra call in irq_exit().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/softirq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index a3eb6eba8c41..7523f4ce4c1d 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -427,7 +427,6 @@ static inline void __irq_exit_rcu(void)
 void irq_exit_rcu(void)
 {
 	__irq_exit_rcu();
-	 /* must be last! */
 	lockdep_hardirq_exit();
 }
 
@@ -440,8 +439,6 @@ void irq_exit(void)
 {
 	irq_exit_rcu();
 	rcu_irq_exit();
-	 /* must be last! */
-	lockdep_hardirq_exit();
 }
 
 /*
Ira Weiny June 4, 2020, 8:51 p.m. UTC | #29
On Thu, Jun 04, 2020 at 12:41:33PM +0300, Mike Rapoport wrote:
> On Wed, Jun 03, 2020 at 04:44:17PM -0700, Guenter Roeck wrote:
> > 
> > sparc32 smp images in next-20200603 still crash for me with a spinlock
> > recursion. s390 images hang early in boot. Several others (alpha, arm64,
> > various ppc) don't even compile. I can run some more bisects over time,
> > but this is becoming a full-time job :-(.
> 
> I've been able to bisect s390 hang to commit b614345f52bc ("x86/entry:
> Clarify irq_{enter,exit}_rcu()").
> 
> After this commit, lockdep_hardirq_exit() is called twice on s390 (and
> others) - one time in irq_exit_rcu() and another one in irq_exit():
> 
> /**
>  * irq_exit_rcu() - Exit an interrupt context without updating RCU
>  *
>  * Also processes softirqs if needed and possible.
>  */
> void irq_exit_rcu(void)
> {
> 	__irq_exit_rcu();
> 	 /* must be last! */
> 	lockdep_hardirq_exit();
> }
> 
> /**
>  * irq_exit - Exit an interrupt context, update RCU and lockdep
>  *
>  * Also processes softirqs if needed and possible.
>  */
> void irq_exit(void)
> {
> 	irq_exit_rcu();
> 	rcu_irq_exit();
> 	 /* must be last! */
> 	lockdep_hardirq_exit();
> }
> 
> Removing the call in irq_exit() make s390 boot again, and judgung by the
> x86 entry code, the comment /* must be last! */ is stale...

FWIW I got s390 to compile and this patch fixes s390 booting for me as well.

13:05:25 > /home/iweiny/dev/linux-build-test/rootfs/s390/run-qemu-s390.sh 
Build reference: next-20200603-4-g840714292d8c

Building s390:defconfig:initrd ... running ........... passed
Building s390:defconfig:virtio-blk-ccw:rootfs ... running ........... passed
Building s390:defconfig:scsi[virtio-ccw]:rootfs ... running ..............  passed
Building s390:defconfig:virtio-pci:rootfs ... running ........... passed
Building s390:defconfig:scsi[virtio-pci]:rootfs ... running ........... passed

Ira

> 
> @Peter, @Thomas, can you comment please?
> 
> From e51d50ee6f4d1f446decf91c2c67230da14ff82c Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Thu, 4 Jun 2020 12:37:03 +0300
> Subject: [PATCH] softirq: don't call lockdep_hardirq_exit() twice
> 
> After commit b614345f52bc ("x86/entry: Clarify irq_{enter,exit}_rcu()")
> lockdep_hardirq_exit() is called twice on every architecture that uses
> irq_exit(): one time in irq_exit_rcu() and another one in irq_exit().
> 
> Remove the extra call in irq_exit().
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/softirq.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index a3eb6eba8c41..7523f4ce4c1d 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -427,7 +427,6 @@ static inline void __irq_exit_rcu(void)
>  void irq_exit_rcu(void)
>  {
>  	__irq_exit_rcu();
> -	 /* must be last! */
>  	lockdep_hardirq_exit();
>  }
>  
> @@ -440,8 +439,6 @@ void irq_exit(void)
>  {
>  	irq_exit_rcu();
>  	rcu_irq_exit();
> -	 /* must be last! */
> -	lockdep_hardirq_exit();
>  }
>  
>  /*
> -- 
> 2.26.2
> 
> 
> 
> > Guenter
> 
> -- 
> Sincerely yours,
> Mike.
diff mbox series

Patch

diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c
index ee8a422b2b76..92e0890416c9 100644
--- a/arch/microblaze/mm/highmem.c
+++ b/arch/microblaze/mm/highmem.c
@@ -57,11 +57,8 @@  void kunmap_atomic_high(void *kvaddr)
 	int type;
 	unsigned int idx;
 
-	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
-		pagefault_enable();
-		preempt_enable();
+	if (vaddr < __fix_to_virt(FIX_KMAP_END))
 		return;
-	}
 
 	type = kmap_atomic_idx();
 
diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c
index 37e244cdb14e..8e8726992720 100644
--- a/arch/mips/mm/highmem.c
+++ b/arch/mips/mm/highmem.c
@@ -41,11 +41,8 @@  void kunmap_atomic_high(void *kvaddr)
 	unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
 	int type __maybe_unused;
 
-	if (vaddr < FIXADDR_START) { // FIXME
-		pagefault_enable();
-		preempt_enable();
+	if (vaddr < FIXADDR_START)
 		return;
-	}
 
 	type = kmap_atomic_idx();
 #ifdef CONFIG_DEBUG_HIGHMEM
diff --git a/arch/powerpc/mm/highmem.c b/arch/powerpc/mm/highmem.c
index 35071c2913f1..624b4438aff9 100644
--- a/arch/powerpc/mm/highmem.c
+++ b/arch/powerpc/mm/highmem.c
@@ -44,11 +44,8 @@  void kunmap_atomic_high(void *kvaddr)
 {
 	unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
 
-	if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
-		pagefault_enable();
-		preempt_enable();
+	if (vaddr < __fix_to_virt(FIX_KMAP_END))
 		return;
-	}
 
 	if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM)) {
 		int type = kmap_atomic_idx();
diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index d237d902f9c3..6ff6e2a9f9b3 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -86,11 +86,8 @@  void kunmap_atomic_high(void *kvaddr)
 	unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
 	int type;
 
-	if (vaddr < FIXADDR_START) { // FIXME
-		pagefault_enable();
-		preempt_enable();
+	if (vaddr < FIXADDR_START)
 		return;
-	}
 
 	type = kmap_atomic_idx();