Message ID | 817e43c5-d88d-e616-7074-5715de29d319@Netapp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 26 Oct 2016 16:15:24 +0000, Yotam Gigi wrote: > >-----Original Message----- > >From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] > >Sent: Wednesday, October 26, 2016 5:40 PM > >To: Yotam Gigi <yotamg@mellanox.com>; Jakub Kicinski <kubakici@wp.pl>; Andy > >Adamson <andros@netapp.com>; Anna Schumaker > ><Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org > >Cc: netdev@vger.kernel.org; Trond Myklebust <Trond.Myklebust@netapp.com>; > >Yotam Gigi <yotam.gi@gmail.com>; mlxsw <mlxsw@mellanox.com> > >Subject: Re: nfs NULL-dereferencing in net-next > > > >On 10/25/2016 01:19 PM, Yotam Gigi wrote: > >> > >>> -----Original Message----- > >>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] > >On > >>> Behalf Of Jakub Kicinski > >>> Sent: Monday, October 17, 2016 10:20 PM > >>> To: Andy Adamson <andros@netapp.com>; Anna Schumaker > >>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org > >>> Cc: netdev@vger.kernel.org; Trond Myklebust > ><Trond.Myklebust@netapp.com> > >>> Subject: nfs NULL-dereferencing in net-next > >>> > >>> Hi! > >>> > >>> I'm hitting this reliably on net-next, HEAD at 3f3177bb680f > >>> ("fsl/fman: fix error return code in mac_probe()"). > >> > >> > >> I see the same thing. It happens constantly on some of my machines, making > >them > >> completely unusable. > >> > >> I bisected it and got to the commit: > >> > >> commit 04ea1b3e6d8ed4978bb608c1748530af3de8c274 > >> Author: Andy Adamson <andros@netapp.com> > >> Date: Fri Sep 9 09:22:27 2016 -0400 > >> > >> NFS add xprt switch addrs test to match client > >> > >> Signed-off-by: Andy Adamson <andros@netapp.com> > >> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> > > > >Thanks for reporting on this everyone! Does this patch help? > > Actually, I still see the same bug with the same trace. I rebuild the latest net-next and I'm not seeing the trace any more... I'm only seeing this (with or without your patch): [ 23.465877] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 [ 23.473784] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 [ 23.588890] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 [ 23.596746] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 [ 23.781574] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 [ 23.789599] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 HTH -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/26/2016 02:08 PM, Jakub Kicinski wrote: > On Wed, 26 Oct 2016 16:15:24 +0000, Yotam Gigi wrote: >>> -----Original Message----- >>> From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] >>> Sent: Wednesday, October 26, 2016 5:40 PM >>> To: Yotam Gigi <yotamg@mellanox.com>; Jakub Kicinski <kubakici@wp.pl>; Andy >>> Adamson <andros@netapp.com>; Anna Schumaker >>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org >>> Cc: netdev@vger.kernel.org; Trond Myklebust <Trond.Myklebust@netapp.com>; >>> Yotam Gigi <yotam.gi@gmail.com>; mlxsw <mlxsw@mellanox.com> >>> Subject: Re: nfs NULL-dereferencing in net-next >>> >>> On 10/25/2016 01:19 PM, Yotam Gigi wrote: >>>> >>>>> -----Original Message----- >>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] >>> On >>>>> Behalf Of Jakub Kicinski >>>>> Sent: Monday, October 17, 2016 10:20 PM >>>>> To: Andy Adamson <andros@netapp.com>; Anna Schumaker >>>>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org >>>>> Cc: netdev@vger.kernel.org; Trond Myklebust >>> <Trond.Myklebust@netapp.com> >>>>> Subject: nfs NULL-dereferencing in net-next >>>>> >>>>> Hi! >>>>> >>>>> I'm hitting this reliably on net-next, HEAD at 3f3177bb680f >>>>> ("fsl/fman: fix error return code in mac_probe()"). >>>> >>>> >>>> I see the same thing. It happens constantly on some of my machines, making >>> them >>>> completely unusable. >>>> >>>> I bisected it and got to the commit: >>>> >>>> commit 04ea1b3e6d8ed4978bb608c1748530af3de8c274 >>>> Author: Andy Adamson <andros@netapp.com> >>>> Date: Fri Sep 9 09:22:27 2016 -0400 >>>> >>>> NFS add xprt switch addrs test to match client >>>> >>>> Signed-off-by: Andy Adamson <andros@netapp.com> >>>> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> >>> >>> Thanks for reporting on this everyone! Does this patch help? >> >> Actually, I still see the same bug with the same trace. Well, it was worth a shot. I'll keep poking at it. > > I rebuild the latest net-next and I'm not seeing the trace any more... > I'm only seeing this (with or without your patch): > > [ 23.465877] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > [ 23.473784] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > [ 23.588890] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > [ 23.596746] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > [ 23.781574] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > [ 23.789599] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 Interesting, I get that too when I try to use NFS v4.1. It's weird that the crash would stop happening like that, so maybe something is racy in this area. Thanks for testing, Yotam and Jakub! Anna > > HTH > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
DQoNCj4tLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPkZyb206IEFubmEgU2NodW1ha2VyIFtt YWlsdG86QW5uYS5TY2h1bWFrZXJATmV0YXBwLmNvbV0NCj5TZW50OiBXZWRuZXNkYXksIE9jdG9i ZXIgMjYsIDIwMTYgOToxNyBQTQ0KPlRvOiBKYWt1YiBLaWNpbnNraSA8a3ViYWtpY2lAd3AucGw+ DQo+Q2M6IFlvdGFtIEdpZ2kgPHlvdGFtZ0BtZWxsYW5veC5jb20+OyBBbmR5IEFkYW1zb24gPGFu ZHJvc0BuZXRhcHAuY29tPjsNCj5saW51eC1uZnNAdmdlci5rZXJuZWwub3JnOyBuZXRkZXZAdmdl ci5rZXJuZWwub3JnOyBUcm9uZCBNeWtsZWJ1c3QNCj48VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5j b20+OyBZb3RhbSBHaWdpIDx5b3RhbS5naUBnbWFpbC5jb20+OyBtbHhzdw0KPjxtbHhzd0BtZWxs YW5veC5jb20+DQo+U3ViamVjdDogUmU6IG5mcyBOVUxMLWRlcmVmZXJlbmNpbmcgaW4gbmV0LW5l eHQNCj4NCj5PbiAxMC8yNi8yMDE2IDAyOjA4IFBNLCBKYWt1YiBLaWNpbnNraSB3cm90ZToNCj4+ IE9uIFdlZCwgMjYgT2N0IDIwMTYgMTY6MTU6MjQgKzAwMDAsIFlvdGFtIEdpZ2kgd3JvdGU6DQo+ Pj4+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+Pj4+IEZyb206IEFubmEgU2NodW1ha2Vy IFttYWlsdG86QW5uYS5TY2h1bWFrZXJATmV0YXBwLmNvbV0NCj4+Pj4gU2VudDogV2VkbmVzZGF5 LCBPY3RvYmVyIDI2LCAyMDE2IDU6NDAgUE0NCj4+Pj4gVG86IFlvdGFtIEdpZ2kgPHlvdGFtZ0Bt ZWxsYW5veC5jb20+OyBKYWt1YiBLaWNpbnNraSA8a3ViYWtpY2lAd3AucGw+Ow0KPkFuZHkNCj4+ Pj4gQWRhbXNvbiA8YW5kcm9zQG5ldGFwcC5jb20+OyBBbm5hIFNjaHVtYWtlcg0KPj4+PiA8QW5u YS5TY2h1bWFrZXJATmV0YXBwLmNvbT47IGxpbnV4LW5mc0B2Z2VyLmtlcm5lbC5vcmcNCj4+Pj4g Q2M6IG5ldGRldkB2Z2VyLmtlcm5lbC5vcmc7IFRyb25kIE15a2xlYnVzdA0KPjxUcm9uZC5NeWts ZWJ1c3RAbmV0YXBwLmNvbT47DQo+Pj4+IFlvdGFtIEdpZ2kgPHlvdGFtLmdpQGdtYWlsLmNvbT47 IG1seHN3IDxtbHhzd0BtZWxsYW5veC5jb20+DQo+Pj4+IFN1YmplY3Q6IFJlOiBuZnMgTlVMTC1k ZXJlZmVyZW5jaW5nIGluIG5ldC1uZXh0DQo+Pj4+DQo+Pj4+IE9uIDEwLzI1LzIwMTYgMDE6MTkg UE0sIFlvdGFtIEdpZ2kgd3JvdGU6DQo+Pj4+Pg0KPj4+Pj4+IC0tLS0tT3JpZ2luYWwgTWVzc2Fn ZS0tLS0tDQo+Pj4+Pj4gRnJvbTogbmV0ZGV2LW93bmVyQHZnZXIua2VybmVsLm9yZyBbbWFpbHRv Om5ldGRldi0NCj5vd25lckB2Z2VyLmtlcm5lbC5vcmddDQo+Pj4+IE9uDQo+Pj4+Pj4gQmVoYWxm IE9mIEpha3ViIEtpY2luc2tpDQo+Pj4+Pj4gU2VudDogTW9uZGF5LCBPY3RvYmVyIDE3LCAyMDE2 IDEwOjIwIFBNDQo+Pj4+Pj4gVG86IEFuZHkgQWRhbXNvbiA8YW5kcm9zQG5ldGFwcC5jb20+OyBB bm5hIFNjaHVtYWtlcg0KPj4+Pj4+IDxBbm5hLlNjaHVtYWtlckBOZXRhcHAuY29tPjsgbGludXgt bmZzQHZnZXIua2VybmVsLm9yZw0KPj4+Pj4+IENjOiBuZXRkZXZAdmdlci5rZXJuZWwub3JnOyBU cm9uZCBNeWtsZWJ1c3QNCj4+Pj4gPFRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tPg0KPj4+Pj4+ IFN1YmplY3Q6IG5mcyBOVUxMLWRlcmVmZXJlbmNpbmcgaW4gbmV0LW5leHQNCj4+Pj4+Pg0KPj4+ Pj4+IEhpIQ0KPj4+Pj4+DQo+Pj4+Pj4gSSdtIGhpdHRpbmcgdGhpcyByZWxpYWJseSBvbiBuZXQt bmV4dCwgSEVBRCBhdCAzZjMxNzdiYjY4MGYNCj4+Pj4+PiAoImZzbC9mbWFuOiBmaXggZXJyb3Ig cmV0dXJuIGNvZGUgaW4gbWFjX3Byb2JlKCkiKS4NCj4+Pj4+DQo+Pj4+Pg0KPj4+Pj4gSSBzZWUg dGhlIHNhbWUgdGhpbmcuIEl0IGhhcHBlbnMgY29uc3RhbnRseSBvbiBzb21lIG9mIG15IG1hY2hp bmVzLCBtYWtpbmcNCj4+Pj4gdGhlbQ0KPj4+Pj4gY29tcGxldGVseSB1bnVzYWJsZS4NCj4+Pj4+ DQo+Pj4+PiBJIGJpc2VjdGVkIGl0IGFuZCBnb3QgdG8gdGhlIGNvbW1pdDoNCj4+Pj4+DQo+Pj4+ PiBjb21taXQgMDRlYTFiM2U2ZDhlZDQ5NzhiYjYwOGMxNzQ4NTMwYWYzZGU4YzI3NA0KPj4+Pj4g QXV0aG9yOiBBbmR5IEFkYW1zb24gPGFuZHJvc0BuZXRhcHAuY29tPg0KPj4+Pj4gRGF0ZTogICBG cmkgU2VwIDkgMDk6MjI6MjcgMjAxNiAtMDQwMA0KPj4+Pj4NCj4+Pj4+ICAgICBORlMgYWRkIHhw cnQgc3dpdGNoIGFkZHJzIHRlc3QgdG8gbWF0Y2ggY2xpZW50DQo+Pj4+Pg0KPj4+Pj4gICAgIFNp Z25lZC1vZmYtYnk6IEFuZHkgQWRhbXNvbiA8YW5kcm9zQG5ldGFwcC5jb20+DQo+Pj4+PiAgICAg U2lnbmVkLW9mZi1ieTogQW5uYSBTY2h1bWFrZXIgPEFubmEuU2NodW1ha2VyQE5ldGFwcC5jb20+ DQo+Pj4+DQo+Pj4+IFRoYW5rcyBmb3IgcmVwb3J0aW5nIG9uIHRoaXMgZXZlcnlvbmUhICBEb2Vz IHRoaXMgcGF0Y2ggaGVscD8NCj4+Pg0KPj4+IEFjdHVhbGx5LCBJIHN0aWxsIHNlZSB0aGUgc2Ft ZSBidWcgd2l0aCB0aGUgc2FtZSB0cmFjZS4NCj4NCj5XZWxsLCBpdCB3YXMgd29ydGggYSBzaG90 LiAgSSdsbCBrZWVwIHBva2luZyBhdCBpdC4NCj4NCj4+DQo+PiBJIHJlYnVpbGQgdGhlIGxhdGVz dCBuZXQtbmV4dCBhbmQgSSdtIG5vdCBzZWVpbmcgdGhlIHRyYWNlIGFueSBtb3JlLi4uDQo+PiBJ J20gb25seSBzZWVpbmcgdGhpcyAod2l0aCBvciB3aXRob3V0IHlvdXIgcGF0Y2gpOg0KPj4NCj4+ IFsgICAyMy40NjU4NzddIE5GUzogc2V0X3BuZnNfbGF5b3V0ZHJpdmVyOiBjbF9leGNoYW5nZV9m bGFncyAweDANCj4+IFsgICAyMy40NzM3ODRdIE5GUzogc2V0X3BuZnNfbGF5b3V0ZHJpdmVyOiBj bF9leGNoYW5nZV9mbGFncyAweDANCj4+IFsgICAyMy41ODg4OTBdIE5GUzogc2V0X3BuZnNfbGF5 b3V0ZHJpdmVyOiBjbF9leGNoYW5nZV9mbGFncyAweDANCj4+IFsgICAyMy41OTY3NDZdIE5GUzog c2V0X3BuZnNfbGF5b3V0ZHJpdmVyOiBjbF9leGNoYW5nZV9mbGFncyAweDANCj4+IFsgICAyMy43 ODE1NzRdIE5GUzogc2V0X3BuZnNfbGF5b3V0ZHJpdmVyOiBjbF9leGNoYW5nZV9mbGFncyAweDAN Cj4+IFsgICAyMy43ODk1OTldIE5GUzogc2V0X3BuZnNfbGF5b3V0ZHJpdmVyOiBjbF9leGNoYW5n ZV9mbGFncyAweDANCj4NCj5JbnRlcmVzdGluZywgSSBnZXQgdGhhdCB0b28gd2hlbiBJIHRyeSB0 byB1c2UgTkZTIHY0LjEuICBJdCdzIHdlaXJkIHRoYXQgdGhlIGNyYXNoIHdvdWxkDQo+c3RvcCBo YXBwZW5pbmcgbGlrZSB0aGF0LCBzbyBtYXliZSBzb21ldGhpbmcgaXMgcmFjeSBpbiB0aGlzIGFy ZWEuDQo+DQo+VGhhbmtzIGZvciB0ZXN0aW5nLCBZb3RhbSBhbmQgSmFrdWIhDQo+QW5uYQ0KDQpJ IGp1c3QgZm91bmQgb3V0IHRoYXQgaXQgaGFwcGVucyBvbiBhbnkgb2YgbXkgbWFjaGluZXMsIG9u Y2UgSSBwdXQgdHdvIG5mcyBlbnRyaWVzIGluDQpteSBmc3RhYi4gSWYgSSBwdXQgb25seSBvbmUs IEkgZG9uJ3Qgc2VlIHRoZSBwcm9ibGVtLiANCg0KSSBob3BlIGl0IG1pZ2h0IGJlIGhlbHBmdWwg OikNCg0KPg0KPj4NCj4+IEhUSA0KPj4NCg0K -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 27 Oct 2016 06:50:22 +0000, Yotam Gigi wrote: > >-----Original Message----- > >From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] > >Sent: Wednesday, October 26, 2016 9:17 PM > >To: Jakub Kicinski <kubakici@wp.pl> > >Cc: Yotam Gigi <yotamg@mellanox.com>; Andy Adamson <andros@netapp.com>; > >linux-nfs@vger.kernel.org; netdev@vger.kernel.org; Trond Myklebust > ><Trond.Myklebust@netapp.com>; Yotam Gigi <yotam.gi@gmail.com>; mlxsw > ><mlxsw@mellanox.com> > >Subject: Re: nfs NULL-dereferencing in net-next > > > >On 10/26/2016 02:08 PM, Jakub Kicinski wrote: > >> On Wed, 26 Oct 2016 16:15:24 +0000, Yotam Gigi wrote: > >>>> -----Original Message----- > >>>> From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] > >>>> Sent: Wednesday, October 26, 2016 5:40 PM > >>>> To: Yotam Gigi <yotamg@mellanox.com>; Jakub Kicinski <kubakici@wp.pl>; > >Andy > >>>> Adamson <andros@netapp.com>; Anna Schumaker > >>>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org > >>>> Cc: netdev@vger.kernel.org; Trond Myklebust > ><Trond.Myklebust@netapp.com>; > >>>> Yotam Gigi <yotam.gi@gmail.com>; mlxsw <mlxsw@mellanox.com> > >>>> Subject: Re: nfs NULL-dereferencing in net-next > >>>> > >>>> On 10/25/2016 01:19 PM, Yotam Gigi wrote: > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev- > >owner@vger.kernel.org] > >>>> On > >>>>>> Behalf Of Jakub Kicinski > >>>>>> Sent: Monday, October 17, 2016 10:20 PM > >>>>>> To: Andy Adamson <andros@netapp.com>; Anna Schumaker > >>>>>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org > >>>>>> Cc: netdev@vger.kernel.org; Trond Myklebust > >>>> <Trond.Myklebust@netapp.com> > >>>>>> Subject: nfs NULL-dereferencing in net-next > >>>>>> > >>>>>> Hi! > >>>>>> > >>>>>> I'm hitting this reliably on net-next, HEAD at 3f3177bb680f > >>>>>> ("fsl/fman: fix error return code in mac_probe()"). > >>>>> > >>>>> > >>>>> I see the same thing. It happens constantly on some of my machines, making > >>>> them > >>>>> completely unusable. > >>>>> > >>>>> I bisected it and got to the commit: > >>>>> > >>>>> commit 04ea1b3e6d8ed4978bb608c1748530af3de8c274 > >>>>> Author: Andy Adamson <andros@netapp.com> > >>>>> Date: Fri Sep 9 09:22:27 2016 -0400 > >>>>> > >>>>> NFS add xprt switch addrs test to match client > >>>>> > >>>>> Signed-off-by: Andy Adamson <andros@netapp.com> > >>>>> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> > >>>> > >>>> Thanks for reporting on this everyone! Does this patch help? > >>> > >>> Actually, I still see the same bug with the same trace. > > > >Well, it was worth a shot. I'll keep poking at it. > > > >> > >> I rebuild the latest net-next and I'm not seeing the trace any more... > >> I'm only seeing this (with or without your patch): > >> > >> [ 23.465877] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > >> [ 23.473784] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > >> [ 23.588890] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > >> [ 23.596746] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > >> [ 23.781574] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > >> [ 23.789599] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 > > > >Interesting, I get that too when I try to use NFS v4.1. It's weird that the crash would > >stop happening like that, so maybe something is racy in this area. > > > >Thanks for testing, Yotam and Jakub! > >Anna > > I just found out that it happens on any of my machines, once I put two nfs entries in > my fstab. If I put only one, I don't see the problem. > > I hope it might be helpful :) Hi Anna, any updates on this one? The crash came back half an hour after I reported that it was gone... Over the weekend David Miller rebased net-next on top of 4.9.0-rc3 and the bug is still there :( FWIW I also have multiple nfs mounts on my setup, 2 in fstab and one in a startup script. Following Yotam I dropped one of the fstab entries and things seem to be working (even though I still have multiple mounts, the other one just comes a bit later). -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2016 11:25 AM, Jakub Kicinski wrote: > On Thu, 27 Oct 2016 06:50:22 +0000, Yotam Gigi wrote: >>> -----Original Message----- >>> From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] >>> Sent: Wednesday, October 26, 2016 9:17 PM >>> To: Jakub Kicinski <kubakici@wp.pl> >>> Cc: Yotam Gigi <yotamg@mellanox.com>; Andy Adamson <andros@netapp.com>; >>> linux-nfs@vger.kernel.org; netdev@vger.kernel.org; Trond Myklebust >>> <Trond.Myklebust@netapp.com>; Yotam Gigi <yotam.gi@gmail.com>; mlxsw >>> <mlxsw@mellanox.com> >>> Subject: Re: nfs NULL-dereferencing in net-next >>> >>> On 10/26/2016 02:08 PM, Jakub Kicinski wrote: >>>> On Wed, 26 Oct 2016 16:15:24 +0000, Yotam Gigi wrote: >>>>>> -----Original Message----- >>>>>> From: Anna Schumaker [mailto:Anna.Schumaker@Netapp.com] >>>>>> Sent: Wednesday, October 26, 2016 5:40 PM >>>>>> To: Yotam Gigi <yotamg@mellanox.com>; Jakub Kicinski <kubakici@wp.pl>; >>> Andy >>>>>> Adamson <andros@netapp.com>; Anna Schumaker >>>>>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org >>>>>> Cc: netdev@vger.kernel.org; Trond Myklebust >>> <Trond.Myklebust@netapp.com>; >>>>>> Yotam Gigi <yotam.gi@gmail.com>; mlxsw <mlxsw@mellanox.com> >>>>>> Subject: Re: nfs NULL-dereferencing in net-next >>>>>> >>>>>> On 10/25/2016 01:19 PM, Yotam Gigi wrote: >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: netdev-owner@vger.kernel.org [mailto:netdev- >>> owner@vger.kernel.org] >>>>>> On >>>>>>>> Behalf Of Jakub Kicinski >>>>>>>> Sent: Monday, October 17, 2016 10:20 PM >>>>>>>> To: Andy Adamson <andros@netapp.com>; Anna Schumaker >>>>>>>> <Anna.Schumaker@Netapp.com>; linux-nfs@vger.kernel.org >>>>>>>> Cc: netdev@vger.kernel.org; Trond Myklebust >>>>>> <Trond.Myklebust@netapp.com> >>>>>>>> Subject: nfs NULL-dereferencing in net-next >>>>>>>> >>>>>>>> Hi! >>>>>>>> >>>>>>>> I'm hitting this reliably on net-next, HEAD at 3f3177bb680f >>>>>>>> ("fsl/fman: fix error return code in mac_probe()"). >>>>>>> >>>>>>> >>>>>>> I see the same thing. It happens constantly on some of my machines, making >>>>>> them >>>>>>> completely unusable. >>>>>>> >>>>>>> I bisected it and got to the commit: >>>>>>> >>>>>>> commit 04ea1b3e6d8ed4978bb608c1748530af3de8c274 >>>>>>> Author: Andy Adamson <andros@netapp.com> >>>>>>> Date: Fri Sep 9 09:22:27 2016 -0400 >>>>>>> >>>>>>> NFS add xprt switch addrs test to match client >>>>>>> >>>>>>> Signed-off-by: Andy Adamson <andros@netapp.com> >>>>>>> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> >>>>>> >>>>>> Thanks for reporting on this everyone! Does this patch help? >>>>> >>>>> Actually, I still see the same bug with the same trace. >>> >>> Well, it was worth a shot. I'll keep poking at it. >>> >>>> >>>> I rebuild the latest net-next and I'm not seeing the trace any more... >>>> I'm only seeing this (with or without your patch): >>>> >>>> [ 23.465877] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>>> [ 23.473784] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>>> [ 23.588890] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>>> [ 23.596746] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>>> [ 23.781574] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>>> [ 23.789599] NFS: set_pnfs_layoutdriver: cl_exchange_flags 0x0 >>> >>> Interesting, I get that too when I try to use NFS v4.1. It's weird that the crash would >>> stop happening like that, so maybe something is racy in this area. >>> >>> Thanks for testing, Yotam and Jakub! >>> Anna >> >> I just found out that it happens on any of my machines, once I put two nfs entries in >> my fstab. If I put only one, I don't see the problem. >> >> I hope it might be helpful :) > > Hi Anna, > > any updates on this one? The crash came back half an hour after I > reported that it was gone... Aww, I was hoping that patch would work. It still seemed to fix some issues for me when mounting multiple servers, so I'm planning to keep it. Unfortunately I'm out of town this week, so I haven't had much of a chance to keep poking at this issue. I should be able to get back to it next week! Thanks for the update! Anna > > Over the weekend David Miller rebased net-next on top of 4.9.0-rc3 and > the bug is still there :( FWIW I also have multiple nfs mounts on my > setup, 2 in fstab and one in a startup script. Following Yotam I > dropped one of the fstab entries and things seem to be working (even > though I still have multiple mounts, the other one just comes a bit > later). > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 03, Anna Schumaker wrote: > Aww, I was hoping that patch would work. It still seemed to fix some > issues for me when mounting multiple servers, so I'm planning to keep > it. Unfortunately I'm out of town this week, so I haven't had much of > a chance to keep poking at this issue. I should be able to get back > to it next week! Is this supposed to be fixed already? I get an oops in rpc_clnt_xprt_switch_has_addr+0xc/0x40 with 4.9.0-rc4. Olaf
On 11/10/2016 09:47 AM, Olaf Hering wrote: > On Thu, Nov 03, Anna Schumaker wrote: > >> Aww, I was hoping that patch would work. It still seemed to fix some >> issues for me when mounting multiple servers, so I'm planning to keep >> it. Unfortunately I'm out of town this week, so I haven't had much of >> a chance to keep poking at this issue. I should be able to get back >> to it next week! > > Is this supposed to be fixed already? I get an oops in > rpc_clnt_xprt_switch_has_addr+0xc/0x40 with 4.9.0-rc4. Not yet, sorry. I'm waiting on one bugfix patch before sending my pull request. Thanks for waiting patiently, Anna > > Olaf > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 34dd7b2..62a4827 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -2753,14 +2753,18 @@ EXPORT_SYMBOL_GPL(rpc_cap_max_reconnect_timeout); void rpc_clnt_xprt_switch_put(struct rpc_clnt *clnt) { + rcu_read_lock(); xprt_switch_put(rcu_dereference(clnt->cl_xpi.xpi_xpswitch)); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(rpc_clnt_xprt_switch_put); void rpc_clnt_xprt_switch_add_xprt(struct rpc_clnt *clnt, struct rpc_xprt *xprt) { + rcu_read_lock(); rpc_xprt_switch_add_xprt(rcu_dereference(clnt->cl_xpi.xpi_xpswitch), xprt); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(rpc_clnt_xprt_switch_add_xprt); @@ -2770,9 +2774,8 @@ bool rpc_clnt_xprt_switch_has_addr(struct rpc_clnt *clnt, struct rpc_xprt_switch *xps; bool ret; - xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); - rcu_read_lock(); + xps = rcu_dereference(clnt->cl_xpi.xpi_xpswitch); ret = rpc_xprt_switch_has_addr(xps, sap); rcu_read_unlock(); return ret;