Message ID | 1493111408-27692-1-git-send-email-arend.vanspriel@broadcom.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 1ed760c9aca0b69c5b24c6fd454bcc573a01df99 |
Delegated to: | Kalle Valo |
Headers | show |
On 25 April 2017 at 10:10, Arend van Spriel <arend.vanspriel@broadcom.com> wrote: > An issue was found brcmfmac driver in which a skbuff in .start_xmit() > callback was actually cloned. So instead of checking for sufficient > headroom it should also be writable. Hence use skb_cow_head() to > check and expand the headroom appropriately. > > Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> > --- > Hi Kalle, > > Did a recursive grep in drivers/net/wireless and found a similar > case in ath6kl. I do not have the hardware to test so this is > only compile tested. > > Regards, > Arend > --- > drivers/net/wireless/ath/ath6kl/txrx.c | 13 ++++--------- > 1 file changed, 4 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath6kl/txrx.c b/drivers/net/wireless/ath/ath6kl/txrx.c > index a531e0c..e6b2517 100644 > --- a/drivers/net/wireless/ath/ath6kl/txrx.c > +++ b/drivers/net/wireless/ath/ath6kl/txrx.c > @@ -399,15 +399,10 @@ int ath6kl_data_tx(struct sk_buff *skb, struct net_device *dev) > csum_dest = skb->csum_offset + csum_start; > } > > - if (skb_headroom(skb) < dev->needed_headroom) { > - struct sk_buff *tmp_skb = skb; > - > - skb = skb_realloc_headroom(skb, dev->needed_headroom); > - kfree_skb(tmp_skb); > - if (skb == NULL) { > - dev->stats.tx_dropped++; > - return 0; > - } > + if (skb_cow_head(skb, dev->needed_headroom)) { > + dev->stats.tx_dropped++; > + kfree_skb(skb); > + return 0; > } > > if (ath6kl_wmi_dix_2_dot3(ar->wmi, skb)) { > -- > 1.9.1 > Not sure if this is the right place to comment on this, but I've had a quick look around various network drivers, and there are similar constructs in a LOT of drivers. I've picked two at random, and both seem to show this issue. When the issue first came up in a USB attached smsc ethernet driver, at least 6 other drivers with similar faults were found in the net/usb tree. Now I could just be being paranoid, and am missing something, so here are the files I looked at... drivers/net/marvell/mwifiex/uap_txrx.c line 161 - no relevant skb_cow operations in this file, but changes are made to the buffers /drivers/net/ethernet/sun/niu.c line 6657 - ditto I'm a bit of a beginner at this stuff, so not sure how this should be taken forward. James
On 25-4-2017 11:36, James Hughes wrote: > On 25 April 2017 at 10:10, Arend van Spriel > <arend.vanspriel@broadcom.com> wrote: >> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >> callback was actually cloned. So instead of checking for sufficient >> headroom it should also be writable. Hence use skb_cow_head() to >> check and expand the headroom appropriately. >> >> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >> --- >> Hi Kalle, >> >> Did a recursive grep in drivers/net/wireless and found a similar >> case in ath6kl. I do not have the hardware to test so this is >> only compile tested. >> >> Regards, >> Arend >> --- >> drivers/net/wireless/ath/ath6kl/txrx.c | 13 ++++--------- >> 1 file changed, 4 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/net/wireless/ath/ath6kl/txrx.c b/drivers/net/wireless/ath/ath6kl/txrx.c >> index a531e0c..e6b2517 100644 >> --- a/drivers/net/wireless/ath/ath6kl/txrx.c >> +++ b/drivers/net/wireless/ath/ath6kl/txrx.c >> @@ -399,15 +399,10 @@ int ath6kl_data_tx(struct sk_buff *skb, struct net_device *dev) >> csum_dest = skb->csum_offset + csum_start; >> } >> >> - if (skb_headroom(skb) < dev->needed_headroom) { >> - struct sk_buff *tmp_skb = skb; >> - >> - skb = skb_realloc_headroom(skb, dev->needed_headroom); >> - kfree_skb(tmp_skb); >> - if (skb == NULL) { >> - dev->stats.tx_dropped++; >> - return 0; >> - } >> + if (skb_cow_head(skb, dev->needed_headroom)) { >> + dev->stats.tx_dropped++; >> + kfree_skb(skb); >> + return 0; >> } >> >> if (ath6kl_wmi_dix_2_dot3(ar->wmi, skb)) { >> -- >> 1.9.1 >> > > Not sure if this is the right place to comment on this, but I've had a > quick look around various network drivers, and there are similar > constructs in a LOT of drivers. I've picked two at random, and both > seem to show this issue. When the issue first came up in a USB > attached smsc ethernet driver, at least 6 other drivers with similar > faults were found in the net/usb tree. Now I could just be being > paranoid, and am missing something, so here are the files I looked > at... > > drivers/net/marvell/mwifiex/uap_txrx.c line 161 - no relevant skb_cow > operations in this file, but changes are made to the buffers This piece of code is used in rx. They have in-driver bridging implemented in mwifiex. Surprised to see such a feature in a upstream driver. > /drivers/net/ethernet/sun/niu.c line 6657 - ditto Looks suspicious indeed. > I'm a bit of a beginner at this stuff, so not sure how this should be > taken forward. I looked at the wireless drivers specifically and initial grep was for skb_push(), but that gave a lot of results. So just did a grep for drivers touching struct net_device::needed_headroom. Admittedly that is more of a glance than a proper look and it would probably be best if driver maintainers would check for such headroom constructs in their driver(s). Regards, Arend
On 25 April 2017 at 12:10, Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: > On 25-4-2017 11:36, James Hughes wrote: >> On 25 April 2017 at 10:10, Arend van Spriel >> <arend.vanspriel@broadcom.com> wrote: >>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>> callback was actually cloned. So instead of checking for sufficient >>> headroom it should also be writable. Hence use skb_cow_head() to >>> check and expand the headroom appropriately. >>> >>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >>> --- >>> Hi Kalle, >>> >>> Did a recursive grep in drivers/net/wireless and found a similar >>> case in ath6kl. I do not have the hardware to test so this is >>> only compile tested. >>> >>> Regards, >>> Arend >>> --- >>> drivers/net/wireless/ath/ath6kl/txrx.c | 13 ++++--------- >>> 1 file changed, 4 insertions(+), 9 deletions(-) >>> >>> diff --git a/drivers/net/wireless/ath/ath6kl/txrx.c b/drivers/net/wireless/ath/ath6kl/txrx.c >>> index a531e0c..e6b2517 100644 >>> --- a/drivers/net/wireless/ath/ath6kl/txrx.c >>> +++ b/drivers/net/wireless/ath/ath6kl/txrx.c >>> @@ -399,15 +399,10 @@ int ath6kl_data_tx(struct sk_buff *skb, struct net_device *dev) >>> csum_dest = skb->csum_offset + csum_start; >>> } >>> >>> - if (skb_headroom(skb) < dev->needed_headroom) { >>> - struct sk_buff *tmp_skb = skb; >>> - >>> - skb = skb_realloc_headroom(skb, dev->needed_headroom); >>> - kfree_skb(tmp_skb); >>> - if (skb == NULL) { >>> - dev->stats.tx_dropped++; >>> - return 0; >>> - } >>> + if (skb_cow_head(skb, dev->needed_headroom)) { >>> + dev->stats.tx_dropped++; >>> + kfree_skb(skb); >>> + return 0; >>> } >>> >>> if (ath6kl_wmi_dix_2_dot3(ar->wmi, skb)) { >>> -- >>> 1.9.1 >>> >> >> Not sure if this is the right place to comment on this, but I've had a >> quick look around various network drivers, and there are similar >> constructs in a LOT of drivers. I've picked two at random, and both >> seem to show this issue. When the issue first came up in a USB >> attached smsc ethernet driver, at least 6 other drivers with similar >> faults were found in the net/usb tree. Now I could just be being >> paranoid, and am missing something, so here are the files I looked >> at... >> >> drivers/net/marvell/mwifiex/uap_txrx.c line 161 - no relevant skb_cow >> operations in this file, but changes are made to the buffers > > This piece of code is used in rx. They have in-driver bridging > implemented in mwifiex. Surprised to see such a feature in a upstream > driver. > >> /drivers/net/ethernet/sun/niu.c line 6657 - ditto > > Looks suspicious indeed. > >> I'm a bit of a beginner at this stuff, so not sure how this should be >> taken forward. > > I looked at the wireless drivers specifically and initial grep was for > skb_push(), but that gave a lot of results. So just did a grep for > drivers touching struct net_device::needed_headroom. Admittedly that is > more of a glance than a proper look and it would probably be best if > driver maintainers would check for such headroom constructs in their > driver(s). > > Regards, > Arend I only checked those two so I suspect more will be in there. There is also a lot of boilerplate code that could be removed simply by using skb_cow_header...is there a standard way of telling all maintainers to check their drivers for particular issues? I did a grep for skb_headroom since it seems unlikely that would be required except in circumstances like this, to find an initial list of possibilities but don't have time to check all the hits!
Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: > An issue was found brcmfmac driver in which a skbuff in .start_xmit() > callback was actually cloned. So instead of checking for sufficient > headroom it should also be writable. Hence use skb_cow_head() to > check and expand the headroom appropriately. > > Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Steve, would you have time to run a quick test with this? Patch set to Deferred.
On 4/26/2017 10:53 AM, Kalle Valo wrote: > Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >> callback was actually cloned. So instead of checking for sufficient >> headroom it should also be writable. Hence use skb_cow_head() to >> check and expand the headroom appropriately. >> >> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> > > Steve, would you have time to run a quick test with this? > > Patch set to Deferred. Just a hint. I tested the equivalent patch in brcmfmac by doing a skb_clone() just before the headroom if-statement (and a kfree_skb() afterwards obviously ;-) ). Regards, Arend
On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: > Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >> callback was actually cloned. So instead of checking for sufficient >> headroom it should also be writable. Hence use skb_cow_head() to >> check and expand the headroom appropriately. >> >> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> > > Steve, would you have time to run a quick test with this? > > Patch set to Deferred. > Happy to give it a quick spin on both of my platforms. @Arend: is there some demonstrable before/after that shows a problem I can detect at runtime? I understand your thought about putting a skb_clone() in there, but what are the expectations? And is any problem evident without explicitly modding the code with the clone? - Steve
On 26-4-2017 17:44, Steve deRosier wrote: > On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: >> Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>> callback was actually cloned. So instead of checking for sufficient >>> headroom it should also be writable. Hence use skb_cow_head() to >>> check and expand the headroom appropriately. >>> >>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >> >> Steve, would you have time to run a quick test with this? >> >> Patch set to Deferred. >> > > Happy to give it a quick spin on both of my platforms. > > @Arend: is there some demonstrable before/after that shows a problem I > can detect at runtime? I understand your thought about putting a > skb_clone() in there, but what are the expectations? And is any > problem evident without explicitly modding the code with the clone? Ok. So the root cause is explained in a email to netdev mailing list, but I can not find it. The sender was probably not a member. I will forward that email to you and cc: linux-wireless. Basically, you need to setup a bridge and run hostapd in bridged mode. Incoming multicast traffic will be cloned by bridge and sent all interfaces in the bridge. If more than one driver puts additional payload in the headroom they are basically mucking about in the same buffer space so packets probably never end up in the devices. In case of ath6kl the patch is in area where driver/device determines IP checksum if it is supported (if I am not mistaken). So not sure how easy it is to replicate without patching it for testing. Regards, Arend
On 26 April 2017 at 19:03, Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: > > > On 26-4-2017 17:44, Steve deRosier wrote: >> On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: >>> Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >>>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>>> callback was actually cloned. So instead of checking for sufficient >>>> headroom it should also be writable. Hence use skb_cow_head() to >>>> check and expand the headroom appropriately. >>>> >>>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >>> >>> Steve, would you have time to run a quick test with this? >>> >>> Patch set to Deferred. >>> >> >> Happy to give it a quick spin on both of my platforms. >> >> @Arend: is there some demonstrable before/after that shows a problem I >> can detect at runtime? I understand your thought about putting a >> skb_clone() in there, but what are the expectations? And is any >> problem evident without explicitly modding the code with the clone? > > Ok. So the root cause is explained in a email to netdev mailing list, > but I can not find it. The sender was probably not a member. I will > forward that email to you and cc: linux-wireless. Basically, you need to > setup a bridge and run hostapd in bridged mode. Incoming multicast > traffic will be cloned by bridge and sent all interfaces in the bridge. > If more than one driver puts additional payload in the headroom they are > basically mucking about in the same buffer space so packets probably > never end up in the devices. In case of ath6kl the patch is in area > where driver/device determines IP checksum if it is supported (if I am > not mistaken). So not sure how easy it is to replicate without patching > it for testing. > > Regards, > Arend That was me. The full mechanism can be seen on the Raspberry Pi github issue tracker here https://github.com/raspberrypi/firmware/issues/673 In brief, when bridging between two devices, if both devices fail to 'unclone' then header corruption could occur if both the devices made header changes since they are both looking at the same data. The Pi has a smsc9x ethernet device and the Brcm Wireless chip - both had the fault, so we were getting corrupted headers, and eventual failure of ethernet. The same fault appears in a large subset of drivers in my brief examinations. James
On Wed, Apr 26, 2017 at 12:54 PM, James Hughes <james.hughes@raspberrypi.org> wrote: > On 26 April 2017 at 19:03, Arend Van Spriel > <arend.vanspriel@broadcom.com> wrote: >> >> >> On 26-4-2017 17:44, Steve deRosier wrote: >>> On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: >>>> Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >>>>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>>>> callback was actually cloned. So instead of checking for sufficient >>>>> headroom it should also be writable. Hence use skb_cow_head() to >>>>> check and expand the headroom appropriately. >>>>> >>>>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >>>> >>>> Steve, would you have time to run a quick test with this? >>>> >>>> Patch set to Deferred. >>>> >>> >>> Happy to give it a quick spin on both of my platforms. >>> @ Arend and James, thanks for the info. I understand it, but unfortunately I can't seem to replicate the problems on my platforms with the limited time I have available to test it. It also may have to do with my platforms having special custom bridging related code, or just me having setup too simple of a test. That said... @Kalle: I have tested on both my 6004 and 6003 platforms. I didn't notice any incorrect behavior in my testing. But I don't have a test setup that would have shown the original problem as reported on the brcm driver so I can't say that the change actually _fixes_ anything. Only that in my testing it doesn't seem to break anything. Tested-by: Steve deRosier <derosier@gmail.com> - Steve
On 27 April 2017 at 05:55, Steve deRosier <derosier@gmail.com> wrote: > On Wed, Apr 26, 2017 at 12:54 PM, James Hughes > <james.hughes@raspberrypi.org> wrote: >> On 26 April 2017 at 19:03, Arend Van Spriel >> <arend.vanspriel@broadcom.com> wrote: >>> >>> >>> On 26-4-2017 17:44, Steve deRosier wrote: >>>> On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: >>>>> Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >>>>>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>>>>> callback was actually cloned. So instead of checking for sufficient >>>>>> headroom it should also be writable. Hence use skb_cow_head() to >>>>>> check and expand the headroom appropriately. >>>>>> >>>>>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >>>>> >>>>> Steve, would you have time to run a quick test with this? >>>>> >>>>> Patch set to Deferred. >>>>> >>>> >>>> Happy to give it a quick spin on both of my platforms. >>>> > > @ Arend and James, thanks for the info. I understand it, but > unfortunately I can't seem to replicate the problems on my platforms > with the limited time I have available to test it. It also may have to > do with my platforms having special custom bridging related code, or > just me having setup too simple of a test. > > That said... > > @Kalle: I have tested on both my 6004 and 6003 platforms. I didn't > notice any incorrect behavior in my testing. But I don't have a test > setup that would have shown the original problem as reported on the > brcm driver so I can't say that the change actually _fixes_ anything. > Only that in my testing it doesn't seem to break anything. > > Tested-by: Steve deRosier <derosier@gmail.com> > > - Steve It was quite difficult to reproduce on the Pi - in general the system seems to recover from corrupted headers, but on the Pi the Wifi driver was writing in some information to the header, then checking it again later (after I think some sort of loopback, but not sure) - it was corrupted. It requires BOTH drivers to have to same fault, i.e. both failed to unclone, and it also required both drivers to be writing something to the header in a place that was subsequently checked by one of the drivers somehow to see if the data was valid. Even then it only appeared to happen on certain packet types, in my case DHCP packets using IPv6 seemed to kick it off. So quite unpredictable when an error may occur.
Steve deRosier <derosier@gmail.com> writes: > On Wed, Apr 26, 2017 at 12:54 PM, James Hughes > <james.hughes@raspberrypi.org> wrote: >> On 26 April 2017 at 19:03, Arend Van Spriel >> <arend.vanspriel@broadcom.com> wrote: >>> >>> >>> On 26-4-2017 17:44, Steve deRosier wrote: >>>> On Wed, Apr 26, 2017 at 1:53 AM, Kalle Valo <kvalo@codeaurora.org> wrote: >>>>> Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: >>>>>> An issue was found brcmfmac driver in which a skbuff in .start_xmit() >>>>>> callback was actually cloned. So instead of checking for sufficient >>>>>> headroom it should also be writable. Hence use skb_cow_head() to >>>>>> check and expand the headroom appropriately. >>>>>> >>>>>> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> >>>>> >>>>> Steve, would you have time to run a quick test with this? >>>>> >>>>> Patch set to Deferred. >>>>> >>>> >>>> Happy to give it a quick spin on both of my platforms. >>>> > > @ Arend and James, thanks for the info. I understand it, but > unfortunately I can't seem to replicate the problems on my platforms > with the limited time I have available to test it. It also may have to > do with my platforms having special custom bridging related code, or > just me having setup too simple of a test. > > That said... > > @Kalle: I have tested on both my 6004 and 6003 platforms. I didn't > notice any incorrect behavior in my testing. But I don't have a test > setup that would have shown the original problem as reported on the > brcm driver so I can't say that the change actually _fixes_ anything. > Only that in my testing it doesn't seem to break anything. > > Tested-by: Steve deRosier <derosier@gmail.com> Yeah, I was mostly worried about regression. I didn't expect you to replicate the bug. Thanks for testing, I'll add this patch to my queue.
Arend Van Spriel <arend.vanspriel@broadcom.com> wrote: > An issue was found brcmfmac driver in which a skbuff in .start_xmit() > callback was actually cloned. So instead of checking for sufficient > headroom it should also be writable. Hence use skb_cow_head() to > check and expand the headroom appropriately. > > Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> > Tested-by: Steve deRosier <derosier@gmail.com> Patch applied to ath-next branch of ath.git, thanks. 1ed760c9aca0 ath6kl: assure headroom of skbuff is writable in .start_xmit()
diff --git a/drivers/net/wireless/ath/ath6kl/txrx.c b/drivers/net/wireless/ath/ath6kl/txrx.c index a531e0c..e6b2517 100644 --- a/drivers/net/wireless/ath/ath6kl/txrx.c +++ b/drivers/net/wireless/ath/ath6kl/txrx.c @@ -399,15 +399,10 @@ int ath6kl_data_tx(struct sk_buff *skb, struct net_device *dev) csum_dest = skb->csum_offset + csum_start; } - if (skb_headroom(skb) < dev->needed_headroom) { - struct sk_buff *tmp_skb = skb; - - skb = skb_realloc_headroom(skb, dev->needed_headroom); - kfree_skb(tmp_skb); - if (skb == NULL) { - dev->stats.tx_dropped++; - return 0; - } + if (skb_cow_head(skb, dev->needed_headroom)) { + dev->stats.tx_dropped++; + kfree_skb(skb); + return 0; } if (ath6kl_wmi_dix_2_dot3(ar->wmi, skb)) {
An issue was found brcmfmac driver in which a skbuff in .start_xmit() callback was actually cloned. So instead of checking for sufficient headroom it should also be writable. Hence use skb_cow_head() to check and expand the headroom appropriately. Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> --- Hi Kalle, Did a recursive grep in drivers/net/wireless and found a similar case in ath6kl. I do not have the hardware to test so this is only compile tested. Regards, Arend --- drivers/net/wireless/ath/ath6kl/txrx.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)