diff mbox

[v2,1/1] ARM: orion5x: use mac_pton() helper

Message ID bce65520-7f11-49f5-c806-59e508001321@the2masters.de (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Hellermann Feb. 22, 2018, 11:18 p.m. UTC
Am 22.02.2018 um 22:42 schrieb Andrew Lunn:
> On Thu, Feb 22, 2018 at 06:45:51PM +0100, Stefan Hellermann wrote:
>> Hi!
>>
>> My QNAP TS-209 NAS Device is crashing with the following commit, which went
>> in the kernel as commit 4904dbda41c860fd117b20f3c48adb2780eee37e
>>
>> I cannot provide a boot log, the device panics before enabling the serial
>> console.
> Hi Stefan
>
> Did you try earlyprintk? You might need to recompile your kernel to
> enable it.
Yes, I tried it, but it didn't help. I even changed qnap_ts209_init() to 
configure uart before ethernet, but it didn't help either.
>
> Looking at the code, i don't see anything obviously wrong. So i think
> i would start by looking how many times it goes through the loop in
> qnap_tsx09_find_mac_addr() with this patch reverted, and what address
> it finds the MAC address at.
>
> Then see what happens with the current crashing code. Is it failing to
> recognise the MAC address, and so keep looping around?
I think I found the failure:
The code is looping through an ext2 filesystem on a 384kB mtd partition 
(factory configuration put there by QNAP). There it looks on every 1kB 
boundary if there is a valid MAC address. The filesystem has a 1kB block 
size, so this seems to work. The MAC address is on the 37th 1kB block.

But: On the 27th block is a large file (1,5kB) without 0 bytes inside.
The code in qnap_tsx09_find_mac_addr() maps 1kB into memory (not a whole 
file or the whole 384kB) and then calls qnap_tsx09_check_mac_addr() -> 
mac_pton() -> strlen() on this memory block. as there is no 0 byte in 
the file on the 27th block, strlen runs into bad memory and the machine 
panics. The old code had no strlen().

I changed mac_pton() to use strnlen(), and now the panic is gone. I 
don't know why strlen is actually needed in mac_pton. The string is 
checked in the following loop, if there is a zero byte somewhere, the 
loop will be returned immediately. So I think the strlen() superfluous. 
Is the following patch correct?

Comments

Andrew Lunn Feb. 22, 2018, 11:34 p.m. UTC | #1
> But: On the 27th block is a large file (1,5kB) without 0 bytes inside.
> The code in qnap_tsx09_find_mac_addr() maps 1kB into memory (not a whole
> file or the whole 384kB) and then calls qnap_tsx09_check_mac_addr() ->
> mac_pton() -> strlen() on this memory block. as there is no 0 byte in the
> file on the 27th block, strlen runs into bad memory and the machine panics.
> The old code had no strlen().

Yes, that sounds look a good explanation. 

> I changed mac_pton() to use strnlen(), and now the panic is gone. I don't
> know why strlen is actually needed in mac_pton. The string is checked in the
> following loop, if there is a zero byte somewhere, the loop will be returned
> immediately. So I think the strlen() superfluous. Is the following patch
> correct?

The patch has been corrupted by you email client. But otherwise, yes.

Please take a look at:

https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html

It will give you hits about correctly formatting the patch. In
addition it should have:

Fixes: 4cd5773a2ae6 ("net: core: move mac_pton() to lib/net_utils.c")

before the --- line, to indicate what it is fixing.

This patch should be against
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git,
since it is a fix, and sent to <netdev@vger.kernel.org>.

Thanks
      Andrew
Andy Shevchenko Feb. 23, 2018, 10:09 a.m. UTC | #2
On Fri, 2018-02-23 at 00:34 +0100, Andrew Lunn wrote:
> > But: On the 27th block is a large file (1,5kB) without 0 bytes
> > inside.
> > The code in qnap_tsx09_find_mac_addr() maps 1kB into memory (not a
> > whole
> > file or the whole 384kB) and then calls qnap_tsx09_check_mac_addr()
> > ->
> > mac_pton() -> strlen() on this memory block. as there is no 0 byte
> > in the
> > file on the 27th block, strlen runs into bad memory and the machine
> > panics.
> > The old code had no strlen().
> 
> Yes, that sounds look a good explanation. 
> 
> > I changed mac_pton() to use strnlen(), and now the panic is gone. I
> > don't
> > know why strlen is actually needed in mac_pton. The string is
> > checked in the
> > following loop, if there is a zero byte somewhere, the loop will be
> > returned
> > immediately. So I think the strlen() superfluous. Is the following
> > patch
> > correct?
> 
> The patch has been corrupted by you email client. But otherwise, yes.
> 
> Please take a look at:
> 
> https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html
> 
> It will give you hits about correctly formatting the patch. In
> addition it should have:
> 
> Fixes: 4cd5773a2ae6 ("net: core: move mac_pton() to lib/net_utils.c")
> 
> before the --- line, to indicate what it is fixing.
> 
> This patch should be against
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git,
> since it is a fix, and sent to <netdev@vger.kernel.org>.

Guys, consider this one instead:
https://patchwork.ozlabs.org/patch/851008/
Andrew Lunn Feb. 23, 2018, 3:01 p.m. UTC | #3
> > The patch has been corrupted by you email client. But otherwise, yes.
> > 
> > Please take a look at:
> > 
> > https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html
> > 
> > It will give you hits about correctly formatting the patch. In
> > addition it should have:
> > 
> > Fixes: 4cd5773a2ae6 ("net: core: move mac_pton() to lib/net_utils.c")
> > 
> > before the --- line, to indicate what it is fixing.
> > 
> > This patch should be against
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git,
> > since it is a fix, and sent to <netdev@vger.kernel.org>.
> 
> Guys, consider this one instead:
> https://patchwork.ozlabs.org/patch/851008/

Hi Andy

Thanks for pointing this patch out.

What is the advantage of doing to the strnlen()? As Stefan says, the
code which follows will detect a short string, in that a NULL is not
in [0-9a-f], nor a : .

Thanks
	Andrew
Andy Shevchenko Feb. 23, 2018, 3:18 p.m. UTC | #4
+Cc Alexey

On Fri, 2018-02-23 at 16:01 +0100, Andrew Lunn wrote:
> > > The patch has been corrupted by you email client. But otherwise,
> > > yes.
> > > 
> > > Please take a look at:
> > > 
> > > https://www.kernel.org/doc/html/v4.12/process/submitting-patches.h
> > > tml
> > > 
> > > It will give you hits about correctly formatting the patch. In
> > > addition it should have:
> > > 
> > > Fixes: 4cd5773a2ae6 ("net: core: move mac_pton() to
> > > lib/net_utils.c")
> > > 
> > > before the --- line, to indicate what it is fixing.
> > > 
> > > This patch should be against
> > > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git,
> > > since it is a fix, and sent to <netdev@vger.kernel.org>.
> > 
> > Guys, consider this one instead:
> > https://patchwork.ozlabs.org/patch/851008/
> 
> Hi Andy
> 
> Thanks for pointing this patch out.
> 
> What is the advantage of doing to the strnlen()? As Stefan says, the
> code which follows will detect a short string, in that a NULL is not
> in [0-9a-f], nor a : .

I'm not sure, but my understanding is that, the strchr() call in the
original code or isxdigit() in the follow up change will trash a cache a
bit. Besides that some of the users are (often?) supplying empty strings
to convert from, and in this case makes sense to bail out fast.

Alexey, can you shed a light here?
Andrew Lunn Feb. 23, 2018, 3:51 p.m. UTC | #5
> > Hi Andy
> > 
> > Thanks for pointing this patch out.
> > 
> > What is the advantage of doing to the strnlen()? As Stefan says, the
> > code which follows will detect a short string, in that a NULL is not
> > in [0-9a-f], nor a : .
> 
> I'm not sure, but my understanding is that, the strchr() call in the
> original code or isxdigit() in the follow up change will trash a cache a
> bit. Besides that some of the users are (often?) supplying empty strings
> to convert from, and in this case makes sense to bail out fast.

Is this function being called on a hot path? In the case which is
crashing, it is during early boot, and it gets called ~ 40 times, in
quick succession. The first call to isxdigit() will need to fetch part
of the _ctype array into cache, but since the caller is only walking
memory, i hope it is still in cache for the next call to mac_pton().

	Andrew
Stefan Hellermann Feb. 23, 2018, 4:36 p.m. UTC | #6
2018-02-23 16:51 GMT+01:00 Andrew Lunn <andrew@lunn.ch>:
>> > Hi Andy
>> >
>> > Thanks for pointing this patch out.
>> >
>> > What is the advantage of doing to the strnlen()? As Stefan says, the
>> > code which follows will detect a short string, in that a NULL is not
>> > in [0-9a-f], nor a : .
>>
>> I'm not sure, but my understanding is that, the strchr() call in the
>> original code or isxdigit() in the follow up change will trash a cache a
>> bit. Besides that some of the users are (often?) supplying empty strings
>> to convert from, and in this case makes sense to bail out fast.
>
> Is this function being called on a hot path? In the case which is
> crashing, it is during early boot, and it gets called ~ 40 times, in
> quick succession. The first call to isxdigit() will need to fetch part
> of the _ctype array into cache, but since the caller is only walking
> memory, i hope it is still in cache for the next call to mac_pton().
>
>         Andrew

In my case mac_pton is not called on a hot path, the slow sata disks
in the NAS dominate the boot time. For me code size matters, and on my
device it's rarely called on a zero string. It's probably slower with
the strnlen as without it. I think if there are users out there
calling it in a hot path on a zero string, they should check the zero
string themselves, this is not the common use case.

Should I send a patch to netdev, can I get some Acked-by: ?

    Stefan
Andy Shevchenko Feb. 23, 2018, 4:57 p.m. UTC | #7
On Fri, 2018-02-23 at 17:36 +0100, Stefan Hellermann wrote:
> 2018-02-23 16:51 GMT+01:00 Andrew Lunn <andrew@lunn.ch>:
> > > > Hi Andy
> > > > 
> > > > Thanks for pointing this patch out.
> > > > 
> > > > What is the advantage of doing to the strnlen()? As Stefan says,
> > > > the
> > > > code which follows will detect a short string, in that a NULL is
> > > > not
> > > > in [0-9a-f], nor a : .
> > > 
> > > I'm not sure, but my understanding is that, the strchr() call in
> > > the
> > > original code or isxdigit() in the follow up change will trash a
> > > cache a
> > > bit. Besides that some of the users are (often?) supplying empty
> > > strings
> > > to convert from, and in this case makes sense to bail out fast.
> > 
> > Is this function being called on a hot path? In the case which is
> > crashing, it is during early boot, and it gets called ~ 40 times, in
> > quick succession. The first call to isxdigit() will need to fetch
> > part
> > of the _ctype array into cache, but since the caller is only walking
> > memory, i hope it is still in cache for the next call to mac_pton().
> > 
> >         Andrew
> 
> In my case mac_pton is not called on a hot path, the slow sata disks
> in the NAS dominate the boot time. For me code size matters, and on my
> device it's rarely called on a zero string. It's probably slower with
> the strnlen as without it. I think if there are users out there
> calling it in a hot path on a zero string, they should check the zero
> string themselves, this is not the common use case.
> 

> Should I send a patch to netdev, can I get some Acked-by: ?

Not from me, sorry.

Alexey would be best person to give a such.
Please, Cc him.
Andrew Lunn Feb. 23, 2018, 5:23 p.m. UTC | #8
> Should I send a patch to netdev, can I get some Acked-by: ?

Please do. I will ack it, if it follows all the usual rules, not
corrupt, etc.

       Andrew
Alexey Dobriyan Feb. 23, 2018, 6:20 p.m. UTC | #9
On Fri, Feb 23, 2018 at 05:18:48PM +0200, Andy Shevchenko wrote:
> +Cc Alexey
> 
> On Fri, 2018-02-23 at 16:01 +0100, Andrew Lunn wrote:
> > > > The patch has been corrupted by you email client. But otherwise,
> > > > yes.
> > > > 
> > > > Please take a look at:
> > > > 
> > > > https://www.kernel.org/doc/html/v4.12/process/submitting-patches.h
> > > > tml
> > > > 
> > > > It will give you hits about correctly formatting the patch. In
> > > > addition it should have:
> > > > 
> > > > Fixes: 4cd5773a2ae6 ("net: core: move mac_pton() to
> > > > lib/net_utils.c")
> > > > 
> > > > before the --- line, to indicate what it is fixing.
> > > > 
> > > > This patch should be against
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git,
> > > > since it is a fix, and sent to <netdev@vger.kernel.org>.
> > > 
> > > Guys, consider this one instead:
> > > https://patchwork.ozlabs.org/patch/851008/
> > 
> > Hi Andy
> > 
> > Thanks for pointing this patch out.
> > 
> > What is the advantage of doing to the strnlen()? As Stefan says, the
> > code which follows will detect a short string, in that a NULL is not
> > in [0-9a-f], nor a : .
> 
> I'm not sure, but my understanding is that, the strchr() call in the
> original code or isxdigit() in the follow up change will trash a cache a
> bit. Besides that some of the users are (often?) supplying empty strings
> to convert from, and in this case makes sense to bail out fast.
> 
> Alexey, can you shed a light here?

I went for the simplest code.
map_pton() is never on the fastpath, so code size matters more.

In this sense, conversion to mac_pton() should be done, and strnlen()
probably should not.
diff mbox

Patch

diff --git a/lib/net_utils.c b/lib/net_utils.c
index 148fc6e99ef6..e7785cf20f59 100644
--- a/lib/net_utils.c
+++ b/lib/net_utils.c
@@ -7,10 +7,6 @@  bool mac_pton(const char *s, u8 *mac)
  {
      int i;

-    /* XX:XX:XX:XX:XX:XX */
-    if (strlen(s) < 3 * ETH_ALEN - 1)
-        return false;
-
      /* Don't dirty result unless string is valid MAC. */
      for (i = 0; i < ETH_ALEN; i++) {
          if (!isxdigit(s[i * 3]) || !isxdigit(s[i * 3 + 1]))