[PATCHv2] net: bpf: reject invalid shifts

Message ID	1452626228-15742-1-git-send-email-rabin@rab.in (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Rabin Vincent <rabin@rab.in> To: davem@davemloft.net Subject: [PATCHv2] net: bpf: reject invalid shifts Date: Tue, 12 Jan 2016 20:17:08 +0100 Message-Id: <1452626228-15742-1-git-send-email-rabin@rab.in> In-Reply-To: <20160112185121.GA34045@ast-mbp.thefacebook.com> References: <20160112185121.GA34045@ast-mbp.thefacebook.com> Precedence: list Cc: linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, Rabin Vincent <rabin@rab.in>, ast@kernel.org, daniel@iogearbox.net MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Rabin Vincent Jan. 12, 2016, 7:17 p.m. UTC

On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
constant shift that can't be encoded in the immediate field of the
UBFM/SBFM instructions is passed to the JIT.  Since these shifts
amounts, which are negative or >= regsize, are invalid, reject them in
the eBPF verifier and the classic BPF filter checker, for all
architectures.

Signed-off-by: Rabin Vincent <rabin@rab.in>
---
v2: handle BPF_ARSH too

 kernel/bpf/verifier.c | 10 ++++++++++
 net/core/filter.c     |  5 +++++
 2 files changed, 15 insertions(+)

Alexei Starovoitov Jan. 12, 2016, 7:26 p.m. UTC | #1

On Tue, Jan 12, 2016 at 08:17:08PM +0100, Rabin Vincent wrote:
> On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> constant shift that can't be encoded in the immediate field of the
> UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> amounts, which are negative or >= regsize, are invalid, reject them in
> the eBPF verifier and the classic BPF filter checker, for all
> architectures.
> 
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> ---
> v2: handle BPF_ARSH too

Thanks.
Acked-by: Alexei Starovoitov <ast@kernel.org>

Daniel Borkmann Jan. 12, 2016, 7:35 p.m. UTC | #2

On 01/12/2016 08:17 PM, Rabin Vincent wrote:
> On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> constant shift that can't be encoded in the immediate field of the
> UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> amounts, which are negative or >= regsize, are invalid, reject them in
> the eBPF verifier and the classic BPF filter checker, for all
> architectures.
>
> Signed-off-by: Rabin Vincent <rabin@rab.in>

Fine with me as well, thanks for following up!

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Eric Dumazet Jan. 12, 2016, 7:48 p.m. UTC | #3

On Tue, 2016-01-12 at 20:17 +0100, Rabin Vincent wrote:
> On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> constant shift that can't be encoded in the immediate field of the
> UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> amounts, which are negative or >= regsize, are invalid, reject them in
> the eBPF verifier and the classic BPF filter checker, for all
> architectures.
> 

Hmm...

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 672eefbfbe99..37157c4c1a78 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -777,6 +777,11 @@ static int bpf_check_classic(const struct sock_filter *filter,
>  			if (ftest->k == 0)
>  				return -EINVAL;
>  			break;
> +		case BPF_ALU | BPF_LSH | BPF_K:
> +		case BPF_ALU | BPF_RSH | BPF_K:
> +			if (ftest->k >= 32)
> +				return -EINVAL;
> +			break;
>  		case BPF_LD | BPF_MEM:
>  		case BPF_LDX | BPF_MEM:
>  		case BPF_ST:

These weak filters used to have undefined behavior, maybe in a never
taken branch, and will now fail hard, possibly breaking old
applications.

I believe we should add a one time warning to give a clue to poor users
hitting this problem.

Not everybody has perfect BPF filters, since most of the time they were
hand coded.

Alexei Starovoitov Jan. 12, 2016, 7:53 p.m. UTC | #4

On Tue, Jan 12, 2016 at 11:48:38AM -0800, Eric Dumazet wrote:
> On Tue, 2016-01-12 at 20:17 +0100, Rabin Vincent wrote:
> > On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> > constant shift that can't be encoded in the immediate field of the
> > UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> > amounts, which are negative or >= regsize, are invalid, reject them in
> > the eBPF verifier and the classic BPF filter checker, for all
> > architectures.
> > 
> 
> Hmm...
> 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 672eefbfbe99..37157c4c1a78 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -777,6 +777,11 @@ static int bpf_check_classic(const struct sock_filter *filter,
> >  			if (ftest->k == 0)
> >  				return -EINVAL;
> >  			break;
> > +		case BPF_ALU | BPF_LSH | BPF_K:
> > +		case BPF_ALU | BPF_RSH | BPF_K:
> > +			if (ftest->k >= 32)
> > +				return -EINVAL;
> > +			break;
> >  		case BPF_LD | BPF_MEM:
> >  		case BPF_LDX | BPF_MEM:
> >  		case BPF_ST:
> 
> These weak filters used to have undefined behavior, maybe in a never
> taken branch, and will now fail hard, possibly breaking old
> applications.
> 
> I believe we should add a one time warning to give a clue to poor users
> hitting this problem.

you mean like warn_on_once() here?
Makes sense I guess.

> Not everybody has perfect BPF filters, since most of the time they were
> hand coded.

yep and we all know who was able to code hundreds of cBPF insns by hand ;)
But I'm sure that code doesn't have such broken shifts. :)))

Daniel Borkmann Jan. 12, 2016, 8:42 p.m. UTC | #5

On 01/12/2016 08:53 PM, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 11:48:38AM -0800, Eric Dumazet wrote:
>> On Tue, 2016-01-12 at 20:17 +0100, Rabin Vincent wrote:
>>> On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
>>> constant shift that can't be encoded in the immediate field of the
>>> UBFM/SBFM instructions is passed to the JIT.  Since these shifts
>>> amounts, which are negative or >= regsize, are invalid, reject them in
>>> the eBPF verifier and the classic BPF filter checker, for all
>>> architectures.
>>>
>>
>> Hmm...
>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 672eefbfbe99..37157c4c1a78 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -777,6 +777,11 @@ static int bpf_check_classic(const struct sock_filter *filter,
>>>   			if (ftest->k == 0)
>>>   				return -EINVAL;
>>>   			break;
>>> +		case BPF_ALU | BPF_LSH | BPF_K:
>>> +		case BPF_ALU | BPF_RSH | BPF_K:
>>> +			if (ftest->k >= 32)
>>> +				return -EINVAL;
>>> +			break;
>>>   		case BPF_LD | BPF_MEM:
>>>   		case BPF_LDX | BPF_MEM:
>>>   		case BPF_ST:
>>
>> These weak filters used to have undefined behavior, maybe in a never
>> taken branch, and will now fail hard, possibly breaking old
>> applications.
>>
>> I believe we should add a one time warning to give a clue to poor users
>> hitting this problem.
>
> you mean like warn_on_once() here?
> Makes sense I guess.

Hmm, WARN_ON_ONCE() would then throw a stack trace also for unprived users,
I doubt we want to scare admins. ;)

Or, you mean pr_warn_once()?

>> Not everybody has perfect BPF filters, since most of the time they were
>> hand coded.
>
> yep and we all know who was able to code hundreds of cBPF insns by hand ;)
> But I'm sure that code doesn't have such broken shifts. :)))

libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
could be to just mask them here, but not in eBPF verifier, but that would be
even more inconsistent (on the other hand, we also allow holes in BPF but not
in eBPF, so wouldn't be the first time we make things different), hmm.

   [1] https://github.com/the-tcpdump-group/libpcap/commit/273455d58b3bbd83d757bda4f4544e3e5cc8c20a

Alexei Starovoitov Jan. 12, 2016, 8:46 p.m. UTC | #6

On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
> On 01/12/2016 08:53 PM, Alexei Starovoitov wrote:
> >On Tue, Jan 12, 2016 at 11:48:38AM -0800, Eric Dumazet wrote:
> >>On Tue, 2016-01-12 at 20:17 +0100, Rabin Vincent wrote:
> >>>On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> >>>constant shift that can't be encoded in the immediate field of the
> >>>UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> >>>amounts, which are negative or >= regsize, are invalid, reject them in
> >>>the eBPF verifier and the classic BPF filter checker, for all
> >>>architectures.
> >>>
> >>
> >>Hmm...
> >>
> >>>diff --git a/net/core/filter.c b/net/core/filter.c
> >>>index 672eefbfbe99..37157c4c1a78 100644
> >>>--- a/net/core/filter.c
> >>>+++ b/net/core/filter.c
> >>>@@ -777,6 +777,11 @@ static int bpf_check_classic(const struct sock_filter *filter,
> >>>  			if (ftest->k == 0)
> >>>  				return -EINVAL;
> >>>  			break;
> >>>+		case BPF_ALU | BPF_LSH | BPF_K:
> >>>+		case BPF_ALU | BPF_RSH | BPF_K:
> >>>+			if (ftest->k >= 32)
> >>>+				return -EINVAL;
> >>>+			break;
> >>>  		case BPF_LD | BPF_MEM:
> >>>  		case BPF_LDX | BPF_MEM:
> >>>  		case BPF_ST:
> >>
> >>These weak filters used to have undefined behavior, maybe in a never
> >>taken branch, and will now fail hard, possibly breaking old
> >>applications.
> >>
> >>I believe we should add a one time warning to give a clue to poor users
> >>hitting this problem.
> >
> >you mean like warn_on_once() here?
> >Makes sense I guess.
> 
> Hmm, WARN_ON_ONCE() would then throw a stack trace also for unprived users,
> I doubt we want to scare admins. ;)
> 
> Or, you mean pr_warn_once()?

yes. there is no need for stack trace of course.

> >>Not everybody has perfect BPF filters, since most of the time they were
> >>hand coded.
> >
> >yep and we all know who was able to code hundreds of cBPF insns by hand ;)
> >But I'm sure that code doesn't have such broken shifts. :)))
> 
> libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
> could be to just mask them here, but not in eBPF verifier, but that would be
> even more inconsistent (on the other hand, we also allow holes in BPF but not
> in eBPF, so wouldn't be the first time we make things different), hmm.

I would rather see broken classic bpf program fixed instead of continue
running them with undefined behavior.

David Miller Jan. 12, 2016, 8:56 p.m. UTC | #7

From: Rabin Vincent <rabin@rab.in>
Date: Tue, 12 Jan 2016 20:17:08 +0100

> On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
> constant shift that can't be encoded in the immediate field of the
> UBFM/SBFM instructions is passed to the JIT.  Since these shifts
> amounts, which are negative or >= regsize, are invalid, reject them in
> the eBPF verifier and the classic BPF filter checker, for all
> architectures.
> 
> Signed-off-by: Rabin Vincent <rabin@rab.in>
> ---
> v2: handle BPF_ARSH too

Applied and queued up for -stable, thanks.

Eric Dumazet Jan. 12, 2016, 10:34 p.m. UTC | #8

On Tue, 2016-01-12 at 11:53 -0800, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 11:48:38AM -0800, Eric Dumazet wrote:

> > 
> > I believe we should add a one time warning to give a clue to poor users
> > hitting this problem.
> 
> you mean like warn_on_once() here?
> Makes sense I guess.


pr_err(DEPRECATED, "%s (pid %d) "
                   "invalid shift in BPF program.\n"
                   current->comm, task_pid_nr(current));

Or something like that.

Eric Dumazet Jan. 12, 2016, 11:28 p.m. UTC | #9

On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:

> > >yep and we all know who was able to code hundreds of cBPF insns by hand ;)
> > >But I'm sure that code doesn't have such broken shifts. :)))
> > 
> > libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
> > could be to just mask them here, but not in eBPF verifier, but that would be
> > even more inconsistent (on the other hand, we also allow holes in BPF but not
> > in eBPF, so wouldn't be the first time we make things different), hmm.
> 
> I would rather see broken classic bpf program fixed instead of continue
> running them with undefined behavior.

This is your choice, because you are a developer.

Some people might be stuck with old software they can not update,
because they do not have the money to pay developers.

And no, I did not code BPF programs like that, but maybe others did, and
I feel the pain of customers that might be stuck.

Linus Torvalds always made clear we must provide backward compatibility,
and really this discussion should not even take place.

As I said, we used to load such BPF program in the past.

The fact that ARM64 crashes because of a faulty JIT implementation is
not an excuse.

Alexei Starovoitov Jan. 12, 2016, 11:47 p.m. UTC | #10

On Tue, Jan 12, 2016 at 03:28:22PM -0800, Eric Dumazet wrote:
> On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
> > On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
> 
> > > >yep and we all know who was able to code hundreds of cBPF insns by hand ;)
> > > >But I'm sure that code doesn't have such broken shifts. :)))
> > > 
> > > libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
> > > could be to just mask them here, but not in eBPF verifier, but that would be
> > > even more inconsistent (on the other hand, we also allow holes in BPF but not
> > > in eBPF, so wouldn't be the first time we make things different), hmm.
> > 
> > I would rather see broken classic bpf program fixed instead of continue
> > running them with undefined behavior.
> 
> This is your choice, because you are a developer.
> 
> Some people might be stuck with old software they can not update,
> because they do not have the money to pay developers.
> 
> And no, I did not code BPF programs like that, but maybe others did, and
> I feel the pain of customers that might be stuck.
> 
> Linus Torvalds always made clear we must provide backward compatibility,
> and really this discussion should not even take place.
> 
> As I said, we used to load such BPF program in the past.
> 
> The fact that ARM64 crashes because of a faulty JIT implementation is
> not an excuse.

I would agree if those loaded programs would do something sensible,
but they're broken. As shown arm and arm64 would execute them
differently without JIT, because HW treats such shifts differently.
I also checked that libpcap is sane and doesn't generate broken shifts.
imo we're not breaking backward compatiblity here.

Hannes Frederic Sowa Jan. 12, 2016, 11:59 p.m. UTC | #11

On 13.01.2016 00:47, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2016 at 03:28:22PM -0800, Eric Dumazet wrote:
>> On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
>>> On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
>>
>>>>> yep and we all know who was able to code hundreds of cBPF insns by hand ;)
>>>>> But I'm sure that code doesn't have such broken shifts. :)))
>>>>
>>>> libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
>>>> could be to just mask them here, but not in eBPF verifier, but that would be
>>>> even more inconsistent (on the other hand, we also allow holes in BPF but not
>>>> in eBPF, so wouldn't be the first time we make things different), hmm.
>>>
>>> I would rather see broken classic bpf program fixed instead of continue
>>> running them with undefined behavior.
>>
>> This is your choice, because you are a developer.
>>
>> Some people might be stuck with old software they can not update,
>> because they do not have the money to pay developers.
>>
>> And no, I did not code BPF programs like that, but maybe others did, and
>> I feel the pain of customers that might be stuck.
>>
>> Linus Torvalds always made clear we must provide backward compatibility,
>> and really this discussion should not even take place.
>>
>> As I said, we used to load such BPF program in the past.
>>
>> The fact that ARM64 crashes because of a faulty JIT implementation is
>> not an excuse.
>
> I would agree if those loaded programs would do something sensible,
> but they're broken. As shown arm and arm64 would execute them
> differently without JIT, because HW treats such shifts differently.
> I also checked that libpcap is sane and doesn't generate broken shifts.
> imo we're not breaking backward compatiblity here.

But on one specific platform those programs did something deterministic, 
reproducible and observable, no? Probably most developers only cared 
about that, probably especially in the embedded segment.

Bye,
Hannes

Hannes Frederic Sowa Jan. 13, 2016, 12:17 a.m. UTC | #12

On 13.01.2016 00:59, Hannes Frederic Sowa wrote:
> On 13.01.2016 00:47, Alexei Starovoitov wrote:
>> On Tue, Jan 12, 2016 at 03:28:22PM -0800, Eric Dumazet wrote:
>>> On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
>>>> On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
>>>
>>>>>> yep and we all know who was able to code hundreds of cBPF insns by
>>>>>> hand ;)
>>>>>> But I'm sure that code doesn't have such broken shifts. :)))
>>>>>
>>>>> libpcap certainly supports raw filters now thanks to Chema [1].
>>>>> Alternative
>>>>> could be to just mask them here, but not in eBPF verifier, but that
>>>>> would be
>>>>> even more inconsistent (on the other hand, we also allow holes in
>>>>> BPF but not
>>>>> in eBPF, so wouldn't be the first time we make things different), hmm.
>>>>
>>>> I would rather see broken classic bpf program fixed instead of continue
>>>> running them with undefined behavior.
>>>
>>> This is your choice, because you are a developer.
>>>
>>> Some people might be stuck with old software they can not update,
>>> because they do not have the money to pay developers.
>>>
>>> And no, I did not code BPF programs like that, but maybe others did, and
>>> I feel the pain of customers that might be stuck.
>>>
>>> Linus Torvalds always made clear we must provide backward compatibility,
>>> and really this discussion should not even take place.
>>>
>>> As I said, we used to load such BPF program in the past.
>>>
>>> The fact that ARM64 crashes because of a faulty JIT implementation is
>>> not an excuse.
>>
>> I would agree if those loaded programs would do something sensible,
>> but they're broken. As shown arm and arm64 would execute them
>> differently without JIT, because HW treats such shifts differently.
>> I also checked that libpcap is sane and doesn't generate broken shifts.
>> imo we're not breaking backward compatiblity here.
>
> But on one specific platform those programs did something deterministic,
> reproducible and observable, no? Probably most developers only cared
> about that, probably especially in the embedded segment.

By the way, we can annotate the JIT interpreter with an 
__attribute__((no_sanitize_undefined)) to get away with the ubsan report.

Then only the BUG_ONs in arm64 code emit lib are a problem, no?

Bye,
Hannes

Alexei Starovoitov Jan. 13, 2016, 12:19 a.m. UTC | #13

On Wed, Jan 13, 2016 at 12:59:46AM +0100, Hannes Frederic Sowa wrote:
> On 13.01.2016 00:47, Alexei Starovoitov wrote:
> >On Tue, Jan 12, 2016 at 03:28:22PM -0800, Eric Dumazet wrote:
> >>On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
> >>>On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
> >>
> >>>>>yep and we all know who was able to code hundreds of cBPF insns by hand ;)
> >>>>>But I'm sure that code doesn't have such broken shifts. :)))
> >>>>
> >>>>libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
> >>>>could be to just mask them here, but not in eBPF verifier, but that would be
> >>>>even more inconsistent (on the other hand, we also allow holes in BPF but not
> >>>>in eBPF, so wouldn't be the first time we make things different), hmm.
> >>>
> >>>I would rather see broken classic bpf program fixed instead of continue
> >>>running them with undefined behavior.
> >>
> >>This is your choice, because you are a developer.
> >>
> >>Some people might be stuck with old software they can not update,
> >>because they do not have the money to pay developers.
> >>
> >>And no, I did not code BPF programs like that, but maybe others did, and
> >>I feel the pain of customers that might be stuck.
> >>
> >>Linus Torvalds always made clear we must provide backward compatibility,
> >>and really this discussion should not even take place.
> >>
> >>As I said, we used to load such BPF program in the past.
> >>
> >>The fact that ARM64 crashes because of a faulty JIT implementation is
> >>not an excuse.
> >
> >I would agree if those loaded programs would do something sensible,
> >but they're broken. As shown arm and arm64 would execute them
> >differently without JIT, because HW treats such shifts differently.
> >I also checked that libpcap is sane and doesn't generate broken shifts.
> >imo we're not breaking backward compatiblity here.
> 
> But on one specific platform those programs did something deterministic,
> reproducible and observable, no? Probably most developers only cared about
> that, probably especially in the embedded segment.

No, they were not. Say we do mask K&31 instead. That may match
what x86 cpu do, but it will not match arm. You just cannot
define previously undefined behavior without breaking something.
And with error the users can actually fix their stuff.
If their software is so old and cannot be upgraded, then
they shouldn't be upgrading the kernel either, something else will break.
Starting from kernel version. Remember 2.x -> 3.x -> 4.x ?
Also the arm64 JIT crash was noticed only because of fancy fuzzing,
so let's be sensible in our risk estimations.

Hannes Frederic Sowa Jan. 13, 2016, 12:42 a.m. UTC | #14

On 13.01.2016 01:19, Alexei Starovoitov wrote:
> On Wed, Jan 13, 2016 at 12:59:46AM +0100, Hannes Frederic Sowa wrote:
>> On 13.01.2016 00:47, Alexei Starovoitov wrote:
>>> On Tue, Jan 12, 2016 at 03:28:22PM -0800, Eric Dumazet wrote:
>>>> On Tue, 2016-01-12 at 12:46 -0800, Alexei Starovoitov wrote:
>>>>> On Tue, Jan 12, 2016 at 09:42:39PM +0100, Daniel Borkmann wrote:
>>>>
>>>>>>> yep and we all know who was able to code hundreds of cBPF insns by hand ;)
>>>>>>> But I'm sure that code doesn't have such broken shifts. :)))
>>>>>>
>>>>>> libpcap certainly supports raw filters now thanks to Chema [1]. Alternative
>>>>>> could be to just mask them here, but not in eBPF verifier, but that would be
>>>>>> even more inconsistent (on the other hand, we also allow holes in BPF but not
>>>>>> in eBPF, so wouldn't be the first time we make things different), hmm.
>>>>>
>>>>> I would rather see broken classic bpf program fixed instead of continue
>>>>> running them with undefined behavior.
>>>>
>>>> This is your choice, because you are a developer.
>>>>
>>>> Some people might be stuck with old software they can not update,
>>>> because they do not have the money to pay developers.
>>>>
>>>> And no, I did not code BPF programs like that, but maybe others did, and
>>>> I feel the pain of customers that might be stuck.
>>>>
>>>> Linus Torvalds always made clear we must provide backward compatibility,
>>>> and really this discussion should not even take place.
>>>>
>>>> As I said, we used to load such BPF program in the past.
>>>>
>>>> The fact that ARM64 crashes because of a faulty JIT implementation is
>>>> not an excuse.
>>>
>>> I would agree if those loaded programs would do something sensible,
>>> but they're broken. As shown arm and arm64 would execute them
>>> differently without JIT, because HW treats such shifts differently.
>>> I also checked that libpcap is sane and doesn't generate broken shifts.
>>> imo we're not breaking backward compatiblity here.
>>
>> But on one specific platform those programs did something deterministic,
>> reproducible and observable, no? Probably most developers only cared about
>> that, probably especially in the embedded segment.
>
> No, they were not. Say we do mask K&31 instead. That may match
> what x86 cpu do, but it will not match arm. You just cannot
> define previously undefined behavior without breaking something.
> And with error the users can actually fix their stuff.

At least the ARM spec says only the least significant byte is used for 
variable input.

My idea was to simply do nothing and leave it as is, getting rid of the 
BUG_ONs in arm64, maybe replacing them with something sensible according 
to the arm64 spec so it matches non-immediate input. It would define new 
behavior only for arm64.

(Hmm, non-optimizing gcc seems to simply not emit such an instruction 
with constant out of the bound. Maybe also an idea to just skip them in 
arm64? Sounds inelegant...)

> If their software is so old and cannot be upgraded, then
> they shouldn't be upgrading the kernel either, something else will break.
> Starting from kernel version. Remember 2.x -> 3.x -> 4.x ?
> Also the arm64 JIT crash was noticed only because of fancy fuzzing,
> so let's be sensible in our risk estimations.

But that wasn't a crash during execution of the program but during the 
generation of the op codes?

I don't have a strong opinion on that and it seems a fix has already 
been applied.

Bye,
Hannes

Eric Dumazet Jan. 13, 2016, 2:11 a.m. UTC | #15

On Tue, 2016-01-12 at 15:47 -0800, Alexei Starovoitov wrote:

> I would agree if those loaded programs would do something sensible,
> but they're broken. As shown arm and arm64 would execute them
> differently without JIT, because HW treats such shifts differently.
> I also checked that libpcap is sane and doesn't generate broken shifts.
> imo we're not breaking backward compatiblity here.
> 

How did you prove a particular code path was even taken in a BPF
program ? This is new to me.

As I said, it is possible some guys never noticed their BPF program were
'broken' because this invalid shift was hidden in a dead code part.

So a program might appear as 'weak' when in fact its behavior was
absolutely correct.

You assume everybody uses libpcap, this is wrong, and for very obvious
reasons.

Try to encode the QUEUE, RXHASH, or CPU instructions in libpcap, for a
start.

Alexei Starovoitov Jan. 13, 2016, 2:24 a.m. UTC | #16

On Tue, Jan 12, 2016 at 06:11:38PM -0800, Eric Dumazet wrote:
> On Tue, 2016-01-12 at 15:47 -0800, Alexei Starovoitov wrote:
> 
> > I would agree if those loaded programs would do something sensible,
> > but they're broken. As shown arm and arm64 would execute them
> > differently without JIT, because HW treats such shifts differently.
> > I also checked that libpcap is sane and doesn't generate broken shifts.
> > imo we're not breaking backward compatiblity here.
> > 
> 
> How did you prove a particular code path was even taken in a BPF
> program ? This is new to me.

Simple. I only found absolute constants for shift instructions
in libpcap source.

> As I said, it is possible some guys never noticed their BPF program were
> 'broken' because this invalid shift was hidden in a dead code part.
> 
> So a program might appear as 'weak' when in fact its behavior was
> absolutely correct.
> 
> You assume everybody uses libpcap, this is wrong, and for very obvious
> reasons.

I didn't imply that.
Obviously there is chromium, libsecomp, lxd, dhclient, nmap and tons
of other apps. The point was for the library the most frequently
associated with classic bpf.

I think adding pr_err_once() to bpf_check_classic() as you
suggested makes the most sense to me at this point.
If anyone wants to submit a patch that masks K &= 31, I would ok with
it as well, but imo it's a disservice to classic bpf users.
Leaving it as-is and waiting for other jits to blow up is not an option.

David Miller Jan. 13, 2016, 2:43 a.m. UTC | #17

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 12 Jan 2016 18:11:38 -0800

> As I said, it is possible some guys never noticed their BPF program
> were 'broken' because this invalid shift was hidden in a dead code
> part.

We should not hide bugs and unintended uses of operations with
undefined behavior.

David Miller Jan. 13, 2016, 2:45 a.m. UTC | #18

From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Tue, 12 Jan 2016 18:24:16 -0800

> If anyone wants to submit a patch that masks K &= 31, I would ok
> with it as well, but imo it's a disservice to classic bpf users.

This is how I feel as well.  I hate when some developer of a tool
thinks it's ok to silently let me do something which it can strictly
determine is questionable without my explicitly asking it to do so.

Eric Dumazet Jan. 13, 2016, 4:07 a.m. UTC | #19

On Tue, 2016-01-12 at 21:43 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 12 Jan 2016 18:11:38 -0800
> 
> > As I said, it is possible some guys never noticed their BPF program
> > were 'broken' because this invalid shift was hidden in a dead code
> > part.
> 
> We should not hide bugs and unintended uses of operations with
> undefined behavior.

   JUMP 2:
   SHR  45
2: RET  10


was a valid program.

But a dumb loader decided to know better.

Hannes Frederic Sowa Jan. 13, 2016, 4:27 a.m. UTC | #20

On Wed, Jan 13, 2016, at 03:43, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 12 Jan 2016 18:11:38 -0800
> 
> > As I said, it is possible some guys never noticed their BPF program
> > were 'broken' because this invalid shift was hidden in a dead code
> > part.
> 
> We should not hide bugs and unintended uses of operations with
> undefined behavior.

The term 'undefined behavior' is defined in terms of the C
specification. We tend to implement the BPF interpreter with C in the
kernel, so we get in contact with that, but BPF programs were mostly
written by hand in BPF assembler or by generators. So the term
'undefined behavior' seems not to be fitting well here. As pointed out
BPF sadly does rely on some specific processors ISAs but not on the C
specification. This also is an advantage as otherwise the JITs would
need to handle all those invalid shifts at runtime and generate checking
code.

I think it makes sense to adapt BPF towards the the ISAs or in case of
the interpreter, towards gcc behavior (which sadly can change, too).
ISAs describe the behavior quite strict what the CPUs do in case of a
variable shift operand that is larger than the register bit size is
applied.

Bye,
Hannes

David Miller Jan. 13, 2016, 5 a.m. UTC | #21

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 12 Jan 2016 20:07:44 -0800

> On Tue, 2016-01-12 at 21:43 -0500, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Tue, 12 Jan 2016 18:11:38 -0800
>> 
>> > As I said, it is possible some guys never noticed their BPF program
>> > were 'broken' because this invalid shift was hidden in a dead code
>> > part.
>> 
>> We should not hide bugs and unintended uses of operations with
>> undefined behavior.
> 
>    JUMP 2:
>    SHR  45
> 2: RET  10
> 
> 
> was a valid program.
> 
> But a dumb loader decided to know better.

I guess you are uninterested in knowing your programs contains such
garbage.

[PATCHv2] net: bpf: reject invalid shifts

Commit Message

Comments

Patch