mbox series

[RFC,0/2] add ppp_generic ioctl to bridge channels

Message ID 20201106181647.16358-1-tparkin@katalix.com (mailing list archive)
Headers show
Series add ppp_generic ioctl to bridge channels | expand

Message

Tom Parkin Nov. 6, 2020, 6:16 p.m. UTC
This small RFC series implements a suggestion from Guillaume Nault in
response to my previous submission to add an ac/pppoe driver to the l2tp
subsystem[1].

Following Guillaume's advice, this series adds an ioctl to the ppp code
to allow a ppp channel to be bridged to another.  Quoting Guillaume:

"It's just a matter of extending struct channel (in ppp_generic.c) with
a pointer to another channel, then testing this pointer in ppp_input().
If the pointer is NULL, use the classical path, if not, forward the PPP
frame using the ->start_xmit function of the peer channel."

This allows userspace to easily take PPP frames from e.g. a PPPoE
session, and forward them over a PPPoL2TP session; accomplishing the
same thing my earlier ac/pppoe driver in l2tp did but in much less code!

Since I am not an expert in the ppp code, this patch set is RFC to
gather any comments prior to making a proper submission.  I have tested
this code using go-l2tp[2] and l2tp-ktest[3], but I have some
uncertainties about the current implementation:

 * I believe that the fact we're not explicitly locking anything in the
   ppp_input path for access to the channel bridge field is OK since:
   
    - ppp_input is called from the socket backlog recv

    - pppox_unbind (which calls ppp_channel_unregister, which unsets the
      channel bridge field) is called from the socket release

   As such I think the bridge pointer cannot change in the recv
   path since as the pppoe.c code says: "Semantics of backlog rcv
   preclude any code from executing in lock_sock()/release_sock()
   bounds".

 * When userspace makes a PPPIOCBRIDGECHAN ioctl call, the channel the
   ioctl is called on is updated to point to the channel identified
   using the index passed in the ioctl call.

   As such, allow PPP frames to pass in both directions from channel A
   to channel B, userspace must call ioctl twice: once to bridge A to B,
   and once to bridge B to A.

   This approach makes the kernel coding easier, because the ioctl
   handler doesn't need to do anything to lock the channel which is
   identified by index: it's sufficient to find it in the per-net list
   (under protection of the list lock) and take a reference on it.

   The downside is that userspace must make two ioctl calls to fully set
   up the bridge.

Any comments on the design welcome, especially thoughts on the two
points above.

Thanks :-)

[1]. Previous l2tp ac/pppoe patch set:

https://lore.kernel.org/netdev/20200930210707.10717-1-tparkin@katalix.com/

[2]. go-l2tp: a Go library for building L2TP applications on Linux
systems, support for the PPPIOCBRIDGECHAN ioctl is on a branch:

https://github.com/katalix/go-l2tp/tree/tp_002_pppoe_2

[3]. l2tp-ktest: a test suite for the Linux Kernel L2TP subsystem

https://github.com/katalix/l2tp-ktest

Tom Parkin (2):
  ppp: add PPPIOCBRIDGECHAN ioctl
  docs: update ppp_generic.rst to describe ioctl PPPIOCBRIDGECHAN

 Documentation/networking/ppp_generic.rst |  5 ++++
 drivers/net/ppp/ppp_generic.c            | 35 +++++++++++++++++++++++-
 include/uapi/linux/ppp-ioctl.h           |  1 +
 3 files changed, 40 insertions(+), 1 deletion(-)

Comments

Guillaume Nault Nov. 9, 2020, 10:51 p.m. UTC | #1
On Fri, Nov 06, 2020 at 06:16:45PM +0000, Tom Parkin wrote:
> This small RFC series implements a suggestion from Guillaume Nault in
> response to my previous submission to add an ac/pppoe driver to the l2tp
> subsystem[1].
> 
> Following Guillaume's advice, this series adds an ioctl to the ppp code
> to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> 
> "It's just a matter of extending struct channel (in ppp_generic.c) with
> a pointer to another channel, then testing this pointer in ppp_input().
> If the pointer is NULL, use the classical path, if not, forward the PPP
> frame using the ->start_xmit function of the peer channel."
> 
> This allows userspace to easily take PPP frames from e.g. a PPPoE
> session, and forward them over a PPPoL2TP session; accomplishing the
> same thing my earlier ac/pppoe driver in l2tp did but in much less code!

Nice to see this RFC. Thanks!

> Since I am not an expert in the ppp code, this patch set is RFC to
> gather any comments prior to making a proper submission.  I have tested
> this code using go-l2tp[2] and l2tp-ktest[3], but I have some
> uncertainties about the current implementation:
> 
>  * I believe that the fact we're not explicitly locking anything in the
>    ppp_input path for access to the channel bridge field is OK since:
>    
>     - ppp_input is called from the socket backlog recv
> 
>     - pppox_unbind (which calls ppp_channel_unregister, which unsets the
>       channel bridge field) is called from the socket release
> 
>    As such I think the bridge pointer cannot change in the recv
>    path since as the pppoe.c code says: "Semantics of backlog rcv
>    preclude any code from executing in lock_sock()/release_sock()
>    bounds".

But ppp_input() is used beyond pppoe. For example, I'm pretty sure these
pre-conditions aren't met for L2TP (pppol2tp_recv() processes packets
directly, packets aren't queued by sk_receive_skb()).

To avoid locking the channel bridge in the data path, you can protect
the pointer with RCU.

>  * When userspace makes a PPPIOCBRIDGECHAN ioctl call, the channel the
>    ioctl is called on is updated to point to the channel identified
>    using the index passed in the ioctl call.
> 
>    As such, allow PPP frames to pass in both directions from channel A
>    to channel B, userspace must call ioctl twice: once to bridge A to B,
>    and once to bridge B to A.
> 
>    This approach makes the kernel coding easier, because the ioctl
>    handler doesn't need to do anything to lock the channel which is
>    identified by index: it's sufficient to find it in the per-net list
>    (under protection of the list lock) and take a reference on it.
> 
>    The downside is that userspace must make two ioctl calls to fully set
>    up the bridge.

That's probably okay, but that'd allow for very strange setups, like
channel A pointing to channel B and channel B being used by a PPP unit.
I'd prefer to avoid having to think about such scenarios when reasoning
about the code.

I think that the channel needs to be locked anyway to safely modify the
bridge pointer. So the "no lock needed" benefit of the 2 ioctl calls
approach doesn't seem to stand.

> Any comments on the design welcome, especially thoughts on the two
> points above.

I haven't been go through all the details yet, but the general design
looks good to me. I'll comment inline for more precise feedbacks.

BTW, shouldn't we have an "UNBRIDGE" command to remove the bridge
between two channels?

> Thanks :-)
> 
> [1]. Previous l2tp ac/pppoe patch set:
> 
> https://lore.kernel.org/netdev/20200930210707.10717-1-tparkin@katalix.com/
> 
> [2]. go-l2tp: a Go library for building L2TP applications on Linux
> systems, support for the PPPIOCBRIDGECHAN ioctl is on a branch:
> 
> https://github.com/katalix/go-l2tp/tree/tp_002_pppoe_2
> 
> [3]. l2tp-ktest: a test suite for the Linux Kernel L2TP subsystem
> 
> https://github.com/katalix/l2tp-ktest
> 
> Tom Parkin (2):
>   ppp: add PPPIOCBRIDGECHAN ioctl
>   docs: update ppp_generic.rst to describe ioctl PPPIOCBRIDGECHAN
> 
>  Documentation/networking/ppp_generic.rst |  5 ++++
>  drivers/net/ppp/ppp_generic.c            | 35 +++++++++++++++++++++++-
>  include/uapi/linux/ppp-ioctl.h           |  1 +
>  3 files changed, 40 insertions(+), 1 deletion(-)
> 
> -- 
> 2.17.1
>
Jakub Kicinski Nov. 9, 2020, 11:52 p.m. UTC | #2
On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:
> This small RFC series implements a suggestion from Guillaume Nault in
> response to my previous submission to add an ac/pppoe driver to the l2tp
> subsystem[1].
> 
> Following Guillaume's advice, this series adds an ioctl to the ppp code
> to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> 
> "It's just a matter of extending struct channel (in ppp_generic.c) with
> a pointer to another channel, then testing this pointer in ppp_input().
> If the pointer is NULL, use the classical path, if not, forward the PPP
> frame using the ->start_xmit function of the peer channel."
> 
> This allows userspace to easily take PPP frames from e.g. a PPPoE
> session, and forward them over a PPPoL2TP session; accomplishing the
> same thing my earlier ac/pppoe driver in l2tp did but in much less code!

I have little understanding of the ppp code, but I can't help but
wonder why this special channel connection is needed? We have great
many ways to redirect traffic between interfaces - bpf, tc, netfilter,
is there anything ppp specific that is required here?
Guillaume Nault Nov. 10, 2020, 9:28 a.m. UTC | #3
On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:
> > This small RFC series implements a suggestion from Guillaume Nault in
> > response to my previous submission to add an ac/pppoe driver to the l2tp
> > subsystem[1].
> > 
> > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> > 
> > "It's just a matter of extending struct channel (in ppp_generic.c) with
> > a pointer to another channel, then testing this pointer in ppp_input().
> > If the pointer is NULL, use the classical path, if not, forward the PPP
> > frame using the ->start_xmit function of the peer channel."
> > 
> > This allows userspace to easily take PPP frames from e.g. a PPPoE
> > session, and forward them over a PPPoL2TP session; accomplishing the
> > same thing my earlier ac/pppoe driver in l2tp did but in much less code!
> 
> I have little understanding of the ppp code, but I can't help but
> wonder why this special channel connection is needed? We have great
> many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> is there anything ppp specific that is required here?

I can see two viable ways to implement this feature. The one described
in this patch series is the simplest. The reason why it doesn't reuse
existing infrastructure is because it has to work at the link layer
(no netfilter) and also has to work on PPP channels (no network
device).

The alternative, is to implement a virtual network device for the
protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
and teach tunnel_key about them. Then we could use iproute2 commands
like:
 # ip link add name pppoe0 up type pppoe external
 # ip link add name l2tp0 up type l2tp external
 # tc qdisc add dev pppoe0 ingress
 # tc qdisc add dev l2tp0 ingress
 # tc filter add dev pppoe0 ingress matchall                        \
     action tunnel_key set l2tp_version 2 l2tp_tid XXX l2tp_sid YYY \
     action mirred egress redirect dev pppoe0
 # tc filter add dev l2tp0 ingress matchall  \
     action tunnel_key set pppoe_sid ZZZ     \
     action mirred egress redirect dev l2tp0

Note: I've used matchall for simplicity, but a real uses case would
have to filter on the L2TP session and tunnel IDs and on the PPPoE
session ID.

As I said in my reply to the original thread, I like this idea, but
haven't thought much about the details. So there might be some road
blocks. Beyond modernising PPP and making it better integrated into the
stack, that should also bring the possibility of hardware offload (but
would any NIC vendor be interested?).

I think the question is more about long term maintainance. Do we want
to keep PPP related module self contained, with low maintainance code
(the current proposal)? Or are we willing to modernise the
infrastructure, add support and maintain PPP features in other modules
like flower, tunnel_key, etc.?

Of course, I might have missed other ways to implement this feature.
But that's all I could think of for now.

And if anyone wants a quick recap about PPP (what are these PPP channel
and unit things? what's the relationship between PPPoE, L2TP and PPP?
etc.), just let me know.

Hope this clarifies the situation.
Tom Parkin Nov. 10, 2020, 11:54 a.m. UTC | #4
On  Mon, Nov 09, 2020 at 23:51:53 +0100, Guillaume Nault wrote:
> >  * I believe that the fact we're not explicitly locking anything in the
> >    ppp_input path for access to the channel bridge field is OK since:
> >    
> >     - ppp_input is called from the socket backlog recv
> > 
> >     - pppox_unbind (which calls ppp_channel_unregister, which unsets the
> >       channel bridge field) is called from the socket release
> > 
> >    As such I think the bridge pointer cannot change in the recv
> >    path since as the pppoe.c code says: "Semantics of backlog rcv
> >    preclude any code from executing in lock_sock()/release_sock()
> >    bounds".
> 
> But ppp_input() is used beyond pppoe. For example, I'm pretty sure these
> pre-conditions aren't met for L2TP (pppol2tp_recv() processes packets
> directly, packets aren't queued by sk_receive_skb()).

Yes, that's true.  I was basing my assumption on the fact that the
l2tp/pppox recv path made similar checks to those in in pppoe.c, e.g.
sk_state.

I take your point more widely though: ppp_input is used by multiple
pppox drivers, so it probably makes more sense to protect the bridge
with a lock than rely on the driver implementation(s) all behaving in
the same way.

> To avoid locking the channel bridge in the data path, you can protect
> the pointer with RCU.

Ack, I'll look at doing so.

> >  * When userspace makes a PPPIOCBRIDGECHAN ioctl call, the channel the
> >    ioctl is called on is updated to point to the channel identified
> >    using the index passed in the ioctl call.
> > 
> >    As such, allow PPP frames to pass in both directions from channel A
> >    to channel B, userspace must call ioctl twice: once to bridge A to B,
> >    and once to bridge B to A.
> > 
> >    This approach makes the kernel coding easier, because the ioctl
> >    handler doesn't need to do anything to lock the channel which is
> >    identified by index: it's sufficient to find it in the per-net list
> >    (under protection of the list lock) and take a reference on it.
> > 
> >    The downside is that userspace must make two ioctl calls to fully set
> >    up the bridge.
> 
> That's probably okay, but that'd allow for very strange setups, like
> channel A pointing to channel B and channel B being used by a PPP unit.
> I'd prefer to avoid having to think about such scenarios when reasoning
> about the code.

Good point about the cognitive load.  I agree with you there.

> I think that the channel needs to be locked anyway to safely modify the
> bridge pointer. So the "no lock needed" benefit of the 2 ioctl calls
> approach doesn't seem to stand.

Agreed.

> BTW, shouldn't we have an "UNBRIDGE" command to remove the bridge
> between two channels?

I'm not sure of the usecase for it to be honest.  Do you have
something specific in mind?


Thanks very much for your review and comments, it's much appreciated
:-)
Tom Parkin Nov. 10, 2020, 12:42 p.m. UTC | #5
On  Tue, Nov 10, 2020 at 10:28:34 +0100, Guillaume Nault wrote:
> On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> > On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:
> > > This small RFC series implements a suggestion from Guillaume Nault in
> > > response to my previous submission to add an ac/pppoe driver to the l2tp
> > > subsystem[1].
> > > 
> > > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > > to allow a ppp channel to be bridged to another.
> > 
> > I have little understanding of the ppp code, but I can't help but
> > wonder why this special channel connection is needed? We have great
> > many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> > is there anything ppp specific that is required here?
> 
> I can see two viable ways to implement this feature. The one described
> in this patch series is the simplest. The reason why it doesn't reuse
> existing infrastructure is because it has to work at the link layer
> (no netfilter) and also has to work on PPP channels (no network
> device).
> 
> The alternative, is to implement a virtual network device for the
> protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
> and teach tunnel_key about them.

One potential downside of this approach is the addition of two virtual
interfaces for each pppoe->pppol2tp mapping: the concern here
primarily being the cost of doing so.

I'm not saying the cost is necessarily prohibitive, but the "bridge the
channels" approach in the RFC is certainly cheaper.

Another concern would be the possibility of the virtual devices being
misconfigured in such a way as to e.g. allow locally generated
broadcast packets to go out on one of the interfaces.  Possibly this
would be easy to avoid, I'm not sure.

> I think the question is more about long term maintainance. Do we want
> to keep PPP related module self contained, with low maintainance code
> (the current proposal)? Or are we willing to modernise the
> infrastructure, add support and maintain PPP features in other modules
> like flower, tunnel_key, etc.?

FWIW I would tend to agree.
Guillaume Nault Nov. 10, 2020, 3:02 p.m. UTC | #6
On Tue, Nov 10, 2020 at 12:42:24PM +0000, Tom Parkin wrote:
> On  Tue, Nov 10, 2020 at 10:28:34 +0100, Guillaume Nault wrote:
> > On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> > > On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:
> > > > This small RFC series implements a suggestion from Guillaume Nault in
> > > > response to my previous submission to add an ac/pppoe driver to the l2tp
> > > > subsystem[1].
> > > > 
> > > > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > > > to allow a ppp channel to be bridged to another.
> > > 
> > > I have little understanding of the ppp code, but I can't help but
> > > wonder why this special channel connection is needed? We have great
> > > many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> > > is there anything ppp specific that is required here?
> > 
> > I can see two viable ways to implement this feature. The one described
> > in this patch series is the simplest. The reason why it doesn't reuse
> > existing infrastructure is because it has to work at the link layer
> > (no netfilter) and also has to work on PPP channels (no network
> > device).
> > 
> > The alternative, is to implement a virtual network device for the
> > protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
> > and teach tunnel_key about them.
> 
> One potential downside of this approach is the addition of two virtual
> interfaces for each pppoe->pppol2tp mapping: the concern here
> primarily being the cost of doing so.

No, this is fixed cost. There'd be only one PPPoE interface for
handling all the PPPoE sessions and one for L2TP. These virtual
interfaces wouldn't be specific to a particular session ID. Instead,
the encapsulation information would be attached to the skb and the
virtual PPPoE or L2TP device would build the header based on these
metadata.

> I'm not saying the cost is necessarily prohibitive, but the "bridge the
> channels" approach in the RFC is certainly cheaper.
> 
> Another concern would be the possibility of the virtual devices being
> misconfigured in such a way as to e.g. allow locally generated
> broadcast packets to go out on one of the interfaces.  Possibly this
> would be easy to avoid, I'm not sure.

I'm not too woried about that. A PPPoE or L2TP interface in external
mode couldn't build its header and forward the packet if the skb it
received doesn't have the proper metadat attached to it. So packets
couldn't be inadvertently sent through these interfaces, something
would have to attach the tunnel metadata to the skb beforehand.

But I agree that the setup becomes visible to the administrator, while
the ioctl() approach kept the whole kernel configuration in the hands
of the control-plane implementation. I think that's a good thing
(easier testing and troubleshooting), but yes, that also opens the
possibility for fat finger mistakes.

> > I think the question is more about long term maintainance. Do we want
> > to keep PPP related module self contained, with low maintainance code
> > (the current proposal)? Or are we willing to modernise the
> > infrastructure, add support and maintain PPP features in other modules
> > like flower, tunnel_key, etc.?
> 
> FWIW I would tend to agree.

Yes, it's really about how much we're ready to invest into PPP-related
features.
Jakub Kicinski Nov. 10, 2020, 4:47 p.m. UTC | #7
On Tue, 10 Nov 2020 10:28:34 +0100 Guillaume Nault wrote:
> On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> > On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:  
> > > This small RFC series implements a suggestion from Guillaume Nault in
> > > response to my previous submission to add an ac/pppoe driver to the l2tp
> > > subsystem[1].
> > > 
> > > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > > to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> > > 
> > > "It's just a matter of extending struct channel (in ppp_generic.c) with
> > > a pointer to another channel, then testing this pointer in ppp_input().
> > > If the pointer is NULL, use the classical path, if not, forward the PPP
> > > frame using the ->start_xmit function of the peer channel."
> > > 
> > > This allows userspace to easily take PPP frames from e.g. a PPPoE
> > > session, and forward them over a PPPoL2TP session; accomplishing the
> > > same thing my earlier ac/pppoe driver in l2tp did but in much less code!  
> > 
> > I have little understanding of the ppp code, but I can't help but
> > wonder why this special channel connection is needed? We have great
> > many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> > is there anything ppp specific that is required here?  
> 
> I can see two viable ways to implement this feature. The one described
> in this patch series is the simplest. The reason why it doesn't reuse
> existing infrastructure is because it has to work at the link layer
> (no netfilter) and also has to work on PPP channels (no network
> device).
> 
> The alternative, is to implement a virtual network device for the
> protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
> and teach tunnel_key about them. Then we could use iproute2 commands
> like:
>  # ip link add name pppoe0 up type pppoe external
>  # ip link add name l2tp0 up type l2tp external
>  # tc qdisc add dev pppoe0 ingress
>  # tc qdisc add dev l2tp0 ingress
>  # tc filter add dev pppoe0 ingress matchall                        \
>      action tunnel_key set l2tp_version 2 l2tp_tid XXX l2tp_sid YYY \
>      action mirred egress redirect dev pppoe0
>  # tc filter add dev l2tp0 ingress matchall  \
>      action tunnel_key set pppoe_sid ZZZ     \
>      action mirred egress redirect dev l2tp0
> 
> Note: I've used matchall for simplicity, but a real uses case would
> have to filter on the L2TP session and tunnel IDs and on the PPPoE
> session ID.
> 
> As I said in my reply to the original thread, I like this idea, but
> haven't thought much about the details. So there might be some road
> blocks. Beyond modernising PPP and making it better integrated into the
> stack, that should also bring the possibility of hardware offload (but
> would any NIC vendor be interested?).

Integrating with the stack gives you access to all its features, other
types of encap, firewalling, bpf, etc.

> I think the question is more about long term maintainance. Do we want
> to keep PPP related module self contained, with low maintainance code
> (the current proposal)? Or are we willing to modernise the
> infrastructure, add support and maintain PPP features in other modules
> like flower, tunnel_key, etc.?

Right, it's really not great to see new IOCTLs being added to drivers,
but the alternative would require easily 50 times more code.
 
> Of course, I might have missed other ways to implement this feature.
> But that's all I could think of for now.
> 
> And if anyone wants a quick recap about PPP (what are these PPP channel
> and unit things? what's the relationship between PPPoE, L2TP and PPP?
> etc.), just let me know.

Some pointers would be appreciated if you don't mind :)
Guillaume Nault Nov. 15, 2020, 11:59 a.m. UTC | #8
On Tue, Nov 10, 2020 at 11:54:07AM +0000, Tom Parkin wrote:
> On  Mon, Nov 09, 2020 at 23:51:53 +0100, Guillaume Nault wrote:
> > BTW, shouldn't we have an "UNBRIDGE" command to remove the bridge
> > between two channels?
> 
> I'm not sure of the usecase for it to be honest.  Do you have
> something specific in mind?

I don't know if there'd be a real production use case. I proposed it
because, in my experience, the diffucult part of any new feature is
the "undo" operation. That's where many race conditions are found.

Having a way to directly revert a BRIDGE operation might help testing
the undo path (otherwise it's just triggered as a side effect of
closing a file descriptor). I personally find that having symmetrical
"do" and "undo" operations helps me thinking precisely about how to
manage concurency. But that's probably a matter of preference. And that
can even be done without exposing the "undo" operation to user space
(it's just more difficult to test).

Anyway, that was just a suggestion. I have no strong opinion.

> Thanks very much for your review and comments, it's much appreciated
> :-)

Thanks! :)
Tom Parkin Nov. 17, 2020, 12:12 p.m. UTC | #9
On  Sun, Nov 15, 2020 at 12:59:59 +0100, Guillaume Nault wrote:
> On Tue, Nov 10, 2020 at 11:54:07AM +0000, Tom Parkin wrote:
> > On  Mon, Nov 09, 2020 at 23:51:53 +0100, Guillaume Nault wrote:
> > > BTW, shouldn't we have an "UNBRIDGE" command to remove the bridge
> > > between two channels?
> > 
> > I'm not sure of the usecase for it to be honest.  Do you have
> > something specific in mind?
> 
> I don't know if there'd be a real production use case. I proposed it
> because, in my experience, the diffucult part of any new feature is
> the "undo" operation. That's where many race conditions are found.
> 
> Having a way to directly revert a BRIDGE operation might help testing
> the undo path (otherwise it's just triggered as a side effect of
> closing a file descriptor). I personally find that having symmetrical
> "do" and "undo" operations helps me thinking precisely about how to
> manage concurency. But that's probably a matter of preference. And that
> can even be done without exposing the "undo" operation to user space
> (it's just more difficult to test).
> 
> Anyway, that was just a suggestion. I have no strong opinion.

Thanks for clarifying the point -- I agree with you about the "undo"
operation helping to expose race conditions.
Tom Parkin Nov. 17, 2020, 12:54 p.m. UTC | #10
On  Tue, Nov 10, 2020 at 08:47:40 -0800, Jakub Kicinski wrote:
> On Tue, 10 Nov 2020 10:28:34 +0100 Guillaume Nault wrote:
> > On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> > > On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:  
> > > > This small RFC series implements a suggestion from Guillaume Nault in
> > > > response to my previous submission to add an ac/pppoe driver to the l2tp
> > > > subsystem[1].
> > > > 
> > > > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > > > to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> > > > 
> > > > "It's just a matter of extending struct channel (in ppp_generic.c) with
> > > > a pointer to another channel, then testing this pointer in ppp_input().
> > > > If the pointer is NULL, use the classical path, if not, forward the PPP
> > > > frame using the ->start_xmit function of the peer channel."
> > > > 
> > > > This allows userspace to easily take PPP frames from e.g. a PPPoE
> > > > session, and forward them over a PPPoL2TP session; accomplishing the
> > > > same thing my earlier ac/pppoe driver in l2tp did but in much less code!  
> > > 
> > > I have little understanding of the ppp code, but I can't help but
> > > wonder why this special channel connection is needed? We have great
> > > many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> > > is there anything ppp specific that is required here?  
> > 
> > I can see two viable ways to implement this feature. The one described
> > in this patch series is the simplest. The reason why it doesn't reuse
> > existing infrastructure is because it has to work at the link layer
> > (no netfilter) and also has to work on PPP channels (no network
> > device).
> > 
> > The alternative, is to implement a virtual network device for the
> > protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
> > and teach tunnel_key about them. Then we could use iproute2 commands
> > like:
> >  # ip link add name pppoe0 up type pppoe external
> >  # ip link add name l2tp0 up type l2tp external
> >  # tc qdisc add dev pppoe0 ingress
> >  # tc qdisc add dev l2tp0 ingress
> >  # tc filter add dev pppoe0 ingress matchall                        \
> >      action tunnel_key set l2tp_version 2 l2tp_tid XXX l2tp_sid YYY \
> >      action mirred egress redirect dev pppoe0
> >  # tc filter add dev l2tp0 ingress matchall  \
> >      action tunnel_key set pppoe_sid ZZZ     \
> >      action mirred egress redirect dev l2tp0
> > 
> > Note: I've used matchall for simplicity, but a real uses case would
> > have to filter on the L2TP session and tunnel IDs and on the PPPoE
> > session ID.
> > 
> > As I said in my reply to the original thread, I like this idea, but
> > haven't thought much about the details. So there might be some road
> > blocks. Beyond modernising PPP and making it better integrated into the
> > stack, that should also bring the possibility of hardware offload (but
> > would any NIC vendor be interested?).
> 
> Integrating with the stack gives you access to all its features, other
> types of encap, firewalling, bpf, etc.
> 
> > I think the question is more about long term maintainance. Do we want
> > to keep PPP related module self contained, with low maintainance code
> > (the current proposal)? Or are we willing to modernise the
> > infrastructure, add support and maintain PPP features in other modules
> > like flower, tunnel_key, etc.?
> 
> Right, it's really not great to see new IOCTLs being added to drivers,
> but the alternative would require easily 50 times more code.

Jakub, could I quickly poll you on your current gut-feel level of
opposition to the ioctl-based approach?

Guillaume has given good feedback on the RFC code which I can work
into an actual patch submission, but I don't really want to if you're
totally opposed to the whole idea :-)

I appreciate you may want to reserve judgement pending a recap of the
ppp subsystem as it stands.

Thanks!
Guillaume Nault Nov. 17, 2020, 2:17 p.m. UTC | #11
On Tue, Nov 17, 2020 at 12:54:22PM +0000, Tom Parkin wrote:
> On  Tue, Nov 10, 2020 at 08:47:40 -0800, Jakub Kicinski wrote:
> > On Tue, 10 Nov 2020 10:28:34 +0100 Guillaume Nault wrote:
> > > I think the question is more about long term maintainance. Do we want
> > > to keep PPP related module self contained, with low maintainance code
> > > (the current proposal)? Or are we willing to modernise the
> > > infrastructure, add support and maintain PPP features in other modules
> > > like flower, tunnel_key, etc.?
> > 
> > Right, it's really not great to see new IOCTLs being added to drivers,
> > but the alternative would require easily 50 times more code.
> 
> Jakub, could I quickly poll you on your current gut-feel level of
> opposition to the ioctl-based approach?
> 
> Guillaume has given good feedback on the RFC code which I can work
> into an actual patch submission, but I don't really want to if you're
> totally opposed to the whole idea :-)
> 
> I appreciate you may want to reserve judgement pending a recap of the
> ppp subsystem as it stands.

I've started writing some general explanations about the protocol and
the actual kernel implementation. I'm planning to send them in the days
to come. I just have to finish some higher priority tasks first. Sorry
for the delay.
Jakub Kicinski Nov. 17, 2020, 4:52 p.m. UTC | #12
On Tue, 17 Nov 2020 12:54:22 +0000 Tom Parkin wrote:
> > > I think the question is more about long term maintainance. Do we want
> > > to keep PPP related module self contained, with low maintainance code
> > > (the current proposal)? Or are we willing to modernise the
> > > infrastructure, add support and maintain PPP features in other modules
> > > like flower, tunnel_key, etc.?  
> > 
> > Right, it's really not great to see new IOCTLs being added to drivers,
> > but the alternative would require easily 50 times more code.  
> 
> Jakub, could I quickly poll you on your current gut-feel level of
> opposition to the ioctl-based approach?
> 
> Guillaume has given good feedback on the RFC code which I can work
> into an actual patch submission, but I don't really want to if you're
> totally opposed to the whole idea :-)

I'll merge it if no one else speaks up in opposition.
Guillaume Nault Nov. 18, 2020, 8:24 p.m. UTC | #13
On Tue, Nov 10, 2020 at 08:47:40AM -0800, Jakub Kicinski wrote:
> On Tue, 10 Nov 2020 10:28:34 +0100 Guillaume Nault wrote:
> > On Mon, Nov 09, 2020 at 03:52:37PM -0800, Jakub Kicinski wrote:
> > > On Fri,  6 Nov 2020 18:16:45 +0000 Tom Parkin wrote:  
> > > > This small RFC series implements a suggestion from Guillaume Nault in
> > > > response to my previous submission to add an ac/pppoe driver to the l2tp
> > > > subsystem[1].
> > > > 
> > > > Following Guillaume's advice, this series adds an ioctl to the ppp code
> > > > to allow a ppp channel to be bridged to another.  Quoting Guillaume:
> > > > 
> > > > "It's just a matter of extending struct channel (in ppp_generic.c) with
> > > > a pointer to another channel, then testing this pointer in ppp_input().
> > > > If the pointer is NULL, use the classical path, if not, forward the PPP
> > > > frame using the ->start_xmit function of the peer channel."
> > > > 
> > > > This allows userspace to easily take PPP frames from e.g. a PPPoE
> > > > session, and forward them over a PPPoL2TP session; accomplishing the
> > > > same thing my earlier ac/pppoe driver in l2tp did but in much less code!  
> > > 
> > > I have little understanding of the ppp code, but I can't help but
> > > wonder why this special channel connection is needed? We have great
> > > many ways to redirect traffic between interfaces - bpf, tc, netfilter,
> > > is there anything ppp specific that is required here?  
> > 
> > I can see two viable ways to implement this feature. The one described
> > in this patch series is the simplest. The reason why it doesn't reuse
> > existing infrastructure is because it has to work at the link layer
> > (no netfilter) and also has to work on PPP channels (no network
> > device).
> > 
> > The alternative, is to implement a virtual network device for the
> > protocols we want to support (at least PPPoE and L2TP, maybe PPTP)
> > and teach tunnel_key about them. Then we could use iproute2 commands
> > like:
> >  # ip link add name pppoe0 up type pppoe external
> >  # ip link add name l2tp0 up type l2tp external
> >  # tc qdisc add dev pppoe0 ingress
> >  # tc qdisc add dev l2tp0 ingress
> >  # tc filter add dev pppoe0 ingress matchall                        \
> >      action tunnel_key set l2tp_version 2 l2tp_tid XXX l2tp_sid YYY \
> >      action mirred egress redirect dev pppoe0
> >  # tc filter add dev l2tp0 ingress matchall  \
> >      action tunnel_key set pppoe_sid ZZZ     \
> >      action mirred egress redirect dev l2tp0
> > 
> > Note: I've used matchall for simplicity, but a real uses case would
> > have to filter on the L2TP session and tunnel IDs and on the PPPoE
> > session ID.
> > 
> > As I said in my reply to the original thread, I like this idea, but
> > haven't thought much about the details. So there might be some road
> > blocks. Beyond modernising PPP and making it better integrated into the
> > stack, that should also bring the possibility of hardware offload (but
> > would any NIC vendor be interested?).
> 
> Integrating with the stack gives you access to all its features, other
> types of encap, firewalling, bpf, etc.
> 
> > I think the question is more about long term maintainance. Do we want
> > to keep PPP related module self contained, with low maintainance code
> > (the current proposal)? Or are we willing to modernise the
> > infrastructure, add support and maintain PPP features in other modules
> > like flower, tunnel_key, etc.?
> 
> Right, it's really not great to see new IOCTLs being added to drivers,
> but the alternative would require easily 50 times more code.
>  
> > Of course, I might have missed other ways to implement this feature.
> > But that's all I could think of for now.
> > 
> > And if anyone wants a quick recap about PPP (what are these PPP channel
> > and unit things? what's the relationship between PPPoE, L2TP and PPP?
> > etc.), just let me know.
> 
> Some pointers would be appreciated if you don't mind :)

Here's a high level view of:
  * the protocol,
  * the kernel implementation,
  * the context of this RFC,
  * and a few pointers at the end :)

Hope this helps. I've tried to keep it short. Feel free to ask for
clarifications and details.


The Point-to-Point Protocol
===========================

PPP is a layer 2 protocol. The header is a single field that identifies
the upper protocol (just like an Ethertype). PPP is point-to-point, so
there's no need for source and destination link layer addresses in the
header: whatever is sent on one end of the pipe is received by the host
on the other end, it's that simple (no shared medium, no switching).
Some protocols have been defined to tunnel PPP packets (PPPoE, L2TP).

PPP doesn't just define how to handle data frames, it also has control
protocols. The Link Control Protocol (LCP) is used to negotiate
link-layer parameters (maximum packet size, optionally request the peer
to authenticate, etc.). LCP is part of the PPP specification. All other
control protocols are defined in different RFCs, but they use the same
protocol structure as LCP.

Once both peers agree on the link parameters, they can proceed to the
optional authentication phase (if that was negotiated during the LCP
phase). There are several authentication protocols available; the one
to use is selected during the LCP phase.

Finally, the peers can negotiate whatever network protocol they want to
use: with PPP, all network protocols need to have an equivalent NCP
(Network Control Protocol). For example, IPv4 has IPCP, IPv6 has IPv6CP,
MPLS has MPLSCP, etc. In some cases, the NCP is used to negotiate
network specific parameters. For example IPCP allows each peer to
advertise its IPv4 address or to request an address from the remote
peer. NCPs are generally very simple. Some don't even have any
parameter to negotiate (like MPLSCP).

Once an NCP has been negotiated, the peers can exchange data packets of
that protocol. Of course several network protocols can be used
simultaneously.

PPP can run over physical links or be tunnelled into other protocols.
For example, PPPoE carries PPP over Ethernet and L2TP tunnels PPP into
UDP.

Kernel Implementation
=====================

The Linux kernel implementation exposes a /dev/ppp virtual file that's
used by user space to implement the control and authentication
protocols.

Typically, user space starts by opening /dev/ppp. By calling the
PPPIOCATTCHAN ioctl, it attaches the file descriptor to a lower layer
that implements the rx and tx handlers. The lower layer may be a serial
link, an L2TP or PPPoE session, etc. We don't have a networking device
yet, but the file descriptor can now receive and send data over the
link, which is enough to implement LCP and authentication protocols.
This is what ppp_generic.c calls a PPP channel.

Then, to create a network device, one needs to open another file
descriptor on /dev/ppp and call the PPPIOCNEWUNIT ioctl. Alternatively,
it's possible to use a netlink call instead of PPPIOCNEWUNIT to create
the netdevice and attach it to the new file descriptor. We now have
what ppp_generic.c calls a PPP unit.

The unit currently doesn't know how to send data on the wire, so one
needs to connect it to the channel. This is done by another ioctl:
PPPIOCCONNECT. Now, the PPP networking device is able to send data
packets on the wire, and the unit file descriptor can be used to
implement the network control protocols.

The reason for having channels and units is PPP multilink: one can
connect several channels to a given unit. In this case, the unit
will use all channels when sending packets. That's the PPP way to
do link aggregation.

Overlays
========

It's possible to encapsulate PPP into other protocols. For example,
the Linux kernel supports PPTP, L2TP and PPPoE. PPTP is (was?) often
used for VPNs. PPPoE and L2TP are typically used by ISPs to provide
DSL connections. The kernel implementation of these protocols provides
the lower layer necessary for PPP channels to send and receive data.

There's an ISP use case that isn't covered by the current
implementation though: "bridging" a PPPoE and an L2TP session together
(or two L2TP sessions together). This is used to stretch a PPP
connection across an IP network (essentially simulating circuit
switching on top of IP).

Tom's RFC addresses this use case, by adding a new ioctl to bridge two
channels together (each channel can run on a different lower layer
technology). Units aren't necessary in this use case, because only
the LCP and authentication protocols need to run on the hosts that do
only bridging: once the authentication succeeds, every packet received
on one channel is forwarded over the other channel. NCPs are still
negotiated end to end.

The other solution envisioned in this thread (virtual L2TP and PPPoE
devices in collect_md mode) wouldn't use ppp_generic.c at all:
act_mirred would directly redirect the PPP packets between the virtual
PPPoE or L2TP devices. I don't have any code for this approach though.

Pointers
========

  * Documentation/networking/ppp_generic.rst:
      Documentation for the kernel implementation (including the ioctls).

  * drivers/net/ppp/*:
      Kernel implementation of PPP, PPPoE, PPTP...

  * net/l2tp/*:
      Kernel implementation of L2TP (v2 and v3). The PPP-specific part
      is in l2tp_ppp.c.

  * RFC 1661:
      The PPP specification, including LCP (IPCP is in RFC 1332, IPv6CP
      in RFC 5072, MPLSCP in RFC 3032 section 4).

  * RFC 1990:
      The PPP multilink specification.

  * RFC 2516:
      The PPPoE specification.

  * RFC 2637:
      The PPTP specification.

  * RFC 2661:
    The L2TPv2 specification (L2TPv3, defined in RFC 3931, was defined
    later as a generalisation of L2TPv2 that could transport more than
    just PPP).
Jakub Kicinski Nov. 20, 2020, 1:17 a.m. UTC | #14
On Wed, 18 Nov 2020 21:24:53 +0100 Guillaume Nault wrote:
> Here's a high level view of:
>   * the protocol,
>   * the kernel implementation,
>   * the context of this RFC,
>   * and a few pointers at the end :)
> 
> Hope this helps. I've tried to keep it short. Feel free to ask for
> clarifications and details.

Thanks of the write up, much appreciated!