Message ID | 20241029103656.2151-1-dqfext@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,net-next] net: ppp: convert to IFF_NO_QUEUE | expand |
+ Toke On Tue, Oct 29, 2024 at 06:36:56PM +0800, Qingfang Deng wrote: > When testing the parallel TX performance of a single PPPoE interface > over a 2.5GbE link with multiple hardware queues, the throughput could > not exceed 1.9Gbps, even with low CPU usage. > > This issue arises because the PPP interface is registered with a single > queue and a tx_queue_len of 3. This default behavior dates back to Linux > 2.3.13, which was suitable for slower serial ports. However, in modern > devices with multiple processors and hardware queues, this configuration > can lead to congestion. > > For PPPoE/PPTP, the lower interface should handle qdisc, so we need to > set IFF_NO_QUEUE. For PPP over a serial port, we don't benefit from a > qdisc with such a short TX queue, so handling TX queueing in the driver > and setting IFF_NO_QUEUE is more effective. > > With this change, PPPoE interfaces can now fully saturate a 2.5GbE link. > > Signed-off-by: Qingfang Deng <dqfext@gmail.com> Hi Toke, I'm wondering if you could offer an opinion on this. > --- > drivers/net/ppp/ppp_generic.c | 27 ++++++++++++--------------- > 1 file changed, 12 insertions(+), 15 deletions(-) > > diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c > index 4b2971e2bf48..5470e0fe1f9b 100644 > --- a/drivers/net/ppp/ppp_generic.c > +++ b/drivers/net/ppp/ppp_generic.c > @@ -236,8 +236,8 @@ struct ppp_net { > /* Get the PPP protocol number from a skb */ > #define PPP_PROTO(skb) get_unaligned_be16((skb)->data) > > -/* We limit the length of ppp->file.rq to this (arbitrary) value */ > -#define PPP_MAX_RQLEN 32 > +/* We limit the length of ppp->file.rq/xq to this (arbitrary) value */ > +#define PPP_MAX_QLEN 32 > > /* > * Maximum number of multilink fragments queued up. > @@ -920,8 +920,6 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) > break; > } else { > ppp->npmode[i] = npi.mode; > - /* we may be able to transmit more packets now (??) */ > - netif_wake_queue(ppp->dev); > } > err = 0; > break; > @@ -1639,6 +1637,7 @@ static void ppp_setup(struct net_device *dev) > dev->tx_queue_len = 3; > dev->type = ARPHRD_PPP; > dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST; > + dev->priv_flags |= IFF_NO_QUEUE; > dev->priv_destructor = ppp_dev_priv_destructor; > netif_keep_dst(dev); > } > @@ -1654,17 +1653,15 @@ static void __ppp_xmit_process(struct ppp *ppp, struct sk_buff *skb) > if (!ppp->closing) { > ppp_push(ppp); > > - if (skb) > - skb_queue_tail(&ppp->file.xq, skb); > + if (skb) { > + if (ppp->file.xq.qlen > PPP_MAX_QLEN) > + kfree_skb(skb); > + else > + skb_queue_tail(&ppp->file.xq, skb); > + } > while (!ppp->xmit_pending && > (skb = skb_dequeue(&ppp->file.xq))) > ppp_send_frame(ppp, skb); > - /* If there's no work left to do, tell the core net > - code that we can accept some more. */ > - if (!ppp->xmit_pending && !skb_peek(&ppp->file.xq)) > - netif_wake_queue(ppp->dev); > - else > - netif_stop_queue(ppp->dev); > } else { > kfree_skb(skb); > } > @@ -1850,7 +1847,7 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb) > * queue it up for pppd to receive. > */ > if (ppp->flags & SC_LOOP_TRAFFIC) { > - if (ppp->file.rq.qlen > PPP_MAX_RQLEN) > + if (ppp->file.rq.qlen > PPP_MAX_QLEN) > goto drop; > skb_queue_tail(&ppp->file.rq, skb); > wake_up_interruptible(&ppp->file.rwait); > @@ -2319,7 +2316,7 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb) > /* put it on the channel queue */ > skb_queue_tail(&pch->file.rq, skb); > /* drop old frames if queue too long */ > - while (pch->file.rq.qlen > PPP_MAX_RQLEN && > + while (pch->file.rq.qlen > PPP_MAX_QLEN && > (skb = skb_dequeue(&pch->file.rq))) > kfree_skb(skb); > wake_up_interruptible(&pch->file.rwait); > @@ -2472,7 +2469,7 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb) > /* control or unknown frame - pass it to pppd */ > skb_queue_tail(&ppp->file.rq, skb); > /* limit queue length by dropping old frames */ > - while (ppp->file.rq.qlen > PPP_MAX_RQLEN && > + while (ppp->file.rq.qlen > PPP_MAX_QLEN && > (skb = skb_dequeue(&ppp->file.rq))) > kfree_skb(skb); > /* wake up any process polling or blocking on read */ > -- > 2.34.1 > >
Simon Horman <horms@kernel.org> writes: > + Toke > > On Tue, Oct 29, 2024 at 06:36:56PM +0800, Qingfang Deng wrote: >> When testing the parallel TX performance of a single PPPoE interface >> over a 2.5GbE link with multiple hardware queues, the throughput could >> not exceed 1.9Gbps, even with low CPU usage. >> >> This issue arises because the PPP interface is registered with a single >> queue and a tx_queue_len of 3. This default behavior dates back to Linux >> 2.3.13, which was suitable for slower serial ports. However, in modern >> devices with multiple processors and hardware queues, this configuration >> can lead to congestion. >> >> For PPPoE/PPTP, the lower interface should handle qdisc, so we need to >> set IFF_NO_QUEUE. For PPP over a serial port, we don't benefit from a >> qdisc with such a short TX queue, so handling TX queueing in the driver >> and setting IFF_NO_QUEUE is more effective. >> >> With this change, PPPoE interfaces can now fully saturate a 2.5GbE link. >> >> Signed-off-by: Qingfang Deng <dqfext@gmail.com> > > Hi Toke, > > I'm wondering if you could offer an opinion on this. Hi Simon Thanks for bringing this to my attention; I'll reply to the parent directly :) -Toke
Qingfang Deng <dqfext@gmail.com> writes: > When testing the parallel TX performance of a single PPPoE interface > over a 2.5GbE link with multiple hardware queues, the throughput could > not exceed 1.9Gbps, even with low CPU usage. > > This issue arises because the PPP interface is registered with a single > queue and a tx_queue_len of 3. This default behavior dates back to Linux > 2.3.13, which was suitable for slower serial ports. However, in modern > devices with multiple processors and hardware queues, this configuration > can lead to congestion. > > For PPPoE/PPTP, the lower interface should handle qdisc, so we need to > set IFF_NO_QUEUE. This bit makes sense - the PPPoE and PPTP channel types call through to the underlying network stack, and their start_xmit() ops never return anything other than 1 (so there's no pushback against the upper PPP device anyway). The same goes for the L2TP PPP channel driver. > For PPP over a serial port, we don't benefit from a qdisc with such a > short TX queue, so handling TX queueing in the driver and setting > IFF_NO_QUEUE is more effective. However, this bit is certainly not true. For the channel drivers that do push back (which is everything apart from the three mentioned above, AFAICT), we absolutely do want a qdisc to store the packets, instead of this arbitrary 32-packet FIFO inside the driver. Your comment about the short TX queue only holds for the pfifo_fast qdisc (that's the only one that uses the tx_queue_len for anything), anything else will do its own thing. (Side note: don't use pfifo_fast!) I suppose one option here could be to set the IFF_NO_QUEUE flag conditionally depending on whether the underlying channel driver does pushback against the PPP device or not (add a channel flag to indicate this, or something), and then call the netif_{wake,stop}_queue() functions conditionally depending on this. But setting the noqueue flag unconditionally like this patch does, is definitely not a good idea! -Toke
Hi Toke, On Tue, Nov 5, 2024 at 8:24 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Qingfang Deng <dqfext@gmail.com> writes: > > > When testing the parallel TX performance of a single PPPoE interface > > over a 2.5GbE link with multiple hardware queues, the throughput could > > not exceed 1.9Gbps, even with low CPU usage. > > > > This issue arises because the PPP interface is registered with a single > > queue and a tx_queue_len of 3. This default behavior dates back to Linux > > 2.3.13, which was suitable for slower serial ports. However, in modern > > devices with multiple processors and hardware queues, this configuration > > can lead to congestion. > > > > For PPPoE/PPTP, the lower interface should handle qdisc, so we need to > > set IFF_NO_QUEUE. > > This bit makes sense - the PPPoE and PPTP channel types call through to > the underlying network stack, and their start_xmit() ops never return > anything other than 1 (so there's no pushback against the upper PPP > device anyway). The same goes for the L2TP PPP channel driver. > > > For PPP over a serial port, we don't benefit from a qdisc with such a > > short TX queue, so handling TX queueing in the driver and setting > > IFF_NO_QUEUE is more effective. > > However, this bit is certainly not true. For the channel drivers that > do push back (which is everything apart from the three mentioned above, > AFAICT), we absolutely do want a qdisc to store the packets, instead of > this arbitrary 32-packet FIFO inside the driver. Your comment about the > short TX queue only holds for the pfifo_fast qdisc (that's the only one > that uses the tx_queue_len for anything), anything else will do its own > thing. > > (Side note: don't use pfifo_fast!) > > I suppose one option here could be to set the IFF_NO_QUEUE flag > conditionally depending on whether the underlying channel driver does > pushback against the PPP device or not (add a channel flag to indicate > this, or something), and then call the netif_{wake,stop}_queue() > functions conditionally depending on this. But setting the noqueue flag > unconditionally like this patch does, is definitely not a good idea! I agree. Then the problem becomes how to test if a PPP device is a PPPoE. It seems like PPPoE is the only one that implements ops->fill_forward_path, should I use that? Or is there a better way? - Qingfang > > -Toke >
Qingfang Deng <dqfext@gmail.com> writes: > Hi Toke, > > On Tue, Nov 5, 2024 at 8:24 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Qingfang Deng <dqfext@gmail.com> writes: >> >> > When testing the parallel TX performance of a single PPPoE interface >> > over a 2.5GbE link with multiple hardware queues, the throughput could >> > not exceed 1.9Gbps, even with low CPU usage. >> > >> > This issue arises because the PPP interface is registered with a single >> > queue and a tx_queue_len of 3. This default behavior dates back to Linux >> > 2.3.13, which was suitable for slower serial ports. However, in modern >> > devices with multiple processors and hardware queues, this configuration >> > can lead to congestion. >> > >> > For PPPoE/PPTP, the lower interface should handle qdisc, so we need to >> > set IFF_NO_QUEUE. >> >> This bit makes sense - the PPPoE and PPTP channel types call through to >> the underlying network stack, and their start_xmit() ops never return >> anything other than 1 (so there's no pushback against the upper PPP >> device anyway). The same goes for the L2TP PPP channel driver. >> >> > For PPP over a serial port, we don't benefit from a qdisc with such a >> > short TX queue, so handling TX queueing in the driver and setting >> > IFF_NO_QUEUE is more effective. >> >> However, this bit is certainly not true. For the channel drivers that >> do push back (which is everything apart from the three mentioned above, >> AFAICT), we absolutely do want a qdisc to store the packets, instead of >> this arbitrary 32-packet FIFO inside the driver. Your comment about the >> short TX queue only holds for the pfifo_fast qdisc (that's the only one >> that uses the tx_queue_len for anything), anything else will do its own >> thing. >> >> (Side note: don't use pfifo_fast!) >> >> I suppose one option here could be to set the IFF_NO_QUEUE flag >> conditionally depending on whether the underlying channel driver does >> pushback against the PPP device or not (add a channel flag to indicate >> this, or something), and then call the netif_{wake,stop}_queue() >> functions conditionally depending on this. But setting the noqueue flag >> unconditionally like this patch does, is definitely not a good idea! > > I agree. Then the problem becomes how to test if a PPP device is a PPPoE. > It seems like PPPoE is the only one that implements > ops->fill_forward_path, should I use that? Or is there a better way? Just add a new field to struct ppp_channel and have the PPoE channel driver set that? Could be a flags field, or even just a 'bool direct_xmit' field... -Toke
On Tue, Nov 5, 2024 at 10:35 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Qingfang Deng <dqfext@gmail.com> writes: > > > Hi Toke, > > > > On Tue, Nov 5, 2024 at 8:24 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Qingfang Deng <dqfext@gmail.com> writes: > >> > >> > When testing the parallel TX performance of a single PPPoE interface > >> > over a 2.5GbE link with multiple hardware queues, the throughput could > >> > not exceed 1.9Gbps, even with low CPU usage. > >> > > >> > This issue arises because the PPP interface is registered with a single > >> > queue and a tx_queue_len of 3. This default behavior dates back to Linux > >> > 2.3.13, which was suitable for slower serial ports. However, in modern > >> > devices with multiple processors and hardware queues, this configuration > >> > can lead to congestion. > >> > > >> > For PPPoE/PPTP, the lower interface should handle qdisc, so we need to > >> > set IFF_NO_QUEUE. > >> > >> This bit makes sense - the PPPoE and PPTP channel types call through to > >> the underlying network stack, and their start_xmit() ops never return > >> anything other than 1 (so there's no pushback against the upper PPP > >> device anyway). The same goes for the L2TP PPP channel driver. > >> > >> > For PPP over a serial port, we don't benefit from a qdisc with such a > >> > short TX queue, so handling TX queueing in the driver and setting > >> > IFF_NO_QUEUE is more effective. > >> > >> However, this bit is certainly not true. For the channel drivers that > >> do push back (which is everything apart from the three mentioned above, > >> AFAICT), we absolutely do want a qdisc to store the packets, instead of > >> this arbitrary 32-packet FIFO inside the driver. Your comment about the > >> short TX queue only holds for the pfifo_fast qdisc (that's the only one > >> that uses the tx_queue_len for anything), anything else will do its own > >> thing. > >> > >> (Side note: don't use pfifo_fast!) > >> > >> I suppose one option here could be to set the IFF_NO_QUEUE flag > >> conditionally depending on whether the underlying channel driver does > >> pushback against the PPP device or not (add a channel flag to indicate > >> this, or something), and then call the netif_{wake,stop}_queue() > >> functions conditionally depending on this. But setting the noqueue flag > >> unconditionally like this patch does, is definitely not a good idea! > > > > I agree. Then the problem becomes how to test if a PPP device is a PPPoE. > > It seems like PPPoE is the only one that implements > > ops->fill_forward_path, should I use that? Or is there a better way? > > Just add a new field to struct ppp_channel and have the PPoE channel > driver set that? Could be a flags field, or even just a 'bool > direct_xmit' field... Okay, I'll send a patch later. > > -Toke >
On Tue, Nov 05, 2024 at 12:47:27PM +0100, Toke Høiland-Jørgensen wrote: > Simon Horman <horms@kernel.org> writes: > > > + Toke > > > > On Tue, Oct 29, 2024 at 06:36:56PM +0800, Qingfang Deng wrote: > >> When testing the parallel TX performance of a single PPPoE interface > >> over a 2.5GbE link with multiple hardware queues, the throughput could > >> not exceed 1.9Gbps, even with low CPU usage. > >> > >> This issue arises because the PPP interface is registered with a single > >> queue and a tx_queue_len of 3. This default behavior dates back to Linux > >> 2.3.13, which was suitable for slower serial ports. However, in modern > >> devices with multiple processors and hardware queues, this configuration > >> can lead to congestion. > >> > >> For PPPoE/PPTP, the lower interface should handle qdisc, so we need to > >> set IFF_NO_QUEUE. For PPP over a serial port, we don't benefit from a > >> qdisc with such a short TX queue, so handling TX queueing in the driver > >> and setting IFF_NO_QUEUE is more effective. > >> > >> With this change, PPPoE interfaces can now fully saturate a 2.5GbE link. > >> > >> Signed-off-by: Qingfang Deng <dqfext@gmail.com> > > > > Hi Toke, > > > > I'm wondering if you could offer an opinion on this. > > Hi Simon > > Thanks for bringing this to my attention; I'll reply to the parent > directly :) Likewise, thanks for following-up on this.
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 4b2971e2bf48..5470e0fe1f9b 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -236,8 +236,8 @@ struct ppp_net { /* Get the PPP protocol number from a skb */ #define PPP_PROTO(skb) get_unaligned_be16((skb)->data) -/* We limit the length of ppp->file.rq to this (arbitrary) value */ -#define PPP_MAX_RQLEN 32 +/* We limit the length of ppp->file.rq/xq to this (arbitrary) value */ +#define PPP_MAX_QLEN 32 /* * Maximum number of multilink fragments queued up. @@ -920,8 +920,6 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) break; } else { ppp->npmode[i] = npi.mode; - /* we may be able to transmit more packets now (??) */ - netif_wake_queue(ppp->dev); } err = 0; break; @@ -1639,6 +1637,7 @@ static void ppp_setup(struct net_device *dev) dev->tx_queue_len = 3; dev->type = ARPHRD_PPP; dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST; + dev->priv_flags |= IFF_NO_QUEUE; dev->priv_destructor = ppp_dev_priv_destructor; netif_keep_dst(dev); } @@ -1654,17 +1653,15 @@ static void __ppp_xmit_process(struct ppp *ppp, struct sk_buff *skb) if (!ppp->closing) { ppp_push(ppp); - if (skb) - skb_queue_tail(&ppp->file.xq, skb); + if (skb) { + if (ppp->file.xq.qlen > PPP_MAX_QLEN) + kfree_skb(skb); + else + skb_queue_tail(&ppp->file.xq, skb); + } while (!ppp->xmit_pending && (skb = skb_dequeue(&ppp->file.xq))) ppp_send_frame(ppp, skb); - /* If there's no work left to do, tell the core net - code that we can accept some more. */ - if (!ppp->xmit_pending && !skb_peek(&ppp->file.xq)) - netif_wake_queue(ppp->dev); - else - netif_stop_queue(ppp->dev); } else { kfree_skb(skb); } @@ -1850,7 +1847,7 @@ ppp_send_frame(struct ppp *ppp, struct sk_buff *skb) * queue it up for pppd to receive. */ if (ppp->flags & SC_LOOP_TRAFFIC) { - if (ppp->file.rq.qlen > PPP_MAX_RQLEN) + if (ppp->file.rq.qlen > PPP_MAX_QLEN) goto drop; skb_queue_tail(&ppp->file.rq, skb); wake_up_interruptible(&ppp->file.rwait); @@ -2319,7 +2316,7 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb) /* put it on the channel queue */ skb_queue_tail(&pch->file.rq, skb); /* drop old frames if queue too long */ - while (pch->file.rq.qlen > PPP_MAX_RQLEN && + while (pch->file.rq.qlen > PPP_MAX_QLEN && (skb = skb_dequeue(&pch->file.rq))) kfree_skb(skb); wake_up_interruptible(&pch->file.rwait); @@ -2472,7 +2469,7 @@ ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb) /* control or unknown frame - pass it to pppd */ skb_queue_tail(&ppp->file.rq, skb); /* limit queue length by dropping old frames */ - while (ppp->file.rq.qlen > PPP_MAX_RQLEN && + while (ppp->file.rq.qlen > PPP_MAX_QLEN && (skb = skb_dequeue(&ppp->file.rq))) kfree_skb(skb); /* wake up any process polling or blocking on read */
When testing the parallel TX performance of a single PPPoE interface over a 2.5GbE link with multiple hardware queues, the throughput could not exceed 1.9Gbps, even with low CPU usage. This issue arises because the PPP interface is registered with a single queue and a tx_queue_len of 3. This default behavior dates back to Linux 2.3.13, which was suitable for slower serial ports. However, in modern devices with multiple processors and hardware queues, this configuration can lead to congestion. For PPPoE/PPTP, the lower interface should handle qdisc, so we need to set IFF_NO_QUEUE. For PPP over a serial port, we don't benefit from a qdisc with such a short TX queue, so handling TX queueing in the driver and setting IFF_NO_QUEUE is more effective. With this change, PPPoE interfaces can now fully saturate a 2.5GbE link. Signed-off-by: Qingfang Deng <dqfext@gmail.com> --- drivers/net/ppp/ppp_generic.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)