Message ID | 20221102183837.157966-1-nnac123@linux.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,net] ibmveth: Reduce maximum tx queues to 8 | expand |
On Wed, 2 Nov 2022 13:38:37 -0500 Nick Child wrote: > Previously, the maximum number of transmit queues allowed was 16. Due to > resource concerns, limit to 8 queues instead. > > Since the driver is virtualized away from the physical NIC, the purpose > of multiple queues is purely to allow for parallel calls to the > hypervisor. Therefore, there is no noticeable effect on performance by > reducing queue count to 8. I'm not sure if that's the point Dave was making but we should be influencing the default, not the MAX. Why limit the MAX?
On 11/3/22 22:59, Jakub Kicinski wrote: > On Wed, 2 Nov 2022 13:38:37 -0500 Nick Child wrote: >> Previously, the maximum number of transmit queues allowed was 16. Due to >> resource concerns, limit to 8 queues instead. >> >> Since the driver is virtualized away from the physical NIC, the purpose >> of multiple queues is purely to allow for parallel calls to the >> hypervisor. Therefore, there is no noticeable effect on performance by >> reducing queue count to 8. > > I'm not sure if that's the point Dave was making but we should be > influencing the default, not the MAX. Why limit the MAX? The MAX is always allocated in the drivers probe function. In the drivers open and ethtool-set-channels functions we set real_num_tx_queues. So the number of allocated queues is always MAX but the number of queues actually in use may differ and can be set by the user. I hope this explains. Otherwise, please let me know. Thanks again for taking the time to review, Nick Child
On Fri, 4 Nov 2022 09:06:02 -0500 Nick Child wrote: > On 11/3/22 22:59, Jakub Kicinski wrote: > > On Wed, 2 Nov 2022 13:38:37 -0500 Nick Child wrote: > >> Previously, the maximum number of transmit queues allowed was 16. Due to > >> resource concerns, limit to 8 queues instead. > >> > >> Since the driver is virtualized away from the physical NIC, the purpose > >> of multiple queues is purely to allow for parallel calls to the > >> hypervisor. Therefore, there is no noticeable effect on performance by > >> reducing queue count to 8. > > > > I'm not sure if that's the point Dave was making but we should be > > influencing the default, not the MAX. Why limit the MAX? > > The MAX is always allocated in the drivers probe function. In the > drivers open and ethtool-set-channels functions we set > real_num_tx_queues. So the number of allocated queues is always MAX > but the number of queues actually in use may differ and can be set by > the user. > I hope this explains. Otherwise, please let me know. Perhaps I don't understand the worry. Is allowing 16 queues a problem because it limits how many instances the hypervisor can support? Or is the concern coming from your recent work on BQL and having many queues exacerbating buffer bloat?
On 11/4/22 12:59, Jakub Kicinski wrote: > On Fri, 4 Nov 2022 09:06:02 -0500 Nick Child wrote: >> On 11/3/22 22:59, Jakub Kicinski wrote: >>> On Wed, 2 Nov 2022 13:38:37 -0500 Nick Child wrote: >>>> Previously, the maximum number of transmit queues allowed was 16. Due to >>>> resource concerns, limit to 8 queues instead. >>>> >>>> Since the driver is virtualized away from the physical NIC, the purpose >>>> of multiple queues is purely to allow for parallel calls to the >>>> hypervisor. Therefore, there is no noticeable effect on performance by >>>> reducing queue count to 8. >>> >>> I'm not sure if that's the point Dave was making but we should be >>> influencing the default, not the MAX. Why limit the MAX? >> >> The MAX is always allocated in the drivers probe function. In the >> drivers open and ethtool-set-channels functions we set >> real_num_tx_queues. So the number of allocated queues is always MAX >> but the number of queues actually in use may differ and can be set by >> the user. >> I hope this explains. Otherwise, please let me know. > > Perhaps I don't understand the worry. Is allowing 16 queues a problem > because it limits how many instances the hypervisor can support? No, the hypervisor is unaware of the number of netdev queues. The reason for adding more netdev queues in the first place is to allow the higher networking layers to make parallel calls to the drivers xmit function, which the hypervisor can handle. > Or is the concern coming from your recent work on BQL and having many > queues exacerbating buffer bloat? Yes, and Dave can jump in here if I am wrong, but, from my understanding, if the NIC cannot send packets at the rate that they are queued then these queues will inevitably fill to txqueuelen. In this case, having more queues will not mean better throughput but will result in a large number of allocations sitting in queues (bufferbloat). I believe Dave's point was, if more queues does not allow for better performance (and can risk bufferbloat) then why have so many at all. After going through testing and seeing no difference in performance with 8 vs 16 queues, I would rather not have the driver be a culprit of potential resource hogging.
On Fri, 4 Nov 2022 13:15:39 -0500 Nick Child wrote: > > Or is the concern coming from your recent work on BQL and having many > > queues exacerbating buffer bloat? > > Yes, and Dave can jump in here if I am wrong, but, from my > understanding, if the NIC cannot send packets at the rate that > they are queued then these queues will inevitably fill to txqueuelen. > In this case, having more queues will not mean better throughput but > will result in a large number of allocations sitting in queues > (bufferbloat). I believe Dave's point was, if more queues does not > allow for better performance (and can risk bufferbloat) then why > have so many at all. > > After going through testing and seeing no difference in performance > with 8 vs 16 queues, I would rather not have the driver be a culprit > of potential resource hogging. Right, so my point was that user can always shoot themselves in the foot with bad configs. You can leave the MAX at 16, in case someone needs it. Limit the default real queues setting instead, most users will use the default.
From: Nick Child > Sent: 04 November 2022 18:16 ... > Yes, and Dave can jump in here if I am wrong, but, from my > understanding, if the NIC cannot send packets at the rate that > they are queued then these queues will inevitably fill to txqueuelen. > In this case, having more queues will not mean better throughput but > will result in a large number of allocations sitting in queues > (bufferbloat). I believe Dave's point was, if more queues does not > allow for better performance (and can risk bufferbloat) then why > have so many at all. Isn't there another issue (noted in the tg3 driver) that if the underlying hardware (or other consumer) is doing a round-robin scan of the transmit queues then (IIRC) a lot of small packet going through one queue can starve queues sending big packets. Certainly tg3 has multiple tx queues disabled. There is an associated problem with drivers that force the number of transmit and receive rings (or whatever) to be the same. The receive processing is far more expensive than transmit (it is also much more critical - it doesn't really matter if transmits get slightly delayed, but dropping rx packets is a PITA). If threaded NAPI is enabled (to avoid issues with softint processing) then you don't really need lots of threads for transmit queues, but do need ones for the rx queues. I had to use threaded NAPI with the threads running under the RT scheduler to avoid packet loss (at 500,000 pkg/sec). With tg3 four rx queues and one tx worked fine. David (not Dave) - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/drivers/net/ethernet/ibm/ibmveth.h b/drivers/net/ethernet/ibm/ibmveth.h index 4f8357187292..6b5faf1feb0b 100644 --- a/drivers/net/ethernet/ibm/ibmveth.h +++ b/drivers/net/ethernet/ibm/ibmveth.h @@ -99,7 +99,7 @@ static inline long h_illan_attributes(unsigned long unit_address, #define IBMVETH_FILT_LIST_SIZE 4096 #define IBMVETH_MAX_BUF_SIZE (1024 * 128) #define IBMVETH_MAX_TX_BUF_SIZE (1024 * 64) -#define IBMVETH_MAX_QUEUES 16U +#define IBMVETH_MAX_QUEUES 8U static int pool_size[] = { 512, 1024 * 2, 1024 * 16, 1024 * 32, 1024 * 64 }; static int pool_count[] = { 256, 512, 256, 256, 256 };
Previously, the maximum number of transmit queues allowed was 16. Due to resource concerns, limit to 8 queues instead. Since the driver is virtualized away from the physical NIC, the purpose of multiple queues is purely to allow for parallel calls to the hypervisor. Therefore, there is no noticeable effect on performance by reducing queue count to 8. Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Nick Child <nnac123@linux.ibm.com> --- Relevant links: - Introduce multiple tx queues (accepted in v6.1): https://lore.kernel.org/netdev/20220928214350.29795-2-nnac123@linux.ibm.com/ - Resource concerns with 16 queues: https://lore.kernel.org/netdev/CAA93jw5reJmaOvt9vw15C1fo1AN7q5jVKzUocbAoNDC-cpi=KQ@mail.gmail.com/ - v1 (only change is commit message length and typo in Reported-by tag): https://lore.kernel.org/netdev/20221102153040.149244-1-nnac123@linux.ibm.com/ drivers/net/ethernet/ibm/ibmveth.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)