diff mbox

[RFC,v2,1/4] mac80211: Add TXQ scheduling API

Message ID 153115422491.7447.12479559048433925372.stgit@alrua-x1 (mailing list archive)
State RFC
Delegated to: Johannes Berg
Headers show

Commit Message

Toke Høiland-Jørgensen July 9, 2018, 4:37 p.m. UTC
This adds an API to mac80211 to handle scheduling of TXQs. The API
consists of two new functions: ieee80211_next_txq() and
ieee80211_schedule_txq(). The former returns the next TXQ that should be
scheduled, and is how the driver gets a queue to pull packets from. The
latter is called internally by mac80211 to start scheduling a queue, and
the driver is supposed to call it to re-schedule the TXQ after it is
finished pulling packets from it (unless the queue emptied). Drivers can
optionally filter TXQs on ac to support per-AC hardware queue designs,
and a sequence number mechanism is used to support drivers looping over
all available TXQs exactly once.

Using this API allows drivers to take advantage of mac80211 scheduling
features such as airtime fairness (added in a subsequent commit).
However, usage of the new API is optional, so support can be added to
individual drivers one at a time.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 include/net/mac80211.h     |   50 +++++++++++++++++++++++++++++---
 net/mac80211/agg-tx.c      |    2 +
 net/mac80211/ieee80211_i.h |    7 ++++
 net/mac80211/main.c        |    3 ++
 net/mac80211/sta_info.c    |    3 ++
 net/mac80211/tx.c          |   69 ++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 130 insertions(+), 4 deletions(-)

Comments

Rajkumar Manoharan July 11, 2018, 12:14 a.m. UTC | #1
On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
[...]
> +/**
> + * ieee80211_schedule_txq - add txq to scheduling loop
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @txq: pointer obtained from station or virtual interface
> + * @reset_seqno: Whether to reset the internal scheduling sequence 
> number,
> + *               allowing this txq to appear again in the current 
> scheduling
> + *               round (see doc for ieee80211_next_txq()).
> + *
> + * Returns %true if the txq was actually added to the scheduling,
> + * %false otherwise.
> + */
> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
> +			    struct ieee80211_txq *txq,
> +			    bool reset_seqno);
> +
> +/**
> + * ieee80211_next_txq - get next tx queue to pull packets from
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @ac: filter returned txqs with this AC number. Pass -1 for no 
> filtering.
> + * @inc_seqno: Whether to increase the scheduling sequence number. 
> Setting this
> + *             to true signifies the start of a new scheduling round. 
> Each TXQ
> + *             will only be returned exactly once in each round 
> (unless its
> + *             sequence number is explicitly reset when calling
> + *             ieee80211_schedule_txq()).
> + *
Toke,

Seems like seqno is internal to mac80211 and meant for active_txq list 
manipulation.
If so, why would drivers have to worry about increment or resetting 
seqno? IMHO to
avoid over serving same txq, two lists (activeq and waitq) can be used 
and always
add new txq into waitq list. So that driver will not worry about 
mac80211 txq
manipulation. Please correct me If Im wrong.

ieee80211_schedule_txq
    - if schedule_order empty, add txq into waitq list tail.

ieee80211_next_txq
    - if activeq empty,
         - move waitq list into activeq

    - if activeq not empty
         - fetch appropriate txq from activeq
         - remove txq from activeq list.

    - If txq found, return txq else return NULL

-Rajkumar
Toke Høiland-Jørgensen July 11, 2018, 8:48 p.m. UTC | #2
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
> [...]
>> +/**
>> + * ieee80211_schedule_txq - add txq to scheduling loop
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @txq: pointer obtained from station or virtual interface
>> + * @reset_seqno: Whether to reset the internal scheduling sequence 
>> number,
>> + *               allowing this txq to appear again in the current 
>> scheduling
>> + *               round (see doc for ieee80211_next_txq()).
>> + *
>> + * Returns %true if the txq was actually added to the scheduling,
>> + * %false otherwise.
>> + */
>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>> +			    struct ieee80211_txq *txq,
>> +			    bool reset_seqno);
>> +
>> +/**
>> + * ieee80211_next_txq - get next tx queue to pull packets from
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @ac: filter returned txqs with this AC number. Pass -1 for no 
>> filtering.
>> + * @inc_seqno: Whether to increase the scheduling sequence number. 
>> Setting this
>> + *             to true signifies the start of a new scheduling round. 
>> Each TXQ
>> + *             will only be returned exactly once in each round 
>> (unless its
>> + *             sequence number is explicitly reset when calling
>> + *             ieee80211_schedule_txq()).
>> + *
> Toke,
>
> Seems like seqno is internal to mac80211 and meant for active_txq list
> manipulation. If so, why would drivers have to worry about increment
> or resetting seqno?

The drivers need to be able to signal when they start a new "scheduling
round" (which is the parameter to next_txq()), and when a queue should
be immediately rescheduled (which is the parameter to schedule_txq()).
See the subsequent patch to ath9k for how this is used; the latter is
signalled when a TXQ successfully dequeued an aggregate...

Now, you're right that the choice to track this via a sequence number is
an implementation detail internal to mac80211... so maybe the parameters
should be called something different.

> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
> be used and always add new txq into waitq list. So that driver will
> not worry about mac80211 txq manipulation. Please correct me If Im
> wrong.
>
> ieee80211_schedule_txq
>     - if schedule_order empty, add txq into waitq list tail.
>
> ieee80211_next_txq
>     - if activeq empty,
>          - move waitq list into activeq
>
>     - if activeq not empty
>          - fetch appropriate txq from activeq
>          - remove txq from activeq list.
>
>     - If txq found, return txq else return NULL


Erm, how would this prevent an infinite loop? With this scheme, at some
point, ieee80211_next_txq() removes the last txq from activeq, and
returns that. Then, when it is called again the next time the driver
loops, it's back to the first case (activeq empty, waitq non-empty); so
it'll move waitq back as activeq and start over... Only the driver
really knows when it is starting a logical "loop through all active
TXQs".

Also, for airtime fairness, the queues are not scheduled strictly
round-robin, so the dual-list scheme wouldn't work there either...

-Toke
Rajkumar Manoharan July 12, 2018, 10:40 p.m. UTC | #3
On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>> [...]
>>> +/**
>>> + * ieee80211_schedule_txq - add txq to scheduling loop
>>> + *
>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>> + * @txq: pointer obtained from station or virtual interface
>>> + * @reset_seqno: Whether to reset the internal scheduling sequence
>>> number,
>>> + *               allowing this txq to appear again in the current
>>> scheduling
>>> + *               round (see doc for ieee80211_next_txq()).
>>> + *
>>> + * Returns %true if the txq was actually added to the scheduling,
>>> + * %false otherwise.
>>> + */
>>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>>> +			    struct ieee80211_txq *txq,
>>> +			    bool reset_seqno);
>>> +
>>> +/**
>>> + * ieee80211_next_txq - get next tx queue to pull packets from
>>> + *
>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>> + * @ac: filter returned txqs with this AC number. Pass -1 for no
>>> filtering.
>>> + * @inc_seqno: Whether to increase the scheduling sequence number.
>>> Setting this
>>> + *             to true signifies the start of a new scheduling 
>>> round.
>>> Each TXQ
>>> + *             will only be returned exactly once in each round
>>> (unless its
>>> + *             sequence number is explicitly reset when calling
>>> + *             ieee80211_schedule_txq()).
>>> + *
>> Toke,
>> 
>> Seems like seqno is internal to mac80211 and meant for active_txq list
>> manipulation. If so, why would drivers have to worry about increment
>> or resetting seqno?
> 
> The drivers need to be able to signal when they start a new "scheduling
> round" (which is the parameter to next_txq()), and when a queue should
> be immediately rescheduled (which is the parameter to schedule_txq()).
> See the subsequent patch to ath9k for how this is used; the latter is
> signalled when a TXQ successfully dequeued an aggregate...
> 
> Now, you're right that the choice to track this via a sequence number 
> is
> an implementation detail internal to mac80211... so maybe the 
> parameters
> should be called something different.
> 
>> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
>> be used and always add new txq into waitq list. So that driver will
>> not worry about mac80211 txq manipulation. Please correct me If Im
>> wrong.
>> 
>> ieee80211_schedule_txq
>>     - if schedule_order empty, add txq into waitq list tail.
>> 
>> ieee80211_next_txq
>>     - if activeq empty,
>>          - move waitq list into activeq
>> 
>>     - if activeq not empty
>>          - fetch appropriate txq from activeq
>>          - remove txq from activeq list.
>> 
>>     - If txq found, return txq else return NULL
> 
> 
> Erm, how would this prevent an infinite loop? With this scheme, at some
> point, ieee80211_next_txq() removes the last txq from activeq, and
> returns that. Then, when it is called again the next time the driver
> loops, it's back to the first case (activeq empty, waitq non-empty); so
> it'll move waitq back as activeq and start over... Only the driver
> really knows when it is starting a logical "loop through all active
> TXQs".
> 
Oops.. My bad.. The idea is that ieee80211_next_txq process txq from 
activeq only
and keep processed txqs separately. Having single list eventually needs 
tracking
mechanism. The point is that once activeq becomes empty, splice waitq 
list and
return NULL. So that driver can break from the loop.

ieee80211_next_txq
      - if activeq empty,
           - move waitq list into activeq
           - return NULL

      - if activeq not empty
           - fetch appropriate txq from activeq
           - remove txq from activeq list.

      - If txq found, return txq else return NULL

> Also, for airtime fairness, the queues are not scheduled strictly
> round-robin, so the dual-list scheme wouldn't work there either...
> 
As you know, ath10k driver will operate in two tx modes (push-only, 
push-pull).
These modes will be switched dynamically depends on firmware load or 
resource
availability. In push-pull mode, firmware will query N number of frames 
for set of
STA, TID. So the driver will directly dequeue N number of frames from 
given txq.
In this case txq ordering alone wont help. I am planning to add below 
changes in
exiting API and add new API ieee80211_reorder_txq.

ieee80211_txq_get_depth
      - return deficit status along with frm_cnt

ieee80211_reorder_txq
      - if txq deficit > 0
            - return;
      - if txq is last
             - return
      - delete txq from list
      - move it to tail
      - update deficit by quantum

ath10k_htt_rx_tx_fetch_ind
      - get txq deficit status
      - if txq deficit > 0
            - dequeue skb
      - else if deficit < 0
            - return NULL

Please share your thoughts.

-Rajkumar
Toke Høiland-Jørgensen July 12, 2018, 11:13 p.m. UTC | #4
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
>>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>>> [...]
>>>> +/**
>>>> + * ieee80211_schedule_txq - add txq to scheduling loop
>>>> + *
>>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>> + * @txq: pointer obtained from station or virtual interface
>>>> + * @reset_seqno: Whether to reset the internal scheduling sequence
>>>> number,
>>>> + *               allowing this txq to appear again in the current
>>>> scheduling
>>>> + *               round (see doc for ieee80211_next_txq()).
>>>> + *
>>>> + * Returns %true if the txq was actually added to the scheduling,
>>>> + * %false otherwise.
>>>> + */
>>>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>>>> +			    struct ieee80211_txq *txq,
>>>> +			    bool reset_seqno);
>>>> +
>>>> +/**
>>>> + * ieee80211_next_txq - get next tx queue to pull packets from
>>>> + *
>>>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>> + * @ac: filter returned txqs with this AC number. Pass -1 for no
>>>> filtering.
>>>> + * @inc_seqno: Whether to increase the scheduling sequence number.
>>>> Setting this
>>>> + *             to true signifies the start of a new scheduling 
>>>> round.
>>>> Each TXQ
>>>> + *             will only be returned exactly once in each round
>>>> (unless its
>>>> + *             sequence number is explicitly reset when calling
>>>> + *             ieee80211_schedule_txq()).
>>>> + *
>>> Toke,
>>> 
>>> Seems like seqno is internal to mac80211 and meant for active_txq list
>>> manipulation. If so, why would drivers have to worry about increment
>>> or resetting seqno?
>> 
>> The drivers need to be able to signal when they start a new "scheduling
>> round" (which is the parameter to next_txq()), and when a queue should
>> be immediately rescheduled (which is the parameter to schedule_txq()).
>> See the subsequent patch to ath9k for how this is used; the latter is
>> signalled when a TXQ successfully dequeued an aggregate...
>> 
>> Now, you're right that the choice to track this via a sequence number 
>> is
>> an implementation detail internal to mac80211... so maybe the 
>> parameters
>> should be called something different.
>> 
>>> IMHO to avoid over serving same txq, two lists (activeq and waitq) can
>>> be used and always add new txq into waitq list. So that driver will
>>> not worry about mac80211 txq manipulation. Please correct me If Im
>>> wrong.
>>> 
>>> ieee80211_schedule_txq
>>>     - if schedule_order empty, add txq into waitq list tail.
>>> 
>>> ieee80211_next_txq
>>>     - if activeq empty,
>>>          - move waitq list into activeq
>>> 
>>>     - if activeq not empty
>>>          - fetch appropriate txq from activeq
>>>          - remove txq from activeq list.
>>> 
>>>     - If txq found, return txq else return NULL
>> 
>> 
>> Erm, how would this prevent an infinite loop? With this scheme, at some
>> point, ieee80211_next_txq() removes the last txq from activeq, and
>> returns that. Then, when it is called again the next time the driver
>> loops, it's back to the first case (activeq empty, waitq non-empty); so
>> it'll move waitq back as activeq and start over... Only the driver
>> really knows when it is starting a logical "loop through all active
>> TXQs".
>> 
> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
> activeq only and keep processed txqs separately. Having single list
> eventually needs tracking mechanism. The point is that once activeq
> becomes empty, splice waitq list and return NULL. So that driver can
> break from the loop.
>
> ieee80211_next_txq
>       - if activeq empty,
>            - move waitq list into activeq
>            - return NULL
>
>       - if activeq not empty
>            - fetch appropriate txq from activeq
>            - remove txq from activeq list.
>
>       - If txq found, return txq else return NULL

Right, so this would ensure the driver only sees each TXQ once *if it
keeps looping*. But it doesn't necessarily; if the hardware queues fill
up, for instance, it might abort earlier. In that case it would need to
signal mac80211 that it is done for now, which is equivalent to
signalling when it starts a scheduler round.

>> Also, for airtime fairness, the queues are not scheduled strictly
>> round-robin, so the dual-list scheme wouldn't work there either...
>> 
> As you know, ath10k driver will operate in two tx modes (push-only,
> push-pull). These modes will be switched dynamically depends on
> firmware load or resource availability. In push-pull mode, firmware
> will query N number of frames for set of STA, TID.

Ah, so it will look up the TXQ without looping through the list of
pending queues at all? Didn't realise that is what it's doing; yeah,
that would be problematic for airtime fairness :)

> So the driver will directly dequeue N number of frames from given txq.
> In this case txq ordering alone wont help. I am planning to add below
> changes in exiting API and add new API ieee80211_reorder_txq.
>
> ieee80211_txq_get_depth
>       - return deficit status along with frm_cnt
>
> ieee80211_reorder_txq
>       - if txq deficit > 0
>             - return;
>       - if txq is last
>              - return
>       - delete txq from list
>       - move it to tail
>       - update deficit by quantum
>
> ath10k_htt_rx_tx_fetch_ind
>       - get txq deficit status
>       - if txq deficit > 0
>             - dequeue skb
>       - else if deficit < 0
>             - return NULL
>
> Please share your thoughts.

Hmm, not sure exactly how this would work; seems a little complicated?
Also, I'd rather if drivers were completely oblivious to the deficit;
that is a bit of an implementation detail...

We could have an ieee80211_txq_may_pull(); or maybe just have
ieee80211_tx_dequeue() return NULL if the deficit is negative? I think
the reasonable thing for the driver to do, then, would be to ask
ieee80211_next_txq() for another TXQ to pull from if the current one
doesn't work for whatever reason.

Would that work for push-pull mode?

-Toke
Rajkumar Manoharan July 13, 2018, 12:33 a.m. UTC | #5
On 2018-07-12 16:13, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
>> On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>>> 
>>>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>>>> [...]
>>>>> +/**
[...]

>>> Erm, how would this prevent an infinite loop? With this scheme, at 
>>> some
>>> point, ieee80211_next_txq() removes the last txq from activeq, and
>>> returns that. Then, when it is called again the next time the driver
>>> loops, it's back to the first case (activeq empty, waitq non-empty); 
>>> so
>>> it'll move waitq back as activeq and start over... Only the driver
>>> really knows when it is starting a logical "loop through all active
>>> TXQs".
>>> 
>> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
>> activeq only and keep processed txqs separately. Having single list
>> eventually needs tracking mechanism. The point is that once activeq
>> becomes empty, splice waitq list and return NULL. So that driver can
>> break from the loop.
>> 
>> ieee80211_next_txq
>>       - if activeq empty,
>>            - move waitq list into activeq
>>            - return NULL
>> 
>>       - if activeq not empty
>>            - fetch appropriate txq from activeq
>>            - remove txq from activeq list.
>> 
>>       - If txq found, return txq else return NULL
> 
> Right, so this would ensure the driver only sees each TXQ once *if it
> keeps looping*. But it doesn't necessarily; if the hardware queues fill
> up, for instance, it might abort earlier. In that case it would need to
> signal mac80211 that it is done for now, which is equivalent to
> signalling when it starts a scheduler round.
> 
Hmm... I thought driver will call ieee80211_schedule_txq when it runs 
out
of hardware descriptor and break the loop. The serving txq will be added
back to head of activeq list. no?

>>> Also, for airtime fairness, the queues are not scheduled strictly
>>> round-robin, so the dual-list scheme wouldn't work there either...
>>> 
>> As you know, ath10k driver will operate in two tx modes (push-only,
>> push-pull). These modes will be switched dynamically depends on
>> firmware load or resource availability. In push-pull mode, firmware
>> will query N number of frames for set of STA, TID.
> 
> Ah, so it will look up the TXQ without looping through the list of
> pending queues at all? Didn't realise that is what it's doing; yeah,
> that would be problematic for airtime fairness :)
> 
>> So the driver will directly dequeue N number of frames from given txq.
>> In this case txq ordering alone wont help. I am planning to add below
>> changes in exiting API and add new API ieee80211_reorder_txq.
>> 
>> ieee80211_txq_get_depth
>>       - return deficit status along with frm_cnt
>> 
>> ieee80211_reorder_txq
>>       - if txq deficit > 0
>>             - return;
>>       - if txq is last
>>              - return
>>       - delete txq from list
>>       - move it to tail
>>       - update deficit by quantum
>> 
>> ath10k_htt_rx_tx_fetch_ind
>>       - get txq deficit status
>>       - if txq deficit > 0
>>             - dequeue skb
>>       - else if deficit < 0
>>             - return NULL
>> 
>> Please share your thoughts.
> 
> Hmm, not sure exactly how this would work; seems a little complicated?
> Also, I'd rather if drivers were completely oblivious to the deficit;
> that is a bit of an implementation detail...
> 
Agree.. Initially I thought of adding deficit check in 
ieee80211_tx_dequeue.
But It will be overhead of taking activeq_lock for every skbs. Perhaps
it can be renamed as allowed_to_dequeue instead of deficit.

> We could have an ieee80211_txq_may_pull(); or maybe just have
> ieee80211_tx_dequeue() return NULL if the deficit is negative?
> 
As I said earlier, checking deficit for every skb will be an overhead.
It should be done once before accessing txq.

> the reasonable thing for the driver to do, then, would be to ask
> ieee80211_next_txq() for another TXQ to pull from if the current one
> doesn't work for whatever reason.
> 
> Would that work for push-pull mode?
> 
Not really. Driver shouldn't send other txq data instead of asked one.
In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
So the driver not supposed to send other txq's data.

-Rajkumar
Toke Høiland-Jørgensen July 13, 2018, 1:39 p.m. UTC | #6
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-12 16:13, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
>>> On 2018-07-11 13:48, Toke Høiland-Jørgensen wrote:
>>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>>>> 
>>>>> On 2018-07-09 09:37, Toke Høiland-Jørgensen wrote:
>>>>> [...]
>>>>>> +/**
> [...]
>
>>>> Erm, how would this prevent an infinite loop? With this scheme, at 
>>>> some
>>>> point, ieee80211_next_txq() removes the last txq from activeq, and
>>>> returns that. Then, when it is called again the next time the driver
>>>> loops, it's back to the first case (activeq empty, waitq non-empty); 
>>>> so
>>>> it'll move waitq back as activeq and start over... Only the driver
>>>> really knows when it is starting a logical "loop through all active
>>>> TXQs".
>>>> 
>>> Oops.. My bad.. The idea is that ieee80211_next_txq process txq from
>>> activeq only and keep processed txqs separately. Having single list
>>> eventually needs tracking mechanism. The point is that once activeq
>>> becomes empty, splice waitq list and return NULL. So that driver can
>>> break from the loop.
>>> 
>>> ieee80211_next_txq
>>>       - if activeq empty,
>>>            - move waitq list into activeq
>>>            - return NULL
>>> 
>>>       - if activeq not empty
>>>            - fetch appropriate txq from activeq
>>>            - remove txq from activeq list.
>>> 
>>>       - If txq found, return txq else return NULL
>> 
>> Right, so this would ensure the driver only sees each TXQ once *if it
>> keeps looping*. But it doesn't necessarily; if the hardware queues fill
>> up, for instance, it might abort earlier. In that case it would need to
>> signal mac80211 that it is done for now, which is equivalent to
>> signalling when it starts a scheduler round.
>> 
> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
> out of hardware descriptor and break the loop. The serving txq will be
> added back to head of activeq list. no?

Yes, and then the next one will be serviced... It's basically:

while (!hwq_is_full()) {
 txq = next_txq():
 build_one_aggr(txq); // may or may not succeed
 if (!empty(txq))
   schedule_txq(txq);
}

It is not generally predictable how many times this will loop before
exiting...

>>>> Also, for airtime fairness, the queues are not scheduled strictly
>>>> round-robin, so the dual-list scheme wouldn't work there either...
>>>> 
>>> As you know, ath10k driver will operate in two tx modes (push-only,
>>> push-pull). These modes will be switched dynamically depends on
>>> firmware load or resource availability. In push-pull mode, firmware
>>> will query N number of frames for set of STA, TID.
>> 
>> Ah, so it will look up the TXQ without looping through the list of
>> pending queues at all? Didn't realise that is what it's doing; yeah,
>> that would be problematic for airtime fairness :)
>> 
>>> So the driver will directly dequeue N number of frames from given txq.
>>> In this case txq ordering alone wont help. I am planning to add below
>>> changes in exiting API and add new API ieee80211_reorder_txq.
>>> 
>>> ieee80211_txq_get_depth
>>>       - return deficit status along with frm_cnt
>>> 
>>> ieee80211_reorder_txq
>>>       - if txq deficit > 0
>>>             - return;
>>>       - if txq is last
>>>              - return
>>>       - delete txq from list
>>>       - move it to tail
>>>       - update deficit by quantum
>>> 
>>> ath10k_htt_rx_tx_fetch_ind
>>>       - get txq deficit status
>>>       - if txq deficit > 0
>>>             - dequeue skb
>>>       - else if deficit < 0
>>>             - return NULL
>>> 
>>> Please share your thoughts.
>> 
>> Hmm, not sure exactly how this would work; seems a little complicated?
>> Also, I'd rather if drivers were completely oblivious to the deficit;
>> that is a bit of an implementation detail...
>> 
> Agree.. Initially I thought of adding deficit check in 
> ieee80211_tx_dequeue.
> But It will be overhead of taking activeq_lock for every skbs. Perhaps
> it can be renamed as allowed_to_dequeue instead of deficit.
>
>> We could have an ieee80211_txq_may_pull(); or maybe just have
>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>> 
> As I said earlier, checking deficit for every skb will be an overhead.
> It should be done once before accessing txq.

Well, it could conceivably be done in a way that doesn't require taking
the activeq_lock. Adding another STOP flag to the TXQ, for instance.
From an API point of view I think that is more consistent with what we
have already...

>> the reasonable thing for the driver to do, then, would be to ask
>> ieee80211_next_txq() for another TXQ to pull from if the current one
>> doesn't work for whatever reason.
>> 
>> Would that work for push-pull mode?
>> 
> Not really. Driver shouldn't send other txq data instead of asked one.

I didn't necessarily mean immediately. As long as it does it eventually.
If a TXQ's deficit runs negative, that TXQ will not be allowed to send
again until its deficit has been restored to positive through enough
cycles of the loop in next_txq(). 

> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
> So the driver not supposed to send other txq's data.

Hmm, it'll actually be interesting to see how the airtime fairness
scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
good way; the DRR scheduler generally only restores one TXQ to positive
deficit at a time, so it may be that MU-MIMO will break completely and
we'll have to come up with another scheduling algorithm.

How does the firmware the airtime of a MU-MIMO transmission to each of
the involved stations?

-Toke
Rajkumar Manoharan July 17, 2018, 1:06 a.m. UTC | #7
On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
[...]

>> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
>> out of hardware descriptor and break the loop. The serving txq will be
>> added back to head of activeq list. no?
> 
> Yes, and then the next one will be serviced... It's basically:
> 
> while (!hwq_is_full()) {
>  txq = next_txq():
>  build_one_aggr(txq); // may or may not succeed
>  if (!empty(txq))
>    schedule_txq(txq);
> }
> 
> It is not generally predictable how many times this will loop before
> exiting...
> 
Agree.. It would be better If the driver does not worry about txq 
sequence
numbering. Perhaps one more API (ieee80211_first_txq) could solve this.
Will leave it to you.

>>>> 
>>>> ieee80211_txq_get_depth
>>>>       - return deficit status along with frm_cnt
>>>> 
>>>> ieee80211_reorder_txq
>>>>       - if txq deficit > 0
>>>>             - return;
>>>>       - if txq is last
>>>>              - return
>>>>       - delete txq from list
>>>>       - move it to tail
>>>>       - update deficit by quantum
>>>> 
>>>> ath10k_htt_rx_tx_fetch_ind
>>>>       - get txq deficit status
>>>>       - if txq deficit > 0
>>>>             - dequeue skb
>>>>       - else if deficit < 0
>>>>             - return NULL
>>>> 
>>>> Please share your thoughts.
>>> 
>>> Hmm, not sure exactly how this would work; seems a little 
>>> complicated?
>>> Also, I'd rather if drivers were completely oblivious to the deficit;
>>> that is a bit of an implementation detail...
>>> 
>> Agree.. Initially I thought of adding deficit check in
>> ieee80211_tx_dequeue.
>> But It will be overhead of taking activeq_lock for every skbs. Perhaps
>> it can be renamed as allowed_to_dequeue instead of deficit.
>> 
>>> We could have an ieee80211_txq_may_pull(); or maybe just have
>>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>>> 
>> As I said earlier, checking deficit for every skb will be an overhead.
>> It should be done once before accessing txq.
> 
> Well, it could conceivably be done in a way that doesn't require taking
> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
> From an API point of view I think that is more consistent with what we
> have already...
> 

Make sense. ieee80211_txq_may_pull would be better place to decide 
whether
given txq is allowed for transmission. It also makes drivers do not have 
to
worry about deficit. Still I may need ieee80211_reorder_txq API after
processing txq. isn't it?

>>> the reasonable thing for the driver to do, then, would be to ask
>>> ieee80211_next_txq() for another TXQ to pull from if the current one
>>> doesn't work for whatever reason.
>>> 
>>> Would that work for push-pull mode?
>>> 
>> Not really. Driver shouldn't send other txq data instead of asked one.
> 
> I didn't necessarily mean immediately. As long as it does it 
> eventually.
> If a TXQ's deficit runs negative, that TXQ will not be allowed to send
> again until its deficit has been restored to positive through enough
> cycles of the loop in next_txq().
> 

Thats true. Are you suggesting to run the loop until the txq deficit 
becomes
positive?

>> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
>> So the driver not supposed to send other txq's data.
> 
> Hmm, it'll actually be interesting to see how the airtime fairness
> scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
> good way; the DRR scheduler generally only restores one TXQ to positive
> deficit at a time, so it may be that MU-MIMO will break completely and
> we'll have to come up with another scheduling algorithm.
> 

In push-pull method, driver reports to firmware that number of frames
queued for each tid per station by wake_tx_queue. Later firmware will 
query
N frames from each TID and after dequeue driver will update remaining 
frames
for that tid. In ATF case, when driver is not able to dequeue frames, 
driver
will simply update remaining frames. The consecutive fetch_ind get 
opportunity
to dequeue the frames. By This way, transmission for serving client will 
be paused
for a while and opportunity will be given to others.

-Rajkumar
Toke Høiland-Jørgensen July 19, 2018, 2:18 p.m. UTC | #8
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
> [...]
>
>>> Hmm... I thought driver will call ieee80211_schedule_txq when it runs
>>> out of hardware descriptor and break the loop. The serving txq will be
>>> added back to head of activeq list. no?
>> 
>> Yes, and then the next one will be serviced... It's basically:
>> 
>> while (!hwq_is_full()) {
>>  txq = next_txq():
>>  build_one_aggr(txq); // may or may not succeed
>>  if (!empty(txq))
>>    schedule_txq(txq);
>> }
>> 
>> It is not generally predictable how many times this will loop before
>> exiting...
>> 
> Agree.. It would be better If the driver does not worry about txq
> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
> solve this. Will leave it to you.

That is basically what the second parameter to next_txq() does in the
current patchset. It just needs to be renamed...

>>>>> 
>>>>> ieee80211_txq_get_depth
>>>>>       - return deficit status along with frm_cnt
>>>>> 
>>>>> ieee80211_reorder_txq
>>>>>       - if txq deficit > 0
>>>>>             - return;
>>>>>       - if txq is last
>>>>>              - return
>>>>>       - delete txq from list
>>>>>       - move it to tail
>>>>>       - update deficit by quantum
>>>>> 
>>>>> ath10k_htt_rx_tx_fetch_ind
>>>>>       - get txq deficit status
>>>>>       - if txq deficit > 0
>>>>>             - dequeue skb
>>>>>       - else if deficit < 0
>>>>>             - return NULL
>>>>> 
>>>>> Please share your thoughts.
>>>> 
>>>> Hmm, not sure exactly how this would work; seems a little 
>>>> complicated?
>>>> Also, I'd rather if drivers were completely oblivious to the deficit;
>>>> that is a bit of an implementation detail...
>>>> 
>>> Agree.. Initially I thought of adding deficit check in
>>> ieee80211_tx_dequeue.
>>> But It will be overhead of taking activeq_lock for every skbs. Perhaps
>>> it can be renamed as allowed_to_dequeue instead of deficit.
>>> 
>>>> We could have an ieee80211_txq_may_pull(); or maybe just have
>>>> ieee80211_tx_dequeue() return NULL if the deficit is negative?
>>>> 
>>> As I said earlier, checking deficit for every skb will be an overhead.
>>> It should be done once before accessing txq.
>> 
>> Well, it could conceivably be done in a way that doesn't require taking
>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>> From an API point of view I think that is more consistent with what we
>> have already...
>> 
>
> Make sense. ieee80211_txq_may_pull would be better place to decide
> whether given txq is allowed for transmission. It also makes drivers
> do not have to worry about deficit. Still I may need
> ieee80211_reorder_txq API after processing txq. isn't it?

The way I was assuming this would work (and what ath9k does), is that a
driver only operates on one TXQ at a time; so it can get that txq,
process it, and re-schedule it. In which case no other API is needed;
the rotating can be done in next_txq(), and schedule_txq() can insert
the txq back into the rotation as needed.

However, it sounds like this is not how ath10k does things? See below.

>>>> the reasonable thing for the driver to do, then, would be to ask
>>>> ieee80211_next_txq() for another TXQ to pull from if the current one
>>>> doesn't work for whatever reason.
>>>> 
>>>> Would that work for push-pull mode?
>>>> 
>>> Not really. Driver shouldn't send other txq data instead of asked one.
>> 
>> I didn't necessarily mean immediately. As long as it does it 
>> eventually.
>> If a TXQ's deficit runs negative, that TXQ will not be allowed to send
>> again until its deficit has been restored to positive through enough
>> cycles of the loop in next_txq().
>> 
>
> Thats true. Are you suggesting to run the loop until the txq deficit
> becomes positive?

Yeah, or rather until the first station with a positive deficit is
found.

>>> In MU-MIMO, firmware will query N packets from given set of {STA,TID}.
>>> So the driver not supposed to send other txq's data.
>> 
>> Hmm, it'll actually be interesting to see how the airtime fairness
>> scheduler interacts with MU-MIMO. I'm not quite sure that it'll be in a
>> good way; the DRR scheduler generally only restores one TXQ to positive
>> deficit at a time, so it may be that MU-MIMO will break completely and
>> we'll have to come up with another scheduling algorithm.
>> 
>
> In push-pull method, driver reports to firmware that number of frames
> queued for each tid per station by wake_tx_queue. Later firmware will
> query N frames from each TID and after dequeue driver will update
> remaining frames for that tid. In ATF case, when driver is not able to
> dequeue frames, driver will simply update remaining frames. The
> consecutive fetch_ind get opportunity to dequeue the frames. By This
> way, transmission for serving client will be paused for a while and
> opportunity will be given to others.

This sounds like the driver doesn't do anything other than notify the
firmware that a txq has pending frames, and everything else is handled
by the firmware? Is this the case?

And if so, how does that interact with ath10k_mac_tx_push_pending()?
That function is basically doing the same thing that I explained above
for ath9k; in the previous version of this patch series I modified that
to use next_txq(). But is that a different TX path, or am I
misunderstanding you?

If you could point me to which parts of the ath10k code I should be
looking at, that would be helpful as well :)

-Toke
Rajkumar Manoharan July 21, 2018, 1:01 a.m. UTC | #9
On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

>>> It is not generally predictable how many times this will loop before
>>> exiting...
>>> 
>> Agree.. It would be better If the driver does not worry about txq
>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>> solve this. Will leave it to you.
> 
> That is basically what the second parameter to next_txq() does in the
> current patchset. It just needs to be renamed...
> 
Agree. As next_txq() increments seqno while starting loop for each AC,
It seems bit confusing. i.e A single ath_txq_schedule_all call will bump
schedule_seqno by 4. right?

Let assume below case where CPU1 is dequeuing skb from isr context and
CPU2 is enqueuing skbs into same txq.

CPU1                                          CPU2
----                                          ----
ath_txq_schedule
   -> ieee80211_next_txq(first)
         -> inc_seq
         -> delete from list
         -> txq->seqno = local->seqno
                                              ieee80211_tx/fast_xmit
                                                 -> ieee80211_queue_skb
                                                     -> 
ieee80211_schedule_txq(reset_seqno)
                                                          -> list_add
                                                          -> txqi->seqno 
= local->seqno - 1

In above sequence, it is quite possible that the same txq
will be served repeatedly and other txqs will be starved. am I right?
IMHO seqno mechanism will not guarantee that txqs will be processed
only once in an iteration.

[...]

>>> Well, it could conceivably be done in a way that doesn't require 
>>> taking
>>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>>> From an API point of view I think that is more consistent with what 
>>> we
>>> have already...
>>> 
>> 
>> Make sense. ieee80211_txq_may_pull would be better place to decide
>> whether given txq is allowed for transmission. It also makes drivers
>> do not have to worry about deficit. Still I may need
>> ieee80211_reorder_txq API after processing txq. isn't it?
> 
> The way I was assuming this would work (and what ath9k does), is that a
> driver only operates on one TXQ at a time; so it can get that txq,
> process it, and re-schedule it. In which case no other API is needed;
> the rotating can be done in next_txq(), and schedule_txq() can insert
> the txq back into the rotation as needed.
> 
> However, it sounds like this is not how ath10k does things? See below.
> 
correct. The current rotation works only in push-only mode. i.e when 
firmware
is not deciding txqs and the driver picks priority txq from active_txqs 
list.
Unfortunately rotation won't work when the driver selects txq other than 
first
in the list. In any case separate API needed for rotation when the 
driver is
processing few packets from txq instead of all pending frames.

>> In push-pull method, driver reports to firmware that number of frames
>> queued for each tid per station by wake_tx_queue. Later firmware will
>> query N frames from each TID and after dequeue driver will update
>> remaining frames for that tid. In ATF case, when driver is not able to
>> dequeue frames, driver will simply update remaining frames. The
>> consecutive fetch_ind get opportunity to dequeue the frames. By This
>> way, transmission for serving client will be paused for a while and
>> opportunity will be given to others.
> 
> This sounds like the driver doesn't do anything other than notify the
> firmware that a txq has pending frames, and everything else is handled
> by the firmware? Is this the case?
> 

Correct. In push-pull method, DRR algorithm running in firmware and so
firmware decides txq.

> And if so, how does that interact with ath10k_mac_tx_push_pending()?
> That function is basically doing the same thing that I explained above
> for ath9k; in the previous version of this patch series I modified that
> to use next_txq(). But is that a different TX path, or am I
> misunderstanding you?
> 
> If you could point me to which parts of the ath10k code I should be
> looking at, that would be helpful as well :)
> 
Depends on firmware htt_tx_mode (push/push_pull), 
ath10k_mac_tx_push_pending()
downloads all/partial frames to firmware. Please take a look at 
ath10k_mac_tx_can_push()
in push_pending(). Let me know If you need any other details.

-Rajkumar
Toke Høiland-Jørgensen July 21, 2018, 8:54 p.m. UTC | #10
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
>>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>
>>>> It is not generally predictable how many times this will loop before
>>>> exiting...
>>>> 
>>> Agree.. It would be better If the driver does not worry about txq
>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>> solve this. Will leave it to you.
>> 
>> That is basically what the second parameter to next_txq() does in the
>> current patchset. It just needs to be renamed...
>> 
> Agree. As next_txq() increments seqno while starting loop for each AC,
> It seems bit confusing. i.e A single ath_txq_schedule_all call will
> bump schedule_seqno by 4. right?

Hmm, good point. Guess there should be one seqno per ac...

> Let assume below case where CPU1 is dequeuing skb from isr context and
> CPU2 is enqueuing skbs into same txq.
>
> CPU1                                          CPU2
> ----                                          ----
> ath_txq_schedule
>    -> ieee80211_next_txq(first)
>          -> inc_seq
>          -> delete from list
>          -> txq->seqno = local->seqno
>                                               ieee80211_tx/fast_xmit
>                                                  -> ieee80211_queue_skb
>                                                      -> 
> ieee80211_schedule_txq(reset_seqno)
>                                                           -> list_add
>                                                           -> txqi->seqno 
> = local->seqno - 1
>
> In above sequence, it is quite possible that the same txq will be
> served repeatedly and other txqs will be starved. am I right? IMHO
> seqno mechanism will not guarantee that txqs will be processed only
> once in an iteration.

Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was experimenting
with that, and guess I ended up picking the wrong value. Thanks for
pointing that out :)

>>>> Well, it could conceivably be done in a way that doesn't require 
>>>> taking
>>>> the activeq_lock. Adding another STOP flag to the TXQ, for instance.
>>>> From an API point of view I think that is more consistent with what 
>>>> we
>>>> have already...
>>>> 
>>> 
>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>> whether given txq is allowed for transmission. It also makes drivers
>>> do not have to worry about deficit. Still I may need
>>> ieee80211_reorder_txq API after processing txq. isn't it?
>> 
>> The way I was assuming this would work (and what ath9k does), is that a
>> driver only operates on one TXQ at a time; so it can get that txq,
>> process it, and re-schedule it. In which case no other API is needed;
>> the rotating can be done in next_txq(), and schedule_txq() can insert
>> the txq back into the rotation as needed.
>> 
>> However, it sounds like this is not how ath10k does things? See below.
>> 
> correct. The current rotation works only in push-only mode. i.e when
> firmware is not deciding txqs and the driver picks priority txq from
> active_txqs list. Unfortunately rotation won't work when the driver
> selects txq other than first in the list. In any case separate API
> needed for rotation when the driver is processing few packets from txq
> instead of all pending frames.

Rotation is not dependent on the TXQ draining completely. Dequeueing a
few packets, then rescheduling the TXQ is fine.

>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>> That function is basically doing the same thing that I explained above
>> for ath9k; in the previous version of this patch series I modified that
>> to use next_txq(). But is that a different TX path, or am I
>> misunderstanding you?
>> 
>> If you could point me to which parts of the ath10k code I should be
>> looking at, that would be helpful as well :)
>> 
> Depends on firmware htt_tx_mode (push/push_pull),
> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
> me know If you need any other details.

Right, so looking at this, it looks like the driver will loop through
all the available TXQs, trying to dequeue from each of them if
ath10k_mac_tx_can_push() returns true, right? This should actually work
fine with the next_txq() / schedule_txq() API. Without airtime fairness
that will just translate into basically what the driver is doing now,
and I don't think more API functions are needed, as long as that is the
only point in the driver that pushes packets to the device...

With airtime fairness, what is going to happen is that the loop is only
going to get a single TXQ (the first one with a positive deficit), then
exit. Once that station has transmitted something and its deficit runs
negative, it'll get rotated and the next one will get a chance. So I
expect that airtime fairness will probably work, but MU-MIMO will
break...

Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
to replace it with something different. But I think the same next_txq()
/ schedule_txq() API will work for that as well...

-Toke
Rajkumar Manoharan July 24, 2018, 12:42 a.m. UTC | #11
On 2018-07-21 13:54, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
>> On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>>> 
>>>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
>>>>> It is not generally predictable how many times this will loop 
>>>>> before
>>>>> exiting...
>>>>> 
>>>> Agree.. It would be better If the driver does not worry about txq
>>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>>> solve this. Will leave it to you.
>>> 
>>> That is basically what the second parameter to next_txq() does in the
>>> current patchset. It just needs to be renamed...
>>> 
>> Agree. As next_txq() increments seqno while starting loop for each AC,
>> It seems bit confusing. i.e A single ath_txq_schedule_all call will
>> bump schedule_seqno by 4. right?
> 
> Hmm, good point. Guess there should be one seqno per ac...
> 
I would prefer to keep separate list for each AC ;) Prepared one on top 
of
your earlier merged changes. Now it needs revisit.

>> Let assume below case where CPU1 is dequeuing skb from isr context and
>> CPU2 is enqueuing skbs into same txq.
>> 
>> CPU1                                          CPU2
>> ----                                          ----
>> ath_txq_schedule
>>    -> ieee80211_next_txq(first)
>>          -> inc_seq
>>          -> delete from list
>>          -> txq->seqno = local->seqno
>>                                               ieee80211_tx/fast_xmit
>>                                                  -> 
>> ieee80211_queue_skb
>>                                                      ->
>> ieee80211_schedule_txq(reset_seqno)
>>                                                           -> list_add
>>                                                           -> 
>> txqi->seqno
>> = local->seqno - 1
>> 
>> In above sequence, it is quite possible that the same txq will be
>> served repeatedly and other txqs will be starved. am I right? IMHO
>> seqno mechanism will not guarantee that txqs will be processed only
>> once in an iteration.
> 
> Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was 
> experimenting
> with that, and guess I ended up picking the wrong value. Thanks for
> pointing that out :)
> 
A simple change in argument may break algo. What would be seqno of first
packet of txq if queue_skb() isn't reset seqno? I am fine in passing 
start
of loop as arg to next_ntx(). But prefer to keep loop detecting within
mac80211 itself by tracking txq same as ath10k. So that no change is 
needed
in schedule_txq() arg list. thoughts?

>>>>> Well, it could conceivably be done in a way that doesn't require
>>>>> taking
>>>>> the activeq_lock. Adding another STOP flag to the TXQ, for 
>>>>> instance.
>>>>> From an API point of view I think that is more consistent with what
>>>>> we
>>>>> have already...
>>>>> 
>>>> 
>>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>>> whether given txq is allowed for transmission. It also makes drivers
>>>> do not have to worry about deficit. Still I may need
>>>> ieee80211_reorder_txq API after processing txq. isn't it?
>>> 
>>> The way I was assuming this would work (and what ath9k does), is that 
>>> a
>>> driver only operates on one TXQ at a time; so it can get that txq,
>>> process it, and re-schedule it. In which case no other API is needed;
>>> the rotating can be done in next_txq(), and schedule_txq() can insert
>>> the txq back into the rotation as needed.
>>> 
>>> However, it sounds like this is not how ath10k does things? See 
>>> below.
>>> 
>> correct. The current rotation works only in push-only mode. i.e when
>> firmware is not deciding txqs and the driver picks priority txq from
>> active_txqs list. Unfortunately rotation won't work when the driver
>> selects txq other than first in the list. In any case separate API
>> needed for rotation when the driver is processing few packets from txq
>> instead of all pending frames.
> 
> Rotation is not dependent on the TXQ draining completely. Dequeueing a
> few packets, then rescheduling the TXQ is fine.
> 
The problem is that when the driver accesses txq directly, it wont call
next_txq(). So the txq will not dequeued from list and schedule_txq() 
wont
be effective. right?

ieee80211_txq_may_pull() - check whether txq is allowed for tx_dequeue()
                           that helps to keep deficit check in mac80211 
itself

ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
                           txq will be deleted from list and if there are 
pending
                           frames, it will be requeued.

>>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>>> That function is basically doing the same thing that I explained 
>>> above
>>> for ath9k; in the previous version of this patch series I modified 
>>> that
>>> to use next_txq(). But is that a different TX path, or am I
>>> misunderstanding you?
>>> 
>>> If you could point me to which parts of the ath10k code I should be
>>> looking at, that would be helpful as well :)
>>> 
>> Depends on firmware htt_tx_mode (push/push_pull),
>> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
>> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
>> me know If you need any other details.
> 
> Right, so looking at this, it looks like the driver will loop through
> all the available TXQs, trying to dequeue from each of them if
> ath10k_mac_tx_can_push() returns true, right? This should actually work
> fine with the next_txq() / schedule_txq() API. Without airtime fairness
> that will just translate into basically what the driver is doing now,
> and I don't think more API functions are needed, as long as that is the
> only point in the driver that pushes packets to the device...
> 
In push-only mode, this will work with next_txq() / schedule_txq() APIs.
In ath10k, packets are downloaded to firmware in three places.

wake_tx_queue()
tx_push_pending()
htt_rx_tx_fetch_ind() - do txq_lookup() and push_txq()

the above mentioned new APIs are needed to take care of fetch_ind().

> With airtime fairness, what is going to happen is that the loop is only
> going to get a single TXQ (the first one with a positive deficit), then
> exit. Once that station has transmitted something and its deficit runs
> negative, it'll get rotated and the next one will get a chance. So I
> expect that airtime fairness will probably work, but MU-MIMO will
> break...
> 
Agree.. In your earlier series, you did similar changes in ath10k 
especially
in wake_tx_queue and push_pending().

> Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
> to replace it with something different. But I think the same next_txq()
> / schedule_txq() API will work for that as well...
> 
As mentioned above, have to take of fetch_ind(). All 10.4 based chips 
(QCA99x0,
QCA9984, QCA9888, QCA4019, etc.) operates in push-pull mode, when the 
number of
connected station increased. As target does not have enough memory for 
buffering
frames for each station, it relied on host memory. ath10k driver can not 
identify
that whether the received fetch_ind is for MU-MIMO or regular 
transmission.

If we don't handle this case, then ath10k driver can not claim mac80211 
ATF support.
Agree that MU-MIMO won't work with DDR scheduler. and it will impact 
MU-MIMO performace
in ath10k when mac80211 ATF is claimed by ath10k.

-Rajkumar
Toke Høiland-Jørgensen July 24, 2018, 11:08 a.m. UTC | #12
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

> On 2018-07-21 13:54, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>> 
>>> On 2018-07-19 07:18, Toke Høiland-Jørgensen wrote:
>>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>>>> 
>>>>> On 2018-07-13 06:39, Toke Høiland-Jørgensen wrote:
>>>>>> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
>>> 
>>>>>> It is not generally predictable how many times this will loop 
>>>>>> before
>>>>>> exiting...
>>>>>> 
>>>>> Agree.. It would be better If the driver does not worry about txq
>>>>> sequence numbering. Perhaps one more API (ieee80211_first_txq) could
>>>>> solve this. Will leave it to you.
>>>> 
>>>> That is basically what the second parameter to next_txq() does in the
>>>> current patchset. It just needs to be renamed...
>>>> 
>>> Agree. As next_txq() increments seqno while starting loop for each AC,
>>> It seems bit confusing. i.e A single ath_txq_schedule_all call will
>>> bump schedule_seqno by 4. right?
>> 
>> Hmm, good point. Guess there should be one seqno per ac...
>> 
> I would prefer to keep separate list for each AC ;) Prepared one on
> top of your earlier merged changes. Now it needs revisit.

Fine with me. Only reason I used a single list + the filtering mechanism
was to keep things compatible with ath10k ;)

>>> Let assume below case where CPU1 is dequeuing skb from isr context and
>>> CPU2 is enqueuing skbs into same txq.
>>> 
>>> CPU1                                          CPU2
>>> ----                                          ----
>>> ath_txq_schedule
>>>    -> ieee80211_next_txq(first)
>>>          -> inc_seq
>>>          -> delete from list
>>>          -> txq->seqno = local->seqno
>>>                                               ieee80211_tx/fast_xmit
>>>                                                  -> 
>>> ieee80211_queue_skb
>>>                                                      ->
>>> ieee80211_schedule_txq(reset_seqno)
>>>                                                           -> list_add
>>>                                                           -> 
>>> txqi->seqno
>>> = local->seqno - 1
>>> 
>>> In above sequence, it is quite possible that the same txq will be
>>> served repeatedly and other txqs will be starved. am I right? IMHO
>>> seqno mechanism will not guarantee that txqs will be processed only
>>> once in an iteration.
>> 
>> Yeah, ieee80211_queue_skb() shouldn't set reset_seqno; was 
>> experimenting
>> with that, and guess I ended up picking the wrong value. Thanks for
>> pointing that out :)
>> 
> A simple change in argument may break algo. What would be seqno of
> first packet of txq if queue_skb() isn't reset seqno?

It would be 0, which would be less than the current seqno in all cases
except just when the seqno counter wraps.

> I am fine in passing start of loop as arg to next_ntx(). But prefer to
> keep loop detecting within mac80211 itself by tracking txq same as
> ath10k. So that no change is needed in schedule_txq() arg list.
> thoughts?

If we remove the reset argument to schedule_txq we lose the ability for
the driver to signal that it is OK to dequeue another series of packets
from the same TXQ in the current scheduling round. I think that things
would mostly still work, but we may risk the hardware running idle for
short periods of time (in ath9k at least). I am not quite sure which
conditions this would happen under, though; and I haven't been able to
trigger it myself with my test hardware. So one approach would be to try
it out without the parameter and put it back later if it turns out to be
problematic...

>>>>>> Well, it could conceivably be done in a way that doesn't require
>>>>>> taking
>>>>>> the activeq_lock. Adding another STOP flag to the TXQ, for 
>>>>>> instance.
>>>>>> From an API point of view I think that is more consistent with what
>>>>>> we
>>>>>> have already...
>>>>>> 
>>>>> 
>>>>> Make sense. ieee80211_txq_may_pull would be better place to decide
>>>>> whether given txq is allowed for transmission. It also makes drivers
>>>>> do not have to worry about deficit. Still I may need
>>>>> ieee80211_reorder_txq API after processing txq. isn't it?
>>>> 
>>>> The way I was assuming this would work (and what ath9k does), is that 
>>>> a
>>>> driver only operates on one TXQ at a time; so it can get that txq,
>>>> process it, and re-schedule it. In which case no other API is needed;
>>>> the rotating can be done in next_txq(), and schedule_txq() can insert
>>>> the txq back into the rotation as needed.
>>>> 
>>>> However, it sounds like this is not how ath10k does things? See 
>>>> below.
>>>> 
>>> correct. The current rotation works only in push-only mode. i.e when
>>> firmware is not deciding txqs and the driver picks priority txq from
>>> active_txqs list. Unfortunately rotation won't work when the driver
>>> selects txq other than first in the list. In any case separate API
>>> needed for rotation when the driver is processing few packets from txq
>>> instead of all pending frames.
>> 
>> Rotation is not dependent on the TXQ draining completely. Dequeueing a
>> few packets, then rescheduling the TXQ is fine.
>> 
> The problem is that when the driver accesses txq directly, it wont
> call next_txq(). So the txq will not dequeued from list and
> schedule_txq() wont be effective. right?
>
> ieee80211_txq_may_pull() - check whether txq is allowed for tx_dequeue()
>                            that helps to keep deficit check in mac80211 
> itself
>
> ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
>                            txq will be deleted from list and if there are 
> pending
>                            frames, it will be requeued.

This could work, but reorder_txq() can't do the reordering from the
middle of the rotation. I.e, it can only reorder if the TXQ being passed
in is currently at the head of the list of active TXQs.

If we do go for this, I think it would make sense to use it everywhere.
I.e., next_txq() doesn't remove the queue, it just returns what is
currently at the head; and the driver will have to call reorder() after
processing a TXQ, instead of schedule_txq().

>>>> And if so, how does that interact with ath10k_mac_tx_push_pending()?
>>>> That function is basically doing the same thing that I explained 
>>>> above
>>>> for ath9k; in the previous version of this patch series I modified 
>>>> that
>>>> to use next_txq(). But is that a different TX path, or am I
>>>> misunderstanding you?
>>>> 
>>>> If you could point me to which parts of the ath10k code I should be
>>>> looking at, that would be helpful as well :)
>>>> 
>>> Depends on firmware htt_tx_mode (push/push_pull),
>>> ath10k_mac_tx_push_pending() downloads all/partial frames to firmware.
>>> Please take a look at ath10k_mac_tx_can_push() in push_pending(). Let
>>> me know If you need any other details.
>> 
>> Right, so looking at this, it looks like the driver will loop through
>> all the available TXQs, trying to dequeue from each of them if
>> ath10k_mac_tx_can_push() returns true, right? This should actually work
>> fine with the next_txq() / schedule_txq() API. Without airtime fairness
>> that will just translate into basically what the driver is doing now,
>> and I don't think more API functions are needed, as long as that is the
>> only point in the driver that pushes packets to the device...
>> 
> In push-only mode, this will work with next_txq() / schedule_txq() APIs.
> In ath10k, packets are downloaded to firmware in three places.
>
> wake_tx_queue()
> tx_push_pending()
> htt_rx_tx_fetch_ind() - do txq_lookup() and push_txq()
>
> the above mentioned new APIs are needed to take care of fetch_ind().

Ah! This was what I was missing; thanks!

Right, so with the current DRR scheduler, the only thing we can do with
this setup is throttle fetch_ind() if the TXQ isn't eligible for
scheduling. What happens if we do that? As far as I can tell,
fetch_ind() is triggered on tx completion from the same TXQ, right? So
if we throttle that, the driver falls back to the tx_push_pending()
rotation?

I think adding the throttling to tx_fetch_ind() would basically amount
to disabling that mechanism for most transmission scenarios...

>> With airtime fairness, what is going to happen is that the loop is only
>> going to get a single TXQ (the first one with a positive deficit), then
>> exit. Once that station has transmitted something and its deficit runs
>> negative, it'll get rotated and the next one will get a chance. So I
>> expect that airtime fairness will probably work, but MU-MIMO will
>> break...
>> 
> Agree.. In your earlier series, you did similar changes in ath10k
> especially in wake_tx_queue and push_pending().
>
>> Don't think we can do MU-MIMO with a DRR airtime scheduler; we'll have
>> to replace it with something different. But I think the same next_txq()
>> / schedule_txq() API will work for that as well...
>> 
> As mentioned above, have to take of fetch_ind(). All 10.4 based chips
> (QCA99x0, QCA9984, QCA9888, QCA4019, etc.) operates in push-pull mode,
> when the number of connected station increased. As target does not
> have enough memory for buffering frames for each station, it relied on
> host memory. ath10k driver can not identify that whether the received
> fetch_ind is for MU-MIMO or regular transmission.
>
> If we don't handle this case, then ath10k driver can not claim
> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
> scheduler. and it will impact MU-MIMO performace in ath10k when
> mac80211 ATF is claimed by ath10k.

Yeah, so the question is if this is an acceptable tradeoff? Do you have
any idea how often MU-MIMO actually provides a benefit in the real
world?

One approach could be to implement the API and driver support with the
current DRR scheduler, and then evolve the scheduler into something that
can handle MU-MIMO afterwards. I have some ideas for this, but not sure
if they will work; and we want to avoid O(n) behaviour in the number of
associated stations.

-Toke
Rajkumar Manoharan July 30, 2018, 9:10 p.m. UTC | #13
On 2018-07-24 04:08, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 

Sorry for the late reply.

>> I would prefer to keep separate list for each AC ;) Prepared one on
>> top of your earlier merged changes. Now it needs revisit.
> 
> Fine with me. Only reason I used a single list + the filtering 
> mechanism
> was to keep things compatible with ath10k ;)
> 
Thanks :). If so, I will defer the per-ac-list later.

>> A simple change in argument may break algo. What would be seqno of
>> first packet of txq if queue_skb() isn't reset seqno?
> 
> It would be 0, which would be less than the current seqno in all cases
> except just when the seqno counter wraps.
> 

One point should be mentioned in comment section of next_txq() that
the driver has to ensure that next_txq() iteration is serialized i.e
no parallel txq traversal are encouraged. Though txqs_list is maintained
in mac80211, parallel iteration may change the behavior. For example
I thought of get rid of txqs_lock in ath10k as txqs_list is moved to 
mac80211.
But it is still needed. right?

>> I am fine in passing start of loop as arg to next_ntx(). But prefer to
>> keep loop detecting within mac80211 itself by tracking txq same as
>> ath10k. So that no change is needed in schedule_txq() arg list.
>> thoughts?
> 
> If we remove the reset argument to schedule_txq we lose the ability for
> the driver to signal that it is OK to dequeue another series of packets
> from the same TXQ in the current scheduling round. I think that things
> would mostly still work, but we may risk the hardware running idle for
> short periods of time (in ath9k at least). I am not quite sure which
> conditions this would happen under, though; and I haven't been able to
> trigger it myself with my test hardware. So one approach would be to 
> try
> it out without the parameter and put it back later if it turns out to 
> be
> problematic...
> 
Agree. Based on experiments, additional arguments can be extended later.
Currently in ath10k, txqs are added back to list when there are pending
frames and are not considered again in the same iteration. It will be
processed in next loop.


>> ieee80211_reorder_txq() - after dequeuing skb (in direct txq access),
>>                            txq will be deleted from list and if there 
>> are
>> pending
>>                            frames, it will be requeued.
> 
> This could work, but reorder_txq() can't do the reordering from the
> middle of the rotation. I.e, it can only reorder if the TXQ being 
> passed
> in is currently at the head of the list of active TXQs.
> 
> If we do go for this, I think it would make sense to use it everywhere.
> I.e., next_txq() doesn't remove the queue, it just returns what is
> currently at the head; and the driver will have to call reorder() after
> processing a TXQ, instead of schedule_txq().
> 
Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care of
reordering list though the driver is accessing txqs directly.

And also as we discussed earlier, I will drop txq_may_pull() API and 
move
the deficit check under tx_dequeue().

> Right, so with the current DRR scheduler, the only thing we can do with
> this setup is throttle fetch_ind() if the TXQ isn't eligible for
> scheduling. What happens if we do that? As far as I can tell,
> fetch_ind() is triggered on tx completion from the same TXQ, right? So
> if we throttle that, the driver falls back to the tx_push_pending()
> rotation?
> 
The fallback push will be effective only if current queued count is 
lesser than
allowed. Otherwise push_pending() won't be effective even after 
throttling
fetch_ind(). The same txq will be served in further fetch_ind() when the 
driver
reports to firmware about pending counts.

> I think adding the throttling to tx_fetch_ind() would basically amount
> to disabling that mechanism for most transmission scenarios...
> 
If you don't throttle fetch_ind(), fairness won't be effective. Also I 
am
thinking of reordering push_pending() after processing fetch_ind_q. This 
will
give enough chances for fetch_ind().

>> If we don't handle this case, then ath10k driver can not claim
>> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
>> scheduler. and it will impact MU-MIMO performace in ath10k when
>> mac80211 ATF is claimed by ath10k.
> 
> Yeah, so the question is if this is an acceptable tradeoff? Do you have
> any idea how often MU-MIMO actually provides a benefit in the real
> world?
> 
Hmm.. We have some criteria of ~1.9 gain in Mu-MIMO test cases with 50 
11ac clients.

-Rajkumar
Toke Høiland-Jørgensen July 30, 2018, 10:48 p.m. UTC | #14
Rajkumar Manoharan <rmanohar@codeaurora.org> writes:

>>> A simple change in argument may break algo. What would be seqno of
>>> first packet of txq if queue_skb() isn't reset seqno?
>> 
>> It would be 0, which would be less than the current seqno in all cases
>> except just when the seqno counter wraps.
>> 
>
> One point should be mentioned in comment section of next_txq() that
> the driver has to ensure that next_txq() iteration is serialized i.e
> no parallel txq traversal are encouraged. Though txqs_list is maintained
> in mac80211, parallel iteration may change the behavior. For example
> I thought of get rid of txqs_lock in ath10k as txqs_list is moved to 
> mac80211.
> But it is still needed. right?

Hmm, the driver really shouldn't need to do any locking apart from using
the next_txq() API. But I think you are right that the seqno mechanism
doesn't play well with unserialised access to through next_txq() from
multiple CPUs. I'll see what I can do about that, and also incorporate
the other changes we've been discussing into a new RFC series.

> Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care
> of reordering list though the driver is accessing txqs directly.

Right, as long as the next_txq() path is still called even while
fetch_ind() is active, that should be fine.

>>> If we don't handle this case, then ath10k driver can not claim
>>> mac80211 ATF support. Agree that MU-MIMO won't work with DDR
>>> scheduler. and it will impact MU-MIMO performace in ath10k when
>>> mac80211 ATF is claimed by ath10k.
>> 
>> Yeah, so the question is if this is an acceptable tradeoff? Do you have
>> any idea how often MU-MIMO actually provides a benefit in the real
>> world?
>> 
> Hmm.. We have some criteria of ~1.9 gain in Mu-MIMO test cases with 50
> 11ac clients.

Hmm, yeah, that would be a shame to lose; although I do think 50-client
deployments are still relatively rare for many people. So maybe we can
make airtime fairness something that can be switched on and off for
ath10k, depending on whether users think they will be needing MU-MIMO?
Until we come up with a scheduler that can handle it, of course...

-Toke
Rajkumar Manoharan July 31, 2018, 12:19 a.m. UTC | #15
On 2018-07-30 15:48, Toke Høiland-Jørgensen wrote:
> Rajkumar Manoharan <rmanohar@codeaurora.org> writes:
> 
> Hmm, the driver really shouldn't need to do any locking apart from 
> using
> the next_txq() API. But I think you are right that the seqno mechanism
> doesn't play well with unserialised access to through next_txq() from
> multiple CPUs. I'll see what I can do about that, and also incorporate
> the other changes we've been discussing into a new RFC series.
> 
Great.. :)

>> Hmm.. reorder_txq() API may not needed. Calling next_txq() takes care
>> of reordering list though the driver is accessing txqs directly.
> 
> Right, as long as the next_txq() path is still called even while
> fetch_ind() is active, that should be fine.
> 
I am still wondering how and when to refill deficit in case next_txq()
won't be called. :(

Will post RFC change on top of yours.

-Rajkumar
Johannes Berg Aug. 29, 2018, 7:36 a.m. UTC | #16
Rather belatedly reviewing this too ...

> + * The driver can't access the queue directly. To dequeue a frame from a
> + * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
> + * queue, it calls the .wake_tx_queue driver op.
> + *
> + * Drivers can optionally delegate responsibility for scheduling queues to
> + * mac80211, to take advantage of airtime fairness accounting. In this case, to
> + * obtain the next queue to pull frames from, the driver calls
> + * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
> + * using ieee80211_schedule_txq() if it is still active after the driver has
> + * finished pulling packets from it.

Maybe you could clarify what "is still active" means here? I'm guessing
it means "tx_dequeue() would return non-NULL" but I doubt you really
want such a strong requirement, perhaps "the last tx_dequeue() returned
non-NULL" is sufficient?


We're also working on adding a TXQ for (bufferable) management packets -
I wonder how that should interact here? Always be scheduled first?

> +/**
> + * ieee80211_schedule_txq - add txq to scheduling loop
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @txq: pointer obtained from station or virtual interface
> + * @reset_seqno: Whether to reset the internal scheduling sequence number,
> + *               allowing this txq to appear again in the current scheduling
> + *               round (see doc for ieee80211_next_txq()).
> + *
> + * Returns %true if the txq was actually added to the scheduling,
> + * %false otherwise.
> + */
> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
> +			    struct ieee80211_txq *txq,
> +			    bool reset_seqno);

I concur with Rajkumar in a way that "seqno" is a really bad name for
this since it's so loaded in the context of wifi. He didn't say it this
way, but said it was an "internal to mac80211" concept here, and was
perhaps also/more referring to the manipulations by drivers.

Perhaps calling it something like scheduling_round or something would be
better? That's not really what it is, schedule_id? Even schedule_seq[no]
would be clearer.

> +/**
> + * ieee80211_next_txq - get next tx queue to pull packets from
> + *
> + * @hw: pointer as obtained from ieee80211_alloc_hw()
> + * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
> + * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
> + *             to true signifies the start of a new scheduling round. Each TXQ
> + *             will only be returned exactly once in each round (unless its
> + *             sequence number is explicitly reset when calling
> + *             ieee80211_schedule_txq()).

Here you already talk about "scheduling sequence number" after all :)

> +	ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
>  	drv_wake_tx_queue(local, txqi);

These always seem to come paired - perhaps a helper is useful?

Although it looks like there are sometimes different locking contexts,
not sure if that's really necessary though.

johannes
Toke Høiland-Jørgensen Aug. 29, 2018, 9:22 a.m. UTC | #17
Johannes Berg <johannes@sipsolutions.net> writes:

> Rather belatedly reviewing this too ...
>
>> + * The driver can't access the queue directly. To dequeue a frame from a
>> + * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
>> + * queue, it calls the .wake_tx_queue driver op.
>> + *
>> + * Drivers can optionally delegate responsibility for scheduling queues to
>> + * mac80211, to take advantage of airtime fairness accounting. In this case, to
>> + * obtain the next queue to pull frames from, the driver calls
>> + * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
>> + * using ieee80211_schedule_txq() if it is still active after the driver has
>> + * finished pulling packets from it.
>
> Maybe you could clarify what "is still active" means here? I'm
> guessing it means "tx_dequeue() would return non-NULL" but I doubt you
> really want such a strong requirement, perhaps "the last tx_dequeue()
> returned non-NULL" is sufficient?

Yeah, the latter should suffice. Really it should be put back "if it is
not known to be empty", but, well, that doesn't read so well...

> We're also working on adding a TXQ for (bufferable) management packets
> - I wonder how that should interact here? Always be scheduled first?

Ah, cool! It may be that it should be given priority, yeah. Presently,
the multicast queue just rotates in the RR with the others, but is never
throttled as it doesn't have an airtime measure (though perhaps it
should)? But that might not be desirable for management frames, as
presumable those need to go out as fast as possible?

>> +/**
>> + * ieee80211_schedule_txq - add txq to scheduling loop
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @txq: pointer obtained from station or virtual interface
>> + * @reset_seqno: Whether to reset the internal scheduling sequence number,
>> + *               allowing this txq to appear again in the current scheduling
>> + *               round (see doc for ieee80211_next_txq()).
>> + *
>> + * Returns %true if the txq was actually added to the scheduling,
>> + * %false otherwise.
>> + */
>> +bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
>> +			    struct ieee80211_txq *txq,
>> +			    bool reset_seqno);
>
> I concur with Rajkumar in a way that "seqno" is a really bad name for
> this since it's so loaded in the context of wifi. He didn't say it this
> way, but said it was an "internal to mac80211" concept here, and was
> perhaps also/more referring to the manipulations by drivers.
>
> Perhaps calling it something like scheduling_round or something would be
> better? That's not really what it is, schedule_id? Even schedule_seq[no]
> would be clearer.

Yeah, definitely going to change that :)

>> +/**
>> + * ieee80211_next_txq - get next tx queue to pull packets from
>> + *
>> + * @hw: pointer as obtained from ieee80211_alloc_hw()
>> + * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
>> + * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
>> + *             to true signifies the start of a new scheduling round. Each TXQ
>> + *             will only be returned exactly once in each round (unless its
>> + *             sequence number is explicitly reset when calling
>> + *             ieee80211_schedule_txq()).
>
> Here you already talk about "scheduling sequence number" after all :)
>
>> +	ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
>>  	drv_wake_tx_queue(local, txqi);
>
> These always seem to come paired - perhaps a helper is useful?
>
> Although it looks like there are sometimes different locking contexts,
> not sure if that's really necessary though.

Hmm, I seem to recall thinking about just putting the call to
schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
dropping that; but will take another look at whether it makes sense to
combine things.

-Toke
Johannes Berg Aug. 29, 2018, 9:24 a.m. UTC | #18
On Wed, 2018-08-29 at 11:22 +0200, Toke Høiland-Jørgensen wrote:

> > We're also working on adding a TXQ for (bufferable) management packets
> > - I wonder how that should interact here? Always be scheduled first?
> 
> Ah, cool! It may be that it should be given priority, yeah. Presently,
> the multicast queue just rotates in the RR with the others, but is never
> throttled as it doesn't have an airtime measure (though perhaps it
> should)? But that might not be desirable for management frames, as
> presumable those need to go out as fast as possible?

Good question. I guess the multicast should have an airtime measure, but
I don't think we want to do accounting on the management. That really
should be few frames, and we want them out ASAP in most cases.

> Hmm, I seem to recall thinking about just putting the call to
> schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
> dropping that; but will take another look at whether it makes sense to
> combine things.

I was thinking the other way around - but that doesn't work since you'd
loop from the driver to itself. This way might work, I guess, hadn't
considered that.

Might be better anyway though to make a new inline that does both, since
the drv_() calls usually really don't do much on their own (other than
checks, and in one case backward compatibility code, but still)

johannes
Toke Høiland-Jørgensen Aug. 29, 2018, 10:09 a.m. UTC | #19
Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2018-08-29 at 11:22 +0200, Toke Høiland-Jørgensen wrote:
>
>> > We're also working on adding a TXQ for (bufferable) management packets
>> > - I wonder how that should interact here? Always be scheduled first?
>> 
>> Ah, cool! It may be that it should be given priority, yeah. Presently,
>> the multicast queue just rotates in the RR with the others, but is never
>> throttled as it doesn't have an airtime measure (though perhaps it
>> should)? But that might not be desirable for management frames, as
>> presumable those need to go out as fast as possible?
>
> Good question. I guess the multicast should have an airtime measure,
> but I don't think we want to do accounting on the management. That
> really should be few frames, and we want them out ASAP in most cases.

Yup, makes sense.

>> Hmm, I seem to recall thinking about just putting the call to
>> schedule_txq() into drv_wake_tx_queue; don't remember why I ended up
>> dropping that; but will take another look at whether it makes sense to
>> combine things.
>
> I was thinking the other way around - but that doesn't work since you'd
> loop from the driver to itself. This way might work, I guess, hadn't
> considered that.
>
> Might be better anyway though to make a new inline that does both, since
> the drv_() calls usually really don't do much on their own (other than
> checks, and in one case backward compatibility code, but still)

ACK, I'll look into that.

-Toke
diff mbox

Patch

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 5790f55c241d..18e43193b614 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -107,9 +107,16 @@ 
  * The driver is expected to initialize its private per-queue data for stations
  * and interfaces in the .add_interface and .sta_add ops.
  *
- * The driver can't access the queue directly. To dequeue a frame, it calls
- * ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a queue, it
- * calls the .wake_tx_queue driver op.
+ * The driver can't access the queue directly. To dequeue a frame from a
+ * txq, it calls ieee80211_tx_dequeue(). Whenever mac80211 adds a new frame to a
+ * queue, it calls the .wake_tx_queue driver op.
+ *
+ * Drivers can optionally delegate responsibility for scheduling queues to
+ * mac80211, to take advantage of airtime fairness accounting. In this case, to
+ * obtain the next queue to pull frames from, the driver calls
+ * ieee80211_next_txq(). The driver is then expected to re-schedule the txq
+ * using ieee80211_schedule_txq() if it is still active after the driver has
+ * finished pulling packets from it.
  *
  * For AP powersave TIM handling, the driver only needs to indicate if it has
  * buffered packets in the driver specific data structures by calling
@@ -5971,13 +5978,48 @@  void ieee80211_unreserve_tid(struct ieee80211_sta *sta, u8 tid);
  * ieee80211_tx_dequeue - dequeue a packet from a software tx queue
  *
  * @hw: pointer as obtained from ieee80211_alloc_hw()
- * @txq: pointer obtained from station or virtual interface
+ * @txq: pointer obtained from station or virtual interface, or from
+ *       ieee80211_next_txq()
  *
  * Returns the skb if successful, %NULL if no frame was available.
  */
 struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 				     struct ieee80211_txq *txq);
 
+/**
+ * ieee80211_schedule_txq - add txq to scheduling loop
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ * @reset_seqno: Whether to reset the internal scheduling sequence number,
+ *               allowing this txq to appear again in the current scheduling
+ *               round (see doc for ieee80211_next_txq()).
+ *
+ * Returns %true if the txq was actually added to the scheduling,
+ * %false otherwise.
+ */
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq,
+			    bool reset_seqno);
+
+/**
+ * ieee80211_next_txq - get next tx queue to pull packets from
+ *
+ * @hw: pointer as obtained from ieee80211_alloc_hw()
+ * @ac: filter returned txqs with this AC number. Pass -1 for no filtering.
+ * @inc_seqno: Whether to increase the scheduling sequence number. Setting this
+ *             to true signifies the start of a new scheduling round. Each TXQ
+ *             will only be returned exactly once in each round (unless its
+ *             sequence number is explicitly reset when calling
+ *             ieee80211_schedule_txq()).
+ *
+ * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
+ * is returned, it will have been removed from the scheduler queue and needs to
+ * be re-scheduled with ieee80211_schedule_txq() to continue to be active.
+ */
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+					 bool inc_seqno);
+
 /**
  * ieee80211_txq_get_depth - get pending frame/byte count of given txq
  *
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index 69e831bc317b..0a2e0d64fc11 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -227,6 +227,8 @@  ieee80211_agg_start_txq(struct sta_info *sta, int tid, bool enable)
 		clear_bit(IEEE80211_TXQ_AMPDU, &txqi->flags);
 
 	clear_bit(IEEE80211_TXQ_STOP, &txqi->flags);
+	ieee80211_schedule_txq(&sta->sdata->local->hw, txq, true);
+
 	local_bh_disable();
 	rcu_read_lock();
 	drv_wake_tx_queue(sta->sdata->local, txqi);
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 172aeae21ae9..6bd68639c699 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -835,6 +835,8 @@  struct txq_info {
 	struct codel_vars def_cvars;
 	struct codel_stats cstats;
 	struct sk_buff_head frags;
+	struct list_head schedule_order;
+	u16 schedule_seqno;
 	unsigned long flags;
 
 	/* keep last! */
@@ -1126,6 +1128,11 @@  struct ieee80211_local {
 	struct codel_vars *cvars;
 	struct codel_params cparams;
 
+	/* protects active_txqs and txqi->schedule_order */
+	spinlock_t active_txq_lock;
+	struct list_head active_txqs;
+	u16 schedule_seqno;
+
 	const struct ieee80211_ops *ops;
 
 	/*
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 4fb2709cb527..abaca5e1a59f 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -634,6 +634,9 @@  struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 	spin_lock_init(&local->rx_path_lock);
 	spin_lock_init(&local->queue_stop_reason_lock);
 
+	INIT_LIST_HEAD(&local->active_txqs);
+	spin_lock_init(&local->active_txq_lock);
+
 	INIT_LIST_HEAD(&local->chanctx_list);
 	mutex_init(&local->chanctx_mtx);
 
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index f34202242d24..bb92bf516b6b 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1244,6 +1244,9 @@  void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
 			if (!txq_has_queue(sta->sta.txq[i]))
 				continue;
 
+			ieee80211_schedule_txq(&local->hw,
+					       sta->sta.txq[i],
+					       true);
 			drv_wake_tx_queue(local, to_txq_info(sta->sta.txq[i]));
 		}
 	}
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 5b93bde248fd..24766566a380 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1437,6 +1437,7 @@  void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
 	codel_vars_init(&txqi->def_cvars);
 	codel_stats_init(&txqi->cstats);
 	__skb_queue_head_init(&txqi->frags);
+	INIT_LIST_HEAD(&txqi->schedule_order);
 
 	txqi->txq.vif = &sdata->vif;
 
@@ -1460,6 +1461,9 @@  void ieee80211_txq_purge(struct ieee80211_local *local,
 
 	fq_tin_reset(fq, tin, fq_skb_free_func);
 	ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
+	spin_lock_bh(&local->active_txq_lock);
+	list_del_init(&txqi->schedule_order);
+	spin_unlock_bh(&local->active_txq_lock);
 }
 
 void ieee80211_txq_set_params(struct ieee80211_local *local)
@@ -1576,6 +1580,7 @@  static bool ieee80211_queue_skb(struct ieee80211_local *local,
 	ieee80211_txq_enqueue(local, txqi, skb);
 	spin_unlock_bh(&fq->lock);
 
+	ieee80211_schedule_txq(&local->hw, &txqi->txq, true);
 	drv_wake_tx_queue(local, txqi);
 
 	return true;
@@ -3574,6 +3579,70 @@  struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 }
 EXPORT_SYMBOL(ieee80211_tx_dequeue);
 
+bool ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq,
+			    bool reset_seqno)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+	bool ret = false;
+
+	spin_lock_bh(&local->active_txq_lock);
+
+	if (list_empty(&txqi->schedule_order)) {
+		list_add_tail(&txqi->schedule_order, &local->active_txqs);
+		if (reset_seqno)
+			txqi->schedule_seqno = local->schedule_seqno - 1;
+		ret = true;
+	}
+
+	spin_unlock_bh(&local->active_txq_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, s8 ac,
+					 bool inc_seqno)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = NULL;
+
+	spin_lock_bh(&local->active_txq_lock);
+
+	if (inc_seqno)
+		local->schedule_seqno++;
+
+	if (list_empty(&local->active_txqs))
+		goto out;
+
+	list_for_each_entry(txqi, &local->active_txqs, schedule_order) {
+		if (ac < 0 || txqi->txq.ac == ac) {
+			/* If seqnos are equal, we've seen this txqi before in
+			 * this scheduling round, so abort.
+			 */
+			if (txqi->schedule_seqno == local->schedule_seqno)
+				txqi = NULL;
+			else
+				list_del_init(&txqi->schedule_order);
+			goto out;
+		}
+	}
+
+	/* no txq with requested ac found */
+	txqi = NULL;
+
+out:
+	spin_unlock_bh(&local->active_txq_lock);
+
+	if (!txqi)
+		return NULL;
+
+	txqi->schedule_seqno = local->schedule_seqno;
+	return &txqi->txq;
+}
+EXPORT_SYMBOL(ieee80211_next_txq);
+
 void __ieee80211_subif_start_xmit(struct sk_buff *skb,
 				  struct net_device *dev,
 				  u32 info_flags)