diff mbox series

[v6,3/4] mac80211: Implement Airtime-based Queue Limit (AQL)

Message ID 157182474287.150713.12867638269538730397.stgit@toke.dk (mailing list archive)
State New, archived
Headers show
Series Add Airtime Queue Limits (AQL) to mac80211 | expand

Commit Message

Toke Høiland-Jørgensen Oct. 23, 2019, 9:59 a.m. UTC
From: Kan Yan <kyan@google.com>

In order for the Fq_CoDel algorithm integrated in mac80211 layer to operate
effectively to control excessive queueing latency, the CoDel algorithm
requires an accurate measure of how long packets stays in the queue, AKA
sojourn time. The sojourn time measured at the mac80211 layer doesn't
include queueing latency in the lower layer (firmware/hardware) and CoDel
expects lower layer to have a short queue. However, most 802.11ac chipsets
offload tasks such TX aggregation to firmware or hardware, thus have a deep
lower layer queue.

Without a mechanism to control the lower layer queue size, packets only
stay in mac80211 layer transiently before being sent to firmware queue.
As a result, the sojourn time measured by CoDel in the mac80211 layer is
almost always lower than the CoDel latency target, hence CoDel does little
to control the latency, even when the lower layer queue causes excessive
latency.

The Byte Queue Limits (BQL) mechanism is commonly used to address the
similar issue with wired network interface. However, this method cannot be
applied directly to the wireless network interface. "Bytes" is not a
suitable measure of queue depth in the wireless network, as the data rate
can vary dramatically from station to station in the same network, from a
few Mbps to over Gbps.

This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work
effectively with wireless drivers that utilized firmware/hardware
offloading. AQL allows each txq to release just enough packets to the lower
layer to form 1-2 large aggregations to keep hardware fully utilized and
retains the rest of the frames in mac80211 layer to be controlled by the
CoDel algorithm.

Signed-off-by: Kan Yan <kyan@google.com>
[ Toke: Keep API to set pending airtime internal, fix nits in commit msg ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/net/cfg80211.h     |    7 ++++
 include/net/mac80211.h     |   12 +++++++
 net/mac80211/debugfs.c     |   78 ++++++++++++++++++++++++++++++++++++++++++++
 net/mac80211/debugfs_sta.c |   43 +++++++++++++++++++-----
 net/mac80211/ieee80211_i.h |    4 ++
 net/mac80211/main.c        |    9 +++++
 net/mac80211/sta_info.c    |   32 ++++++++++++++++++
 net/mac80211/sta_info.h    |    8 +++++
 net/mac80211/tx.c          |   46 ++++++++++++++++++++++++--
 9 files changed, 225 insertions(+), 14 deletions(-)

Comments

Johannes Berg Nov. 8, 2019, 10:09 a.m. UTC | #1
On Wed, 2019-10-23 at 11:59 +0200, Toke Høiland-Jørgensen wrote:
> 
>  
> +void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
> +					  struct sta_info *sta, u8 ac,
> +					  u16 tx_airtime, bool tx_completed)
> +{
> +	spin_lock_bh(&local->active_txq_lock[ac]);
> +	if (tx_completed) {
> +		if (sta) {
> +			if (WARN_ONCE(sta->airtime[ac].aql_tx_pending < tx_airtime,
> +				      "TXQ pending airtime underflow: %u, %u",
> +				      sta->airtime[ac].aql_tx_pending, tx_airtime))

Maybe add the STA/AC to the message?

johannes
Toke Høiland-Jørgensen Nov. 8, 2019, 10:56 a.m. UTC | #2
Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2019-10-23 at 11:59 +0200, Toke Høiland-Jørgensen wrote:
>> 
>>  
>> +void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
>> +					  struct sta_info *sta, u8 ac,
>> +					  u16 tx_airtime, bool tx_completed)
>> +{
>> +	spin_lock_bh(&local->active_txq_lock[ac]);
>> +	if (tx_completed) {
>> +		if (sta) {
>> +			if (WARN_ONCE(sta->airtime[ac].aql_tx_pending < tx_airtime,
>> +				      "TXQ pending airtime underflow: %u, %u",
>> +				      sta->airtime[ac].aql_tx_pending, tx_airtime))
>
> Maybe add the STA/AC to the message?

Can do. Any idea why we might be seeing underflows (as Kan reported)?

-Toke
Johannes Berg Nov. 8, 2019, 10:59 a.m. UTC | #3
On Fri, 2019-11-08 at 11:56 +0100, Toke Høiland-Jørgensen wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > On Wed, 2019-10-23 at 11:59 +0200, Toke Høiland-Jørgensen wrote:
> > >  
> > > +void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
> > > +					  struct sta_info *sta, u8 ac,
> > > +					  u16 tx_airtime, bool tx_completed)
> > > +{
> > > +	spin_lock_bh(&local->active_txq_lock[ac]);
> > > +	if (tx_completed) {
> > > +		if (sta) {
> > > +			if (WARN_ONCE(sta->airtime[ac].aql_tx_pending < tx_airtime,
> > > +				      "TXQ pending airtime underflow: %u, %u",
> > > +				      sta->airtime[ac].aql_tx_pending, tx_airtime))
> > 
> > Maybe add the STA/AC to the message?
> 
> Can do. Any idea why we might be seeing underflows (as Kan reported)?

No, I really have no idea. The shifting looked OK to me, though I didn't
review it carefully enough to say I've really looked at all places ...

johannes
Toke Høiland-Jørgensen Nov. 8, 2019, 11:10 a.m. UTC | #4
Johannes Berg <johannes@sipsolutions.net> writes:

> On Fri, 2019-11-08 at 11:56 +0100, Toke Høiland-Jørgensen wrote:
>> Johannes Berg <johannes@sipsolutions.net> writes:
>> 
>> > On Wed, 2019-10-23 at 11:59 +0200, Toke Høiland-Jørgensen wrote:
>> > >  
>> > > +void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
>> > > +					  struct sta_info *sta, u8 ac,
>> > > +					  u16 tx_airtime, bool tx_completed)
>> > > +{
>> > > +	spin_lock_bh(&local->active_txq_lock[ac]);
>> > > +	if (tx_completed) {
>> > > +		if (sta) {
>> > > +			if (WARN_ONCE(sta->airtime[ac].aql_tx_pending < tx_airtime,
>> > > +				      "TXQ pending airtime underflow: %u, %u",
>> > > +				      sta->airtime[ac].aql_tx_pending, tx_airtime))
>> > 
>> > Maybe add the STA/AC to the message?
>> 
>> Can do. Any idea why we might be seeing underflows (as Kan reported)?
>
> No, I really have no idea. The shifting looked OK to me, though I didn't
> review it carefully enough to say I've really looked at all places ...

Right, bugger. I was thinking maybe there's a case where skbs can be
cloned (and retain the tx_time_est field) and then released twice? Or
maybe somewhere that steps on the skb->cb field in some other way?
Couldn't find anything obvious on a first perusal of the TX path code,
but maybe you could think of something?

Otherwise I guess we'll be forced to go and do some actual,
old-fashioned debugging ;)

-Toke
Johannes Berg Nov. 8, 2019, 11:17 a.m. UTC | #5
On Fri, 2019-11-08 at 12:10 +0100, Toke Høiland-Jørgensen wrote:

> Right, bugger. I was thinking maybe there's a case where skbs can be
> cloned (and retain the tx_time_est field) and then released twice? 

They could be cloned, but I don't see how that'd be while *inside* the
stack and then they get reported twice - unless the driver did something
like that?

I mean, TCP surely does that for example, but it's before we even get to
mac80211.

> Or
> maybe somewhere that steps on the skb->cb field in some other way?
> Couldn't find anything obvious on a first perusal of the TX path code,
> but maybe you could think of something?

No, sorry. But I also didn't actually look at the driver at all.

> Otherwise I guess we'll be forced to go and do some actual,
> old-fashioned debugging ;)

:)

johannes
Kan Yan Nov. 9, 2019, 1:22 a.m. UTC | #6
It is most likely just insufficient locking. active_txq_lock is per
AC, can't protect local->aql_total_pending_airtime against racing
conditions:
void ieee80211_sta_update_pending_airtime(...)
{
        spin_lock_bh(&local->active_txq_lock[ac]);
        ...
        local->aql_total_pending_airtime -= tx_airtime;
        ...
        spin_unlock_bh(&local->active_txq_lock[ac]);
}

After changing it to atomic_t, no more aql_total_pending_airtime
underflow so far :). Using atomic operation should also help reduce
CPU overhead due to lock contention, as
ieee80211_sta_update_pending_airtime() is often called from the tx
completion routine triggered by interrupts, often in a different core
than where __ieee80211_schedule_txq() is running.

I will post a new version a bit later if the test goes well.

Regards,
Kan


On Fri, Nov 8, 2019 at 3:17 AM Johannes Berg <johannes@sipsolutions.net> wrote:
>
> On Fri, 2019-11-08 at 12:10 +0100, Toke Høiland-Jørgensen wrote:
>
> > Right, bugger. I was thinking maybe there's a case where skbs can be
> > cloned (and retain the tx_time_est field) and then released twice?
>
> They could be cloned, but I don't see how that'd be while *inside* the
> stack and then they get reported twice - unless the driver did something
> like that?
>
> I mean, TCP surely does that for example, but it's before we even get to
> mac80211.
>
> > Or
> > maybe somewhere that steps on the skb->cb field in some other way?
> > Couldn't find anything obvious on a first perusal of the TX path code,
> > but maybe you could think of something?
>
> No, sorry. But I also didn't actually look at the driver at all.
>
> > Otherwise I guess we'll be forced to go and do some actual,
> > old-fashioned debugging ;)
>
> :)
>
> johannes
>
Toke Høiland-Jørgensen Nov. 9, 2019, 11:22 a.m. UTC | #7
Kan Yan <kyan@google.com> writes:

> It is most likely just insufficient locking. active_txq_lock is per
> AC, can't protect local->aql_total_pending_airtime against racing
> conditions:
> void ieee80211_sta_update_pending_airtime(...)
> {
>         spin_lock_bh(&local->active_txq_lock[ac]);
>         ...
>         local->aql_total_pending_airtime -= tx_airtime;
>         ...
>         spin_unlock_bh(&local->active_txq_lock[ac]);
> }

Ohh, right; didn't even realise those were not per-AC as well...

> After changing it to atomic_t, no more aql_total_pending_airtime
> underflow so far :). Using atomic operation should also help reduce
> CPU overhead due to lock contention, as
> ieee80211_sta_update_pending_airtime() is often called from the tx
> completion routine triggered by interrupts, often in a different core
> than where __ieee80211_schedule_txq() is running.
>
> I will post a new version a bit later if the test goes well.

Awesome! :)

-Toke
diff mbox series

Patch

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index ff45c3e1abff..8d50c0a60dbd 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2602,6 +2602,13 @@  enum wiphy_params_flags {
 
 #define IEEE80211_DEFAULT_AIRTIME_WEIGHT	256
 
+/* The per TXQ device queue limit in airtime */
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L	4000
+#define IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H	8000
+
+/* The per interface airtime threshold to switch to lower queue limit */
+#define IEEE80211_AQL_THRESHOLD			24000
+
 /**
  * struct cfg80211_pmksa - PMK Security Association
  *
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 8a3e0544a026..2bc0f2538a36 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -5565,6 +5565,18 @@  void ieee80211_send_eosp_nullfunc(struct ieee80211_sta *pubsta, int tid);
 void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
 				    u32 tx_airtime, u32 rx_airtime);
 
+/**
+ * ieee80211_txq_airtime_check - check if a txq can send frame to device
+ *
+ * @hw: pointer obtained from ieee80211_alloc_hw()
+ * @txq: pointer obtained from station or virtual interface
+ *
+ * Return true if the AQL's airtime limit has not been reached and the txq can
+ * continue to send more packets to the device. Otherwise return false.
+ */
+bool
+ieee80211_txq_airtime_check(struct ieee80211_hw *hw, struct ieee80211_txq *txq);
+
 /**
  * ieee80211_iter_keys - iterate keys programmed into the device
  * @hw: pointer obtained from ieee80211_alloc_hw()
diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 568b3b276931..d77ea0e51c1d 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -148,6 +148,80 @@  static const struct file_operations aqm_ops = {
 	.llseek = default_llseek,
 };
 
+static ssize_t aql_txq_limit_read(struct file *file,
+				  char __user *user_buf,
+				  size_t count,
+				  loff_t *ppos)
+{
+	struct ieee80211_local *local = file->private_data;
+	char buf[400];
+	int len = 0;
+
+	len = scnprintf(buf, sizeof(buf),
+			"AC	AQL limit low	AQL limit high\n"
+			"VO	%u		%u\n"
+			"VI	%u		%u\n"
+			"BE	%u		%u\n"
+			"BK	%u		%u\n",
+			local->aql_txq_limit_low[IEEE80211_AC_VO],
+			local->aql_txq_limit_high[IEEE80211_AC_VO],
+			local->aql_txq_limit_low[IEEE80211_AC_VI],
+			local->aql_txq_limit_high[IEEE80211_AC_VI],
+			local->aql_txq_limit_low[IEEE80211_AC_BE],
+			local->aql_txq_limit_high[IEEE80211_AC_BE],
+			local->aql_txq_limit_low[IEEE80211_AC_BK],
+			local->aql_txq_limit_high[IEEE80211_AC_BK]);
+	return simple_read_from_buffer(user_buf, count, ppos,
+				       buf, len);
+}
+
+static ssize_t aql_txq_limit_write(struct file *file,
+				   const char __user *user_buf,
+				   size_t count,
+				   loff_t *ppos)
+{
+	struct ieee80211_local *local = file->private_data;
+	char buf[100];
+	size_t len;
+	u32 ac, q_limit_low, q_limit_high;
+	struct sta_info *sta;
+
+	if (count > sizeof(buf))
+		return -EINVAL;
+
+	if (copy_from_user(buf, user_buf, count))
+		return -EFAULT;
+
+	buf[sizeof(buf) - 1] = 0;
+	len = strlen(buf);
+	if (len > 0 && buf[len - 1] == '\n')
+		buf[len - 1] = 0;
+
+	if (sscanf(buf, "%u %u %u", &ac, &q_limit_low, &q_limit_high) != 3)
+		return -EINVAL;
+
+	if (ac >= IEEE80211_NUM_ACS)
+		return -EINVAL;
+
+	local->aql_txq_limit_low[ac] = q_limit_low;
+	local->aql_txq_limit_high[ac] = q_limit_high;
+
+	mutex_lock(&local->sta_mtx);
+	list_for_each_entry(sta, &local->sta_list, list) {
+		sta->airtime[ac].aql_limit_low = q_limit_low;
+		sta->airtime[ac].aql_limit_high = q_limit_high;
+	}
+	mutex_unlock(&local->sta_mtx);
+	return count;
+}
+
+static const struct file_operations aql_txq_limit_ops = {
+	.write = aql_txq_limit_write,
+	.read = aql_txq_limit_read,
+	.open = simple_open,
+	.llseek = default_llseek,
+};
+
 static ssize_t force_tx_status_read(struct file *file,
 				    char __user *user_buf,
 				    size_t count,
@@ -441,6 +515,10 @@  void debugfs_hw_add(struct ieee80211_local *local)
 	debugfs_create_u16("airtime_flags", 0600,
 			   phyd, &local->airtime_flags);
 
+	DEBUGFS_ADD(aql_txq_limit);
+	debugfs_create_u32("aql_threshold", 0600,
+			   phyd, &local->aql_threshold);
+
 	statsd = debugfs_create_dir("statistics", phyd);
 
 	/* if the dir failed, don't put all the other things into the root! */
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index c8ad20c28c43..9f9b8f5ed86a 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -197,10 +197,12 @@  static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
 {
 	struct sta_info *sta = file->private_data;
 	struct ieee80211_local *local = sta->sdata->local;
-	size_t bufsz = 200;
+	size_t bufsz = 400;
 	char *buf = kzalloc(bufsz, GFP_KERNEL), *p = buf;
 	u64 rx_airtime = 0, tx_airtime = 0;
 	s64 deficit[IEEE80211_NUM_ACS];
+	u32 q_depth[IEEE80211_NUM_ACS];
+	u32 q_limit_l[IEEE80211_NUM_ACS], q_limit_h[IEEE80211_NUM_ACS];
 	ssize_t rv;
 	int ac;
 
@@ -212,19 +214,22 @@  static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
 		rx_airtime += sta->airtime[ac].rx_airtime;
 		tx_airtime += sta->airtime[ac].tx_airtime;
 		deficit[ac] = sta->airtime[ac].deficit;
+		q_limit_l[ac] = sta->airtime[ac].aql_limit_low;
+		q_limit_h[ac] = sta->airtime[ac].aql_limit_high;
+		q_depth[ac] = sta->airtime[ac].aql_tx_pending;
 		spin_unlock_bh(&local->active_txq_lock[ac]);
 	}
 
 	p += scnprintf(p, bufsz + buf - p,
 		"RX: %llu us\nTX: %llu us\nWeight: %u\n"
-		"Deficit: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
-		rx_airtime,
-		tx_airtime,
-		sta->airtime_weight,
-		deficit[0],
-		deficit[1],
-		deficit[2],
-		deficit[3]);
+		"Deficit: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n"
+		"Q depth: VO: %u us VI: %u us BE: %u us BK: %u us\n"
+		"Q limit[low/high]: VO: %u/%u VI: %u/%u BE: %u/%u BK: %u/%u\n",
+		rx_airtime, tx_airtime, sta->airtime_weight,
+		deficit[0], deficit[1], deficit[2], deficit[3],
+		q_depth[0], q_depth[1], q_depth[2], q_depth[3],
+		q_limit_l[0], q_limit_h[0], q_limit_l[1], q_limit_h[1],
+		q_limit_l[2], q_limit_h[2], q_limit_l[3], q_limit_h[3]),
 
 	rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
 	kfree(buf);
@@ -236,7 +241,25 @@  static ssize_t sta_airtime_write(struct file *file, const char __user *userbuf,
 {
 	struct sta_info *sta = file->private_data;
 	struct ieee80211_local *local = sta->sdata->local;
-	int ac;
+	u32 ac, q_limit_l, q_limit_h;
+	char _buf[100] = {}, *buf = _buf;
+
+	if (count > sizeof(_buf))
+		return -EINVAL;
+
+	if (copy_from_user(buf, userbuf, count))
+		return -EFAULT;
+
+	buf[sizeof(_buf) - 1] = '\0';
+	if (sscanf(buf, "queue limit %u %u %u", &ac, &q_limit_l, &q_limit_h)
+	    != 3)
+		return -EINVAL;
+
+	if (ac >= IEEE80211_NUM_ACS)
+		return -EINVAL;
+
+	sta->airtime[ac].aql_limit_low = q_limit_l;
+	sta->airtime[ac].aql_limit_high = q_limit_h;
 
 	for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
 		spin_lock_bh(&local->active_txq_lock[ac]);
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 225ea4e3cd76..6fa690757388 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1142,6 +1142,10 @@  struct ieee80211_local {
 	u16 schedule_round[IEEE80211_NUM_ACS];
 
 	u16 airtime_flags;
+	u32 aql_txq_limit_low[IEEE80211_NUM_ACS];
+	u32 aql_txq_limit_high[IEEE80211_NUM_ACS];
+	u32 aql_threshold;
+	u32 aql_total_pending_airtime;
 
 	const struct ieee80211_ops *ops;
 
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index aba094b4ccfc..0792c9b9c850 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -667,8 +667,15 @@  struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
 		INIT_LIST_HEAD(&local->active_txqs[i]);
 		spin_lock_init(&local->active_txq_lock[i]);
+		local->aql_txq_limit_low[i] = IEEE80211_DEFAULT_AQL_TXQ_LIMIT_L;
+		local->aql_txq_limit_high[i] =
+			IEEE80211_DEFAULT_AQL_TXQ_LIMIT_H;
 	}
-	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
+
+	local->airtime_flags = AIRTIME_USE_TX |
+			       AIRTIME_USE_RX |
+			       AIRTIME_USE_AQL;
+	local->aql_threshold = IEEE80211_AQL_THRESHOLD;
 
 	INIT_LIST_HEAD(&local->chanctx_list);
 	mutex_init(&local->chanctx_mtx);
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index bd11fef2139f..64bacf4f068c 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -396,6 +396,9 @@  struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
 		skb_queue_head_init(&sta->ps_tx_buf[i]);
 		skb_queue_head_init(&sta->tx_filtered[i]);
 		sta->airtime[i].deficit = sta->airtime_weight;
+		sta->airtime[i].aql_tx_pending = 0;
+		sta->airtime[i].aql_limit_low = local->aql_txq_limit_low[i];
+		sta->airtime[i].aql_limit_high = local->aql_txq_limit_high[i];
 	}
 
 	for (i = 0; i < IEEE80211_NUM_TIDS; i++)
@@ -1893,6 +1896,35 @@  void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
 }
 EXPORT_SYMBOL(ieee80211_sta_register_airtime);
 
+void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
+					  struct sta_info *sta, u8 ac,
+					  u16 tx_airtime, bool tx_completed)
+{
+	spin_lock_bh(&local->active_txq_lock[ac]);
+	if (tx_completed) {
+		if (sta) {
+			if (WARN_ONCE(sta->airtime[ac].aql_tx_pending < tx_airtime,
+				      "TXQ pending airtime underflow: %u, %u",
+				      sta->airtime[ac].aql_tx_pending, tx_airtime))
+				sta->airtime[ac].aql_tx_pending = 0;
+			else
+				sta->airtime[ac].aql_tx_pending -= tx_airtime;
+		}
+
+		if (WARN_ONCE(local->aql_total_pending_airtime < tx_airtime,
+			      "Device pending airtime underflow: %u, %u",
+			      local->aql_total_pending_airtime, tx_airtime))
+			local->aql_total_pending_airtime = 0;
+		else
+			local->aql_total_pending_airtime -= tx_airtime;
+	} else {
+		if (sta)
+			sta->airtime[ac].aql_tx_pending += tx_airtime;
+		local->aql_total_pending_airtime += tx_airtime;
+	}
+	spin_unlock_bh(&local->active_txq_lock[ac]);
+}
+
 int sta_info_move_state(struct sta_info *sta,
 			enum ieee80211_sta_state new_state)
 {
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 369c2dddce52..4e4d76e81b0f 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -127,13 +127,21 @@  enum ieee80211_agg_stop_reason {
 /* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
 #define AIRTIME_USE_TX		BIT(0)
 #define AIRTIME_USE_RX		BIT(1)
+#define AIRTIME_USE_AQL		BIT(2)
 
 struct airtime_info {
 	u64 rx_airtime;
 	u64 tx_airtime;
 	s64 deficit;
+	u32 aql_tx_pending; /* Estimated airtime for frames pending in queue */
+	u32 aql_limit_low;
+	u32 aql_limit_high;
 };
 
+void ieee80211_sta_update_pending_airtime(struct ieee80211_local *local,
+					  struct sta_info *sta, u8 ac,
+					  u16 tx_airtime, bool tx_completed);
+
 struct sta_info;
 
 /**
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index a16c2f863702..12653d873b8c 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3665,7 +3665,8 @@  struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 	struct ieee80211_txq *ret = NULL;
-	struct txq_info *txqi = NULL;
+	struct txq_info *txqi = NULL, *head = NULL;
+	bool found_eligible_txq = false;
 
 	spin_lock_bh(&local->active_txq_lock[ac]);
 
@@ -3676,13 +3677,26 @@  struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac)
 	if (!txqi)
 		goto out;
 
+	if (txqi == head && !found_eligible_txq)
+		goto out;
+
+	if (!head)
+		head = txqi;
+
 	if (txqi->txq.sta) {
 		struct sta_info *sta = container_of(txqi->txq.sta,
-						struct sta_info, sta);
+						    struct sta_info, sta);
+		bool aql_check = ieee80211_txq_airtime_check(hw, &txqi->txq);
+		s64 deficit = sta->airtime[txqi->txq.ac].deficit;
+
+		if (aql_check)
+			found_eligible_txq = true;
 
-		if (sta->airtime[txqi->txq.ac].deficit < 0) {
+		if (deficit < 0)
 			sta->airtime[txqi->txq.ac].deficit +=
 				sta->airtime_weight;
+
+		if (deficit < 0 || !aql_check) {
 			list_move_tail(&txqi->schedule_order,
 				       &local->active_txqs[txqi->txq.ac]);
 			goto begin;
@@ -3736,6 +3750,32 @@  void __ieee80211_schedule_txq(struct ieee80211_hw *hw,
 }
 EXPORT_SYMBOL(__ieee80211_schedule_txq);
 
+bool ieee80211_txq_airtime_check(struct ieee80211_hw *hw,
+				 struct ieee80211_txq *txq)
+{
+	struct sta_info *sta;
+	struct ieee80211_local *local = hw_to_local(hw);
+
+	if (!(local->airtime_flags & AIRTIME_USE_AQL))
+		return true;
+
+	if (!txq->sta)
+		return true;
+
+	sta = container_of(txq->sta, struct sta_info, sta);
+	if (sta->airtime[txq->ac].aql_tx_pending <
+	    sta->airtime[txq->ac].aql_limit_low)
+		return true;
+
+	if (local->aql_total_pending_airtime < local->aql_threshold &&
+	    sta->airtime[txq->ac].aql_tx_pending <
+	    sta->airtime[txq->ac].aql_limit_high)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL(ieee80211_txq_airtime_check);
+
 bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 				struct ieee80211_txq *txq)
 {