diff mbox series

[RFC,v3,07/12] iwlwifi: Extended Key ID support (NATIVE)

Message ID 20190210210620.31181-8-alexander@wetzel-home.de (mailing list archive)
State RFC
Delegated to: Johannes Berg
Headers show
Series Draft for Extended Key ID support | expand

Commit Message

Alexander Wetzel Feb. 10, 2019, 9:06 p.m. UTC
This is not ready for merge and has known issues.
The patch is only for discussions to sort out how to handle it correctly!

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---

iwlwifi intel cards had two big surprises:

Assuming I did not make some stupid errors it looks like my old
"Intel Corporation Centrino Ultimate-N 6300 (rev 3e)" using ucode 
9.221.4.1 build 25532 is perfectly fine with two keys uploaded to
harware and honoring the keyid in the MPDUs. For a card launched 2011
that's a pleasant surprise:-)

A much shorter test with a modern "Intel Corporation Wireless 8265 /
8175 (rev 78)" using ucode version 36.e91976c0.0 shows what seems to be
MPDUs decoded with the wrong key at each rekey and therefore a candidate
for the COMPAT support only..
So the bad news seems to be, that the modern card dropped the support.

It also seems to force us to add some per-card or per-firmware depending
check to decide which card can use the Native Extended Key ID support
and use the Compat mode for the rest.
Is there any way to find out which cards/firmware can be used with
Extended Key ID?
I also have tested patch for iwldvm using the Compat mode and I think
mvm cards will also work with that.

 drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c | 5 +++++
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 5 +++++
 2 files changed, 10 insertions(+)

Comments

Johannes Berg Feb. 15, 2019, 11:52 a.m. UTC | #1
On Sun, 2019-02-10 at 22:06 +0100, Alexander Wetzel wrote:
> This is not ready for merge and has known issues.
> The patch is only for discussions to sort out how to handle it correctly!
> 
> Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
> ---
> 
> iwlwifi intel cards had two big surprises:
> 
> Assuming I did not make some stupid errors it looks like my old
> "Intel Corporation Centrino Ultimate-N 6300 (rev 3e)" using ucode 
> 9.221.4.1 build 25532 is perfectly fine with two keys uploaded to
> harware and honoring the keyid in the MPDUs. For a card launched 2011
> that's a pleasant surprise:-)

:-)

> A much shorter test with a modern "Intel Corporation Wireless 8265 /
> 8175 (rev 78)" using ucode version 36.e91976c0.0 shows what seems to be
> MPDUs decoded with the wrong key at each rekey and therefore a candidate
> for the COMPAT support only..
> So the bad news seems to be, that the modern card dropped the support.

Probably just a firmware bug.

> It also seems to force us to add some per-card or per-firmware depending
> check to decide which card can use the Native Extended Key ID support
> and use the Compat mode for the rest.
> Is there any way to find out which cards/firmware can be used with
> Extended Key ID?

No, but if you have a good test case we can check out what the firmware
bug is and fix it. Perhaps not for all, but for the future at least.
Maybe we can still figure out where it was introduced and thus see where
it's good to use native mode.

> I also have tested patch for iwldvm using the Compat mode and I think
> mvm cards will also work with that.

No they don't, no firmware is available for that.

johannes
Alexander Wetzel Feb. 22, 2019, 8:50 p.m. UTC | #2
Am 15.02.19 um 12:52 schrieb Johannes Berg:
> On Sun, 2019-02-10 at 22:06 +0100, Alexander Wetzel wrote:
>> This is not ready for merge and has known issues.
>> The patch is only for discussions to sort out how to handle it correctly!
>>
>> Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
>> ---
>>
>> iwlwifi intel cards had two big surprises:
>>
>> Assuming I did not make some stupid errors it looks like my old
>> "Intel Corporation Centrino Ultimate-N 6300 (rev 3e)" using ucode
>> 9.221.4.1 build 25532 is perfectly fine with two keys uploaded to
>> harware and honoring the keyid in the MPDUs. For a card launched 2011
>> that's a pleasant surprise:-)
> 
> :-)
> 
>> A much shorter test with a modern "Intel Corporation Wireless 8265 /
>> 8175 (rev 78)" using ucode version 36.e91976c0.0 shows what seems to be
>> MPDUs decoded with the wrong key at each rekey and therefore a candidate
>> for the COMPAT support only..
>> So the bad news seems to be, that the modern card dropped the support.
> 
> Probably just a firmware bug.
> 
>> It also seems to force us to add some per-card or per-firmware depending
>> check to decide which card can use the Native Extended Key ID support
>> and use the Compat mode for the rest.
>> Is there any way to find out which cards/firmware can be used with
>> Extended Key ID?
> 
> No, but if you have a good test case we can check out what the firmware
> bug is and fix it. Perhaps not for all, but for the future at least.
> Maybe we can still figure out where it was introduced and thus see where
> it's good to use native mode.

I'll verify if can reproduce the scrambled packets and will provide a 
capture if so. Assuming that confirms the initial finding I'll be able 
to reproduce that at will within minutes with access to a test system 
having a mvm card. (I have some plans which will improve access, but 
looks like that will take some time and efforts.)

For now I handle that as low prio till we have generic Extended Key ID 
support merged and I've had some time to improve my test setup and 
hopefully have better access to a mvm card for testing.

> 
>> I also have tested patch for iwldvm using the Compat mode and I think
>> mvm cards will also work with that.
> 
> No they don't, no firmware is available for that.

So far I only looked at the dvm part of iwlwifi with only minutes spend 
on mvm to port the NATIVE solution from dvm.

Are you saying that mvm cards can't seamless switch a RX/TX key to TX 
only one? mvm seems to support SW crypto as needed and switching RX/TX 
to TX keys is the only other requirement for COMPAT mode.

Alexander
Johannes Berg Feb. 22, 2019, 9:06 p.m. UTC | #3
On Fri, 2019-02-22 at 21:50 +0100, Alexander Wetzel wrote:

> > No, but if you have a good test case we can check out what the firmware
> > bug is and fix it. Perhaps not for all, but for the future at least.
> > Maybe we can still figure out where it was introduced and thus see where
> > it's good to use native mode.
> 
> I'll verify if can reproduce the scrambled packets and will provide a 
> capture if so. Assuming that confirms the initial finding I'll be able 
> to reproduce that at will within minutes with access to a test system 
> having a mvm card. (I have some plans which will improve access, but 
> looks like that will take some time and efforts.)
> 
> For now I handle that as low prio till we have generic Extended Key ID 
> support merged

Agree

>  and I've had some time to improve my test setup and 
> hopefully have better access to a mvm card for testing.

If all that's lacking is a few NICs then I'm sure we/I can help out with
that :)

> > > I also have tested patch for iwldvm using the Compat mode and I think
> > > mvm cards will also work with that.
> > 
> > No they don't, no firmware is available for that.
> 
> So far I only looked at the dvm part of iwlwifi with only minutes spend 
> on mvm to port the NATIVE solution from dvm.

:)

> Are you saying that mvm cards can't seamless switch a RX/TX key to TX 
> only one? mvm seems to support SW crypto as needed and switching RX/TX 
> to TX keys is the only other requirement for COMPAT mode.

No, sorry. I misread your statement. I thought you were saying "mvm
cards will also work with iwldvm" rather than "mvm cards will also work
with compat mode" :-)

I think they all should just be made to work in native mode, the
firmware basically supports this as you found, there must be some small
bugs.

johannes
Alexander Wetzel Feb. 24, 2019, 1:04 p.m. UTC | #4
>>   and I've had some time to improve my test setup and
>> hopefully have better access to a mvm card for testing.
> 
> If all that's lacking is a few NICs then I'm sure we/I can help out with
> that :)
> 

Thanks, but as you suspected the NICs are the smallest problem...

I've simply delayed setting up a good lab far too long. So prior to 
shifting the focus to drivers I have to remedy that somehow.
Not really relevant here, but if you want to know what I plan for the 
future here my thoughts on it. If not you better skip that section:-)

Current plan is, to get one (ideally two, depending on price) mini PCs 
(amd64) with m.2 slots, good antennas and one or two matching Intel AC 
9260 cards.

Additionally I want to get rid of the whitelist in my development 
notebook by flashing coreboot, opening the path to install a minipci to 
m.2 adapter mount a second/third Intel 9260 card.

The core boot flash rollout is imminent, just do not want to risk the 
system by soldering wires to the mainboard till the generic support is 
sorted out. If that goes wrong I could brick my main system and lose 
access to my dvm card, but no risk no fun:-)

With that setup and using AC-9260 in all devices I'll have near-perfect 
setup to test and observe AC-9260 and can simply swap out cards to test 
and and maybe also get working after (hopefully) getting AC-9260 
Extended Key ID "certified" as reference implementation.
With three stations I can then try my hand at the still open mesh 
support for Extended Key ID, but that should wpa_supplicant only.

That said the initial heavy lifting is done. I'm still amazed I get that 
off the ground and working end2end with comparable ease. While there is 
still much work left the structures are all in place and it's now mainly 
closing the gaps.

I currently plan stick to the project a bit langer getting at least some 
intel cards and ath9k working, finalize the already existing Extended 
Key ID and PTK0 rekey patches and also update wireshark again.

I suspect I'll need at least one year for that in the best case, 
potentially much longer.

I already have three bugs/issues flagged for analysis. Nothing really 
relevant for the generic support but all three itching enough I'll plan 
to look into them:
1) Wrong time stamps with at least iwlwifi when using monitor in 
parallel to the normal connection for outgoing MPDUs (probably mac80211 
issue, )
2) Probably: Wpa_supplicant scan list is flushed too often, preventing 
fast reconnects
3) Wireshark can't handle Extended Key IDs decoding (and decoding should 
get at least some other generic improvements

Chances are I'll be busy with coreboot for some time and then 
finding/setting up some mini PC(s). That's nothing I've had any need for 
so far and suitable systems with M.2 slots with at least two external 
antennas are hard to find. (So I may end up mounting external antennas 
to the system myself.)

To continue I primarily need another Extended Key ID AP, ideally able to 
support NATIVE and COMPAT mode. Having a second device (or maybe just 
second M.2 card in the same one?) for sniffing nice.

Finding a really good sniffer able to also capture A-MPDU frames 
including control frames would be awesome. All the so-called good 
sniffing USB sticks I got my hands on crap even for sniffing normal 
frames compared to my built-in dvm card nobody seems to suggest for 
sniffing...
Full-speed sniffing to observe a rekey during load is hardly a common 
use case and I suspect usb interface alone an indication the card won't 
be able to handle it well. And some things like the cleartext packet 
leaks only becomes visible at higher speeds. (Which btw also need a 
follow up. My crypt skills are not the best, but with my current 
understanding I'm quite sure a retransmit of a previously encrypted 
frame in the clear text gives away the TK key for all other frames 
encrypted with the same TK.)

So plenty of work and reason to get a better test environment set up:-)

I hope the newer mvm cards will be as good as my old mvm card and when 
using the same card in the AP, sta and sniffer able to get basically 
every frame as long as the sniffer has the best antennas.

If someone here has a suggestion what works for I would appreciate some 
pointers here:-) Don't want sink too much money at it but if it's 
something around 300€ or so I could simply start with one and decide 
later if I really need a second one. Non-amd64 systems would also be ok, 
but it must at least work with vanilla kernels and something like 
Debian. I'm wasted enough time of porting my patches for each test and 
keep them in sync.

If someone is willing to donate a suitable test system I would not say 
no, but I'll plan to finish that with private funds also.

I probably should work on the new AP, but then I always wanted to test 
coreboot and finding out my notebook is now supported is too alluring to 
resist;-)


>>>> I also have tested patch for iwldvm using the Compat mode and I think
>>>> mvm cards will also work with that.
>>>
>>> No they don't, no firmware is available for that.
>>
>> So far I only looked at the dvm part of iwlwifi with only minutes spend
>> on mvm to port the NATIVE solution from dvm.
> 
> :)
> 
>> Are you saying that mvm cards can't seamless switch a RX/TX key to TX
>> only one? mvm seems to support SW crypto as needed and switching RX/TX
>> to TX keys is the only other requirement for COMPAT mode.
> 
> No, sorry. I misread your statement. I thought you were saying "mvm
> cards will also work with iwldvm" rather than "mvm cards will also work
> with compat mode" :-)
> 
> I think they all should just be made to work in native mode, the
> firmware basically supports this as you found, there must be some small
> bugs.

Agree. And with your statements that it already should work and the 
option to get ucode updates I like our chances:-) With some luck it 
could even work and I made some error the first time. I'll give that a 
second look with what I have at hand soon. But after bombing you with 
mails for what feels like most of the weekend I'll postpone that for now:-)

As mentioned above I'm currently aiming for two or three Intel AC-9260 
cards for the next development round. It's seems to be the most modern 
card and the price difference between the cards is irrelevant compared 
to both efforts and costs to get the cards working in two or three 
devices. If you thing another card would be better for development I'll 
just use that one instead...

Alexander

Alexander
Johannes Berg April 8, 2019, 8:10 p.m. UTC | #5
On Sun, 2019-02-24 at 14:04 +0100, Alexander Wetzel wrote:

> Finding a really good sniffer able to also capture A-MPDU frames 
> including control frames would be awesome. 

I think the AC-9260 you have should be a decent sniffer. The (yet
unreleased) follow-up hardware is even better, but this one is fine.

Just remember to load with amsdu_size set to the appropriate (maximum)
A-MSDU size you want to capture.

> I probably should work on the new AP, but then I always wanted to test 
> coreboot and finding out my notebook is now supported is too alluring to 
> resist;-)

:-)

> > I think they all should just be made to work in native mode, the
> > firmware basically supports this as you found, there must be some small
> > bugs.
> 
> Agree. And with your statements that it already should work and the 
> option to get ucode updates I like our chances:-) With some luck it 
> could even work and I made some error the first time. I'll give that a 
> second look with what I have at hand soon. But after bombing you with 
> mails for what feels like most of the weekend I'll postpone that for now:-)

I was looking at the firmware now and ... well, I want to really test
this to understand what's going wrong, because it really *looks* like
even the recent ones should be supported natively, at least as far as
I've looked now.

> As mentioned above I'm currently aiming for two or three Intel AC-9260 
> cards for the next development round. It's seems to be the most modern 
> card and the price difference between the cards is irrelevant compared 
> to both efforts and costs to get the cards working in two or three 
> devices. If you thing another card would be better for development I'll 
> just use that one instead...

AC-9260 should be fine, as far as Intel is concerned. Also make for good
sniffers, in my experience, we use them all the time for that.

johannes
Alexander Wetzel April 10, 2019, 8:46 p.m. UTC | #6
Am 08.04.19 um 22:10 schrieb Johannes Berg:
> On Sun, 2019-02-24 at 14:04 +0100, Alexander Wetzel wrote:
> 
>> Finding a really good sniffer able to also capture A-MPDU frames
>> including control frames would be awesome.
> 
> I think the AC-9260 you have should be a decent sniffer. The (yet
> unreleased) follow-up hardware is even better, but this one is fine.
> 
> Just remember to load with amsdu_size set to the appropriate (maximum)
> A-MSDU size you want to capture.

I did not do that so far... Thanks for that tip!

> 
>> I probably should work on the new AP, but then I always wanted to test
>> coreboot and finding out my notebook is now supported is too alluring to
>> resist;-)
> 
> :-)
> 

I've got a new test system in the meantime and have it up and running as 
a simple test AP with Extended Key ID. I suspect I'll should tune the 
antennas (and probably also the card) based on the first test spins, but 
it's good enough for the moment.

>>> I think they all should just be made to work in native mode, the
>>> firmware basically supports this as you found, there must be some small
>>> bugs.
>>
>> Agree. And with your statements that it already should work and the
>> option to get ucode updates I like our chances:-) With some luck it
>> could even work and I made some error the first time. I'll give that a
>> second look with what I have at hand soon. But after bombing you with
>> mails for what feels like most of the weekend I'll postpone that for now:-)
> 
> I was looking at the firmware now and ... well, I want to really test
> this to understand what's going wrong, because it really *looks* like
> even the recent ones should be supported natively, at least as far as
> I've looked now.
> 

my new test AP came with a Intel AC-3168, which seems to use only one 
antenna, potentially also explaining my fist impression that it's a 
worse card for sniffing than my old Ultimate-N 6300. But it looks like 
it's acting exactly the same in my other iwlmvm test. So I actually 
started to run some tests and started writing a mail. It's so far 
inconclusive and I want to verify the packets on the air next. Something 
is off here and I want to look at that again from scratch, including an 
external sniffer.

That said here what I've got so far and some fresh captures.

It looks like my (new) Wireless-AC 3168NGW (firmware 29.1044073957.0) 
does not have the new key ready for Rx when needed. I have a roughly 5 - 
15 ms long window where the card scrambles received packets using the 
new key. (Note: I can't replicate that at the moment. May be wrong!)
I first suspected the card "cleared" the new key for usage a bit too 
soon and tried to verify that by waiting a bit after installing a key to 
the HW. But it looks like it's not so simple...

I've added a 40 ms delay in the mvm driver after the call to 
iwl_mvm_set_sta_key() and it first looked like that improved the 
situation. So I moved the sleep to iwl_trans_send_cmd() behind 
send_cmd() when not being in CMD_ASYNC but I can't see any any 
differences any longer. At the moment (with the new test setup) I always 
get one corrupted frame when downloading from the AP. Always the first 
frame using the new key...

As for the test procedure: I just add a monitor interface in parallel to 
the "normal" interface on the AP. With HW encryption enabled we should 
only get cleartext packets and don't have to worry about encrypted 
packets in our capture at all:
   iw phy phy0 interface add mon0 type monitor
   ip link set up dev mon0
And start a capture in the interface:
   tcpdump -pi mon0 -s0 -w /tmp/AP.cap

I've just uploaded some captures for you to 
https://www.awhome.eu/index.php/s/AJJXBLsZmzHdxpX also. I've enabled 
swcrypto on the client for the first two and enabled HW crypto on the 
client again for the third and forth.

AP-40ms.pcap.gz
	delay hack as outlined above on the AP
AP-no-delay.cap.gz
	no hack (just some useless printks)
AP-no-delay-client-HW-crypt.cap.gz
	same as above, only cleint using HW crypto
AP-upload-no-delay-HW-crypt.cap.gz
	same as previous, only uploading instead of downloading.
	(and too many broken packets on receive, indicating a bad
	reception/sniffer card)

In all captures I have a normal (1s) ping running to the AP from the 
cleint and start a download from an internal server after a while.

You can e.g. find the "corrupted" looking frames with the wireshark filter
"(wlan.fc.type_subtype == 0x0028) && !(llc.dsap == 0xaa)"

Each capture here only has exactly one, the very first packet using the 
new key.

I'll plane to look deeper here, but that is as far as I got so far.

When you look at the captures keep in mind that both the client and the 
AP also have the two not merged patches applied. But I do not see how 
that makes a difference here.


>> As mentioned above I'm currently aiming for two or three Intel AC-9260
>> cards for the next development round. It's seems to be the most modern
>> card and the price difference between the cards is irrelevant compared
>> to both efforts and costs to get the cards working in two or three
>> devices. If you thing another card would be better for development I'll
>> just use that one instead...
> 
> AC-9260 should be fine, as far as Intel is concerned. Also make for good
> sniffers, in my experience, we use them all the time for that.

Guess I will have to get one for my AP soon at least:-)

Alexander
Johannes Berg April 12, 2019, 9:51 a.m. UTC | #7
On Wed, 2019-04-10 at 22:46 +0200, Alexander Wetzel wrote:

> my new test AP came with a Intel AC-3168, which seems to use only one 
> antenna, potentially also explaining my fist impression that it's a 
> worse card for sniffing than my old Ultimate-N 6300.

Right, that's worse in some way.

> It looks like my (new) Wireless-AC 3168NGW (firmware 29.1044073957.0) 
> does not have the new key ready for Rx when needed. I have a roughly 5 - 
> 15 ms long window where the card scrambles received packets using the 
> new key. (Note: I can't replicate that at the moment. May be wrong!)
> I first suspected the card "cleared" the new key for usage a bit too 
> soon and tried to verify that by waiting a bit after installing a key to 
> the HW. But it looks like it's not so simple...
> 
> I've added a 40 ms delay in the mvm driver after the call to 
> iwl_mvm_set_sta_key() and it first looked like that improved the 
> situation. So I moved the sleep to iwl_trans_send_cmd() behind 
> send_cmd() when not being in CMD_ASYNC but I can't see any any 
> differences any longer. At the moment (with the new test setup) I always 
> get one corrupted frame when downloading from the AP. Always the first 
> frame using the new key...

Ok, that's interesting. Definitely something I'd want to reproduce here
locally and see where exactly we select the wrong key.

FWIW, no timing changes should be needed, once the firmware responds
(and we wait for that when installing a key) everything will be in
place.

> As for the test procedure: I just add a monitor interface in parallel to 
> the "normal" interface on the AP. With HW encryption enabled we should 
> only get cleartext packets and don't have to worry about encrypted 
> packets in our capture at all:
>    iw phy phy0 interface add mon0 type monitor
>    ip link set up dev mon0
> And start a capture in the interface:
>    tcpdump -pi mon0 -s0 -w /tmp/AP.cap

Right.

> I've just uploaded some captures for you to 
> https://www.awhome.eu/index.php/s/AJJXBLsZmzHdxpX also. I've enabled 
> swcrypto on the client for the first two and enabled HW crypto on the 
> client again for the third and forth.
> 
> AP-40ms.pcap.gz
> 	delay hack as outlined above on the AP
> AP-no-delay.cap.gz
> 	no hack (just some useless printks)
> AP-no-delay-client-HW-crypt.cap.gz
> 	same as above, only cleint using HW crypto
> AP-upload-no-delay-HW-crypt.cap.gz
> 	same as previous, only uploading instead of downloading.
> 	(and too many broken packets on receive, indicating a bad
> 	reception/sniffer card)
> 
> In all captures I have a normal (1s) ping running to the AP from the 
> cleint and start a download from an internal server after a while.
> 
> You can e.g. find the "corrupted" looking frames with the wireshark filter
> "(wlan.fc.type_subtype == 0x0028) && !(llc.dsap == 0xaa)"
> 
> Each capture here only has exactly one, the very first packet using the 
> new key.

I'll take a look, but a trace-cmd recording would be more interesting
than the monitor interface, as it also tells us when what key was
installed etc.

If you have some time, you can find how to record that here (under the
"Tracing" section):

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging

Please do review the privacy notes there at the end of the page if you
do this, you should probably send the files to me/us directly rather
than post publicly. They do contain the keys of your (test) network to
some extent - at least of course the PTK(s)/GTK(s), but usually also the
KEK/KCK.

> When you look at the captures keep in mind that both the client and the 
> AP also have the two not merged patches applied. But I do not see how 
> that makes a difference here.

Agree.

johannes
Johannes Berg April 12, 2019, 11:19 a.m. UTC | #8
> AP-no-delay-client-HW-crypt.cap.gz
> 	same as above, only cleint using HW crypto

> You can e.g. find the "corrupted" looking frames with the wireshark filter
> "(wlan.fc.type_subtype == 0x0028) && !(llc.dsap == 0xaa)"

For everyone else - make sure to set "ignore protected bit - with IV" in
wireshark.

> Each capture here only has exactly one, the very first packet using the 
> new key.

Yeah. Note that this packet is really quickly after the EAPOL 4/4
(1.5ms) so perhaps the key wasn't installed yet or something? Hmm, then
again, it should be before we send the confirmation?

But I don't know. The tracing would definitely tell me what's going on.

Btw, if you do record such tracing, it helps if you compile
wpa_supplicant with CONFIG_DEBUG_LINUX_TRACING and start it with the -T
argument to record all of its debugging into the trace file as well.

johannes
Alexander Wetzel April 14, 2019, 4:12 p.m. UTC | #9
> I'll take a look, but a trace-cmd recording would be more interesting
> than the monitor interface, as it also tells us when what key was
> installed etc.

I've just uploaded better captures also including traces to the same 
location. Everything relevant for this mail is in the folder
iwlwifi-debug here:
https://www.awhome.eu/index.php/s/AJJXBLsZmzHdxpX

The old traces have been moved to top dir "old", if you still are 
interested in them, but I think the new ones should be better.

1) Both AP and the STA are now running wt-2019-04-10 *without* the not 
merged patches from the Extended Key ID series. (I've only enabled 
Ext_Key_ID for mvm and dvm on top and added some printk's. The patches 
for that are also available via the link above but really boring...

2) the files with the same prefix (in the format HH:MM) show the same 
communication.
The AP files are of course from the AP, STA from the client station and 
Sniff from my Netgear 7800 used as a OTA sniffer. With the AP literally 
sitting on top of it.
The "Sniff" files can be decoded with the WPA Password "ExtKeyTestAP".

My "Hack" patch to wireshark to allow it to decode keyID 1 is also 
uploaded. (The hack just allows decoding of keyID 1 at the price of no 
longer decoding frames encrypted with gtk keyid 1.)

3) the AP is using a AC 3168NGW and the STA a AC 8265. The two devices 
are roughly 2m apart with a direct line of sight on a channel with next 
to no other traffic on it. (You see what's there in the Sniff captures, 
these are all totally unfiltered.)

4) I also tried running a trace on the STA. But I could not reproduce 
the issue when a trace was running on the STA. (That's probably 
coincidence, through, I just gave it three tries and the first without a 
trace on the STA worked.)

5) The files in the "broken" folder are showing the issue and the ones 
in "working" (seem) to be ok at a first glance.

I can run more tests and also try catch the problem with a STA trace, 
too, if you want. But that may not be necessary:

It looks like we only have a problem when we get frames with the "new" 
keyID and then again some with the "old" keyID. Of course we also could 
have multiple problems, too... But in that case I would say let's first 
look at this one.
The problematic frames are encrypted with the correct "old" key, 
according to wireshark.

But on the STA they are scrambled and therefor probably have been 
decrypted with the - in this case - wrong new key.

And as it happens there is also a really good looking first suspect why 
this may have happened:
According to the STA captures the broken frames came in one A-MPDU which 
started with keyID 1 and then "appended" the older keyID 0 frames at the 
end. (The OTA sniffer seems to be miss the A-MPDU details, see the STA 
capture...)

Now it really looks like the mvm cards are trusting the standard and 
decode all MPDUs within one A-MPDU with the same key while at the same 
time allow mixing different keyIDs in a-MPDU.

The so far mostly theoretical question how far mac80211 should support 
the driver (e.g. key ID border signal) or if we want to let the 
drivers/card handle that would become much more pressing...

Ideally the card/firmware would be able to detect that and start a new 
A-MPDU. But for my understanding the driver could also set A-MPDU to 
only one frame, forcing the firmware to flush all A-MPDUs under 
construction and then set it back to the normal value. (Or that is how 
I've planned to test that in that way with your tips in the past and 
what's documented of the mvm/dvm firmware API.)


> Please do review the privacy notes there at the end of the page if you
> do this, you should probably send the files to me/us directly rather
> than post publicly. They do contain the keys of your (test) network to
> some extent - at least of course the PTK(s)/GTK(s), but usually also the
> KEK/KCK.

Thanks for the reminder.
I'll definitely will remember the option, I was not aware that there was 
a more private way:-)

So far I can accept the risk - as I understand it - or even some short 
term intruders in my test network.

For my understanding  the previous captures (without the PSK in the 
mail) only allows brute forcing the PSK due to the 4-way handshakes.
Providing a valid handshake allows anyone to crack the PSK, without the 
challenge to capture it himself.

But risk of providing eapol frames of your network changed last year, 
the handshake is no longer required to crack the password. Anyone able 
to talk to the AP can get equivalent information by a simple query. So I 
assume sharing captures with the eapol handshakes is not impacting the 
security of my (test) Wlan in any relevant way. So I'm only giving 
everyone a small peak in my normally encrypted wlan, correct?

Here a link to the vulnerability discovered last year:
https://blog.avira.com/cracking-your-wpa2-wi-fi-password-just-became-easier/

But I've changed the PSK already for my test Wlan again from the ones 
used in the provided captures. The "real" Wlan is using EAP-TLS, the one 
I had the ptk rekey freezes some years ago:-)

Alexander
Johannes Berg April 15, 2019, 8:44 a.m. UTC | #10
On Sun, 2019-04-14 at 18:12 +0200, Alexander Wetzel wrote:
> > I'll take a look, but a trace-cmd recording would be more interesting
> > than the monitor interface, as it also tells us when what key was
> > installed etc.
> 
> I've just uploaded better captures also including traces to the same 
> location. Everything relevant for this mail is in the folder
> iwlwifi-debug here:
> https://www.awhome.eu/index.php/s/AJJXBLsZmzHdxpX

Great, thanks.

> It looks like we only have a problem when we get frames with the "new" 
> keyID and then again some with the "old" keyID. Of course we also could 
> have multiple problems, too... But in that case I would say let's first 
> look at this one.

This part kinda surprises me.

> The problematic frames are encrypted with the correct "old" key, 
> according to wireshark.
> 
> But on the STA they are scrambled and therefor probably have been 
> decrypted with the - in this case - wrong new key.
> 
> And as it happens there is also a really good looking first suspect why 
> this may have happened:
> According to the STA captures the broken frames came in one A-MPDU which 
> started with keyID 1 and then "appended" the older keyID 0 frames at the 
> end. (The OTA sniffer seems to be miss the A-MPDU details, see the STA 
> capture...)

This doesn't surprise me at all.

> Now it really looks like the mvm cards are trusting the standard and 
> decode all MPDUs within one A-MPDU with the same key while at the same 
> time allow mixing different keyIDs in a-MPDU.

Yes, I'm pretty sure the firmware will only be able to look at the whole
A-MPDU and thus must make a decision based on the first subframe
regarding the key.

> The so far mostly theoretical question how far mac80211 should support 
> the driver (e.g. key ID border signal) or if we want to let the 
> drivers/card handle that would become much more pressing...

Yeah, good point. I guess this really would have to be on the
transmitter though.

> Ideally the card/firmware would be able to detect that and start a new 
> A-MPDU. 

For similar reasons, I don't think it can do this even on TX.

> But for my understanding the driver could also set A-MPDU to 
> only one frame, forcing the firmware to flush all A-MPDUs under 
> construction and then set it back to the normal value. (Or that is how 
> I've planned to test that in that way with your tips in the past and 
> what's documented of the mvm/dvm firmware API.)

It's probably getting more complicated with newer versions of the API,
but yes, I guess it should be doable to suspend aggregation.

> Thanks for the reminder.
> I'll definitely will remember the option, I was not aware that there was 
> a more private way:-)

Just wanted to point it out. I'd agree the risk is minimal since you use
a separate test network anyway.

johannes
Alexander Wetzel April 15, 2019, 8:09 p.m. UTC | #11
Am 15.04.19 um 10:44 schrieb Johannes Berg:
>> But for my understanding the driver could also set A-MPDU to
>> only one frame, forcing the firmware to flush all A-MPDUs under
>> construction and then set it back to the normal value. (Or that is how
>> I've planned to test that in that way with your tips in the past and
>> what's documented of the mvm/dvm firmware API.)
> It's probably getting more complicated with newer versions of the API,
> but yes, I guess it should be doable to suspend aggregation.

Honestly I don't like my own patch for the KeyId border signal, too 
complex and not intuitive at all. But I wanted to have some code for 
discussion and then got carried away a bit in what may have been not the 
right direction.

Could we not simply ask (at least) the drivers supporting Extended Key 
IDs to implement a special for the remote STA invisible A-MPDU "stop" mode?
In this new mode each A-MPDU would simply be build out of a single MPDU. 
We then could keep Block-Ack active and we don't have to tell the remote 
STA anything about that decision. After all a A-MPDU with only one MPDU 
is allowed...
We then could tell the driver to switch to this mode when installing the 
new key and out of it again when we have activated the new key for Tx.
That should be perfectly fine to run only in the control path and 
comparable simple to set up, too.

That also sounds like something all drivers supporting A-MPDU should be 
able to pull off somehow. (But then I've never even looked at more than 
ath9k and iwlwifi so far for A-MPDU, and at those not close, yet.)
Do you see any problems with that solution and/or a better idea?

Also would you prefer we first sort out the A-MPDU issue prior I adding 
test cases to the hostapd/wpa_supplicant or the other way round?

Looks like I'll have some time at Easter to (start) looking into one or 
the other.
I'm currently trending to the A-MPDU issue, as it's the more fundamental 
of the two immediate topics. (And the captures are pretty clear, this 
explains so far every corrupted packet captured.)


Regards,
Alexander
Johannes Berg April 16, 2019, 9:31 a.m. UTC | #12
On Mon, 2019-04-15 at 22:09 +0200, Alexander Wetzel wrote:
> Honestly I don't like my own patch for the KeyId border signal, too 
> complex and not intuitive at all. But I wanted to have some code for 
> discussion and then got carried away a bit in what may have been not the 
> right direction.

:-)

> Could we not simply ask (at least) the drivers supporting Extended Key 
> IDs to implement a special for the remote STA invisible A-MPDU "stop" mode?
> In this new mode each A-MPDU would simply be build out of a single MPDU. 
> We then could keep Block-Ack active and we don't have to tell the remote 
> STA anything about that decision. After all a A-MPDU with only one MPDU 
> is allowed...
> We then could tell the driver to switch to this mode when installing the 
> new key and out of it again when we have activated the new key for Tx.
> That should be perfectly fine to run only in the control path and 
> comparable simple to set up, too.

Sounds doable, but I'm still debating if we even should give them a
signal - they have all the information in a sense, and do we have a good
idea of when we can go out of this mode again?

(FWIW, I'd probably call it "suspend" and "resume" or so)

> That also sounds like something all drivers supporting A-MPDU should be 
> able to pull off somehow. (But then I've never even looked at more than 
> ath9k and iwlwifi so far for A-MPDU, and at those not close, yet.)
> Do you see any problems with that solution and/or a better idea?

It ought to be possible, or if not then the device just can't support
extended key ID?

> Also would you prefer we first sort out the A-MPDU issue prior I adding 
> test cases to the hostapd/wpa_supplicant or the other way round?

I think adding the code to hostapd/wpa_s is fine - right now we're
obviously OK since we have no drivers using extended key ID that use
hardware crypto, and if we have software crypto there's no problem
either way, even if A-MPDUs are built (which is probably not the case
for any such drivers anyway.)

In a sense I'd prefer getting the necessary bits and pieces elsewhere in
the stack upstream first since that's a prerequisite for anyone
else being able to easily work on this, and that's something we need for
drivers to pick it up.

johannes
Alexander Wetzel April 16, 2019, 6:28 p.m. UTC | #13
>> Could we not simply ask (at least) the drivers supporting Extended Key
>> IDs to implement a special for the remote STA invisible A-MPDU "stop" mode?
>> In this new mode each A-MPDU would simply be build out of a single MPDU.
>> We then could keep Block-Ack active and we don't have to tell the remote
>> STA anything about that decision. After all a A-MPDU with only one MPDU
>> is allowed...
>> We then could tell the driver to switch to this mode when installing the
>> new key and out of it again when we have activated the new key for Tx.
>> That should be perfectly fine to run only in the control path and
>> comparable simple to set up, too.
> 
> Sounds doable, but I'm still debating if we even should give them a
> signal - they have all the information in a sense, and do we have a good
> idea of when we can go out of this mode again?

Handling that on the Tx path would be tricky, but I you are right: 
Drivers should be able to handle that as it is on the control path, too.

They can enable the mode when a key with IEEE80211_KEY_FLAG_NO_AUTO_TX 
set and quiet it again as soon as they get a MPDU using the new KeyID.
Since switching back to normal doesn't have to be done immediately a 
asyc call from Tx path or even a worker should do the job just fine.

Btw:
This also means we'll have to update the merged mac80211 Extended Key ID 
support: We can only enable it for cards without HW crypto when they do 
not set AMPDU_AGGREGATION. With the updated userspace these cards will 
start using Extended Key ID with the already merged patches.

Drivers which should be able to use Extended Key ID with the two merged 
patches seem to be:
admtek/adm8211.c
ath/ar5523/ar5523.c
broadcom/b43legacy/main.c
broadcom/brcm80211/brcmsmac/mac80211_if.c
mac80211_hwsim.c
marvell/libertas_tf/main.c
ralink/rt2x00/rt2400pci.c
ralink/rt2x00/rt2500pci.c
realtek/rtl818x/rtl8180/dev.c
realtek/rtl818x/rtl8187/dev.c
zydas/zd1211rw/zd_mac.c

Of those only hwsim and brcmsmac seems to support AMPDU and only 
brcmsmac relly needs the fix to not lose some packets when rekeying.


>> That also sounds like something all drivers supporting A-MPDU should be
>> able to pull off somehow. (But then I've never even looked at more than
>> ath9k and iwlwifi so far for A-MPDU, and at those not close, yet.)
>> Do you see any problems with that solution and/or a better idea?
> 
> It ought to be possible, or if not then the device just can't support
> extended key ID?

Agree. Or drivers can decide to deny A-MPDU setup when the key has been 
installed with IEEE80211_KEY_FLAG_NO_AUTO_TX when they want.

>> Also would you prefer we first sort out the A-MPDU issue prior I adding
>> test cases to the hostapd/wpa_supplicant or the other way round?
> 
> I think adding the code to hostapd/wpa_s is fine - right now we're
> obviously OK since we have no drivers using extended key ID that use
> hardware crypto, and if we have software crypto there's no problem
> either way, even if A-MPDUs are built (which is probably not the case
> for any such drivers anyway.)
> 
> In a sense I'd prefer getting the necessary bits and pieces elsewhere in
> the stack upstream first since that's a prerequisite for anyone
> else being able to easily work on this, and that's something we need for
> drivers to pick it up.

Ok:-)

I assume we still have to wait till the API is in mainline (probably 
5.2) to ask hostapd/wpa_supplicant to merge the patches?

At the moment I'm planning to get the patches merge ready and posted as 
RFC till 5.2 is out. I'm also planing to keep Extended Key ID for Mesh 
networks till later, so we can focus on AP/STA.


Alexander
Johannes Berg April 16, 2019, 7:11 p.m. UTC | #14
On Tue, 2019-04-16 at 20:28 +0200, Alexander Wetzel wrote:

> They can enable the mode when a key with IEEE80211_KEY_FLAG_NO_AUTO_TX 
> set

This I agree with.

> and quiet it again as soon as they get a MPDU using the new KeyID.

This isn't true, afaict. You need to be sure that no MPDUs remain using
the old key ID, not just that the new key ID showed up.

> Since switching back to normal doesn't have to be done immediately a 
> asyc call from Tx path or even a worker should do the job just fine.

Sure.

> Btw:
> This also means we'll have to update the merged mac80211 Extended Key ID 
> support: We can only enable it for cards without HW crypto when they do 
> not set AMPDU_AGGREGATION. With the updated userspace these cards will 
> start using Extended Key ID with the already merged patches.

I was going to say this is fine, but no, of course not ... we shouldn't
use different key id in the same A-MPDU.

That said, I'd be very surprised if there are any such drivers, except
in corner cases (like loading some drivers like ath9k or iwlwifi with
swcrypto=1 or so)

> Of those only hwsim and brcmsmac seems to support AMPDU and only 
> brcmsmac relly needs the fix to not lose some packets when rekeying.

I can't believe that brcmsmac has no HW crypto support?

Anyway, a patch - even if it serves mostly as documentation - would be
most welcome.

> I assume we still have to wait till the API is in mainline (probably 
> 5.2) to ask hostapd/wpa_supplicant to merge the patches?

No, mac80211-next is (usually?) good enough.

johannes
Alexander Wetzel April 16, 2019, 9:32 p.m. UTC | #15
Am 16.04.19 um 21:11 schrieb Johannes Berg:
> On Tue, 2019-04-16 at 20:28 +0200, Alexander Wetzel wrote:
> 
>> They can enable the mode when a key with IEEE80211_KEY_FLAG_NO_AUTO_TX
>> set
> 
> This I agree with.
> 
>> and quiet it again as soon as they get a MPDU using the new KeyID.
> 
> This isn't true, afaict. You need to be sure that no MPDUs remain using
> the old key ID, not just that the new key ID showed up.
> 

Hm, you are right. There is no grantee that after the first frame with 
the new keyID all frames after that will also use the new keyID when 
using different priority classes, I assume..
But each TID should have a clean cut over, shouldn't it?

But then that would only help us with cards able to control the A-MPDU 
size per TID...
Short of sending a dummy MPDU to each TID from mac80211 as key border 
and the driver waiting for all of them I can't think of anything right 
now. I really hope we can find something better...

>> Since switching back to normal doesn't have to be done immediately a
>> asyc call from Tx path or even a worker should do the job just fine.
> 
> Sure.
> 
>> Btw:
>> This also means we'll have to update the merged mac80211 Extended Key ID
>> support: We can only enable it for cards without HW crypto when they do
>> not set AMPDU_AGGREGATION. With the updated userspace these cards will
>> start using Extended Key ID with the already merged patches.
> 
> I was going to say this is fine, but no, of course not ... we shouldn't
> use different key id in the same A-MPDU.
> 
> That said, I'd be very surprised if there are any such drivers, except
> in corner cases (like loading some drivers like ath9k or iwlwifi with
> swcrypto=1 or so)
>

These should be no problem. The drivers still implement set_key and 
mac80211 will not enable Extended Key ID for them. Enabling Extended Key 
ID with SW crypto for drivers implementing set_key is only possible by 
patching either mac80211 or the driver.

Btw:
That was the main use case I added the mac80211 module parameter 
ieee80211_extended_key_id for. But looks like that was of too little use 
and cut out.

>> Of those only hwsim and brcmsmac seems to support AMPDU and only
>> brcmsmac relly needs the fix to not lose some packets when rekeying.
> 
> I can't believe that brcmsmac has no HW crypto support?

I've just greped the drivers, brcmsmac has ampdu_action in its 
implementation of struct ieee80211_ops but no set_key.
So it really looks like SW crypto only to me...

> 
> Anyway, a patch - even if it serves mostly as documentation - would be
> most welcome.

I'll prepare something. Looks really trivial to fix that.

>> I assume we still have to wait till the API is in mainline (probably
>> 5.2) to ask hostapd/wpa_supplicant to merge the patches?
> 
> No, mac80211-next is (usually?) good enough.

Good to know:-)

Alexander
diff mbox series

Patch

diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
index 54b759cec8b3..0dd5c19ac412 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
@@ -111,6 +111,7 @@  int iwlagn_mac_setup_register(struct iwl_priv *priv,
 	ieee80211_hw_set(hw, SUPPORTS_DYNAMIC_PS);
 	ieee80211_hw_set(hw, SUPPORT_FAST_XMIT);
 	ieee80211_hw_set(hw, WANT_MONITOR_VIF);
+	ieee80211_hw_set(hw, EXT_KEY_ID_NATIVE);
 
 	if (priv->trans->max_skb_frags)
 		hw->netdev_features = NETIF_F_HIGHDMA | NETIF_F_SG;
@@ -676,6 +677,7 @@  static int iwlagn_mac_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd,
 
 
 	switch (cmd) {
+	case EXT_SET_KEY:
 	case SET_KEY:
 		if (is_default_wep_key) {
 			ret = iwl_set_default_wep_key(priv, vif_priv->ctx, key);
@@ -701,6 +703,9 @@  static int iwlagn_mac_set_key(struct ieee80211_hw *hw, enum set_key_cmd cmd,
 
 		IWL_DEBUG_MAC80211(priv, "disable hwcrypto key\n");
 		break;
+	case EXT_KEY_RX_TX:
+		ret = 0;
+		break;
 	default:
 		ret = -EINVAL;
 	}
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
index fc251cc47b7f..102367d2572f 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
@@ -442,6 +442,7 @@  int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm)
 	ieee80211_hw_set(hw, STA_MMPDU_TXQ);
 	ieee80211_hw_set(hw, TX_AMSDU);
 	ieee80211_hw_set(hw, TX_FRAG_LIST);
+	ieee80211_hw_set(hw, EXT_KEY_ID_NATIVE);
 
 	if (iwl_mvm_has_tlc_offload(mvm)) {
 		ieee80211_hw_set(hw, TX_AMPDU_SETUP_IN_HW);
@@ -3357,6 +3358,7 @@  static int iwl_mvm_mac_set_key(struct ieee80211_hw *hw,
 	mutex_lock(&mvm->mutex);
 
 	switch (cmd) {
+	case EXT_SET_KEY:
 	case SET_KEY:
 		if ((vif->type == NL80211_IFTYPE_ADHOC ||
 		     vif->type == NL80211_IFTYPE_AP) && !sta) {
@@ -3464,6 +3466,9 @@  static int iwl_mvm_mac_set_key(struct ieee80211_hw *hw,
 		IWL_DEBUG_MAC80211(mvm, "disable hwcrypto key\n");
 		ret = iwl_mvm_remove_sta_key(mvm, vif, sta, key);
 		break;
+	case EXT_KEY_RX_TX:
+		ret = 0;
+		break;
 	default:
 		ret = -EINVAL;
 	}