Tuesday, October 13, 2015

Fixing up the RTL8188SU/RTL8192SU 802.11bgn driver (rsu)

I recently figured out most of the missing pieces for 11n support and stability with the rsu driver in FreeBSD. This handles support for the RTL8188SU and RTL8192SU chipsets. I'll cover what I found and fixed in this post.

First off - the driver was in reasonably poor shape. It sometimes paniced when the NIC was removed, it didn't support 802.11n at all and it wouldn't associate reliably. I was pestered enough by one of the original users behind getting the driver ported (Idwer! Hi!) and decided I'd give fixing it up a go.

Importantly - it's a mostly real fullmac device. "fullmac" here means that the firmware on the device does almost everything interesting - it handles association, it can do encryption/decryption for you if you want, it'll handle retransmission and transmit rate control. There are some important things it doesn't do - I'll cover those shortly.

Here's a fun bit of trivia - this firmware outputs text debugging via a magic firmware notification, and it's on by default. This made all of the debugging much, much easier as I didn't have to guess so much about what was going on in the firmware. All firmware developers - please do this. Please!

I first looked at the association issue. The device does full scan offload - you send it a firmware command to start scanning and it'll return scan results as they come in. Plenty of firmware devices do this. Then you send it an association message, then a join_bss message. For those looking at the source - rsu_site_survey() starts the scan, and rsu_join_bss() attempts an association. Now, I noticed that it was sending a join message before the site survey finished. I also noticed that I never really received any management frames, and when I used a sniffer to see what was going on, I saw double-associations sometimes occuring.

I then checked OpenBSD. Their driver just stubbed out the management frame transmit routine. This wasn't done in FreeBSD, so I added it. It turns out the firmware here does all management frame transmit and receive, so I just plainly have to do none of it. This tidied things up a bit but it didn't fix association.

Next up - the whole way scan results were pushed into net80211 was wrong. Sometimes scan results ended up on the wrong channel. The driver was doing dirty things to the current channel state directly and then faking a beacon to the net80211 stack. I replaced that with some code I wrote for the 7260 wifi driver - the stack now accepts a channel (and other things!) as part of the receive frame, so you can do proper off-channel frame reception. This tidied up the scan results so they were now consistent.

Then I thought about an evil hack - how about delaying the call to rsu_join_bss() until after the survey finished. That worked - associations were now very reliable.


Now the device associated reliably and worked okay. There were some missing bits for the firmware setup path for doing things like power saving, saying how many transmit/receive streams are available, etc, but those were easy to add. Next up - 802.11n.

On the receive side, 802.11n requires you to do A-MPDU reordering as the transmitter is free to retransmit failed frames out of order. But the net80211 stack only handled the case where it saw the management frames and it itself drove the A-MPDU negotiation. Here, the firmware drives the negotiation and just tells us what's just happened. So, I had to extend net80211 to be told what the A-MPDU parameters are. It turned out that yes, the firmware sends a notification about A-MPDU going up, but it doesn't tell you how big the block-ack window is. Sigh. So, I needed to add that.

But the access point still wasn't negotiating it. Here was the next fun bit - join rsu_join_bss() it lets the stack assemble optional IEs to send to the access point and, the more interesting part, it looks at said IEs for an idea of what its own configuration should be. I added the HTINFO IE and voila! It started negotiating 802.11n.

(Oh, and I had to add M_AMPDU to each RX'ed frame from an 802.11n node before I called net80211, or the receive code would never do A-MPDU reorder processing.)

The final hack - I stubbed out the A-MPDU TX negotiation so we would never attempt to do it. So yes, there's no TX aggregation support, but that's fine for now.

Then Idwer told me it wasn't working for him. After much digging with the Linux driver authors (Thanks Christian and Larry!) we found that the OpenBSD driver tried to program the chip directly for 40MHz mode and that's wrong - instead, I just missed one of the 802.11n IEs. The firmware looked into that to see what the channel setup should've been. Two lines of diff later and I was on at 40MHz wide modes.

Finally - stability. It turns out that the USB drivers do inconsistent things when it comes to the detach path. They're supposed to stop transmit/receive, then flush buffers which flushes the net80211 node references, and then tear down the net80211 interface. Some, eg if_rsu, were doing it the other way. I fixed if_rsu and if_urtwn - they're now both stable.

No comments:

Post a Comment