Message ID | 167580444939.5328.5412964147692077675.stgit@91.116.238.104.host.secureserver.net (mailing list archive) |
---|---|
Headers | show |
Series | Another crack at a handshake upcall mechanism | expand |
On 2/7/23 22:41, Chuck Lever wrote: > Hi- > > Here is v3 of a series to add generic support for transport layer > security handshake on behalf of kernel consumers (user space > consumers use a security library directly, of course). > > This version of the series does away with the listen/poll/accept/ > close design and replaces it with a full netlink implementation > that handles much of the same function. > > The first patch in the series adds a new netlink family to handle > the kernel-user space interaction to request a handshake. The second > patch demonstrates how to extend this new mechanism to support a > particular transport layer security protocol (in this case, > TLSv1.3). > > Of particular interest is that the user space handshake agent now > must perform a second downcall when the handshake is complete, > rather than simply closing the socket descriptor. This enables the > user space agent to pass down a session status, whether the session > was mutually authenticated, and the identity of the remote peer. > (Although these facilities are plumbed into the netlink protocol, > they have yet to be fully implemented by the kernel or the sample > user space agent below). > > Certificates and pre-shared keys are made available to the user > space agent via keyrings, or the agent can use authentication > materials residing in the local filesystem. > > The full patch set to support SunRPC with TLSv1.3 is available in > the topic-rpc-with-tls-upcall branch here, based on v6.1.10: > > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git > > A sample user space handshake agent with netlink support is > available in the "netlink" branch here: > > https://github.com/oracle/ktls-utils > > --- > > Changes since v2: > - PF_HANDSHAKE replaced with NETLINK_HANDSHAKE > - Replaced listen(2) / poll(2) with a multicast notification service > - Replaced accept(2) with a netlink operation that can return an > open fd and handshake parameters > - Replaced close(2) with a netlink operation that can take arguments > > Changes since RFC: > - Generic upcall support split away from kTLS > - Added support for TLS ServerHello > - Documentation has been temporarily removed while API churns > > Chuck Lever (2): > net/handshake: Create a NETLINK service for handling handshake requests > net/tls: Support AF_HANDSHAKE in kTLS > > The use of AF_HANDSHAKE in the short description here is stale. I'll > fix that in a subsequent posting. > Have been playing around with this patchset, and for some reason I get a weird crash: [ 5101.640941] nvme nvme0: queue 0: start TLS with key 15982809 [ 5111.769538] nvme nvme0: queue 0: TLS handshake complete, tmo 2500, error -110 [ 5111.769545] BUG: kernel NULL pointer dereference, address: 0000000000000068 [ 5111.770089] #PF: supervisor read access in kernel mode [ 5111.770460] #PF: error_code(0x0000) - not-present page [ 5111.770828] PGD 0 P4D 0 [ 5111.771019] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 5111.771344] CPU: 0 PID: 8611 Comm: nvme Kdump: loaded Tainted: G [ 5111.772193] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS [ 5111.772864] RIP: 0010:kernel_sock_shutdown+0x9/0x20 which looks to me as if the socket had been deallocated once the netlink handshake has completed. And indeed, handshake_accept() has the 'CLOEXEC' flag set. So if the userprocess exits it'll close the socket, and we're hosed. Which seems to be what is happening here. Let's see if things work out better without the CLOEXEC flag. Cheers, Hannes
On 2/14/23 10:44, Hannes Reinecke wrote: > On 2/7/23 22:41, Chuck Lever wrote: >> Hi- >> >> Here is v3 of a series to add generic support for transport layer >> security handshake on behalf of kernel consumers (user space >> consumers use a security library directly, of course). >> >> This version of the series does away with the listen/poll/accept/ >> close design and replaces it with a full netlink implementation >> that handles much of the same function. >> >> The first patch in the series adds a new netlink family to handle >> the kernel-user space interaction to request a handshake. The second >> patch demonstrates how to extend this new mechanism to support a >> particular transport layer security protocol (in this case, >> TLSv1.3). >> >> Of particular interest is that the user space handshake agent now >> must perform a second downcall when the handshake is complete, >> rather than simply closing the socket descriptor. This enables the >> user space agent to pass down a session status, whether the session >> was mutually authenticated, and the identity of the remote peer. >> (Although these facilities are plumbed into the netlink protocol, >> they have yet to be fully implemented by the kernel or the sample >> user space agent below). >> >> Certificates and pre-shared keys are made available to the user >> space agent via keyrings, or the agent can use authentication >> materials residing in the local filesystem. >> >> The full patch set to support SunRPC with TLSv1.3 is available in >> the topic-rpc-with-tls-upcall branch here, based on v6.1.10: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git >> >> A sample user space handshake agent with netlink support is >> available in the "netlink" branch here: >> >> https://github.com/oracle/ktls-utils >> >> --- >> >> Changes since v2: >> - PF_HANDSHAKE replaced with NETLINK_HANDSHAKE >> - Replaced listen(2) / poll(2) with a multicast notification service >> - Replaced accept(2) with a netlink operation that can return an >> open fd and handshake parameters >> - Replaced close(2) with a netlink operation that can take arguments >> >> Changes since RFC: >> - Generic upcall support split away from kTLS >> - Added support for TLS ServerHello >> - Documentation has been temporarily removed while API churns >> >> Chuck Lever (2): >> net/handshake: Create a NETLINK service for handling handshake >> requests >> net/tls: Support AF_HANDSHAKE in kTLS >> >> The use of AF_HANDSHAKE in the short description here is stale. I'll >> fix that in a subsequent posting. >> > Have been playing around with this patchset, and for some reason I get a > weird crash: > > [ 5101.640941] nvme nvme0: queue 0: start TLS with key 15982809 > [ 5111.769538] nvme nvme0: queue 0: TLS handshake complete, tmo 2500, > error -110 > [ 5111.769545] BUG: kernel NULL pointer dereference, address: > 0000000000000068 > [ 5111.770089] #PF: supervisor read access in kernel mode > [ 5111.770460] #PF: error_code(0x0000) - not-present page > [ 5111.770828] PGD 0 P4D 0 > [ 5111.771019] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 5111.771344] CPU: 0 PID: 8611 Comm: nvme Kdump: loaded Tainted: G [ > 5111.772193] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS [ > 5111.772864] RIP: 0010:kernel_sock_shutdown+0x9/0x20 > > which looks to me as if the socket had been deallocated once the netlink > handshake has completed. > And indeed, handshake_accept() has the 'CLOEXEC' flag set. > So if the userprocess exits it'll close the socket, and we're hosed. > Which seems to be what is happening here. > > Let's see if things work out better without the CLOEXEC flag. > Nope, that doesn't work. Turns out to be an issue with netlink timeout handling. In my code I've added a 'wait_for_completion' loop, seeing that I need to get the result from the upcall such that I can continue. But as I'm triggering the infamous 'assert' in gnutls (regarding PSK identity length), userspace does _not_ return, but rather waits indefinitely. Or, rather, longer than I'm prepared to wait. Once the timeout is triggered I find that the socket has been released, causing _quite_ some friction with the code :-) Looks like I'll have to add timeout handling to the netlink handshake; plan is to transmit the timeout parameter from the kernel to userspace, and set the timeout via gnutls_handshake_set_timeout(). Cheers, Hannes