History log of /openbmc/linux/net/tls/tls_main.c (Results 251 – 275 of 294)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v4.19.18, v4.19.17
# 76f7164d 17-Jan-2019 Atul Gupta <atul.gupta@chelsio.com>

net/tls: free ctx in sock destruct

free tls context in sock destruct. close may not be the last
call to free sock but force releasing the ctx in close
will result in GPF when ctx ref

net/tls: free ctx in sock destruct

free tls context in sock destruct. close may not be the last
call to free sock but force releasing the ctx in close
will result in GPF when ctx referred again in tcp_done

[ 515.330477] general protection fault: 0000 [#1] SMP PTI
[ 515.330539] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc7+ #10
[ 515.330657] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0b
11/07/2013
[ 515.330844] RIP: 0010:tls_hw_unhash+0xbf/0xd0
[
[ 515.332220] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 515.332340] CR2: 00007fab32c55000 CR3: 000000009261e000 CR4:
00000000000006e0
[ 515.332519] Call Trace:
[ 515.332632] <IRQ>
[ 515.332793] tcp_set_state+0x5a/0x190
[ 515.332907] ? tcp_update_metrics+0xe3/0x350
[ 515.333023] tcp_done+0x31/0xd0
[ 515.333130] tcp_rcv_state_process+0xc27/0x111a
[ 515.333242] ? __lock_is_held+0x4f/0x90
[ 515.333350] ? tcp_v4_do_rcv+0xaf/0x1e0
[ 515.333456] tcp_v4_do_rcv+0xaf/0x1e0

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 63a6b3fe 17-Jan-2019 Atul Gupta <atul.gupta@chelsio.com>

net/tls: build_protos moved to common routine

build protos is required for tls_hw_prot also hence moved to
'tls_build_proto' and called as required from tls_init
and tls_hw_proto. Th

net/tls: build_protos moved to common routine

build protos is required for tls_hw_prot also hence moved to
'tls_build_proto' and called as required from tls_init
and tls_hw_proto. This is required since build_protos
for v4 is moved from tls_register to tls_init in
commit <28cb6f1eaffdc5a6a9707cac55f4a43aa3fd7895>

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 339bbff2 20-Dec-2018 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-21

The following pull-request

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-21

The following pull-request contains BPF updates for your *net-next* tree.

There is a merge conflict in test_verifier.c. Result looks as follows:

[...]
},
{
"calls: cross frame pruning",
.insns = {
[...]
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "function calls to other bpf functions are allowed for root only",
.result_unpriv = REJECT,
.errstr = "!read_ok",
.result = REJECT,
},
{
"jset: functional",
.insns = {
[...]
{
"jset: unknown const compare not taken",
.insns = {
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
BPF_FUNC_get_prandom_u32),
BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1),
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "!read_ok",
.result_unpriv = REJECT,
.errstr = "!read_ok",
.result = REJECT,
},
[...]
{
"jset: range",
.insns = {
[...]
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.result_unpriv = ACCEPT,
.result = ACCEPT,
},

The main changes are:

1) Various BTF related improvements in order to get line info
working. Meaning, verifier will now annotate the corresponding
BPF C code to the error log, from Martin and Yonghong.

2) Implement support for raw BPF tracepoints in modules, from Matt.

3) Add several improvements to verifier state logic, namely speeding
up stacksafe check, optimizations for stack state equivalence
test and safety checks for liveness analysis, from Alexei.

4) Teach verifier to make use of BPF_JSET instruction, add several
test cases to kselftests and remove nfp specific JSET optimization
now that verifier has awareness, from Jakub.

5) Improve BPF verifier's slot_type marking logic in order to
allow more stack slot sharing, from Jiong.

6) Add sk_msg->size member for context access and add set of fixes
and improvements to make sock_map with kTLS usable with openssl
based applications, from John.

7) Several cleanups and documentation updates in bpftool as well as
auto-mount of tracefs for "bpftool prog tracelog" command,
from Quentin.

8) Include sub-program tags from now on in bpf_prog_info in order to
have a reliable way for user space to get all tags of the program
e.g. needed for kallsyms correlation, from Song.

9) Add BTF annotations for cgroup_local_storage BPF maps and
implement bpf fs pretty print support, from Roman.

10) Fix bpftool in order to allow for cross-compilation, from Ivan.

11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order
to be compatible with libbfd and allow for Debian packaging,
from Jakub.

12) Remove an obsolete prog->aux sanitation in dump and get rid of
version check for prog load, from Daniel.

13) Fix a memory leak in libbpf's line info handling, from Prashant.

14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info
does not get unaligned, from Jesper.

15) Fix test_progs kselftest to work with older compilers which are less
smart in optimizing (and thus throwing build error), from Stanislav.

16) Cleanup and simplify AF_XDP socket teardown, from Björn.

17) Fix sk lookup in BPF kselftest's test_sock_addr with regards
to netns_id argument, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12
# 28cb6f1e 20-Dec-2018 John Fastabend <john.fastabend@gmail.com>

bpf: tls_sw, init TLS ULP removes BPF proto hooks

The existing code did not expect users would initialize the TLS ULP
without subsequently calling the TLS TX enabling socket option.

bpf: tls_sw, init TLS ULP removes BPF proto hooks

The existing code did not expect users would initialize the TLS ULP
without subsequently calling the TLS TX enabling socket option.
If the application tries to send data after the TLS ULP enable op
but before the TLS TX enable op the BPF sk_msg verdict program is
skipped. This patch resolves this by converting the ipv4 sock ops
to be calculated at init time the same way ipv6 ops are done. This
pulls in any changes to the sock ops structure that have been made
after the socket was created including the changes from adding the
socket to a sock{map|hash}.

This was discovered by running OpenSSL master branch which calls
the TLS ULP setsockopt early in TLS handshake but only enables
the TLS TX path once the handshake has completed. As a result the
datapath missed the initial handshake messages.

Fixes: 02c558b2d5d6 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


Revision tags: v4.19.11
# c6ec179a 19-Dec-2018 Ganesh Goudar <ganeshgr@chelsio.com>

net/tls: allocate tls context using GFP_ATOMIC

create_ctx can be called from atomic context, hence use
GFP_ATOMIC instead of GFP_KERNEL.

[ 395.962599] BUG: sleeping function ca

net/tls: allocate tls context using GFP_ATOMIC

create_ctx can be called from atomic context, hence use
GFP_ATOMIC instead of GFP_KERNEL.

[ 395.962599] BUG: sleeping function called from invalid context at mm/slab.h:421
[ 395.979896] in_atomic(): 1, irqs_disabled(): 0, pid: 16254, name: openssl
[ 395.996564] 2 locks held by openssl/16254:
[ 396.010492] #0: 00000000347acb52 (sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.isra.44+0x13b/0x9a0
[ 396.029838] #1: 000000006c9552b5 (device_spinlock){+...}, at: tls_init+0x1d/0x280
[ 396.047675] CPU: 5 PID: 16254 Comm: openssl Tainted: G O 4.20.0-rc6+ #25
[ 396.066019] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0c 09/25/2017
[ 396.083537] Call Trace:
[ 396.096265] dump_stack+0x5e/0x8b
[ 396.109876] ___might_sleep+0x216/0x250
[ 396.123940] kmem_cache_alloc_trace+0x1b0/0x240
[ 396.138800] create_ctx+0x1f/0x60
[ 396.152504] tls_init+0xbd/0x280
[ 396.166135] tcp_set_ulp+0x191/0x2d0
[ 396.180035] ? tcp_set_ulp+0x2c/0x2d0
[ 396.193960] do_tcp_setsockopt.isra.44+0x148/0x9a0
[ 396.209013] __sys_setsockopt+0x7c/0xe0
[ 396.223054] __x64_sys_setsockopt+0x20/0x30
[ 396.237378] do_syscall_64+0x4a/0x180
[ 396.251200] entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: df9d4a178022 ("net/tls: sleeping function from invalid context")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.19.10, v4.19.9
# df9d4a17 11-Dec-2018 Atul Gupta <atul.gupta@chelsio.com>

net/tls: sleeping function from invalid context

HW unhash within mutex for registered tls devices cause sleep
when called from tcp_set_state for TCP_CLOSE. Release lock and
re-acquir

net/tls: sleeping function from invalid context

HW unhash within mutex for registered tls devices cause sleep
when called from tcp_set_state for TCP_CLOSE. Release lock and
re-acquire after function call with ref count incr/dec.
defined kref and fp release for tls_device to ensure device
is not released outside lock.

BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:748
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/7
INFO: lockdep is turned off.
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W O
Call Trace:
<IRQ>
dump_stack+0x5e/0x8b
___might_sleep+0x222/0x260
__mutex_lock+0x5c/0xa50
? vprintk_emit+0x1f3/0x440
? kmem_cache_free+0x22d/0x2a0
? tls_hw_unhash+0x2f/0x80
? printk+0x52/0x6e
? tls_hw_unhash+0x2f/0x80
tls_hw_unhash+0x2f/0x80
tcp_set_state+0x5f/0x180
tcp_done+0x2e/0xe0
tcp_rcv_state_process+0x92c/0xdd3
? lock_acquire+0xf5/0x1f0
? tcp_v4_rcv+0xa7c/0xbe0
? tcp_v4_do_rcv+0x70/0x1e0

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 6c0563e4 11-Dec-2018 Atul Gupta <atul.gupta@chelsio.com>

net/tls: Init routines in create_ctx

create_ctx is called from tls_init and tls_hw_prot
hence initialize function pointers in common routine.

Signed-off-by: Atul Gupta <atul.gup

net/tls: Init routines in create_ctx

create_ctx is called from tls_init and tls_hw_prot
hence initialize function pointers in common routine.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19
# c16ee04c 20-Oct-2018 Daniel Borkmann <daniel@iogearbox.net>

ulp: remove uid and user_visible members

They are not used anymore and therefore should be removed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei S

ulp: remove uid and user_visible members

They are not used anymore and therefore should be removed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


Revision tags: v4.18.16, v4.18.15, v4.18.14
# 924ad65e 12-Oct-2018 John Fastabend <john.fastabend@gmail.com>

tls: replace poll implementation with read hook

Instead of re-implementing poll routine use the poll callback to
trigger read from kTLS, we reuse the stream_memory_read callback
whic

tls: replace poll implementation with read hook

Instead of re-implementing poll routine use the poll callback to
trigger read from kTLS, we reuse the stream_memory_read callback
which is simpler and achieves the same. This helps to align sockmap
and kTLS so we can more easily embed BPF in kTLS.

Joint work with Daniel.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


Revision tags: v4.18.13, v4.18.12, v4.18.11, v4.18.10
# 9932a29a 24-Sep-2018 Vakul Garg <vakul.garg@nxp.com>

net/tls: Fixed race condition in async encryption

On processors with multi-engine crypto accelerators, it is possible that
multiple records get encrypted in parallel and their encryption

net/tls: Fixed race condition in async encryption

On processors with multi-engine crypto accelerators, it is possible that
multiple records get encrypted in parallel and their encryption
completion is notified to different cpus in multicore processor. This
leads to the situation where tls_encrypt_done() starts executing in
parallel on different cores. In current implementation, encrypted
records are queued to tx_ready_list in tls_encrypt_done(). This requires
addition to linked list 'tx_ready_list' to be protected. As
tls_decrypt_done() could be executing in irq content, it is not possible
to protect linked list addition operation using a lock.

To fix the problem, we remove linked list addition operation from the
irq context. We do tx_ready_list addition/removal operation from
application context only and get rid of possible multiple access to
the linked list. Before starting encryption on the record, we add it to
the tail of tx_ready_list. To prevent tls_tx_records() from transmitting
it, we mark the record with a new flag 'tx_ready' in 'struct tls_rec'.
When record encryption gets completed, tls_encrypt_done() has to only
update the 'tx_ready' flag to true & linked list add operation is not
required.

The changed logic brings some other side benefits. Since the records
are always submitted in tls sequence number order for encryption, the
tx_ready_list always remains sorted and addition of new records to it
does not have to traverse the linked list.

Lastly, we renamed tx_ready_list in 'struct tls_sw_context_tx' to
'tx_list'. This is because now, the some of the records at the tail are
not ready to transmit.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# a42055e8 20-Sep-2018 Vakul Garg <vakul.garg@nxp.com>

net/tls: Add support for async encryption of records for performance

In current implementation, tls records are encrypted & transmitted
serially. Till the time the previously submitted u

net/tls: Add support for async encryption of records for performance

In current implementation, tls records are encrypted & transmitted
serially. Till the time the previously submitted user data is encrypted,
the implementation waits and on finish starts transmitting the record.
This approach of encrypt-one record at a time is inefficient when
asynchronous crypto accelerators are used. For each record, there are
overheads of interrupts, driver softIRQ scheduling etc. Also the crypto
accelerator sits idle most of time while an encrypted record's pages are
handed over to tcp stack for transmission.

This patch enables encryption of multiple records in parallel when an
async capable crypto accelerator is present in system. This is achieved
by allowing the user space application to send more data using sendmsg()
even while previously issued data is being processed by crypto
accelerator. This requires returning the control back to user space
application after submitting encryption request to accelerator. This
also means that zero-copy mode of encryption cannot be used with async
accelerator as we must be done with user space application buffer before
returning from sendmsg().

There can be multiple records in flight to/from the accelerator. Each of
the record is represented by 'struct tls_rec'. This is used to store the
memory pages for the record.

After the records are encrypted, they are added in a linked list called
tx_ready_list which contains encrypted tls records sorted as per tls
sequence number. The records from tx_ready_list are transmitted using a
newly introduced function called tls_tx_records(). The tx_ready_list is
polled for any record ready to be transmitted in sendmsg(), sendpage()
after initiating encryption of new tls records. This achieves parallel
encryption and transmission of records when async accelerator is
present.

There could be situation when crypto accelerator completes encryption
later than polling of tx_ready_list by sendmsg()/sendpage(). Therefore
we need a deferred work context to be able to transmit records from
tx_ready_list. The deferred work context gets scheduled if applications
are not sending much data through the socket. If the applications issue
sendmsg()/sendpage() in quick succession, then the scheduling of
tx_work_handler gets cancelled as the tx_ready_list would be polled from
application's context itself. This saves scheduling overhead of deferred
work.

The patch also brings some side benefit. We are able to get rid of the
concept of CLOSED record. This is because the records once closed are
either encrypted and then placed into tx_ready_list or if encryption
fails, the socket error is set. This simplifies the kernel tls
sendpath. However since tls_device.c is still using macros, accessory
functions for CLOSED records have been retained.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.18.9
# c844eb46 12-Sep-2018 Sabrina Dubroca <sd@queasysnail.net>

tls: clear key material from kernel memory when do_tls_setsockopt_conf fails

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signe

tls: clear key material from kernel memory when do_tls_setsockopt_conf fails

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 86029d10 12-Sep-2018 Sabrina Dubroca <sd@queasysnail.net>

tls: zero the crypto information from tls_context before freeing

This contains key material in crypto_send_aes_gcm_128 and
crypto_recv_aes_gcm_128.

Introduce union tls_crypto_co

tls: zero the crypto information from tls_context before freeing

This contains key material in crypto_send_aes_gcm_128 and
crypto_recv_aes_gcm_128.

Introduce union tls_crypto_context, and replace the two identical
unions directly embedded in struct tls_context with it. We can then
use this union to clean up the memory in the new tls_ctx_free()
function.

Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.18.7, v4.18.6, v4.18.5
# 67db7cd2 22-Aug-2018 John Fastabend <john.fastabend@gmail.com>

tls: possible hang when do_tcp_sendpages hits sndbuf is full case

Currently, the lower protocols sk_write_space handler is not called if
TLS is sending a scatterlist via tls_push_sg. Ho

tls: possible hang when do_tcp_sendpages hits sndbuf is full case

Currently, the lower protocols sk_write_space handler is not called if
TLS is sending a scatterlist via tls_push_sg. However, normally
tls_push_sg calls do_tcp_sendpage, which may be under memory pressure,
that in turn may trigger a wait via sk_wait_event. Typically, this
happens when the in-flight bytes exceed the sdnbuf size. In the normal
case when enough ACKs are received sk_write_space() will be called and
the sk_wait_event will be woken up allowing it to send more data
and/or return to the user.

But, in the TLS case because the sk_write_space() handler does not
wake up the events the above send will wait until the sndtimeo is
exceeded. By default this is MAX_SCHEDULE_TIMEOUT so it look like a
hang to the user (especially this impatient user). To fix this pass
the sk_write_space event to the lower layers sk_write_space event
which in the TCP case will wake any pending events.

I observed the above while integrating sockmap and ktls. It
initially appeared as test_sockmap (modified to use ktls) occasionally
hanging. To reliably reproduce this reduce the sndbuf size and stress
the tls layer by sending many 1B sends. This results in every byte
needing a header and each byte individually being sent to the crypto
layer.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

show more ...


Revision tags: v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16
# 037b0b86 16-Aug-2018 Daniel Borkmann <daniel@iogearbox.net>

tcp, ulp: add alias for all ulp modules

Lets not turn the TCP ULP lookup into an arbitrary module loader as
we only intend to load ULP modules through this mechanism, not other
unrel

tcp, ulp: add alias for all ulp modules

Lets not turn the TCP ULP lookup into an arbitrary module loader as
we only intend to load ULP modules through this mechanism, not other
unrelated kernel modules:

[root@bar]# cat foo.c
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/tcp.h>
#include <linux/in.h>

int main(void)
{
int sock = socket(PF_INET, SOCK_STREAM, 0);
setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
return 0;
}

[root@bar]# gcc foo.c -O2 -Wall
[root@bar]# lsmod | grep sctp
[root@bar]# ./a.out
[root@bar]# lsmod | grep sctp
sctp 1077248 4
libcrc32c 16384 3 nf_conntrack,nf_nat,sctp
[root@bar]#

Fix it by adding module alias to TCP ULP modules, so probing module
via request_module() will be limited to tcp-ulp-[name]. The existing
modules like kTLS will load fine given tcp-ulp-tls alias, but others
will fail to load:

[root@bar]# lsmod | grep sctp
[root@bar]# ./a.out
[root@bar]# lsmod | grep sctp
[root@bar]#

Sockmap is not affected from this since it's either built-in or not.

Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

show more ...


Revision tags: v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10, v4.17.9, v4.17.8, v4.17.7
# 4799ac81 13-Jul-2018 Boris Pismenny <borisp@mellanox.com>

tls: Add rx inline crypto offload

This patch completes the generic infrastructure to offload TLS crypto to a
network device. It enables the kernel to skip decryption and
authenticati

tls: Add rx inline crypto offload

This patch completes the generic infrastructure to offload TLS crypto to a
network device. It enables the kernel to skip decryption and
authentication of some skbs marked as decrypted by the NIC. In the fast
path, all packets received are decrypted by the NIC and the performance
is comparable to plain TCP.

This infrastructure doesn't require a TCP offload engine. Instead, the
NIC only decrypts packets that contain the expected TCP sequence number.
Out-Of-Order TCP packets are provided unmodified. As a result, at the
worst case a received TLS record consists of both plaintext and ciphertext
packets. These partially decrypted records must be reencrypted,
only to be decrypted.

The notable differences between SW KTLS Rx and this offload are as
follows:
1. Partial decryption - Software must handle the case of a TLS record
that was only partially decrypted by HW. This can happen due to packet
reordering.
2. Resynchronization - tls_read_size calls the device driver to
resynchronize HW after HW lost track of TLS record framing in
the TCP stream.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.17.6, v4.17.5, v4.17.4
# a11e1d43 28-Jun-2018 Linus Torvalds <torvalds@linux-foundation.org>

Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL

The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression

Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL

The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.

Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.

But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.

[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


Revision tags: v4.17.3, v4.17.2
# f6fadff3 11-Jun-2018 Daniel Borkmann <daniel@iogearbox.net>

tls: fix NULL pointer dereference on poll

While hacking on kTLS, I ran into the following panic from an
unprivileged netserver / netperf TCP session:

BUG: unable to handle ker

tls: fix NULL pointer dereference on poll

While hacking on kTLS, I ran into the following panic from an
unprivileged netserver / netperf TCP session:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0
Oops: 0010 [#1] SMP KASAN PTI
CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
RIP: 0010: (null)
Code: Bad RIP value.
RSP: 0018:ffff88036abcf740 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26
RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280
RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48
R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280
R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350
FS: 00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? tls_sw_poll+0xa4/0x160 [tls]
? sock_poll+0x20a/0x680
? do_select+0x77b/0x11a0
? poll_schedule_timeout.constprop.12+0x130/0x130
? pick_link+0xb00/0xb00
? read_word_at_a_time+0x13/0x20
? vfs_poll+0x270/0x270
? deref_stack_reg+0xad/0xe0
? __read_once_size_nocheck.constprop.6+0x10/0x10
[...]

Debugging further, it turns out that calling into ctx->sk_poll() is
invalid since sk_poll itself is NULL which was saved from the original
TCP socket in order for tls_sw_poll() to invoke it.

Looks like the recent conversion from poll to poll_mask callback started
in 152524231023 ("net: add support for ->poll_mask in proto_ops") missed
to eventually convert kTLS, too: TCP's ->poll was converted over to the
->poll_mask in commit 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask")
and therefore kTLS wrongly saved the ->poll old one which is now NULL.

Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN |
POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in
tcp_poll_mask() as well that is mangled here.

Fixes: 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Watson <davejwatson@fb.com>
Tested-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# b2d6cee1 11-May-2018 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

The bpf syscall and selftests conflicts were trivial
overlapping changes.

The r8169 change involved moving the adde

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

The bpf syscall and selftests conflicts were trivial
overlapping changes.

The r8169 change involved moving the added mdelay from 'net' into a
different function.

A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts. I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".

Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.17.1, v4.17
# 080324c3 06-May-2018 Andre Tomt <andre@tomt.net>

net/tls: Fix connection stall on partial tls record

In the case of writing a partial tls record we forgot to clear the
ctx->in_tcp_sendpages flag, causing some connections to stall.

net/tls: Fix connection stall on partial tls record

In the case of writing a partial tls record we forgot to clear the
ctx->in_tcp_sendpages flag, causing some connections to stall.

Fixes: c212d2c7fc47 ("net/tls: Don't recursively call push_record during tls_write_space callbacks")
Signed-off-by: Andre Tomt <andre@tomt.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# 98f0a395 05-May-2018 Eric Dumazet <edumazet@google.com>

tls: fix use after free in tls_sk_proto_close

syzbot reported a use-after-free in tls_sk_proto_close

Add a boolean value to cleanup a bit this function.

BUG: KASAN: use-aft

tls: fix use after free in tls_sk_proto_close

syzbot reported a use-after-free in tls_sk_proto_close

Add a boolean value to cleanup a bit this function.

BUG: KASAN: use-after-free in tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
Read of size 1 at addr ffff8801ae40a858 by task syz-executor363/4503

CPU: 0 PID: 4503 Comm: syz-executor363 Not tainted 4.17.0-rc3+ #34
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
sock_release+0x96/0x1b0 net/socket.c:594
sock_close+0x16/0x20 net/socket.c:1149
__fput+0x34d/0x890 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243
task_work_run+0x1e4/0x290 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x1aee/0x2730 kernel/exit.c:865
do_group_exit+0x16f/0x430 kernel/exit.c:968
get_signal+0x886/0x1960 kernel/signal.c:2469
do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4457b9
RSP: 002b:00007fdf4d766da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006dac3c RCX: 00000000004457b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000006dac3c
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac38
R13: 3692738801137283 R14: 6bf92c39443c4c1d R15: 0000000000000006

Allocated by task 4498:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
kmalloc include/linux/slab.h:512 [inline]
kzalloc include/linux/slab.h:701 [inline]
create_ctx net/tls/tls_main.c:521 [inline]
tls_init+0x1f9/0xb00 net/tls/tls_main.c:633
tcp_set_ulp+0x1bc/0x520 net/ipv4/tcp_ulp.c:153
do_tcp_setsockopt.isra.39+0x44a/0x2600 net/ipv4/tcp.c:2588
tcp_setsockopt+0xc1/0xe0 net/ipv4/tcp.c:2893
sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
__sys_setsockopt+0x1bd/0x390 net/socket.c:1903
__do_sys_setsockopt net/socket.c:1914 [inline]
__se_sys_setsockopt net/socket.c:1911 [inline]
__x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 4503:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xd9/0x260 mm/slab.c:3813
tls_sw_free_resources+0x2a3/0x360 net/tls/tls_sw.c:1037
tls_sk_proto_close+0x67c/0x9c0 net/tls/tls_main.c:288
inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
sock_release+0x96/0x1b0 net/socket.c:594
sock_close+0x16/0x20 net/socket.c:1149
__fput+0x34d/0x890 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243
task_work_run+0x1e4/0x290 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x1aee/0x2730 kernel/exit.c:865
do_group_exit+0x16f/0x430 kernel/exit.c:968
get_signal+0x886/0x1960 kernel/signal.c:2469
do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801ae40a800
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 88 bytes inside of
256-byte region [ffff8801ae40a800, ffff8801ae40a900)
The buggy address belongs to the page:
page:ffffea0006b90280 count:1 mapcount:0 mapping:ffff8801ae40a080 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffff8801ae40a080 0000000000000000 000000010000000c
raw: ffffea0006bea9e0 ffffea0006bc94a0 ffff8801da8007c0 0000000000000000
page dumped because: kasan: bad access detected

Fixes: dd0bed1665d6 ("tls: support for Inline tls record")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Atul Gupta <atul.gupta@chelsio.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Ilya Lesokhin <ilyal@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: Dave Watson <davejwatson@fb.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# a7b15ab8 04-May-2018 David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Overlapping changes in selftests Makefile.

Signed-off-by: David S. Miller <davem@davemloft.net>


# c212d2c7 01-May-2018 Dave Watson <davejwatson@fb.com>

net/tls: Don't recursively call push_record during tls_write_space callbacks

It is reported that in some cases, write_space may be called in
do_tcp_sendpages, such that we recursively in

net/tls: Don't recursively call push_record during tls_write_space callbacks

It is reported that in some cases, write_space may be called in
do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again:

[ 660.468802] ? do_tcp_sendpages+0x8d/0x580
[ 660.468826] ? tls_push_sg+0x74/0x130 [tls]
[ 660.468852] ? tls_push_record+0x24a/0x390 [tls]
[ 660.468880] ? tls_write_space+0x6a/0x80 [tls]
...

tls_push_sg already does a loop over all sending sg's, so ignore
any tls_write_space notifications until we are done sending.
We then have to call the previous write_space to wake up
poll() waiters after we are done with the send loop.

Reported-by: Andre Tomt <andre@tomt.net>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# e8f69799 30-Apr-2018 Ilya Lesokhin <ilyal@mellanox.com>

net/tls: Add generic NIC offload infrastructure

This patch adds a generic infrastructure to offload TLS crypto to a
network device. It enables the kernel TLS socket to skip encryption

net/tls: Add generic NIC offload infrastructure

This patch adds a generic infrastructure to offload TLS crypto to a
network device. It enables the kernel TLS socket to skip encryption
and authentication operations on the transmit side of the data path.
Leaving those computationally expensive operations to the NIC.

The NIC offload infrastructure builds TLS records and pushes them to
the TCP layer just like the SW KTLS implementation and using the same
API.
TCP segmentation is mostly unaffected. Currently the only exception is
that we prevent mixed SKBs where only part of the payload requires
offload. In the future we are likely to add a similar restriction
following a change cipher spec record.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. The offloaded implementation builds "plaintext TLS record", those
records contain plaintext instead of ciphertext and place holder bytes
instead of authentication tags.
2. The offloaded implementation maintains a mapping from TCP sequence
number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
TLS socket, we can use the tls NIC offload infrastructure to obtain
enough context to encrypt the payload of the SKB.
A TLS record is released when the last byte of the record is ack'ed,
this is done through the new icsk_clean_acked callback.

The infrastructure should be extendable to support various NIC offload
implementations. However it is currently written with the
implementation below in mind:
The NIC assumes that packets from each offloaded stream are sent as
plaintext and in-order. It keeps track of the TLS records in the TCP
stream. When a packet marked for offload is transmitted, the NIC
encrypts the payload in-place and puts authentication tags in the
relevant place holders.

The responsibility for handling out-of-order packets (i.e. TCP
retransmission, qdisc drops) falls on the netdev driver.

The netdev driver keeps track of the expected TCP SN from the NIC's
perspective. If the next packet to transmit matches the expected TCP
SN, the driver advances the expected TCP SN, and transmits the packet
with TLS offload indication.

If the next packet to transmit does not match the expected TCP SN. The
driver calls the TLS layer to obtain the TLS record that includes the
TCP of the packet for transmission. Using this TLS record, the driver
posts a work entry on the transmit queue to reconstruct the NIC TLS
state required for the offload of the out-of-order packet. It updates
the expected TCP SN accordingly and transmits the now in-order packet.
The same queue is used for packet transmission and TLS context
reconstruction to avoid the need for flushing the transmit queue before
issuing the context reconstruction request.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


# f66de3ee 30-Apr-2018 Boris Pismenny <borisp@mellanox.com>

net/tls: Split conf to rx + tx

In TLS inline crypto, we can have one direction in software
and another in hardware. Thus, we split the TLS configuration to separate
structures for re

net/tls: Split conf to rx + tx

In TLS inline crypto, we can have one direction in software
and another in hardware. Thus, we split the TLS configuration to separate
structures for receive and transmit.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


1...<<1112