Description
In the Linux kernel, the following vulnerability has been resolved:bpf: fix ktls panic with sockmap[ 2172.936997] ------------[ cut here ]------------[ 2172.936999] kernel BUG at lib/iov_iter.c:629!......[ 2172.944996] PKRU: 55555554[ 2172.945155] Call Trace:[ 2172.945299] [ 2172.945428] ? die+0x36/0x90[ 2172.945601] ? do_trap+0xdd/0x100[ 2172.945795] ? iov_iter_revert+0x178/0x180[ 2172.946031] ? iov_iter_revert+0x178/0x180[ 2172.946267] ? do_error_trap+0x7d/0x110[ 2172.946499] ? iov_iter_revert+0x178/0x180[ 2172.946736] ? exc_invalid_op+0x50/0x70[ 2172.946961] ? iov_iter_revert+0x178/0x180[ 2172.947197] ? asm_exc_invalid_op+0x1a/0x20[ 2172.947446] ? iov_iter_revert+0x178/0x180[ 2172.947683] ? iov_iter_revert+0x5c/0x180[ 2172.947913] tls_sw_sendmsg_locked.isra.0+0x794/0x840[ 2172.948206] tls_sw_sendmsg+0x52/0x80[ 2172.948420] ? inet_sendmsg+0x1f/0x70[ 2172.948634] __sys_sendto+0x1cd/0x200[ 2172.948848] ? find_held_lock+0x2b/0x80[ 2172.949072] ? syscall_trace_enter+0x140/0x270[ 2172.949330] ? __lock_release.isra.0+0x5e/0x170[ 2172.949595] ? find_held_lock+0x2b/0x80[ 2172.949817] ? syscall_trace_enter+0x140/0x270[ 2172.950211] ? lockdep_hardirqs_on_prepare+0xda/0x190[ 2172.950632] ? ktime_get_coarse_real_ts64+0xc2/0xd0[ 2172.951036] __x64_sys_sendto+0x24/0x30[ 2172.951382] do_syscall_64+0x90/0x170......After calling bpf_exec_tx_verdict(), the size of msg_pl->sg may increase,e.g., when the BPF program executes bpf_msg_push_data().If the BPF program sets cork_bytes and sg.size is smaller than cork_bytes,it will return -ENOSPC and attempt to roll back to the non-zero copylogic. However, during rollback, msg->msg_iter is reset, but sincemsg_pl->sg.size has been increased, subsequent executions will exceed theactual size of msg_iter.'''iov_iter_revert(&msg->msg_iter, msg_pl->sg.size - orig_size);'''The changes in this commit are based on the following considerations:1. When cork_bytes is set, rolling back to non-zero copy logic ispointless and can directly go to zero-copy logic.2. We can not calculate the correct number of bytes to revert msg_iter.Assume the original data is "abcdefgh" (8 bytes), and after 3 pushesby the BPF program, it becomes 11-byte data: "abc?de?fgh?".Then, we set cork_bytes to 6, which means the first 6 bytes have beenprocessed, and the remaining 5 bytes "?fgh?" will be cached until thelength meets the cork_bytes requirement.However, some data in "?fgh?" is not within 'sg->msg_iter'(but in msg_pl instead), especially the data "?" we pushed.So it doesn't seem as simple as just reverting through an offset ofmsg_iter.3. For non-TLS sockets in tcp_bpf_sendmsg, when a "cork" situation occurs,the user-space send() doesn't return an error, and the returned length isthe same as the input length parameter, even if some data is cached.Additionally, I saw that the current non-zero-copy logic for handlingcorking is written as:'''line 1177else if (ret != -EAGAIN) { if (ret == -ENOSPC) ret = 0; goto send_end;'''So it's ok to just return 'copied' without error when a "cork" situationoccurs.
POC
Reference
No PoCs from references.
Github
- https://github.com/w4zu/Debian_security