| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: fix integer underflow in chain mode
The jumbo_frm() chain-mode implementation unconditionally computes
len = nopaged_len - bmax;
where nopaged_len = skb_headlen(skb) (linear bytes only) and bmax is
BUF_SIZE_8KiB or BUF_SIZE_2KiB. However, the caller stmmac_xmit()
decides to invoke jumbo_frm() based on skb->len (total length including
page fragments):
is_jumbo = stmmac_is_jumbo_frm(priv, skb->len, enh_desc);
When a packet has a small linear portion (nopaged_len <= bmax) but a
large total length due to page fragments (skb->len > bmax), the
subtraction wraps as an unsigned integer, producing a huge len value
(~0xFFFFxxxx). This causes the while (len != 0) loop to execute
hundreds of thousands of iterations, passing skb->data + bmax * i
pointers far beyond the skb buffer to dma_map_single(). On IOMMU-less
SoCs (the typical deployment for stmmac), this maps arbitrary kernel
memory to the DMA engine, constituting a kernel memory disclosure and
potential memory corruption from hardware.
Fix this by introducing a buf_len local variable clamped to
min(nopaged_len, bmax). Computing len = nopaged_len - buf_len is then
always safe: it is zero when the linear portion fits within a single
descriptor, causing the while (len != 0) loop to be skipped naturally,
and the fragment loop in stmmac_xmit() handles page fragments afterward. |
| In the Linux kernel, the following vulnerability has been resolved:
mmc: vub300: fix NULL-deref on disconnect
Make sure to deregister the controller before dropping the reference to
the driver data on disconnect to avoid NULL-pointer dereferences or
use-after-free. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/damon/stat: deallocate damon_call() failure leaking damon_ctx
damon_stat_start() always allocates the module's damon_ctx object
(damon_stat_context). Meanwhile, if damon_call() in the function fails,
the damon_ctx object is not deallocated. Hence, if the damon_call() is
failed, and the user writes Y to “enabled” again, the previously
allocated damon_ctx object is leaked.
This cannot simply be fixed by deallocating the damon_ctx object when
damon_call() fails. That's because damon_call() failure doesn't guarantee
the kdamond main function, which accesses the damon_ctx object, is
completely finished. In other words, if damon_stat_start() deallocates
the damon_ctx object after damon_call() failure, the not-yet-terminated
kdamond could access the freed memory (use-after-free).
Fix the leak while avoiding the use-after-free by keeping returning
damon_stat_start() without deallocating the damon_ctx object after
damon_call() failure, but deallocating it when the function is invoked
again and the kdamond is completely terminated. If the kdamond is not yet
terminated, simply return -EAGAIN, as the kdamond will soon be terminated.
The issue was discovered [1] by sashiko. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/vma: fix memory leak in __mmap_region()
commit 605f6586ecf7 ("mm/vma: do not leak memory when .mmap_prepare
swaps the file") handled the success path by skipping get_file() via
file_doesnt_need_get, but missed the error path.
When /dev/zero is mmap'd with MAP_SHARED, mmap_zero_prepare() calls
shmem_zero_setup_desc() which allocates a new shmem file to back the
mapping. If __mmap_new_vma() subsequently fails, this replacement
file is never fput()'d - the original is released by
ksys_mmap_pgoff(), but nobody releases the new one.
Add fput() for the swapped file in the error path.
Reproducible with fault injection.
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
CPU: 2 UID: 0 PID: 366 Comm: syz.7.14 Not tainted 7.0.0-rc6 #2 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x164/0x1f0
should_fail_ex+0x525/0x650
should_failslab+0xdf/0x140
kmem_cache_alloc_noprof+0x78/0x630
vm_area_alloc+0x24/0x160
__mmap_region+0xf6b/0x2660
mmap_region+0x2eb/0x3a0
do_mmap+0xc79/0x1240
vm_mmap_pgoff+0x252/0x4c0
ksys_mmap_pgoff+0xf8/0x120
__x64_sys_mmap+0x12a/0x190
do_syscall_64+0xa9/0x580
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
BUG: memory leak
unreferenced object 0xffff8881118aca80 (size 360):
comm "syz.7.14", pid 366, jiffies 4294913255
hex dump (first 32 bytes):
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
ff ff ff ff ff ff ff ff c0 28 4d ae ff ff ff ff .........(M.....
backtrace (crc db0f53bc):
kmem_cache_alloc_noprof+0x3ab/0x630
alloc_empty_file+0x5a/0x1e0
alloc_file_pseudo+0x135/0x220
__shmem_file_setup+0x274/0x420
shmem_zero_setup_desc+0x9c/0x170
mmap_zero_prepare+0x123/0x140
__mmap_region+0xdda/0x2660
mmap_region+0x2eb/0x3a0
do_mmap+0xc79/0x1240
vm_mmap_pgoff+0x252/0x4c0
ksys_mmap_pgoff+0xf8/0x120
__x64_sys_mmap+0x12a/0x190
do_syscall_64+0xa9/0x580
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Found by syzkaller. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915/gt: fix refcount underflow in intel_engine_park_heartbeat
A use-after-free / refcount underflow is possible when the heartbeat
worker and intel_engine_park_heartbeat() race to release the same
engine->heartbeat.systole request.
The heartbeat worker reads engine->heartbeat.systole and calls
i915_request_put() on it when the request is complete, but clears
the pointer in a separate, non-atomic step. Concurrently, a request
retirement on another CPU can drop the engine wakeref to zero, triggering
__engine_park() -> intel_engine_park_heartbeat(). If the heartbeat
timer is pending at that point, cancel_delayed_work() returns true and
intel_engine_park_heartbeat() reads the stale non-NULL systole pointer
and calls i915_request_put() on it again, causing a refcount underflow:
```
<4> [487.221889] Workqueue: i915-unordered engine_retire [i915]
<4> [487.222640] RIP: 0010:refcount_warn_saturate+0x68/0xb0
...
<4> [487.222707] Call Trace:
<4> [487.222711] <TASK>
<4> [487.222716] intel_engine_park_heartbeat.part.0+0x6f/0x80 [i915]
<4> [487.223115] intel_engine_park_heartbeat+0x25/0x40 [i915]
<4> [487.223566] __engine_park+0xb9/0x650 [i915]
<4> [487.223973] ____intel_wakeref_put_last+0x2e/0xb0 [i915]
<4> [487.224408] __intel_wakeref_put_last+0x72/0x90 [i915]
<4> [487.224797] intel_context_exit_engine+0x7c/0x80 [i915]
<4> [487.225238] intel_context_exit+0xf1/0x1b0 [i915]
<4> [487.225695] i915_request_retire.part.0+0x1b9/0x530 [i915]
<4> [487.226178] i915_request_retire+0x1c/0x40 [i915]
<4> [487.226625] engine_retire+0x122/0x180 [i915]
<4> [487.227037] process_one_work+0x239/0x760
<4> [487.227060] worker_thread+0x200/0x3f0
<4> [487.227068] ? __pfx_worker_thread+0x10/0x10
<4> [487.227075] kthread+0x10d/0x150
<4> [487.227083] ? __pfx_kthread+0x10/0x10
<4> [487.227092] ret_from_fork+0x3d4/0x480
<4> [487.227099] ? __pfx_kthread+0x10/0x10
<4> [487.227107] ret_from_fork_asm+0x1a/0x30
<4> [487.227141] </TASK>
```
Fix this by replacing the non-atomic pointer read + separate clear with
xchg() in both racing paths. xchg() is a single indivisible hardware
instruction that atomically reads the old pointer and writes NULL. This
guarantees only one of the two concurrent callers obtains the non-NULL
pointer and performs the put, the other gets NULL and skips it.
(cherry picked from commit 13238dc0ee4f9ab8dafa2cca7295736191ae2f42) |
| In the Linux kernel, the following vulnerability has been resolved:
batman-adv: reject oversized global TT response buffers
batadv_tt_prepare_tvlv_global_data() builds the allocation length for a
global TT response in 16-bit temporaries. When a remote originator
advertises a large enough global TT, the TT payload length plus the VLAN
header offset can exceed 65535 and wrap before kmalloc().
The full-table response path still uses the original TT payload length when
it fills tt_change, so the wrapped allocation is too small and
batadv_tt_prepare_tvlv_global_data() writes past the end of the heap object
before the later packet-size check runs.
Fix this by rejecting TT responses whose TVLV value length cannot fit in
the 16-bit TVLV payload length field. |
| In the Linux kernel, the following vulnerability has been resolved:
tipc: fix bc_ackers underflow on duplicate GRP_ACK_MSG
The GRP_ACK_MSG handler in tipc_group_proto_rcv() currently decrements
bc_ackers on every inbound group ACK, even when the same member has
already acknowledged the current broadcast round.
Because bc_ackers is a u16, a duplicate ACK received after the last
legitimate ACK wraps the counter to 65535. Once wrapped,
tipc_group_bc_cong() keeps reporting congestion and later group
broadcasts on the affected socket stay blocked until the group is
recreated.
Fix this by ignoring duplicate or stale ACKs before touching bc_acked or
bc_ackers. This makes repeated GRP_ACK_MSG handling idempotent and
prevents the underflow path. |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm: hold dev ref until after transport_finish NF_HOOK
After async crypto completes, xfrm_input_resume() calls dev_put()
immediately on re-entry before the skb reaches transport_finish.
The skb->dev pointer is then used inside NF_HOOK and its okfn,
which can race with device teardown.
Remove the dev_put from the async resumption entry and instead
drop the reference after the NF_HOOK call in transport_finish,
using a saved device pointer since NF_HOOK may consume the skb.
This covers NF_DROP, NF_QUEUE and NF_STOLEN paths that skip
the okfn.
For non-transport exits (decaps, gro, drop) and secondary
async return points, release the reference inline when
async is set. |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm: clear trailing padding in build_polexpire()
build_expire() clears the trailing padding bytes of struct
xfrm_user_expire after setting the hard field via memset_after(),
but the analogous function build_polexpire() does not do this for
struct xfrm_user_polexpire.
The padding bytes after the __u8 hard field are left
uninitialized from the heap allocation, and are then sent to
userspace via netlink multicast to XFRMNLGRP_EXPIRE listeners,
leaking kernel heap memory contents.
Add the missing memset_after() call, matching build_expire(). |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nft_ct: fix use-after-free in timeout object destroy
nft_ct_timeout_obj_destroy() frees the timeout object with kfree()
immediately after nf_ct_untimeout(), without waiting for an RCU grace
period. Concurrent packet processing on other CPUs may still hold
RCU-protected references to the timeout object obtained via
rcu_dereference() in nf_ct_timeout_data().
Add an rcu_head to struct nf_ct_timeout and use kfree_rcu() to defer
freeing until after an RCU grace period, matching the approach already
used in nfnetlink_cttimeout.c.
KASAN report:
BUG: KASAN: slab-use-after-free in nf_conntrack_tcp_packet+0x1381/0x29d0
Read of size 4 at addr ffff8881035fe19c by task exploit/80
Call Trace:
nf_conntrack_tcp_packet+0x1381/0x29d0
nf_conntrack_in+0x612/0x8b0
nf_hook_slow+0x70/0x100
__ip_local_out+0x1b2/0x210
tcp_sendmsg_locked+0x722/0x1580
__sys_sendto+0x2d8/0x320
Allocated by task 75:
nft_ct_timeout_obj_init+0xf6/0x290
nft_obj_init+0x107/0x1b0
nf_tables_newobj+0x680/0x9c0
nfnetlink_rcv_batch+0xc29/0xe00
Freed by task 26:
nft_obj_destroy+0x3f/0xa0
nf_tables_trans_destroy_work+0x51c/0x5c0
process_one_work+0x2c4/0x5a0 |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix incorrect return value after changing leaf in lookup_extent_data_ref()
After commit 1618aa3c2e01 ("btrfs: simplify return variables in
lookup_extent_data_ref()"), the err and ret variables were merged into
a single ret variable. However, when btrfs_next_leaf() returns 0
(success), ret is overwritten from -ENOENT to 0. If the first key in
the next leaf does not match (different objectid or type), the function
returns 0 instead of -ENOENT, making the caller believe the lookup
succeeded when it did not. This can lead to operations on the wrong
extent tree item, potentially causing extent tree corruption.
Fix this by returning -ENOENT directly when the key does not match,
instead of relying on the ret variable. |
| In the Linux kernel, the following vulnerability has been resolved:
smb: smbdirect: introduce smbdirect_socket.recv_io.credits.available
The logic off managing recv credits by counting posted recv_io and
granted credits is racy.
That's because the peer might already consumed a credit,
but between receiving the incoming recv at the hardware
and processing the completion in the 'recv_done' functions
we likely have a window where we grant credits, which
don't really exist.
So we better have a decicated counter for the
available credits, which will be incremented
when we posted new recv buffers and drained when
we grant the credits to the peer. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/i915/gt: Check set_default_submission() before deferencing
When the i915 driver firmware binaries are not present, the
set_default_submission pointer is not set. This pointer is
dereferenced during suspend anyways.
Add a check to make sure it is set before dereferencing.
[ 23.289926] PM: suspend entry (deep)
[ 23.293558] Filesystems sync: 0.000 seconds
[ 23.298010] Freezing user space processes
[ 23.302771] Freezing user space processes completed (elapsed 0.000 seconds)
[ 23.309766] OOM killer disabled.
[ 23.313027] Freezing remaining freezable tasks
[ 23.318540] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 23.342038] serial 00:05: disabled
[ 23.345719] serial 00:02: disabled
[ 23.349342] serial 00:01: disabled
[ 23.353782] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 23.358993] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[ 23.361635] ata1.00: Entering standby power mode
[ 23.368863] ata2.00: Entering standby power mode
[ 23.445187] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 23.452194] #PF: supervisor instruction fetch in kernel mode
[ 23.457896] #PF: error_code(0x0010) - not-present page
[ 23.463065] PGD 0 P4D 0
[ 23.465640] Oops: Oops: 0010 [#1] SMP NOPTI
[ 23.469869] CPU: 8 UID: 0 PID: 211 Comm: kworker/u48:18 Tainted: G S W 6.19.0-rc4-00020-gf0b9d8eb98df #10 PREEMPT(voluntary)
[ 23.482512] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[ 23.496511] Workqueue: async async_run_entry_fn
[ 23.501087] RIP: 0010:0x0
[ 23.503755] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 23.510324] RSP: 0018:ffffb4a60065fca8 EFLAGS: 00010246
[ 23.515592] RAX: 0000000000000000 RBX: ffff9f428290e000 RCX: 000000000000000f
[ 23.522765] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff9f428290e000
[ 23.529937] RBP: ffff9f4282907070 R08: ffff9f4281130428 R09: 00000000ffffffff
[ 23.537111] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9f42829070f8
[ 23.544284] R13: ffff9f4282906028 R14: ffff9f4282900000 R15: ffff9f4282906b68
[ 23.551457] FS: 0000000000000000(0000) GS:ffff9f466b2cf000(0000) knlGS:0000000000000000
[ 23.559588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 23.565365] CR2: ffffffffffffffd6 CR3: 000000031c230001 CR4: 0000000000f70ef0
[ 23.572539] PKRU: 55555554
[ 23.575281] Call Trace:
[ 23.577770] <TASK>
[ 23.579905] intel_engines_reset_default_submission+0x42/0x60
[ 23.585695] __intel_gt_unset_wedged+0x191/0x200
[ 23.590360] intel_gt_unset_wedged+0x20/0x40
[ 23.594675] gt_sanitize+0x15e/0x170
[ 23.598290] i915_gem_suspend_late+0x6b/0x180
[ 23.602692] i915_drm_suspend_late+0x35/0xf0
[ 23.607008] ? __pfx_pci_pm_suspend_late+0x10/0x10
[ 23.611843] dpm_run_callback+0x78/0x1c0
[ 23.615817] device_suspend_late+0xde/0x2e0
[ 23.620037] async_suspend_late+0x18/0x30
[ 23.624082] async_run_entry_fn+0x25/0xa0
[ 23.628129] process_one_work+0x15b/0x380
[ 23.632182] worker_thread+0x2a5/0x3c0
[ 23.635973] ? __pfx_worker_thread+0x10/0x10
[ 23.640279] kthread+0xf6/0x1f0
[ 23.643464] ? __pfx_kthread+0x10/0x10
[ 23.647263] ? __pfx_kthread+0x10/0x10
[ 23.651045] ret_from_fork+0x131/0x190
[ 23.654837] ? __pfx_kthread+0x10/0x10
[ 23.658634] ret_from_fork_asm+0x1a/0x30
[ 23.662597] </TASK>
[ 23.664826] Modules linked in:
[ 23.667914] CR2: 0000000000000000
[ 23.671271] ------------[ cut here ]------------
(cherry picked from commit daa199abc3d3d1740c9e3a2c3e9216ae5b447cad) |
| In the Linux kernel, the following vulnerability has been resolved:
firmware: arm_scmi: Fix NULL dereference on notify error path
Since commit b5daf93b809d1 ("firmware: arm_scmi: Avoid notifier
registration for unsupported events") the call chains leading to the helper
__scmi_event_handler_get_ops expect an ERR_PTR to be returned on failure to
get an handler for the requested event key, while the current helper can
still return a NULL when no handler could be found or created.
Fix by forcing an ERR_PTR return value when the handler reference is NULL. |
| In the Linux kernel, the following vulnerability has been resolved:
i2c: cp2615: fix serial string NULL-deref at probe
The cp2615 driver uses the USB device serial string as the i2c adapter
name but does not make sure that the string exists.
Verify that the device has a serial number before accessing it to avoid
triggering a NULL-pointer dereference (e.g. with malicious devices). |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: Fix static_branch_dec() underflow for aql_disable.
syzbot reported static_branch_dec() underflow in aql_enable_write(). [0]
The problem is that aql_enable_write() does not serialise concurrent
write()s to the debugfs.
aql_enable_write() checks static_key_false(&aql_disable.key) and
later calls static_branch_inc() or static_branch_dec(), but the
state may change between the two calls.
aql_disable does not need to track inc/dec.
Let's use static_branch_enable() and static_branch_disable().
[0]:
val == 0
WARNING: kernel/jump_label.c:311 at __static_key_slow_dec_cpuslocked.part.0+0x107/0x120 kernel/jump_label.c:311, CPU#0: syz.1.3155/20288
Modules linked in:
CPU: 0 UID: 0 PID: 20288 Comm: syz.1.3155 Tainted: G U L syzkaller #0 PREEMPT(full)
Tainted: [U]=USER, [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/24/2026
RIP: 0010:__static_key_slow_dec_cpuslocked.part.0+0x107/0x120 kernel/jump_label.c:311
Code: f2 c9 ff 5b 5d c3 cc cc cc cc e8 54 f2 c9 ff 48 89 df e8 ac f9 ff ff eb ad e8 45 f2 c9 ff 90 0f 0b 90 eb a2 e8 3a f2 c9 ff 90 <0f> 0b 90 eb 97 48 89 df e8 5c 4b 33 00 e9 36 ff ff ff 0f 1f 80 00
RSP: 0018:ffffc9000b9f7c10 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffffff9b3e5d40 RCX: ffffffff823c57b4
RDX: ffff8880285a0000 RSI: ffffffff823c5846 RDI: ffff8880285a0000
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a
R13: 1ffff9200173ef88 R14: 0000000000000001 R15: ffffc9000b9f7e98
FS: 00007f530dd726c0(0000) GS:ffff8881245e3000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000001140 CR3: 000000007cc4a000 CR4: 00000000003526f0
Call Trace:
<TASK>
__static_key_slow_dec_cpuslocked kernel/jump_label.c:297 [inline]
__static_key_slow_dec kernel/jump_label.c:321 [inline]
static_key_slow_dec+0x7c/0xc0 kernel/jump_label.c:336
aql_enable_write+0x2b2/0x310 net/mac80211/debugfs.c:343
short_proxy_write+0x133/0x1a0 fs/debugfs/file.c:383
vfs_write+0x2aa/0x1070 fs/read_write.c:684
ksys_pwrite64 fs/read_write.c:793 [inline]
__do_sys_pwrite64 fs/read_write.c:801 [inline]
__se_sys_pwrite64 fs/read_write.c:798 [inline]
__x64_sys_pwrite64+0x1eb/0x250 fs/read_write.c:798
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xc9/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f530cf9aeb9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f530dd72028 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 00007f530d215fa0 RCX: 00007f530cf9aeb9
RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000010
RBP: 00007f530d008c1f R08: 0000000000000000 R09: 0000000000000000
R10: 4200000000000005 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f530d216038 R14: 00007f530d215fa0 R15: 00007ffde89fb978
</TASK> |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: x86: Use scratch field in MMIO fragment to hold small write values
When exiting to userspace to service an emulated MMIO write, copy the
to-be-written value to a scratch field in the MMIO fragment if the size
of the data payload is 8 bytes or less, i.e. can fit in a single chunk,
instead of pointing the fragment directly at the source value.
This fixes a class of use-after-free bugs that occur when the emulator
initiates a write using an on-stack, local variable as the source, the
write splits a page boundary, *and* both pages are MMIO pages. Because
KVM's ABI only allows for physically contiguous MMIO requests, accesses
that split MMIO pages are separated into two fragments, and are sent to
userspace one at a time. When KVM attempts to complete userspace MMIO in
response to KVM_RUN after the first fragment, KVM will detect the second
fragment and generate a second userspace exit, and reference the on-stack
variable.
The issue is most visible if the second KVM_RUN is performed by a separate
task, in which case the stack of the initiating task can show up as truly
freed data.
==================================================================
BUG: KASAN: use-after-free in complete_emulated_mmio+0x305/0x420
Read of size 1 at addr ffff888009c378d1 by task syz-executor417/984
CPU: 1 PID: 984 Comm: syz-executor417 Not tainted 5.10.0-182.0.0.95.h2627.eulerosv2r13.x86_64 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace:
dump_stack+0xbe/0xfd
print_address_description.constprop.0+0x19/0x170
__kasan_report.cold+0x6c/0x84
kasan_report+0x3a/0x50
check_memory_region+0xfd/0x1f0
memcpy+0x20/0x60
complete_emulated_mmio+0x305/0x420
kvm_arch_vcpu_ioctl_run+0x63f/0x6d0
kvm_vcpu_ioctl+0x413/0xb20
__se_sys_ioctl+0x111/0x160
do_syscall_64+0x30/0x40
entry_SYSCALL_64_after_hwframe+0x67/0xd1
RIP: 0033:0x42477d
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007faa8e6890e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004d7338 RCX: 000000000042477d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
RBP: 00000000004d7330 R08: 00007fff28d546df R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d733c
R13: 0000000000000000 R14: 000000000040a200 R15: 00007fff28d54720
The buggy address belongs to the page:
page:0000000029f6a428 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9c37
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0000000 0000000000000000 ffffea0000270dc8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888009c37780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff888009c37880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff888009c37900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
The bug can also be reproduced with a targeted KVM-Unit-Test by hacking
KVM to fill a large on-stack variable in complete_emulated_mmio(), i.e. by
overwrite the data value with garbage.
Limit the use of the scratch fields to 8-byte or smaller accesses, and to
just writes, as larger accesses and reads are not affected thanks to
implementation details in the emulator, but add a sanity check to ensure
those details don't change in the future. Specifically, KVM never uses
on-stack variables for accesses larger that 8 bytes, e.g. uses an operand
in the emulator context, and *al
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
mm: call ->free_folio() directly in folio_unmap_invalidate()
We can only call filemap_free_folio() if we have a reference to (or hold a
lock on) the mapping. Otherwise, we've already removed the folio from the
mapping so it no longer pins the mapping and the mapping can be removed,
causing a use-after-free when accessing mapping->a_ops.
Follow the same pattern as __remove_mapping() and load the free_folio
function pointer before dropping the lock on the mapping. That lets us
make filemap_free_folio() static as this was the only caller outside
filemap.c. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock
Take and hold kvm->lock for before checking sev_guest() in
sev_mem_enc_register_region(), as sev_guest() isn't stable unless kvm->lock
is held (or KVM can guarantee KVM_SEV_INIT{2} has completed and can't
rollack state). If KVM_SEV_INIT{2} fails, KVM can end up trying to add to
a not-yet-initialized sev->regions_list, e.g. triggering a #GP
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 110 UID: 0 PID: 72717 Comm: syz.15.11462 Tainted: G U W O 6.16.0-smp-DEV #1 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024
RIP: 0010:sev_mem_enc_register_region+0x3f0/0x4f0 ../include/linux/list.h:83
Code: <41> 80 3c 04 00 74 08 4c 89 ff e8 f1 c7 a2 00 49 39 ed 0f 84 c6 00
RSP: 0018:ffff88838647fbb8 EFLAGS: 00010256
RAX: dffffc0000000000 RBX: 1ffff92015cf1e0b RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff888367870000
RBP: ffffc900ae78f050 R08: ffffea000d9e0007 R09: 1ffffd4001b3c000
R10: dffffc0000000000 R11: fffff94001b3c001 R12: 0000000000000000
R13: ffff8982ab0bde00 R14: ffffc900ae78f058 R15: 0000000000000000
FS: 00007f34e9dc66c0(0000) GS:ffff89ee64d33000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe180adef98 CR3: 000000047210e000 CR4: 0000000000350ef0
Call Trace:
<TASK>
kvm_arch_vm_ioctl+0xa72/0x1240 ../arch/x86/kvm/x86.c:7371
kvm_vm_ioctl+0x649/0x990 ../virt/kvm/kvm_main.c:5363
__se_sys_ioctl+0x101/0x170 ../fs/ioctl.c:51
do_syscall_x64 ../arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x6f/0x1f0 ../arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f34e9f7e9a9
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f34e9dc6038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f34ea1a6080 RCX: 00007f34e9f7e9a9
RDX: 0000200000000280 RSI: 000000008010aebb RDI: 0000000000000007
RBP: 00007f34ea000d69 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f34ea1a6080 R15: 00007ffce77197a8
</TASK>
with a syzlang reproducer that looks like:
syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000040)={0x0, &(0x7f0000000180)=ANY=[], 0x70}) (async)
syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000080)={0x0, &(0x7f0000000180)=ANY=[@ANYBLOB="..."], 0x4f}) (async)
r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000200), 0x0, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
r2 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000240), 0x0, 0x0)
r3 = ioctl$KVM_CREATE_VM(r2, 0xae01, 0x0)
ioctl$KVM_SET_CLOCK(r3, 0xc008aeba, &(0x7f0000000040)={0x1, 0x8, 0x0, 0x5625e9b0}) (async)
ioctl$KVM_SET_PIT2(r3, 0x8010aebb, &(0x7f0000000280)={[...], 0x5}) (async)
ioctl$KVM_SET_PIT2(r1, 0x4070aea0, 0x0) (async)
r4 = ioctl$KVM_CREATE_VM(0xffffffffffffffff, 0xae01, 0x0)
openat$kvm(0xffffffffffffff9c, 0x0, 0x0, 0x0) (async)
ioctl$KVM_SET_USER_MEMORY_REGION(r4, 0x4020ae46, &(0x7f0000000400)={0x0, 0x0, 0x0, 0x2000, &(0x7f0000001000/0x2000)=nil}) (async)
r5 = ioctl$KVM_CREATE_VCPU(r4, 0xae41, 0x2)
close(r0) (async)
openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x8000, 0x0) (async)
ioctl$KVM_SET_GUEST_DEBUG(r5, 0x4048ae9b, &(0x7f0000000300)={0x4376ea830d46549b, 0x0, [0x46, 0x0, 0x0, 0x0, 0x0, 0x1000]}) (async)
ioctl$KVM_RUN(r5, 0xae80, 0x0)
Opportunistically use guard() to avoid having to define a new error label
and goto usage. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU
Reject synchronizing vCPU state to its associated VMSA if the vCPU has
already been launched, i.e. if the VMSA has already been encrypted. On a
host with SNP enabled, accessing guest-private memory generates an RMP #PF
and panics the host.
BUG: unable to handle page fault for address: ff1276cbfdf36000
#PF: supervisor write access in kernel mode
#PF: error_code(0x80000003) - RMP violation
PGD 5a31801067 P4D 5a31802067 PUD 40ccfb5063 PMD 40e5954063 PTE 80000040fdf36163
SEV-SNP: PFN 0x40fdf36, RMP entry: [0x6010fffffffff001 - 0x000000000000001f]
Oops: Oops: 0003 [#1] SMP NOPTI
CPU: 33 UID: 0 PID: 996180 Comm: qemu-system-x86 Tainted: G OE
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. PowerEdge R7625/0H1TJT, BIOS 1.5.8 07/21/2023
RIP: 0010:sev_es_sync_vmsa+0x54/0x4c0 [kvm_amd]
Call Trace:
<TASK>
snp_launch_update_vmsa+0x19d/0x290 [kvm_amd]
snp_launch_finish+0xb6/0x380 [kvm_amd]
sev_mem_enc_ioctl+0x14e/0x720 [kvm_amd]
kvm_arch_vm_ioctl+0x837/0xcf0 [kvm]
kvm_vm_ioctl+0x3fd/0xcc0 [kvm]
__x64_sys_ioctl+0xa3/0x100
x64_sys_call+0xfe0/0x2350
do_syscall_64+0x81/0x10f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7ffff673287d
</TASK>
Note, the KVM flaw has been present since commit ad73109ae7ec ("KVM: SVM:
Provide support to launch and run an SEV-ES guest"), but has only been
actively dangerous for the host since SNP support was added. With SEV-ES,
KVM would "just" clobber guest state, which is totally fine from a host
kernel perspective since userspace can clobber guest state any time before
sev_launch_update_vmsa(). |