In the Linux kernel, the following vulnerability has been resolved:
tty: pcn_uart: fix memory leak with using debugfs_lookup()
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
In the Linux kernel, the following vulnerability has been resolved:
serial: 8250: Reinit port->pm on port specific driver unbind
When we unbind a serial port hardware specific 8250 driver, the generic
serial8250 driver takes over the port. After that we see an oops about 10
seconds later. This can produce the following at least on some TI SoCs:
Unhandled fault: imprecise external abort (0x1406)
Internal error: : 1406 [#1] SMP ARM
Turns out that we may still have the serial port hardware specific driver
port->pm in use, and serial8250_pm() tries to call it after the port
specific driver is gone:
serial8250_pm [8250_base] from uart_change_pm+0x54/0x8c [serial_base]
uart_change_pm [serial_base] from uart_hangup+0x154/0x198 [serial_base]
uart_hangup [serial_base] from __tty_hangup.part.0+0x328/0x37c
__tty_hangup.part.0 from disassociate_ctty+0x154/0x20c
disassociate_ctty from do_exit+0x744/0xaac
do_exit from do_group_exit+0x40/0x8c
do_group_exit from __wake_up_parent+0x0/0x1c
Let's fix the issue by calling serial8250_set_defaults() in
serial8250_unregister_port(). This will set the port back to using
the serial8250 default functions, and sets the port->pm to point to
serial8250_pm.
In the Linux kernel, the following vulnerability has been resolved:
igb: Fix igb_down hung on surprise removal
In a setup where a Thunderbolt hub connects to Ethernet and a display
through USB Type-C, users may experience a hung task timeout when they
remove the cable between the PC and the Thunderbolt hub.
This is because the igb_down function is called multiple times when
the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
triggers the first call, and the igb_remove triggers the second call.
The second call to igb_down will block at napi_synchronize.
Here's the call trace:
__schedule+0x3b0/0xddb
? __mod_timer+0x164/0x5d3
schedule+0x44/0xa8
schedule_timeout+0xb2/0x2a4
? run_local_timers+0x4e/0x4e
msleep+0x31/0x38
igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
__igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
__dev_close_many+0x95/0xec
dev_close_many+0x6e/0x103
unregister_netdevice_many+0x105/0x5b1
unregister_netdevice_queue+0xc2/0x10d
unregister_netdev+0x1c/0x23
igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
pci_device_remove+0x3f/0x9c
device_release_driver_internal+0xfe/0x1b4
pci_stop_bus_device+0x5b/0x7f
pci_stop_bus_device+0x30/0x7f
pci_stop_bus_device+0x30/0x7f
pci_stop_and_remove_bus_device+0x12/0x19
pciehp_unconfigure_device+0x76/0xe9
pciehp_disable_slot+0x6e/0x131
pciehp_handle_presence_or_link_change+0x7a/0x3f7
pciehp_ist+0xbe/0x194
irq_thread_fn+0x22/0x4d
? irq_thread+0x1fd/0x1fd
irq_thread+0x17b/0x1fd
? irq_forced_thread_fn+0x5f/0x5f
kthread+0x142/0x153
? __irq_get_irqchip_state+0x46/0x46
? kthread_associate_blkcg+0x71/0x71
ret_from_fork+0x1f/0x30
In this case, igb_io_error_detected detaches the network interface
and requests a PCIE slot reset, however, the PCIE reset callback is
not being invoked and thus the Ethernet connection breaks down.
As the PCIE error in this case is a non-fatal one, requesting a
slot reset can be avoided.
This patch fixes the task hung issue and preserves Ethernet
connection by ignoring non-fatal PCIE errors.
In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Pointer may be dereferenced
Klocwork tool reported pointer 'rport' returned from call to function
fc_bsg_to_rport() may be NULL and will be dereferenced.
Add a fix to validate rport before dereferencing.
In the Linux kernel, the following vulnerability has been resolved:
md/raid10: prevent soft lockup while flush writes
Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.
Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:
watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
<TASK>
call_rcu+0x16/0x20
put_object+0x41/0x80
__delete_object+0x50/0x90
delete_object_full+0x2b/0x40
kmemleak_free+0x46/0xa0
slab_free_freelist_hook.constprop.0+0xed/0x1a0
kmem_cache_free+0xfd/0x300
mempool_free_slab+0x1f/0x30
mempool_free+0x3a/0x100
bio_free+0x59/0x80
bio_put+0xcf/0x2c0
free_r10bio+0xbf/0xf0
raid_end_bio_io+0x78/0xb0
one_write_done+0x8a/0xa0
raid10_end_write_request+0x1b4/0x430
bio_endio+0x175/0x320
brd_submit_bio+0x3b9/0x9b7 [brd]
__submit_bio+0x69/0xe0
submit_bio_noacct_nocheck+0x1e6/0x5a0
submit_bio_noacct+0x38c/0x7e0
flush_pending_writes+0xf0/0x240
raid10d+0xac/0x1ed0
Fix the problem by adding cond_resched() to raid10 like what raid1 did.
Note that unlimited plugged bio still need to be optimized, for example,
in the case of lots of dirty pages writeback, this will take lots of
memory and io will spend a long time in plug, hence io latency is bad.
In the Linux kernel, the following vulnerability has been resolved:
bpf: make sure skb->len != 0 when redirecting to a tunneling device
syzkaller managed to trigger another case where skb->len == 0
when we enter __dev_queue_xmit:
WARNING: CPU: 0 PID: 2470 at include/linux/skbuff.h:2576 skb_assert_len include/linux/skbuff.h:2576 [inline]
WARNING: CPU: 0 PID: 2470 at include/linux/skbuff.h:2576 __dev_queue_xmit+0x2069/0x35e0 net/core/dev.c:4295
Call Trace:
dev_queue_xmit+0x17/0x20 net/core/dev.c:4406
__bpf_tx_skb net/core/filter.c:2115 [inline]
__bpf_redirect_no_mac net/core/filter.c:2140 [inline]
__bpf_redirect+0x5fb/0xda0 net/core/filter.c:2163
____bpf_clone_redirect net/core/filter.c:2447 [inline]
bpf_clone_redirect+0x247/0x390 net/core/filter.c:2419
bpf_prog_48159a89cb4a9a16+0x59/0x5e
bpf_dispatcher_nop_func include/linux/bpf.h:897 [inline]
__bpf_prog_run include/linux/filter.h:596 [inline]
bpf_prog_run include/linux/filter.h:603 [inline]
bpf_test_run+0x46c/0x890 net/bpf/test_run.c:402
bpf_prog_test_run_skb+0xbdc/0x14c0 net/bpf/test_run.c:1170
bpf_prog_test_run+0x345/0x3c0 kernel/bpf/syscall.c:3648
__sys_bpf+0x43a/0x6c0 kernel/bpf/syscall.c:5005
__do_sys_bpf kernel/bpf/syscall.c:5091 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5089 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5089
do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x61/0xc6
The reproducer doesn't really reproduce outside of syzkaller
environment, so I'm taking a guess here. It looks like we
do generate correct ETH_HLEN-sized packet, but we redirect
the packet to the tunneling device. Before we do so, we
__skb_pull l2 header and arrive again at skb->len == 0.
Doesn't seem like we can do anything better than having
an explicit check after __skb_pull?
In the Linux kernel, the following vulnerability has been resolved:
drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/gpu/drm/sti/sti_hda.c:637:16: error: incompatible function pointer types initializing 'enum drm_mode_status (*)(struct drm_connector *, struct drm_display_mode *)' with an expression of type 'int (struct drm_connector *, struct drm_display_mode *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.mode_valid = sti_hda_connector_mode_valid,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sti/sti_dvo.c:376:16: error: incompatible function pointer types initializing 'enum drm_mode_status (*)(struct drm_connector *, struct drm_display_mode *)' with an expression of type 'int (struct drm_connector *, struct drm_display_mode *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.mode_valid = sti_dvo_connector_mode_valid,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/sti/sti_hdmi.c:1035:16: error: incompatible function pointer types initializing 'enum drm_mode_status (*)(struct drm_connector *, struct drm_display_mode *)' with an expression of type 'int (struct drm_connector *, struct drm_display_mode *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.mode_valid = sti_hdmi_connector_mode_valid,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
->mode_valid() in 'struct drm_connector_helper_funcs' expects a return
type of 'enum drm_mode_status', not 'int'. Adjust the return type of
sti_{dvo,hda,hdmi}_connector_mode_valid() to match the prototype's to
resolve the warning and CFI failure.
In the Linux kernel, the following vulnerability has been resolved:
regulator: core: fix use_count leakage when handling boot-on
I found a use_count leakage towards supply regulator of rdev with
boot-on option.
┌───────────────────┐ ┌───────────────────┐
│ regulator_dev A │ │ regulator_dev B │
│ (boot-on) │ │ (boot-on) │
│ use_count=0 │◀──supply──│ use_count=1 │
│ │ │ │
└───────────────────┘ └───────────────────┘
In case of rdev(A) configured with `regulator-boot-on', the use_count
of supplying regulator(B) will increment inside
regulator_enable(rdev->supply).
Thus, B will acts like always-on, and further balanced
regulator_enable/disable cannot actually disable it anymore.
However, B was also configured with `regulator-boot-on', we wish it
could be disabled afterwards.
In the Linux kernel, the following vulnerability has been resolved:
igb: Do not free q_vector unless new one was allocated
Avoid potential use-after-free condition under memory pressure. If the
kzalloc() fails, q_vector will be freed but left in the original
adapter->q_vector[v_idx] array position.