Changelog in Linux kernel 6.6.58

 
ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 [+ + +]
Author: Vasiliy Kovalev <[email protected]>
Date:   Wed Oct 9 16:42:48 2024 +0300

    ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2
    
    commit 9988844c457f6f17fb2e75aa000b6c3b1b673bb9 upstream.
    
    There is a problem with simultaneous audio output to headphones and
    speakers, and when headphones are turned off, the speakers also turn
    off and do not turn them on.
    
    However, it was found that if you boot linux immediately after windows,
    there are no such problems. When comparing alsa-info, the only difference
    is the different configuration of Node 0x1d:
    
    working conf. (windows): Pin-ctls: 0x80: HP
    not working     (linux): Pin-ctls: 0xc0: OUT HP
    
    This patch disable the AC_PINCTL_OUT_EN bit of Node 0x1d and fixes the
    described problem.
    
    Signed-off-by: Vasiliy Kovalev <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 [+ + +]
Author: Vasiliy Kovalev <[email protected]>
Date:   Wed Oct 16 11:07:13 2024 +0300

    ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2
    
    commit 164cd0e077a18d6208523c82b102c98c77fdd51f upstream.
    
    The cached version avoids redundant commands to the codec, improving
    stability and reducing unnecessary operations. This change ensures
    better power management and reliable restoration of pin configurations,
    especially after hibernation (S4) and other power transitions.
    
    Fixes: 9988844c457f ("ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2")
    Suggested-by: Kai-Heng Feng <[email protected]>
    Suggested-by: Takashi Iwai <[email protected]>
    Signed-off-by: Vasiliy Kovalev <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
arm64: probes: Fix simulate_ldr*_literal() [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Tue Oct 8 16:58:47 2024 +0100

    arm64: probes: Fix simulate_ldr*_literal()
    
    commit 50f813e57601c22b6f26ced3193b9b94d70a2640 upstream.
    
    The simulate_ldr_literal() code always loads a 64-bit quantity, and when
    simulating a 32-bit load into a 'W' register, it discards the most
    significant 32 bits. For big-endian kernels this means that the relevant
    bits are discarded, and the value returned is the the subsequent 32 bits
    in memory (i.e. the value at addr + 4).
    
    Additionally, simulate_ldr_literal() and simulate_ldrsw_literal() use a
    plain C load, which the compiler may tear or elide (e.g. if the target
    is the zero register). Today this doesn't happen to matter, but it may
    matter in future if trampoline code uses a LDR (literal) or LDRSW
    (literal).
    
    Update simulate_ldr_literal() and simulate_ldrsw_literal() to use an
    appropriately-sized READ_ONCE() to perform the access, which avoids
    these problems.
    
    Fixes: 39a67d49ba35 ("arm64: kprobes instruction simulation support")
    Cc: [email protected]
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Catalin Marinas <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: probes: Fix uprobes for big-endian kernels [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Tue Oct 8 16:58:48 2024 +0100

    arm64: probes: Fix uprobes for big-endian kernels
    
    commit 13f8f1e05f1dc36dbba6cba0ae03354c0dafcde7 upstream.
    
    The arm64 uprobes code is broken for big-endian kernels as it doesn't
    convert the in-memory instruction encoding (which is always
    little-endian) into the kernel's native endianness before analyzing and
    simulating instructions. This may result in a few distinct problems:
    
    * The kernel may may erroneously reject probing an instruction which can
      safely be probed.
    
    * The kernel may erroneously erroneously permit stepping an
      instruction out-of-line when that instruction cannot be stepped
      out-of-line safely.
    
    * The kernel may erroneously simulate instruction incorrectly dur to
      interpretting the byte-swapped encoding.
    
    The endianness mismatch isn't caught by the compiler or sparse because:
    
    * The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so
      the compiler and sparse have no idea these contain a little-endian
      32-bit value. The core uprobes code populates these with a memcpy()
      which similarly does not handle endianness.
    
    * While the uprobe_opcode_t type is an alias for __le32, both
      arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[]
      to the similarly-named probe_opcode_t, which is an alias for u32.
      Hence there is no endianness conversion warning.
    
    Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and
    adding the appropriate __le32_to_cpu() conversions prior to consuming
    the instruction encoding. The core uprobes copies these fields as opaque
    ranges of bytes, and so is unaffected by this change.
    
    At the same time, remove MAX_UINSN_BYTES and consistently use
    AARCH64_INSN_SIZE for clarity.
    
    Tested with the following:
    
    | #include <stdio.h>
    | #include <stdbool.h>
    |
    | #define noinline __attribute__((noinline))
    |
    | static noinline void *adrp_self(void)
    | {
    |         void *addr;
    |
    |         asm volatile(
    |         "       adrp    %x0, adrp_self\n"
    |         "       add     %x0, %x0, :lo12:adrp_self\n"
    |         : "=r" (addr));
    | }
    |
    |
    | int main(int argc, char *argv)
    | {
    |         void *ptr = adrp_self();
    |         bool equal = (ptr == adrp_self);
    |
    |         printf("adrp_self   => %p\n"
    |                "adrp_self() => %p\n"
    |                "%s\n",
    |                adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL");
    |
    |         return 0;
    | }
    
    .... where the adrp_self() function was compiled to:
    
    | 00000000004007e0 <adrp_self>:
    |   4007e0:       90000000        adrp    x0, 400000 <__ehdr_start>
    |   4007e4:       911f8000        add     x0, x0, #0x7e0
    |   4007e8:       d65f03c0        ret
    
    Before this patch, the ADRP is not recognized, and is assumed to be
    steppable, resulting in corruption of the result:
    
    | # ./adrp-self
    | adrp_self   => 0x4007e0
    | adrp_self() => 0x4007e0
    | EQUAL
    | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
    | # echo 1 > /sys/kernel/tracing/events/uprobes/enable
    | # ./adrp-self
    | adrp_self   => 0x4007e0
    | adrp_self() => 0xffffffffff7e0
    | NOT EQUAL
    
    After this patch, the ADRP is correctly recognized and simulated:
    
    | # ./adrp-self
    | adrp_self   => 0x4007e0
    | adrp_self() => 0x4007e0
    | EQUAL
    | #
    | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events
    | # echo 1 > /sys/kernel/tracing/events/uprobes/enable
    | # ./adrp-self
    | adrp_self   => 0x4007e0
    | adrp_self() => 0x4007e0
    | EQUAL
    
    Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
    Cc: [email protected]
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Catalin Marinas <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: probes: Remove broken LDR (literal) uprobe support [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Tue Oct 8 16:58:46 2024 +0100

    arm64: probes: Remove broken LDR (literal) uprobe support
    
    commit acc450aa07099d071b18174c22a1119c57da8227 upstream.
    
    The simulate_ldr_literal() and simulate_ldrsw_literal() functions are
    unsafe to use for uprobes. Both functions were originally written for
    use with kprobes, and access memory with plain C accesses. When uprobes
    was added, these were reused unmodified even though they cannot safely
    access user memory.
    
    There are three key problems:
    
    1) The plain C accesses do not have corresponding extable entries, and
       thus if they encounter a fault the kernel will treat these as
       unintentional accesses to user memory, resulting in a BUG() which
       will kill the kernel thread, and likely lead to further issues (e.g.
       lockup or panic()).
    
    2) The plain C accesses are subject to HW PAN and SW PAN, and so when
       either is in use, any attempt to simulate an access to user memory
       will fault. Thus neither simulate_ldr_literal() nor
       simulate_ldrsw_literal() can do anything useful when simulating a
       user instruction on any system with HW PAN or SW PAN.
    
    3) The plain C accesses are privileged, as they run in kernel context,
       and in practice can access a small range of kernel virtual addresses.
       The instructions they simulate have a range of +/-1MiB, and since the
       simulated instructions must itself be a user instructions in the
       TTBR0 address range, these can address the final 1MiB of the TTBR1
       acddress range by wrapping downwards from an address in the first
       1MiB of the TTBR0 address range.
    
       In contemporary kernels the last 8MiB of TTBR1 address range is
       reserved, and accesses to this will always fault, meaning this is no
       worse than (1).
    
       Historically, it was theoretically possible for the linear map or
       vmemmap to spill into the final 8MiB of the TTBR1 address range, but
       in practice this is extremely unlikely to occur as this would
       require either:
    
       * Having enough physical memory to fill the entire linear map all the
         way to the final 1MiB of the TTBR1 address range.
    
       * Getting unlucky with KASLR randomization of the linear map such
         that the populated region happens to overlap with the last 1MiB of
         the TTBR address range.
    
       ... and in either case if we were to spill into the final page there
       would be larger problems as the final page would alias with error
       pointers.
    
    Practically speaking, (1) and (2) are the big issues. Given there have
    been no reports of problems since the broken code was introduced, it
    appears that no-one is relying on probing these instructions with
    uprobes.
    
    Avoid these issues by not allowing uprobes on LDR (literal) and LDRSW
    (literal), limiting the use of simulate_ldr_literal() and
    simulate_ldrsw_literal() to kprobes. Attempts to place uprobes on LDR
    (literal) and LDRSW (literal) will be rejected as
    arm_probe_decode_insn() will return INSN_REJECTED. In future we can
    consider introducing working uprobes support for these instructions, but
    this will require more significant work.
    
    Fixes: 9842ceae9fa8 ("arm64: Add uprobe support")
    Cc: [email protected]
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Catalin Marinas <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race [+ + +]
Author: Omar Sandoval <[email protected]>
Date:   Tue Oct 15 10:59:46 2024 -0700

    blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
    
    commit e972b08b91ef48488bae9789f03cfedb148667fb upstream.
    
    We're seeing crashes from rq_qos_wake_function that look like this:
    
      BUG: unable to handle page fault for address: ffffafe180a40084
      #PF: supervisor write access in kernel mode
      #PF: error_code(0x0002) - not-present page
      PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
      Oops: Oops: 0002 [#1] PREEMPT SMP PTI
      CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
      Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
      RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
      RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
      RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
      RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
      R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
      R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
      FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <IRQ>
       try_to_wake_up+0x5a/0x6a0
       rq_qos_wake_function+0x71/0x80
       __wake_up_common+0x75/0xa0
       __wake_up+0x36/0x60
       scale_up.part.0+0x50/0x110
       wb_timer_fn+0x227/0x450
       ...
    
    So rq_qos_wake_function() calls wake_up_process(data->task), which calls
    try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).
    
    p comes from data->task, and data comes from the waitqueue entry, which
    is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
    dump with drgn, I found that the waiter had already woken up and moved
    on to a completely unrelated code path, clobbering what was previously
    data->task. Meanwhile, the waker was passing the clobbered garbage in
    data->task to wake_up_process(), leading to the crash.
    
    What's happening is that in between rq_qos_wake_function() deleting the
    waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
    that it already got a token and returning. The race looks like this:
    
    rq_qos_wait()                           rq_qos_wake_function()
    ==============================================================
    prepare_to_wait_exclusive()
                                            data->got_token = true;
                                            list_del_init(&curr->entry);
    if (data.got_token)
            break;
    finish_wait(&rqw->wait, &data.wq);
      ^- returns immediately because
         list_empty_careful(&wq_entry->entry)
         is true
    ... return, go do something else ...
                                            wake_up_process(data->task)
                                              (NO LONGER VALID!)-^
    
    Normally, finish_wait() is supposed to synchronize against the waker.
    But, as noted above, it is returning immediately because the waitqueue
    entry has already been removed from the waitqueue.
    
    The bug is that rq_qos_wake_function() is accessing the waitqueue entry
    AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
    and THEN deletes the waitqueue entry, which is the proper order.
    
    Fix it by swapping the order. We also need to use
    list_del_init_careful() to match the list_empty_careful() in
    finish_wait().
    
    Fixes: 38cfb5a45ee0 ("blk-wbt: improve waking of tasks")
    Cc: [email protected]
    Signed-off-by: Omar Sandoval <[email protected]>
    Acked-by: Tejun Heo <[email protected]>
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Wed Oct 16 11:47:00 2024 -0400

    Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
    
    commit 2c1dda2acc4192d826e84008d963b528e24d12bc upstream.
    
    Fake CSR controllers don't seem to handle short-transfer properly which
    cause command to time out:
    
    kernel: usb 1-1: new full-speed USB device number 19 using xhci_hcd
    kernel: usb 1-1: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=88.91
    kernel: usb 1-1: New USB device strings: Mfr=0, Product=2, SerialNumber=0
    kernel: usb 1-1: Product: BT DONGLE10
    ...
    Bluetooth: hci1: Opcode 0x1004 failed: -110
    kernel: Bluetooth: hci1: command 0x1004 tx timeout
    
    According to USB Spec 2.0 Section 5.7.3 Interrupt Transfer Packet Size
    Constraints a interrupt transfer is considered complete when the size is 0
    (ZPL) or < wMaxPacketSize:
    
     'When an interrupt transfer involves more data than can fit in one
     data payload of the currently established maximum size, all data
     payloads are required to be maximum-sized except for the last data
     payload, which will contain the remaining data. An interrupt transfer
     is complete when the endpoint does one of the following:
    
     • Has transferred exactly the amount of data expected
     • Transfers a packet with a payload size less than wMaxPacketSize or
     transfers a zero-length packet'
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219365
    Fixes: 7b05933340f4 ("Bluetooth: btusb: Fix not handling ZPL/short-transfer")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: Call iso_exit() on module unload [+ + +]
Author: Aaron Thompson <[email protected]>
Date:   Fri Oct 4 23:04:09 2024 +0000

    Bluetooth: Call iso_exit() on module unload
    
    commit d458cd1221e9e56da3b2cc5518ad3225caa91f20 upstream.
    
    If iso_init() has been called, iso_exit() must be called on module
    unload. Without that, the struct proto that iso_init() registered with
    proto_register() becomes invalid, which could cause unpredictable
    problems later. In my case, with CONFIG_LIST_HARDENED and
    CONFIG_BUG_ON_DATA_CORRUPTION enabled, loading the module again usually
    triggers this BUG():
    
      list_add corruption. next->prev should be prev (ffffffffb5355fd0),
        but was 0000000000000068. (next=ffffffffc0a010d0).
      ------------[ cut here ]------------
      kernel BUG at lib/list_debug.c:29!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 1 PID: 4159 Comm: modprobe Not tainted 6.10.11-4+bt2-ao-desktop #1
      RIP: 0010:__list_add_valid_or_report+0x61/0xa0
      ...
        __list_add_valid_or_report+0x61/0xa0
        proto_register+0x299/0x320
        hci_sock_init+0x16/0xc0 [bluetooth]
        bt_init+0x68/0xd0 [bluetooth]
        __pfx_bt_init+0x10/0x10 [bluetooth]
        do_one_initcall+0x80/0x2f0
        do_init_module+0x8b/0x230
        __do_sys_init_module+0x15f/0x190
        do_syscall_64+0x68/0x110
      ...
    
    Cc: [email protected]
    Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
    Signed-off-by: Aaron Thompson <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: ISO: Fix multiple init when debugfs is disabled [+ + +]
Author: Aaron Thompson <[email protected]>
Date:   Fri Oct 4 23:04:08 2024 +0000

    Bluetooth: ISO: Fix multiple init when debugfs is disabled
    
    commit a9b7b535ba192c6b77e6c15a4c82d853163eab8c upstream.
    
    If bt_debugfs is not created successfully, which happens if either
    CONFIG_DEBUG_FS or CONFIG_DEBUG_FS_ALLOW_ALL is unset, then iso_init()
    returns early and does not set iso_inited to true. This means that a
    subsequent call to iso_init() will result in duplicate calls to
    proto_register(), bt_sock_register(), etc.
    
    With CONFIG_LIST_HARDENED and CONFIG_BUG_ON_DATA_CORRUPTION enabled, the
    duplicate call to proto_register() triggers this BUG():
    
      list_add double add: new=ffffffffc0b280d0, prev=ffffffffbab56250,
        next=ffffffffc0b280d0.
      ------------[ cut here ]------------
      kernel BUG at lib/list_debug.c:35!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 2 PID: 887 Comm: bluetoothd Not tainted 6.10.11-1-ao-desktop #1
      RIP: 0010:__list_add_valid_or_report+0x9a/0xa0
      ...
        __list_add_valid_or_report+0x9a/0xa0
        proto_register+0x2b5/0x340
        iso_init+0x23/0x150 [bluetooth]
        set_iso_socket_func+0x68/0x1b0 [bluetooth]
        kmem_cache_free+0x308/0x330
        hci_sock_sendmsg+0x990/0x9e0 [bluetooth]
        __sock_sendmsg+0x7b/0x80
        sock_write_iter+0x9a/0x110
        do_iter_readv_writev+0x11d/0x220
        vfs_writev+0x180/0x3e0
        do_writev+0xca/0x100
      ...
    
    This change removes the early return. The check for iso_debugfs being
    NULL was unnecessary, it is always NULL when iso_inited is false.
    
    Cc: [email protected]
    Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
    Signed-off-by: Aaron Thompson <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: Remove debugfs directory on module init failure [+ + +]
Author: Aaron Thompson <[email protected]>
Date:   Fri Oct 4 23:04:10 2024 +0000

    Bluetooth: Remove debugfs directory on module init failure
    
    commit 1db4564f101b47188c1b71696bd342ef09172b22 upstream.
    
    If bt_init() fails, the debugfs directory currently is not removed. If
    the module is loaded again after that, the debugfs directory is not set
    up properly due to the existing directory.
    
      # modprobe bluetooth
      # ls -laF /sys/kernel/debug/bluetooth
      total 0
      drwxr-xr-x  2 root root 0 Sep 27 14:26 ./
      drwx------ 31 root root 0 Sep 27 14:25 ../
      -r--r--r--  1 root root 0 Sep 27 14:26 l2cap
      -r--r--r--  1 root root 0 Sep 27 14:26 sco
      # modprobe -r bluetooth
      # ls -laF /sys/kernel/debug/bluetooth
      ls: cannot access '/sys/kernel/debug/bluetooth': No such file or directory
      #
    
      # modprobe bluetooth
      modprobe: ERROR: could not insert 'bluetooth': Invalid argument
      # dmesg | tail -n 6
      Bluetooth: Core ver 2.22
      NET: Registered PF_BLUETOOTH protocol family
      Bluetooth: HCI device and connection manager initialized
      Bluetooth: HCI socket layer initialized
      Bluetooth: Faking l2cap_init() failure for testing
      NET: Unregistered PF_BLUETOOTH protocol family
      # ls -laF /sys/kernel/debug/bluetooth
      total 0
      drwxr-xr-x  2 root root 0 Sep 27 14:31 ./
      drwx------ 31 root root 0 Sep 27 14:26 ../
      #
    
      # modprobe bluetooth
      # dmesg | tail -n 7
      Bluetooth: Core ver 2.22
      debugfs: Directory 'bluetooth' with parent '/' already present!
      NET: Registered PF_BLUETOOTH protocol family
      Bluetooth: HCI device and connection manager initialized
      Bluetooth: HCI socket layer initialized
      Bluetooth: L2CAP socket layer initialized
      Bluetooth: SCO socket layer initialized
      # ls -laF /sys/kernel/debug/bluetooth
      total 0
      drwxr-xr-x  2 root root 0 Sep 27 14:31 ./
      drwx------ 31 root root 0 Sep 27 14:26 ../
      #
    
    Cc: [email protected]
    Fixes: ffcecac6a738 ("Bluetooth: Create root debugfs directory during module init")
    Signed-off-by: Aaron Thompson <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
btrfs: fix uninitialized pointer free in add_inode_ref() [+ + +]
Author: Roi Martin <[email protected]>
Date:   Wed Oct 9 10:08:33 2024 +0200

    btrfs: fix uninitialized pointer free in add_inode_ref()
    
    commit 66691c6e2f18d2aa4b22ffb624b9bdc97e9979e4 upstream.
    
    The add_inode_ref() function does not initialize the "name" struct when
    it is declared.  If any of the following calls to "read_one_inode()
    returns NULL,
    
            dir = read_one_inode(root, parent_objectid);
            if (!dir) {
                    ret = -ENOENT;
                    goto out;
            }
    
            inode = read_one_inode(root, inode_objectid);
            if (!inode) {
                    ret = -EIO;
                    goto out;
            }
    
    then "name.name" would be freed on "out" before being initialized.
    
    out:
            ...
            kfree(name.name);
    
    This issue was reported by Coverity with CID 1526744.
    
    Fixes: e43eec81c516 ("btrfs: use struct qstr instead of name and namelen pairs")
    CC: [email protected] # 6.6+
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Roi Martin <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
btrfs: fix uninitialized pointer free on read_alloc_one_name() error [+ + +]
Author: Roi Martin <[email protected]>
Date:   Thu Oct 10 21:47:17 2024 +0200

    btrfs: fix uninitialized pointer free on read_alloc_one_name() error
    
    commit 2ab5e243c2266c841e0f6904fad1514b18eaf510 upstream.
    
    The function read_alloc_one_name() does not initialize the name field of
    the passed fscrypt_str struct if kmalloc fails to allocate the
    corresponding buffer.  Thus, it is not guaranteed that
    fscrypt_str.name is initialized when freeing it.
    
    This is a follow-up to the linked patch that fixes the remaining
    instances of the bug introduced by commit e43eec81c516 ("btrfs: use
    struct qstr instead of name and namelen pairs").
    
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    Fixes: e43eec81c516 ("btrfs: use struct qstr instead of name and namelen pairs")
    CC: [email protected] # 6.1+
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Roi Martin <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/amdgpu/swsmu: Only force workload setup on init [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Oct 2 10:22:30 2024 -0400

    drm/amdgpu/swsmu: Only force workload setup on init
    
    commit cb07c8338fc2b9d5f949a19d4a07ee4d5ecf8793 upstream.
    
    Needed to set the workload type at init time so that
    we can apply the navi3x margin optimization.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
    Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init")
    Reviewed-by: Kenneth Feng <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 580ad7cbd4b7be8d2cb5ab5c1fca6bb76045eb0e)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/amdgpu: prevent BO_HANDLES error from being overwritten [+ + +]
Author: Mohammed Anees <[email protected]>
Date:   Wed Oct 9 17:58:31 2024 +0530

    drm/amdgpu: prevent BO_HANDLES error from being overwritten
    
    commit c0ec082f10b7a1fd25e8c1e2a686440da913b7a3 upstream.
    
    Before this patch, if multiple BO_HANDLES chunks were submitted,
    the error -EINVAL would be correctly set but could be overwritten
    by the return value from amdgpu_cs_p1_bo_handles(). This patch
    ensures that if there are multiple BO_HANDLES, we stop.
    
    Fixes: fec5f8e8c6bc ("drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit")
    Signed-off-by: Mohammed Anees <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 40f2cd98828f454bdc5006ad3d94330a5ea164b7)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/radeon: Fix encoder->possible_clones [+ + +]
Author: Ville Syrjälä <[email protected]>
Date:   Mon Oct 14 19:09:36 2024 +0300

    drm/radeon: Fix encoder->possible_clones
    
    commit 28127dba64d8ae1a0b737b973d6d029908599611 upstream.
    
    Include the encoder itself in its possible_clones bitmask.
    In the past nothing validated that drivers were populating
    possible_clones correctly, but that changed in commit
    74d2aacbe840 ("drm: Validate encoder->possible_clones").
    Looks like radeon never got the memo and is still not
    following the rules 100% correctly.
    
    This results in some warnings during driver initialization:
    Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x4 (full encoder mask=0x7)
    WARNING: CPU: 0 PID: 170 at drivers/gpu/drm/drm_mode_config.c:615 drm_mode_config_validate+0x113/0x39c
    ...
    
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Fixes: 74d2aacbe840 ("drm: Validate encoder->possible_clones")
    Reported-by: Erhard Furtner <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/20241009000321.418e4294@yea/
    Tested-by: Erhard Furtner <[email protected]>
    Signed-off-by: Ville Syrjälä <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 3b6e7d40649c0d75572039aff9d0911864c689db)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/vmwgfx: Handle surface check failure correctly [+ + +]
Author: Nikolay Kuratov <[email protected]>
Date:   Wed Oct 2 15:24:29 2024 +0300

    drm/vmwgfx: Handle surface check failure correctly
    
    commit 26498b8d54373d31a621d7dec95c4bd842563b3b upstream.
    
    Currently if condition (!bo and !vmw_kms_srf_ok()) was met
    we go to err_out with ret == 0.
    err_out dereferences vfb if ret == 0, but in our case vfb is still NULL.
    
    Fix this by assigning sensible error to ret.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE
    
    Signed-off-by: Nikolay Kuratov <[email protected]>
    Cc: [email protected]
    Fixes: 810b3e1683d0 ("drm/vmwgfx: Support topology greater than texture size")
    Signed-off-by: Zack Rusin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
fat: fix uninitialized variable [+ + +]
Author: OGAWA Hirofumi <[email protected]>
Date:   Fri Oct 4 15:03:49 2024 +0900

    fat: fix uninitialized variable
    
    commit 963a7f4d3b90ee195b895ca06b95757fcba02d1a upstream.
    
    syszbot produced this with a corrupted fs image.  In theory, however an IO
    error would trigger this also.
    
    This affects just an error report, so should not be a serious error.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: OGAWA Hirofumi <[email protected]>
    Reported-by: [email protected]
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
iio: accel: kx022a: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:47 2024 +0200

    iio: accel: kx022a: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit 96666f05d11acf0370cedca17a4c3ab6f9554b35 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: 7c1d1677b322 ("iio: accel: Support Kionix/ROHM KX022A accelerometer")
    Signed-off-by: Javier Carrasco <[email protected]>
    Acked-by: Matti Vaittinen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:49 2024 +0200

    iio: adc: ti-ads124s08: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit eb143d05def52bc6d193e813018e5fa1a0e47c77 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: e717f8c6dfec ("iio: adc: Add the TI ads124s08 ADC code")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:50 2024 +0200

    iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit 4c4834fd8696a949d1b1f1c2c5b96e1ad2083b02 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Fixes: 2a86487786b5 ("iio: adc: ti-ads8688: add trigger and buffer support")
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Signed-off-by: Javier Carrasco <[email protected]>
    Reviewed-by: Sean Nyekjaer <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: adc: ti-lmp92064: add missing select REGMAP_SPI in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:51 2024 +0200

    iio: adc: ti-lmp92064: add missing select REGMAP_SPI in Kconfig
    
    commit f3fe8c52c580e99c6dc0c7859472ec48176af32d upstream.
    
    This driver makes use of regmap_spi, but does not select the required
    module.
    Add the missing 'select REGMAP_SPI'.
    
    Fixes: 627198942641 ("iio: adc: add ADC driver for the TI LMP92064 controller")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 18:49:37 2024 +0200

    iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
    
    commit b7983033a10baa0d98784bb411b2679bfb207d9a upstream.
    
    This driver makes use of regmap_spi, but does not select the required
    module.
    Add the missing 'select REGMAP_SPI'.
    
    Fixes: 28b4c30bfa5f ("iio: amplifiers: ada4250: add support for ADA4250")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:53 2024 +0200

    iio: dac: ad3552r: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit 5bede948670f447154df401458aef4e2fd446ba8 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: 8f2b54824b28 ("drivers:iio:dac: Add AD3552R driver support")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:54 2024 +0200

    iio: dac: ad5766: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit 62ec3df342cca6a8eb7ed33fd4ac8d0fbfcb9391 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: 885b9790c25a ("drivers:iio:dac:ad5766.c: Add trigger buffer")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 18:49:38 2024 +0200

    iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
    
    commit bcdab6f74c91cda19714354fd4e9e3ef3c9a78b3 upstream.
    
    This driver makes use of regmap_spi, but does not select the required
    module.
    Add the missing 'select REGMAP_SPI'.
    
    Fixes: cbbb819837f6 ("iio: dac: ad5770r: Add AD5770R support")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 18:49:39 2024 +0200

    iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
    
    commit 252ff06a4cb4e572cb3c7fcfa697db96b08a7781 upstream.
    
    This driver makes use of regmap_spi, but does not select the required
    module.
    Add the missing 'select REGMAP_SPI'.
    
    Fixes: 8316cebd1e59 ("iio: dac: add support for ltc1660")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 18:49:40 2024 +0200

    iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
    
    commit 27b6aa68a68105086aef9f0cb541cd688e5edea8 upstream.
    
    This driver makes use of regmap_mmio, but does not select the required
    module.
    Add the missing 'select REGMAP_MMIO'.
    
    Fixes: 4d4b30526eb8 ("iio: dac: add support for stm32 DAC")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 18:49:35 2024 +0200

    iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig
    
    commit c64643ed4eaa5dfd0b3bab7ef1c50b84f3dbaba4 upstream.
    
    This driver makes use of regmap_spi, but does not select the required
    module.
    Add the missing 'select REGMAP_SPI'.
    
    Fixes: eda549e2e524 ("iio: frequency: adf4377: add support for ADF4377")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() [+ + +]
Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 20:41:12 2024 +0200

    iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
    
    commit 3a29b84cf7fbf912a6ab1b9c886746f02b74ea25 upstream.
    
    If hid_sensor_set_report_latency() fails, the error code should be returned
    instead of a value likely to be interpreted as 'success'.
    
    Fixes: 138bc7969c24 ("iio: hid-sensor-hub: Implement batch mode")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Acked-by: Srinivas Pandruvada <[email protected]>
    Link: https://patch.msgid.link/c50640665f091a04086e5092cf50f73f2055107a.1727980825.git.christophe.jaillet@wanadoo.fr
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: light: bu27008: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:56 2024 +0200

    iio: light: bu27008: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit aa99ef68eff5bc6df4959a372ae355b3b73f9930 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: 41ff93d14f78 ("iio: light: ROHM BU27008 color sensor")
    Signed-off-by: Javier Carrasco <[email protected]>
    Acked-by: Matti Vaittinen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: light: opt3001: add missing full-scale range value [+ + +]
Author: Emil Gedenryd <[email protected]>
Date:   Fri Sep 13 11:57:02 2024 +0200

    iio: light: opt3001: add missing full-scale range value
    
    commit 530688e39c644543b71bdd9cb45fdfb458a28eaa upstream.
    
    The opt3001 driver uses predetermined full-scale range values to
    determine what exponent to use for event trigger threshold values.
    The problem is that one of the values specified in the datasheet is
    missing from the implementation. This causes larger values to be
    scaled down to an incorrect exponent, effectively reducing the
    maximum settable threshold value by a factor of 2.
    
    Add missing full-scale range array value.
    
    Fixes: 94a9b7b1809f ("iio: light: add support for TI's opt3001 light sensor")
    Signed-off-by: Emil Gedenryd <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: light: veml6030: fix ALS sensor resolution [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Mon Sep 23 00:17:49 2024 +0200

    iio: light: veml6030: fix ALS sensor resolution
    
    commit c9e9746f275c45108f2b0633a4855d65d9ae0736 upstream.
    
    The driver still uses the sensor resolution provided in the datasheet
    until Rev. 1.6, 28-Apr-2022, which was updated with Rev 1.7,
    28-Nov-2023. The original ambient light resolution has been updated from
    0.0036 lx/ct to 0.0042 lx/ct, which is the value that can be found in
    the current device datasheet.
    
    Update the default resolution for IT = 100 ms and GAIN = 1/8 from the
    original 4608 mlux/cnt to the current value from the "Resolution and
    maximum detection range" table (Application Note 84367, page 5), 5376
    mlux/cnt.
    
    Cc: <[email protected]>
    Fixes: 7b779f573c48 ("iio: light: add driver for veml6030 ambient light sensor")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: light: veml6030: fix IIO device retrieval from embedded device [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Fri Sep 13 15:18:58 2024 +0200

    iio: light: veml6030: fix IIO device retrieval from embedded device
    
    commit c7c44e57750c31de43906d97813273fdffcf7d02 upstream.
    
    The dev pointer that is received as an argument in the
    in_illuminance_period_available_show function references the device
    embedded in the IIO device, not in the i2c client.
    
    dev_to_iio_dev() must be used to accessthe right data. The current
    implementation leads to a segmentation fault on every attempt to read
    the attribute because indio_dev gets a NULL assignment.
    
    This bug has been present since the first appearance of the driver,
    apparently since the last version (V6) before getting applied. A
    constant attribute was used until then, and the last modifications might
    have not been tested again.
    
    Cc: [email protected]
    Fixes: 7b779f573c48 ("iio: light: add driver for veml6030 ambient light sensor")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Thu Oct 3 23:04:59 2024 +0200

    iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
    
    commit 75461a0b15d7c026924d0001abce0476bbc7eda8 upstream.
    
    This driver makes use of triggered buffers, but does not select the
    required modules.
    
    Add the missing 'select IIO_BUFFER' and 'select IIO_TRIGGERED_BUFFER'.
    
    Fixes: 16b05261537e ("mb1232.c: add distance iio sensor with i2c")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Input: xpad - add support for MSI Claw A1M [+ + +]
Author: John Edwards <[email protected]>
Date:   Thu Oct 10 23:09:23 2024 +0000

    Input: xpad - add support for MSI Claw A1M
    
    commit 22a18935d7d96bbb1a28076f843c1926d0ba189e upstream.
    
    Add MSI Claw A1M controller to xpad_device match table when in xinput mode.
    Add MSI VID as XPAD_XBOX360_VENDOR.
    
    Signed-off-by: John Edwards <[email protected]>
    Reviewed-by: Derek J. Clark <[email protected]>
    Reviewed-by: Christopher Snowhill <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
io_uring/sqpoll: close race on waiting for sqring entries [+ + +]
Author: Jens Axboe <[email protected]>
Date:   Tue Oct 15 08:58:25 2024 -0600

    io_uring/sqpoll: close race on waiting for sqring entries
    
    commit 28aabffae6be54284869a91cd8bccd3720041129 upstream.
    
    When an application uses SQPOLL, it must wait for the SQPOLL thread to
    consume SQE entries, if it fails to get an sqe when calling
    io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the
    flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done
    with io_uring_sqring_wait(). There's a natural expectation that once
    this call returns, a new SQE entry can be retrieved, filled out, and
    submitted. However, the kernel uses the cached sq head to determine if
    the SQRING is full or not. If the SQPOLL thread is currently in the
    process of submitting SQE entries, it may have updated the cached sq
    head, but not yet committed it to the SQ ring. Hence the kernel may find
    that there are SQE entries ready to be consumed, and return successfully
    to the application. If the SQPOLL thread hasn't yet committed the SQ
    ring entries by the time the application returns to userspace and
    attempts to get a new SQE, it will fail getting a new SQE.
    
    Fix this by having io_sqring_full() always use the user visible SQ ring
    head entry, rather than the internally cached one.
    
    Cc: [email protected] # 5.10+
    Link: https://github.com/axboe/liburing/discussions/1267
    Reported-by: Benedek Thaler <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices [+ + +]
Author: Lu Baolu <[email protected]>
Date:   Mon Oct 14 09:37:44 2024 +0800

    iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices
    
    commit 6e02a277f1db24fa039e23783c8921c7b0e5b1b3 upstream.
    
    Previously, the domain_context_clear() function incorrectly called
    pci_for_each_dma_alias() to set up context entries for non-PCI devices.
    This could lead to kernel hangs or other unexpected behavior.
    
    Add a check to only call pci_for_each_dma_alias() for PCI devices. For
    non-PCI devices, domain_context_clear_one() is called directly.
    
    Reported-by: Todd Brandt <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219363
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219349
    Fixes: 9a16ab9d6402 ("iommu/vt-d: Make context clearing consistent with context mapping")
    Cc: [email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1 [+ + +]
Author: Nianyao Tang <[email protected]>
Date:   Sat Apr 6 02:27:37 2024 +0000

    irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
    
    commit 80e9963fb3b5509dfcabe9652d56bf4b35542055 upstream.
    
    As per the GICv4.1 spec (Arm IHI 0069H, 5.3.19):
    
     "A VMAPP with {V, Alloc}=={0, x} is self-synchronizing, This means the ITS
      command queue does not show the command as consumed until all of its
      effects are completed."
    
    Furthermore, VSYNC is allowed to deliver an SError when referencing a
    non existent VPE.
    
    By these definitions, a VMAPP followed by a VSYNC is a bug, as the
    later references a VPE that has been unmapped by the former.
    
    Fix it by eliding the VSYNC in this scenario.
    
    Fixes: 64edfaa9a234 ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP")
    Signed-off-by: Nianyao Tang <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Reviewed-by: Marc Zyngier <[email protected]>
    Reviewed-by: Zenghui Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE [+ + +]
Author: Marc Zyngier <[email protected]>
Date:   Wed Oct 2 21:49:59 2024 +0100

    irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
    
    commit 1442ee0011983f0c5c4b92380e6853afb513841a upstream.
    
    Kunkun Jiang reported that there is a small window of opportunity for
    userspace to force a change of affinity for a VPE while the VPE has already
    been unmapped, but the corresponding doorbell interrupt still visible in
    /proc/irq/.
    
    Plug the race by checking the value of vmapp_count, which tracks whether
    the VPE is mapped ot not, and returning an error in this case.
    
    This involves making vmapp_count common to both GICv4.1 and its v4.0
    ancestor.
    
    Fixes: 64edfaa9a234 ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP")
    Reported-by: Kunkun Jiang <[email protected]>
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
irqchip/sifive-plic: Unmask interrupt in plic_irq_enable() [+ + +]
Author: Nam Cao <[email protected]>
Date:   Thu Oct 3 10:41:52 2024 +0200

    irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()
    
    commit 6b1e0651e9ce8ce418ad4ff360e7b9925dc5da79 upstream.
    
    It is possible that an interrupt is disabled and masked at the same time.
    When the interrupt is enabled again by enable_irq(), only plic_irq_enable()
    is called, not plic_irq_unmask(). The interrupt remains masked and never
    raises.
    
    An example where interrupt is both disabled and masked is when
    handle_fasteoi_irq() is the handler, and IRQS_ONESHOT is set. The interrupt
    handler:
    
      1. Mask the interrupt
      2. Handle the interrupt
      3. Check if interrupt is still enabled, and unmask it (see
         cond_unmask_eoi_irq())
    
    If another task disables the interrupt in the middle of the above steps,
    the interrupt will not get unmasked, and will remain masked when it is
    enabled in the future.
    
    The problem is occasionally observed when PREEMPT_RT is enabled, because
    PREEMPT_RT adds the IRQS_ONESHOT flag. But PREEMPT_RT only makes the problem
    more likely to appear, the bug has been around since commit a1706a1c5062
    ("irqchip/sifive-plic: Separate the enable and mask operations").
    
    Fix it by unmasking interrupt in plic_irq_enable().
    
    Fixes: a1706a1c5062 ("irqchip/sifive-plic: Separate the enable and mask operations")
    Signed-off-by: Nam Cao <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ksmbd: fix user-after-free from session log off [+ + +]
Author: Namjae Jeon <[email protected]>
Date:   Tue Oct 8 22:42:57 2024 +0900

    ksmbd: fix user-after-free from session log off
    
    commit 7aa8804c0b67b3cb263a472d17f2cb50d7f1a930 upstream.
    
    There is racy issue between smb2 session log off and smb2 session setup.
    It will cause user-after-free from session log off.
    This add session_lock when setting SMB2_SESSION_EXPIRED and referece
    count to session struct not to free session while it is being used.
    
    Cc: [email protected] # v5.15+
    Reported-by: [email protected] # ZDI-CAN-25282
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
KVM: s390: Change virtual to physical address access in diag 0x258 handler [+ + +]
Author: Michael Mueller <[email protected]>
Date:   Tue Sep 17 17:18:34 2024 +0200

    KVM: s390: Change virtual to physical address access in diag 0x258 handler
    
    commit cad4b3d4ab1f062708fff33f44d246853f51e966 upstream.
    
    The parameters for the diag 0x258 are real addresses, not virtual, but
    KVM was using them as virtual addresses. This only happened to work, since
    the Linux kernel as a guest used to have a 1:1 mapping for physical vs
    virtual addresses.
    
    Fix KVM so that it correctly uses the addresses as real addresses.
    
    Cc: [email protected]
    Fixes: 8ae04b8f500b ("KVM: s390: Guest's memory access functions get access registers")
    Suggested-by: Vasily Gorbik <[email protected]>
    Signed-off-by: Michael Mueller <[email protected]>
    Signed-off-by: Nico Boehr <[email protected]>
    Reviewed-by: Christian Borntraeger <[email protected]>
    Reviewed-by: Heiko Carstens <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Janosch Frank <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: s390: gaccess: Check if guest address is in memslot [+ + +]
Author: Nico Boehr <[email protected]>
Date:   Tue Sep 17 17:18:33 2024 +0200

    KVM: s390: gaccess: Check if guest address is in memslot
    
    commit e8061f06185be0a06a73760d6526b8b0feadfe52 upstream.
    
    Previously, access_guest_page() did not check whether the given guest
    address is inside of a memslot. This is not a problem, since
    kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case.
    
    However, -EFAULT is also returned when copy_to/from_user fails.
    
    When emulating a guest instruction, the address being outside a memslot
    usually means that an addressing exception should be injected into the
    guest.
    
    Failure in copy_to/from_user however indicates that something is wrong
    in userspace and hence should be handled there.
    
    To be able to distinguish these two cases, return PGM_ADDRESSING in
    access_guest_page() when the guest address is outside guest memory. In
    access_guest_real(), populate vcpu->arch.pgm.code such that
    kvm_s390_inject_prog_cond() can be used in the caller for injecting into
    the guest (if applicable).
    
    Since this adds a new return value to access_guest_page(), we need to make
    sure that other callers are not confused by the new positive return value.
    
    There are the following users of access_guest_page():
    - access_guest_with_key() does the checking itself (in
      guest_range_to_gpas()), so this case should never happen. Even if, the
      handling is set up properly.
    - access_guest_real() just passes the return code to its callers, which
      are:
        - read_guest_real() - see below
        - write_guest_real() - see below
    
    There are the following users of read_guest_real():
    - ar_translation() in gaccess.c which already returns PGM_*
    - setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always
      return -EFAULT on read_guest_read() nonzero return - no change
    - shadow_crycb(), handle_stfle() always present this as validity, this
      could be handled better but doesn't change current behaviour - no change
    
    There are the following users of write_guest_real():
    - kvm_s390_store_status_unloaded() always returns -EFAULT on
      write_guest_real() failure.
    
    Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions")
    Cc: [email protected]
    Signed-off-by: Nico Boehr <[email protected]>
    Reviewed-by: Heiko Carstens <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Janosch Frank <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Linux: Linux 6.6.58 [+ + +]
Author: Greg Kroah-Hartman <[email protected]>
Date:   Tue Oct 22 15:46:36 2024 +0200

    Linux 6.6.58
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Harshit Mogalapalli <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
maple_tree: correct tree corruption on spanning store [+ + +]
Author: Lorenzo Stoakes <[email protected]>
Date:   Mon Oct 7 16:28:32 2024 +0100

    maple_tree: correct tree corruption on spanning store
    
    commit bea07fd63192b61209d48cbb81ef474cc3ee4c62 upstream.
    
    Patch series "maple_tree: correct tree corruption on spanning store", v3.
    
    There has been a nasty yet subtle maple tree corruption bug that appears
    to have been in existence since the inception of the algorithm.
    
    This bug seems far more likely to happen since commit f8d112a4e657
    ("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
    at which reports started to be submitted concerning this bug.
    
    We were made definitely aware of the bug thanks to the kind efforts of
    Bert Karwatzki who helped enormously in my being able to track this down
    and identify the cause of it.
    
    The bug arises when an attempt is made to perform a spanning store across
    two leaf nodes, where the right leaf node is the rightmost child of the
    shared parent, AND the store completely consumes the right-mode node.
    
    This results in mas_wr_spanning_store() mitakenly duplicating the new and
    existing entries at the maximum pivot within the range, and thus maple
    tree corruption.
    
    The fix patch corrects this by detecting this scenario and disallowing the
    mistaken duplicate copy.
    
    The fix patch commit message goes into great detail as to how this occurs.
    
    This series also includes a test which reliably reproduces the issue, and
    asserts that the fix works correctly.
    
    Bert has kindly tested the fix and confirmed it resolved his issues.  Also
    Mikhail Gavrilov kindly reported what appears to be precisely the same
    bug, which this fix should also resolve.
    
    
    This patch (of 2):
    
    There has been a subtle bug present in the maple tree implementation from
    its inception.
    
    This arises from how stores are performed - when a store occurs, it will
    overwrite overlapping ranges and adjust the tree as necessary to
    accommodate this.
    
    A range may always ultimately span two leaf nodes.  In this instance we
    walk the two leaf nodes, determine which elements are not overwritten to
    the left and to the right of the start and end of the ranges respectively
    and then rebalance the tree to contain these entries and the newly
    inserted one.
    
    This kind of store is dubbed a 'spanning store' and is implemented by
    mas_wr_spanning_store().
    
    In order to reach this stage, mas_store_gfp() invokes
    mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
    walk the tree and update the object (mas) to traverse to the location
    where the write should be performed, determining its store type.
    
    When a spanning store is required, this function returns false stopping at
    the parent node which contains the target range, and mas_wr_store_type()
    marks the mas->store_type as wr_spanning_store to denote this fact.
    
    When we go to perform the store in mas_wr_spanning_store(), we first
    determine the elements AFTER the END of the range we wish to store (that
    is, to the right of the entry to be inserted) - we do this by walking to
    the NEXT pivot in the tree (i.e.  r_mas.last + 1), starting at the node we
    have just determined contains the range over which we intend to write.
    
    We then turn our attention to the entries to the left of the entry we are
    inserting, whose state is represented by l_mas, and copy these into a 'big
    node', which is a special node which contains enough slots to contain two
    leaf node's worth of data.
    
    We then copy the entry we wish to store immediately after this - the copy
    and the insertion of the new entry is performed by mas_store_b_node().
    
    After this we copy the elements to the right of the end of the range which
    we are inserting, if we have not exceeded the length of the node (i.e.
    r_mas.offset <= r_mas.end).
    
    Herein lies the bug - under very specific circumstances, this logic can
    break and corrupt the maple tree.
    
    Consider the following tree:
    
    Height
      0                             Root Node
                                     /      \
                     pivot = 0xffff /        \ pivot = ULONG_MAX
                                   /          \
      1                       A [-----]       ...
                                 /   \
                 pivot = 0x4fff /     \ pivot = 0xffff
                               /       \
      2 (LEAVES)          B [-----]  [-----] C
                                          ^--- Last pivot 0xffff.
    
    Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
    that all ranges expressed in maple tree code are inclusive):
    
    1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
       determines that this is a spanning store across nodes B and C. The mas
       state is set such that the current node from which we traverse further
       is node A.
    
    2. In mas_wr_spanning_store() we try to find elements to the right of pivot
       0xffff by searching for an index of 0x10000:
    
        - mas_wr_walk_index() invokes mas_wr_walk_descend() and
          mas_wr_node_walk() in turn.
    
            - mas_wr_node_walk() loops over entries in node A until EITHER it
              finds an entry whose pivot equals or exceeds 0x10000 OR it
              reaches the final entry.
    
            - Since no entry has a pivot equal to or exceeding 0x10000, pivot
              0xffff is selected, leading to node C.
    
        - mas_wr_walk_traverse() resets the mas state to traverse node C. We
          loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
          in turn once again.
    
             - Again, we reach the last entry in node C, which has a pivot of
               0xffff.
    
    3. We then copy the elements to the left of 0x4000 in node B to the big
       node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
       too.
    
    4. We determine whether we have any entries to copy from the right of the
       end of the range via - and with r_mas set up at the entry at pivot
       0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
       pivot 0xffff.
    
    5. BUG! The maple tree is corrupted with a duplicate entry.
    
    This requires a very specific set of circumstances - we must be spanning
    the last element in a leaf node, which is the last element in the parent
    node.
    
    spanning store across two leaf nodes with a range that ends at that shared
    pivot.
    
    A potential solution to this problem would simply be to reset the walk
    each time we traverse r_mas, however given the rarity of this situation it
    seems that would be rather inefficient.
    
    Instead, this patch detects if the right hand node is populated, i.e.  has
    anything we need to copy.
    
    We do so by only copying elements from the right of the entry being
    inserted when the maximum value present exceeds the last, rather than
    basing this on offset position.
    
    The patch also updates some comments and eliminates the unused bool return
    value in mas_wr_walk_index().
    
    The work performed in commit f8d112a4e657 ("mm/mmap: avoid zeroing vma
    tree in mmap_region()") seems to have made the probability of this event
    much more likely, which is the point at which reports started to be
    submitted concerning this bug.
    
    The motivation for this change arose from Bert Karwatzki's report of
    encountering mm instability after the release of kernel v6.12-rc1 which,
    after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
    options, was identified as maple tree corruption.
    
    After Bert very generously provided his time and ability to reproduce this
    event consistently, I was able to finally identify that the issue
    discussed in this commit message was occurring for him.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Lorenzo Stoakes <[email protected]>
    Reported-by: Bert Karwatzki <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Bert Karwatzki <[email protected]>
    Reported-by: Mikhail Gavrilov <[email protected]>
    Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/
    Acked-by: Vlastimil Babka <[email protected]>
    Reviewed-by: Liam R. Howlett <[email protected]>
    Tested-by: Mikhail Gavrilov <[email protected]>
    Reviewed-by: Wei Yang <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Sidhartha Kumar <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device [+ + +]
Author: Heiko Thiery <[email protected]>
Date:   Mon Oct 7 09:11:20 2024 +0200

    misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device
    
    commit 3c2d73de49be528276474c1a53f78b38ee11c1fa upstream.
    
    By using NVMEM_DEVID_AUTO we support more than 1 device and
    automatically enumerate.
    
    Fixes: 9ab5465349c0 ("misc: microchip: pci1xxxx: Add support to read and write into PCI1XXXX EEPROM via NVMEM sysfs")
    Cc: [email protected]
    Signed-off-by: Heiko Thiery <[email protected]>
    Reviewed-by: Michael Walle <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device [+ + +]
Author: Heiko Thiery <[email protected]>
Date:   Mon Oct 7 09:11:22 2024 +0200

    misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device
    
    commit 2471787c1f0dae6721f60ab44be37460635d3732 upstream.
    
    By using NVMEM_DEVID_AUTO we support more than 1 device and
    automatically enumerate.
    
    Fixes: 0969001569e4 ("misc: microchip: pci1xxxx: Add support to read and write into PCI1XXXX OTP via NVMEM sysfs")
    Cc: [email protected]
    Signed-off-by: Heiko Thiery <[email protected]>
    Reviewed-by: Michael Walle <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/mglru: only clear kswapd_failures if reclaimable [+ + +]
Author: Wei Xu <[email protected]>
Date:   Mon Oct 14 22:12:11 2024 +0000

    mm/mglru: only clear kswapd_failures if reclaimable
    
    commit b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad upstream.
    
    lru_gen_shrink_node() unconditionally clears kswapd_failures, which can
    prevent kswapd from sleeping and cause 100% kswapd cpu usage even when
    kswapd repeatedly fails to make progress in reclaim.
    
    Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes some
    progress, similar to shrink_node().
    
    I happened to run into this problem in one of my tests recently.  It
    requires a combination of several conditions: The allocator needs to
    allocate a right amount of pages such that it can wake up kswapd
    without itself being OOM killed; there is no memory for kswapd to
    reclaim (My test disables swap and cleans page cache first); no other
    process frees enough memory at the same time.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
    Signed-off-by: Wei Xu <[email protected]>
    Cc: Axel Rasmussen <[email protected]>
    Cc: Brian Geffon <[email protected]>
    Cc: Jan Alexander Steffens <[email protected]>
    Cc: Suleiman Souhlal <[email protected]>
    Cc: Yu Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/mremap: fix move_normal_pmd/retract_page_tables race [+ + +]
Author: Jann Horn <[email protected]>
Date:   Mon Oct 7 23:42:04 2024 +0200

    mm/mremap: fix move_normal_pmd/retract_page_tables race
    
    commit 6fa1066fc5d00cb9f1b0e83b7ff6ef98d26ba2aa upstream.
    
    In mremap(), move_page_tables() looks at the type of the PMD entry and the
    specified address range to figure out by which method the next chunk of
    page table entries should be moved.
    
    At that point, the mmap_lock is held in write mode, but no rmap locks are
    held yet.  For PMD entries that point to page tables and are fully covered
    by the source address range, move_pgt_entry(NORMAL_PMD, ...) is called,
    which first takes rmap locks, then does move_normal_pmd().
    move_normal_pmd() takes the necessary page table locks at source and
    destination, then moves an entire page table from the source to the
    destination.
    
    The problem is: The rmap locks, which protect against concurrent page
    table removal by retract_page_tables() in the THP code, are only taken
    after the PMD entry has been read and it has been decided how to move it.
    So we can race as follows (with two processes that have mappings of the
    same tmpfs file that is stored on a tmpfs mount with huge=advise); note
    that process A accesses page tables through the MM while process B does it
    through the file rmap:
    
    process A                      process B
    =========                      =========
    mremap
      mremap_to
        move_vma
          move_page_tables
            get_old_pmd
            alloc_new_pmd
                          *** PREEMPT ***
                                   madvise(MADV_COLLAPSE)
                                     do_madvise
                                       madvise_walk_vmas
                                         madvise_vma_behavior
                                           madvise_collapse
                                             hpage_collapse_scan_file
                                               collapse_file
                                                 retract_page_tables
                                                   i_mmap_lock_read(mapping)
                                                   pmdp_collapse_flush
                                                   i_mmap_unlock_read(mapping)
            move_pgt_entry(NORMAL_PMD, ...)
              take_rmap_locks
              move_normal_pmd
              drop_rmap_locks
    
    When this happens, move_normal_pmd() can end up creating bogus PMD entries
    in the line `pmd_populate(mm, new_pmd, pmd_pgtable(pmd))`.  The effect
    depends on arch-specific and machine-specific details; on x86, you can end
    up with physical page 0 mapped as a page table, which is likely
    exploitable for user->kernel privilege escalation.
    
    Fix the race by letting process B recheck that the PMD still points to a
    page table after the rmap locks have been taken.  Otherwise, we bail and
    let the caller fall back to the PTE-level copying path, which will then
    bail immediately at the pmd_none() check.
    
    Bug reachability: Reaching this bug requires that you can create
    shmem/file THP mappings - anonymous THP uses different code that doesn't
    zap stuff under rmap locks.  File THP is gated on an experimental config
    flag (CONFIG_READ_ONLY_THP_FOR_FS), so on normal distro kernels you need
    shmem THP to hit this bug.  As far as I know, getting shmem THP normally
    requires that you can mount your own tmpfs with the right mount flags,
    which would require creating your own user+mount namespace; though I don't
    know if some distros maybe enable shmem THP by default or something like
    that.
    
    Bug impact: This issue can likely be used for user->kernel privilege
    escalation when it is reachable.
    
    Link: https://lkml.kernel.org/r/20241007-move_normal_pmd-vs-collapse-fix-2-v1-1-5ead9631f2ea@google.com
    Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock")
    Signed-off-by: Jann Horn <[email protected]>
    Signed-off-by: David Hildenbrand <[email protected]>
    Co-developed-by: David Hildenbrand <[email protected]>
    Closes: https://project-zero.issues.chromium.org/371047675
    Acked-by: Qi Zheng <[email protected]>
    Reviewed-by: Lorenzo Stoakes <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Joel Fernandes <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/swapfile: skip HugeTLB pages for unuse_vma [+ + +]
Author: Liu Shixin <[email protected]>
Date:   Tue Oct 15 09:45:21 2024 +0800

    mm/swapfile: skip HugeTLB pages for unuse_vma
    
    commit 7528c4fb1237512ee18049f852f014eba80bbe8d upstream.
    
    I got a bad pud error and lost a 1GB HugeTLB when calling swapoff.  The
    problem can be reproduced by the following steps:
    
     1. Allocate an anonymous 1GB HugeTLB and some other anonymous memory.
     2. Swapout the above anonymous memory.
     3. run swapoff and we will get a bad pud error in kernel message:
    
      mm/pgtable-generic.c:42: bad pud 00000000743d215d(84000001400000e7)
    
    We can tell that pud_clear_bad is called by pud_none_or_clear_bad in
    unuse_pud_range() by ftrace.  And therefore the HugeTLB pages will never
    be freed because we lost it from page table.  We can skip HugeTLB pages
    for unuse_vma to fix it.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 0fe6e20b9c4c ("hugetlb, rmap: add reverse mapping for hugepage")
    Signed-off-by: Liu Shixin <[email protected]>
    Acked-by: Muchun Song <[email protected]>
    Cc: Naoya Horiguchi <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow [+ + +]
Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Tue Oct 15 10:38:47 2024 +0200

    mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
    
    commit 7decd1f5904a489d3ccdcf131972f94645681689 upstream.
    
    Syzkaller reported this splat:
    
      ==================================================================
      BUG: KASAN: slab-use-after-free in mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
      Read of size 4 at addr ffff8880569ac858 by task syz.1.2799/14662
    
      CPU: 0 UID: 0 PID: 14662 Comm: syz.1.2799 Not tainted 6.12.0-rc2-syzkaller-00307-g36c254515dc6 #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0xc3/0x620 mm/kasan/report.c:488
       kasan_report+0xd9/0x110 mm/kasan/report.c:601
       mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
       mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
       mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
       mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
       genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
       netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
       netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:729 [inline]
       __sock_sendmsg net/socket.c:744 [inline]
       ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
       ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
       __sys_sendmsg+0x117/0x1f0 net/socket.c:2690
       do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
       __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
       do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
       entry_SYSENTER_compat_after_hwframe+0x84/0x8e
      RIP: 0023:0xf7fe4579
      Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
      RSP: 002b:00000000f574556c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
      RAX: ffffffffffffffda RBX: 000000000000000b RCX: 0000000020000140
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
    
      Allocated by task 5387:
       kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
       kasan_save_track+0x14/0x30 mm/kasan/common.c:68
       poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
       __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:394
       kmalloc_noprof include/linux/slab.h:878 [inline]
       kzalloc_noprof include/linux/slab.h:1014 [inline]
       subflow_create_ctx+0x87/0x2a0 net/mptcp/subflow.c:1803
       subflow_ulp_init+0xc3/0x4d0 net/mptcp/subflow.c:1956
       __tcp_set_ulp net/ipv4/tcp_ulp.c:146 [inline]
       tcp_set_ulp+0x326/0x7f0 net/ipv4/tcp_ulp.c:167
       mptcp_subflow_create_socket+0x4ae/0x10a0 net/mptcp/subflow.c:1764
       __mptcp_subflow_connect+0x3cc/0x1490 net/mptcp/subflow.c:1592
       mptcp_pm_create_subflow_or_signal_addr+0xbda/0x23a0 net/mptcp/pm_netlink.c:642
       mptcp_pm_nl_fully_established net/mptcp/pm_netlink.c:650 [inline]
       mptcp_pm_nl_work+0x3a1/0x4f0 net/mptcp/pm_netlink.c:943
       mptcp_worker+0x15a/0x1240 net/mptcp/protocol.c:2777
       process_one_work+0x958/0x1b30 kernel/workqueue.c:3229
       process_scheduled_works kernel/workqueue.c:3310 [inline]
       worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
       kthread+0x2c1/0x3a0 kernel/kthread.c:389
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Freed by task 113:
       kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
       kasan_save_track+0x14/0x30 mm/kasan/common.c:68
       kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:230 [inline]
       slab_free_hook mm/slub.c:2342 [inline]
       slab_free mm/slub.c:4579 [inline]
       kfree+0x14f/0x4b0 mm/slub.c:4727
       kvfree+0x47/0x50 mm/util.c:701
       kvfree_rcu_list+0xf5/0x2c0 kernel/rcu/tree.c:3423
       kvfree_rcu_drain_ready kernel/rcu/tree.c:3563 [inline]
       kfree_rcu_monitor+0x503/0x8b0 kernel/rcu/tree.c:3632
       kfree_rcu_shrink_scan+0x245/0x3a0 kernel/rcu/tree.c:3966
       do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
       shrink_slab+0x32b/0x12a0 mm/shrinker.c:662
       shrink_one+0x47e/0x7b0 mm/vmscan.c:4818
       shrink_many mm/vmscan.c:4879 [inline]
       lru_gen_shrink_node mm/vmscan.c:4957 [inline]
       shrink_node+0x2452/0x39d0 mm/vmscan.c:5937
       kswapd_shrink_node mm/vmscan.c:6765 [inline]
       balance_pgdat+0xc19/0x18f0 mm/vmscan.c:6957
       kswapd+0x5ea/0xbf0 mm/vmscan.c:7226
       kthread+0x2c1/0x3a0 kernel/kthread.c:389
       ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Last potentially related work creation:
       kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
       __kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:541
       kvfree_call_rcu+0x74/0xbe0 kernel/rcu/tree.c:3810
       subflow_ulp_release+0x2ae/0x350 net/mptcp/subflow.c:2009
       tcp_cleanup_ulp+0x7c/0x130 net/ipv4/tcp_ulp.c:124
       tcp_v4_destroy_sock+0x1c5/0x6a0 net/ipv4/tcp_ipv4.c:2541
       inet_csk_destroy_sock+0x1a3/0x440 net/ipv4/inet_connection_sock.c:1293
       tcp_done+0x252/0x350 net/ipv4/tcp.c:4870
       tcp_rcv_state_process+0x379b/0x4f30 net/ipv4/tcp_input.c:6933
       tcp_v4_do_rcv+0x1ad/0xa90 net/ipv4/tcp_ipv4.c:1938
       sk_backlog_rcv include/net/sock.h:1115 [inline]
       __release_sock+0x31b/0x400 net/core/sock.c:3072
       __tcp_close+0x4f3/0xff0 net/ipv4/tcp.c:3142
       __mptcp_close_ssk+0x331/0x14d0 net/mptcp/protocol.c:2489
       mptcp_close_ssk net/mptcp/protocol.c:2543 [inline]
       mptcp_close_ssk+0x150/0x220 net/mptcp/protocol.c:2526
       mptcp_pm_nl_rm_addr_or_subflow+0x2be/0xcc0 net/mptcp/pm_netlink.c:878
       mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
       mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
       mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
       genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
       genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
       genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
       netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
       netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
       netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
       netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:729 [inline]
       __sock_sendmsg net/socket.c:744 [inline]
       ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
       ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
       __sys_sendmsg+0x117/0x1f0 net/socket.c:2690
       do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
       __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
       do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
       entry_SYSENTER_compat_after_hwframe+0x84/0x8e
    
      The buggy address belongs to the object at ffff8880569ac800
       which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 88 bytes inside of
       freed 512-byte region [ffff8880569ac800, ffff8880569aca00)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x569ac
      head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
      page_type: f5(slab)
      raw: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
      raw: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
      head: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
      head: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
      head: 04fff00000000002 ffffea00015a6b01 ffffffffffffffff 0000000000000000
      head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 10238, tgid 10238 (kworker/u32:6), ts 597403252405, free_ts 597177952947
       set_page_owner include/linux/page_owner.h:32 [inline]
       post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1537
       prep_new_page mm/page_alloc.c:1545 [inline]
       get_page_from_freelist+0x101e/0x3070 mm/page_alloc.c:3457
       __alloc_pages_noprof+0x223/0x25a0 mm/page_alloc.c:4733
       alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265
       alloc_slab_page mm/slub.c:2412 [inline]
       allocate_slab mm/slub.c:2578 [inline]
       new_slab+0x2ba/0x3f0 mm/slub.c:2631
       ___slab_alloc+0xd1d/0x16f0 mm/slub.c:3818
       __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908
       __slab_alloc_node mm/slub.c:3961 [inline]
       slab_alloc_node mm/slub.c:4122 [inline]
       __kmalloc_cache_noprof+0x2c5/0x310 mm/slub.c:4290
       kmalloc_noprof include/linux/slab.h:878 [inline]
       kzalloc_noprof include/linux/slab.h:1014 [inline]
       mld_add_delrec net/ipv6/mcast.c:743 [inline]
       igmp6_leave_group net/ipv6/mcast.c:2625 [inline]
       igmp6_group_dropped+0x4ab/0xe40 net/ipv6/mcast.c:723
       __ipv6_dev_mc_dec+0x281/0x360 net/ipv6/mcast.c:979
       addrconf_leave_solict net/ipv6/addrconf.c:2253 [inline]
       __ipv6_ifa_notify+0x3f6/0xc30 net/ipv6/addrconf.c:6283
       addrconf_ifdown.isra.0+0xef9/0x1a20 net/ipv6/addrconf.c:3982
       addrconf_notify+0x220/0x19c0 net/ipv6/addrconf.c:3781
       notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
       call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1996
       call_netdevice_notifiers_extack net/core/dev.c:2034 [inline]
       call_netdevice_notifiers net/core/dev.c:2048 [inline]
       dev_close_many+0x333/0x6a0 net/core/dev.c:1589
      page last free pid 13136 tgid 13136 stack trace:
       reset_page_owner include/linux/page_owner.h:25 [inline]
       free_pages_prepare mm/page_alloc.c:1108 [inline]
       free_unref_page+0x5f4/0xdc0 mm/page_alloc.c:2638
       stack_depot_save_flags+0x2da/0x900 lib/stackdepot.c:666
       kasan_save_stack+0x42/0x60 mm/kasan/common.c:48
       kasan_save_track+0x14/0x30 mm/kasan/common.c:68
       unpoison_slab_object mm/kasan/common.c:319 [inline]
       __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4085 [inline]
       slab_alloc_node mm/slub.c:4134 [inline]
       kmem_cache_alloc_noprof+0x121/0x2f0 mm/slub.c:4141
       skb_clone+0x190/0x3f0 net/core/skbuff.c:2084
       do_one_broadcast net/netlink/af_netlink.c:1462 [inline]
       netlink_broadcast_filtered+0xb11/0xef0 net/netlink/af_netlink.c:1540
       netlink_broadcast+0x39/0x50 net/netlink/af_netlink.c:1564
       uevent_net_broadcast_untagged lib/kobject_uevent.c:331 [inline]
       kobject_uevent_net_broadcast lib/kobject_uevent.c:410 [inline]
       kobject_uevent_env+0xacd/0x1670 lib/kobject_uevent.c:608
       device_del+0x623/0x9f0 drivers/base/core.c:3882
       snd_card_disconnect.part.0+0x58a/0x7c0 sound/core/init.c:546
       snd_card_disconnect+0x1f/0x30 sound/core/init.c:495
       snd_usx2y_disconnect+0xe9/0x1f0 sound/usb/usx2y/usbusx2y.c:417
       usb_unbind_interface+0x1e8/0x970 drivers/usb/core/driver.c:461
       device_remove drivers/base/dd.c:569 [inline]
       device_remove+0x122/0x170 drivers/base/dd.c:561
    
    That's because 'subflow' is used just after 'mptcp_close_ssk(subflow)',
    which will initiate the release of its memory. Even if it is very likely
    the release and the re-utilisation will be done later on, it is of
    course better to avoid any issues and read the content of 'subflow'
    before closing it.
    
    Fixes: 1c1f72137598 ("mptcp: pm: only decrement add_addr_accepted for MPJ req")
    Cc: [email protected]
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/[email protected]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mptcp: prevent MPC handshake on port-based signal endpoints [+ + +]
Author: Paolo Abeni <[email protected]>
Date:   Mon Oct 14 16:06:00 2024 +0200

    mptcp: prevent MPC handshake on port-based signal endpoints
    
    commit 3d041393ea8c815f773020fb4a995331a69c0139 upstream.
    
    Syzkaller reported a lockdep splat:
    
      ============================================
      WARNING: possible recursive locking detected
      6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Not tainted
      --------------------------------------------
      syz-executor364/5113 is trying to acquire lock:
      ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
      ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
    
      but task is already holding lock:
      ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
      ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
    
      other info that might help us debug this:
       Possible unsafe locking scenario:
    
             CPU0
             ----
        lock(k-slock-AF_INET);
        lock(k-slock-AF_INET);
    
       *** DEADLOCK ***
    
       May be due to missing lock nesting notation
    
      7 locks held by syz-executor364/5113:
       #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
       #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x153/0x1b10 net/mptcp/protocol.c:1806
       #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
       #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x11f/0x530 net/mptcp/protocol.c:1727
       #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
       #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
       #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5f/0x1b80 net/ipv4/ip_output.c:470
       #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
       #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
       #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1390 net/ipv4/ip_output.c:228
       #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: local_lock_acquire include/linux/local_lock_internal.h:29 [inline]
       #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x33b/0x15b0 net/core/dev.c:6104
       #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
       #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
       #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x230/0x5f0 net/ipv4/ip_input.c:232
       #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
       #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
    
      stack backtrace:
      CPU: 0 UID: 0 PID: 5113 Comm: syz-executor364 Not tainted 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:93 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
       check_deadlock kernel/locking/lockdep.c:3061 [inline]
       validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3855
       __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
       mptcp_sk_clone_init+0x32/0x13c0 net/mptcp/protocol.c:3279
       subflow_syn_recv_sock+0x931/0x1920 net/mptcp/subflow.c:874
       tcp_check_req+0xfe4/0x1a20 net/ipv4/tcp_minisocks.c:853
       tcp_v4_rcv+0x1c3e/0x37f0 net/ipv4/tcp_ipv4.c:2267
       ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
       NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
       NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
       __netif_receive_skb_one_core net/core/dev.c:5661 [inline]
       __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
       process_backlog+0x662/0x15b0 net/core/dev.c:6108
       __napi_poll+0xcb/0x490 net/core/dev.c:6772
       napi_poll net/core/dev.c:6841 [inline]
       net_rx_action+0x89b/0x1240 net/core/dev.c:6963
       handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
       do_softirq+0x11b/0x1e0 kernel/softirq.c:455
       </IRQ>
       <TASK>
       __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
       local_bh_enable include/linux/bottom_half.h:33 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline]
       __dev_queue_xmit+0x1763/0x3e90 net/core/dev.c:4450
       dev_queue_xmit include/linux/netdevice.h:3105 [inline]
       neigh_hh_output include/net/neighbour.h:526 [inline]
       neigh_output include/net/neighbour.h:540 [inline]
       ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:235
       ip_local_out net/ipv4/ip_output.c:129 [inline]
       __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:535
       __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
       tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6542 [inline]
       tcp_rcv_state_process+0x2c32/0x4570 net/ipv4/tcp_input.c:6729
       tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1934
       sk_backlog_rcv include/net/sock.h:1111 [inline]
       __release_sock+0x214/0x350 net/core/sock.c:3004
       release_sock+0x61/0x1f0 net/core/sock.c:3558
       mptcp_sendmsg_fastopen+0x1ad/0x530 net/mptcp/protocol.c:1733
       mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1812
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x1a6/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
       ___sys_sendmsg net/socket.c:2651 [inline]
       __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
       __do_sys_sendmmsg net/socket.c:2766 [inline]
       __se_sys_sendmmsg net/socket.c:2763 [inline]
       __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f04fb13a6b9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffd651f42d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f04fb13a6b9
      RDX: 0000000000000001 RSI: 0000000020000d00 RDI: 0000000000000004
      RBP: 00007ffd651f4310 R08: 0000000000000001 R09: 0000000000000001
      R10: 0000000020000080 R11: 0000000000000246 R12: 00000000000f4240
      R13: 00007f04fb187449 R14: 00007ffd651f42f4 R15: 00007ffd651f4300
       </TASK>
    
    As noted by Cong Wang, the splat is false positive, but the code
    path leading to the report is an unexpected one: a client is
    attempting an MPC handshake towards the in-kernel listener created
    by the in-kernel PM for a port based signal endpoint.
    
    Such connection will be never accepted; many of them can make the
    listener queue full and preventing the creation of MPJ subflow via
    such listener - its intended role.
    
    Explicitly detect this scenario at initial-syn time and drop the
    incoming MPC request.
    
    Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
    Cc: [email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=f4aacdfef2c6a6529c3e
    Cc: Cong Wang <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    [ Conflicts in mib.[ch], because commit 6982826fe5e5 ("mptcp: fallback
      to TCP after SYN+MPC drops"), and commit 27069e7cb3d1 ("mptcp: disable
      active MPTCP in case of blackhole") are linked to new features, not
      available in this version. Resolving the conflicts is easy, simply
      adding the new lines declaring the new "endpoint attempt" MIB entry. ]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
net: enetc: add missing static descriptor and inline keyword [+ + +]
Author: Wei Fang <[email protected]>
Date:   Fri Oct 11 11:01:03 2024 +0800

    net: enetc: add missing static descriptor and inline keyword
    
    commit 1d7b2ce43d2c22a21dadaf689cb36a69570346a6 upstream.
    
    Fix the build warnings when CONFIG_FSL_ENETC_MDIO is not enabled.
    The detailed warnings are shown as follows.
    
    include/linux/fsl/enetc_mdio.h:62:18: warning: no previous prototype for function 'enetc_hw_alloc' [-Wmissing-prototypes]
          62 | struct enetc_hw *enetc_hw_alloc(struct device *dev, void __iomem *port_regs)
             |                  ^
    include/linux/fsl/enetc_mdio.h:62:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
          62 | struct enetc_hw *enetc_hw_alloc(struct device *dev, void __iomem *port_regs)
             | ^
             | static
    8 warnings generated.
    
    Fixes: 6517798dd343 ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
    Cc: [email protected]
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Claudiu Manoil <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: enetc: block concurrent XDP transmissions during ring reconfiguration [+ + +]
Author: Wei Fang <[email protected]>
Date:   Thu Oct 10 17:20:54 2024 +0800

    net: enetc: block concurrent XDP transmissions during ring reconfiguration
    
    commit c728a95ccf2a8ba544facfc30a4418d4c68c39f0 upstream.
    
    When testing the XDP_REDIRECT function on the LS1028A platform, we
    found a very reproducible issue that the Tx frames can no longer be
    sent out even if XDP_REDIRECT is turned off. Specifically, if there
    is a lot of traffic on Rx direction, when XDP_REDIRECT is turned on,
    the console may display some warnings like "timeout for tx ring #6
    clear", and all redirected frames will be dropped, the detailed log
    is as follows.
    
    root@ls1028ardb:~# ./xdp-bench redirect eno0 eno2
    Redirecting from eno0 (ifindex 3; driver fsl_enetc) to eno2 (ifindex 4; driver fsl_enetc)
    [203.849809] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #5 clear
    [204.006051] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear
    [204.161944] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear
    eno0->eno2     1420505 rx/s       1420590 err,drop/s      0 xmit/s
      xmit eno0->eno2    0 xmit/s     1420590 drop/s     0 drv_err/s     15.71 bulk-avg
    eno0->eno2     1420484 rx/s       1420485 err,drop/s      0 xmit/s
      xmit eno0->eno2    0 xmit/s     1420485 drop/s     0 drv_err/s     15.71 bulk-avg
    
    By analyzing the XDP_REDIRECT implementation of enetc driver, the
    driver will reconfigure Tx and Rx BD rings when a bpf program is
    installed or uninstalled, but there is no mechanisms to block the
    redirected frames when enetc driver reconfigures rings. Similarly,
    XDP_TX verdicts on received frames can also lead to frames being
    enqueued in the Tx rings. Because XDP ignores the state set by the
    netif_tx_wake_queue() API, so introduce the ENETC_TX_DOWN flag to
    suppress transmission of XDP frames.
    
    Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
    Cc: [email protected]
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: enetc: disable NAPI after all rings are disabled [+ + +]
Author: Wei Fang <[email protected]>
Date:   Thu Oct 10 17:20:56 2024 +0800

    net: enetc: disable NAPI after all rings are disabled
    
    commit 6b58fadd44aafbbd6af5f0b965063e1fd2063992 upstream.
    
    When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC
    on LS1028A, it was found that if the command was re-run multiple times,
    Rx could not receive the frames, and the result of xdp-bench showed
    that the rx rate was 0.
    
    root@ls1028ardb:~# ./xdp-bench tx eno0
    Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc)
    Summary                      2046 rx/s                  0 err,drop/s
    Summary                         0 rx/s                  0 err,drop/s
    Summary                         0 rx/s                  0 err,drop/s
    Summary                         0 rx/s                  0 err,drop/s
    
    By observing the Rx PIR and CIR registers, CIR is always 0x7FF and
    PIR is always 0x7FE, which means that the Rx ring is full and can no
    longer accommodate other Rx frames. Therefore, the problem is caused
    by the Rx BD ring not being cleaned up.
    
    Further analysis of the code revealed that the Rx BD ring will only
    be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met.
    Therefore, some debug logs were added to the driver and the current
    values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx
    BD ring was full. The logs are as follows.
    
    [  178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140
    [  178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110
    [  178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
    
    From the results, the max value of xdp_tx_in_flight has reached 2140.
    However, the size of the Rx BD ring is only 2048. So xdp_tx_in_flight
    did not drop to 0 after enetc_stop() is called and the driver does not
    clear it. The root cause is that NAPI is disabled too aggressively,
    without having waited for the pending XDP_TX frames to be transmitted,
    and their buffers recycled, so that xdp_tx_in_flight cannot naturally
    drop to 0. Later, enetc_free_tx_ring() does free those stale, unsent
    XDP_TX packets, but it is not coded up to also reset xdp_tx_in_flight,
    hence the manifestation of the bug.
    
    One option would be to cover this extra condition in enetc_free_tx_ring(),
    but now that the ENETC_TX_DOWN exists, we have created a window at
    the beginning of enetc_stop() where NAPI can still be scheduled, but
    any concurrent enqueue will be blocked. Therefore, enetc_wait_bdrs()
    and enetc_disable_tx_bdrs() can be called with NAPI still scheduled,
    and it is guaranteed that this will not wait indefinitely, but instead
    give us an indication that the pending TX frames have orderly dropped
    to zero. Only then should we call napi_disable().
    
    This way, enetc_free_tx_ring() becomes entirely redundant and can be
    dropped as part of subsequent cleanup.
    
    The change also refactors enetc_start() so that it looks like the
    mirror opposite procedure of enetc_stop().
    
    Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing")
    Cc: [email protected]
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Tested-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: enetc: disable Tx BD rings after they are empty [+ + +]
Author: Wei Fang <[email protected]>
Date:   Thu Oct 10 17:20:55 2024 +0800

    net: enetc: disable Tx BD rings after they are empty
    
    commit 0a93f2ca4be6c4616d371f18a3fabad2df7f8d55 upstream.
    
    The Tx BD rings are disabled first in enetc_stop() and the driver
    waits for them to become empty. This operation is not safe while
    the ring is actively transmitting frames, and will cause the ring
    to not be empty and hardware exception. As described in the NETC
    block guide, software should only disable an active Tx ring after
    all pending ring entries have been consumed (i.e. when PI = CI).
    Disabling a transmit ring that is actively processing BDs risks
    a HW-SW race hazard whereby a hardware resource becomes assigned
    to work on one or more ring entries only to have those entries be
    removed due to the ring becoming disabled.
    
    When testing XDP_REDIRECT feautre, although all frames were blocked
    from being put into Tx rings during ring reconfiguration, the similar
    warning log was still encountered:
    
    fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear
    fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear
    
    The reason is that when there are still unsent frames in the Tx ring,
    disabling the Tx ring causes the remaining frames to be unable to be
    sent out. And the Tx ring cannot be restored, which means that even
    if the xdp program is uninstalled, the Tx frames cannot be sent out
    anymore. Therefore, correct the operation order in enect_start() and
    enect_stop().
    
    Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing")
    Cc: [email protected]
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: enetc: remove xdp_drops statistic from enetc_xdp_drop() [+ + +]
Author: Wei Fang <[email protected]>
Date:   Thu Oct 10 17:20:53 2024 +0800

    net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
    
    commit 412950d5746f7aa139e14fe95338694c1f09b595 upstream.
    
    The xdp_drops statistic indicates the number of XDP frames dropped in
    the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and
    XDP_REDIRECT actions. If frame loss occurs in these two actions, the
    frames loss count should not be included in xdp_drops, because there
    are already xdp_tx_drops and xdp_redirect_failures to count the frame
    loss of these two actions, so it's better to remove xdp_drops statistic
    from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
    
    Fixes: 7ed2bc80074e ("net: enetc: add support for XDP_TX")
    Cc: [email protected]
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Maciej Fijalkowski <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: fec: Move `fec_ptp_read()` to the top of the file [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Mon Aug 12 11:47:13 2024 +0200

    net: fec: Move `fec_ptp_read()` to the top of the file
    
    commit 4374a1fe580a14f6152752390c678d90311df247 upstream.
    
    This function is used in `fec_ptp_enable_pps()` through
    struct cyclecounter read(). Moving the declaration makes
    it clearer, what's happening.
    
    Suggested-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/netdev/[email protected]/T/#ma6c21ad264016c24612048b1483769eaff8cdf20
    Signed-off-by: Csókás, Bence <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: fec: Remove duplicated code [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Mon Aug 12 11:47:15 2024 +0200

    net: fec: Remove duplicated code
    
    commit 713ebaed68d88121cbaf5e74104e2290a9ea74bd upstream.
    
    `fec_ptp_pps_perout()` reimplements logic already
    in `fec_ptp_read()`. Replace with function call.
    
    Signed-off-by: Csókás, Bence <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY [+ + +]
Author: Oleksij Rempel <[email protected]>
Date:   Sun Oct 13 07:29:16 2024 +0200

    net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY
    
    commit d0c3601f2c4e12e7689b0f46ebc17525250ea8c3 upstream.
    
    A boot delay was introduced by commit 79540d133ed6 ("net: macb: Fix
    handling of fixed-link node"). This delay was caused by the call to
    `mdiobus_register()` in cases where a fixed-link PHY was present. The
    MDIO bus registration triggered unnecessary PHY address scans, leading
    to a 20-second delay due to attempts to detect Clause 45 (C45)
    compatible PHYs, despite no MDIO bus being attached.
    
    The commit 79540d133ed6 ("net: macb: Fix handling of fixed-link node")
    was originally introduced to fix a regression caused by commit
    7897b071ac3b4 ("net: macb: convert to phylink"), which caused the driver
    to misinterpret fixed-link nodes as PHY nodes. This resulted in warnings
    like:
    mdio_bus f0028000.ethernet-ffffffff: fixed-link has invalid PHY address
    mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 0
    ...
    mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 31
    
    This patch reworks the logic to avoid registering and allocation of the
    MDIO bus when:
      - The device tree contains a fixed-link node.
      - There is no "mdio" child node in the device tree.
    
    If a child node named "mdio" exists, the MDIO bus will be registered to
    support PHYs  attached to the MACB's MDIO bus. Otherwise, with only a
    fixed-link, the MDIO bus is skipped.
    
    Tested on a sama5d35 based system with a ksz8863 switch attached to
    macb0.
    
    Fixes: 79540d133ed6 ("net: macb: Fix handling of fixed-link node")
    Signed-off-by: Oleksij Rempel <[email protected]>
    Cc: [email protected]
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Oct 14 20:19:22 2024 +0800

    net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test()
    
    commit 217a3d98d1e9891a8b1438a27dfbc64ddf01f691 upstream.
    
    Commit a3c1e45156ad ("net: microchip: vcap: Fix use-after-free error in
    kunit test") fixed the use-after-free error, but introduced below
    memory leaks by removing necessary vcap_free_rule(), add it to fix it.
    
            unreferenced object 0xffffff80ca58b700 (size 192):
              comm "kunit_try_catch", pid 1215, jiffies 4294898264
              hex dump (first 32 bytes):
                00 12 7a 00 05 00 00 00 0a 00 00 00 64 00 00 00  ..z.........d...
                00 00 00 00 00 00 00 00 00 04 0b cc 80 ff ff ff  ................
              backtrace (crc 9c09c3fe):
                [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
                [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
                [<0000000040a01b8d>] vcap_alloc_rule+0x3cc/0x9c4
                [<000000003fe86110>] vcap_api_encode_rule_test+0x1ac/0x16b0
                [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
                [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
                [<00000000c5d82c9a>] kthread+0x2e8/0x374
                [<00000000f4287308>] ret_from_fork+0x10/0x20
            unreferenced object 0xffffff80cc0b0400 (size 64):
              comm "kunit_try_catch", pid 1215, jiffies 4294898265
              hex dump (first 32 bytes):
                80 04 0b cc 80 ff ff ff 18 b7 58 ca 80 ff ff ff  ..........X.....
                39 00 00 00 02 00 00 00 06 05 04 03 02 01 ff ff  9...............
              backtrace (crc daf014e9):
                [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
                [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
                [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
                [<00000000dfdb1e81>] vcap_api_encode_rule_test+0x224/0x16b0
                [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
                [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
                [<00000000c5d82c9a>] kthread+0x2e8/0x374
                [<00000000f4287308>] ret_from_fork+0x10/0x20
            unreferenced object 0xffffff80cc0b0700 (size 64):
              comm "kunit_try_catch", pid 1215, jiffies 4294898265
              hex dump (first 32 bytes):
                80 07 0b cc 80 ff ff ff 28 b7 58 ca 80 ff ff ff  ........(.X.....
                3c 00 00 00 00 00 00 00 01 2f 03 b3 ec ff ff ff  <......../......
              backtrace (crc 8d877792):
                [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
                [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
                [<000000006eadfab7>] vcap_rule_add_action+0x2d0/0x52c
                [<00000000323475d1>] vcap_api_encode_rule_test+0x4d4/0x16b0
                [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
                [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
                [<00000000c5d82c9a>] kthread+0x2e8/0x374
                [<00000000f4287308>] ret_from_fork+0x10/0x20
            unreferenced object 0xffffff80cc0b0900 (size 64):
              comm "kunit_try_catch", pid 1215, jiffies 4294898266
              hex dump (first 32 bytes):
                80 09 0b cc 80 ff ff ff 80 06 0b cc 80 ff ff ff  ................
                7d 00 00 00 01 00 00 00 00 00 00 00 ff 00 00 00  }...............
              backtrace (crc 34181e56):
                [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
                [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
                [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
                [<00000000991e3564>] vcap_val_rule+0xcf0/0x13e8
                [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0
                [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
                [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
                [<00000000c5d82c9a>] kthread+0x2e8/0x374
                [<00000000f4287308>] ret_from_fork+0x10/0x20
            unreferenced object 0xffffff80cc0b0980 (size 64):
              comm "kunit_try_catch", pid 1215, jiffies 4294898266
              hex dump (first 32 bytes):
                18 b7 58 ca 80 ff ff ff 00 09 0b cc 80 ff ff ff  ..X.............
                67 00 00 00 00 00 00 00 01 01 74 88 c0 ff ff ff  g.........t.....
              backtrace (crc 275fd9be):
                [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
                [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
                [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
                [<000000001396a1a2>] test_add_def_fields+0xb0/0x100
                [<000000006e7621f0>] vcap_val_rule+0xa98/0x13e8
                [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0
                [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
                [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
                [<00000000c5d82c9a>] kthread+0x2e8/0x374
                [<00000000f4287308>] ret_from_fork+0x10/0x20
            ......
    
    Cc: [email protected]
    Fixes: a3c1e45156ad ("net: microchip: vcap: Fix use-after-free error in kunit test")
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Jens Emil Schulz Østergaard <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
nilfs2: propagate directory read errors from nilfs_find_entry() [+ + +]
Author: Ryusuke Konishi <[email protected]>
Date:   Fri Oct 4 12:35:31 2024 +0900

    nilfs2: propagate directory read errors from nilfs_find_entry()
    
    commit 08cfa12adf888db98879dbd735bc741360a34168 upstream.
    
    Syzbot reported that a task hang occurs in vcs_open() during a fuzzing
    test for nilfs2.
    
    The root cause of this problem is that in nilfs_find_entry(), which
    searches for directory entries, ignores errors when loading a directory
    page/folio via nilfs_get_folio() fails.
    
    If the filesystem images is corrupted, and the i_size of the directory
    inode is large, and the directory page/folio is successfully read but
    fails the sanity check, for example when it is zero-filled,
    nilfs_check_folio() may continue to spit out error messages in bursts.
    
    Fix this issue by propagating the error to the callers when loading a
    page/folio fails in nilfs_find_entry().
    
    The current interface of nilfs_find_entry() and its callers is outdated
    and cannot propagate error codes such as -EIO and -ENOMEM returned via
    nilfs_find_entry(), so fix it together.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: Lizhi Xu <[email protected]>
    Closes: https://lkml.kernel.org/r/[email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=8a192e8d090fa9a31135
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
parport: Proper fix for array out-of-bounds access [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Fri Sep 20 12:32:19 2024 +0200

    parport: Proper fix for array out-of-bounds access
    
    commit 02ac3a9ef3a18b58d8f3ea2b6e46de657bf6c4f9 upstream.
    
    The recent fix for array out-of-bounds accesses replaced sprintf()
    calls blindly with snprintf().  However, since snprintf() returns the
    would-be-printed size, not the actually output size, the length
    calculation can still go over the given limit.
    
    Use scnprintf() instead of snprintf(), which returns the actually
    output letters, for addressing the potential out-of-bounds access
    properly.
    
    Fixes: ab11dac93d2d ("dev/parport: fix the array out-of-bounds risk")
    Cc: [email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
pinctrl: apple: check devm_kasprintf() returned value [+ + +]
Author: Ma Ke <[email protected]>
Date:   Thu Sep 5 10:09:17 2024 +0800

    pinctrl: apple: check devm_kasprintf() returned value
    
    commit 665a58fe663ac7a9ea618dc0b29881649324b116 upstream.
    
    devm_kasprintf() can return a NULL pointer on failure but this returned
    value is not checked. Fix this lack and check the returned value.
    
    Found by code review.
    
    Cc: [email protected]
    Fixes: a0f160ffcb83 ("pinctrl: add pinctrl/GPIO driver for Apple SoCs")
    Signed-off-by: Ma Ke <[email protected]>
    Reviewed-by: Christophe JAILLET <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

pinctrl: ocelot: fix system hang on level based interrupts [+ + +]
Author: Sergey Matsievskiy <[email protected]>
Date:   Sat Oct 12 13:57:43 2024 +0300

    pinctrl: ocelot: fix system hang on level based interrupts
    
    commit 93b8ddc54507a227087c60a0013ed833b6ae7d3c upstream.
    
    The current implementation only calls chained_irq_enter() and
    chained_irq_exit() if it detects pending interrupts.
    
    ```
    for (i = 0; i < info->stride; i++) {
            uregmap_read(info->map, id_reg + 4 * i, ®);
            if (!reg)
                    continue;
    
            chained_irq_enter(parent_chip, desc);
    ```
    
    However, in case of GPIO pin configured in level mode and the parent
    controller configured in edge mode, GPIO interrupt might be lowered by the
    hardware. In the result, if the interrupt is short enough, the parent
    interrupt is still pending while the GPIO interrupt is cleared;
    chained_irq_enter() never gets called and the system hangs trying to
    service the parent interrupt.
    
    Moving chained_irq_enter() and chained_irq_exit() outside the for loop
    ensures that they are called even when GPIO interrupt is lowered by the
    hardware.
    
    The similar code with chained_irq_enter() / chained_irq_exit() functions
    wrapping interrupt checking loop may be found in many other drivers:
    ```
    grep -r -A 10 chained_irq_enter drivers/pinctrl
    ```
    
    Cc: [email protected]
    Signed-off-by: Sergey Matsievskiy <[email protected]>
    Reviewed-by: Alexandre Belloni <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

pinctrl: stm32: check devm_kasprintf() returned value [+ + +]
Author: Ma Ke <[email protected]>
Date:   Fri Sep 6 18:03:26 2024 +0800

    pinctrl: stm32: check devm_kasprintf() returned value
    
    commit b0f0e3f0552a566def55c844b0d44250c58e4df6 upstream.
    
    devm_kasprintf() can return a NULL pointer on failure but this returned
    value is not checked. Fix this lack and check the returned value.
    
    Found by code review.
    
    Cc: [email protected]
    Fixes: 32c170ff15b0 ("pinctrl: stm32: set default gpio line names using pin names")
    Signed-off-by: Ma Ke <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
posix-clock: Fix missing timespec64 check in pc_clock_settime() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Wed Oct 9 15:23:01 2024 +0800

    posix-clock: Fix missing timespec64 check in pc_clock_settime()
    
    commit d8794ac20a299b647ba9958f6d657051fc51a540 upstream.
    
    As Andrew pointed out, it will make sense that the PTP core
    checked timespec64 struct's tv_sec and tv_nsec range before calling
    ptp->info->settime64().
    
    As the man manual of clock_settime() said, if tp.tv_sec is negative or
    tp.tv_nsec is outside the range [0..999,999,999], it should return EINVAL,
    which include dynamic clocks which handles PTP clock, and the condition is
    consistent with timespec64_valid(). As Thomas suggested, timespec64_valid()
    only check the timespec is valid, but not ensure that the time is
    in a valid range, so check it ahead using timespec64_valid_strict()
    in pc_clock_settime() and return -EINVAL if not valid.
    
    There are some drivers that use tp->tv_sec and tp->tv_nsec directly to
    write registers without validity checks and assume that the higher layer
    has checked it, which is dangerous and will benefit from this, such as
    hclge_ptp_settime(), igb_ptp_settime_i210(), _rcar_gen4_ptp_settime(),
    and some drivers can remove the checks of itself.
    
    Cc: [email protected]
    Fixes: 0606f422b453 ("posix clocks: Introduce dynamic clocks")
    Acked-by: Richard Cochran <[email protected]>
    Suggested-by: Andrew Lunn <[email protected]>
    Suggested-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
s390/sclp: Deactivate sclp after all its users [+ + +]
Author: Thomas Weißschuh <[email protected]>
Date:   Mon Oct 14 07:50:06 2024 +0200

    s390/sclp: Deactivate sclp after all its users
    
    commit 0d9dc27df22d9b5c8dc7185c8dddbc14f5468518 upstream.
    
    On reboot the SCLP interface is deactivated through a reboot notifier.
    This happens before other components using SCLP have the chance to run
    their own reboot notifiers.
    Two of those components are the SCLP console and tty drivers which try
    to flush the last outstanding messages.
    At that point the SCLP interface is already unusable and the messages
    are discarded.
    
    Execute sclp_deactivate() as late as possible to avoid this issue.
    
    Fixes: 4ae46db99cd8 ("s390/consoles: improve panic notifiers reliability")
    Cc: [email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Reviewed-by: Sven Schnelle <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
s390/sclp_vt220: Convert newlines to CRLF instead of LFCR [+ + +]
Author: Thomas Weißschuh <[email protected]>
Date:   Mon Oct 14 07:50:07 2024 +0200

    s390/sclp_vt220: Convert newlines to CRLF instead of LFCR
    
    commit dee3df68ab4b00fff6bdf9fc39541729af37307c upstream.
    
    According to the VT220 specification the possible character combinations
    sent on RETURN are only CR or CRLF [0].
    
            The Return key sends either a CR character (0/13) or a CR
            character (0/13) and an LF character (0/10), depending on the
            set/reset state of line feed/new line mode (LNM).
    
    The sclp/vt220 driver however uses LFCR. This can confuse tools, for
    example the kunit runner.
    
    Link: https://vt100.net/docs/vt220-rm/chapter3.html#S3.2
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: [email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Reviewed-by: Sven Schnelle <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
scsi: ufs: core: Fix the issue of ICU failure [+ + +]
Author: Peter Wang <[email protected]>
Date:   Tue Oct 1 17:19:16 2024 +0800

    scsi: ufs: core: Fix the issue of ICU failure
    
    commit bf0c6cc73f7f91ec70307f7c72343f6cb7d65d01 upstream.
    
    When setting the ICU bit without using read-modify-write, SQRTCy will
    restart SQ again and receive an RTC return error code 2 (Failure - SQ
    not stopped).
    
    Additionally, the error log has been modified so that this type of error
    can be observed.
    
    Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode")
    Cc: [email protected]
    Signed-off-by: Peter Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Bao D. Nguyen <[email protected]>
    Reviewed-by: Bart Van Assche <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down [+ + +]
Author: Seunghwan Baek <[email protected]>
Date:   Thu Aug 29 18:39:13 2024 +0900

    scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down
    
    commit 19a198b67767d952c8f3d0cf24eb3100522a8223 upstream.
    
    There is a history of deadlock if reboot is performed at the beginning
    of booting. SDEV_QUIESCE was set for all LU's scsi_devices by UFS
    shutdown, and at that time the audio driver was waiting on
    blk_mq_submit_bio() holding a mutex_lock while reading the fw binary.
    After that, a deadlock issue occurred while audio driver shutdown was
    waiting for mutex_unlock of blk_mq_submit_bio(). To solve this, set
    SDEV_OFFLINE for all LUs except WLUN, so that any I/O that comes down
    after a UFS shutdown will return an error.
    
    [   31.907781]I[0:      swapper/0:    0]        1        130705007       1651079834      11289729804                0 D(   2) 3 ffffff882e208000 *             init [device_shutdown]
    [   31.907793]I[0:      swapper/0:    0] Mutex: 0xffffff8849a2b8b0: owner[0xffffff882e28cb00 kworker/6:0 :49]
    [   31.907806]I[0:      swapper/0:    0] Call trace:
    [   31.907810]I[0:      swapper/0:    0]  __switch_to+0x174/0x338
    [   31.907819]I[0:      swapper/0:    0]  __schedule+0x5ec/0x9cc
    [   31.907826]I[0:      swapper/0:    0]  schedule+0x7c/0xe8
    [   31.907834]I[0:      swapper/0:    0]  schedule_preempt_disabled+0x24/0x40
    [   31.907842]I[0:      swapper/0:    0]  __mutex_lock+0x408/0xdac
    [   31.907849]I[0:      swapper/0:    0]  __mutex_lock_slowpath+0x14/0x24
    [   31.907858]I[0:      swapper/0:    0]  mutex_lock+0x40/0xec
    [   31.907866]I[0:      swapper/0:    0]  device_shutdown+0x108/0x280
    [   31.907875]I[0:      swapper/0:    0]  kernel_restart+0x4c/0x11c
    [   31.907883]I[0:      swapper/0:    0]  __arm64_sys_reboot+0x15c/0x280
    [   31.907890]I[0:      swapper/0:    0]  invoke_syscall+0x70/0x158
    [   31.907899]I[0:      swapper/0:    0]  el0_svc_common+0xb4/0xf4
    [   31.907909]I[0:      swapper/0:    0]  do_el0_svc+0x2c/0xb0
    [   31.907918]I[0:      swapper/0:    0]  el0_svc+0x34/0xe0
    [   31.907928]I[0:      swapper/0:    0]  el0t_64_sync_handler+0x68/0xb4
    [   31.907937]I[0:      swapper/0:    0]  el0t_64_sync+0x1a0/0x1a4
    
    [   31.908774]I[0:      swapper/0:    0]       49                0         11960702      11236868007                0 D(   2) 6 ffffff882e28cb00 *      kworker/6:0 [__bio_queue_enter]
    [   31.908783]I[0:      swapper/0:    0] Call trace:
    [   31.908788]I[0:      swapper/0:    0]  __switch_to+0x174/0x338
    [   31.908796]I[0:      swapper/0:    0]  __schedule+0x5ec/0x9cc
    [   31.908803]I[0:      swapper/0:    0]  schedule+0x7c/0xe8
    [   31.908811]I[0:      swapper/0:    0]  __bio_queue_enter+0xb8/0x178
    [   31.908818]I[0:      swapper/0:    0]  blk_mq_submit_bio+0x194/0x67c
    [   31.908827]I[0:      swapper/0:    0]  __submit_bio+0xb8/0x19c
    
    Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
    Cc: [email protected]
    Signed-off-by: Seunghwan Baek <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Bart Van Assche <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
selftest: hid: add the missing tests directory [+ + +]
Author: Yun Lu <[email protected]>
Date:   Tue Oct 15 17:15:20 2024 +0800

    selftest: hid: add the missing tests directory
    
    commit fe05c40ca9c18cfdb003f639a30fc78a7ab49519 upstream.
    
    Commit 160c826b4dd0 ("selftest: hid: add missing run-hid-tools-tests.sh")
    has added the run-hid-tools-tests.sh script for it to be installed, but
    I forgot to add the tests directory together.
    
    If running the test case without the tests directory,  will results in
    the following error message:
    
        make -C tools/testing/selftests/ TARGETS=hid install \
                INSTALL_PATH=$KSFT_INSTALL_PATH
        cd $KSFT_INSTALL_PATH
        ./run_kselftest.sh -t hid:hid-core.sh
    
      /usr/lib/python3.11/site-packages/_pytest/config/__init__.py:331: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
      Plugin: helpconfig, Hook: pytest_cmdline_parse
      UsageError: usage: __main__.py [options] [file_or_dir] [file_or_dir] [...]
      __main__.py: error: unrecognized arguments: --udevd
        inifile: None
        rootdir: /root/linux/kselftest_install/hid
    
    In fact, the run-hid-tools-tests.sh script uses the scripts in the tests
    directory to run tests. The tests directory also needs to be added to be
    installed.
    
    Fixes: ffb85d5c9e80 ("selftests: hid: import hid-tools hid-core tests")
    Cc: [email protected]
    Signed-off-by: Yun Lu <[email protected]>
    Acked-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
selftests/mm: fix deadlock for fork after pthread_create on ARM [+ + +]
Author: Edward Liaw <[email protected]>
Date:   Thu Oct 3 21:17:11 2024 +0000

    selftests/mm: fix deadlock for fork after pthread_create on ARM
    
    commit e142cc87ac4ec618f2ccf5f68aedcd6e28a59d9d upstream.
    
    On Android with arm, there is some synchronization needed to avoid a
    deadlock when forking after pthread_create.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cff294582798 ("selftests/mm: extend and rename uffd pagemap test")
    Signed-off-by: Edward Liaw <[email protected]>
    Cc: Lokesh Gidra <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests/mm: replace atomic_bool with pthread_barrier_t [+ + +]
Author: Edward Liaw <[email protected]>
Date:   Thu Oct 3 21:17:10 2024 +0000

    selftests/mm: replace atomic_bool with pthread_barrier_t
    
    commit e61ef21e27e8deed8c474e9f47f4aa7bc37e138c upstream.
    
    Patch series "selftests/mm: fix deadlock after pthread_create".
    
    On Android arm, pthread_create followed by a fork caused a deadlock in the
    case where the fork required work to be completed by the created thread.
    
    Update the synchronization primitive to use pthread_barrier instead of
    atomic_bool.
    
    Apply the same fix to the wp-fork-with-event test.
    
    
    This patch (of 2):
    
    Swap synchronization primitive with pthread_barrier, so that stdatomic.h
    does not need to be included.
    
    The synchronization is needed on Android ARM64; we see a deadlock with
    pthread_create when the parent thread races forward before the child has a
    chance to start doing work.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cff294582798 ("selftests/mm: extend and rename uffd pagemap test")
    Signed-off-by: Edward Liaw <[email protected]>
    Cc: Lokesh Gidra <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
selftests: mptcp: join: change capture/checksum as bool [+ + +]
Author: Geliang Tang <[email protected]>
Date:   Fri Oct 18 17:57:37 2024 +0200

    selftests: mptcp: join: change capture/checksum as bool
    
    commit 8c6f6b4bb53a904f922dfb90d566391d3feee32c upstream.
    
    To maintain consistency with other scripts, this patch changes vars
    'capture' and 'checksum' as bool vars in mptcp_join.
    
    Signed-off-by: Geliang Tang <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240223-upstream-net-next-20240223-misc-improvements-v1-7-b6c8a10396bd@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: 5afca7e996c4 ("selftests: mptcp: join: test for prohibited MPC to port-based endp")
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests: mptcp: join: test for prohibited MPC to port-based endp [+ + +]
Author: Paolo Abeni <[email protected]>
Date:   Fri Oct 18 17:57:38 2024 +0200

    selftests: mptcp: join: test for prohibited MPC to port-based endp
    
    commit 5afca7e996c42aed1b4a42d4712817601ba42aff upstream.
    
    Explicitly verify that MPC connection attempts towards a port-based
    signal endpoint fail with a reset.
    
    Note that this new test is a bit different from the other ones, not
    using 'run_tests'. It is then needed to add the capture capability, and
    the picking the right port which have been extracted into three new
    helpers. The info about the capture can also be printed from a single
    point, which simplifies the exit paths in do_transfer().
    
    The 'Fixes' tag here below is the same as the one from the previous
    commit: this patch here is not fixing anything wrong in the selftests,
    but it validates the previous fix for an issue introduced by this commit
    ID.
    
    Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
    Cc: [email protected]
    Co-developed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    [ Conflicts in mptcp_join.sh, because commit 0bd962dd86b2 ("selftests:
      mptcp: join: check CURRESTAB counters"), and commit 9e6a39ecb9a1
      ("selftests: mptcp: export TEST_COUNTER variable") are linked to new
      features, not available in this version. Resolving the conflicts is
      easy, simply adding the new helpers before do_transfer(), and rename
      MPTCP_LIB_TEST_COUNTER to TEST_COUNT that was used before. ]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests: mptcp: remove duplicated variables [+ + +]
Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Fri Oct 18 17:57:39 2024 +0200

    selftests: mptcp: remove duplicated variables
    
    A few week ago, there were some backport issues in MPTCP selftests,
    because some patches have been applied twice, but with versions handling
    conflicts differently [1].
    
    Patches fixing these issues have been sent [2] and applied, but it looks
    like quilt was still confused with the removal of some patches, and
    commit a417ef47a665 ("selftests: mptcp: join: validate event numbers")
    duplicated some variables, not present in the original patch [3].
    
    Anyway, Bash was complaining, but that was not causing any tests to
    fail. Also, that's easy to fix by simply removing the duplicated ones.
    
    Link: https://lore.kernel.org/[email protected] [1]
    Link: https://lore.kernel.org/[email protected] [2]
    Link: https://lore.kernel.org/[email protected] [3]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
serial: imx: Update mctrl old_status on RTSD interrupt [+ + +]
Author: Marek Vasut <[email protected]>
Date:   Wed Oct 2 20:40:38 2024 +0200

    serial: imx: Update mctrl old_status on RTSD interrupt
    
    commit 40d7903386df4d18f04d90510ba90eedee260085 upstream.
    
    When sending data using DMA at high baudrate (4 Mbdps in local test case) to
    a device with small RX buffer which keeps asserting RTS after every received
    byte, it is possible that the iMX UART driver would not recognize the falling
    edge of RTS input signal and get stuck, unable to transmit any more data.
    
    This condition happens when the following sequence of events occur:
    - imx_uart_mctrl_check() is called at some point and takes a snapshot of UART
      control signal status into sport->old_status using imx_uart_get_hwmctrl().
      The RTSS/TIOCM_CTS bit is of interest here (*).
    - DMA transfer occurs, the remote device asserts RTS signal after each byte.
      The i.MX UART driver recognizes each such RTS signal change, raises an
      interrupt with USR1 register RTSD bit set, which leads to invocation of
      __imx_uart_rtsint(), which calls uart_handle_cts_change().
      - If the RTS signal is deasserted, uart_handle_cts_change() clears
        port->hw_stopped and unblocks the port for further data transfers.
      - If the RTS is asserted, uart_handle_cts_change() sets port->hw_stopped
        and blocks the port for further data transfers. This may occur as the
        last interrupt of a transfer, which means port->hw_stopped remains set
        and the port remains blocked (**).
    - Any further data transfer attempts will trigger imx_uart_mctrl_check(),
      which will read current status of UART control signals by calling
      imx_uart_get_hwmctrl() (***) and compare it with sport->old_status .
      - If current status differs from sport->old_status for RTS signal,
        uart_handle_cts_change() is called and possibly unblocks the port
        by clearing port->hw_stopped .
      - If current status does not differ from sport->old_status for RTS
        signal, no action occurs. This may occur in case prior snapshot (*)
        was taken before any transfer so the RTS is deasserted, current
        snapshot (***) was taken after a transfer and therefore RTS is
        deasserted again, which means current status and sport->old_status
        are identical. In case (**) triggered when RTS got asserted, and
        made port->hw_stopped set, the port->hw_stopped will remain set
        because no change on RTS line is recognized by this driver and
        uart_handle_cts_change() is not called from here to unblock the
        port->hw_stopped.
    
    Update sport->old_status in __imx_uart_rtsint() accordingly to make
    imx_uart_mctrl_check() detect such RTS change. Note that TIOCM_CAR
    and TIOCM_RI bits in sport->old_status do not suffer from this problem.
    
    Fixes: ceca629e0b48 ("[ARM] 2971/1: i.MX uart handle rts irq")
    Cc: stable <[email protected]>
    Reviewed-by: Esben Haabendal <[email protected]>
    Signed-off-by: Marek Vasut <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: qcom-geni: fix dma rx cancellation [+ + +]
Author: Johan Hovold <[email protected]>
Date:   Wed Oct 9 16:51:05 2024 +0200

    serial: qcom-geni: fix dma rx cancellation
    
    commit 23ee4a25661c33e6381d41e848a9060ed6d72845 upstream.
    
    Make sure to wait for the DMA transfer to complete when cancelling the
    rx command on stop_rx(). This specifically prevents the DMA completion
    interrupt from firing after rx has been restarted, something which can
    lead to an IOMMU fault and hosed rx when the interrupt handler unmaps
    the DMA buffer for the new command:
    
            qcom_geni_serial 988000.serial: serial engine reports 0 RX bytes in!
            arm-smmu 15000000.iommu: FSR    = 00000402 [Format=2 TF], SID=0x563
            arm-smmu 15000000.iommu: FSYNR0 = 00210013 [S1CBNDX=33 WNR PLVL=3]
            Bluetooth: hci0: command 0xfc00 tx timeout
            Bluetooth: hci0: Reading QCA version information failed (-110)
    
    Also add the missing state machine reset which is needed in case
    cancellation fails.
    
    Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
    Cc: [email protected]      # 6.3
    Cc: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: qcom-geni: fix polled console initialisation [+ + +]
Author: Johan Hovold <[email protected]>
Date:   Wed Oct 9 16:51:02 2024 +0200

    serial: qcom-geni: fix polled console initialisation
    
    commit 4bef7c6f299910f19876ad8e7f5897514855f1d2 upstream.
    
    The polled console (KGDB/KDB) implementation must not call port setup
    unconditionally as the port may already be in use by the console or a
    getty.
    
    Only make sure that the receiver is enabled, but do not enable any
    device interrupts.
    
    Fixes: d8851a96ba25 ("tty: serial: qcom-geni-serial: Add a poll_init() function")
    Cc: [email protected]      # 6.4
    Cc: Douglas Anderson <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Reviewed-by: Douglas Anderson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: qcom-geni: fix receiver enable [+ + +]
Author: Johan Hovold <[email protected]>
Date:   Wed Oct 9 16:51:06 2024 +0200

    serial: qcom-geni: fix receiver enable
    
    commit fa103d2599e11e802c818684cff821baefe7f206 upstream.
    
    The receiver is supposed to be enabled in the startup() callback and not
    in set_termios() which is called also during console setup.
    
    This specifically avoids accepting input before the port has been opened
    (and interrupts enabled), something which can also break the GENI
    firmware (cancel fails and after abort, the "stale" counter handling
    appears to be broken so that later input is not processed until twelve
    chars have been received).
    
    There also does not appear to be any need to keep the receiver disabled
    while updating the port settings.
    
    Since commit 6f3c3cafb115 ("serial: qcom-geni: disable interrupts during
    console writes") the calls to manipulate the secondary interrupts, which
    were done without holding the port lock, can also lead to the receiver
    being left disabled when set_termios() races with the console code (e.g.
    when init opens the tty during boot). This can manifest itself as a
    serial getty not accepting input.
    
    The calls to stop and start rx in set_termios() can similarly race with
    DMA completion and, for example, cause the DMA buffer to be unmapped
    twice or the mapping to be leaked.
    
    Fix this by only enabling the receiver during startup and while holding
    the port lock to avoid racing with the console code.
    
    Fixes: 6f3c3cafb115 ("serial: qcom-geni: disable interrupts during console writes")
    Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
    Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
    Cc: [email protected]      # 6.3
    Cc: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Reviewed-by: Douglas Anderson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: qcom-geni: revert broken hibernation support [+ + +]
Author: Johan Hovold <[email protected]>
Date:   Wed Oct 9 16:51:03 2024 +0200

    serial: qcom-geni: revert broken hibernation support
    
    commit 19df76662a33d2f2fc41a66607cb8285fc02d6ec upstream.
    
    This reverts commit 35781d8356a2eecaa6074ceeb80ee22e252fcdae.
    
    Hibernation is not supported on Qualcomm platforms with mainline
    kernels yet a broken vendor implementation for the GENI serial driver
    made it upstream.
    
    This is effectively dead code that cannot be tested and should just be
    removed, but if these paths were ever hit for an open non-console port
    they would crash the machine as the driver would fail to enable clocks
    during restore() (i.e. all ports would have to be closed by drivers and
    user space before hibernating the system to avoid this as a comment in
    the code hinted at).
    
    The broken implementation also added a random call to enable the
    receiver in the port setup code where it does not belong and which
    enables the receiver prematurely for console ports.
    
    Fixes: 35781d8356a2 ("tty: serial: qcom-geni-serial: Add support for Hibernation feature")
    Cc: [email protected]      # 6.2
    Cc: Aniket Randive <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
tcp: fix mptcp DSS corruption due to large pmtu xmit [+ + +]
Author: Paolo Abeni <[email protected]>
Date:   Fri Oct 18 17:57:36 2024 +0200

    tcp: fix mptcp DSS corruption due to large pmtu xmit
    
    commit 4dabcdf581217e60690467a37c956a5b8dbc6bd9 upstream.
    
    Syzkaller was able to trigger a DSS corruption:
    
      TCP: request_sock_subflow_v4: Possible SYN flooding on port [::]:20002. Sending cookies.
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 5227 at net/mptcp/protocol.c:695 __mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
      Modules linked in:
      CPU: 0 UID: 0 PID: 5227 Comm: syz-executor350 Not tainted 6.11.0-syzkaller-08829-gaf9c191ac2a0 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
      RIP: 0010:__mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
      Code: 0f b6 dc 31 ff 89 de e8 b5 dd ea f5 89 d8 48 81 c4 50 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 98 da ea f5 90 <0f> 0b 90 e9 47 ff ff ff e8 8a da ea f5 90 0f 0b 90 e9 99 e0 ff ff
      RSP: 0018:ffffc90000006db8 EFLAGS: 00010246
      RAX: ffffffff8ba9df18 RBX: 00000000000055f0 RCX: ffff888030023c00
      RDX: 0000000000000100 RSI: 00000000000081e5 RDI: 00000000000055f0
      RBP: 1ffff110062bf1ae R08: ffffffff8ba9cf12 R09: 1ffff110062bf1b8
      R10: dffffc0000000000 R11: ffffed10062bf1b9 R12: 0000000000000000
      R13: dffffc0000000000 R14: 00000000700cec61 R15: 00000000000081e5
      FS:  000055556679c380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020287000 CR3: 0000000077892000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <IRQ>
       move_skbs_to_msk net/mptcp/protocol.c:811 [inline]
       mptcp_data_ready+0x29c/0xa90 net/mptcp/protocol.c:854
       subflow_data_ready+0x34a/0x920 net/mptcp/subflow.c:1490
       tcp_data_queue+0x20fd/0x76c0 net/ipv4/tcp_input.c:5283
       tcp_rcv_established+0xfba/0x2020 net/ipv4/tcp_input.c:6237
       tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
       tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2350
       ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
       NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
       NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
       __netif_receive_skb_one_core net/core/dev.c:5662 [inline]
       __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
       process_backlog+0x662/0x15b0 net/core/dev.c:6107
       __napi_poll+0xcb/0x490 net/core/dev.c:6771
       napi_poll net/core/dev.c:6840 [inline]
       net_rx_action+0x89b/0x1240 net/core/dev.c:6962
       handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
       do_softirq+0x11b/0x1e0 kernel/softirq.c:455
       </IRQ>
       <TASK>
       __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
       local_bh_enable include/linux/bottom_half.h:33 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
       __dev_queue_xmit+0x1764/0x3e80 net/core/dev.c:4451
       dev_queue_xmit include/linux/netdevice.h:3094 [inline]
       neigh_hh_output include/net/neighbour.h:526 [inline]
       neigh_output include/net/neighbour.h:540 [inline]
       ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236
       ip_local_out net/ipv4/ip_output.c:130 [inline]
       __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:536
       __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
       tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline]
       tcp_mtu_probe net/ipv4/tcp_output.c:2547 [inline]
       tcp_write_xmit+0x641d/0x6bf0 net/ipv4/tcp_output.c:2752
       __tcp_push_pending_frames+0x9b/0x360 net/ipv4/tcp_output.c:3015
       tcp_push_pending_frames include/net/tcp.h:2107 [inline]
       tcp_data_snd_check net/ipv4/tcp_input.c:5714 [inline]
       tcp_rcv_established+0x1026/0x2020 net/ipv4/tcp_input.c:6239
       tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
       sk_backlog_rcv include/net/sock.h:1113 [inline]
       __release_sock+0x214/0x350 net/core/sock.c:3072
       release_sock+0x61/0x1f0 net/core/sock.c:3626
       mptcp_push_release net/mptcp/protocol.c:1486 [inline]
       __mptcp_push_pending+0x6b5/0x9f0 net/mptcp/protocol.c:1625
       mptcp_sendmsg+0x10bb/0x1b10 net/mptcp/protocol.c:1903
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x1a6/0x270 net/socket.c:745
       ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2603
       ___sys_sendmsg net/socket.c:2657 [inline]
       __sys_sendmsg+0x2aa/0x390 net/socket.c:2686
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7fb06e9317f9
      Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007ffe2cfd4f98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007fb06e97f468 RCX: 00007fb06e9317f9
      RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000005
      RBP: 00007fb06e97f446 R08: 0000555500000000 R09: 0000555500000000
      R10: 0000555500000000 R11: 0000000000000246 R12: 00007fb06e97f406
      R13: 0000000000000001 R14: 00007ffe2cfd4fe0 R15: 0000000000000003
       </TASK>
    
    Additionally syzkaller provided a nice reproducer. The repro enables
    pmtu on the loopback device, leading to tcp_mtu_probe() generating
    very large probe packets.
    
    tcp_can_coalesce_send_queue_head() currently does not check for
    mptcp-level invariants, and allowed the creation of cross-DSS probes,
    leading to the mentioned corruption.
    
    Address the issue teaching tcp_can_coalesce_send_queue_head() about
    mptcp using the tcp_skb_can_collapse(), also reducing the code
    duplication.
    
    Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions")
    Cc: [email protected]
    Reported-by: [email protected]
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/513
    Signed-off-by: Paolo Abeni <[email protected]>
    Acked-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    [ Conflict in tcp_output.c, because the commit 65249feb6b3d ("net: add
      support for skbs with unreadable frags") is not in this version. This
      commit is linked to a new feature (Devmem TCP) and introduces a new
      condition which causes the conflicts. Resolving this is easy: we can
      ignore the missing new condition, and use tcp_skb_can_collapse() like
      in the original patch. ]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
tty: n_gsm: Fix use-after-free in gsm_cleanup_mux [+ + +]
Author: Longlong Xia <[email protected]>
Date:   Thu Sep 26 21:02:13 2024 +0800

    tty: n_gsm: Fix use-after-free in gsm_cleanup_mux
    
    commit 9462f4ca56e7d2430fdb6dcc8498244acbfc4489 upstream.
    
    BUG: KASAN: slab-use-after-free in gsm_cleanup_mux+0x77b/0x7b0
    drivers/tty/n_gsm.c:3160 [n_gsm]
    Read of size 8 at addr ffff88815fe99c00 by task poc/3379
    CPU: 0 UID: 0 PID: 3379 Comm: poc Not tainted 6.11.0+ #56
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX
    Desktop Reference Platform, BIOS 6.00 11/12/2020
    Call Trace:
     <TASK>
     gsm_cleanup_mux+0x77b/0x7b0 drivers/tty/n_gsm.c:3160 [n_gsm]
     __pfx_gsm_cleanup_mux+0x10/0x10 drivers/tty/n_gsm.c:3124 [n_gsm]
     __pfx_sched_clock_cpu+0x10/0x10 kernel/sched/clock.c:389
     update_load_avg+0x1c1/0x27b0 kernel/sched/fair.c:4500
     __pfx_min_vruntime_cb_rotate+0x10/0x10 kernel/sched/fair.c:846
     __rb_insert_augmented+0x492/0xbf0 lib/rbtree.c:161
     gsmld_ioctl+0x395/0x1450 drivers/tty/n_gsm.c:3408 [n_gsm]
     _raw_spin_lock_irqsave+0x92/0xf0 arch/x86/include/asm/atomic.h:107
     __pfx_gsmld_ioctl+0x10/0x10 drivers/tty/n_gsm.c:3822 [n_gsm]
     ktime_get+0x5e/0x140 kernel/time/timekeeping.c:195
     ldsem_down_read+0x94/0x4e0 arch/x86/include/asm/atomic64_64.h:79
     __pfx_ldsem_down_read+0x10/0x10 drivers/tty/tty_ldsem.c:338
     __pfx_do_vfs_ioctl+0x10/0x10 fs/ioctl.c:805
     tty_ioctl+0x643/0x1100 drivers/tty/tty_io.c:2818
    
    Allocated by task 65:
     gsm_data_alloc.constprop.0+0x27/0x190 drivers/tty/n_gsm.c:926 [n_gsm]
     gsm_send+0x2c/0x580 drivers/tty/n_gsm.c:819 [n_gsm]
     gsm1_receive+0x547/0xad0 drivers/tty/n_gsm.c:3038 [n_gsm]
     gsmld_receive_buf+0x176/0x280 drivers/tty/n_gsm.c:3609 [n_gsm]
     tty_ldisc_receive_buf+0x101/0x1e0 drivers/tty/tty_buffer.c:391
     tty_port_default_receive_buf+0x61/0xa0 drivers/tty/tty_port.c:39
     flush_to_ldisc+0x1b0/0x750 drivers/tty/tty_buffer.c:445
     process_scheduled_works+0x2b0/0x10d0 kernel/workqueue.c:3229
     worker_thread+0x3dc/0x950 kernel/workqueue.c:3391
     kthread+0x2a3/0x370 kernel/kthread.c:389
     ret_from_fork+0x2d/0x70 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:257
    
    Freed by task 3367:
     kfree+0x126/0x420 mm/slub.c:4580
     gsm_cleanup_mux+0x36c/0x7b0 drivers/tty/n_gsm.c:3160 [n_gsm]
     gsmld_ioctl+0x395/0x1450 drivers/tty/n_gsm.c:3408 [n_gsm]
     tty_ioctl+0x643/0x1100 drivers/tty/tty_io.c:2818
    
    [Analysis]
    gsm_msg on the tx_ctrl_list or tx_data_list of gsm_mux
    can be freed by multi threads through ioctl,which leads
    to the occurrence of uaf. Protect it by gsm tx lock.
    
    Signed-off-by: Longlong Xia <[email protected]>
    Cc: stable <[email protected]>
    Suggested-by: Jiri Slaby <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ublk: don't allow user copy for unprivileged device [+ + +]
Author: Ming Lei <[email protected]>
Date:   Wed Oct 16 21:48:47 2024 +0800

    ublk: don't allow user copy for unprivileged device
    
    commit 42aafd8b48adac1c3b20fe5892b1b91b80c1a1e6 upstream.
    
    UBLK_F_USER_COPY requires userspace to call write() on ublk char
    device for filling request buffer, and unprivileged device can't
    be trusted.
    
    So don't allow user copy for unprivileged device.
    
    Cc: [email protected]
    Fixes: 1172d5b8beca ("ublk: support user copy")
    Signed-off-by: Ming Lei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG [+ + +]
Author: Prashanth K <[email protected]>
Date:   Tue Sep 24 15:02:08 2024 +0530

    usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
    
    commit c96e31252110a84dcc44412e8a7b456b33c3e298 upstream.
    
    DWC3 programming guide mentions that when operating in USB2.0 speeds,
    if GUSB2PHYCFG[6] or GUSB2PHYCFG[8] is set, it must be cleared prior
    to issuing commands and may be set again  after the command completes.
    But currently while issuing EndXfer command without CmdIOC set, we
    wait for 1ms after GUSB2PHYCFG is restored. This results in cases
    where EndXfer command doesn't get completed and causes SMMU faults
    since requests are unmapped afterwards. Hence restore GUSB2PHYCFG
    after waiting for EndXfer command completion.
    
    Cc: [email protected]
    Fixes: 1d26ba0944d3 ("usb: dwc3: Wait unconditionally after issuing EndXfer command")
    Signed-off-by: Prashanth K <[email protected]>
    Acked-by: Thinh Nguyen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
USB: serial: option: add support for Quectel EG916Q-GL [+ + +]
Author: Benjamin B. Frost <[email protected]>
Date:   Wed Sep 11 10:54:05 2024 +0200

    USB: serial: option: add support for Quectel EG916Q-GL
    
    commit 540eff5d7faf0c9330ec762da49df453263f7676 upstream.
    
    Add Quectel EM916Q-GL with product ID 0x6007
    
    T:  Bus=01 Lev=02 Prnt=02 Port=01 Cnt=01 Dev#=  3 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=2c7c ProdID=6007 Rev= 2.00
    S:  Manufacturer=Quectel
    S:  Product=EG916Q-GL
    C:* #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=200mA
    A:  FirstIf#= 4 IfCount= 2 Cls=02(comm.) Sub=06 Prot=00
    I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=82(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=84(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=86(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether
    E:  Ad=88(I) Atr=03(Int.) MxPS=  32 Ivl=32ms
    I:  If#= 5 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
    I:* If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    MI_00 Quectel USB Diag Port
    MI_01 Quectel USB NMEA Port
    MI_02 Quectel USB AT Port
    MI_03 Quectel USB Modem Port
    MI_04 Quectel USB Net Port
    
    Signed-off-by: Benjamin B. Frost <[email protected]>
    Reviewed-by: Lars Melin <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: serial: option: add Telit FN920C04 MBIM compositions [+ + +]
Author: Daniele Palmas <[email protected]>
Date:   Thu Oct 3 11:38:08 2024 +0200

    USB: serial: option: add Telit FN920C04 MBIM compositions
    
    commit 6d951576ee16430822a8dee1e5c54d160e1de87d upstream.
    
    Add the following Telit FN920C04 compositions:
    
    0x10a2: MBIM + tty (AT/NMEA) + tty (AT) + tty (diag)
    T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10a2 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FN920
    S:  SerialNumber=92c4c4d8
    C:  #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    0x10a7: MBIM + tty (AT) + tty (AT) + tty (diag)
    T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 18 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10a7 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FN920
    S:  SerialNumber=92c4c4d8
    C:  #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    0x10aa: MBIM + tty (AT) + tty (diag) + DPL (data packet logging) + adb
    T:  Bus=03 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 15 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10aa Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FN920
    S:  SerialNumber=92c4c4d8
    C:  #Ifs= 6 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 4 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Signed-off-by: Daniele Palmas <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF [+ + +]
Author: Jonathan Marek <[email protected]>
Date:   Sat Oct 5 10:41:46 2024 -0400

    usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF
    
    commit ffe85c24d7ca5de7d57690c0ab194b3838674935 upstream.
    
    This line is overwriting the result of the above switch-case.
    
    This fixes the tcpm driver getting stuck in a "Sink TX No Go" loop.
    
    Fixes: a4422ff22142 ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver")
    Cc: stable <[email protected]>
    Signed-off-by: Jonathan Marek <[email protected]>
    Acked-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Heikki Krogerus <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
vt: prevent kernel-infoleak in con_font_get() [+ + +]
Author: Jeongjun Park <[email protected]>
Date:   Fri Oct 11 02:46:19 2024 +0900

    vt: prevent kernel-infoleak in con_font_get()
    
    commit f956052e00de211b5c9ebaa1958366c23f82ee9e upstream.
    
    font.data may not initialize all memory spaces depending on the implementation
    of vc->vc_sw->con_font_get. This may cause info-leak, so to prevent this, it
    is safest to modify it to initialize the allocated memory space to 0, and it
    generally does not affect the overall performance of the system.
    
    Cc: [email protected]
    Reported-by: [email protected]
    Fixes: 05e2600cb0a4 ("VT: Bump font size limitation to 64x128 pixels")
    Signed-off-by: Jeongjun Park <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/apic: Always explicitly disarm TSC-deadline timer [+ + +]
Author: Zhang Rui <[email protected]>
Date:   Tue Oct 15 14:15:22 2024 +0800

    x86/apic: Always explicitly disarm TSC-deadline timer
    
    commit ffd95846c6ec6cf1f93da411ea10d504036cab42 upstream.
    
    New processors have become pickier about the local APIC timer state
    before entering low power modes. These low power modes are used (for
    example) when you close your laptop lid and suspend. If you put your
    laptop in a bag and it is not in this low power mode, it is likely
    to get quite toasty while it quickly sucks the battery dry.
    
    The problem boils down to some CPUs' inability to power down until the
    CPU recognizes that the local APIC timer is shut down. The current
    kernel code works in one-shot and periodic modes but does not work for
    deadline mode. Deadline mode has been the supported and preferred mode
    on Intel CPUs for over a decade and uses an MSR to drive the timer
    instead of an APIC register.
    
    Disable the TSC Deadline timer in lapic_timer_shutdown() by writing to
    MSR_IA32_TSC_DEADLINE when in TSC-deadline mode. Also avoid writing
    to the initial-count register (APIC_TMICT) which is ignored in
    TSC-deadline mode.
    
    Note: The APIC_LVTT|=APIC_LVT_MASKED operation should theoretically be
    enough to tell the hardware that the timer will not fire in any of the
    timer modes. But mitigating AMD erratum 411[1] also requires clearing
    out APIC_TMICT. Solely setting APIC_LVT_MASKED is also ineffective in
    practice on Intel Lunar Lake systems, which is the motivation for this
    change.
    
    1. 411 Processor May Exit Message-Triggered C1E State Without an Interrupt if Local APIC Timer Reaches Zero - https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/revision-guides/41322_10h_Rev_Gd.pdf
    
    Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available")
    Suggested-by: Dave Hansen <[email protected]>
    Signed-off-by: Zhang Rui <[email protected]>
    Signed-off-by: Dave Hansen <[email protected]>
    Reviewed-by: Rafael J. Wysocki <[email protected]>
    Tested-by: Srinivas Pandruvada <[email protected]>
    Tested-by: Todd Brandt <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/all/20241015061522.25288-1-rui.zhang%40intel.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/bugs: Do not use UNTRAIN_RET with IBPB on entry [+ + +]
Author: Johannes Wikner <[email protected]>
Date:   Tue Oct 8 12:58:03 2024 +0200

    x86/bugs: Do not use UNTRAIN_RET with IBPB on entry
    
    commit c62fa117c32bd1abed9304c58e0da6940f8c7fc2 upstream.
    
    Since X86_FEATURE_ENTRY_IBPB will invalidate all harmful predictions
    with IBPB, no software-based untraining of returns is needed anymore.
    Currently, this change affects retbleed and SRSO mitigations so if
    either of the mitigations is doing IBPB and the other one does the
    software sequence, the latter is not needed anymore.
    
      [ bp: Massage commit message. ]
    
    Suggested-by: Borislav Petkov <[email protected]>
    Signed-off-by: Johannes Wikner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/bugs: Skip RSB fill at VMEXIT [+ + +]
Author: Johannes Wikner <[email protected]>
Date:   Tue Oct 8 12:36:30 2024 +0200

    x86/bugs: Skip RSB fill at VMEXIT
    
    commit 0fad2878642ec46225af2054564932745ac5c765 upstream.
    
    entry_ibpb() is designed to follow Intel's IBPB specification regardless
    of CPU. This includes invalidating RSB entries.
    
    Hence, if IBPB on VMEXIT has been selected, entry_ibpb() as part of the
    RET untraining in the VMEXIT path will take care of all BTB and RSB
    clearing so there's no need to explicitly fill the RSB anymore.
    
      [ bp: Massage commit message. ]
    
    Suggested-by: Borislav Petkov <[email protected]>
    Signed-off-by: Johannes Wikner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/bugs: Use code segment selector for VERW operand [+ + +]
Author: Pawan Gupta <[email protected]>
Date:   Thu Sep 26 09:10:31 2024 -0700

    x86/bugs: Use code segment selector for VERW operand
    
    commit e4d2102018542e3ae5e297bc6e229303abff8a0f upstream.
    
    Robert Gill reported below #GP in 32-bit mode when dosemu software was
    executing vm86() system call:
    
      general protection fault: 0000 [#1] PREEMPT SMP
      CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1
      Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010
      EIP: restore_all_switch_stack+0xbe/0xcf
      EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
      ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc
      DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046
      CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0
      Call Trace:
       show_regs+0x70/0x78
       die_addr+0x29/0x70
       exc_general_protection+0x13c/0x348
       exc_bounds+0x98/0x98
       handle_exception+0x14d/0x14d
       exc_bounds+0x98/0x98
       restore_all_switch_stack+0xbe/0xcf
       exc_bounds+0x98/0x98
       restore_all_switch_stack+0xbe/0xcf
    
    This only happens in 32-bit mode when VERW based mitigations like MDS/RFDS
    are enabled. This is because segment registers with an arbitrary user value
    can result in #GP when executing VERW. Intel SDM vol. 2C documents the
    following behavior for VERW instruction:
    
      #GP(0) - If a memory operand effective address is outside the CS, DS, ES,
               FS, or GS segment limit.
    
    CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user
    space. Use %cs selector to reference VERW operand. This ensures VERW will
    not #GP for an arbitrary user %ds.
    
    [ mingo: Fixed the SOB chain. ]
    
    Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
    Reported-by: Robert Gill <[email protected]>
    Reviewed-by: Andrew Cooper <[email protected]
    Cc: [email protected] # 5.10+
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
    Closes: https://lore.kernel.org/all/[email protected]/
    Suggested-by: Dave Hansen <[email protected]>
    Suggested-by: Brian Gerst <[email protected]>
    Signed-off-by: Pawan Gupta <[email protected]>
    Signed-off-by: Dave Hansen <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load [+ + +]
Author: John Allen <[email protected]>
Date:   Mon Sep 23 16:44:04 2024 +0000

    x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load
    
    commit ee4d4e8d2c3bec6ee652599ab31991055a72c322 upstream.
    
    Commit
    
      f69759be251d ("x86/CPU/AMD: Move Zenbleed check to the Zen2 init function")
    
    causes a bit in the DE_CFG MSR to get set erroneously after a microcode late
    load.
    
    The microcode late load path calls into amd_check_microcode() and subsequently
    zen2_zenbleed_check(). Since the above commit removes the cpu_has_amd_erratum()
    call from zen2_zenbleed_check(), this will cause all non-Zen2 CPUs to go
    through the function and set the bit in the DE_CFG MSR.
    
    Call into the Zenbleed fix path on Zen2 CPUs only.
    
      [ bp: Massage commit message, use cpu_feature_enabled(). ]
    
    Fixes: f69759be251d ("x86/CPU/AMD: Move Zenbleed check to the Zen2 init function")
    Signed-off-by: John Allen <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Acked-by: Borislav Petkov (AMD) <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/cpufeatures: Add a IBPB_NO_RET BUG flag [+ + +]
Author: Johannes Wikner <[email protected]>
Date:   Mon Sep 23 20:49:34 2024 +0200

    x86/cpufeatures: Add a IBPB_NO_RET BUG flag
    
    commit 3ea87dfa31a7b0bb0ff1675e67b9e54883013074 upstream.
    
    Set this flag if the CPU has an IBPB implementation that does not
    invalidate return target predictions. Zen generations < 4 do not flush
    the RSB when executing an IBPB and this bug flag denotes that.
    
      [ bp: Massage. ]
    
    Signed-off-by: Johannes Wikner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET [+ + +]
Author: Jim Mattson <[email protected]>
Date:   Fri Sep 13 10:32:27 2024 -0700

    x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET
    
    commit ff898623af2ed564300752bba83a680a1e4fec8d upstream.
    
    AMD's initial implementation of IBPB did not clear the return address
    predictor. Beginning with Zen4, AMD's IBPB *does* clear the return address
    predictor. This behavior is enumerated by CPUID.80000008H:EBX.IBPB_RET[30].
    
    Define X86_FEATURE_AMD_IBPB_RET for use in KVM_GET_SUPPORTED_CPUID,
    when determining cross-vendor capabilities.
    
    Suggested-by: Venkatesh Srinivas <[email protected]>
    Signed-off-by: Jim Mattson <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Tom Lendacky <[email protected]>
    Reviewed-by: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/entry: Have entry_ibpb() invalidate return predictions [+ + +]
Author: Johannes Wikner <[email protected]>
Date:   Mon Sep 23 20:49:36 2024 +0200

    x86/entry: Have entry_ibpb() invalidate return predictions
    
    commit 50e4b3b94090babe8d4bb85c95f0d3e6b07ea86e upstream.
    
    entry_ibpb() should invalidate all indirect predictions, including return
    target predictions. Not all IBPB implementations do this, in which case the
    fallback is RSB filling.
    
    Prevent SRSO-style hijacks of return predictions following IBPB, as the return
    target predictor can be corrupted before the IBPB completes.
    
      [ bp: Massage. ]
    
    Signed-off-by: Johannes Wikner <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/entry_32: Clear CPU buffers after register restore in NMI return [+ + +]
Author: Pawan Gupta <[email protected]>
Date:   Wed Sep 25 15:25:44 2024 -0700

    x86/entry_32: Clear CPU buffers after register restore in NMI return
    
    commit 48a2440d0f20c826b884e04377ccc1e4696c84e9 upstream.
    
    CPU buffers are currently cleared after call to exc_nmi, but before
    register state is restored. This may be okay for MDS mitigation but not for
    RDFS. Because RDFS mitigation requires CPU buffers to be cleared when
    registers don't have any sensitive data.
    
    Move CLEAR_CPU_BUFFERS after RESTORE_ALL_NMI.
    
    Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
    Suggested-by: Dave Hansen <[email protected]>
    Signed-off-by: Pawan Gupta <[email protected]>
    Signed-off-by: Dave Hansen <[email protected]>
    Cc:[email protected]
    Link: https://lore.kernel.org/all/20240925-fix-dosemu-vm86-v7-2-1de0daca2d42%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/entry_32: Do not clobber user EFLAGS.ZF [+ + +]
Author: Pawan Gupta <[email protected]>
Date:   Wed Sep 25 15:25:38 2024 -0700

    x86/entry_32: Do not clobber user EFLAGS.ZF
    
    commit 2e2e5143d4868163d6756c8c6a4d28cbfa5245e5 upstream.
    
    Opportunistic SYSEXIT executes VERW to clear CPU buffers after user EFLAGS
    are restored. This can clobber user EFLAGS.ZF.
    
    Move CLEAR_CPU_BUFFERS before the user EFLAGS are restored. This ensures
    that the user EFLAGS.ZF is not clobbered.
    
    Closes: https://lore.kernel.org/lkml/yVXwe8gvgmPADpRB6lXlicS2fcHoV5OHHxyuFbB_MEleRPD7-KhGe5VtORejtPe-KCkT8Uhcg5d7-IBw4Ojb4H7z5LQxoZylSmJ8KNL3A8o=@protonmail.com/
    Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
    Reported-by: Jari Ruusu <[email protected]>
    Signed-off-by: Pawan Gupta <[email protected]>
    Signed-off-by: Dave Hansen <[email protected]>
    Cc:[email protected]
    Link: https://lore.kernel.org/all/20240925-fix-dosemu-vm86-v7-1-1de0daca2d42%40linux.intel.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
x86/resctrl: Annotate get_mem_config() functions as __init [+ + +]
Author: Nathan Chancellor <[email protected]>
Date:   Tue Sep 17 09:02:53 2024 -0700

    x86/resctrl: Annotate get_mem_config() functions as __init
    
    commit d5fd042bf4cfb557981d65628e1779a492cd8cfa upstream.
    
    After a recent LLVM change [1] that deduces __cold on functions that only call
    cold code (such as __init functions), there is a section mismatch warning from
    __get_mem_config_intel(), which got moved to .text.unlikely. as a result of
    that optimization:
    
      WARNING: modpost: vmlinux: section mismatch in reference: \
      __get_mem_config_intel+0x77 (section: .text.unlikely.) -> thread_throttle_mode_init (section: .init.text)
    
    Mark __get_mem_config_intel() as __init as well since it is only called
    from __init code, which clears up the warning.
    
    While __rdt_get_mem_config_amd() does not exhibit a warning because it
    does not call any __init code, it is a similar function that is only
    called from __init code like __get_mem_config_intel(), so mark it __init
    as well to keep the code symmetrical.
    
    CONFIG_SECTION_MISMATCH_WARN_ONLY=n would turn this into a fatal error.
    
    Fixes: 05b93417ce5b ("x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)")
    Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature")
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Reinette Chatre <[email protected]>
    Cc: <[email protected]>
    Link: https://github.com/llvm/llvm-project/commit/6b11573b8c5e3d36beee099dbe7347c2a007bf53 [1]
    Link: https://lore.kernel.org/r/20240917-x86-restctrl-get_mem_config_intel-init-v3-1-10d521256284@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
xfs: allow symlinks with short remote targets [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:21 2024 -0700

    xfs: allow symlinks with short remote targets
    
    commit 38de567906d95c397d87f292b892686b7ec6fbc3 upstream.
    
    An internal user complained about log recovery failing on a symlink
    ("Bad dinode after recovery") with the following (excerpted) format:
    
    core.magic = 0x494e
    core.mode = 0120777
    core.version = 3
    core.format = 2 (extents)
    core.nlinkv2 = 1
    core.nextents = 1
    core.size = 297
    core.nblocks = 1
    core.naextents = 0
    core.forkoff = 0
    core.aformat = 2 (extents)
    u3.bmx[0] = [startoff,startblock,blockcount,extentflag]
    0:[0,12,1,0]
    
    This is a symbolic link with a 297-byte target stored in a disk block,
    which is to say this is a symlink with a remote target.  The forkoff is
    0, which is to say that there's 512 - 176 == 336 bytes in the inode core
    to store the data fork.
    
    Eventually, testing of generic/388 failed with the same inode corruption
    message during inode recovery.  In writing a debugging patch to call
    xfs_dinode_verify on dirty inode log items when we're committing
    transactions, I observed that xfs/298 can reproduce the problem quite
    quickly.
    
    xfs/298 creates a symbolic link, adds some extended attributes, then
    deletes them all.  The test failure occurs when the final removexattr
    also deletes the attr fork because that does not convert the remote
    symlink back into a shortform symlink.  That is how we trip this test.
    The only reason why xfs/298 only triggers with the debug patch added is
    that it deletes the symlink, so the final iflush shows the inode as
    free.
    
    I wrote a quick fstest to emulate the behavior of xfs/298, except that
    it leaves the symlinks on the filesystem after inducing the "corrupt"
    state.  Kernels going back at least as far as 4.18 have written out
    symlink inodes in this manner and prior to 1eb70f54c445f they did not
    object to reading them back in.
    
    Because we've been writing out inodes this way for quite some time, the
    only way to fix this is to relax the check for symbolic links.
    Directories don't have this problem because di_size is bumped to
    blocksize during the sf->data conversion.
    
    Fixes: 1eb70f54c445f ("xfs: validate inode fork size against fork format")
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: allow unlinked symlinks and dirs with zero size [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:26 2024 -0700

    xfs: allow unlinked symlinks and dirs with zero size
    
    commit 1ec9307fc066dd8a140d5430f8a7576aa9d78cd3 upstream.
    
    For a very very long time, inode inactivation has set the inode size to
    zero before unmapping the extents associated with the data fork.
    Unfortunately, commit 3c6f46eacd876 changed the inode verifier to
    prohibit zero-length symlinks and directories.  If an inode happens to
    get logged in this state and the system crashes before freeing the
    inode, log recovery will also fail on the broken inode.
    
    Therefore, allow zero-size symlinks and directories as long as the link
    count is zero; nobody will be able to open these files by handle so
    there isn't any risk of data exposure.
    
    Fixes: 3c6f46eacd876 ("xfs: sanity check directory inode di_size")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2 [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:10 2024 -0700

    xfs: check opcode and iovec count match in xlog_recover_attri_commit_pass2
    
    commit ad206ae50eca62836c5460ab5bbf2a6c59a268e7 upstream.
    
    Check that the number of recovered log iovecs is what is expected for
    the xattri opcode is expecting.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: check shortform attr entry flags specifically [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:12 2024 -0700

    xfs: check shortform attr entry flags specifically
    
    commit 309dc9cbbb4379241bcc9b5a6a42c04279a0e5a7 upstream.
    
    While reviewing flag checking in the attr scrub functions, we noticed
    that the shortform attr scanner didn't catch entries that have the LOCAL
    or INCOMPLETE bits set.  Neither of these flags can ever be set on a
    shortform attr, so we need to check this narrower set of valid flags.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: convert delayed extents to unwritten when zeroing post eof blocks [+ + +]
Author: Zhang Yi <[email protected]>
Date:   Tue Oct 15 17:11:20 2024 -0700

    xfs: convert delayed extents to unwritten when zeroing post eof blocks
    
    commit 5ce5674187c345dc31534d2024c09ad8ef29b7ba upstream.
    
    Current clone operation could be non-atomic if the destination of a file
    is beyond EOF, user could get a file with corrupted (zeroed) data on
    crash.
    
    The problem is about preallocations. If you write some data into a file:
    
            [A...B)
    
    and XFS decides to preallocate some post-eof blocks, then it can create
    a delayed allocation reservation:
    
            [A.........D)
    
    The writeback path tries to convert delayed extents to real ones by
    allocating blocks. If there aren't enough contiguous free space, we can
    end up with two extents, the first real and the second still delalloc:
    
            [A....C)[C.D)
    
    After that, both the in-memory and the on-disk file sizes are still B.
    If we clone into the range [E...F) from another file:
    
            [A....C)[C.D)      [E...F)
    
    then xfs_reflink_zero_posteof() calls iomap_zero_range() to zero out the
    range [B, E) beyond EOF and flush it. Since [C, D) is still a delalloc
    extent, its pagecache will be zeroed and both the in-memory and on-disk
    size will be updated to D after flushing but before cloning. This is
    wrong, because the user can see the size change and read the zeroes
    while the clone operation is ongoing.
    
    We need to keep the in-memory and on-disk size before the clone
    operation starts, so instead of writing zeroes through the page cache
    for delayed ranges beyond EOF, we convert these ranges to unwritten and
    invalidate any cached data over that range beyond EOF.
    
    Suggested-by: Dave Chinner <[email protected]>
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: enforce one namespace per attribute [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:14 2024 -0700

    xfs: enforce one namespace per attribute
    
    commit ea0b3e814741fb64e7785b564ea619578058e0b0 upstream.
    
    [backport: fix conflicts due to various xattr refactoring]
    
    Create a standardized helper function to enforce one namespace bit per
    extended attribute, and refactor all the open-coded hweight logic.  This
    function is not a static inline to avoid porting hassles in userspace.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix error returns from xfs_bmapi_write [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Oct 15 17:11:06 2024 -0700

    xfs: fix error returns from xfs_bmapi_write
    
    commit 6773da870ab89123d1b513da63ed59e32a29cb77 upstream.
    
    [backport: resolve conflicts due to missing quota_repair.c,
    rtbitmap_repair.c, xfs_bmap_mark_sick()]
    
    xfs_bmapi_write can return 0 without actually returning a mapping in
    mval in two different cases:
    
     1) when there is absolutely no space available to do an allocation
     2) when converting delalloc space, and the allocation is so small
        that it only covers parts of the delalloc extent before the
        range requested by the caller
    
    Callers at best can handle one of these cases, but in many cases can't
    cope with either one.  Switch xfs_bmapi_write to always return a
    mapping or return an error code instead.  For case 1) above ENOSPC is
    the obvious choice which is very much what the callers expect anyway.
    For case 2) there is no really good error code, so pick a funky one
    from the SysV streams portfolio.
    
    This fixes the reproducer here:
    
        https://lore.kernel.org/linux-xfs/CAEJPjCvT3Uag-pMTYuigEjWZHn1sGMZ0GCjVVCv29tNHK76Cgg@mail.gmail.com0/
    
    which uses reserved blocks to create file systems that are gravely
    out of space and thus cause at least xfs_file_alloc_space to hang
    and trigger the lack of ENOSPC handling in xfs_dquot_disk_alloc.
    
    Note that this patch does not actually make any caller but
    xfs_alloc_file_space deal intelligently with case 2) above.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reported-by: 刘通 <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix freeing speculative preallocations for preallocated files [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Oct 15 17:11:24 2024 -0700

    xfs: fix freeing speculative preallocations for preallocated files
    
    commit 610b29161b0aa9feb59b78dc867553274f17fb01 upstream.
    
    xfs_can_free_eofblocks returns false for files that have persistent
    preallocations unless the force flag is passed and there are delayed
    blocks.  This means it won't free delalloc reservations for files
    with persistent preallocations unless the force flag is set, and it
    will also free the persistent preallocations if the force flag is
    set and the file happens to have delayed allocations.
    
    Both of these are bad, so do away with the force flag and always free
    only post-EOF delayed allocations for files with the XFS_DIFLAG_PREALLOC
    or APPEND flags set.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix missing check for invalid attr flags [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:11 2024 -0700

    xfs: fix missing check for invalid attr flags
    
    commit f660ec8eaeb50d0317c29601aacabdb15e5f2203 upstream.
    
    [backport: fix build errors in xchk_xattr_listent]
    
    The xattr scrubber doesn't check for undefined flags in shortform attr
    entries.  Therefore, define a mask XFS_ATTR_ONDISK_MASK that has all
    possible XFS_ATTR_* flags in it, and use that to check for unknown bits
    in xchk_xattr_actor.
    
    Refactor the check in the dabtree scanner function to use the new mask
    as well.  The redundant checks need to be in place because the dabtree
    check examines the hash mappings and therefore needs to decode the attr
    leaf entries to compute the namehash.  This happens before the walk of
    the xattr entries themselves.
    
    Fixes: ae0506eba78fd ("xfs: check used space of shortform xattr structures")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix unlink vs cluster buffer instantiation race [+ + +]
Author: Dave Chinner <[email protected]>
Date:   Tue Oct 15 17:11:23 2024 -0700

    xfs: fix unlink vs cluster buffer instantiation race
    
    commit 348a1983cf4cf5099fc398438a968443af4c9f65 upstream.
    
    Luis has been reporting an assert failure when freeing an inode
    cluster during inode inactivation for a while. The assert looks
    like:
    
     XFS: Assertion failed: bp->b_flags & XBF_DONE, file: fs/xfs/xfs_trans_buf.c, line: 241
     ------------[ cut here ]------------
     kernel BUG at fs/xfs/xfs_message.c:102!
     Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
     CPU: 4 PID: 73 Comm: kworker/4:1 Not tainted 6.10.0-rc1 #4
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
     Workqueue: xfs-inodegc/loop5 xfs_inodegc_worker [xfs]
     RIP: 0010:assfail (fs/xfs/xfs_message.c:102) xfs
     RSP: 0018:ffff88810188f7f0 EFLAGS: 00010202
     RAX: 0000000000000000 RBX: ffff88816e748250 RCX: 1ffffffff844b0e7
     RDX: 0000000000000004 RSI: ffff88810188f558 RDI: ffffffffc2431fa0
     RBP: 1ffff11020311f01 R08: 0000000042431f9f R09: ffffed1020311e9b
     R10: ffff88810188f4df R11: ffffffffac725d70 R12: ffff88817a3f4000
     R13: ffff88812182f000 R14: ffff88810188f998 R15: ffffffffc2423f80
     FS:  0000000000000000(0000) GS:ffff8881c8400000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 000055fe9d0f109c CR3: 000000014426c002 CR4: 0000000000770ef0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
     PKRU: 55555554
     Call Trace:
      <TASK>
     xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:241 (discriminator 1)) xfs
     xfs_imap_to_bp (fs/xfs/xfs_trans.h:210 fs/xfs/libxfs/xfs_inode_buf.c:138) xfs
     xfs_inode_item_precommit (fs/xfs/xfs_inode_item.c:145) xfs
     xfs_trans_run_precommits (fs/xfs/xfs_trans.c:931) xfs
     __xfs_trans_commit (fs/xfs/xfs_trans.c:966) xfs
     xfs_inactive_ifree (fs/xfs/xfs_inode.c:1811) xfs
     xfs_inactive (fs/xfs/xfs_inode.c:2013) xfs
     xfs_inodegc_worker (fs/xfs/xfs_icache.c:1841 fs/xfs/xfs_icache.c:1886) xfs
     process_one_work (kernel/workqueue.c:3231)
     worker_thread (kernel/workqueue.c:3306 (discriminator 2) kernel/workqueue.c:3393 (discriminator 2))
     kthread (kernel/kthread.c:389)
     ret_from_fork (arch/x86/kernel/process.c:147)
     ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
      </TASK>
    
    And occurs when the the inode precommit handlers is attempt to look
    up the inode cluster buffer to attach the inode for writeback.
    
    The trail of logic that I can reconstruct is as follows.
    
            1. the inode is clean when inodegc runs, so it is not
               attached to a cluster buffer when precommit runs.
    
            2. #1 implies the inode cluster buffer may be clean and not
               pinned by dirty inodes when inodegc runs.
    
            3. #2 implies that the inode cluster buffer can be reclaimed
               by memory pressure at any time.
    
            4. The assert failure implies that the cluster buffer was
               attached to the transaction, but not marked done. It had
               been accessed earlier in the transaction, but not marked
               done.
    
            5. #4 implies the cluster buffer has been invalidated (i.e.
               marked stale).
    
            6. #5 implies that the inode cluster buffer was instantiated
               uninitialised in the transaction in xfs_ifree_cluster(),
               which only instantiates the buffers to invalidate them
               and never marks them as done.
    
    Given factors 1-3, this issue is highly dependent on timing and
    environmental factors. Hence the issue can be very difficult to
    reproduce in some situations, but highly reliable in others. Luis
    has an environment where it can be reproduced easily by g/531 but,
    OTOH, I've reproduced it only once in ~2000 cycles of g/531.
    
    I think the fix is to have xfs_ifree_cluster() set the XBF_DONE flag
    on the cluster buffers, even though they may not be initialised. The
    reasons why I think this is safe are:
    
            1. A buffer cache lookup hit on a XBF_STALE buffer will
               clear the XBF_DONE flag. Hence all future users of the
               buffer know they have to re-initialise the contents
               before use and mark it done themselves.
    
            2. xfs_trans_binval() sets the XFS_BLI_STALE flag, which
               means the buffer remains locked until the journal commit
               completes and the buffer is unpinned. Hence once marked
               XBF_STALE/XFS_BLI_STALE by xfs_ifree_cluster(), the only
               context that can access the freed buffer is the currently
               running transaction.
    
            3. #2 implies that future buffer lookups in the currently
               running transaction will hit the transaction match code
               and not the buffer cache. Hence XBF_STALE and
               XFS_BLI_STALE will not be cleared unless the transaction
               initialises and logs the buffer with valid contents
               again. At which point, the buffer will be marked marked
               XBF_DONE again, so having XBF_DONE already set on the
               stale buffer is a moot point.
    
            4. #2 also implies that any concurrent access to that
               cluster buffer will block waiting on the buffer lock
               until the inode cluster has been fully freed and is no
               longer an active inode cluster buffer.
    
            5. #4 + #1 means that any future user of the disk range of
               that buffer will always see the range of disk blocks
               covered by the cluster buffer as not done, and hence must
               initialise the contents themselves.
    
            6. Setting XBF_DONE in xfs_ifree_cluster() then means the
               unlinked inode precommit code will see a XBF_DONE buffer
               from the transaction match as it expects. It can then
               attach the stale but newly dirtied inode to the stale
               but newly dirtied cluster buffer without unexpected
               failures. The stale buffer will then sail through the
               journal and do the right thing with the attached stale
               inode during unpin.
    
    Hence the fix is just one line of extra code. The explanation of
    why we have to set XBF_DONE in xfs_ifree_cluster, OTOH, is long and
    complex....
    
    Fixes: 82842fee6e59 ("xfs: fix AGF vs inode cluster buffer deadlock")
    Signed-off-by: Dave Chinner <[email protected]>
    Tested-by: Luis Chamberlain <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix xfs_bmap_add_extent_delay_real for partial conversions [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Oct 15 17:11:07 2024 -0700

    xfs: fix xfs_bmap_add_extent_delay_real for partial conversions
    
    commit d69bee6a35d3c5e4873b9e164dd1a9711351a97c upstream.
    
    [backport: resolve conflict due to xfs_mod_freecounter refactor]
    
    xfs_bmap_add_extent_delay_real takes parts or all of a delalloc extent
    and converts them to a real extent.  It is written to deal with any
    potential overlap of the to be converted range with the delalloc extent,
    but it turns out that currently only converting the entire extents, or a
    part starting at the beginning is actually exercised, as the only caller
    always tries to convert the entire delalloc extent, and either succeeds
    or at least progresses partially from the start.
    
    If it only converts a tiny part of a delalloc extent, the indirect block
    calculation for the new delalloc extent (da_new) might be equivalent to that
    of the existing delalloc extent (da_old).  If this extent conversion now
    requires allocating an indirect block that gets accounted into da_new,
    leading to the assert that da_new must be smaller or equal to da_new
    unless we split the extent to trigger.
    
    Except for the assert that case is actually handled by just trying to
    allocate more space, as that already handled for the split case (which
    currently can't be reached at all), so just reusing it should be fine.
    Except that without dipping into the reserved block pool that would make
    it a bit too easy to trigger a fs shutdown due to ENOSPC.  So in addition
    to adjusting the assert, also dip into the reserved block pool.
    
    Note that I could only reproduce the assert with a change to only convert
    the actually asked range instead of the full delalloc extent from
    xfs_bmapi_write.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make sure sb_fdblocks is non-negative [+ + +]
Author: Wengang Wang <[email protected]>
Date:   Tue Oct 15 17:11:22 2024 -0700

    xfs: make sure sb_fdblocks is non-negative
    
    commit 58f880711f2ba53fd5e959875aff5b3bf6d5c32e upstream.
    
    A user with a completely full filesystem experienced an unexpected
    shutdown when the filesystem tried to write the superblock during
    runtime.
    kernel shows the following dmesg:
    
    [    8.176281] XFS (dm-4): Metadata corruption detected at xfs_sb_write_verify+0x60/0x120 [xfs], xfs_sb block 0x0
    [    8.177417] XFS (dm-4): Unmount and run xfs_repair
    [    8.178016] XFS (dm-4): First 128 bytes of corrupted metadata buffer:
    [    8.178703] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00  XFSB............
    [    8.179487] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [    8.180312] 00000020: cf 12 dc 89 ca 26 45 29 92 e6 e3 8d 3b b8 a2 c3  .....&E)....;...
    [    8.181150] 00000030: 00 00 00 00 01 00 00 06 00 00 00 00 00 00 00 80  ................
    [    8.182003] 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82  ................
    [    8.182004] 00000050: 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00  .....d..........
    [    8.182004] 00000060: 00 00 64 00 b4 a5 02 00 02 00 00 08 00 00 00 00  ..d.............
    [    8.182005] 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 17 00 00 19  ................
    [    8.182008] XFS (dm-4): Corruption of in-memory data detected.  Shutting down filesystem
    [    8.182010] XFS (dm-4): Please unmount the filesystem and rectify the problem(s)
    
    When xfs_log_sb writes super block to disk, b_fdblocks is fetched from
    m_fdblocks without any lock. As m_fdblocks can experience a positive ->
    negative -> positive changing when the FS reaches fullness (see
    xfs_mod_fdblocks). So there is a chance that sb_fdblocks is negative, and
    because sb_fdblocks is type of unsigned long long, it reads super big.
    And sb_fdblocks being bigger than sb_dblocks is a problem during log
    recovery, xfs_validate_sb_write() complains.
    
    Fix:
    As sb_fdblocks will be re-calculated during mount when lazysbcount is
    enabled, We just need to make xfs_validate_sb_write() happy -- make sure
    sb_fdblocks is not nenative. This patch also takes care of other percpu
    counters in xfs_log_sb.
    
    Signed-off-by: Wengang Wang <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional [+ + +]
Author: Zhang Yi <[email protected]>
Date:   Tue Oct 15 17:11:18 2024 -0700

    xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
    
    commit fc8d0ba0ff5fe4700fa02008b7751ec6b84b7677 upstream.
    
    Allow callers to pass a NULLL seq argument if they don't care about
    the fork sequence number.
    
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset [+ + +]
Author: Zhang Yi <[email protected]>
Date:   Tue Oct 15 17:11:19 2024 -0700

    xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
    
    commit 2e08371a83f1c06fd85eea8cd37c87a224cc4cc4 upstream.
    
    Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire
    delalloc extent and require multiple invocations to allocate the target
    offset. So xfs_convert_blocks() add a loop to do this job and we call it
    in the write back path, but xfs_convert_blocks() isn't a common helper.
    Let's do it in xfs_bmapi_convert_delalloc() and drop
    xfs_convert_blocks(), preparing for the post EOF delalloc blocks
    converting in the buffered write begin path.
    
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: match lock mode in xfs_buffered_write_iomap_begin() [+ + +]
Author: Zhang Yi <[email protected]>
Date:   Tue Oct 15 17:11:17 2024 -0700

    xfs: match lock mode in xfs_buffered_write_iomap_begin()
    
    commit bb712842a85d595525e72f0e378c143e620b3ea2 upstream.
    
    Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace
    xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the
    writing inode, and a new variable lockmode is used to indicate the lock
    mode. Although the lockmode should always be XFS_ILOCK_EXCL, it's still
    better to use this variable instead of useing XFS_ILOCK_EXCL directly
    when unlocking the inode.
    
    Fixes: 1aa91d9c9933 ("xfs: Add async buffered write support")
    Signed-off-by: Zhang Yi <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Oct 15 17:11:08 2024 -0700

    xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent
    
    commit 86de848403abda05bf9c16dcdb6bef65a8d88c41 upstream.
    
    Accessing if_bytes without the ilock is racy.  Remove the initial
    if_bytes == 0 check in xfs_reflink_end_cow_extent and let
    ext_iext_lookup_extent fail for this case after we've taken the ilock.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:09 2024 -0700

    xfs: require XFS_SB_FEAT_INCOMPAT_LOG_XATTRS for attr log intent item recovery
    
    commit 8ef1d96a985e4dc07ffbd71bd7fc5604a80cc644 upstream.
    
    The XFS_SB_FEAT_INCOMPAT_LOG_XATTRS feature bit protects a filesystem
    from old kernels that do not know how to recover extended attribute log
    intent items.  Make this check mandatory instead of a debugging assert.
    
    Fixes: fd920008784ea ("xfs: Set up infrastructure for log attribute replay")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: restrict when we try to align cow fork delalloc to cowextsz hints [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:25 2024 -0700

    xfs: restrict when we try to align cow fork delalloc to cowextsz hints
    
    commit 288e1f693f04e66be99f27e7cbe4a45936a66745 upstream.
    
    xfs/205 produces the following failure when always_cow is enabled:
    
    #  --- a/tests/xfs/205.out      2024-02-28 16:20:24.437887970 -0800
    #  +++ b/tests/xfs/205.out.bad  2024-06-03 21:13:40.584000000 -0700
    #  @@ -1,4 +1,5 @@
    #   QA output created by 205
    #   *** one file
    #  +   !!! disk full (expected)
    #   *** one file, a few bytes at a time
    #   *** done
    
    This is the result of overly aggressive attempts to align cow fork
    delalloc reservations to the CoW extent size hint.  Looking at the trace
    data, we're trying to append a single fsblock to the "fred" file.
    Trying to create a speculative post-eof reservation fails because
    there's not enough space.
    
    We then set @prealloc_blocks to zero and try again, but the cowextsz
    alignment code triggers, which expands our request for a 1-fsblock
    reservation into a 39-block reservation.  There's not enough space for
    that, so the whole write fails with ENOSPC even though there's
    sufficient space in the filesystem to allocate the single block that we
    need to land the write.
    
    There are two things wrong here -- first, we shouldn't be attempting
    speculative preallocations beyond what was requested when we're low on
    space.  Second, if we've already computed a posteof preallocation, we
    shouldn't bother trying to align that to the cowextsize hint.
    
    Fix both of these problems by adding a flag that only enables the
    expansion of the delalloc reservation to the cowextsize if we're doing a
    non-extending write, and only if we're not doing an ENOSPC retry.  This
    requires us to move the ENOSPC retry logic to xfs_bmapi_reserve_delalloc.
    
    I probably should have caught this six years ago when 6ca30729c206d was
    being reviewed, but oh well.  Update the comments to reflect what the
    code does now.
    
    Fixes: 6ca30729c206d ("xfs: bmap code cleanup")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Chandan Babu R <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: revert commit 44af6c7e59b12 [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:15 2024 -0700

    xfs: revert commit 44af6c7e59b12
    
    commit 2a009397eb5ae178670cbd7101e9635cf6412b35 upstream.
    
    [backport: resolve conflicts due to new xattr walk helper]
    
    In my haste to fix what I thought was a performance problem in the attr
    scrub code, I neglected to notice that the xfs_attr_get_ilocked also had
    the effect of checking that attributes can actually be looked up through
    the attr dabtree.  Fix this.
    
    Fixes: 44af6c7e59b12 ("xfs: don't load local xattr values during scrub")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: use dontcache for grabbing inodes during scrub [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:16 2024 -0700

    xfs: use dontcache for grabbing inodes during scrub
    
    commit b27ce0da60a523fc32e3795f96b2de5490642235 upstream.
    
    [backport: resolve conflict due to missing iscan.c]
    
    Back when I wrote commit a03297a0ca9f2, I had thought that we'd be doing
    users a favor by only marking inodes dontcache at the end of a scrub
    operation, and only if there's only one reference to that inode.  This
    was more or less true back when I_DONTCACHE was an XFS iflag and the
    only thing it did was change the outcome of xfs_fs_drop_inode to 1.
    
    Note: If there are dentries pointing to the inode when scrub finishes,
    the inode will have positive i_count and stay around in cache until
    dentry reclaim.
    
    But now we have d_mark_dontcache, which cause the inode *and* the
    dentries attached to it all to be marked I_DONTCACHE, which means that
    we drop the dentries ASAP, which drops the inode ASAP.
    
    This is bad if scrub found problems with the inode, because now they can
    be scheduled for inactivation, which can cause inodegc to trip on it and
    shut down the filesystem.
    
    Even if the inode isn't bad, this is still suboptimal because phases 3-7
    each initiate inode scans.  Dropping the inode immediately during phase
    3 is silly because phase 5 will reload it and drop it immediately, etc.
    It's fine to mark the inodes dontcache, but if there have been accesses
    to the file that set up dentries, we should keep them.
    
    I validated this by setting up ftrace to capture xfs_iget_recycle*
    tracepoints and ran xfs/285 for 30 seconds.  With current djwong-wtf I
    saw ~30,000 recycle events.  I then dropped the d_mark_dontcache calls
    and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
    30 seconds.
    
    Therefore, grab the inode with XFS_IGET_DONTCACHE, which only has the
    effect of setting I_DONTCACHE for cache misses.  Remove the
    d_mark_dontcache call that can happen in xchk_irele.
    
    Fixes: a03297a0ca9f2 ("xfs: manage inode DONTCACHE status at irele time")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: validate recovered name buffers when recovering xattr items [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Tue Oct 15 17:11:13 2024 -0700

    xfs: validate recovered name buffers when recovering xattr items
    
    commit 1c7f09d210aba2f2bb206e2e8c97c9f11a3fd880 upstream.
    
    Strengthen the xattri log item recovery code by checking that we
    actually have the required name and newname buffers for whatever
    operation we're replaying.
    
    Signed-off-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Catherine Hoang <[email protected]>
    Acked-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
xhci: Fix incorrect stream context type macro [+ + +]
Author: Mathias Nyman <[email protected]>
Date:   Wed Oct 16 16:59:57 2024 +0300

    xhci: Fix incorrect stream context type macro
    
    commit 6599b6a6fa8060145046d0744456b6abdb3122a7 upstream.
    
    The stream contex type (SCT) bitfield is used both in the stream context
    data structure,  and in the 'Set TR Dequeue pointer' command TRB.
    In both cases it uses bits 3:1
    
    The SCT_FOR_TRB(p) macro used to set the stream context type (SCT) field
    for the 'Set TR Dequeue pointer' command TRB incorrectly shifts the value
    1 bit left before masking the three bits.
    
    Fix this by first masking and rshifting, just like the similar
    SCT_FOR_CTX(p) macro does
    
    This issue has not been visibile as the lost bit 3 is only used with
    secondary stream arrays (SSA). Xhci driver currently only supports using
    a primary stream array with Linear stream addressing.
    
    Fixes: 95241dbdf828 ("xhci: Set SCT field for Set TR dequeue on streams")
    Cc: [email protected]
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Mitigate failed set dequeue pointer commands [+ + +]
Author: Mathias Nyman <[email protected]>
Date:   Wed Oct 16 16:59:58 2024 +0300

    xhci: Mitigate failed set dequeue pointer commands
    
    commit fe49df60cdb7c2975aa743dc295f8786e4b7db10 upstream.
    
    Avoid xHC host from processing a cancelled URB by always turning
    cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command.
    
    If the command fails then xHC will start processing the cancelled TD
    instead of skipping it once endpoint is restarted, causing issues like
    Babble error.
    
    This is not a complete solution as a failed 'Set TR Deq' command does not
    guarantee xHC TRB caches are cleared.
    
    Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function")
    Cc: [email protected]
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: tegra: fix checked USB2 port number [+ + +]
Author: Henry Lin <[email protected]>
Date:   Mon Oct 14 12:21:34 2024 +0800

    xhci: tegra: fix checked USB2 port number
    
    commit 7d381137cb6ecf558ef6698c7730ddd482d4c8f2 upstream.
    
    If USB virtualizatoin is enabled, USB2 ports are shared between all
    Virtual Functions. The USB2 port number owned by an USB2 root hub in
    a Virtual Function may be less than total USB2 phy number supported
    by the Tegra XUSB controller.
    
    Using total USB2 phy number as port number to check all PORTSC values
    would cause invalid memory access.
    
    [  116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f
    ...
    [  117.213640] Call trace:
    [  117.216783]  tegra_xusb_enter_elpg+0x23c/0x658
    [  117.222021]  tegra_xusb_runtime_suspend+0x40/0x68
    [  117.227260]  pm_generic_runtime_suspend+0x30/0x50
    [  117.232847]  __rpm_callback+0x84/0x3c0
    [  117.237038]  rpm_suspend+0x2dc/0x740
    [  117.241229] pm_runtime_work+0xa0/0xb8
    [  117.245769]  process_scheduled_works+0x24c/0x478
    [  117.251007]  worker_thread+0x23c/0x328
    [  117.255547]  kthread+0x104/0x1b0
    [  117.259389]  ret_from_fork+0x10/0x20
    [  117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100)
    
    Cc: [email protected] # v6.3+
    Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls")
    Signed-off-by: Henry Lin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>