Changelog in Linux kernel 6.12.4

ad7780: fix division by zero in ad7780_write_raw() [+ + +]

Author: Zicheng Qu <[email protected]>
Date:   Mon Oct 28 14:20:27 2024 +0000

    ad7780: fix division by zero in ad7780_write_raw()
    
    commit c174b53e95adf2eece2afc56cd9798374919f99a upstream.
    
    In the ad7780_write_raw() , val2 can be zero, which might lead to a
    division by zero error in DIV_ROUND_CLOSEST(). The ad7780_write_raw()
    is based on iio_info's write_raw. While val is explicitly declared that
    can be zero (in read mode), val2 is not specified to be non-zero.
    
    Fixes: 9085daa4abcc ("staging: iio: ad7780: add gain & filter gpio support")
    Cc: [email protected]
    Signed-off-by: Zicheng Qu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: allwinner: pinephone: Add mount matrix to accelerometer [+ + +]

Author: Dragan Simic <[email protected]>
Date:   Thu Sep 19 21:15:26 2024 +0200

    arm64: dts: allwinner: pinephone: Add mount matrix to accelerometer
    
    commit 2496b2aaacf137250f4ca449f465e2cadaabb0e8 upstream.
    
    The way InvenSense MPU-6050 accelerometer is mounted on the user-facing side
    of the Pine64 PinePhone mainboard, which makes it rotated 90 degrees counter-
    clockwise, [1] requires the accelerometer's x- and y-axis to be swapped, and
    the direction of the accelerometer's y-axis to be inverted.
    
    Rectify this by adding a mount-matrix to the accelerometer definition in the
    Pine64 PinePhone dtsi file.
    
    [1] https://files.pine64.org/doc/PinePhone/PinePhone%20mainboard%20bottom%20placement%20v1.1%2020191031.pdf
    
    Fixes: 91f480d40942 ("arm64: dts: allwinner: Add initial support for Pine64 PinePhone")
    Cc: [email protected]
    Suggested-by: Ondrej Jirman <[email protected]>
    Suggested-by: Andrey Skvortsov <[email protected]>
    Signed-off-by: Dragan Simic <[email protected]>
    Reviewed-by: Andrey Skvortsov <[email protected]>
    Link: https://patch.msgid.link/129f0c754d071cca1db5d207d9d4a7bd9831dff7.1726773282.git.dsimic@manjaro.org
    [[email protected]: Replaced Helped-by with Suggested-by]
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: freescale: imx8mm-verdin: Fix SD regulator startup delay [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Thu Oct 24 15:06:50 2024 +0200

    arm64: dts: freescale: imx8mm-verdin: Fix SD regulator startup delay
    
    commit 0ca7699c376743b633b6419a42888dba386d5351 upstream.
    
    The power switch used to power the SD card interface might have
    more than 2ms turn-on time, increase the startup delay to 20ms to
    prevent failures.
    
    Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini")
    Cc: [email protected]
    Signed-off-by: Francesco Dolcini <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: freescale: imx8mp-verdin: Fix SD regulator startup delay [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Thu Oct 24 15:06:51 2024 +0200

    arm64: dts: freescale: imx8mp-verdin: Fix SD regulator startup delay
    
    commit 6c5789c9d2c06968532243daa235f6ff809ad71e upstream.
    
    The power switch used to power the SD card interface might have
    more than 2ms turn-on time, increase the startup delay to 20ms to
    prevent failures.
    
    Fixes: a39ed23bdf6e ("arm64: dts: freescale: add initial support for verdin imx8m plus")
    Cc: [email protected]
    Signed-off-by: Francesco Dolcini <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: mediatek: mt8186-corsola: Fix GPU supply coupling max-spread [+ + +]

Author: Chen-Yu Tsai <[email protected]>
Date:   Mon Oct 21 22:05:36 2024 +0800

    arm64: dts: mediatek: mt8186-corsola: Fix GPU supply coupling max-spread
    
    commit 2f1aab0cb0661d533f008e4975325080351cdfc8 upstream.
    
    The GPU SRAM supply is supposed to be always at least 0.1V higher than
    the GPU supply. However when the DT was upstreamed, the spread was
    incorrectly set to 0.01V.
    
    Fixes: 8855d01fb81f ("arm64: dts: mediatek: Add MT8186 Krabby platform based Tentacruel / Tentacool")
    Cc: [email protected]
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: mediatek: mt8186-corsola: Fix IT6505 reset line polarity [+ + +]

Author: Chen-Yu Tsai <[email protected]>
Date:   Tue Oct 29 18:02:25 2024 +0800

    arm64: dts: mediatek: mt8186-corsola: Fix IT6505 reset line polarity
    
    commit fbcc95fceb6d179dd150df2dc613dfd9b013052c upstream.
    
    The reset line of the IT6505 bridge chip is active low, not active high.
    It was incorrectly inverted in the device tree as the implementation at
    the time incorrectly inverted the polarity in its driver, due to a prior
    device having an inline inverting level shifter.
    
    Fix the polarity now while the external display pipeline is incomplete,
    thereby avoiding any impact to running systems.
    
    A matching fix for the driver should be included if this change is
    backported.
    
    Fixes: 8855d01fb81f ("arm64: dts: mediatek: Add MT8186 Krabby platform based Tentacruel / Tentacool")
    Cc: [email protected]
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: ti: k3-am62-verdin: Fix SD regulator startup delay [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Thu Oct 24 15:06:28 2024 +0200

    arm64: dts: ti: k3-am62-verdin: Fix SD regulator startup delay
    
    commit 2213ca51998fef61d3df4ca156054cdcc37c42b8 upstream.
    
    The power switch used to power the SD card interface might have
    more than 2ms turn-on time, increase the startup delay to 20ms to
    prevent failures.
    
    Fixes: 316b80246b16 ("arm64: dts: ti: add verdin am62")
    Cc: [email protected]
    Signed-off-by: Francesco Dolcini <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vignesh Raghavendra <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ARM: 9429/1: ioremap: Sync PGDs for VMALLOC shadow [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Wed Oct 23 13:03:14 2024 +0100

    ARM: 9429/1: ioremap: Sync PGDs for VMALLOC shadow
    
    commit d6e6a74d4cea853b5321eeabb69c611148eedefe upstream.
    
    When sync:ing the VMALLOC area to other CPUs, make sure to also
    sync the KASAN shadow memory for the VMALLOC area, so that we
    don't get stale entries for the shadow memory in the top level PGD.
    
    Since we are now copying PGDs in two instances, create a helper
    function named memcpy_pgd() to do the actual copying, and
    create a helper to map the addresses of VMALLOC_START and
    VMALLOC_END into the corresponding shadow memory.
    
    Co-developed-by: Melon Liu <[email protected]>
    
    Cc: [email protected]
    Fixes: 565cbaad83d8 ("ARM: 9202/1: kasan: support CONFIG_KASAN_VMALLOC")
    Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/
    Reported-by: Clement LE GOFFIC <[email protected]>
    Suggested-by: Mark Rutland <[email protected]>
    Suggested-by: Russell King (Oracle) <[email protected]>
    Acked-by: Mark Rutland <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ARM: 9430/1: entry: Do a dummy read from VMAP shadow [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Wed Oct 23 13:04:44 2024 +0100

    ARM: 9430/1: entry: Do a dummy read from VMAP shadow
    
    commit 44e9a3bb76e5f2eecd374c8176b2c5163c8bb2e2 upstream.
    
    When switching task, in addition to a dummy read from the new
    VMAP stack, also do a dummy read from the VMAP stack's
    corresponding KASAN shadow memory to sync things up in
    the new MM context.
    
    Cc: [email protected]
    Fixes: a1c510d0adc6 ("ARM: implement support for vmap'ed stacks")
    Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/
    Reported-by: Clement LE GOFFIC <[email protected]>
    Suggested-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ARM: 9431/1: mm: Pair atomic_set_release() with _read_acquire() [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Wed Oct 23 13:05:34 2024 +0100

    ARM: 9431/1: mm: Pair atomic_set_release() with _read_acquire()
    
    commit 93ee385254d53849c01dd8ab9bc9d02790ee7f0e upstream.
    
    The code for syncing vmalloc memory PGD pointers is using
    atomic_read() in pair with atomic_set_release() but the
    proper pairing is atomic_read_acquire() paired with
    atomic_set_release().
    
    This is done to clearly instruct the compiler to not
    reorder the memcpy() or similar calls inside the section
    so that we do not observe changes to init_mm. memcpy()
    calls should be identified by the compiler as having
    unpredictable side effects, but let's try to be on the
    safe side.
    
    Cc: [email protected]
    Fixes: d31e23aff011 ("ARM: mm: make vmalloc_seq handling SMP safe")
    Suggested-by: Mark Rutland <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: add delivered_freeze to debugfs output [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:19 2024 +0000

    binder: add delivered_freeze to debugfs output
    
    commit cb2aeb2ec25884133110ffe5a67ff3cf7dee5ceb upstream.
    
    Add the pending proc->delivered_freeze work to the debugfs output. This
    information was omitted in the original implementation of the freeze
    notification and can be valuable for debugging issues.
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Signed-off-by: Carlos Llamas <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: allow freeze notification for dead nodes [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:17 2024 +0000

    binder: allow freeze notification for dead nodes
    
    commit ca63c66935b978441055e3d87d30225267f99329 upstream.
    
    Alice points out that binder_request_freeze_notification() should not
    return EINVAL when the relevant node is dead [1]. The node can die at
    any point even if the user input is valid. Instead, allow the request
    to be allocated but skip the initial notification for dead nodes. This
    avoids propagating unnecessary errors back to userspace.
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Suggested-by: Alice Ryhl <[email protected]>
    Link: https://lore.kernel.org/all/CAH5fLghapZJ4PbbkC8V5A6Zay-_sgTzwVpwqk6RWWUNKKyJC_Q@mail.gmail.com/ [1]
    Signed-off-by: Carlos Llamas <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix BINDER_WORK_CLEAR_FREEZE_NOTIFICATION debug logs [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:16 2024 +0000

    binder: fix BINDER_WORK_CLEAR_FREEZE_NOTIFICATION debug logs
    
    commit 595ea72efff9fa65bc52b6406e0822f90841f266 upstream.
    
    proc 699
    context binder-test
      thread 699: l 00 need_return 0 tr 0
      ref 25: desc 1 node 20 s 1 w 0 d 00000000c03e09a3
      unknown work: type 11
    
    proc 640
    context binder-test
      thread 640: l 00 need_return 0 tr 0
      ref 8: desc 1 node 3 s 1 w 0 d 000000002bb493e1
      has cleared freeze notification
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Suggested-by: Alice Ryhl <[email protected]>
    Signed-off-by: Carlos Llamas <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix BINDER_WORK_FROZEN_BINDER debug logs [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:15 2024 +0000

    binder: fix BINDER_WORK_FROZEN_BINDER debug logs
    
    commit 830d7db744b42c693bf1db7e94db86d7efd91f0e upstream.
    
    The BINDER_WORK_FROZEN_BINDER type is not handled in the binder_logs
    entries and it shows up as "unknown work" when logged:
    
      proc 649
      context binder-test
        thread 649: l 00 need_return 0 tr 0
        ref 13: desc 1 node 8 s 1 w 0 d 0000000053c4c0c3
        unknown work: type 10
    
    This patch add the freeze work type and is now logged as such:
    
      proc 637
      context binder-test
        thread 637: l 00 need_return 0 tr 0
        ref 8: desc 1 node 3 s 1 w 0 d 00000000dc39e9c6
        has frozen binder
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Acked-by: Todd Kjos <[email protected]>
    Signed-off-by: Carlos Llamas <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix freeze UAF in binder_release_work() [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:14 2024 +0000

    binder: fix freeze UAF in binder_release_work()
    
    commit 7e20434cbca814cb91a0a261ca0106815ef48e5f upstream.
    
    When a binder reference is cleaned up, any freeze work queued in the
    associated process should also be removed. Otherwise, the reference is
    freed while its ref->freeze.work is still queued in proc->work leading
    to a use-after-free issue as shown by the following KASAN report:
    
      ==================================================================
      BUG: KASAN: slab-use-after-free in binder_release_work+0x398/0x3d0
      Read of size 8 at addr ffff31600ee91488 by task kworker/5:1/211
    
      CPU: 5 UID: 0 PID: 211 Comm: kworker/5:1 Not tainted 6.11.0-rc7-00382-gfc6c92196396 #22
      Hardware name: linux,dummy-virt (DT)
      Workqueue: events binder_deferred_func
      Call trace:
       binder_release_work+0x398/0x3d0
       binder_deferred_func+0xb60/0x109c
       process_one_work+0x51c/0xbd4
       worker_thread+0x608/0xee8
    
      Allocated by task 703:
       __kmalloc_cache_noprof+0x130/0x280
       binder_thread_write+0xdb4/0x42a0
       binder_ioctl+0x18f0/0x25ac
       __arm64_sys_ioctl+0x124/0x190
       invoke_syscall+0x6c/0x254
    
      Freed by task 211:
       kfree+0xc4/0x230
       binder_deferred_func+0xae8/0x109c
       process_one_work+0x51c/0xbd4
       worker_thread+0x608/0xee8
      ==================================================================
    
    This commit fixes the issue by ensuring any queued freeze work is removed
    when cleaning up a binder reference.
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Acked-by: Todd Kjos <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Signed-off-by: Carlos Llamas <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix memleak of proc->delivered_freeze [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:18 2024 +0000

    binder: fix memleak of proc->delivered_freeze
    
    commit 1db76ec2b4b206ff943e292a0b55e68ff3443598 upstream.
    
    If a freeze notification is cleared with BC_CLEAR_FREEZE_NOTIFICATION
    before calling binder_freeze_notification_done(), then it is detached
    from its reference (e.g. ref->freeze) but the work remains queued in
    proc->delivered_freeze. This leads to a memory leak when the process
    exits as any pending entries in proc->delivered_freeze are not freed:
    
      unreferenced object 0xffff38e8cfa36180 (size 64):
        comm "binder-util", pid 655, jiffies 4294936641
        hex dump (first 32 bytes):
          b8 e9 9e c8 e8 38 ff ff b8 e9 9e c8 e8 38 ff ff  .....8.......8..
          0b 00 00 00 00 00 00 00 3c 1f 4b 00 00 00 00 00  ........<.K.....
        backtrace (crc 95983b32):
          [<000000000d0582cf>] kmemleak_alloc+0x34/0x40
          [<000000009c99a513>] __kmalloc_cache_noprof+0x208/0x280
          [<00000000313b1704>] binder_thread_write+0xdec/0x439c
          [<000000000cbd33bb>] binder_ioctl+0x1b68/0x22cc
          [<000000002bbedeeb>] __arm64_sys_ioctl+0x124/0x190
          [<00000000b439adee>] invoke_syscall+0x6c/0x254
          [<00000000173558fc>] el0_svc_common.constprop.0+0xac/0x230
          [<0000000084f72311>] do_el0_svc+0x40/0x58
          [<000000008b872457>] el0_svc+0x38/0x78
          [<00000000ee778653>] el0t_64_sync_handler+0x120/0x12c
          [<00000000a8ec61bf>] el0t_64_sync+0x190/0x194
    
    This patch fixes the leak by ensuring that any pending entries in
    proc->delivered_freeze are freed during binder_deferred_release().
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Signed-off-by: Carlos Llamas <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix node UAF in binder_add_freeze_work() [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:12 2024 +0000

    binder: fix node UAF in binder_add_freeze_work()
    
    commit dc8aea47b928cc153b591b3558829ce42f685074 upstream.
    
    In binder_add_freeze_work() we iterate over the proc->nodes with the
    proc->inner_lock held. However, this lock is temporarily dropped in
    order to acquire the node->lock first (lock nesting order). This can
    race with binder_node_release() and trigger a use-after-free:
    
      ==================================================================
      BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xe4/0x19c
      Write of size 4 at addr ffff53c04c29dd04 by task freeze/640
    
      CPU: 5 UID: 0 PID: 640 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 #17
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       _raw_spin_lock+0xe4/0x19c
       binder_add_freeze_work+0x148/0x478
       binder_ioctl+0x1e70/0x25ac
       __arm64_sys_ioctl+0x124/0x190
    
      Allocated by task 637:
       __kmalloc_cache_noprof+0x12c/0x27c
       binder_new_node+0x50/0x700
       binder_transaction+0x35ac/0x6f74
       binder_thread_write+0xfb8/0x42a0
       binder_ioctl+0x18f0/0x25ac
       __arm64_sys_ioctl+0x124/0x190
    
      Freed by task 637:
       kfree+0xf0/0x330
       binder_thread_read+0x1e88/0x3a68
       binder_ioctl+0x16d8/0x25ac
       __arm64_sys_ioctl+0x124/0x190
      ==================================================================
    
    Fix the race by taking a temporary reference on the node before
    releasing the proc->inner lock. This ensures the node remains alive
    while in use.
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Reviewed-by: Alice Ryhl <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Signed-off-by: Carlos Llamas <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binder: fix OOB in binder_add_freeze_work() [+ + +]

Author: Carlos Llamas <[email protected]>
Date:   Thu Sep 26 23:36:13 2024 +0000

    binder: fix OOB in binder_add_freeze_work()
    
    commit 011e69a1b23011c0db3af4b8293fdd4522cc97b0 upstream.
    
    In binder_add_freeze_work() we iterate over the proc->nodes with the
    proc->inner_lock held. However, this lock is temporarily dropped to
    acquire the node->lock first (lock nesting order). This can race with
    binder_deferred_release() which removes the nodes from the proc->nodes
    rbtree and adds them into binder_dead_nodes list. This leads to a broken
    iteration in binder_add_freeze_work() as rb_next() will use data from
    binder_dead_nodes, triggering an out-of-bounds access:
    
      ==================================================================
      BUG: KASAN: global-out-of-bounds in rb_next+0xfc/0x124
      Read of size 8 at addr ffffcb84285f7170 by task freeze/660
    
      CPU: 8 UID: 0 PID: 660 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 #18
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       rb_next+0xfc/0x124
       binder_add_freeze_work+0x344/0x534
       binder_ioctl+0x1e70/0x25ac
       __arm64_sys_ioctl+0x124/0x190
    
      The buggy address belongs to the variable:
       binder_dead_nodes+0x10/0x40
      [...]
      ==================================================================
    
    This is possible because proc->nodes (rbtree) and binder_dead_nodes
    (list) share entries in binder_node through a union:
    
            struct binder_node {
            [...]
                    union {
                            struct rb_node rb_node;
                            struct hlist_node dead_node;
                    };
    
    Fix the race by checking that the proc is still alive. If not, simply
    break out of the iteration.
    
    Fixes: d579b04a52a1 ("binder: frozen notification")
    Cc: [email protected]
    Reviewed-by: Alice Ryhl <[email protected]>
    Acked-by: Todd Kjos <[email protected]>
    Signed-off-by: Carlos Llamas <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: add a sanity check for btrfs root in btrfs_search_slot() [+ + +]

Author: Lizhi Xu <[email protected]>
Date:   Fri Oct 25 12:55:53 2024 +0800

    btrfs: add a sanity check for btrfs root in btrfs_search_slot()
    
    [ Upstream commit 3ed51857a50f530ac7a1482e069dfbd1298558d4 ]
    
    Syzbot reports a null-ptr-deref in btrfs_search_slot().
    
    The reproducer is using rescue=ibadroots, and the extent tree root is
    corrupted thus the extent tree is NULL.
    
    When scrub tries to search the extent tree to gather the needed extent
    info, btrfs_search_slot() doesn't check if the target root is NULL or
    not, resulting the null-ptr-deref.
    
    Add sanity check for btrfs root before using it in btrfs_search_slot().
    
    Reported-by: [email protected]
    Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots")
    Link: https://syzkaller.appspot.com/bug?extid=3030e17bd57a73d39bd7
    CC: [email protected] # 5.15+
    Reviewed-by: Qu Wenruo <[email protected]>
    Tested-by: [email protected]
    Signed-off-by: Lizhi Xu <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: change btrfs_encoded_read() so that reading of extent is done by caller [+ + +]

Author: Mark Harmstone <[email protected]>
Date:   Tue Oct 22 15:50:17 2024 +0100

    btrfs: change btrfs_encoded_read() so that reading of extent is done by caller
    
    [ Upstream commit 26efd44796c6dd7a64f039a0dda6d558eac97a3e ]
    
    Change the behaviour of btrfs_encoded_read() so that if it needs to read
    an extent from disk, it leaves the extent and inode locked and returns
    -EIOCBQUEUED. The caller is then responsible for doing the I/O via
    btrfs_encoded_read_regular() and unlocking the extent and inode.
    
    Signed-off-by: Mark Harmstone <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Stable-dep-of: 05b36b04d74a ("btrfs: fix use-after-free in btrfs_encoded_read_endio()")
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: don't loop for nowait writes when checking for cross references [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Fri Nov 15 15:46:13 2024 +0000

    btrfs: don't loop for nowait writes when checking for cross references
    
    [ Upstream commit ed67f2a913a4f0fc505db29805c41dd07d3cb356 ]
    
    When checking for delayed refs when verifying if there are cross
    references for a data extent, we stop if the path has nowait set and we
    can't try lock the delayed ref head's mutex, returning -EAGAIN with the
    goal of making a write fallback to a blocking context. However we ignore
    the -EAGAIN at btrfs_cross_ref_exist() when check_delayed_ref() returns
    it, and keep looping instead of immediately returning the -EAGAIN to the
    caller.
    
    Fix this by not looping if we get -EAGAIN and we have a nowait path.
    
    Fixes: 26ce91144631 ("btrfs: make can_nocow_extent nowait compatible")
    CC: [email protected] # 6.1+
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: drop unused parameter file_offset from btrfs_encoded_read_regular_fill_pages() [+ + +]

Author: David Sterba <[email protected]>
Date:   Wed Oct 9 16:31:47 2024 +0200

    btrfs: drop unused parameter file_offset from btrfs_encoded_read_regular_fill_pages()
    
    [ Upstream commit 590168edbe6317ca9f4066215fb099f43ffe745c ]
    
    The file_offset parameter used to be passed to encoded read struct but
    was removed in commit b665affe93d8 ("btrfs: remove unused members from
    struct btrfs_encoded_read_private").
    
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Stable-dep-of: 05b36b04d74a ("btrfs: fix use-after-free in btrfs_encoded_read_endio()")
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: fix use-after-free in btrfs_encoded_read_endio() [+ + +]

Author: Johannes Thumshirn <[email protected]>
Date:   Wed Nov 13 18:16:48 2024 +0100

    btrfs: fix use-after-free in btrfs_encoded_read_endio()
    
    [ Upstream commit 05b36b04d74a517d6675bf2f90829ff1ac7e28dc ]
    
    Shinichiro reported the following use-after free that sometimes is
    happening in our CI system when running fstests' btrfs/284 on a TCMU
    runner device:
    
      BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780
      Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219
    
      CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15
      Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020
      Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
      Call Trace:
       <TASK>
       dump_stack_lvl+0x6e/0xa0
       ? lock_release+0x708/0x780
       print_report+0x174/0x505
       ? lock_release+0x708/0x780
       ? __virt_addr_valid+0x224/0x410
       ? lock_release+0x708/0x780
       kasan_report+0xda/0x1b0
       ? lock_release+0x708/0x780
       ? __wake_up+0x44/0x60
       lock_release+0x708/0x780
       ? __pfx_lock_release+0x10/0x10
       ? __pfx_do_raw_spin_lock+0x10/0x10
       ? lock_is_held_type+0x9a/0x110
       _raw_spin_unlock_irqrestore+0x1f/0x60
       __wake_up+0x44/0x60
       btrfs_encoded_read_endio+0x14b/0x190 [btrfs]
       btrfs_check_read_bio+0x8d9/0x1360 [btrfs]
       ? lock_release+0x1b0/0x780
       ? trace_lock_acquire+0x12f/0x1a0
       ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs]
       ? process_one_work+0x7e3/0x1460
       ? lock_acquire+0x31/0xc0
       ? process_one_work+0x7e3/0x1460
       process_one_work+0x85c/0x1460
       ? __pfx_process_one_work+0x10/0x10
       ? assign_work+0x16c/0x240
       worker_thread+0x5e6/0xfc0
       ? __pfx_worker_thread+0x10/0x10
       kthread+0x2c3/0x3a0
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x31/0x70
       ? __pfx_kthread+0x10/0x10
       ret_from_fork_asm+0x1a/0x30
       </TASK>
    
      Allocated by task 3661:
       kasan_save_stack+0x30/0x50
       kasan_save_track+0x14/0x30
       __kasan_kmalloc+0xaa/0xb0
       btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs]
       send_extent_data+0xf0f/0x24a0 [btrfs]
       process_extent+0x48a/0x1830 [btrfs]
       changed_cb+0x178b/0x2ea0 [btrfs]
       btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs]
       _btrfs_ioctl_send+0x117/0x330 [btrfs]
       btrfs_ioctl+0x184a/0x60a0 [btrfs]
       __x64_sys_ioctl+0x12e/0x1a0
       do_syscall_64+0x95/0x180
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
      Freed by task 3661:
       kasan_save_stack+0x30/0x50
       kasan_save_track+0x14/0x30
       kasan_save_free_info+0x3b/0x70
       __kasan_slab_free+0x4f/0x70
       kfree+0x143/0x490
       btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs]
       send_extent_data+0xf0f/0x24a0 [btrfs]
       process_extent+0x48a/0x1830 [btrfs]
       changed_cb+0x178b/0x2ea0 [btrfs]
       btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs]
       _btrfs_ioctl_send+0x117/0x330 [btrfs]
       btrfs_ioctl+0x184a/0x60a0 [btrfs]
       __x64_sys_ioctl+0x12e/0x1a0
       do_syscall_64+0x95/0x180
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
      The buggy address belongs to the object at ffff888106a83f00
       which belongs to the cache kmalloc-rnd-07-96 of size 96
      The buggy address is located 24 bytes inside of
       freed 96-byte region [ffff888106a83f00, ffff888106a83f60)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83
      flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
      page_type: f5(slab)
      raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004
      raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000
      page dumped because: kasan: bad access detected
    
      Memory state around the buggy address:
       ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                  ^
       ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      ==================================================================
    
    Further analyzing the trace and the crash dump's vmcore file shows that
    the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on
    the wait_queue that is in the private data passed to the end_io handler.
    
    Commit 4ff47df40447 ("btrfs: move priv off stack in
    btrfs_encoded_read_regular_fill_pages()") moved 'struct
    btrfs_encoded_read_private' off the stack.
    
    Before that commit one can see a corruption of the private data when
    analyzing the vmcore after a crash:
    
    *(struct btrfs_encoded_read_private *)0xffff88815626eec8 = {
            .wait = (wait_queue_head_t){
                    .lock = (spinlock_t){
                            .rlock = (struct raw_spinlock){
                                    .raw_lock = (arch_spinlock_t){
                                            .val = (atomic_t){
                                                    .counter = (int)-2005885696,
                                            },
                                            .locked = (u8)0,
                                            .pending = (u8)157,
                                            .locked_pending = (u16)40192,
                                            .tail = (u16)34928,
                                    },
                                    .magic = (unsigned int)536325682,
                                    .owner_cpu = (unsigned int)29,
                                    .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0,
                                    .dep_map = (struct lockdep_map){
                                            .key = (struct lock_class_key *)0xffff8881575a3b6c,
                                            .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 },
                                            .name = (const char *)0xffff88815626f100 = "",
                                            .wait_type_outer = (u8)37,
                                            .wait_type_inner = (u8)178,
                                            .lock_type = (u8)154,
                                    },
                            },
                            .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 },
                            .dep_map = (struct lockdep_map){
                                    .key = (struct lock_class_key *)0xffff8881575a3b6c,
                                    .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 },
                                    .name = (const char *)0xffff88815626f100 = "",
                                    .wait_type_outer = (u8)37,
                                    .wait_type_inner = (u8)178,
                                    .lock_type = (u8)154,
                            },
                    },
                    .head = (struct list_head){
                            .next = (struct list_head *)0x112cca,
                            .prev = (struct list_head *)0x47,
                    },
            },
            .pending = (atomic_t){
                    .counter = (int)-1491499288,
            },
            .status = (blk_status_t)130,
    }
    
    Here we can see several indicators of in-memory data corruption, e.g. the
    large negative atomic values of ->pending or
    ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic
    0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus
    pointer values for ->wait->head.
    
    To fix this, change atomic_dec_return() to atomic_dec_and_test() to fix the
    corruption, as atomic_dec_return() is defined as two instructions on
    x86_64, whereas atomic_dec_and_test() is defined as a single atomic
    operation. This can lead to a situation where counter value is already
    decremented but the if statement in btrfs_encoded_read_endio() is not
    completely processed, i.e. the 0 test has not completed. If another thread
    continues executing btrfs_encoded_read_regular_fill_pages() the
    atomic_dec_return() there can see an already updated ->pending counter and
    continues by freeing the private data. Continuing in the endio handler the
    test for 0 succeeds and the wait_queue is woken up, resulting in a
    use-after-free.
    
    Reported-by: Shinichiro Kawasaki <[email protected]>
    Suggested-by: Damien Le Moal <[email protected]>
    Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl")
    CC: [email protected] # 6.1+
    Reviewed-by: Filipe Manana <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages() [+ + +]

Author: Mark Harmstone <[email protected]>
Date:   Tue Oct 22 15:50:19 2024 +0100

    btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()
    
    [ Upstream commit 68d3b27e05c7ca5545e88465f5e2be6eda0e11df ]
    
    Change btrfs_encoded_read_regular_fill_pages() so that the priv struct
    is allocated rather than stored on the stack, in preparation for adding
    an asynchronous mode to the function.
    
    Signed-off-by: Mark Harmstone <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Stable-dep-of: 05b36b04d74a ("btrfs: fix use-after-free in btrfs_encoded_read_endio()")
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: ref-verify: fix use-after-free after invalid ref action [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Fri Nov 15 11:29:21 2024 +0000

    btrfs: ref-verify: fix use-after-free after invalid ref action
    
    [ Upstream commit 7c4e39f9d2af4abaf82ca0e315d1fd340456620f ]
    
    At btrfs_ref_tree_mod() after we successfully inserted the new ref entry
    (local variable 'ref') into the respective block entry's rbtree (local
    variable 'be'), if we find an unexpected action of BTRFS_DROP_DELAYED_REF,
    we error out and free the ref entry without removing it from the block
    entry's rbtree. Then in the error path of btrfs_ref_tree_mod() we call
    btrfs_free_ref_cache(), which iterates over all block entries and then
    calls free_block_entry() for each one, and there we will trigger a
    use-after-free when we are called against the block entry to which we
    added the freed ref entry to its rbtree, since the rbtree still points
    to the block entry, as we didn't remove it from the rbtree before freeing
    it in the error path at btrfs_ref_tree_mod(). Fix this by removing the
    new ref entry from the rbtree before freeing it.
    
    Syzbot report this with the following stack traces:
    
       BTRFS error (device loop0 state EA):   Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615
          __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523
          update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512
          btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
          btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
          btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
          btrfs_insert_empty_items+0x9c/0x1a0 fs/btrfs/ctree.c:4314
          btrfs_insert_empty_item fs/btrfs/ctree.h:669 [inline]
          btrfs_insert_orphan_item+0x1f1/0x320 fs/btrfs/orphan.c:23
          btrfs_orphan_add+0x6d/0x1a0 fs/btrfs/inode.c:3482
          btrfs_unlink+0x267/0x350 fs/btrfs/inode.c:4293
          vfs_unlink+0x365/0x650 fs/namei.c:4469
          do_unlinkat+0x4ae/0x830 fs/namei.c:4533
          __do_sys_unlinkat fs/namei.c:4576 [inline]
          __se_sys_unlinkat fs/namei.c:4569 [inline]
          __x64_sys_unlinkat+0xcc/0xf0 fs/namei.c:4569
          do_syscall_x64 arch/x86/entry/common.c:52 [inline]
          do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
          entry_SYSCALL_64_after_hwframe+0x77/0x7f
       BTRFS error (device loop0 state EA):   Ref action 1, root 5, ref_root 5, parent 0, owner 260, offset 0, num_refs 1
          __btrfs_mod_ref+0x76b/0xac0 fs/btrfs/extent-tree.c:2521
          update_ref_for_cow+0x96a/0x11f0
          btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
          btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
          btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
          btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411
          __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030
          btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline]
          __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137
          __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171
          btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313
          prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586
          relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611
          btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081
          btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377
          __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161
          btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538
       BTRFS error (device loop0 state EA):   Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615
          __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523
          update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512
          btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
          btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
          btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
          btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411
          __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030
          btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline]
          __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137
          __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171
          btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313
          prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586
          relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611
          btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081
          btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377
          __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161
          btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538
       ==================================================================
       BUG: KASAN: slab-use-after-free in rb_first+0x69/0x70 lib/rbtree.c:473
       Read of size 8 at addr ffff888042d1af38 by task syz.0.0/5329
    
       CPU: 0 UID: 0 PID: 5329 Comm: syz.0.0 Not tainted 6.12.0-rc7-syzkaller #0
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
       Call Trace:
        <TASK>
        __dump_stack lib/dump_stack.c:94 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
        print_address_description mm/kasan/report.c:377 [inline]
        print_report+0x169/0x550 mm/kasan/report.c:488
        kasan_report+0x143/0x180 mm/kasan/report.c:601
        rb_first+0x69/0x70 lib/rbtree.c:473
        free_block_entry+0x78/0x230 fs/btrfs/ref-verify.c:248
        btrfs_free_ref_cache+0xa3/0x100 fs/btrfs/ref-verify.c:917
        btrfs_ref_tree_mod+0x139f/0x15e0 fs/btrfs/ref-verify.c:898
        btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544
        __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523
        update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512
        btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
        btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
        btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
        btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411
        __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030
        btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline]
        __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137
        __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171
        btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313
        prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586
        relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611
        btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081
        btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377
        __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161
        btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538
        btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:907 [inline]
        __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
        entry_SYSCALL_64_after_hwframe+0x77/0x7f
       RIP: 0033:0x7f996df7e719
       RSP: 002b:00007f996ede7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
       RAX: ffffffffffffffda RBX: 00007f996e135f80 RCX: 00007f996df7e719
       RDX: 0000000020000180 RSI: 00000000c4009420 RDI: 0000000000000004
       RBP: 00007f996dff139e R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
       R13: 0000000000000000 R14: 00007f996e135f80 R15: 00007fff79f32e68
        </TASK>
    
       Allocated by task 5329:
        kasan_save_stack mm/kasan/common.c:47 [inline]
        kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
        poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
        __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394
        kasan_kmalloc include/linux/kasan.h:257 [inline]
        __kmalloc_cache_noprof+0x19c/0x2c0 mm/slub.c:4295
        kmalloc_noprof include/linux/slab.h:878 [inline]
        kzalloc_noprof include/linux/slab.h:1014 [inline]
        btrfs_ref_tree_mod+0x264/0x15e0 fs/btrfs/ref-verify.c:701
        btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544
        __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523
        update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512
        btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
        btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
        btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
        btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411
        __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030
        btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline]
        __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137
        __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171
        btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313
        prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586
        relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611
        btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081
        btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377
        __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161
        btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538
        btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:907 [inline]
        __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
        entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
       Freed by task 5329:
        kasan_save_stack mm/kasan/common.c:47 [inline]
        kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
        kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
        poison_slab_object mm/kasan/common.c:247 [inline]
        __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
        kasan_slab_free include/linux/kasan.h:230 [inline]
        slab_free_hook mm/slub.c:2342 [inline]
        slab_free mm/slub.c:4579 [inline]
        kfree+0x1a0/0x440 mm/slub.c:4727
        btrfs_ref_tree_mod+0x136c/0x15e0
        btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544
        __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523
        update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512
        btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594
        btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754
        btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116
        btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411
        __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030
        btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline]
        __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137
        __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171
        btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313
        prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586
        relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611
        btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081
        btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377
        __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161
        btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538
        btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:907 [inline]
        __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
        entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
       The buggy address belongs to the object at ffff888042d1af00
        which belongs to the cache kmalloc-64 of size 64
       The buggy address is located 56 bytes inside of
        freed 64-byte region [ffff888042d1af00, ffff888042d1af40)
    
       The buggy address belongs to the physical page:
       page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x42d1a
       anon flags: 0x4fff00000000000(node=1|zone=1|lastcpupid=0x7ff)
       page_type: f5(slab)
       raw: 04fff00000000000 ffff88801ac418c0 0000000000000000 dead000000000001
       raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000
       page dumped because: kasan: bad access detected
       page_owner tracks the page as allocated
       page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5055, tgid 5055 (dhcpcd-run-hook), ts 40377240074, free_ts 40376848335
        set_page_owner include/linux/page_owner.h:32 [inline]
        post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1541
        prep_new_page mm/page_alloc.c:1549 [inline]
        get_page_from_freelist+0x3649/0x3790 mm/page_alloc.c:3459
        __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4735
        alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
        alloc_slab_page+0x6a/0x140 mm/slub.c:2412
        allocate_slab+0x5a/0x2f0 mm/slub.c:2578
        new_slab mm/slub.c:2631 [inline]
        ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3818
        __slab_alloc+0x58/0xa0 mm/slub.c:3908
        __slab_alloc_node mm/slub.c:3961 [inline]
        slab_alloc_node mm/slub.c:4122 [inline]
        __do_kmalloc_node mm/slub.c:4263 [inline]
        __kmalloc_noprof+0x25a/0x400 mm/slub.c:4276
        kmalloc_noprof include/linux/slab.h:882 [inline]
        kzalloc_noprof include/linux/slab.h:1014 [inline]
        tomoyo_encode2 security/tomoyo/realpath.c:45 [inline]
        tomoyo_encode+0x26f/0x540 security/tomoyo/realpath.c:80
        tomoyo_realpath_from_path+0x59e/0x5e0 security/tomoyo/realpath.c:283
        tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
        tomoyo_check_open_permission+0x255/0x500 security/tomoyo/file.c:771
        security_file_open+0x777/0x990 security/security.c:3109
        do_dentry_open+0x369/0x1460 fs/open.c:945
        vfs_open+0x3e/0x330 fs/open.c:1088
        do_open fs/namei.c:3774 [inline]
        path_openat+0x2c84/0x3590 fs/namei.c:3933
       page last free pid 5055 tgid 5055 stack trace:
        reset_page_owner include/linux/page_owner.h:25 [inline]
        free_pages_prepare mm/page_alloc.c:1112 [inline]
        free_unref_page+0xcfb/0xf20 mm/page_alloc.c:2642
        free_pipe_info+0x300/0x390 fs/pipe.c:860
        put_pipe_info fs/pipe.c:719 [inline]
        pipe_release+0x245/0x320 fs/pipe.c:742
        __fput+0x23f/0x880 fs/file_table.c:431
        __do_sys_close fs/open.c:1567 [inline]
        __se_sys_close fs/open.c:1552 [inline]
        __x64_sys_close+0x7f/0x110 fs/open.c:1552
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
        entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
       Memory state around the buggy address:
        ffff888042d1ae00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
        ffff888042d1ae80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
       >ffff888042d1af00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
                                               ^
        ffff888042d1af80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
        ffff888042d1b000: 00 00 00 00 00 fc fc 00 00 00 00 00 fc fc 00 00
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/T/#u
    Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool")
    CC: [email protected] # 4.19+
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: extract entity name from device id [+ + +]

Author: Patrick Donnelly <[email protected]>
Date:   Sat Oct 12 20:54:11 2024 -0400

    ceph: extract entity name from device id
    
    commit 955710afcb3bb63e21e186451ed5eba85fa14d0b upstream.
    
    Previously, the "name" in the new device syntax "<name>@<fsid>.<fsname>"
    was ignored because (presumably) tests were done using mount.ceph which
    also passed the entity name using "-o name=foo". If mounting is done
    without the mount.ceph helper, the new device id syntax fails to set
    the name properly.
    
    Cc: [email protected]
    Link: https://tracker.ceph.com/issues/68516
    Signed-off-by: Patrick Donnelly <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: fix cred leak in ceph_mds_check_access() [+ + +]

Author: Max Kellermann <[email protected]>
Date:   Sat Nov 23 08:21:21 2024 +0100

    ceph: fix cred leak in ceph_mds_check_access()
    
    commit c5cf420303256dcd6ff175643e9e9558543c2047 upstream.
    
    get_current_cred() increments the reference counter, but the
    put_cred() call was missing.
    
    Cc: [email protected]
    Fixes: 596afb0b8933 ("ceph: add ceph_mds_check_access() helper")
    Signed-off-by: Max Kellermann <[email protected]>
    Reviewed-by: Xiubo Li <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: pass cred pointer to ceph_mds_auth_match() [+ + +]

Author: Max Kellermann <[email protected]>
Date:   Sat Nov 23 08:21:20 2024 +0100

    ceph: pass cred pointer to ceph_mds_auth_match()
    
    commit 23426309a4064b25a961e1c72961d8bfc7c8c990 upstream.
    
    This eliminates a redundant get_current_cred() call, because
    ceph_mds_check_access() has already obtained this pointer.
    
    As a side effect, this also fixes a reference leak in
    ceph_mds_auth_match(): by omitting the get_current_cred() call, no
    additional cred reference is taken.
    
    Cc: [email protected]
    Fixes: 596afb0b8933 ("ceph: add ceph_mds_check_access() helper")
    Signed-off-by: Max Kellermann <[email protected]>
    Reviewed-by: Xiubo Li <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-qcs404: fix initial rate of GPLL3 [+ + +]

Author: Gabor Juhos <[email protected]>
Date:   Tue Oct 22 11:45:56 2024 +0200

    clk: qcom: gcc-qcs404: fix initial rate of GPLL3
    
    commit 36d202241d234fa4ac50743510d098ad52bd193a upstream.
    
    The comment before the config of the GPLL3 PLL says that the
    PLL should run at 930 MHz. In contrary to this, calculating
    the frequency from the current configuration values by using
    19.2 MHz as input frequency defined in 'qcs404.dtsi', it gives
    921.6 MHz:
    
      $ xo=19200000; l=48; alpha=0x0; alpha_hi=0x0
      $ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
      921600000.00000000000000000000
    
    Set 'alpha_hi' in the configuration to a value used in downstream
    kernels [1][2] in order to get the correct output rate:
    
      $ xo=19200000; l=48; alpha=0x0; alpha_hi=0x70
      $ echo "$xo * ($((l)) + $(((alpha_hi << 32 | alpha) >> 8)) / 2^32)" | bc -l
      930000000.00000000000000000000
    
    The change is based on static code analysis, compile tested only.
    
    [1] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/kernel.lnx.5.4.r56-rel/drivers/clk/qcom/gcc-qcs404.c?ref_type=heads#L335
    [2} https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/blob/kernel.lnx.5.15.r49-rel/drivers/clk/qcom/gcc-qcs404.c?ref_type=heads#L127
    
    Cc: [email protected]
    Fixes: 652f1813c113 ("clk: qcom: gcc: Add global clock controller driver for QCS404")
    Signed-off-by: Gabor Juhos <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: scmi: Fix cleanup path when boost enablement fails [+ + +]

Author: Sibi Sankar <[email protected]>
Date:   Thu Oct 31 18:57:44 2024 +0530

    cpufreq: scmi: Fix cleanup path when boost enablement fails
    
    commit 8c776a54d9ef3e945db2fe407ad6ad4525422943 upstream.
    
    Include free_cpufreq_table in the cleanup path when boost enablement fails.
    
    cc: [email protected]
    Fixes: a8e949d41c72 ("cpufreq: scmi: Enable boost support")
    Signed-off-by: Sibi Sankar <[email protected]>
    Signed-off-by: Viresh Kumar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm thin: Add missing destroy_work_on_stack() [+ + +]

Author: Yuan Can <[email protected]>
Date:   Wed Nov 6 09:03:12 2024 +0800

    dm thin: Add missing destroy_work_on_stack()
    
    commit e74fa2447bf9ed03d085b6d91f0256cc1b53f1a8 upstream.
    
    This commit add missed destroy_work_on_stack() operations for pw->worker in
    pool_work_wait().
    
    Fixes: e7a3e871d895 ("dm thin: cleanup noflush_work to use a proper completion")
    Cc: [email protected]
    Signed-off-by: Yuan Can <[email protected]>
    Signed-off-by: Mikulas Patocka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dm: Fix typo in error message [+ + +]

Author: Ssuhung Yeh <[email protected]>
Date:   Thu Oct 31 18:25:59 2024 +0800

    dm: Fix typo in error message
    
    commit 2deb70d3e66d538404d9e71bff236e6d260da66e upstream.
    
    Remove the redundant "i" at the beginning of the error message. This "i"
    came from commit 1c1318866928 ("dm: prefer
    '"%s...", __func__'"), the "i" is accidentally left.
    
    Signed-off-by: Ssuhung Yeh <[email protected]>
    Signed-off-by: Mikulas Patocka <[email protected]>
    Fixes: 1c1318866928 ("dm: prefer '"%s...", __func__'")
    Cc: [email protected]      # v6.3+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

driver core: fw_devlink: Stop trying to optimize cycle detection logic [+ + +]

Author: Saravana Kannan <[email protected]>
Date:   Wed Oct 30 10:10:07 2024 -0700

    driver core: fw_devlink: Stop trying to optimize cycle detection logic
    
    commit bac3b10b78e54b7da3cede397258f75a2180609b upstream.
    
    In attempting to optimize fw_devlink runtime, I introduced numerous cycle
    detection bugs by foregoing cycle detection logic under specific
    conditions. Each fix has further narrowed the conditions for optimization.
    
    It's time to give up on these optimization attempts and just run the cycle
    detection logic every time fw_devlink tries to create a device link.
    
    The specific bug report that triggered this fix involved a supplier fwnode
    that never gets a device created for it. Instead, the supplier fwnode is
    represented by the device that corresponds to an ancestor fwnode.
    
    In this case, fw_devlink didn't do any cycle detection because the cycle
    detection logic is only run when a device link is created between the
    devices that correspond to the actual consumer and supplier fwnodes.
    
    With this change, fw_devlink will run cycle detection logic even when
    creating SYNC_STATE_ONLY proxy device links from a device that is an
    ancestor of a consumer fwnode.
    
    Reported-by: Tomi Valkeinen <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles")
    Cc: stable <[email protected]>
    Tested-by: Tomi Valkeinen <[email protected]>
    Signed-off-by: Saravana Kannan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Fix handling of plane refcount [+ + +]

Author: Joshua Aberback <[email protected]>
Date:   Mon Oct 28 17:12:22 2024 -0400

    drm/amd/display: Fix handling of plane refcount
    
    commit 27227a234c1487cb7a684615f0749c455218833a upstream.
    
    [Why]
    The mechanism to backup and restore plane states doesn't maintain
    refcount, which can cause issues if the refcount of the plane changes
    in between backup and restore operations, such as memory leaks if the
    refcount was supposed to go down, or double frees / invalid memory
    accesses if the refcount was supposed to go up.
    
    [How]
    Cache and re-apply current refcount when restoring plane states.
    
    Cc: [email protected]
    Reviewed-by: Josip Pavic <[email protected]>
    Signed-off-by: Joshua Aberback <[email protected]>
    Signed-off-by: Hamza Mahfooz <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Remove PIPE_DTO_SRC_SEL programming from set_dtbclk_dto [+ + +]

Author: Ovidiu Bunea <[email protected]>
Date:   Wed Nov 6 16:25:18 2024 -0500

    drm/amd/display: Remove PIPE_DTO_SRC_SEL programming from set_dtbclk_dto
    
    commit a3e6079bd93d5c66a43bf6a5f90e5b98465dc7b3 upstream.
    
    There are cases where an OTG is remapped from driving a regular HDMI
    display to a DP/eDP display. There are also cases where DTBCLK needs to
    be enabled for HPO, but DTBCLK DTO programming may be done while OTG is
    still enabled which is dangerous as the PIPE_DTO_SRC_SEL programming may
    change the pixel clock generator source for a mapped and running OTG and
    cause it to hang.
    
    Remove the PIPE_DTO_SRC_SEL programming from this sequence since it is
    already done in program_pixel_clk(). Additionally, make sure that
    program_pixel_clk sets DTBCLK DTO as source for special HDMI cases.
    
    Cc: [email protected] # 6.11+
    Reviewed-by: Nicholas Kazlauskas <[email protected]>
    Signed-off-by: Ovidiu Bunea <[email protected]>
    Signed-off-by: Hamza Mahfooz <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: update pipe selection policy to check head pipe [+ + +]

Author: Yihan Zhu <[email protected]>
Date:   Wed Oct 30 16:20:21 2024 -0400

    drm/amd/display: update pipe selection policy to check head pipe
    
    commit 8fef253c94a5312b9150b2ff8e633b331bac7e88 upstream.
    
    [Why]
    No check on head pipe during the dml to dc hw mapping will allow illegal
    pipe usage. This will result in a wrong pipe topology to cause mpcc tree
    totally mess up then cause a display hang.
    
    [How]
    Avoid to use the pipe is head in all check and avoid ODM slice during
    preferred pipe check.
    
    Cc: [email protected]
    Reviewed-by: Nicholas Kazlauskas <[email protected]>
    Signed-off-by: Yihan Zhu <[email protected]>
    Signed-off-by: Hamza Mahfooz <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3 [+ + +]

Author: Kenneth Feng <[email protected]>
Date:   Tue Nov 19 14:26:58 2024 +0800

    drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3
    
    commit b0df0e777874549c128b43f7bf4989a2ed24b37a upstream.
    
    disable pcie speed switching on Intel platform for smu v14.0.2/3
    based on Intel's requirement.
    v2: align the setting with smu v13.
    
    Signed-off-by: Kenneth Feng <[email protected]>
    Reviewed-by: Lijo Lazar <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected] # 6.11.x
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/pm: Remove arcturus min power limit [+ + +]

Author: Lijo Lazar <[email protected]>
Date:   Wed Nov 20 08:34:39 2024 +0530

    drm/amd/pm: Remove arcturus min power limit
    
    commit da868898cf4c5ddbd1f7406e356edce5d7211eb5 upstream.
    
    As per power team, there is no need to impose a lower bound on arcturus
    power limit. Any unreasonable limit set will result in frequent
    throttling.
    
    Signed-off-by: Lijo Lazar <[email protected]>
    Reviewed-by: Kenneth Feng <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/pm: skip setting the power source on smu v14.0.2/3 [+ + +]

Author: Kenneth Feng <[email protected]>
Date:   Tue Nov 19 15:03:22 2024 +0800

    drm/amd/pm: skip setting the power source on smu v14.0.2/3
    
    commit 76c7f08094767b5df3b60e18d1bdecddd4a5c844 upstream.
    
    skip setting power source on smu v14.0.2/3
    
    Signed-off-by: Kenneth Feng <[email protected]>
    Reviewed-by: Lijo Lazar <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected] # 6.11.x
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/pm: update current_socclk and current_uclk in gpu_metrics on smu v13.0.7 [+ + +]

Author: Umio Yasuno <[email protected]>
Date:   Thu Nov 14 16:15:27 2024 +0900

    drm/amd/pm: update current_socclk and current_uclk in gpu_metrics on smu v13.0.7
    
    commit 2abf2f7032df4c4e7f6cf7906da59d0e614897d6 upstream.
    
    These were missed before.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3751
    Signed-off-by: Umio Yasuno <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd: Add some missing straps from NBIO 7.11.0 [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Mon Nov 18 11:46:10 2024 -0600

    drm/amd: Add some missing straps from NBIO 7.11.0
    
    commit 902fbbf429b8213232b18de0ddfd5c0f3851cb8f upstream.
    
    Earlier ASICs have strap information exported, and this is missing
    for NBIO 7.11.0.
    
    Cc: [email protected]
    Reviewed-by: Alex Deucher <[email protected]>
    Fixes: ca8c68142ad8 ("drm/amdgpu: add nbio 7.11 registers")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd: Fix initialization mistake for NBIO 7.11 devices [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Mon Nov 18 11:46:11 2024 -0600

    drm/amd: Fix initialization mistake for NBIO 7.11 devices
    
    commit 349af06a3abd0bb3787ee2daf3ac508412fe8dcc upstream.
    
    There is a strapping issue on NBIO 7.11.x that can lead to spurious PME
    events while in the D0 state.
    
    Cc: [email protected]
    Reviewed-by: Alex Deucher <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/pm: add gen5 display to the user on smu v14.0.2/3 [+ + +]

Author: Kenneth Feng <[email protected]>
Date:   Tue Nov 19 11:10:47 2024 +0800

    drm/amdgpu/pm: add gen5 display to the user on smu v14.0.2/3
    
    commit 6719ab8234ce4b0c0e9aa93aaa94961e5b2bc852 upstream.
    
    add gen5 display to the user on smu v14.0.2/3
    
    Signed-off-by: Kenneth Feng <[email protected]>
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected] # 6.11.x
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu: fix usage slab after free [+ + +]

Author: Vitaly Prosyak <[email protected]>
Date:   Mon Nov 11 17:24:08 2024 -0500

    drm/amdgpu: fix usage slab after free
    
    commit b61badd20b443eabe132314669bb51a263982e5c upstream.
    
    [  +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147
    
    [  +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1
    [  +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
    [  +0.000016] Call Trace:
    [  +0.000008]  <TASK>
    [  +0.000009]  dump_stack_lvl+0x76/0xa0
    [  +0.000017]  print_report+0xce/0x5f0
    [  +0.000017]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000019]  ? srso_return_thunk+0x5/0x5f
    [  +0.000015]  ? kasan_complete_mode_report_info+0x72/0x200
    [  +0.000016]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000019]  kasan_report+0xbe/0x110
    [  +0.000015]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000023]  __asan_report_load8_noabort+0x14/0x30
    [  +0.000014]  drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
    [  +0.000020]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? __kasan_check_write+0x14/0x30
    [  +0.000016]  ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
    [  +0.000020]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? __kasan_check_write+0x14/0x30
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? enable_work+0x124/0x220
    [  +0.000015]  ? __pfx_enable_work+0x10/0x10
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? free_large_kmalloc+0x85/0xf0
    [  +0.000016]  drm_sched_entity_destroy+0x18/0x30 [gpu_sched]
    [  +0.000020]  amdgpu_vce_sw_fini+0x55/0x170 [amdgpu]
    [  +0.000735]  ? __kasan_check_read+0x11/0x20
    [  +0.000016]  vce_v4_0_sw_fini+0x80/0x110 [amdgpu]
    [  +0.000726]  amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu]
    [  +0.000679]  ? mutex_unlock+0x80/0xe0
    [  +0.000017]  ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu]
    [  +0.000662]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? __kasan_check_write+0x14/0x30
    [  +0.000013]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? mutex_unlock+0x80/0xe0
    [  +0.000016]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
    [  +0.000663]  drm_minor_release+0xc9/0x140 [drm]
    [  +0.000081]  drm_release+0x1fd/0x390 [drm]
    [  +0.000082]  __fput+0x36c/0xad0
    [  +0.000018]  __fput_sync+0x3c/0x50
    [  +0.000014]  __x64_sys_close+0x7d/0xe0
    [  +0.000014]  x64_sys_call+0x1bc6/0x2680
    [  +0.000014]  do_syscall_64+0x70/0x130
    [  +0.000014]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? irqentry_exit_to_user_mode+0x60/0x190
    [  +0.000015]  ? srso_return_thunk+0x5/0x5f
    [  +0.000014]  ? irqentry_exit+0x43/0x50
    [  +0.000012]  ? srso_return_thunk+0x5/0x5f
    [  +0.000013]  ? exc_page_fault+0x7c/0x110
    [  +0.000015]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [  +0.000014] RIP: 0033:0x7ffff7b14f67
    [  +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
    [  +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
    [  +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
    [  +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003
    [  +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000
    [  +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
    [  +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
    [  +0.000020]  </TASK>
    
    [  +0.000016] Allocated by task 383 on cpu 7 at 26.880319s:
    [  +0.000014]  kasan_save_stack+0x28/0x60
    [  +0.000008]  kasan_save_track+0x18/0x70
    [  +0.000007]  kasan_save_alloc_info+0x38/0x60
    [  +0.000007]  __kasan_kmalloc+0xc1/0xd0
    [  +0.000007]  kmalloc_trace_noprof+0x180/0x380
    [  +0.000007]  drm_sched_init+0x411/0xec0 [gpu_sched]
    [  +0.000012]  amdgpu_device_init+0x695f/0xa610 [amdgpu]
    [  +0.000658]  amdgpu_driver_load_kms+0x1a/0x120 [amdgpu]
    [  +0.000662]  amdgpu_pci_probe+0x361/0xf30 [amdgpu]
    [  +0.000651]  local_pci_probe+0xe7/0x1b0
    [  +0.000009]  pci_device_probe+0x248/0x890
    [  +0.000008]  really_probe+0x1fd/0x950
    [  +0.000008]  __driver_probe_device+0x307/0x410
    [  +0.000007]  driver_probe_device+0x4e/0x150
    [  +0.000007]  __driver_attach+0x223/0x510
    [  +0.000006]  bus_for_each_dev+0x102/0x1a0
    [  +0.000007]  driver_attach+0x3d/0x60
    [  +0.000006]  bus_add_driver+0x2ac/0x5f0
    [  +0.000006]  driver_register+0x13d/0x490
    [  +0.000008]  __pci_register_driver+0x1ee/0x2b0
    [  +0.000007]  llc_sap_close+0xb0/0x160 [llc]
    [  +0.000009]  do_one_initcall+0x9c/0x3e0
    [  +0.000008]  do_init_module+0x241/0x760
    [  +0.000008]  load_module+0x51ac/0x6c30
    [  +0.000006]  __do_sys_init_module+0x234/0x270
    [  +0.000007]  __x64_sys_init_module+0x73/0xc0
    [  +0.000006]  x64_sys_call+0xe3/0x2680
    [  +0.000006]  do_syscall_64+0x70/0x130
    [  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    [  +0.000015] Freed by task 2147 on cpu 6 at 160.507651s:
    [  +0.000013]  kasan_save_stack+0x28/0x60
    [  +0.000007]  kasan_save_track+0x18/0x70
    [  +0.000007]  kasan_save_free_info+0x3b/0x60
    [  +0.000007]  poison_slab_object+0x115/0x1c0
    [  +0.000007]  __kasan_slab_free+0x34/0x60
    [  +0.000007]  kfree+0xfa/0x2f0
    [  +0.000007]  drm_sched_fini+0x19d/0x410 [gpu_sched]
    [  +0.000012]  amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu]
    [  +0.000662]  amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu]
    [  +0.000653]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
    [  +0.000655]  drm_minor_release+0xc9/0x140 [drm]
    [  +0.000071]  drm_release+0x1fd/0x390 [drm]
    [  +0.000071]  __fput+0x36c/0xad0
    [  +0.000008]  __fput_sync+0x3c/0x50
    [  +0.000007]  __x64_sys_close+0x7d/0xe0
    [  +0.000007]  x64_sys_call+0x1bc6/0x2680
    [  +0.000007]  do_syscall_64+0x70/0x130
    [  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    [  +0.000014] The buggy address belongs to the object at ffff8881b8605f80
                   which belongs to the cache kmalloc-64 of size 64
    [  +0.000020] The buggy address is located 8 bytes inside of
                   freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0)
    
    [  +0.000028] The buggy address belongs to the physical page:
    [  +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605
    [  +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
    [  +0.000007] page_type: 0xffffefff(slab)
    [  +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001
    [  +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000
    [  +0.000006] page dumped because: kasan: bad access detected
    
    [  +0.000012] Memory state around the buggy address:
    [  +0.000011]  ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
    [  +0.000015]  ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
    [  +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
    [  +0.000013]                       ^
    [  +0.000011]  ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
    [  +0.000014]  ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb
    [  +0.000013] ==================================================================
    
    The issue reproduced on VG20 during the IGT pci_unplug test.
    The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill.
    In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by
    each entity within the run queue, leading to invalid memory access.
    To resolve this, the order of cleanup calls is updated:
    
        Before:
            amdgpu_fence_driver_sw_fini
            amdgpu_device_ip_fini
    
        After:
            amdgpu_device_ip_fini
            amdgpu_fence_driver_sw_fini
    
    This updated order ensures that all entities in the IPs are cleaned up first, followed by proper
    cleanup of the schedulers.
    
    Additional Investigation:
    
    During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo
    buffer must be freed only as the final step in the cleanup process to prevent any premature
    access during earlier cleanup stages.
    
    v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini.
    
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Signed-off-by: Vitaly Prosyak <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdkfd: Use the correct wptr size [+ + +]

Author: Lijo Lazar <[email protected]>
Date:   Mon Nov 11 20:11:38 2024 +0530

    drm/amdkfd: Use the correct wptr size
    
    commit cdc6705f98ea3f854a60ba8c9b19228e197ae384 upstream.
    
    Write pointer could be 32-bit or 64-bit. Use the correct size during
    initialization.
    
    Signed-off-by: Lijo Lazar <[email protected]>
    Acked-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/bridge: it6505: Fix inverted reset polarity [+ + +]

Author: Chen-Yu Tsai <[email protected]>
Date:   Tue Oct 29 17:54:10 2024 +0800

    drm/bridge: it6505: Fix inverted reset polarity
    
    commit c5f3f21728b069412e8072b8b1d0a3d9d3ab0265 upstream.
    
    The IT6505 bridge chip has a active low reset line. Since it is a
    "reset" and not an "enable" line, the GPIO should be asserted to
    put it in reset and deasserted to bring it out of reset during
    the power on sequence.
    
    The polarity was inverted when the driver was first introduced, likely
    because the device family that was targeted had an inverting level
    shifter on the reset line.
    
    The MT8186 Corsola devices already have the IT6505 in their device tree,
    but the whole display pipeline is actually disabled and won't be enabled
    until some remaining issues are sorted out. The other known user is
    the MT8183 Kukui / Jacuzzi family; their device trees currently do not
    have the IT6505 included.
    
    Fix the polarity in the driver while there are no actual users.
    
    Fixes: b5c84a9edcd4 ("drm/bridge: add it6505 driver")
    Cc: [email protected]
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/etnaviv: flush shader L1 cache after user commandstream [+ + +]

Author: Lucas Stach <[email protected]>
Date:   Fri Oct 25 17:14:46 2024 +0200

    drm/etnaviv: flush shader L1 cache after user commandstream
    
    commit 4f8dbadef085ab447a01a8d4806a3f629fea05ed upstream.
    
    The shader L1 cache is a writeback cache for shader loads/stores
    and thus must be flushed before any BOs backing the shader buffers
    are potentially freed.
    
    Cc: [email protected]
    Reviewed-by: Christian Gmeiner <[email protected]>
    Signed-off-by: Lucas Stach <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/fbdev-dma: Select FB_DEFERRED_IO [+ + +]

Author: Thomas Zimmermann <[email protected]>
Date:   Mon Oct 14 10:55:17 2024 +0200

    drm/fbdev-dma: Select FB_DEFERRED_IO
    
    commit 67c40c9b2ec5f375bf78274d4e9ef0e3b8315bea upstream.
    
    Commit 808a40b69468 ("drm/fbdev-dma: Implement damage handling and
    deferred I/O") added deferred I/O for fbdev-dma. Also select the
    Kconfig symbol FB_DEFERRED_IO (via FB_DMAMEM_HELPERS_DEFERRED). Fixes
    build errors about missing fbdefio, such as
    
    drivers/gpu/drm/drm_fbdev_dma.c:218:26: error: 'struct drm_fb_helper' has no member named 'fbdefio'
      218 |                 fb_helper->fbdefio.delay = HZ / 20;
          |                          ^~
    drivers/gpu/drm/drm_fbdev_dma.c:219:26: error: 'struct drm_fb_helper' has no member named 'fbdefio'
      219 |                 fb_helper->fbdefio.deferred_io = drm_fb_helper_deferred_io;
          |                          ^~
    drivers/gpu/drm/drm_fbdev_dma.c:221:21: error: 'struct fb_info' has no member named 'fbdefio'
      221 |                 info->fbdefio = &fb_helper->fbdefio;
          |                     ^~
    drivers/gpu/drm/drm_fbdev_dma.c:221:43: error: 'struct drm_fb_helper' has no member named 'fbdefio'
      221 |                 info->fbdefio = &fb_helper->fbdefio;
          |                                           ^~
    
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: 808a40b69468 ("drm/fbdev-dma: Implement damage handling and deferred I/O")
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Javier Martinez Canillas <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: <[email protected]> # v6.11+
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/mediatek: Fix child node refcount handling in early exit [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Fri Oct 11 21:21:51 2024 +0200

    drm/mediatek: Fix child node refcount handling in early exit
    
    commit f708e8b4cfd16e5c8cd8d7fcfcb2fb2c6ed93af3 upstream.
    
    Early exits (goto, break, return) from for_each_child_of_node() required
    an explicit call to of_node_put(), which was not introduced with the
    break if cnt == MAX_CRTC.
    
    Add the missing of_node_put() before the break.
    
    Cc: [email protected]
    Fixes: d761b9450e31 ("drm/mediatek: Add cnt checking for coverity issue")
    
    Signed-off-by: Javier Carrasco <[email protected]>
    Reviewed-by: CK Hu <[email protected]>
    Reviewed-by: Chen-Yu Tsai <[email protected]>
    Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
    Signed-off-by: Chun-Kuang Hu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panic: Fix uninitialized spinlock acquisition with CONFIG_DRM_PANIC=n [+ + +]

Author: Lyude Paul <[email protected]>
Date:   Mon Sep 16 19:00:08 2024 -0400

    drm/panic: Fix uninitialized spinlock acquisition with CONFIG_DRM_PANIC=n
    
    commit 319e53f155907cf2c6dabc16ec9dce0179bc04d1 upstream.
    
    It turns out that if you happen to have a kernel config where
    CONFIG_DRM_PANIC is disabled and spinlock debugging is enabled, along with
    KMS being enabled - we'll end up trying to acquire an uninitialized
    spin_lock with drm_panic_lock() when we try to do a commit:
    
      rvkms rvkms.0: [drm:drm_atomic_commit] committing 0000000068d2ade1
      INFO: trying to register non-static key.
      The code is fine but needs lockdep annotation, or maybe
      you didn't initialize this object before use?
      turning off the locking correctness validator.
      CPU: 4 PID: 1347 Comm: modprobe Not tainted 6.10.0-rc1Lyude-Test+ #272
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20240524-3.fc40 05/24/2024
      Call Trace:
       <TASK>
       dump_stack_lvl+0x77/0xa0
       assign_lock_key+0x114/0x120
       register_lock_class+0xa8/0x2c0
       __lock_acquire+0x7d/0x2bd0
       ? __vmap_pages_range_noflush+0x3a8/0x550
       ? drm_atomic_helper_swap_state+0x2ad/0x3a0
       lock_acquire+0xec/0x290
       ? drm_atomic_helper_swap_state+0x2ad/0x3a0
       ? lock_release+0xee/0x310
       _raw_spin_lock_irqsave+0x4e/0x70
       ? drm_atomic_helper_swap_state+0x2ad/0x3a0
       drm_atomic_helper_swap_state+0x2ad/0x3a0
       drm_atomic_helper_commit+0xb1/0x270
       drm_atomic_commit+0xaf/0xe0
       ? __pfx___drm_printfn_info+0x10/0x10
       drm_client_modeset_commit_atomic+0x1a1/0x250
       drm_client_modeset_commit_locked+0x4b/0x180
       drm_client_modeset_commit+0x27/0x50
       __drm_fb_helper_restore_fbdev_mode_unlocked+0x76/0x90
       drm_fb_helper_set_par+0x38/0x40
       fbcon_init+0x3c4/0x690
       visual_init+0xc0/0x120
       do_bind_con_driver+0x409/0x4c0
       do_take_over_console+0x233/0x280
       do_fb_registered+0x11f/0x210
       fbcon_fb_registered+0x2c/0x60
       register_framebuffer+0x248/0x2a0
       __drm_fb_helper_initial_config_and_unlock+0x58a/0x720
       drm_fbdev_generic_client_hotplug+0x6e/0xb0
       drm_client_register+0x76/0xc0
       _RNvXs_CsHeezP08sTT_5rvkmsNtB4_5RvkmsNtNtCs1cdwasc6FUb_6kernel8platform6Driver5probe+0xed2/0x1060 [rvkms]
       ? _RNvMs_NtCs1cdwasc6FUb_6kernel8platformINtB4_7AdapterNtCsHeezP08sTT_5rvkms5RvkmsE14probe_callbackBQ_+0x2b/0x70 [rvkms]
       ? acpi_dev_pm_attach+0x25/0x110
       ? platform_probe+0x6a/0xa0
       ? really_probe+0x10b/0x400
       ? __driver_probe_device+0x7c/0x140
       ? driver_probe_device+0x22/0x1b0
       ? __device_attach_driver+0x13a/0x1c0
       ? __pfx___device_attach_driver+0x10/0x10
       ? bus_for_each_drv+0x114/0x170
       ? __device_attach+0xd6/0x1b0
       ? bus_probe_device+0x9e/0x120
       ? device_add+0x288/0x4b0
       ? platform_device_add+0x75/0x230
       ? platform_device_register_full+0x141/0x180
       ? rust_helper_platform_device_register_simple+0x85/0xb0
       ? _RNvMs2_NtCs1cdwasc6FUb_6kernel8platformNtB5_6Device13create_simple+0x1d/0x60
       ? _RNvXs0_CsHeezP08sTT_5rvkmsNtB5_5RvkmsNtCs1cdwasc6FUb_6kernel6Module4init+0x11e/0x160 [rvkms]
       ? 0xffffffffc083f000
       ? init_module+0x20/0x1000 [rvkms]
       ? kernfs_xattr_get+0x3e/0x80
       ? do_one_initcall+0x148/0x3f0
       ? __lock_acquire+0x5ef/0x2bd0
       ? __lock_acquire+0x5ef/0x2bd0
       ? __lock_acquire+0x5ef/0x2bd0
       ? put_cpu_partial+0x51/0x1d0
       ? lock_acquire+0xec/0x290
       ? put_cpu_partial+0x51/0x1d0
       ? lock_release+0xee/0x310
       ? put_cpu_partial+0x51/0x1d0
       ? fs_reclaim_acquire+0x69/0xf0
       ? lock_acquire+0xec/0x290
       ? fs_reclaim_acquire+0x69/0xf0
       ? kfree+0x22f/0x340
       ? lock_release+0xee/0x310
       ? kmalloc_trace_noprof+0x48/0x340
       ? do_init_module+0x22/0x240
       ? kmalloc_trace_noprof+0x155/0x340
       ? do_init_module+0x60/0x240
       ? __se_sys_finit_module+0x2e0/0x3f0
       ? do_syscall_64+0xa4/0x180
       ? syscall_exit_to_user_mode+0x108/0x140
       ? do_syscall_64+0xb0/0x180
       ? vma_end_read+0xd0/0xe0
       ? do_user_addr_fault+0x309/0x640
       ? clear_bhb_loop+0x45/0xa0
       ? clear_bhb_loop+0x45/0xa0
       ? clear_bhb_loop+0x45/0xa0
       ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
       </TASK>
    
    Fix this by stubbing these macros out when this config option isn't
    enabled, along with fixing the unused variable warning that introduces.
    
    Signed-off-by: Lyude Paul <[email protected]>
    Reviewed-by: Daniel Vetter <[email protected]>
    Fixes: e2a1cda3e0c7 ("drm/panic: Add drm panic locking")
    Cc: <[email protected]> # v6.10+
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sti: avoid potential dereference of error pointers [+ + +]

Author: Ma Ke <[email protected]>
Date:   Fri Sep 13 17:04:12 2024 +0800

    drm/sti: avoid potential dereference of error pointers
    
    commit 831214f77037de02afc287eae93ce97f218d8c04 upstream.
    
    The return value of drm_atomic_get_crtc_state() needs to be
    checked. To avoid use of error pointer 'crtc_state' in case
    of the failure.
    
    Cc: [email protected]
    Fixes: dd86dc2f9ae1 ("drm/sti: implement atomic_check for the planes")
    Signed-off-by: Ma Ke <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Alain Volmat <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sti: avoid potential dereference of error pointers in sti_gdp_atomic_check [+ + +]

Author: Ma Ke <[email protected]>
Date:   Mon Sep 9 14:33:59 2024 +0800

    drm/sti: avoid potential dereference of error pointers in sti_gdp_atomic_check
    
    commit e965e771b069421c233d674c3c8cd8c7f7245f42 upstream.
    
    The return value of drm_atomic_get_crtc_state() needs to be
    checked. To avoid use of error pointer 'crtc_state' in case
    of the failure.
    
    Cc: [email protected]
    Fixes: dd86dc2f9ae1 ("drm/sti: implement atomic_check for the planes")
    Signed-off-by: Ma Ke <[email protected]>
    Acked-by: Alain Volmat <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Alain Volmat <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sti: avoid potential dereference of error pointers in sti_hqvdp_atomic_check [+ + +]

Author: Ma Ke <[email protected]>
Date:   Fri Sep 13 17:09:26 2024 +0800

    drm/sti: avoid potential dereference of error pointers in sti_hqvdp_atomic_check
    
    commit c1ab40a1fdfee732c7e6ff2fb8253760293e47e8 upstream.
    
    The return value of drm_atomic_get_crtc_state() needs to be
    checked. To avoid use of error pointer 'crtc_state' in case
    of the failure.
    
    Cc: [email protected]
    Fixes: dd86dc2f9ae1 ("drm/sti: implement atomic_check for the planes")
    Signed-off-by: Ma Ke <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Alain Volmat <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/guc_submit: fix race around suspend_pending [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Fri Nov 22 16:19:17 2024 +0000

    drm/xe/guc_submit: fix race around suspend_pending
    
    commit 87651f31ae4e6e6e7e6c7270b9b469405e747407 upstream.
    
    Currently in some testcases we can trigger:
    
    xe 0000:03:00.0: [drm] Assertion `exec_queue_destroyed(q)` failed!
    ....
    WARNING: CPU: 18 PID: 2640 at drivers/gpu/drm/xe/xe_guc_submit.c:1826 xe_guc_sched_done_handler+0xa54/0xef0 [xe]
    xe 0000:03:00.0: [drm] *ERROR* GT1: DEREGISTER_DONE: Unexpected engine state 0x00a1, guc_id=57
    
    Looking at a snippet of corresponding ftrace for this GuC id we can see:
    
    162.673311: xe_sched_msg_add:     dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
    162.673317: xe_sched_msg_recv:    dev=0000:03:00.0, gt=1 guc_id=57, opcode=3
    162.673319: xe_exec_queue_scheduling_disable: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
    162.674089: xe_exec_queue_kill:   dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0x29, flags=0x0
    162.674108: xe_exec_queue_close:  dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
    162.674488: xe_exec_queue_scheduling_done: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa9, flags=0x0
    162.678452: xe_exec_queue_deregister: dev=0000:03:00.0, 1:0x2, gt=1, width=1, guc_id=57, guc_state=0xa1, flags=0x0
    
    It looks like we try to suspend the queue (opcode=3), setting
    suspend_pending and triggering a disable_scheduling. The user then
    closes the queue. However the close will also forcefully signal the
    suspend fence after killing the queue, later when the G2H response for
    disable_scheduling comes back we have now cleared suspend_pending when
    signalling the suspend fence, so the disable_scheduling now incorrectly
    tries to also deregister the queue. This leads to warnings since the queue
    has yet to even be marked for destruction. We also seem to trigger
    errors later with trying to double unregister the same queue.
    
    To fix this tweak the ordering when handling the response to ensure we
    don't race with a disable_scheduling that didn't actually intend to
    perform an unregister.  The destruction path should now also correctly
    wait for any pending_disable before marking as destroyed.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3371
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit f161809b362f027b6d72bd998e47f8f0bad60a2e)
    Signed-off-by: Thomas Hellström <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/migrate: fix pat index usage [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Tue Nov 26 18:13:00 2024 +0000

    drm/xe/migrate: fix pat index usage
    
    commit 23346f85163de83aca6dc30dde3944131cf54706 upstream.
    
    XE_CACHE_WB must be converted into the per-platform pat index for that
    particular caching mode, otherwise we are just encoding whatever happens
    to be the value of that enum.
    
    Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: Nirmoy Das <[email protected]>
    Cc: <[email protected]> # v6.12+
    Reviewed-by: Nirmoy Das <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit f3dc9246f9c3cd5a7d8fd70cfd805bfc52214e2e)
    Signed-off-by: Thomas Hellström <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/migrate: use XE_BO_FLAG_PAGETABLE [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Tue Nov 26 18:13:01 2024 +0000

    drm/xe/migrate: use XE_BO_FLAG_PAGETABLE
    
    commit c78f4399188369a55eed69cbf19a8aad2a65ac75 upstream.
    
    On some HW we want to avoid the host caching PTEs, since access from GPU
    side can be incoherent. However here the special migrate object is
    mapping PTEs which are written from the host and potentially cached. Use
    XE_BO_FLAG_PAGETABLE to ensure that non-cached mapping is used, on
    platforms where this matters.
    
    Fixes: 7a060d786cc1 ("drm/xe/mtl: Map PPGTT as CPU:WC")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: Nirmoy Das <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Nirmoy Das <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit febc689b27d28973cd02f667548a5dca383d859a)
    Signed-off-by: Thomas Hellström <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs [+ + +]

Author: Jonathan Cavitt <[email protected]>
Date:   Wed Oct 23 20:07:15 2024 +0000

    drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs
    
    commit 55858fa7eb2f163f7aa34339fd3399ba4ff564c6 upstream.
    
    Several OA registers and allowlist registers were missing from the
    save/restore list for GuC and could be lost during an engine reset.  Add
    them to the list.
    
    v2:
    - Fix commit message (Umesh)
    - Add missing closes (Ashutosh)
    
    v3:
    - Add missing fixes (Ashutosh)
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2249
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Suggested-by: Umesh Nerlige Ramappa <[email protected]>
    Suggested-by: John Harrison <[email protected]>
    Signed-off-by: Jonathan Cavitt <[email protected]>
    CC: [email protected] # v6.11+
    Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
    Reviewed-by: Ashutosh Dixit <[email protected]>
    Signed-off-by: Ashutosh Dixit <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm: panel: jd9365da-h3: Remove unused num_init_cmds structure member [+ + +]

Author: Hugo Villeneuve <[email protected]>
Date:   Mon Sep 30 13:05:03 2024 -0400

    drm: panel: jd9365da-h3: Remove unused num_init_cmds structure member
    
    commit 66ae275365be4f118abe2254a0ced1d913af93f2 upstream.
    
    Now that the driver has been converted to use wrapped MIPI DCS functions,
    the num_init_cmds structure member is no longer needed, so remove it.
    
    Fixes: 35583e129995 ("drm/panel: panel-jadard-jd9365da-h3: use wrapped MIPI DCS functions")
    Cc: [email protected]
    Signed-off-by: Hugo Villeneuve <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Reviewed-by: Jessica Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm: xlnx: zynqmp_dpsub: fix hotplug detection [+ + +]

Author: Steffen Dirkwinkel <[email protected]>
Date:   Mon Oct 28 14:42:17 2024 +0100

    drm: xlnx: zynqmp_dpsub: fix hotplug detection
    
    commit 71ba1c9b1c717831920c3d432404ee5a707e04b4 upstream.
    
    drm_kms_helper_poll_init needs to be called after zynqmp_dpsub_kms_init.
    zynqmp_dpsub_kms_init creates the connector and without it we don't
    enable hotplug detection.
    
    Fixes: eb2d64bfcc17 ("drm: xlnx: zynqmp_dpsub: Report HPD through the bridge")
    Cc: [email protected]
    Signed-off-by: Steffen Dirkwinkel <[email protected]>
    Signed-off-by: Tomi Valkeinen <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: net: fec: add pps channel property [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Fri Oct 4 17:24:17 2024 +0200

    dt-bindings: net: fec: add pps channel property
    
    commit 1aa772be0444a2bd06957f6d31865e80e6ae4244 upstream.
    
    Add fsl,pps-channel property to select where to connect the PPS signal.
    This depends on the internal SoC routing and on the board, for example
    on the i.MX8 SoC it can be connected to an external pin (using channel 1)
    or to internal eDMA as DMA request (channel 0).
    
    Signed-off-by: Francesco Dolcini <[email protected]>
    Acked-by: Conor Dooley <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Csókás, Bence <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

efi/libstub: Free correct pointer on failure [+ + +]

Author: Ard Biesheuvel <[email protected]>
Date:   Sun Oct 13 15:19:04 2024 +0200

    efi/libstub: Free correct pointer on failure
    
    commit 06d39d79cbd5a91a33707951ebf2512d0e759847 upstream.
    
    cmdline_ptr is an out parameter, which is not allocated by the function
    itself, and likely points into the caller's stack.
    
    cmdline refers to the pool allocation that should be freed when cleaning
    up after a failure, so pass this instead to free_pool().
    
    Fixes: 42c8ea3dca09 ("efi: libstub: Factor out EFI stub entrypoint ...")
    Cc: <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

f2fs: fix to drop all discards after creating snapshot on lvm device [+ + +]

Author: Chao Yu <[email protected]>
Date:   Thu Nov 21 22:17:16 2024 +0800

    f2fs: fix to drop all discards after creating snapshot on lvm device
    
    commit bc8aeb04fd80cb8cfae3058445c84410fd0beb5e upstream.
    
    Piergiorgio reported a bug in bugzilla as below:
    
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 969 at fs/f2fs/segment.c:1330
    RIP: 0010:__submit_discard_cmd+0x27d/0x400 [f2fs]
    Call Trace:
     __issue_discard_cmd+0x1ca/0x350 [f2fs]
     issue_discard_thread+0x191/0x480 [f2fs]
     kthread+0xcf/0x100
     ret_from_fork+0x31/0x50
     ret_from_fork_asm+0x1a/0x30
    
    w/ below testcase, it can reproduce this bug quickly:
    - pvcreate /dev/vdb
    - vgcreate myvg1 /dev/vdb
    - lvcreate -L 1024m -n mylv1 myvg1
    - mount /dev/myvg1/mylv1 /mnt/f2fs
    - dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20
    - sync
    - rm /mnt/f2fs/file
    - sync
    - lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1
    - umount /mnt/f2fs
    
    The root cause is: it will update discard_max_bytes of mounted lvm
    device to zero after creating snapshot on this lvm device, then,
    __submit_discard_cmd() will pass parameter @nr_sects w/ zero value
    to __blkdev_issue_discard(), it returns a NULL bio pointer, result
    in panic.
    
    This patch changes as below for fixing:
    1. Let's drop all remained discards in f2fs_unfreeze() if snapshot
    of lvm device is created.
    2. Checking discard_max_bytes before submitting discard during
    __submit_discard_cmd().
    
    Cc: [email protected]
    Fixes: 35ec7d574884 ("f2fs: split discard command in prior to block layer")
    Reported-by: Piergiorgio Sartor <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219484
    Signed-off-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero [+ + +]

Author: Jiri Olsa <[email protected]>
Date:   Fri Nov 22 00:11:18 2024 +0100

    fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero
    
    commit 088f294609d8f8816dc316681aef2eb61982e0da upstream.
    
    If iov_iter_zero succeeds after failed copy_from_kernel_nofault,
    we need to reset the ret value to zero otherwise it will be returned
    as final return value of read_kcore_iter.
    
    This fixes objdump -d dump over /proc/kcore for me.
    
    Cc: [email protected]
    Cc: Alexander Gordeev <[email protected]>
    Fixes: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
    Signed-off-by: Jiri Olsa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Alexander Gordeev <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ftrace: Fix regression with module command in stack_trace_filter [+ + +]

Author: guoweikang <[email protected]>
Date:   Wed Nov 20 13:27:49 2024 +0800

    ftrace: Fix regression with module command in stack_trace_filter
    
    commit 45af52e7d3b8560f21d139b3759735eead8b1653 upstream.
    
    When executing the following command:
    
        # echo "write*:mod:ext3" > /sys/kernel/tracing/stack_trace_filter
    
    The current mod command causes a null pointer dereference. While commit
    0f17976568b3f ("ftrace: Fix regression with module command in stack_trace_filter")
    has addressed part of the issue, it left a corner case unhandled, which still
    results in a kernel crash.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 04ec7bb642b77 ("tracing: Have the trace_array hold the list of registered func probes");
    Signed-off-by: guoweikang <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i3c: master: Fix miss free init_dyn_addr at i3c_master_put_i3c_addrs() [+ + +]

Author: Frank Li <[email protected]>
Date:   Tue Oct 1 12:26:08 2024 -0400

    i3c: master: Fix miss free init_dyn_addr at i3c_master_put_i3c_addrs()
    
    commit 3082990592f7c6d7510a9133afa46e31bbe26533 upstream.
    
    if (dev->boardinfo && dev->boardinfo->init_dyn_addr)
                                          ^^^ here check "init_dyn_addr"
            i3c_bus_set_addr_slot_status(&master->bus, dev->info.dyn_addr, ...)
                                                                 ^^^^
                                                            free "dyn_addr"
    Fix copy/paste error "dyn_addr" by replacing it with "init_dyn_addr".
    
    Cc: [email protected]
    Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i3c: master: svc: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 30 17:19:13 2024 +0800

    i3c: master: svc: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 18599e93e4e814ce146186026c6abf83c14d5798 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Cc: [email protected] # v5.17
    Fixes: 05be23ef78f7 ("i3c: master: svc: add runtime pm support")
    Reviewed-by: Frank Li <[email protected]>
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i3c: master: svc: fix possible assignment of the same address to two devices [+ + +]

Author: Frank Li <[email protected]>
Date:   Wed Oct 2 10:50:38 2024 -0400

    i3c: master: svc: fix possible assignment of the same address to two devices
    
    commit 3b2ac810d86eb96e882db80a3320a3848b133208 upstream.
    
    svc_i3c_master_do_daa() {
        ...
        for (i = 0; i < dev_nb; i++) {
            ret = i3c_master_add_i3c_dev_locked(m, addrs[i]);
            if (ret)
                goto rpm_out;
        }
    }
    
    If two devices (A and B) are detected in DAA and address 0xa is assigned to
    device A and 0xb to device B, a failure in i3c_master_add_i3c_dev_locked()
    for device A (addr: 0xa) could prevent device B (addr: 0xb) from being
    registered on the bus. The I3C stack might still consider 0xb a free
    address. If a subsequent Hotjoin occurs, 0xb might be assigned to Device A,
    causing both devices A and B to use the same address 0xb, violating the I3C
    specification.
    
    The return value for i3c_master_add_i3c_dev_locked() should not be checked
    because subsequent steps will scan the entire I3C bus, independent of
    whether i3c_master_add_i3c_dev_locked() returns success.
    
    If device A registration fails, there is still a chance to register device
    B. i3c_master_add_i3c_dev_locked() can reset DAA if a failure occurs while
    retrieving device information.
    
    Cc: [email protected]
    Fixes: 317bacf960a4 ("i3c: master: add enable(disable) hot join in sys entry")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i3c: master: svc: Modify enabled_events bit 7:0 to act as IBI enable counter [+ + +]

Author: Frank Li <[email protected]>
Date:   Fri Nov 1 12:50:02 2024 -0400

    i3c: master: svc: Modify enabled_events bit 7:0 to act as IBI enable counter
    
    commit 25bc99be5fe53853053ceeaa328068c49dc1e799 upstream.
    
    Fix issue where disabling IBI on one device disables the entire IBI
    interrupt. Modify bit 7:0 of enabled_events to serve as an IBI enable
    counter, ensuring that the system IBI interrupt is disabled only when all
    I3C devices have IBI disabled.
    
    Cc: [email protected]
    Fixes: 7ff730ca458e ("i3c: master: svc: enable the interrupt in the enable ibi function")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: accel: kx022a: Fix raw read format [+ + +]

Author: Matti Vaittinen <[email protected]>
Date:   Wed Oct 30 15:16:11 2024 +0200

    iio: accel: kx022a: Fix raw read format
    
    commit b7d2bc99b3bdc03fff9b416dd830632346d83530 upstream.
    
    The KX022A provides the accelerometer data in two subsequent registers.
    The registers are laid out so that the value obtained via bulk-read of
    these registers can be interpreted as signed 16-bit little endian value.
    The read value is converted to cpu_endianes and stored into 32bit integer.
    The le16_to_cpu() casts value to unsigned 16-bit value, and when this is
    assigned to 32-bit integer the resulting value will always be positive.
    
    This has not been a problem to users (at least not all users) of the sysfs
    interface, who know the data format based on the scan info and who have
    converted the read value back to 16-bit signed value. This isn't
    compliant with the ABI however.
    
    This, however, will be a problem for those who use the in-kernel
    interfaces, especially the iio_read_channel_processed_scale().
    
    The iio_read_channel_processed_scale() performs multiplications to the
    returned (always positive) raw value, which will cause strange results
    when the data from the sensor has been negative.
    
    Fix the read_raw format by casting the result of the le_to_cpu() to
    signed 16-bit value before assigning it to the integer. This will make
    the negative readings to be correctly reported as negative.
    
    This fix will be visible to users by changing values returned via sysfs
    to appear in correct (negative) format.
    
    Reported-by: Kalle Niemi <[email protected]>
    Fixes: 7c1d1677b322 ("iio: accel: Support Kionix/ROHM KX022A accelerometer")
    Signed-off-by: Matti Vaittinen <[email protected]>
    Tested-by: Kalle Niemi <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/ZyIxm_zamZfIGrnB@mva-rohm
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: adc: ad7923: Fix buffer overflow for tx_buf and ring_xfer [+ + +]

Author: Nuno Sa <[email protected]>
Date:   Tue Oct 29 13:46:37 2024 +0000

    iio: adc: ad7923: Fix buffer overflow for tx_buf and ring_xfer
    
    commit 3a4187ec454e19903fd15f6e1825a4b84e59a4cd upstream.
    
    The AD7923 was updated to support devices with 8 channels, but the size
    of tx_buf and ring_xfer was not increased accordingly, leading to a
    potential buffer overflow in ad7923_update_scan_mode().
    
    Fixes: 851644a60d20 ("iio: adc: ad7923: Add support for the ad7908/ad7918/ad7928")
    Cc: [email protected]
    Signed-off-by: Nuno Sa <[email protected]>
    Signed-off-by: Zicheng Qu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: Fix fwnode_handle in __fwnode_iio_channel_get_by_name() [+ + +]

Author: Zicheng Qu <[email protected]>
Date:   Sat Nov 2 09:25:25 2024 +0000

    iio: Fix fwnode_handle in __fwnode_iio_channel_get_by_name()
    
    commit 3993ca4add248f0f853f54f9273a7de850639f33 upstream.
    
    In the fwnode_iio_channel_get_by_name(), iterating over parent nodes to
    acquire IIO channels via fwnode_for_each_parent_node(). The variable
    chan was mistakenly attempted on the original node instead of the
    current parent node. This patch corrects the logic to ensure that
    __fwnode_iio_channel_get_by_name() is called with the correct parent
    node.
    
    Cc: [email protected] # v6.6+
    Fixes: 1e64b9c5f9a0 ("iio: inkern: move to fwnode properties")
    Signed-off-by: Zicheng Qu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: gts: fix infinite loop for gain_to_scaletables() [+ + +]

Author: Zicheng Qu <[email protected]>
Date:   Thu Oct 31 01:46:26 2024 +0000

    iio: gts: fix infinite loop for gain_to_scaletables()
    
    commit 7452f8a0814bb73f739ee0dab60f099f3361b151 upstream.
    
    In iio_gts_build_avail_time_table(), it is checked that gts->num_itime is
    non-zero, but gts->num_itime is not checked in gain_to_scaletables(). The
    variable time_idx is initialized as gts->num_itime - 1. This implies that
    time_idx might initially be set to -1 (0 - 1 = -1). Consequently, using
    while (time_idx--) could lead to an infinite loop.
    
    Cc: [email protected] # v6.6+
    Fixes: 38416c28e168 ("iio: light: Add gain-time-scale helpers")
    Signed-off-by: Zicheng Qu <[email protected]>
    Reviewed-by: Matti Vaittinen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: invensense: fix multiple odr switch when FIFO is off [+ + +]

Author: Jean-Baptiste Maneyrol <[email protected]>
Date:   Mon Oct 21 10:38:42 2024 +0200

    iio: invensense: fix multiple odr switch when FIFO is off
    
    commit ef5f5e7b6f73f79538892a8be3a3bee2342acc9f upstream.
    
    When multiple ODR switch happens during FIFO off, the change could
    not be taken into account if you get back to previous FIFO on value.
    For example, if you run sensor buffer at 50Hz, stop, change to
    200Hz, then back to 50Hz and restart buffer, data will be timestamped
    at 200Hz. This due to testing against mult and not new_mult.
    
    To prevent this, let's just run apply_odr automatically when FIFO is
    off. It will also simplify driver code.
    
    Update inv_mpu6050 and inv_icm42600 to delete now useless apply_odr.
    
    Fixes: 95444b9eeb8c ("iio: invensense: fix odr switching to same value")
    Cc: [email protected]
    Signed-off-by: Jean-Baptiste Maneyrol <[email protected]>
    Link: https://patch.msgid.link/20241021-invn-inv-sensors-timestamp-fix-switch-fifo-off-v2-1-39ffd43edcc4@tdk.com
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/arm-smmu: Defer probe of clients after smmu device bound [+ + +]

Author: Pratyush Brahma <[email protected]>
Date:   Fri Oct 4 14:34:28 2024 +0530

    iommu/arm-smmu: Defer probe of clients after smmu device bound
    
    commit 229e6ee43d2a160a1592b83aad620d6027084aad upstream.
    
    Null pointer dereference occurs due to a race between smmu
    driver probe and client driver probe, when of_dma_configure()
    for client is called after the iommu_device_register() for smmu driver
    probe has executed but before the driver_bound() for smmu driver
    has been called.
    
    Following is how the race occurs:
    
    T1:Smmu device probe            T2: Client device probe
    
    really_probe()
    arm_smmu_device_probe()
    iommu_device_register()
                                            really_probe()
                                            platform_dma_configure()
                                            of_dma_configure()
                                            of_dma_configure_id()
                                            of_iommu_configure()
                                            iommu_probe_device()
                                            iommu_init_device()
                                            arm_smmu_probe_device()
                                            arm_smmu_get_by_fwnode()
                                                    driver_find_device_by_fwnode()
                                                    driver_find_device()
                                                    next_device()
                                                    klist_next()
                                                        /* null ptr
                                                           assigned to smmu */
                                            /* null ptr dereference
                                               while smmu->streamid_mask */
    driver_bound()
            klist_add_tail()
    
    When this null smmu pointer is dereferenced later in
    arm_smmu_probe_device, the device crashes.
    
    Fix this by deferring the probe of the client device
    until the smmu device has bound to the arm smmu driver.
    
    Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support")
    Cc: [email protected]
    Co-developed-by: Prakash Gupta <[email protected]>
    Signed-off-by: Prakash Gupta <[email protected]>
    Signed-off-by: Pratyush Brahma <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [will: Add comment]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/io-pgtable-arm: Fix stage-2 map/unmap for concatenated tables [+ + +]

Author: Mostafa Saleh <[email protected]>
Date:   Thu Oct 24 16:25:15 2024 +0000

    iommu/io-pgtable-arm: Fix stage-2 map/unmap for concatenated tables
    
    commit d71fa842d33c48ac2809ae11d2379b5a788792cb upstream.
    
    ARM_LPAE_LVL_IDX() takes into account concatenated PGDs and can return
    an index spanning multiple page-table pages given a sufficiently large
    input address. However, when the resulting index is used to calculate
    the number of remaining entries in the page, the possibility of
    concatenation is ignored and we end up computing a negative upper bound:
    
            max_entries = ARM_LPAE_PTES_PER_TABLE(data) - map_idx_start;
    
    On the map path, this results in a negative 'mapped' value being
    returned but on the unmap path we can leak child tables if they are
    skipped in __arm_lpae_free_pgtable().
    
    Introduce an arm_lpae_max_entries() helper to convert a table index into
    the remaining number of entries within a single page-table page.
    
    Cc: <[email protected]>
    Signed-off-by: Mostafa Saleh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [will: Tweaked comment and commit message]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/tegra241-cmdqv: Fix unused variable warning [+ + +]

Author: Will Deacon <[email protected]>
Date:   Tue Oct 29 15:58:24 2024 +0000

    iommu/tegra241-cmdqv: Fix unused variable warning
    
    commit 5492f0c4085a8fb8820ff974f17b83a7d6dab5a5 upstream.
    
    While testing some io-pgtable changes, I ran into a compiler warning
    from the Tegra CMDQ driver:
    
      drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c:803:23: warning: unused variable 'cmdqv_debugfs_dir' [-Wunused-variable]
        803 | static struct dentry *cmdqv_debugfs_dir;
            |                       ^~~~~~~~~~~~~~~~~
      1 warning generated.
    
    Guard the variable declaration with CONFIG_IOMMU_DEBUGFS to silence the
    warning.
    
    Signed-off-by: Will Deacon <[email protected]>
    Cc: Jiri Slaby <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

kunit: Fix potential null dereference in kunit_device_driver_test() [+ + +]

Author: Zichen Xie <[email protected]>
Date:   Thu Nov 14 23:43:36 2024 -0600

    kunit: Fix potential null dereference in kunit_device_driver_test()
    
    commit 435c20eed572a95709b1536ff78832836b2f91b1 upstream.
    
    kunit_kzalloc() may return a NULL pointer, dereferencing it without
    NULL check may lead to NULL dereference.
    Add a NULL check for test_state.
    
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: d03c720e03bd ("kunit: Add APIs for managing devices")
    Signed-off-by: Zichen Xie <[email protected]>
    Cc: [email protected]
    Reviewed-by: David Gow <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

kunit: string-stream: Fix a UAF bug in kunit_init_suite() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Tue Nov 12 16:03:14 2024 +0800

    kunit: string-stream: Fix a UAF bug in kunit_init_suite()
    
    commit 39e21403c978862846fa68b7f6d06f9cca235194 upstream.
    
    In kunit_debugfs_create_suite(), if alloc_string_stream() fails in the
    kunit_suite_for_each_test_case() loop, the "suite->log = stream"
    has assigned before, and the error path only free the suite->log's stream
    memory but not set it to NULL, so the later string_stream_clear() of
    suite->log in kunit_init_suite() will cause below UAF bug.
    
    Set stream pointer to NULL after free to fix it.
    
            Unable to handle kernel paging request at virtual address 006440150000030d
            Mem abort info:
              ESR = 0x0000000096000004
              EC = 0x25: DABT (current EL), IL = 32 bits
              SET = 0, FnV = 0
              EA = 0, S1PTW = 0
              FSC = 0x04: level 0 translation fault
            Data abort info:
              ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
              CM = 0, WnR = 0, TnD = 0, TagAccess = 0
              GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
            [006440150000030d] address between user and kernel address ranges
            Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
            Dumping ftrace buffer:
               (ftrace buffer empty)
            Modules linked in: iio_test_gts industrialio_gts_helper cfg80211 rfkill ipv6 [last unloaded: iio_test_gts]
            CPU: 5 UID: 0 PID: 6253 Comm: modprobe Tainted: G    B   W        N 6.12.0-rc4+ #458
            Tainted: [B]=BAD_PAGE, [W]=WARN, [N]=TEST
            Hardware name: linux,dummy-virt (DT)
            pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
            pc : string_stream_clear+0x54/0x1ac
            lr : string_stream_clear+0x1a8/0x1ac
            sp : ffffffc080b47410
            x29: ffffffc080b47410 x28: 006440550000030d x27: ffffff80c96b5e98
            x26: ffffff80c96b5e80 x25: ffffffe461b3f6c0 x24: 0000000000000003
            x23: ffffff80c96b5e88 x22: 1ffffff019cdf4fc x21: dfffffc000000000
            x20: ffffff80ce6fa7e0 x19: 032202a80000186d x18: 0000000000001840
            x17: 0000000000000000 x16: 0000000000000000 x15: ffffffe45c355cb4
            x14: ffffffe45c35589c x13: ffffffe45c03da78 x12: ffffffb810168e75
            x11: 1ffffff810168e74 x10: ffffffb810168e74 x9 : dfffffc000000000
            x8 : 0000000000000004 x7 : 0000000000000003 x6 : 0000000000000001
            x5 : ffffffc080b473a0 x4 : 0000000000000000 x3 : 0000000000000000
            x2 : 0000000000000001 x1 : ffffffe462fbf620 x0 : dfffffc000000000
            Call trace:
             string_stream_clear+0x54/0x1ac
             __kunit_test_suites_init+0x108/0x1d8
             kunit_exec_run_tests+0xb8/0x100
             kunit_module_notify+0x400/0x55c
             notifier_call_chain+0xfc/0x3b4
             blocking_notifier_call_chain+0x68/0x9c
             do_init_module+0x24c/0x5c8
             load_module+0x4acc/0x4e90
             init_module_from_file+0xd4/0x128
             idempotent_init_module+0x2d4/0x57c
             __arm64_sys_finit_module+0xac/0x100
             invoke_syscall+0x6c/0x258
             el0_svc_common.constprop.0+0x160/0x22c
             do_el0_svc+0x44/0x5c
             el0_svc+0x48/0xb8
             el0t_64_sync_handler+0x13c/0x158
             el0t_64_sync+0x190/0x194
            Code: f9400753 d2dff800 f2fbffe0 d343fe7c (38e06b80)
            ---[ end trace 0000000000000000 ]---
            Kernel panic - not syncing: Oops: Fatal exception
    
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Fixes: a3fdf784780c ("kunit: string-stream: Decouple string_stream from kunit")
    Suggested-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Kuan-Wei Chiu <[email protected]>
    Reviewed-by: David Gow <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

leds: flash: mt6360: Fix device_for_each_child_node() refcounting in error paths [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Fri Sep 27 01:20:52 2024 +0200

    leds: flash: mt6360: Fix device_for_each_child_node() refcounting in error paths
    
    commit 73b03b27736e440e3009fe1319cbc82d2cd1290c upstream.
    
    The device_for_each_child_node() macro requires explicit calls to
    fwnode_handle_put() upon early exits to avoid memory leaks, and in
    this case the error paths are handled after jumping to
    'out_flash_realease', which misses that required call to
    to decrement the refcount of the child node.
    
    A more elegant and robust solution is using the scoped variant of the
    loop, which automatically handles such early exits.
    
    Fix the child node refcounting in the error paths by using
    device_for_each_child_node_scoped().
    
    Cc: [email protected]
    Fixes: 679f8652064b ("leds: Add mt6360 driver")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://lore.kernel.org/r/20240927-leds_device_for_each_child_node_scoped-v1-1-95c0614b38c8@gmail.com
    Signed-off-by: Lee Jones <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

leds: lp55xx: Remove redundant test for invalid channel number [+ + +]

Author: Michal Vokáč <[email protected]>
Date:   Thu Oct 17 17:08:12 2024 +0200

    leds: lp55xx: Remove redundant test for invalid channel number
    
    commit 09b1ef9813a0742674f7efe26104403ca94a1b4a upstream.
    
    Since commit 92a81562e695 ("leds: lp55xx: Add multicolor framework
    support to lp55xx") there are two subsequent tests if the chan_nr
    (reg property) is in valid range. One in the lp55xx_init_led()
    function and one in the lp55xx_parse_common_child() function that
    was added with the mentioned commit.
    
    There are two issues with that.
    
    First is in the lp55xx_parse_common_child() function where the reg
    property is tested right after it is read from the device tree.
    Test for the upper range is not correct though. Valid reg values are
    0 to (max_channel - 1) so it should be >=.
    
    Second issue is that in case the parsed value is out of the range
    the probe just fails and no error message is shown as the code never
    reaches the second test that prints and error message.
    
    Remove the test form lp55xx_parse_common_child() function completely
    and keep the one in lp55xx_init_led() function to deal with it.
    
    Fixes: 92a81562e695 ("leds: lp55xx: Add multicolor framework support to lp55xx")
    Cc: [email protected]
    Signed-off-by: Michal Vokáč <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.12.4 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Mon Dec 9 10:41:16 2024 +0100

    Linux 6.12.4
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Ronald Warsow <[email protected]>
    Tested-by: Luna Jernberg <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Salvatore Bonaccorso <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: kernelci.org bot <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

maple_tree: refine mas_store_root() on storing NULL [+ + +]

Author: Wei Yang <[email protected]>
Date:   Thu Oct 31 23:16:26 2024 +0000

    maple_tree: refine mas_store_root() on storing NULL
    
    commit 0ea120b278ad7f7cfeeb606e150ad04b192df60b upstream.
    
    Currently, when storing NULL on mas_store_root(), the behavior could be
    improved.
    
    Storing NULLs over the entire tree may result in a node being used to
    store a single range.  Further stores of NULL may cause the node and
    tree to be corrupt and cause incorrect behaviour.  Fixing the store to
    the root null fixes the issue by ensuring that a range of 0 - ULONG_MAX
    results in an empty tree.
    
    Users of the tree may experience incorrect values returned if the tree
    was expanded to store values, then overwritten by all NULLS, then
    continued to store NULLs over the empty area.
    
    For example possible cases are:
    
      * store NULL at any range result a new node
      * store NULL at range [m, n] where m > 0 to a single entry tree result
        a new node with range [m, n] set to NULL
      * store NULL at range [m, n] where m > 0 to an empty tree result
        consecutive NULL slot
      * it allows for multiple NULL entries by expanding root
        to store NULLs to an empty tree
    
    This patch tries to improve in:
    
      * memory efficient by setting to empty tree instead of using a node
      * remove the possibility of consecutive NULL slot which will prohibit
        extended null in later operation
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Wei Yang <[email protected]>
    Reviewed-by: Liam R. Howlett <[email protected]>
    Cc: Liam R. Howlett <[email protected]>
    Cc: Sidhartha Kumar <[email protected]>
    Cc: Lorenzo Stoakes <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

md/md-bitmap: Add missing destroy_work_on_stack() [+ + +]

Author: Yuan Can <[email protected]>
Date:   Tue Nov 5 21:01:05 2024 +0800

    md/md-bitmap: Add missing destroy_work_on_stack()
    
    commit 6012169e8aae9c0eda38bbedcd7a1540a81220ae upstream.
    
    This commit add missed destroy_work_on_stack() operations for
    unplug_work.work in bitmap_unplug_async().
    
    Fixes: a022325ab970 ("md/md-bitmap: add a new helper to unplug bitmap asynchrously")
    Cc: [email protected]
    Signed-off-by: Yuan Can <[email protected]>
    Reviewed-by: Yu Kuai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Song Liu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

md/raid5: Wait sync io to finish before changing group cnt [+ + +]

Author: Xiao Ni <[email protected]>
Date:   Wed Nov 6 17:51:24 2024 +0800

    md/raid5: Wait sync io to finish before changing group cnt
    
    commit fa1944bbe6220eb929e2c02e5e8706b908565711 upstream.
    
    One customer reports a bug: raid5 is hung when changing thread cnt
    while resync is running. The stripes are all in conf->handle_list
    and new threads can't handle them.
    
    Commit b39f35ebe86d ("md: don't quiesce in mddev_suspend()") removes
    pers->quiesce from mddev_suspend/resume. Before this patch, mddev_suspend
    needs to wait for all ios including sync io to finish. Now it's used
    to only wait normal io.
    
    Fix this by calling raid5_quiesce from raid5_store_group_thread_cnt
    directly to wait all sync requests to finish before changing the group
    cnt.
    
    Fixes: b39f35ebe86d ("md: don't quiesce in mddev_suspend()")
    Cc: [email protected]
    Signed-off-by: Xiao Ni <[email protected]>
    Reviewed-by: Yu Kuai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Song Liu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: amphion: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Fri Nov 1 17:40:49 2024 +0800

    media: amphion: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 316e74500d1c6589cba28cebe2864a0bceeb2396 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Cc: [email protected]
    Fixes: b50a64fc54af ("media: amphion: add amphion vpu device driver")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: amphion: Set video drvdata before register video device [+ + +]

Author: Ming Qian <[email protected]>
Date:   Fri Sep 13 15:21:45 2024 +0900

    media: amphion: Set video drvdata before register video device
    
    commit 8cbb1a7bd5973b57898b26eb804fe44af440bb63 upstream.
    
    The video drvdata should be set before the video device is registered,
    otherwise video_drvdata() may return NULL in the open() file ops, and led
    to oops.
    
    Fixes: 3cd084519c6f ("media: amphion: add vpu v4l2 m2m support")
    Cc: <[email protected]>
    Signed-off-by: Ming Qian <[email protected]>
    Reviewed-by: TaoJiang <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: gspca: ov534-ov772x: Fix off-by-one error in set_frame_rate() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Oct 28 16:02:56 2024 +0800

    media: gspca: ov534-ov772x: Fix off-by-one error in set_frame_rate()
    
    commit d2842dec577900031826dc44e9bf0c66416d7173 upstream.
    
    In set_frame_rate(), select a rate in rate_0 or rate_1 by checking
    sd->frame_rate >= r->fps in a loop, but the loop condition terminates when
    the index reaches zero, which fails to check the last elememt in rate_0 or
    rate_1.
    
    Check for >= 0 so that the last one in rate_0 or rate_1 is also checked.
    
    Fixes: 189d92af707e ("V4L/DVB (13422): gspca - ov534: ov772x changes from Richard Kaswy.")
    Cc: [email protected]
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: i2c: dw9768: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Fri Nov 1 17:40:48 2024 +0800

    media: i2c: dw9768: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit d6594d50761728d09f23238cf9c368bab6260ef3 upstream.
    
    It is not valid to call pm_runtime_set_suspended() and
    pm_runtime_set_active() for devices with runtime PM enabled because it
    returns -EAGAIN if it is enabled already and working. So, adjust the
    order to fix it.
    
    Cc: [email protected]
    Fixes: 5f9a089b6de3 ("dw9768: Enable low-power probe on ACPI")
    Suggested-by: Sakari Ailus <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: i2c: tc358743: Fix crash in the probe error path when using polling [+ + +]

Author: Alexander Shiyan <[email protected]>
Date:   Wed Oct 9 09:05:44 2024 +0300

    media: i2c: tc358743: Fix crash in the probe error path when using polling
    
    commit 869f38ae07f7df829da4951c3d1f7a2be09c2e9a upstream.
    
    If an error occurs in the probe() function, we should remove the polling
    timer that was alarmed earlier, otherwise the timer is called with
    arguments that are already freed, which results in a crash.
    
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 0 at kernel/time/timer.c:1830 __run_timers+0x244/0x268
    Modules linked in:
    CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Not tainted 6.11.0 #226
    Hardware name: Diasom DS-RK3568-SOM-EVB (DT)
    pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : __run_timers+0x244/0x268
    lr : __run_timers+0x1d4/0x268
    sp : ffffff80eff2baf0
    x29: ffffff80eff2bb50 x28: 7fffffffffffffff x27: ffffff80eff2bb00
    x26: ffffffc080f669c0 x25: ffffff80efef6bf0 x24: ffffff80eff2bb00
    x23: 0000000000000000 x22: dead000000000122 x21: 0000000000000000
    x20: ffffff80efef6b80 x19: ffffff80041c8bf8 x18: ffffffffffffffff
    x17: ffffffc06f146000 x16: ffffff80eff27dc0 x15: 000000000000003e
    x14: 0000000000000000 x13: 00000000000054da x12: 0000000000000000
    x11: 00000000000639c0 x10: 000000000000000c x9 : 0000000000000009
    x8 : ffffff80eff2cb40 x7 : ffffff80eff2cb40 x6 : ffffff8002bee480
    x5 : ffffffc080cb2220 x4 : ffffffc080cb2150 x3 : 00000000000f4240
    x2 : 0000000000000102 x1 : ffffff80eff2bb00 x0 : ffffff80041c8bf0
    Call trace:
     __run_timers+0x244/0x268
     timer_expire_remote+0x50/0x68
     tmigr_handle_remote+0x388/0x39c
     run_timer_softirq+0x38/0x44
     handle_softirqs+0x138/0x298
     __do_softirq+0x14/0x20
     ____do_softirq+0x10/0x1c
     call_on_irq_stack+0x24/0x4c
     do_softirq_own_stack+0x1c/0x2c
     irq_exit_rcu+0x9c/0xcc
     el1_interrupt+0x48/0xc0
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x7c/0x80
     default_idle_call+0x34/0x68
     do_idle+0x23c/0x294
     cpu_startup_entry+0x38/0x3c
     secondary_start_kernel+0x128/0x160
     __secondary_switched+0xb8/0xbc
    ---[ end trace 0000000000000000 ]---
    
    Fixes: 4e66a52a2e4c ("[media] tc358743: Add support for platforms without IRQ line")
    Signed-off-by: Alexander Shiyan <[email protected]>
    Cc: [email protected]
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: imx-jpeg: Ensure power suppliers be suspended before detach them [+ + +]

Author: Ming Qian <[email protected]>
Date:   Fri Sep 13 15:22:54 2024 +0900

    media: imx-jpeg: Ensure power suppliers be suspended before detach them
    
    commit fd0af4cd35da0eb550ef682b71cda70a4e36f6b9 upstream.
    
    The power suppliers are always requested to suspend asynchronously,
    dev_pm_domain_detach() requires the caller to ensure proper
    synchronization of this function with power management callbacks.
    otherwise the detach may led to kernel panic, like below:
    
    [ 1457.107934] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040
    [ 1457.116777] Mem abort info:
    [ 1457.119589]   ESR = 0x0000000096000004
    [ 1457.123358]   EC = 0x25: DABT (current EL), IL = 32 bits
    [ 1457.128692]   SET = 0, FnV = 0
    [ 1457.131764]   EA = 0, S1PTW = 0
    [ 1457.134920]   FSC = 0x04: level 0 translation fault
    [ 1457.139812] Data abort info:
    [ 1457.142707]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    [ 1457.148196]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [ 1457.153256]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [ 1457.158563] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001138b6000
    [ 1457.165000] [0000000000000040] pgd=0000000000000000, p4d=0000000000000000
    [ 1457.171792] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [ 1457.178045] Modules linked in: v4l2_jpeg wave6_vpu_ctrl(-) [last unloaded: mxc_jpeg_encdec]
    [ 1457.186383] CPU: 0 PID: 51938 Comm: kworker/0:3 Not tainted 6.6.36-gd23d64eea511 #66
    [ 1457.194112] Hardware name: NXP i.MX95 19X19 board (DT)
    [ 1457.199236] Workqueue: pm pm_runtime_work
    [ 1457.203247] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [ 1457.210188] pc : genpd_runtime_suspend+0x20/0x290
    [ 1457.214886] lr : __rpm_callback+0x48/0x1d8
    [ 1457.218968] sp : ffff80008250bc50
    [ 1457.222270] x29: ffff80008250bc50 x28: 0000000000000000 x27: 0000000000000000
    [ 1457.229394] x26: 0000000000000000 x25: 0000000000000008 x24: 00000000000f4240
    [ 1457.236518] x23: 0000000000000000 x22: ffff00008590f0e4 x21: 0000000000000008
    [ 1457.243642] x20: ffff80008099c434 x19: ffff00008590f000 x18: ffffffffffffffff
    [ 1457.250766] x17: 5300326563697665 x16: 645f676e696c6f6f x15: 63343a6d726f6674
    [ 1457.257890] x14: 0000000000000004 x13: 00000000000003a4 x12: 0000000000000002
    [ 1457.265014] x11: 0000000000000000 x10: 0000000000000a60 x9 : ffff80008250bbb0
    [ 1457.272138] x8 : ffff000092937200 x7 : ffff0003fdf6af80 x6 : 0000000000000000
    [ 1457.279262] x5 : 00000000410fd050 x4 : 0000000000200000 x3 : 0000000000000000
    [ 1457.286386] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00008590f000
    [ 1457.293510] Call trace:
    [ 1457.295946]  genpd_runtime_suspend+0x20/0x290
    [ 1457.300296]  __rpm_callback+0x48/0x1d8
    [ 1457.304038]  rpm_callback+0x6c/0x78
    [ 1457.307515]  rpm_suspend+0x10c/0x570
    [ 1457.311077]  pm_runtime_work+0xc4/0xc8
    [ 1457.314813]  process_one_work+0x138/0x248
    [ 1457.318816]  worker_thread+0x320/0x438
    [ 1457.322552]  kthread+0x110/0x114
    [ 1457.325767]  ret_from_fork+0x10/0x20
    
    Fixes: 2db16c6ed72c ("media: imx-jpeg: Add V4L2 driver for i.MX8 JPEG Encoder/Decoder")
    Cc: <[email protected]>
    Signed-off-by: Ming Qian <[email protected]>
    Reviewed-by: TaoJiang <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: imx-jpeg: Set video drvdata before register video device [+ + +]

Author: Ming Qian <[email protected]>
Date:   Fri Sep 13 15:21:44 2024 +0900

    media: imx-jpeg: Set video drvdata before register video device
    
    commit d2b7ecc26bd5406d5ba927be1748aa99c568696c upstream.
    
    The video drvdata should be set before the video device is registered,
    otherwise video_drvdata() may return NULL in the open() file ops, and led
    to oops.
    
    Fixes: 2db16c6ed72c ("media: imx-jpeg: Add V4L2 driver for i.MX8 JPEG Encoder/Decoder")
    Cc: <[email protected]>
    Signed-off-by: Ming Qian <[email protected]>
    Reviewed-by: TaoJiang <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: mtk-jpeg: Fix null-ptr-deref during unload module [+ + +]

Author: Guoqing Jiang <[email protected]>
Date:   Thu Sep 12 10:48:01 2024 +0800

    media: mtk-jpeg: Fix null-ptr-deref during unload module
    
    commit 17af2b39daf12870cac61ffc360e62bc35798afb upstream.
    
    The workqueue should be destroyed in mtk_jpeg_core.c since commit
    09aea13ecf6f ("media: mtk-jpeg: refactor some variables"), otherwise
    the below calltrace can be easily triggered.
    
    [  677.862514] Unable to handle kernel paging request at virtual address dfff800000000023
    [  677.863633] KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
    ...
    [  677.879654] CPU: 6 PID: 1071 Comm: modprobe Tainted: G           O       6.8.12-mtk+gfa1a78e5d24b+ #17
    ...
    [  677.882838] pc : destroy_workqueue+0x3c/0x770
    [  677.883413] lr : mtk_jpegdec_destroy_workqueue+0x70/0x88 [mtk_jpeg_dec_hw]
    [  677.884314] sp : ffff80008ad974f0
    [  677.884744] x29: ffff80008ad974f0 x28: ffff0000d7115580 x27: ffff0000dd691070
    [  677.885669] x26: ffff0000dd691408 x25: ffff8000844af3e0 x24: ffff80008ad97690
    [  677.886592] x23: ffff0000e051d400 x22: ffff0000dd691010 x21: dfff800000000000
    [  677.887515] x20: 0000000000000000 x19: 0000000000000000 x18: ffff800085397ac0
    [  677.888438] x17: 0000000000000000 x16: ffff8000801b87c8 x15: 1ffff000115b2e10
    [  677.889361] x14: 00000000f1f1f1f1 x13: 0000000000000000 x12: ffff7000115b2e4d
    [  677.890285] x11: 1ffff000115b2e4c x10: ffff7000115b2e4c x9 : ffff80000aa43e90
    [  677.891208] x8 : 00008fffeea4d1b4 x7 : ffff80008ad97267 x6 : 0000000000000001
    [  677.892131] x5 : ffff80008ad97260 x4 : ffff7000115b2e4d x3 : 0000000000000000
    [  677.893054] x2 : 0000000000000023 x1 : dfff800000000000 x0 : 0000000000000118
    [  677.893977] Call trace:
    [  677.894297]  destroy_workqueue+0x3c/0x770
    [  677.894826]  mtk_jpegdec_destroy_workqueue+0x70/0x88 [mtk_jpeg_dec_hw]
    [  677.895677]  devm_action_release+0x50/0x90
    [  677.896211]  release_nodes+0xe8/0x170
    [  677.896688]  devres_release_all+0xf8/0x178
    [  677.897219]  device_unbind_cleanup+0x24/0x170
    [  677.897785]  device_release_driver_internal+0x35c/0x480
    [  677.898461]  device_release_driver+0x20/0x38
    ...
    [  677.912665] ---[ end trace 0000000000000000 ]---
    
    Fixes: 09aea13ecf6f ("media: mtk-jpeg: refactor some variables")
    Cc: <[email protected]>
    Signed-off-by: Guoqing Jiang <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: ov08x40: Fix burst write sequence [+ + +]

Author: Bryan O'Donoghue <[email protected]>
Date:   Thu Oct 10 13:33:17 2024 +0100

    media: ov08x40: Fix burst write sequence
    
    commit d0fef6de4f1b957e35a05a5ba4aab2a2576d6686 upstream.
    
    It is necessary to account for I2C quirks in the burst mode path of this
    driver. Not all I2C controllers can accept arbitrarily long writes and this
    is represented in the quirks field of the adapter structure.
    
    Prior to this patch the following error message is seen on a Qualcomm
    X1E80100 CRD.
    
    [   38.773524] i2c i2c-2: adapter quirk: msg too long (addr 0x0036, size 290, write)
    [   38.781454] ov08x40 2-0036: Failed regs transferred: -95
    [   38.787076] ov08x40 2-0036: ov08x40_start_streaming failed to set regs
    
    Fix the error by breaking up the write sequence into the advertised maximum
    write size of the quirks field if the quirks field is populated.
    
    Fixes: 8f667d202384 ("media: ov08x40: Reduce start streaming time")
    Cc: [email protected] # v6.9+
    Tested-by: Bryan O'Donoghue <[email protected]> # x1e80100-crd
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: platform: allegro-dvt: Fix possible memory leak in allocate_buffers_internal() [+ + +]

Author: Gaosheng Cui <[email protected]>
Date:   Wed Oct 9 16:28:02 2024 +0800

    media: platform: allegro-dvt: Fix possible memory leak in allocate_buffers_internal()
    
    commit 0f514068fbc5d4d189c817adc7c4e32cffdc2e47 upstream.
    
    The buffer in the loop should be released under the exception path,
    otherwise there may be a memory leak here.
    
    To mitigate this, free the buffer when allegro_alloc_buffer fails.
    
    Fixes: f20387dfd065 ("media: allegro: add Allegro DVT video IP core driver")
    Cc: <[email protected]>
    Signed-off-by: Gaosheng Cui <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: platform: exynos4-is: Fix an OF node reference leak in fimc_md_is_isp_available [+ + +]

Author: Joe Hattori <[email protected]>
Date:   Mon Nov 4 19:01:19 2024 +0900

    media: platform: exynos4-is: Fix an OF node reference leak in fimc_md_is_isp_available
    
    commit 8964eb23408243ae0016d1f8473c76f64ff25d20 upstream.
    
    In fimc_md_is_isp_available(), of_get_child_by_name() is called to check
    if FIMC-IS is available. Current code does not decrement the refcount of
    the returned device node, which causes an OF node reference leak. Fix it
    by calling of_node_put() at the end of the variable scope.
    
    Signed-off-by: Joe Hattori <[email protected]>
    Fixes: e781bbe3fecf ("[media] exynos4-is: Add fimc-is subdevs registration")
    Cc: [email protected]
    Reviewed-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    [hverkuil: added CC to stable]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: platform: rga: fix 32-bit DMA limitation [+ + +]

Author: John Keeping <[email protected]>
Date:   Mon Aug 12 15:35:55 2024 +0100

    media: platform: rga: fix 32-bit DMA limitation
    
    commit 953c03d8cb41d08fe6994f5d94c4393ac9da2f13 upstream.
    
    The destination buffer flags are assigned twice but source is not set in
    what looks like a copy+paste mistake.  Assign the source queue flags so
    the 32-bit DMA limitation is handled consistently.
    
    Fixes: ec9ef8dda2a2 ("media: rockchip: rga: set dma mask to 32 bits")
    Cc: <[email protected]>
    Signed-off-by: John Keeping <[email protected]>
    Reviewed-by: Michael Tretter <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: fix error path on configuration of power domains [+ + +]

Author: Vladimir Zapolskiy <[email protected]>
Date:   Wed Aug 14 00:03:42 2024 +0300

    media: qcom: camss: fix error path on configuration of power domains
    
    commit 4f45d65b781499d2a79eca12155532739c876aa2 upstream.
    
    There is a chance to meet runtime issues during configuration of CAMSS
    power domains, because on the error path dev_pm_domain_detach() is
    unexpectedly called with NULL or error pointer.
    
    One of the simplest ways to reproduce the problem is to probe CAMSS
    driver before registration of CAMSS power domains, for instance if
    a platform CAMCC driver is simply not built.
    
    Warning backtrace example:
    
        Unable to handle kernel NULL pointer dereference at virtual address 00000000000001a2
    
        <snip>
    
        pc : dev_pm_domain_detach+0x8/0x48
        lr : camss_probe+0x374/0x9c0
    
        <snip>
    
        Call trace:
         dev_pm_domain_detach+0x8/0x48
         platform_probe+0x70/0xf0
         really_probe+0xc4/0x2a8
         __driver_probe_device+0x80/0x140
         driver_probe_device+0x48/0x170
         __device_attach_driver+0xc0/0x148
         bus_for_each_drv+0x88/0xf0
         __device_attach+0xb0/0x1c0
         device_initial_probe+0x1c/0x30
         bus_probe_device+0xb4/0xc0
         deferred_probe_work_func+0x90/0xd0
         process_one_work+0x164/0x3e0
         worker_thread+0x310/0x420
         kthread+0x120/0x130
         ret_from_fork+0x10/0x20
    
    Fixes: 23aa4f0cd327 ("media: qcom: camss: Move VFE power-domain specifics into vfe.c")
    Cc: <[email protected]>
    Signed-off-by: Vladimir Zapolskiy <[email protected]>
    Reviewed-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: ts2020: fix null-ptr-deref in ts2020_probe() [+ + +]

Author: Li Zetao <[email protected]>
Date:   Thu Oct 10 23:41:13 2024 +0800

    media: ts2020: fix null-ptr-deref in ts2020_probe()
    
    commit 4a058b34b52ed3feb1f3ff6fd26aefeeeed20cba upstream.
    
    KASAN reported a null-ptr-deref issue when executing the following
    command:
    
      # echo ts2020 0x20 > /sys/bus/i2c/devices/i2c-0/new_device
        KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
        CPU: 53 UID: 0 PID: 970 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #24
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
        RIP: 0010:ts2020_probe+0xad/0xe10 [ts2020]
        RSP: 0018:ffffc9000abbf598 EFLAGS: 00010202
        RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffffc0714809
        RDX: 0000000000000002 RSI: ffff88811550be00 RDI: 0000000000000010
        RBP: ffff888109868800 R08: 0000000000000001 R09: fffff52001577eb6
        R10: 0000000000000000 R11: ffffc9000abbff50 R12: ffffffffc0714790
        R13: 1ffff92001577eb8 R14: ffffffffc07190d0 R15: 0000000000000001
        FS:  00007f95f13b98c0(0000) GS:ffff888149280000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000555d2634b000 CR3: 0000000152236000 CR4: 00000000000006f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         <TASK>
         ts2020_probe+0xad/0xe10 [ts2020]
         i2c_device_probe+0x421/0xb40
         really_probe+0x266/0x850
        ...
    
    The cause of the problem is that when using sysfs to dynamically register
    an i2c device, there is no platform data, but the probe process of ts2020
    needs to use platform data, resulting in a null pointer being accessed.
    
    Solve this problem by adding checks to platform data.
    
    Fixes: dc245a5f9b51 ("[media] ts2020: implement I2C client bindings")
    Cc: <[email protected]>
    Signed-off-by: Li Zetao <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: uvcvideo: Require entities to have a non-zero unique ID [+ + +]

Author: Thadeu Lima de Souza Cascardo <[email protected]>
Date:   Fri Sep 13 15:06:01 2024 -0300

    media: uvcvideo: Require entities to have a non-zero unique ID
    
    commit 3dd075fe8ebbc6fcbf998f81a75b8c4b159a6195 upstream.
    
    Per UVC 1.1+ specification 3.7.2, units and terminals must have a non-zero
    unique ID.
    
    ```
    Each Unit and Terminal within the video function is assigned a unique
    identification number, the Unit ID (UID) or Terminal ID (TID), contained in
    the bUnitID or bTerminalID field of the descriptor. The value 0x00 is
    reserved for undefined ID,
    ```
    
    So, deny allocating an entity with ID 0 or an ID that belongs to a unit
    that is already added to the list of entities.
    
    This also prevents some syzkaller reproducers from triggering warnings due
    to a chain of entities referring to themselves. In one particular case, an
    Output Unit is connected to an Input Unit, both with the same ID of 1. But
    when looking up for the source ID of the Output Unit, that same entity is
    found instead of the input entity, which leads to such warnings.
    
    In another case, a backward chain was considered finished as the source ID
    was 0. Later on, that entity was found, but its pads were not valid.
    
    Here is a sample stack trace for one of those cases.
    
    [   20.650953] usb 1-1: new high-speed USB device number 2 using dummy_hcd
    [   20.830206] usb 1-1: Using ep0 maxpacket: 8
    [   20.833501] usb 1-1: config 0 descriptor??
    [   21.038518] usb 1-1: string descriptor 0 read error: -71
    [   21.038893] usb 1-1: Found UVC 0.00 device <unnamed> (2833:0201)
    [   21.039299] uvcvideo 1-1:0.0: Entity type for entity Output 1 was not initialized!
    [   21.041583] uvcvideo 1-1:0.0: Entity type for entity Input 1 was not initialized!
    [   21.042218] ------------[ cut here ]------------
    [   21.042536] WARNING: CPU: 0 PID: 9 at drivers/media/mc/mc-entity.c:1147 media_create_pad_link+0x2c4/0x2e0
    [   21.043195] Modules linked in:
    [   21.043535] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.11.0-rc7-00030-g3480e43aeccf #444
    [   21.044101] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
    [   21.044639] Workqueue: usb_hub_wq hub_event
    [   21.045100] RIP: 0010:media_create_pad_link+0x2c4/0x2e0
    [   21.045508] Code: fe e8 20 01 00 00 b8 f4 ff ff ff 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 0f 0b eb e9 0f 0b eb 0a 0f 0b eb 06 <0f> 0b eb 02 0f 0b b8 ea ff ff ff eb d4 66 2e 0f 1f 84 00 00 00 00
    [   21.046801] RSP: 0018:ffffc9000004b318 EFLAGS: 00010246
    [   21.047227] RAX: ffff888004e5d458 RBX: 0000000000000000 RCX: ffffffff818fccf1
    [   21.047719] RDX: 000000000000007b RSI: 0000000000000000 RDI: ffff888004313290
    [   21.048241] RBP: ffff888004313290 R08: 0001ffffffffffff R09: 0000000000000000
    [   21.048701] R10: 0000000000000013 R11: 0001888004313290 R12: 0000000000000003
    [   21.049138] R13: ffff888004313080 R14: ffff888004313080 R15: 0000000000000000
    [   21.049648] FS:  0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
    [   21.050271] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   21.050688] CR2: 0000592cc27635b0 CR3: 000000000431c000 CR4: 0000000000750ef0
    [   21.051136] PKRU: 55555554
    [   21.051331] Call Trace:
    [   21.051480]  <TASK>
    [   21.051611]  ? __warn+0xc4/0x210
    [   21.051861]  ? media_create_pad_link+0x2c4/0x2e0
    [   21.052252]  ? report_bug+0x11b/0x1a0
    [   21.052540]  ? trace_hardirqs_on+0x31/0x40
    [   21.052901]  ? handle_bug+0x3d/0x70
    [   21.053197]  ? exc_invalid_op+0x1a/0x50
    [   21.053511]  ? asm_exc_invalid_op+0x1a/0x20
    [   21.053924]  ? media_create_pad_link+0x91/0x2e0
    [   21.054364]  ? media_create_pad_link+0x2c4/0x2e0
    [   21.054834]  ? media_create_pad_link+0x91/0x2e0
    [   21.055131]  ? _raw_spin_unlock+0x1e/0x40
    [   21.055441]  ? __v4l2_device_register_subdev+0x202/0x210
    [   21.055837]  uvc_mc_register_entities+0x358/0x400
    [   21.056144]  uvc_register_chains+0x1fd/0x290
    [   21.056413]  uvc_probe+0x380e/0x3dc0
    [   21.056676]  ? __lock_acquire+0x5aa/0x26e0
    [   21.056946]  ? find_held_lock+0x33/0xa0
    [   21.057196]  ? kernfs_activate+0x70/0x80
    [   21.057533]  ? usb_match_dynamic_id+0x1b/0x70
    [   21.057811]  ? find_held_lock+0x33/0xa0
    [   21.058047]  ? usb_match_dynamic_id+0x55/0x70
    [   21.058330]  ? lock_release+0x124/0x260
    [   21.058657]  ? usb_match_one_id_intf+0xa2/0x100
    [   21.058997]  usb_probe_interface+0x1ba/0x330
    [   21.059399]  really_probe+0x1ba/0x4c0
    [   21.059662]  __driver_probe_device+0xb2/0x180
    [   21.059944]  driver_probe_device+0x5a/0x100
    [   21.060170]  __device_attach_driver+0xe9/0x160
    [   21.060427]  ? __pfx___device_attach_driver+0x10/0x10
    [   21.060872]  bus_for_each_drv+0xa9/0x100
    [   21.061312]  __device_attach+0xed/0x190
    [   21.061812]  device_initial_probe+0xe/0x20
    [   21.062229]  bus_probe_device+0x4d/0xd0
    [   21.062590]  device_add+0x308/0x590
    [   21.062912]  usb_set_configuration+0x7b6/0xaf0
    [   21.063403]  usb_generic_driver_probe+0x36/0x80
    [   21.063714]  usb_probe_device+0x7b/0x130
    [   21.063936]  really_probe+0x1ba/0x4c0
    [   21.064111]  __driver_probe_device+0xb2/0x180
    [   21.064577]  driver_probe_device+0x5a/0x100
    [   21.065019]  __device_attach_driver+0xe9/0x160
    [   21.065403]  ? __pfx___device_attach_driver+0x10/0x10
    [   21.065820]  bus_for_each_drv+0xa9/0x100
    [   21.066094]  __device_attach+0xed/0x190
    [   21.066535]  device_initial_probe+0xe/0x20
    [   21.066992]  bus_probe_device+0x4d/0xd0
    [   21.067250]  device_add+0x308/0x590
    [   21.067501]  usb_new_device+0x347/0x610
    [   21.067817]  hub_event+0x156b/0x1e30
    [   21.068060]  ? process_scheduled_works+0x48b/0xaf0
    [   21.068337]  process_scheduled_works+0x5a3/0xaf0
    [   21.068668]  worker_thread+0x3cf/0x560
    [   21.068932]  ? kthread+0x109/0x1b0
    [   21.069133]  kthread+0x197/0x1b0
    [   21.069343]  ? __pfx_worker_thread+0x10/0x10
    [   21.069598]  ? __pfx_kthread+0x10/0x10
    [   21.069908]  ret_from_fork+0x32/0x40
    [   21.070169]  ? __pfx_kthread+0x10/0x10
    [   21.070424]  ret_from_fork_asm+0x1a/0x30
    [   21.070737]  </TASK>
    
    Cc: [email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=0584f746fde3d52b4675
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=dd320d114deb3f5bb79b
    Fixes: a3fbc2e6bb05 ("media: mc-entity.c: use WARN_ON, validate link pads")
    Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
    Reviewed-by: Ricardo Ribalda <[email protected]>
    Reviewed-by: Laurent Pinchart <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: uvcvideo: Stop stream during unregister [+ + +]

Author: Ricardo Ribalda <[email protected]>
Date:   Thu Sep 26 05:59:06 2024 +0000

    media: uvcvideo: Stop stream during unregister
    
    commit c9ec6f1736363b2b2bb4e266997389740f628441 upstream.
    
    uvc_unregister_video() can be called asynchronously from
    uvc_disconnect(). If the device is still streaming when that happens, a
    plethora of race conditions can occur.
    
    Make sure that the device has stopped streaming before exiting this
    function.
    
    If the user still holds handles to the driver's file descriptors, any
    ioctl will return -ENODEV from the v4l2 core.
    
    This change makes uvc more consistent with the rest of the v4l2 drivers
    using the vb2_fop_* and vb2_ioctl_* helpers.
    
    This driver (and many other usb drivers) always had this problem, but it
    wasn't possible to easily fix this until the vb2_video_unregister_device()
    helper was added. So the Fixes tag points to the creation of that helper.
    
    Reviewed-by: Hans Verkuil <[email protected]>
    Suggested-by: Hans Verkuil <[email protected]>
    Signed-off-by: Ricardo Ribalda <[email protected]>
    Reviewed-by: Mauro Carvalho Chehab <[email protected]>
    Fixes: f729ef5796d8 ("media: videobuf2-v4l2.c: add vb2_video_unregister_device helper function")
    Cc: [email protected] # 5.10.x
    [hverkuil: add note regarding Fixes version]
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: venus: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Fri Nov 1 17:40:50 2024 +0800

    media: venus: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 2a20869f7d798aa2b69e45b863eaf1b1ecf98278 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Cc: [email protected]
    Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Bryan O'Donoghue <[email protected]>
    Acked-by: Stanimir Varbanov <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: verisilicon: av1: Fix reference video buffer pointer assignment [+ + +]

Author: Benjamin Gaignard <[email protected]>
Date:   Tue Sep 10 14:10:09 2024 +0000

    media: verisilicon: av1: Fix reference video buffer pointer assignment
    
    commit 672f24ed6ebcd986688c6674a6d994a265fefc25 upstream.
    
    Always get new destination buffer for reference frame because nothing
    garantees the one set previously is still valid or unused.
    
    Fixes this chromium test suite:
    https://chromium.googlesource.com/chromium/src/media/+/refs/heads/main/test/data/test-25fps.av1.ivf
    
    Fixes: 727a400686a2 ("media: verisilicon: Add Rockchip AV1 decoder")
    Cc: <[email protected]>
    Signed-off-by: Benjamin Gaignard <[email protected]>
    Reviewed-by: Nicolas Dufresne <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    [hverkuil: fix typo and add link to chromium test suite]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/damon/vaddr: fix issue in damon_va_evenly_split_region() [+ + +]

Author: Zheng Yejian <[email protected]>
Date:   Tue Oct 22 16:39:26 2024 +0800

    mm/damon/vaddr: fix issue in damon_va_evenly_split_region()
    
    commit f3c7a1ede435e2e45177d7a490a85fb0a0ec96d1 upstream.
    
    Patch series "mm/damon/vaddr: Fix issue in
    damon_va_evenly_split_region()".  v2.
    
    According to the logic of damon_va_evenly_split_region(), currently
    following split case would not meet the expectation:
    
      Suppose DAMON_MIN_REGION=0x1000,
      Case: Split [0x0, 0x3000) into 2 pieces, then the result would be
            acutually 3 regions:
              [0x0, 0x1000), [0x1000, 0x2000), [0x2000, 0x3000)
            but NOT the expected 2 regions:
              [0x0, 0x1000), [0x1000, 0x3000) !!!
    
    The root cause is that when calculating size of each split piece in
    damon_va_evenly_split_region():
    
      `sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, DAMON_MIN_REGION);`
    
    both the dividing and the ALIGN_DOWN may cause loss of precision, then
    each time split one piece of size 'sz_piece' from origin 'start' to 'end'
    would cause more pieces are split out than expected!!!
    
    To fix it, count for each piece split and make sure no more than
    'nr_pieces'.  In addition, add above case into damon_test_split_evenly().
    
    And add 'nr_piece == 1' check in damon_va_evenly_split_region() for better
    code readability and add a corresponding kunit testcase.
    
    
    This patch (of 2):
    
    According to the logic of damon_va_evenly_split_region(), currently
    following split case would not meet the expectation:
    
      Suppose DAMON_MIN_REGION=0x1000,
      Case: Split [0x0, 0x3000) into 2 pieces, then the result would be
            acutually 3 regions:
              [0x0, 0x1000), [0x1000, 0x2000), [0x2000, 0x3000)
            but NOT the expected 2 regions:
              [0x0, 0x1000), [0x1000, 0x3000) !!!
    
    The root cause is that when calculating size of each split piece in
    damon_va_evenly_split_region():
    
      `sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, DAMON_MIN_REGION);`
    
    both the dividing and the ALIGN_DOWN may cause loss of precision,
    then each time split one piece of size 'sz_piece' from origin 'start' to
    'end' would cause more pieces are split out than expected!!!
    
    To fix it, count for each piece split and make sure no more than
    'nr_pieces'. In addition, add above case into damon_test_split_evenly().
    
    After this patch, damon-operations test passed:
    
     # ./tools/testing/kunit/kunit.py run damon-operations
     [...]
     ============== damon-operations (6 subtests) ===============
     [PASSED] damon_test_three_regions_in_vmas
     [PASSED] damon_test_apply_three_regions1
     [PASSED] damon_test_apply_three_regions2
     [PASSED] damon_test_apply_three_regions3
     [PASSED] damon_test_apply_three_regions4
     [PASSED] damon_test_split_evenly
     ================ [PASSED] damon-operations =================
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
    Signed-off-by: Zheng Yejian <[email protected]>
    Reviewed-by: SeongJae Park <[email protected]>
    Cc: Fernand Sieber <[email protected]>
    Cc: Leonard Foerster <[email protected]>
    Cc: Shakeel Butt <[email protected]>
    Cc: Ye Weihua <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/slub: Avoid list corruption when removing a slab from the full list [+ + +]

Author: yuan.gao <[email protected]>
Date:   Fri Oct 18 14:44:35 2024 +0800

    mm/slub: Avoid list corruption when removing a slab from the full list
    
    commit dbc16915279a548a204154368da23d402c141c81 upstream.
    
    Boot with slub_debug=UFPZ.
    
    If allocated object failed in alloc_consistency_checks, all objects of
    the slab will be marked as used, and then the slab will be removed from
    the partial list.
    
    When an object belonging to the slab got freed later, the remove_full()
    function is called. Because the slab is neither on the partial list nor
    on the full list, it eventually lead to a list corruption (actually a
    list poison being detected).
    
    So we need to mark and isolate the slab page with metadata corruption,
    do not put it back in circulation.
    
    Because the debug caches avoid all the fastpaths, reusing the frozen bit
    to mark slab page with metadata corruption seems to be fine.
    
    [ 4277.385669] list_del corruption, ffffea00044b3e50->next is LIST_POISON1 (dead000000000100)
    [ 4277.387023] ------------[ cut here ]------------
    [ 4277.387880] kernel BUG at lib/list_debug.c:56!
    [ 4277.388680] invalid opcode: 0000 [#1] PREEMPT SMP PTI
    [ 4277.389562] CPU: 5 PID: 90 Comm: kworker/5:1 Kdump: loaded Tainted: G           OE      6.6.1-1 #1
    [ 4277.392113] Workqueue: xfs-inodegc/vda1 xfs_inodegc_worker [xfs]
    [ 4277.393551] RIP: 0010:__list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.394518] Code: 48 91 82 e8 37 f9 9a ff 0f 0b 48 89 fe 48 c7 c7 28 49 91 82 e8 26 f9 9a ff 0f 0b 48 89 fe 48 c7 c7 58 49 91
    [ 4277.397292] RSP: 0018:ffffc90000333b38 EFLAGS: 00010082
    [ 4277.398202] RAX: 000000000000004e RBX: ffffea00044b3e50 RCX: 0000000000000000
    [ 4277.399340] RDX: 0000000000000002 RSI: ffffffff828f8715 RDI: 00000000ffffffff
    [ 4277.400545] RBP: ffffea00044b3e40 R08: 0000000000000000 R09: ffffc900003339f0
    [ 4277.401710] R10: 0000000000000003 R11: ffffffff82d44088 R12: ffff888112cf9910
    [ 4277.402887] R13: 0000000000000001 R14: 0000000000000001 R15: ffff8881000424c0
    [ 4277.404049] FS:  0000000000000000(0000) GS:ffff88842fd40000(0000) knlGS:0000000000000000
    [ 4277.405357] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 4277.406389] CR2: 00007f2ad0b24000 CR3: 0000000102a3a006 CR4: 00000000007706e0
    [ 4277.407589] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 4277.408780] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 4277.410000] PKRU: 55555554
    [ 4277.410645] Call Trace:
    [ 4277.411234]  <TASK>
    [ 4277.411777]  ? die+0x32/0x80
    [ 4277.412439]  ? do_trap+0xd6/0x100
    [ 4277.413150]  ? __list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.414158]  ? do_error_trap+0x6a/0x90
    [ 4277.414948]  ? __list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.415915]  ? exc_invalid_op+0x4c/0x60
    [ 4277.416710]  ? __list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.417675]  ? asm_exc_invalid_op+0x16/0x20
    [ 4277.418482]  ? __list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.419466]  ? __list_del_entry_valid_or_report+0x7b/0xc0
    [ 4277.420410]  free_to_partial_list+0x515/0x5e0
    [ 4277.421242]  ? xfs_iext_remove+0x41a/0xa10 [xfs]
    [ 4277.422298]  xfs_iext_remove+0x41a/0xa10 [xfs]
    [ 4277.423316]  ? xfs_inodegc_worker+0xb4/0x1a0 [xfs]
    [ 4277.424383]  xfs_bmap_del_extent_delay+0x4fe/0x7d0 [xfs]
    [ 4277.425490]  __xfs_bunmapi+0x50d/0x840 [xfs]
    [ 4277.426445]  xfs_itruncate_extents_flags+0x13a/0x490 [xfs]
    [ 4277.427553]  xfs_inactive_truncate+0xa3/0x120 [xfs]
    [ 4277.428567]  xfs_inactive+0x22d/0x290 [xfs]
    [ 4277.429500]  xfs_inodegc_worker+0xb4/0x1a0 [xfs]
    [ 4277.430479]  process_one_work+0x171/0x340
    [ 4277.431227]  worker_thread+0x277/0x390
    [ 4277.431962]  ? __pfx_worker_thread+0x10/0x10
    [ 4277.432752]  kthread+0xf0/0x120
    [ 4277.433382]  ? __pfx_kthread+0x10/0x10
    [ 4277.434134]  ret_from_fork+0x2d/0x50
    [ 4277.434837]  ? __pfx_kthread+0x10/0x10
    [ 4277.435566]  ret_from_fork_asm+0x1b/0x30
    [ 4277.436280]  </TASK>
    
    Fixes: 643b113849d8 ("slub: enable tracking of full slabs")
    Suggested-by: Hyeonggon Yoo <[email protected]>
    Suggested-by: Vlastimil Babka <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: yuan.gao <[email protected]>
    Reviewed-by: Hyeonggon Yoo <[email protected]>
    Acked-by: Christoph Lameter <[email protected]>
    Signed-off-by: Vlastimil Babka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/vmalloc: combine all TLB flush operations of KASAN shadow virtual address into one operation [+ + +]

Author: Adrian Huang <[email protected]>
Date:   Sat Jul 27 00:52:46 2024 +0800

    mm/vmalloc: combine all TLB flush operations of KASAN shadow virtual address into one operation
    
    commit 9e9e085effe9b7e342138fde3cf8577d22509932 upstream.
    
    When compiling kernel source 'make -j $(nproc)' with the up-and-running
    KASAN-enabled kernel on a 256-core machine, the following soft lockup is
    shown:
    
    watchdog: BUG: soft lockup - CPU#28 stuck for 22s! [kworker/28:1:1760]
    CPU: 28 PID: 1760 Comm: kworker/28:1 Kdump: loaded Not tainted 6.10.0-rc5 #95
    Workqueue: events drain_vmap_area_work
    RIP: 0010:smp_call_function_many_cond+0x1d8/0xbb0
    Code: 38 c8 7c 08 84 c9 0f 85 49 08 00 00 8b 45 08 a8 01 74 2e 48 89 f1 49 89 f7 48 c1 e9 03 41 83 e7 07 4c 01 e9 41 83 c7 03 f3 90 <0f> b6 01 41 38 c7 7c 08 84 c0 0f 85 d4 06 00 00 8b 45 08 a8 01 75
    RSP: 0018:ffffc9000cb3fb60 EFLAGS: 00000202
    RAX: 0000000000000011 RBX: ffff8883bc4469c0 RCX: ffffed10776e9949
    RDX: 0000000000000002 RSI: ffff8883bb74ca48 RDI: ffffffff8434dc50
    RBP: ffff8883bb74ca40 R08: ffff888103585dc0 R09: ffff8884533a1800
    R10: 0000000000000004 R11: ffffffffffffffff R12: ffffed1077888d39
    R13: dffffc0000000000 R14: ffffed1077888d38 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff8883bc400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00005577b5c8d158 CR3: 0000000004850000 CR4: 0000000000350ef0
    Call Trace:
     <IRQ>
     ? watchdog_timer_fn+0x2cd/0x390
     ? __pfx_watchdog_timer_fn+0x10/0x10
     ? __hrtimer_run_queues+0x300/0x6d0
     ? sched_clock_cpu+0x69/0x4e0
     ? __pfx___hrtimer_run_queues+0x10/0x10
     ? srso_return_thunk+0x5/0x5f
     ? ktime_get_update_offsets_now+0x7f/0x2a0
     ? srso_return_thunk+0x5/0x5f
     ? srso_return_thunk+0x5/0x5f
     ? hrtimer_interrupt+0x2ca/0x760
     ? __sysvec_apic_timer_interrupt+0x8c/0x2b0
     ? sysvec_apic_timer_interrupt+0x6a/0x90
     </IRQ>
     <TASK>
     ? asm_sysvec_apic_timer_interrupt+0x16/0x20
     ? smp_call_function_many_cond+0x1d8/0xbb0
     ? __pfx_do_kernel_range_flush+0x10/0x10
     on_each_cpu_cond_mask+0x20/0x40
     flush_tlb_kernel_range+0x19b/0x250
     ? srso_return_thunk+0x5/0x5f
     ? kasan_release_vmalloc+0xa7/0xc0
     purge_vmap_node+0x357/0x820
     ? __pfx_purge_vmap_node+0x10/0x10
     __purge_vmap_area_lazy+0x5b8/0xa10
     drain_vmap_area_work+0x21/0x30
     process_one_work+0x661/0x10b0
     worker_thread+0x844/0x10e0
     ? srso_return_thunk+0x5/0x5f
     ? __kthread_parkme+0x82/0x140
     ? __pfx_worker_thread+0x10/0x10
     kthread+0x2a5/0x370
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x30/0x70
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    
    Debugging Analysis:
    
      1. The following ftrace log shows that the lockup CPU spends too much
         time iterating vmap_nodes and flushing TLB when purging vm_area
         structures. (Some info is trimmed).
    
         kworker: funcgraph_entry:              |  drain_vmap_area_work() {
         kworker: funcgraph_entry:              |   mutex_lock() {
         kworker: funcgraph_entry:  1.092 us    |     __cond_resched();
         kworker: funcgraph_exit:   3.306 us    |   }
         ...                                        ...
         kworker: funcgraph_entry:              |    flush_tlb_kernel_range() {
         ...                                          ...
         kworker: funcgraph_exit: # 7533.649 us |    }
         ...                                         ...
         kworker: funcgraph_entry:  2.344 us    |   mutex_unlock();
         kworker: funcgraph_exit: $ 23871554 us | }
    
         The drain_vmap_area_work() spends over 23 seconds.
    
         There are 2805 flush_tlb_kernel_range() calls in the ftrace log.
           * One is called in __purge_vmap_area_lazy().
           * Others are called by purge_vmap_node->kasan_release_vmalloc.
             purge_vmap_node() iteratively releases kasan vmalloc
             allocations and flushes TLB for each vmap_area.
               - [Rough calculation] Each flush_tlb_kernel_range() runs
                 about 7.5ms.
                   -- 2804 * 7.5ms = 21.03 seconds.
                   -- That's why a soft lock is triggered.
    
      2. Extending the soft lockup time can work around the issue (For example,
         # echo 60 > /proc/sys/kernel/watchdog_thresh). This confirms the
         above-mentioned speculation: drain_vmap_area_work() spends too much
         time.
    
    If we combine all TLB flush operations of the KASAN shadow virtual
    address into one operation in the call path
    'purge_vmap_node()->kasan_release_vmalloc()', the running time of
    drain_vmap_area_work() can be saved greatly. The idea is from the
    flush_tlb_kernel_range() call in __purge_vmap_area_lazy(). And, the
    soft lockup won't be triggered.
    
    Here is the test result based on 6.10:
    
    [6.10 wo/ the patch]
      1. ftrace latency profiling (record a trace if the latency > 20s).
         echo 20000000 > /sys/kernel/debug/tracing/tracing_thresh
         echo drain_vmap_area_work > /sys/kernel/debug/tracing/set_graph_function
         echo function_graph > /sys/kernel/debug/tracing/current_tracer
         echo 1 > /sys/kernel/debug/tracing/tracing_on
    
      2. Run `make -j $(nproc)` to compile the kernel source
    
      3. Once the soft lockup is reproduced, check the ftrace log:
         cat /sys/kernel/debug/tracing/trace
            # tracer: function_graph
            #
            # CPU  DURATION                  FUNCTION CALLS
            # |     |   |                     |   |   |   |
              76) $ 50412985 us |    } /* __purge_vmap_area_lazy */
              76) $ 50412997 us |  } /* drain_vmap_area_work */
              76) $ 29165911 us |    } /* __purge_vmap_area_lazy */
              76) $ 29165926 us |  } /* drain_vmap_area_work */
              91) $ 53629423 us |    } /* __purge_vmap_area_lazy */
              91) $ 53629434 us |  } /* drain_vmap_area_work */
              91) $ 28121014 us |    } /* __purge_vmap_area_lazy */
              91) $ 28121026 us |  } /* drain_vmap_area_work */
    
    [6.10 w/ the patch]
      1. Repeat step 1-2 in "[6.10 wo/ the patch]"
    
      2. The soft lockup is not triggered and ftrace log is empty.
         cat /sys/kernel/debug/tracing/trace
         # tracer: function_graph
         #
         # CPU  DURATION                  FUNCTION CALLS
         # |     |   |                     |   |   |   |
    
      3. Setting 'tracing_thresh' to 10/5 seconds does not get any ftrace
         log.
    
      4. Setting 'tracing_thresh' to 1 second gets ftrace log.
         cat /sys/kernel/debug/tracing/trace
         # tracer: function_graph
         #
         # CPU  DURATION                  FUNCTION CALLS
         # |     |   |                     |   |   |   |
           23) $ 1074942 us  |    } /* __purge_vmap_area_lazy */
           23) $ 1074950 us  |  } /* drain_vmap_area_work */
    
      The worst execution time of drain_vmap_area_work() is about 1 second.
    
    Link: https://lore.kernel.org/lkml/[email protected]/
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 282631cb2447 ("mm: vmalloc: remove global purge_vmap_area_root rb-tree")
    Signed-off-by: Adrian Huang <[email protected]>
    Co-developed-by: Uladzislau Rezki (Sony) <[email protected]>
    Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
    Tested-by: Jiwei Sun <[email protected]>
    Reviewed-by: Baoquan He <[email protected]>
    Cc: Alexander Potapenko <[email protected]>
    Cc: Andrey Konovalov <[email protected]>
    Cc: Andrey Ryabinin <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Cc: Dmitry Vyukov <[email protected]>
    Cc: Vincenzo Frascino <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mtd: spinand: winbond: Fix 512GW and 02JW OOB layout [+ + +]

Author: Miquel Raynal <[email protected]>
Date:   Wed Oct 9 14:49:59 2024 +0200

    mtd: spinand: winbond: Fix 512GW and 02JW OOB layout
    
    commit c1247de51cab53fc357a73804c11fb4fba55b2d9 upstream.
    
    Both W25N512GW and W25N02JW chips have 64 bytes of OOB and thus cannot
    use the layout for 128 bytes OOB. Reference the correct layout instead.
    
    Fixes: 6a804fb72de5 ("mtd: spinand: winbond: add support for serial NAND flash")
    Cc: [email protected]
    Signed-off-by: Miquel Raynal <[email protected]>
    Reviewed-by: Frieder Schrempf <[email protected]>
    Link: https://lore.kernel.org/linux-mtd/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mtd: spinand: winbond: Fix 512GW, 01GW, 01JW and 02JW ECC information [+ + +]

Author: Miquel Raynal <[email protected]>
Date:   Wed Oct 9 14:50:00 2024 +0200

    mtd: spinand: winbond: Fix 512GW, 01GW, 01JW and 02JW ECC information
    
    commit fee9b240916df82a8b07aef0fdfe96785417a164 upstream.
    
    These four chips:
    * W25N512GW
    * W25N01GW
    * W25N01JW
    * W25N02JW
    all require a single bit of ECC strength and thus feature an on-die
    Hamming-like ECC engine. There is no point in filling a ->get_status()
    callback for them because the main ECC status bytes are located in
    standard places, and retrieving the number of bitflips in case of
    corrected chunk is both useless and unsupported (if there are bitflips,
    then there is 1 at most, so no need to query the chip for that).
    
    Without this change, a kernel warning triggers every time a bit flips.
    
    Fixes: 6a804fb72de5 ("mtd: spinand: winbond: add support for serial NAND flash")
    Cc: [email protected]
    Signed-off-by: Miquel Raynal <[email protected]>
    Reviewed-by: Frieder Schrempf <[email protected]>
    Link: https://lore.kernel.org/linux-mtd/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: fec: make PPS channel configurable [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Fri Oct 4 17:24:19 2024 +0200

    net: fec: make PPS channel configurable
    
    commit 566c2d83887f0570056833102adc5b88e681b0c7 upstream.
    
    Depending on the SoC where the FEC is integrated into the PPS channel
    might be routed to different timer instances. Make this configurable
    from the devicetree.
    
    When the related DT property is not present fallback to the previous
    default and use channel 0.
    
    Reviewed-by: Frank Li <[email protected]>
    Tested-by: Rafael Beims <[email protected]>
    Signed-off-by: Francesco Dolcini <[email protected]>
    Reviewed-by: Csókás, Bence <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Csókás, Bence <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: fec: refactor PPS channel configuration [+ + +]

Author: Francesco Dolcini <[email protected]>
Date:   Fri Oct 4 17:24:18 2024 +0200

    net: fec: refactor PPS channel configuration
    
    commit bf8ca67e21671e7a56e31da45360480b28f185f1 upstream.
    
    Preparation patch to allow for PPS channel configuration, no functional
    change intended.
    
    Signed-off-by: Francesco Dolcini <[email protected]>
    Reviewed-by: Frank Li <[email protected]>
    Reviewed-by: Csókás, Bence <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Csókás, Bence <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: phy: dp83869: fix status reporting for 1000base-x autonegotiation [+ + +]

Author: Romain Gantois <[email protected]>
Date:   Tue Nov 12 15:06:08 2024 +0100

    net: phy: dp83869: fix status reporting for 1000base-x autonegotiation
    
    commit 378e8feea9a70d37a5dc1678b7ec27df21099fa5 upstream.
    
    The DP83869 PHY transceiver supports converting from RGMII to 1000base-x.
    In this operation mode, autonegotiation can be performed, as described in
    IEEE802.3.
    
    The DP83869 has a set of fiber-specific registers located at offset 0xc00.
    When the transceiver is configured in RGMII-to-1000base-x mode, these
    registers are mapped onto offset 0, which should make reading the
    autonegotiation status transparent.
    
    However, the fiber registers at offset 0xc04 and 0xc05 follow the bit
    layout specified in Clause 37, and genphy_read_status() assumes a Clause 22
    layout. Thus, genphy_read_status() doesn't properly read the capabilities
    advertised by the link partner, resulting in incorrect link parameters.
    
    Similarly, genphy_config_aneg() doesn't properly write advertised
    capabilities.
    
    Fix the 1000base-x autonegotiation procedure by replacing
    genphy_read_status() and genphy_config_aneg() with their Clause 37
    equivalents.
    
    Fixes: a29de52ba2a1 ("net: dp83869: Add ability to advertise Fiber connection")
    Cc: [email protected]
    Signed-off-by: Romain Gantois <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: stmmac: set initial EEE policy configuration [+ + +]

Author: Choong Yong Liang <[email protected]>
Date:   Wed Nov 20 16:38:18 2024 +0800

    net: stmmac: set initial EEE policy configuration
    
    commit 59c5e1411a0a13ebb930f4ebba495cc4eb14f8f2 upstream.
    
    Set the initial eee_cfg values to have 'ethtool --show-eee ' display
    the initial EEE configuration.
    
    Fixes: 49168d1980e2 ("net: phy: Add phy_support_eee() indicating MAC support EEE")
    Cc: <[email protected]>
    Signed-off-by: Choong Yong Liang <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

netkit: Add option for scrubbing skb meta data [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Fri Oct 4 12:13:31 2024 +0200

    netkit: Add option for scrubbing skb meta data
    
    commit 83134ef4609388f6b9ca31a384f531155196c2a7 upstream.
    
    Jordan reported that when running Cilium with netkit in per-endpoint-routes
    mode, network policy misclassifies traffic. In this direct routing mode
    of Cilium which is used in case of GKE/EKS/AKS, the Pod's BPF program to
    enforce policy sits on the netkit primary device's egress side.
    
    The issue here is that in case of netkit's netkit_prep_forward(), it will
    clear meta data such as skb->mark and skb->priority before executing the
    BPF program. Thus, identity data stored in there from earlier BPF programs
    (e.g. from tcx ingress on the physical device) gets cleared instead of
    being made available for the primary's program to process. While for traffic
    egressing the Pod via the peer device this might be desired, this is
    different for the primary one where compared to tcx egress on the host
    veth this information would be available.
    
    To address this, add a new parameter for the device orchestration to
    allow control of skb->mark and skb->priority scrubbing, to make the two
    accessible from BPF (and eventually leave it up to the program to scrub).
    By default, the current behavior is retained. For netkit peer this also
    enables the use case where applications could cooperate/signal intent to
    the BPF program.
    
    Note that struct netkit has a 4 byte hole between policy and bundle which
    is used here, in other words, struct netkit's first cacheline content used
    in fast-path does not get moved around.
    
    Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device")
    Reported-by: Jordan Rife <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Nikolay Aleksandrov <[email protected]>
    Link: https://github.com/cilium/cilium/issues/34042
    Acked-by: Jakub Kicinski <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin KaFai Lau <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur [+ + +]

Author: Yang Erkun <[email protected]>
Date:   Tue Nov 5 19:03:14 2024 +0800

    nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur
    
    commit 98100e88dd8865999dc6379a3356cd799795fe7b upstream.
    
    The action force umount(umount -f) will attempt to kill all rpc_task even
    umount operation may ultimately fail if some files remain open.
    Consequently, if an action attempts to open a file, it can potentially
    send two rpc_task to nfs server.
    
                       NFS CLIENT
    thread1                             thread2
    open("file")
    ...
    nfs4_do_open
     _nfs4_do_open
      _nfs4_open_and_get_state
       _nfs4_proc_open
        nfs4_run_open_task
         /* rpc_task1 */
         rpc_run_task
         rpc_wait_for_completion_task
    
                                        umount -f
                                        nfs_umount_begin
                                         rpc_killall_tasks
                                          rpc_signal_task
         rpc_task1 been wakeup
         and return -512
     _nfs4_do_open // while loop
        ...
        nfs4_run_open_task
         /* rpc_task2 */
         rpc_run_task
         rpc_wait_for_completion_task
    
    While processing an open request, nfsd will first attempt to find or
    allocate an nfs4_openowner. If it finds an nfs4_openowner that is not
    marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since
    two rpc_task can attempt to open the same file simultaneously from the
    client to server, and because two instances of nfsd can run
    concurrently, this situation can lead to lots of memory leak.
    Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be
    triggered.
    
                        NFS SERVER
    nfsd1                  nfsd2       echo 0 > /proc/fs/nfsd/threads
    
    nfsd4_open
     nfsd4_process_open1
      find_or_alloc_open_stateowner
       // alloc oo1, stateid1
                           nfsd4_open
                            nfsd4_process_open1
                            find_or_alloc_open_stateowner
                            // find oo1, without NFS4_OO_CONFIRMED
                             release_openowner
                              unhash_openowner_locked
                              list_del_init(&oo->oo_perclient)
                              // cannot find this oo
                              // from client, LEAK!!!
                             alloc_stateowner // alloc oo2
    
     nfsd4_process_open2
      init_open_stateid
      // associate oo1
      // with stateid1, stateid1 LEAK!!!
      nfs4_get_vfs_file
      // alloc nfsd_file1 and nfsd_file_mark1
      // all LEAK!!!
    
                             nfsd4_process_open2
                             ...
    
                                        write_threads
                                         ...
                                         nfsd_destroy_serv
                                          nfsd_shutdown_net
                                           nfs4_state_shutdown_net
                                            nfs4_state_destroy_net
                                             destroy_client
                                              __destroy_client
                                              // won't find oo1!!!
                                         nfsd_shutdown_generic
                                          nfsd_file_cache_shutdown
                                           kmem_cache_destroy
                                           for nfsd_file_slab
                                           and nfsd_file_mark_slab
                                           // bark since nfsd_file1
                                           // and nfsd_file_mark1
                                           // still alive
    
    =======================================================================
    BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
    __kmem_cache_shutdown()
    -----------------------------------------------------------------------
    
    Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28
    flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
    CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
    1.16.1-2.fc37 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x53/0x70
     slab_err+0xb0/0xf0
     __kmem_cache_shutdown+0x15c/0x310
     kmem_cache_destroy+0x66/0x160
     nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
     nfsd_destroy_serv+0x251/0x2a0 [nfsd]
     nfsd_svc+0x125/0x1e0 [nfsd]
     write_threads+0x16a/0x2a0 [nfsd]
     nfsctl_transaction_write+0x74/0xa0 [nfsd]
     vfs_write+0x1ae/0x6d0
     ksys_write+0xc1/0x160
     do_syscall_64+0x5f/0x170
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Disabling lock debugging due to kernel taint
    Object 0xff11000110e2ac38 @offset=3128
    Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3
    pid=800
     nfsd_file_do_acquire+0x20f/0xa30 [nfsd]
     nfsd_file_acquire_opened+0x5f/0x90 [nfsd]
     nfs4_get_vfs_file+0x4c9/0x570 [nfsd]
     nfsd4_process_open2+0x713/0x1070 [nfsd]
     nfsd4_open+0x74b/0x8b0 [nfsd]
     nfsd4_proc_compound+0x70b/0xc20 [nfsd]
     nfsd_dispatch+0x1b4/0x3a0 [nfsd]
     svc_process_common+0x5b8/0xc50 [sunrpc]
     svc_process+0x2ab/0x3b0 [sunrpc]
     svc_handle_xprt+0x681/0xa20 [sunrpc]
     nfsd+0x183/0x220 [nfsd]
     kthread+0x199/0x1e0
     ret_from_fork+0x31/0x60
     ret_from_fork_asm+0x1a/0x30
    
    Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and
    break nfsd4_open process to fix this problem.
    
    Cc: [email protected] # v5.4+
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Yang Erkun <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nfsd: make sure exp active before svc_export_show [+ + +]

Author: Yang Erkun <[email protected]>
Date:   Mon Oct 21 22:23:41 2024 +0800

    nfsd: make sure exp active before svc_export_show
    
    commit be8f982c369c965faffa198b46060f8853e0f1f0 upstream.
    
    The function `e_show` was called with protection from RCU. This only
    ensures that `exp` will not be freed. Therefore, the reference count for
    `exp` can drop to zero, which will trigger a refcount use-after-free
    warning when `exp_get` is called. To resolve this issue, use
    `cache_get_rcu` to ensure that `exp` remains active.
    
    ------------[ cut here ]------------
    refcount_t: addition on 0; use-after-free.
    WARNING: CPU: 3 PID: 819 at lib/refcount.c:25
    refcount_warn_saturate+0xb1/0x120
    CPU: 3 UID: 0 PID: 819 Comm: cat Not tainted 6.12.0-rc3+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
    1.16.1-2.fc37 04/01/2014
    RIP: 0010:refcount_warn_saturate+0xb1/0x120
    ...
    Call Trace:
     <TASK>
     e_show+0x20b/0x230 [nfsd]
     seq_read_iter+0x589/0x770
     seq_read+0x1e5/0x270
     vfs_read+0x125/0x530
     ksys_read+0xc1/0x160
     do_syscall_64+0x5f/0x170
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Fixes: bf18f163e89c ("NFSD: Using exp_get for export getting")
    Cc: [email protected] # 4.20+
    Signed-off-by: Yang Erkun <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nvmem: core: Check read_only flag for force_ro in bin_attr_nvmem_write() [+ + +]

Author: Marek Vasut <[email protected]>
Date:   Wed Oct 30 14:02:53 2024 +0000

    nvmem: core: Check read_only flag for force_ro in bin_attr_nvmem_write()
    
    commit da9596955c05966768364ab1cad2f43fcddc6f06 upstream.
    
    The bin_attr_nvmem_write() must check the read_only flag and block
    writes on read-only devices, now that a nvmem device can be switched
    between read-write and read-only mode at runtime using the force_ro
    attribute. Add the missing check.
    
    Fixes: 9d7eb234ac7a ("nvmem: core: Implement force_ro sysfs attribute")
    Cc: [email protected]
    Signed-off-by: Marek Vasut <[email protected]>
    Signed-off-by: Srinivas Kandagatla <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ovl: Filter invalid inodes with missing lookup function [+ + +]

Author: Vasiliy Kovalev <[email protected]>
Date:   Tue Nov 19 18:58:17 2024 +0300

    ovl: Filter invalid inodes with missing lookup function
    
    commit c8b359dddb418c60df1a69beea01d1b3322bfe83 upstream.
    
    Add a check to the ovl_dentry_weird() function to prevent the
    processing of directory inodes that lack the lookup function.
    This is important because such inodes can cause errors in overlayfs
    when passed to the lowerstack.
    
    Reported-by: [email protected]
    Link: https://syzkaller.appspot.com/bug?extid=a8c9d476508bd14a90e5
    Suggested-by: Miklos Szeredi <[email protected]>
    Link: https://lore.kernel.org/linux-unionfs/CAJfpegvx-oS9XGuwpJx=Xe28_jzWx5eRo1y900_ZzWY+=gGzUg@mail.gmail.com/
    Signed-off-by: Vasiliy Kovalev <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ovl: properly handle large files in ovl_security_fileattr [+ + +]

Author: Oleksandr Tymoshenko <[email protected]>
Date:   Wed Oct 30 00:28:55 2024 +0000

    ovl: properly handle large files in ovl_security_fileattr
    
    commit 3b6b99ef15ea37635604992ede9ebcccef38a239 upstream.
    
    dentry_open in ovl_security_fileattr fails for any file
    larger than 2GB if open method of the underlying filesystem
    calls generic_file_open (e.g. fusefs).
    
    The issue can be reproduce using the following script:
    (passthrough_ll is an example app from libfuse).
    
      $ D=/opt/test/mnt
      $ mkdir -p ${D}/{source,base,top/uppr,top/work,ovlfs}
      $ dd if=/dev/zero of=${D}/source/zero.bin bs=1G count=2
      $ passthrough_ll -o source=${D}/source ${D}/base
      $ mount -t overlay overlay \
          -olowerdir=${D}/base,upperdir=${D}/top/uppr,workdir=${D}/top/work \
          ${D}/ovlfs
      $ chmod 0777 ${D}/mnt/ovlfs/zero.bin
    
    Running this script results in "Value too large for defined data type"
    error message from chmod.
    
    Signed-off-by: Oleksandr Tymoshenko <[email protected]>
    Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags")
    Cc: [email protected] # v5.15+
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: dwc: ep: Fix advertised resizable BAR size regression [+ + +]

Author: Niklas Cassel <[email protected]>
Date:   Sat Nov 16 01:59:51 2024 +0100

    PCI: dwc: ep: Fix advertised resizable BAR size regression
    
    commit 118397c9baaac0b7ec81896f8d755d09aa82c485 upstream.
    
    The advertised resizable BAR size was fixed in commit 72e34b8593e0 ("PCI:
    dwc: endpoint: Fix advertised resizable BAR size").
    
    Commit 867ab111b242 ("PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown()
    API to handle Link Down event") was included shortly after this, and
    moved the code to another function. When the code was moved, this fix
    was mistakenly lost.
    
    According to the spec, it is illegal to not have a bit set in
    PCI_REBAR_CAP, and 1 MB is the smallest size allowed.
    
    So, set bit 4 in PCI_REBAR_CAP, so that we actually advertise support
    for a 1 MB BAR size.
    
    Fixes: 867ab111b242 ("PCI: dwc: ep: Add a generic dw_pcie_ep_linkdown() API to handle Link Down event")
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Niklas Cassel <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: endpoint: Clear secondary (not primary) EPC in pci_epc_remove_epf() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Thu Nov 7 08:53:09 2024 +0800

    PCI: endpoint: Clear secondary (not primary) EPC in pci_epc_remove_epf()
    
    commit 688d2eb4c6fcfdcdaed0592f9df9196573ff5ce2 upstream.
    
    In addition to a primary endpoint controller, an endpoint function may be
    associated with a secondary endpoint controller, epf->sec_epc, to provide
    NTB (non-transparent bridge) functionality.
    
    Previously, pci_epc_remove_epf() incorrectly cleared epf->epc instead of
    epf->sec_epc when removing from the secondary endpoint controller.
    
    Extend the epc->list_lock coverage and clear either epf->epc or
    epf->sec_epc as indicated.
    
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 63840ff53223 ("PCI: endpoint: Add support to associate secondary EPC with EPF")
    Signed-off-by: Zijun Hu <[email protected]>
    Reviewed-by: Manivannan Sadhasivam <[email protected]>
    [mani: reworded subject and description]
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    [bhelgaas: commit log]
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: endpoint: Fix PCI domain ID release in pci_epc_destroy() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Thu Nov 7 08:53:08 2024 +0800

    PCI: endpoint: Fix PCI domain ID release in pci_epc_destroy()
    
    commit 4acc902ed3743edd4ac2d3846604a99d17104359 upstream.
    
    pci_epc_destroy() invokes pci_bus_release_domain_nr() to release the PCI
    domain ID, but there are two issues:
    
      - 'epc->dev' is passed to pci_bus_release_domain_nr() which was already
        freed by device_unregister(), leading to a use-after-free issue.
    
      - Domain ID corresponds to the EPC device parent, so passing 'epc->dev'
        is also wrong.
    
    Fix these issues by passing 'epc->dev.parent' to
    pci_bus_release_domain_nr() and also do it before device_unregister().
    
    Fixes: 0328947c5032 ("PCI: endpoint: Assign PCI domain number for endpoint controllers")
    Signed-off-by: Zijun Hu <[email protected]>
    Reviewed-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [mani: reworded subject and description]
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: imx6: Fix suspend/resume support on i.MX6QDL [+ + +]

Author: Stefan Eichenberger <[email protected]>
Date:   Wed Oct 30 11:32:45 2024 +0100

    PCI: imx6: Fix suspend/resume support on i.MX6QDL
    
    commit 0a726f542d7c8cc0f9c5ed7df5a4bd4b59ac21b3 upstream.
    
    The suspend/resume functionality is currently broken on the i.MX6QDL
    platform, as documented in the NXP errata (ERR005723):
    
      https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf
    
    This patch addresses the issue by sharing most of the suspend/resume
    sequences used by other i.MX devices, while avoiding modifications to
    critical registers that disrupt the PCIe functionality. It targets the
    same problem as the following downstream commit:
    
      https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995
    
    Unlike the downstream commit, this patch also resets the connected PCIe
    device if possible. Without this reset, certain drivers, such as ath10k
    or iwlwifi, will crash on resume. The device reset is also done by the
    driver on other i.MX platforms, making this patch consistent with
    existing practices.
    
    Upon resuming, the kernel will hang and display an error. Here's an
    example of the error encountered with the ath10k driver:
    
      ath10k_pci 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible
      Unhandled fault: imprecise external abort (0x1406) at 0x0106f944
    
    Without this patch, suspend/resume will fail on i.MX6QDL devices if a
    PCIe device is connected.
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Stefan Eichenberger <[email protected]>
    [kwilczynski: commit log, added tag for stable releases]
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Reviewed-by: Manivannan Sadhasivam <[email protected]>
    Acked-by: Richard Zhu <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: keystone: Add link up check to ks_pcie_other_map_bus() [+ + +]

Author: Kishon Vijay Abraham I <[email protected]>
Date:   Fri May 24 16:27:14 2024 +0530

    PCI: keystone: Add link up check to ks_pcie_other_map_bus()
    
    commit 9e9ec8d8692a6f64d81ef67d4fb6255af6be684b upstream.
    
    K2G forwards the error triggered by a link-down state (e.g., no connected
    endpoint device) on the system bus for PCI configuration transactions;
    these errors are reported as an SError at system level, which is fatal and
    hangs the system.
    
    So, apply fix similar to how it was done in the DesignWare Core driver
    commit 15b23906347c ("PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()").
    
    Fixes: 10a797c6e54a ("PCI: dwc: keystone: Use pci_ops for config space accessors")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Kishon Vijay Abraham I <[email protected]>
    Signed-off-by: Siddharth Vadapalli <[email protected]>
    [kwilczynski: commit log, added tag for stable releases]
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: keystone: Set mode as Root Complex for "ti,keystone-pcie" compatible [+ + +]

Author: Kishon Vijay Abraham I <[email protected]>
Date:   Fri May 24 16:27:13 2024 +0530

    PCI: keystone: Set mode as Root Complex for "ti,keystone-pcie" compatible
    
    commit 5a938ed9481b0c06cb97aec45e722a80568256fd upstream.
    
    commit 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x
    Platforms") introduced configuring "enum dw_pcie_device_mode" as part of
    device data ("struct ks_pcie_of_data"). However it failed to set the
    mode for "ti,keystone-pcie" compatible.
    
    Since the mode defaults to "DW_PCIE_UNKNOWN_TYPE", the following error
    message is displayed for the v3.65a controller:
    
      "INVALID device type 0"
    
    Despite the driver probing successfully, the controller may not be
    functional in the Root Complex mode of operation.
    
    So, set the mode as Root Complex for "ti,keystone-pcie" compatible to
    fix this.
    
    Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Kishon Vijay Abraham I <[email protected]>
    Signed-off-by: Siddharth Vadapalli <[email protected]>
    [kwilczynski: commit log, added tag for stable releases]
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: of_property: Assign PCI instead of CPU bus address to dynamic PCI nodes [+ + +]

Author: Andrea della Porta <[email protected]>
Date:   Fri Nov 8 10:42:56 2024 +0100

    PCI: of_property: Assign PCI instead of CPU bus address to dynamic PCI nodes
    
    commit 5e316d34b53039346e252d0019e2f4167af2c0ef upstream.
    
    When populating "ranges" property for a PCI bridge or endpoint,
    of_pci_prop_ranges() incorrectly uses the CPU address of the resource.  In
    such PCI nodes, the window should instead be in PCI address space. Call
    pci_bus_address() on the resource in order to obtain the PCI bus address.
    
    [Previous discussion at:
    https://lore.kernel.org/all/8b4fa91380fc4754ea80f47330c613e4f6b6592c.1724159867.git.andrea.porta@suse.com/]
    
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
    Tested-by: Herve Codina <[email protected]>
    Signed-off-by: Andrea della Porta <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: qcom: Disable ASPM L0s for X1E80100 [+ + +]

Author: Qiang Yu <[email protected]>
Date:   Thu Oct 31 20:09:01 2024 -0700

    PCI: qcom: Disable ASPM L0s for X1E80100
    
    commit fba6045161d686adc102b6ef71b2fd1e5f90a616 upstream.
    
    Currently, the cfg_1_9_0 which is being used for X1E80100 doesn't disable
    ASPM L0s. However, hardware team recommends to disable L0s as the PHY init
    sequence is not tuned support L0s. Hence reuse cfg_sc8280xp for X1E80100.
    
    Note that the config_sid() callback is not present in cfg_sc8280xp, don't
    concern about this because config_sid() callback is originally a no-op
    for X1E80100.
    
    Fixes: 6d0c39324c5f ("PCI: qcom: Add X1E80100 PCIe support")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Qiang Yu <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Reviewed-by: Johan Hovold <[email protected]>
    Reviewed-by: Manivannan Sadhasivam <[email protected]>
    Cc: <[email protected]> # 6.9
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

PCI: rockchip-ep: Fix address translation unit programming [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Thu Oct 17 10:58:36 2024 +0900

    PCI: rockchip-ep: Fix address translation unit programming
    
    commit 64f093c4d99d797b68b407a9d8767aadc3e3ea7a upstream.
    
    The Rockchip PCIe endpoint controller handles PCIe transfers addresses
    by masking the lower bits of the programmed PCI address and using the
    same number of lower bits masked from the CPU address space used for the
    mapping. For a PCI mapping of <size> bytes starting from <pci_addr>,
    the number of bits masked is the number of address bits changing in the
    address range [pci_addr..pci_addr + size - 1].
    
    However, rockchip_pcie_prog_ep_ob_atu() calculates num_pass_bits only
    using the size of the mapping, resulting in an incorrect number of mask
    bits depending on the value of the PCI address to map.
    
    Fix this by introducing the helper function
    rockchip_pcie_ep_ob_atu_num_bits() to correctly calculate the number of
    mask bits to use to program the address translation unit. The number of
    mask bits is calculated depending on both the PCI address and size of
    the mapping, and clamped between 8 and 20 using the macros
    ROCKCHIP_PCIE_AT_MIN_NUM_BITS and ROCKCHIP_PCIE_AT_MAX_NUM_BITS. As
    defined in the Rockchip RK3399 TRM V1.3 Part2, Sections 17.5.5.1.1 and
    17.6.8.2.1, this clamping is necessary because:
    
      1) The lower 8 bits of the PCI address to be mapped by the outbound
         region are ignored. So a minimum of 8 address bits are needed and
         imply that the PCI address must be aligned to 256.
    
      2) The outbound memory regions are 1MB in size. So while we can specify
         up to 63-bits for the PCI address (num_bits filed uses bits 0 to 5 of
         the outbound address region 0 register), we must limit the number of
         valid address bits to 20 to match the memory window maximum size (1
         << 20 = 1MB).
    
    Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Damien Le Moal <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf jevents: fix breakage when do perf stat on system metric [+ + +]

Author: Xu Yang <[email protected]>
Date:   Thu Nov 7 08:20:28 2024 -0800

    perf jevents: fix breakage when do perf stat on system metric
    
    commit 4a159e6049f319bef6f9e6d2ccdd322f57d24830 upstream.
    
    When do perf stat on sys metric, perf tool output nothing now:
    
      $ perf stat -a -M imx95_ddr_read.all -I 1000
      $
    
    This command runs on an arm64 machine and the Soc has one DDR hw pmu
    except one armv8_cortex_a55 pmu. Their maps show as follows:
    
    const struct pmu_events_map pmu_events_map[] = {
    {
            .arch = "arm64",
            .cpuid = "0x00000000410fd050",
            .event_table = {
                    .pmus = pmu_events__arm_cortex_a55,
                    .num_pmus = ARRAY_SIZE(pmu_events__arm_cortex_a55)
            },
            .metric_table = {
                    .pmus = NULL,
                    .num_pmus = 0
            }
    },
    
    static const struct pmu_sys_events pmu_sys_event_tables[] = {
    {
            .event_table = {
                    .pmus = pmu_events__freescale_imx95_sys,
                    .num_pmus = ARRAY_SIZE(pmu_events__freescale_imx95_sys)
            },
            .metric_table = {
                    .pmus = pmu_metrics__freescale_imx95_sys,
                    .num_pmus = ARRAY_SIZE(pmu_metrics__freescale_imx95_sys)
            },
            .name = "pmu_events__freescale_imx95_sys",
    },
    
    Currently, pmu_metrics_table__find() will return NULL when only do perf
    stat on sys metric. Then parse_groups() will never be called to parse
    sys metric_name, finally perf tool will exit directly. This should be a
    common problem.
    
    To fix the issue, this will keep the logic before commit f20c15d13f01
    ("perf pmu-events: Remember the perf_events_map for a PMU") to return a
    empty metric table rather than a NULL pointer.
    
    This should be fine since the removed part just check if the table match
    provided metric_name.  Without these code, the code in parse_groups()
    will also check the validity of metrci_name too.
    
    Fixes: f20c15d13f017d4b ("perf pmu-events: Remember the perf_events_map for a PMU")
    Reviewed-by: James Clark <[email protected]>
    Signed-off-by: Xu Yang <[email protected]>
    Tested-by: Xu Yang <[email protected]>
    Acked-by: Ian Rogers <[email protected]>
    Cc: Adrian Hunter <[email protected]>
    Cc: Albert Ou <[email protected]>
    Cc: Alexander Shishkin <[email protected]>
    Cc: Alexandre Ghiti <[email protected]>
    Cc: Athira Rajeev <[email protected]>
    Cc: Benjamin Gray <[email protected]>
    Cc: Ben Zong-You Xie <[email protected]>
    Cc: Bibo Mao <[email protected]>
    Cc: Clément Le Goffic <[email protected]>
    Cc: Dima Kogan <[email protected]>
    Cc: Dr. David Alan Gilbert <[email protected]>
    Cc: Huacai Chen <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: John Garry <[email protected]>
    Cc: Kan Liang <[email protected]>
    Cc: Leo Yan <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mike Leach <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Palmer Dabbelt <[email protected]>
    Cc: Paul Walmsley <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Ravi Bangoria <[email protected]>
    Cc: Sandipan Das <[email protected]>
    Cc: Will Deacon <[email protected]>
    Cc: Yicong Yang <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ian Rogers <[email protected]>
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

posix-timers: Target group sigqueue to current task only if not exiting [+ + +]

Author: Frederic Weisbecker <[email protected]>
Date:   Sat Nov 23 00:48:11 2024 +0100

    posix-timers: Target group sigqueue to current task only if not exiting
    
    commit 63dffecfba3eddcf67a8f76d80e0c141f93d44a5 upstream.
    
    A sigqueue belonging to a posix timer, which target is not a specific
    thread but a whole thread group, is preferrably targeted to the current
    task if it is part of that thread group.
    
    However nothing prevents a posix timer event from queueing such a
    sigqueue from a reaped yet running task. The interruptible code space
    between exit_notify() and the final call to schedule() is enough for
    posix_timer_fn() hrtimer to fire.
    
    If that happens while the current task is part of the thread group
    target, it is proposed to handle it but since its sighand pointer may
    have been cleared already, the sigqueue is dropped even if there are
    other tasks running within the group that could handle it.
    
    As a result posix timers with thread group wide target may miss signals
    when some of their threads are exiting.
    
    Fix this with verifying that the current task hasn't been through
    exit_notify() before proposing it as a preferred target so as to ensure
    that its sighand is still here and stable.
    
    complete_signal() might still reconsider the choice and find a better
    target within the group if current has passed retarget_shared_pending()
    already.
    
    Fixes: bcb7ee79029d ("posix-timers: Prefer delivery of signals to the current thread")
    Reported-by: Anthony Mallet <[email protected]>
    Suggested-by: Oleg Nesterov <[email protected]>
    Signed-off-by: Frederic Weisbecker <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Oleg Nesterov <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Closes: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

powerpc/vdso: Drop -mstack-protector-guard flags in 32-bit files with clang [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Wed Oct 30 11:41:37 2024 -0700

    powerpc/vdso: Drop -mstack-protector-guard flags in 32-bit files with clang
    
    commit d677ce521334d8f1f327cafc8b1b7854b0833158 upstream.
    
    Under certain conditions, the 64-bit '-mstack-protector-guard' flags may
    end up in the 32-bit vDSO flags, resulting in build failures due to the
    structure of clang's argument parsing of the stack protector options,
    which validates the arguments of the stack protector guard flags
    unconditionally in the frontend, choking on the 64-bit values when
    targeting 32-bit:
    
      clang: error: invalid value 'r13' in 'mstack-protector-guard-reg=', expected one of: r2
      clang: error: invalid value 'r13' in 'mstack-protector-guard-reg=', expected one of: r2
      make[3]: *** [arch/powerpc/kernel/vdso/Makefile:85: arch/powerpc/kernel/vdso/vgettimeofday-32.o] Error 1
      make[3]: *** [arch/powerpc/kernel/vdso/Makefile:87: arch/powerpc/kernel/vdso/vgetrandom-32.o] Error 1
    
    Remove these flags by adding them to the CC32FLAGSREMOVE variable, which
    already handles situations similar to this. Additionally, reformat and
    align a comment better for the expanding CONFIG_CC_IS_CLANG block.
    
    Cc: [email protected] # v6.1+
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://patch.msgid.link/20241030-powerpc-vdso-drop-stackp-flags-clang-v1-1-d95e7376d29c@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

powerpc: Adjust adding stack protector flags to KBUILD_CLAGS for clang [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Wed Oct 9 12:26:09 2024 -0700

    powerpc: Adjust adding stack protector flags to KBUILD_CLAGS for clang
    
    commit bee08a9e6ab03caf14481d97b35a258400ffab8f upstream.
    
    After fixing the HAVE_STACKPROTECTER checks for clang's in-progress
    per-task stack protector support [1], the build fails during prepare0
    because '-mstack-protector-guard-offset' has not been added to
    KBUILD_CFLAGS yet but the other '-mstack-protector-guard' flags have.
    
      clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default
      clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default
      make[4]: *** [scripts/Makefile.build:229: scripts/mod/empty.o] Error 1
      make[4]: *** [scripts/Makefile.build:102: scripts/mod/devicetable-offsets.s] Error 1
    
    Mirror other architectures and add all '-mstack-protector-guard' flags
    to KBUILD_CFLAGS atomically during stack_protector_prepare, which
    resolves the issue and allows clang's implementation to fully work with
    the kernel.
    
    Cc: [email protected] # 6.1+
    Link: https://github.com/llvm/llvm-project/pull/110928 [1]
    Reviewed-by: Keith Packard <[email protected]>
    Tested-by: Keith Packard <[email protected]>
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://patch.msgid.link/20241009-powerpc-fix-stackprotector-test-clang-v2-2-12fb86b31857@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

powerpc: Fix stack protector Kconfig test for clang [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Wed Oct 9 12:26:08 2024 -0700

    powerpc: Fix stack protector Kconfig test for clang
    
    commit 46e1879deea22eed31e9425d58635895fc0e8040 upstream.
    
    Clang's in-progress per-task stack protector support [1] does not work
    with the current Kconfig checks because '-mstack-protector-guard-offset'
    is not provided, unlike all other architecture Kconfig checks.
    
      $ fd Kconfig -x rg -l mstack-protector-guard-offset
      ./arch/arm/Kconfig
      ./arch/riscv/Kconfig
      ./arch/arm64/Kconfig
    
    This produces an error from clang, which is interpreted as the flags not
    being supported at all when they really are.
    
      $ clang --target=powerpc64-linux-gnu \
              -mstack-protector-guard=tls \
              -mstack-protector-guard-reg=r13 \
              -c -o /dev/null -x c /dev/null
      clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default
    
    This argument will always be provided by the build system, so mirror
    other architectures and use '-mstack-protector-guard-offset=0' for
    testing support, which fixes the issue for clang and does not regress
    support with GCC.
    
    Even with the first problem addressed, the 32-bit test continues to fail
    because Kbuild uses the powerpc64le-linux-gnu target for clang and
    nothing flips the target to 32-bit, resulting in an error about an
    invalid register valid:
    
      $ clang --target=powerpc64le-linux-gnu \
              -mstack-protector-guard=tls
              -mstack-protector-guard-reg=r2 \
              -mstack-protector-guard-offset=0 \
              -x c -c -o /dev/null /dev/null
      clang: error: invalid value 'r2' in 'mstack-protector-guard-reg=', expected one of: r13
    
    While GCC allows arbitrary registers, the implementation of
    '-mstack-protector-guard=tls' in LLVM shares the same code path as the
    user space thread local storage implementation, which uses a fixed
    register (2 for 32-bit and 13 for 62-bit), so the command line parsing
    enforces this limitation.
    
    Use the Kconfig macro '$(m32-flag)', which expands to '-m32' when
    supported, in the stack protector support cc-option call to properly
    switch the target to a 32-bit one, which matches what happens in Kbuild.
    While the 64-bit macro does not strictly need it, add the equivalent
    64-bit option for symmetry.
    
    Cc: [email protected] # 6.1+
    Link: https://github.com/llvm/llvm-project/pull/110928 [1]
    Reviewed-by: Keith Packard <[email protected]>
    Tested-by: Keith Packard <[email protected]>
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://patch.msgid.link/20241009-powerpc-fix-stackprotector-test-clang-v2-1-12fb86b31857@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

quota: flush quota_release_work upon quota writeback [+ + +]

Author: Ojaswin Mujoo <[email protected]>
Date:   Thu Nov 21 18:08:54 2024 +0530

    quota: flush quota_release_work upon quota writeback
    
    [ Upstream commit ac6f420291b3fee1113f21d612fa88b628afab5b ]
    
    One of the paths quota writeback is called from is:
    
    freeze_super()
      sync_filesystem()
        ext4_sync_fs()
          dquot_writeback_dquots()
    
    Since we currently don't always flush the quota_release_work queue in
    this path, we can end up with the following race:
    
     1. dquot are added to releasing_dquots list during regular operations.
     2. FS Freeze starts, however, this does not flush the quota_release_work queue.
     3. Freeze completes.
     4. Kernel eventually tries to flush the workqueue while FS is frozen which
        hits a WARN_ON since transaction gets started during frozen state:
    
      ext4_journal_check_start+0x28/0x110 [ext4] (unreliable)
      __ext4_journal_start_sb+0x64/0x1c0 [ext4]
      ext4_release_dquot+0x90/0x1d0 [ext4]
      quota_release_workfn+0x43c/0x4d0
    
    Which is the following line:
    
      WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
    
    Which ultimately results in generic/390 failing due to dmesg
    noise. This was detected on powerpc machine 15 cores.
    
    To avoid this, make sure to flush the workqueue during
    dquot_writeback_dquots() so we dont have any pending workitems after
    freeze.
    
    Reported-by: Disha Goel <[email protected]>
    CC: [email protected]
    Fixes: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide")
    Reviewed-by: Baokun Li <[email protected]>
    Signed-off-by: Ojaswin Mujoo <[email protected]>
    Signed-off-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: qcom_q6v5_pas: disable auto boot for wpss [+ + +]

Author: Balaji Pothunoori <[email protected]>
Date:   Fri Oct 18 16:29:11 2024 +0530

    remoteproc: qcom_q6v5_pas: disable auto boot for wpss
    
    commit 8a47704d64c9afda80e7f399ba2cf898cfcc45b2 upstream.
    
    Currently, the rproc "atomic_t power" variable is incremented during:
    a. WPSS rproc auto boot.
    b. AHB power on for ath11k.
    
    During AHB power off (rmmod ath11k_ahb.ko), rproc_shutdown fails
    to unload the WPSS firmware because the rproc->power value is '2',
    causing the atomic_dec_and_test(&rproc->power) condition to fail.
    
    Consequently, during AHB power on (insmod ath11k_ahb.ko),
    QMI_WLANFW_HOST_CAP_REQ_V01 fails due to the host and firmware QMI
    states being out of sync.
    
    Fixes: 300ed425dfa9 ("remoteproc: qcom_q6v5_pas: Add SC7280 ADSP, CDSP & WPSS")
    Cc: [email protected]
    Signed-off-by: Balaji Pothunoori <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable" [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Nov 14 16:23:45 2024 -0500

    Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"
    
    commit 979bfe291b5b30a9132c2fd433247e677b24c6aa upstream.
    
    This reverts commit 949658cb9b69ab9d22a42a662b2fdc7085689ed8.
    
    This causes a blank screen on boot.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3696
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: Shixiong Ou <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs" [+ + +]

Author: Ashutosh Dixit <[email protected]>
Date:   Tue Oct 29 13:01:47 2024 -0700

    Revert "drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs"
    
    commit 0191fddf53748cf2b473d78faeabe6dcb47689d2 upstream.
    
    This reverts commit 55858fa7eb2f163f7aa34339fd3399ba4ff564c6.
    
    '55858fa7eb2f ("drm/xe/xe_guc_ads: save/restore OA registers and allowlist
    regs")' was not properly reviewed and also causes dmesg asserts in
    CI. Revert it.
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3295
    Fixes: 55858fa7eb2f ("drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs")
    Signed-off-by: Ashutosh Dixit <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Signed-off-by: Matt Roper <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

s390/entry: Mark IRQ entries to fix stack depot warnings [+ + +]

Author: Vasily Gorbik <[email protected]>
Date:   Tue Nov 19 14:54:07 2024 +0100

    s390/entry: Mark IRQ entries to fix stack depot warnings
    
    commit 45c9f2b856a075a34873d00788d2e8a250c1effd upstream.
    
    The stack depot filters out everything outside of the top interrupt
    context as an uninteresting or irrelevant part of the stack traces. This
    helps with stack trace de-duplication, avoiding an explosion of saved
    stack traces that share the same IRQ context code path but originate
    from different randomly interrupted points, eventually exhausting the
    stack depot.
    
    Filtering uses in_irqentry_text() to identify functions within the
    .irqentry.text and .softirqentry.text sections, which then become the
    last stack trace entries being saved.
    
    While __do_softirq() is placed into the .softirqentry.text section by
    common code, populating .irqentry.text is architecture-specific.
    
    Currently, the .irqentry.text section on s390 is empty, which prevents
    stack depot filtering and de-duplication and could result in warnings
    like:
    
    Stack depot reached limit capacity
    WARNING: CPU: 0 PID: 286113 at lib/stackdepot.c:252 depot_alloc_stack+0x39a/0x3c8
    
    with PREEMPT and KASAN enabled.
    
    Fix this by moving the IO/EXT interrupt handlers from .kprobes.text into
    the .irqentry.text section and updating the kprobes blacklist to include
    the .irqentry.text section.
    
    This is done only for asynchronous interrupts and explicitly not for
    program checks, which are synchronous and where the context beyond the
    program check is important to preserve. Despite machine checks being
    somewhat in between, they are extremely rare, and preserving context
    when possible is also of value.
    
    SVCs and Restart Interrupts are not relevant, one being always at the
    boundary to user space and the other being a one-time thing.
    
    IRQ entries filtering is also optionally used in ftrace function graph,
    where the same logic applies.
    
    Cc: [email protected] # 5.15+
    Reviewed-by: Heiko Carstens <[email protected]>
    Signed-off-by: Vasily Gorbik <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

s390/stacktrace: Use break instead of return statement [+ + +]

Author: Heiko Carstens <[email protected]>
Date:   Mon Nov 18 13:14:07 2024 +0100

    s390/stacktrace: Use break instead of return statement
    
    commit 588a9836a4ef7ec3bfcffda526dfa399637e6cfc upstream.
    
    arch_stack_walk_user_common() contains a return statement instead of a
    break statement in case store_ip() fails while trying to store a callchain
    entry of a user space process.
    This may lead to a missing pagefault_enable() call.
    
    If this happens any subsequent page fault of the process won't be resolved
    by the page fault handler and this in turn will lead to the process being
    killed.
    
    Use a break instead of a return statement to fix this.
    
    Fixes: ebd912ff9919 ("s390/stacktrace: Merge perf_callchain_user() and arch_stack_walk_user()")
    Cc: [email protected]
    Reviewed-by: Jens Remus <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: ufs: exynos: Add check inside exynos_ufs_config_smu() [+ + +]

Author: Peter Griffin <[email protected]>
Date:   Thu Oct 31 15:00:23 2024 +0000

    scsi: ufs: exynos: Add check inside exynos_ufs_config_smu()
    
    commit c662cedea14efdcf373d8d886ec18019d50e0772 upstream.
    
    Move the EXYNOS_UFS_OPT_UFSPR_SECURE check inside
    exynos_ufs_config_smu().
    
    This way all call sites will benefit from the check. This fixes a bug
    currently in the exynos_ufs_resume() path on gs101 as it calls
    exynos_ufs_config_smu() and we end up accessing registers that can only
    be accessed from secure world which results in a serror.
    
    Fixes: d11e0a318df8 ("scsi: ufs: exynos: Add support for Tensor gs101 SoC")
    Signed-off-by: Peter Griffin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Reviewed-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: ufs: exynos: Fix hibern8 notify callbacks [+ + +]

Author: Peter Griffin <[email protected]>
Date:   Thu Oct 31 15:00:31 2024 +0000

    scsi: ufs: exynos: Fix hibern8 notify callbacks
    
    commit ceef938bbf8b93ba3a218b4adc244cde94b582aa upstream.
    
    v1 of the patch which introduced the ufshcd_vops_hibern8_notify()
    callback used a bool instead of an enum. In v2 this was updated to an
    enum based on the review feedback in [1].
    
    ufs-exynos hibernate calls have always been broken upstream as it
    follows the v1 bool implementation.
    
    Link: https://patchwork.kernel.org/project/linux-scsi/patch/[email protected]/ [1]
    Fixes: 55f4b1f73631 ("scsi: ufs: ufs-exynos: Add UFS host support for Exynos SoCs")
    Signed-off-by: Peter Griffin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Reviewed-by: Tudor Ambarus <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

slab: Fix too strict alignment check in create_cache() [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Wed Nov 20 13:46:21 2024 +0100

    slab: Fix too strict alignment check in create_cache()
    
    commit 9008fe8fad8255edfdbecea32d7eb0485d939d0d upstream.
    
    On m68k, where the minimum alignment of unsigned long is 2 bytes:
    
        Kernel panic - not syncing: __kmem_cache_create_args: Failed to create slab 'io_kiocb'. Error -22
        CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted 6.12.0-atari-03776-g7eaa1f99261a #1783
        Stack from 0102fe5c:
                0102fe5c 00514a2b 00514a2b ffffff00 00000001 0051f5ed 00425e78 00514a2b
                0041eb74 ffffffea 00000310 0051f5ed ffffffea ffffffea 00601f60 00000044
                0102ff20 000e7a68 0051ab8e 004383b8 0051f5ed ffffffea 000000b8 00000007
                01020c00 00000000 000e77f0 0041e5f0 005f67c0 0051f5ed 000000b6 0102fef4
                00000310 0102fef4 00000000 00000016 005f676c 0060a34c 00000010 00000004
                00000038 0000009a 01000000 000000b8 005f668e 0102e000 00001372 0102ff88
        Call Trace: [<00425e78>] dump_stack+0xc/0x10
         [<0041eb74>] panic+0xd8/0x26c
         [<000e7a68>] __kmem_cache_create_args+0x278/0x2e8
         [<000e77f0>] __kmem_cache_create_args+0x0/0x2e8
         [<0041e5f0>] memset+0x0/0x8c
         [<005f67c0>] io_uring_init+0x54/0xd2
    
    The minimal alignment of an integral type may differ from its size,
    hence is not safe to assume that an arbitrary freeptr_t (which is
    basically an unsigned long) is always aligned to 4 or 8 bytes.
    
    As nothing seems to require the additional alignment, it is safe to fix
    this by relaxing the check to the actual minimum alignment of freeptr_t.
    
    Fixes: aaa736b186239b7d ("io_uring: specify freeptr usage for SLAB_TYPESAFE_BY_RCU io_kiocb cache")
    Fixes: d345bd2e9834e2da ("mm: add kmem_cache_create_rcu()")
    Reported-by: Guenter Roeck <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Reviewed-by: Jens Axboe <[email protected]>
    Signed-off-by: Vlastimil Babka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spmi: pmic-arb: fix return path in for_each_available_child_of_node() [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Fri Nov 8 16:28:26 2024 -0800

    spmi: pmic-arb: fix return path in for_each_available_child_of_node()
    
    commit 77adf4b1f3e1fdb319f7ee515e5924bb77df3916 upstream.
    
    This loop requires explicit calls to of_node_put() upon early exits
    (break, goto, return) to decrement the child refcounter and avoid memory
    leaks if the child is not required out of the loop.
    
    A more robust solution is using the scoped variant of the macro, which
    automatically calls of_node_put() when the child goes out of scope.
    
    Cc: [email protected]
    Fixes: 979987371739 ("spmi: pmic-arb: Add multi bus support")
    Signed-off-by: Javier Carrasco <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Neil Armstrong <[email protected]>
    Signed-off-by: Stephen Boyd <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

thermal: int3400: Fix reading of current_uuid for active policy [+ + +]

Author: Srinivas Pandruvada <[email protected]>
Date:   Thu Nov 14 12:02:13 2024 -0800

    thermal: int3400: Fix reading of current_uuid for active policy
    
    commit 7082503622986537f57bdb5ef23e69e70cfad881 upstream.
    
    When the current_uuid attribute is set to the active policy UUID,
    reading back the same attribute is returning "INVALID" instead of
    the active policy UUID on some platforms before Ice Lake.
    
    In platforms before Ice Lake, firmware provides a list of supported
    thermal policies. In this case, user space can select any of the
    supported thermal policies via a write to attribute "current_uuid".
    
    In commit c7ff29763989 ("thermal: int340x: Update OS policy capability
    handshake")', the OS policy handshake was updated to support Ice Lake
    and later platforms and it treated priv->current_uuid_index=0 as
    invalid. However, priv->current_uuid_index=0 is for the active policy,
    only priv->current_uuid_index=-1 is invalid.
    
    Fix this issue by updating the priv->current_uuid_index check.
    
    Fixes: c7ff29763989 ("thermal: int340x: Update OS policy capability handshake")
    Signed-off-by: Srinivas Pandruvada <[email protected]>
    Cc: 5.18+ <[email protected]> # 5.18+
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Fix function timing profiler to initialize hashtable [+ + +]

Author: Masami Hiramatsu (Google) <[email protected]>
Date:   Sun Aug 18 21:50:28 2024 +0900

    tracing: Fix function timing profiler to initialize hashtable
    
    commit c54a1a06daa78613519b4d24495b0d175b8af63f upstream.
    
    Since the new fgraph requires to initialize fgraph_ops.ops.func_hash before
    calling register_ftrace_graph(), initialize it with default (tracing all
    functions) parameter.
    
    Cc: [email protected]
    Fixes: 5fccc7552ccb ("ftrace: Add subops logic to allow one ops to manage many")
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

util_macros.h: fix/rework find_closest() macros [+ + +]

Author: Alexandru Ardelean <[email protected]>
Date:   Tue Nov 5 16:54:05 2024 +0200

    util_macros.h: fix/rework find_closest() macros
    
    commit bc73b4186736341ab5cd2c199da82db6e1134e13 upstream.
    
    A bug was found in the find_closest() (find_closest_descending() is also
    affected after some testing), where for certain values with small
    progressions, the rounding (done by averaging 2 values) causes an
    incorrect index to be returned.  The rounding issues occur for
    progressions of 1, 2 and 3.  It goes away when the progression/interval
    between two values is 4 or larger.
    
    It's particularly bad for progressions of 1.  For example if there's an
    array of 'a = { 1, 2, 3 }', using 'find_closest(2, a ...)' would return 0
    (the index of '1'), rather than returning 1 (the index of '2').  This
    means that for exact values (with a progression of 1), find_closest() will
    misbehave and return the index of the value smaller than the one we're
    searching for.
    
    For progressions of 2 and 3, the exact values are obtained correctly; but
    values aren't approximated correctly (as one would expect).  Starting with
    progressions of 4, all seems to be good (one gets what one would expect).
    
    While one could argue that 'find_closest()' should not be used for arrays
    with progressions of 1 (i.e. '{1, 2, 3, ...}', the macro should still
    behave correctly.
    
    The bug was found while testing the 'drivers/iio/adc/ad7606.c',
    specifically the oversampling feature.
    For reference, the oversampling values are listed as:
       static const unsigned int ad7606_oversampling_avail[7] = {
              1, 2, 4, 8, 16, 32, 64,
       };
    
    When doing:
      1. $ echo 1 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         1  # this is fine
      2. $ echo 2 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         1  # this is wrong; 2 should be returned here
      3. $ echo 3 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         2  # this is fine
      4. $ echo 4 > /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         $ cat /sys/bus/iio/devices/iio\:device0/oversampling_ratio
         4  # this is fine
    And from here-on, the values are as correct (one gets what one would
    expect.)
    
    While writing a kunit test for this bug, a peculiar issue was found for the
    array in the 'drivers/hwmon/ina2xx.c' & 'drivers/iio/adc/ina2xx-adc.c'
    drivers. While running the kunit test (for 'ina226_avg_tab' from these
    drivers):
      * idx = find_closest([-1 to 2], ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab));
        This returns idx == 0, so value.
      * idx = find_closest(3, ina226_avg_tab, ARRAY_SIZE(ina226_avg_tab));
        This returns idx == 0, value 1; and now one could argue whether 3 is
        closer to 4 or to 1. This quirk only appears for value '3' in this
        array, but it seems to be a another rounding issue.
      * And from 4 onwards the 'find_closest'() works fine (one gets what one
        would expect).
    
    This change reworks the find_closest() macros to also check the difference
    between the left and right elements when 'x'. If the distance to the right
    is smaller (than the distance to the left), the index is incremented by 1.
    This also makes redundant the need for using the DIV_ROUND_CLOSEST() macro.
    
    In order to accommodate for any mix of negative + positive values, the
    internal variables '__fc_x', '__fc_mid_x', '__fc_left' & '__fc_right' are
    forced to 'long' type. This also addresses any potential bugs/issues with
    'x' being of an unsigned type. In those situations any comparison between
    signed & unsigned would be promoted to a comparison between 2 unsigned
    numbers; this is especially annoying when '__fc_left' & '__fc_right'
    underflow.
    
    The find_closest_descending() macro was also reworked and duplicated from
    the find_closest(), and it is being iterated in reverse. The main reason
    for this is to get the same indices as 'find_closest()' (but in reverse).
    The comparison for '__fc_right < __fc_left' favors going the array in
    ascending order.
    For example for array '{ 1024, 512, 256, 128, 64, 16, 4, 1 }' and x = 3, we
    get:
        __fc_mid_x = 2
        __fc_left = -1
        __fc_right = -2
        Then '__fc_right < __fc_left' evaluates to true and '__fc_i++' becomes 7
        which is not quite incorrect, but 3 is closer to 4 than to 1.
    
    This change has been validated with the kunit from the next patch.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 95d119528b0b ("util_macros.h: add find_closest() macro")
    Signed-off-by: Alexandru Ardelean <[email protected]>
    Cc: Bartosz Golaszewski <[email protected]>
    Cc: Greg Kroah-Hartman <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vfio/qat: fix overflow check in qat_vf_resume_write() [+ + +]

Author: Giovanni Cabiddu <[email protected]>
Date:   Mon Oct 21 13:37:53 2024 +0100

    vfio/qat: fix overflow check in qat_vf_resume_write()
    
    commit 9283b7392570421c22a6c8058614f5b76a46b81c upstream.
    
    The unsigned variable `size_t len` is cast to the signed type `loff_t`
    when passed to the function check_add_overflow(). This function considers
    the type of the destination, which is of type loff_t (signed),
    potentially leading to an overflow. This issue is similar to the one
    described in the link below.
    
    Remove the cast.
    
    Note that even if check_add_overflow() is bypassed, by setting `len` to
    a value that is greater than LONG_MAX (which is considered as a negative
    value after the cast), the function copy_from_user(), invoked a few lines
    later, will not perform any copy and return `len` as (len > INT_MAX)
    causing qat_vf_resume_write() to fail with -EFAULT.
    
    Fixes: bb208810b1ab ("vfio/qat: Add vfio_pci driver for Intel QAT SR-IOV VF devices")
    CC: [email protected] # 6.10+
    Link: https://lore.kernel.org/all/[email protected]
    Reported-by: Zijie Zhao <[email protected]>
    Signed-off-by: Giovanni Cabiddu <[email protected]>
    Reviewed-by: Xin Zeng <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alex Williamson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event [+ + +]

Author: MengEn Sun <[email protected]>
Date:   Fri Nov 1 12:06:38 2024 +0800

    vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
    
    commit 2ea80b039b9af0b71c00378523b71c254fb99c23 upstream.
    
    Since 5.14-rc1, NUMA events will only be folded from per-CPU statistics to
    per zone and global statistics when the user actually needs it.
    
    Currently, the kernel has performs the fold operation when reading
    /proc/vmstat, but does not perform the fold operation in /proc/zoneinfo.
    This can lead to inaccuracies in the following statistics in zoneinfo:
    - numa_hit
    - numa_miss
    - numa_foreign
    - numa_interleave
    - numa_local
    - numa_other
    
    Therefore, before printing per-zone vm_numa_event when reading
    /proc/zoneinfo, we should also perform the fold operation.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: f19298b9516c ("mm/vmstat: convert NUMA statistics to basic NUMA counters")
    Signed-off-by: MengEn Sun <[email protected]>
    Reviewed-by: JinLiang Zheng <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: remove unknown compat feature check in superblock write validation [+ + +]

Author: Long Li <[email protected]>
Date:   Wed Nov 13 17:17:15 2024 +0800

    xfs: remove unknown compat feature check in superblock write validation
    
    [ Upstream commit 652f03db897ba24f9c4b269e254ccc6cc01ff1b7 ]
    
    Compat features are new features that older kernels can safely ignore,
    allowing read-write mounts without issues. The current sb write validation
    implementation returns -EFSCORRUPTED for unknown compat features,
    preventing filesystem write operations and contradicting the feature's
    definition.
    
    Additionally, if the mounted image is unclean, the log recovery may need
    to write to the superblock. Returning an error for unknown compat features
    during sb write validation can cause mount failures.
    
    Although XFS currently does not use compat feature flags, this issue
    affects current kernels' ability to mount images that may use compat
    feature flags in the future.
    
    Since superblock read validation already warns about unknown compat
    features, it's unnecessary to repeat this warning during write validation.
    Therefore, the relevant code in write validation is being removed.
    
    Fixes: 9e037cb7972f ("xfs: check for unknown v5 feature bits in superblock write verifier")
    Cc: [email protected] # v4.19+
    Signed-off-by: Long Li <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

zram: clear IDLE flag after recompression [+ + +]

Author: Sergey Senozhatsky <[email protected]>
Date:   Tue Oct 29 00:36:14 2024 +0900

    zram: clear IDLE flag after recompression
    
    commit f85219096648b251a81e9fe24a1974590cfc417d upstream.
    
    Patch series "zram: IDLE flag handling fixes", v2.
    
    zram can wrongly preserve ZRAM_IDLE flag on its entries which can result
    in premature post-processing (writeback and recompression) of such
    entries.
    
    This patch (of 2)
    
    Recompression should clear ZRAM_IDLE flag on the entries it has accessed,
    because otherwise some entries, specifically those for which recompression
    has failed, become immediate candidate entries for another post-processing
    (e.g.  writeback).
    
    Consider the following case:
    - recompression marks entries IDLE every 4 hours and attempts
      to recompress them
    - some entries are incompressible, so we keep them intact and
      hence preserve IDLE flag
    - writeback marks entries IDLE every 8 hours and writebacks
      IDLE entries, however we have IDLE entries left from
      recompression, so writeback prematurely writebacks those
      entries.
    
    The bug was reported by Shin Kawamura.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 84b33bf78889 ("zram: introduce recompress sysfs knob")
    Signed-off-by: Sergey Senozhatsky <[email protected]>
    Reported-by: Shin Kawamura <[email protected]>
    Acked-by: Brian Geffon <[email protected]>
    Cc: Minchan Kim <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>