Changelog in Linux kernel 6.12.7

accel/ivpu: Fix general protection fault in ivpu_bo_list() [+ + +]

Author: Jacek Lawrynowicz <[email protected]>
Date:   Tue Dec 10 14:09:37 2024 +0100

    accel/ivpu: Fix general protection fault in ivpu_bo_list()
    
    commit 4b2efb9db0c22a130bbd1275e489b42c02d08050 upstream.
    
    Check if ctx is not NULL before accessing its fields.
    
    Fixes: 37dee2a2f433 ("accel/ivpu: Improve buffer object debug logs")
    Cc: [email protected] # v6.8
    Reviewed-by: Karol Wachowski <[email protected]>
    Reviewed-by: Jeffrey Hugo <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal() [+ + +]

Author: Jacek Lawrynowicz <[email protected]>
Date:   Tue Dec 10 14:09:39 2024 +0100

    accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal()
    
    commit 0f6482caa6acdfdfc744db7430771fe7e6c4e787 upstream.
    
    Move pm_runtime_set_active() to ivpu_pm_init() so when
    ivpu_ipc_send_receive_internal() is executed before ivpu_pm_enable()
    it already has correct runtime state, even if last resume was
    not successful.
    
    Fixes: 8ed520ff4682 ("accel/ivpu: Move set autosuspend delay to HW specific code")
    Cc: [email protected] # v6.7+
    Reviewed-by: Karol Wachowski <[email protected]>
    Reviewed-by: Jeffrey Hugo <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG [+ + +]

Author: Suren Baghdasaryan <[email protected]>
Date:   Fri Nov 29 16:14:23 2024 -0800

    alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG
    
    commit 60da7445a142bd15e67f3cda915497781c3f781f upstream.
    
    It was recently noticed that set_codetag_empty() might be used not only to
    mark NULL alloctag references as empty to avoid warnings but also to reset
    valid tags (in clear_page_tag_ref()).  Since set_codetag_empty() is
    defined as NOOP for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n, such use of
    set_codetag_empty() leads to subtle bugs.  Fix set_codetag_empty() for
    CONFIG_MEM_ALLOC_PROFILING_DEBUG=n to reset the tag reference.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function")
    Signed-off-by: Suren Baghdasaryan <[email protected]>
    Reported-by: David Wang <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Cc: David Wang <[email protected]>
    Cc: Kent Overstreet <[email protected]>
    Cc: Mike Rapoport (Microsoft) <[email protected]>
    Cc: Pasha Tatashin <[email protected]>
    Cc: Sourav Panda <[email protected]>
    Cc: Yu Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

block: avoid to reuse `hctx` not removed from cpuhp callback list [+ + +]

Author: Ming Lei <[email protected]>
Date:   Wed Dec 18 18:16:15 2024 +0800

    block: avoid to reuse `hctx` not removed from cpuhp callback list
    
    [ Upstream commit 85672ca9ceeaa1dcf2777a7048af5f4aee3fd02b ]
    
    If the 'hctx' isn't removed from cpuhp callback list, we can't reuse it,
    otherwise use-after-free may be triggered.
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-lkp/[email protected]
    Tested-by: kernel test robot <[email protected]>
    Fixes: 22465bbac53c ("blk-mq: move cpuhp callback registering out of q->sysfs_lock")
    Signed-off-by: Ming Lei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock" [+ + +]

Author: Ming Lei <[email protected]>
Date:   Wed Dec 18 18:16:14 2024 +0800

    block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock"
    
    commit 224749be6c23efe7fb8a030854f4fc5d1dd813b3 upstream.
    
    This reverts commit be26ba96421ab0a8fa2055ccf7db7832a13c44d2.
    
    Commit be26ba96421a ("block: Fix potential deadlock while freezing queue and
    acquiring sysfs_loc") actually reverts commit 22465bbac53c ("blk-mq: move cpuhp
    callback registering out of q->sysfs_lock"), and causes the original resctrl
    lockdep warning.
    
    So revert it and we need to fix the issue in another way.
    
    Cc: Nilay Shroff <[email protected]>
    Fixes: be26ba96421a ("block: Fix potential deadlock while freezing queue and acquiring sysfs_loc")
    Signed-off-by: Ming Lei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix improper generation check in snapshot delete [+ + +]

Author: Josef Bacik <[email protected]>
Date:   Wed Nov 13 11:11:55 2024 -0500

    btrfs: fix improper generation check in snapshot delete
    
    commit d75d72a858f0c00ca8ae161b48cdb403807be4de upstream.
    
    We have been using the following check
    
       if (generation <= root->root_key.offset)
    
    to make decisions about whether or not to visit a node during snapshot
    delete.  This is because for normal subvolumes this is set to 0, and for
    snapshots it's set to the creation generation.  The idea being that if
    the generation of the node is less than or equal to our creation
    generation then we don't need to visit that node, because it doesn't
    belong to us, we can simply drop our reference and move on.
    
    However reloc roots don't have their generation stored in
    root->root_key.offset, instead that is the objectid of their
    corresponding fs root.  This means we can incorrectly not walk into
    nodes that need to be dropped when deleting a reloc root.
    
    There are a variety of consequences to making the wrong choice in two
    distinct areas.
    
    visit_node_for_delete()
    
    1. False positive.  We think we are newer than the block when we really
       aren't.  We don't visit the node and drop our reference to the node
       and carry on.  This would result in leaked space.
    2. False negative.  We do decide to walk down into a block that we
       should have just dropped our reference to.  However this means that
       the child node will have refs > 1, so we will switch to
       UPDATE_BACKREF, and then the subsequent walk_down_proc() will notice
       that btrfs_header_owner(node) != root->root_key.objectid and it'll
       break out of the loop, and then walk_up_proc() will drop our reference,
       so this appears to be ok.
    
    do_walk_down()
    
    1. False positive.  We are in UPDATE_BACKREF and incorrectly decide that
       we are done and don't need to update the backref for our lower nodes.
       This is another case that simply won't happen with relocation, as we
       only have to do UPDATE_BACKREF if the node below us was shared and
       didn't have FULL_BACKREF set, and since we don't own that node
       because we're a reloc root we actually won't end up in this case.
    2. False negative.  Again this is tricky because as described above, we
       simply wouldn't be here from relocation, because we don't own any of
       the nodes because we never set btrfs_header_owner() to the reloc root
       objectid, and we always use FULL_BACKREF, we never actually need to
       set FULL_BACKREF on any children.
    
    Having spent a lot of time stressing relocation/snapshot delete recently
    I've not seen this pop in practice.  But this is objectively incorrect,
    so fix this to get the correct starting generation based on the root
    we're dropping to keep me from thinking there's a problem here.
    
    CC: [email protected]
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: split bios to the fs sector size boundary [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Mon Nov 4 07:26:33 2024 +0100

    btrfs: split bios to the fs sector size boundary
    
    commit be691b5e593f2cc8cef67bbc59c1fb91b74a86a9 upstream.
    
    Btrfs like other file systems can't really deal with I/O not aligned to
    it's internal block size (which strangely is called sector size in
    btrfs, for historical reasons), but the block layer split helper doesn't
    even know about that.
    
    Round down the split boundary so that all I/Os are aligned.
    
    Fixes: d5e4377d5051 ("btrfs: split zone append bios in btrfs_submit_bio")
    CC: [email protected] # 6.12
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Damien Le Moal <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: tree-checker: reject inline extent items with 0 ref count [+ + +]

Author: Qu Wenruo <[email protected]>
Date:   Wed Dec 4 13:30:46 2024 +1030

    btrfs: tree-checker: reject inline extent items with 0 ref count
    
    commit dfb92681a19e1d5172420baa242806414b3eff6f upstream.
    
    [BUG]
    There is a bug report in the mailing list where btrfs_run_delayed_refs()
    failed to drop the ref count for logical 25870311358464 num_bytes
    2113536.
    
    The involved leaf dump looks like this:
    
      item 166 key (25870311358464 168 2113536) itemoff 10091 itemsize 50
        extent refs 1 gen 84178 flags 1
        ref#0: shared data backref parent 32399126528000 count 0 <<<
        ref#1: shared data backref parent 31808973717504 count 1
    
    Notice the count number is 0.
    
    [CAUSE]
    There is no concrete evidence yet, but considering 0 -> 1 is also a
    single bit flipped, it's possible that hardware memory bitflip is
    involved, causing the on-disk extent tree to be corrupted.
    
    [FIX]
    To prevent us reading such corrupted extent item, or writing such
    damaged extent item back to disk, enhance the handling of
    BTRFS_EXTENT_DATA_REF_KEY and BTRFS_SHARED_DATA_REF_KEY keys for both
    inlined and key items, to detect such 0 ref count and reject them.
    
    CC: [email protected] # 5.4+
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    Reported-by: Frankie Fisher <[email protected]>
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

can: m_can: fix missed interrupts with m_can_pci [+ + +]

Author: Matthias Schiffer <[email protected]>
Date:   Mon Oct 7 10:23:59 2024 +0200

    can: m_can: fix missed interrupts with m_can_pci
    
    [ Upstream commit 743375f8deee360b0e902074bab99b0c9368d42f ]
    
    The interrupt line of PCI devices is interpreted as edge-triggered,
    however the interrupt signal of the m_can controller integrated in Intel
    Elkhart Lake CPUs appears to be generated level-triggered.
    
    Consider the following sequence of events:
    
    - IR register is read, interrupt X is set
    - A new interrupt Y is triggered in the m_can controller
    - IR register is written to acknowledge interrupt X. Y remains set in IR
    
    As at no point in this sequence no interrupt flag is set in IR, the
    m_can interrupt line will never become deasserted, and no edge will ever
    be observed to trigger another run of the ISR. This was observed to
    result in the TX queue of the EHL m_can to get stuck under high load,
    because frames were queued to the hardware in m_can_start_xmit(), but
    m_can_finish_tx() was never run to account for their successful
    transmission.
    
    On an Elkhart Lake based board with the two CAN interfaces connected to
    each other, the following script can reproduce the issue:
    
        ip link set can0 up type can bitrate 1000000
        ip link set can1 up type can bitrate 1000000
    
        cangen can0 -g 2 -I 000 -L 8 &
        cangen can0 -g 2 -I 001 -L 8 &
        cangen can0 -g 2 -I 002 -L 8 &
        cangen can0 -g 2 -I 003 -L 8 &
        cangen can0 -g 2 -I 004 -L 8 &
        cangen can0 -g 2 -I 005 -L 8 &
        cangen can0 -g 2 -I 006 -L 8 &
        cangen can0 -g 2 -I 007 -L 8 &
    
        cangen can1 -g 2 -I 100 -L 8 &
        cangen can1 -g 2 -I 101 -L 8 &
        cangen can1 -g 2 -I 102 -L 8 &
        cangen can1 -g 2 -I 103 -L 8 &
        cangen can1 -g 2 -I 104 -L 8 &
        cangen can1 -g 2 -I 105 -L 8 &
        cangen can1 -g 2 -I 106 -L 8 &
        cangen can1 -g 2 -I 107 -L 8 &
    
        stress-ng --matrix 0 &
    
    To fix the issue, repeatedly read and acknowledge interrupts at the
    start of the ISR until no interrupt flags are set, so the next incoming
    interrupt will also result in an edge on the interrupt line.
    
    While we have received a report that even with this patch, the TX queue
    can become stuck under certain (currently unknown) circumstances on the
    Elkhart Lake, this patch completely fixes the issue with the above
    reproducer, and it is unclear whether the remaining issue has a similar
    cause at all.
    
    Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake")
    Signed-off-by: Matthias Schiffer <[email protected]>
    Reviewed-by: Markus Schneider-Pargmann <[email protected]>
    Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Marc Kleine-Budde <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

can: m_can: set init flag earlier in probe [+ + +]

Author: Matthias Schiffer <[email protected]>
Date:   Mon Oct 7 10:23:58 2024 +0200

    can: m_can: set init flag earlier in probe
    
    [ Upstream commit fca2977629f49dee437e217c3fc423b6e0cad98c ]
    
    While an m_can controller usually already has the init flag from a
    hardware reset, no such reset happens on the integrated m_can_pci of the
    Intel Elkhart Lake. If the CAN controller is found in an active state,
    m_can_dev_setup() would fail because m_can_niso_supported() calls
    m_can_cccr_update_bits(), which refuses to modify any other configuration
    bits when CCCR_INIT is not set.
    
    To avoid this issue, set CCCR_INIT before attempting to modify any other
    configuration flags.
    
    Fixes: cd5a46ce6fa6 ("can: m_can: don't enable transceiver when probing")
    Signed-off-by: Matthias Schiffer <[email protected]>
    Reviewed-by: Markus Schneider-Pargmann <[email protected]>
    Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Marc Kleine-Budde <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: fix memory leak in ceph_direct_read_write() [+ + +]

Author: Ilya Dryomov <[email protected]>
Date:   Fri Dec 6 17:32:59 2024 +0100

    ceph: fix memory leak in ceph_direct_read_write()
    
    commit 66e0c4f91461d17d48071695271c824620bed4ef upstream.
    
    The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked
    and pages remain pinned if ceph_alloc_sparse_ext_map() fails.
    
    There is no need to delay the allocation of sparse_ext map until after
    the bvecs array is set up, so fix this by moving sparse_ext allocation
    a bit earlier.  Also, make a similar adjustment in __ceph_sync_read()
    for consistency (a leak of the same kind in __ceph_sync_read() has been
    addressed differently).
    
    Cc: [email protected]
    Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads")
    Signed-off-by: Ilya Dryomov <[email protected]>
    Reviewed-by: Alex Markuze <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: fix memory leaks in __ceph_sync_read() [+ + +]

Author: Max Kellermann <[email protected]>
Date:   Thu Dec 5 16:49:51 2024 +0100

    ceph: fix memory leaks in __ceph_sync_read()
    
    commit d6fd6f8280f0257ba93f16900a0d3d3912f32c79 upstream.
    
    In two `break` statements, the call to ceph_release_page_vector() was
    missing, leaking the allocation from ceph_alloc_page_vector().
    
    Instead of adding the missing ceph_release_page_vector() calls, the
    Ceph maintainers preferred to transfer page ownership to the
    `ceph_osd_request` by passing `own_pages=true` to
    osd_req_op_extent_osd_data_pages().  This requires postponing the
    ceph_osdc_put_request() call until after the block that accesses the
    `pages`.
    
    Cc: [email protected]
    Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads")
    Fixes: f0fe1e54cfcf ("ceph: plumb in decryption during reads")
    Signed-off-by: Max Kellermann <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: give up on paths longer than PATH_MAX [+ + +]

Author: Max Kellermann <[email protected]>
Date:   Mon Nov 18 23:28:28 2024 +0100

    ceph: give up on paths longer than PATH_MAX
    
    commit 550f7ca98ee028a606aa75705a7e77b1bd11720f upstream.
    
    If the full path to be built by ceph_mdsc_build_path() happens to be
    longer than PATH_MAX, then this function will enter an endless (retry)
    loop, effectively blocking the whole task.  Most of the machine
    becomes unusable, making this a very simple and effective DoS
    vulnerability.
    
    I cannot imagine why this retry was ever implemented, but it seems
    rather useless and harmful to me.  Let's remove it and fail with
    ENAMETOOLONG instead.
    
    Cc: [email protected]
    Reported-by: Dario Weißer <[email protected]>
    Signed-off-by: Max Kellermann <[email protected]>
    Reviewed-by: Alex Markuze <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: improve error handling and short/overflow-read logic in __ceph_sync_read() [+ + +]

Author: Alex Markuze <[email protected]>
Date:   Wed Nov 27 15:34:10 2024 +0200

    ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()
    
    commit 9abee475803fab6ad59d4f4fc59c6a75374a7d9d upstream.
    
    This patch refines the read logic in __ceph_sync_read() to ensure more
    predictable and efficient behavior in various edge cases.
    
    - Return early if the requested read length is zero or if the file size
      (`i_size`) is zero.
    - Initialize the index variable (`idx`) where needed and reorder some
      code to ensure it is always set before use.
    - Improve error handling by checking for negative return values earlier.
    - Remove redundant encrypted file checks after failures. Only attempt
      filesystem-level decryption if the read succeeded.
    - Simplify leftover calculations to correctly handle cases where the
      read extends beyond the end of the file or stops short.  This can be
      hit by continuously reading a file while, on another client, we keep
      truncating and writing new data into it.
    - This resolves multiple issues caused by integer and consequent buffer
      overflow (`pages` array being accessed beyond `num_pages`):
      - https://tracker.ceph.com/issues/67524
      - https://tracker.ceph.com/issues/68980
      - https://tracker.ceph.com/issues/68981
    
    Cc: [email protected]
    Fixes: 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads")
    Reported-by: Luis Henriques (SUSE) <[email protected]>
    Signed-off-by: Alex Markuze <[email protected]>
    Reviewed-by: Viacheslav Dubeyko <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: validate snapdirname option length when mounting [+ + +]

Author: Ilya Dryomov <[email protected]>
Date:   Wed Nov 20 16:43:51 2024 +0100

    ceph: validate snapdirname option length when mounting
    
    commit 12eb22a5a609421b380c3c6ca887474fb2089b2c upstream.
    
    It becomes a path component, so it shouldn't exceed NAME_MAX
    characters.  This was hardened in commit c152737be22b ("ceph: Use
    strscpy() instead of strcpy() in __get_snap_name()"), but no actual
    check was put in place.
    
    Cc: [email protected]
    Signed-off-by: Ilya Dryomov <[email protected]>
    Reviewed-by: Alex Markuze <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

chelsio/chtls: prevent potential integer overflow on 32bit [+ + +]

Author: Dan Carpenter <[email protected]>
Date:   Fri Dec 13 12:47:27 2024 +0300

    chelsio/chtls: prevent potential integer overflow on 32bit
    
    commit fbbd84af6ba70334335bdeba3ae536cf751c14c6 upstream.
    
    The "gl->tot_len" variable is controlled by the user.  It comes from
    process_responses().  On 32bit systems, the "gl->tot_len +
    sizeof(struct cpl_pass_accept_req) + sizeof(struct rss_header)" addition
    could have an integer wrapping bug.  Use size_add() to prevent this.
    
    Fixes: a08943947873 ("crypto: chtls - Register chtls with net tls")
    Cc: [email protected]
    Signed-off-by: Dan Carpenter <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cxl/pci: Fix potential bogus return value upon successful probing [+ + +]

Author: Davidlohr Bueso <[email protected]>
Date:   Fri Nov 15 09:00:32 2024 -0800

    cxl/pci: Fix potential bogus return value upon successful probing
    
    [ Upstream commit da4d8c83358163df9a4addaeba0ef8bcb03b22e8 ]
    
    If cxl_pci_ras_unmask() returns non-zero, cxl_pci_probe() will end up
    returning that value, instead of zero.
    
    Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL")
    Reviewed-by: Fan Ni <[email protected]>
    Signed-off-by: Davidlohr Bueso <[email protected]>
    Reviewed-by: Ira Weiny <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Dave Jiang <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cxl/region: Fix region creation for greater than x2 switches [+ + +]

Author: Huaisheng Ye <[email protected]>
Date:   Mon Dec 9 15:33:02 2024 -0800

    cxl/region: Fix region creation for greater than x2 switches
    
    [ Upstream commit 76467a94810c2aa4dd3096903291ac6df30c399e ]
    
    The cxl_port_setup_targets() algorithm fails to identify valid target list
    ordering in the presence of 4-way and above switches resulting in
    'cxl create-region' failures of the form:
    
      $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
      cxl region: create_region: region0: failed to set target7 to mem0
      cxl region: cmd_create_region: created 0 regions
    
      [kernel debug message]
      check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2
      bus_remove_device:574: bus: 'cxl': remove device region0
    
    QEMU can create this failing topology:
    
                           ACPI0017:00 [root0]
                               |
                             HB_0 [port1]
                            /             \
                         RP_0             RP_1
                          |                 |
                    USP [port2]           USP [port3]
                /    /    \    \        /   /    \    \
              DSP   DSP   DSP   DSP   DSP  DSP   DSP  DSP
               |     |     |     |     |    |     |    |
              mem4  mem6  mem2  mem7  mem1 mem3  mem5  mem0
     Pos:      0     2     4     6     1    3     5    7
    
     HB: Host Bridge
     RP: Root Port
     USP: Upstream Port
     DSP: Downstream Port
    
    ...with the following command steps:
    
    $ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg  \
            -smp cpus=8 \
            -m 8G \
            -hda /home/work/vm-images/centos-stream8-02.qcow2 \
            -object memory-backend-ram,size=4G,id=m0 \
            -object memory-backend-ram,size=4G,id=m1 \
            -object memory-backend-ram,size=2G,id=cxl-mem0 \
            -object memory-backend-ram,size=2G,id=cxl-mem1 \
            -object memory-backend-ram,size=2G,id=cxl-mem2 \
            -object memory-backend-ram,size=2G,id=cxl-mem3 \
            -object memory-backend-ram,size=2G,id=cxl-mem4 \
            -object memory-backend-ram,size=2G,id=cxl-mem5 \
            -object memory-backend-ram,size=2G,id=cxl-mem6 \
            -object memory-backend-ram,size=2G,id=cxl-mem7 \
            -numa node,memdev=m0,cpus=0-3,nodeid=0 \
            -numa node,memdev=m1,cpus=4-7,nodeid=1 \
            -netdev user,id=net0,hostfwd=tcp::2222-:22 \
            -device virtio-net-pci,netdev=net0 \
            -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
            -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
            -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \
            -device cxl-upstream,bus=root_port0,id=us0 \
            -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \
            -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \
            -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \
            -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \
            -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \
            -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \
            -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \
            -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \
            -device cxl-upstream,bus=root_port1,id=us1 \
            -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \
            -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \
            -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \
            -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \
            -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \
            -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \
            -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \
            -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \
            -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G &
    
    In Guest OS:
    $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0
    
    Fix the method to calculate @distance by iterativeley multiplying the
    number of targets per switch port. This also follows the algorithm
    recommended here [1].
    
    Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
    Link: http://lore.kernel.org/[email protected] [1]
    Signed-off-by: Huaisheng Ye <[email protected]>
    Tested-by: Li Zhijian <[email protected]>
    [djbw: add a comment explaining 'distance']
    Signed-off-by: Dan Williams <[email protected]>
    Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwillia2-xfh.jf.intel.com
    Signed-off-by: Dave Jiang <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

dma-buf: Fix __dma_buf_debugfs_list_del argument for !CONFIG_DEBUG_FS [+ + +]

Author: T.J. Mercier <[email protected]>
Date:   Sun Nov 17 17:03:25 2024 +0000

    dma-buf: Fix __dma_buf_debugfs_list_del argument for !CONFIG_DEBUG_FS
    
    [ Upstream commit 0cff90dec63da908fb16d9ea2872ebbcd2d18e6a ]
    
    The arguments for __dma_buf_debugfs_list_del do not match for both the
    CONFIG_DEBUG_FS case and the !CONFIG_DEBUG_FS case. The !CONFIG_DEBUG_FS
    case should take a struct dma_buf *, but it's currently struct file *.
    This can lead to the build error:
    
    error: passing argument 1 of ‘__dma_buf_debugfs_list_del’ from
    incompatible pointer type [-Werror=incompatible-pointer-types]
    
    dma-buf.c:63:53: note: expected ‘struct file *’ but argument is of
    type ‘struct dma_buf *’
       63 | static void __dma_buf_debugfs_list_del(struct file *file)
    
    Fixes: bfc7bc539392 ("dma-buf: Do not build debugfs related code when !CONFIG_DEBUG_FS")
    Signed-off-by: T.J. Mercier <[email protected]>
    Reviewed-by: Tvrtko Ursulin <[email protected]>
    Signed-off-by: Sumit Semwal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Wed Nov 6 07:42:47 2024 -0800

    Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet
    
    commit 07a756a49f4b4290b49ea46e089cbe6f79ff8d26 upstream.
    
    If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is
    fully initialized, we can hit the panic below:
    
    hv_utils: Registering HyperV Utility Driver
    hv_vmbus: registering driver hv_utils
    ...
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1
    RIP: 0010:hv_pkt_iter_first+0x12/0xd0
    Call Trace:
    ...
     vmbus_recvpacket
     hv_kvp_onchannelcallback
     vmbus_on_event
     tasklet_action_common
     tasklet_action
     handle_softirqs
     irq_exit_rcu
     sysvec_hyperv_stimer0
     </IRQ>
     <TASK>
     asm_sysvec_hyperv_stimer0
    ...
     kvp_register_done
     hvt_op_read
     vfs_read
     ksys_read
     __x64_sys_read
    
    This can happen because the KVP/VSS channel callback can be invoked
    even before the channel is fully opened:
    1) as soon as hv_kvp_init() -> hvutil_transport_init() creates
    /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and
    register itself to the driver by writing a message KVP_OP_REGISTER1 to the
    file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and
    reading the file for the driver's response, which is handled by
    hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
    
    2) the problem with kvp_register_done() is that it can cause the
    channel callback to be called even before the channel is fully opened,
    and when the channel callback is starting to run, util_probe()->
    vmbus_open() may have not initialized the ringbuffer yet, so the
    callback can hit the panic of NULL pointer dereference.
    
    To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in
    __vmbus_open(), just before the first hv_ringbuffer_init(), and then we
    unload and reload the driver hv_utils, and run the daemon manually within
    the 10 seconds.
    
    Fix the panic by reordering the steps in util_probe() so the char dev
    entry used by the KVP or VSS daemon is not created until after
    vmbus_open() has completed. This reordering prevents the race condition
    from happening.
    
    Reported-by: Dexuan Cui <[email protected]>
    Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration")
    Cc: [email protected]
    Signed-off-by: Michael Kelley <[email protected]>
    Acked-by: Wei Liu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wei Liu <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd: Update strapping for NBIO 2.5.0 [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Tue Dec 10 20:44:14 2024 -0600

    drm/amd: Update strapping for NBIO 2.5.0
    
    commit a7f9d98eb1202132014ba760c26ad8608ffc9caf upstream.
    
    This helps to avoid a spurious PME event on hotplug to Azalia.
    
    Cc: Vijendar Mukunda <[email protected]>
    Reported-and-tested-by: [email protected]
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215884
    Tested-by: Gabriel Marcano <[email protected]>
    Acked-by: Alex Deucher <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 3f6f237b9dd189e1fb85b8a3f7c97a8f27c1e49a)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/gfx12: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 17:04:58 2024 -0500

    drm/amdgpu/gfx12: fix IP version check
    
    commit 41be00f839e9ee7753892a73a36ce4c14c6f5cbf upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit f1fd1d0f40272948aa6ab82a3a82ecbbc76dff53)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/mmhub4.1: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 17:03:20 2024 -0500

    drm/amdgpu/mmhub4.1: fix IP version check
    
    commit 6ebc5b92190e01dd48313b68cbf752c9adcfefa8 upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 63bfd24088b42c6f55c2096bfc41b50213d419b2)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/nbio7.0: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 16:49:20 2024 -0500

    drm/amdgpu/nbio7.0: fix IP version check
    
    commit 3abb660f9e18925468685591a3702bda05faba4f upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 0ec43fbece784215d3c4469973e4556d70bce915)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/nbio7.11: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 17:00:07 2024 -0500

    drm/amdgpu/nbio7.11: fix IP version check
    
    commit 8c1ecc7197a88c6ae62de56e1c0887f220712a32 upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 2c8eeaaa0fe5841ccf07a0eb51b1426f34ef39f7)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/nbio7.7: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 16:47:48 2024 -0500

    drm/amdgpu/nbio7.7: fix IP version check
    
    commit 458600da793da12e0f3724ecbea34a80703f4d5b upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 22b9555bc90df22b585bdd1f161b61584b13af51)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu/smu14.0.2: fix IP version check [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Thu Dec 12 17:06:26 2024 -0500

    drm/amdgpu/smu14.0.2: fix IP version check
    
    commit 9e752ee26c1031312a01d2afc281f5f6fdfca176 upstream.
    
    Use the helper function rather than reading it directly.
    
    Reviewed-by: Yang Wang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 8f2cd1067afe68372a1723e05e19b68ed187676a)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu: don't access invalid sched [+ + +]

Author: Pierre-Eric Pelloux-Prayer <[email protected]>
Date:   Fri Dec 6 13:17:45 2024 +0100

    drm/amdgpu: don't access invalid sched
    
    [ Upstream commit a93b1020eb9386d7da11608477121b10079c076a ]
    
    Since 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()")
    accessing job->base.sched can produce unexpected results as the initialisation
    of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the
    memset.
    
    This commit fixes an issue when a CS would fail validation and would
    be rejected after job->num_ibs is incremented. In this case,
    amdgpu_ib_free(ring->adev, ...) will be called, which would crash the
    machine because the ring value is bogus.
    
    To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this
    because the device is actually not used in this function.
    
    The next commit will remove the ring argument completely.
    
    Fixes: 2320c9e6a768 ("drm/sched: memset() 'job' in drm_sched_job_init()")
    Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 2ae520cb12831d264ceb97c61f72c59d33c0dbd7)
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix amdgpu_coredump [+ + +]

Author: Christian König <[email protected]>
Date:   Thu Dec 12 16:29:18 2024 +0100

    drm/amdgpu: fix amdgpu_coredump
    
    commit 8d1a13816e59254bd3b18f5ae0895230922bd120 upstream.
    
    The VM pointer might already be outdated when that function is called.
    Use the PASID instead to gather the information instead.
    
    Signed-off-by: Christian König <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 57f812d171af4ba233d3ed7c94dfa5b8e92dcc04)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update [+ + +]

Author: Michel Dänzer <[email protected]>
Date:   Tue Dec 17 18:22:56 2024 +0100

    drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update
    
    commit 85230ee36d88e7a09fb062d43203035659dd10a5 upstream.
    
    Third time's the charm, I hope?
    
    Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer")
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Michel Dänzer <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 695c2c745e5dff201b75da8a1d237ce403600d04)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/display: use ERR_PTR on DP tunnel manager creation fail [+ + +]

Author: Krzysztof Karas <[email protected]>
Date:   Thu Dec 12 11:00:41 2024 +0000

    drm/display: use ERR_PTR on DP tunnel manager creation fail
    
    commit 080b2e7b5e9ad23343e4b11f0751e4c724a78958 upstream.
    
    Instead of returning a generic NULL on error from
    drm_dp_tunnel_mgr_create(), use error pointers with informative codes
    to align the function with stub that is executed when
    CONFIG_DRM_DISPLAY_DP_TUNNEL is unset. This will also trigger IS_ERR()
    in current caller (intel_dp_tunnerl_mgr_init()) instead of bypassing it
    via NULL pointer.
    
    v2: use error codes inside drm_dp_tunnel_mgr_create() instead of handling
     on caller's side (Michal, Imre)
    
    v3: fixup commit message and add "CC"/"Fixes" lines (Andi),
     mention aligning function code with stub
    
    Fixes: 91888b5b1ad2 ("drm/i915/dp: Add support for DP tunnel BW allocation")
    Cc: Imre Deak <[email protected]>
    Cc: <[email protected]> # v6.9+
    Signed-off-by: Krzysztof Karas <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]>
    Signed-off-by: Imre Deak <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/7q4fpnmmztmchczjewgm6igy55qt6jsm7tfd4fl4ucfq6yg2oy@q4lxtsu6445c
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/modes: Avoid divide by zero harder in drm_mode_vrefresh() [+ + +]

Author: Ville Syrjälä <[email protected]>
Date:   Fri Nov 29 06:26:28 2024 +0200

    drm/modes: Avoid divide by zero harder in drm_mode_vrefresh()
    
    commit 9398332f23fab10c5ec57c168b44e72997d6318e upstream.
    
    drm_mode_vrefresh() is trying to avoid divide by zero
    by checking whether htotal or vtotal are zero. But we may
    still end up with a div-by-zero of vtotal*htotal*...
    
    Cc: [email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Reviewed-by: Jani Nikula <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panel: himax-hx83102: Add a check to prevent NULL pointer dereference [+ + +]

Author: Zhang Zekun <[email protected]>
Date:   Fri Oct 25 15:34:08 2024 +0800

    drm/panel: himax-hx83102: Add a check to prevent NULL pointer dereference
    
    [ Upstream commit e1e1af9148dc4c866eda3fb59cd6ec3c7ea34b1d ]
    
    drm_mode_duplicate() could return NULL due to lack of memory,
    which will then call NULL pointer dereference. Add a check to
    prevent it.
    
    Fixes: 0ef94554dc40 ("drm/panel: himax-hx83102: Break out as separate driver")
    Signed-off-by: Zhang Zekun <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panel: novatek-nt35950: fix return value check in nt35950_probe() [+ + +]

Author: Yang Yingliang <[email protected]>
Date:   Tue Oct 29 20:39:57 2024 +0800

    drm/panel: novatek-nt35950: fix return value check in nt35950_probe()
    
    [ Upstream commit f8fd0968eff52cf092c0d517d17507ea2f6e5ea5 ]
    
    mipi_dsi_device_register_full() never returns NULL pointer, it
    will return ERR_PTR() when it fails, so replace the check with
    IS_ERR().
    
    Fixes: 623a3531e9cf ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC panels")
    Signed-off-by: Yang Yingliang <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panel: st7701: Add prepare_prev_first flag to drm_panel [+ + +]

Author: Marek Vasut <[email protected]>
Date:   Sun Nov 24 23:48:07 2024 +0100

    drm/panel: st7701: Add prepare_prev_first flag to drm_panel
    
    [ Upstream commit 406dd4c7984a457567ca652455d5efad81983f02 ]
    
    The DSI host must be enabled for the panel to be initialized in
    prepare(). Set the prepare_prev_first flag to guarantee this.
    This fixes the panel operation on NXP i.MX8MP SoC / Samsung DSIM
    DSI host.
    
    Fixes: 849b2e3ff969 ("drm/panel: Add Sitronix ST7701 panel driver")
    Signed-off-by: Marek Vasut <[email protected]>
    Reviewed-by: Jessica Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panel: synaptics-r63353: Fix regulator unbalance [+ + +]

Author: Michael Trimarchi <[email protected]>
Date:   Thu Dec 5 17:29:58 2024 +0100

    drm/panel: synaptics-r63353: Fix regulator unbalance
    
    [ Upstream commit d2bd3fcb825725a59c8880070b1206b1710922bd ]
    
    The shutdown function can be called when the display is already
    unprepared. For example during reboot this trigger a kernel
    backlog. Calling the drm_panel_unprepare, allow us to avoid
    to trigger the kernel warning.
    
    Fixes: 2e87bad7cd33 ("drm/panel: Add Synaptics R63353 panel driver")
    Tested-by: Dario Binacchi <[email protected]>
    Signed-off-by: Michael Trimarchi <[email protected]>
    Signed-off-by: Dario Binacchi <[email protected]>
    Reviewed-by: Neil Armstrong <[email protected]>
    Reviewed-by: Jessica Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

EDAC/amd64: Simplify ECC check on unified memory controllers [+ + +]

Author: Borislav Petkov (AMD) <[email protected]>
Date:   Wed Dec 11 12:07:42 2024 +0100

    EDAC/amd64: Simplify ECC check on unified memory controllers
    
    commit 747367340ca6b5070728b86ae36ad6747f66b2fb upstream.
    
    The intent of the check is to see whether at least one UMC has ECC
    enabled. So do that instead of tracking which ones are enabled in masks
    which are too small in size anyway and lead to not loading the driver on
    Zen4 machines with UMCs enabled over UMC8.
    
    Fixes: e2be5955a886 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh")
    Reported-by: Avadhut Naik <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Avadhut Naik <[email protected]>
    Reviewed-by: Avadhut Naik <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

efivarfs: Fix error on non-existent file [+ + +]

Author: James Bottomley <[email protected]>
Date:   Sun Dec 8 13:34:13 2024 -0500

    efivarfs: Fix error on non-existent file
    
    commit 2ab0837cb91b7de507daa145d17b3b6b2efb3abf upstream.
    
    When looking up a non-existent file, efivarfs returns -EINVAL if the
    file does not conform to the NAME-GUID format and -ENOENT if it does.
    This is caused by efivars_d_hash() returning -EINVAL if the name is not
    formatted correctly.  This error is returned before simple_lookup()
    returns a negative dentry, and is the error value that the user sees.
    
    Fix by removing this check.  If the file does not exist, simple_lookup()
    will return a negative dentry leading to -ENOENT and efivarfs_create()
    already has a validity check before it creates an entry (and will
    correctly return -EINVAL)
    
    Signed-off-by: James Bottomley <[email protected]>
    Cc: <[email protected]>
    [ardb: make efivarfs_valid_name() static]
    Signed-off-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

epoll: Add synchronous wakeup support for ep_poll_callback [+ + +]

Author: Xuewen Yan <[email protected]>
Date:   Fri Apr 26 16:05:48 2024 +0800

    epoll: Add synchronous wakeup support for ep_poll_callback
    
    commit 900bbaae67e980945dec74d36f8afe0de7556d5a upstream.
    
    Now, the epoll only use wake_up() interface to wake up task.
    However, sometimes, there are epoll users which want to use
    the synchronous wakeup flag to hint the scheduler, such as
    Android binder driver.
    So add a wake_up_sync() define, and use the wake_up_sync()
    when the sync is true in ep_poll_callback().
    
    Co-developed-by: Jing Xia <[email protected]>
    Signed-off-by: Jing Xia <[email protected]>
    Signed-off-by: Xuewen Yan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Brian Geffon <[email protected]>
    Reviewed-by: Brian Geffon <[email protected]>
    Reported-by: Benoit Lize <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Cc: Brian Geffon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

erofs: add erofs_sb_free() helper [+ + +]

Author: Gao Xiang <[email protected]>
Date:   Thu Dec 12 21:35:01 2024 +0800

    erofs: add erofs_sb_free() helper
    
    [ Upstream commit e2de3c1bf6a0c99b089bd706a62da8f988918858 ]
    
    Unify the common parts of erofs_fc_free() and erofs_kill_sb() as
    erofs_sb_free().
    
    Thus, fput() in erofs_fc_get_tree() is no longer needed, too.
    
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default")
    Signed-off-by: Sasha Levin <[email protected]>

erofs: fix PSI memstall accounting [+ + +]

Author: Gao Xiang <[email protected]>
Date:   Wed Nov 27 16:52:36 2024 +0800

    erofs: fix PSI memstall accounting
    
    [ Upstream commit 1a2180f6859c73c674809f9f82e36c94084682ba ]
    
    Max Kellermann recently reported psi_group_cpu.tasks[NR_MEMSTALL] is
    incorrect in the 6.11.9 kernel.
    
    The root cause appears to be that, since the problematic commit, bio
    can be NULL, causing psi_memstall_leave() to be skipped in
    z_erofs_submit_queue().
    
    Reported-by: Max Kellermann <[email protected]>
    Closes: https://lore.kernel.org/r/CAKPOu+8tvSowiJADW2RuKyofL_CSkm_SuyZA7ME5vMLWmL6pqw@mail.gmail.com
    Fixes: 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly")
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

erofs: reference `struct erofs_device_info` for erofs_map_dev [+ + +]

Author: Gao Xiang <[email protected]>
Date:   Fri Dec 13 07:54:01 2024 +0800

    erofs: reference `struct erofs_device_info` for erofs_map_dev
    
    [ Upstream commit f8d920a402aec3482931cb5f1539ed438740fc49 ]
    
    Record `m_sb` and `m_dif` to replace `m_fscache`, `m_daxdev`, `m_fp`
    and `m_dax_part_off` in order to simplify the codebase.
    
    Note that `m_bdev` is still left since it can be assigned from
    `sb->s_bdev` directly.
    
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default")
    Signed-off-by: Sasha Levin <[email protected]>

erofs: use `struct erofs_device_info` for the primary device [+ + +]

Author: Gao Xiang <[email protected]>
Date:   Mon Dec 16 20:53:08 2024 +0800

    erofs: use `struct erofs_device_info` for the primary device
    
    [ Upstream commit 7b00af2c5414dc01e0718deef7ead81102867636 ]
    
    Instead of just listing each one directly in `struct erofs_sb_info`
    except that we still use `sb->s_bdev` for the primary block device.
    
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default")
    Signed-off-by: Sasha Levin <[email protected]>

erofs: use buffered I/O for file-backed mounts by default [+ + +]

Author: Gao Xiang <[email protected]>
Date:   Thu Dec 12 21:43:36 2024 +0800

    erofs: use buffered I/O for file-backed mounts by default
    
    [ Upstream commit 6422cde1b0d5a31b206b263417c1c2b3c80fe82c ]
    
    For many use cases (e.g. container images are just fetched from remote),
    performance will be impacted if underlay page cache is up-to-date but
    direct i/o flushes dirty pages first.
    
    Instead, let's use buffered I/O by default to keep in sync with loop
    devices and add a (re)mount option to explicitly give a try to use
    direct I/O if supported by the underlying files.
    
    The container startup time is improved as below:
    [workload] docker.io/library/workpress:latest
                                         unpack        1st run  non-1st runs
    EROFS snapshotter buffered I/O file  4.586404265s  0.308s   0.198s
    EROFS snapshotter direct I/O file    4.581742849s  2.238s   0.222s
    EROFS snapshotter loop               4.596023152s  0.346s   0.201s
    Overlayfs snapshotter                5.382851037s  0.206s   0.214s
    
    Fixes: fb176750266a ("erofs: add file-backed mount support")
    Cc: Derek McGowan <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Gao Xiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

fgraph: Still initialize idle shadow stacks when starting [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Wed Dec 11 13:53:35 2024 -0500

    fgraph: Still initialize idle shadow stacks when starting
    
    commit cc252bb592638e0f7aea40d580186c36d89526b8 upstream.
    
    A bug was discovered where the idle shadow stacks were not initialized
    for offline CPUs when starting function graph tracer, and when they came
    online they were not traced due to the missing shadow stack. To fix
    this, the idle task shadow stack initialization was moved to using the
    CPU hotplug callbacks. But it removed the initialization when the
    function graph was enabled. The problem here is that the hotplug
    callbacks are called when the CPUs come online, but the idle shadow
    stack initialization only happens if function graph is currently
    active. This caused the online CPUs to not get their shadow stack
    initialized.
    
    The idle shadow stack initialization still needs to be done when the
    function graph is registered, as they will not be allocated if function
    graph is not registered.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 2c02f7375e65 ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks")
    Reported-by: Linus Walleij <[email protected]>
    Tested-by: Linus Walleij <[email protected]>
    Closes: https://lore.kernel.org/all/CACRpkdaTBrHwRbbrphVy-=SeDz6MSsXhTKypOtLrTQ+DgGAOcQ@mail.gmail.com/
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

firmware: arm_ffa: Fix the race around setting ffa_dev->properties [+ + +]

Author: Levi Yun <[email protected]>
Date:   Tue Dec 3 14:31:08 2024 +0000

    firmware: arm_ffa: Fix the race around setting ffa_dev->properties
    
    [ Upstream commit 6fe437cfe2cdc797b03f63b338a13fac96ed6a08 ]
    
    Currently, ffa_dev->properties is set after the ffa_device_register()
    call return in ffa_setup_partitions(). This could potentially result in
    a race where the partition's properties is accessed while probing
    struct ffa_device before it is set.
    
    Update the ffa_device_register() to receive ffa_partition_info so all
    the data from the partition information received from the firmware can
    be updated into the struct ffa_device before the calling device_register()
    in ffa_device_register().
    
    Fixes: e781858488b9 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration")
    Signed-off-by: Levi Yun <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Sudeep Holla <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

firmware: arm_scmi: Fix i.MX build dependency [+ + +]

Author: Arnd Bergmann <[email protected]>
Date:   Sat Nov 16 00:05:18 2024 +0100

    firmware: arm_scmi: Fix i.MX build dependency
    
    [ Upstream commit 514b2262ade48a0503ac6aa03c3bfb8c5be69b21 ]
    
    The newly added SCMI vendor driver references functions in the
    protocol driver but needs a Kconfig dependency to ensure it can link,
    essentially the Kconfig dependency needs to be reversed to match the
    link time dependency:
    
      |  arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_write':
      |     fsl_mqs.c:(.text+0x1aa): undefined reference to `scmi_imx_misc_ctrl_set'
      |  arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_read':
      |     fsl_mqs.c:(.text+0x1ee): undefined reference to `scmi_imx_misc_ctrl_get'
    
    This however only works after changing the dependency in the SND_SOC_FSL_MQS
    driver as well, which uses 'select IMX_SCMI_MISC_DRV' to turn on a
    driver it depends on. This is generally a bad idea, so the best solution
    is to change that into a dependency.
    
    To allow the ASoC driver to keep building with the SCMI support, this
    needs to be an optional dependency that enforces the link-time
    dependency if IMX_SCMI_MISC_DRV is a loadable module but not
    depend on it if that is disabled.
    
    Fixes: 61c9f03e22fc ("firmware: arm_scmi: Add initial support for i.MX MISC protocol")
    Fixes: 101c9023594a ("ASoC: fsl_mqs: Support accessing registers by scmi interface")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Acked-by: Mark Brown <[email protected]>
    Acked-by: Shengjiu Wang <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Sudeep Holla <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hexagon: Disable constant extender optimization for LLVM prior to 19.1.0 [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Thu Nov 21 11:22:18 2024 -0700

    hexagon: Disable constant extender optimization for LLVM prior to 19.1.0
    
    commit aef25be35d23ec768eed08bfcf7ca3cf9685bc28 upstream.
    
    The Hexagon-specific constant extender optimization in LLVM may crash on
    Linux kernel code [1], such as fs/bcache/btree_io.c after
    commit 32ed4a620c54 ("bcachefs: Btree path tracepoints") in 6.12:
    
      clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed.
      Stack dump:
      0.      Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c
      1.      <eof> parser at end of file
      2.      Code generation
      3.      Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'.
      4.      Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath'
    
    Without assertions enabled, there is just a hang during compilation.
    
    This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM
    19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the
    constant expander optimization using the '-mllvm' option when using a
    toolchain that is not fixed.
    
    Cc: [email protected]
    Link: https://github.com/llvm/llvm-project/issues/99714 [1]
    Link: https://github.com/llvm/llvm-project/commit/68df06a0b2998765cb0a41353fcf0919bbf57ddb [2]
    Link: https://github.com/llvm/llvm-project/commit/2ab8d93061581edad3501561722ebd5632d73892 [3]
    Reviewed-by: Brian Cain <[email protected]>
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

hwmon: (tmp513) Fix Current Register value interpretation [+ + +]

Author: Murad Masimov <[email protected]>
Date:   Mon Dec 16 20:36:47 2024 +0300

    hwmon: (tmp513) Fix Current Register value interpretation
    
    [ Upstream commit da1d0e6ba211baf6747db74c07700caddfd8a179 ]
    
    The value returned by the driver after processing the contents of the
    Current Register does not correspond to the TMP512/TMP513 specifications.
    A raw register value is converted to a signed integer value by a sign
    extension in accordance with the algorithm provided in the specification,
    but due to the off-by-one error in the sign bit index, the result is
    incorrect. Moreover, negative values will be reported as large positive
    due to missing sign extension from u32 to long.
    
    According to the TMP512 and TMP513 datasheets, the Current Register (07h)
    is a 16-bit two's complement integer value. E.g., if regval = 1000 0011
    0000 0000, then the value must be (-32000 * lsb), but the driver will
    return (33536 * lsb).
    
    Fix off-by-one bug, and also cast data->curr_lsb_ua (which is of type u32)
    to long to prevent incorrect cast for negative values.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
    Signed-off-by: Murad Masimov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [groeck: Fixed description line length]
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers [+ + +]

Author: Murad Masimov <[email protected]>
Date:   Mon Dec 16 20:36:46 2024 +0300

    hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers
    
    [ Upstream commit 74d7e038fd072635d21e4734e3223378e09168d3 ]
    
    The values returned by the driver after processing the contents of the
    Shunt Voltage Register and the Shunt Limit Registers do not correspond to
    the TMP512/TMP513 specifications. A raw register value is converted to a
    signed integer value by a sign extension in accordance with the algorithm
    provided in the specification, but due to the off-by-one error in the sign
    bit index, the result is incorrect. Moreover, the PGA shift calculated with
    the tmp51x_get_pga_shift function is relevant only to the Shunt Voltage
    Register, but is also applied to the Shunt Limit Registers.
    
    According to the TMP512 and TMP513 datasheets, the Shunt Voltage Register
    (04h) is 13 to 16 bit two's complement integer value, depending on the PGA
    setting.  The Shunt Positive (0Ch) and Negative (0Dh) Limit Registers are
    16-bit two's complement integer values. Below are some examples:
    
    * Shunt Voltage Register
    If PGA = 8, and regval = 1000 0011 0000 0000, then the decimal value must
    be -32000, but the value calculated by the driver will be 33536.
    
    * Shunt Limit Register
    If regval = 1000 0011 0000 0000, then the decimal value must be -32000, but
    the value calculated by the driver will be 768, if PGA = 1.
    
    Fix sign bit index, and also correct misleading comment describing the
    tmp51x_get_pga_shift function.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
    Signed-off-by: Murad Masimov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [groeck: Fixed description and multi-line alignments]
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers [+ + +]

Author: Murad Masimov <[email protected]>
Date:   Mon Dec 16 20:36:48 2024 +0300

    hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers
    
    [ Upstream commit dd471e25770e7e632f736b90db1e2080b2171668 ]
    
    The values returned by the driver after processing the contents of the
    Temperature Result and the Temperature Limit Registers do not correspond to
    the TMP512/TMP513 specifications. A raw register value is converted to a
    signed integer value by a sign extension in accordance with the algorithm
    provided in the specification, but due to the off-by-one error in the sign
    bit index, the result is incorrect.
    
    According to the TMP512 and TMP513 datasheets, the Temperature Result (08h
    to 0Bh) and Limit (11h to 14h) Registers are 13-bit two's complement
    integer values, shifted left by 3 bits. The value is scaled by 0.0625
    degrees Celsius per bit.  E.g., if regval = 1 1110 0111 0000 000, the
    output should be -25 degrees, but the driver will return +487 degrees.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
    Signed-off-by: Murad Masimov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [groeck: fixed description line length]
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: pnx: Fix timeout in wait functions [+ + +]

Author: Vladimir Riabchun <[email protected]>
Date:   Sat Dec 7 00:19:34 2024 +0100

    i2c: pnx: Fix timeout in wait functions
    
    [ Upstream commit 7363f2d4c18557c99c536b70489187bb4e05c412 ]
    
    Since commit f63b94be6942 ("i2c: pnx: Fix potential deadlock warning
    from del_timer_sync() call in isr") jiffies are stored in
    i2c_pnx_algo_data.timeout, but wait_timeout and wait_reset are still
    using it as milliseconds. Convert jiffies back to milliseconds to wait
    for the expected amount of time.
    
    Fixes: f63b94be6942 ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr")
    Signed-off-by: Vladimir Riabchun <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: riic: Always round-up when calculating bus period [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Fri Nov 22 15:14:35 2024 +0100

    i2c: riic: Always round-up when calculating bus period
    
    commit de6b43798d9043a7c749a0428dbb02d5fff156e5 upstream.
    
    Currently, the RIIC driver may run the I2C bus faster than requested,
    which may cause subtle failures.  E.g. Biju reported a measured bus
    speed of 450 kHz instead of the expected maximum of 400 kHz on RZ/G2L.
    
    The initial calculation of the bus period uses DIV_ROUND_UP(), to make
    sure the actual bus speed never becomes faster than the requested bus
    speed.  However, the subsequent division-by-two steps do not use
    round-up, which may lead to a too-small period, hence a too-fast and
    possible out-of-spec bus speed.  E.g. on RZ/Five, requesting a bus speed
    of 100 resp. 400 kHz will yield too-fast target bus speeds of 100806
    resp. 403226 Hz instead of 97656 resp. 390625 Hz.
    
    Fix this by using DIV_ROUND_UP() in the subsequent divisions, too.
    
    Tested on RZ/A1H, RZ/A2M, and RZ/Five.
    
    Fixes: d982d66514192cdb ("i2c: riic: remove clock and frequency restrictions")
    Reported-by: Biju Das <[email protected]>
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Cc: <[email protected]> # v4.15+
    Link: https://lore.kernel.org/r/c59aea77998dfea1b4456c4b33b55ab216fcbf5e.1732284746.git.geert+renesas@glider.be
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i915/guc: Accumulate active runtime on gt reset [+ + +]

Author: Umesh Nerlige Ramappa <[email protected]>
Date:   Wed Nov 27 09:40:06 2024 -0800

    i915/guc: Accumulate active runtime on gt reset
    
    [ Upstream commit 1622ed27d26ab4c234476be746aa55bcd39159dd ]
    
    On gt reset, if a context is running, then accumulate it's active time
    into the busyness counter since there will be no chance for the context
    to switch out and update it's run time.
    
    v2: Move comment right above the if (John)
    
    Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
    Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
    Reviewed-by: John Harrison <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 7ed047da59cfa1acb558b95169d347acc8d85da1)
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i915/guc: Ensure busyness counter increases motonically [+ + +]

Author: Umesh Nerlige Ramappa <[email protected]>
Date:   Wed Nov 27 09:40:05 2024 -0800

    i915/guc: Ensure busyness counter increases motonically
    
    [ Upstream commit 59a0b46788d58fdcee8d2f6b4e619d264a1799bf ]
    
    Active busyness of an engine is calculated using gt timestamp and the
    context switch in time. While capturing the gt timestamp, it's possible
    that the context switches out. This race could result in an active
    busyness value that is greater than the actual context runtime value by a
    small amount. This leads to a negative delta and throws off busyness
    calculations for the user.
    
    If a subsequent count is smaller than the previous one, just return the
    previous one, since we expect the busyness to catch up.
    
    Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
    Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
    Reviewed-by: John Harrison <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit cf907f6d294217985e9dafd9985dce874e04ca37)
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i915/guc: Reset engine utilization buffer before registration [+ + +]

Author: Umesh Nerlige Ramappa <[email protected]>
Date:   Wed Nov 27 09:40:04 2024 -0800

    i915/guc: Reset engine utilization buffer before registration
    
    [ Upstream commit abcc2ddae5f82aa6cfca162e3db643dd33f0a2e8 ]
    
    On GT reset, we store total busyness counts for all engines and
    re-register the utilization buffer with GuC. At that time we should
    reset the buffer, so that we don't get spurious busyness counts on
    subsequent queries.
    
    To repro this issue, run igt@perf_pmu@busy-hang followed by
    igt@perf_pmu@most-busy-idle-check-all for a couple iterations.
    
    Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
    Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
    Reviewed-by: John Harrison <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit abd318237fa6556c1e5225529af145ef15d5ff0d)
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

io_uring: check if iowq is killed before queuing [+ + +]

Author: Pavel Begunkov <[email protected]>
Date:   Thu Dec 19 19:52:58 2024 +0000

    io_uring: check if iowq is killed before queuing
    
    commit dbd2ca9367eb19bc5e269b8c58b0b1514ada9156 upstream.
    
    task work can be executed after the task has gone through io_uring
    termination, whether it's the final task_work run or the fallback path.
    In this case, task work will find ->io_wq being already killed and
    null'ed, which is a problem if it then tries to forward the request to
    io_queue_iowq(). Make io_queue_iowq() fail requests in this case.
    
    Note that it also checks PF_KTHREAD, because the user can first close
    a DEFER_TASKRUN ring and shortly after kill the task, in which case
    ->iowq check would race.
    
    Cc: [email protected]
    Fixes: 50c52250e2d74 ("block: implement async io_uring discard cmd")
    Fixes: 773af69121ecc ("io_uring: always reissue from task_work context")
    Reported-by: Will <[email protected]>
    Signed-off-by: Pavel Begunkov <[email protected]>
    Link: https://lore.kernel.org/r/63312b4a2c2bb67ad67b857d17a300e1d3b078e8.1734637909.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

io_uring: Fix registered ring file refcount leak [+ + +]

Author: Jann Horn <[email protected]>
Date:   Wed Dec 18 17:56:25 2024 +0100

    io_uring: Fix registered ring file refcount leak
    
    commit 12d908116f7efd34f255a482b9afc729d7a5fb78 upstream.
    
    Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is
    only called on exit, but __io_uring_free (which frees the tctx in which the
    registered ring pointers are stored) is also called on execve (via
    begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel ->
    io_uring_cancel_generic -> __io_uring_free).
    
    This means: A process going through execve while having registered rings
    will leak references to the rings' `struct file`.
    
    Fix it by zapping registered rings on execve(). This is implemented by
    moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its
    callee __io_uring_cancel(), which is called from io_uring_task_cancel() on
    execve.
    
    This could probably be exploited *on 32-bit kernels* by leaking 2^32
    references to the same ring, because the file refcount is stored in a
    pointer-sized field and get_file() doesn't have protection against
    refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no
    impact beyond a memory leak.
    
    Cc: [email protected]
    Fixes: e7a6c00dc77a ("io_uring: add support for registering ring file descriptors")
    Signed-off-by: Jann Horn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ionic: Fix netdev notifier unregister on failure [+ + +]

Author: Brett Creeley <[email protected]>
Date:   Thu Dec 12 13:31:55 2024 -0800

    ionic: Fix netdev notifier unregister on failure
    
    [ Upstream commit 9590d32e090ea2751e131ae5273859ca22f5ac14 ]
    
    If register_netdev() fails, then the driver leaks the netdev notifier.
    Fix this by calling ionic_lif_unregister() on register_netdev()
    failure. This will also call ionic_lif_unregister_phc() if it has
    already been registered.
    
    Fixes: 30b87ab4c0b3 ("ionic: remove lif list concept")
    Signed-off-by: Brett Creeley <[email protected]>
    Signed-off-by: Shannon Nelson <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ionic: no double destroy workqueue [+ + +]

Author: Shannon Nelson <[email protected]>
Date:   Thu Dec 12 13:31:56 2024 -0800

    ionic: no double destroy workqueue
    
    [ Upstream commit 746e6ae2e202b062b9deee7bd86d94937997ecd7 ]
    
    There are some FW error handling paths that can cause us to
    try to destroy the workqueue more than once, so let's be sure
    we're checking for that.
    
    The case where this popped up was in an AER event where the
    handlers got called in such a way that ionic_reset_prepare()
    and thus ionic_dev_teardown() got called twice in a row.
    The second time through the workqueue was already destroyed,
    and destroy_workqueue() choked on the bad wq pointer.
    
    We didn't hit this in AER handler testing before because at
    that time we weren't using a private workqueue.  Later we
    replaced the use of the system workqueue with our own private
    workqueue but hadn't rerun the AER handler testing since then.
    
    Fixes: 9e25450da700 ("ionic: add private workqueue per-device")
    Signed-off-by: Shannon Nelson <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ionic: use ee->offset when returning sprom data [+ + +]

Author: Shannon Nelson <[email protected]>
Date:   Thu Dec 12 13:31:57 2024 -0800

    ionic: use ee->offset when returning sprom data
    
    [ Upstream commit b096d62ba1323391b2db98b7704e2468cf3b1588 ]
    
    Some calls into ionic_get_module_eeprom() don't use a single
    full buffer size, but instead multiple calls with an offset.
    Teach our driver to use the offset correctly so we can
    respond appropriately to the caller.
    
    Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support")
    Signed-off-by: Shannon Nelson <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems [+ + +]

Author: David Laight <[email protected]>
Date:   Sat Dec 14 17:30:53 2024 +0000

    ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems
    
    [ Upstream commit cf2c97423a4f89c8b798294d3f34ecfe7e7035c3 ]
    
    The 'max_avail' value is calculated from the system memory
    size using order_base_2().
    order_base_2(x) is defined as '(x) ? fn(x) : 0'.
    The compiler generates two copies of the code that follows
    and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit).
    This triggers a compile-time assert since min is 5.
    
    In reality a system would have to have less than 512MB memory
    for the bounds passed to clamp to be reversed.
    
    Swap the order of the arguments to clamp() to avoid the warning.
    
    Replace the clamp_val() on the line below with clamp().
    clamp_val() is just 'an accident waiting to happen' and not needed here.
    
    Detected by compile time checks added to clamp(), specifically:
    minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
    
    Reported-by: Linux Kernel Functional Testing <[email protected]>
    Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6ke0g@mail.gmail.com/
    Fixes: 4f325e26277b ("ipvs: dynamically limit the connection hash table")
    Tested-by: Bartosz Golaszewski <[email protected]>
    Reviewed-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: David Laight <[email protected]>
    Acked-by: Julian Anastasov <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

irqchip/gic-v3: Work around insecure GIC integrations [+ + +]

Author: Marc Zyngier <[email protected]>
Date:   Fri Dec 13 14:10:37 2024 +0000

    irqchip/gic-v3: Work around insecure GIC integrations
    
    commit 773c05f417fa14e1ac94776619e9c978ec001f0b upstream.
    
    It appears that the relatively popular RK3399 SoC has been put together
    using a large amount of illicit substances, as experiments reveal that its
    integration of GIC500 exposes the *secure* programming interface to
    non-secure.
    
    This has some pretty bad effects on the way priorities are handled, and
    results in a dead machine if booting with pseudo-NMI enabled
    (irqchip.gicv3_pseudo_nmi=1) if the kernel contains 18fdb6348c480 ("arm64:
    irqchip/gic-v3: Select priorities at boot time"), which relies on the
    priorities being programmed using the NS view.
    
    Let's restore some sanity by going one step further and disable security
    altogether in this case. This is not any worse, and puts us in a mode where
    priorities actually make some sense.
    
    Huge thanks to Mark Kettenis who initially identified this issue on
    OpenBSD, and to Chen-Yu Tsai who reported the problem in Linux.
    
    Fixes: 18fdb6348c480 ("arm64: irqchip/gic-v3: Select priorities at boot time")
    Reported-by: Mark Kettenis <[email protected]>
    Reported-by: Chen-Yu Tsai <[email protected]>
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Chen-Yu Tsai <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ksmbd: count all requests in req_running counter [+ + +]

Author: Marios Makassikis <[email protected]>
Date:   Sat Dec 14 12:16:45 2024 +0900

    ksmbd: count all requests in req_running counter
    
    [ Upstream commit 83c47d9e0ce79b5d7c0b21b9f35402dbde0fa15c ]
    
    This changes the semantics of req_running to count all in-flight
    requests on a given connection, rather than the number of elements
    in the conn->request list. The latter is used only in smb2_cancel,
    and the counter is not used
    
    Signed-off-by: Marios Makassikis <[email protected]>
    Acked-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Stable-dep-of: 43fb7bce8866 ("ksmbd: fix broken transfers when exceeding max simultaneous operations")
    Signed-off-by: Sasha Levin <[email protected]>

ksmbd: fix broken transfers when exceeding max simultaneous operations [+ + +]

Author: Marios Makassikis <[email protected]>
Date:   Sat Dec 14 12:17:23 2024 +0900

    ksmbd: fix broken transfers when exceeding max simultaneous operations
    
    [ Upstream commit 43fb7bce8866e793275c4f9f25af6a37745f3416 ]
    
    Since commit 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB
    operations"), ksmbd enforces a maximum number of simultaneous operations
    for a connection. The problem is that reaching the limit causes ksmbd to
    close the socket, and the client has no indication that it should have
    slowed down.
    
    This behaviour can be reproduced by setting "smb2 max credits = 128" (or
    lower), and transferring a large file (25GB).
    
    smbclient fails as below:
    
      $ smbclient //192.168.1.254/testshare -U user%pass
      smb: \> put file.bin
      cli_push returned NT_STATUS_USER_SESSION_DELETED
      putting file file.bin as \file.bin smb2cli_req_compound_submit:
      Insufficient credits. 0 available, 1 needed
      NT_STATUS_INTERNAL_ERROR closing remote file \file.bin
      smb: \> smb2cli_req_compound_submit: Insufficient credits. 0 available,
      1 needed
    
    Windows clients fail with 0x8007003b (with smaller files even).
    
    Fix this by delaying reading from the socket until there's room to
    allocate a request. This effectively applies backpressure on the client,
    so the transfer completes, albeit at a slower rate.
    
    Fixes: 0a77d947f599 ("ksmbd: check outstanding simultaneous SMB operations")
    Signed-off-by: Marios Makassikis <[email protected]>
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden [+ + +]

Author: Marc Zyngier <[email protected]>
Date:   Tue Dec 3 19:02:36 2024 +0000

    KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden
    
    commit 03c7527e97f73081633d773f9f8c2373f9854b25 upstream.
    
    Catalin reports that a hypervisor lying to a guest about the size
    of the ASID field may result in unexpected issues:
    
    - if the underlying HW does only supports 8 bit ASIDs, the ASID
      field in a TLBI VAE1* operation is only 8 bits, and the HW will
      ignore the other 8 bits
    
    - if on the contrary the HW is 16 bit capable, the ASID field
      in the same TLBI operation is always 16 bits, irrespective of
      the value of TCR_ELx.AS.
    
    This could lead to missed invalidations if the guest was lead to
    assume that the HW had 8 bit ASIDs while they really are 16 bit wide.
    
    In order to avoid any potential disaster that would be hard to debug,
    prenent the migration between a host with 8 bit ASIDs to one with
    wider ASIDs (the converse was obviously always forbidden). This is
    also consistent with what we already do for VMIDs.
    
    If it becomes absolutely mandatory to support such a migration path
    in the future, we will have to trap and emulate all TLBIs, something
    that nobody should look forward to.
    
    Fixes: d5a32b60dc18 ("KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1")
    Reported-by: Catalin Marinas <[email protected]>
    Signed-off-by: Marc Zyngier <[email protected]>
    Cc: [email protected]
    Cc: Will Deacon <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Marc Zyngier <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Oliver Upton <[email protected]>
    Acked-by: Catalin Marinas <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Oliver Upton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits [+ + +]

Author: Sean Christopherson <[email protected]>
Date:   Wed Dec 11 09:29:52 2024 -0800

    KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits
    
    commit 4d5163cba43fe96902165606fa54e1aecbbb32de upstream.
    
    Drop KVM's arbitrary behavior of making DE_CFG.LFENCE_SERIALIZE read-only
    for the guest, as rejecting writes can lead to guest crashes, e.g. Windows
    in particular doesn't gracefully handle unexpected #GPs on the WRMSR, and
    nothing in the AMD manuals suggests that LFENCE_SERIALIZE is read-only _if
    it exists_.
    
    KVM only allows LFENCE_SERIALIZE to be set, by the guest or host, if the
    underlying CPU has X86_FEATURE_LFENCE_RDTSC, i.e. if LFENCE is guaranteed
    to be serializing.  So if the guest sets LFENCE_SERIALIZE, KVM will provide
    the desired/correct behavior without any additional action (the guest's
    value is never stuffed into hardware).  And having LFENCE be serializing
    even when it's not _required_ to be is a-ok from a functional perspective.
    
    Fixes: 74a0e79df68a ("KVM: SVM: Disallow guest from changing userspace's MSR_AMD64_DE_CFG value")
    Fixes: d1d93fa90f1a ("KVM: SVM: Add MSR-based feature support for serializing LFENCE")
    Reported-by: Simon Pilkington <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]
    Cc: Tom Lendacky <[email protected]>
    Cc: [email protected]
    Reviewed-by: Tom Lendacky <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sean Christopherson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init [+ + +]

Author: Sean Christopherson <[email protected]>
Date:   Tue Dec 10 17:32:58 2024 -0800

    KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
    
    commit 1201f226c863b7da739f7420ddba818cedf372fc upstream.
    
    Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to
    avoid the overead of CPUID during runtime.  The offset, size, and metadata
    for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e.
    is constant for a given CPU, and thus can be cached during module load.
    
    On Intel's Emerald Rapids, CPUID is *wildly* expensive, to the point where
    recomputing XSAVE offsets and sizes results in a 4x increase in latency of
    nested VM-Enter and VM-Exit (nested transitions can trigger
    xstate_required_size() multiple times per transition), relative to using
    cached values.  The issue is easily visible by running `perf top` while
    triggering nested transitions: kvm_update_cpuid_runtime() shows up at a
    whopping 50%.
    
    As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test
    and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested
    roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald
    Rapids (EMR) takes:
    
      SKX 11650
      ICX 22350
      EMR 28850
    
    Using cached values, the latency drops to:
    
      SKX 6850
      ICX 9000
      EMR 7900
    
    The underlying issue is that CPUID itself is slow on ICX, and comically
    slow on EMR.  The problem is exacerbated on CPUs which support XSAVES
    and/or XSAVEC, as KVM invokes xstate_required_size() twice on each
    runtime CPUID update, and because there are more supported XSAVE features
    (CPUID for supported XSAVE feature sub-leafs is significantly slower).
    
     SKX:
      CPUID.0xD.2  = 348 cycles
      CPUID.0xD.3  = 400 cycles
      CPUID.0xD.4  = 276 cycles
      CPUID.0xD.5  = 236 cycles
      <other sub-leaves are similar>
    
     EMR:
      CPUID.0xD.2  = 1138 cycles
      CPUID.0xD.3  = 1362 cycles
      CPUID.0xD.4  = 1068 cycles
      CPUID.0xD.5  = 910 cycles
      CPUID.0xD.6  = 914 cycles
      CPUID.0xD.7  = 1350 cycles
      CPUID.0xD.8  = 734 cycles
      CPUID.0xD.9  = 766 cycles
      CPUID.0xD.10 = 732 cycles
      CPUID.0xD.11 = 718 cycles
      CPUID.0xD.12 = 734 cycles
      CPUID.0xD.13 = 1700 cycles
      CPUID.0xD.14 = 1126 cycles
      CPUID.0xD.15 = 898 cycles
      CPUID.0xD.16 = 716 cycles
      CPUID.0xD.17 = 748 cycles
      CPUID.0xD.18 = 776 cycles
    
    Note, updating runtime CPUID information multiple times per nested
    transition is itself a flaw, especially since CPUID is a mandotory
    intercept on both Intel and AMD.  E.g. KVM doesn't need to ensure emulated
    CPUID state is up-to-date while running L2.  That flaw will be fixed in a
    future patch, as deferring runtime CPUID updates is more subtle than it
    appears at first glance, the benefits aren't super critical to have once
    the XSAVE issue is resolved, and caching CPUID output is desirable even if
    KVM's updates are deferred.
    
    Cc: Jim Mattson <[email protected]>
    Cc: [email protected]
    Signed-off-by: Sean Christopherson <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Paolo Bonzini <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

KVM: x86: Play nice with protected guests in complete_hypercall_exit() [+ + +]

Author: Sean Christopherson <[email protected]>
Date:   Wed Nov 27 16:43:39 2024 -0800

    KVM: x86: Play nice with protected guests in complete_hypercall_exit()
    
    commit 9b42d1e8e4fe9dc631162c04caa69b0d1860b0f0 upstream.
    
    Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit
    hypercall when completing said hypercall.  For guests with protected state,
    e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit
    mode as the vCPU state needed to detect 64-bit mode is unavailable.
    
    Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE
    hypercall via VMGEXIT trips the WARN:
    
      ------------[ cut here ]------------
      WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm]
      Modules linked in: kvm_amd kvm ... [last unloaded: kvm]
      CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470
      Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024
      RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm]
      Call Trace:
       <TASK>
       kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm]
       kvm_vcpu_ioctl+0x54f/0x630 [kvm]
       __se_sys_ioctl+0x6b/0xc0
       do_syscall_64+0x83/0x160
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
       </TASK>
      ---[ end trace 0000000000000000 ]---
    
    Fixes: b5aead0064f3 ("KVM: x86: Assume a 64-bit hypercall for guests with protected state")
    Cc: [email protected]
    Cc: Tom Lendacky <[email protected]>
    Reviewed-by: Xiaoyao Li <[email protected]>
    Reviewed-by: Nikunj A Dadhania <[email protected]>
    Reviewed-by: Tom Lendacky <[email protected]>
    Reviewed-by: Binbin Wu <[email protected]>
    Reviewed-by: Kai Huang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sean Christopherson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.12.7 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri Dec 27 14:02:20 2024 +0100

    Linux 6.12.7
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Salvatore Bonaccorso <[email protected]>
    Tested-by: Harshit Mogalapalli <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Luna Jernberg <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Justin M. Forbes <[email protected]>
    Tested-by: kernelci.org bot <[email protected]>
    Tested-by: Markus Reichelt <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Hardik Garg <[email protected]>
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy() [+ + +]

Author: David Hildenbrand <[email protected]>
Date:   Tue Dec 10 10:34:37 2024 +0100

    mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy()
    
    commit faeec8e23c10bd30e8aa759a2eb3018dae00f924 upstream.
    
    In split_large_buddy(), we might call pfn_to_page() on a PFN that might
    not exist.  In corner cases, such as when freeing the highest pageblock in
    the last memory section, this could result with CONFIG_SPARSEMEM &&
    !CONFIG_SPARSEMEM_EXTREME in __pfn_to_section() returning NULL and and
    __section_mem_map_addr() dereferencing that NULL pointer.
    
    Let's fix it, and avoid doing a pfn_to_page() call for the first
    iteration, where we already have the page.
    
    So far this was found by code inspection, but let's just CC stable as the
    fix is easy.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: fd919a85cd55 ("mm: page_isolation: prepare for hygienic freelists")
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Vlastimil Babka <[email protected]>
    Closes: https://lkml.kernel.org/r/[email protected]
    Reviewed-by: Vlastimil Babka <[email protected]>
    Reviewed-by: Zi Yan <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Cc: Yu Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: convert partially_mapped set/clear operations to be atomic [+ + +]

Author: Usama Arif <[email protected]>
Date:   Thu Dec 12 18:33:51 2024 +0000

    mm: convert partially_mapped set/clear operations to be atomic
    
    commit 42b2eb69835b0fda797f70eb5b4fc213dbe3a7ea upstream.
    
    Other page flags in the 2nd page, like PG_hwpoison and PG_anon_exclusive
    can get modified concurrently.  Changes to other page flags might be lost
    if they are happening at the same time as non-atomic partially_mapped
    operations.  Hence, make partially_mapped operations atomic.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 8422acdc97ed ("mm: introduce a pageflag for partially mapped folios")
    Reported-by: David Hildenbrand <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Usama Arif <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Acked-by: Roman Gushchin <[email protected]>
    Cc: Barry Song <[email protected]>
    Cc: Domenico Cerasuolo <[email protected]>
    Cc: Jonathan Corbet <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Mike Rapoport (Microsoft) <[email protected]>
    Cc: Nico Pache <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Cc: Ryan Roberts <[email protected]>
    Cc: Shakeel Butt <[email protected]>
    Cc: Yu Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: shmem: fix ShmemHugePages at swapout [+ + +]

Author: Hugh Dickins <[email protected]>
Date:   Wed Dec 4 22:50:06 2024 -0800

    mm: shmem: fix ShmemHugePages at swapout
    
    commit dad2dc9c92e0f93f33cebcb0595b8daa3d57473f upstream.
    
    /proc/meminfo ShmemHugePages has been showing overlarge amounts (more than
    Shmem) after swapping out THPs: we forgot to update NR_SHMEM_THPS.
    
    Add shmem_update_stats(), to avoid repetition, and risk of making that
    mistake again: the call from shmem_delete_from_page_cache() is the bugfix;
    the call from shmem_replace_folio() is reassuring, but not really a bugfix
    (replace corrects misplaced swapin readahead, but huge swapin readahead
    would be a mistake).
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 809bc86517cc ("mm: shmem: support large folio swap out")
    Signed-off-by: Hugh Dickins <[email protected]>
    Reviewed-by: Shakeel Butt <[email protected]>
    Reviewed-by: Yosry Ahmed <[email protected]>
    Reviewed-by: Baolin Wang <[email protected]>
    Tested-by: Baolin Wang <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: use aligned address in clear_gigantic_page() [+ + +]

Author: Kefeng Wang <[email protected]>
Date:   Mon Oct 28 22:56:55 2024 +0800

    mm: use aligned address in clear_gigantic_page()
    
    commit 8aca2bc96c833ba695ede7a45ad7784c836a262e upstream.
    
    In current kernel, hugetlb_no_page() calls folio_zero_user() with the
    fault address.  Where the fault address may be not aligned with the huge
    page size.  Then, folio_zero_user() may call clear_gigantic_page() with
    the address, while clear_gigantic_page() requires the address to be huge
    page size aligned.  So, this may cause memory corruption or information
    leak, addtional, use more obvious naming 'addr_hint' instead of 'addr' for
    clear_gigantic_page().
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 78fefd04c123 ("mm: memory: convert clear_huge_page() to folio_zero_user()")
    Signed-off-by: Kefeng Wang <[email protected]>
    Reviewed-by: "Huang, Ying" <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Cc: Matthew Wilcox (Oracle) <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: use aligned address in copy_user_gigantic_page() [+ + +]

Author: Kefeng Wang <[email protected]>
Date:   Mon Oct 28 22:56:56 2024 +0800

    mm: use aligned address in copy_user_gigantic_page()
    
    commit f5d09de9f1bf9674c6418ff10d0a40cfe29268e1 upstream.
    
    In current kernel, hugetlb_wp() calls copy_user_large_folio() with the
    fault address.  Where the fault address may be not aligned with the huge
    page size.  Then, copy_user_large_folio() may call
    copy_user_gigantic_page() with the address, while
    copy_user_gigantic_page() requires the address to be huge page size
    aligned.  So, this may cause memory corruption or information leak,
    addtional, use more obvious naming 'addr_hint' instead of 'addr' for
    copy_user_gigantic_page().
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 530dd9926dc1 ("mm: memory: improve copy_user_large_folio()")
    Signed-off-by: Kefeng Wang <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Cc: Huang Ying <[email protected]>
    Cc: Matthew Wilcox (Oracle) <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe() [+ + +]

Author: Joe Hattori <[email protected]>
Date:   Tue Dec 3 11:34:42 2024 +0900

    mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe()
    
    commit f3d87abe11ed04d1b23a474a212f0e5deeb50892 upstream.
    
    Current implementation leaves pdev->dev as a wakeup source. Add a
    device_init_wakeup(&pdev->dev, false) call in the .remove() function and
    in the error path of the .probe() function.
    
    Signed-off-by: Joe Hattori <[email protected]>
    Fixes: 527f36f5efa4 ("mmc: mediatek: add support for SDIO eint wakup IRQ")
    Cc: [email protected]
    Message-ID: <[email protected]>
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk [+ + +]

Author: Prathamesh Shete <[email protected]>
Date:   Mon Dec 9 15:40:09 2024 +0530

    mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk
    
    commit a56335c85b592cb2833db0a71f7112b7d9f0d56b upstream.
    
    Value 0 in ADMA length descriptor is interpreted as 65536 on new Tegra
    chips, remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk to make sure max
    ADMA2 length is 65536.
    
    Fixes: 4346b7c7941d ("mmc: tegra: Add Tegra186 support")
    Cc: [email protected]
    Signed-off-by: Prathamesh Shete <[email protected]>
    Acked-by: Thierry Reding <[email protected]>
    Acked-by: Adrian Hunter <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:18 2024 +0800

    net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg
    
    [ Upstream commit a29e220d3c8edbf0e1beb0f028878a4a85966556 ]
    
    When receiving proposal msg in server, the field iparea_offset
    and the field ipv6_prefixes_cnt in proposal msg are from the
    remote client and can not be fully trusted. Especially the
    field iparea_offset, once exceed the max value, there has the
    chance to access wrong address, and crash may happen.
    
    This patch checks iparea_offset and ipv6_prefixes_cnt before using them.
    
    Fixes: e7b7a64a8493 ("smc: support variable CLC proposal messages")
    Signed-off-by: Guangguan Wang <[email protected]>
    Reviewed-by: Wen Gu <[email protected]>
    Reviewed-by: D. Wythe <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: check return value of sock_recvmsg when draining clc data [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:21 2024 +0800

    net/smc: check return value of sock_recvmsg when draining clc data
    
    [ Upstream commit c5b8ee5022a19464783058dc6042e8eefa34e8cd ]
    
    When receiving clc msg, the field length in smc_clc_msg_hdr indicates the
    length of msg should be received from network and the value should not be
    fully trusted as it is from the network. Once the value of length exceeds
    the value of buflen in function smc_clc_wait_msg it may run into deadloop
    when trying to drain the remaining data exceeding buflen.
    
    This patch checks the return value of sock_recvmsg when draining data in
    case of deadloop in draining.
    
    Fixes: fb4f79264c0f ("net/smc: tolerate future SMCD versions")
    Signed-off-by: Guangguan Wang <[email protected]>
    Reviewed-by: Wen Gu <[email protected]>
    Reviewed-by: D. Wythe <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: check smcd_v2_ext_offset when receiving proposal msg [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:20 2024 +0800

    net/smc: check smcd_v2_ext_offset when receiving proposal msg
    
    [ Upstream commit 9ab332deb671d8f7e66d82a2ff2b3f715bc3a4ad ]
    
    When receiving proposal msg in server, the field smcd_v2_ext_offset in
    proposal msg is from the remote client and can not be fully trusted.
    Once the value of smcd_v2_ext_offset exceed the max value, there has
    the chance to access wrong address, and crash may happen.
    
    This patch checks the value of smcd_v2_ext_offset before using it.
    
    Fixes: 5c21c4ccafe8 ("net/smc: determine accepted ISM devices")
    Signed-off-by: Guangguan Wang <[email protected]>
    Reviewed-by: Wen Gu <[email protected]>
    Reviewed-by: D. Wythe <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:17 2024 +0800

    net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll
    
    [ Upstream commit 679e9ddcf90dbdf98aaaa71a492454654b627bcb ]
    
    When application sending data more than sndbuf_space, there have chances
    application will sleep in epoll_wait, and will never be wakeup again. This
    is caused by a race between smc_poll and smc_cdc_tx_handler.
    
    application                                      tasklet
    smc_tx_sendmsg(len > sndbuf_space)   |
    epoll_wait for EPOLL_OUT,timeout=0   |
      smc_poll                           |
        if (!smc->conn.sndbuf_space)     |
                                         |  smc_cdc_tx_handler
                                         |    atomic_add sndbuf_space
                                         |    smc_tx_sndbuf_nonfull
                                         |      if (!test_bit SOCK_NOSPACE)
                                         |        do not sk_write_space;
          set_bit SOCK_NOSPACE;          |
        return mask=0;                   |
    
    Application will sleep in epoll_wait as smc_poll returns 0. And
    smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE
    has not be set. If there is no inflight cdc msg, sk_write_space will not be
    called any more, and application will sleep in epoll_wait forever.
    So check sndbuf_space again after NOSPACE flag is set to break the race.
    
    Fixes: 8dce2786a290 ("net/smc: smc_poll improvements")
    Signed-off-by: Guangguan Wang <[email protected]>
    Suggested-by: Paolo Abeni <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:19 2024 +0800

    net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg
    
    [ Upstream commit 7863c9f3d24ba49dbead7e03dfbe40deb5888fdf ]
    
    When receiving proposal msg in server, the fields v2_ext_offset/
    eid_cnt/ism_gid_cnt in proposal msg are from the remote client
    and can not be fully trusted. Especially the field v2_ext_offset,
    once exceed the max value, there has the chance to access wrong
    address, and crash may happen.
    
    This patch checks the fields v2_ext_offset/eid_cnt/ism_gid_cnt
    before using them.
    
    Fixes: 8c3dca341aea ("net/smc: build and send V2 CLC proposal")
    Signed-off-by: Guangguan Wang <[email protected]>
    Reviewed-by: Wen Gu <[email protected]>
    Reviewed-by: D. Wythe <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: protect link down work from execute after lgr freed [+ + +]

Author: Guangguan Wang <[email protected]>
Date:   Wed Dec 11 17:21:16 2024 +0800

    net/smc: protect link down work from execute after lgr freed
    
    [ Upstream commit 2b33eb8f1b3e8c2f87cfdbc8cc117f6bdfabc6ec ]
    
    link down work may be scheduled before lgr freed but execute
    after lgr freed, which may result in crash. So it is need to
    hold a reference before shedule link down work, and put the
    reference after work executed or canceled.
    
    The relevant crash call stack as follows:
     list_del corruption. prev->next should be ffffb638c9c0fe20,
        but was 0000000000000000
     ------------[ cut here ]------------
     kernel BUG at lib/list_debug.c:51!
     invalid opcode: 0000 [#1] SMP NOPTI
     CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1
     Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014
     Workqueue: events smc_link_down_work [smc]
     RIP: 0010:__list_del_entry_valid.cold+0x31/0x47
     RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086
     RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000
     RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80
     RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38
     R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002
     R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0
     FS:  0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     PKRU: 55555554
     Call Trace:
      rwsem_down_write_slowpath+0x17e/0x470
      smc_link_down_work+0x3c/0x60 [smc]
      process_one_work+0x1ac/0x350
      worker_thread+0x49/0x2f0
      ? rescuer_thread+0x360/0x360
      kthread+0x118/0x140
      ? __kthread_bind_mask+0x60/0x60
      ret_from_fork+0x1f/0x30
    
    Fixes: 541afa10c126 ("net/smc: add smcr_port_err() and smcr_link_down() processing")
    Signed-off-by: Guangguan Wang <[email protected]>
    Reviewed-by: Tony Lu <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Mon Dec 16 15:50:59 2024 +0200

    net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic
    
    [ Upstream commit 16f027cd40eeedd2325f7e720689462ca8d9d13e ]
    
    Robert Hodaszi reports that locally terminated traffic towards
    VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing
    the same symptoms as for commit 1f9fc48fd302 ("net: dsa: sja1105: fix
    reception from VLAN-unaware bridges").
    
    For context, the set merged as "VLAN fixes for Ocelot driver":
    https://lore.kernel.org/netdev/[email protected]/
    
    was developed in a slightly different form earlier this year, in January.
    Initially, the switch was unconditionally configured to set OCELOT_ES0_TAG
    when using ocelot-8021q, regardless of port operating mode.
    
    This led to the situation where VLAN-unaware bridge ports would always
    push their PVID - see ocelot_vlan_unaware_pvid() - a negligible value
    anyway - into RX packets. To strip this in software, we would have needed
    DSA to know what private VID the switch chose for VLAN-unaware bridge
    ports, and pushed into the packets. This was implemented downstream, and
    a remnant of it remains in the form of a comment mentioning
    ds->ops->get_private_vid(), as something which would maybe need to be
    considered in the future.
    
    However, for upstream, it was deemed inappropriate, because it would
    mean introducing yet another behavior for stripping VLAN tags from
    VLAN-unaware bridge ports, when one already existed (ds->untag_bridge_pvid).
    The latter has been marked as obsolete along with an explanation why it
    is logically broken, but still, it would have been confusing.
    
    So, for upstream, felix_update_tag_8021q_rx_rule() was developed, which
    essentially changed the state of affairs from "Felix with ocelot-8021q
    delivers all packets as VLAN-tagged towards the CPU" into "Felix with
    ocelot-8021q delivers all packets from VLAN-aware bridge ports towards
    the CPU". This was done on the premise that in VLAN-unaware mode,
    there's nothing useful in the VLAN tags, and we can avoid introducing
    ds->ops->get_private_vid() in the DSA receive path if we configure the
    switch to not push those VLAN tags into packets in the first place.
    
    Unfortunately, and this is when the trainwreck started, the selftests
    developed initially and posted with the series were not re-ran.
    dsa_software_vlan_untag() was initially written given the assumption
    that users of this feature would send _all_ traffic as VLAN-tagged.
    It was only partially adapted to the new scheme, by removing
    ds->ops->get_private_vid(), which also used to be necessary in
    standalone ports mode.
    
    Where the trainwreck became even worse is that I had a second opportunity
    to think about this, when the dsa_software_vlan_untag() logic change
    initially broke sja1105, in commit 1f9fc48fd302 ("net: dsa: sja1105: fix
    reception from VLAN-unaware bridges"). I did not connect the dots that
    it also breaks ocelot-8021q, for pretty much the same reason that not
    all received packets will be VLAN-tagged.
    
    To be compatible with the optimized Felix control path which runs
    felix_update_tag_8021q_rx_rule() to only push VLAN tags when useful (in
    VLAN-aware mode), we need to restore the old dsa_software_vlan_untag()
    logic. The blamed commit introduced the assumption that
    dsa_software_vlan_untag() will see only VLAN-tagged packets, assumption
    which is false. What corrupts RX traffic is the fact that we call
    skb_vlan_untag() on packets which are not VLAN-tagged in the first
    place.
    
    Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges")
    Reported-by: Robert Hodaszi <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: bgmac-platform: fix an OF node reference leak [+ + +]

Author: Joe Hattori <[email protected]>
Date:   Sat Dec 14 10:49:12 2024 +0900

    net: ethernet: bgmac-platform: fix an OF node reference leak
    
    [ Upstream commit 0cb2c504d79e7caa3abade3f466750c82ad26f01 ]
    
    The OF node obtained by of_parse_phandle() is not freed. Call
    of_node_put() to balance the refcount.
    
    This bug was found by an experimental static analysis tool that I am
    developing.
    
    Fixes: 1676aba5ef7e ("net: ethernet: bgmac: device tree phy enablement")
    Signed-off-by: Joe Hattori <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: oa_tc6: fix infinite loop error when tx credits becomes 0 [+ + +]

Author: Parthiban Veerasooran <[email protected]>
Date:   Fri Dec 13 18:01:58 2024 +0530

    net: ethernet: oa_tc6: fix infinite loop error when tx credits becomes 0
    
    [ Upstream commit 7d2f320e12744e5906a4fab40381060a81d22c12 ]
    
    SPI thread wakes up to perform SPI transfer whenever there is an TX skb
    from n/w stack or interrupt from MAC-PHY. Ethernet frame from TX skb is
    transferred based on the availability tx credits in the MAC-PHY which is
    reported from the previous SPI transfer. Sometimes there is a possibility
    that TX skb is available to transmit but there is no tx credits from
    MAC-PHY. In this case, there will not be any SPI transfer but the thread
    will be running in an endless loop until tx credits available again.
    
    So checking the availability of tx credits along with TX skb will prevent
    the above infinite loop. When the tx credits available again that will be
    notified through interrupt which will trigger the SPI transfer to get the
    available tx credits.
    
    Fixes: 53fbde8ab21e ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames")
    Reviewed-by: Jacob Keller <[email protected]>
    Signed-off-by: Parthiban Veerasooran <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: oa_tc6: fix tx skb race condition between reference pointers [+ + +]

Author: Parthiban Veerasooran <[email protected]>
Date:   Fri Dec 13 18:01:59 2024 +0530

    net: ethernet: oa_tc6: fix tx skb race condition between reference pointers
    
    [ Upstream commit e592b5110b3e9393881b0a019d86832bbf71a47f ]
    
    There are two skb pointers to manage tx skb's enqueued from n/w stack.
    waiting_tx_skb pointer points to the tx skb which needs to be processed
    and ongoing_tx_skb pointer points to the tx skb which is being processed.
    
    SPI thread prepares the tx data chunks from the tx skb pointed by the
    ongoing_tx_skb pointer. When the tx skb pointed by the ongoing_tx_skb is
    processed, the tx skb pointed by the waiting_tx_skb is assigned to
    ongoing_tx_skb and the waiting_tx_skb pointer is assigned with NULL.
    Whenever there is a new tx skb from n/w stack, it will be assigned to
    waiting_tx_skb pointer if it is NULL. Enqueuing and processing of a tx skb
    handled in two different threads.
    
    Consider a scenario where the SPI thread processed an ongoing_tx_skb and
    it moves next tx skb from waiting_tx_skb pointer to ongoing_tx_skb pointer
    without doing any NULL check. At this time, if the waiting_tx_skb pointer
    is NULL then ongoing_tx_skb pointer is also assigned with NULL. After
    that, if a new tx skb is assigned to waiting_tx_skb pointer by the n/w
    stack and there is a chance to overwrite the tx skb pointer with NULL in
    the SPI thread. Finally one of the tx skb will be left as unhandled,
    resulting packet missing and memory leak.
    
    - Consider the below scenario where the TXC reported from the previous
    transfer is 10 and ongoing_tx_skb holds an tx ethernet frame which can be
    transported in 20 TXCs and waiting_tx_skb is still NULL.
            tx_credits = 10; /* 21 are filled in the previous transfer */
            ongoing_tx_skb = 20;
            waiting_tx_skb = NULL; /* Still NULL */
    - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true.
    - After oa_tc6_prepare_spi_tx_buf_for_tx_skbs()
            ongoing_tx_skb = 10;
            waiting_tx_skb = NULL; /* Still NULL */
    - Perform SPI transfer.
    - Process SPI rx buffer to get the TXC from footers.
    - Now let's assume previously filled 21 TXCs are freed so we are good to
    transport the next remaining 10 tx chunks from ongoing_tx_skb.
            tx_credits = 21;
            ongoing_tx_skb = 10;
            waiting_tx_skb = NULL;
    - So, (tc6->ongoing_tx_skb || tc6->waiting_tx_skb) becomes true again.
    - In the oa_tc6_prepare_spi_tx_buf_for_tx_skbs()
            ongoing_tx_skb = NULL;
            waiting_tx_skb = NULL;
    
    - Now the below bad case might happen,
    
    Thread1 (oa_tc6_start_xmit)     Thread2 (oa_tc6_spi_thread_handler)
    ---------------------------     -----------------------------------
    - if waiting_tx_skb is NULL
                                    - if ongoing_tx_skb is NULL
                                    - ongoing_tx_skb = waiting_tx_skb
    - waiting_tx_skb = skb
                                    - waiting_tx_skb = NULL
                                    ...
                                    - ongoing_tx_skb = NULL
    - if waiting_tx_skb is NULL
    - waiting_tx_skb = skb
    
    To overcome the above issue, protect the moving of tx skb reference from
    waiting_tx_skb pointer to ongoing_tx_skb pointer and assigning new tx skb
    to waiting_tx_skb pointer, so that the other thread can't access the
    waiting_tx_skb pointer until the current thread completes moving the tx
    skb reference safely.
    
    Fixes: 53fbde8ab21e ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames")
    Signed-off-by: Parthiban Veerasooran <[email protected]>
    Reviewed-by: Larysa Zaremba <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hinic: Fix cleanup in create_rxqs/txqs() [+ + +]

Author: Dan Carpenter <[email protected]>
Date:   Fri Dec 13 17:28:11 2024 +0300

    net: hinic: Fix cleanup in create_rxqs/txqs()
    
    [ Upstream commit 7203d10e93b6e6e1d19481ef7907de6a9133a467 ]
    
    There is a check for NULL at the start of create_txqs() and
    create_rxqs() which tess if "nic_dev->txqs" is non-NULL.  The
    intention is that if the device is already open and the queues
    are already created then we don't create them a second time.
    
    However, the bug is that if we have an error in the create_txqs()
    then the pointer doesn't get set back to NULL.  The NULL check
    at the start of the function will say that it's already open when
    it's not and the device can't be used.
    
    Set ->txqs back to NULL on cleanup on error.
    
    Fixes: c3e79baf1b03 ("net-next/hinic: Add logical Txq and Rxq")
    Signed-off-by: Dan Carpenter <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mctp: handle skb cleanup on sock_queue failures [+ + +]

Author: Jeremy Kerr <[email protected]>
Date:   Wed Dec 18 11:53:01 2024 +0800

    net: mctp: handle skb cleanup on sock_queue failures
    
    commit ce1219c3f76bb131d095e90521506d3c6ccfa086 upstream.
    
    Currently, we don't use the return value from sock_queue_rcv_skb, which
    means we may leak skbs if a message is not successfully queued to a
    socket.
    
    Instead, ensure that we're freeing the skb where the sock hasn't
    otherwise taken ownership of the skb by adding checks on the
    sock_queue_rcv_skb() to invoke a kfree on failure.
    
    In doing so, rather than using the 'rc' value to trigger the
    kfree_skb(), use the skb pointer itself, which is more explicit.
    
    Also, add a kunit test for the sock delivery failure cases.
    
    Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly")
    Cc: [email protected]
    Signed-off-by: Jeremy Kerr <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: mdiobus: fix an OF node reference leak [+ + +]

Author: Joe Hattori <[email protected]>
Date:   Wed Dec 18 12:51:06 2024 +0900

    net: mdiobus: fix an OF node reference leak
    
    [ Upstream commit 572af9f284669d31d9175122bbef9bc62cea8ded ]
    
    fwnode_find_mii_timestamper() calls of_parse_phandle_with_fixed_args()
    but does not decrement the refcount of the obtained OF node. Add an
    of_node_put() call before returning from the function.
    
    This bug was detected by an experimental static analysis tool that I am
    developing.
    
    Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()")
    Signed-off-by: Joe Hattori <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic() [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Thu Dec 12 18:55:45 2024 +0200

    net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic()
    
    [ Upstream commit 2d5df3a680ffdaf606baa10636bdb1daf757832e ]
    
    Packets injected by the CPU should have a SRC_PORT field equal to the
    CPU port module index in the Analyzer block (ocelot->num_phys_ports).
    
    The blamed commit copied the ocelot_ifh_set_basic() call incorrectly
    from ocelot_xmit_common() in net/dsa/tag_ocelot.c. Instead of calling
    with "x", it calls with BIT_ULL(x), but the field is not a port mask,
    but rather a single port index.
    
    [ side note: this is the technical debt of code duplication :( ]
    
    The error used to be silent and doesn't appear to have other
    user-visible manifestations, but with new changes in the packing
    library, it now fails loudly as follows:
    
    ------------[ cut here ]------------
    Cannot store 0x40 inside bits 46-43 - will truncate
    sja1105 spi2.0: xmit timed out
    WARNING: CPU: 1 PID: 102 at lib/packing.c:98 __pack+0x90/0x198
    sja1105 spi2.0: timed out polling for tstamp
    CPU: 1 UID: 0 PID: 102 Comm: felix_xmit
    Tainted: G        W        N 6.13.0-rc1-00372-gf706b85d972d-dirty #2605
    Call trace:
     __pack+0x90/0x198 (P)
     __pack+0x90/0x198 (L)
     packing+0x78/0x98
     ocelot_ifh_set_basic+0x260/0x368
     ocelot_port_inject_frame+0xa8/0x250
     felix_port_deferred_xmit+0x14c/0x258
     kthread_worker_fn+0x134/0x350
     kthread+0x114/0x138
    
    The code path pertains to the ocelot switchdev driver and to the felix
    secondary DSA tag protocol, ocelot-8021q. Here seen with ocelot-8021q.
    
    The messenger (packing) is not really to blame, so fix the original
    commit instead.
    
    Fixes: e1b9e80236c5 ("net: mscc: ocelot: fix QoS class for injected packets with "ocelot-8021q"")
    Signed-off-by: Vladimir Oltean <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: netdevsim: fix nsim_pp_hold_write() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Mon Dec 16 08:37:03 2024 +0000

    net: netdevsim: fix nsim_pp_hold_write()
    
    [ Upstream commit b9b8301d369b4c876de5255dbf067b19ba88ac71 ]
    
    nsim_pp_hold_write() has two problems:
    
    1) It may return with rtnl held, as found by syzbot.
    
    2) Its return value does not propagate an error if any.
    
    Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: renesas: rswitch: rework ts tags management [+ + +]

Author: Nikita Yushchenko <[email protected]>
Date:   Thu Dec 12 11:25:58 2024 +0500

    net: renesas: rswitch: rework ts tags management
    
    [ Upstream commit 922b4b955a03d19fea98938f33ef0e62d01f5159 ]
    
    The existing linked list based implementation of how ts tags are
    assigned and managed is unsafe against concurrency and corner cases:
    - element addition in tx processing can race against element removal
      in ts queue completion,
    - element removal in ts queue completion can race against element
      removal in device close,
    - if a large number of frames gets added to tx queue without ts queue
      completions in between, elements with duplicate tag values can get
      added.
    
    Use a different implementation, based on per-port used tags bitmaps and
    saved skb arrays.
    
    Safety for addition in tx processing vs removal in ts completion is
    provided by:
    
        tag = find_first_zero_bit(...);
        smp_mb();
        <write rdev->ts_skb[tag]>
        set_bit(...);
    
      vs
    
        <read rdev->ts_skb[tag]>
        smp_mb();
        clear_bit(...);
    
    Safety for removal in ts completion vs removal in device close is
    provided by using atomic read-and-clear for rdev->ts_skb[tag]:
    
        ts_skb = xchg(&rdev->ts_skb[tag], NULL);
        if (ts_skb)
            <handle it>
    
    Fixes: 33f5d733b589 ("net: renesas: rswitch: Improve TX timestamp accuracy")
    Signed-off-by: Nikita Yushchenko <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sched: fix ordering of qlen adjustment [+ + +]

Author: Lion Ackermann <[email protected]>
Date:   Mon Dec 2 17:22:57 2024 +0100

    net: sched: fix ordering of qlen adjustment
    
    commit 5eb7de8cd58e73851cd37ff8d0666517d9926948 upstream.
    
    Changes to sch->q.qlen around qdisc_tree_reduce_backlog() need to happen
    _before_ a call to said function because otherwise it may fail to notify
    parent qdiscs when the child is about to become empty.
    
    Signed-off-by: Lion Ackermann <[email protected]>
    Acked-by: Toke Høiland-Jørgensen <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Cc: Artem Metla <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: stmmac: fix TSO DMA API usage causing oops [+ + +]

Author: Russell King (Oracle) <[email protected]>
Date:   Fri Dec 6 12:40:11 2024 +0000

    net: stmmac: fix TSO DMA API usage causing oops
    
    [ Upstream commit 4c49f38e20a57f8abaebdf95b369295b153d1f8e ]
    
    Commit 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap
    for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s
    members to be later in stmmac_tso_xmit().
    
    The buf (dma cookie) and len stored in this structure are passed to
    dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that
    the dma cookie passed to dma_unmap_single() is the same as the value
    returned from dma_map_single(). However, by moving the assignment
    later, this is not the case when priv->dma_cap.addr64 > 32 as "des"
    is offset by proto_hdr_len.
    
    This causes problems such as:
    
      dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed
    
    and with DMA_API_DEBUG enabled:
    
      DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes]
    
    Fix this by maintaining "des" as the original DMA cookie, and use
    tso_des to pass the offset DMA cookie to stmmac_tso_allocator().
    
    Full details of the crashes can be found at:
    https://lore.kernel.org/all/[email protected]/
    https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5turc@bwwhelz2l4dw/
    
    Reported-by: Jon Hunter <[email protected]>
    Reported-by: Thierry Reding <[email protected]>
    Fixes: 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data")
    Tested-by: Jon Hunter <[email protected]>
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Furong Xu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: tun: fix tun_napi_alloc_frags() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Dec 12 22:22:47 2024 +0000

    net: tun: fix tun_napi_alloc_frags()
    
    commit 429fde2d81bcef0ebab002215358955704586457 upstream.
    
    syzbot reported the following crash [1]
    
    Issue came with the blamed commit. Instead of going through
    all the iov components, we keep using the first one
    and end up with a malformed skb.
    
    [1]
    
    kernel BUG at net/core/skbuff.c:2849 !
    Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
    CPU: 0 UID: 0 PID: 6230 Comm: syz-executor132 Not tainted 6.13.0-rc1-syzkaller-00407-g96b6fcc0ee41 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
     RIP: 0010:__pskb_pull_tail+0x1568/0x1570 net/core/skbuff.c:2848
    Code: 38 c1 0f 8c 32 f1 ff ff 4c 89 f7 e8 92 96 74 f8 e9 25 f1 ff ff e8 e8 ae 09 f8 48 8b 5c 24 08 e9 eb fb ff ff e8 d9 ae 09 f8 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 0018:ffffc90004cbef30 EFLAGS: 00010293
    RAX: ffffffff8995c347 RBX: 00000000fffffff2 RCX: ffff88802cf45a00
    RDX: 0000000000000000 RSI: 00000000fffffff2 RDI: 0000000000000000
    RBP: ffff88807df0c06a R08: ffffffff8995b084 R09: 1ffff1100fbe185c
    R10: dffffc0000000000 R11: ffffed100fbe185d R12: ffff888076e85d50
    R13: ffff888076e85c80 R14: ffff888076e85cf4 R15: ffff888076e85c80
    FS:  00007f0dca6ea6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f0dca6ead58 CR3: 00000000119da000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      skb_cow_data+0x2da/0xcb0 net/core/skbuff.c:5284
      tipc_aead_decrypt net/tipc/crypto.c:894 [inline]
      tipc_crypto_rcv+0x402/0x24e0 net/tipc/crypto.c:1844
      tipc_rcv+0x57e/0x12a0 net/tipc/node.c:2109
      tipc_l2_rcv_msg+0x2bd/0x450 net/tipc/bearer.c:668
      __netif_receive_skb_list_ptype net/core/dev.c:5720 [inline]
      __netif_receive_skb_list_core+0x8b7/0x980 net/core/dev.c:5762
      __netif_receive_skb_list net/core/dev.c:5814 [inline]
      netif_receive_skb_list_internal+0xa51/0xe30 net/core/dev.c:5905
      gro_normal_list include/net/gro.h:515 [inline]
      napi_complete_done+0x2b5/0x870 net/core/dev.c:6256
      napi_complete include/linux/netdevice.h:567 [inline]
      tun_get_user+0x2ea0/0x4890 drivers/net/tun.c:1982
      tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2057
     do_iter_readv_writev+0x600/0x880
      vfs_writev+0x376/0xba0 fs/read_write.c:1050
      do_writev+0x1b6/0x360 fs/read_write.c:1096
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Fixes: de4f5fed3f23 ("iov_iter: add iter_iovec() helper")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netdev/[email protected]/T/#u
    Cc: [email protected]
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Joe Damato <[email protected]>
    Reviewed-by: Jens Axboe <[email protected]>
    Acked-by: Willem de Bruijn <[email protected]>
    Acked-by: Michael S. Tsirkin <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

netdev-genl: avoid empty messages in queue dump [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Tue Dec 17 18:25:08 2024 -0800

    netdev-genl: avoid empty messages in queue dump
    
    [ Upstream commit 5eb70dbebf32c2fd1f2814c654ae17fc47d6e859 ]
    
    Empty netlink responses from do() are not correct (as opposed to
    dump() where not dumping anything is perfectly fine).
    We should return an error if the target object does not exist,
    in this case if the netdev is down it has no queues.
    
    Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue")
    Reported-by: [email protected]
    Reviewed-by: Eric Dumazet <[email protected]>
    Reviewed-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netdev: fix repeated netlink messages in queue dump [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Fri Dec 13 07:22:40 2024 -0800

    netdev: fix repeated netlink messages in queue dump
    
    [ Upstream commit b1f3a2f5a742c1e939a73031bd31b9e557a2d77d ]
    
    The context is supposed to record the next queue to dump,
    not last dumped. If the dump doesn't fit we will restart
    from the already-dumped queue, duplicating the message.
    
    Before this fix and with the selftest improvements later
    in this series we see:
    
      # ./run_kselftest.sh -t drivers/net:queues.py
      timeout set to 45
      selftests: drivers/net: queues.py
      KTAP version 1
      1..2
      # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues:
      # Check|     ksft_eq(queues, expected)
      # Check failed 102 != 100
      # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues:
      # Check|     ksft_eq(queues, expected)
      # Check failed 101 != 100
      not ok 1 queues.get_queues
      ok 2 queues.addremove_queues
      # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0
      not ok 1 selftests: drivers/net: queues.py # exit=1
    
    With the fix:
    
      # ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py
      timeout set to 45
      selftests: drivers/net: queues.py
      KTAP version 1
      1..2
      ok 1 queues.get_queues
      ok 2 queues.addremove_queues
      # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
    
    Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue")
    Reviewed-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netdev: fix repeated netlink messages in queue stats [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Fri Dec 13 07:22:41 2024 -0800

    netdev: fix repeated netlink messages in queue stats
    
    [ Upstream commit ecc391a541573da46b7ccc188105efedd40aef1b ]
    
    The context is supposed to record the next queue to dump,
    not last dumped. If the dump doesn't fit we will restart
    from the already-dumped queue, duplicating the message.
    
    Before this fix and with the selftest improvements later
    in this series we see:
    
      # ./run_kselftest.sh -t drivers/net:stats.py
      timeout set to 45
      selftests: drivers/net: stats.py
      KTAP version 1
      1..5
      ok 1 stats.check_pause
      ok 2 stats.check_fec
      ok 3 stats.pkt_byte_sum
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), len(set(queues[qtype])),
      # Check failed 45 != 44 repeated queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1,
      # Check failed 45 != 44 missing queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), len(set(queues[qtype])),
      # Check failed 45 != 44 repeated queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1,
      # Check failed 45 != 44 missing queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), len(set(queues[qtype])),
      # Check failed 103 != 100 repeated queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1,
      # Check failed 103 != 100 missing queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), len(set(queues[qtype])),
      # Check failed 102 != 100 repeated queue keys
      # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex:
      # Check|     ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1,
      # Check failed 102 != 100 missing queue keys
      not ok 4 stats.qstat_by_ifindex
      ok 5 stats.check_down
      # Totals: pass:4 fail:1 xfail:0 xpass:0 skip:0 error:0
    
    With the fix:
    
      # ./ksft-net-drv/run_kselftest.sh -t drivers/net:stats.py
      timeout set to 45
      selftests: drivers/net: stats.py
      KTAP version 1
      1..5
      ok 1 stats.check_pause
      ok 2 stats.check_fec
      ok 3 stats.pkt_byte_sum
      ok 4 stats.qstat_by_ifindex
      ok 5 stats.check_down
      # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0
    
    Fixes: ab63a2387cb9 ("netdev: add per-queue statistics")
    Reviewed-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netdevsim: prevent bad user input in nsim_dev_health_break_write() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Fri Dec 13 17:25:18 2024 +0000

    netdevsim: prevent bad user input in nsim_dev_health_break_write()
    
    [ Upstream commit ee76746387f6233bdfa93d7406990f923641568f ]
    
    If either a zero count or a large one is provided, kernel can crash.
    
    Fixes: 82c93a87bf8b ("netdevsim: implement couple of testing devlink health reporters")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netdev/[email protected]/T/#u
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Reviewed-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: ipset: Fix for recursive locking warning [+ + +]

Author: Phil Sutter <[email protected]>
Date:   Tue Dec 17 20:56:55 2024 +0100

    netfilter: ipset: Fix for recursive locking warning
    
    [ Upstream commit 70b6f46a4ed8bd56c85ffff22df91e20e8c85e33 ]
    
    With CONFIG_PROVE_LOCKING, when creating a set of type bitmap:ip, adding
    it to a set of type list:set and populating it from iptables SET target
    triggers a kernel warning:
    
    | WARNING: possible recursive locking detected
    | 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted
    | --------------------------------------------
    | ping/4018 is trying to acquire lock:
    | ffff8881094a6848 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set]
    |
    | but task is already holding lock:
    | ffff88811034c048 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set]
    
    This is a false alarm: ipset does not allow nested list:set type, so the
    loop in list_set_kadd() can never encounter the outer set itself. No
    other set type supports embedded sets, so this is the only case to
    consider.
    
    To avoid the false report, create a distinct lock class for list:set
    type ipset locks.
    
    Fixes: f830837f0eed ("netfilter: ipset: list:set set type support")
    Signed-off-by: Phil Sutter <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFS/pnfs: Fix a live lock between recalled layouts and layoutget [+ + +]

Author: Trond Myklebust <[email protected]>
Date:   Mon Dec 16 19:28:06 2024 -0500

    NFS/pnfs: Fix a live lock between recalled layouts and layoutget
    
    commit 62e2a47ceab8f3f7d2e3f0e03fdd1c5e0059fd8b upstream.
    
    When the server is recalling a layout, we should ignore the count of
    outstanding layoutget calls, since the server is expected to return
    either NFS4ERR_RECALLCONFLICT or NFS4ERR_RETURNCONFLICT for as long as
    the recall is outstanding.
    Currently, we may end up livelocking, causing the layout to eventually
    be forcibly revoked.
    
    Fixes: bf0291dd2267 ("pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised")
    Cc: [email protected]
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nilfs2: fix buffer head leaks in calls to truncate_inode_pages() [+ + +]

Author: Ryusuke Konishi <[email protected]>
Date:   Fri Dec 13 01:43:28 2024 +0900

    nilfs2: fix buffer head leaks in calls to truncate_inode_pages()
    
    commit 6309b8ce98e9a18390b9fd8f03fc412f3c17aee9 upstream.
    
    When block_invalidatepage was converted to block_invalidate_folio, the
    fallback to block_invalidatepage in folio_invalidate() if the
    address_space_operations method invalidatepage (currently
    invalidate_folio) was not set, was removed.
    
    Unfortunately, some pseudo-inodes in nilfs2 use empty_aops set by
    inode_init_always_gfp() as is, or explicitly set it to
    address_space_operations.  Therefore, with this change,
    block_invalidatepage() is no longer called from folio_invalidate(), and as
    a result, the buffer_head structures attached to these pages/folios are no
    longer freed via try_to_free_buffers().
    
    Thus, these buffer heads are now leaked by truncate_inode_pages(), which
    cleans up the page cache from inode evict(), etc.
    
    Three types of caches use empty_aops: gc inode caches and the DAT shadow
    inode used by GC, and b-tree node caches.  Of these, b-tree node caches
    explicitly call invalidate_mapping_pages() during cleanup, which involves
    calling try_to_free_buffers(), so the leak was not visible during normal
    operation but worsened when GC was performed.
    
    Fix this issue by using address_space_operations with invalidate_folio set
    to block_invalidate_folio instead of empty_aops, which will ensure the
    same behavior as before.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 7ba13abbd31e ("fs: Turn block_invalidatepage into block_invalidate_folio")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Cc: <[email protected]>    [5.18+]
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nilfs2: prevent use of deleted inode [+ + +]

Author: Edward Adam Davis <[email protected]>
Date:   Mon Dec 9 15:56:52 2024 +0900

    nilfs2: prevent use of deleted inode
    
    commit 901ce9705fbb9f330ff1f19600e5daf9770b0175 upstream.
    
    syzbot reported a WARNING in nilfs_rmdir. [1]
    
    Because the inode bitmap is corrupted, an inode with an inode number that
    should exist as a ".nilfs" file was reassigned by nilfs_mkdir for "file0",
    causing an inode duplication during execution.  And this causes an
    underflow of i_nlink in rmdir operations.
    
    The inode is used twice by the same task to unmount and remove directories
    ".nilfs" and "file0", it trigger warning in nilfs_rmdir.
    
    Avoid to this issue, check i_nlink in nilfs_iget(), if it is 0, it means
    that this inode has been deleted, and iput is executed to reclaim it.
    
    [1]
    WARNING: CPU: 1 PID: 5824 at fs/inode.c:407 drop_nlink+0xc4/0x110 fs/inode.c:407
    ...
    Call Trace:
     <TASK>
     nilfs_rmdir+0x1b0/0x250 fs/nilfs2/namei.c:342
     vfs_rmdir+0x3a3/0x510 fs/namei.c:4394
     do_rmdir+0x3b5/0x580 fs/namei.c:4453
     __do_sys_rmdir fs/namei.c:4472 [inline]
     __se_sys_rmdir fs/namei.c:4470 [inline]
     __x64_sys_rmdir+0x47/0x50 fs/namei.c:4470
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: d25006523d0b ("nilfs2: pathname operations")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=9260555647a5132edd48
    Tested-by: [email protected]
    Signed-off-by: Edward Adam Davis <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix the space leak in LA when releasing LA [+ + +]

Author: Heming Zhao <[email protected]>
Date:   Thu Dec 5 18:48:33 2024 +0800

    ocfs2: fix the space leak in LA when releasing LA
    
    commit 7782e3b3b004e8cb94a88621a22cc3c2f33e5b90 upstream.
    
    Commit 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()")
    introduced an issue, the ocfs2_sync_local_to_main() ignores the last
    contiguous free bits, which causes an OCFS2 volume to lose the last free
    clusters of LA window during the release routine.
    
    Please note, because commit dfe6c5692fb5 ("ocfs2: fix the la space leak
    when unmounting an ocfs2 volume") was reverted, this commit is a
    replacement fix for commit dfe6c5692fb5.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()")
    Signed-off-by: Heming Zhao <[email protected]>
    Suggested-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Mon Dec 9 21:24:59 2024 +0800

    of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()
    
    commit fec3edc47d5cfc2dd296a5141df887bf567944db upstream.
    
    On a malformed interrupt-map property which is shorter than expected by
    1 cell, we may read bogus data past the end of the property instead of
    returning an error in of_irq_parse_imap_parent().
    
    Decrement the remaining length when skipping over the interrupt parent
    phandle cell.
    
    Fixes: 935df1bd40d4 ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()")
    Cc: [email protected]
    Signed-off-by: Zijun Hu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [rh: reword commit msg]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Mon Dec 9 21:25:02 2024 +0800

    of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()
    
    commit 0f7ca6f69354e0c3923bbc28c92d0ecab4d50a3e upstream.
    
    of_irq_parse_one() may use uninitialized variable @addr_len as shown below:
    
    // @addr_len is uninitialized
    int addr_len;
    
    // This operation does not touch @addr_len if it fails.
    addr = of_get_property(device, "reg", &addr_len);
    
    // Use uninitialized @addr_len if the operation fails.
    if (addr_len > sizeof(addr_buf))
            addr_len = sizeof(addr_buf);
    
    // Check the operation result here.
    if (addr)
            memcpy(addr_buf, addr, addr_len);
    
    Fix by initializing @addr_len before the operation.
    
    Fixes: b739dffa5d57 ("of/irq: Prevent device address out-of-bounds read in interrupt map walk")
    Cc: [email protected]
    Signed-off-by: Zijun Hu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of: address: Preserve the flags portion on 1:1 dma-ranges mapping [+ + +]

Author: Andrea della Porta <[email protected]>
Date:   Sun Nov 24 11:05:37 2024 +0100

    of: address: Preserve the flags portion on 1:1 dma-ranges mapping
    
    commit 7f05e20b989ac33c9c0f8c2028ec0a566493548f upstream.
    
    A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma
    translations. In this specific case, the current behaviour is to zero out
    the entire specifier so that the translation could be carried on as an
    offset from zero. This includes address specifier that has flags (e.g.
    PCI ranges).
    
    Once the flags portion has been zeroed, the translation chain is broken
    since the mapping functions will check the upcoming address specifier
    against mismatching flags, always failing the 1:1 mapping and its entire
    purpose of always succeeding.
    
    Set to zero only the address portion while passing the flags through.
    
    Fixes: dbbdee94734b ("of/address: Merge all of the bus translation code")
    Cc: [email protected]
    Signed-off-by: Andrea della Porta <[email protected]>
    Tested-by: Herve Codina <[email protected]>
    Link: https://lore.kernel.org/r/e51ae57874e58a9b349c35e2e877425ebc075d7a.1732441813.git.andrea.porta@suse.com
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of: Fix error path in of_parse_phandle_with_args_map() [+ + +]

Author: Herve Codina <[email protected]>
Date:   Mon Dec 2 17:58:19 2024 +0100

    of: Fix error path in of_parse_phandle_with_args_map()
    
    commit d7dfa7fde63dde4d2ec0083133efe2c6686c03ff upstream.
    
    The current code uses some 'goto put;' to cancel the parsing operation
    and can lead to a return code value of 0 even on error cases.
    
    Indeed, some goto calls are done from a loop without setting the ret
    value explicitly before the goto call and so the ret value can be set to
    0 due to operation done in previous loop iteration. For instance match
    can be set to 0 in the previous loop iteration (leading to a new
    iteration) but ret can also be set to 0 it the of_property_read_u32()
    call succeed. In that case if no match are found or if an error is
    detected the new iteration, the return value can be wrongly 0.
    
    Avoid those cases setting the ret value explicitly before the goto
    calls.
    
    Fixes: bd6f2fd5a1d5 ("of: Support parsing phandle argument lists through a nexus node")
    Cc: [email protected]
    Signed-off-by: Herve Codina <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of: Fix refcount leakage for OF node returned by __of_get_dma_parent() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Fri Dec 6 08:52:30 2024 +0800

    of: Fix refcount leakage for OF node returned by __of_get_dma_parent()
    
    commit 5d009e024056ded20c5bb1583146b833b23bbd5a upstream.
    
    __of_get_dma_parent() returns OF device node @args.np, but the node's
    refcount is increased twice, by both of_parse_phandle_with_args() and
    of_node_get(), so causes refcount leakage for the node.
    
    Fix by directly returning the node got by of_parse_phandle_with_args().
    
    Fixes: f83a6e5dea6c ("of: address: Add support for the parent DMA bus")
    Cc: [email protected]
    Signed-off-by: Zijun Hu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of: property: fw_devlink: Do not use interrupt-parent directly [+ + +]

Author: Samuel Holland <[email protected]>
Date:   Wed Nov 20 15:31:16 2024 -0800

    of: property: fw_devlink: Do not use interrupt-parent directly
    
    commit bc7acc0bd0f94c26bc0defc902311794a3d0fae9 upstream.
    
    commit 7f00be96f125 ("of: property: Add device link support for
    interrupt-parent, dmas and -gpio(s)") started adding device links for
    the interrupt-parent property. commit 4104ca776ba3 ("of: property: Add
    fw_devlink support for interrupts") and commit f265f06af194 ("of:
    property: Fix fw_devlink handling of interrupts/interrupts-extended")
    later added full support for parsing the interrupts and
    interrupts-extended properties, which includes looking up the node of
    the parent domain. This made the handler for the interrupt-parent
    property redundant.
    
    In fact, creating device links based solely on interrupt-parent is
    problematic, because it can create spurious cycles. A node may have
    this property without itself being an interrupt controller or consumer.
    For example, this property is often present in the root node or a /soc
    bus node to set the default interrupt parent for child nodes. However,
    it is incorrect for the bus to depend on the interrupt controller, as
    some of the bus's children may not be interrupt consumers at all or may
    have a different interrupt parent.
    
    Resolving these spurious dependency cycles can cause an incorrect probe
    order for interrupt controller drivers. This was observed on a RISC-V
    system with both an APLIC and IMSIC under /soc, where interrupt-parent
    in /soc points to the APLIC, and the APLIC msi-parent points to the
    IMSIC. fw_devlink found three dependency cycles and attempted to probe
    the APLIC before the IMSIC. After applying this patch, there were no
    dependency cycles and the probe order was correct.
    
    Acked-by: Marc Zyngier <[email protected]>
    Cc: [email protected]
    Fixes: 4104ca776ba3 ("of: property: Add fw_devlink support for interrupts")
    Signed-off-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

p2sb: Do not scan and remove the P2SB device when it is unhidden [+ + +]

Author: Shin'ichiro Kawasaki <[email protected]>
Date:   Thu Nov 28 09:28:36 2024 +0900

    p2sb: Do not scan and remove the P2SB device when it is unhidden
    
    [ Upstream commit 360c400d0f568636c1b98d1d5f9f49aa3d420c70 ]
    
    When drivers access P2SB device resources, it calls p2sb_bar(). Before
    the commit 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls
    during PCI device probe"), p2sb_bar() obtained the resources and then
    called pci_stop_and_remove_bus_device() for clean up. Then the P2SB
    device disappeared. The commit 5913320eb0b3 introduced the P2SB device
    resource cache feature in the boot process. During the resource cache,
    pci_stop_and_remove_bus_device() is called for the P2SB device, then the
    P2SB device disappears regardless of whether p2sb_bar() is called or
    not. Such P2SB device disappearance caused a confusion [1]. To avoid the
    confusion, avoid the pci_stop_and_remove_bus_device() call when the BIOS
    does not hide the P2SB device.
    
    For that purpose, cache the P2SB device resources only if the BIOS hides
    the P2SB device. Call p2sb_scan_and_cache() only if p2sb_hidden_by_bios
    is true. This allows removing two branches from p2sb_scan_and_cache().
    When p2sb_bar() is called, get the resources from the cache if the P2SB
    device is hidden. Otherwise, read the resources from the unhidden P2SB
    device.
    
    Reported-by: Daniel Walker (danielwa) <[email protected]>
    Closes: https://lore.kernel.org/lkml/ZzTI+biIUTvFT6NC@goliath/ [1]
    Fixes: 5913320eb0b3 ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe")
    Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

p2sb: Factor out p2sb_read_from_cache() [+ + +]

Author: Shin'ichiro Kawasaki <[email protected]>
Date:   Thu Nov 28 09:28:33 2024 +0900

    p2sb: Factor out p2sb_read_from_cache()
    
    [ Upstream commit 9244524d60ddea55f4df54c51200e8fef2032447 ]
    
    To prepare for the following fix, factor out the code to read the P2SB
    resource from the cache to the new function p2sb_read_from_cache().
    
    Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden")
    Signed-off-by: Sasha Levin <[email protected]>

p2sb: Introduce the global flag p2sb_hidden_by_bios [+ + +]

Author: Shin'ichiro Kawasaki <[email protected]>
Date:   Thu Nov 28 09:28:34 2024 +0900

    p2sb: Introduce the global flag p2sb_hidden_by_bios
    
    [ Upstream commit ae3e6ebc5ab046d434c05c58a3e3f7e94441fec2 ]
    
    To prepare for the following fix, introduce the global flag
    p2sb_hidden_by_bios. Check if the BIOS hides the P2SB device and store
    the result in the flag. This allows to refer to the check result across
    functions.
    
    Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden")
    Signed-off-by: Sasha Levin <[email protected]>

p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache() [+ + +]

Author: Shin'ichiro Kawasaki <[email protected]>
Date:   Thu Nov 28 09:28:35 2024 +0900

    p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache()
    
    [ Upstream commit 0286070c74ee48391fc07f7f617460479472d221 ]
    
    To prepare for the following fix, move the code to hide and unhide the
    P2SB device from p2sb_cache_resources() to p2sb_scan_and_cache().
    
    Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Stable-dep-of: 360c400d0f56 ("p2sb: Do not scan and remove the P2SB device when it is unhidden")
    Signed-off-by: Sasha Levin <[email protected]>

psample: adjust size if rate_as_probability is set [+ + +]

Author: Adrian Moreno <[email protected]>
Date:   Tue Dec 17 12:37:39 2024 +0100

    psample: adjust size if rate_as_probability is set
    
    [ Upstream commit 5eecd85c77a254a43bde3212da8047b001745c9f ]
    
    If PSAMPLE_ATTR_SAMPLE_PROBABILITY flag is to be sent, the available
    size for the packet data has to be adjusted accordingly.
    
    Also, check the error code returned by nla_put_flag.
    
    Fixes: 7b1b2b60c63f ("net: psample: allow using rate as probability")
    Signed-off-by: Adrian Moreno <[email protected]>
    Reviewed-by: Aaron Conole <[email protected]>
    Reviewed-by: Ido Schimmel <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ring-buffer: Fix overflow in __rb_map_vma [+ + +]

Author: Edward Adam Davis <[email protected]>
Date:   Wed Dec 18 21:36:55 2024 +0800

    ring-buffer: Fix overflow in __rb_map_vma
    
    commit c58a812c8e49ad688f94f4b050ad5c5b388fc5d2 upstream.
    
    An overflow occurred when performing the following calculation:
    
       nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff;
    
    Add a check before the calculation to avoid this problem.
    
    syzbot reported this as a slab-out-of-bounds in __rb_map_vma:
    
    BUG: KASAN: slab-out-of-bounds in __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058
    Read of size 8 at addr ffff8880767dd2b8 by task syz-executor187/5836
    
    CPU: 0 UID: 0 PID: 5836 Comm: syz-executor187 Not tainted 6.13.0-rc2-syzkaller-00159-gf932fb9b4074 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:378 [inline]
     print_report+0xc3/0x620 mm/kasan/report.c:489
     kasan_report+0xd9/0x110 mm/kasan/report.c:602
     __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058
     ring_buffer_map+0x56e/0x9b0 kernel/trace/ring_buffer.c:7138
     tracing_buffers_mmap+0xa6/0x120 kernel/trace/trace.c:8482
     call_mmap include/linux/fs.h:2183 [inline]
     mmap_file mm/internal.h:124 [inline]
     __mmap_new_file_vma mm/vma.c:2291 [inline]
     __mmap_new_vma mm/vma.c:2355 [inline]
     __mmap_region+0x1786/0x2670 mm/vma.c:2456
     mmap_region+0x127/0x320 mm/mmap.c:1348
     do_mmap+0xc00/0xfc0 mm/mmap.c:496
     vm_mmap_pgoff+0x1ba/0x360 mm/util.c:580
     ksys_mmap_pgoff+0x32c/0x5c0 mm/mmap.c:542
     __do_sys_mmap arch/x86/kernel/sys_x86_64.c:89 [inline]
     __se_sys_mmap arch/x86/kernel/sys_x86_64.c:82 [inline]
     __x64_sys_mmap+0x125/0x190 arch/x86/kernel/sys_x86_64.c:82
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    The reproducer for this bug is:
    
    ------------------------8<-------------------------
     #include <fcntl.h>
     #include <stdlib.h>
     #include <unistd.h>
     #include <asm/types.h>
     #include <sys/mman.h>
    
     int main(int argc, char **argv)
     {
            int page_size = getpagesize();
            int fd;
            void *meta;
    
            system("echo 1 > /sys/kernel/tracing/buffer_size_kb");
            fd = open("/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY);
    
            meta = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, page_size * 5);
     }
    ------------------------>8-------------------------
    
    Cc: [email protected]
    Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions")
    Link: https://lore.kernel.org/[email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=345e4443a21200874b18
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit [+ + +]

Author: Michael Neuling <[email protected]>
Date:   Wed Nov 27 04:18:40 2024 +0000

    RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
    
    [ Upstream commit ea6398a5af81e3e7fb3da5d261694d479a321fd9 ]
    
    This doesn't cause a problem currently as HVIEN isn't used elsewhere
    yet. Found by inspection.
    
    Signed-off-by: Michael Neuling <[email protected]>
    Fixes: 16b0bde9a37c ("RISC-V: KVM: Add perf sampling support for guests")
    Reviewed-by: Atish Patra <[email protected]>
    Reviewed-by: Anup Patel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Anup Patel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/ipl: Fix never less than zero warning [+ + +]

Author: Alexander Gordeev <[email protected]>
Date:   Mon Dec 9 17:43:48 2024 +0100

    s390/ipl: Fix never less than zero warning
    
    [ Upstream commit 5fa49dd8e521a42379e5e41fcf2c92edaaec0a8b ]
    
    DEFINE_IPL_ATTR_STR_RW() macro produces "unsigned 'len' is never less
    than zero." warning when sys_vmcmd_on_*_store() callbacks are defined.
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: 247576bf624a ("s390/ipl: Do not accept z/VM CP diag X'008' cmds longer than max length")
    Reviewed-by: Heiko Carstens <[email protected]>
    Signed-off-by: Alexander Gordeev <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/mm: Consider KMSAN modules metadata for paging levels [+ + +]

Author: Vasily Gorbik <[email protected]>
Date:   Tue Dec 10 12:35:34 2024 +0100

    s390/mm: Consider KMSAN modules metadata for paging levels
    
    [ Upstream commit 282da38b465395c930687974627c24f47ddce5ff ]
    
    The calculation determining whether to use three- or four-level paging
    didn't account for KMSAN modules metadata. Include this metadata in the
    virtual memory size calculation to ensure correct paging mode selection
    and avoiding potentially unnecessary physical memory size limitations.
    
    Fixes: 65ca73f9fb36 ("s390/mm: define KMSAN metadata for vmalloc and modules")
    Acked-by: Heiko Carstens <[email protected]>
    Reviewed-by: Alexander Gordeev <[email protected]>
    Reviewed-by: Ilya Leoshkevich <[email protected]>
    Signed-off-by: Vasily Gorbik <[email protected]>
    Signed-off-by: Alexander Gordeev <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/mm: Fix DirectMap accounting [+ + +]

Author: Heiko Carstens <[email protected]>
Date:   Fri Nov 29 17:39:27 2024 +0100

    s390/mm: Fix DirectMap accounting
    
    commit 41856638e6c4ed51d8aa9e54f70059d1e357b46e upstream.
    
    With uncoupling of physical and virtual address spaces population of
    the identity mapping was changed to use the type POPULATE_IDENTITY
    instead of POPULATE_DIRECT. This breaks DirectMap accounting:
    
    > cat /proc/meminfo
    DirectMap4k:       55296 kB
    DirectMap1M:    18446744073709496320 kB
    
    Adjust all locations of update_page_count() in vmem.c to use
    POPULATE_IDENTITY instead of POPULATE_DIRECT as well. With this
    accounting is correct again:
    
    > cat /proc/meminfo
    DirectMap4k:       54264 kB
    DirectMap1M:     8334336 kB
    
    Fixes: c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces")
    Cc: [email protected]
    Reviewed-by: Alexander Gordeev <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Alexander Gordeev <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/dlserver: Fix dlserver double enqueue [+ + +]

Author: Vineeth Pillai (Google) <[email protected]>
Date:   Thu Dec 12 22:22:36 2024 -0500

    sched/dlserver: Fix dlserver double enqueue
    
    [ Upstream commit b53127db1dbf7f1047cf35c10922d801dcd40324 ]
    
    dlserver can get dequeued during a dlserver pick_task due to the delayed
    deueue feature and this can lead to issues with dlserver logic as it
    still thinks that dlserver is on the runqueue. The dlserver throttling
    and replenish logic gets confused and can lead to double enqueue of
    dlserver.
    
    Double enqueue of dlserver could happend due to couple of reasons:
    
    Case 1
    ------
    
    Delayed dequeue feature[1] can cause dlserver being stopped during a
    pick initiated by dlserver:
      __pick_next_task
       pick_task_dl -> server_pick_task
        pick_task_fair
         pick_next_entity (if (sched_delayed))
          dequeue_entities
           dl_server_stop
    
    server_pick_task goes ahead with update_curr_dl_se without knowing that
    dlserver is dequeued and this confuses the logic and may lead to
    unintended enqueue while the server is stopped.
    
    Case 2
    ------
    A race condition between a task dequeue on one cpu and same task's enqueue
    on this cpu by a remote cpu while the lock is released causing dlserver
    double enqueue.
    
    One cpu would be in the schedule() and releasing RQ-lock:
    
    current->state = TASK_INTERRUPTIBLE();
            schedule();
              deactivate_task()
                dl_stop_server();
              pick_next_task()
                pick_next_task_fair()
                  sched_balance_newidle()
                    rq_unlock(this_rq)
    
    at which point another CPU can take our RQ-lock and do:
    
            try_to_wake_up()
              ttwu_queue()
                rq_lock()
                ...
                activate_task()
                  dl_server_start() --> first enqueue
                wakeup_preempt() := check_preempt_wakeup_fair()
                  update_curr()
                    update_curr_task()
                      if (current->dl_server)
                        dl_server_update()
                          enqueue_dl_entity() --> second enqueue
    
    This bug was not apparent as the enqueue in dl_server_start doesn't
    usually happen because of the defer logic. But as a side effect of the
    first case(dequeue during dlserver pick), dl_throttled and dl_yield will
    be set and this causes the time accounting of dlserver to messup and
    then leading to a enqueue in dl_server_start.
    
    Have an explicit flag representing the status of dlserver to avoid the
    confusion. This is set in dl_server_start and reset in dlserver_stop.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Suggested-by: Peter Zijlstra <[email protected]>
    Signed-off-by: "Vineeth Pillai (Google)" <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Marcel Ziswiler <[email protected]> # ROCK 5B
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

sched/dlserver: Fix dlserver time accounting [+ + +]

Author: Vineeth Pillai (Google) <[email protected]>
Date:   Thu Dec 12 22:22:37 2024 -0500

    sched/dlserver: Fix dlserver time accounting
    
    [ Upstream commit c7f7e9c73178e0e342486fd31e7f363ef60e3f83 ]
    
    dlserver time is accounted when:
     - dlserver is active and the dlserver proxies the cfs task.
     - dlserver is active but deferred and cfs task runs after being picked
       through the normal fair class pick.
    
    dl_server_update is called in two places to make sure that both the
    above times are accounted for. But it doesn't check if dlserver is
    active or not. Now that we have this dl_server_active flag, we can
    consolidate dl_server_update into one place and all we need to check is
    whether dlserver is active or not. When dlserver is active there is only
    two possible conditions:
     - dlserver is deferred.
     - cfs task is running on behalf of dlserver.
    
    Fixes: a110a81c52a9 ("sched/deadline: Deferrable dl server")
    Signed-off-by: "Vineeth Pillai (Google)" <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Marcel Ziswiler <[email protected]> # ROCK 5B
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

sched/eevdf: More PELT vs DELAYED_DEQUEUE [+ + +]

Author: Peter Zijlstra <[email protected]>
Date:   Mon Dec 2 18:45:57 2024 +0100

    sched/eevdf: More PELT vs DELAYED_DEQUEUE
    
    [ Upstream commit 76f2f783294d7d55c2564e2dfb0a7279ba0bc264 ]
    
    Vincent and Dietmar noted that while
    commit fc1892becd56 ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE") fixes
    the entity runnable stats, it does not adjust the cfs_rq runnable stats,
    which are based off of h_nr_running.
    
    Track h_nr_delayed such that we can discount those and adjust the
    signal.
    
    Fixes: fc1892becd56 ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE")
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Closes: https://lore.kernel.org/lkml/CAKfTPtCNUvWE_GX5LyvTF-WdxUT=ZgvZZv-4t=eWntg5uOFqiQ@mail.gmail.com/
    [ Fixes checkpatch warnings and rebased ]
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reported-by: Dietmar Eggemann <[email protected]>
    Reported-by: Vincent Guittot <[email protected]>
    Signed-off-by: "Peter Zijlstra (Intel)" <[email protected]>
    Signed-off-by: Vincent Guittot <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Dietmar Eggemann <[email protected]>
    Tested-by: K Prateek Nayak <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

sched/fair: Fix NEXT_BUDDY [+ + +]

Author: K Prateek Nayak <[email protected]>
Date:   Thu Nov 28 12:59:54 2024 +0530

    sched/fair: Fix NEXT_BUDDY
    
    [ Upstream commit 493afbd187c4c9cc1642792c0d9ba400c3d6d90d ]
    
    Adam reports that enabling NEXT_BUDDY insta triggers a WARN in
    pick_next_entity().
    
    Moving clear_buddies() up before the delayed dequeue bits ensures
    no ->next buddy becomes delayed. Further ensure no new ->next buddy
    ever starts as delayed.
    
    Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
    Reported-by: Adam Li <[email protected]>
    Signed-off-by: K Prateek Nayak <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Adam Li <[email protected]>
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

sched/fair: Fix sched_can_stop_tick() for fair tasks [+ + +]

Author: Vincent Guittot <[email protected]>
Date:   Mon Dec 2 18:45:56 2024 +0100

    sched/fair: Fix sched_can_stop_tick() for fair tasks
    
    [ Upstream commit c1f43c342e1f2e32f0620bf2e972e2a9ea0a1e60 ]
    
    We can't stop the tick of a rq if there are at least 2 tasks enqueued in
    the whole hierarchy and not only at the root cfs rq.
    
    rq->cfs.nr_running tracks the number of sched_entity at one level
    whereas rq->cfs.h_nr_running tracks all queued tasks in the
    hierarchy.
    
    Fixes: 11cc374f4643b ("sched_ext: Simplify scx_can_stop_tick() invocation in sched_can_stop_tick()")
    Signed-off-by: Vincent Guittot <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Dietmar Eggemann <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

selftests/bpf: Use asm constraint "m" for LoongArch [+ + +]

Author: Tiezhu Yang <[email protected]>
Date:   Thu Dec 19 19:15:06 2024 +0800

    selftests/bpf: Use asm constraint "m" for LoongArch
    
    commit 29d44cce324dab2bd86c447071a596262e7109b6 upstream.
    
    Currently, LoongArch LLVM does not support the constraint "o" and no plan
    to support it, it only supports the similar constraint "m", so change the
    constraints from "nor" in the "else" case to arch-specific "nmr" to avoid
    the build error such as "unexpected asm memory constraint" for LoongArch.
    
    Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests")
    Suggested-by: Weining Lu <[email protected]>
    Suggested-by: Li Chen <[email protected]>
    Signed-off-by: Tiezhu Yang <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Reviewed-by: Huacai Chen <[email protected]>
    Cc: [email protected]
    Link: https://llvm.org/docs/LangRef.html#supported-constraint-code-list
    Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp#L172
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests/memfd: run sysctl tests when PID namespace support is enabled [+ + +]

Author: Isaac J. Manjarres <[email protected]>
Date:   Thu Dec 5 11:29:41 2024 -0800

    selftests/memfd: run sysctl tests when PID namespace support is enabled
    
    commit 6a75f19af16ff482cfd6085c77123aa0f464f8dd upstream.
    
    The sysctl tests for vm.memfd_noexec rely on the kernel to support PID
    namespaces (i.e.  the kernel is built with CONFIG_PID_NS=y).  If the
    kernel the test runs on does not support PID namespaces, the first sysctl
    test will fail when attempting to spawn a new thread in a new PID
    namespace, abort the test, preventing the remaining tests from being run.
    
    This is not desirable, as not all kernels need PID namespaces, but can
    still use the other features provided by memfd.  Therefore, only run the
    sysctl tests if the kernel supports PID namespaces.  Otherwise, skip those
    tests and emit an informative message to let the user know why the sysctl
    tests are not being run.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 11f75a01448f ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC")
    Signed-off-by: Isaac J. Manjarres <[email protected]>
    Reviewed-by: Jeff Xu <[email protected]>
    Cc: Suren Baghdasaryan <[email protected]>
    Cc: Kalesh Singh <[email protected]>
    Cc: <[email protected]>    [6.6+]
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests: openvswitch: fix tcpdump execution [+ + +]

Author: Adrian Moreno <[email protected]>
Date:   Tue Dec 17 22:16:51 2024 +0100

    selftests: openvswitch: fix tcpdump execution
    
    [ Upstream commit a17975992cc11588767175247ccaae1213a8b582 ]
    
    Fix the way tcpdump is executed by:
    - Using the right variable for the namespace. Currently the use of the
      empty "ns" makes the command fail.
    - Waiting until it starts to capture to ensure the interesting traffic
      is caught on slow systems.
    - Using line-buffered output to ensure logs are available when the test
      is paused with "-p". Otherwise the last chunk of data might only be
      written when tcpdump is killed.
    
    Fixes: 74cc26f416b9 ("selftests: openvswitch: add interface support")
    Signed-off-by: Adrian Moreno <[email protected]>
    Acked-by: Eelco Chaudron <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

smb: client: fix TCP timers deadlock after rmmod [+ + +]

Author: Enzo Matsumiya <[email protected]>
Date:   Tue Dec 10 18:15:12 2024 -0300

    smb: client: fix TCP timers deadlock after rmmod
    
    commit e9f2517a3e18a54a3943c098d2226b245d488801 upstream.
    
    Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
    fixed a netns UAF by manually enabled socket refcounting
    (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)).
    
    The reason the patch worked for that bug was because we now hold
    references to the netns (get_net_track() gets a ref internally)
    and they're properly released (internally, on __sk_destruct()),
    but only because sk->sk_net_refcnt was set.
    
    Problem:
    (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless
    if init_net or other)
    
    Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not
    only out of cifs scope, but also technically wrong -- it's set conditionally
    based on user (=1) vs kernel (=0) sockets.  And net/ implementations
    seem to base their user vs kernel space operations on it.
    
    e.g. upon TCP socket close, the TCP timers are not cleared because
    sk->sk_net_refcnt=1:
    (cf. commit 151c9c724d05 ("tcp: properly terminate timers for kernel sockets"))
    
    net/ipv4/tcp.c:
        void tcp_close(struct sock *sk, long timeout)
        {
            lock_sock(sk);
            __tcp_close(sk, timeout);
            release_sock(sk);
            if (!sk->sk_net_refcnt)
                    inet_csk_clear_xmit_timers_sync(sk);
            sock_put(sk);
        }
    
    Which will throw a lockdep warning and then, as expected, deadlock on
    tcp_write_timer().
    
    A way to reproduce this is by running the reproducer from ef7134c7fc48
    and then 'rmmod cifs'.  A few seconds later, the deadlock/lockdep
    warning shows up.
    
    Fix:
    We shouldn't mess with socket internals ourselves, so do not set
    sk_net_refcnt manually.
    
    Also change __sock_create() to sock_create_kern() for explicitness.
    
    As for non-init_net network namespaces, we deal with it the best way
    we can -- hold an extra netns reference for server->ssocket and drop it
    when it's released.  This ensures that the netns still exists whenever
    we need to create/destroy server->ssocket, but is not directly tied to
    it.
    
    Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.")
    Cc: [email protected]
    Signed-off-by: Enzo Matsumiya <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

team: Fix feature exposure when no ports are present [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Fri Dec 13 13:36:57 2024 +0100

    team: Fix feature exposure when no ports are present
    
    [ Upstream commit e78c20f327bd94dabac68b98218dff069a8780f0 ]
    
    Small follow-up to align this to an equivalent behavior as the bond driver.
    The change in 3625920b62c3 ("teaming: fix vlan_features computing") removed
    the netdevice vlan_features when there is no team port attached, yet it
    leaves the full set of enc_features intact.
    
    Instead, leave the default features as pre 3625920b62c3, and recompute once
    we do have ports attached. Also, similarly as in bonding case, call the
    netdev_base_features() helper on the enc_features.
    
    Fixes: 3625920b62c3 ("teaming: fix vlan_features computing")
    Signed-off-by: Daniel Borkmann <[email protected]>
    Reviewed-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

thunderbolt: Add support for Intel Panther Lake-M/P [+ + +]

Author: Mika Westerberg <[email protected]>
Date:   Tue May 14 10:15:14 2024 +0300

    thunderbolt: Add support for Intel Panther Lake-M/P
    
    commit 8644b48714dca8bf2f42a4ff8311de8efc9bd8c3 upstream.
    
    Intel Panther Lake-M/P has the same integrated Thunderbolt/USB4
    controller as Lunar Lake. Add these PCI IDs to the driver list of
    supported devices.
    
    Cc: [email protected]
    Signed-off-by: Mika Westerberg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

thunderbolt: Don't display nvm_version unless upgrade supported [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Mon Dec 9 10:25:51 2024 -0600

    thunderbolt: Don't display nvm_version unless upgrade supported
    
    commit e34f1717ef0632fcec5cb827e5e0e9f223d70c9b upstream.
    
    The read will never succeed if NVM wasn't initialized due to an unknown
    format.
    
    Add a new callback for visibility to only show when supported.
    
    Cc: [email protected]
    Fixes: aef9c693e7e5 ("thunderbolt: Move vendor specific NVM handling into nvm.c")
    Reported-by: Richard Hughes <[email protected]>
    Closes: https://github.com/fwupd/fwupd/issues/8200
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Mika Westerberg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

thunderbolt: Improve redrive mode handling [+ + +]

Author: Mika Westerberg <[email protected]>
Date:   Fri Nov 15 11:54:40 2024 +0200

    thunderbolt: Improve redrive mode handling
    
    commit 24740385cb0d6d22ab7fa7adf36546d5b3cdcf73 upstream.
    
    When USB-C monitor is connected directly to Intel Barlow Ridge host, it
    goes into "redrive" mode that basically routes the DisplayPort signals
    directly from the GPU to the USB-C monitor without any tunneling needed.
    However, the host router must be powered on for this to work. Aaron
    reported that there are a couple of cases where this will not work with
    the current code:
    
      - Booting with USB-C monitor plugged in.
      - Plugging in USB-C monitor when the host router is in sleep state
        (runtime suspended).
      - Plugging in USB-C device while the system is in system sleep state.
    
    In all these cases once the host router is runtime suspended the picture
    on the connected USB-C display disappears too. This is certainly not
    what the user expected.
    
    For this reason improve the redrive mode handling to keep the host
    router from runtime suspending when detect that any of the above cases
    is happening.
    
    Fixes: a75e0684efe5 ("thunderbolt: Keep the domain powered when USB4 port is in redrive mode")
    Reported-by: Aaron Rainbolt <[email protected]>
    Closes: https://lore.kernel.org/linux-usb/20241009220118.70bfedd0@kf-ir16/
    Cc: [email protected]
    Signed-off-by: Mika Westerberg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tools/net/ynl: fix sub-message key lookup for nested attributes [+ + +]

Author: Donald Hunter <[email protected]>
Date:   Fri Dec 13 13:07:11 2024 +0000

    tools/net/ynl: fix sub-message key lookup for nested attributes
    
    [ Upstream commit 663ad7481f068057f6f692c5368c47150e855370 ]
    
    Use the correct attribute space for sub-message key lookup in nested
    attributes when adding attributes. This fixes rt_link where the "kind"
    key and "data" sub-message are nested attributes in "linkinfo".
    
    For example:
    
    ./tools/net/ynl/cli.py \
        --create \
        --spec Documentation/netlink/specs/rt_link.yaml \
        --do newlink \
        --json '{"link": 99,
                 "linkinfo": { "kind": "vlan", "data": {"id": 4 } }
                 }'
    
    Signed-off-by: Donald Hunter <[email protected]>
    Fixes: ab463c4342d1 ("tools/net/ynl: Add support for encoding sub-messages")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools: hv: change permissions of NetworkManager configuration file [+ + +]

Author: Olaf Hering <[email protected]>
Date:   Wed Oct 16 16:35:10 2024 +0200

    tools: hv: change permissions of NetworkManager configuration file
    
    [ Upstream commit 91ae69c7ed9e262f24240c425ad1eef2cf6639b7 ]
    
    Align permissions of the resulting .nmconnection file, instead of
    the input file from hv_kvp_daemon. To avoid the tiny time frame
    where the output file is world-readable, use umask instead of chmod.
    
    Fixes: 42999c904612 ("hv/hv_kvp_daemon:Support for keyfile based connection profile")
    Signed-off-by: Olaf Hering <[email protected]>
    Reviewed-by: Shradha Gupta <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wei Liu <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools: hv: Fix a complier warning in the fcopy uio daemon [+ + +]

Author: Dexuan Cui <[email protected]>
Date:   Tue Sep 10 00:44:32 2024 +0000

    tools: hv: Fix a complier warning in the fcopy uio daemon
    
    commit cb1b78f1c726c938bd47497c1ab16b01ce967f37 upstream.
    
    hv_fcopy_uio_daemon.c:436:53: warning: '%s' directive output may be truncated
    writing up to 14 bytes into a region of size 10 [-Wformat-truncation=]
      436 |  snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name);
    
    Also added 'static' for the array 'desc[]'.
    
    Fixes: 82b0945ce2c2 ("tools: hv: Add new fcopy application based on uio driver")
    Cc: [email protected] # 6.10+
    Signed-off-by: Dexuan Cui <[email protected]>
    Reviewed-by: Saurabh Sengar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wei Liu <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Wed Dec 18 14:15:07 2024 -0500

    trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers
    
    commit 8cd63406d08110c8098e1efda8aef7ddab4db348 upstream.
    
    The TP_printk() of a TRACE_EVENT() is a generic printf format that any
    developer can create for their event. It may include pointers to strings
    and such. A boot mapped buffer may contain data from a previous kernel
    where the strings addresses are different.
    
    One solution is to copy the event content and update the pointers by the
    recorded delta, but a simpler solution (for now) is to just use the
    print_fields() function to print these events. The print_fields() function
    just iterates the fields and prints them according to what type they are,
    and ignores the TP_printk() format from the event itself.
    
    To understand the difference, when printing via TP_printk() the output
    looks like this:
    
      4582.696626: kmem_cache_alloc: call_site=getname_flags+0x47/0x1f0 ptr=00000000e70e10e0 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL node=-1 accounted=false
      4582.696629: kmem_cache_alloc: call_site=alloc_empty_file+0x6b/0x110 ptr=0000000095808002 bytes_req=360 bytes_alloc=384 gfp_flags=GFP_KERNEL node=-1 accounted=false
      4582.696630: kmem_cache_alloc: call_site=security_file_alloc+0x24/0x100 ptr=00000000576339c3 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false
      4582.696653: kmem_cache_free: call_site=do_sys_openat2+0xa7/0xd0 ptr=00000000e70e10e0 name=names_cache
    
    But when printing via print_fields() (echo 1 > /sys/kernel/tracing/options/fields)
    the same event output looks like this:
    
      4582.696626: kmem_cache_alloc: call_site=0xffffffff92d10d97 (-1831793257) ptr=0xffff9e0e8571e000 (-107689771147264) bytes_req=0x1000 (4096) bytes_alloc=0x1000 (4096) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0)
      4582.696629: kmem_cache_alloc: call_site=0xffffffff92d0250b (-1831852789) ptr=0xffff9e0e8577f800 (-107689770747904) bytes_req=0x168 (360) bytes_alloc=0x180 (384) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0)
      4582.696630: kmem_cache_alloc: call_site=0xffffffff92efca74 (-1829778828) ptr=0xffff9e0e8d35d3b0 (-107689640864848) bytes_req=0x10 (16) bytes_alloc=0x10 (16) gfp_flags=0xdc0 (3520) node=0xffffffff (-1) accounted=(0)
      4582.696653: kmem_cache_free: call_site=0xffffffff92cfbea7 (-1831879001) ptr=0xffff9e0e8571e000 (-107689771147264) name=names_cache
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Add "%s" check in test_event_printk() [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Mon Dec 16 21:41:21 2024 -0500

    tracing: Add "%s" check in test_event_printk()
    
    commit 65a25d9f7ac02e0cf361356e834d1c71d36acca9 upstream.
    
    The test_event_printk() code makes sure that when a trace event is
    registered, any dereferenced pointers in from the event's TP_printk() are
    pointing to content in the ring buffer. But currently it does not handle
    "%s", as there's cases where the string pointer saved in the ring buffer
    points to a static string in the kernel that will never be freed. As that
    is a valid case, the pointer needs to be checked at runtime.
    
    Currently the runtime check is done via trace_check_vprintf(), but to not
    have to replicate everything in vsnprintf() it does some logic with the
    va_list that may not be reliable across architectures. In order to get rid
    of that logic, more work in the test_event_printk() needs to be done. Some
    of the strings can be validated at this time when it is obvious the string
    is valid because the string will be saved in the ring buffer content.
    
    Do all the validation of strings in the ring buffer at boot in
    test_event_printk(), and make sure that the field of the strings that
    point into the kernel are accessible. This will allow adding checks at
    runtime that will validate the fields themselves and not rely on paring
    the TP_printk() format at runtime.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Al Viro <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Add missing helper functions in event pointer dereference check [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Mon Dec 16 21:41:20 2024 -0500

    tracing: Add missing helper functions in event pointer dereference check
    
    commit 917110481f6bc1c96b1e54b62bb114137fbc6d17 upstream.
    
    The process_pointer() helper function looks to see if various trace event
    macros are used. These macros are for storing data in the event. This
    makes it safe to dereference as the dereference will then point into the
    event on the ring buffer where the content of the data stays with the
    event itself.
    
    A few helper functions were missing. Those were:
    
      __get_rel_dynamic_array()
      __get_dynamic_array_len()
      __get_rel_dynamic_array_len()
      __get_rel_sockaddr()
    
    Also add a helper function find_print_string() to not need to use a middle
    man variable to test if the string exists.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Al Viro <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Check "%s" dereference via the field and not the TP_printk format [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Mon Dec 16 21:41:22 2024 -0500

    tracing: Check "%s" dereference via the field and not the TP_printk format
    
    commit afd2627f727b89496d79a6b934a025fc916d4ded upstream.
    
    The TP_printk() portion of a trace event is executed at the time a event
    is read from the trace. This can happen seconds, minutes, hours, days,
    months, years possibly later since the event was recorded. If the print
    format contains a dereference to a string via "%s", and that string was
    allocated, there's a chance that string could be freed before it is read
    by the trace file.
    
    To protect against such bugs, there are two functions that verify the
    event. The first one is test_event_printk(), which is called when the
    event is created. It reads the TP_printk() format as well as its arguments
    to make sure nothing may be dereferencing a pointer that was not copied
    into the ring buffer along with the event. If it is, it will trigger a
    WARN_ON().
    
    For strings that use "%s", it is not so easy. The string may not reside in
    the ring buffer but may still be valid. Strings that are static and part
    of the kernel proper which will not be freed for the life of the running
    system, are safe to dereference. But to know if it is a pointer to a
    static string or to something on the heap can not be determined until the
    event is triggered.
    
    This brings us to the second function that tests for the bad dereferencing
    of strings, trace_check_vprintf(). It would walk through the printf format
    looking for "%s", and when it finds it, it would validate that the pointer
    is safe to read. If not, it would produces a WARN_ON() as well and write
    into the ring buffer "[UNSAFE-MEMORY]".
    
    The problem with this is how it used va_list to have vsnprintf() handle
    all the cases that it didn't need to check. Instead of re-implementing
    vsnprintf(), it would make a copy of the format up to the %s part, and
    call vsnprintf() with the current va_list ap variable, where the ap would
    then be ready to point at the string in question.
    
    For architectures that passed va_list by reference this was possible. For
    architectures that passed it by copy it was not. A test_can_verify()
    function was used to differentiate between the two, and if it wasn't
    possible, it would disable it.
    
    Even for architectures where this was feasible, it was a stretch to rely
    on such a method that is undocumented, and could cause issues later on
    with new optimizations of the compiler.
    
    Instead, the first function test_event_printk() was updated to look at
    "%s" as well. If the "%s" argument is a pointer outside the event in the
    ring buffer, it would find the field type of the event that is the problem
    and mark the structure with a new flag called "needs_test". The event
    itself will be marked by TRACE_EVENT_FL_TEST_STR to let it be known that
    this event has a field that needs to be verified before the event can be
    printed using the printf format.
    
    When the event fields are created from the field type structure, the
    fields would copy the field type's "needs_test" value.
    
    Finally, before being printed, a new function ignore_event() is called
    which will check if the event has the TEST_STR flag set (if not, it
    returns false). If the flag is set, it then iterates through the events
    fields looking for the ones that have the "needs_test" flag set.
    
    Then it uses the offset field from the field structure to find the pointer
    in the ring buffer event. It runs the tests to make sure that pointer is
    safe to print and if not, it triggers the WARN_ON() and also adds to the
    trace output that the event in question has an unsafe memory access.
    
    The ignore_event() makes the trace_check_vprintf() obsolete so it is
    removed.
    
    Link: https://lore.kernel.org/all/CAHk-=wh3uOnqnZPpR0PeLZZtyWbZLboZ7cHLCKRWsocvs9Y7hQ@mail.gmail.com/
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Al Viro <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing: Fix test_event_printk() to process entire print argument [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Mon Dec 16 21:41:19 2024 -0500

    tracing: Fix test_event_printk() to process entire print argument
    
    commit a6629626c584200daf495cc9a740048b455addcd upstream.
    
    The test_event_printk() analyzes print formats of trace events looking for
    cases where it may dereference a pointer that is not in the ring buffer
    which can possibly be a bug when the trace event is read from the ring
    buffer and the content of that pointer no longer exists.
    
    The function needs to accurately go from one print format argument to the
    next. It handles quotes and parenthesis that may be included in an
    argument. When it finds the start of the next argument, it uses a simple
    "c = strstr(fmt + i, ',')" to find the end of that argument!
    
    In order to include "%s" dereferencing, it needs to process the entire
    content of the print format argument and not just the content of the first
    ',' it finds. As there may be content like:
    
     ({ const char *saved_ptr = trace_seq_buffer_ptr(p); static const char
       *access_str[] = { "---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux"
       }; union kvm_mmu_page_role role; role.word = REC->role;
       trace_seq_printf(p, "sp gen %u gfn %llx l%u %u-byte q%u%s %s%s" " %snxe
       %sad root %u %s%c", REC->mmu_valid_gen, REC->gfn, role.level,
       role.has_4_byte_gpte ? 4 : 8, role.quadrant, role.direct ? " direct" : "",
       access_str[role.access], role.invalid ? " invalid" : "", role.efer_nx ? ""
       : "!", role.ad_disabled ? "!" : "", REC->root_count, REC->unsync ?
       "unsync" : "sync", 0); saved_ptr; })
    
    Which is an example of a full argument of an existing event. As the code
    already handles finding the next print format argument, process the
    argument at the end of it and not the start of it. This way it has both
    the start of the argument as well as the end of it.
    
    Add a helper function "process_pointer()" that will do the processing during
    the loop as well as at the end. It also makes the code cleaner and easier
    to read.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Al Viro <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5013f454a352c ("tracing: Add check of trace event print fmts for dereferencing pointers")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

udmabuf: also check for F_SEAL_FUTURE_WRITE [+ + +]

Author: Jann Horn <[email protected]>
Date:   Wed Dec 4 17:26:20 2024 +0100

    udmabuf: also check for F_SEAL_FUTURE_WRITE
    
    commit 0a16e24e34f28210f68195259456c73462518597 upstream.
    
    When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf
    must reject memfds with this flag, just like ones with F_SEAL_WRITE.
    Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED.
    
    Fixes: ab3948f58ff8 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd")
    Cc: [email protected]
    Acked-by: Vivek Kasireddy <[email protected]>
    Signed-off-by: Jann Horn <[email protected]>
    Reviewed-by: Joel Fernandes (Google) <[email protected]>
    Signed-off-by: Vivek Kasireddy <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

udmabuf: fix memory leak on last export_udmabuf() error path [+ + +]

Author: Jann Horn <[email protected]>
Date:   Wed Dec 4 17:26:21 2024 +0100

    udmabuf: fix memory leak on last export_udmabuf() error path
    
    [ Upstream commit f49856f525acd5bef52ae28b7da2e001bbe7439e ]
    
    In export_udmabuf(), if dma_buf_fd() fails because the FD table is full, a
    dma_buf owning the udmabuf has already been created; but the error handling
    in udmabuf_create() will tear down the udmabuf without doing anything about
    the containing dma_buf.
    
    This leaves a dma_buf in memory that contains a dangling pointer; though
    that doesn't seem to lead to anything bad except a memory leak.
    
    Fix it by moving the dma_buf_fd() call out of export_udmabuf() so that we
    can give it different error handling.
    
    Note that the shape of this code changed a lot in commit 5e72b2b41a21
    ("udmabuf: convert udmabuf driver to use folios"); but the memory leak
    seems to have existed since the introduction of udmabuf.
    
    Fixes: fbb0de795078 ("Add udmabuf misc device")
    Acked-by: Vivek Kasireddy <[email protected]>
    Signed-off-by: Jann Horn <[email protected]>
    Signed-off-by: Vivek Kasireddy <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

udmabuf: fix racy memfd sealing check [+ + +]

Author: Jann Horn <[email protected]>
Date:   Wed Dec 4 17:26:19 2024 +0100

    udmabuf: fix racy memfd sealing check
    
    commit 9cb189a882738c1d28b349d4e7c6a1ef9b3d8f87 upstream.
    
    The current check_memfd_seals() is racy: Since we first do
    check_memfd_seals() and then udmabuf_pin_folios() without holding any
    relevant lock across both, F_SEAL_WRITE can be set in between.
    This is problematic because we can end up holding pins to pages in a
    write-sealed memfd.
    
    Fix it using the inode lock, that's probably the easiest way.
    In the future, we might want to consider moving this logic into memfd,
    especially if anyone else wants to use memfd_pin_folios().
    
    Reported-by: Julian Orth <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106
    Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTgpD+StkV9w@mail.gmail.com
    Fixes: fbb0de795078 ("Add udmabuf misc device")
    Cc: [email protected]
    Signed-off-by: Jann Horn <[email protected]>
    Acked-by: Joel Fernandes (Google) <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Signed-off-by: Vivek Kasireddy <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

udmabuf: udmabuf_create pin folio codestyle cleanup [+ + +]

Author: Huan Yang <[email protected]>
Date:   Wed Sep 18 10:52:27 2024 +0800

    udmabuf: udmabuf_create pin folio codestyle cleanup
    
    [ Upstream commit 164fd9efd46531fddfaa933d394569259896642b ]
    
    This patch aim to simplify the memfd folio pin during the udmabuf
    create. No functional changes.
    
    This patch create a udmabuf_pin_folios function, in this, do the memfd
    pin folio and then record each pinned folio, offset.
    
    This patch simplify the pinned folio record, iter by each pinned folio,
    and then record each offset in it.
    
    Compare to iter by pgcnt, more readable.
    
    Suggested-by: Vivek Kasireddy <[email protected]>
    Signed-off-by: Huan Yang <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Signed-off-by: Vivek Kasireddy <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Stable-dep-of: f49856f525ac ("udmabuf: fix memory leak on last export_udmabuf() error path")
    Signed-off-by: Sasha Levin <[email protected]>

USB: serial: option: add MediaTek T7XX compositions [+ + +]

Author: Jack Wu <[email protected]>
Date:   Thu Nov 28 10:22:27 2024 +0800

    USB: serial: option: add MediaTek T7XX compositions
    
    commit f07dfa6a1b65034a5c3ba3a555950d972f252757 upstream.
    
    Add the MediaTek T7XX compositions:
    
    T:  Bus=03 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 74 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0e8d ProdID=7129 Rev= 0.01
    S:  Manufacturer=MediaTek Inc.
    S:  Product=USB DATA CARD
    S:  SerialNumber=004402459035402
    C:* #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA
    A:  FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00
    I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    -------------------------------
    | If Number | Function        |
    -------------------------------
    | 2         | USB AP Log Port |
    -------------------------------
    | 3         | USB AP GNSS Port|
    -------------------------------
    | 4         | USB AP META Port|
    -------------------------------
    | 5         | ADB port        |
    -------------------------------
    | 6         | USB MD AT Port  |
    ------------------------------
    | 7         | USB MD META Port|
    -------------------------------
    | 8         | USB NTZ Port    |
    -------------------------------
    | 9         | USB Debug port  |
    -------------------------------
    
    Signed-off-by: Jack Wu <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: serial: option: add MeiG Smart SLM770A [+ + +]

Author: Michal Hrusecky <[email protected]>
Date:   Tue Nov 19 14:00:18 2024 +0100

    USB: serial: option: add MeiG Smart SLM770A
    
    commit 724d461e44dfc0815624d2a9792f2f2beb7ee46d upstream.
    
    Update the USB serial option driver to support MeiG Smart SLM770A.
    
    ID 2dee:4d57 Marvell Mobile Composite Device Bus
    
    T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=2dee ProdID=4d57 Rev= 1.00
    S:  Manufacturer=Marvell
    S:  Product=Mobile Composite Device Bus
    C:* #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=500mA
    A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03
    I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host
    E:  Ad=87(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=89(I) Atr=03(Int.) MxPS=  64 Ivl=4096ms
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Tested successfully connecting to the Internet via rndis interface after
    dialing via AT commands on If#=3 or If#=4.
    Not sure of the purpose of the other serial interfaces.
    
    Signed-off-by: Michal Hrusecky <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready [+ + +]

Author: Mank Wang <[email protected]>
Date:   Fri Nov 22 09:06:00 2024 +0000

    USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready
    
    commit aa954ae08262bb5cd6ab18dd56a0b58c1315db8b upstream.
    
    LCUK54-WRD's pid/vid
    0x3731/0x010a
    0x3731/0x010c
    
    LCUK54-WWD's pid/vid
    0x3731/0x010b
    0x3731/0x010d
    
    Above products use the exact same interface layout and option
    driver:
    MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL
    
    T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#=  5 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=3731 ProdID=0101 Rev= 5.04
    S:  Manufacturer=NetPrisma
    S:  Product=LCUK54-WRD
    S:  SerialNumber=feeba631
    C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA
    A:  FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00
    I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option
    E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
    E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Signed-off-by: Mank Wang <[email protected]>
    [ johan: use lower case hex notation ]
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: serial: option: add TCL IK512 MBIM & ECM [+ + +]

Author: Daniel Swanemar <[email protected]>
Date:   Mon Nov 4 14:42:17 2024 +0100

    USB: serial: option: add TCL IK512 MBIM & ECM
    
    commit fdad4fb7c506bea8b419f70ff2163d99962e8ede upstream.
    
    Add the following TCL IK512 compositions:
    
    0x0530: Modem + Diag + AT + MBIM
    T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  3 Spd=10000 MxCh= 0
    D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
    P:  Vendor=1bbb ProdID=0530 Rev=05.04
    S:  Manufacturer=TCL
    S:  Product=TCL 5G USB Dongle
    S:  SerialNumber=3136b91a
    C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=82(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    
    0x0640: ECM + Modem + Diag + AT
    T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  4 Spd=10000 MxCh= 0
    D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
    P:  Vendor=1bbb ProdID=0640 Rev=05.04
    S:  Manufacturer=TCL
    S:  Product=TCL 5G USB Dongle
    S:  SerialNumber=3136b91a
    C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
    E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
    E:  Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    
    Signed-off-by: Daniel Swanemar <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: serial: option: add Telit FE910C04 rmnet compositions [+ + +]

Author: Daniele Palmas <[email protected]>
Date:   Mon Dec 9 16:32:54 2024 +0100

    USB: serial: option: add Telit FE910C04 rmnet compositions
    
    commit 8366e64a4454481339e7c56a8ad280161f2e441d upstream.
    
    Add the following Telit FE910C04 compositions:
    
    0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag)
    T:  Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10c0 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE910
    S:  SerialNumber=f71b8b32
    C:  #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag)
    T:  Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10c4 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE910
    S:  SerialNumber=f71b8b32
    C:  #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb
    T:  Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480  MxCh= 0
    D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10c8 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE910
    S:  SerialNumber=f71b8b32
    C:  #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    
    Signed-off-by: Daniele Palmas <[email protected]>
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vmalloc: fix accounting with i915 [+ + +]

Author: Matthew Wilcox (Oracle) <[email protected]>
Date:   Wed Dec 11 20:25:37 2024 +0000

    vmalloc: fix accounting with i915
    
    commit a2e740e216f5bf49ccb83b6d490c72a340558a43 upstream.
    
    If the caller of vmap() specifies VM_MAP_PUT_PAGES (currently only the
    i915 driver), we will decrement nr_vmalloc_pages and MEMCG_VMALLOC in
    vfree().  These counters are incremented by vmalloc() but not by vmap() so
    this will cause an underflow.  Check the VM_MAP_PUT_PAGES flag before
    decrementing either counter.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap")
    Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Reviewed-by: Shakeel Butt <[email protected]>
    Reviewed-by: Balbir Singh <[email protected]>
    Acked-by: Michal Hocko <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Roman Gushchin <[email protected]>
    Cc: "Uladzislau Rezki (Sony)" <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/hyperv: Fix hv tsc page based sched_clock for hibernation [+ + +]

Author: Naman Jain <[email protected]>
Date:   Tue Sep 17 11:09:17 2024 +0530

    x86/hyperv: Fix hv tsc page based sched_clock for hibernation
    
    commit bcc80dec91ee745b3d66f3e48f0ec2efdea97149 upstream.
    
    read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is
    bigger than the variable hv_sched_clock_offset, which is cached during
    early boot, but depending on the timing this assumption may be false
    when a hibernated VM starts again (the clock counter starts from 0
    again) and is resuming back (Note: hv_init_tsc_clocksource() is not
    called during hibernation/resume); consequently,
    read_hv_sched_clock_tsc() may return a negative integer (which is
    interpreted as a huge positive integer since the return type is u64)
    and new kernel messages are prefixed with huge timestamps before
    read_hv_sched_clock_tsc() grows big enough (which typically takes
    several seconds).
    
    Fix the issue by saving the Hyper-V clock counter just before the
    suspend, and using it to correct the hv_sched_clock_offset in
    resume. This makes hv tsc page based sched_clock continuous and ensures
    that post resume, it starts from where it left off during suspend.
    Override x86_platform.save_sched_clock_state and
    x86_platform.restore_sched_clock_state routines to correct this as soon
    as possible.
    
    Note: if Invariant TSC is available, the issue doesn't happen because
    1) we don't register read_hv_sched_clock_tsc() for sched clock:
    See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework
    clocksource and sched clock setup");
    2) the common x86 code adjusts TSC similarly: see
    __restore_processor_state() ->  tsc_verify_tsc_adjust(true) and
    x86_platform.restore_sched_clock_state().
    
    Cc: [email protected]
    Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation")
    Co-developed-by: Dexuan Cui <[email protected]>
    Signed-off-by: Dexuan Cui <[email protected]>
    Signed-off-by: Naman Jain <[email protected]>
    Reviewed-by: Michael Kelley <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wei Liu <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xfs: fix off-by-one error in fsmap's end_daddr usage [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Dec 18 11:50:52 2024 -0800

    xfs: fix off-by-one error in fsmap's end_daddr usage
    
    commit a440a28ddbdcb861150987b4d6e828631656b92f upstream.
    
    In commit ca6448aed4f10a, we created an "end_daddr" variable to fix
    fsmap reporting when the end of the range requested falls in the middle
    of an unknown (aka free on the rmapbt) region.  Unfortunately, I didn't
    notice that the the code sets end_daddr to the last sector of the device
    but then uses that quantity to compute the length of the synthesized
    mapping.
    
    Zizhi Wo later observed that when end_daddr isn't set, we still don't
    report the last fsblock on a device because in that case (aka when
    info->last is true), the info->high mapping that we pass to
    xfs_getfsmap_group_helper has a startblock that points to the last
    fsblock.  This is also wrong because the code uses startblock to
    compute the length of the synthesized mapping.
    
    Fix the second problem by setting end_daddr unconditionally, and fix the
    first problem by setting start_daddr to one past the end of the range to
    query.
    
    Cc: <[email protected]> # v6.11
    Fixes: ca6448aed4f10a ("xfs: Fix missing interval for missing_owner in xfs fsmap")
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Reported-by: Zizhi Wo <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xfs: fix sb_spino_align checks for large fsblock sizes [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Dec 18 11:51:07 2024 -0800

    xfs: fix sb_spino_align checks for large fsblock sizes
    
    commit 7f8a44f37229fc76bfcafa341a4b8862368ef44a upstream.
    
    For a sparse inodes filesystem, mkfs.xfs computes the values of
    sb_spino_align and sb_inoalignmt with the following code:
    
            int     cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
    
            if (cfg->sb_feat.crcs_enabled)
                    cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;
    
            sbp->sb_spino_align = cluster_size >> cfg->blocklog;
            sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK *
                            cfg->inodesize >> cfg->blocklog;
    
    On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results
    in cluster_size = 8192 * (512 / 256) = 16384.  As a result,
    sb_spino_align and sb_inoalignmt are both set to zero.  Unfortunately,
    this trips the new sb_spino_align check that was just added to
    xfs_validate_sb_common, and the mkfs fails:
    
    # mkfs.xfs -f -b size=64k, /dev/sda
    meta-data=/dev/sda               isize=512    agcount=4, agsize=81136 blks
             =                       sectsz=512   attr=2, projid32bit=1
             =                       crc=1        finobt=1, sparse=1, rmapbt=1
             =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
             =                       exchange=0   metadir=0
    data     =                       bsize=65536  blocks=324544, imaxpct=25
             =                       sunit=0      swidth=0 blks
    naming   =version 2              bsize=65536  ascii-ci=0, ftype=1, parent=0
    log      =internal log           bsize=65536  blocks=5006, version=2
             =                       sectsz=512   sunit=0 blks, lazy-count=1
    realtime =none                   extsz=65536  blocks=0, rtextents=0
             =                       rgcount=0    rgsize=0 extents
    Discarding blocks...Sparse inode alignment (0) is invalid.
    Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
    libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
    mkfs.xfs: Releasing dirty buffer to free list!
    found dirty buffer (bulk) on free list!
    Sparse inode alignment (0) is invalid.
    Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200
    libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1
    mkfs.xfs: writing AG headers failed, err=22
    
    Prior to commit 59e43f5479cce1 this all worked fine, even if "sparse"
    inodes are somewhat meaningless when everything fits in a single
    fsblock.  Adjust the checks to handle existing filesystems.
    
    Cc: <[email protected]> # v6.13-rc1
    Fixes: 59e43f5479cce1 ("xfs: sb_spino_align is not verified")
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xfs: fix sparse inode limits on runt AG [+ + +]

Author: Dave Chinner <[email protected]>
Date:   Wed Dec 18 11:50:36 2024 -0800

    xfs: fix sparse inode limits on runt AG
    
    commit 13325333582d4820d39b9e8f63d6a54e745585d9 upstream.
    
    The runt AG at the end of a filesystem is almost always smaller than
    the mp->m_sb.sb_agblocks. Unfortunately, when setting the max_agbno
    limit for the inode chunk allocation, we do not take this into
    account. This means we can allocate a sparse inode chunk that
    overlaps beyond the end of an AG. When we go to allocate an inode
    from that sparse chunk, the irec fails validation because the
    agbno of the start of the irec is beyond valid limits for the runt
    AG.
    
    Prevent this from happening by taking into account the size of the
    runt AG when allocating inode chunks. Also convert the various
    checks for valid inode chunk agbnos to use xfs_ag_block_count()
    so that they will also catch such issues in the future.
    
    Fixes: 56d1115c9bc7 ("xfs: allocate sparse inode chunks on full chunk allocation failure")
    Signed-off-by: Dave Chinner <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    [djwong: backport to stable because upstream maintainer ignored cc-stable]
    Link: https://lore.kernel.org/linux-xfs/20241112231539.GG9438@frogsfrogsfrogs/
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xfs: fix zero byte checking in the superblock scrubber [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Dec 18 11:51:23 2024 -0800

    xfs: fix zero byte checking in the superblock scrubber
    
    commit c004a793e0ec34047c3bd423bcd8966f5fac88dc upstream.
    
    The logic to check that the region past the end of the superblock is all
    zeroes is wrong -- we don't want to check only the bytes past the end of
    the maximally sized ondisk superblock structure as currently defined in
    xfs_format.h; we want to check the bytes beyond the end of the ondisk as
    defined by the feature bits.
    
    Port the superblock size logic from xfs_repair and then put it to use in
    xfs_scrub.
    
    Cc: <[email protected]> # v4.15
    Fixes: 21fb4cb1981ef7 ("xfs: scrub the secondary superblocks")
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xfs: sb_spino_align is not verified [+ + +]

Author: Dave Chinner <[email protected]>
Date:   Wed Dec 18 11:50:20 2024 -0800

    xfs: sb_spino_align is not verified
    
    commit 59e43f5479cce106d71c0b91a297c7ad1913176c upstream.
    
    It's just read in from the superblock and used without doing any
    validity checks at all on the value.
    
    Fixes: fb4f2b4e5a82 ("xfs: add sparse inode chunk alignment superblock field")
    Signed-off-by: Dave Chinner <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Carlos Maiolino <[email protected]>
    [djwong: actually tag for 6.12 because upstream maintainer ignored cc-stable tag]
    Link: https://lore.kernel.org/linux-xfs/20241024165544.GI21853@frogsfrogsfrogs/
    Signed-off-by: "Darrick J. Wong" <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic [+ + +]

Author: Mathias Nyman <[email protected]>
Date:   Tue Dec 17 12:21:21 2024 +0200

    xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic
    
    commit e21ebe51af688eb98fd6269240212a3c7300deea upstream.
    
    xHC hosts from several vendors have the same issue where endpoints start
    so slowly that a later queued 'Stop Endpoint' command may complete before
    endpoint is up and running.
    
    The 'Stop Endpoint' command fails with context state error as the endpoint
    still appears as  stopped.
    
    See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for
    details
    
    CC: [email protected]
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

zram: fix uninitialized ZRAM not releasing backing device [+ + +]

Author: Kairui Song <[email protected]>
Date:   Tue Dec 10 00:57:16 2024 +0800

    zram: fix uninitialized ZRAM not releasing backing device
    
    commit 74363ec674cb172d8856de25776c8f3103f05e2f upstream.
    
    Setting backing device is done before ZRAM initialization.  If we set the
    backing device, then remove the ZRAM module without initializing the
    device, the backing device reference will be leaked and the device will be
    hold forever.
    
    Fix this by always reset the ZRAM fully on rmmod or reset store.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 013bf95a83ec ("zram: add interface to specif backing device")
    Signed-off-by: Kairui Song <[email protected]>
    Reported-by: Desheng Wu <[email protected]>
    Suggested-by: Sergey Senozhatsky <[email protected]>
    Reviewed-by: Sergey Senozhatsky <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

zram: refuse to use zero sized block device as backing device [+ + +]

Author: Kairui Song <[email protected]>
Date:   Tue Dec 10 00:57:15 2024 +0800

    zram: refuse to use zero sized block device as backing device
    
    commit be48c412f6ebf38849213c19547bc6d5b692b5e5 upstream.
    
    Patch series "zram: fix backing device setup issue", v2.
    
    This series fixes two bugs of backing device setting:
    
    - ZRAM should reject using a zero sized (or the uninitialized ZRAM
      device itself) as the backing device.
    - Fix backing device leaking when removing a uninitialized ZRAM
      device.
    
    
    This patch (of 2):
    
    Setting a zero sized block device as backing device is pointless, and one
    can easily create a recursive loop by setting the uninitialized ZRAM
    device itself as its own backing device by (zram0 is uninitialized):
    
        echo /dev/zram0 > /sys/block/zram0/backing_dev
    
    It's definitely a wrong config, and the module will pin itself, kernel
    should refuse doing so in the first place.
    
    By refusing to use zero sized device we avoided misuse cases including
    this one above.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 013bf95a83ec ("zram: add interface to specif backing device")
    Signed-off-by: Kairui Song <[email protected]>
    Reported-by: Desheng Wu <[email protected]>
    Reviewed-by: Sergey Senozhatsky <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>