Changelog in Linux kernel 6.1.95

af_unix: Annodate data-races around sk->sk_state for writers. [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:28 2024 -0700

    af_unix: Annodate data-races around sk->sk_state for writers.
    
    [ Upstream commit 942238f9735a4a4ebf8274b218d9a910158941d1 ]
    
    sk->sk_state is changed under unix_state_lock(), but it's read locklessly
    in many places.
    
    This patch adds WRITE_ONCE() on the writer side.
    
    We will add READ_ONCE() to the lockless readers in the following patches.
    
    Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:37 2024 -0700

    af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
    
    [ Upstream commit bd9f2d05731f6a112d0c7391a0d537bfc588dbe6 ]
    
    net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be
    changed concurrently.
    
    Let's use READ_ONCE() in unix_create1().
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:41 2024 -0700

    af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
    
    [ Upstream commit efaf24e30ec39ebbea9112227485805a48b0ceb1 ]
    
    While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock().
    
    Let's use READ_ONCE() to read sk->sk_shutdown.
    
    Fixes: e4e541a84863 ("sock-diag: Report shutdown for inet and unix sockets (v2)")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:29 2024 -0700

    af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
    
    [ Upstream commit 3a0f38eb285c8c2eead4b3230c7ac2983707599d ]
    
    ioctl(SIOCINQ) calls unix_inq_len() that checks sk->sk_state first
    and returns -EINVAL if it's TCP_LISTEN.
    
    Then, for SOCK_STREAM sockets, unix_inq_len() returns the number of
    bytes in recvq.
    
    However, unix_inq_len() does not hold unix_state_lock(), and the
    concurrent listen() might change the state after checking sk->sk_state.
    
    If the race occurs, 0 is returned for the listener, instead of -EINVAL,
    because the length of skb with embryo is 0.
    
    We could hold unix_state_lock() in unix_inq_len(), but it's overkill
    given the result is true for pre-listen() TCP_CLOSE state.
    
    So, let's use READ_ONCE() for sk->sk_state in unix_inq_len().
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:31 2024 -0700

    af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
    
    [ Upstream commit a9bf9c7dc6a5899c01cb8f6e773a66315a5cd4b7 ]
    
    As small optimisation, unix_stream_connect() prefetches the client's
    sk->sk_state without unix_state_lock() and checks if it's TCP_CLOSE.
    
    Later, sk->sk_state is checked again under unix_state_lock().
    
    Let's use READ_ONCE() for the first check and TCP_CLOSE directly for
    the second check.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:34 2024 -0700

    af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
    
    [ Upstream commit af4c733b6b1aded4dc808fafece7dfe6e9d2ebb3 ]
    
    unix_stream_read_skb() is called from sk->sk_data_ready() context
    where unix_state_lock() is not held.
    
    Let's use READ_ONCE() there.
    
    Fixes: 77462de14a43 ("af_unix: Add read_sock for stream socket types")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:33 2024 -0700

    af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
    
    [ Upstream commit 8a34d4e8d9742a24f74998f45a6a98edd923319b ]
    
    The following functions read sk->sk_state locklessly and proceed only if
    the state is TCP_ESTABLISHED.
    
      * unix_stream_sendmsg
      * unix_stream_read_generic
      * unix_seqpacket_sendmsg
      * unix_seqpacket_recvmsg
    
    Let's use READ_ONCE() there.
    
    Fixes: a05d2ad1c1f3 ("af_unix: Only allow recv on connected seqpacket sockets.")
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:35 2024 -0700

    af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
    
    [ Upstream commit 0aa3be7b3e1f8f997312cc4705f8165e02806f8f ]
    
    While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read
    locklessly.
    
    Let's use READ_ONCE() there.
    
    Note that the result could be inconsistent if the socket is dumped
    during the state change.  This is common for other SOCK_DIAG and
    similar interfaces.
    
    Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
    Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA")
    Fixes: 45a96b9be6ec ("unix_diag: Dumping all sockets core")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:30 2024 -0700

    af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
    
    [ Upstream commit eb0718fb3e97ad0d6f4529b810103451c90adf94 ]
    
    unix_poll() and unix_dgram_poll() read sk->sk_state locklessly and
    calls unix_writable() which also reads sk->sk_state without holding
    unix_state_lock().
    
    Let's use READ_ONCE() in unix_poll() and unix_dgram_poll() and pass
    it to unix_writable().
    
    While at it, we remove TCP_SYN_SENT check in unix_dgram_poll() as
    that state does not exist for AF_UNIX socket since the code was added.
    
    Fixes: 1586a5877db9 ("af_unix: do not report POLLOUT on listeners")
    Fixes: 3c73419c09a5 ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: annotate lockless accesses to sk->sk_err [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Wed Mar 15 20:57:46 2023 +0000

    af_unix: annotate lockless accesses to sk->sk_err
    
    [ Upstream commit cc04410af7de348234ac36a5f50c4ce416efdb4b ]
    
    unix_poll() and unix_dgram_poll() read sk->sk_err
    without any lock held.
    
    Add relevant READ_ONCE()/WRITE_ONCE() annotations.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: 83690b82d228 ("af_unix: Use skb_queue_empty_lockless() in unix_release_sock().")
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Read with MSG_PEEK loops if the first unread byte is OOB [+ + +]

Author: Rao Shoaib <[email protected]>
Date:   Tue Jun 11 01:46:39 2024 -0700

    af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
    
    [ Upstream commit a6736a0addd60fccc3a3508461d72314cc609772 ]
    
    Read with MSG_PEEK flag loops if the first byte to read is an OOB byte.
    commit 22dd70eb2c3d ("af_unix: Don't peek OOB data without MSG_OOB.")
    addresses the loop issue but does not address the issue that no data
    beyond OOB byte can be read.
    
    >>> from socket import *
    >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
    >>> c1.send(b'a', MSG_OOB)
    1
    >>> c1.send(b'b')
    1
    >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
    b'b'
    
    >>> from socket import *
    >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
    >>> c2.setsockopt(SOL_SOCKET, SO_OOBINLINE, 1)
    >>> c1.send(b'a', MSG_OOB)
    1
    >>> c1.send(b'b')
    1
    >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
    b'a'
    >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
    b'a'
    >>> c2.recv(1, MSG_DONTWAIT)
    b'a'
    >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
    b'b'
    >>>
    
    Fixes: 314001f0bf92 ("af_unix: Add OOB support")
    Signed-off-by: Rao Shoaib <[email protected]>
    Reviewed-by: Kuniyuki Iwashima <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:27 2024 -0700

    af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
    
    [ Upstream commit 26bfb8b57063f52b867f9b6c8d1742fcb5bd656c ]
    
    When a SOCK_DGRAM socket connect()s to another socket, the both sockets'
    sk->sk_state are changed to TCP_ESTABLISHED so that we can register them
    to BPF SOCKMAP.
    
    When the socket disconnects from the peer by connect(AF_UNSPEC), the state
    is set back to TCP_CLOSE.
    
    Then, the peer's state is also set to TCP_CLOSE, but the update is done
    locklessly and unconditionally.
    
    Let's say socket A connect()ed to B, B connect()ed to C, and A disconnects
    from B.
    
    After the first two connect()s, all three sockets' sk->sk_state are
    TCP_ESTABLISHED:
    
      $ ss -xa
      Netid State  Recv-Q Send-Q  Local Address:Port  Peer Address:PortProcess
      u_dgr ESTAB  0      0       @A 641              * 642
      u_dgr ESTAB  0      0       @B 642              * 643
      u_dgr ESTAB  0      0       @C 643              * 0
    
    And after the disconnect, B's state is TCP_CLOSE even though it's still
    connected to C and C's state is TCP_ESTABLISHED.
    
      $ ss -xa
      Netid State  Recv-Q Send-Q  Local Address:Port  Peer Address:PortProcess
      u_dgr UNCONN 0      0       @A 641              * 0
      u_dgr UNCONN 0      0       @B 642              * 643
      u_dgr ESTAB  0      0       @C 643              * 0
    
    In this case, we cannot register B to SOCKMAP.
    
    So, when a socket disconnects from the peer, we should not set TCP_CLOSE to
    the peer if the peer is connected to yet another socket, and this must be
    done under unix_state_lock().
    
    Note that we use WRITE_ONCE() for sk->sk_state as there are many lockless
    readers.  These data-races will be fixed in the following patches.
    
    Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:39 2024 -0700

    af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
    
    [ Upstream commit 83690b82d228b3570565ebd0b41873933238b97f ]
    
    If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock()
    checks the length of the peer socket's recvq under unix_state_lock().
    
    However, unix_stream_read_generic() calls skb_unlink() after releasing
    the lock.  Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks
    skb without unix_state_lock().
    
    Thues, unix_state_lock() does not protect qlen.
    
    Let's use skb_queue_empty_lockless() in unix_release_sock().
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:40 2024 -0700

    af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
    
    [ Upstream commit 5d915e584d8408211d4567c22685aae8820bfc55 ]
    
    We can dump the socket queue length via UNIX_DIAG by specifying
    UDIAG_SHOW_RQLEN.
    
    If sk->sk_state is TCP_LISTEN, we return the recv queue length,
    but here we do not hold recvq lock.
    
    Let's use skb_queue_len_lockless() in sk_diag_show_rqlen().
    
    Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Jun 4 09:52:38 2024 -0700

    af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
    
    [ Upstream commit 45d872f0e65593176d880ec148f41ad7c02e40a7 ]
    
    Once sk->sk_state is changed to TCP_LISTEN, it never changes.
    
    unix_accept() takes advantage of this characteristics; it does not
    hold the listener's unix_state_lock() and only acquires recvq lock
    to pop one skb.
    
    It means unix_state_lock() does not prevent the queue length from
    changing in unix_stream_connect().
    
    Thus, we need to use unix_recvq_full_lockless() to avoid data-race.
    
    Now we remove unix_recvq_full() as no one uses it.
    
    Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in
    unix_recvq_full_lockless() because of the following reasons:
    
      (1) For SOCK_DGRAM, it is a written-once field in unix_create1()
    
      (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the
          listener's unix_state_lock() in unix_listen(), and we hold
          the lock in unix_stream_connect()
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration [+ + +]

Author: Volodymyr Babchuk <[email protected]>
Date:   Fri Apr 12 19:03:25 2024 +0000

    arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
    
    [ Upstream commit 819fe8c96a5172dfd960e5945e8f00f8fed32953 ]
    
    There are two issues with SDHC2 configuration for SA8155P-ADP,
    which prevent use of SDHC2 and causes issues with ethernet:
    
    - Card Detect pin for SHDC2 on SA8155P-ADP is connected to gpio4 of
      PMM8155AU_1, not to SoC itself. SoC's gpio4 is used for DWMAC
      TX. If sdhc driver probes after dwmac driver, it reconfigures
      gpio4 and this breaks Ethernet MAC.
    
    - pinctrl configuration mentions gpio96 as CD pin. It seems it was
      copied from some SM8150 example, because as mentioned above,
      correct CD pin is gpio4 on PMM8155AU_1.
    
    This patch fixes both mentioned issues by providing correct pin handle
    and pinctrl configuration.
    
    Fixes: 0deb2624e2d0 ("arm64: dts: qcom: sa8155p-adp: Add support for uSD card")
    Cc: [email protected]
    Signed-off-by: Volodymyr Babchuk <[email protected]>
    Reviewed-by: Stephan Gerhold <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: dts: qcom: sm8150: align TLMM pin configuration with DT schema [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Thu Oct 6 16:45:17 2022 +0200

    arm64: dts: qcom: sm8150: align TLMM pin configuration with DT schema
    
    [ Upstream commit 028fe09cda0a0d568e6a7d65b0336d32600b480c ]
    
    DT schema expects TLMM pin configuration nodes to be named with
    '-state' suffix and their optional children with '-pins' suffix.
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Bjorn Andersson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 819fe8c96a51 ("arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration")
    Signed-off-by: Sasha Levin <[email protected]>

ax25: Fix refcount imbalance on inbound connections [+ + +]

Author: Lars Kellogg-Stedman <[email protected]>
Date:   Wed May 29 17:02:43 2024 -0400

    ax25: Fix refcount imbalance on inbound connections
    
    [ Upstream commit 3c34fb0bd4a4237592c5ecb5b2e2531900c55774 ]
    
    When releasing a socket in ax25_release(), we call netdev_put() to
    decrease the refcount on the associated ax.25 device. However, the
    execution path for accepting an incoming connection never calls
    netdev_hold(). This imbalance leads to refcount errors, and ultimately
    to kernel crashes.
    
    A typical call trace for the above situation will start with one of the
    following errors:
    
        refcount_t: decrement hit 0; leaking memory.
        refcount_t: underflow; use-after-free.
    
    And will then have a trace like:
    
        Call Trace:
        <TASK>
        ? show_regs+0x64/0x70
        ? __warn+0x83/0x120
        ? refcount_warn_saturate+0xb2/0x100
        ? report_bug+0x158/0x190
        ? prb_read_valid+0x20/0x30
        ? handle_bug+0x3e/0x70
        ? exc_invalid_op+0x1c/0x70
        ? asm_exc_invalid_op+0x1f/0x30
        ? refcount_warn_saturate+0xb2/0x100
        ? refcount_warn_saturate+0xb2/0x100
        ax25_release+0x2ad/0x360
        __sock_release+0x35/0xa0
        sock_close+0x19/0x20
        [...]
    
    On reboot (or any attempt to remove the interface), the kernel gets
    stuck in an infinite loop:
    
        unregister_netdevice: waiting for ax0 to become free. Usage count = 0
    
    This patch corrects these issues by ensuring that we call netdev_hold()
    and ax25_dev_hold() for new connections in ax25_accept(). This makes the
    logic leading to ax25_accept() match the logic for ax25_bind(): in both
    cases we increment the refcount, which is ultimately decremented in
    ax25_release().
    
    Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()")
    Signed-off-by: Lars Kellogg-Stedman <[email protected]>
    Tested-by: Duoming Zhou <[email protected]>
    Tested-by: Dan Cross <[email protected]>
    Tested-by: Chris Maness <[email protected]>
    Reviewed-by: Dan Carpenter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() [+ + +]

Author: Duoming Zhou <[email protected]>
Date:   Thu May 30 13:17:33 2024 +0800

    ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()
    
    [ Upstream commit 166fcf86cd34e15c7f383eda4642d7a212393008 ]
    
    The object "ax25_dev" is managed by reference counting. Thus it should
    not be directly released by kfree(), replace with ax25_dev_put().
    
    Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
    Suggested-by: Dan Carpenter <[email protected]>
    Signed-off-by: Duoming Zhou <[email protected]>
    Reviewed-by: Dan Carpenter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon May 20 16:03:07 2024 -0400

    Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
    
    [ Upstream commit 806a5198c05987b748b50f3d0c0cfb3d417381a4 ]
    
    This removes the bogus check for max > hcon->le_conn_max_interval since
    the later is just the initial maximum conn interval not the maximum the
    stack could support which is really 3200=4000ms.
    
    In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values
    of the following fields in IXIT that would cause hci_check_conn_params
    to fail:
    
    TSPX_conn_update_int_min
    TSPX_conn_update_int_max
    TSPX_conn_update_peripheral_latency
    TSPX_conn_update_supervision_timeout
    
    Link: https://github.com/bluez/bluez/issues/847
    Fixes: e4b019515f95 ("Bluetooth: Enforce validation on max value of connection interval")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: qca: fix invalid device address check [+ + +]

Author: Johan Hovold <[email protected]>
Date:   Tue Apr 16 11:15:09 2024 +0200

    Bluetooth: qca: fix invalid device address check
    
    [ Upstream commit 32868e126c78876a8a5ddfcb6ac8cb2fffcf4d27 ]
    
    Qualcomm Bluetooth controllers may not have been provisioned with a
    valid device address and instead end up using the default address
    00:00:00:00:5a:ad.
    
    This was previously believed to be due to lack of persistent storage for
    the address but it may also be due to integrators opting to not use the
    on-chip OTP memory and instead store the address elsewhere (e.g. in
    storage managed by secure world firmware).
    
    According to Qualcomm, at least WCN6750, WCN6855 and WCN7850 have
    on-chip OTP storage for the address.
    
    As the device type alone cannot be used to determine when the address is
    valid, instead read back the address during setup() and only set the
    HCI_QUIRK_USE_BDADDR_PROPERTY flag when needed.
    
    This specifically makes sure that controllers that have been provisioned
    with an address do not start as unconfigured.
    
    Reported-by: Janaki Ramaiah Thota <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]/
    Fixes: 5971752de44c ("Bluetooth: hci_qca: Set HCI_QUIRK_USE_BDADDR_PROPERTY for wcn3990")
    Fixes: e668eb1e1578 ("Bluetooth: hci_core: Don't stop BT if the BD address missing in dts")
    Fixes: 6945795bc81a ("Bluetooth: fix use-bdaddr-property quirk")
    Cc: [email protected]      # 6.5
    Cc: Matthias Kaehlcke <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Reported-by: Janaki Ramaiah Thota <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: qca: fix wcn3991 device address check [+ + +]

Author: Johan Hovold <[email protected]>
Date:   Thu Apr 25 09:55:03 2024 +0200

    Bluetooth: qca: fix wcn3991 device address check
    
    commit 66c39332d02d65e311ec89b0051130bfcd00c9ac upstream.
    
    Qualcomm Bluetooth controllers may not have been provisioned with a
    valid device address and instead end up using the default address
    00:00:00:00:5a:ad.
    
    This address is now used to determine if a controller has a valid
    address or if one needs to be provided through devicetree or by user
    space before the controller can be used.
    
    It turns out that the WCN3991 controllers used in Chromium Trogdor
    machines use a different default address, 39:98:00:00:5a:ad, which also
    needs to be marked as invalid so that the correct address is fetched
    from the devicetree.
    
    Qualcomm has unfortunately not yet provided any answers as to whether
    the 39:98 encodes a hardware id and if there are other variants of the
    default address that needs to be handled by the driver.
    
    For now, add the Trogdor WCN3991 default address to the device address
    check to avoid having these controllers start with the default address
    instead of their assigned addresses.
    
    Fixes: 32868e126c78 ("Bluetooth: qca: fix invalid device address check")
    Cc: [email protected]      # 6.5
    Cc: Doug Anderson <[email protected]>
    Cc: Janaki Ramaiah Thota <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Tested-by: Douglas Anderson <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: qca: generalise device address check [+ + +]

Author: Johan Hovold <[email protected]>
Date:   Tue Apr 30 19:07:41 2024 +0200

    Bluetooth: qca: generalise device address check
    
    commit dd336649ba89789c845618dcbc09867010aec673 upstream.
    
    The default device address apparently comes from the NVM configuration
    file and can differ quite a bit between controllers.
    
    Store the default address when parsing the configuration file and use it
    to determine whether the controller has been provisioned with an
    address.
    
    This makes sure that devices without a unique address start as
    unconfigured unless a valid address has been provided in the devicetree.
    
    Fixes: 32868e126c78 ("Bluetooth: qca: fix invalid device address check")
    Cc: [email protected]      # 6.5
    Cc: Doug Anderson <[email protected]>
    Cc: Janaki Ramaiah Thota <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Tested-by: Douglas Anderson <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() [+ + +]

Author: Aleksandr Mishin <[email protected]>
Date:   Tue Jun 11 11:25:46 2024 +0300

    bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
    
    [ Upstream commit a9b9741854a9fe9df948af49ca5514e0ed0429df ]
    
    In case of token is released due to token->state == BNXT_HWRM_DEFERRED,
    released token (set to NULL) is used in log messages. This issue is
    expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But
    this error code is returned by recent firmware. So some firmware may not
    return it. This may lead to NULL pointer dereference.
    Adjust this issue by adding token pointer check.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 8fa4219dba8e ("bnxt_en: add dynamic debug support for HWRM messages")
    Suggested-by: Michael Chan <[email protected]>
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Wojciech Drewek <[email protected]>
    Reviewed-by: Michael Chan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Set run context for rawtp test_run callback [+ + +]

Author: Jiri Olsa <[email protected]>
Date:   Tue Jun 4 17:00:24 2024 +0200

    bpf: Set run context for rawtp test_run callback
    
    [ Upstream commit d0d1df8ba18abc57f28fb3bc053b2bf319367f2c ]
    
    syzbot reported crash when rawtp program executed through the
    test_run interface calls bpf_get_attach_cookie helper or any
    other helper that touches task->bpf_ctx pointer.
    
    Setting the run context (task->bpf_ctx pointer) for test_run
    callback.
    
    Fixes: 7adfc6c9b315 ("bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value")
    Reported-by: [email protected]
    Signed-off-by: Jiri Olsa <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Daniel Borkmann <[email protected]>
    Closes: https://syzkaller.appspot.com/bug?extid=3ab78ff125b7979e45f9
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: fix leak of qgroup extent records after transaction abort [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Mon Jun 3 12:49:08 2024 +0100

    btrfs: fix leak of qgroup extent records after transaction abort
    
    [ Upstream commit fb33eb2ef0d88e75564983ef057b44c5b7e4fded ]
    
    Qgroup extent records are created when delayed ref heads are created and
    then released after accounting extents at btrfs_qgroup_account_extents(),
    called during the transaction commit path.
    
    If a transaction is aborted we free the qgroup records by calling
    btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(),
    unless we don't have delayed references. We are incorrectly assuming
    that no delayed references means we don't have qgroup extents records.
    
    We can currently have no delayed references because we ran them all
    during a transaction commit and the transaction was aborted after that
    due to some error in the commit path.
    
    So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at
    btrfs_destroy_delayed_refs() even if we don't have any delayed references.
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    Fixes: 81f7eb00ff5b ("btrfs: destroy qgroup extent records on transaction abort")
    CC: [email protected] # 6.1+
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() [+ + +]

Author: Qu Wenruo <[email protected]>
Date:   Tue Apr 9 20:32:34 2024 +0930

    btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range()
    
    [ Upstream commit fe1c6c7acce10baf9521d6dccc17268d91ee2305 ]
    
    [BUG]
    During my extent_map cleanup/refactor, with extra sanity checks,
    extent-map-tests::test_case_7() would not pass the checks.
    
    The problem is, after btrfs_drop_extent_map_range(), the resulted
    extent_map has a @block_start way too large.
    Meanwhile my btrfs_file_extent_item based members are returning a
    correct @disk_bytenr/@offset combination.
    
    The extent map layout looks like this:
    
         0        16K    32K       48K
         | PINNED |      | Regular |
    
    The regular em at [32K, 48K) also has 32K @block_start.
    
    Then drop range [0, 36K), which should shrink the regular one to be
    [36K, 48K).
    However the @block_start is incorrect, we expect 32K + 4K, but got 52K.
    
    [CAUSE]
    Inside btrfs_drop_extent_map_range() function, if we hit an extent_map
    that covers the target range but is still beyond it, we need to split
    that extent map into half:
    
            |<-- drop range -->|
                     |<----- existing extent_map --->|
    
    And if the extent map is not compressed, we need to forward
    extent_map::block_start by the difference between the end of drop range
    and the extent map start.
    
    However in that particular case, the difference is calculated using
    (start + len - em->start).
    
    The problem is @start can be modified if the drop range covers any
    pinned extent.
    
    This leads to wrong calculation, and would be caught by my later
    extent_map sanity checks, which checks the em::block_start against
    btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset.
    
    This is a regression caused by commit c962098ca4af ("btrfs: fix
    incorrect splitting in btrfs_drop_extent_map_range"), which removed the
    @len update for pinned extents.
    
    [FIX]
    Fix it by avoiding using @start completely, and use @end - em->start
    instead, which @end is exclusive bytenr number.
    
    And update the test case to verify the @block_start to prevent such
    problem from happening.
    
    Thankfully this is not going to lead to any data corruption, as IO path
    does not utilize btrfs_drop_extent_map_range() with @skip_pinned set.
    
    So this fix is only here for the sake of consistency/correctness.
    
    CC: [email protected] # 6.5+
    Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range")
    Reviewed-by: Filipe Manana <[email protected]>
    Signed-off-by: Qu Wenruo <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: make btrfs_destroy_delayed_refs() return void [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Fri Jun 2 12:19:42 2023 +0100

    btrfs: make btrfs_destroy_delayed_refs() return void
    
    [ Upstream commit 99f09ce309b8307ce8dca209f936e99a7c332214 ]
    
    btrfs_destroy_delayed_refs() always returns 0 and its single caller does
    not check its return value, as it also returns void, and so does the
    callers' caller and so on. This is because we are in the transaction abort
    path, where we have no way to deal with errors (we are in a critical
    situation) and all cleanup of resources works in a best effort fashion.
    So make btrfs_destroy_delayed_refs() return void.
    
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Stable-dep-of: fb33eb2ef0d8 ("btrfs: fix leak of qgroup extent records after transaction abort")
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: remove unnecessary prototype declarations at disk-io.c [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Mon May 29 16:17:06 2023 +0100

    btrfs: remove unnecessary prototype declarations at disk-io.c
    
    [ Upstream commit 184533e3618f4d0b382c1ef3de0ce34e849005d7 ]
    
    We have a few static functions at disk-io.c for which we have a forward
    declaration of their prototype, but it's not needed because all those
    functions are defined before they are called, so remove them.
    
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Stable-dep-of: fb33eb2ef0d8 ("btrfs: fix leak of qgroup extent records after transaction abort")
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Mon Jun 5 10:51:08 2023 +0200

    btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
    
    commit 87463f7e0250d471fac41e7c9c45ae21d83b5f85 upstream.
    
    Split the code handling a type DUP block group from
    btrfs_load_block_group_zone_info to make the code more readable.
    
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Mon Jun 5 10:51:06 2023 +0200

    btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
    
    commit 09a46725cc84165af452d978a3532d6b97a28796 upstream.
    
    Split out a helper for the body of the per-zone loop in
    btrfs_load_block_group_zone_info to make the function easier to read and
    modify.
    
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Mon Jun 5 10:51:07 2023 +0200

    btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
    
    commit 9e0e3e74dc6928a0956f4e27e24d473c65887e96 upstream.
    
    Split the code handling a type single block group from
    btrfs_load_block_group_zone_info to make the code more readable.
    
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: zoned: fix use-after-free due to race with dev replace [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Wed May 8 11:51:07 2024 +0100

    btrfs: zoned: fix use-after-free due to race with dev replace
    
    commit 0090d6e1b210551e63cf43958dc7a1ec942cdde9 upstream.
    
    While loading a zone's info during creation of a block group, we can race
    with a device replace operation and then trigger a use-after-free on the
    device that was just replaced (source device of the replace operation).
    
    This happens because at btrfs_load_zone_info() we extract a device from
    the chunk map into a local variable and then use the device while not
    under the protection of the device replace rwsem. So if there's a device
    replace operation happening when we extract the device and that device
    is the source of the replace operation, we will trigger a use-after-free
    if before we finish using the device the replace operation finishes and
    frees the device.
    
    Fix this by enlarging the critical section under the protection of the
    device replace rwsem so that all uses of the device are done inside the
    critical section.
    
    CC: [email protected] # 6.1.x: 15c12fcc50a1: btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
    CC: [email protected] # 6.1.x: 09a46725cc84: btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
    CC: [email protected] # 6.1.x: 9e0e3e74dc69: btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
    CC: [email protected] # 6.1.x: 87463f7e0250: btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
    CC: [email protected] # 6.1.x
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Mon Jun 5 10:51:05 2023 +0200

    btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
    
    commit 15c12fcc50a1b12a747f8b6ec05cdb18c537a4d1 upstream.
    
    Add a new zone_info structure to hold per-zone information in
    btrfs_load_block_group_zone_info and prepare for breaking out helpers
    from it.
    
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode [+ + +]

Author: David Howells <[email protected]>
Date:   Fri Jan 19 20:49:34 2024 +0000

    cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
    
    commit c3d6569a43322f371e7ba0ad386112723757ac8f upstream.
    
    cachefiles_ondemand_init_object() as called from cachefiles_open_file() and
    cachefiles_create_tmpfile() does not check if object->ondemand is set
    before dereferencing it, leading to an oops something like:
    
            RIP: 0010:cachefiles_ondemand_init_object+0x9/0x41
            ...
            Call Trace:
             <TASK>
             cachefiles_open_file+0xc9/0x187
             cachefiles_lookup_cookie+0x122/0x2be
             fscache_cookie_state_machine+0xbe/0x32b
             fscache_cookie_worker+0x1f/0x2d
             process_one_work+0x136/0x208
             process_scheduled_works+0x3a/0x41
             worker_thread+0x1a2/0x1f6
             kthread+0xca/0xd2
             ret_from_fork+0x21/0x33
    
    Fix this by making cachefiles_ondemand_init_object() return immediately if
    cachefiles->ondemand is NULL.
    
    Fixes: 3c5ecfe16e76 ("cachefiles: extract ondemand info field from cachefiles_object")
    Reported-by: Marc Dionne <[email protected]>
    Signed-off-by: David Howells <[email protected]>
    cc: Gao Xiang <[email protected]>
    cc: Chao Yu <[email protected]>
    cc: Yue Hu <[email protected]>
    cc: Jeffle Xu <[email protected]>
    cc: [email protected]
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:42:57 2024 +0800

    cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
    
    [ Upstream commit cc5ac966f26193ab185cc43d64d9f1ae998ccb6e ]
    
    This lets us see the correct trace output.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: add restore command to recover inflight ondemand read requests [+ + +]

Author: Jia Zhu <[email protected]>
Date:   Mon Nov 20 12:14:22 2023 +0800

    cachefiles: add restore command to recover inflight ondemand read requests
    
    [ Upstream commit e73fa11a356ca0905c3cc648eaacc6f0f2d2c8b3 ]
    
    Previously, in ondemand read scenario, if the anonymous fd was closed by
    user daemon, inflight and subsequent read requests would return EIO.
    As long as the device connection is not released, user daemon can hold
    and restore inflight requests by setting the request flag to
    CACHEFILES_REQ_NEW.
    
    Suggested-by: Gao Xiang <[email protected]>
    Signed-off-by: Jia Zhu <[email protected]>
    Signed-off-by: Xin Yin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jingbo Xu <[email protected]>
    Reviewed-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: add spin_lock for cachefiles_ondemand_info [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:03 2024 +0800

    cachefiles: add spin_lock for cachefiles_ondemand_info
    
    [ Upstream commit 0a790040838c736495d5afd6b2d636f159f817f1 ]
    
    The following concurrency may cause a read request to fail to be completed
    and result in a hung:
    
               t1             |             t2
    ---------------------------------------------------------
                                cachefiles_ondemand_copen
                                  req = xa_erase(&cache->reqs, id)
    // Anon fd is maliciously closed.
    cachefiles_ondemand_fd_release
      xa_lock(&cache->reqs)
      cachefiles_ondemand_set_object_close(object)
      xa_unlock(&cache->reqs)
                                  cachefiles_ondemand_set_object_open
                                  // No one will ever close it again.
    cachefiles_ondemand_daemon_read
      cachefiles_ondemand_select_req
      // Get a read req but its fd is already closed.
      // The daemon can't issue a cread ioctl with an closed fd, then hung.
    
    So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and
    state, thus we can avoid the above problem in cachefiles_ondemand_copen()
    by using ondemand_id to determine if fd has been closed.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: defer exposing anon_fd until after copy_to_user() succeeds [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:05 2024 +0800

    cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
    
    [ Upstream commit 4b4391e77a6bf24cba2ef1590e113d9b73b11039 ]
    
    After installing the anonymous fd, we can now see it in userland and close
    it. However, at this point we may not have gotten the reference count of
    the cache, but we will put it during colse fd, so this may cause a cache
    UAF.
    
    So grab the cache reference count before fd_install(). In addition, by
    kernel convention, fd is taken over by the user land after fd_install(),
    and the kernel should not call close_fd() after that, i.e., it should call
    fd_install() after everything is ready, thus fd_install() is called after
    copy_to_user() succeeds.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Suggested-by: Hou Tao <[email protected]>
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: extract ondemand info field from cachefiles_object [+ + +]

Author: Jia Zhu <[email protected]>
Date:   Mon Nov 20 12:14:19 2023 +0800

    cachefiles: extract ondemand info field from cachefiles_object
    
    [ Upstream commit 3c5ecfe16e7699011c12c2d44e55437415331fa3 ]
    
    We'll introduce a @work_struct field for @object in subsequent patches,
    it will enlarge the size of @object.
    As the result of that, this commit extracts ondemand info field from
    @object.
    
    Signed-off-by: Jia Zhu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jingbo Xu <[email protected]>
    Reviewed-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:00 2024 +0800

    cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
    
    [ Upstream commit da4a827416066191aafeeccee50a8836a826ba10 ]
    
    We got the following issue in a fuzz test of randomly issuing the restore
    command:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0xb41/0xb60
    Read of size 8 at addr ffff888122e84088 by task ondemand-04-dae/963
    
    CPU: 13 PID: 963 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #564
    Call Trace:
     kasan_report+0x93/0xc0
     cachefiles_ondemand_daemon_read+0xb41/0xb60
     vfs_read+0x169/0xb50
     ksys_read+0xf5/0x1e0
    
    Allocated by task 116:
     kmem_cache_alloc+0x140/0x3a0
     cachefiles_lookup_cookie+0x140/0xcd0
     fscache_cookie_state_machine+0x43c/0x1230
     [...]
    
    Freed by task 792:
     kmem_cache_free+0xfe/0x390
     cachefiles_put_object+0x241/0x480
     fscache_cookie_state_machine+0x5c8/0x1230
     [...]
    ==================================================================
    
    Following is the process that triggers the issue:
    
         mount  |   daemon_thread1    |    daemon_thread2
    ------------------------------------------------------------
    cachefiles_withdraw_cookie
     cachefiles_ondemand_clean_object(object)
      cachefiles_ondemand_send_req
       REQ_A = kzalloc(sizeof(*req) + data_len)
       wait_for_completion(&REQ_A->done)
    
                cachefiles_daemon_read
                 cachefiles_ondemand_daemon_read
                  REQ_A = cachefiles_ondemand_select_req
                  msg->object_id = req->object->ondemand->ondemand_id
                                      ------ restore ------
                                      cachefiles_ondemand_restore
                                      xas_for_each(&xas, req, ULONG_MAX)
                                       xas_set_mark(&xas, CACHEFILES_REQ_NEW)
    
                                      cachefiles_daemon_read
                                       cachefiles_ondemand_daemon_read
                                        REQ_A = cachefiles_ondemand_select_req
                  copy_to_user(_buffer, msg, n)
                   xa_erase(&cache->reqs, id)
                   complete(&REQ_A->done)
                  ------ close(fd) ------
                  cachefiles_ondemand_fd_release
                   cachefiles_put_object
     cachefiles_put_object
      kmem_cache_free(cachefiles_object_jar, object)
                                        REQ_A->object->ondemand->ondemand_id
                                         // object UAF !!!
    
    When we see the request within xa_lock, req->object must not have been
    freed yet, so grab the reference count of object before xa_unlock to
    avoid the above issue.
    
    Fixes: 0a7e54c1959c ("cachefiles: resend an open request if the read request's object is closed")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Reviewed-by: Jia Zhu <[email protected]>
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:42:59 2024 +0800

    cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
    
    [ Upstream commit de3e26f9e5b76fc628077578c001c4a51bf54d06 ]
    
    We got the following issue in a fuzz test of randomly issuing the restore
    command:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0
    Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962
    
    CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542
    Call Trace:
     kasan_report+0x94/0xc0
     cachefiles_ondemand_daemon_read+0x609/0xab0
     vfs_read+0x169/0xb50
     ksys_read+0xf5/0x1e0
    
    Allocated by task 626:
     __kmalloc+0x1df/0x4b0
     cachefiles_ondemand_send_req+0x24d/0x690
     cachefiles_create_tmpfile+0x249/0xb30
     cachefiles_create_file+0x6f/0x140
     cachefiles_look_up_object+0x29c/0xa60
     cachefiles_lookup_cookie+0x37d/0xca0
     fscache_cookie_state_machine+0x43c/0x1230
     [...]
    
    Freed by task 626:
     kfree+0xf1/0x2c0
     cachefiles_ondemand_send_req+0x568/0x690
     cachefiles_create_tmpfile+0x249/0xb30
     cachefiles_create_file+0x6f/0x140
     cachefiles_look_up_object+0x29c/0xa60
     cachefiles_lookup_cookie+0x37d/0xca0
     fscache_cookie_state_machine+0x43c/0x1230
     [...]
    ==================================================================
    
    Following is the process that triggers the issue:
    
         mount  |   daemon_thread1    |    daemon_thread2
    ------------------------------------------------------------
     cachefiles_ondemand_init_object
      cachefiles_ondemand_send_req
       REQ_A = kzalloc(sizeof(*req) + data_len)
       wait_for_completion(&REQ_A->done)
    
                cachefiles_daemon_read
                 cachefiles_ondemand_daemon_read
                  REQ_A = cachefiles_ondemand_select_req
                  cachefiles_ondemand_get_fd
                  copy_to_user(_buffer, msg, n)
                process_open_req(REQ_A)
                                      ------ restore ------
                                      cachefiles_ondemand_restore
                                      xas_for_each(&xas, req, ULONG_MAX)
                                       xas_set_mark(&xas, CACHEFILES_REQ_NEW);
    
                                      cachefiles_daemon_read
                                       cachefiles_ondemand_daemon_read
                                        REQ_A = cachefiles_ondemand_select_req
    
                 write(devfd, ("copen %u,%llu", msg->msg_id, size));
                 cachefiles_ondemand_copen
                  xa_erase(&cache->reqs, id)
                  complete(&REQ_A->done)
       kfree(REQ_A)
                                        cachefiles_ondemand_get_fd(REQ_A)
                                         fd = get_unused_fd_flags
                                         file = anon_inode_getfile
                                         fd_install(fd, file)
                                         load = (void *)REQ_A->msg.data;
                                         load->fd = fd;
                                         // load UAF !!!
    
    This issue is caused by issuing a restore command when the daemon is still
    alive, which results in a request being processed multiple times thus
    triggering a UAF. So to avoid this problem, add an additional reference
    count to cachefiles_req, which is held while waiting and reading, and then
    released when the waiting and reading is over.
    
    Note that since there is only one reference count for waiting, we need to
    avoid the same request being completed multiple times, so we can only
    complete the request if it is successfully removed from the xarray.
    
    Fixes: e73fa11a356c ("cachefiles: add restore command to recover inflight ondemand read requests")
    Suggested-by: Hou Tao <[email protected]>
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Reviewed-by: Jia Zhu <[email protected]>
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: flush all requests after setting CACHEFILES_DEAD [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:07 2024 +0800

    cachefiles: flush all requests after setting CACHEFILES_DEAD
    
    [ Upstream commit 85e833cd7243bda7285492b0653c3abb1e2e757b ]
    
    In ondemand mode, when the daemon is processing an open request, if the
    kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write()
    will always return -EIO, so the daemon can't pass the copen to the kernel.
    Then the kernel process that is waiting for the copen triggers a hung_task.
    
    Since the DEAD state is irreversible, it can only be exited by closing
    /dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark
    the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to
    avoid the above hungtask. We may still be able to read some of the cached
    data before closing the fd of /dev/cachefiles.
    
    Note that this relies on the patch that adds reference counting to the req,
    otherwise it may UAF.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: introduce object ondemand state [+ + +]

Author: Jia Zhu <[email protected]>
Date:   Mon Nov 20 12:14:18 2023 +0800

    cachefiles: introduce object ondemand state
    
    [ Upstream commit 357a18d033143617e9c7d420c8f0dd4cbab5f34d ]
    
    Previously, @ondemand_id field was used not only to identify ondemand
    state of the object, but also to represent the index of the xarray.
    This commit introduces @state field to decouple the role of @ondemand_id
    and adds helpers to access it.
    
    Signed-off-by: Jia Zhu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jingbo Xu <[email protected]>
    Reviewed-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: never get a new anonymous fd if ondemand_id is valid [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:04 2024 +0800

    cachefiles: never get a new anonymous fd if ondemand_id is valid
    
    [ Upstream commit 4988e35e95fc938bdde0e15880fe72042fc86acf ]
    
    Now every time the daemon reads an open request, it gets a new anonymous fd
    and ondemand_id. With the introduction of "restore", it is possible to read
    the same open request more than once, and therefore an object can have more
    than one anonymous fd.
    
    If the anonymous fd is not unique, the following concurrencies will result
    in an fd leak:
    
         t1     |         t2         |          t3
    ------------------------------------------------------------
     cachefiles_ondemand_init_object
      cachefiles_ondemand_send_req
       REQ_A = kzalloc(sizeof(*req) + data_len)
       wait_for_completion(&REQ_A->done)
                cachefiles_daemon_read
                 cachefiles_ondemand_daemon_read
                  REQ_A = cachefiles_ondemand_select_req
                  cachefiles_ondemand_get_fd
                    load->fd = fd0
                    ondemand_id = object_id0
                                      ------ restore ------
                                      cachefiles_ondemand_restore
                                       // restore REQ_A
                                      cachefiles_daemon_read
                                       cachefiles_ondemand_daemon_read
                                        REQ_A = cachefiles_ondemand_select_req
                                          cachefiles_ondemand_get_fd
                                            load->fd = fd1
                                            ondemand_id = object_id1
                 process_open_req(REQ_A)
                 write(devfd, ("copen %u,%llu", msg->msg_id, size))
                 cachefiles_ondemand_copen
                  xa_erase(&cache->reqs, id)
                  complete(&REQ_A->done)
       kfree(REQ_A)
                                      process_open_req(REQ_A)
                                      // copen fails due to no req
                                      // daemon close(fd1)
                                      cachefiles_ondemand_fd_release
                                       // set object closed
     -- umount --
     cachefiles_withdraw_cookie
      cachefiles_ondemand_clean_object
       cachefiles_ondemand_init_close_req
        if (!cachefiles_ondemand_object_is_open(object))
          return -ENOENT;
        // The fd0 is not closed until the daemon exits.
    
    However, the anonymous fd holds the reference count of the object and the
    object holds the reference count of the cookie. So even though the cookie
    has been relinquished, it will not be unhashed and freed until the daemon
    exits.
    
    In fscache_hash_cookie(), when the same cookie is found in the hash list,
    if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new
    cookie waits for the old cookie to be unhashed, while the old cookie is
    waiting for the leaked fd to be closed, if the daemon does not exit in time
    it will trigger a hung task.
    
    To avoid this, allocate a new anonymous fd only if no anonymous fd has
    been allocated (ondemand_id == 0) or if the previously allocated anonymous
    fd has been closed (ondemand_id == -1). Moreover, returns an error if
    ondemand_id is valid, letting the daemon know that the current userland
    restore logic is abnormal and needs to be checked.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:43:01 2024 +0800

    cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
    
    [ Upstream commit 3e6d704f02aa4c50c7bc5fe91a4401df249a137b ]
    
    The err_put_fd label is only used once, so remove it to make the code
    more readable. In addition, the logic for deleting error request and
    CLOSE request is merged to simplify the code.
    
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Reviewed-by: Jia Zhu <[email protected]>
    Reviewed-by: Gao Xiang <[email protected]>
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: remove requests from xarray during flushing requests [+ + +]

Author: Baokun Li <[email protected]>
Date:   Wed May 22 19:42:58 2024 +0800

    cachefiles: remove requests from xarray during flushing requests
    
    [ Upstream commit 0fc75c5940fa634d84e64c93bfc388e1274ed013 ]
    
    Even with CACHEFILES_DEAD set, we can still read the requests, so in the
    following concurrency the request may be used after it has been freed:
    
         mount  |   daemon_thread1    |    daemon_thread2
    ------------------------------------------------------------
     cachefiles_ondemand_init_object
      cachefiles_ondemand_send_req
       REQ_A = kzalloc(sizeof(*req) + data_len)
       wait_for_completion(&REQ_A->done)
                cachefiles_daemon_read
                 cachefiles_ondemand_daemon_read
                                      // close dev fd
                                      cachefiles_flush_reqs
                                       complete(&REQ_A->done)
       kfree(REQ_A)
                  xa_lock(&cache->reqs);
                  cachefiles_ondemand_select_req
                    req->msg.opcode != CACHEFILES_OP_READ
                    // req use-after-free !!!
                  xa_unlock(&cache->reqs);
                                       xa_destroy(&cache->reqs)
    
    Hence remove requests from cache->reqs when flushing them to avoid
    accessing freed requests.
    
    Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Jeff Layton <[email protected]>
    Reviewed-by: Jia Zhu <[email protected]>
    Reviewed-by: Gao Xiang <[email protected]>
    Reviewed-by: Jingbo Xu <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cachefiles: resend an open request if the read request's object is closed [+ + +]

Author: Jia Zhu <[email protected]>
Date:   Mon Nov 20 12:14:20 2023 +0800

    cachefiles: resend an open request if the read request's object is closed
    
    [ Upstream commit 0a7e54c1959c0feb2de23397ec09c7692364313e ]
    
    When an anonymous fd is closed by user daemon, if there is a new read
    request for this file comes up, the anonymous fd should be re-opened
    to handle that read request rather than fail it directly.
    
    1. Introduce reopening state for objects that are closed but have
       inflight/subsequent read requests.
    2. No longer flush READ requests but only CLOSE requests when anonymous
       fd is closed.
    3. Enqueue the reopen work to workqueue, thus user daemon could get rid
       of daemon_read context and handle that request smoothly. Otherwise,
       the user daemon will send a reopen request and wait for itself to
       process the request.
    
    Signed-off-by: Jia Zhu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jingbo Xu <[email protected]>
    Reviewed-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
    Signed-off-by: Sasha Levin <[email protected]>

clk: sifive: Do not register clkdevs for PRCI clocks [+ + +]

Author: Samuel Holland <[email protected]>
Date:   Mon May 27 17:14:12 2024 -0700

    clk: sifive: Do not register clkdevs for PRCI clocks
    
    [ Upstream commit 2607133196c35f31892ee199ce7ffa717bea4ad1 ]
    
    These clkdevs were unnecessary, because systems using this driver always
    look up clocks using the devicetree. And as Russell King points out[1],
    since the provided device name was truncated, lookups via clkdev would
    never match.
    
    Recently, commit 8d532528ff6a ("clkdev: report over-sized strings when
    creating clkdev entries") caused clkdev registration to fail due to the
    truncation, and this now prevents the driver from probing. Fix the
    driver by removing the clkdev registration.
    
    Link: https://lore.kernel.org/linux-clk/[email protected]/ [1]
    Fixes: 30b8e27e3b58 ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block")
    Fixes: 8d532528ff6a ("clkdev: report over-sized strings when creating clkdev entries")
    Reported-by: Guenter Roeck <[email protected]>
    Closes: https://lore.kernel.org/linux-clk/[email protected]/
    Suggested-by: Russell King <[email protected]>
    Signed-off-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Stephen Boyd <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c [+ + +]

Author: Dave Jiang <[email protected]>
Date:   Tue May 28 15:55:51 2024 -0700

    cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
    
    [ Upstream commit d55510527153d17a3af8cc2df69c04f95ae1350d ]
    
    tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not
    include linux/vmalloc.h. Kernel v6.10 made changes that causes the
    currently included headers not depend on vmalloc.h and therefore
    mem.c can no longer compile. Add linux/vmalloc.h to fix compile
    issue.
    
      CC [M]  tools/testing/cxl/test/mem.o
    tools/testing/cxl/test/mem.c: In function ‘label_area_release’:
    tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration]
     1428 |         vfree(lsa);
          |         ^~~~~
          |         kvfree
    tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’:
    tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration]
     1466 |         mdata->lsa = vmalloc(LSA_SIZE);
          |                      ^~~~~~~
          |                      kmalloc
    
    Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver")
    Reviewed-by: Dan Williams <[email protected]>
    Reviewed-by: Alison Schofield <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dave Jiang <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

dmaengine: axi-dmac: fix possible race in remove() [+ + +]

Author: Nuno Sa <[email protected]>
Date:   Thu Mar 28 14:58:50 2024 +0100

    dmaengine: axi-dmac: fix possible race in remove()
    
    commit 1bc31444209c8efae98cb78818131950d9a6f4d6 upstream.
    
    We need to first free the IRQ before calling of_dma_controller_free().
    Otherwise we could get an interrupt and schedule a tasklet while
    removing the DMA controller.
    
    Fixes: 0e3b67b348b8 ("dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller")
    Cc: [email protected]
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drivers: core: synchronize really_probe() and dev_uevent() [+ + +]

Author: Dirk Behme <[email protected]>
Date:   Mon May 13 07:06:34 2024 +0200

    drivers: core: synchronize really_probe() and dev_uevent()
    
    commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream.
    
    Synchronize the dev->driver usage in really_probe() and dev_uevent().
    These can run in different threads, what can result in the following
    race condition for dev->driver uninitialization:
    
    Thread #1:
    ==========
    
    really_probe() {
    ...
    probe_failed:
    ...
    device_unbind_cleanup(dev) {
        ...
        dev->driver = NULL;   // <= Failed probe sets dev->driver to NULL
        ...
        }
    ...
    }
    
    Thread #2:
    ==========
    
    dev_uevent() {
    ...
    if (dev->driver)
          // If dev->driver is NULLed from really_probe() from here on,
          // after above check, the system crashes
          add_uevent_var(env, "DRIVER=%s", dev->driver->name);
    ...
    }
    
    really_probe() holds the lock, already. So nothing needs to be done
    there. dev_uevent() is called with lock held, often, too. But not
    always. What implies that we can't add any locking in dev_uevent()
    itself. So fix this race by adding the lock to the non-protected
    path. This is the path where above race is observed:
    
     dev_uevent+0x235/0x380
     uevent_show+0x10c/0x1f0  <= Add lock here
     dev_attr_show+0x3a/0xa0
     sysfs_kf_seq_show+0x17c/0x250
     kernfs_seq_show+0x7c/0x90
     seq_read_iter+0x2d7/0x940
     kernfs_fop_read_iter+0xc6/0x310
     vfs_read+0x5bc/0x6b0
     ksys_read+0xeb/0x1b0
     __x64_sys_read+0x42/0x50
     x64_sys_call+0x27ad/0x2d30
     do_syscall_64+0xcd/0x1d0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Similar cases are reported by syzkaller in
    
    https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a
    
    But these are regarding the *initialization* of dev->driver
    
    dev->driver = drv;
    
    As this switches dev->driver to non-NULL these reports can be considered
    to be false-positives (which should be "fixed" by this commit, as well,
    though).
    
    The same issue was reported and tried to be fixed back in 2015 in
    
    https://lore.kernel.org/lkml/[email protected]/
    
    already.
    
    Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class")
    Cc: stable <[email protected]>
    Cc: [email protected]
    Cc: Ashish Sangwan <[email protected]>
    Cc: Namjae Jeon <[email protected]>
    Signed-off-by: Dirk Behme <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: drop unnecessary NULL checks in debugfs [+ + +]

Author: Alexey Kodanev <[email protected]>
Date:   Tue Dec 27 20:04:15 2022 +0300

    drm/amd/display: drop unnecessary NULL checks in debugfs
    
    [ Upstream commit f8e12e770e8049917f82387033b3cf44bc43b915 ]
    
    pipe_ctx pointer cannot be NULL when getting the address of
    an element of the pipe_ctx array. Moreover, the MAX_PIPES is
    defined as 6, so pipe_ctx is not NULL after the loop either.
    
    Detected using the static analysis tool - Svace.
    
    Signed-off-by: Alexey Kodanev <[email protected]>
    Signed-off-by: Hamza Mahfooz <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: 892b41b16f61 ("drm/amd/display: Fix incorrect DSC instance for MST")
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix incorrect DSC instance for MST [+ + +]

Author: Hersen Wu <[email protected]>
Date:   Tue Feb 13 14:26:06 2024 -0500

    drm/amd/display: Fix incorrect DSC instance for MST
    
    [ Upstream commit 892b41b16f6163e6556545835abba668fcab4eea ]
    
    [Why] DSC debugfs, such as dp_dsc_clock_en_read,
    use aconnector->dc_link to find pipe_ctx for display.
    Displays connected to MST hub share the same dc_link.
    DSC instance is from pipe_ctx. This causes incorrect
    DSC instance for display connected to MST hub.
    
    [How] Add aconnector->sink check to find pipe_ctx.
    
    CC: [email protected]
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Hersen Wu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/bridge/panel: Fix runtime warning on panel bridge release [+ + +]

Author: Adam Miotk <[email protected]>
Date:   Mon Jun 10 11:27:39 2024 +0100

    drm/bridge/panel: Fix runtime warning on panel bridge release
    
    [ Upstream commit ce62600c4dbee8d43b02277669dd91785a9b81d9 ]
    
    Device managed panel bridge wrappers are created by calling to
    drm_panel_bridge_add_typed() and registering a release handler for
    clean-up when the device gets unbound.
    
    Since the memory for this bridge is also managed and linked to the panel
    device, the release function should not try to free that memory.
    Moreover, the call to devm_kfree() inside drm_panel_bridge_remove() will
    fail in this case and emit a warning because the panel bridge resource
    is no longer on the device resources list (it has been removed from
    there before the call to release handlers).
    
    Fixes: 67022227ffb1 ("drm/bridge: Add a devm_ allocator for panel bridge.")
    Signed-off-by: Adam Miotk <[email protected]>
    Signed-off-by: Maxime Ripard <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/exynos/vidi: fix memory leak in .get_modes() [+ + +]

Author: Jani Nikula <[email protected]>
Date:   Thu May 30 13:01:51 2024 +0300

    drm/exynos/vidi: fix memory leak in .get_modes()
    
    commit 38e3825631b1f314b21e3ade00b5a4d737eb054e upstream.
    
    The duplicated EDID is never freed. Fix it.
    
    Cc: [email protected]
    Signed-off-by: Jani Nikula <[email protected]>
    Signed-off-by: Inki Dae <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found [+ + +]

Author: Marek Szyprowski <[email protected]>
Date:   Thu Apr 25 11:48:51 2024 +0200

    drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
    
    commit 799d4b392417ed6889030a5b2335ccb6dcf030ab upstream.
    
    When reading EDID fails and driver reports no modes available, the DRM
    core adds an artificial 1024x786 mode to the connector. Unfortunately
    some variants of the Exynos HDMI (like the one in Exynos4 SoCs) are not
    able to drive such mode, so report a safe 640x480 mode instead of nothing
    in case of the EDID reading failure.
    
    This fixes the following issue observed on Trats2 board since commit
    13d5b040363c ("drm/exynos: do not return negative values from .get_modes()"):
    
    [drm] Exynos DRM: using 11c00000.fimd device for DMA mapping operations
    exynos-drm exynos-drm: bound 11c00000.fimd (ops fimd_component_ops)
    exynos-drm exynos-drm: bound 12c10000.mixer (ops mixer_component_ops)
    exynos-dsi 11c80000.dsi: [drm:samsung_dsim_host_attach] Attached s6e8aa0 device (lanes:4 bpp:24 mode-flags:0x10b)
    exynos-drm exynos-drm: bound 11c80000.dsi (ops exynos_dsi_component_ops)
    exynos-drm exynos-drm: bound 12d00000.hdmi (ops hdmi_component_ops)
    [drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1
    exynos-hdmi 12d00000.hdmi: [drm:hdmiphy_enable.part.0] *ERROR* PLL could not reach steady state
    panel-samsung-s6e8aa0 11c80000.dsi.0: ID: 0xa2, 0x20, 0x8c
    exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 11 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
    [CRTC:70:crtc-1] vblank wait timed out
    Modules linked in:
    CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.9.0-rc5-next-20240424 #14913
    Hardware name: Samsung Exynos (Flattened Device Tree)
    Workqueue: events_unbound deferred_probe_work_func
    Call trace:
     unwind_backtrace from show_stack+0x10/0x14
     show_stack from dump_stack_lvl+0x68/0x88
     dump_stack_lvl from __warn+0x7c/0x1c4
     __warn from warn_slowpath_fmt+0x11c/0x1a8
     warn_slowpath_fmt from drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
     drm_atomic_helper_wait_for_vblanks.part.0 from drm_atomic_helper_commit_tail_rpm+0x7c/0x8c
     drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184
     commit_tail from drm_atomic_helper_commit+0x168/0x190
     drm_atomic_helper_commit from drm_atomic_commit+0xb4/0xe0
     drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x27c
     drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc
     drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
     drm_client_modeset_commit from __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0xc4
     __drm_fb_helper_restore_fbdev_mode_unlocked from drm_fb_helper_set_par+0x2c/0x3c
     drm_fb_helper_set_par from fbcon_init+0x3d8/0x550
     fbcon_init from visual_init+0xc0/0x108
     visual_init from do_bind_con_driver+0x1b8/0x3a4
     do_bind_con_driver from do_take_over_console+0x140/0x1ec
     do_take_over_console from do_fbcon_takeover+0x70/0xd0
     do_fbcon_takeover from fbcon_fb_registered+0x19c/0x1ac
     fbcon_fb_registered from register_framebuffer+0x190/0x21c
     register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x350/0x574
     __drm_fb_helper_initial_config_and_unlock from exynos_drm_fbdev_client_hotplug+0x6c/0xb0
     exynos_drm_fbdev_client_hotplug from drm_client_register+0x58/0x94
     drm_client_register from exynos_drm_bind+0x160/0x190
     exynos_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8
     try_to_bring_up_aggregate_device from __component_add+0xb0/0x170
     __component_add from mixer_probe+0x74/0xcc
     mixer_probe from platform_probe+0x5c/0xb8
     platform_probe from really_probe+0xe0/0x3d8
     really_probe from __driver_probe_device+0x9c/0x1e4
     __driver_probe_device from driver_probe_device+0x30/0xc0
     driver_probe_device from __device_attach_driver+0xa8/0x120
     __device_attach_driver from bus_for_each_drv+0x80/0xcc
     bus_for_each_drv from __device_attach+0xac/0x1fc
     __device_attach from bus_probe_device+0x8c/0x90
     bus_probe_device from deferred_probe_work_func+0x98/0xe0
     deferred_probe_work_func from process_one_work+0x240/0x6d0
     process_one_work from worker_thread+0x1a0/0x3f4
     worker_thread from kthread+0x104/0x138
     kthread from ret_from_fork+0x14/0x28
    Exception stack(0xf0895fb0 to 0xf0895ff8)
    ...
    irq event stamp: 82357
    hardirqs last  enabled at (82363): [<c01a96e8>] vprintk_emit+0x308/0x33c
    hardirqs last disabled at (82368): [<c01a969c>] vprintk_emit+0x2bc/0x33c
    softirqs last  enabled at (81614): [<c0101644>] __do_softirq+0x320/0x500
    softirqs last disabled at (81609): [<c012dfe0>] __irq_exit_rcu+0x130/0x184
    ---[ end trace 0000000000000000 ]---
    exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
    exynos-drm exynos-drm: [drm] *ERROR* [CRTC:70:crtc-1] commit wait timed out
    exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
    exynos-drm exynos-drm: [drm] *ERROR* [CONNECTOR:74:HDMI-A-1] commit wait timed out
    exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
    exynos-drm exynos-drm: [drm] *ERROR* [PLANE:56:plane-5] commit wait timed out
    exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
    
    Cc: [email protected]
    Fixes: 13d5b040363c ("drm/exynos: do not return negative values from .get_modes()")
    Signed-off-by: Marek Szyprowski <[email protected]>
    Signed-off-by: Inki Dae <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/i915/dpt: Make DPT object unshrinkable [+ + +]

Author: Vidya Srinivas <[email protected]>
Date:   Mon May 20 22:26:34 2024 +0530

    drm/i915/dpt: Make DPT object unshrinkable
    
    commit 43e2b37e2ab660c3565d4cff27922bc70e79c3f1 upstream.
    
    In some scenarios, the DPT object gets shrunk but
    the actual framebuffer did not and thus its still
    there on the DPT's vm->bound_list. Then it tries to
    rewrite the PTEs via a stale CPU mapping. This causes panic.
    
    Cc: [email protected]
    Reported-by: Shawn Lee <[email protected]>
    Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt")
    Signed-off-by: Vidya Srinivas <[email protected]>
    [vsyrjala: Add TODO comment]
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 51064d471c53dcc8eddd2333c3f1c1d9131ba36c)
    Signed-off-by: Jani Nikula <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/i915/gt: Disarm breadcrumbs if engines are already idle [+ + +]

Author: Chris Wilson <[email protected]>
Date:   Tue Apr 23 18:23:10 2024 +0200

    drm/i915/gt: Disarm breadcrumbs if engines are already idle
    
    commit 70cb9188ffc75e643debf292fcddff36c9dbd4ae upstream.
    
    The breadcrumbs use a GT wakeref for guarding the interrupt, but are
    disarmed during release of the engine wakeref. This leaves a hole where
    we may attach a breadcrumb just as the engine is parking (after it has
    parked its breadcrumbs), execute the irq worker with some signalers still
    attached, but never be woken again.
    
    That issue manifests itself in CI with IGT runner timeouts while tests
    are waiting indefinitely for release of all GT wakerefs.
    
    <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
    <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
    <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
    <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
    <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
    <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
    <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
    <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
    <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
    ...
    <6> [299.356526] sysrq: Show State
    ...
    <6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
    <6> [299.373967] Call Trace:
    <6> [299.373968]  <TASK>
    <6> [299.373970]  __schedule+0x3bb/0xda0
    <6> [299.373974]  schedule+0x41/0x110
    <6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
    <6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
    <6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
    <6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
    <6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
    <6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
    <6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
    <6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
    <6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
    <6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]
    
    At the end of the interrupt worker, if there are no more engines awake,
    disarm the breadcrumb and go to sleep.
    
    Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
    Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
    Signed-off-by: Chris Wilson <[email protected]>
    Cc: Andrzej Hajda <[email protected]>
    Cc: <[email protected]> # v5.12+
    Signed-off-by: Janusz Krzysztofik <[email protected]>
    Acked-by: Nirmoy Das <[email protected]>
    Reviewed-by: Andrzej Hajda <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit fbad43eccae5cb14594195c20113369aabaa22b5)
    Signed-off-by: Jani Nikula <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/komeda: check for error-valued pointer [+ + +]

Author: Amjad Ouled-Ameur <[email protected]>
Date:   Mon Jun 10 11:20:56 2024 +0100

    drm/komeda: check for error-valued pointer
    
    [ Upstream commit b880018edd3a577e50366338194dee9b899947e0 ]
    
    komeda_pipeline_get_state() may return an error-valued pointer, thus
    check the pointer for negative or null value before dereferencing.
    
    Fixes: 502932a03fce ("drm/komeda: Add the initial scaler support for CORE")
    Signed-off-by: Amjad Ouled-Ameur <[email protected]>
    Signed-off-by: Maxime Ripard <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/vmwgfx: 3D disabled should not effect STDU memory limits [+ + +]

Author: Ian Forbes <[email protected]>
Date:   Tue May 21 13:47:18 2024 -0500

    drm/vmwgfx: 3D disabled should not effect STDU memory limits
    
    [ Upstream commit fb5e19d2dd03eb995ccd468d599b2337f7f66555 ]
    
    This limit became a hard cap starting with the change referenced below.
    Surface creation on the device will fail if the requested size is larger
    than this limit so altering the value arbitrarily will expose modes that
    are too large for the device's hard limits.
    
    Fixes: 7ebb47c9f9ab ("drm/vmwgfx: Read new register for GB memory when available")
    
    Signed-off-by: Ian Forbes <[email protected]>
    Signed-off-by: Zack Rusin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/vmwgfx: Filter modes which exceed graphics memory [+ + +]

Author: Ian Forbes <[email protected]>
Date:   Tue May 21 13:47:17 2024 -0500

    drm/vmwgfx: Filter modes which exceed graphics memory
    
    [ Upstream commit 426826933109093503e7ef15d49348fc5ab505fe ]
    
    SVGA requires individual surfaces to fit within graphics memory
    (max_mob_pages) which means that modes with a final buffer size that would
    exceed graphics memory must be pruned otherwise creation will fail.
    
    Additionally llvmpipe requires its buffer height and width to be a multiple
    of its tile size which is 64. As a result we have to anticipate that
    llvmpipe will round up the mode size passed to it by the compositor when
    it creates buffers and filter modes where this rounding exceeds graphics
    memory.
    
    This fixes an issue where VMs with low graphics memory (< 64MiB) configured
    with high resolution mode boot to a black screen because surface creation
    fails.
    
    Fixes: d947d1b71deb ("drm/vmwgfx: Add and connect connector helper function")
    Signed-off-by: Ian Forbes <[email protected]>
    Signed-off-by: Zack Rusin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/vmwgfx: Port the framebuffer code to drm fb helpers [+ + +]

Author: Zack Rusin <[email protected]>
Date:   Sat Oct 22 00:02:32 2022 -0400

    drm/vmwgfx: Port the framebuffer code to drm fb helpers
    
    [ Upstream commit df42523c12f8d58a41f547f471b46deffd18c203 ]
    
    Instead of using vmwgfx specific framebuffer implementation use the drm
    fb helpers. There's no change in functionality, the only difference
    is a reduction in the amount of code inside the vmwgfx module.
    
    drm fb helpers do not deal correctly with changes in crtc preferred mode
    at runtime, but the old fb code wasn't dealing with it either.
    Same situation applies to high-res fb consoles - the old code was
    limited to 1176x885 because it was checking for legacy/deprecated
    memory limites, the drm fb helpers are limited to the initial resolution
    set on fb due to first problem (drm fb helpers being unable to handle
    hotplug crtc preferred mode changes).
    
    This also removes the kernel config for disabling fb support which hasn't
    been used or supported in a very long time.
    
    Signed-off-by: Zack Rusin <[email protected]>
    Reviewed-by: Maaz Mombasawala <[email protected]>
    Reviewed-by: Martin Krastev <[email protected]>
    Reviewed-by: Thomas Zimmermann <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Stable-dep-of: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory")
    Signed-off-by: Sasha Levin <[email protected]>

drm/vmwgfx: Refactor drm connector probing for display modes [+ + +]

Author: Martin Krastev <[email protected]>
Date:   Fri Jan 26 15:08:00 2024 -0500

    drm/vmwgfx: Refactor drm connector probing for display modes
    
    [ Upstream commit 935f795045a6f9b13d28d46ebdad04bfea8750dd ]
    
    Implement drm_connector_helper_funcs.mode_valid and .get_modes,
    replacing custom drm_connector_funcs.fill_modes code with
    drm_helper_probe_single_connector_modes; for STDU, LDU & SOU
    display units.
    
    Signed-off-by: Martin Krastev <[email protected]>
    Reviewed-by: Zack Rusin <[email protected]>
    Signed-off-by: Zack Rusin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Stable-dep-of: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory")
    Signed-off-by: Sasha Levin <[email protected]>

drm/vmwgfx: Remove STDU logic from generic mode_valid function [+ + +]

Author: Ian Forbes <[email protected]>
Date:   Tue May 21 13:47:19 2024 -0500

    drm/vmwgfx: Remove STDU logic from generic mode_valid function
    
    [ Upstream commit dde1de06bd7248fd83c4ce5cf0dbe9e4e95bbb91 ]
    
    STDU has its own mode_valid function now so this logic can be removed from
    the generic version.
    
    Fixes: 935f795045a6 ("drm/vmwgfx: Refactor drm connector probing for display modes")
    
    Signed-off-by: Ian Forbes <[email protected]>
    Signed-off-by: Zack Rusin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails [+ + +]

Author: Gabor Juhos <[email protected]>
Date:   Mon Mar 4 14:14:53 2024 +0100

    firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails
    
    [ Upstream commit 0c50b7fcf2773b4853e83fc15aba1a196ba95966 ]
    
    There are several functions which are calling qcom_scm_bw_enable()
    then returns immediately if the call fails and leaves the clocks
    enabled.
    
    Change the code of these functions to disable clocks when the
    qcom_scm_bw_enable() call fails. This also fixes a possible dma
    buffer leak in the qcom_scm_pas_init_image() function.
    
    Compile tested only due to lack of hardware with interconnect
    support.
    
    Cc: [email protected]
    Fixes: 65b7ebda5028 ("firmware: qcom_scm: Add bw voting support to the SCM interface")
    Signed-off-by: Gabor Juhos <[email protected]>
    Reviewed-by: Mukesh Ojha <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/proc: fix softlockup in __read_vmcore [+ + +]

Author: Rik van Riel <[email protected]>
Date:   Tue May 7 09:18:58 2024 -0400

    fs/proc: fix softlockup in __read_vmcore
    
    commit 5cbcb62dddf5346077feb82b7b0c9254222d3445 upstream.
    
    While taking a kernel core dump with makedumpfile on a larger system,
    softlockup messages often appear.
    
    While softlockup warnings can be harmless, they can also interfere with
    things like RCU freeing memory, which can be problematic when the kdump
    kexec image is configured with as little memory as possible.
    
    Avoid the softlockup, and give things like work items and RCU a chance to
    do their thing during __read_vmcore by adding a cond_resched.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Rik van Riel <[email protected]>
    Acked-by: Baoquan He <[email protected]>
    Cc: Dave Young <[email protected]>
    Cc: Vivek Goyal <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

geneve: Fix incorrect inner network header offset when innerprotoinherit is set [+ + +]

Author: Gal Pressman <[email protected]>
Date:   Thu Jun 6 23:32:48 2024 +0300

    geneve: Fix incorrect inner network header offset when innerprotoinherit is set
    
    [ Upstream commit c6ae073f5903f6c6439d0ac855836a4da5c0a701 ]
    
    When innerprotoinherit is set, the tunneled packets do not have an inner
    Ethernet header.
    Change 'maclen' to not always assume the header length is ETH_HLEN, as
    there might not be a MAC header.
    
    This resolves issues with drivers (e.g. mlx5, in
    mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset
    to be correct, and use it for TX offloads.
    
    Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb")
    Signed-off-by: Gal Pressman <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Wojciech Drewek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: Convert to immutable irq_chip [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Mon Mar 20 10:55:12 2023 +0100

    gpio: tqmx86: Convert to immutable irq_chip
    
    [ Upstream commit 8e43827b6ae727a745ce7a8cc19184b28905a965 ]
    
    Convert the driver to immutable irq-chip with a bit of
    intuition.
    
    Cc: Marc Zyngier <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Reviewed-by: Marc Zyngier <[email protected]>
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Stable-dep-of: 08af509efdf8 ("gpio: tqmx86: store IRQ trigger type and unmask status separately")
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type [+ + +]

Author: Matthias Schiffer <[email protected]>
Date:   Thu May 30 12:20:02 2024 +0200

    gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
    
    [ Upstream commit 90dd7de4ef7ba584823dfbeba834c2919a4bb55b ]
    
    The TQMx86 GPIO controller only supports falling and rising edge
    triggers, but not both. Fix this by implementing a software both-edge
    mode that toggles the edge type after every interrupt.
    
    Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
    Co-developed-by: Gregor Herburger <[email protected]>
    Signed-off-by: Gregor Herburger <[email protected]>
    Signed-off-by: Matthias Schiffer <[email protected]>
    Link: https://lore.kernel.org/r/515324f0491c4d44f4ef49f170354aca002d81ef.1717063994.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: fix typo in Kconfig label [+ + +]

Author: Gregor Herburger <[email protected]>
Date:   Thu May 30 12:19:59 2024 +0200

    gpio: tqmx86: fix typo in Kconfig label
    
    [ Upstream commit 8c219e52ca4d9a67cd6a7074e91bf29b55edc075 ]
    
    Fix description for GPIO_TQMX86 from QTMX86 to TQMx86.
    
    Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
    Signed-off-by: Gregor Herburger <[email protected]>
    Signed-off-by: Matthias Schiffer <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/e0e38c9944ad6d281d9a662a45d289b88edc808e.1717063994.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: introduce shadow register for GPIO output value [+ + +]

Author: Matthias Schiffer <[email protected]>
Date:   Thu May 30 12:20:00 2024 +0200

    gpio: tqmx86: introduce shadow register for GPIO output value
    
    [ Upstream commit 9d6a811b522ba558bcb4ec01d12e72a0af8e9f6e ]
    
    The TQMx86 GPIO controller uses the same register address for input and
    output data. Reading the register will always return current inputs
    rather than the previously set outputs (regardless of the current
    direction setting). Therefore, using a RMW pattern does not make sense
    when setting output values. Instead, the previously set output register
    value needs to be stored as a shadow register.
    
    As there is no reliable way to get the current output values from the
    hardware, also initialize all channels to 0, to ensure that stored and
    actual output values match. This should usually not have any effect in
    practise, as the TQMx86 UEFI sets all outputs to 0 during boot.
    
    Also prepare for extension of the driver to more than 8 GPIOs by using
    DECLARE_BITMAP.
    
    Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
    Signed-off-by: Matthias Schiffer <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/d0555933becd45fa92a85675d26e4d59343ddc01.1717063994.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: remove unneeded call to platform_set_drvdata() [+ + +]

Author: Andrei Coardos <[email protected]>
Date:   Tue Aug 1 23:38:39 2023 +0300

    gpio: tqmx86: remove unneeded call to platform_set_drvdata()
    
    [ Upstream commit 0a5e9306b812fe3517548fab92b3d3d6ce7576e5 ]
    
    This function call was found to be unnecessary as there is no equivalent
    platform_get_drvdata() call to access the private data of the driver. Also,
    the private data is defined in this driver, so there is no risk of it being
    accessed outside of this driver file.
    
    Reviewed-by: Alexandru Ardelean <[email protected]>
    Signed-off-by: Andrei Coardos <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Stable-dep-of: 9d6a811b522b ("gpio: tqmx86: introduce shadow register for GPIO output value")
    Signed-off-by: Sasha Levin <[email protected]>

gpio: tqmx86: store IRQ trigger type and unmask status separately [+ + +]

Author: Matthias Schiffer <[email protected]>
Date:   Thu May 30 12:20:01 2024 +0200

    gpio: tqmx86: store IRQ trigger type and unmask status separately
    
    [ Upstream commit 08af509efdf8dad08e972b48de0e2c2a7919ea8b ]
    
    irq_set_type() should not implicitly unmask the IRQ.
    
    All accesses to the interrupt configuration register are moved to a new
    helper tqmx86_gpio_irq_config(). We also introduce the new rule that
    accessing irq_type must happen while locked, which will become
    significant for fixing EDGE_BOTH handling.
    
    Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
    Signed-off-by: Matthias Schiffer <[email protected]>
    Link: https://lore.kernel.org/r/6aa4f207f77cb58ef64ffb947e91949b0f753ccd.1717063994.git.matthias.schiffer@ew.tq-group.com
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

greybus: Fix use-after-free bug in gb_interface_release due to race condition. [+ + +]

Author: Sicong Huang <[email protected]>
Date:   Tue Apr 16 16:03:13 2024 +0800

    greybus: Fix use-after-free bug in gb_interface_release due to race condition.
    
    commit 5c9c5d7f26acc2c669c1dcf57d1bb43ee99220ce upstream.
    
    In gb_interface_create, &intf->mode_switch_completion is bound with
    gb_interface_mode_switch_work. Then it will be started by
    gb_interface_request_mode_switch. Here is the relevant code.
    if (!queue_work(system_long_wq, &intf->mode_switch_work)) {
            ...
    }
    
    If we call gb_interface_release to make cleanup, there may be an
    unfinished work. This function will call kfree to free the object
    "intf". However, if gb_interface_mode_switch_work is scheduled to
    run after kfree, it may cause use-after-free error as
    gb_interface_mode_switch_work will use the object "intf".
    The possible execution flow that may lead to the issue is as follows:
    
    CPU0                            CPU1
    
                                |   gb_interface_create
                                |   gb_interface_request_mode_switch
    gb_interface_release        |
    kfree(intf) (free)          |
                                |   gb_interface_mode_switch_work
                                |   mutex_lock(&intf->mutex) (use)
    
    Fix it by canceling the work before kfree.
    
    Signed-off-by: Sicong Huang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: Ronnie Sahlberg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gve: Clear napi->skb before dev_kfree_skb_any() [+ + +]

Author: Ziwei Xiao <[email protected]>
Date:   Wed Jun 12 00:16:54 2024 +0000

    gve: Clear napi->skb before dev_kfree_skb_any()
    
    commit 6f4d93b78ade0a4c2cafd587f7b429ce95abb02e upstream.
    
    gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
    is freed with dev_kfree_skb_any(). This can result in a subsequent call
    to napi_get_frags returning a dangling pointer.
    
    Fix this by clearing napi->skb before the skb is freed.
    
    Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path")
    Cc: [email protected]
    Reported-by: Shailend Chand <[email protected]>
    Signed-off-by: Ziwei Xiao <[email protected]>
    Reviewed-by: Harshitha Ramamurthy <[email protected]>
    Reviewed-by: Shailend Chand <[email protected]>
    Reviewed-by: Praveen Kaligineedi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gve: ignore nonrelevant GSO type bits when processing TSO headers [+ + +]

Author: Joshua Washington <[email protected]>
Date:   Mon Jun 10 15:57:18 2024 -0700

    gve: ignore nonrelevant GSO type bits when processing TSO headers
    
    [ Upstream commit 1b9f756344416e02b41439bf2324b26aa25e141c ]
    
    TSO currently fails when the skb's gso_type field has more than one bit
    set.
    
    TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a
    few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes
    virtualization, such as QEMU, a real use-case.
    
    The gso_type and gso_size fields as passed from userspace in
    virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type
    |= SKB_GSO_DODGY to force the packet to enter the software GSO stack
    for verification.
    
    This issue might similarly come up when the CWR bit is set in the TCP
    header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit
    to be set.
    
    Fixes: a57e5de476be ("gve: DQO: Add TX path")
    Signed-off-by: Joshua Washington <[email protected]>
    Reviewed-by: Praveen Kaligineedi <[email protected]>
    Reviewed-by: Harshitha Ramamurthy <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Suggested-by: Eric Dumazet <[email protected]>
    Acked-by: Andrei Vagin <[email protected]>
    
    v2 - Remove unnecessary comments, remove line break between fixes tag
    and signoffs.
    
    v3 - Add back unrelated empty line removal.
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: core: remove unnecessary WARN_ON() in implement() [+ + +]

Author: Nikita Zhandarovich <[email protected]>
Date:   Fri May 17 07:19:14 2024 -0700

    HID: core: remove unnecessary WARN_ON() in implement()
    
    [ Upstream commit 4aa2dcfbad538adf7becd0034a3754e1bd01b2b5 ]
    
    Syzkaller hit a warning [1] in a call to implement() when trying
    to write a value into a field of smaller size in an output report.
    
    Since implement() already has a warn message printed out with the
    help of hid_warn() and value in question gets trimmed with:
            ...
            value &= m;
            ...
    WARN_ON may be considered superfluous. Remove it to suppress future
    syzkaller triggers.
    
    [1]
    WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 implement drivers/hid/hid-core.c:1451 [inline]
    WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863
    Modules linked in:
    CPU: 0 PID: 5084 Comm: syz-executor424 Not tainted 6.9.0-rc7-syzkaller-00183-gcf87f46fd34d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
    RIP: 0010:implement drivers/hid/hid-core.c:1451 [inline]
    RIP: 0010:hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863
    ...
    Call Trace:
     <TASK>
     __usbhid_submit_report drivers/hid/usbhid/hid-core.c:591 [inline]
     usbhid_submit_report+0x43d/0x9e0 drivers/hid/usbhid/hid-core.c:636
     hiddev_ioctl+0x138b/0x1f00 drivers/hid/usbhid/hiddev.c:726
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:904 [inline]
     __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    ...
    
    Fixes: 95d1c8951e5b ("HID: simplify implement() a bit")
    Reported-by: <[email protected]>
    Suggested-by: Alan Stern <[email protected]>
    Signed-off-by: Nikita Zhandarovich <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: i2c-hid: elan: Add ili9882t timing [+ + +]

Author: Cong Yang <[email protected]>
Date:   Wed Aug 2 15:19:47 2023 +0800

    HID: i2c-hid: elan: Add ili9882t timing
    
    [ Upstream commit f2f43bf15d7aa3286eced18d5199ee579e2c2614 ]
    
    The ili9882t is a TDDI IC (Touch with Display Driver). The
    datasheet specifies there should be 60ms between touch SDA
    sleep and panel RESX. Doug's series[1] allows panels and
    touchscreens to power on/off together, so we can add the 65 ms
    delay in i2c_hid_core_suspend before panel_unprepare.
    
    Because ili9882t touchscrgeen is a panel follower, and
    needs to use vccio-supply instead of vcc33-supply, so set
    it NULL to ili9882t_chip_data, then not use vcc33 regulator.
    
    [1]: https://lore.kernel.org/all/[email protected]
    
    Reviewed-by: Douglas Anderson <[email protected]>
    Signed-off-by: Cong Yang <[email protected]>
    Acked-by: Benjamin Tissoires <[email protected]>
    Link: https://lore.kernel.org/r/20230802071947.1683318-3-yangcong5@huaqin.corp-partner.google.com
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Stable-dep-of: 0eafc58f2194 ("HID: i2c-hid: elan: fix reset suspend current leakage")
    Signed-off-by: Sasha Levin <[email protected]>

HID: i2c-hid: elan: fix reset suspend current leakage [+ + +]

Author: Johan Hovold <[email protected]>
Date:   Tue May 7 16:48:18 2024 +0200

    HID: i2c-hid: elan: fix reset suspend current leakage
    
    [ Upstream commit 0eafc58f2194dbd01d4be40f99a697681171995b ]
    
    The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s
    shares the VCC33 supply with other peripherals that may remain powered
    during suspend (e.g. when enabled as wakeup sources).
    
    The reset line is also wired so that it can be left deasserted when the
    supply is off.
    
    This is important as it avoids holding the controller in reset for
    extended periods of time when it remains powered, which can lead to
    increased power consumption, and also avoids leaking current through the
    X13s reset circuitry during suspend (and after driver unbind).
    
    Use the new 'no-reset-on-power-off' devicetree property to determine
    when reset needs to be asserted on power down.
    
    Notably this also avoids wasting power on machine variants without a
    touchscreen for which the driver would otherwise exit probe with reset
    asserted.
    
    Fixes: bd3cba00dcc6 ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens")
    Cc: <[email protected]>    # 6.0
    Cc: Douglas Anderson <[email protected]>
    Tested-by: Steev Klimaszewski <[email protected]>
    Signed-off-by: Johan Hovold <[email protected]>
    Reviewed-by: Douglas Anderson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() [+ + +]

Author: José Expósito <[email protected]>
Date:   Fri May 24 15:05:39 2024 +0200

    HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
    
    [ Upstream commit ce3af2ee95170b7d9e15fff6e500d67deab1e7b3 ]
    
    Fix a memory leak on logi_dj_recv_send_report() error path.
    
    Fixes: 6f20d3261265 ("HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()")
    Signed-off-by: José Expósito <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: acpi: Unbind mux adapters before delete [+ + +]

Author: Hamish Martin <[email protected]>
Date:   Wed Mar 13 11:16:32 2024 +1300

    i2c: acpi: Unbind mux adapters before delete
    
    [ Upstream commit 3f858bbf04dbac934ac279aaee05d49eb9910051 ]
    
    There is an issue with ACPI overlay table removal specifically related
    to I2C multiplexers.
    
    Consider an ACPI SSDT Overlay that defines a PCA9548 I2C mux on an
    existing I2C bus. When this table is loaded we see the creation of a
    device for the overall PCA9548 chip and 8 further devices - one
    i2c_adapter each for the mux channels. These are all bound to their
    ACPI equivalents via an eventual invocation of acpi_bind_one().
    
    When we unload the SSDT overlay we run into the problem. The ACPI
    devices are deleted as normal via acpi_device_del_work_fn() and the
    acpi_device_del_list.
    
    However, the following warning and stack trace is output as the
    deletion does not go smoothly:
    ------------[ cut here ]------------
    kernfs: can not remove 'physical_node', no directory
    WARNING: CPU: 1 PID: 11 at fs/kernfs/dir.c:1674 kernfs_remove_by_name_ns+0xb9/0xc0
    Modules linked in:
    CPU: 1 PID: 11 Comm: kworker/u128:0 Not tainted 6.8.0-rc6+ #1
    Hardware name: congatec AG conga-B7E3/conga-B7E3, BIOS 5.13 05/16/2023
    Workqueue: kacpi_hotplug acpi_device_del_work_fn
    RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0
    Code: e4 00 48 89 ef e8 07 71 db ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 a7 55 e4 00 0f 0b eb a6 48 c7 c7 f0 38 0d 9d e8 97 0a d5 ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 0018:ffff9f864008fb28 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff8ef90a8d4940 RCX: 0000000000000000
    RDX: ffff8f000e267d10 RSI: ffff8f000e25c780 RDI: ffff8f000e25c780
    RBP: ffff8ef9186f9870 R08: 0000000000013ffb R09: 00000000ffffbfff
    R10: 00000000ffffbfff R11: ffff8f000e0a0000 R12: ffff9f864008fb50
    R13: ffff8ef90c93dd60 R14: ffff8ef9010d0958 R15: ffff8ef9186f98c8
    FS:  0000000000000000(0000) GS:ffff8f000e240000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f48f5253a08 CR3: 00000003cb82e000 CR4: 00000000003506f0
    Call Trace:
     <TASK>
     ? kernfs_remove_by_name_ns+0xb9/0xc0
     ? __warn+0x7c/0x130
     ? kernfs_remove_by_name_ns+0xb9/0xc0
     ? report_bug+0x171/0x1a0
     ? handle_bug+0x3c/0x70
     ? exc_invalid_op+0x17/0x70
     ? asm_exc_invalid_op+0x1a/0x20
     ? kernfs_remove_by_name_ns+0xb9/0xc0
     ? kernfs_remove_by_name_ns+0xb9/0xc0
     acpi_unbind_one+0x108/0x180
     device_del+0x18b/0x490
     ? srso_return_thunk+0x5/0x5f
     ? srso_return_thunk+0x5/0x5f
     device_unregister+0xd/0x30
     i2c_del_adapter.part.0+0x1bf/0x250
     i2c_mux_del_adapters+0xa1/0xe0
     i2c_device_remove+0x1e/0x80
     device_release_driver_internal+0x19a/0x200
     bus_remove_device+0xbf/0x100
     device_del+0x157/0x490
     ? __pfx_device_match_fwnode+0x10/0x10
     ? srso_return_thunk+0x5/0x5f
     device_unregister+0xd/0x30
     i2c_acpi_notify+0x10f/0x140
     notifier_call_chain+0x58/0xd0
     blocking_notifier_call_chain+0x3a/0x60
     acpi_device_del_work_fn+0x85/0x1d0
     process_one_work+0x134/0x2f0
     worker_thread+0x2f0/0x410
     ? __pfx_worker_thread+0x10/0x10
     kthread+0xe3/0x110
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x2f/0x50
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1b/0x30
     </TASK>
    ---[ end trace 0000000000000000 ]---
    ...
    repeated 7 more times, 1 for each channel of the mux
    ...
    
    The issue is that the binding of the ACPI devices to their peer I2C
    adapters is not correctly cleaned up. Digging deeper into the issue we
    see that the deletion order is such that the ACPI devices matching the
    mux channel i2c adapters are deleted first during the SSDT overlay
    removal. For each of the channels we see a call to i2c_acpi_notify()
    with ACPI_RECONFIG_DEVICE_REMOVE but, because these devices are not
    actually i2c_clients, nothing is done for them.
    
    Later on, after each of the mux channels has been dealt with, we come
    to delete the i2c_client representing the PCA9548 device. This is the
    call stack we see above, whereby the kernel cleans up the i2c_client
    including destruction of the mux and its channel adapters. At this
    point we do attempt to unbind from the ACPI peers but those peers no
    longer exist and so we hit the kernfs errors.
    
    The fix is to augment i2c_acpi_notify() to handle i2c_adapters. But,
    given that the life cycle of the adapters is linked to the i2c_client,
    instead of deleting the i2c_adapters during the i2c_acpi_notify(), we
    just trigger unbinding of the ACPI device from the adapter device, and
    allow the clean up of the adapter to continue in the way it always has.
    
    Signed-off-by: Hamish Martin <[email protected]>
    Reviewed-by: Mika Westerberg <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]>
    Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications")
    Cc: <[email protected]> # v4.8+
    Signed-off-by: Wolfram Sang <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: add fwnode APIs [+ + +]

Author: Russell King (Oracle) <[email protected]>
Date:   Wed Jan 11 10:54:21 2023 +0000

    i2c: add fwnode APIs
    
    [ Upstream commit 373c612d72461ddaea223592df31e62c934aae61 ]
    
    Add fwnode APIs for finding and getting I2C adapters, which will be
    used by the SFP code. These are passed the fwnode corresponding to
    the adapter, and return the I2C adapter. It is the responsibility of
    the caller to find the appropriate fwnode.
    
    We keep the DT and ACPI interfaces, but where appropriate, recode them
    to use the fwnode interfaces internally.
    
    Reviewed-by: Mika Westerberg <[email protected]>
    Signed-off-by: Russell King (Oracle) <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>
    Stable-dep-of: 3f858bbf04db ("i2c: acpi: Unbind mux adapters before delete")
    Signed-off-by: Sasha Levin <[email protected]>

i2c: at91: Fix the functionality flags of the slave-only interface [+ + +]

Author: Jean Delvare <[email protected]>
Date:   Fri May 31 11:19:14 2024 +0200

    i2c: at91: Fix the functionality flags of the slave-only interface
    
    [ Upstream commit d6d5645e5fc1233a7ba950de4a72981c394a2557 ]
    
    When an I2C adapter acts only as a slave, it should not claim to
    support I2C master capabilities.
    
    Fixes: 9d3ca54b550c ("i2c: at91: added slave mode support")
    Signed-off-by: Jean Delvare <[email protected]>
    Cc: Juergen Fitschen <[email protected]>
    Cc: Ludovic Desroches <[email protected]>
    Cc: Codrin Ciubotariu <[email protected]>
    Cc: Andi Shyti <[email protected]>
    Cc: Nicolas Ferre <[email protected]>
    Cc: Alexandre Belloni <[email protected]>
    Cc: Claudiu Beznea <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: designware: Fix the functionality flags of the slave-only interface [+ + +]

Author: Jean Delvare <[email protected]>
Date:   Fri May 31 11:17:48 2024 +0200

    i2c: designware: Fix the functionality flags of the slave-only interface
    
    [ Upstream commit cbf3fb5b29e99e3689d63a88c3cddbffa1b8de99 ]
    
    When an I2C adapter acts only as a slave, it should not claim to
    support I2C master capabilities.
    
    Fixes: 5b6d721b266a ("i2c: designware: enable SLAVE in platform module")
    Signed-off-by: Jean Delvare <[email protected]>
    Cc: Luis Oliveira <[email protected]>
    Cc: Jarkko Nikula <[email protected]>
    Cc: Andy Shevchenko <[email protected]>
    Cc: Mika Westerberg <[email protected]>
    Cc: Jan Dabros <[email protected]>
    Cc: Andi Shyti <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Acked-by: Jarkko Nikula <[email protected]>
    Tested-by: Jarkko Nikula <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: fix iteration of TLVs in Preserved Fields Area [+ + +]

Author: Jacob Keller <[email protected]>
Date:   Mon Jun 3 14:42:30 2024 -0700

    ice: fix iteration of TLVs in Preserved Fields Area
    
    [ Upstream commit 03e4a092be8ce3de7c1baa7ae14e68b64e3ea644 ]
    
    The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value
    structures in the Preserved Fields Area (PFA) of the NVM. This is used by
    the driver to access data such as the Part Board Assembly identifier.
    
    The function uses simple logic to iterate over the PFA. First, the pointer
    to the PFA in the NVM is read. Then the total length of the PFA is read
    from the first word.
    
    A pointer to the first TLV is initialized, and a simple loop iterates over
    each TLV. The pointer is moved forward through the NVM until it exceeds the
    PFA area.
    
    The logic seems sound, but it is missing a key detail. The Preserved
    Fields Area length includes one additional final word. This is documented
    in the device data sheet as a dummy word which contains 0xFFFF. All NVMs
    have this extra word.
    
    If the driver tries to scan for a TLV that is not in the PFA, it will read
    past the size of the PFA. It reads and interprets the last dummy word of
    the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA
    as a length.
    
    The PFA resides within the Shadow RAM portion of the NVM, which is
    relatively small. All of its offsets are within a 16-bit size. The PFA
    pointer and TLV pointer are stored by the driver as 16-bit values.
    
    In almost all cases, the word following the PFA will be such that
    interpreting it as a length will result in 16-bit arithmetic overflow. Once
    overflowed, the new next_tlv value is now below the maximum offset of the
    PFA. Thus, the driver will continue to iterate the data as TLVs. In the
    worst case, the driver hits on a sequence of reads which loop back to
    reading the same offsets in an endless loop.
    
    To fix this, we need to correct the loop iteration check to account for
    this extra word at the end of the PFA. This alone is sufficient to resolve
    the known cases of this issue in the field. However, it is plausible that
    an NVM could be misconfigured or have corrupt data which results in the
    same kind of overflow. Protect against this by using check_add_overflow
    when calculating both the maximum offset of the TLVs, and when calculating
    the next_tlv offset at the end of each loop iteration. This ensures that
    the driver will not get stuck in an infinite loop when scanning the PFA.
    
    Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
    Co-developed-by: Paul Greenwalt <[email protected]>
    Signed-off-by: Paul Greenwalt <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]>
    Signed-off-by: Jacob Keller <[email protected]>
    Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563aa89b0c@intel.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: Introduce new parameters in ice_sched_node [+ + +]

Author: Michal Wilczynski <[email protected]>
Date:   Tue Nov 15 11:48:20 2022 +0100

    ice: Introduce new parameters in ice_sched_node
    
    [ Upstream commit 16dfa49406bc5e1f4cbb115027cbd719d7e6c930 ]
    
    To support new devlink-rate API ice_sched_node struct needs to store
    a number of additional parameters. This includes tx_max, tx_share,
    tx_weight, and tx_priority.
    
    Add new fields to ice_sched_node struct. Add new functions to configure
    the hardware with new parameters. Introduce new xarray to identify
    nodes uniquely.
    
    Signed-off-by: Michal Wilczynski <[email protected]>
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap")
    Signed-off-by: Sasha Levin <[email protected]>

ice: remove af_xdp_zc_qps bitmap [+ + +]

Author: Larysa Zaremba <[email protected]>
Date:   Mon Jun 3 14:42:32 2024 -0700

    ice: remove af_xdp_zc_qps bitmap
    
    [ Upstream commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd ]
    
    Referenced commit has introduced a bitmap to distinguish between ZC and
    copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this
    for us.
    
    The bitmap would be especially useful when restoring previous state after
    rebuild, if only it was not reallocated in the process. This leads to e.g.
    xdpsock dying after changing number of queues.
    
    Instead of preserving the bitmap during the rebuild, remove it completely
    and distinguish between ZC and copy-mode queues based on the presence of
    a device associated with the pool.
    
    Fixes: e102db780e1c ("ice: track AF_XDP ZC enabled queues in bitmap")
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Larysa Zaremba <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Chandan Kumar Rout <[email protected]>
    Signed-off-by: Jacob Keller <[email protected]>
    Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563aa89b0c@intel.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: remove null checks before devm_kfree() calls [+ + +]

Author: Przemek Kitszel <[email protected]>
Date:   Wed May 31 14:38:40 2023 +0200

    ice: remove null checks before devm_kfree() calls
    
    [ Upstream commit ad667d626825383b626ad6ed38d6205618abb115 ]
    
    We all know they are redundant.
    
    Reviewed-by: Michal Swiatkowski <[email protected]>
    Reviewed-by: Michal Wilczynski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: Przemek Kitszel <[email protected]>
    Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap")
    Signed-off-by: Sasha Levin <[email protected]>

iio: accel: mxc4005: allow module autoloading via OF compatible [+ + +]

Author: Luca Ceresoli <[email protected]>
Date:   Wed Oct 4 18:39:28 2023 +0200

    iio: accel: mxc4005: allow module autoloading via OF compatible
    
    [ Upstream commit 4d7c16d08d248952c116f2eb9b7b5abc43a19688 ]
    
    Add OF device table with compatible strings to allow automatic module
    loading.
    
    Signed-off-by: Luca Ceresoli <[email protected]>
    Reviewed-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/20231004-mxc4005-device-tree-support-v1-2-e7c0faea72e4@bootlin.com
    Signed-off-by: Jonathan Cameron <[email protected]>
    Stable-dep-of: 6b8cffdc4a31 ("iio: accel: mxc4005: Reset chip on probe() and resume()")
    Signed-off-by: Sasha Levin <[email protected]>

iio: accel: mxc4005: Reset chip on probe() and resume() [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Tue Mar 26 12:37:00 2024 +0100

    iio: accel: mxc4005: Reset chip on probe() and resume()
    
    [ Upstream commit 6b8cffdc4a31e4a72f75ecd1bc13fbf0dafee390 ]
    
    On some designs the chip is not properly reset when powered up at boot or
    after a suspend/resume cycle.
    
    Use the sw-reset feature to ensure that the chip is in a clean state
    after probe() / resume() and in the case of resume() restore the settings
    (scale, trigger-enabled).
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218578
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iio: adc: ad9467: fix scan type sign [+ + +]

Author: David Lechner <[email protected]>
Date:   Fri May 3 14:45:05 2024 -0500

    iio: adc: ad9467: fix scan type sign
    
    commit 8a01ef749b0a632f0e1f4ead0f08b3310d99fcb1 upstream.
    
    According to the IIO documentation, the sign in the scan type should be
    lower case. The ad9467 driver was incorrectly using upper case.
    
    Fix by changing to lower case.
    
    Fixes: 4606d0f4b05f ("iio: adc: ad9467: add support for AD9434 high-speed ADC")
    Fixes: ad6797120238 ("iio: adc: ad9467: add support AD9467 ADC")
    Signed-off-by: David Lechner <[email protected]>
    Link: https://lore.kernel.org/r/20240503-ad9467-fix-scan-type-sign-v1-1-c7a1a066ebb9@baylibre.com
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: dac: ad5592r: fix temperature channel scaling value [+ + +]

Author: Marc Ferland <[email protected]>
Date:   Wed May 1 11:05:54 2024 -0400

    iio: dac: ad5592r: fix temperature channel scaling value
    
    commit 279428df888319bf68f2686934897301a250bb84 upstream.
    
    The scale value for the temperature channel is (assuming Vref=2.5 and
    the datasheet):
    
        376.7897513
    
    When calculating both val and val2 for the temperature scale we
    use (3767897513/25) and multiply it by Vref (here I assume 2500mV) to
    obtain:
    
      2500 * (3767897513/25) ==> 376789751300
    
    Finally we divide with remainder by 10^9 to get:
    
        val = 376
        val2 = 789751300
    
    However, we return IIO_VAL_INT_PLUS_MICRO (should have been NANO) as
    the scale type. So when converting the raw temperature value to the
    'processed' temperature value we will get (assuming raw=810,
    offset=-753):
    
        processed = (raw + offset) * scale_val
                  = (810 + -753) * 376
                  = 21432
    
        processed += div((raw + offset) * scale_val2, 10^6)
                  += div((810 + -753) * 789751300, 10^6)
                  += 45015
        ==> 66447
        ==> 66.4 Celcius
    
    instead of the expected 21.5 Celsius.
    
    Fix this issue by changing IIO_VAL_INT_PLUS_MICRO to
    IIO_VAL_INT_PLUS_NANO.
    
    Fixes: 56ca9db862bf ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs")
    Signed-off-by: Marc Ferland <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: imu: inv_icm42600: delete unneeded update watermark call [+ + +]

Author: Jean-Baptiste Maneyrol <[email protected]>
Date:   Mon May 27 21:00:08 2024 +0000

    iio: imu: inv_icm42600: delete unneeded update watermark call
    
    commit 245f3b149e6cc3ac6ee612cdb7042263bfc9e73c upstream.
    
    Update watermark will be done inside the hwfifo_set_watermark callback
    just after the update_scan_mode. It is useless to do it here.
    
    Fixes: 7f85e42a6c54 ("iio: imu: inv_icm42600: add buffer support in iio devices")
    Cc: [email protected]
    Signed-off-by: Jean-Baptiste Maneyrol <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Input: try trimming too long modalias strings [+ + +]

Author: Dmitry Torokhov <[email protected]>
Date:   Mon Apr 29 14:50:41 2024 -0700

    Input: try trimming too long modalias strings
    
    commit 0774d19038c496f0c3602fb505c43e1b2d8eed85 upstream.
    
    If an input device declares too many capability bits then modalias
    string for such device may become too long and not fit into uevent
    buffer, resulting in failure of sending said uevent. This, in turn,
    may prevent userspace from recognizing existence of such devices.
    
    This is typically not a concern for real hardware devices as they have
    limited number of keys, but happen with synthetic devices such as
    ones created by xen-kbdfront driver, which creates devices as being
    capable of delivering all possible keys, since it doesn't know what
    keys the backend may produce.
    
    To deal with such devices input core will attempt to trim key data,
    in the hope that the rest of modalias string will fit in the given
    buffer. When trimming key data it will indicate that it is not
    complete by placing "+," sign, resulting in conversions like this:
    
    old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
    new: k71,72,73,74,78,7A,7B,7C,+,
    
    This should allow existing udev rules continue to work with existing
    devices, and will also allow writing more complex rules that would
    recognize trimmed modalias and check input device characteristics by
    other means (for example by parsing KEY= data in uevent or parsing
    input device sysfs attributes).
    
    Note that the driver core may try adding more uevent environment
    variables once input core is done adding its own, so when forming
    modalias we can not use the entire available buffer, so we reduce
    it by somewhat an arbitrary amount (96 bytes).
    
    Reported-by: Jason Andryuk <[email protected]>
    Reviewed-by: Peter Hutterer <[email protected]>
    Tested-by: Jason Andryuk <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Jason Andryuk <[email protected]>

intel_th: pci: Add Granite Rapids SOC support [+ + +]

Author: Alexander Shishkin <[email protected]>
Date:   Mon Apr 29 16:01:15 2024 +0300

    intel_th: pci: Add Granite Rapids SOC support
    
    commit 854afe461b009801a171b3a49c5f75ea43e4c04c upstream.
    
    Add support for the Trace Hub in Granite Rapids SOC.
    
    Signed-off-by: Alexander Shishkin <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

intel_th: pci: Add Granite Rapids support [+ + +]

Author: Alexander Shishkin <[email protected]>
Date:   Mon Apr 29 16:01:14 2024 +0300

    intel_th: pci: Add Granite Rapids support
    
    commit e44937889bdf4ecd1f0c25762b7226406b9b7a69 upstream.
    
    Add support for the Trace Hub in Granite Rapids.
    
    Signed-off-by: Alexander Shishkin <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

intel_th: pci: Add Lunar Lake support [+ + +]

Author: Alexander Shishkin <[email protected]>
Date:   Mon Apr 29 16:01:19 2024 +0300

    intel_th: pci: Add Lunar Lake support
    
    commit f866b65322bfbc8fcca13c25f49e1a5c5a93ae4d upstream.
    
    Add support for the Trace Hub in Lunar Lake.
    
    Signed-off-by: Alexander Shishkin <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

intel_th: pci: Add Meteor Lake-S support [+ + +]

Author: Alexander Shishkin <[email protected]>
Date:   Mon Apr 29 16:01:17 2024 +0300

    intel_th: pci: Add Meteor Lake-S support
    
    commit c4a30def564d75e84718b059d1a62cc79b137cf9 upstream.
    
    Add support for the Trace Hub in Meteor Lake-S.
    
    Signed-off-by: Alexander Shishkin <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

intel_th: pci: Add Sapphire Rapids SOC support [+ + +]

Author: Alexander Shishkin <[email protected]>
Date:   Mon Apr 29 16:01:16 2024 +0300

    intel_th: pci: Add Sapphire Rapids SOC support
    
    commit 2e1da7efabe05cb0cf0b358883b2bc89080ed0eb upstream.
    
    Add support for the Trace Hub in Sapphire Rapids SOC.
    
    Signed-off-by: Alexander Shishkin <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

io_uring: check for non-NULL file pointer in io_file_can_poll() [+ + +]

Author: Jens Axboe <[email protected]>
Date:   Sat Jun 1 12:25:35 2024 -0600

    io_uring: check for non-NULL file pointer in io_file_can_poll()
    
    commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
    
    In earlier kernels, it was possible to trigger a NULL pointer
    dereference off the forced async preparation path, if no file had
    been assigned. The trace leading to that looks as follows:
    
    BUG: kernel NULL pointer dereference, address: 00000000000000b0
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP
    CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
    RIP: 0010:io_buffer_select+0xc3/0x210
    Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b
    RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246
    RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040
    RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700
    RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020
    R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8
    R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000
    FS:  00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0
    Call Trace:
     <TASK>
     ? __die+0x1f/0x60
     ? page_fault_oops+0x14d/0x420
     ? do_user_addr_fault+0x61/0x6a0
     ? exc_page_fault+0x6c/0x150
     ? asm_exc_page_fault+0x22/0x30
     ? io_buffer_select+0xc3/0x210
     __io_import_iovec+0xb5/0x120
     io_readv_prep_async+0x36/0x70
     io_queue_sqe_fallback+0x20/0x260
     io_submit_sqes+0x314/0x630
     __do_sys_io_uring_enter+0x339/0xbc0
     ? __do_sys_io_uring_register+0x11b/0xc50
     ? vm_mmap_pgoff+0xce/0x160
     do_syscall_64+0x5f/0x180
     entry_SYSCALL_64_after_hwframe+0x46/0x4e
    RIP: 0033:0x55e0a110a67e
    Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6
    
    because the request is marked forced ASYNC and has a bad file fd, and
    hence takes the forced async prep path.
    
    Current kernels with the request async prep cleaned up can no longer hit
    this issue, but for ease of backporting, let's add this safety check in
    here too as it really doesn't hurt. For both cases, this will inevitably
    end with a CQE posted with -EBADF.
    
    Cc: [email protected]
    Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront")
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iommu/amd: Fix sysfs leak in iommu init [+ + +]

Author: Kun(llfl) <[email protected]>
Date:   Thu May 9 08:42:20 2024 +0800

    iommu/amd: Fix sysfs leak in iommu init
    
    [ Upstream commit a295ec52c8624883885396fde7b4df1a179627c3 ]
    
    During the iommu initialization, iommu_init_pci() adds sysfs nodes.
    However, these nodes aren't remove in free_iommu_resources() subsequently.
    
    Fixes: 39ab9555c241 ("iommu: Add sysfs bindings for struct iommu_device")
    Signed-off-by: Kun(llfl) <[email protected]>
    Reviewed-by: Suravee Suthikulpanit <[email protected]>
    Link: https://lore.kernel.org/r/c8e0d11c6ab1ee48299c288009cf9c5dae07b42d.1715215003.git.llfl@linux.alibaba.com
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ionic: fix use after netif_napi_del() [+ + +]

Author: Taehee Yoo <[email protected]>
Date:   Wed Jun 12 06:04:46 2024 +0000

    ionic: fix use after netif_napi_del()
    
    [ Upstream commit 79f18a41dd056115d685f3b0a419c7cd40055e13 ]
    
    When queues are started, netif_napi_add() and napi_enable() are called.
    If there are 4 queues and only 3 queues are used for the current
    configuration, only 3 queues' napi should be registered and enabled.
    The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
    enabling only the using queue' napi. Unused queues' napi will not be
    registered by netif_napi_add(), so the .poll pointer indicates NULL.
    But it couldn't distinguish whether the napi was unregistered or not
    because netif_napi_del() doesn't reset the .poll pointer to NULL.
    So, ionic_qcq_enable() calls napi_enable() for the queue, which was
    unregistered by netif_napi_del().
    
    Reproducer:
       ethtool -L <interface name> rx 1 tx 1 combined 0
       ethtool -L <interface name> rx 0 tx 0 combined 1
       ethtool -L <interface name> rx 0 tx 0 combined 4
    
    Splat looks like:
    kernel BUG at net/core/dev.c:6666!
    Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
    Workqueue: events ionic_lif_deferred_work [ionic]
    RIP: 0010:napi_enable+0x3b/0x40
    Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
    RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
    RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
    R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
    R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
    FS:  0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
    PKRU: 55555554
    Call Trace:
     <TASK>
     ? die+0x33/0x90
     ? do_trap+0xd9/0x100
     ? napi_enable+0x3b/0x40
     ? do_error_trap+0x83/0xb0
     ? napi_enable+0x3b/0x40
     ? napi_enable+0x3b/0x40
     ? exc_invalid_op+0x4e/0x70
     ? napi_enable+0x3b/0x40
     ? asm_exc_invalid_op+0x16/0x20
     ? napi_enable+0x3b/0x40
     ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
     ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
     ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
     ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
     process_one_work+0x145/0x360
     worker_thread+0x2bb/0x3d0
     ? __pfx_worker_thread+0x10/0x10
     kthread+0xcc/0x100
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x2d/0x50
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
    
    Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
    Signed-off-by: Taehee Yoo <[email protected]>
    Reviewed-by: Brett Creeley <[email protected]>
    Reviewed-by: Shannon Nelson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv6: fix possible race in __fib6_drop_pcpu_from() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue Jun 4 19:35:49 2024 +0000

    ipv6: fix possible race in __fib6_drop_pcpu_from()
    
    [ Upstream commit b01e1c030770ff3b4fe37fc7cc6bca03f594133f ]
    
    syzbot found a race in __fib6_drop_pcpu_from() [1]
    
    If compiler reads more than once (*ppcpu_rt),
    second read could read NULL, if another cpu clears
    the value in rt6_get_pcpu_route().
    
    Add a READ_ONCE() to prevent this race.
    
    Also add rcu_read_lock()/rcu_read_unlock() because
    we rely on RCU protection while dereferencing pcpu_rt.
    
    [1]
    
    Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
    CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
    Workqueue: netns cleanup_net
     RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984
    Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48
    RSP: 0018:ffffc900040df070 EFLAGS: 00010206
    RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16
    RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091
    RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007
    R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8
    R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001
    FS:  0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
      __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline]
      fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline]
      fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038
      fib6_del_route net/ipv6/ip6_fib.c:1998 [inline]
      fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043
      fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205
      fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127
      fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175
      fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255
      __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271
      rt6_sync_down_dev net/ipv6/route.c:4906 [inline]
      rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911
      addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855
      addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778
      notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
      call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992
      call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
      call_netdevice_notifiers net/core/dev.c:2044 [inline]
      dev_close_many+0x333/0x6a0 net/core/dev.c:1585
      unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193
      unregister_netdevice_many net/core/dev.c:11276 [inline]
      default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759
      ops_exit_list+0x128/0x180 net/core/net_namespace.c:178
      cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
      process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
      process_scheduled_works kernel/workqueue.c:3312 [inline]
      worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
      kthread+0x2c1/0x3a0 kernel/kthread.c:389
      ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
      ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info")
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Martin KaFai Lau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv6: ioam: block BH from ioam6_output() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Fri May 31 13:26:32 2024 +0000

    ipv6: ioam: block BH from ioam6_output()
    
    [ Upstream commit 2fe40483ec257de2a0d819ef88e3e76c7e261319 ]
    
    As explained in commit 1378817486d6 ("tipc: block BH
    before using dst_cache"), net/core/dst_cache.c
    helpers need to be called with BH disabled.
    
    Disabling preemption in ioam6_output() is not good enough,
    because ioam6_output() is called from process context,
    lwtunnel_output() only uses rcu_read_lock().
    
    We might be interrupted by a softirq, re-enter ioam6_output()
    and corrupt dst_cache data structures.
    
    Fix the race by using local_bh_disable() instead of
    preempt_disable().
    
    Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation")
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Justin Iurman <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv6: sr: block BH in seg6_output_core() and seg6_input_core() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Fri May 31 13:26:34 2024 +0000

    ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
    
    [ Upstream commit c0b98ac1cc104f48763cdb27b1e9ac25fd81fc90 ]
    
    As explained in commit 1378817486d6 ("tipc: block BH
    before using dst_cache"), net/core/dst_cache.c
    helpers need to be called with BH disabled.
    
    Disabling preemption in seg6_output_core() is not good enough,
    because seg6_output_core() is called from process context,
    lwtunnel_output() only uses rcu_read_lock().
    
    We might be interrupted by a softirq, re-enter seg6_output_core()
    and corrupt dst_cache data structures.
    
    Fix the race by using local_bh_disable() instead of
    preempt_disable().
    
    Apply a similar change in seg6_input_core().
    
    Fixes: fa79581ea66c ("ipv6: sr: fix several BUGs when preemption is enabled")
    Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: David Lebrun <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() [+ + +]

Author: Hagar Hemdan <[email protected]>
Date:   Fri May 31 16:21:44 2024 +0000

    irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
    
    commit b97e8a2f7130a4b30d1502003095833d16c028b3 upstream.
    
    its_vlpi_prop_update() calls lpi_write_config() which obtains the
    mapping information for a VLPI without lock held. So it could race
    with its_vlpi_unmap().
    
    Since all calls from its_irq_set_vcpu_affinity() require the same
    lock to be held, hoist the locking there instead of sprinkling the
    locking all over the place.
    
    This bug was discovered using Coverity Static Analysis Security Testing
    (SAST) by Synopsys, Inc.
    
    [ tglx: Use guard() instead of goto ]
    
    Fixes: 015ec0386ab6 ("irqchip/gic-v3-its: Add VLPI configuration handling")
    Suggested-by: Marc Zyngier <[email protected]>
    Signed-off-by: Hagar Hemdan <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Cc: [email protected]
    Reviewed-by: Marc Zyngier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

jfs: xattr: fix buffer overflow for invalid xattr [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Tue May 14 12:06:34 2024 +0200

    jfs: xattr: fix buffer overflow for invalid xattr
    
    commit 7c55b78818cfb732680c4a72ab270cc2d2ee3d0f upstream.
    
    When an xattr size is not what is expected, it is printed out to the
    kernel log in hex format as a form of debugging.  But when that xattr
    size is bigger than the expected size, printing it out can cause an
    access off the end of the buffer.
    
    Fix this all up by properly restricting the size of the debug hex dump
    in the kernel log.
    
    Reported-by: [email protected]
    Cc: Dave Kleikamp <[email protected]>
    Link: https://lore.kernel.org/r/2024051433-slider-cloning-98f9@gregkh
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

knfsd: LOOKUP can return an illegal error value [+ + +]

Author: Trond Myklebust <[email protected]>
Date:   Mon May 6 12:30:04 2024 -0400

    knfsd: LOOKUP can return an illegal error value
    
    commit e221c45da3770962418fb30c27d941bbc70d595a upstream.
    
    The 'NFS error' NFSERR_OPNOTSUPP is not described by any of the official
    NFS related RFCs, but appears to have snuck into some older .x files for
    NFSv2.
    Either way, it is not in RFC1094, RFC1813 or any of the NFSv4 RFCs, so
    should not be returned by the knfsd server, and particularly not by the
    "LOOKUP" operation.
    
    Instead, let's return NFSERR_STALE, which is more appropriate if the
    filesystem encodes the filehandle as FILEID_INVALID.
    
    Cc: [email protected]
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

landlock: Fix d_parent walk [+ + +]

Author: Mickaël Salaün <[email protected]>
Date:   Thu May 16 20:19:34 2024 +0200

    landlock: Fix d_parent walk
    
    commit 88da52ccd66e65f2e63a6c35c9dff55d448ef4dc upstream.
    
    The WARN_ON_ONCE() in collect_domain_accesses() can be triggered when
    trying to link a root mount point.  This cannot work in practice because
    this directory is mounted, but the VFS check is done after the call to
    security_path_link().
    
    Do not use source directory's d_parent when the source directory is the
    mount point.
    
    Cc: Günther Noack <[email protected]>
    Cc: Paul Moore <[email protected]>
    Cc: [email protected]
    Reported-by: [email protected]
    Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
    Closes: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    [mic: Fix commit message]
    Signed-off-by: Mickaël Salaün <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.1.95 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Fri Jun 21 14:36:01 2024 +0200

    Linux 6.1.95
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: SeongJae Park <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Tested-by: Allen Pais <[email protected]>
    Tested-by: Kelsey Steele <[email protected]>
    Tested-by: Salvatore Bonaccorso <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Sven Joachim <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet [+ + +]

Author: Aleksandr Mishin <[email protected]>
Date:   Wed Jun 5 13:11:35 2024 +0300

    liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
    
    [ Upstream commit c44711b78608c98a3e6b49ce91678cd0917d5349 ]
    
    In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value,
    but then it is unconditionally passed to skb_add_rx_frag() which looks
    strange and could lead to null pointer dereference.
    
    lio_vf_rep_copy_packet() call trace looks like:
            octeon_droq_process_packets
             octeon_droq_fast_process_packets
              octeon_droq_dispatch_pkt
               octeon_create_recv_info
                ...search in the dispatch_list...
                 ->disp_fn(rdisp->rinfo, ...)
                  lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...)
    In this path there is no code which sets pg_info->page to NULL.
    So this check looks unneeded and doesn't solve potential problem.
    But I guess the author had reason to add a check and I have no such card
    and can't do real test.
    In addition, the code in the function liquidio_push_packet() in
    liquidio/lio_core.c does exactly the same.
    
    Based on this, I consider the most acceptable compromise solution to
    adjust this issue by moving skb_add_rx_frag() into conditional scope.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 1f233f327913 ("liquidio: switchdev support for LiquidIO NIC")
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mei: me: release irq in mei_me_pci_resume error path [+ + +]

Author: Tomas Winkler <[email protected]>
Date:   Tue Jun 4 12:07:28 2024 +0300

    mei: me: release irq in mei_me_pci_resume error path
    
    commit 283cb234ef95d94c61f59e1cd070cd9499b51292 upstream.
    
    The mei_me_pci_resume doesn't release irq on the error path,
    in case mei_start() fails.
    
    Cc: <[email protected]>
    Fixes: 33ec08263147 ("mei: revamp mei reset state machine")
    Signed-off-by: Tomas Winkler <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

misc/pvpanic-pci: register attributes via pci_driver [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Thu Apr 11 23:33:51 2024 +0200

    misc/pvpanic-pci: register attributes via pci_driver
    
    [ Upstream commit ee59be35d7a8be7fcaa2d61fb89734ab5c25e4ee ]
    
    In __pci_register_driver(), the pci core overwrites the dev_groups field of
    the embedded struct device_driver with the dev_groups from the outer
    struct pci_driver unconditionally.
    
    Set dev_groups in the pci_driver to make sure it is used.
    
    This was broken since the introduction of pvpanic-pci.
    
    Fixes: db3a4f0abefd ("misc/pvpanic: add PCI driver")
    Cc: [email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Fixes: ded13b9cfd59 ("PCI: Add support for dev_groups to struct pci_driver")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

misc/pvpanic: deduplicate common code [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Wed Oct 11 09:18:27 2023 +0200

    misc/pvpanic: deduplicate common code
    
    [ Upstream commit c1426d392aebc51da4944d950d89e483e43f6f14 ]
    
    pvpanic-mmio.c and pvpanic-pci.c share a lot of code.
    Refactor it into pvpanic.c where it doesn't have to be kept in sync
    manually and where the core logic can be understood more easily.
    
    No functional change.
    
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Stable-dep-of: ee59be35d7a8 ("misc/pvpanic-pci: register attributes via pci_driver")
    Signed-off-by: Sasha Levin <[email protected]>

misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() [+ + +]

Author: Yongzhi Liu <[email protected]>
Date:   Thu May 23 20:14:34 2024 +0800

    misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe()
    
    [ Upstream commit 77427e3d5c353e3dd98c7c0af322f8d9e3131ace ]
    
    There is a memory leak (forget to free allocated buffers) in a
    memory allocation failure path.
    
    Fix it to jump to the correct error handling code.
    
    Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
    Signed-off-by: Yongzhi Liu <[email protected]>
    Reviewed-by: Kumaravel Thiagarajan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe() [+ + +]

Author: Yongzhi Liu <[email protected]>
Date:   Thu May 23 20:14:33 2024 +0800

    misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe()
    
    commit 086c6cbcc563c81d55257f9b27e14faf1d0963d3 upstream.
    
    When auxiliary_device_add() returns error and then calls
    auxiliary_device_uninit(), callback function
    gp_auxiliary_device_release() calls ida_free() and
    kfree(aux_device_wrapper) to free memory. We should't
    call them again in the error handling path.
    
    Fix this by skipping the redundant cleanup functions.
    
    Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
    Signed-off-by: Yongzhi Liu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm, vmalloc: fix high order __GFP_NOFAIL allocations [+ + +]

Author: Michal Hocko <[email protected]>
Date:   Mon Mar 6 09:15:17 2023 +0100

    mm, vmalloc: fix high order __GFP_NOFAIL allocations
    
    [ Upstream commit e9c3cda4d86e56bf7fe403729f38c4f0f65d3860 ]
    
    Gao Xiang has reported that the page allocator complains about high order
    __GFP_NOFAIL request coming from the vmalloc core:
    
     __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
     alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
     vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
     __vmalloc_area_node mm/vmalloc.c:3057 [inline]
     __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
     kvmalloc_node+0x156/0x1a0 mm/util.c:606
     kvmalloc include/linux/slab.h:737 [inline]
     kvmalloc_array include/linux/slab.h:755 [inline]
     kvcalloc include/linux/slab.h:760 [inline]
    
    it seems that I have completely missed high order allocation backing
    vmalloc areas case when implementing __GFP_NOFAIL support.  This means
    that [k]vmalloc at al.  can allocate higher order allocations with
    __GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or
    cause a lot of reclaim/compaction activity if those requests cannot be
    satisfied.
    
    Fix the issue by falling back to zero order allocations for __GFP_NOFAIL
    requests if the high order request fails.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL")
    Reported-by: Gao Xiang <[email protected]>
      Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Michal Hocko <[email protected]>
    Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
    Acked-by: Vlastimil Babka <[email protected]>
    Cc: Baoquan He <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Cc: Mel Gorman <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Stable-dep-of: 8e0545c83d67 ("mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL")
    Signed-off-by: Sasha Levin <[email protected]>

mm/huge_memory: don't unpoison huge_zero_folio [+ + +]

Author: Miaohe Lin <[email protected]>
Date:   Thu May 16 20:26:08 2024 +0800

    mm/huge_memory: don't unpoison huge_zero_folio
    
    commit fe6f86f4b40855a130a19aa589f9ba7f650423f4 upstream.
    
    When I did memory failure tests recently, below panic occurs:
    
     kernel BUG at include/linux/mm.h:1135!
     invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
     CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14
     RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
     RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
     RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
     RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
     RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
     R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
     R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
     FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
     Call Trace:
      <TASK>
      do_shrink_slab+0x14f/0x6a0
      shrink_slab+0xca/0x8c0
      shrink_node+0x2d0/0x7d0
      balance_pgdat+0x33a/0x720
      kswapd+0x1f3/0x410
      kthread+0xd5/0x100
      ret_from_fork+0x2f/0x50
      ret_from_fork_asm+0x1a/0x30
      </TASK>
     Modules linked in: mce_inject hwpoison_inject
     ---[ end trace 0000000000000000 ]---
     RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
     RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
     RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
     RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
     RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
     R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
     R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
     FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
    
    The root cause is that HWPoison flag will be set for huge_zero_folio
    without increasing the folio refcnt.  But then unpoison_memory() will
    decrease the folio refcnt unexpectedly as it appears like a successfully
    hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when
    releasing huge_zero_folio.
    
    Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue.
    We're not prepared to unpoison huge_zero_folio yet.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page")
    Signed-off-by: Miaohe Lin <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Reviewed-by: Oscar Salvador <[email protected]>
    Reviewed-by: Anshuman Khandual <[email protected]>
    Cc: Naoya Horiguchi <[email protected]>
    Cc: Xu Yu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Miaohe Lin <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/memory-failure: fix handling of dissolved but not taken off from buddy pages [+ + +]

Author: Miaohe Lin <[email protected]>
Date:   Thu May 23 15:12:17 2024 +0800

    mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
    
    commit 8cf360b9d6a840700e06864236a01a883b34bbad upstream.
    
    When I did memory failure tests recently, below panic occurs:
    
    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
    flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
    raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
    raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
    page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
    ------------[ cut here ]------------
    kernel BUG at include/linux/page-flags.h:1009!
    invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    RIP: 0010:__del_page_from_free_list+0x151/0x180
    RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
    RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
    RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
    RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
    R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
    R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
    FS:  00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
    Call Trace:
     <TASK>
     __rmqueue_pcplist+0x23b/0x520
     get_page_from_freelist+0x26b/0xe40
     __alloc_pages_noprof+0x113/0x1120
     __folio_alloc_noprof+0x11/0xb0
     alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
     __alloc_fresh_hugetlb_folio+0xe7/0x140
     alloc_pool_huge_folio+0x68/0x100
     set_max_huge_pages+0x13d/0x340
     hugetlb_sysctl_handler_common+0xe8/0x110
     proc_sys_call_handler+0x194/0x280
     vfs_write+0x387/0x550
     ksys_write+0x64/0xe0
     do_syscall_64+0xc2/0x1d0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7ff916114887
    RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
    RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
    RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
    R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
    R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
     </TASK>
    Modules linked in: mce_inject hwpoison_inject
    ---[ end trace 0000000000000000 ]---
    
    And before the panic, there had an warning about bad page state:
    
    BUG: Bad page state in process page-types  pfn:8cee00
    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
    flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
    page_type: 0xffffff7f(buddy)
    raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
    raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
    page dumped because: nonzero mapcount
    Modules linked in: mce_inject hwpoison_inject
    CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
    Call Trace:
     <TASK>
     dump_stack_lvl+0x83/0xa0
     bad_page+0x63/0xf0
     free_unref_page+0x36e/0x5c0
     unpoison_memory+0x50b/0x630
     simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
     debugfs_attr_write+0x42/0x60
     full_proxy_write+0x5b/0x80
     vfs_write+0xcd/0x550
     ksys_write+0x64/0xe0
     do_syscall_64+0xc2/0x1d0
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f189a514887
    RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
    RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
    RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
    R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
     </TASK>
    
    The root cause should be the below race:
    
     memory_failure
      try_memory_failure_hugetlb
       me_huge_page
        __page_handle_poison
         dissolve_free_hugetlb_folio
         drain_all_pages -- Buddy page can be isolated e.g. for compaction.
         take_page_off_buddy -- Failed as page is not in the buddy list.
                 -- Page can be putback into buddy after compaction.
        page_ref_inc -- Leads to buddy page with refcnt = 1.
    
    Then unpoison_memory() can unpoison the page and send the buddy page back
    into buddy list again leading to the above bad page state warning.  And
    bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
    page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
    allocate this page.
    
    Fix this issue by only treating __page_handle_poison() as successful when
    it returns 1.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
    Signed-off-by: Miaohe Lin <[email protected]>
    Cc: Naoya Horiguchi <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Miaohe Lin <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL [+ + +]

Author: Hailong.Liu <[email protected]>
Date:   Fri May 10 18:01:31 2024 +0800

    mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
    
    [ Upstream commit 8e0545c83d672750632f46e3f9ad95c48c91a0fc ]
    
    commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
    includes support for __GFP_NOFAIL, but it presents a conflict with commit
    dd544141b9eb ("vmalloc: back off when the current task is OOM-killed").  A
    possible scenario is as follows:
    
    process-a
    __vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL)
        __vmalloc_area_node()
            vm_area_alloc_pages()
                    --> oom-killer send SIGKILL to process-a
            if (fatal_signal_pending(current)) break;
    --> return NULL;
    
    To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages()
    if __GFP_NOFAIL set.
    
    This issue occurred during OPLUS KASAN TEST. Below is part of the log
    -> oom-killer sends signal to process
    [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198
    
    [65731.259685] [T32454] Call trace:
    [65731.259698] [T32454]  dump_backtrace+0xf4/0x118
    [65731.259734] [T32454]  show_stack+0x18/0x24
    [65731.259756] [T32454]  dump_stack_lvl+0x60/0x7c
    [65731.259781] [T32454]  dump_stack+0x18/0x38
    [65731.259800] [T32454]  mrdump_common_die+0x250/0x39c [mrdump]
    [65731.259936] [T32454]  ipanic_die+0x20/0x34 [mrdump]
    [65731.260019] [T32454]  atomic_notifier_call_chain+0xb4/0xfc
    [65731.260047] [T32454]  notify_die+0x114/0x198
    [65731.260073] [T32454]  die+0xf4/0x5b4
    [65731.260098] [T32454]  die_kernel_fault+0x80/0x98
    [65731.260124] [T32454]  __do_kernel_fault+0x160/0x2a8
    [65731.260146] [T32454]  do_bad_area+0x68/0x148
    [65731.260174] [T32454]  do_mem_abort+0x151c/0x1b34
    [65731.260204] [T32454]  el1_abort+0x3c/0x5c
    [65731.260227] [T32454]  el1h_64_sync_handler+0x54/0x90
    [65731.260248] [T32454]  el1h_64_sync+0x68/0x6c
    
    [65731.260269] [T32454]  z_erofs_decompress_queue+0x7f0/0x2258
    --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL);
            kernel panic by NULL pointer dereference.
            erofs assume kvmalloc with __GFP_NOFAIL never return NULL.
    [65731.260293] [T32454]  z_erofs_runqueue+0xf30/0x104c
    [65731.260314] [T32454]  z_erofs_readahead+0x4f0/0x968
    [65731.260339] [T32454]  read_pages+0x170/0xadc
    [65731.260364] [T32454]  page_cache_ra_unbounded+0x874/0xf30
    [65731.260388] [T32454]  page_cache_ra_order+0x24c/0x714
    [65731.260411] [T32454]  filemap_fault+0xbf0/0x1a74
    [65731.260437] [T32454]  __do_fault+0xd0/0x33c
    [65731.260462] [T32454]  handle_mm_fault+0xf74/0x3fe0
    [65731.260486] [T32454]  do_mem_abort+0x54c/0x1b34
    [65731.260509] [T32454]  el0_da+0x44/0x94
    [65731.260531] [T32454]  el0t_64_sync_handler+0x98/0xb4
    [65731.260553] [T32454]  el0t_64_sync+0x198/0x19c
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL")
    Signed-off-by: Hailong.Liu <[email protected]>
    Acked-by: Michal Hocko <[email protected]>
    Suggested-by: Barry Song <[email protected]>
    Reported-by: Oven <[email protected]>
    Reviewed-by: Barry Song <[email protected]>
    Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
    Cc: Chao Yu <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Cc: Gao Xiang <[email protected]>
    Cc: Lorenzo Stoakes <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mmc: davinci: Don't strip remove function when driver is builtin [+ + +]

Author: Uwe Kleine-König <[email protected]>
Date:   Sun Mar 24 12:40:17 2024 +0100

    mmc: davinci: Don't strip remove function when driver is builtin
    
    [ Upstream commit 55c421b364482b61c4c45313a535e61ed5ae4ea3 ]
    
    Using __exit for the remove function results in the remove callback being
    discarded with CONFIG_MMC_DAVINCI=y. When such a device gets unbound (e.g.
    using sysfs or hotplug), the driver is just removed without the cleanup
    being performed. This results in resource leaks. Fix it by compiling in the
    remove callback unconditionally.
    
    This also fixes a W=1 modpost warning:
    
    WARNING: modpost: drivers/mmc/host/davinci_mmc: section mismatch in
    reference: davinci_mmcsd_driver+0x10 (section: .data) ->
    davinci_mmcsd_remove (section: .exit.text)
    
    Fixes: b4cff4549b7a ("DaVinci: MMC: MMC/SD controller driver for DaVinci family")
    Signed-off-by: Uwe Kleine-König <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mptcp: ensure snd_una is properly initialized on connect [+ + +]

Author: Paolo Abeni <[email protected]>
Date:   Fri Jun 7 17:01:48 2024 +0200

    mptcp: ensure snd_una is properly initialized on connect
    
    commit 8031b58c3a9b1db3ef68b3bd749fbee2e1e1aaa3 upstream.
    
    This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt
    is properly initialized on connect"). It turns out that syzkaller can
    trigger the retransmit after fallback and before processing any other
    incoming packet - so that snd_una is still left uninitialized.
    
    Address the issue explicitly initializing snd_una together with snd_nxt
    and write_seq.
    
    Suggested-by: Mat Martineau <[email protected]>
    Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect")
    Cc: [email protected]
    Reported-by: Christoph Paasch <[email protected]>
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID [+ + +]

Author: YonglongLi <[email protected]>
Date:   Fri Jun 7 17:01:49 2024 +0200

    mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
    
    commit 6a09788c1a66e3d8b04b3b3e7618cc817bb60ae9 upstream.
    
    The RmAddr MIB counter is supposed to be incremented once when a valid
    RM_ADDR has been received. Before this patch, it could have been
    incremented as many times as the number of subflows connected to the
    linked address ID, so it could have been 0, 1 or more than 1.
    
    The "RmSubflow" is incremented after a local operation. In this case,
    it is normal to tied it with the number of subflows that have been
    actually removed.
    
    The "remove invalid addresses" MP Join subtest has been modified to
    validate this case. A broadcast IP address is now used instead: the
    client will not be able to create a subflow to this address. The
    consequence is that when receiving the RM_ADDR with the ID attached to
    this broadcast IP address, no subflow linked to this ID will be found.
    
    Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs")
    Cc: [email protected]
    Co-developed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: YonglongLi <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mptcp: pm: update add_addr counters after connect [+ + +]

Author: YonglongLi <[email protected]>
Date:   Fri Jun 7 17:01:50 2024 +0200

    mptcp: pm: update add_addr counters after connect
    
    commit 40eec1795cc27b076d49236649a29507c7ed8c2d upstream.
    
    The creation of new subflows can fail for different reasons. If no
    subflow have been created using the received ADD_ADDR, the related
    counters should not be updated, otherwise they will never be decremented
    for events related to this ID later on.
    
    For the moment, the number of accepted ADD_ADDR is only decremented upon
    the reception of a related RM_ADDR, and only if the remote address ID is
    currently being used by at least one subflow. In other words, if no
    subflow can be created with the received address, the counter will not
    be decremented. In this case, it is then important not to increment
    pm.add_addr_accepted counter, and not to modify pm.accept_addr bit.
    
    Note that this patch does not modify the behaviour in case of failures
    later on, e.g. if the MP Join is dropped or rejected.
    
    The "remove invalid addresses" MP Join subtest has been modified to
    validate this case. The broadcast IP address is added before the "valid"
    address that will be used to successfully create a subflow, and the
    limit is decreased by one: without this patch, it was not possible to
    create the last subflow, because:
    
    - the broadcast address would have been accepted even if it was not
      usable: the creation of a subflow to this address results in an error,
    
    - the limit of 2 accepted ADD_ADDR would have then been reached.
    
    Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
    Cc: [email protected]
    Co-developed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: YonglongLi <[email protected]>
    Reviewed-by: Mat Martineau <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    [ Conflicts in the selftests, in the same context, because the next line
      with 'run_tests' has been updated later by a few commits like commit
      e571fb09c893 ("selftests: mptcp: add speed env var"). We don't need to
      touch this line, nor to backport the long refactoring series. ]
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net/ipv6: Fix the RT cache flush via sysctl using a previous delay [+ + +]

Author: Petr Pavlu <[email protected]>
Date:   Fri Jun 7 13:28:28 2024 +0200

    net/ipv6: Fix the RT cache flush via sysctl using a previous delay
    
    [ Upstream commit 14a20e5b4ad998793c5f43b0330d9e1388446cf3 ]
    
    The net.ipv6.route.flush system parameter takes a value which specifies
    a delay used during the flush operation for aging exception routes. The
    written value is however not used in the currently requested flush and
    instead utilized only in the next one.
    
    A problem is that ipv6_sysctl_rtcache_flush() first reads the old value
    of net->ipv6.sysctl.flush_delay into a local delay variable and then
    calls proc_dointvec() which actually updates the sysctl based on the
    provided input.
    
    Fix the problem by switching the order of the two operations.
    
    Fixes: 4990509f19e8 ("[NETNS][IPV6]: Make sysctls route per namespace.")
    Signed-off-by: Petr Pavlu <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Always stop health timer during driver removal [+ + +]

Author: Shay Drory <[email protected]>
Date:   Tue Jun 4 00:04:43 2024 +0300

    net/mlx5: Always stop health timer during driver removal
    
    [ Upstream commit c8b3f38d2dae0397944814d691a419c451f9906f ]
    
    Currently, if teardown_hca fails to execute during driver removal, mlx5
    does not stop the health timer. Afterwards, mlx5 continue with driver
    teardown. This may lead to a UAF bug, which results in page fault
    Oops[1], since the health timer invokes after resources were freed.
    
    Hence, stop the health monitor even if teardown_hca fails.
    
    [1]
    mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
    mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
    mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
    mlx5_core 0000:18:00.0: E-Switch: cleanup
    mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource
    mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup
    BUG: unable to handle page fault for address: ffffa26487064230
    PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           OE     -------  ---  6.7.0-68.fc38.x86_64 #1
    Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020
    RIP: 0010:ioread32be+0x34/0x60
    RSP: 0018:ffffa26480003e58 EFLAGS: 00010292
    RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0
    RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230
    RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8
    R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0
    R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0
    FS:  0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     <IRQ>
     ? __die+0x23/0x70
     ? page_fault_oops+0x171/0x4e0
     ? exc_page_fault+0x175/0x180
     ? asm_exc_page_fault+0x26/0x30
     ? __pfx_poll_health+0x10/0x10 [mlx5_core]
     ? __pfx_poll_health+0x10/0x10 [mlx5_core]
     ? ioread32be+0x34/0x60
     mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core]
     ? __pfx_poll_health+0x10/0x10 [mlx5_core]
     poll_health+0x42/0x230 [mlx5_core]
     ? __next_timer_interrupt+0xbc/0x110
     ? __pfx_poll_health+0x10/0x10 [mlx5_core]
     call_timer_fn+0x21/0x130
     ? __pfx_poll_health+0x10/0x10 [mlx5_core]
     __run_timers+0x222/0x2c0
     run_timer_softirq+0x1d/0x40
     __do_softirq+0xc9/0x2c8
     __irq_exit_rcu+0xa6/0xc0
     sysvec_apic_timer_interrupt+0x72/0x90
     </IRQ>
     <TASK>
     asm_sysvec_apic_timer_interrupt+0x1a/0x20
    RIP: 0010:cpuidle_enter_state+0xcc/0x440
     ? cpuidle_enter_state+0xbd/0x440
     cpuidle_enter+0x2d/0x40
     do_idle+0x20d/0x270
     cpu_startup_entry+0x2a/0x30
     rest_init+0xd0/0xd0
     arch_call_rest_init+0xe/0x30
     start_kernel+0x709/0xa90
     x86_64_start_reservations+0x18/0x30
     x86_64_start_kernel+0x96/0xa0
     secondary_startup_64_no_verify+0x18f/0x19b
    ---[ end trace 0000000000000000 ]---
    
    Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load")
    Signed-off-by: Shay Drory <[email protected]>
    Reviewed-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Fix tainted pointer delete is case of flow rules creation fail [+ + +]

Author: Aleksandr Mishin <[email protected]>
Date:   Tue Jun 4 13:05:52 2024 +0300

    net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
    
    [ Upstream commit 229bedbf62b13af5aba6525ad10b62ad38d9ccb5 ]
    
    In case of flow rule creation fail in mlx5_lag_create_port_sel_table(),
    instead of previously created rules, the tainted pointer is deleted
    deveral times.
    Fix this bug by using correct flow rules pointers.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode")
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Split function_setup() to enable and open functions [+ + +]

Author: Shay Drory <[email protected]>
Date:   Wed May 3 12:08:48 2023 +0300

    net/mlx5: Split function_setup() to enable and open functions
    
    [ Upstream commit 2059cf51f318681a4cdd3eb1a01a2d62b6a9c442 ]
    
    mlx5_cmd_init_hca() is taking ~0.2 seconds. In case of a user who
    desire to disable some of the SF aux devices, and with large scale-1K
    SFs for example, this user will waste more than 3 minutes on
    mlx5_cmd_init_hca() which isn't needed at this stage.
    
    Downstream patch will change SFs which are probe over the E-switch,
    local SFs, to be probed without any aux dev. In order to support this,
    split function_setup() to avoid executing mlx5_cmd_init_hca().
    
    Signed-off-by: Shay Drory <[email protected]>
    Reviewed-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Stable-dep-of: c8b3f38d2dae ("net/mlx5: Always stop health timer during driver removal")
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Stop waiting for PCI if pci channel is offline [+ + +]

Author: Moshe Shemesh <[email protected]>
Date:   Tue Jun 4 00:04:42 2024 +0300

    net/mlx5: Stop waiting for PCI if pci channel is offline
    
    [ Upstream commit 33afbfcc105a572159750f2ebee834a8a70fdd96 ]
    
    In case pci channel becomes offline the driver should not wait for PCI
    reads during health dump and recovery flow. The driver has timeout for
    each of these loops trying to read PCI, so it would fail anyway.
    However, in case of recovery waiting till timeout may cause the pci
    error_detected() callback fail to meet pci_dpc_recovered() wait timeout.
    
    Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues")
    Signed-off-by: Moshe Shemesh <[email protected]>
    Reviewed-by: Shay Drori <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Stop waiting for PCI up if teardown was triggered [+ + +]

Author: Moshe Shemesh <[email protected]>
Date:   Mon Mar 13 22:42:21 2023 -0700

    net/mlx5: Stop waiting for PCI up if teardown was triggered
    
    [ Upstream commit 8ff38e730c3f5ee717f25365ef8aa4739562d567 ]
    
    If driver teardown is called while PCI is turned off, there is a race
    between health recovery and teardown. If health recovery already started
    it will wait 60 sec trying to see if PCI gets back and it can recover,
    but actually there is no need to wait anymore once teardown was called.
    
    Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break
    waiting for PCI up.
    
    Signed-off-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: 33afbfcc105a ("net/mlx5: Stop waiting for PCI if pci channel is offline")
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets [+ + +]

Author: Gal Pressman <[email protected]>
Date:   Thu Jun 6 23:32:49 2024 +0300

    net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
    
    [ Upstream commit 791b4089e326271424b78f2fae778b20e53d071b ]
    
    Move the vxlan_features_check() call to after we verified the packet is
    a tunneled VXLAN packet.
    
    Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might
    wrongly not get offloaded.
    In some cases, it worked by chance as GENEVE header is the same size as
    VXLAN, but it is obviously incorrect.
    
    Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support")
    Signed-off-by: Gal Pressman <[email protected]>
    Reviewed-by: Dragos Tatulea <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Wojciech Drewek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/ncsi: Fix the multi thread manner of NCSI driver [+ + +]

Author: DelphineCCChiu <[email protected]>
Date:   Wed May 29 14:58:55 2024 +0800

    net/ncsi: Fix the multi thread manner of NCSI driver
    
    [ Upstream commit e85e271dec0270982afed84f70dc37703fcc1d52 ]
    
    Currently NCSI driver will send several NCSI commands back to back without
    waiting the response of previous NCSI command or timeout in some state
    when NIC have multi channel. This operation against the single thread
    manner defined by NCSI SPEC(section 6.3.2.3 in DSP0222_1.1.1)
    
    According to NCSI SPEC(section 6.2.13.1 in DSP0222_1.1.1), we should probe
    one channel at a time by sending NCSI commands (Clear initial state, Get
    version ID, Get capabilities...), than repeat this steps until the max
    number of channels which we got from NCSI command (Get capabilities) has
    been probed.
    
    Fixes: e6f44ed6d04d ("net/ncsi: Package and channel management")
    Signed-off-by: DelphineCCChiu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/ncsi: Simplify Kconfig/dts control flow [+ + +]

Author: Peter Delevoryas <[email protected]>
Date:   Tue Nov 14 10:07:33 2023 -0600

    net/ncsi: Simplify Kconfig/dts control flow
    
    [ Upstream commit c797ce168930ce3d62a9b7fc4d7040963ee6a01e ]
    
    Background:
    
    1. CONFIG_NCSI_OEM_CMD_KEEP_PHY
    
    If this is enabled, we send an extra OEM Intel command in the probe
    sequence immediately after discovering a channel (e.g. after "Clear
    Initial State").
    
    2. CONFIG_NCSI_OEM_CMD_GET_MAC
    
    If this is enabled, we send one of 3 OEM "Get MAC Address" commands from
    Broadcom, Mellanox (Nvidida), and Intel in the *configuration* sequence
    for a channel.
    
    3. mellanox,multi-host (or mlx,multi-host)
    
    Introduced by this patch:
    
    https://lore.kernel.org/all/[email protected]/
    
    Which was actually originally from [email protected]:
    
    https://github.com/facebook/openbmc-linux/commit/9f132a10ec48db84613519258cd8a317fb9c8f1b
    
    Cosmo claimed that the Nvidia ConnectX-4 and ConnectX-6 NIC's don't
    respond to Get Version ID, et. al in the probe sequence unless you send
    the Set MC Affinity command first.
    
    Problem Statement:
    
    We've been using a combination of #ifdef code blocks and IS_ENABLED()
    conditions to conditionally send these OEM commands.
    
    It makes adding any new code around these commands hard to understand.
    
    Solution:
    
    In this patch, I just want to remove the conditionally compiled blocks
    of code, and always use IS_ENABLED(...) to do dynamic control flow.
    
    I don't think the small amount of code this adds to non-users of the OEM
    Kconfigs is a big deal.
    
    Signed-off-by: Peter Delevoryas <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: e85e271dec02 ("net/ncsi: Fix the multi thread manner of NCSI driver")
    Signed-off-by: Sasha Levin <[email protected]>

net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue Jun 4 18:15:11 2024 +0000

    net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
    
    [ Upstream commit f921a58ae20852d188f70842431ce6519c4fdc36 ]
    
    If one TCA_TAPRIO_ATTR_PRIOMAP attribute has been provided,
    taprio_parse_mqprio_opt() must validate it, or userspace
    can inject arbitrary data to the kernel, the second time
    taprio_change() is called.
    
    First call (with valid attributes) sets dev->num_tc
    to a non zero value.
    
    Second call (with arbitrary mqprio attributes)
    returns early from taprio_parse_mqprio_opt()
    and bad things can happen.
    
    Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
    Reported-by: Noam Rathaus <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/smc: avoid overwriting when adjusting sock bufsizes [+ + +]

Author: Wen Gu <[email protected]>
Date:   Fri May 31 16:54:17 2024 +0800

    net/smc: avoid overwriting when adjusting sock bufsizes
    
    [ Upstream commit fb0aa0781a5f457e3864da68af52c3b1f4f7fd8f ]
    
    When copying smc settings to clcsock, avoid setting clcsock's sk_sndbuf
    to sysctl_tcp_wmem[1], since this may overwrite the value set by
    tcp_sndbuf_expand() in TCP connection establishment.
    
    And the other setting sk_{snd|rcv}buf to sysctl value in
    smc_adjust_sock_bufsizes() can also be omitted since the initialization
    of smc sock and clcsock has set sk_{snd|rcv}buf to smc.sysctl_{w|r}mem
    or ipv4_sysctl_tcp_{w|r}mem[1].
    
    Fixes: 30c3c4a4497c ("net/smc: Use correct buffer sizes when switching between TCP and SMC")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wen Gu <[email protected]>
    Reviewed-by: Wenjia Zhang <[email protected]>
    Reviewed-by: Gerd Bayer <[email protected]>, too.
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: bridge: mst: fix suspicious rcu usage in br_mst_set_state [+ + +]

Author: Nikolay Aleksandrov <[email protected]>
Date:   Sun Jun 9 13:36:54 2024 +0300

    net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
    
    [ Upstream commit 546ceb1dfdac866648ec959cbc71d9525bd73462 ]
    
    I converted br_mst_set_state to RCU to avoid a vlan use-after-free
    but forgot to change the vlan group dereference helper. Switch to vlan
    group RCU deref helper to fix the suspicious rcu usage warning.
    
    Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
    Signed-off-by: Nikolay Aleksandrov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state [+ + +]

Author: Nikolay Aleksandrov <[email protected]>
Date:   Sun Jun 9 13:36:53 2024 +0300

    net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
    
    [ Upstream commit 36c92936e868601fa1f43da6758cf55805043509 ]
    
    Pass the already obtained vlan group pointer to br_mst_vlan_set_state()
    instead of dereferencing it again. Each caller has already correctly
    dereferenced it for their context. This change is required for the
    following suspicious RCU dereference fix. No functional changes
    intended.
    
    Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
    Signed-off-by: Nikolay Aleksandrov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: add cond_resched() to hns3 ring buffer init process [+ + +]

Author: Jie Wang <[email protected]>
Date:   Wed Jun 5 15:20:58 2024 +0800

    net: hns3: add cond_resched() to hns3 ring buffer init process
    
    [ Upstream commit 968fde83841a8c23558dfbd0a0c69d636db52b55 ]
    
    Currently hns3 ring buffer init process would hold cpu too long with big
    Tx/Rx ring depth. This could cause soft lockup.
    
    So this patch adds cond_resched() to the process. Then cpu can break to
    run other tasks instead of busy looping.
    
    Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters")
    Signed-off-by: Jie Wang <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hns3: fix kernel crash problem in concurrent scenario [+ + +]

Author: Yonglong Liu <[email protected]>
Date:   Wed Jun 5 15:20:57 2024 +0800

    net: hns3: fix kernel crash problem in concurrent scenario
    
    [ Upstream commit 12cda920212a49fa22d9e8b9492ac4ea013310a4 ]
    
    When link status change, the nic driver need to notify the roce
    driver to handle this event, but at this time, the roce driver
    may uninit, then cause kernel crash.
    
    To fix the problem, when link status change, need to check
    whether the roce registered, and when uninit, need to wait link
    update finish.
    
    Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
    Signed-off-by: Yonglong Liu <[email protected]>
    Signed-off-by: Jijie Shao <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP [+ + +]

Author: Kory Maincent <[email protected]>
Date:   Mon Jun 10 10:34:26 2024 +0200

    net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
    
    [ Upstream commit 144ba8580bcb82b2686c3d1a043299d844b9a682 ]
    
    ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by
    checkpatch script.
    
    Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment")
    Reviewed-by: Andrew Lunn <[email protected]>
    Acked-by: Oleksij Rempel <[email protected]>
    Signed-off-by: Kory Maincent <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sched: sch_multiq: fix possible OOB write in multiq_tune() [+ + +]

Author: Hangyu Hua <[email protected]>
Date:   Mon Jun 3 15:13:03 2024 +0800

    net: sched: sch_multiq: fix possible OOB write in multiq_tune()
    
    [ Upstream commit affc18fdc694190ca7575b9a86632a73b9fe043d ]
    
    q->bands will be assigned to qopt->bands to execute subsequent code logic
    after kmalloc. So the old q->bands should not be used in kmalloc.
    Otherwise, an out-of-bounds write will occur.
    
    Fixes: c2999f7fb05b ("net: sched: multiq: don't call qdisc_put() while holding tree lock")
    Signed-off-by: Hangyu Hua <[email protected]>
    Acked-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sfp: Always call `sfp_sm_mod_remove()` on remove [+ + +]

Author: Csókás, Bence <[email protected]>
Date:   Wed Jun 5 10:42:51 2024 +0200

    net: sfp: Always call `sfp_sm_mod_remove()` on remove
    
    [ Upstream commit e96b2933152fd87b6a41765b2f58b158fde855b6 ]
    
    If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will
    not be run. As a consequence, `sfp_hwmon_remove()` is not getting
    run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()`
    itself checks `sfp->sm_mod_state` anyways, so this check was not
    really needed in the first place.
    
    Fixes: d2e816c0293f ("net: sfp: handle module remove outside state machine")
    Signed-off-by: "Csókás, Bence" <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters [+ + +]

Author: Xiaolei Wang <[email protected]>
Date:   Sat Jun 8 22:35:24 2024 +0800

    net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
    
    [ Upstream commit be27b896529787e23a35ae4befb6337ce73fcca0 ]
    
    The current cbs parameter depends on speed after uplinking,
    which is not needed and will report a configuration error
    if the port is not initially connected. The UAPI exposed by
    tc-cbs requires userspace to recalculate the send slope anyway,
    because the formula depends on port_transmit_rate (see man tc-cbs),
    which is not an invariant from tc's perspective. Therefore, we
    use offload->sendslope and offload->idleslope to derive the
    original port_transmit_rate from the CBS formula.
    
    Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC")
    Signed-off-by: Xiaolei Wang <[email protected]>
    Reviewed-by: Wojciech Drewek <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: wwan: iosm: Fix tainted pointer delete is case of region creation fail [+ + +]

Author: Aleksandr Mishin <[email protected]>
Date:   Tue Jun 4 11:25:00 2024 +0300

    net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
    
    [ Upstream commit b0c9a26435413b81799047a7be53255640432547 ]
    
    In case of region creation fail in ipc_devlink_create_region(), previously
    created regions delete process starts from tainted pointer which actually
    holds error code value.
    Fix this bug by decreasing region index before delete.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 4dcd183fbd67 ("net: wwan: iosm: devlink registration")
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Acked-by: Sergey Ryazanov <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type [+ + +]

Author: Jozsef Kadlecsik <[email protected]>
Date:   Tue Jun 4 15:58:03 2024 +0200

    netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
    
    [ Upstream commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10 ]
    
    Lion Ackermann reported that there is a race condition between namespace cleanup
    in ipset and the garbage collection of the list:set type. The namespace
    cleanup can destroy the list:set type of sets while the gc of the set type is
    waiting to run in rcu cleanup. The latter uses data from the destroyed set which
    thus leads use after free. The patch contains the following parts:
    
    - When destroying all sets, first remove the garbage collectors, then wait
      if needed and then destroy the sets.
    - Fix the badly ordered "wait then remove gc" for the destroy a single set
      case.
    - Fix the missing rcu locking in the list:set type in the userspace test
      case.
    - Use proper RCU list handlings in the list:set type.
    
    The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list flush to cancel_gc).
    
    Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression in swap operation)
    Reported-by: Lion Ackermann <[email protected]>
    Tested-by: Lion Ackermann <[email protected]>
    Signed-off-by: Jozsef Kadlecsik <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFS: add barriers when testing for NFS_FSDATA_BLOCKED [+ + +]

Author: NeilBrown <[email protected]>
Date:   Tue May 28 13:27:17 2024 +1000

    NFS: add barriers when testing for NFS_FSDATA_BLOCKED
    
    [ Upstream commit 99bc9f2eb3f79a2b4296d9bf43153e1d10ca50d3 ]
    
    dentry->d_fsdata is set to NFS_FSDATA_BLOCKED while unlinking or
    renaming-over a file to ensure that no open succeeds while the NFS
    operation progressed on the server.
    
    Setting dentry->d_fsdata to NFS_FSDATA_BLOCKED is done under ->d_lock
    after checking the refcount is not elevated.  Any attempt to open the
    file (through that name) will go through lookp_open() which will take
    ->d_lock while incrementing the refcount, we can be sure that once the
    new value is set, __nfs_lookup_revalidate() *will* see the new value and
    will block.
    
    We don't have any locking guarantee that when we set ->d_fsdata to NULL,
    the wait_var_event() in __nfs_lookup_revalidate() will notice.
    wait/wake primitives do NOT provide barriers to guarantee order.  We
    must use smp_load_acquire() in wait_var_event() to ensure we look at an
    up-to-date value, and must use smp_store_release() before wake_up_var().
    
    This patch adds those barrier functions and factors out
    block_revalidate() and unblock_revalidate() far clarity.
    
    There is also a hypothetical bug in that if memory allocation fails
    (which never happens in practice) we might leave ->d_fsdata locked.
    This patch adds the missing call to unblock_revalidate().
    
    Reported-and-tested-by: Richard Kojedzinszky <[email protected]>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071501
    Fixes: 3c59366c207e ("NFS: don't unhash dentry during unlink/rename")
    Signed-off-by: NeilBrown <[email protected]>
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Linux: NFSv4.1 enforce rootpath check in fs_location query [+ + +]

Author: Olga Kornievskaia <[email protected]>
Date:   Wed May 29 15:44:35 2024 -0400

    NFSv4.1 enforce rootpath check in fs_location query
    
    [ Upstream commit 28568c906c1bb5f7560e18082ed7d6295860f1c2 ]
    
    In commit 4ca9f31a2be66 ("NFSv4.1 test and add 4.1 trunking transport"),
    we introduce the ability to query the NFS server for possible trunking
    locations of the existing filesystem. However, we never checked the
    returned file system path for these alternative locations. According
    to the RFC, the server can say that the filesystem currently known
    under "fs_root" of fs_location also resides under these server
    locations under the following "rootpath" pathname. The client cannot
    handle trunking a filesystem that reside under different location
    under different paths other than what the main path is. This patch
    enforces the check that fs_root path and rootpath path in fs_location
    reply is the same.
    
    Fixes: 4ca9f31a2be6 ("NFSv4.1 test and add 4.1 trunking transport")
    Signed-off-by: Olga Kornievskaia <[email protected]>
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors [+ + +]

Author: Ryusuke Konishi <[email protected]>
Date:   Tue Jun 4 22:42:55 2024 +0900

    nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
    
    [ Upstream commit 7373a51e7998b508af7136530f3a997b286ce81c ]
    
    The error handling in nilfs_empty_dir() when a directory folio/page read
    fails is incorrect, as in the old ext2 implementation, and if the
    folio/page cannot be read or nilfs_check_folio() fails, it will falsely
    determine the directory as empty and corrupt the file system.
    
    In addition, since nilfs_empty_dir() does not immediately return on a
    failed folio/page read, but continues to loop, this can cause a long loop
    with I/O if i_size of the directory's inode is also corrupted, causing the
    log writer thread to wait and hang, as reported by syzbot.
    
    Fix these issues by making nilfs_empty_dir() immediately return a false
    value (0) if it fails to get a directory folio/page.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=c8166c541d3971bf6c87
    Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations")
    Tested-by: Ryusuke Konishi <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nilfs2: fix potential kernel bug due to lack of writeback flag waiting [+ + +]

Author: Ryusuke Konishi <[email protected]>
Date:   Thu May 30 23:15:56 2024 +0900

    nilfs2: fix potential kernel bug due to lack of writeback flag waiting
    
    commit a4ca369ca221bb7e06c725792ac107f0e48e82e7 upstream.
    
    Destructive writes to a block device on which nilfs2 is mounted can cause
    a kernel bug in the folio/page writeback start routine or writeback end
    routine (__folio_start_writeback in the log below):
    
     kernel BUG at mm/page-writeback.c:3070!
     Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
     ...
     RIP: 0010:__folio_start_writeback+0xbaa/0x10e0
     Code: 25 ff 0f 00 00 0f 84 18 01 00 00 e8 40 ca c6 ff e9 17 f6 ff ff
      e8 36 ca c6 ff 4c 89 f7 48 c7 c6 80 c0 12 84 e8 e7 b3 0f 00 90 <0f>
      0b e8 1f ca c6 ff 4c 89 f7 48 c7 c6 a0 c6 12 84 e8 d0 b3 0f 00
     ...
     Call Trace:
      <TASK>
      nilfs_segctor_do_construct+0x4654/0x69d0 [nilfs2]
      nilfs_segctor_construct+0x181/0x6b0 [nilfs2]
      nilfs_segctor_thread+0x548/0x11c0 [nilfs2]
      kthread+0x2f0/0x390
      ret_from_fork+0x4b/0x80
      ret_from_fork_asm+0x1a/0x30
      </TASK>
    
    This is because when the log writer starts a writeback for segment summary
    blocks or a super root block that use the backing device's page cache, it
    does not wait for the ongoing folio/page writeback, resulting in an
    inconsistent writeback state.
    
    Fix this issue by waiting for ongoing writebacks when putting
    folios/pages on the backing device into writeback state.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Tested-by: Ryusuke Konishi <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nilfs2: return the mapped address from nilfs_get_page() [+ + +]

Author: Matthew Wilcox (Oracle) <[email protected]>
Date:   Mon Nov 27 23:30:25 2023 +0900

    nilfs2: return the mapped address from nilfs_get_page()
    
    [ Upstream commit 09a46acb3697e50548bb265afa1d79163659dd85 ]
    
    In prepartion for switching from kmap() to kmap_local(), return the kmap
    address from nilfs_get_page() instead of having the caller look up
    page_address().
    
    [konishi.ryusuke: fixed a missing blank line after declaration]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
    Signed-off-by: Ryusuke Konishi <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Stable-dep-of: 7373a51e7998 ("nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors")
    Signed-off-by: Sasha Levin <[email protected]>

null_blk: Print correct max open zones limit in null_init_zoned_dev() [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Tue May 28 15:28:52 2024 +0900

    null_blk: Print correct max open zones limit in null_init_zoned_dev()
    
    commit 233e27b4d21c3e44eb863f03e566d3a22e81a7ae upstream.
    
    When changing the maximum number of open zones, print that number
    instead of the total number of zones.
    
    Fixes: dc4d137ee3b7 ("null_blk: add support for max open/active zone limit for zoned devices")
    Cc: [email protected]
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nvmet-passthru: propagate status from id override functions [+ + +]

Author: Daniel Wagner <[email protected]>
Date:   Wed Jun 12 16:02:40 2024 +0200

    nvmet-passthru: propagate status from id override functions
    
    [ Upstream commit d76584e53f4244dbc154bec447c3852600acc914 ]
    
    The id override functions return a status which is not propagated to the
    caller.
    
    Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
    Signed-off-by: Daniel Wagner <[email protected]>
    Reviewed-by: Chaitanya Kulkarni <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ocfs2: fix races between hole punching and AIO+DIO [+ + +]

Author: Su Yue <[email protected]>
Date:   Mon Apr 8 16:20:39 2024 +0800

    ocfs2: fix races between hole punching and AIO+DIO
    
    commit 952b023f06a24b2ad6ba67304c4c84d45bea2f18 upstream.
    
    After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block",
    fstests/generic/300 become from always failed to sometimes failed:
    
    ========================================================================
    [  473.293420 ] run fstests generic/300
    
    [  475.296983 ] JBD2: Ignoring recovery information on journal
    [  475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
    [  494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
    [  494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
    [  494.292018 ] OCFS2: File system is now read-only.
    [  494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
    [  494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
    fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072
    =========================================================================
    
    In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten
    extents to a list.  extents are also inserted into extent tree in
    ocfs2_write_begin_nolock.  Then another thread call fallocate to puch a
    hole at one of the unwritten extent.  The extent at cpos was removed by
    ocfs2_remove_extent().  At end io worker thread, ocfs2_search_extent_list
    found there is no such extent at the cpos.
    
        T1                        T2                T3
                                  inode lock
                                    ...
                                    insert extents
                                    ...
                                  inode unlock
    ocfs2_fallocate
     __ocfs2_change_file_space
      inode lock
      lock ip_alloc_sem
      ocfs2_remove_inode_range inode
       ocfs2_remove_btree_range
        ocfs2_remove_extent
        ^---remove the extent at cpos 78723
      ...
      unlock ip_alloc_sem
      inode unlock
                                           ocfs2_dio_end_io
                                            ocfs2_dio_end_io_write
                                             lock ip_alloc_sem
                                             ocfs2_mark_extent_written
                                              ocfs2_change_extent_flag
                                               ocfs2_search_extent_list
                                               ^---failed to find extent
                                              ...
                                              unlock ip_alloc_sem
    
    In most filesystems, fallocate is not compatible with racing with AIO+DIO,
    so fix it by adding to wait for all dio before fallocate/punch_hole like
    ext4.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: b25801038da5 ("ocfs2: Support xfs style space reservation ioctls")
    Signed-off-by: Su Yue <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: use coarse time for new created files [+ + +]

Author: Su Yue <[email protected]>
Date:   Mon Apr 8 16:20:41 2024 +0800

    ocfs2: use coarse time for new created files
    
    commit b8cb324277ee16f3eca3055b96fce4735a5a41c6 upstream.
    
    The default atime related mount option is '-o realtime' which means file
    atime should be updated if atime <= ctime or atime <= mtime.  atime should
    be updated in the following scenario, but it is not:
    ==========================================================
    $ rm /mnt/testfile;
    $ echo test > /mnt/testfile
    $ stat -c "%X %Y %Z" /mnt/testfile
    1711881646 1711881646 1711881646
    $ sleep 5
    $ cat /mnt/testfile > /dev/null
    $ stat -c "%X %Y %Z" /mnt/testfile
    1711881646 1711881646 1711881646
    ==========================================================
    
    And the reason the atime in the test is not updated is that ocfs2 calls
    ktime_get_real_ts64() in __ocfs2_mknod_locked during file creation.  Then
    inode_set_ctime_current() is called in inode_set_ctime_current() calls
    ktime_get_coarse_real_ts64() to get current time.
    
    ktime_get_real_ts64() is more accurate than ktime_get_coarse_real_ts64().
    In my test box, I saw ctime set by ktime_get_coarse_real_ts64() is less
    than ktime_get_real_ts64() even ctime is set later.  The ctime of the new
    inode is smaller than atime.
    
    The call trace is like:
    
    ocfs2_create
      ocfs2_mknod
        __ocfs2_mknod_locked
        ....
    
          ktime_get_real_ts64 <------- set atime,ctime,mtime, more accurate
          ocfs2_populate_inode
        ...
        ocfs2_init_acl
          ocfs2_acl_set_mode
            inode_set_ctime_current
              current_time
                ktime_get_coarse_real_ts64 <-------less accurate
    
    ocfs2_file_read_iter
      ocfs2_inode_lock_atime
        ocfs2_should_update_atime
          atime <= ctime ? <-------- false, ctime < atime due to accuracy
    
    So here call ktime_get_coarse_real_ts64 to set inode time coarser while
    creating new files.  It may lower the accuracy of file times.  But it's
    not a big deal since we already use coarse time in other places like
    ocfs2_update_inode_atime and inode_set_ctime_current.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: c62c38f6b91b ("ocfs2: replace CURRENT_TIME macro")
    Signed-off-by: Su Yue <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

octeontx2-af: Always allocate PF entries from low prioriy zone [+ + +]

Author: Subbaraya Sundeep <[email protected]>
Date:   Wed May 29 20:59:44 2024 +0530

    octeontx2-af: Always allocate PF entries from low prioriy zone
    
    [ Upstream commit 8b0f7410942cdc420c4557eda02bfcdf60ccec17 ]
    
    PF mcam entries has to be at low priority always so that VF
    can install longest prefix match rules at higher priority.
    This was taken care currently but when priority allocation
    wrt reference entry is requested then entries are allocated
    from mid-zone instead of low priority zone. Fix this and
    always allocate entries from low priority zone for PFs.
    
    Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF")
    Signed-off-by: Subbaraya Sundeep <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id [+ + +]

Author: Rick Wertenbroek <[email protected]>
Date:   Wed Apr 3 16:45:08 2024 +0200

    PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
    
    commit 2dba285caba53f309d6060fca911b43d63f41697 upstream.
    
    Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem
    Vendor ID are u16 variables and are written to a u32 register of the
    controller. The Subsystem Vendor ID was always 0 because the u16 value
    was masked incorrectly with GENMASK(31,16) resulting in all lower 16
    bits being set to 0 prior to the shift.
    
    Remove both masks as they are unnecessary and set the register correctly
    i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the
    Subsystem Vendor ID.
    
    This is documented in the RK3399 TRM section 17.6.7.1.17
    
    [kwilczynski: removed unnecesary newline]
    Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
    Link: https://lore.kernel.org/linux-pci/[email protected]
    Signed-off-by: Rick Wertenbroek <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Reviewed-by: Damien Le Moal <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf/core: Fix missing wakeup when waiting for context reference [+ + +]

Author: Haifeng Xu <[email protected]>
Date:   Mon May 13 10:39:48 2024 +0000

    perf/core: Fix missing wakeup when waiting for context reference
    
    commit 74751ef5c1912ebd3e65c3b65f45587e05ce5d36 upstream.
    
    In our production environment, we found many hung tasks which are
    blocked for more than 18 hours. Their call traces are like this:
    
    [346278.191038] __schedule+0x2d8/0x890
    [346278.191046] schedule+0x4e/0xb0
    [346278.191049] perf_event_free_task+0x220/0x270
    [346278.191056] ? init_wait_var_entry+0x50/0x50
    [346278.191060] copy_process+0x663/0x18d0
    [346278.191068] kernel_clone+0x9d/0x3d0
    [346278.191072] __do_sys_clone+0x5d/0x80
    [346278.191076] __x64_sys_clone+0x25/0x30
    [346278.191079] do_syscall_64+0x5c/0xc0
    [346278.191083] ? syscall_exit_to_user_mode+0x27/0x50
    [346278.191086] ? do_syscall_64+0x69/0xc0
    [346278.191088] ? irqentry_exit_to_user_mode+0x9/0x20
    [346278.191092] ? irqentry_exit+0x19/0x30
    [346278.191095] ? exc_page_fault+0x89/0x160
    [346278.191097] ? asm_exc_page_fault+0x8/0x30
    [346278.191102] entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The task was waiting for the refcount become to 1, but from the vmcore,
    we found the refcount has already been 1. It seems that the task didn't
    get woken up by perf_event_release_kernel() and got stuck forever. The
    below scenario may cause the problem.
    
    Thread A                                        Thread B
    ...                                             ...
    perf_event_free_task                            perf_event_release_kernel
                                                       ...
                                                       acquire event->child_mutex
                                                       ...
                                                       get_ctx
       ...                                             release event->child_mutex
       acquire ctx->mutex
       ...
       perf_free_event (acquire/release event->child_mutex)
       ...
       release ctx->mutex
       wait_var_event
                                                       acquire ctx->mutex
                                                       acquire event->child_mutex
                                                       # move existing events to free_list
                                                       release event->child_mutex
                                                       release ctx->mutex
                                                       put_ctx
    ...                                             ...
    
    In this case, all events of the ctx have been freed, so we couldn't
    find the ctx in free_list and Thread A will miss the wakeup. It's thus
    necessary to add a wakeup after dropping the reference.
    
    Fixes: 1cf8dfe8a661 ("perf/core: Fix race between close() and fork()")
    Signed-off-by: Haifeng Xu <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Frederic Weisbecker <[email protected]>
    Acked-by: Mark Rutland <[email protected]>
    Cc: [email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/x86: dell-smbios: Fix wrong token data in sysfs [+ + +]

Author: Armin Wolf <[email protected]>
Date:   Tue May 28 22:49:02 2024 +0200

    platform/x86: dell-smbios: Fix wrong token data in sysfs
    
    [ Upstream commit 1981b296f858010eae409548fd297659b2cc570e ]
    
    When reading token data from sysfs on my Inspiron 3505, the token
    locations and values are wrong. This happens because match_attribute()
    blindly assumes that all entries in da_tokens have an associated
    entry in token_attrs.
    
    This however is not true as soon as da_tokens[] contains zeroed
    token entries. Those entries are being skipped when initialising
    token_attrs, breaking the core assumption of match_attribute().
    
    Fix this by defining an extra struct for each pair of token attributes
    and use container_of() to retrieve token information.
    
    Tested on a Dell Inspiron 3050.
    
    Fixes: 33b9ca1e53b4 ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens")
    Signed-off-by: Armin Wolf <[email protected]>
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

powerpc/uaccess: Fix build errors seen with GCC 13/14 [+ + +]

Author: Michael Ellerman <[email protected]>
Date:   Wed May 29 22:30:28 2024 +1000

    powerpc/uaccess: Fix build errors seen with GCC 13/14
    
    commit 2d43cc701b96f910f50915ac4c2a0cae5deb734c upstream.
    
    Building ppc64le_defconfig with GCC 14 fails with assembler errors:
    
        CC      fs/readdir.o
      /tmp/ccdQn0mD.s: Assembler messages:
      /tmp/ccdQn0mD.s:212: Error: operand out of domain (18 is not a multiple of 4)
      /tmp/ccdQn0mD.s:226: Error: operand out of domain (18 is not a multiple of 4)
      ... [6 lines]
      /tmp/ccdQn0mD.s:1699: Error: operand out of domain (18 is not a multiple of 4)
    
    A snippet of the asm shows:
    
      # ../fs/readdir.c:210:         unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
             ld 9,0(29)       # MEM[(u64 *)name_38(D) + _88 * 1], MEM[(u64 *)name_38(D) + _88 * 1]
      # 210 "../fs/readdir.c" 1
             1:      std 9,18(8)     # put_user       # *__pus_addr_52, MEM[(u64 *)name_38(D) + _88 * 1]
    
    The 'std' instruction requires a 4-byte aligned displacement because
    it is a DS-form instruction, and as the assembler says, 18 is not a
    multiple of 4.
    
    A similar error is seen with GCC 13 and CONFIG_UBSAN_SIGNED_WRAP=y.
    
    The fix is to change the constraint on the memory operand to put_user(),
    from "m" which is a general memory reference to "YZ".
    
    The "Z" constraint is documented in the GCC manual PowerPC machine
    constraints, and specifies a "memory operand accessed with indexed or
    indirect addressing". "Y" is not documented in the manual but specifies
    a "memory operand for a DS-form instruction". Using both allows the
    compiler to generate a DS-form "std" or X-form "stdx" as appropriate.
    
    The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because
    the "Y" constraint does not guarantee 4-byte alignment when prefixed
    instructions are enabled.
    
    Unfortunately clang doesn't support the "Y" constraint so that has to be
    behind an ifdef.
    
    Although the build error is only seen with GCC 13/14, that appears
    to just be luck. The constraint has been incorrect since it was first
    added.
    
    Fixes: c20beffeec3c ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()")
    Cc: [email protected] # v5.10+
    Suggested-by: Kewen Lin <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ptp: Fix error message on failed pin verification [+ + +]

Author: Karol Kolacinski <[email protected]>
Date:   Tue Jun 4 14:05:27 2024 +0200

    ptp: Fix error message on failed pin verification
    
    [ Upstream commit 323a359f9b077f382f4483023d096a4d316fd135 ]
    
    On failed verification of PTP clock pin, error message prints channel
    number instead of pin index after "pin", which is incorrect.
    
    Fix error message by adding channel number to the message and printing
    pin number instead of channel number.
    
    Fixes: 6092315dfdec ("ptp: introduce programmable pins.")
    Signed-off-by: Karol Kolacinski <[email protected]>
    Acked-by: Richard Cochran <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs [+ + +]

Author: Beleswar Padhi <[email protected]>
Date:   Tue Apr 30 16:23:07 2024 +0530

    remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
    
    commit 3c8a9066d584f5010b6f4ba03bf6b19d28973d52 upstream.
    
    PSC controller has a limitation that it can only power-up the second
    core when the first core is in ON state. Power-state for core0 should be
    equal to or higher than core1.
    
    Therefore, prevent core1 from powering up before core0 during the start
    process from sysfs. Similarly, prevent core0 from shutting down before
    core1 has been shut down from sysfs.
    
    Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem")
    Signed-off-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

remoteproc: k3-r5: Jump to error handling labels in start/stop errors [+ + +]

Author: Beleswar Padhi <[email protected]>
Date:   Mon May 6 19:48:49 2024 +0530

    remoteproc: k3-r5: Jump to error handling labels in start/stop errors
    
    commit 1dc7242f6ee0c99852cb90676d7fe201cf5de422 upstream.
    
    In case of errors during core start operation from sysfs, the driver
    directly returns with the -EPERM error code. Fix this to ensure that
    mailbox channels are freed on error before returning by jumping to the
    'put_mbox' error handling label. Similarly, jump to the 'out' error
    handling label to return with required -EPERM error code during the
    core stop operation from sysfs.
    
    Fixes: 3c8a9066d584 ("remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs")
    Signed-off-by: Beleswar Padhi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

remoteproc: k3-r5: Wait for core0 power-up before powering up core1 [+ + +]

Author: Apurva Nandan <[email protected]>
Date:   Tue Apr 30 16:23:06 2024 +0530

    remoteproc: k3-r5: Wait for core0 power-up before powering up core1
    
    commit 61f6f68447aba08aeaa97593af3a7d85a114891f upstream.
    
    PSC controller has a limitation that it can only power-up the second core
    when the first core is in ON state. Power-state for core0 should be equal
    to or higher than core1, else the kernel is seen hanging during rproc
    loading.
    
    Make the powering up of cores sequential, by waiting for the current core
    to power-up before proceeding to the next core, with a timeout of 2sec.
    Add a wait queue event in k3_r5_cluster_rproc_init call, that will wait
    for the current core to be released from reset before proceeding with the
    next core.
    
    Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem")
    Signed-off-by: Apurva Nandan <[email protected]>
    Signed-off-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "fork: defer linking file vma until vma is fully initialized" [+ + +]

Author: Sam James <[email protected]>
Date:   Fri Jun 14 09:40:28 2024 +0100

    Revert "fork: defer linking file vma until vma is fully initialized"
    
    This reverts commit 0c42f7e039aba3de6d7dbf92da708e2b2ecba557 which is commit
    35e351780fa9d8240dd6f7e4f245f9ea37e96c19 upstream.
    
    The backport is incomplete and causes xfstests failures. The consequences
    of the incomplete backport seem worse than the original issue, so pick
    the lesser evil and revert until a full backport is ready.
    
    Link: https://lore.kernel.org/stable/[email protected]/
    Reported-by: Leah Rumancik <[email protected]>
    Signed-off-by: Sam James <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: fix overlap of allocated page and PTR_ERR [+ + +]

Author: Nam Cao <[email protected]>
Date:   Thu Apr 25 13:52:01 2024 +0200

    riscv: fix overlap of allocated page and PTR_ERR
    
    commit 994af1825a2aa286f4903ff64a1c7378b52defe6 upstream.
    
    On riscv32, it is possible for the last page in virtual address space
    (0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that
    shouldn't happen.
    
    There is already some code to ensure memblock won't allocate the last page.
    However, buddy allocator is left unchecked.
    
    Fix this by reserving physical memory that would be mapped at virtual
    addresses greater than 0xfffff000.
    
    Reported-by: Björn Töpel <[email protected]>
    Closes: https://lore.kernel.org/linux-riscv/[email protected]
    Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
    Signed-off-by: Nam Cao <[email protected]>
    Cc: <[email protected]>
    Tested-by: Björn Töpel <[email protected]>
    Reviewed-by: Björn Töpel <[email protected]>
    Reviewed-by: Mike Rapoport (IBM) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context [+ + +]

Author: Nam Cao <[email protected]>
Date:   Wed May 15 07:50:40 2024 +0200

    riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
    
    commit fb1cf0878328fe75d47f0aed0a65b30126fcefc4 upstream.
    
    __kernel_map_pages() is a debug function which clears the valid bit in page
    table entry for deallocated pages to detect illegal memory accesses to
    freed pages.
    
    This function set/clear the valid bit using __set_memory(). __set_memory()
    acquires init_mm's semaphore, and this operation may sleep. This is
    problematic, because  __kernel_map_pages() can be called in atomic context,
    and thus is illegal to sleep. An example warning that this causes:
    
    BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
    preempt_count: 2, expected: 0
    CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
    Hardware name: riscv-virtio,qemu (DT)
    Call Trace:
    [<ffffffff800060dc>] dump_backtrace+0x1c/0x24
    [<ffffffff8091ef6e>] show_stack+0x2c/0x38
    [<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
    [<ffffffff8092bb24>] dump_stack+0x14/0x1c
    [<ffffffff8003b7ac>] __might_resched+0x104/0x10e
    [<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
    [<ffffffff8093276a>] down_write+0x20/0x72
    [<ffffffff8000cf00>] __set_memory+0x82/0x2fa
    [<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
    [<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
    [<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
    [<ffffffff80011904>] copy_process+0x72c/0x17ec
    [<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
    [<ffffffff80012f62>] kernel_thread+0x82/0xa0
    [<ffffffff8003552c>] kthreadd+0x14a/0x1be
    [<ffffffff809357de>] ret_from_fork+0xe/0x1c
    
    Rewrite this function with apply_to_existing_page_range(). It is fine to
    not have any locking, because __kernel_map_pages() works with pages being
    allocated/deallocated and those pages are not changed by anyone else in the
    meantime.
    
    Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
    Signed-off-by: Nam Cao <[email protected]>
    Cc: [email protected]
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: mpi3mr: Fix ATA NCQ priority support [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Tue Jun 11 17:34:35 2024 +0900

    scsi: mpi3mr: Fix ATA NCQ priority support
    
    commit 90e6f08915ec6efe46570420412a65050ec826b2 upstream.
    
    The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to
    the HBA if a read or write command directed at an ATA device should be
    translated to an NCQ read/write command with the high prioiryt bit set
    when the request uses the RT priority class and the user has enabled NCQ
    priority through sysfs.
    
    However, unlike the mpt3sas driver, the mpi3mr driver does not define
    the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so
    the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never
    actually set and NCQ Priority cannot ever be used.
    
    Fix this by defining these missing atributes to allow a user to check if
    an ATA device supports NCQ priority and to enable/disable the use of NCQ
    priority. To do this, lift the function scsih_ncq_prio_supp() out of the
    mpt3sas driver and make it the generic SCSI SAS transport function
    sas_ata_ncq_prio_supported(). Nothing in that function is hardware
    specific, so this function can be used in both the mpt3sas driver and
    the mpi3mr driver.
    
    Reported-by: Scott McCoy <[email protected]>
    Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing")
    Cc: [email protected]
    Signed-off-by: Damien Le Moal <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Niklas Cassel <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory [+ + +]

Author: Breno Leitao <[email protected]>
Date:   Wed Jun 5 01:55:29 2024 -0700

    scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
    
    commit 4254dfeda82f20844299dca6c38cbffcfd499f41 upstream.
    
    There is a potential out-of-bounds access when using test_bit() on a single
    word. The test_bit() and set_bit() functions operate on long values, and
    when testing or setting a single word, they can exceed the word
    boundary. KASAN detects this issue and produces a dump:
    
             BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas
    
             Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965
    
    For full log, please look at [1].
    
    Make the allocation at least the size of sizeof(unsigned long) so that
    set_bit() and test_bit() have sufficient room for read/write operations
    without overwriting unallocated memory.
    
    [1] Link: https://lore.kernel.org/all/[email protected]/
    
    Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path")
    Cc: [email protected]
    Suggested-by: Keith Busch <[email protected]>
    Signed-off-by: Breno Leitao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Keith Busch <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: sd: Use READ(16) when reading block zero on large capacity disks [+ + +]

Author: Martin K. Petersen <[email protected]>
Date:   Tue Jun 4 22:25:21 2024 -0400

    scsi: sd: Use READ(16) when reading block zero on large capacity disks
    
    commit 7926d51f73e0434a6250c2fd1a0555f98d9a62da upstream.
    
    Commit 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior
    to querying device properties") triggered a read to LBA 0 before
    attempting to inquire about device characteristics. This was done
    because some protocol bridge devices will return generic values until
    an attached storage device's media has been accessed.
    
    Pierre Tomon reported that this change caused problems on a large
    capacity external drive connected via a bridge device. The bridge in
    question does not appear to implement the READ(10) command.
    
    Issue a READ(16) instead of READ(10) when a device has been identified
    as preferring 16-byte commands (use_16_for_rw heuristic).
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=218890
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties")
    Cc: [email protected]
    Reported-by: Pierre Tomon <[email protected]>
    Suggested-by: Alan Stern <[email protected]>
    Tested-by: Pierre Tomon <[email protected]>
    Reviewed-by: Bart Van Assche <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests/ftrace: Fix to check required event file [+ + +]

Author: Masami Hiramatsu (Google) <[email protected]>
Date:   Tue May 21 09:00:22 2024 +0900

    selftests/ftrace: Fix to check required event file
    
    [ Upstream commit f6c3c83db1d939ebdb8c8922748ae647d8126d91 ]
    
    The dynevent/test_duplicates.tc test case uses `syscalls/sys_enter_openat`
    event for defining eprobe on it. Since this `syscalls` events depend on
    CONFIG_FTRACE_SYSCALLS=y, if it is not set, the test will fail.
    
    Add the event file to `required` line so that the test will return
    `unsupported` result.
    
    Fixes: 297e1dcdca3d ("selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes")
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: compaction_test: fix bogus test success on Aarch64 [+ + +]

Author: Dev Jain <[email protected]>
Date:   Tue May 21 13:13:56 2024 +0530

    selftests/mm: compaction_test: fix bogus test success on Aarch64
    
    [ Upstream commit d4202e66a4b1fe6968f17f9f09bbc30d08f028a1 ]
    
    Patch series "Fixes for compaction_test", v2.
    
    The compaction_test memory selftest introduces fragmentation in memory
    and then tries to allocate as many hugepages as possible. This series
    addresses some problems.
    
    On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
    compaction_index becomes 0, which is less than 3, due to no division by
    zero exception being raised. We fix that by checking for division by
    zero.
    
    Secondly, correctly set the number of hugepages to zero before trying
    to set a large number of them.
    
    Now, consider a situation in which, at the start of the test, a non-zero
    number of hugepages have been already set (while running the entire
    selftests/mm suite, or manually by the admin). The test operates on 80%
    of memory to avoid OOM-killer invocation, and because some memory is
    already blocked by hugepages, it would increase the chance of OOM-killing.
    Also, since mem_free used in check_compaction() is the value before we
    set nr_hugepages to zero, the chance that the compaction_index will
    be small is very high if the preset nr_hugepages was high, leading to a
    bogus test success.
    
    This patch (of 3):
    
    Currently, if at runtime we are not able to allocate a huge page, the test
    will trivially pass on Aarch64 due to no exception being raised on
    division by zero while computing compaction_index.  Fix that by checking
    for nr_hugepages == 0.  Anyways, in general, avoid a division by zero by
    exiting the program beforehand.  While at it, fix a typo, and handle the
    case where the number of hugepages may overflow an integer.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
    Signed-off-by: Dev Jain <[email protected]>
    Cc: Anshuman Khandual <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: Sri Jayaramappa <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages [+ + +]

Author: Dev Jain <[email protected]>
Date:   Tue May 21 13:13:57 2024 +0530

    selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
    
    [ Upstream commit 9ad665ef55eaad1ead1406a58a34f615a7c18b5e ]
    
    Currently, the test tries to set nr_hugepages to zero, but that is not
    actually done because the file offset is not reset after read().  Fix that
    using lseek().
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
    Signed-off-by: Dev Jain <[email protected]>
    Cc: <[email protected]>
    Cc: Anshuman Khandual <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: Sri Jayaramappa <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: conform test to TAP format output [+ + +]

Author: Muhammad Usama Anjum <[email protected]>
Date:   Mon Jan 1 13:36:12 2024 +0500

    selftests/mm: conform test to TAP format output
    
    [ Upstream commit 9a21701edc41465de56f97914741bfb7bfc2517d ]
    
    Conform the layout, informational and status messages to TAP.  No
    functional change is intended other than the layout of output messages.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Muhammad Usama Anjum <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64")
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: log a consistent test name for check_compaction [+ + +]

Author: Mark Brown <[email protected]>
Date:   Fri Feb 9 14:30:04 2024 +0000

    selftests/mm: log a consistent test name for check_compaction
    
    [ Upstream commit f3b7568c49420d2dcd251032c9ca1e069ec8a6c9 ]
    
    Every test result report in the compaction test prints a distinct log
    messae, and some of the reports print a name that varies at runtime.  This
    causes problems for automation since a lot of automation software uses the
    printed string as the name of the test, if the name varies from run to run
    and from pass to fail then the automation software can't identify that a
    test changed result or that the same tests are being run.
    
    Refactor the logging to use a consistent name when printing the result of
    the test, printing the existing messages as diagnostic information instead
    so they are still available for people trying to interpret the results.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Cc: Muhammad Usama Anjum <[email protected]>
    Cc: Ryan Roberts <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64")
    Signed-off-by: Sasha Levin <[email protected]>

serial: 8250_dw: fall back to poll if there's no interrupt [+ + +]

Author: Jisheng Zhang <[email protected]>
Date:   Sun Aug 6 17:20:56 2023 +0800

    serial: 8250_dw: fall back to poll if there's no interrupt
    
    [ Upstream commit 22130dae0533c474e4e0db930a88caa9b397d083 ]
    
    When there's no irq(this can be due to various reasons, for example,
    no irq from HW support, or we just want to use poll solution, and so
    on), falling back to poll is still better than no support at all.
    
    Signed-off-by: Jisheng Zhang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
    Signed-off-by: Sasha Levin <[email protected]>

serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level [+ + +]

Author: Doug Brown <[email protected]>
Date:   Sun May 19 12:19:30 2024 -0700

    serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
    
    commit 5208e7ced520a813b4f4774451fbac4e517e78b2 upstream.
    
    The FIFO is 64 bytes, but the FCR is configured to fire the TX interrupt
    when the FIFO is half empty (bit 3 = 0). Thus, we should only write 32
    bytes when a TX interrupt occurs.
    
    This fixes a problem observed on the PXA168 that dropped a bunch of TX
    bytes during large transmissions.
    
    Fixes: ab28f51c77cd ("serial: rewrite pxa2xx-uart to use 8250_core")
    Signed-off-by: Doug Brown <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: stable <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

serial: core: Add UPIO_UNKNOWN constant for unknown port type [+ + +]

Author: Andy Shevchenko <[email protected]>
Date:   Mon Mar 4 14:27:03 2024 +0200

    serial: core: Add UPIO_UNKNOWN constant for unknown port type
    
    [ Upstream commit 79d713baf63c8f23cc58b304c40be33d64a12aaf ]
    
    In some APIs we would like to assign the special value to iotype
    and compare against it in another places. Introduce UPIO_UNKNOWN
    for this purpose.
    
    Note, we can't use 0, because it's a valid value for IO port access.
    
    Signed-off-by: Andy Shevchenko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
    Signed-off-by: Sasha Levin <[email protected]>

serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler [+ + +]

Author: Hugo Villeneuve <[email protected]>
Date:   Tue Apr 30 16:04:30 2024 -0400

    serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
    
    [ Upstream commit 8492bd91aa055907c67ef04f2b56f6dadd1f44bf ]
    
    When using a high speed clock with a low baud rate, the 4x prescaler is
    automatically selected if required. In that case, sc16is7xx_set_baud()
    properly configures the chip registers, but returns an incorrect baud
    rate by not taking into account the prescaler value. This incorrect baud
    rate is then fed to uart_update_timeout().
    
    For example, with an input clock of 80MHz, and a selected baud rate of 50,
    sc16is7xx_set_baud() will return 200 instead of 50.
    
    Fix this by first changing the prescaler variable to hold the selected
    prescaler value instead of the MCR bitfield. Then properly take into
    account the selected prescaler value in the return value computation.
    
    Also add better documentation about the divisor value computation.
    
    Fixes: dfeae619d781 ("serial: sc16is7xx")
    Cc: [email protected]
    Signed-off-by: Hugo Villeneuve <[email protected]>
    Reviewed-by: Jiri Slaby <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

serial: sc16is7xx: replace hardcoded divisor value with BIT() macro [+ + +]

Author: Hugo Villeneuve <[email protected]>
Date:   Thu Dec 21 18:18:19 2023 -0500

    serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
    
    [ Upstream commit 2e57cefc4477659527f7adab1f87cdbf60ef1ae6 ]
    
    To better show why the limit is what it is, since we have only 16 bits for
    the divisor.
    
    Reviewed-by: Andy Shevchenko <[email protected]>
    Suggested-by: Andy Shevchenko <[email protected]>
    Signed-off-by: Hugo Villeneuve <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Stable-dep-of: 8492bd91aa05 ("serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler")
    Signed-off-by: Sasha Levin <[email protected]>

sock_map: avoid race between sock_map_close and sk_psock_put [+ + +]

Author: Thadeu Lima de Souza Cascardo <[email protected]>
Date:   Fri May 24 11:47:02 2024 -0300

    sock_map: avoid race between sock_map_close and sk_psock_put
    
    commit 4b4647add7d3c8530493f7247d11e257ee425bf0 upstream.
    
    sk_psock_get will return NULL if the refcount of psock has gone to 0, which
    will happen when the last call of sk_psock_put is done. However,
    sk_psock_drop may not have finished yet, so the close callback will still
    point to sock_map_close despite psock being NULL.
    
    This can be reproduced with a thread deleting an element from the sock map,
    while the second one creates a socket, adds it to the map and closes it.
    
    That will trigger the WARN_ON_ONCE:
    
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
    Modules linked in:
    CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
    RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
    Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02
    RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293
    RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000
    RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0
    RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3
    R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840
    R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870
    FS:  000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0
    Call Trace:
     <TASK>
     unix_release+0x87/0xc0 net/unix/af_unix.c:1048
     __sock_release net/socket.c:659 [inline]
     sock_close+0xbe/0x240 net/socket.c:1421
     __fput+0x42b/0x8a0 fs/file_table.c:422
     __do_sys_close fs/open.c:1556 [inline]
     __se_sys_close fs/open.c:1541 [inline]
     __x64_sys_close+0x7f/0x110 fs/open.c:1541
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7fb37d618070
    Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
    RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
    RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070
    RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004
    RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000
    R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
     </TASK>
    
    Use sk_psock, which will only check that the pointer is not been set to
    NULL yet, which should only happen after the callbacks are restored. If,
    then, a reference can still be gotten, we may call sk_psock_stop and cancel
    psock->work.
    
    As suggested by Paolo Abeni, reorder the condition so the control flow is
    less convoluted.
    
    After that change, the reproducer does not trigger the WARN_ON_ONCE
    anymore.
    
    Suggested-by: Paolo Abeni <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=07a2e4a1a57118ef7355
    Fixes: aadb2bb83ff7 ("sock_map: Fix a potential use-after-free in sock_map_close()")
    Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself")
    Cc: [email protected]
    Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
    Acked-by: Jakub Sitnicki <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spmi: hisi-spmi-controller: Do not override device identifier [+ + +]

Author: Vamshi Gajjela <[email protected]>
Date:   Tue May 7 14:07:41 2024 -0700

    spmi: hisi-spmi-controller: Do not override device identifier
    
    commit eda4923d78d634482227c0b189d9b7ca18824146 upstream.
    
    'nr' member of struct spmi_controller, which serves as an identifier
    for the controller/bus. This value is a dynamic ID assigned in
    spmi_controller_alloc, and overriding it from the driver results in an
    ida_free error "ida_free called for id=xx which is not allocated".
    
    Signed-off-by: Vamshi Gajjela <[email protected]>
    Fixes: 70f59c90c819 ("staging: spmi: add Hikey 970 SPMI controller driver")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Stephen Boyd <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

SUNRPC: return proper error from gss_wrap_req_priv [+ + +]

Author: Chen Hanxiao <[email protected]>
Date:   Thu May 23 16:47:16 2024 +0800

    SUNRPC: return proper error from gss_wrap_req_priv
    
    [ Upstream commit 33c94d7e3cb84f6d130678d6d59ba475a6c489cf ]
    
    don't return 0 if snd_buf->len really greater than snd_buf->buflen
    
    Signed-off-by: Chen Hanxiao <[email protected]>
    Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
    Reviewed-by: Benjamin Coddington <[email protected]>
    Reviewed-by: Chuck Lever <[email protected]>
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB [+ + +]

Author: Jason Xing <[email protected]>
Date:   Tue Jun 4 01:02:16 2024 +0800

    tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
    
    [ Upstream commit a46d0ea5c94205f40ecf912d1bb7806a8a64704f ]
    
    According to RFC 1213, we should also take CLOSE-WAIT sockets into
    consideration:
    
      "tcpCurrEstab OBJECT-TYPE
       ...
       The number of TCP connections for which the current state
       is either ESTABLISHED or CLOSE- WAIT."
    
    After this, CurrEstab counter will display the total number of
    ESTABLISHED and CLOSE-WAIT sockets.
    
    The logic of counting
    When we increment the counter?
    a) if we change the state to ESTABLISHED.
    b) if we change the state from SYN-RECEIVED to CLOSE-WAIT.
    
    When we decrement the counter?
    a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT,
    say, on the client side, changing from ESTABLISHED to FIN-WAIT-1.
    b) if the socket leaves CLOSE-WAIT, say, on the server side, changing
    from CLOSE-WAIT to LAST-ACK.
    
    Please note: there are two chances that old state of socket can be changed
    to CLOSE-WAIT in tcp_fin(). One is SYN-RECV, the other is ESTABLISHED.
    So we have to take care of the former case.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Jason Xing <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: fix race in tcp_v6_syn_recv_sock() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Jun 6 15:46:51 2024 +0000

    tcp: fix race in tcp_v6_syn_recv_sock()
    
    [ Upstream commit d37fe4255abe8e7b419b90c5847e8ec2b8debb08 ]
    
    tcp_v6_syn_recv_sock() calls ip6_dst_store() before
    inet_sk(newsk)->pinet6 has been set up.
    
    This means ip6_dst_store() writes over the parent (listener)
    np->dst_cookie.
    
    This is racy because multiple threads could share the same
    parent and their final np->dst_cookie could be wrong.
    
    Move ip6_dst_store() call after inet_sk(newsk)->pinet6
    has been changed and after the copy of parent ipv6_pinfo.
    
    Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

thunderbolt: debugfs: Fix margin debugfs node creation condition [+ + +]

Author: Aapo Vienamo <[email protected]>
Date:   Fri May 24 18:53:17 2024 +0300

    thunderbolt: debugfs: Fix margin debugfs node creation condition
    
    commit 985cfe501b74f214905ab4817acee0df24627268 upstream.
    
    The margin debugfs node controls the "Enable Margin Test" field of the
    lane margining operations. This field selects between either low or high
    voltage margin values for voltage margin test or left or right timing
    margin values for timing margin test.
    
    According to the USB4 specification, whether or not the "Enable Margin
    Test" control applies, depends on the values of the "Independent
    High/Low Voltage Margin" or "Independent Left/Right Timing Margin"
    capability fields for voltage and timing margin tests respectively. The
    pre-existing condition enabled the debugfs node also in the case where
    both low/high or left/right margins are returned, which is incorrect.
    This change only enables the debugfs node in question, if the specific
    required capability values are met.
    
    Signed-off-by: Aapo Vienamo <[email protected]>
    Fixes: d0f1e0c2a699 ("thunderbolt: Add support for receiver lane margining")
    Cc: [email protected]
    Signed-off-by: Mika Westerberg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device() [+ + +]

Author: Oleg Nesterov <[email protected]>
Date:   Tue May 28 14:20:19 2024 +0200

    tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()
    
    commit 07c54cc5988f19c9642fd463c2dbdac7fc52f777 upstream.
    
    After the recent commit 5097cbcb38e6 ("sched/isolation: Prevent boot crash
    when the boot CPU is nohz_full") the kernel no longer crashes, but there is
    another problem.
    
    In this case tick_setup_device() calls tick_take_do_timer_from_boot() to
    update tick_do_timer_cpu and this triggers the WARN_ON_ONCE(irqs_disabled)
    in smp_call_function_single().
    
    Kill tick_take_do_timer_from_boot() and just use WRITE_ONCE(), the new
    comment explains why this is safe (thanks Thomas!).
    
    Fixes: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full")
    Signed-off-by: Oleg Nesterov <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing/selftests: Fix kprobe event name test for .isra. functions [+ + +]

Author: Steven Rostedt (Google) <[email protected]>
Date:   Mon May 20 20:57:37 2024 -0400

    tracing/selftests: Fix kprobe event name test for .isra. functions
    
    commit 23a4b108accc29a6125ed14de4a044689ffeda78 upstream.
    
    The kprobe_eventname.tc test checks if a function with .isra. can have a
    kprobe attached to it. It loops through the kallsyms file for all the
    functions that have the .isra. name, and checks if it exists in the
    available_filter_functions file, and if it does, it uses it to attach a
    kprobe to it.
    
    The issue is that kprobes can not attach to functions that are listed more
    than once in available_filter_functions. With the latest kernel, the
    function that is found is: rapl_event_update.isra.0
    
      # grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions
      rapl_event_update.isra.0
      rapl_event_update.isra.0
    
    It is listed twice. This causes the attached kprobe to it to fail which in
    turn fails the test. Instead of just picking the function function that is
    found in available_filter_functions, pick the first one that is listed
    only once in available_filter_functions.
    
    Cc: [email protected]
    Fixes: 604e3548236d ("selftests/ftrace: Select an existing function in kprobe_eventname test")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Acked-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tty: n_tty: Fix buffer offsets when lookahead is used [+ + +]

Author: Ilpo Järvinen <[email protected]>
Date:   Tue May 14 17:04:29 2024 +0300

    tty: n_tty: Fix buffer offsets when lookahead is used
    
    commit b19ab7ee2c4c1ec5f27c18413c3ab63907f7d55c upstream.
    
    When lookahead has "consumed" some characters (la_count > 0),
    n_tty_receive_buf_standard() and n_tty_receive_buf_closing() for
    characters beyond the la_count are given wrong cp/fp offsets which
    leads to duplicating and losing some characters.
    
    If la_count > 0, correct buffer pointers and make count consistent too
    (the latter is not strictly necessary to fix the issue but seems more
    logical to adjust all variables immediately to keep state consistent).
    
    Reported-by: Vadym Krevs <[email protected]>
    Fixes: 6bb6fa6908eb ("tty: Implement lookahead to process XON/XOFF timely")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218834
    Tested-by: Vadym Krevs <[email protected]>
    Cc: [email protected]
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb-storage: alauda: Check whether the media is initialized [+ + +]

Author: Shichao Lai <[email protected]>
Date:   Sun May 26 09:27:45 2024 +0800

    usb-storage: alauda: Check whether the media is initialized
    
    [ Upstream commit 16637fea001ab3c8df528a8995b3211906165a30 ]
    
    The member "uzonesize" of struct alauda_info will remain 0
    if alauda_init_media() fails, potentially causing divide errors
    in alauda_read_data() and alauda_write_lba().
    - Add a member "media_initialized" to struct alauda_info.
    - Change a condition in alauda_check_media() to ensure the
      first initialization.
    - Add an error check for the return value of alauda_init_media().
    
    Fixes: e80b0fade09e ("[PATCH] USB Storage: add alauda support")
    Reported-by: xingwei lee <[email protected]>
    Reported-by: yue sun <[email protected]>
    Reviewed-by: Alan Stern <[email protected]>
    Signed-off-by: Shichao Lai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages [+ + +]

Author: Alan Stern <[email protected]>
Date:   Thu Jun 13 21:30:43 2024 -0400

    USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
    
    commit 22f00812862564b314784167a89f27b444f82a46 upstream.
    
    The syzbot fuzzer found that the interrupt-URB completion callback in
    the cdc-wdm driver was taking too long, and the driver's immediate
    resubmission of interrupt URBs with -EPROTO status combined with the
    dummy-hcd emulation to cause a CPU lockup:
    
    cdc_wdm 1-1:1.0: nonzero urb status received: -71
    cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
    watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
    CPU#0 Utilization every 4s during lockup:
            #1:  98% system,          0% softirq,     3% hardirq,     0% idle
            #2:  98% system,          0% softirq,     3% hardirq,     0% idle
            #3:  98% system,          0% softirq,     3% hardirq,     0% idle
            #4:  98% system,          0% softirq,     3% hardirq,     0% idle
            #5:  98% system,          1% softirq,     3% hardirq,     0% idle
    Modules linked in:
    irq event stamp: 73096
    hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
    hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
    hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
    hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
    softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
    softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
    softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
    CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
    
    Testing showed that the problem did not occur if the two error
    messages -- the first two lines above -- were removed; apparently adding
    material to the kernel log takes a surprisingly large amount of time.
    
    In any case, the best approach for preventing these lockups and to
    avoid spamming the log with thousands of error messages per second is
    to ratelimit the two dev_err() calls.  Therefore we replace them with
    dev_err_ratelimited().
    
    Signed-off-by: Alan Stern <[email protected]>
    Suggested-by: Greg KH <[email protected]>
    Reported-and-tested-by: [email protected]
    Closes: https://lore.kernel.org/linux-usb/[email protected]/
    Reported-and-tested-by: [email protected]
    Closes: https://lore.kernel.org/linux-usb/[email protected]/
    Fixes: 9908a32e94de ("USB: remove err() macro from usb class drivers")
    Link: https://lore.kernel.org/linux-usb/[email protected]/
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete [+ + +]

Author: Wesley Cheng <[email protected]>
Date:   Mon Apr 8 18:40:59 2024 -0700

    usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete
    
    [ Upstream commit 24729b307eefcd7c476065cd7351c1a018082c19 ]
    
    FFS based applications can utilize the aio_cancel() callback to dequeue
    pending USB requests submitted to the UDC.  There is a scenario where the
    FFS application issues an AIO cancel call, while the UDC is handling a
    soft disconnect.  For a DWC3 based implementation, the callstack looks
    like the following:
    
        DWC3 Gadget                               FFS Application
    dwc3_gadget_soft_disconnect()              ...
      --> dwc3_stop_active_transfers()
        --> dwc3_gadget_giveback(-ESHUTDOWN)
          --> ffs_epfile_async_io_complete()   ffs_aio_cancel()
            --> usb_ep_free_request()            --> usb_ep_dequeue()
    
    There is currently no locking implemented between the AIO completion
    handler and AIO cancel, so the issue occurs if the completion routine is
    running in parallel to an AIO cancel call coming from the FFS application.
    As the completion call frees the USB request (io_data->req) the FFS
    application is also referencing it for the usb_ep_dequeue() call.  This can
    lead to accessing a stale/hanging pointer.
    
    commit b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently")
    relocated the usb_ep_free_request() into ffs_epfile_async_io_complete().
    However, in order to properly implement locking to mitigate this issue, the
    spinlock can't be added to ffs_epfile_async_io_complete(), as
    usb_ep_dequeue() (if successfully dequeuing a USB request) will call the
    function driver's completion handler in the same context.  Hence, leading
    into a deadlock.
    
    Fix this issue by moving the usb_ep_free_request() back to
    ffs_user_copy_worker(), and ensuring that it explicitly sets io_data->req
    to NULL after freeing it within the ffs->eps_lock.  This resolves the race
    condition above, as the ffs_aio_cancel() routine will not continue
    attempting to dequeue a request that has already been freed, or the
    ffs_user_copy_work() not freeing the USB request until the AIO cancel is
    done referencing it.
    
    This fix depends on
      commit b566d38857fc ("usb: gadget: f_fs: use io_data->status
      consistently")
    
    Fixes: 2e4c7553cd6f ("usb: gadget: f_fs: add aio support")
    Cc: stable <[email protected]>  # b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently")
    Signed-off-by: Wesley Cheng <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

usb: gadget: f_fs: use io_data->status consistently [+ + +]

Author: John Keeping <[email protected]>
Date:   Thu Nov 24 17:04:28 2022 +0000

    usb: gadget: f_fs: use io_data->status consistently
    
    [ Upstream commit b566d38857fcb6777f25b674b90a831eec0817a2 ]
    
    Commit fb1f16d74e26 ("usb: gadget: f_fs: change ep->status safe in
    ffs_epfile_io()") added a new ffs_io_data::status field to fix lifetime
    issues in synchronous requests.
    
    While there are no similar lifetime issues for asynchronous requests
    (the separate ep member in ffs_io_data avoids them) using the status
    field means the USB request can be freed earlier and that there is more
    consistency between the synchronous and asynchronous I/O paths.
    
    Cc: Linyu Yuan <[email protected]>
    Signed-off-by: John Keeping <[email protected]>
    Reviewed-by: Linyu Yuan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Stable-dep-of: 24729b307eef ("usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete")
    Signed-off-by: Sasha Levin <[email protected]>

usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps [+ + +]

Author: Amit Sunil Dhamne <[email protected]>
Date:   Tue May 14 15:01:31 2024 -0700

    usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
    
    commit e7e921918d905544500ca7a95889f898121ba886 upstream.
    
    There could be a potential use-after-free case in
    tcpm_register_source_caps(). This could happen when:
     * new (say invalid) source caps are advertised
     * the existing source caps are unregistered
     * tcpm_register_source_caps() returns with an error as
       usb_power_delivery_register_capabilities() fails
    
    This causes port->partner_source_caps to hold on to the now freed source
    caps.
    
    Reset port->partner_source_caps value to NULL after unregistering
    existing source caps.
    
    Fixes: 230ecdf71a64 ("usb: typec: tcpm: unregister existing source caps before re-registration")
    Cc: [email protected]
    Signed-off-by: Amit Sunil Dhamne <[email protected]>
    Reviewed-by: Ondrej Jirman <[email protected]>
    Reviewed-by: Heikki Krogerus <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state [+ + +]

Author: Kyle Tso <[email protected]>
Date:   Mon May 20 23:48:58 2024 +0800

    usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state
    
    commit fc8fb9eea94d8f476e15f3a4a7addeb16b3b99d6 upstream.
    
    Similar to what fixed in Commit a6fe37f428c1 ("usb: typec: tcpm: Skip
    hard reset when in error recovery"), the handling of the received Hard
    Reset has to be skipped during TOGGLING state.
    
    [ 4086.021288] VBUS off
    [ 4086.021295] pending state change SNK_READY -> SNK_UNATTACHED @ 650 ms [rev2 NONE_AMS]
    [ 4086.022113] VBUS VSAFE0V
    [ 4086.022117] state change SNK_READY -> SNK_UNATTACHED [rev2 NONE_AMS]
    [ 4086.022447] VBUS off
    [ 4086.022450] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS]
    [ 4086.023060] VBUS VSAFE0V
    [ 4086.023064] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS]
    [ 4086.023070] disable BIST MODE TESTDATA
    [ 4086.023766] disable vbus discharge ret:0
    [ 4086.023911] Setting usb_comm capable false
    [ 4086.028874] Setting voltage/current limit 0 mV 0 mA
    [ 4086.028888] polarity 0
    [ 4086.030305] Requesting mux state 0, usb-role 0, orientation 0
    [ 4086.033539] Start toggling
    [ 4086.038496] state change SNK_UNATTACHED -> TOGGLING [rev2 NONE_AMS]
    
    // This Hard Reset is unexpected
    [ 4086.038499] Received hard reset
    [ 4086.038501] state change TOGGLING -> HARD_RESET_START [rev2 HARD_RESET]
    
    Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
    Cc: [email protected]
    Signed-off-by: Kyle Tso <[email protected]>
    Reviewed-by: Heikki Krogerus <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected [+ + +]

Author: John Ernberg <[email protected]>
Date:   Fri May 17 11:43:52 2024 +0000

    USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected
    
    commit 8475ffcfb381a77075562207ce08552414a80326 upstream.
    
    If no other USB HCDs are selected when compiling a small pure virutal
    machine, the Xen HCD driver cannot be built.
    
    Fix it by traversing down host/ if CONFIG_USB_XEN_HCD is selected.
    
    Fixes: 494ed3997d75 ("usb: Introduce Xen pvUSB frontend (xen hcd)")
    Cc: [email protected] # v5.17+
    Signed-off-by: John Ernberg <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vmci: prevent speculation leaks by sanitizing event in event_deliver() [+ + +]

Author: Hagar Gamal Halim Hemdan <[email protected]>
Date:   Tue Apr 30 08:59:16 2024 +0000

    vmci: prevent speculation leaks by sanitizing event in event_deliver()
    
    commit 8003f00d895310d409b2bf9ef907c56b42a4e0f4 upstream.
    
    Coverity spotted that event_msg is controlled by user-space,
    event_msg->event_data.event is passed to event_deliver() and used
    as an index without sanitization.
    
    This change ensures that the event index is sanitized to mitigate any
    possibility of speculative information leaks.
    
    This bug was discovered and resolved using Coverity Static Analysis
    Security Testing (SAST) by Synopsys, Inc.
    
    Only compile tested, no access to HW.
    
    Fixes: 1d990201f9bb ("VMCI: event handling implementation.")
    Cc: stable <[email protected]>
    Signed-off-by: Hagar Gamal Halim Hemdan <[email protected]>
    Link: https://lore.kernel.org/stable/20231127193533.46174-1-hagarhem%40amazon.com
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vxlan: Fix regression when dropping packets due to invalid src addresses [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Mon Jun 3 10:59:26 2024 +0200

    vxlan: Fix regression when dropping packets due to invalid src addresses
    
    [ Upstream commit 1cd4bc987abb2823836cbb8f887026011ccddc8a ]
    
    Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address")
    has recently been added to vxlan mainly in the context of source
    address snooping/learning so that when it is enabled, an entry in the
    FDB is not being created for an invalid address for the corresponding
    tunnel endpoint.
    
    Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in
    that it passed through whichever macs were set in the L2 header. It
    turns out that this change in behavior breaks setups, for example,
    Cilium with netkit in L3 mode for Pods as well as tunnel mode has been
    passing before the change in f58f45c1e5b9 for both vxlan and geneve.
    After mentioned change it is only passing for geneve as in case of
    vxlan packets are dropped due to vxlan_set_mac() returning false as
    source and destination macs are zero which for E/W traffic via tunnel
    is totally fine.
    
    Fix it by only opting into the is_valid_ether_addr() check in
    vxlan_set_mac() when in fact source address snooping/learning is
    actually enabled in vxlan. This is done by moving the check into
    vxlan_snoop(). With this change, the Cilium connectivity test suite
    passes again for both tunnel flavors.
    
    Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address")
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: David Bauer <[email protected]>
    Cc: Ido Schimmel <[email protected]>
    Cc: Nikolay Aleksandrov <[email protected]>
    Cc: Martin KaFai Lau <[email protected]>
    Reviewed-by: Ido Schimmel <[email protected]>
    Reviewed-by: Nikolay Aleksandrov <[email protected]>
    Reviewed-by: David Bauer <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: ath10k: fix QCOM_RPROC_COMMON dependency [+ + +]

Author: Dmitry Baryshkov <[email protected]>
Date:   Fri May 17 10:00:28 2024 +0300

    wifi: ath10k: fix QCOM_RPROC_COMMON dependency
    
    [ Upstream commit 21ae74e1bf18331ae5e279bd96304b3630828009 ]
    
    If ath10k_snoc is built-in, while Qualcomm remoteprocs are built as
    modules, compilation fails with:
    
    /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_init':
    drivers/net/wireless/ath/ath10k/snoc.c:1534: undefined reference to `qcom_register_ssr_notifier'
    /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_deinit':
    drivers/net/wireless/ath/ath10k/snoc.c:1551: undefined reference to `qcom_unregister_ssr_notifier'
    
    Add corresponding dependency to ATH10K_SNOC Kconfig entry so that it's
    built as module if QCOM_RPROC_COMMON is built as module too.
    
    Fixes: 747ff7d3d742 ("ath10k: Don't always treat modem stop events as crashes")
    Cc: [email protected]
    Signed-off-by: Dmitry Baryshkov <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: cfg80211: fully move wiphy work to unbound workqueue [+ + +]

Author: Johannes Berg <[email protected]>
Date:   Wed May 22 12:41:25 2024 +0200

    wifi: cfg80211: fully move wiphy work to unbound workqueue
    
    [ Upstream commit e296c95eac655008d5a709b8cf54d0018da1c916 ]
    
    Previously I had moved the wiphy work to the unbound
    system workqueue, but missed that when it restarts and
    during resume it was still using the normal system
    workqueue. Fix that.
    
    Fixes: 91d20ab9d9ca ("wifi: cfg80211: use system_unbound_wq for wiphy work")
    Reviewed-by: Miriam Rachel Korenblit <[email protected]>
    Link: https://msgid.link/20240522124126.7ca959f2cbd3.I3e2a71ef445d167b84000ccf934ea245aef8d395@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: cfg80211: Lock wiphy in cfg80211_get_station [+ + +]

Author: Remi Pommarel <[email protected]>
Date:   Tue May 21 21:47:26 2024 +0200

    wifi: cfg80211: Lock wiphy in cfg80211_get_station
    
    [ Upstream commit 642f89daa34567d02f312d03e41523a894906dae ]
    
    Wiphy should be locked before calling rdev_get_station() (see lockdep
    assert in ieee80211_get_station()).
    
    This fixes the following kernel NULL dereference:
    
     Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
     Mem abort info:
       ESR = 0x0000000096000006
       EC = 0x25: DABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
       FSC = 0x06: level 2 translation fault
     Data abort info:
       ISV = 0, ISS = 0x00000006
       CM = 0, WnR = 0
     user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000
     [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000
     Internal error: Oops: 0000000096000006 [#1] SMP
     Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath
     CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705
     Hardware name: RPT (r1) (DT)
     Workqueue: bat_events batadv_v_elp_throughput_metric_update
     pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
     lr : sta_set_sinfo+0xcc/0xbd4
     sp : ffff000007b43ad0
     x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98
     x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000
     x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc
     x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000
     x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d
     x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e
     x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000
     x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000
     x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90
     x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000
     Call trace:
      ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
      sta_set_sinfo+0xcc/0xbd4
      ieee80211_get_station+0x2c/0x44
      cfg80211_get_station+0x80/0x154
      batadv_v_elp_get_throughput+0x138/0x1fc
      batadv_v_elp_throughput_metric_update+0x1c/0xa4
      process_one_work+0x1ec/0x414
      worker_thread+0x70/0x46c
      kthread+0xdc/0xe0
      ret_from_fork+0x10/0x20
     Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814)
    
    This happens because STA has time to disconnect and reconnect before
    batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In
    this situation, ath10k_sta_state() can be in the middle of resetting
    arsta data when the work queue get chance to be scheduled and ends up
    accessing it. Locking wiphy prevents that.
    
    Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API")
    Signed-off-by: Remi Pommarel <[email protected]>
    Reviewed-by: Nicolas Escande <[email protected]>
    Acked-by: Antonio Quartulli <[email protected]>
    Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: cfg80211: pmsr: use correct nla_get_uX functions [+ + +]

Author: Lin Ma <[email protected]>
Date:   Tue May 21 15:50:59 2024 +0800

    wifi: cfg80211: pmsr: use correct nla_get_uX functions
    
    [ Upstream commit ab904521f4de52fef4f179d2dfc1877645ef5f5c ]
    
    The commit 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM
    initiator API") defines four attributes NL80211_PMSR_FTM_REQ_ATTR_
    {NUM_BURSTS_EXP}/{BURST_PERIOD}/{BURST_DURATION}/{FTMS_PER_BURST} in
    following ways.
    
    static const struct nla_policy
    nl80211_pmsr_ftm_req_attr_policy[NL80211_PMSR_FTM_REQ_ATTR_MAX + 1] = {
        ...
        [NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP] =
            NLA_POLICY_MAX(NLA_U8, 15),
        [NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD] = { .type = NLA_U16 },
        [NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION] =
            NLA_POLICY_MAX(NLA_U8, 15),
        [NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST] =
            NLA_POLICY_MAX(NLA_U8, 31),
        ...
    };
    
    That is, those attributes are expected to be NLA_U8 and NLA_U16 types.
    However, the consumers of these attributes in `pmsr_parse_ftm` blindly
    all use `nla_get_u32`, which is incorrect and causes functionality issues
    on little-endian platforms. Hence, fix them with the correct `nla_get_u8`
    and `nla_get_u16` functions.
    
    Fixes: 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API")
    Signed-off-by: Lin Ma <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef [+ + +]

Author: Shahar S Matityahu <[email protected]>
Date:   Fri May 10 17:06:39 2024 +0300

    wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
    
    [ Upstream commit 87821b67dea87addbc4ab093ba752753b002176a ]
    
    The driver should call iwl_dbg_tlv_free even if debugfs is not defined
    since ini mode does not depend on debugfs ifdef.
    
    Fixes: 68f6f492c4fa ("iwlwifi: trans: support loading ini TLVs from external file")
    Signed-off-by: Shahar S Matityahu <[email protected]>
    Reviewed-by: Luciano Coelho <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240510170500.c8e3723f55b0.I5e805732b0be31ee6b83c642ec652a34e974ff10@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: check n_ssids before accessing the ssids [+ + +]

Author: Miri Korenblit <[email protected]>
Date:   Mon May 13 13:27:12 2024 +0300

    wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
    
    [ Upstream commit 60d62757df30b74bf397a2847a6db7385c6ee281 ]
    
    In some versions of cfg80211, the ssids poinet might be a valid one even
    though n_ssids is 0. Accessing the pointer in this case will cuase an
    out-of-bound access. Fix this by checking n_ssids first.
    
    Fixes: c1a7515393e4 ("iwlwifi: mvm: add adaptive dwell support")
    Signed-off-by: Miri Korenblit <[email protected]>
    Reviewed-by: Ilan Peer <[email protected]>
    Reviewed-by: Johannes Berg <[email protected]>
    Link: https://msgid.link/20240513132416.6e4d1762bf0d.I5a0e6cc8f02050a766db704d15594c61fe583d45@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: don't read past the mfuart notifcation [+ + +]

Author: Emmanuel Grumbach <[email protected]>
Date:   Mon May 13 13:27:14 2024 +0300

    wifi: iwlwifi: mvm: don't read past the mfuart notifcation
    
    [ Upstream commit 4bb95f4535489ed830cf9b34b0a891e384d1aee4 ]
    
    In case the firmware sends a notification that claims it has more data
    than it has, we will read past that was allocated for the notification.
    Remove the print of the buffer, we won't see it by default. If needed,
    we can see the content with tracing.
    
    This was reported by KFENCE.
    
    Fixes: bdccdb854f2f ("iwlwifi: mvm: support MFUART dump in case of MFUART assert")
    Signed-off-by: Emmanuel Grumbach <[email protected]>
    Reviewed-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aebf2794f553d9b@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 [+ + +]

Author: Johannes Berg <[email protected]>
Date:   Fri May 10 17:06:33 2024 +0300

    wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
    
    [ Upstream commit 4a7aace2899711592327463c1a29ffee44fcc66e ]
    
    We don't actually support >64 even for HE devices, so revert
    back to 64. This fixes an issue where the session is refused
    because the queue is configured differently from the actual
    session later.
    
    Fixes: 514c30696fbc ("iwlwifi: add support for IEEE802.11ax")
    Signed-off-by: Johannes Berg <[email protected]>
    Reviewed-by: Liad Kaufman <[email protected]>
    Reviewed-by: Luciano Coelho <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://msgid.link/20240510170500.52f7b4cf83aa.If47e43adddf7fe250ed7f5571fbb35d8221c7c47@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: correctly parse Spatial Reuse Parameter Set element [+ + +]

Author: Lingbo Kong <[email protected]>
Date:   Thu May 16 10:18:54 2024 +0800

    wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
    
    [ Upstream commit a26d8dc5227f449a54518a8b40733a54c6600a8b ]
    
    Currently, the way of parsing Spatial Reuse Parameter Set element is
    incorrect and some members of struct ieee80211_he_obss_pd are not assigned.
    
    To address this issue, it must be parsed in the order of the elements of
    Spatial Reuse Parameter Set defined in the IEEE Std 802.11ax specification.
    
    The diagram of the Spatial Reuse Parameter Set element (IEEE Std 802.11ax
    -2021-9.4.2.252).
    
    -------------------------------------------------------------------------
    |       |      |         |       |Non-SRG|  SRG  | SRG   | SRG  | SRG   |
    |Element|Length| Element |  SR   |OBSS PD|OBSS PD|OBSS PD| BSS  |Partial|
    |   ID  |      |   ID    |Control|  Max  |  Min  | Max   |Color | BSSID |
    |       |      |Extension|       | Offset| Offset|Offset |Bitmap|Bitmap |
    -------------------------------------------------------------------------
    
    Fixes: 1ced169cc1c2 ("mac80211: allow setting spatial reuse parameters from bss_conf")
    Signed-off-by: Lingbo Kong <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup() [+ + +]

Author: Remi Pommarel <[email protected]>
Date:   Wed May 29 08:57:53 2024 +0200

    wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
    
    [ Upstream commit 44c06bbde6443de206b30f513100b5670b23fc5e ]
    
    The ieee80211_sta_ps_deliver_wakeup() function takes sta->ps_lock to
    synchronizes with ieee80211_tx_h_unicast_ps_buf() which is called from
    softirq context. However using only spin_lock() to get sta->ps_lock in
    ieee80211_sta_ps_deliver_wakeup() does not prevent softirq to execute
    on this same CPU, to run ieee80211_tx_h_unicast_ps_buf() and try to
    take this same lock ending in deadlock. Below is an example of rcu stall
    that arises in such situation.
    
     rcu: INFO: rcu_sched self-detected stall on CPU
     rcu:    2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996
     rcu:    (t=42586894 jiffies g=2057 q=362405 ncpus=4)
     CPU: 2 PID: 719 Comm: wpa_supplicant Tainted: G        W          6.4.0-02158-g1b062f552873 #742
     Hardware name: RPT (r1) (DT)
     pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : queued_spin_lock_slowpath+0x58/0x2d0
     lr : invoke_tx_handlers_early+0x5b4/0x5c0
     sp : ffff00001ef64660
     x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8
     x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000
     x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000
     x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000
     x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80
     x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da
     x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440
     x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880
     x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000
     x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8
     Call trace:
      queued_spin_lock_slowpath+0x58/0x2d0
      ieee80211_tx+0x80/0x12c
      ieee80211_tx_pending+0x110/0x278
      tasklet_action_common.constprop.0+0x10c/0x144
      tasklet_action+0x20/0x28
      _stext+0x11c/0x284
      ____do_softirq+0xc/0x14
      call_on_irq_stack+0x24/0x34
      do_softirq_own_stack+0x18/0x20
      do_softirq+0x74/0x7c
      __local_bh_enable_ip+0xa0/0xa4
      _ieee80211_wake_txqs+0x3b0/0x4b8
      __ieee80211_wake_queue+0x12c/0x168
      ieee80211_add_pending_skbs+0xec/0x138
      ieee80211_sta_ps_deliver_wakeup+0x2a4/0x480
      ieee80211_mps_sta_status_update.part.0+0xd8/0x11c
      ieee80211_mps_sta_status_update+0x18/0x24
      sta_apply_parameters+0x3bc/0x4c0
      ieee80211_change_station+0x1b8/0x2dc
      nl80211_set_station+0x444/0x49c
      genl_family_rcv_msg_doit.isra.0+0xa4/0xfc
      genl_rcv_msg+0x1b0/0x244
      netlink_rcv_skb+0x38/0x10c
      genl_rcv+0x34/0x48
      netlink_unicast+0x254/0x2bc
      netlink_sendmsg+0x190/0x3b4
      ____sys_sendmsg+0x1e8/0x218
      ___sys_sendmsg+0x68/0x8c
      __sys_sendmsg+0x44/0x84
      __arm64_sys_sendmsg+0x20/0x28
      do_el0_svc+0x6c/0xe8
      el0_svc+0x14/0x48
      el0t_64_sync_handler+0xb0/0xb4
      el0t_64_sync+0x14c/0x150
    
    Using spin_lock_bh()/spin_unlock_bh() instead prevents softirq to raise
    on the same CPU that is holding the lock.
    
    Fixes: 1d147bfa6429 ("mac80211: fix AP powersave TX vs. wakeup race")
    Signed-off-by: Remi Pommarel <[email protected]>
    Link: https://msgid.link/8e36fe07d0fbc146f89196cd47a53c8a0afe84aa.1716910344.git.repk@triplefau.lt
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects [+ + +]

Author: Nicolas Escande <[email protected]>
Date:   Tue May 28 16:26:05 2024 +0200

    wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
    
    [ Upstream commit b7d7f11a291830fdf69d3301075dd0fb347ced84 ]
    
    The hwmp code use objects of type mesh_preq_queue, added to a list in
    ieee80211_if_mesh, to keep track of mpath we need to resolve. If the mpath
    gets deleted, ex mesh interface is removed, the entries in that list will
    never get cleaned. Fix this by flushing all corresponding items of the
    preq_queue in mesh_path_flush_pending().
    
    This should take care of KASAN reports like this:
    
    unreferenced object 0xffff00000668d800 (size 128):
      comm "kworker/u8:4", pid 67, jiffies 4295419552 (age 1836.444s)
      hex dump (first 32 bytes):
        00 1f 05 09 00 00 ff ff 00 d5 68 06 00 00 ff ff  ..........h.....
        8e 97 ea eb 3e b8 01 00 00 00 00 00 00 00 00 00  ....>...........
      backtrace:
        [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
        [<00000000049bd418>] kmalloc_trace+0x34/0x80
        [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
        [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
        [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
        [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
        [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
        [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
        [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
        [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
        [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
        [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
        [<00000000b36425d1>] worker_thread+0x9c/0x634
        [<0000000005852dd5>] kthread+0x1bc/0x1c4
        [<000000005fccd770>] ret_from_fork+0x10/0x20
    unreferenced object 0xffff000009051f00 (size 128):
      comm "kworker/u8:4", pid 67, jiffies 4295419553 (age 1836.440s)
      hex dump (first 32 bytes):
        90 d6 92 0d 00 00 ff ff 00 d8 68 06 00 00 ff ff  ..........h.....
        36 27 92 e4 02 e0 01 00 00 58 79 06 00 00 ff ff  6'.......Xy.....
      backtrace:
        [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
        [<00000000049bd418>] kmalloc_trace+0x34/0x80
        [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
        [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
        [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
        [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
        [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
        [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
        [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
        [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
        [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
        [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
        [<00000000b36425d1>] worker_thread+0x9c/0x634
        [<0000000005852dd5>] kthread+0x1bc/0x1c4
        [<000000005fccd770>] ret_from_fork+0x10/0x20
    
    Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
    Signed-off-by: Nicolas Escande <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

x86/amd_nb: Check for invalid SMN reads [+ + +]

Author: Yazen Ghannam <[email protected]>
Date:   Mon Apr 3 16:42:44 2023 +0000

    x86/amd_nb: Check for invalid SMN reads
    
    commit c625dabbf1c4a8e77e4734014f2fde7aa9071a1f upstream.
    
    AMD Zen-based systems use a System Management Network (SMN) that
    provides access to implementation-specific registers.
    
    SMN accesses are done indirectly through an index/data pair in PCI
    config space. The PCI config access may fail and return an error code.
    This would prevent the "read" value from being updated.
    
    However, the PCI config access may succeed, but the return value may be
    invalid. This is in similar fashion to PCI bad reads, i.e. return all
    bits set.
    
    Most systems will return 0 for SMN addresses that are not accessible.
    This is in line with AMD convention that unavailable registers are
    Read-as-Zero/Writes-Ignored.
    
    However, some systems will return a "PCI Error Response" instead. This
    value, along with an error code of 0 from the PCI config access, will
    confuse callers of the amd_smn_read() function.
    
    Check for this condition, clear the return value, and set a proper error
    code.
    
    Fixes: ddfe43cdc0da ("x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h")
    Signed-off-by: Yazen Ghannam <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/boot: Don't add the EFI stub to targets, again [+ + +]

Author: Benjamin Segall <[email protected]>
Date:   Wed Jun 12 12:44:44 2024 -0700

    x86/boot: Don't add the EFI stub to targets, again
    
    commit b2747f108b8034271fd5289bd8f3a7003e0775a3 upstream.
    
    This is a re-commit of
    
      da05b143a308 ("x86/boot: Don't add the EFI stub to targets")
    
    after the tagged patch incorrectly reverted it.
    
    vmlinux-objs-y is added to targets, with an assumption that they are all
    relative to $(obj); adding a $(objtree)/drivers/...  path causes the
    build to incorrectly create a useless
    arch/x86/boot/compressed/drivers/...  directory tree.
    
    Fix this just by using a different make variable for the EFI stub.
    
    Fixes: cb8bda8ad443 ("x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S")
    Signed-off-by: Ben Segall <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Ard Biesheuvel <[email protected]>
    Cc: [email protected] # v6.1+
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Apply broken streams quirk to Etron EJ188 xHCI host [+ + +]

Author: Kuangyi Chiang <[email protected]>
Date:   Tue Jun 11 15:06:09 2024 +0300

    xhci: Apply broken streams quirk to Etron EJ188 xHCI host
    
    commit 91f7a1524a92c70ffe264db8bdfa075f15bbbeb9 upstream.
    
    As described in commit 8f873c1ff4ca ("xhci: Blacklist using streams on the
    Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams
    do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188
    as well.
    
    Cc: [email protected]
    Signed-off-by: Kuangyi Chiang <[email protected]>
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Apply reset resume quirk to Etron EJ188 xHCI host [+ + +]

Author: Kuangyi Chiang <[email protected]>
Date:   Tue Jun 11 15:06:08 2024 +0300

    xhci: Apply reset resume quirk to Etron EJ188 xHCI host
    
    commit 17bd54555c2aaecfdb38e2734149f684a73fa584 upstream.
    
    As described in commit c877b3b2ad5c ("xhci: Add reset on resume quirk for
    asrock p67 host"), EJ188 have the same issue as EJ168, where completely
    dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well.
    
    Cc: [email protected]
    Signed-off-by: Kuangyi Chiang <[email protected]>
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Handle TD clearing for multiple streams case [+ + +]

Author: Hector Martin <[email protected]>
Date:   Tue Jun 11 15:06:10 2024 +0300

    xhci: Handle TD clearing for multiple streams case
    
    commit 5ceac4402f5d975e5a01c806438eb4e554771577 upstream.
    
    When multiple streams are in use, multiple TDs might be in flight when
    an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
    each, to ensure everything is reset properly and the caches cleared.
    Change the logic so that any N>1 TDs found active for different streams
    are deferred until after the first one is processed, calling
    xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
    queue another command until we are done with all of them. Also change
    the error/"should never happen" paths to ensure we at least clear any
    affected TDs, even if we can't issue a command to clear the hardware
    cache, and complain loudly with an xhci_warn() if this ever happens.
    
    This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct
    assumptions about number of rings per endpoint.") early on in the XHCI
    driver's life, when stream support was first added.
    It was then identified but not fixed nor made into a warning in commit
    674f8438c121 ("xhci: split handling halted endpoints into two steps"),
    which added a FIXME comment for the problem case (without materially
    changing the behavior as far as I can tell, though the new logic made
    the problem more obvious).
    
    Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some
    cached cancelled URBs."), it was acknowledged again.
    
    [Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached
    cancelled URBs.") was a targeted regression fix to the previously mentioned
    patch. Users reported issues with usb stuck after unmounting/disconnecting
    UAS devices. This rolled back the TD clearing of multiple streams to its
    original state.]
    
    Apparently the commit author was aware of the problem (yet still chose
    to submit it): It was still mentioned as a FIXME, an xhci_dbg() was
    added to log the problem condition, and the remaining issue was mentioned
    in the commit description. The choice of making the log type xhci_dbg()
    for what is, at this point, a completely unhandled and known broken
    condition is puzzling and unfortunate, as it guarantees that no actual
    users would see the log in production, thereby making it nigh
    undebuggable (indeed, even if you turn on DEBUG, the message doesn't
    really hint at there being a problem at all).
    
    It took me *months* of random xHC crashes to finally find a reliable
    repro and be able to do a deep dive debug session, which could all have
    been avoided had this unhandled, broken condition been actually reported
    with a warning, as it should have been as a bug intentionally left in
    unfixed (never mind that it shouldn't have been left in at all).
    
    > Another fix to solve clearing the caches of all stream rings with
    > cancelled TDs is needed, but not as urgent.
    
    3 years after that statement and 14 years after the original bug was
    introduced, I think it's finally time to fix it. And maybe next time
    let's not leave bugs unfixed (that are actually worse than the original
    bug), and let's actually get people to review kernel commits please.
    
    Fixes xHC crashes and IOMMU faults with UAS devices when handling
    errors/faults. Easiest repro is to use `hdparm` to mark an early sector
    (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
    At least in the case of JMicron controllers, the read errors end up
    having to cancel two TDs (for two queued requests to different streams)
    and the one that didn't get cleared properly ends up faulting the xHC
    entirely when it tries to access DMA pages that have since been unmapped,
    referred to by the stale TDs. This normally happens quickly (after two
    or three loops). After this fix, I left the `cat` in a loop running
    overnight and experienced no xHC failures, with all read errors
    recovered properly. Repro'd and tested on an Apple M1 Mac Mini
    (dwc3 host).
    
    On systems without an IOMMU, this bug would instead silently corrupt
    freed memory, making this a security bug (even on systems with IOMMUs
    this could silently corrupt memory belonging to other USB devices on the
    same controller, so it's still a security bug). Given that the kernel
    autoprobes partition tables, I'm pretty sure a malicious USB device
    pretending to be a UAS device and reporting an error with the right
    timing could deliberately trigger a UAF and write to freed memory, with
    no user action.
    
    [Mathias: Commit message and code comment edit, original at:]
    https://lore.kernel.org/linux-usb/[email protected]/
    
    Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.")
    Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.")
    Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
    Cc: [email protected]
    Cc: [email protected]
    Reviewed-by: Neal Gompa <[email protected]>
    Signed-off-by: Hector Martin <[email protected]>
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xhci: Set correct transferred length for cancelled bulk transfers [+ + +]

Author: Mathias Nyman <[email protected]>
Date:   Tue Jun 11 15:06:07 2024 +0300

    xhci: Set correct transferred length for cancelled bulk transfers
    
    commit f0260589b439e2637ad54a2b25f00a516ef28a57 upstream.
    
    The transferred length is set incorrectly for cancelled bulk
    transfer TDs in case the bulk transfer ring stops on the last transfer
    block with a 'Stop - Length Invalid' completion code.
    
    length essentially ends up being set to the requested length:
    urb->actual_length = urb->transfer_buffer_length
    
    Length for 'Stop - Length Invalid' cases should be the sum of all
    TRB transfer block lengths up to the one the ring stopped on,
    _excluding_ the one stopped on.
    
    Fix this by always summing up TRB lengths for 'Stop - Length Invalid'
    bulk cases.
    
    This issue was discovered by Alan Stern while debugging
    https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not
    solve that bug. Issue is older than 4.10 kernel but fix won't apply
    to those due to major reworks in that area.
    
    Tested-by: Pierre Tomon <[email protected]>
    Cc: [email protected] # v4.10+
    Cc: Alan Stern <[email protected]>
    Signed-off-by: Mathias Nyman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

xtensa: fix MAKE_PC_FROM_RA second argument [+ + +]

Author: Max Filippov <[email protected]>
Date:   Sat Feb 17 05:15:42 2024 -0800

    xtensa: fix MAKE_PC_FROM_RA second argument
    
    [ Upstream commit 0e60f0b75884677fb9f4f2ad40d52b43451564d5 ]
    
    Xtensa has two-argument MAKE_PC_FROM_RA macro to convert a0 to an actual
    return address because when windowed ABI is used call{,x}{4,8,12}
    opcodes stuff encoded window size into the top 2 bits of the register
    that becomes a return address in the called function. Second argument of
    that macro is supposed to be an address having these 2 topmost bits set
    correctly, but the comment suggested that that could be the stack
    address. However the stack doesn't have to be in the same 1GByte region
    as the code, especially in noMMU XIP configurations.
    
    Fix the comment and use either _text or regs->pc as the second argument
    for the MAKE_PC_FROM_RA macro.
    
    Cc: [email protected]
    Signed-off-by: Max Filippov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

xtensa: stacktrace: include for prototype [+ + +]

Author: Randy Dunlap <[email protected]>
Date:   Tue Sep 19 22:21:30 2023 -0700

    xtensa: stacktrace: include <asm/ftrace.h> for prototype
    
    [ Upstream commit 1b6ceeb99ee05eb2c62a9e5512623e63cf8490ba ]
    
    Use <asm/ftrace.h> to prevent a build warning:
    
    arch/xtensa/kernel/stacktrace.c:263:15: warning: no previous prototype for 'return_address' [-Wmissing-prototypes]
      263 | unsigned long return_address(unsigned level)
    
    Signed-off-by: Randy Dunlap <[email protected]>
    Cc: Chris Zankel <[email protected]>
    Cc: Max Filippov <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Max Filippov <[email protected]>
    Stable-dep-of: 0e60f0b75884 ("xtensa: fix MAKE_PC_FROM_RA second argument")
    Signed-off-by: Sasha Levin <[email protected]>

zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING [+ + +]

Author: Oleg Nesterov <[email protected]>
Date:   Sat Jun 8 14:06:16 2024 +0200

    zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
    
    [ Upstream commit 7fea700e04bd3f424c2d836e98425782f97b494e ]
    
    kernel_wait4() doesn't sleep and returns -EINTR if there is no
    eligible child and signal_pending() is true.
    
    That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not
    enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending()
    return false and avoid a busy-wait loop.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL")
    Signed-off-by: Oleg Nesterov <[email protected]>
    Reported-by: Rachel Menge <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Reviewed-by: Boqun Feng <[email protected]>
    Tested-by: Wei Fu <[email protected]>
    Reviewed-by: Jens Axboe <[email protected]>
    Cc: Allen Pais <[email protected]>
    Cc: Christian Brauner <[email protected]>
    Cc: Frederic Weisbecker <[email protected]>
    Cc: Joel Fernandes (Google) <[email protected]>
    Cc: Joel Granados <[email protected]>
    Cc: Josh Triplett <[email protected]>
    Cc: Lai Jiangshan <[email protected]>
    Cc: Mateusz Guzik <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Mike Christie <[email protected]>
    Cc: Neeraj Upadhyay <[email protected]>
    Cc: Paul E. McKenney <[email protected]>
    Cc: Steven Rostedt (Google) <[email protected]>
    Cc: Zqiang <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>