Changelog in Linux kernel 6.14.8

accel/ivpu: Dump only first MMU fault from single context [+ + +]

Author: Karol Wachowski <[email protected]>
Date:   Tue Jan 7 18:32:29 2025 +0100

    accel/ivpu: Dump only first MMU fault from single context
    
    commit 0240fa18d247c99a1967f2fed025296a89a1c5f5 upstream.
    
    Stop dumping consecutive faults from an already faulty context immediately,
    instead of waiting for the context abort thread handler (IRQ handler bottom
    half) to abort currently executing jobs.
    
    Remove 'R' (record events) bit from context descriptor of a faulty
    context to prevent future faults generation.
    
    This change speeds up the IRQ handler by eliminating the need to print the
    fault content repeatedly. Additionally, it prevents flooding dmesg with
    errors, which was occurring due to the delay in the bottom half of the
    handler stopping fault-generating jobs.
    
    Signed-off-by: Karol Wachowski <[email protected]>
    Signed-off-by: Maciej Falkowski <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Fix missing MMU events from reserved SSID [+ + +]

Author: Karol Wachowski <[email protected]>
Date:   Tue Jan 7 18:32:31 2025 +0100

    accel/ivpu: Fix missing MMU events from reserved SSID
    
    commit 353b8f48390d36b39276ff6af61464ec64cd4d5c upstream.
    
    Generate recovery when fault from reserved context is detected.
    Add Abort (A) bit to reserved (1) SSID to ensure NPU also receives a fault.
    
    There is no way to create a file_priv with reserved SSID
    but it is still possible to receive MMU faults from that SSID
    as it is a default NPU HW setting. Such situation will occur if
    FW freed context related resources but still performed access to DRAM.
    
    Signed-off-by: Karol Wachowski <[email protected]>
    Signed-off-by: Maciej Falkowski <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Fix missing MMU events if file_priv is unbound [+ + +]

Author: Karol Wachowski <[email protected]>
Date:   Wed Jan 29 13:56:33 2025 +0100

    accel/ivpu: Fix missing MMU events if file_priv is unbound
    
    commit 2f5bbea1807a064a1e4c1b385c8cea4f37bb4b17 upstream.
    
    Move the ivpu_mmu_discard_events() function to the common portion of
    the abort work function. This ensures it is called only once, even if
    there are no faulty contexts in context_xa, to guarantee that MMU events
    are discarded and new events are not missed.
    
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Karol Wachowski <[email protected]>
    Reviewed-by: Jeffrey Hugo <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Flush pending jobs of device's workqueues [+ + +]

Author: Maciej Falkowski <[email protected]>
Date:   Tue Apr 1 17:57:55 2025 +0200

    accel/ivpu: Flush pending jobs of device's workqueues
    
    commit 683e9fa1c885a0cffbc10b459a7eee9df92af1c1 upstream.
    
    Use flush_work() instead of cancel_work_sync() for driver IRQ
    workqueues to guarantee that remaining pending work
    will be handled.
    
    This resolves two issues that were encountered where a driver was left
    in an incorrect state as the bottom-half was canceled:
    
    1. Cancelling context-abort of a job that is still executing and
       is causing translation faults which is going to cause additional TDRs
    
    2. Cancelling bottom-half of a DCT (duty-cycle throttling) request
       which will cause a device to not be adjusted to an external frequency
       request.
    
    Fixes: bc3e5f48b7ee ("accel/ivpu: Use workqueue for IRQ handling")
    Signed-off-by: Maciej Falkowski <[email protected]>
    Reviewed-by: Lizhi Hou <[email protected]>
    Reviewed-by: Jeff Hugo <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Move parts of MMU event IRQ handling to thread handler [+ + +]

Author: Karol Wachowski <[email protected]>
Date:   Tue Jan 7 18:32:30 2025 +0100

    accel/ivpu: Move parts of MMU event IRQ handling to thread handler
    
    commit 4480912f3f8b8a1fbb5ae12c5c547fd094ec4197 upstream.
    
    To prevent looping infinitely in MMU event handler we stop
    generating new events by removing 'R' (record) bit from context
    descriptor, but to ensure this change has effect KMD has to perform
    configuration invalidation followed by sync command.
    
    Because of that move parts of the interrupt handler that can take longer
    to a thread not to block in interrupt handler for too long.
    This includes:
     * disabling event queue for the time KMD updates MMU event queue consumer
       to ensure proper synchronization between MMU and KMD
    
     * removal of 'R' (record) bit from context descriptor to ensure no more
       faults are recorded until that context is destroyed
    
    Signed-off-by: Karol Wachowski <[email protected]>
    Signed-off-by: Maciej Falkowski <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

accel/ivpu: Use workqueue for IRQ handling [+ + +]

Author: Maciej Falkowski <[email protected]>
Date:   Tue Jan 7 18:32:28 2025 +0100

    accel/ivpu: Use workqueue for IRQ handling
    
    commit bc3e5f48b7ee021371dc37297678f7089be6ce28 upstream.
    
    Convert IRQ bottom half from the thread handler into workqueue.
    This increases a stability in rare scenarios where driver on
    debugging/hardening kernels processes IRQ too slow and misses
    some interrupts due to it.
    Workqueue handler also gives a very minor performance increase.
    
    Signed-off-by: Maciej Falkowski <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: PPTT: Fix processor subtable walk [+ + +]

Author: Jeremy Linton <[email protected]>
Date:   Wed May 7 21:30:25 2025 -0500

    ACPI: PPTT: Fix processor subtable walk
    
    commit adfab6b39202481bb43286fff94def4953793fdb upstream.
    
    The original PPTT code had a bug where the processor subtable length
    was not correctly validated when encountering a truncated
    acpi_pptt_processor node.
    
    Commit 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of
    sizeof() calls") attempted to fix this by validating the size is as
    large as the acpi_pptt_processor node structure. This introduced a
    regression where the last processor node in the PPTT table is ignored
    if it doesn't contain any private resources. That results errors like:
    
      ACPI PPTT: PPTT table found, but unable to locate core XX (XX)
      ACPI: SPE must be homogeneous
    
    Furthermore, it fails in a common case where the node length isn't
    equal to the acpi_pptt_processor structure size, leaving the original
    bug in a modified form.
    
    Correct the regression by adjusting the loop termination conditions as
    suggested by the bug reporters. An additional check performed after
    the subtable node type is detected, validates the acpi_pptt_processor
    node is fully contained in the PPTT table. Repeating the check in
    acpi_pptt_leaf_node() is largely redundant as the node is already
    known to be fully contained in the table.
    
    The case where a final truncated node's parent property is accepted,
    but the node itself is rejected should not be considered a bug.
    
    Fixes: 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls")
    Reported-by: Maximilian Heyne <[email protected]>
    Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amazon/
    Reported-by: Yicong Yang <[email protected]>
    Closes: https://lore.kernel.org/linux-acpi/[email protected]/
    Signed-off-by: Jeremy Linton <[email protected]>
    Tested-by: Yicong Yang <[email protected]>
    Reviewed-by: Sudeep Holla <[email protected]>
    Tested-by: Maximilian Heyne <[email protected]>
    Cc: All applicable <[email protected]> # 7ab4f0e37a0f4: ACPI PPTT: Fix coding mistakes ...
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() [+ + +]

Author: Wentao Liang <[email protected]>
Date:   Wed May 14 17:24:44 2025 +0800

    ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
    
    commit 9e000f1b7f31684cc5927e034360b87ac7919593 upstream.
    
    The function snd_es1968_capture_open() calls the function
    snd_pcm_hw_constraint_pow2(), but does not check its return
    value. A proper implementation can be found in snd_cx25821_pcm_open().
    
    Add error handling for snd_pcm_hw_constraint_pow2() and propagate its
    error code.
    
    Fixes: b942cf815b57 ("[ALSA] es1968 - Fix stuttering capture")
    Cc: [email protected] # v2.6.22
    Signed-off-by: Wentao Liang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: seq: Fix delivery of UMP events to group ports [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Sun May 11 15:45:27 2025 +0200

    ALSA: seq: Fix delivery of UMP events to group ports
    
    [ Upstream commit ff7b190aef6cccdb6f14d20c5753081fe6420e0b ]
    
    When an event with UMP message is sent to a UMP client, the EP port
    receives always no matter where the event is sent to, as it's a
    catch-all port.  OTOH, if an event is sent to EP port, and if the
    event has a certain UMP Group, it should have been delivered to the
    associated UMP Group port, too, but this was ignored, so far.
    
    This patch addresses the behavior.  Now a UMP event sent to the
    Endpoint port will be delivered to the subscribers of the UMP group
    port the event is associated with.
    
    The patch also does a bit of refactoring to simplify the code about
    __deliver_to_subscribers().
    
    Fixes: 177ccf811df4 ("ALSA: seq: Support MIDI 2.0 UMP Endpoint port")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: sh: SND_AICA should depend on SH_DMA_API [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Tue May 13 09:31:04 2025 +0200

    ALSA: sh: SND_AICA should depend on SH_DMA_API
    
    [ Upstream commit 66e48ef6ef506c89ec1b3851c6f9f5f80b5835ff ]
    
    If CONFIG_SH_DMA_API=n:
    
        WARNING: unmet direct dependencies detected for G2_DMA
          Depends on [n]: SH_DREAMCAST [=y] && SH_DMA_API [=n]
          Selected by [y]:
          - SND_AICA [=y] && SOUND [=y] && SND [=y] && SND_SUPERH [=y] && SH_DREAMCAST [=y]
    
    SND_AICA selects G2_DMA.  As the latter depends on SH_DMA_API, the
    former should depend on SH_DMA_API, too.
    
    Fixes: f477a538c14d07f8 ("sh: dma: fix kconfig dependency for G2_DMA")
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/b90625f8a9078d0d304bafe862cbe3a3fab40082.1747121335.git.geert+renesas@glider.be
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Sun May 11 16:11:45 2025 +0200

    ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info
    
    [ Upstream commit dd33993a9721ab1dae38bd37c9f665987d554239 ]
    
    s/devince/device/
    
    It's used only internally, so no any behavior changes.
    
    Fixes: 37e0e14128e0 ("ALSA: ump: Support UMP Endpoint and Function Block parsing")
    Acked-by: Greg Kroah-Hartman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add sample rate quirk for Audioengine D1 [+ + +]

Author: Christian Heusel <[email protected]>
Date:   Mon May 12 22:23:37 2025 +0200

    ALSA: usb-audio: Add sample rate quirk for Audioengine D1
    
    commit 2b24eb060c2bb9ef79e1d3bcf633ba1bc95215d6 upstream.
    
    A user reported on the Arch Linux Forums that their device is emitting
    the following message in the kernel journal, which is fixed by adding
    the quirk as submitted in this patch:
    
        > kernel: usb 1-2: current rate 8436480 is different from the runtime rate 48000
    
    There also is an entry for this product line added long time ago.
    Their specific device has the following ID:
    
        $ lsusb | grep Audio
        Bus 001 Device 002: ID 1101:0003 EasyPass Industrial Co., Ltd Audioengine D1
    
    Link: https://bbs.archlinux.org/viewtopic.php?id=305494
    Fixes: 93f9d1a4ac593 ("ALSA: usb-audio: Apply sample rate quirk for Audioengine D1")
    Cc: [email protected]
    Signed-off-by: Christian Heusel <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera [+ + +]

Author: Nicolas Chauvet <[email protected]>
Date:   Thu May 15 12:21:32 2025 +0200

    ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera
    
    commit 7b9938a14460e8ec7649ca2e80ac0aae9815bf02 upstream.
    
    Microdia JP001 does not support reading the sample rate which leads to
    many lines of "cannot get freq at ep 0x84".
    This patch adds the USB ID to quirks.c and avoids those error messages.
    
    usb 7-4: New USB device found, idVendor=0c45, idProduct=636b, bcdDevice= 1.00
    usb 7-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3
    usb 7-4: Product: JP001
    usb 7-4: Manufacturer: JP001
    usb 7-4: SerialNumber: JP001
    usb 7-4: 3:1: cannot get freq at ep 0x84
    
    Cc: <[email protected]>
    Signed-off-by: Nicolas Chauvet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: amlogic: dreambox: fix missing clkc_audio node [+ + +]

Author: Christian Hewitt <[email protected]>
Date:   Sat May 3 08:44:43 2025 +0000

    arm64: dts: amlogic: dreambox: fix missing clkc_audio node
    
    commit 0f67578587bb9e5a8eecfcdf6b8a501b5bd90526 upstream.
    
    Add the clkc_audio node to fix audio support on Dreambox One/Two.
    
    Fixes: 83a6f4c62cb1 ("arm64: dts: meson: add initial support for Dreambox One/Two")
    CC: [email protected]
    Suggested-by: Emanuel Strobel <[email protected]>
    Signed-off-by: Christian Hewitt <[email protected]>
    Reviewed-by: Martin Blumenstingl <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout [+ + +]

Author: Himanshu Bhavani <[email protected]>
Date:   Mon May 5 11:28:27 2025 +0530

    arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout
    
    [ Upstream commit c6888983134e2ccc2db8ffd2720b0d4826d952e4 ]
    
    Fix SD card timeout issue caused by LDO5 regulator getting disabled
    after boot.
    
    The kernel log shows LDO5 being disabled, which leads to a timeout
    on USDHC2:
    [   33.760561] LDO5: disabling
    [   81.119861] mmc1: Timeout waiting for hardware interrupt.
    
    To prevent this, set regulator-boot-on and regulator-always-on for
    LDO5. Also add the vqmmc regulator to properly support 1.8V/3.3V
    signaling for USDHC2 using a GPIO-controlled regulator.
    
    Fixes: 6c2a1f4f71258 ("arm64: dts: imx8mp-var-som-symphony: Add Variscite Symphony board and VAR-SOM-MX8MP SoM")
    Signed-off-by: Himanshu Bhavani <[email protected]>
    Acked-by: Tarang Raval <[email protected]>
    Signed-off-by: Shawn Guo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down [+ + +]

Author: Sam Edwards <[email protected]>
Date:   Sat Mar 29 09:50:17 2025 -0700

    arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down
    
    commit fdc7bd909a5f38793468e9cf9b6a9063d96c6234 upstream.
    
    The RK3588 thermal sensor driver only receives interrupts when a
    higher-temperature threshold is crossed; it cannot notify when the
    sensor cools back off. As a result, the driver must poll for temperature
    changes to detect when the conditions for a thermal trip are no longer
    met. However, it only does so if the DT enables polling.
    
    Before this patch, the RK1 DT did not enable polling, causing the fan to
    continue running at the speed corresponding to the highest temperature
    reached.
    
    Follow suit with similar RK3588 boards by setting a polling-delay of
    1000ms, enabling the driver to detect when the sensor cools back off,
    allowing the fan speed to decrease as appropriate.
    
    Fixes: 7c8ec5e6b9d6 ("arm64: dts: rockchip: Enable automatic fan control on Turing RK1")
    Cc: [email protected] # v6.13+
    Signed-off-by: Sam Edwards <[email protected]>
    Reviewed-by: Dragan Simic <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588 [+ + +]

Author: Tom Vincent <[email protected]>
Date:   Thu Apr 17 09:17:53 2025 +0100

    arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588
    
    [ Upstream commit 5e6a4ee9799b202fefa8c6264647971f892f0264 ]
    
    The Realtek RT5616 audio codec on the FriendlyElec CM3588 module fails
    to probe correctly due to the missing clock properties. This results
    in distorted analogue audio output.
    
    Assign MCLK to 12.288 MHz, which allows the codec to advertise most of
    the standard sample rates per other RK3588 devices.
    
    Fixes: e23819cf273c ("arm64: dts: rockchip: Add FriendlyElec CM3588 NAS board")
    Signed-off-by: Tom Vincent <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: dts: rockchip: fix Sige5 RTC interrupt pin [+ + +]

Author: Nicolas Frattaroli <[email protected]>
Date:   Tue Apr 29 18:51:55 2025 +0200

    arm64: dts: rockchip: fix Sige5 RTC interrupt pin
    
    [ Upstream commit 4bf593be2e462623c4c34c7e3b604eb3f8f9de45 ]
    
    Someone made a typo when they added the RTC to the Sige5 DTS, which
    resulted in it using interrupts from GPIO0 B0 instead of GPIO0 A0. The
    pinctrl entry for it wasn't typoed though, curiously enough.
    
    The Sige5 v1.1 schematic was used to verify that GPIO0 A0 is the correct
    pin for the RTC wakeup interrupt, so let's change it to that.
    
    Fixes: 40f742b07ab2 ("arm64: dts: rockchip: Add rk3576-armsom-sige5 board")
    Signed-off-by: Nicolas Frattaroli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi [+ + +]

Author: Dragan Simic <[email protected]>
Date:   Mon Mar 24 12:00:43 2025 +0100

    arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi
    
    commit e0bd7ecf6b2dc71215af699dffbf14bf0bc3d978 upstream.
    
    The differences in the vendor-approved CPU and GPU OPPs for the standard
    Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2]
    come from the latter, presumably, supporting an extended temperature range
    that's usually associated with industrial applications, despite the two SoC
    variant datasheets specifying the same upper limit for the allowed ambient
    temperature for both variants.  However, the lower temperature limit is
    specified much lower for the RK3588J variant. [1][2]
    
    To be on the safe side and to ensure maximum longevity of the RK3588J SoCs,
    only the CPU and GPU OPPs that are declared by the vendor to be always safe
    for this SoC variant may be provided.  As explained by the vendor [3] and
    according to the RK3588J datasheet, [2] higher-frequency/higher-voltage
    CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC
    lifetime expectancy.  Presumably, using the higher OPPs may be safe only
    when not enjoying the assumed extended temperature range that the RK3588J,
    as an SoC variant targeted specifically at higher-temperature, industrial
    applications, is made (or binned) for.
    
    Anyone able to keep their RK3588J-based board outside the above-presumed
    extended temperature range at all times, and willing to take the associated
    risk of possibly reducing the SoC lifetime expectancy, is free to apply
    a DT overlay that adds the higher CPU and GPU OPPs.
    
    With all this and the downstream RK3588(J) DT definitions [4][5] in mind,
    let's delete the RK3588J CPU and GPU OPPs that are not considered belonging
    to the normal operation mode for this SoC variant.  To quote the RK3588J
    datasheet [2], "normal mode means the chipset works under safety voltage
    and frequency;  for the industrial environment, highly recommend to keep in
    normal mode, the lifetime is reasonably guaranteed", while "overdrive mode
    brings higher frequency, and the voltage will increase accordingly;  under
    the overdrive mode for a long time, the chipset may shorten the lifetime,
    especially in high-temperature condition".
    
    To sum the RK3588J datasheet [2] and the vendor-provided DTs up, [4][5]
    the maximum allowed CPU core, GPU and NPU frequencies are as follows:
    
       IP core    | Normal mode | Overdrive mode
      ------------+-------------+----------------
       Cortex-A55 |   1,296 MHz |      1,704 MHz
       Cortex-A76 |   1,608 MHz |      2,016 MHz
       GPU        |     700 MHz |        850 MHz
       NPU        |     800 MHz |        950 MHz
    
    Unfortunately, when it comes to the actual voltages for the RK3588J CPU and
    GPU OPPs, there's a discrepancy between the RK3588J datasheet [2] and the
    downstream kernel code. [4][5]  The RK3588J datasheet states that "the max.
    working voltage of CPU/GPU/NPU is 0.75 V under the normal mode", while the
    downstream kernel code actually allows voltage ranges that go up to 0.95 V,
    which is still within the voltage range allowed by the datasheet.  However,
    the RK3588J datasheet also tells us to "strictly refer to the software
    configuration of SDK and the hardware reference design", so let's embrace
    the voltage ranges provided by the downstream kernel code, which also
    prevents the undesirable theoretical outcome of ending up with no usable
    OPPs on a particular board, as a result of the board's voltage regulator(s)
    being unable to deliver the exact voltages, for whatever reason.
    
    The above-described voltage ranges for the RK3588J CPU OPPs remain taken
    from the downstream kernel code [4][5] by picking the highest, worst-bin
    values, which ensure that all RK3588J bins will work reliably.  Yes, with
    some power inevitably wasted as unnecessarily generated heat, but the
    reliability is paramount, together with the longevity.  This deficiency
    may be revisited separately at some point in the future.
    
    The provided RK3588J CPU OPPs follow the slightly debatable "provide only
    the highest-frequency OPP from the same-voltage group" approach that's been
    established earlier, [6] as a result of the "same-voltage, lower-frequency"
    OPPs being considered inefficient from the IPA governor's standpoint, which
    may also be revisited separately at some point in the future.
    
    [1] https://wiki.friendlyelec.com/wiki/images/e/ee/Rockchip_RK3588_Datasheet_V1.6-20231016.pdf
    [2] https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2403201054_Rockchip-RK3588J_C22364189.pdf
    [3] https://lore.kernel.org/linux-rockchip/[email protected]/T/#u
    [4] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
    [5] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
    [6] https://lore.kernel.org/all/[email protected]/
    
    Fixes: 667885a68658 ("arm64: dts: rockchip: Add OPP data for CPU cores on RK3588j")
    Fixes: a7b2070505a2 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j")
    Cc: [email protected]
    Cc: Heiko Stuebner <[email protected]>
    Cc: Alexey Charkov <[email protected]>
    Helped-by: Quentin Schulz <[email protected]>
    Reviewed-by: Quentin Schulz <[email protected]>
    Signed-off-by: Dragan Simic <[email protected]>
    Link: https://lore.kernel.org/r/eeec0d30d79b019d111b3f0aa2456e69896b2caa.1742813866.git.dsimic@manjaro.org
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

binfmt_elf: Move brk for static PIE even if ASLR disabled [+ + +]

Author: Kees Cook <[email protected]>
Date:   Fri Apr 25 15:45:06 2025 -0700

    binfmt_elf: Move brk for static PIE even if ASLR disabled
    
    [ Upstream commit 11854fe263eb1b9a8efa33b0c087add7719ea9b4 ]
    
    In commit bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing
    direct loader exec"), the brk was moved out of the mmap region when
    loading static PIE binaries (ET_DYN without INTERP). The common case
    for these binaries was testing new ELF loaders, so the brk needed to
    be away from mmap to avoid colliding with stack, future mmaps (of the
    loader-loaded binary), etc. But this was only done when ASLR was enabled,
    in an attempt to minimize changes to memory layouts.
    
    After adding support to respect alignment requirements for static PIE
    binaries in commit 3545deff0ec7 ("binfmt_elf: Honor PT_LOAD alignment
    for static PIE"), it became possible to have a large gap after the
    final PT_LOAD segment and the top of the mmap region. This means that
    future mmap allocations might go after the last PT_LOAD segment (where
    brk might be if ASLR was disabled) instead of before them (where they
    traditionally ended up).
    
    On arm64, running with ASLR disabled, Ubuntu 22.04's "ldconfig" binary,
    a static PIE, has alignment requirements that leaves a gap large enough
    after the last PT_LOAD segment to fit the vdso and vvar, but still leave
    enough space for the brk (which immediately follows the last PT_LOAD
    segment) to be allocated by the binary.
    
    fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
    fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
    fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
    ***[brk will go here at fffff7ffa000]***
    fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
    fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
    fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]
    
    After commit 0b3bc3354eb9 ("arm64: vdso: Switch to generic storage
    implementation"), the arm64 vvar grew slightly, and suddenly the brk
    collided with the allocation.
    
    fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real
    fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real
    fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0
    ***[oops, no room any more, vvar is at fffff7ffa000!]***
    fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0       [vvar]
    fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0       [vdso]
    fffffffdf000-1000000000000 rw-p 00000000 00:00 0      [stack]
    
    The solution is to unconditionally move the brk out of the mmap region
    for static PIE binaries. Whether ASLR is enabled or not does not change if
    there may be future mmap allocation collisions with a growing brk region.
    
    Update memory layout comments (with kernel-doc headings), consolidate
    the setting of mm->brk to later (it isn't needed early), move static PIE
    brk out of mmap unconditionally, and make sure brk(2) knows to base brk
    position off of mm->start_brk not mm->end_data no matter what the cause of
    moving it is (via current->brk_randomized).
    
    For the CONFIG_COMPAT_BRK case, though, leave the logic unchanged, as we
    can never safely move the brk. These systems, however, are not using
    specially aligned static PIE binaries.
    
    Reported-by: Ryan Roberts <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ryan Roberts <[email protected]>
    Tested-by: Ryan Roberts <[email protected]>
    Signed-off-by: Kees Cook <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Tue Apr 29 15:05:59 2025 -0400

    Bluetooth: MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags
    
    [ Upstream commit 1e2e3044c1bc64a64aa0eaf7c17f7832c26c9775 ]
    
    Device flags could be updated in the meantime while MGMT_OP_ADD_DEVICE
    is pending on hci_update_passive_scan_sync so instead of setting the
    current_flags as cmd->user_data just do a lookup using
    hci_conn_params_lookup and use the latest stored flags.
    
    Fixes: a182d9c84f9c ("Bluetooth: MGMT: Fix Add Device to responding before completing")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: add back warning for mount option commit values exceeding 300 [+ + +]

Author: Kyoji Ogasawara <[email protected]>
Date:   Fri May 9 19:26:31 2025 +0900

    btrfs: add back warning for mount option commit values exceeding 300
    
    commit 4ce2affc6ef9f84b4aebbf18bd5c57397b6024eb upstream.
    
    The Btrfs documentation states that if the commit value is greater than
    300 a warning should be issued. The warning was accidentally lost in the
    new mount API update.
    
    Fixes: 6941823cc878 ("btrfs: remove old mount API code")
    CC: [email protected] # 6.12+
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Anand Jain <[email protected]>
    Signed-off-by: Kyoji Ogasawara <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix discard worker infinite loop after disabling discard [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Mon May 5 16:03:16 2025 +0100

    btrfs: fix discard worker infinite loop after disabling discard
    
    commit 54db6d1bdd71fa90172a2a6aca3308bbf7fa7eb5 upstream.
    
    If the discard worker is running and there's currently only one block
    group, that block group is a data block group, it's in the unused block
    groups discard list and is being used (it got an extent allocated from it
    after becoming unused), the worker can end up in an infinite loop if a
    transaction abort happens or the async discard is disabled (during remount
    or unmount for example).
    
    This happens like this:
    
    1) Task A, the discard worker, is at peek_discard_list() and
       find_next_block_group() returns block group X;
    
    2) Block group X is in the unused block groups discard list (its discard
       index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past
       it become an unused block group and was added to that list, but then
       later it got an extent allocated from it, so its ->used counter is not
       zero anymore;
    
    3) The current transaction is aborted by task B and we end up at
       __btrfs_handle_fs_error() in the transaction abort path, where we call
       btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from
       fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode
       (setting SB_RDONLY in the super block's s_flags field);
    
    4) Task A calls __add_to_discard_list() with the goal of moving the block
       group from the unused block groups discard list into another discard
       list, but at __add_to_discard_list() we end up doing nothing because
       btrfs_run_discard_work() returns false, since the super block has
       SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set
       anymore in fs_info->flags. So block group X remains in the unused block
       groups discard list;
    
    5) Task A then does a goto into the 'again' label, calls
       find_next_block_group() again we gets block group X again. Then it
       repeats the previous steps over and over since there are not other
       block groups in the discard lists and block group X is never moved
       out of the unused block groups discard list since
       btrfs_run_discard_work() keeps returning false and therefore
       __add_to_discard_list() doesn't move block group X out of that discard
       list.
    
    When this happens we can get a soft lockup report like this:
    
      [71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97]
      [71.957] Modules linked in: xfs af_packet rfkill (...)
      [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W          6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
      [71.957] Tainted: [W]=WARN
      [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
      [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
      [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
      [71.957] Code: c1 01 48 83 (...)
      [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
      [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
      [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
      [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
      [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
      [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
      [71.957] FS:  0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000
      [71.957] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0
      [71.957] PKRU: 55555554
      [71.957] Call Trace:
      [71.957]  <TASK>
      [71.957]  process_one_work+0x17e/0x330
      [71.957]  worker_thread+0x2ce/0x3f0
      [71.957]  ? __pfx_worker_thread+0x10/0x10
      [71.957]  kthread+0xef/0x220
      [71.957]  ? __pfx_kthread+0x10/0x10
      [71.957]  ret_from_fork+0x34/0x50
      [71.957]  ? __pfx_kthread+0x10/0x10
      [71.957]  ret_from_fork_asm+0x1a/0x30
      [71.957]  </TASK>
      [71.957] Kernel panic - not syncing: softlockup: hung tasks
      [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G        W    L     6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa
      [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP
      [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
      [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs]
      [71.992] Call Trace:
      [71.993]  <IRQ>
      [71.994]  dump_stack_lvl+0x5a/0x80
      [71.994]  panic+0x10b/0x2da
      [71.995]  watchdog_timer_fn.cold+0x9a/0xa1
      [71.996]  ? __pfx_watchdog_timer_fn+0x10/0x10
      [71.997]  __hrtimer_run_queues+0x132/0x2a0
      [71.997]  hrtimer_interrupt+0xff/0x230
      [71.998]  __sysvec_apic_timer_interrupt+0x55/0x100
      [71.999]  sysvec_apic_timer_interrupt+0x6c/0x90
      [72.000]  </IRQ>
      [72.000]  <TASK>
      [72.001]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
      [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs]
      [72.002] Code: c1 01 48 83 (...)
      [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246
      [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000
      [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad
      [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080
      [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860
      [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000
      [72.010]  ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f]
      [72.011]  process_one_work+0x17e/0x330
      [72.012]  worker_thread+0x2ce/0x3f0
      [72.013]  ? __pfx_worker_thread+0x10/0x10
      [72.014]  kthread+0xef/0x220
      [72.014]  ? __pfx_kthread+0x10/0x10
      [72.015]  ret_from_fork+0x34/0x50
      [72.015]  ? __pfx_kthread+0x10/0x10
      [72.016]  ret_from_fork_asm+0x1a/0x30
      [72.017]  </TASK>
      [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [72.019] Rebooting in 90 seconds..
    
    So fix this by making sure we move a block group out of the unused block
    groups discard list when calling __add_to_discard_list().
    
    Fixes: 2bee7eb8bb81 ("btrfs: discard one region at a time in async discard")
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012
    CC: [email protected] # 5.10+
    Reviewed-by: Boris Burkov <[email protected]>
    Reviewed-by: Daniel Vacek <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix folio leak in submit_one_async_extent() [+ + +]

Author: Boris Burkov <[email protected]>
Date:   Wed May 7 12:42:24 2025 -0700

    btrfs: fix folio leak in submit_one_async_extent()
    
    commit a0fd1c6098633f9a95fc2f636383546c82b704c3 upstream.
    
    If btrfs_reserve_extent() fails while submitting an async_extent for a
    compressed write, then we fail to call free_async_extent_pages() on the
    async_extent and leak its folios. A likely cause for such a failure
    would be btrfs_reserve_extent() failing to find a large enough
    contiguous free extent for the compressed extent.
    
    I was able to reproduce this by:
    
    1. mount with compress-force=zstd:3
    2. fallocating most of a filesystem to a big file
    3. fragmenting the remaining free space
    4. trying to copy in a file which zstd would generate large compressed
       extents for (vmlinux worked well for this)
    
    Step 4. hits the memory leak and can be repeated ad nauseam to
    eventually exhaust the system memory.
    
    Fix this by detecting the case where we fallback to uncompressed
    submission for a compressed async_extent and ensuring that we call
    free_async_extent_pages().
    
    Fixes: 131a821a243f ("btrfs: fallback if compressed IO fails for ENOSPC")
    CC: [email protected] # 6.1+
    Reviewed-by: Filipe Manana <[email protected]>
    Co-developed-by: Josef Bacik <[email protected]>
    Signed-off-by: Boris Burkov <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks [+ + +]

Author: Waiman Long <[email protected]>
Date:   Thu May 8 15:24:13 2025 -0400

    cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks
    
    [ Upstream commit 39b5ef791d109dd54c7c2e6e87933edfcc0ad1ac ]
    
    Commit ec5fbdfb99d1 ("cgroup/cpuset: Enable update_tasks_cpumask()
    on top_cpuset") enabled us to pull CPUs dedicated to child partitions
    from tasks in top_cpuset by ignoring per cpu kthreads. However, there
    can be other kthreads that are not per cpu but have PF_NO_SETAFFINITY
    flag set to indicate that we shouldn't mess with their CPU affinity.
    For other kthreads, their affinity will be changed to skip CPUs dedicated
    to child partitions whether it is an isolating or a scheduling one.
    
    As all the per cpu kthreads have PF_NO_SETAFFINITY set, the
    PF_NO_SETAFFINITY tasks are essentially a superset of per cpu kthreads.
    Fix this issue by dropping the kthread_is_per_cpu() check and checking
    the PF_NO_SETAFFINITY flag instead.
    
    Fixes: ec5fbdfb99d1 ("cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset")
    Signed-off-by: Waiman Long <[email protected]>
    Acked-by: Frederic Weisbecker <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

dma-buf: insert memory barrier before updating num_fences [+ + +]

Author: Hyejeong Choi <[email protected]>
Date:   Mon May 12 21:06:38 2025 -0500

    dma-buf: insert memory barrier before updating num_fences
    
    commit 72c7d62583ebce7baeb61acce6057c361f73be4a upstream.
    
    smp_store_mb() inserts memory barrier after storing operation.
    It is different with what the comment is originally aiming so Null
    pointer dereference can be happened if memory update is reordered.
    
    Signed-off-by: Hyejeong Choi <[email protected]>
    Fixes: a590d0fdbaa5 ("dma-buf: Update reservation shared_count after adding the new fence")
    CC: [email protected]
    Reviewed-by: Christian König <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:12 2025 +0800

    dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals
    
    commit 61259fb96e023f7299c442c48b13e72c441fc0f2 upstream.
    
    The idxd_setup_internals() is missing some cleanup when things fail in
    the middle.
    
    Add the appropriate cleanup routines:
    
    - cleanup groups
    - cleanup enginces
    - cleanup wqs
    
    to make sure it exits gracefully.
    
    Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime")
    Cc: [email protected]
    Suggested-by: Fenghua Yu <[email protected]>
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: Add missing cleanups in cleanup internals [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:13 2025 +0800

    dmaengine: idxd: Add missing cleanups in cleanup internals
    
    commit 61d651572b6c4fe50c7b39a390760f3a910c7ccf upstream.
    
    The idxd_cleanup_internals() function only decreases the reference count
    of groups, engines, and wqs but is missing the step to release memory
    resources.
    
    To fix this, use the cleanup helper to properly release the memory
    resources.
    
    Fixes: ddf742d4f3f1 ("dmaengine: idxd: Add missing cleanup for early error out in probe call")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:16 2025 +0800

    dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call
    
    commit d5449ff1b04dfe9ed8e455769aa01e4c2ccf6805 upstream.
    
    The remove call stack is missing idxd cleanup to free bitmap, ida and
    the idxd_device. Call idxd_free() helper routines to make sure we exit
    gracefully.
    
    Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
    Cc: [email protected]
    Suggested-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: fix memory leak in error handling path of idxd_alloc [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:14 2025 +0800

    dmaengine: idxd: fix memory leak in error handling path of idxd_alloc
    
    commit 46a5cca76c76c86063000a12936f8e7875295838 upstream.
    
    Memory allocated for idxd is not freed if an error occurs during
    idxd_alloc(). To fix it, free the allocated memory in the reverse order
    of allocation before exiting the function in case of an error.
    
    Fixes: a8563a33a5e2 ("dmanegine: idxd: reformat opcap output to match bitmap_parse() input")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:15 2025 +0800

    dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe
    
    commit 90022b3a6981ec234902be5dbf0f983a12c759fc upstream.
    
    Memory allocated for idxd is not freed if an error occurs during
    idxd_pci_probe(). To fix it, free the allocated memory in the reverse
    order of allocation before exiting the function in case of an error.
    
    Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:10 2025 +0800

    dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines
    
    commit 817bced19d1dbdd0b473580d026dc0983e30e17b upstream.
    
    Memory allocated for engines is not freed if an error occurs during
    idxd_setup_engines(). To fix it, free the allocated memory in the
    reverse order of allocation before exiting the function in case of an
    error.
    
    Fixes: 75b911309060 ("dmaengine: idxd: fix engine conf_dev lifetime")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:11 2025 +0800

    dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups
    
    commit aa6f4f945b10eac57aed46154ae7d6fada7fccc7 upstream.
    
    Memory allocated for groups is not freed if an error occurs during
    idxd_setup_groups(). To fix it, free the allocated memory in the reverse
    order of allocation before exiting the function in case of an error.
    
    Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:09 2025 +0800

    dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs
    
    commit 3fd2f4bc010cdfbc07dd21018dc65bd9370eb7a4 upstream.
    
    Memory allocated for wqs is not freed if an error occurs during
    idxd_setup_wqs(). To fix it, free the allocated memory in the reverse
    order of allocation before exiting the function in case of an error.
    
    Fixes: 7c5dd23e57c1 ("dmaengine: idxd: fix wq conf_dev 'struct device' lifetime")
    Fixes: 700af3a0a26c ("dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev")
    Fixes: de5819b99489 ("dmaengine: idxd: track enabled workqueues in bitmap")
    Fixes: b0325aefd398 ("dmaengine: idxd: add WQ operation cap restriction support")
    Cc: [email protected]
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: idxd: Refactor remove call with idxd_cleanup() helper [+ + +]

Author: Shuai Xue <[email protected]>
Date:   Fri Apr 4 20:02:17 2025 +0800

    dmaengine: idxd: Refactor remove call with idxd_cleanup() helper
    
    commit a409e919ca321cc0e28f8abf96fde299f0072a81 upstream.
    
    The idxd_cleanup() helper cleans up perfmon, interrupts, internals and
    so on. Refactor remove call with the idxd_cleanup() helper to avoid code
    duplication. Note, this also fixes the missing put_device() for idxd
    groups, enginces and wqs.
    
    Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators")
    Cc: [email protected]
    Suggested-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: Shuai Xue <[email protected]>
    Reviewed-by: Fenghua Yu <[email protected]>
    Reviewed-by: Dave Jiang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" [+ + +]

Author: Nathan Lynch <[email protected]>
Date:   Thu Apr 3 11:24:19 2025 -0500

    dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
    
    commit df180e65305f8c1e020d54bfc2132349fd693de1 upstream.
    
    Several issues with this change:
    
    * The analysis is flawed and it's unclear what problem is being
      fixed. There is no difference between wait_event_freezable_timeout()
      and wait_event_timeout() with respect to device interrupts. And of
      course "the interrupt notifying the finish of an operation happens
      during wait_event_freezable_timeout()" -- that's how it's supposed
      to work.
    
    * The link at the "Closes:" tag appears to be an unrelated
      use-after-free in idxd.
    
    * It introduces a regression: dmatest threads are meant to be
      freezable and this change breaks that.
    
    See discussion here:
    https://lore.kernel.org/dmaengine/[email protected]/
    
    Fixes: e87ca16e9911 ("dmaengine: dmatest: Fix dmatest waiting less when interrupted")
    Signed-off-by: Nathan Lynch <[email protected]>
    Link: https://lore.kernel.org/r/20250403-dmaengine-dmatest-revert-waiting-less-v1-1-8227c5a3d7c8@amd.com
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: ti: k3-udma: Add missing locking [+ + +]

Author: Ronald Wahl <[email protected]>
Date:   Mon Apr 14 19:31:13 2025 +0200

    dmaengine: ti: k3-udma: Add missing locking
    
    commit fca280992af8c2fbd511bc43f65abb4a17363f2f upstream.
    
    Recent kernels complain about a missing lock in k3-udma.c when the lock
    validator is enabled:
    
    [    4.128073] WARNING: CPU: 0 PID: 746 at drivers/dma/ti/../virt-dma.h:169 udma_start.isra.0+0x34/0x238
    [    4.137352] CPU: 0 UID: 0 PID: 746 Comm: kworker/0:3 Not tainted 6.12.9-arm64 #28
    [    4.144867] Hardware name: pp-v12 (DT)
    [    4.148648] Workqueue: events udma_check_tx_completion
    [    4.153841] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    4.160834] pc : udma_start.isra.0+0x34/0x238
    [    4.165227] lr : udma_start.isra.0+0x30/0x238
    [    4.169618] sp : ffffffc083cabcf0
    [    4.172963] x29: ffffffc083cabcf0 x28: 0000000000000000 x27: ffffff800001b005
    [    4.180167] x26: ffffffc0812f0000 x25: 0000000000000000 x24: 0000000000000000
    [    4.187370] x23: 0000000000000001 x22: 00000000e21eabe9 x21: ffffff8000fa0670
    [    4.194571] x20: ffffff8001b6bf00 x19: ffffff8000fa0430 x18: ffffffc083b95030
    [    4.201773] x17: 0000000000000000 x16: 00000000f0000000 x15: 0000000000000048
    [    4.208976] x14: 0000000000000048 x13: 0000000000000000 x12: 0000000000000001
    [    4.216179] x11: ffffffc08151a240 x10: 0000000000003ea1 x9 : ffffffc08046ab68
    [    4.223381] x8 : ffffffc083cabac0 x7 : ffffffc081df3718 x6 : 0000000000029fc8
    [    4.230583] x5 : ffffffc0817ee6d8 x4 : 0000000000000bc0 x3 : 0000000000000000
    [    4.237784] x2 : 0000000000000000 x1 : 00000000001fffff x0 : 0000000000000000
    [    4.244986] Call trace:
    [    4.247463]  udma_start.isra.0+0x34/0x238
    [    4.251509]  udma_check_tx_completion+0xd0/0xdc
    [    4.256076]  process_one_work+0x244/0x3fc
    [    4.260129]  process_scheduled_works+0x6c/0x74
    [    4.264610]  worker_thread+0x150/0x1dc
    [    4.268398]  kthread+0xd8/0xe8
    [    4.271492]  ret_from_fork+0x10/0x20
    [    4.275107] irq event stamp: 220
    [    4.278363] hardirqs last  enabled at (219): [<ffffffc080a27c7c>] _raw_spin_unlock_irq+0x38/0x50
    [    4.287183] hardirqs last disabled at (220): [<ffffffc080a1c154>] el1_dbg+0x24/0x50
    [    4.294879] softirqs last  enabled at (182): [<ffffffc080037e68>] handle_softirqs+0x1c0/0x3cc
    [    4.303437] softirqs last disabled at (177): [<ffffffc080010170>] __do_softirq+0x1c/0x28
    [    4.311559] ---[ end trace 0000000000000000 ]---
    
    This commit adds the missing locking.
    
    Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA")
    Cc: Peter Ujfalusi <[email protected]>
    Cc: Vignesh Raghavendra <[email protected]>
    Cc: Vinod Koul <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Signed-off-by: Ronald Wahl <[email protected]>
    Acked-by: Peter Ujfalusi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy [+ + +]

Author: Yemike Abhilash Chandra <[email protected]>
Date:   Thu Apr 17 13:25:21 2025 +0530

    dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
    
    commit 8ca9590c39b69b55a8de63d2b21b0d44f523b43a upstream.
    
    Currently, a local dma_cap_mask_t variable is used to store device
    cap_mask within udma_of_xlate(). However, the DMA_PRIVATE flag in
    the device cap_mask can get cleared when the last channel is released.
    This can happen right after storing the cap_mask locally in
    udma_of_xlate(), and subsequent dma_request_channel() can fail due to
    mismatch in the cap_mask. Fix this by removing the local dma_cap_mask_t
    variable and directly using the one from the dma_device structure.
    
    Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA")
    Cc: [email protected]
    Signed-off-by: Vaishnav Achath <[email protected]>
    Acked-by: Peter Ujfalusi <[email protected]>
    Reviewed-by: Udit Kumar <[email protected]>
    Signed-off-by: Yemike Abhilash Chandra <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Wed Apr 23 08:18:44 2025 -0500

    drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies
    
    [ Upstream commit 690d722e02819ef978f90cd7553973eba1007e6c ]
    
    If a policy is passed into amd_pmf_get_pb_data() that causes the engine
    to fail to start there is a memory leak. Free the memory in this failure
    path.
    
    Fixes: 10817f28e5337 ("platform/x86/amd/pmf: Add capability to sideload of policy binary")
    Signed-off-by: Mario Limonciello <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Wed Apr 23 08:18:45 2025 -0500

    drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies
    
    [ Upstream commit 8e81b9cd6e95188d12c9cc25d40b61dd5ea05ace ]
    
    commit 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for
    Compatibility with new PMF-TA") added support for platforms that support
    an updated TA, however it also exposed a number of platforms that although
    they have support for the updated TA don't actually populate a policy
    binary.
    
    Add an explicit check that the policy binary isn't empty before
    initializing the TA.
    
    Reported-by: Christian Heusel <[email protected]>
    Closes: https://lore.kernel.org/platform-driver-x86/[email protected]/
    Fixes: 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA")
    Signed-off-by: Mario Limonciello <[email protected]>
    Tested-by: Christian Heusel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Mon May 12 17:06:00 2025 -0700

    Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
    
    commit 380b75d3078626aadd0817de61f3143f5db6e393 upstream.
    
    vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
    and is hardcoded to create a single GPA range. To allow it to also be
    used by the netvsc driver to create multiple GPA ranges, no longer
    hardcode as having a single GPA range. Allow the calling driver to
    specify the rangecount in the supplied descriptor.
    
    Update the storvsc driver to reflect this new approach.
    
    Cc: <[email protected]> # 6.1.x
    Signed-off-by: Michael Kelley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Mon May 12 17:06:04 2025 -0700

    Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
    
    commit 45a442fe369e6c4e0b4aa9f63b31c3f2f9e2090e upstream.
    
    With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
    instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
    callers. Remove it.
    
    Cc: <[email protected]> # 6.1.x
    Signed-off-by: Michael Kelley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Avoid flooding unnecessary info messages [+ + +]

Author: Wayne Lin <[email protected]>
Date:   Tue May 13 11:20:24 2025 +0800

    drm/amd/display: Avoid flooding unnecessary info messages
    
    commit d33724ffb743d3d2698bd969e29253ae0cff9739 upstream.
    
    It's expected that we'll encounter temporary exceptions
    during aux transactions. Adjust logging from drm_info to
    drm_dbg_dp to prevent flooding with unnecessary log messages.
    
    Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Acked-by: Alex Deucher <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 9a9c3e1fe5256da14a0a307dff0478f90c55fc8c)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Correct the reply value when AUX write incomplete [+ + +]

Author: Wayne Lin <[email protected]>
Date:   Fri Apr 25 14:44:02 2025 +0800

    drm/amd/display: Correct the reply value when AUX write incomplete
    
    commit d433981385c62c72080e26f1c00a961d18b233be upstream.
    
    [Why]
    Now forcing aux->transfer to return 0 when incomplete AUX write is
    inappropriate. It should return bytes have been transferred.
    
    [How]
    aux->transfer is asked not to change original msg except reply field of
    drm_dp_aux_msg structure. Copy the msg->buffer when it's write request,
    and overwrite the first byte when sink reply 1 byte indicating partially
    written byte number. Then we can return the correct value without
    changing the original msg.
    
    Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Ray Wu <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Signed-off-by: Ray Wu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 7ac37f0dcd2e0b729fa7b5513908dc8ab802b540)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp [+ + +]

Author: Melissa Wen <[email protected]>
Date:   Wed Apr 30 11:11:47 2025 -0300

    drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp
    
    [ Upstream commit a3b7e65b6be59e686e163fa1ceb0922f996897c2 ]
    
    Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for
    pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null
    pointer dereference on dcn20_update_dchubp_dpp. This is the same
    function hooked for update_dchubp_dpp in dcn401, with the same issue.
    Fix possible null pointer deference on dcn401_program_pipe too.
    
    Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup")
    Signed-off-by: Melissa Wen <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit d8d47f739752227957d8efc0cb894761bfe1d879)
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: csa unmap use uninterruptible lock [+ + +]

Author: Philip Yang <[email protected]>
Date:   Wed May 7 11:04:32 2025 -0400

    drm/amdgpu: csa unmap use uninterruptible lock
    
    commit a0fa7873f2f869087b1e7793f7fac3713a1e3afe upstream.
    
    After process exit to unmap csa and free GPU vm, if signal is accepted
    and then waiting to take vm lock is interrupted and return, it causes
    memory leaking and below warning backtrace.
    
    Change to use uninterruptible wait lock fix the issue.
    
    WARNING: CPU: 69 PID: 167800 at amd/amdgpu/amdgpu_kms.c:1525
     amdgpu_driver_postclose_kms+0x294/0x2a0 [amdgpu]
     Call Trace:
      <TASK>
      drm_file_free.part.0+0x1da/0x230 [drm]
      drm_close_helper.isra.0+0x65/0x70 [drm]
      drm_release+0x6a/0x120 [drm]
      amdgpu_drm_release+0x51/0x60 [amdgpu]
      __fput+0x9f/0x280
      ____fput+0xe/0x20
      task_work_run+0x67/0xa0
      do_exit+0x217/0x3c0
      do_group_exit+0x3b/0xb0
      get_signal+0x14a/0x8d0
      arch_do_signal_or_restart+0xde/0x100
      exit_to_user_mode_loop+0xc1/0x1a0
      exit_to_user_mode_prepare+0xf4/0x100
      syscall_exit_to_user_mode+0x17/0x40
      do_syscall_64+0x69/0xc0
    
    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 7dbbfb3c171a6f63b01165958629c9c26abf38ab)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amdgpu: fix incorrect MALL size for GFX1151 [+ + +]

Author: Tim Huang <[email protected]>
Date:   Thu May 8 13:37:35 2025 +0800

    drm/amdgpu: fix incorrect MALL size for GFX1151
    
    commit 2d73b0845ab3963856e857b810600e5594bc29f4 upstream.
    
    On GFX1151, the reported MALL cache size reflects only
    half of its actual size; this adjustment corrects the discrepancy.
    
    Signed-off-by: Tim Huang <[email protected]>
    Acked-by: Alex Deucher <[email protected]>
    Reviewed-by: Yifan Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 0a5c060b593ad152318f89e5564bfdfcff8a6ac0)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/meson: Use 1000ULL when operating with mode->clock [+ + +]

Author: I Hsin Cheng <[email protected]>
Date:   Tue May 6 02:43:38 2025 +0800

    drm/meson: Use 1000ULL when operating with mode->clock
    
    [ Upstream commit eb0851e14432f3b87c77b704c835ac376deda03a ]
    
    Coverity scan reported the usage of "mode->clock * 1000" may lead to
    integer overflow. Use "1000ULL" instead of "1000"
    when utilizing it to avoid potential integer overflow issue.
    
    Link: https://scan5.scan.coverity.com/#/project-view/10074/10063?selectedIssue=1646759
    Signed-off-by: I Hsin Cheng <[email protected]>
    Reviewed-by: Martin Blumenstingl <[email protected]>
    Fixes: 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types")
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Neil Armstrong <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/tiny: panel-mipi-dbi: Use drm_client_setup_with_fourcc() [+ + +]

Author: Fabio Estevam <[email protected]>
Date:   Thu Apr 17 07:34:58 2025 -0300

    drm/tiny: panel-mipi-dbi: Use drm_client_setup_with_fourcc()
    
    commit 9c1798259b9420f38f1fa1b83e3d864c3eb1a83e upstream.
    
    Since commit 559358282e5b ("drm/fb-helper: Don't use the preferred depth
    for the BPP default"), RGB565 displays such as the CFAF240320X no longer
    render correctly: colors are distorted and the content is shown twice
    horizontally.
    
    This regression is due to the fbdev emulation layer defaulting to 32 bits
    per pixel, whereas the display expects 16 bpp (RGB565). As a result, the
    framebuffer data is incorrectly interpreted by the panel.
    
    Fix the issue by calling drm_client_setup_with_fourcc() with a format
    explicitly selected based on the display's bits-per-pixel value. For 16
    bpp, use DRM_FORMAT_RGB565; for other values, fall back to the previous
    behavior. This ensures that the allocated framebuffer format matches the
    hardware expectations, avoiding color and layout corruption.
    
    Tested on a CFAF240320X display with an RGB565 configuration, confirming
    correct colors and layout after applying this patch.
    
    Cc: [email protected]
    Fixes: 559358282e5b ("drm/fb-helper: Don't use the preferred depth for the BPP default")
    Signed-off-by: Fabio Estevam <[email protected]>
    Reviewed-by: Thomas Zimmermann <[email protected]>
    Reviewed-by: Javier Martinez Canillas <[email protected]>
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/gsc: do not flush the GSC worker from the reset path [+ + +]

Author: Daniele Ceraolo Spurio <[email protected]>
Date:   Fri May 2 08:51:04 2025 -0700

    drm/xe/gsc: do not flush the GSC worker from the reset path
    
    commit 03552d8ac0afcc080c339faa0b726e2c0e9361cb upstream.
    
    The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
    while the GSC one isn't (and can't be as we need to do memory
    allocations in the gsc worker). Therefore, we can't flush the latter
    from the former.
    
    The reason why we had such a flush was to avoid interrupting either
    the GSC FW load or in progress GSC proxy operations. GSC proxy
    operations fall into 2 categories:
    
    1) GSC proxy init: this only happens once immediately after GSC FW load
       and does not support being interrupted. The only way to recover from
       an interruption of the proxy init is to do an FLR and re-load the GSC.
    
    2) GSC proxy request: this can happen in response to a request that
       the driver sends to the GSC. If this is interrupted, the GSC FW will
       timeout and the driver request will be failed, but overall the GSC
       will keep working fine.
    
    Flushing the work allowed us to avoid interruption in both cases (unless
    the hang came from the GSC engine itself, in which case we're toast
    anyway). However, a failure on a proxy request is tolerable if we're in
    a scenario where we're triggering a GT reset (i.e., something is already
    gone pretty wrong), so what we really need to avoid is interrupting
    the init flow, which we can do by polling on the register that reports
    when the proxy init is complete (as that ensure us that all the load and
    init operations have been completed).
    
    Note that during suspend we still want to do a flush of the worker to
    make sure it completes any operations involving the HW before the power
    is cut.
    
    v2: fix spelling in commit msg, rename waiter function (Julia)
    
    Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
    Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
    Cc: John Harrison <[email protected]>
    Cc: Alan Previn <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Julia Filipchuk <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    (cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value [+ + +]

Author: Umesh Nerlige Ramappa <[email protected]>
Date:   Fri May 9 09:12:01 2025 -0700

    drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value
    
    [ Upstream commit 66c8f7b435bddb7d8577ac8a57e175a6cb147227 ]
    
    For determining actual job execution time, save the current value of the
    CTX_TIMESTAMP register rather than the value saved in LRC since the
    current register value is the closest to the start time of the job.
    
    v2: Define MI_STORE_REGISTER_MEM to fix compile error
    v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas)
    
    Fixes: 65921374c48f ("drm/xe: Emit ctx timestamp copy in ring ops")
    Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Reviewed-by: Lucas De Marchi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    (cherry picked from commit 38b14233e5deff51db8faec287b4acd227152246)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/eventpoll: fix endless busy loop after timeout has expired [+ + +]

Author: Max Kellermann <[email protected]>
Date:   Tue Apr 29 20:58:27 2025 +0200

    fs/eventpoll: fix endless busy loop after timeout has expired
    
    commit d9ec73301099ec5975505e1c3effbe768bab9490 upstream.
    
    After commit 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in
    the future"), the following program would immediately enter a busy
    loop in the kernel:
    
    ```
    int main() {
      int e = epoll_create1(0);
      struct epoll_event event = {.events = EPOLLIN};
      epoll_ctl(e, EPOLL_CTL_ADD, 0, &event);
      const struct timespec timeout = {.tv_nsec = 1};
      epoll_pwait2(e, &event, 1, &timeout, 0);
    }
    ```
    
    This happens because the given (non-zero) timeout of 1 nanosecond
    usually expires before ep_poll() is entered and then
    ep_schedule_timeout() returns false, but `timed_out` is never set
    because the code line that sets it is skipped.  This quickly turns
    into a soft lockup, RCU stalls and deadlocks, inflicting severe
    headaches to the whole system.
    
    When the timeout has expired, we don't need to schedule a hrtimer, but
    we should set the `timed_out` variable.  Therefore, I suggest moving
    the ep_schedule_timeout() check into the `timed_out` expression
    instead of skipping it.
    
    brauner: Note that there was an earlier fix by Joe Damato in response to
    my bug report in [1].
    
    Fixes: 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in the future")
    Cc: Joe Damato <[email protected]>
    Cc: [email protected]
    Signed-off-by: Max Kellermann <[email protected]>
    Link: https://lore.kernel.org/[email protected] [1]
    Link: https://lore.kernel.org/[email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

fs/xattr.c: fix simple_xattr_list to always include security.* xattrs [+ + +]

Author: Stephen Smalley <[email protected]>
Date:   Thu Apr 24 11:28:20 2025 -0400

    fs/xattr.c: fix simple_xattr_list to always include security.* xattrs
    
    [ Upstream commit 8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d ]
    
    The vfs has long had a fallback to obtain the security.* xattrs from the
    LSM when the filesystem does not implement its own listxattr, but
    shmem/tmpfs and kernfs later gained their own xattr handlers to support
    other xattrs. Unfortunately, as a side effect, tmpfs and kernfs-based
    filesystems like sysfs no longer return the synthetic security.* xattr
    names via listxattr unless they are explicitly set by userspace or
    initially set upon inode creation after policy load. coreutils has
    recently switched from unconditionally invoking getxattr for security.*
    for ls -Z via libselinux to only doing so if listxattr returns the xattr
    name, breaking ls -Z of such inodes.
    
    Before:
    $ getfattr -m.* /run/initramfs
    <no output>
    $ getfattr -m.* /sys/kernel/fscaps
    <no output>
    $ setfattr -n user.foo /run/initramfs
    $ getfattr -m.* /run/initramfs
    user.foo
    
    After:
    $ getfattr -m.* /run/initramfs
    security.selinux
    $ getfattr -m.* /sys/kernel/fscaps
    security.selinux
    $ setfattr -n user.foo /run/initramfs
    $ getfattr -m.* /run/initramfs
    security.selinux
    user.foo
    
    Link: https://lore.kernel.org/selinux/CAFqZXNtF8wDyQajPCdGn=iOawX4y77ph0EcfcqcUUj+T87FKyA@mail.gmail.com/
    Link: https://lore.kernel.org/selinux/[email protected]/
    Signed-off-by: Stephen Smalley <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: b09e0fa4b4ea66266058ee ("tmpfs: implement generic xattr support")
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ftrace: Fix preemption accounting for stacktrace filter command [+ + +]

Author: pengdonglin <[email protected]>
Date:   Mon May 12 17:42:46 2025 +0800

    ftrace: Fix preemption accounting for stacktrace filter command
    
    commit 11aff32439df6ca5b3b891b43032faf88f4a6a29 upstream.
    
    The preemption count of the stacktrace filter command to trace ksys_read
    is consistently incorrect:
    
    $ echo ksys_read:stacktrace > set_ftrace_filter
    
       <...>-453     [004] ...1.    38.308956: <stack trace>
    => ksys_read
    => do_syscall_64
    => entry_SYSCALL_64_after_hwframe
    
    The root cause is that the trace framework disables preemption when
    invoking the filter command callback in function_trace_probe_call:
    
       preempt_disable_notrace();
       probe_ops->func(ip, parent_ip, probe_opsbe->tr, probe_ops, probe->data);
       preempt_enable_notrace();
    
    Use tracing_gen_ctx_dec() to account for the preempt_disable_notrace(),
    which will output the correct preemption count:
    
    $ echo ksys_read:stacktrace > set_ftrace_filter
    
       <...>-410     [006] .....    31.420396: <stack trace>
    => ksys_read
    => do_syscall_64
    => entry_SYSCALL_64_after_hwframe
    
    Cc: [email protected]
    Fixes: 36590c50b2d07 ("tracing: Merge irqflags + preempt counter.")
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: pengdonglin <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ftrace: Fix preemption accounting for stacktrace trigger command [+ + +]

Author: pengdonglin <[email protected]>
Date:   Mon May 12 17:42:45 2025 +0800

    ftrace: Fix preemption accounting for stacktrace trigger command
    
    commit e333332657f615ac2b55aa35565c4a882018bbe9 upstream.
    
    When using the stacktrace trigger command to trace syscalls, the
    preemption count was consistently reported as 1 when the system call
    event itself had 0 (".").
    
    For example:
    
    root@ubuntu22-vm:/sys/kernel/tracing/events/syscalls/sys_enter_read
    $ echo stacktrace > trigger
    $ echo 1 > enable
    
        sshd-416     [002] .....   232.864910: sys_read(fd: a, buf: 556b1f3221d0, count: 8000)
        sshd-416     [002] ...1.   232.864913: <stack trace>
     => ftrace_syscall_enter
     => syscall_trace_enter
     => do_syscall_64
     => entry_SYSCALL_64_after_hwframe
    
    The root cause is that the trace framework disables preemption in __DO_TRACE before
    invoking the trigger callback.
    
    Use the tracing_gen_ctx_dec() that will accommodate for the increase of
    the preemption count in __DO_TRACE when calling the callback. The result
    is the accurate reporting of:
    
        sshd-410     [004] .....   210.117660: sys_read(fd: 4, buf: 559b725ba130, count: 40000)
        sshd-410     [004] .....   210.117662: <stack trace>
     => ftrace_syscall_enter
     => syscall_trace_enter
     => do_syscall_64
     => entry_SYSCALL_64_after_hwframe
    
    Cc: [email protected]
    Fixes: ce33c845b030c ("tracing: Dump stacktrace trigger to the corresponding instance")
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: pengdonglin <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gpio: pca953x: fix IRQ storm on system wake up [+ + +]

Author: Emanuele Ghidoli <[email protected]>
Date:   Mon May 12 11:54:41 2025 +0200

    gpio: pca953x: fix IRQ storm on system wake up
    
    commit 3e38f946062b4845961ab86b726651b4457b2af8 upstream.
    
    If an input changes state during wake-up and is used as an interrupt
    source, the IRQ handler reads the volatile input register to clear the
    interrupt mask and deassert the IRQ line. However, the IRQ handler is
    triggered before access to the register is granted, causing the read
    operation to fail.
    
    As a result, the IRQ handler enters a loop, repeatedly printing the
    "failed reading register" message, until `pca953x_resume()` is eventually
    called, which restores the driver context and enables access to
    registers.
    
    Fix by disabling the IRQ line before entering suspend mode, and
    re-enabling it after the driver context is restored in `pca953x_resume()`.
    
    An IRQ can be disabled with disable_irq() and still wake the system as
    long as the IRQ has wake enabled, so the wake-up functionality is
    preserved.
    
    Fixes: b76574300504 ("gpio: pca953x: Restore registers after suspend/resume cycle")
    Cc: [email protected]
    Signed-off-by: Emanuele Ghidoli <[email protected]>
    Signed-off-by: Francesco Dolcini <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Tested-by: Geert Uytterhoeven <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: amd_sfh: Fix SRA sensor when it's the only sensor [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Mon Apr 21 16:32:09 2025 -0500

    HID: amd_sfh: Fix SRA sensor when it's the only sensor
    
    commit 0cc2effbc8f522af6b9d871cd27678e6aed9d56c upstream.
    
    On systems that only have an SRA sensor connected to SFH the sensor
    doesn't get enabled due to a bad optimization condition of breaking
    the sensor walk loop.
    
    This optimization is unnecessary in the first place because if there
    is only one device then the loop only runs once. Drop the condition
    and explicitly mark sensor as enabled.
    
    Reported-by: Yijun Shen <[email protected]>
    Tested-By: Yijun Shen <[email protected]>
    Fixes: d1c444b47100d ("HID: amd_sfh: Add support to export device operating states")
    Cc: [email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Acked-by: Basavaraj Natikar <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: bpf: abort dispatch if device destroyed [+ + +]

Author: Rong Zhang <[email protected]>
Date:   Mon May 12 23:24:19 2025 +0800

    HID: bpf: abort dispatch if device destroyed
    
    commit 578e1b96fad7402ff7e9c7648c8f1ad0225147c8 upstream.
    
    The current HID bpf implementation assumes no output report/request will
    go through it after hid_bpf_destroy_device() has been called. This leads
    to a bug that unplugging certain types of HID devices causes a cleaned-
    up SRCU to be accessed. The bug was previously a hidden failure until a
    recent x86 percpu change [1] made it access not-present pages.
    
    The bug will be triggered if the conditions below are met:
    
    A) a device under the driver has some LEDs on
    B) hid_ll_driver->request() is uninplemented (e.g., logitech-djreceiver)
    
    If condition A is met, hidinput_led_worker() is always scheduled *after*
    hid_bpf_destroy_device().
    
    hid_destroy_device
    ` hid_bpf_destroy_device
      ` cleanup_srcu_struct(&hdev->bpf.srcu)
    ` hid_remove_device
      ` ...
        ` led_classdev_unregister
          ` led_trigger_set(led_cdev, NULL)
            ` led_set_brightness(led_cdev, LED_OFF)
              ` ...
                ` input_inject_event
                  ` input_event_dispose
                    ` hidinput_input_event
                      ` schedule_work(&hid->led_work) [hidinput_led_worker]
    
    This is fine when condition B is not met, where hidinput_led_worker()
    calls hid_ll_driver->request(). This is the case for most HID drivers,
    which implement it or use the generic one from usbhid. The driver itself
    or an underlying driver will then abort processing the request.
    
    Otherwise, hidinput_led_worker() tries hid_hw_output_report() and leads
    to the bug.
    
    hidinput_led_worker
    ` hid_hw_output_report
      ` dispatch_hid_bpf_output_report
        ` srcu_read_lock(&hdev->bpf.srcu)
        ` srcu_read_unlock(&hdev->bpf.srcu, idx)
    
    The bug has existed since the introduction [2] of
    dispatch_hid_bpf_output_report(). However, the same bug also exists in
    dispatch_hid_bpf_raw_requests(), and I've reproduced (no visible effect
    because of the lack of [1], but confirmed bpf.destroyed == 1) the bug
    against the commit (i.e., the Fixes:) introducing the function. This is
    because hidinput_led_worker() falls back to hid_hw_raw_request() when
    hid_ll_driver->output_report() is uninplemented (e.g., logitech-
    djreceiver).
    
    hidinput_led_worker
    ` hid_hw_output_report: -ENOSYS
    ` hid_hw_raw_request
      ` dispatch_hid_bpf_raw_requests
        ` srcu_read_lock(&hdev->bpf.srcu)
        ` srcu_read_unlock(&hdev->bpf.srcu, idx)
    
    Fix the issue by returning early in the two mentioned functions if
    hid_bpf has been marked as destroyed. Though
    dispatch_hid_bpf_device_event() handles input events, and there is no
    evidence that it may be called after the destruction, the same check, as
    a safety net, is also added to it to maintain the consistency among all
    dispatch functions.
    
    The impact of the bug on other architectures is unclear. Even if it acts
    as a hidden failure, this is still dangerous because it corrupts
    whatever is on the address calculated by SRCU. Thus, CC'ing the stable
    list.
    
    [1]: commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
    [2]: commit 9286675a2aed ("HID: bpf: add HID-BPF hooks for
    hid_hw_output_report")
    
    Closes: https://lore.kernel.org/all/20250506145548.GGaBoi9Jzp3aeJizTR@fat_crate.local/
    Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests")
    Cc: [email protected]
    Signed-off-by: Rong Zhang <[email protected]>
    Tested-by: Petr Tesarik <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: thrustmaster: fix memory leak in thrustmaster_interrupts() [+ + +]

Author: Qasim Ijaz <[email protected]>
Date:   Thu Mar 27 23:11:46 2025 +0000

    HID: thrustmaster: fix memory leak in thrustmaster_interrupts()
    
    [ Upstream commit 09d546303b370113323bfff456c4e8cff8756005 ]
    
    In thrustmaster_interrupts(), the allocated send_buf is not
    freed if the usb_check_int_endpoints() check fails, leading
    to a memory leak.
    
    Fix this by ensuring send_buf is freed before returning in
    the error path.
    
    Fixes: 50420d7c79c3 ("HID: hid-thrustmaster: Fix warning in thrustmaster_probe by adding endpoint check")
    Signed-off-by: Qasim Ijaz <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: uclogic: Add NULL check in uclogic_input_configured() [+ + +]

Author: Henry Martin <[email protected]>
Date:   Tue Apr 1 17:48:53 2025 +0800

    HID: uclogic: Add NULL check in uclogic_input_configured()
    
    [ Upstream commit bd07f751208ba190f9b0db5e5b7f35d5bb4a8a1e ]
    
    devm_kasprintf() returns NULL when memory allocation fails. Currently,
    uclogic_input_configured() does not check for this case, which results
    in a NULL pointer dereference.
    
    Add NULL check after devm_kasprintf() to prevent this issue.
    
    Fixes: dd613a4e45f8 ("HID: uclogic: Correct devm device reference for hidinput input_dev name")
    Signed-off-by: Henry Martin <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hv_netvsc: Preserve contiguous PFN grouping in the page buffer array [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Mon May 12 17:06:02 2025 -0700

    hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
    
    commit 41a6328b2c55276f89ea3812069fd7521e348bbf upstream.
    
    Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in
    SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
    driver for Hyper-V synthetic networking (netvsc) occasionally reports
    "nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
    that Hyper-V has rejected a network packet transmit request from the
    guest, and the outgoing network packet is dropped. Higher level
    network protocols presumably recover and resend the packet so there is
    no functional error, but performance is slightly impacted. Commit
    dca5161f9bd0 is not the cause of the error -- it only added reporting
    of an error that was already happening without any notice. The error
    has presumably been present since the netvsc driver was originally
    introduced into Linux.
    
    The root cause of the problem is that the netvsc driver in Linux may
    send an incorrectly formatted VMBus message to Hyper-V when
    transmitting the network packet. The incorrect formatting occurs when
    the rndis header of the VMBus message crosses a page boundary due to
    how the Linux skb head memory is aligned. In such a case, two PFNs are
    required to describe the location of the rndis header, even though
    they are contiguous in guest physical address (GPA) space. Hyper-V
    requires that two rndis header PFNs be in a single "GPA range" data
    struture, but current netvsc code puts each PFN in its own GPA range,
    which Hyper-V rejects as an error.
    
    The incorrect formatting occurs only for larger packets that netvsc
    must transmit via a VMBus "GPA Direct" message. There's no problem
    when netvsc transmits a smaller packet by copying it into a pre-
    allocated send buffer slot because the pre-allocated slots don't have
    page crossing issues.
    
    After commit 14ad6ed30a10 ("net: allow small head cache usage with
    large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
    much more frequently in VMs with 16 or more vCPUs. It may occur every
    few seconds, or even more frequently, in an ssh session that outputs a
    lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
    allocated, making it much more likely that the rndis header will cross
    a page boundary when the vCPU count is 16 or more. The changes in
    commit 14ad6ed30a10 are perfectly valid -- they just had the side
    effect of making the netvsc bug more prominent.
    
    Current code in init_page_array() creates a separate page buffer array
    entry for each PFN required to identify the data to be transmitted.
    Contiguous PFNs get separate entries in the page buffer array, and any
    information about contiguity is lost.
    
    Fix the core issue by having init_page_array() construct the page
    buffer array to represent contiguous ranges rather than individual
    pages. When these ranges are subsequently passed to
    netvsc_build_mpb_array(), it can build GPA ranges that contain
    multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
    error status: 2". If instead the network packet is sent by copying
    into a pre-allocated send buffer slot, the copy proceeds using the
    contiguous ranges rather than individual pages, but the result of the
    copying is the same. Also fix rndis_filter_send_request() to construct
    a contiguous range, since it has its own page buffer array.
    
    This change has a side benefit in CoCo VMs in that netvsc_dma_map()
    calls dma_map_single() on each contiguous range instead of on each
    page. This results in fewer calls to dma_map_single() but on larger
    chunks of memory, which should reduce contention on the swiotlb.
    
    Since the page buffer array now contains one entry for each contiguous
    range instead of for each individual page, the number of entries in
    the array can be reduced, saving 208 bytes of stack space in
    netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.
    
    [1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
    Cc: <[email protected]> # 6.1.x
    Signed-off-by: Michael Kelley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

hv_netvsc: Remove rmsg_pgcnt [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Mon May 12 17:06:03 2025 -0700

    hv_netvsc: Remove rmsg_pgcnt
    
    commit 5bbc644bbf4e97a05bc0cb052189004588ff8a09 upstream.
    
    init_page_array() now always creates a single page buffer array entry
    for the rndis message, even if the rndis message crosses a page
    boundary. As such, the number of page buffer array entries used for
    the rndis message must no longer be tracked -- it is always just 1.
    Remove the rmsg_pgcnt field and use "1" where the value is needed.
    
    Cc: <[email protected]> # 6.1.x
    Signed-off-by: Michael Kelley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages [+ + +]

Author: Michael Kelley <[email protected]>
Date:   Mon May 12 17:06:01 2025 -0700

    hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
    
    commit 4f98616b855cb0e3b5917918bb07b44728eb96ea upstream.
    
    netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
    messages. This function creates a series of GPA ranges, each of which
    contains a single PFN. However, if the rndis header in the VMBus
    message crosses a page boundary, the netvsc protocol with the host
    requires that both PFNs for the rndis header must be in a single "GPA
    range" data structure, which isn't possible with
    vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
    new function netvsc_build_mpb_array() to build a VMBus message with
    multiple GPA ranges, each of which may contain multiple PFNs. Use
    vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.
    
    There's no functional change since higher levels of netvsc don't
    maintain or propagate knowledge of contiguous PFNs. Based on its
    input, netvsc_build_mpb_array() still produces a separate GPA range
    for each PFN and the behavior is the same as with
    vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
    subsequent patch to provide the necessary grouping.
    
    Cc: <[email protected]> # 6.1.x
    Signed-off-by: Michael Kelley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: designware: Fix an error handling path in i2c_dw_pci_probe() [+ + +]

Author: Christophe JAILLET <[email protected]>
Date:   Tue May 13 19:56:41 2025 +0200

    i2c: designware: Fix an error handling path in i2c_dw_pci_probe()
    
    commit 1cfe51ef07ca3286581d612debfb0430eeccbb65 upstream.
    
    If navi_amd_register_client() fails, the previous i2c_dw_probe() call
    should be undone by a corresponding i2c_del_adapter() call, as already done
    in the remove function.
    
    Fixes: 17631e8ca2d3 ("i2c: designware: Add driver support for AMD NAVI GPU")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Cc: <[email protected]> # v5.13+
    Acked-by: Jarkko Nikula <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Link: https://lore.kernel.org/r/fcd9651835a32979df8802b2db9504c523a8ebbb.1747158983.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: adc: ad7606: check for NULL before calling sw_mode_config() [+ + +]

Author: David Lechner <[email protected]>
Date:   Tue Mar 18 17:52:09 2025 -0500

    iio: adc: ad7606: check for NULL before calling sw_mode_config()
    
    [ Upstream commit 5257d80e22bf27009d6742e4c174f42cfe54e425 ]
    
    Check that the sw_mode_config function pointer is not NULL before
    calling it. Not all buses define this callback, which resulted in a NULL
    pointer dereference.
    
    Fixes: e571c1902116 ("iio: adc: ad7606: move scale_setup as function pointer on chip-info")
    Reviewed-by: Nuno Sá <[email protected]>
    Signed-off-by: David Lechner <[email protected]>
    Link: https://patch.msgid.link/20250318-iio-adc-ad7606-improvements-v2-1-4b605427774c@baylibre.com
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iio: adc: ad7606: move software functions into common file [+ + +]

Author: Guillaume Stols <[email protected]>
Date:   Mon Feb 10 17:10:53 2025 +0100

    iio: adc: ad7606: move software functions into common file
    
    [ Upstream commit d2477887f6677de0675e600f1590378a5fb52909 ]
    
    Since the register are always the same, whatever bus is used, moving the
    software functions into the main file avoids the code to be duplicated
    in both SPI and parallel version of the driver.
    
    Signed-off-by: Guillaume Stols <[email protected]>
    Co-developed-by: Angelo Dureghello <[email protected]>
    Signed-off-by: Angelo Dureghello <[email protected]>
    Link: https://patch.msgid.link/20250210-wip-bl-ad7606_add_backend_sw_mode-v4-3-160df18b1da7@baylibre.com
    Signed-off-by: Jonathan Cameron <[email protected]>
    Stable-dep-of: 5257d80e22bf ("iio: adc: ad7606: check for NULL before calling sw_mode_config()")
    Signed-off-by: Sasha Levin <[email protected]>

iio: adc: ad7606: move the software mode configuration [+ + +]

Author: Guillaume Stols <[email protected]>
Date:   Mon Feb 10 17:10:52 2025 +0100

    iio: adc: ad7606: move the software mode configuration
    
    [ Upstream commit f2a62931b39478c98f977caf299df5bc072f38e0 ]
    
    This is a preparation for the intoduction of the sofware functions in
    the iio backend version of the driver.
    The software mode configuration must be executed once the channels are
    configured, and the number of channels is known. This is not the case
    before iio-backend's configuration is called, and iio backend version of
    the driver does not have a timestamp channel.
    Also the sw_mode_config callback is configured during the iio-backend
    configuration.
    For clarity purpose, I moved the entire block instead of just the
    concerned function calls.
    
    Signed-off-by: Guillaume Stols <[email protected]>
    Link: https://patch.msgid.link/20250210-wip-bl-ad7606_add_backend_sw_mode-v4-2-160df18b1da7@baylibre.com
    Signed-off-by: Jonathan Cameron <[email protected]>
    Stable-dep-of: 5257d80e22bf ("iio: adc: ad7606: check for NULL before calling sw_mode_config()")
    Signed-off-by: Sasha Levin <[email protected]>

io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo() [+ + +]

Author: Jens Axboe <[email protected]>
Date:   Tue May 13 15:02:23 2025 -0600

    io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo()
    
    [ Upstream commit d871198ee431d90f5308d53998c1ba1d5db5619a ]
    
    Not everything requires locking in there, which is why the 'has_lock'
    variable exists. But enough does that it's a bit unwieldy to manage.
    Wrap the whole thing in a ->uring_lock trylock, and just return
    with no output if we fail to grab it. The existing trylock() will
    already have greatly diminished utility/output for the failure case.
    
    This fixes an issue with reading the SQE fields, if the ring is being
    actively resized at the same time.
    
    Reported-by: Jann Horn <[email protected]>
    Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS")
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

io_uring/memmap: don't use page_address() on a highmem page [+ + +]

Author: Jens Axboe <[email protected]>
Date:   Mon May 12 09:06:06 2025 -0600

    io_uring/memmap: don't use page_address() on a highmem page
    
    commit f446c6311e86618a1f81eb576b56a6266307238f upstream.
    
    For older/32-bit systems with highmem, don't assume that the pages in
    a mapped region are always going to be mapped. If io_region_init_ptr()
    finds that the pages are coalescable, also check if the first page is
    a HighMem page or not. If it is, fall through to the usual vmap()
    mapping rather than attempt to get the unmapped page address.
    
    Cc: [email protected]
    Fixes: c4d0ac1c1567 ("io_uring/memmap: optimise single folio regions")
    Link: https://lore.kernel.org/all/[email protected]/
    Reported-by: [email protected]
    Link: https://lore.kernel.org/all/[email protected]/
    Reported-by: [email protected]
    Tested-by: [email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

io_uring/uring_cmd: fix hybrid polling initialization issue [+ + +]

Author: hexue <[email protected]>
Date:   Mon May 12 13:20:25 2025 +0800

    io_uring/uring_cmd: fix hybrid polling initialization issue
    
    commit 63166b815dc163b2e46426cecf707dc5923d6d13 upstream.
    
    Modify the check for whether the timer is initialized during IO transfer
    when passthrough is used with hybrid polling, to ensure that it's always
    setup correctly.
    
    Cc: [email protected]
    Fixes: 01ee194d1aba ("io_uring: add support for hybrid IOPOLL")
    Signed-off-by: hexue <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

kbuild: Disable -Wdefault-const-init-unsafe [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Tue May 6 14:02:01 2025 -0700

    kbuild: Disable -Wdefault-const-init-unsafe
    
    commit d0afcfeb9e3810ec89d1ffde1a0e36621bb75dca upstream.
    
    A new on by default warning in clang [1] aims to flags instances where
    const variables without static or thread local storage or const members
    in aggregate types are not initialized because it can lead to an
    indeterminate value. This is quite noisy for the kernel due to
    instances originating from header files such as:
    
      drivers/gpu/drm/i915/gt/intel_ring.h:62:2: error: default initialization of an object of type 'typeof (ring->size)' (aka 'const unsigned int') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
         62 |         typecheck(typeof(ring->size), next);
            |         ^
      include/linux/typecheck.h:10:9: note: expanded from macro 'typecheck'
         10 | ({      type __dummy; \
            |              ^
    
      include/net/ip.h:478:14: error: default initialization of an object of type 'typeof (rt->dst.expires)' (aka 'const unsigned long') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
        478 |                 if (mtu && time_before(jiffies, rt->dst.expires))
            |                            ^
      include/linux/jiffies.h:138:26: note: expanded from macro 'time_before'
        138 | #define time_before(a,b)        time_after(b,a)
            |                                 ^
      include/linux/jiffies.h:128:3: note: expanded from macro 'time_after'
        128 |         (typecheck(unsigned long, a) && \
            |          ^
      include/linux/typecheck.h:11:12: note: expanded from macro 'typecheck'
         11 |         typeof(x) __dummy2; \
            |                   ^
    
      include/linux/list.h:409:27: warning: default initialization of an object of type 'union (unnamed union at include/linux/list.h:409:27)' with const member leaves the object uninitialized [-Wdefault-const-init-field-unsafe]
        409 |         struct list_head *next = smp_load_acquire(&head->next);
            |                                  ^
      include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire'
        176 | #define smp_load_acquire(p) __smp_load_acquire(p)
            |                             ^
      arch/arm64/include/asm/barrier.h:164:59: note: expanded from macro '__smp_load_acquire'
        164 |         union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u;   \
            |                                                                  ^
      include/linux/list.h:409:27: note: member '__val' declared 'const' here
    
      crypto/scatterwalk.c:66:22: error: default initialization of an object of type 'struct scatter_walk' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
         66 |         struct scatter_walk walk;
            |                             ^
      include/crypto/algapi.h:112:15: note: member 'addr' declared 'const' here
        112 |                 void *const addr;
            |                             ^
    
      fs/hugetlbfs/inode.c:733:24: error: default initialization of an object of type 'struct vm_area_struct' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
        733 |         struct vm_area_struct pseudo_vma;
            |                               ^
      include/linux/mm_types.h:803:20: note: member 'vm_flags' declared 'const' here
        803 |                 const vm_flags_t vm_flags;
            |                                  ^
    
    Silencing the instances from typecheck.h is difficult because '= {}' is
    not available in older but supported compilers and '= {0}' would cause
    warnings about a literal 0 being treated as NULL. While it might be
    possible to come up with a local hack to silence the warning for
    clang-21+, it may not be worth it since -Wuninitialized will still
    trigger if an uninitialized const variable is actually used.
    
    In all audited cases of the "field" variant of the warning, the members
    are either not used in the particular call path, modified through other
    means such as memset() / memcpy() because the containing object is not
    const, or are within a union with other non-const members.
    
    Since this warning does not appear to have a high signal to noise ratio,
    just disable it.
    
    Cc: [email protected]
    Link: https://github.com/llvm/llvm-project/commit/576161cb6069e2c7656a8ef530727a0f4aefff30 [1]
    Reported-by: Linux Kernel Functional Testing <[email protected]>
    Closes: https://lore.kernel.org/CA+G9fYuNjKcxFKS_MKPRuga32XbndkLGcY-PVuoSwzv6VWbY=w@mail.gmail.com/
    Reported-by: Marcus Seyfarth <[email protected]>
    Closes: https://github.com/ClangBuiltLinux/linux/issues/2088
    Signed-off-by: Nathan Chancellor <[email protected]>
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.14.8 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Thu May 22 14:31:58 2025 +0200

    Linux 6.14.8
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Ronald Warsow <[email protected]>
    Tested-by: Luna Jernberg <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Miguel Ojeda <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Markus Reichelt <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Hardik Garg <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: Fix MAX_REG_OFFSET calculation [+ + +]

Author: Huacai Chen <[email protected]>
Date:   Wed May 14 22:17:43 2025 +0800

    LoongArch: Fix MAX_REG_OFFSET calculation
    
    commit 90436d234230e9a950ccd87831108b688b27a234 upstream.
    
    Fix MAX_REG_OFFSET calculation, make it point to the last register
    in 'struct pt_regs' and not to the marker itself, which could allow
    regs_get_register() to return an invalid offset.
    
    Cc: [email protected]
    Fixes: 803b0fc5c3f2baa6e5 ("LoongArch: Add process management")
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: Move __arch_cpu_idle() to .cpuidle.text section [+ + +]

Author: Huacai Chen <[email protected]>
Date:   Wed May 14 22:17:52 2025 +0800

    LoongArch: Move __arch_cpu_idle() to .cpuidle.text section
    
    commit 3e245b7b74c3a2ead5fa4bad27cc275284c75189 upstream.
    
    Now arch_cpu_idle() is annotated with __cpuidle which means it is in
    the .cpuidle.text section, but __arch_cpu_idle() isn't. Thus, fix the
    missing .cpuidle.text section assignment for __arch_cpu_idle() in order
    to correct backtracing with nmi_backtrace().
    
    The principle is similar to the commit 97c8580e85cf81c ("MIPS: Annotate
    cpu_wait implementations with __cpuidle")
    
    Cc: [email protected]
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: Prevent cond_resched() occurring within kernel-fpu [+ + +]

Author: Tianyang Zhang <[email protected]>
Date:   Wed May 14 22:17:43 2025 +0800

    LoongArch: Prevent cond_resched() occurring within kernel-fpu
    
    commit 2468b0e3d5659dfde77f081f266e1111a981efb8 upstream.
    
    When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/
    CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely
    acts as a barrier(). However, in these cases cond_resched() can still
    trigger a context switch and modify the CSR.EUEN, resulting in do_fpu()
    exception being activated within the kernel-fpu critical sections, as
    demonstrated in the following path:
    
    dcn32_calculate_wm_and_dlg()
        DC_FP_START()
            dcn32_calculate_wm_and_dlg_fpu()
                dcn32_find_dummy_latency_index_for_fw_based_mclk_switch()
                    dcn32_internal_validate_bw()
                        dcn32_enable_phantom_stream()
                            dc_create_stream_for_sink()
                               kzalloc(GFP_KERNEL)
                                    __kmem_cache_alloc_node()
                                        __cond_resched()
        DC_FP_END()
    
    This patch is similar to commit d02198550423a0b (x86/fpu: Improve crypto
    performance by making kernel-mode FPU reliably usable in softirqs).  It
    uses local_bh_disable() instead of preempt_disable() for non-RT kernels
    so it can avoid the cond_resched() issue, and also extend the kernel-fpu
    application scenarios to the softirq context.
    
    Cc: [email protected]
    Signed-off-by: Tianyang Zhang <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: Save and restore CSR.CNTC for hibernation [+ + +]

Author: Huacai Chen <[email protected]>
Date:   Wed May 14 22:17:52 2025 +0800

    LoongArch: Save and restore CSR.CNTC for hibernation
    
    commit ceb9155d058a11242aa0572875c44e9713b1a2be upstream.
    
    Save and restore CSR.CNTC for hibernation which is similar to suspend.
    
    For host this is unnecessary because sched clock is ensured continuous,
    but for kvm guest sched clock isn't enough because rdtime.d should also
    be continuous.
    
    Host::rdtime.d = Host::CSR.CNTC + counter
    Guest::rdtime.d = Host::CSR.CNTC + Host::CSR.GCNTC + Guest::CSR.CNTC + counter
    
    so,
    
    Guest::rdtime.d = Host::rdtime.d + Host::CSR.GCNTC + Guest::CSR.CNTC
    
    To ensure Guest::rdtime.d continuous, Host::rdtime.d should be at first
    continuous, while Host::CSR.GCNTC / Guest::CSR.CNTC is maintained by KVM.
    
    Cc: [email protected]
    Signed-off-by: Xianglai Li <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: uprobes: Remove redundant code about resume_era [+ + +]

Author: Tiezhu Yang <[email protected]>
Date:   Wed May 14 22:18:10 2025 +0800

    LoongArch: uprobes: Remove redundant code about resume_era
    
    commit 12614f794274f63fbdfe76771b2b332077d63848 upstream.
    
    arch_uprobe_skip_sstep() returns true if instruction was emulated, that
    is to say, there is no need to single step for the emulated instructions.
    regs->csr_era will point to the destination address directly after the
    exception, so the resume_era related code is redundant, just remove them.
    
    Cc: [email protected]
    Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support")
    Signed-off-by: Tiezhu Yang <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: uprobes: Remove user_{en,dis}able_single_step() [+ + +]

Author: Tiezhu Yang <[email protected]>
Date:   Wed May 14 22:18:10 2025 +0800

    LoongArch: uprobes: Remove user_{en,dis}able_single_step()
    
    commit 0b326b2371f94e798137cc1a3c5c2eef2bc69061 upstream.
    
    When executing the "perf probe" and "perf stat" test cases about some
    cryptographic algorithm, the output shows that "Trace/breakpoint trap".
    This is because it uses the software singlestep breakpoint for uprobes
    on LoongArch, and no need to use the hardware singlestep. So just remove
    the related function call to user_{en,dis}able_single_step() for uprobes
    on LoongArch.
    
    How to reproduce:
    
    Please make sure CONFIG_UPROBE_EVENTS is set and openssl supports sm2
    algorithm, then execute the following command.
    
    cd tools/perf && make
    ./perf probe -x /usr/lib64/libcrypto.so BN_mod_mul_montgomery
    ./perf stat -e probe_libcrypto:BN_mod_mul_montgomery openssl speed sm2
    
    Cc: [email protected]
    Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support")
    Signed-off-by: Tiezhu Yang <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

MAINTAINERS: Update Alexey Makhalov's email address [+ + +]

Author: Alexey Makhalov <[email protected]>
Date:   Tue Mar 18 00:40:31 2025 +0000

    MAINTAINERS: Update Alexey Makhalov's email address
    
    commit 386cd3dcfd63491619b4034b818737fc0219e128 upstream.
    
    Fix a typo in an email address.
    
    Closes: https://lore.kernel.org/all/20240925-rational-succinct-vulture-cca9fb@lemur/T/
    Reported-by: Konstantin Ryabitsev <[email protected]>
    Reported-by: Juergen Gross <[email protected]>
    Signed-off-by: Alexey Makhalov <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices [+ + +]

Author: Ido Schimmel <[email protected]>
Date:   Wed May 14 14:48:05 2025 +0200

    mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
    
    [ Upstream commit 92ec4855034b2c4d13f117558dc73d20581fa9ff ]
    
    The driver only offloads neighbors that are constructed on top of net
    devices registered by it or their uppers (which are all Ethernet). The
    device supports GRE encapsulation and decapsulation of forwarded
    traffic, but the driver will not offload dummy neighbors constructed on
    top of GRE net devices as they are not uppers of its net devices:
    
     # ip link add name gre1 up type gre tos inherit local 192.0.2.1 remote 198.51.100.1
     # ip neigh add 0.0.0.0 lladdr 0.0.0.0 nud noarp dev gre1
     $ ip neigh show dev gre1 nud noarp
     0.0.0.0 lladdr 0.0.0.0 NOARP
    
    (Note that the neighbor is not marked with 'offload')
    
    When the driver is reloaded and the existing configuration is replayed,
    the driver does not perform the same check regarding existing neighbors
    and offloads the previously added one:
    
     # devlink dev reload pci/0000:01:00.0
     $ ip neigh show dev gre1 nud noarp
     0.0.0.0 lladdr 0.0.0.0 offload NOARP
    
    If the neighbor is later deleted, the driver will ignore the
    notification (given the GRE net device is not its upper) and will
    therefore keep referencing freed memory, resulting in a use-after-free
    [1] when the net device is deleted:
    
     # ip neigh del 0.0.0.0 lladdr 0.0.0.0 dev gre1
     # ip link del dev gre1
    
    Fix by skipping neighbor replay if the net device for which the replay
    is performed is not our upper.
    
    [1]
    BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x1ea/0x200
    Read of size 8 at addr ffff888155b0e420 by task ip/2282
    [...]
    Call Trace:
     <TASK>
     dump_stack_lvl+0x6f/0xa0
     print_address_description.constprop.0+0x6f/0x350
     print_report+0x108/0x205
     kasan_report+0xdf/0x110
     mlxsw_sp_neigh_entry_update+0x1ea/0x200
     mlxsw_sp_router_rif_gone_sync+0x2a8/0x440
     mlxsw_sp_rif_destroy+0x1e9/0x750
     mlxsw_sp_netdevice_ipip_ol_event+0x3c9/0xdc0
     mlxsw_sp_router_netdevice_event+0x3ac/0x15e0
     notifier_call_chain+0xca/0x150
     call_netdevice_notifiers_info+0x7f/0x100
     unregister_netdevice_many_notify+0xc8c/0x1d90
     rtnl_dellink+0x34e/0xa50
     rtnetlink_rcv_msg+0x6fb/0xb70
     netlink_rcv_skb+0x131/0x360
     netlink_unicast+0x426/0x710
     netlink_sendmsg+0x75a/0xc20
     __sock_sendmsg+0xc1/0x150
     ____sys_sendmsg+0x5aa/0x7b0
     ___sys_sendmsg+0xfc/0x180
     __sys_sendmsg+0x121/0x1b0
     do_syscall_64+0xbb/0x1d0
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    Fixes: 8fdb09a7674c ("mlxsw: spectrum_router: Replay neighbours when RIF is made")
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Petr Machata <[email protected]>
    Signed-off-by: Petr Machata <[email protected]>
    Link: https://patch.msgid.link/c53c02c904fde32dad484657be3b1477884e9ad6.1747225701.git.petrm@nvidia.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mm/page_alloc: fix race condition in unaccepted memory handling [+ + +]

Author: Kirill A. Shutemov <[email protected]>
Date:   Tue May 6 16:32:07 2025 +0300

    mm/page_alloc: fix race condition in unaccepted memory handling
    
    commit fefc075182275057ce607effaa3daa9e6e3bdc73 upstream.
    
    The page allocator tracks the number of zones that have unaccepted memory
    using static_branch_enc/dec() and uses that static branch in hot paths to
    determine if it needs to deal with unaccepted memory.
    
    Borislav and Thomas pointed out that the tracking is racy: operations on
    static_branch are not serialized against adding/removing unaccepted pages
    to/from the zone.
    
    Sanity checks inside static_branch machinery detects it:
    
    WARNING: CPU: 0 PID: 10 at kernel/jump_label.c:276 __static_key_slow_dec_cpuslocked+0x8e/0xa0
    
    The comment around the WARN() explains the problem:
    
            /*
             * Warn about the '-1' case though; since that means a
             * decrement is concurrent with a first (0->1) increment. IOW
             * people are trying to disable something that wasn't yet fully
             * enabled. This suggests an ordering problem on the user side.
             */
    
    The effect of this static_branch optimization is only visible on
    microbenchmark.
    
    Instead of adding more complexity around it, remove it altogether.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Kirill A. Shutemov <[email protected]>
    Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory")
    Link: https://lore.kernel.org/all/20250506092445.GBaBnVXXyvnazly6iF@fat_crate.local
    Reported-by: Borislav Petkov <[email protected]>
    Tested-by: Borislav Petkov (AMD) <[email protected]>
    Reported-by: Thomas Gleixner <[email protected]>
    Cc: Vlastimil Babka <[email protected]>
    Cc: Suren Baghdasaryan <[email protected]>
    Cc: Michal Hocko <[email protected]>
    Cc: Brendan Jackman <[email protected]>
    Cc: Johannes Weiner <[email protected]>
    Cc: <[email protected]>    [6.5+]
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Kirill A. Shutemov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: hugetlb: fix incorrect fallback for subpool [+ + +]

Author: Wupeng Ma <[email protected]>
Date:   Thu Apr 10 14:26:33 2025 +0800

    mm: hugetlb: fix incorrect fallback for subpool
    
    commit a833a693a490ecff8ba377654c6d4d333718b6b1 upstream.
    
    During our testing with hugetlb subpool enabled, we observe that
    hstate->resv_huge_pages may underflow into negative values.  Root cause
    analysis reveals a race condition in subpool reservation fallback handling
    as follow:
    
    hugetlb_reserve_pages()
        /* Attempt subpool reservation */
        gbl_reserve = hugepage_subpool_get_pages(spool, chg);
    
        /* Global reservation may fail after subpool allocation */
        if (hugetlb_acct_memory(h, gbl_reserve) < 0)
            goto out_put_pages;
    
    out_put_pages:
        /* This incorrectly restores reservation to subpool */
        hugepage_subpool_put_pages(spool, chg);
    
    When hugetlb_acct_memory() fails after subpool allocation, the current
    implementation over-commits subpool reservations by returning the full
    'chg' value instead of the actual allocated 'gbl_reserve' amount.  This
    discrepancy propagates to global reservations during subsequent releases,
    eventually causing resv_huge_pages underflow.
    
    This problem can be trigger easily with the following steps:
    1. reverse hugepage for hugeltb allocation
    2. mount hugetlbfs with min_size to enable hugetlb subpool
    3. alloc hugepages with two task(make sure the second will fail due to
       insufficient amount of hugepages)
    4. with for a few seconds and repeat step 3 which will make
       hstate->resv_huge_pages to go below zero.
    
    To fix this problem, return corrent amount of pages to subpool during the
    fallback after hugepage_subpool_get_pages is called.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1c5ecae3a93f ("hugetlbfs: add minimum size accounting to subpools")
    Signed-off-by: Wupeng Ma <[email protected]>
    Tested-by: Joshua Hahn <[email protected]>
    Reviewed-by: Oscar Salvador <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Ma Wupeng <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: userfaultfd: correct dirty flags set for both present and swap pte [+ + +]

Author: Barry Song <[email protected]>
Date:   Fri May 9 10:09:12 2025 +1200

    mm: userfaultfd: correct dirty flags set for both present and swap pte
    
    commit 75cb1cca2c880179a11c7dd9380b6f14e41a06a4 upstream.
    
    As David pointed out, what truly matters for mremap and userfaultfd move
    operations is the soft dirty bit.  The current comment and
    implementation—which always sets the dirty bit for present PTEs and
    fails to set the soft dirty bit for swap PTEs—are incorrect.  This could
    break features like Checkpoint-Restore in Userspace (CRIU).
    
    This patch updates the behavior to correctly set the soft dirty bit for
    both present and swap PTEs in accordance with mremap.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
    Signed-off-by: Barry Song <[email protected]>
    Reported-by: David Hildenbrand <[email protected]>
    Closes: https://lore.kernel.org/linux-mm/[email protected]/
    Acked-by: Peter Xu <[email protected]>
    Reviewed-by: Suren Baghdasaryan <[email protected]>
    Cc: Lokesh Gidra <[email protected]>
    Cc: Andrea Arcangeli <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net/mlx5e: Disable MACsec offload for uplink representor profile [+ + +]

Author: Carolina Jubran <[email protected]>
Date:   Sun May 11 13:15:52 2025 +0300

    net/mlx5e: Disable MACsec offload for uplink representor profile
    
    [ Upstream commit 588431474eb7572e57a927fa8558c9ba2f8af143 ]
    
    MACsec offload is not supported in switchdev mode for uplink
    representors. When switching to the uplink representor profile, the
    MACsec offload feature must be cleared from the netdevice's features.
    
    If left enabled, attempts to add offloads result in a null pointer
    dereference, as the uplink representor does not support MACsec offload
    even though the feature bit remains set.
    
    Clear NETIF_F_HW_MACSEC in mlx5e_fix_uplink_rep_features().
    
    Kernel log:
    
    Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN
    KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f]
    CPU: 29 UID: 0 PID: 4714 Comm: ip Not tainted 6.14.0-rc4_for_upstream_debug_2025_03_02_17_35 #1
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    RIP: 0010:__mutex_lock+0x128/0x1dd0
    Code: d0 7c 08 84 d2 0f 85 ad 15 00 00 8b 35 91 5c fe 03 85 f6 75 29 49 8d 7e 60 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a6 15 00 00 4d 3b 76 60 0f 85 fd 0b 00 00 65 ff
    RSP: 0018:ffff888147a4f160 EFLAGS: 00010206
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
    RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000078
    RBP: ffff888147a4f2e0 R08: ffffffffa05d2c19 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
    R13: dffffc0000000000 R14: 0000000000000018 R15: ffff888152de0000
    FS:  00007f855e27d800(0000) GS:ffff88881ee80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000004e5768 CR3: 000000013ae7c005 CR4: 0000000000372eb0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? die_addr+0x3d/0xa0
     ? exc_general_protection+0x144/0x220
     ? asm_exc_general_protection+0x22/0x30
     ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
     ? __mutex_lock+0x128/0x1dd0
     ? lockdep_set_lock_cmp_fn+0x190/0x190
     ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
     ? mutex_lock_io_nested+0x1ae0/0x1ae0
     ? lock_acquire+0x1c2/0x530
     ? macsec_upd_offload+0x145/0x380
     ? lockdep_hardirqs_on_prepare+0x400/0x400
     ? kasan_save_stack+0x30/0x40
     ? kasan_save_stack+0x20/0x40
     ? kasan_save_track+0x10/0x30
     ? __kasan_kmalloc+0x77/0x90
     ? __kmalloc_noprof+0x249/0x6b0
     ? genl_family_rcv_msg_attrs_parse.constprop.0+0xb5/0x240
     ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
     mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
     ? mlx5e_macsec_add_rxsa+0x11a0/0x11a0 [mlx5_core]
     macsec_update_offload+0x26c/0x820
     ? macsec_set_mac_address+0x4b0/0x4b0
     ? lockdep_hardirqs_on_prepare+0x284/0x400
     ? _raw_spin_unlock_irqrestore+0x47/0x50
     macsec_upd_offload+0x2c8/0x380
     ? macsec_update_offload+0x820/0x820
     ? __nla_parse+0x22/0x30
     ? genl_family_rcv_msg_attrs_parse.constprop.0+0x15e/0x240
     genl_family_rcv_msg_doit+0x1cc/0x2a0
     ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
     ? cap_capable+0xd4/0x330
     genl_rcv_msg+0x3ea/0x670
     ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
     ? lockdep_set_lock_cmp_fn+0x190/0x190
     ? macsec_update_offload+0x820/0x820
     netlink_rcv_skb+0x12b/0x390
     ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
     ? netlink_ack+0xd80/0xd80
     ? rwsem_down_read_slowpath+0xf90/0xf90
     ? netlink_deliver_tap+0xcd/0xac0
     ? netlink_deliver_tap+0x155/0xac0
     ? _copy_from_iter+0x1bb/0x12c0
     genl_rcv+0x24/0x40
     netlink_unicast+0x440/0x700
     ? netlink_attachskb+0x760/0x760
     ? lock_acquire+0x1c2/0x530
     ? __might_fault+0xbb/0x170
     netlink_sendmsg+0x749/0xc10
     ? netlink_unicast+0x700/0x700
     ? __might_fault+0xbb/0x170
     ? netlink_unicast+0x700/0x700
     __sock_sendmsg+0xc5/0x190
     ____sys_sendmsg+0x53f/0x760
     ? import_iovec+0x7/0x10
     ? kernel_sendmsg+0x30/0x30
     ? __copy_msghdr+0x3c0/0x3c0
     ? filter_irq_stacks+0x90/0x90
     ? stack_depot_save_flags+0x28/0xa30
     ___sys_sendmsg+0xeb/0x170
     ? kasan_save_stack+0x30/0x40
     ? copy_msghdr_from_user+0x110/0x110
     ? do_syscall_64+0x6d/0x140
     ? lock_acquire+0x1c2/0x530
     ? __virt_addr_valid+0x116/0x3b0
     ? __virt_addr_valid+0x1da/0x3b0
     ? lock_downgrade+0x680/0x680
     ? __delete_object+0x21/0x50
     __sys_sendmsg+0xf7/0x180
     ? __sys_sendmsg_sock+0x20/0x20
     ? kmem_cache_free+0x14c/0x4e0
     ? __x64_sys_close+0x78/0xd0
     do_syscall_64+0x6d/0x140
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7f855e113367
    Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
    RSP: 002b:00007ffd15e90c88 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f855e113367
    RDX: 0000000000000000 RSI: 00007ffd15e90cf0 RDI: 0000000000000004
    RBP: 00007ffd15e90dbc R08: 0000000000000028 R09: 000000000045d100
    R10: 00007f855e011dd8 R11: 0000000000000246 R12: 0000000000000019
    R13: 0000000067c6b785 R14: 00000000004a1e80 R15: 0000000000000000
     </TASK>
    Modules linked in: 8021q garp mrp sch_ingress openvswitch nsh mlx5_ib mlx5_fwctl mlx5_dpll mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core]
    ---[ end trace 0000000000000000 ]---
    
    Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support")
    Signed-off-by: Carolina Jubran <[email protected]>
    Reviewed-by: Shahar Shitrit <[email protected]>
    Reviewed-by: Dragos Tatulea <[email protected]>
    Signed-off-by: Tariq Toukan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/tls: fix kernel panic when alloc_page failed [+ + +]

Author: Pengtao He <[email protected]>
Date:   Wed May 14 21:20:13 2025 +0800

    net/tls: fix kernel panic when alloc_page failed
    
    [ Upstream commit 491deb9b8c4ad12fe51d554a69b8165b9ef9429f ]
    
    We cannot set frag_list to NULL pointer when alloc_page failed.
    It will be used in tls_strp_check_queue_ok when the next time
    tls_strp_read_sock is called.
    
    This is because we don't reset full_len in tls_strp_flush_anchor_copy()
    so the recv path will try to continue handling the partial record
    on the next call but we dettached the rcvq from the frag list.
    Alternative fix would be to reset full_len.
    
    Unable to handle kernel NULL pointer dereference
    at virtual address 0000000000000028
     Call trace:
     tls_strp_check_rcv+0x128/0x27c
     tls_strp_data_ready+0x34/0x44
     tls_data_ready+0x3c/0x1f0
     tcp_data_ready+0x9c/0xe4
     tcp_data_queue+0xf6c/0x12d0
     tcp_rcv_established+0x52c/0x798
    
    Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
    Signed-off-by: Pengtao He <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: cadence: macb: Fix a possible deadlock in macb_halt_tx. [+ + +]

Author: Mathieu Othacehe <[email protected]>
Date:   Fri May 9 14:19:35 2025 +0200

    net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
    
    [ Upstream commit c92d6089d8ad7d4d815ebcedee3f3907b539ff1f ]
    
    There is a situation where after THALT is set high, TGO stays high as
    well. Because jiffies are never updated, as we are in a context with
    interrupts disabled, we never exit that loop and have a deadlock.
    
    That deadlock was noticed on a sama5d4 device that stayed locked for days.
    
    Use retries instead of jiffies so that the timeout really works and we do
    not have a deadlock anymore.
    
    Fixes: e86cd53afc590 ("net/macb: better manage tx errors")
    Signed-off-by: Mathieu Othacehe <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: prevent standalone from trying to forward to other ports [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Thu May 8 11:14:24 2025 +0200

    net: dsa: b53: prevent standalone from trying to forward to other ports
    
    [ Upstream commit 4227ea91e2657f7965e34313448e9d0a2b67712e ]
    
    When bridged ports and standalone ports share a VLAN, e.g. via VLAN
    uppers, or untagged traffic with a vlan unaware bridge, the ASIC will
    still try to forward traffic to known FDB entries on standalone ports.
    But since the port VLAN masks prevent forwarding to bridged ports, this
    traffic will be dropped.
    
    This e.g. can be observed in the bridge_vlan_unaware ping tests, where
    this breaks pinging with learning on.
    
    Work around this by enabling the simplified EAP mode on switches
    supporting it for standalone ports, which causes the ASIC to redirect
    traffic of unknown source MAC addresses to the CPU port.
    
    Since standalone ports do not learn, there are no known source MAC
    addresses, so effectively this redirects all incoming traffic to the CPU
    port.
    
    Fixes: ff39c2d68679 ("net: dsa: b53: Add bridge support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: microchip: let phylink manage PHY EEE configuration on KSZ switches [+ + +]

Author: Oleksij Rempel <[email protected]>
Date:   Sun May 4 10:14:33 2025 +0200

    net: dsa: microchip: let phylink manage PHY EEE configuration on KSZ switches
    
    commit 76ca05e0abe31a4f47a5b5a85041b5a22c03baf8 upstream.
    
    Phylink expects MAC drivers to provide LPI callbacks to properly manage
    Energy Efficient Ethernet (EEE) configuration. On KSZ switches with
    integrated PHYs, LPI is internally handled by hardware, while ports
    without integrated PHYs have no documented MAC-level LPI support.
    
    Provide dummy mac_disable_tx_lpi() and mac_enable_tx_lpi() callbacks to
    satisfy phylink requirements. Also, set default EEE capabilities during
    phylink initialization where applicable.
    
    Since phylink can now gracefully handle optional EEE configuration,
    remove the need for the MICREL_NO_EEE PHY flag.
    
    This change addresses issues caused by incomplete EEE refactoring
    introduced in commit fe0d4fd9285e ("net: phy: Keep track of EEE
    configuration"). It is not easily possible to fix all older kernels, but
    this patch ensures proper behavior on latest kernels and can be
    considered for backporting to stable kernels starting from v6.14.
    
    Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
    Signed-off-by: Oleksij Rempel <[email protected]>
    Cc: [email protected] # v6.14+
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Fri May 9 14:38:16 2025 +0300

    net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING
    
    [ Upstream commit 498625a8ab2c8e1c9ab5105744310e8d6952cc01 ]
    
    It has been reported that when under a bridge with stp_state=1, the logs
    get spammed with this message:
    
    [  251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port
    
    Further debugging shows the following info associated with packets:
    source_port=-1, switch_id=-1, vid=-1, vbid=1
    
    In other words, they are data plane packets which are supposed to be
    decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly)
    refuses to do so, because no switch port is currently in
    BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively
    unexpected.
    
    The error goes away after the port progresses to BR_STATE_LEARNING in 15
    seconds (the default forward_time of the bridge), because then,
    dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane
    packets with a plausible bridge port in a plausible STP state.
    
    Re-reading IEEE 802.1D-1990, I see the following:
    
    "4.4.2 Learning: (...) The Forwarding Process shall discard received
    frames."
    
    IEEE 802.1D-2004 further clarifies:
    
    "DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the
    DISCARDING port state. While those dot1dStpPortStates serve to
    distinguish reasons for discarding frames, the operation of the
    Forwarding and Learning processes is the same for all of them. (...)
    LISTENING represents a port that the spanning tree algorithm has
    selected to be part of the active topology (computing a Root Port or
    Designated Port role) but is temporarily discarding frames to guard
    against loops or incorrect learning."
    
    Well, this is not what the driver does - instead it sets
    mac[port].ingress = true.
    
    To get rid of the log spam, prevent unexpected data plane packets to
    be received by software by discarding them on ingress in the LISTENING
    state.
    
    In terms of blame attribution: the prints only date back to commit
    d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based
    on the VBID"). However, the settings would permit a LISTENING port to
    forward to a FORWARDING port, and the standard suggests that's not OK.
    
    Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability [+ + +]

Author: Bo-Cun Chen <[email protected]>
Date:   Tue May 13 05:27:30 2025 +0100

    net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
    
    [ Upstream commit 1bdea6fad6fb985ff13828373c48e337c4e939f9 ]
    
    Since MTK_ESW_BIT is a bit number rather than a bitmap, it causes
    MTK_HAS_CAPS to produce incorrect results. This leads to the ETH
    driver not declaring MAC capabilities correctly for the MT7988 ESW.
    
    Fixes: 445eb6448ed3 ("net: ethernet: mtk_eth_soc: add basic support for MT7988 SoC")
    Signed-off-by: Bo-Cun Chen <[email protected]>
    Signed-off-by: Daniel Golle <[email protected]>
    Reviewed-by: Michal Swiatkowski <[email protected]>
    Link: https://patch.msgid.link/b8b37f409d1280fad9c4d32521e6207f63cd3213.1747110258.git.daniel@makrotopia.org
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mctp: Don't access ifa_index when missing [+ + +]

Author: Matt Johnston <[email protected]>
Date:   Thu May 8 13:18:32 2025 +0800

    net: mctp: Don't access ifa_index when missing
    
    [ Upstream commit f11cf946c0a92c560a890d68e4775723353599e1 ]
    
    In mctp_dump_addrinfo, ifa_index can be used to filter interfaces, but
    only when the struct ifaddrmsg is provided. Otherwise it will be
    comparing to uninitialised memory - reproducible in the syzkaller case from
    dhcpd, or busybox "ip addr show".
    
    The kernel MCTP implementation has always filtered by ifa_index, so
    existing userspace programs expecting to dump MCTP addresses must
    already be passing a valid ifa_index value (either 0 or a real index).
    
    BUG: KMSAN: uninit-value in mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128
     mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128
     rtnl_dump_all+0x3ec/0x5b0 net/core/rtnetlink.c:4380
     rtnl_dumpit+0xd5/0x2f0 net/core/rtnetlink.c:6824
     netlink_dump+0x97b/0x1690 net/netlink/af_netlink.c:2309
    
    Fixes: 583be982d934 ("mctp: Add device handling and netlink interface")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/all/[email protected]/
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Matt Johnston <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mctp: Ensure keys maintain only one ref to corresponding dev [+ + +]

Author: Andrew Jeffery <[email protected]>
Date:   Thu May 8 14:16:00 2025 +0930

    net: mctp: Ensure keys maintain only one ref to corresponding dev
    
    [ Upstream commit e4f349bd6e58051df698b82f94721f18a02a293d ]
    
    mctp_flow_prepare_output() is called in mctp_route_output(), which
    places outbound packets onto a given interface. The packet may represent
    a message fragment, in which case we provoke an unbalanced reference
    count to the underlying device. This causes trouble if we ever attempt
    to remove the interface:
    
        [   48.702195] usb 1-1: USB disconnect, device number 2
        [   58.883056] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
        [   69.022548] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
        [   79.172568] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2
        ...
    
    Predicate the invocation of mctp_dev_set_key() in
    mctp_flow_prepare_output() on not already having associated the device
    with the key. It's not yet realistic to uphold the property that the key
    maintains only one device reference earlier in the transmission sequence
    as the route (and therefore the device) may not be known at the time the
    key is associated with the socket.
    
    Fixes: 67737c457281 ("mctp: Pass flow data & flow release events to drivers")
    Acked-by: Jeremy Kerr <[email protected]>
    Signed-off-by: Andrew Jeffery <[email protected]>
    Link: https://patch.msgid.link/20250508-mctp-dev-refcount-v1-1-d4f965c67bb5@codeconstruct.com.au
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: micrel: remove KSZ9477 EEE quirks now handled by phylink [+ + +]

Author: Oleksij Rempel <[email protected]>
Date:   Sun May 4 10:14:34 2025 +0200

    net: phy: micrel: remove KSZ9477 EEE quirks now handled by phylink
    
    commit 8c619eb21b8e87ae95877e9cca9fcb0e3115776e upstream.
    
    The KSZ9477 PHY driver contained workarounds for broken EEE capability
    advertisements by manually masking supported EEE modes and forcibly
    disabling EEE if MICREL_NO_EEE was set.
    
    With proper MAC-side EEE handling implemented via phylink, these quirks
    are no longer necessary. Remove MICREL_NO_EEE handling and the use of
    ksz9477_get_features().
    
    This simplifies the PHY driver and avoids duplicated EEE management logic.
    
    Signed-off-by: Oleksij Rempel <[email protected]>
    Cc: [email protected] # v6.14+
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: qede: Initialize qede_ll_ops with designated initializer [+ + +]

Author: Nathan Chancellor <[email protected]>
Date:   Wed May 7 21:47:45 2025 +0100

    net: qede: Initialize qede_ll_ops with designated initializer
    
    commit 6b3ab7f2cbfaeb6580709cd8ef4d72cfd01bfde4 upstream.
    
    After a recent change [1] in clang's randstruct implementation to
    randomize structures that only contain function pointers, there is an
    error because qede_ll_ops get randomized but does not use a designated
    initializer for the first member:
    
      drivers/net/ethernet/qlogic/qede/qede_main.c:206:2: error: a randomized struct can only be initialized with a designated initializer
        206 |         {
            |         ^
    
    Explicitly initialize the common member using a designated initializer
    to fix the build.
    
    Cc: [email protected]
    Fixes: 035f7f87b729 ("randstruct: Enable Clang support")
    Link: https://github.com/llvm/llvm-project/commit/04364fb888eea6db9811510607bed4b200bcb082 [1]
    Signed-off-by: Nathan Chancellor <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net_sched: Flush gso_skb list too during ->change() [+ + +]

Author: Cong Wang <[email protected]>
Date:   Tue May 6 21:35:58 2025 -0700

    net_sched: Flush gso_skb list too during ->change()
    
    [ Upstream commit 2d3cbfd6d54a2c39ce3244f33f85c595844bd7b8 ]
    
    Previously, when reducing a qdisc's limit via the ->change() operation, only
    the main skb queue was trimmed, potentially leaving packets in the gso_skb
    list. This could result in NULL pointer dereference when we only check
    sch->limit against sch->q.qlen.
    
    This patch introduces a new helper, qdisc_dequeue_internal(), which ensures
    both the gso_skb list and the main queue are properly flushed when trimming
    excess packets. All relevant qdiscs (codel, fq, fq_codel, fq_pie, hhf, pie)
    are updated to use this helper in their ->change() routines.
    
    Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
    Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
    Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
    Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
    Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
    Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme")
    Reported-by: Will <[email protected]>
    Reported-by: Savy <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netlink: specs: tc: all actions are indexed arrays [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Tue May 13 15:16:38 2025 -0700

    netlink: specs: tc: all actions are indexed arrays
    
    [ Upstream commit f3dd5fb2fa494dcbdb10f8d27f2deac8ef61a2fc ]
    
    Some TC filters have actions listed as indexed arrays of nests
    and some as just nests. They are all indexed arrays, the handling
    is common across filters.
    
    Fixes: 2267672a6190 ("doc/netlink/specs: Update the tc spec")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netlink: specs: tc: fix a couple of attribute names [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Tue May 13 15:13:16 2025 -0700

    netlink: specs: tc: fix a couple of attribute names
    
    [ Upstream commit a9fb87b8b86918e34ef6bf3316311f41bc1a5b1f ]
    
    Fix up spelling of two attribute names. These are clearly typoes
    and will prevent C codegen from working. Let's treat this as
    a fix to get the correction into users' hands ASAP, and prevent
    anyone depending on the wrong names.
    
    Fixes: a1bcfde83669 ("doc/netlink/specs: Add a spec for tc")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFS/localio: Fix a race in nfs_local_open_fh() [+ + +]

Author: Trond Myklebust <[email protected]>
Date:   Mon Apr 21 14:43:34 2025 -0400

    NFS/localio: Fix a race in nfs_local_open_fh()
    
    [ Upstream commit fa7ab64f1e2fdc8f2603aab8e0dd20de89cb10d9 ]
    
    Once the clp->cl_uuid.lock has been dropped, another CPU could come in
    and free the struct nfsd_file that was just added. To prevent that from
    happening, take the RCU read lock before dropping the spin lock.
    
    Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
    Signed-off-by: Trond Myklebust <[email protected]>
    Reviewed-by: Mike Snitzer <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nfs: handle failure of nfs_get_lock_context in unlock path [+ + +]

Author: Li Lingfeng <[email protected]>
Date:   Thu Apr 17 15:25:08 2025 +0800

    nfs: handle failure of nfs_get_lock_context in unlock path
    
    [ Upstream commit c457dc1ec770a22636b473ce5d35614adfe97636 ]
    
    When memory is insufficient, the allocation of nfs_lock_context in
    nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat
    an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM)
    as valid and proceed to execute rpc_run_task(), this will trigger a NULL
    pointer dereference in nfs4_locku_prepare. For example:
    
    BUG: kernel NULL pointer dereference, address: 000000000000000c
    PGD 0 P4D 0
    Oops: Oops: 0000 [#1] SMP PTI
    CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40
    Workqueue: rpciod rpc_async_schedule
    RIP: 0010:nfs4_locku_prepare+0x35/0xc2
    Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3
    RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246
    RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40
    RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38
    R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030
    R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30
    FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0
    Call Trace:
     <TASK>
     __rpc_execute+0xbc/0x480
     rpc_async_schedule+0x2f/0x40
     process_one_work+0x232/0x5d0
     worker_thread+0x1da/0x3d0
     ? __pfx_worker_thread+0x10/0x10
     kthread+0x10d/0x240
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x34/0x50
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    Modules linked in:
    CR2: 000000000000000c
    ---[ end trace 0000000000000000 ]---
    
    Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and
    return NULL to terminate subsequent rpc_run_task, preventing NULL pointer
    dereference.
    
    Fixes: f30cb757f680 ("NFS: Always wait for I/O completion before unlock")
    Signed-off-by: Li Lingfeng <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFSv4/pnfs: Reset the layout state after a layoutreturn [+ + +]

Author: Trond Myklebust <[email protected]>
Date:   Sat May 10 10:50:13 2025 -0400

    NFSv4/pnfs: Reset the layout state after a layoutreturn
    
    [ Upstream commit 6d6d7f91cc8c111d40416ac9240a3bb9396c5235 ]
    
    If there are still layout segments in the layout plh_return_lsegs list
    after a layout return, we should be resetting the state to ensure they
    eventually get returned as well.
    
    Fixes: 68f744797edd ("pNFS: Do not free layout segments that are marked for return")
    Signed-off-by: Trond Myklebust <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable [+ + +]

Author: Keith Busch <[email protected]>
Date:   Thu May 8 16:57:06 2025 +0200

    nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable
    
    [ Upstream commit 3d8932133dcecbd9bef1559533c1089601006f45 ]
    
    We need to lock this queue for that condition because the timeout work
    executes per-namespace and can poll the poll CQ.
    
    Reported-by: Hannes Reinecke <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Fixes: a0fa9647a54e ("NVMe: add blk polling support")
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Daniel Wagner <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-pci: make nvme_pci_npages_prp() __always_inline [+ + +]

Author: Kees Cook <[email protected]>
Date:   Tue May 6 20:35:40 2025 -0700

    nvme-pci: make nvme_pci_npages_prp() __always_inline
    
    [ Upstream commit 40696426b8c8c4f13cf6ac52f0470eed144be4b2 ]
    
    The only reason nvme_pci_npages_prp() could be used as a compile-time
    known result in BUILD_BUG_ON() is because the compiler was always choosing
    to inline the function. Under special circumstances (sanitizer coverage
    functions disabled for __init functions on ARCH=um), the compiler decided
    to stop inlining it:
    
       drivers/nvme/host/pci.c: In function 'nvme_init':
       include/linux/compiler_types.h:557:45: error: call to '__compiletime_assert_678' declared with attribute error: BUILD_BUG_ON failed: nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS
         557 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
             |                                             ^
       include/linux/compiler_types.h:538:25: note: in definition of macro '__compiletime_assert'
         538 |                         prefix ## suffix();                             \
             |                         ^~~~~~
       include/linux/compiler_types.h:557:9: note: in expansion of macro '_compiletime_assert'
         557 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
             |         ^~~~~~~~~~~~~~~~~~~
       include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
          39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
             |                                     ^~~~~~~~~~~~~~~~~~
       include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
          50 |         BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
             |         ^~~~~~~~~~~~~~~~
       drivers/nvme/host/pci.c:3804:9: note: in expansion of macro 'BUILD_BUG_ON'
        3804 |         BUILD_BUG_ON(nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS);
             |         ^~~~~~~~~~~~
    
    Force it to be __always_inline to make sure it is always available for
    use with BUILD_BUG_ON().
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: c372cdd1efdf ("nvme-pci: iod npages fits in s8")
    Signed-off-by: Kees Cook <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

octeontx2-af: Fix CGX Receive counters [+ + +]

Author: Hariprasad Kelam <[email protected]>
Date:   Tue May 13 12:45:54 2025 +0530

    octeontx2-af: Fix CGX Receive counters
    
    [ Upstream commit bf449f35e77fd44017abf991fac1f9ab7705bbe0 ]
    
    Each CGX block supports 4 logical MACs (LMACS). Receive
    counters CGX_CMR_RX_STAT0-8 are per LMAC and CGX_CMR_RX_STAT9-12
    are per CGX.
    
    Due a bug in previous patch, stale Per CGX counters values observed.
    
    Fixes: 66208910e57a ("octeontx2-af: Support to retrieve CGX LMAC stats")
    Signed-off-by: Hariprasad Kelam <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

octeontx2-pf: Do not reallocate all ntuple filters [+ + +]

Author: Subbaraya Sundeep <[email protected]>
Date:   Mon May 12 18:22:37 2025 +0530

    octeontx2-pf: Do not reallocate all ntuple filters
    
    [ Upstream commit dcb479fde00be9a151c047d0a7c0626b64eb0019 ]
    
    If ntuple filters count is modified followed by
    unicast filters count using devlink then the ntuple count
    set by user is ignored and all the ntuple filters are
    being reallocated. Fix this by storing the ntuple count
    set by user. Without this patch, say if user tries
    to modify ntuple count as 8 followed by ucast filter count as 4
    using devlink commands then ntuple count is being reverted to
    default value 16 i.e, not retaining user set value 8.
    
    Fixes: 39c469188b6d ("octeontx2-pf: Add ucast filter count configurability via devlink.")
    Signed-off-by: Subbaraya Sundeep <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

octeontx2-pf: Fix ethtool support for SDP representors [+ + +]

Author: Hariprasad Kelam <[email protected]>
Date:   Mon May 12 11:59:01 2025 +0530

    octeontx2-pf: Fix ethtool support for SDP representors
    
    [ Upstream commit 314007549d89adebdd1e214a743d7e26edbd075e ]
    
    The hardware supports multiple MAC types, including RPM, SDP, and LBK.
    However, features such as link settings and pause frames are only available
    on RPM MAC, and not supported on SDP or LBK.
    
    This patch updates the ethtool operations logic accordingly to reflect
    this behavior.
    
    Fixes: 2f7f33a09516 ("octeontx2-pf: Add representors for sdp MAC")
    Signed-off-by: Hariprasad Kelam <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy [+ + +]

Author: Subbaraya Sundeep <[email protected]>
Date:   Mon May 12 18:12:36 2025 +0530

    octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
    
    [ Upstream commit 865ab2461375e3a5a2526f91f9a9f17b8931bc9e ]
    
    MASCEC hardware block has a field called maximum transmit size for
    TX secy. Max packet size going out of MCS block has be programmed
    taking into account full packet size which has L2 header,SecTag
    and ICV. MACSEC offload driver is configuring max transmit size as
    macsec interface MTU which is incorrect. Say with 1500 MTU of real
    device, macsec interface created on top of real device will have MTU of
    1468(1500 - (SecTag + ICV)). This is causing packets from macsec
    interface of size greater than or equal to 1468 are not getting
    transmitted out because driver programmed max transmit size as 1468
    instead of 1514(1500 + ETH_HDR_LEN).
    
    Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
    Signed-off-by: Subbaraya Sundeep <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

perf tools: Fix build error for LoongArch [+ + +]

Author: Tiezhu Yang <[email protected]>
Date:   Tue May 20 14:30:09 2025 +0800

    perf tools: Fix build error for LoongArch
    
    There exists the following error when building perf tools on LoongArch:
    
      CC      util/syscalltbl.o
    In file included from util/syscalltbl.c:16:
    tools/perf/arch/loongarch/include/syscall_table.h:2:10: fatal error: asm/syscall_table_64.h: No such file or directory
        2 | #include <asm/syscall_table_64.h>
          |          ^~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    
    This is because the generated syscall header is syscalls_64.h rather
    than syscall_table_64.h. The above problem was introduced from v6.14,
    then the header syscall_table.h has been removed from mainline tree
    in commit af472d3c4454 ("perf syscalltbl: Remove syscall_table.h"),
    just fix it only for the linux-6.14.y branch of stable tree.
    
    By the way, no need to fix the mainline tree and there is no upstream
    git id for this patch.
    
    How to reproduce:
    
      git clone https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
      cd linux && git checkout origin/linux-6.14.y
      make JOBS=1 -C tools/perf
    
    Fixes: fa70857a27e5 ("perf tools loongarch: Use syscall table")
    Signed-off-by: Tiezhu Yang <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

phy: Fix error handling in tegra_xusb_port_init [+ + +]

Author: Ma Ke <[email protected]>
Date:   Mon Mar 3 15:27:39 2025 +0800

    phy: Fix error handling in tegra_xusb_port_init
    
    commit b2ea5f49580c0762d17d80d8083cb89bc3acf74f upstream.
    
    If device_add() fails, do not use device_unregister() for error
    handling. device_unregister() consists two functions: device_del() and
    put_device(). device_unregister() should only be called after
    device_add() succeeded because device_del() undoes what device_add()
    does if successful. Change device_unregister() to put_device() call
    before returning from the function.
    
    As comment of device_add() says, 'if device_add() succeeds, you should
    call device_del() when you want to get rid of it. If device_add() has
    not succeeded, use only put_device() to drop the reference count'.
    
    Found by code review.
    
    Cc: [email protected]
    Fixes: 53d2a715c240 ("phy: Add Tegra XUSB pad controller support")
    Signed-off-by: Ma Ke <[email protected]>
    Acked-by: Thierry Reding <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind [+ + +]

Author: Claudiu Beznea <[email protected]>
Date:   Wed May 7 15:50:28 2025 +0300

    phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind
    
    commit 54c4c58713aaff76c2422ff5750e557ab3b100d7 upstream.
    
    It has been observed on the Renesas RZ/G3S SoC that unbinding and binding
    the PHY driver leads to role autodetection failures. This issue occurs when
    PHY 3 is the first initialized PHY. PHY 3 does not have an interrupt
    associated with the USB2_INT_ENABLE register (as
    rcar_gen3_int_enable[3] = 0). As a result, rcar_gen3_init_otg() is called
    to initialize OTG without enabling PHY interrupts.
    
    To resolve this, add rcar_gen3_is_any_otg_rphy_initialized() and call it in
    role_store(), role_show(), and rcar_gen3_init_otg(). At the same time,
    rcar_gen3_init_otg() is only called when initialization for a PHY with
    interrupt bits is in progress. As a result, the
    struct rcar_gen3_phy::otg_initialized is no longer needed.
    
    Fixes: 549b6b55b005 ("phy: renesas: rcar-gen3-usb2: enable/disable independent irqs")
    Cc: [email protected]
    Reviewed-by: Yoshihiro Shimoda <[email protected]>
    Tested-by: Yoshihiro Shimoda <[email protected]>
    Reviewed-by: Lad Prabhakar <[email protected]>
    Signed-off-by: Claudiu Beznea <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

phy: renesas: rcar-gen3-usb2: Set timing registers only once [+ + +]

Author: Claudiu Beznea <[email protected]>
Date:   Wed May 7 15:50:32 2025 +0300

    phy: renesas: rcar-gen3-usb2: Set timing registers only once
    
    commit 86e70849f4b2b4597ac9f7c7931f2a363774be25 upstream.
    
    phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common
    to all PHYs. There is no need to set them every time a PHY is initialized.
    Set timing register only when the 1st PHY is initialized.
    
    Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
    Cc: [email protected]
    Reviewed-by: Yoshihiro Shimoda <[email protected]>
    Tested-by: Yoshihiro Shimoda <[email protected]>
    Reviewed-by: Lad Prabhakar <[email protected]>
    Signed-off-by: Claudiu Beznea <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

phy: tegra: xusb: remove a stray unlock [+ + +]

Author: Dan Carpenter <[email protected]>
Date:   Wed Apr 23 16:08:23 2025 +0300

    phy: tegra: xusb: remove a stray unlock
    
    commit 83c178470e0bf690d34c8c08440f2421b82e881c upstream.
    
    We used to take a lock in tegra186_utmi_bias_pad_power_on() but now we
    have moved the lock into the caller.  Unfortunately, when we moved the
    lock this unlock was left behind and it results in a double unlock.
    Delete it now.
    
    Fixes: b47158fb4295 ("phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking")
    Signed-off-by: Dan Carpenter <[email protected]>
    Reviewed-by: Jon Hunter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking [+ + +]

Author: Wayne Chang <[email protected]>
Date:   Tue Apr 8 11:09:05 2025 +0800

    phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking
    
    commit b47158fb42959c417ff2662075c0d46fb783d5d1 upstream.
    
    The current implementation uses bias_pad_enable as a reference count to
    manage the shared bias pad for all UTMI PHYs. However, during system
    suspension with connected USB devices, multiple power-down requests for
    the UTMI pad result in a mismatch in the reference count, which in turn
    produces warnings such as:
    
    [  237.762967] WARNING: CPU: 10 PID: 1618 at tegra186_utmi_pad_power_down+0x160/0x170
    [  237.763103] Call trace:
    [  237.763104]  tegra186_utmi_pad_power_down+0x160/0x170
    [  237.763107]  tegra186_utmi_phy_power_off+0x10/0x30
    [  237.763110]  phy_power_off+0x48/0x100
    [  237.763113]  tegra_xusb_enter_elpg+0x204/0x500
    [  237.763119]  tegra_xusb_suspend+0x48/0x140
    [  237.763122]  platform_pm_suspend+0x2c/0xb0
    [  237.763125]  dpm_run_callback.isra.0+0x20/0xa0
    [  237.763127]  __device_suspend+0x118/0x330
    [  237.763129]  dpm_suspend+0x10c/0x1f0
    [  237.763130]  dpm_suspend_start+0x88/0xb0
    [  237.763132]  suspend_devices_and_enter+0x120/0x500
    [  237.763135]  pm_suspend+0x1ec/0x270
    
    The root cause was traced back to the dynamic power-down changes
    introduced in commit a30951d31b25 ("xhci: tegra: USB2 pad power controls"),
    where the UTMI pad was being powered down without verifying its current
    state. This unbalanced behavior led to discrepancies in the reference
    count.
    
    To rectify this issue, this patch replaces the single reference counter
    with a bitmask, renamed to utmi_pad_enabled. Each bit in the mask
    corresponds to one of the four USB2 PHYs, allowing us to track each pad's
    enablement status individually.
    
    With this change:
      - The bias pad is powered on only when the mask is clear.
      - Each UTMI pad is powered on or down based on its corresponding bit
        in the mask, preventing redundant operations.
      - The overall power state of the shared bias pad is maintained
        correctly during suspend/resume cycles.
    
    The mutex used to prevent race conditions during UTMI pad enable/disable
    operations has been moved from the tegra186_utmi_bias_pad_power_on/off
    functions to the parent functions tegra186_utmi_pad_power_on/down. This
    change ensures that there are no race conditions when updating the bitmask.
    
    Cc: [email protected]
    Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls")
    Signed-off-by: Wayne Chang <[email protected]>
    Reviewed-by: Jon Hunter <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers [+ + +]

Author: Suma Hegde <[email protected]>
Date:   Fri Apr 25 10:23:57 2025 +0000

    platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers
    
    [ Upstream commit 0581d384f344ed0a963dd27cbff3c7af80c189e7 ]
    
    amd_hsmp and hsmp_acpi are intended to be mutually exclusive drivers and
    amd_hsmp is for legacy platforms. To achieve this, it is essential to
    check for the presence of the ACPI device in plat.c. If the hsmp ACPI
    device entry is found, allow the hsmp_acpi driver to manage the hsmp
    and return an error from plat.c.
    
    Additionally, rename the driver from amd_hsmp to hsmp_acpi to prevent
    "Driver 'amd_hsmp' is already registered, aborting..." error in case
    both drivers are loaded simultaneously.
    
    Also, support both platform device based and ACPI based probing for
    family 0x1A models 0x00 to 0x0F, implement only ACPI based probing
    for family 0x1A, models 0x10 to 0x1F. Return false from
    legacy_hsmp_support() for this platform.
    This aligns with the condition check in is_f1a_m0h().
    
    Link: https://lore.kernel.org/platform-driver-x86/aALZxvHWmphNL1wa@gourry-fedora-PF4VCD3F/
    Fixes: 7d3135d16356 ("platform/x86/amd/hsmp: Create separate ACPI, plat and common drivers")
    Reviewed-by: Naveen Krishna Chatradhi <[email protected]>
    Co-developed-by: Gregory Price <[email protected]>
    Signed-off-by: Gregory Price <[email protected]>
    Signed-off-by: Suma Hegde <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL) [+ + +]

Author: Runhua He <[email protected]>
Date:   Wed May 7 18:01:03 2025 +0800

    platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL)
    
    [ Upstream commit 0887817e4953885fbd6a5c1bec2fdd339261eb19 ]
    
    MECHREVO Wujie 14XA (GX4HRXL) wakes up immediately after s2idle entry.
    This happens regardless of whether the laptop is plugged into AC power,
    or whether any peripheral is plugged into the laptop.
    
    Similar to commit a55bdad5dfd1 ("platform/x86/amd/pmc: Disable keyboard
    wakeup on AMD Framework 13"), the MECHREVO Wujie 14XA wakes up almost
    instantly after s2idle suspend entry (IRQ1 is the keyboard):
    
    2025-04-18 17:23:57,588 DEBUG:  PM: Triggering wakeup from IRQ 9
    2025-04-18 17:23:57,588 DEBUG:  PM: Triggering wakeup from IRQ 1
    
    Add this model to the spurious_8042 quirk to workaround this.
    
    This patch does not affect the wake-up function of the built-in keyboard.
    Because the firmware of this machine adds an insurance for keyboard
    wake-up events, as it always triggers an additional IRQ 9 to wake up the
    system.
    
    Suggested-by: Mingcong Bai <[email protected]>
    Suggested-by: Xinhui Yang <[email protected]>
    Suggested-by: Rong Zhang <[email protected]>
    Fixes: a55bdad5dfd1 ("platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13")
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4166
    Cc: Mario Limonciello <[email protected]>
    Link: https://zhuanldan.zhihu.com/p/730538041
    Tested-by: Yemu Lu <[email protected]>
    Signed-off-by: Runhua He <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Thu May 1 15:17:02 2025 +0200

    platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
    
    [ Upstream commit bfcfe6d335a967f8ea0c1980960e6f0205b5de6e ]
    
    The wlan_ctrl_by_user detection was introduced by commit a50bd128f28c
    ("asus-wmi: record wlan status while controlled by userapp").
    
    Quoting from that commit's commit message:
    
    """
    When you call WMIMethod(DSTS, 0x00010011) to get WLAN status, it may return
    
    (1) 0x00050001 (On)
    (2) 0x00050000 (Off)
    (3) 0x00030001 (On)
    (4) 0x00030000 (Off)
    (5) 0x00000002 (Unknown)
    
    (1), (2) means that the model has hardware GPIO for WLAN, you can call
    WMIMethod(DEVS, 0x00010011, 1 or 0) to turn WLAN on/off.
    (3), (4) means that the model doesn’t have hardware GPIO, you need to use
    API or driver library to turn WLAN on/off, and call
    WMIMethod(DEVS, 0x00010012, 1 or 0) to set WLAN LED status.
    After you set WLAN LED status, you can see the WLAN status is changed with
    WMIMethod(DSTS, 0x00010011). Because the status is recorded lastly
    (ex: Windows), you can use it for synchronization.
    (5) means that the model doesn’t have WLAN device.
    
    WLAN is the ONLY special case with upper rule.
    """
    
    The wlan_ctrl_by_user flag should be set on 0x0003000? ((3), (4) above)
    return values, but the flag mistakenly also gets set on laptops with
    0x0005000? ((1), (2)) return values. This is causing rfkill problems on
    laptops where 0x0005000? is returned.
    
    Fix the check to only set the wlan_ctrl_by_user flag for 0x0003000?
    return values.
    
    Fixes: a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219786
    Signed-off-by: Hans de Goede <[email protected]>
    Reviewed-by: Armin Wolf <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() [+ + +]

Author: Abdun Nihaal <[email protected]>
Date:   Mon May 12 10:18:27 2025 +0530

    qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()
    
    [ Upstream commit 9d8a99c5a7c7f4f7eca2c168a4ec254409670035 ]
    
    In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory
    allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is
    not freed. Fix that by jumping to the error path that frees them, by
    calling qlcnic_free_mbx_args(). This was found using static analysis.
    
    Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation")
    Signed-off-by: Abdun Nihaal <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem [+ + +]

Author: Zhu Yanjun <[email protected]>
Date:   Tue May 6 17:10:08 2025 +0200

    RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem
    
    [ Upstream commit d0706bfd3ee40923c001c6827b786a309e2a8713 ]
    
    Call Trace:
    
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:408 [inline]
     print_report+0xc3/0x670 mm/kasan/report.c:521
     kasan_report+0xe0/0x110 mm/kasan/report.c:634
     strlen+0x93/0xa0 lib/string.c:420
     __fortify_strlen include/linux/fortify-string.h:268 [inline]
     get_kobj_path_length lib/kobject.c:118 [inline]
     kobject_get_path+0x3f/0x2a0 lib/kobject.c:158
     kobject_uevent_env+0x289/0x1870 lib/kobject_uevent.c:545
     ib_register_device drivers/infiniband/core/device.c:1472 [inline]
     ib_register_device+0x8cf/0xe00 drivers/infiniband/core/device.c:1393
     rxe_register_device+0x275/0x320 drivers/infiniband/sw/rxe/rxe_verbs.c:1552
     rxe_net_add+0x8e/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:550
     rxe_newlink+0x70/0x190 drivers/infiniband/sw/rxe/rxe.c:225
     nldev_newlink+0x3a3/0x680 drivers/infiniband/core/nldev.c:1796
     rdma_nl_rcv_msg+0x387/0x6e0 drivers/infiniband/core/netlink.c:195
     rdma_nl_rcv_skb.constprop.0.isra.0+0x2e5/0x450
     netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
     netlink_unicast+0x53a/0x7f0 net/netlink/af_netlink.c:1339
     netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1883
     sock_sendmsg_nosec net/socket.c:712 [inline]
     __sock_sendmsg net/socket.c:727 [inline]
     ____sys_sendmsg+0xa95/0xc70 net/socket.c:2566
     ___sys_sendmsg+0x134/0x1d0 net/socket.c:2620
     __sys_sendmsg+0x16d/0x220 net/socket.c:2652
     do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
     do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    This problem is similar to the problem that the
    commit 1d6a9e7449e2 ("RDMA/core: Fix use-after-free when rename device name")
    fixes.
    
    The root cause is: the function ib_device_rename() renames the name with
    lock. But in the function kobject_uevent(), this name is accessed without
    lock protection at the same time.
    
    The solution is to add the lock protection when this name is accessed in
    the function kobject_uevent().
    
    Fixes: 779e0bf47632 ("RDMA/core: Do not indicate device ready when device enablement fails")
    Link: https://patch.msgid.link/r/[email protected]
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72
    Signed-off-by: Zhu Yanjun <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug [+ + +]

Author: Zhu Yanjun <[email protected]>
Date:   Sat Apr 12 09:57:14 2025 +0200

    RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
    
    [ Upstream commit f81b33582f9339d2dc17c69b92040d3650bb4bae ]
    
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:94 [inline]
     dump_stack_lvl+0x7d/0xa0 lib/dump_stack.c:120
     print_address_description mm/kasan/report.c:378 [inline]
     print_report+0xcf/0x610 mm/kasan/report.c:489
     kasan_report+0xb5/0xe0 mm/kasan/report.c:602
     rxe_queue_cleanup+0xd0/0xe0 drivers/infiniband/sw/rxe/rxe_queue.c:195
     rxe_cq_cleanup+0x3f/0x50 drivers/infiniband/sw/rxe/rxe_cq.c:132
     __rxe_cleanup+0x168/0x300 drivers/infiniband/sw/rxe/rxe_pool.c:232
     rxe_create_cq+0x22e/0x3a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1109
     create_cq+0x658/0xb90 drivers/infiniband/core/uverbs_cmd.c:1052
     ib_uverbs_create_cq+0xc7/0x120 drivers/infiniband/core/uverbs_cmd.c:1095
     ib_uverbs_write+0x969/0xc90 drivers/infiniband/core/uverbs_main.c:679
     vfs_write fs/read_write.c:677 [inline]
     vfs_write+0x26a/0xcc0 fs/read_write.c:659
     ksys_write+0x1b8/0x200 fs/read_write.c:731
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    In the function rxe_create_cq, when rxe_cq_from_init fails, the function
    rxe_cleanup will be called to handle the allocated resources. In fact,
    some memory resources have already been freed in the function
    rxe_cq_from_init. Thus, this problem will occur.
    
    The solution is to let rxe_cleanup do all the work.
    
    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Link: https://paste.ubuntu.com/p/tJgC42wDf6/
    Tested-by: liuyi <[email protected]>
    Signed-off-by: Zhu Yanjun <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Reviewed-by: Daisuke Matsuda <[email protected]>
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

regulator: max20086: fix invalid memory access [+ + +]

Author: Cosmin Tanislav <[email protected]>
Date:   Thu May 8 09:49:43 2025 +0300

    regulator: max20086: fix invalid memory access
    
    [ Upstream commit 6b0cd72757c69bc2d45da42b41023e288d02e772 ]
    
    max20086_parse_regulators_dt() calls of_regulator_match() using an
    array of struct of_regulator_match allocated on the stack for the
    matches argument.
    
    of_regulator_match() calls devm_of_regulator_put_matches(), which calls
    devres_alloc() to allocate a struct devm_of_regulator_matches which will
    be de-allocated using devm_of_regulator_put_matches().
    
    struct devm_of_regulator_matches is populated with the stack allocated
    matches array.
    
    If the device fails to probe, devm_of_regulator_put_matches() will be
    called and will try to call of_node_put() on that stack pointer,
    generating the following dmesg entries:
    
    max20086 6-0028: Failed to read DEVICE_ID reg: -121
    kobject: '\xc0$\xa5\x03' (000000002cebcb7a): is not initialized, yet
    kobject_put() is being called.
    
    Followed by a stack trace matching the call flow described above.
    
    Switch to allocating the matches array using devm_kcalloc() to
    avoid accessing the stack pointer long after it's out of scope.
    
    This also has the advantage of allowing multiple max20086 to probe
    without overriding the data stored inside the global of_regulator_match.
    
    Fixes: bfff546aae50 ("regulator: Add MAX20086-MAX20089 driver")
    Signed-off-by: Cosmin Tanislav <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Revert "drm/amd/display: Hardware cursor changes color when switched to software cursor" [+ + +]

Author: Melissa Wen <[email protected]>
Date:   Tue Apr 22 11:58:11 2025 -0300

    Revert "drm/amd/display: Hardware cursor changes color when switched to software cursor"
    
    commit fe14c0f096f58d2569e587e9f4b05d772272bbb4 upstream.
    
    This reverts commit 272e6aab14bbf98d7a06b2b1cd6308a02d4a10a1.
    
    Applying degamma curve to the cursor by default breaks Linux userspace
    expectation.
    
    On Linux, AMD display manager enables cursor degamma ROM just for
    implict sRGB on HW versions where degamma is split into two blocks:
    degamma ROM for pre-defined TFs and `gamma correction` for user/custom
    curves, and degamma ROM settings doesn't apply to cursor plane.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1513
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803
    Reported-by: Michel Dänzer <[email protected]>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4144
    Signed-off-by: Melissa Wen <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit f6a305d4748801a6c799ae9375b2ecff3aed094b)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "kbuild, rust: use -fremap-path-prefix to make paths relative" [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Sun May 11 08:02:28 2025 +0200

    Revert "kbuild, rust: use -fremap-path-prefix to make paths relative"
    
    commit 8cf5b3f836147d8d4e7c6eb4c01945b97dab8297 upstream.
    
    This reverts commit dbdffaf50ff9cee3259a7cef8a7bd9e0f0ba9f13.
    
    --remap-path-prefix breaks the ability of debuggers to find the source
    file corresponding to object files. As there is no simple or uniform
    way to specify the source directory explicitly, this breaks developers
    workflows.
    
    Revert the unconditional usage of --remap-path-prefix, equivalent to the
    same change for -ffile-prefix-map in KBUILD_CPPFLAGS.
    
    Fixes: dbdffaf50ff9 ("kbuild, rust: use -fremap-path-prefix to make paths relative")
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Acked-by: Miguel Ojeda <[email protected]>
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ring-buffer: Fix persistent buffer when commit page is the reader page [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Tue May 13 11:50:32 2025 -0400

    ring-buffer: Fix persistent buffer when commit page is the reader page
    
    commit 1d6c39c89f617c9fec6bbae166e25b16a014f7c8 upstream.
    
    The ring buffer is made up of sub buffers (sometimes called pages as they
    are by default PAGE_SIZE). It has the following "pages":
    
      "tail page" - this is the page that the next write will write to
      "head page" - this is the page that the reader will swap the reader page with.
      "reader page" - This belongs to the reader, where it will swap the head
                      page from the ring buffer so that the reader does not
                      race with the writer.
    
    The writer may end up on the "reader page" if the ring buffer hasn't
    written more than one page, where the "tail page" and the "head page" are
    the same.
    
    The persistent ring buffer has meta data that points to where these pages
    exist so on reboot it can re-create the pointers to the cpu_buffer
    descriptor. But when the commit page is on the reader page, the logic is
    incorrect.
    
    The check to see if the commit page is on the reader page checked if the
    head page was the reader page, which would never happen, as the head page
    is always in the ring buffer. The correct check would be to test if the
    commit page is on the reader page. If that's the case, then it can exit
    out early as the commit page is only on the reader page when there's only
    one page of data in the buffer. There's no reason to iterate the ring
    buffer pages to find the "commit page" as it is already found.
    
    To trigger this bug:
    
      # echo 1 > /sys/kernel/tracing/instances/boot_mapped/events/syscalls/sys_enter_fchownat/enable
      # touch /tmp/x
      # chown sshd /tmp/x
      # reboot
    
    On boot up, the dmesg will have:
     Ring buffer meta [0] is from previous boot!
     Ring buffer meta [1] is from previous boot!
     Ring buffer meta [2] is from previous boot!
     Ring buffer meta [3] is from previous boot!
     Ring buffer meta [4] commit page not found
     Ring buffer meta [5] is from previous boot!
     Ring buffer meta [6] is from previous boot!
     Ring buffer meta [7] is from previous boot!
    
    Where the buffer on CPU 4 had a "commit page not found" error and that
    buffer is cleared and reset causing the output to be empty and the data lost.
    
    When it works correctly, it has:
    
      # cat /sys/kernel/tracing/instances/boot_mapped/trace_pipe
            <...>-1137    [004] .....   998.205323: sys_enter_fchownat: __syscall_nr=0x104 (260) dfd=0xffffff9c (4294967196) filename=(0xffffc90000a0002c) user=0x3e8 (1000) group=0xffffffff (4294967295) flag=0x0 (0
    
    Cc: [email protected]
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5f3b6e839f3ce ("ring-buffer: Validate boot range memory events")
    Reported-by: Tasos Sahanidis <[email protected]>
    Tested-by: Tasos Sahanidis <[email protected]>
    Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: dts: sophgo: fix DMA data-width configuration for CV18xx [+ + +]

Author: Ze Huang <[email protected]>
Date:   Mon Apr 28 17:24:36 2025 +0800

    riscv: dts: sophgo: fix DMA data-width configuration for CV18xx
    
    [ Upstream commit 3e6244429ba38f8dee3336b8b805948276b281ab ]
    
    The "snps,data-width" property[1] defines the AXI data width of the DMA
    controller as:
    
        width = 8 × (2^n) bits
    
    (0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits)
    where "n" is the value of "snps,data-width".
    
    For the CV18xx DMA controller, the correct AXI data width is 32 bits,
    corresponding to "snps,data-width = 2".
    
    Test results on Milkv Duo S can be found here [2].
    
    Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1]
    Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2]
    
    Fixes: 514951a81a5e ("riscv: dts: sophgo: cv18xx: add DMA controller")
    Co-developed-by: Yu Yuan <[email protected]>
    Signed-off-by: Yu Yuan <[email protected]>
    Signed-off-by: Ze Huang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Inochi Amaoto <[email protected]>
    Signed-off-by: Chen Wang <[email protected]>
    Signed-off-by: Chen Wang <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator [+ + +]

Author: Tejun Heo <[email protected]>
Date:   Mon May 5 11:30:39 2025 -1000

    sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator
    
    commit 428dc9fc0873989d73918d4a9cc22745b7bbc799 upstream.
    
    BPF programs may call next() and destroy() on BPF iterators even after new()
    returns an error value (e.g. bpf_for_each() macro ignores error returns from
    new()). bpf_iter_scx_dsq_new() could leave the iterator in an uninitialized
    state after an error return causing bpf_iter_scx_dsq_next() to dereference
    garbage data. Make bpf_iter_scx_dsq_new() always clear $kit->dsq so that
    next() and destroy() become noops.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Fixes: 650ba21b131e ("sched_ext: Implement DSQ iterator")
    Cc: [email protected] # v6.12+
    Acked-by: Andrea Righi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer [+ + +]

Author: Steve Siwinski <[email protected]>
Date:   Thu May 8 16:01:22 2025 -0400

    scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer
    
    commit e8007fad5457ea547ca63bb011fdb03213571c7e upstream.
    
    The REPORT ZONES buffer size is currently limited by the HBA's maximum
    segment count to ensure the buffer can be mapped. However, the block
    layer further limits the number of iovec entries to 1024 when allocating
    a bio.
    
    To avoid allocation of buffers too large to be mapped, further restrict
    the maximum buffer size to BIO_MAX_INLINE_VECS.
    
    Replace the UIO_MAXIOV symbolic name with the more contextually
    appropriate BIO_MAX_INLINE_VECS.
    
    Fixes: b091ac616846 ("sd_zbc: Fix report zones buffer allocation")
    Cc: [email protected]
    Signed-off-by: Steve Siwinski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Damien Le Moal <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

smb: client: fix memory leak during error handling for POSIX mkdir [+ + +]

Author: Jethro Donaldson <[email protected]>
Date:   Thu May 15 01:23:23 2025 +1200

    smb: client: fix memory leak during error handling for POSIX mkdir
    
    commit 1fe4a44b7fa3955bcb7b4067c07b778fe90d8ee7 upstream.
    
    The response buffer for the CREATE request handled by smb311_posix_mkdir()
    is leaked on the error path (goto err_free_rsp_buf) because the structure
    pointer *rsp passed to free_rsp_buf() is not assigned until *after* the
    error condition is checked.
    
    As *rsp is initialised to NULL, free_rsp_buf() becomes a no-op and the leak
    is instead reported by __kmem_cache_shutdown() upon subsequent rmmod of
    cifs.ko if (and only if) the error path has been hit.
    
    Pass rsp_iov.iov_base to free_rsp_buf() instead, similar to the code in
    other functions in smb2pdu.c for which *rsp is assigned late.
    
    Cc: [email protected]
    Signed-off-by: Jethro Donaldson <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spi: loopback-test: Do not split 1024-byte hexdumps [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Fri May 2 13:10:35 2025 +0200

    spi: loopback-test: Do not split 1024-byte hexdumps
    
    [ Upstream commit a73fa3690a1f3014d6677e368dce4e70767a6ba2 ]
    
    spi_test_print_hex_dump() prints buffers holding less than 1024 bytes in
    full.  Larger buffers are truncated: only the first 512 and the last 512
    bytes are printed, separated by a truncation message.  The latter is
    confusing in case the buffer holds exactly 1024 bytes, as all data is
    printed anyway.
    
    Fix this by printing buffers holding up to and including 1024 bytes in
    full.
    
    Fixes: 84e0c4e5e2c4ef42 ("spi: add loopback test driver to allow for spi_master regression tests")
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/37ee1bc90c6554c9347040adabf04188c8f704aa.1746184171.git.geert+renesas@glider.be
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: tegra114: Use value to check for invalid delays [+ + +]

Author: Aaron Kling <[email protected]>
Date:   Tue May 6 13:36:59 2025 -0500

    spi: tegra114: Use value to check for invalid delays
    
    commit e979a7c79fbc706f6dac913af379ef4caa04d3d5 upstream.
    
    A delay unit of 0 is a valid entry, thus it is not valid to check for
    unused delays. Instead, check the value field; if that is zero, the
    given delay is unset.
    
    Fixes: 4426e6b4ecf6 ("spi: tegra114: Don't fail set_cs_timing when delays are zero")
    Cc: [email protected]
    Signed-off-by: Aaron Kling <[email protected]>
    Reviewed-by: Jon Hunter <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tests/ncdevmem: Fix double-free of queue array [+ + +]

Author: Cosmin Ratiu <[email protected]>
Date:   Thu May 8 11:44:34 2025 +0300

    tests/ncdevmem: Fix double-free of queue array
    
    [ Upstream commit 97c4e094a4b2edbb4fffeda718f8e806f825a18f ]
    
    netdev_bind_rx takes ownership of the queue array passed as parameter
    and frees it, so a queue array buffer cannot be reused across multiple
    netdev_bind_rx calls.
    
    This commit fixes that by always passing in a newly created queue array
    to all netdev_bind_rx calls in ncdevmem.
    
    Fixes: 85585b4bc8d8 ("selftests: add ncdevmem, netcat for devmem TCP")
    Signed-off-by: Cosmin Ratiu <[email protected]>
    Acked-by: Stanislav Fomichev <[email protected]>
    Reviewed-by: Joe Damato <[email protected]>
    Reviewed-by: Mina Almasry <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools/net/ynl: ethtool: fix crash when Hardware Clock info is missing [+ + +]

Author: Hangbin Liu <[email protected]>
Date:   Thu May 8 03:54:14 2025 +0000

    tools/net/ynl: ethtool: fix crash when Hardware Clock info is missing
    
    [ Upstream commit 45375814eb3f4245956c0c85092a4eee4441d167 ]
    
    Fix a crash in the ethtool YNL implementation when Hardware Clock information
    is not present in the response. This ensures graceful handling of devices or
    drivers that do not provide this optional field. e.g.
    
      Traceback (most recent call last):
        File "/net/tools/net/ynl/pyynl/./ethtool.py", line 438, in <module>
          main()
          ~~~~^^
        File "/net/tools/net/ynl/pyynl/./ethtool.py", line 341, in main
          print(f'PTP Hardware Clock: {tsinfo["phc-index"]}')
                                       ~~~~~~^^^^^^^^^^^^^
      KeyError: 'phc-index'
    
    Fixes: f3d07b02b2b8 ("tools: ynl: ethtool testing tool")
    Signed-off-by: Hangbin Liu <[email protected]>
    Acked-by: Stanislav Fomichev <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tpm: Mask TPM RC in tpm2_start_auth_session() [+ + +]

Author: Jarkko Sakkinen <[email protected]>
Date:   Mon Apr 7 15:28:05 2025 +0300

    tpm: Mask TPM RC in tpm2_start_auth_session()
    
    commit 539fbab37881e32ba6a708a100de6db19e1e7e7d upstream.
    
    tpm2_start_auth_session() does not mask TPM RC correctly from the callers:
    
    [   28.766528] tpm tpm0: A TPM error (2307) occurred start auth session
    
    Process TPM RCs inside tpm2_start_auth_session(), and map them to POSIX
    error codes.
    
    Cc: [email protected] # v6.10+
    Fixes: 699e3efd6c64 ("tpm: Add HMAC session start and end functions")
    Reported-by: Herbert Xu <[email protected]>
    Closes: https://lore.kernel.org/linux-integrity/[email protected]/
    Reviewed-by: Stefano Garzarella <[email protected]>
    Signed-off-by: Jarkko Sakkinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tpm: tis: Double the timeout B to 4s [+ + +]

Author: Michal Suchanek <[email protected]>
Date:   Fri Apr 4 10:23:14 2025 +0200

    tpm: tis: Double the timeout B to 4s
    
    [ Upstream commit 2f661f71fda1fc0c42b7746ca5b7da529eb6b5be ]
    
    With some Infineon chips the timeouts in tpm_tis_send_data (both B and
    C) can reach up to about 2250 ms.
    
    Timeout C is retried since
    commit de9e33df7762 ("tpm, tpm_tis: Workaround failed command reception on Infineon devices")
    
    Timeout B still needs to be extended.
    
    The problem is most commonly encountered with context related operation
    such as load context/save context. These are issued directly by the
    kernel, and there is no retry logic for them.
    
    When a filesystem is set up to use the TPM for unlocking the boot fails,
    and restarting the userspace service is ineffective. This is likely
    because ignoring a load context/save context result puts the real TPM
    state and the TPM state expected by the kernel out of sync.
    
    Chips known to be affected:
    tpm_tis IFX1522:00: 2.0 TPM (device-id 0x1D, rev-id 54)
    Description: SLB9672
    Firmware Revision: 15.22
    
    tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22)
    Firmware Revision: 7.83
    
    tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1A, rev-id 16)
    Firmware Revision: 5.63
    
    Link: https://lore.kernel.org/linux-integrity/[email protected]/
    Signed-off-by: Michal Suchanek <[email protected]>
    Reviewed-by: Jarkko Sakkinen <[email protected]>
    Signed-off-by: Jarkko Sakkinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tracing: fprobe: Fix RCU warning message in list traversal [+ + +]

Author: Breno Leitao <[email protected]>
Date:   Thu Apr 10 05:22:21 2025 -0700

    tracing: fprobe: Fix RCU warning message in list traversal
    
    [ Upstream commit 9dda18a32b4a6693fccd3f7c0738af646147b1cf ]
    
    When CONFIG_PROVE_RCU_LIST is enabled, fprobe triggers the following
    warning:
    
        WARNING: suspicious RCU usage
        kernel/trace/fprobe.c:457 RCU-list traversed in non-reader section!!
    
        other info that might help us debug this:
            #1: ffffffff863c4e08 (fprobe_mutex){+.+.}-{4:4}, at: fprobe_module_callback+0x7b/0x8c0
    
        Call Trace:
            fprobe_module_callback
            notifier_call_chain
            blocking_notifier_call_chain
    
    This warning occurs because fprobe_remove_node_in_module() traverses an
    RCU list using RCU primitives without holding an RCU read lock. However,
    the function is only called from fprobe_module_callback(), which holds
    the fprobe_mutex lock that provides sufficient protection for safely
    traversing the list.
    
    Fix the warning by specifying the locking design to the
    CONFIG_PROVE_RCU_LIST mechanism. Add the lockdep_is_held() argument to
    hlist_for_each_entry_rcu() to inform the RCU checker that fprobe_mutex
    provides the required protection.
    
    Link: https://lore.kernel.org/all/[email protected]/
    
    Fixes: a3dc2983ca7b90 ("tracing: fprobe: Cleanup fprobe hash when module unloading")
    Signed-off-by: Breno Leitao <[email protected]>
    Tested-by: Antonio Quartulli <[email protected]>
    Tested-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tracing: probes: Fix a possible race in trace_probe_log APIs [+ + +]

Author: Masami Hiramatsu (Google) <[email protected]>
Date:   Sat May 10 12:44:41 2025 +0900

    tracing: probes: Fix a possible race in trace_probe_log APIs
    
    [ Upstream commit fd837de3c9cb1a162c69bc1fb1f438467fe7f2f5 ]
    
    Since the shared trace_probe_log variable can be accessed and
    modified via probe event create operation of kprobe_events,
    uprobe_events, and dynamic_events, it should be protected.
    In the dynamic_events, all operations are serialized by
    `dyn_event_ops_mutex`. But kprobe_events and uprobe_events
    interfaces are not serialized.
    
    To solve this issue, introduces dyn_event_create(), which runs
    create() operation under the mutex, for kprobe_events and
    uprobe_events. This also uses lockdep to check the mutex is
    held when using trace_probe_log* APIs.
    
    Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@devnote2/
    
    Reported-by: Paul Cacheux <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events")
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tracing: samples: Initialize trace_array_printk() with the correct function [+ + +]

Author: Steven Rostedt <[email protected]>
Date:   Fri May 9 15:26:57 2025 -0400

    tracing: samples: Initialize trace_array_printk() with the correct function
    
    commit 1b0c192c92ea1fe2dcb178f84adf15fe37c3e7c8 upstream.
    
    When using trace_array_printk() on a created instance, the correct
    function to use to initialize it is:
    
      trace_array_init_printk()
    
    Not
    
      trace_printk_init_buffer()
    
    The former is a proper function to use, the latter is for initializing
    trace_printk() and causes the NOTICE banner to be displayed.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Divya Indi <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.")
    Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tsnep: fix timestamping with a stacked DSA driver [+ + +]

Author: Gerhard Engleder <[email protected]>
Date:   Wed May 14 21:56:57 2025 +0200

    tsnep: fix timestamping with a stacked DSA driver
    
    [ Upstream commit b3ca9eef6646576ad506a96d941d87a69f66732a ]
    
    This driver is susceptible to a form of the bug explained in commit
    c26a2c2ddc01 ("gianfar: Fix TX timestamping with a stacked DSA driver")
    and in Documentation/networking/timestamping.rst section "Other caveats
    for MAC drivers", specifically it timestamps any skb which has
    SKBTX_HW_TSTAMP, and does not consider if timestamping has been enabled
    in adapter->hwtstamp_config.tx_type.
    
    Evaluate the proper TX timestamping condition only once on the TX
    path (in tsnep_xmit_frame_ring()) and store the result in an additional
    TX entry flag. Evaluate the new TX entry flag in the TX confirmation path
    (in tsnep_tx_poll()).
    
    This way SKBTX_IN_PROGRESS is set by the driver as required, but never
    evaluated. SKBTX_IN_PROGRESS shall not be evaluated as it can be set
    by a stacked DSA driver and evaluating it would lead to unwanted
    timestamps.
    
    Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver")
    Suggested-by: Vladimir Oltean <[email protected]>
    Signed-off-by: Gerhard Engleder <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ublk: fix dead loop when canceling io command [+ + +]

Author: Ming Lei <[email protected]>
Date:   Fri May 16 00:26:01 2025 +0800

    ublk: fix dead loop when canceling io command
    
    [ Upstream commit dd24f87f65c957f30e605e44961d2fd53a44c780 ]
    
    Commit:
    
    f40139fde527 ("ublk: fix race between io_uring_cmd_complete_in_task and
                    ublk_cancel_cmd")
    
    adds a request state check in ublk_cancel_cmd(), and if the request is
    started, skips canceling this uring_cmd.
    
    However, the current uring_cmd may be in ACTIVE state, without block
    request coming to the uring command. Meantime, if the cached request in
    tag_set.tags[tag] has been delivered to ublk server and reycycled, then
    this uring_cmd can't be canceled.
    
    ublk requests are aborted in ublk char device release handler, which
    depends on canceling all ACTIVE uring_cmd. So it causes a dead loop.
    
    Fix this issue by not taking a stale request into account when canceling
    uring_cmd in ublk_cancel_cmd().
    
    Reported-by: Shinichiro Kawasaki <[email protected]>
    Closes: https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/
    Fixes: f40139fde527 ("ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd")
    Signed-off-by: Ming Lei <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [axboe: rewording of commit message]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

udf: Make sure i_lenExtents is uptodate on inode eviction [+ + +]

Author: Jan Kara <[email protected]>
Date:   Wed May 7 11:49:41 2025 +0200

    udf: Make sure i_lenExtents is uptodate on inode eviction
    
    commit 55dd5b4db3bf04cf077a8d1712f6295d4517c337 upstream.
    
    UDF maintains total length of all extents in i_lenExtents. Generally we
    keep extent lengths (and thus i_lenExtents) block aligned because it
    makes the file appending logic simpler. However the standard mandates
    that the inode size must match the length of all extents and thus we
    trim the last extent when closing the file. To catch possible bugs we
    also verify that i_lenExtents matches i_size when evicting inode from
    memory. Commit b405c1e58b73 ("udf: refactor udf_next_aext() to handle
    error") however broke the code updating i_lenExtents and thus
    udf_evict_inode() ended up spewing lots of errors about incorrectly
    sized extents although the extents were actually sized properly. Fix the
    updating of i_lenExtents to silence the errors.
    
    Fixes: b405c1e58b73 ("udf: refactor udf_next_aext() to handle error")
    CC: [email protected]
    Signed-off-by: Jan Kara <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

vsock/test: Fix occasional failure in SIOCOUTQ tests [+ + +]

Author: Konstantin Shkolnyy <[email protected]>
Date:   Wed May 7 10:14:56 2025 -0500

    vsock/test: Fix occasional failure in SIOCOUTQ tests
    
    [ Upstream commit 7fd7ad6f36af36f30a06d165eff3780cb139fa79 ]
    
    These tests:
        "SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes"
        "SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes"
    output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)".
    
    They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data
    have been received by the other side. However, sometimes there is a delay
    in updating this "unsent bytes" counter, and the test fails even though
    the counter properly goes to 0 several milliseconds later.
    
    The delay occurs in the kernel because the used buffer notification
    callback virtio_vsock_tx_done(), called upon receipt of the data by the
    other side, doesn't update the counter itself. It delegates that to
    a kernel thread (via vsock->tx_work). Sometimes that thread is delayed
    more than the test expects.
    
    Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs.
    
    Signed-off-by: Konstantin Shkolnyy <[email protected]>
    Reviewed-by: Stefano Garzarella <[email protected]>
    Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request [+ + +]

Author: Kees Cook <[email protected]>
Date:   Fri May 9 11:46:45 2025 -0700

    wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
    
    [ Upstream commit 82bbe02b2500ef0a62053fe2eb84773fe31c5a0a ]
    
    Make sure that n_channels is set after allocating the
    struct cfg80211_registered_device::int_scan_req member. Seen with
    syzkaller:
    
    UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:1208:5
    index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')
    
    This was missed in the initial conversions because I failed to locate
    the allocation likely due to the "sizeof(void *)" not matching the
    "channels" array type.
    
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Fixes: e3eac9f32ec0 ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by")
    Signed-off-by: Kees Cook <[email protected]>
    Reviewed-by: Gustavo A. R. Silva <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mt76: disable napi on driver removal [+ + +]

Author: Fedor Pchelkin <[email protected]>
Date:   Tue May 6 14:55:39 2025 +0300

    wifi: mt76: disable napi on driver removal
    
    commit 78ab4be549533432d97ea8989d2f00b508fa68d8 upstream.
    
    A warning on driver removal started occurring after commit 9dd05df8403b
    ("net: warn if NAPI instance wasn't shut down"). Disable tx napi before
    deleting it in mt76_dma_cleanup().
    
     WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100
     CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy)
     Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024
     RIP: 0010:__netif_napi_del_locked+0xf0/0x100
     Call Trace:
     <TASK>
     mt76_dma_cleanup+0x54/0x2f0 [mt76]
     mt7921_pci_remove+0xd5/0x190 [mt7921e]
     pci_device_remove+0x47/0xc0
     device_release_driver_internal+0x19e/0x200
     driver_detach+0x48/0x90
     bus_remove_driver+0x6d/0xf0
     pci_unregister_driver+0x2e/0xb0
     __do_sys_delete_module.isra.0+0x197/0x2e0
     do_syscall_64+0x7b/0x160
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Tested with mt7921e but the same pattern can be actually applied to other
    mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled
    in their *_dma_init() functions and only toggled off and on again inside
    their suspend/resume/reset paths. So it should be okay to disable tx
    napi in such a generic way.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    Fixes: 2ac515a5d74f ("mt76: mt76x02: use napi polling for tx cleanup")
    Cc: [email protected]
    Signed-off-by: Fedor Pchelkin <[email protected]>
    Tested-by: Ming Yen Hsieh <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl [+ + +]

Author: Ming Yen Hsieh <[email protected]>
Date:   Fri May 9 09:04:20 2025 +0800

    wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
    
    commit 0aa8496adda570c2005410a30df963a16643a3dc upstream.
    
    Ensure that the hdr_trans_tlv command is included in the broadcast wtbl to
    prevent the IPv6 and multicast packet from being dropped by the chip.
    
    Cc: [email protected]
    Fixes: cb1353ef3473 ("wifi: mt76: mt7925: integrate *mlo_sta_cmd and *sta_cmd")
    Reported-by: Benjamin Xiao <[email protected]>
    Tested-by: Niklas Schnelle <[email protected]>
    Signed-off-by: Ming Yen Hsieh <[email protected]>
    Link: https://lore.kernel.org/lkml/EmWnO5b-acRH1TXbGnkx41eJw654vmCR-8_xMBaPMwexCnfkvKCdlU5u19CGbaapJ3KRu-l3B-tSUhf8CCQwL0odjo6Cd5YG5lvNeB-vfdg=@pm.me/
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/amd_node, platform/x86/amd/hsmp: Have HSMP use SMN through AMD_NODE [+ + +]

Author: Yazen Ghannam <[email protected]>
Date:   Thu Jan 30 19:48:55 2025 +0000

    x86/amd_node, platform/x86/amd/hsmp: Have HSMP use SMN through AMD_NODE
    
    [ Upstream commit 735049b801cf3d597752017385cfc8768ce44303 ]
    
    The HSMP interface is just an SMN interface with different offsets.
    
    Define an HSMP wrapper in the SMN code and have the HSMP platform driver
    use that rather than a local solution.
    
    Also, remove the "root" member from AMD_NB, since there are no more
    users of it.
    
    Signed-off-by: Yazen Ghannam <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Carlos Bilbao <[email protected]>
    Acked-by: Ilpo Järvinen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 0581d384f344 ("platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers")
    Signed-off-by: Sasha Levin <[email protected]>

x86/sev: Do not touch VMSA pages during SNP guest memory kdump [+ + +]

Author: Ashish Kalra <[email protected]>
Date:   Mon Apr 28 21:41:51 2025 +0000

    x86/sev: Do not touch VMSA pages during SNP guest memory kdump
    
    commit d2062cc1b1c367d5d019f595ef860159e1301351 upstream.
    
    When kdump is running makedumpfile to generate vmcore and dump SNP guest
    memory it touches the VMSA page of the vCPU executing kdump.
    
    It then results in unrecoverable #NPF/RMP faults as the VMSA page is
    marked busy/in-use when the vCPU is running and subsequently a causes
    guest softlockup/hang.
    
    Additionally, other APs may be halted in guest mode and their VMSA pages
    are marked busy and touching these VMSA pages during guest memory dump
    will also cause #NPF.
    
    Issue AP_DESTROY GHCB calls on other APs to ensure they are kicked out
    of guest mode and then clear the VMSA bit on their VMSA pages.
    
    If the vCPU running kdump is an AP, mark it's VMSA page as offline to
    ensure that makedumpfile excludes that page while dumping guest memory.
    
    Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec")
    Signed-off-by: Ashish Kalra <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Pankaj Gupta <[email protected]>
    Reviewed-by: Tom Lendacky <[email protected]>
    Tested-by: Srikanth Aithal <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

x86/sev: Make sure pages are not skipped during kdump [+ + +]

Author: Ashish Kalra <[email protected]>
Date:   Tue May 6 18:35:29 2025 +0000

    x86/sev: Make sure pages are not skipped during kdump
    
    commit 82b7f88f2316c5442708daeb0b5ec5aa54c8ff7f upstream.
    
    When shared pages are being converted to private during kdump, additional
    checks are performed. They include handling the case of a GHCB page being
    contained within a huge page.
    
    Currently, this check incorrectly skips a page just below the GHCB page from
    being transitioned back to private during kdump preparation.
    
    This skipped page causes a 0x404 #VC exception when it is accessed later while
    dumping guest memory for vmcore generation.
    
    Correct the range to be checked for GHCB contained in a huge page.  Also,
    ensure that the skipped huge page containing the GHCB page is transitioned
    back to private by applying the correct address mask later when changing GHCBs
    to private at end of kdump preparation.
    
      [ bp: Massage commit message. ]
    
    Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec")
    Signed-off-by: Ashish Kalra <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Tom Lendacky <[email protected]>
    Tested-by: Srikanth Aithal <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>