Changelog in Linux kernel 6.11.3

 
accel/ivpu: Add missing MODULE_FIRMWARE metadata [+ + +]
Author: Alexander F. Lent <[email protected]>
Date:   Tue Jul 9 07:54:14 2024 -0400

    accel/ivpu: Add missing MODULE_FIRMWARE metadata
    
    [ Upstream commit 58b5618ba80a5e5a8d531a70eae12070e5bd713f ]
    
    Modules that load firmware from various paths at runtime must declare
    those paths at compile time, via the MODULE_FIRMWARE macro, so that the
    firmware paths are included in the module's metadata.
    
    The accel/ivpu driver loads firmware but lacks this metadata,
    preventing dracut from correctly locating firmware files. Fix it.
    
    Fixes: 9ab43e95f922 ("accel/ivpu: Switch to generation based FW names")
    Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
    Signed-off-by: Alexander F. Lent <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240709-fix-ivpu-firmware-metadata-v3-1-55f70bba055b@xanderlent.com
    Signed-off-by: Sasha Levin <[email protected]>

 
ACPI: battery: Fix possible crash when unregistering a battery hook [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:34 2024 +0200

    ACPI: battery: Fix possible crash when unregistering a battery hook
    
    [ Upstream commit 76959aff14a0012ad6b984ec7686d163deccdc16 ]
    
    When a battery hook returns an error when adding a new battery, then
    the battery hook is automatically unregistered.
    However the battery hook provider cannot know that, so it will later
    call battery_hook_unregister() on the already unregistered battery
    hook, resulting in a crash.
    
    Fix this by using the list head to mark already unregistered battery
    hooks as already being unregistered so that they can be ignored by
    battery_hook_unregister().
    
    Fixes: fa93854f7a7e ("battery: Add the battery hooking API")
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: battery: Simplify battery hook locking [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:33 2024 +0200

    ACPI: battery: Simplify battery hook locking
    
    [ Upstream commit 86309cbed26139e1caae7629dcca1027d9a28e75 ]
    
    Move the conditional locking from __battery_hook_unregister()
    into battery_hook_unregister() and rename the low-level function
    to simplify the locking during battery hook removal.
    
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Reviewed-by: Pali Rohár <[email protected]>
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Stable-dep-of: 76959aff14a0 ("ACPI: battery: Fix possible crash when unregistering a battery hook")
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: CPPC: Add support for setting EPP register in FFH [+ + +]
Author: Mario Limonciello <[email protected]>
Date:   Mon Sep 9 22:15:24 2024 -0500

    ACPI: CPPC: Add support for setting EPP register in FFH
    
    [ Upstream commit aaf21ac93909e08a12931173336bdb52ac8499f1 ]
    
    Some Asus AMD systems are reported to not be able to change EPP values
    because the BIOS doesn't advertise support for the CPPC MSR and the PCC
    region is not configured.
    
    However the ACPI 6.2 specification allows CPC registers to be declared
    in FFH:
    ```
    Starting with ACPI Specification 6.2, all _CPC registers can be in
    PCC, System Memory, System IO, or Functional Fixed Hardware address
    spaces. OSPM support for this more flexible register space scheme
    is indicated by the “Flexible Address Space for CPPC Registers” _OSC
    bit.
    ```
    
    If this _OSC has been set allow using FFH to configure EPP.
    
    Reported-by: [email protected]
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
    Suggested-by: [email protected]
    Tested-by: [email protected]
    Tested-by: [email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: EC: Do not release locks during operation region accesses [+ + +]
Author: Rafael J. Wysocki <[email protected]>
Date:   Thu Jul 4 18:26:54 2024 +0200

    ACPI: EC: Do not release locks during operation region accesses
    
    [ Upstream commit dc171114926ec390ab90f46534545420ec03e458 ]
    
    It is not particularly useful to release locks (the EC mutex and the
    ACPI global lock, if present) and re-acquire them immediately thereafter
    during EC address space accesses in acpi_ec_space_handler().
    
    First, releasing them for a while before grabbing them again does not
    really help anyone because there may not be enough time for another
    thread to acquire them.
    
    Second, if another thread successfully acquires them and carries out
    a new EC write or read in the middle if an operation region access in
    progress, it may confuse the EC firmware, especially after the burst
    mode has been enabled.
    
    Finally, manipulating the locks after writing or reading every single
    byte of data is overhead that it is better to avoid.
    
    Accordingly, modify the code to carry out EC address space accesses
    entirely without releasing the locks.
    
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: PAD: fix crash in exit_round_robin() [+ + +]
Author: Seiji Nishikawa <[email protected]>
Date:   Sun Aug 25 23:13:52 2024 +0900

    ACPI: PAD: fix crash in exit_round_robin()
    
    [ Upstream commit 0a2ed70a549e61c5181bad5db418d223b68ae932 ]
    
    The kernel occasionally crashes in cpumask_clear_cpu(), which is called
    within exit_round_robin(), because when executing clear_bit(nr, addr) with
    nr set to 0xffffffff, the address calculation may cause misalignment within
    the memory, leading to access to an invalid memory address.
    
    ----------
    BUG: unable to handle kernel paging request at ffffffffe0740618
            ...
    CPU: 3 PID: 2919323 Comm: acpi_pad/14 Kdump: loaded Tainted: G           OE  X --------- -  - 4.18.0-425.19.2.el8_7.x86_64 #1
            ...
    RIP: 0010:power_saving_thread+0x313/0x411 [acpi_pad]
    Code: 89 cd 48 89 d3 eb d1 48 c7 c7 55 70 72 c0 e8 64 86 b0 e4 c6 05 0d a1 02 00 01 e9 bc fd ff ff 45 89 e4 42 8b 04 a5 20 82 72 c0 <f0> 48 0f b3 05 f4 9c 01 00 42 c7 04 a5 20 82 72 c0 ff ff ff ff 31
    RSP: 0018:ff72a5d51fa77ec8 EFLAGS: 00010202
    RAX: 00000000ffffffff RBX: ff462981e5d8cb80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ff46297556959d80 R08: 0000000000000382 R09: ff46297c8d0f38d8
    R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000000e
    R13: 0000000000000000 R14: ffffffffffffffff R15: 000000000000000e
    FS:  0000000000000000(0000) GS:ff46297a800c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffffffe0740618 CR3: 0000007e20410004 CR4: 0000000000771ee0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     ? acpi_pad_add+0x120/0x120 [acpi_pad]
     kthread+0x10b/0x130
     ? set_kthread_struct+0x50/0x50
     ret_from_fork+0x1f/0x40
            ...
    CR2: ffffffffe0740618
    
    crash> dis -lr ffffffffc0726923
            ...
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 114
    0xffffffffc0726918 <power_saving_thread+776>:   mov    %r12d,%r12d
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 325
    0xffffffffc072691b <power_saving_thread+779>:   mov    -0x3f8d7de0(,%r12,4),%eax
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./arch/x86/include/asm/bitops.h: 80
    0xffffffffc0726923 <power_saving_thread+787>:   lock btr %rax,0x19cf4(%rip)        # 0xffffffffc0740620 <pad_busy_cpus_bits>
    
    crash> px tsk_in_cpu[14]
    $66 = 0xffffffff
    
    crash> px 0xffffffffc072692c+0x19cf4
    $99 = 0xffffffffc0740620
    
    crash> sym 0xffffffffc0740620
    ffffffffc0740620 (b) pad_busy_cpus_bits [acpi_pad]
    
    crash> px pad_busy_cpus_bits[0]
    $42 = 0xfffc0
    ----------
    
    To fix this, ensure that tsk_in_cpu[tsk_index] != -1 before calling
    cpumask_clear_cpu() in exit_round_robin(), just as it is done in
    round_robin_cpu().
    
    Signed-off-by: Seiji Nishikawa <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject edit, avoid updates to the same value ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:06 2024 +0200

    ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
    
    commit 056301e7c7c886f96d799edd36f3406cc30e1822 upstream.
    
    Like other Asus ExpertBook models the B2502CVA has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the B2502CVA to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217760
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:05 2024 +0200

    ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
    
    commit 2f80ce0b78c340e332f04a5801dee5e4ac8cfaeb upstream.
    
    Like other Asus Vivobook models the X1704VAP has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the X1704VAP to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Reported-by: Lamome Julien <[email protected]>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078696
    Closes: https://lore.kernel.org/all/[email protected]/
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:04 2024 +0200

    ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA
    
    commit 63539defee17bf0cbd8e24078cf103efee9c6633 upstream.
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GA has a DSDT
    describing IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
        $ sudo dmesg | grep DMI:.*BIOS
        [    0.000000] DMI: ASUSTeK COMPUTER INC. Vivobook Go E1404GA_E1404GA/E1404GA, BIOS E1404GA.302 08/23/2023
        $ sudo cp /sys/firmware/acpi/tables/DSDT dsdt.dat
        $ iasl -d dsdt.dat
        $ grep -A 30 PS2K dsdt.dsl | grep IRQ -A 1
                    IRQ (Level, ActiveLow, Exclusive, )
                        {1}
    
    There already is an entry in the irq1_level_low_skip_override[] DMI match
    table for the "E1404GAB", change this to match on "E1404GA" to cover
    the E1404GA model as well (DMI_MATCH() does a substring match).
    
    Reported-by: Paul Menzel <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Remove duplicate Asus E1504GAB IRQ override [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:03 2024 +0200

    ACPI: resource: Remove duplicate Asus E1504GAB IRQ override
    
    commit 65bdebf38e5fac7c56a9e05d3479a707e6dc783c upstream.
    
    Commit d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook
    E1504GA and E1504GAB") does exactly what the subject says, adding DMI
    matches for both the E1504GA and E1504GAB.
    
    But DMI_MATCH() does a substring match, so checking for E1504GA will also
    match E1504GAB.
    
    Drop the unnecessary E1504GAB entry since that is covered already by
    the E1504GA entry.
    
    Fixes: d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook E1504GA and E1504GAB")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB [+ + +]
Author: Tamim Khan <[email protected]>
Date:   Mon Sep 2 21:43:05 2024 -0400

    ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
    
    [ Upstream commit 49e9cc315604972cc14868cb67831e3e8c3f1470 ]
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GAB has a DSDT
    that describes IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
    This override prevents the internal keyboard from working.
    
    Fix the problem by adding this laptop to the table that prevents the kernel
    from overriding the IRQ.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219212
    Signed-off-by: Tamim Khan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Wed Sep 18 17:38:49 2024 +0200

    ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
    
    commit ac78288fe062b64e45a479eaae74aaaafcc8ecdd upstream.
    
    Dell All In One (AIO) models released after 2017 may use a backlight
    controller board connected to an UART.
    
    In DSDT this uart port will be defined as:
    
       Name (_HID, "DELL0501")
       Name (_CID, EisaId ("PNP0501")
    
    The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with
    the above _HID + _CID. Loading the dell-uart-backlight driver fails with
    the following errors:
    
    [   18.261353] dell_uart_backlight serial0-0: Timed out waiting for response.
    [   18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version
    [   18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110
    
    Indicating that there is no backlight controller board attached to
    the UART, while the GPU's native backlight control method does work.
    
    Add a quirk to use the GPU's native backlight control method on this model.
    
    Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edit ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Sat Sep 7 14:44:19 2024 +0200

    ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
    
    [ Upstream commit eb7b0f12e13ba99e64e3a690c2166895ed63b437 ]
    
    The Panasonic Toughbook CF-18 advertises both native and vendor backlight
    control interfaces. But only the vendor one actually works.
    
    acpi_video_get_backlight_type() will pick the non working native backlight
    by default, add a quirk to select the working vendor backlight instead.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package() [+ + +]
Author: Pei Xiao <[email protected]>
Date:   Thu Jul 18 14:05:48 2024 +0800

    ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
    
    [ Upstream commit a5242874488eba2b9062985bf13743c029821330 ]
    
    ACPICA commit 4d4547cf13cca820ff7e0f859ba83e1a610b9fd0
    
    ACPI_ALLOCATE_ZEROED() may fail, elements might be NULL and will cause
    NULL pointer dereference later.
    
    Link: https://github.com/acpica/acpica/commit/4d4547cf
    Signed-off-by: Pei Xiao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_field() fails [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Sun Apr 14 21:50:33 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
    
    [ Upstream commit e6169a8ffee8a012badd8c703716e761ce851b15 ]
    
    ACPICA commit 1280045754264841b119a5ede96cd005bc09b5a7
    
    If acpi_ps_get_next_field() fails, the previously created field list
    needs to be properly disposed before returning the status code.
    
    Link: https://github.com/acpica/acpica/commit/12800457
    Signed-off-by: Armin Wolf <[email protected]>
    [ rjw: Rename local variable to avoid compiler confusion ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Wed Apr 3 20:50:11 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
    
    [ Upstream commit 5accb265f7a1b23e52b0ec42313d1e12895552f4 ]
    
    ACPICA commit 2802af722bbde7bf1a7ac68df68e179e2555d361
    
    If acpi_ps_get_next_namepath() fails, the previously allocated
    union acpi_parse_object needs to be freed before returning the
    status code.
    
    The issue was first being reported on the Linux ACPI mailing list:
    
    Link: https://lore.kernel.org/linux-acpi/[email protected]/T/
    Link: https://github.com/acpica/acpica/commit/2802af72
    Signed-off-by: Armin Wolf <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: iasl: handle empty connection_node [+ + +]
Author: Aleksandrs Vinarskis <[email protected]>
Date:   Sun Aug 11 23:33:44 2024 +0200

    ACPICA: iasl: handle empty connection_node
    
    [ Upstream commit a0a2459b79414584af6c46dd8c6f866d8f1aa421 ]
    
    ACPICA commit 6c551e2c9487067d4b085333e7fe97e965a11625
    
    Link: https://github.com/acpica/acpica/commit/6c551e2c
    Signed-off-by: Aleksandrs Vinarskis <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
afs: Fix missing wire-up of afs_retry_request() [+ + +]
Author: David Howells <[email protected]>
Date:   Sat Sep 14 21:40:02 2024 +0100

    afs: Fix missing wire-up of afs_retry_request()
    
    [ Upstream commit 2cf36327ee1e47733aba96092d7bd082a4056ff5 ]
    
    afs_retry_request() is supposed to be pointed to by the afs_req_ops netfs
    operations table, but the pointer got lost somewhere.  The function is used
    during writeback to rotate through the authentication keys that were in
    force when the file was modified locally.
    
    Fix this by adding the pointer to the function.
    
    Fixes: 1ecb146f7cd8 ("netfs, afs: Use writeback retry to deal with alternate keys")
    Reported-by: Dr. David Alan Gilbert <[email protected]>
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

afs: Fix the setting of the server responding flag [+ + +]
Author: David Howells <[email protected]>
Date:   Mon Sep 23 16:07:50 2024 +0100

    afs: Fix the setting of the server responding flag
    
    [ Upstream commit ff98751bae40faed1ba9c6a7287e84430f7dec64 ]
    
    In afs_wait_for_operation(), we set transcribe the call responded flag to
    the server record that we used after doing the fileserver iteration loop -
    but it's possible to exit the loop having had a response from the server
    that we've discarded (e.g. it returned an abort or we started receiving
    data, but the call didn't complete).
    
    This means that op->server might be NULL, but we don't check that before
    attempting to set the server flag.
    
    Fixes: 98f9fda2057b ("afs: Fold the afs_addr_cursor struct in")
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ALSA: asihpi: Fix potential OOB array access [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:14:42 2024 +0200

    ALSA: asihpi: Fix potential OOB array access
    
    [ Upstream commit 7b986c7430a6bb68d523dac7bfc74cbd5b44ef96 ]
    
    ASIHPI driver stores some values in the static array upon a response
    from the driver, and its index depends on the firmware.  We shouldn't
    trust it blindly.
    
    This patch adds a sanity check of the array index to fit in the array
    size.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Fix leftover snd_power_unref() [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 1 08:42:01 2024 +0200

    ALSA: control: Fix leftover snd_power_unref()
    
    commit fef1ac950c600ba50ef4d65ca03c8dae9be7f9ea upstream.
    
    One snd_power_unref() was forgotten and left at __snd_ctl_elem_info()
    in the previous change for reorganizing the locking order.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://github.com/thesofproject/linux/pull/5127
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: control: Fix power_ref lock order for compat code, too [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 18:31:27 2024 +0200

    ALSA: control: Fix power_ref lock order for compat code, too
    
    [ Upstream commit a1066453b5e49a28523f3ecbbfe4e06c6a29561c ]
    
    In the previous change for swapping the power_ref and controls_rwsem
    lock order, the code path for the compat layer was forgotten.
    This patch covers the remaining code.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Take power_ref lock primarily [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Mon Jul 29 18:06:58 2024 +0200

    ALSA: control: Take power_ref lock primarily
    
    [ Upstream commit fcc62b19104a67b9a2941513771e09389b75bd95 ]
    
    The code path for kcontrol accesses have often nested locks of both
    card's controls_rwsem and power_ref, and applies in that order.
    However, what could take much longer is the latter, power_ref; it
    waits for the power state of the device, and it pretty much depends on
    the user's action.
    
    This patch swaps the locking order of those locks to a more natural
    way, namely, power_ref -> controls_rwsem, in order to shorten the time
    of possible nested locks.  For consistency, power_ref is taken always
    in the top-level caller side (that is, *_user() functions and the
    ioctl handler itself).
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: core: add isascii() check to card ID generator [+ + +]
Author: Jaroslav Kysela <[email protected]>
Date:   Wed Oct 2 21:46:49 2024 +0200

    ALSA: core: add isascii() check to card ID generator
    
    commit d278a9de5e1837edbe57b2f1f95a104ff6c84846 upstream.
    
    The card identifier should contain only safe ASCII characters. The isalnum()
    returns true also for characters for non-ASCII characters.
    
    Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4135
    Link: https://lore.kernel.org/linux-sound/yk3WTvKkwheOon_LzZlJ43PPInz6byYfBzpKkbasww1yzuiMRqn7n6Y8vZcXB-xwFCu_vb8hoNjv7DTNwH5TWjpEuiVsyn9HPCEXqwF4120=@protonmail.com/
    Cc: [email protected]
    Reported-by: Barnabás Pőcze <[email protected]>
    Signed-off-by: Jaroslav Kysela <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: gus: Fix some error handling paths related to get_bpos() usage [+ + +]
Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 21:34:01 2024 +0200

    ALSA: gus: Fix some error handling paths related to get_bpos() usage
    
    [ Upstream commit 9df39a872c462ea07a3767ebd0093c42b2ff78a2 ]
    
    If get_bpos() fails, it is likely that the corresponding error code should
    be returned.
    
    Fixes: a6970bb1dd99 ("ALSA: gus: Convert to the new PCM ops")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/d9ca841edad697154afa97c73a5d7a14919330d9.1727984008.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Fri Oct 4 10:25:58 2024 +0200

    ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
    
    [ Upstream commit b3ebb007060f89d5a45c9b99f06a55e36a1945b5 ]
    
    We received a regression report for System76 Pangolin (pang14) due to
    the recent fix for Tuxedo Sirius devices to support the top speaker.
    The reason was the conflicting PCI SSID, as often seen.
    
    As a workaround, now the codec SSID is checked and the quirk is
    applied conditionally only to Sirius devices.
    
    Fixes: 4178d78cd7a8 ("ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices")
    Reported-by: Christian Heusel <[email protected]>
    Reported-by: Jerry <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Tue Oct 1 14:14:36 2024 +0200

    ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
    
    [ Upstream commit 1c801e7f77445bc56e5e1fec6191fd4503534787 ]
    
    Some time ago, we introduced the obey_preferred_dacs flag for choosing
    the DAC/pin pairs specified by the driver instead of parsing the
    paths.  This works as expected, per se, but there have been a few
    cases where we forgot to set this flag while preferred_dacs table is
    already set up.  It ended up with incorrect wiring and made us
    wondering why it doesn't work.
    
    Basically, when the preferred_dacs table is provided, it means that
    the driver really wants to wire up to follow that.  That is, the
    presence of the preferred_dacs table itself is already a "do-it"
    flag.
    
    In this patch, we simply replace the evaluation of obey_preferred_dacs
    flag with the presence of preferred_dacs table for fixing the
    misbehavior.  Another patch to drop of the obsoleted flag will
    follow.
    
    Fixes: 242d990c158d ("ALSA: hda/generic: Add option to enforce preferred_dacs pairs")
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1219803
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 [+ + +]
Author: Abhishek Tamboli <[email protected]>
Date:   Mon Sep 30 20:23:00 2024 +0530

    ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
    
    commit d75dba49744478c32f6ce1c16b5f391c2d5cef5f upstream.
    
    Add the quirk for HP Pavilion Gaming laptop 15z-ec200 for
    enabling the mute led. The fix apply the ALC285_FIXUP_HP_MUTE_LED
    quirk for this model.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219303
    Signed-off-by: Abhishek Tamboli <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9 [+ + +]
Author: Ai Chao <[email protected]>
Date:   Thu Sep 26 14:02:52 2024 +0800

    ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9
    
    commit dee476950cbd83125655a3f49e00d63b79f6114e upstream.
    
    The headset mic requires a fixup to be properly detected/used.
    
    Signed-off-by: Ai Chao <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8 [+ + +]
Author: Nikolai Afanasenkov <[email protected]>
Date:   Mon Sep 16 13:50:42 2024 -0600

    ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8
    
    commit cb2deca056d579fe008c8d0a4ceb04d2b368fe42 upstream.
    
    The HP Elite mt645 G8 Mobile Thin Client uses an ALC236 codec
    and needs the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk
    to enable the mute and micmute LED functionality.
    
    This patch adds the system ID of the HP Elite mt645 G8
    to the `alc269_fixup_tbl` in `patch_realtek.c`
    to enable the required quirk.
    
    Cc: [email protected]
    Signed-off-by: Nikolai Afanasenkov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Fix the push button function for the ALC257 [+ + +]
Author: Oder Chiou <[email protected]>
Date:   Mon Sep 30 18:50:39 2024 +0800

    ALSA: hda/realtek: Fix the push button function for the ALC257
    
    [ Upstream commit 05df9732a0894846c46d0062d4af535c5002799d ]
    
    The headset push button cannot work properly in case of the ALC257.
    This patch reverted the previous commit to correct the side effect.
    
    Fixes: ef9718b3d54e ("ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7")
    Signed-off-by: Oder Chiou <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init [+ + +]
Author: Joshua Grisham <[email protected]>
Date:   Mon Sep 9 21:30:00 2024 +0200

    ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init
    
    [ Upstream commit 7e4d4b32ab9532bd1babcd5d0763d727ebb04be0 ]
    
    I have done a lot of analysis for these type of devices and collaborated
    quite a bit with Nick Weihs (author of the first patch submitted for this
    including adding samsung_helper.c). More information can be found in the
    issue on Github [1] including additional rationale and testing.
    
    The existing implementation includes a large number of equalizer coef
    values that are not necessary to actually init and enable the speaker
    amps, as well as create a somewhat worse sound profile. Users have
    reported "muffled" or "muddy" sound; more information about this including
    my analysis of the differences can be found in the linked Github issue.
    
    This patch refactors the "v2" version of ALC298_FIXUP_SAMSUNG_AMP to a much
    simpler implementation which removes the new samsung_helper.c, reuses more
    of the existing patch_realtek.c, and sends significantly fewer unnecessary
    coef values (including removing all of these EQ-specific coef values).
    
    A pcm_playback_hook is used to dynamically enable and disable the speaker
    amps only when there will be audio playback; this is to match the behavior
    of how the driver for these devices is working in Windows, and is
    suspected but not yet tested or confirmed to help with power consumption.
    
    Support for models with 2 speaker amps vs 4 speaker amps is controlled by
    a specific quirk name for both types. A new int num_speaker_amps has been
    added to alc_spec so that the hooks can know how many speaker amps to
    enable or disable. This design was chosen to limit the number of places
    that subsystem ids will need to be maintained: like this, they can be
    maintained only once in the quirk table and there will not be another
    separate list of subsystem ids to maintain elsewhere in the code.
    
    Also updated the quirk name from ALC298_FIXUP_SAMSUNG_AMP2 to
    ALC298_FIXUP_SAMSUNG_AMP_V2_.. as this is not a quirk for "Amp #2" on
    ALC298 but is instead a different version of how to handle it.
    
    More devices have been added (see Github issue for testing confirmation),
    as well as a small cleanup to existing names.
    
    [1]: https://github.com/thesofproject/linux/issues/4055#issuecomment-2323411911
    
    Signed-off-by: Joshua Grisham <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop [+ + +]
Author: Baojun Xu <[email protected]>
Date:   Thu Sep 19 15:57:43 2024 +0800

    ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop
    
    commit 49f5ee951f11f4d6a124f00f71b2590507811a55 upstream.
    
    Add new vendor_id and subsystem_id in quirk for Lenovo Y990 Laptop.
    
    Signed-off-by: Baojun Xu <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hdsp: Break infinite MIDI input flush loop [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:15:12 2024 +0200

    ALSA: hdsp: Break infinite MIDI input flush loop
    
    [ Upstream commit c01f3815453e2d5f699ccd8c8c1f93a5b8669e59 ]
    
    The current MIDI input flush on HDSP and HDSPM drivers relies on the
    hardware reporting the right value.  If the hardware doesn't give the
    proper value but returns -1, it may be stuck at an infinite loop.
    
    Add a counter and break if the loop is unexpectedly too long.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: line6: add hw monitor volume control to POD HD500X [+ + +]
Author: Hans P. Moller <[email protected]>
Date:   Thu Oct 3 20:28:28 2024 -0300

    ALSA: line6: add hw monitor volume control to POD HD500X
    
    commit 703235a244e533652346844cfa42623afb36eed1 upstream.
    
    Add hw monitor volume control for POD HD500X. This is done adding
    LINE6_CAP_HWMON_CTL to the capabilities
    
    Signed-off-by: Hans P. Moller <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: mixer_oss: Remove some incorrect kfree_const() usages [+ + +]
Author: Christophe JAILLET <[email protected]>
Date:   Thu Sep 26 20:17:36 2024 +0200

    ALSA: mixer_oss: Remove some incorrect kfree_const() usages
    
    [ Upstream commit 368e4663c557de4a33f321b44e7eeec0a21b2e4e ]
    
    "assigned" and "assigned->name" are allocated in snd_mixer_oss_proc_write()
    using kmalloc() and kstrdup(), so there is no point in using kfree_const()
    to free these resources.
    
    Switch to the more standard kfree() to free these resources.
    
    This could avoid a memory leak.
    
    Fixes: 454f5ec1d2b7 ("ALSA: mixer: oss: Constify snd_mixer_oss_assign_table definition")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/63ac20f64234b7c9ea87a7fa9baf41e8255852f7.1727374631.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET [+ + +]
Author: Lianqin Hu <[email protected]>
Date:   Wed Sep 25 03:16:29 2024 +0000

    ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET
    
    commit 73385f3e0d8088b715ae8f3f66d533c482a376ab upstream.
    
    Audio control requests that sets sampling frequency sometimes fail on
    this card. Adding delay between control messages eliminates that problem.
    
    Signed-off-by: Lianqin Hu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/TYUPR06MB62177E629E9DEF2401333BF7D2692@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add input value sanity checks for standard types [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Tue Aug 6 14:46:50 2024 +0200

    ALSA: usb-audio: Add input value sanity checks for standard types
    
    [ Upstream commit 901e85677ec0bb9a69fb9eab1feafe0c4eb7d07e ]
    
    For an invalid input value that is out of the given range, currently
    USB-audio driver corrects the value silently and accepts without
    errors.  This is no wrong behavior, per se, but the recent kselftest
    rather wants to have an error in such a case, hence a different
    behavior is expected now.
    
    This patch adds a sanity check at each control put for the standard
    mixer types and returns an error if an invalid value is given.
    
    Note that this covers only the standard mixer types.  The mixer quirks
    that have own control callbacks would need different coverage.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add logitech Audio profile quirk [+ + +]
Author: Joshua Pius <[email protected]>
Date:   Thu Sep 12 15:26:28 2024 +0000

    ALSA: usb-audio: Add logitech Audio profile quirk
    
    [ Upstream commit a51c925c11d7b855167e64b63eb4378e5adfc11d ]
    
    Specify shortnames for the following Logitech Devices: Rally bar, Rally
    bar mini, Tap, MeetUp and Huddle.
    
    Signed-off-by: Joshua Pius <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add mixer quirk for RME Digiface USB [+ + +]
Author: Asahi Lina <[email protected]>
Date:   Tue Sep 3 19:52:30 2024 +0900

    ALSA: usb-audio: Add mixer quirk for RME Digiface USB
    
    [ Upstream commit 611a96f6acf2e74fe28cb90908a9c183862348ce ]
    
    Implement sync, output format, and input status mixer controls, to allow
    the interface to be used as a straight ADAT/SPDIF (+ Headphones) I/O
    interface.
    
    This does not implement the matrix mixer, output gain controls, or input
    level meter feedback. The full mixer interface is only really usable
    using a dedicated userspace control app (there are too many mixer nodes
    for alsamixer to be usable), so for now we leave it up to userspace to
    directly control these features using raw USB control messages. This is
    similar to how it's done with some FireWire interfaces (ffado-mixer).
    
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add native DSD support for Luxman D-08u [+ + +]
Author: Jan Lalinsky <[email protected]>
Date:   Thu Oct 3 05:08:11 2024 +0200

    ALSA: usb-audio: Add native DSD support for Luxman D-08u
    
    commit 6b0bde5d8d4078ca5feec72fd2d828f0e5cf115d upstream.
    
    Add native DSD support for Luxman D-08u DAC, by adding the PID/VID 1852:5062.
    This makes DSD playback work, and also sound quality when playing PCM files
    is improved, crackling sounds are gone.
    
    Signed-off-by: Jan Lalinsky <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add quirk for RME Digiface USB [+ + +]
Author: Cyan Nyan <[email protected]>
Date:   Tue Sep 3 19:52:29 2024 +0900

    ALSA: usb-audio: Add quirk for RME Digiface USB
    
    [ Upstream commit c032044e9672408c534d64a6df2b1ba14449e948 ]
    
    Add trivial support for audio streaming on the RME Digiface USB. Binds
    only to the first interface to allow userspace to directly drive the
    complex I/O and matrix mixer controls.
    
    Signed-off-by: Cyan Nyan <[email protected]>
    [Lina: Added 2x/4x sample rate support & boot/format quirks]
    Co-developed-by: Asahi Lina <[email protected]>
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Define macros for quirk table entries [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:41 2024 +0200

    ALSA: usb-audio: Define macros for quirk table entries
    
    [ Upstream commit 0c3ad39b791c2ecf718afcaca30e5ceafa939d5c ]
    
    Many entries in the USB-audio quirk tables have relatively complex
    expressions.  For improving the readability, introduce a few macros.
    Those are applied in the following patch.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Replace complex quirk lines with macros [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:42 2024 +0200

    ALSA: usb-audio: Replace complex quirk lines with macros
    
    [ Upstream commit d79e13f8e8abb5cd3a2a0f9fc9bc3fc750c5b06f ]
    
    Apply the newly introduced macros for reduce the complex expressions
    and cast in the quirk table definitions.  It results in a significant
    code reduction, too.
    
    There should be no functional changes.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
aoe: fix the potential use-after-free problem in more places [+ + +]
Author: Chun-Yi Lee <[email protected]>
Date:   Wed Oct 2 11:54:58 2024 +0800

    aoe: fix the potential use-after-free problem in more places
    
    commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 upstream.
    
    For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential
    use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put()
    instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs
    into use-after-free.
    
    Then Nicolai Stange found more places in aoe have potential use-after-free
    problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
    and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
    packet to tx queue. So they should also use dev_hold() to increase the
    refcnt of skb->dev.
    
    On the other hand, moving dev_put() to tx() causes that the refcnt of
    skb->dev be reduced to a negative value, because corresponding
    dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(),
    probe(), and aoecmd_cfg_rsp(). This patch fixed this issue.
    
    Cc: [email protected]
    Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
    Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts")
    Reported-by: Nicolai Stange <[email protected]>
    Signed-off-by: Chun-Yi Lee <[email protected]>
    Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
arm64: cputype: Add Neoverse-N3 definitions [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:04:19 2024 +0100

    arm64: cputype: Add Neoverse-N3 definitions
    
    [ Upstream commit 924725707d80bc2588cefafef76ff3f164d299bc ]
    
    Add cputype definitions for Neoverse-N3. These will be used for errata
    detection in subsequent patches.
    
    These values can be found in Table A-261 ("MIDR_EL1 bit descriptions")
    in issue 02 of the Neoverse-N3 TRM, which can be found at:
    
      https://developer.arm.com/documentation/107997/0000/?lang=en
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: errata: Expand speculative SSBS workaround once more [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:04:20 2024 +0100

    arm64: errata: Expand speculative SSBS workaround once more
    
    [ Upstream commit 081eb7932c2b244f63317a982c5e3990e2c7fbdd ]
    
    A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
    special-purpose register does not affect subsequent speculative
    instructions, permitting speculative store bypassing for a window of
    time.
    
    We worked around this for a number of CPUs in commits:
    
    * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
    * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")
    * 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)")
    
    Since then, a (hopefully final) batch of updates have been published,
    with two more affected CPUs. For the affected CPUs the existing
    mitigation is sufficient, as described in their respective Software
    Developer Errata Notice (SDEN) documents:
    
    * Cortex-A715 (MP148) SDEN v15.0, erratum 3456084
      https://developer.arm.com/documentation/SDEN-2148827/1500/
    
    * Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111
      https://developer.arm.com/documentation/SDEN-3050973/0500/
    
    Enable the existing mitigation by adding the relevant MIDRs to
    erratum_spec_ssbs_list, and update silicon-errata.rst and the
    Kconfig text accordingly.
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Sep 30 13:04:48 2024 +0100

    arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
    
    commit b3d6121eaeb22aee8a02f46706745b1968cc0292 upstream.
    
    The Kconfig logic to select HAVE_DYNAMIC_FTRACE_WITH_ARGS is incorrect,
    and HAVE_DYNAMIC_FTRACE_WITH_ARGS may be selected when it is not
    supported by the combination of clang and GNU LD, resulting in link-time
    errors:
    
      aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
      aarch64-linux-gnu-ld: final link failed: bad value
    
    ... which can be seen when building with CC=clang using a binutils
    version older than 2.36.
    
    We originally fixed that in commit:
    
      45bd8951806eb5e8 ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang")
    
    ... by splitting the "select HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement
    into separete CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS and
    GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS options which individually select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS.
    
    Subsequently we accidentally re-introduced the common "select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement in commit:
    
      26299b3f6ba26bfc ("ftrace: arm64: move from REGS to ARGS")
    
    ... then we removed it again in commit:
    
      68a63a412d18bd2e ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")
    
    ... then we accidentally re-introduced it again in commit:
    
      2aa6ac03516d078c ("arm64: ftrace: Add direct call support")
    
    Fix this for the third time by keeping the unified select statement and
    making this depend onf either GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS or
    CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS. This is more consistent with
    usual style and less likely to go wrong in future.
    
    Fixes: 2aa6ac03516d ("arm64: ftrace: Add direct call support")
    Cc: <[email protected]> # 6.4.x
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 [+ + +]
Author: Easwar Hariharan <[email protected]>
Date:   Thu Oct 3 22:52:35 2024 +0000

    arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
    
    commit 3eddb108abe3de6723cc4b77e8558ce1b3047987 upstream.
    
    Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
    from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata:
    Expand speculative SSBS workaround")
    
    CC: Mark Rutland <[email protected]>
    CC: James More <[email protected]>
    CC: Will Deacon <[email protected]>
    CC: [email protected] # 6.6+
    Signed-off-by: Easwar Hariharan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() [+ + +]
Author: Fares Mehanna <[email protected]>
Date:   Mon Sep 2 16:33:08 2024 +0000

    arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
    
    [ Upstream commit 7eced90b202d63cdc1b9b11b1353adb1389830f9 ]
    
    The reasons for PTEs in the kernel direct map to be marked invalid are not
    limited to kfence / debug pagealloc machinery. In particular,
    memfd_secret() also steals pages with set_direct_map_invalid_noflush().
    
    When building the transitional page tables for kexec from the current
    kernel's page tables, those pages need to become regular writable pages,
    otherwise, if the relocation places kexec segments over such pages, a fault
    will occur during kexec, leading to host going dark during kexec.
    
    This patch addresses the kexec issue by marking any PTE as valid if it is
    not none. While this fixes the kexec crash, it does not address the
    security concern that if processes owning secret memory are not terminated
    before kexec, the secret content will be mapped in the new kernel without
    being scrubbed.
    
    Suggested-by: Jan H. Schönherr <[email protected]>
    Signed-off-by: Fares Mehanna <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized [+ + +]
Author: Andrei Simion <[email protected]>
Date:   Tue Sep 24 11:12:38 2024 +0300

    ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized
    
    [ Upstream commit 09cfc6a532d249a51d3af5022d37ebbe9c3d31f6 ]
    
    Update the driver to prevent alsa-restore.service from failing when
    reading data from /var/lib/alsa/asound.state at boot. Ensure that the
    restoration of ALSA mixer configurations is skipped if substream->runtime
    is NULL.
    
    Fixes: 50291652af52 ("ASoC: atmel: mchp-pdmc: add PDMC driver")
    Signed-off-by: Andrei Simion <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: codecs: wsa883x: Handle reading version failure [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Wed Jul 10 15:52:31 2024 +0200

    ASoC: codecs: wsa883x: Handle reading version failure
    
    [ Upstream commit 2fbf16992e5aa14acf0441320033a01a32309ded ]
    
    If reading version and variant from registers fails (which is unlikely
    but possible, because it is a read over bus), the driver will proceed
    and perform device configuration based on uninitialized stack variables.
    Handle it a bit better - bail out without doing any init and failing the
    update status Soundwire callback.
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m [+ + +]
Author: Hui Wang <[email protected]>
Date:   Wed Oct 2 10:56:59 2024 +0800

    ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
    
    [ Upstream commit 47d7d3fd72afc7dcd548806291793ee6f3848215 ]
    
    In most Linux distribution kernels, the SND is set to m, in such a
    case, when booting the kernel on i.MX8MP EVK board, there is a
    warning calltrace like below:
     Call trace:
     snd_card_init+0x484/0x4cc [snd]
     snd_card_new+0x70/0xa8 [snd]
     snd_soc_bind_card+0x310/0xbd0 [snd_soc_core]
     snd_soc_register_card+0xf0/0x108 [snd_soc_core]
     devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
    
    That is because the card.owner is not set, a warning calltrace is
    raised in the snd_card_init() due to it.
    
    Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
    Signed-off-by: Hui Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev() [+ + +]
Author: Pierre-Louis Bossart <[email protected]>
Date:   Tue Aug 27 20:32:01 2024 +0800

    ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev()
    
    [ Upstream commit 14e91ddd5c02d8c3e5a682ebfa0546352b459911 ]
    
    The code seems mostly copy-pasted, with some machine drivers
    forgetting to test if the 'adev' result is NULL.
    
    Add this check when missing, and use -ENOENT consistently as an error
    code.
    
    Reported-by: Dan Carpenter <[email protected]>
    Closes: https://lore.kernel.org/alsa-devel/[email protected]/T/#u
    Signed-off-by: Pierre-Louis Bossart <[email protected]>
    Reviewed-by: Péter Ujfalusi <[email protected]>
    Signed-off-by: Bard Liao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item [+ + +]
Author: Bard Liao <[email protected]>
Date:   Tue Oct 1 14:17:37 2024 +0800

    ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item
    
    [ Upstream commit 5afc29ba44fdd1bcbad4e07246c395d946301580 ]
    
    There is no links_num in struct snd_soc_acpi_mach {}, and we test
    !link->num_adr as a condition to end the loop in hda_sdw_machine_select().
    So an empty item in struct snd_soc_acpi_link_adr array is required.
    
    Fixes: 65ab45b90656 ("ASoC: Intel: soc-acpi: Add match entries for some cs42l43 laptops")
    Signed-off-by: Bard Liao <[email protected]>
    Reviewed-by: Péter Ujfalusi <[email protected]>
    Reviewed-by: Charles Keepax <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: topology: Fix incorrect addressing assignments [+ + +]
Author: Tang Bin <[email protected]>
Date:   Sat Sep 14 16:16:08 2024 +0800

    ASoC: topology: Fix incorrect addressing assignments
    
    [ Upstream commit 85109780543b5100aba1d0842b6a7c3142be74d2 ]
    
    The variable 'kc' is handled in the function
    soc_tplg_control_dbytes_create(), and 'kc->private_value'
    is assigned to 'sbe', so In the function soc_tplg_dbytes_create(),
    the right 'sbe' should be 'kc.private_value', the same logical error
    in the function soc_tplg_dmixer_create(), thus fix them.
    
    Fixes: 0867278200f7 ("ASoC: topology: Unify code for creating standalone and widget bytes control")
    Fixes: 4654ca7cc8d6 ("ASoC: topology: Unify code for creating standalone and widget mixer control")
    Signed-off-by: Tang Bin <[email protected]>
    Reviewed-by: Amadeusz Sławiński <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ata: pata_serverworks: Do not use the term blacklist [+ + +]
Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 10:58:36 2024 +0900

    ata: pata_serverworks: Do not use the term blacklist
    
    [ Upstream commit 858048568c9e3887d8b19e101ee72f129d65cb15 ]
    
    Let's not use the term blacklist in the function
    serverworks_osb4_filter() documentation comment and rather simply refer
    to what that function looks at: the list of devices with groken UDMA5.
    
    While at it, also constify the values of the csb_bad_ata100 array.
    
    Of note is that all of this should probably be handled using libata
    quirk mechanism but it is unclear if these UDMA5 quirks are specific
    to this controller only.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: sata_sil: Rename sil_blacklist to sil_quirks [+ + +]
Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 11:14:11 2024 +0900

    ata: sata_sil: Rename sil_blacklist to sil_quirks
    
    [ Upstream commit 93b0f9e11ce511353c65b7f924cf5f95bd9c3aba ]
    
    Rename the array sil_blacklist to sil_quirks as this name is more
    neutral and is also consistent with how this driver define quirks with
    the SIL_QUIRK_XXX flags.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
blk_iocost: fix more out of bound shifts [+ + +]
Author: Konstantin Ovsepian <[email protected]>
Date:   Thu Aug 22 08:41:36 2024 -0700

    blk_iocost: fix more out of bound shifts
    
    [ Upstream commit 9bce8005ec0dcb23a58300e8522fe4a31da606fa ]
    
    Recently running UBSAN caught few out of bound shifts in the
    ioc_forgive_debts() function:
    
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2142:38
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2144:30
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    Call Trace:
    <IRQ>
    dump_stack_lvl+0xca/0x130
    __ubsan_handle_shift_out_of_bounds+0x22c/0x280
    ? __lock_acquire+0x6441/0x7c10
    ioc_timer_fn+0x6cec/0x7750
    ? blk_iocost_init+0x720/0x720
    ? call_timer_fn+0x5d/0x470
    call_timer_fn+0xfa/0x470
    ? blk_iocost_init+0x720/0x720
    __run_timer_base+0x519/0x700
    ...
    
    Actual impact of this issue was not identified but I propose to fix the
    undefined behaviour.
    The proposed fix to prevent those out of bound shifts consist of
    precalculating exponent before using it the shift operations by taking
    min value from the actual exponent and maximum possible number of bits.
    
    Reported-by: Breno Leitao <[email protected]>
    Signed-off-by: Konstantin Ovsepian <[email protected]>
    Acked-by: Tejun Heo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
block: fix integer overflow in BLKSECDISCARD [+ + +]
Author: Alexey Dobriyan <[email protected]>
Date:   Tue Sep 3 22:48:19 2024 +0300

    block: fix integer overflow in BLKSECDISCARD
    
    [ Upstream commit 697ba0b6ec4ae04afb67d3911799b5e2043b4455 ]
    
    I independently rediscovered
    
            commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155
            block: fix overflow in blk_ioctl_discard()
    
    but for secure erase.
    
    Same problem:
    
            uint64_t r[2] = {512, 18446744073709551104ULL};
            ioctl(fd, BLKSECDISCARD, r);
    
    will enter near infinite loop inside blkdev_issue_secure_erase():
    
            a.out: attempt to access beyond end of device
            loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048
            bio_check_eod: 3286214 callbacks suppressed
    
    Signed-off-by: Alexey Dobriyan <[email protected]>
    Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:12:04 2024 +0800

    Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 7b1ab460592ca818e7b52f27cd3ec86af79220d1 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: bb7f4f0bcee6 ("btmrvl: add platform specific wakeup interrupt support")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B [+ + +]
Author: Hilda Wu <[email protected]>
Date:   Thu Aug 29 16:40:05 2024 +0800

    Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B
    
    [ Upstream commit 9a0570948c5def5c59e588dc0e009ed850a1f5a1 ]
    
    For tracking multiple devices concurrently with a condition.
    The patch enables the HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER quirk
    on RTL8852B controller.
    
    The quirk setting is based on commit 9e14606d8f38 ("Bluetooth: msft:
    Extended monitor tracking by address filter")
    
    With this setting, when a pattern monitor detects a device, this
    feature issues an address monitor for tracking that device. Let the
    original pattern monitor keep monitor new devices.
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122 [+ + +]
Author: Hilda Wu <[email protected]>
Date:   Fri Aug 16 16:58:22 2024 +0800

    Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122
    
    [ Upstream commit bdf9557f70e7512bb2f754abf90d9e9958745316 ]
    
    Add the support ID (0x0489, 0xe122) to usb_device_id table for
    Realtek RTL8852C.
    
    The device info from /sys/kernel/debug/usb/devices as below.
    
    T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
    D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0489 ProdID=e122 Rev= 0.00
    S:  Manufacturer=Realtek
    S:  Product=Bluetooth Radio
    S:  SerialNumber=00e04c000001
    C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:17:00 2024 -0400

    Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE
    
    commit b25e11f978b63cb7857890edb3a698599cddb10e upstream.
    
    This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
    ("Bluetooth: Always request for user confirmation for Just Works")
    always request user confirmation with confirm_hint set since the
    likes of bluetoothd have dedicated policy around JUST_WORKS method
    (e.g. main.conf:JustWorksRepairing).
    
    CVE: CVE-2024-8805
    Cc: [email protected]
    Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Tested-by: Kiran K <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: L2CAP: Fix uaf in l2cap_connect [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon Sep 23 12:47:39 2024 -0400

    Bluetooth: L2CAP: Fix uaf in l2cap_connect
    
    [ Upstream commit 333b4fd11e89b29c84c269123f871883a30be586 ]
    
    [Syzbot reported]
    BUG: KASAN: slab-use-after-free in l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
    Read of size 8 at addr ffff8880241e9800 by task kworker/u9:0/54
    
    CPU: 0 UID: 0 PID: 54 Comm: kworker/u9:0 Not tainted 6.11.0-rc6-syzkaller-00268-g788220eee30d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Workqueue: hci2 hci_rx_work
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0xc3/0x620 mm/kasan/report.c:488
     kasan_report+0xd9/0x110 mm/kasan/report.c:601
     l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
     l2cap_connect_req net/bluetooth/l2cap_core.c:4080 [inline]
     l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:4772 [inline]
     l2cap_sig_channel net/bluetooth/l2cap_core.c:5543 [inline]
     l2cap_recv_frame+0xf0b/0x8eb0 net/bluetooth/l2cap_core.c:6825
     l2cap_recv_acldata+0x9b4/0xb70 net/bluetooth/l2cap_core.c:7514
     hci_acldata_packet net/bluetooth/hci_core.c:3791 [inline]
     hci_rx_work+0xaab/0x1610 net/bluetooth/hci_core.c:4028
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    ...
    
    Freed by task 5245:
     kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
     kasan_save_track+0x14/0x30 mm/kasan/common.c:68
     kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
     poison_slab_object+0xf7/0x160 mm/kasan/common.c:240
     __kasan_slab_free+0x32/0x50 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2256 [inline]
     slab_free mm/slub.c:4477 [inline]
     kfree+0x12a/0x3b0 mm/slub.c:4598
     l2cap_conn_free net/bluetooth/l2cap_core.c:1810 [inline]
     kref_put include/linux/kref.h:65 [inline]
     l2cap_conn_put net/bluetooth/l2cap_core.c:1822 [inline]
     l2cap_conn_del+0x59d/0x730 net/bluetooth/l2cap_core.c:1802
     l2cap_connect_cfm+0x9e6/0xf80 net/bluetooth/l2cap_core.c:7241
     hci_connect_cfm include/net/bluetooth/hci_core.h:1960 [inline]
     hci_conn_failed+0x1c3/0x370 net/bluetooth/hci_conn.c:1265
     hci_abort_conn_sync+0x75a/0xb50 net/bluetooth/hci_sync.c:5583
     abort_conn_sync+0x197/0x360 net/bluetooth/hci_conn.c:2917
     hci_cmd_sync_work+0x1a4/0x410 net/bluetooth/hci_sync.c:328
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Reported-by: [email protected]
    Tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=c12e2f941af1feb5632c
    Fixes: 7b064edae38d ("Bluetooth: Fix authentication if acl data comes before remote feature evt")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: MGMT: Fix possible crash on mgmt_index_removed [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:34:42 2024 -0400

    Bluetooth: MGMT: Fix possible crash on mgmt_index_removed
    
    [ Upstream commit f53e1c9c726d83092167f2226f32bd3b73f26c21 ]
    
    If mgmt_index_removed is called while there are commands queued on
    cmd_sync it could lead to crashes like the bellow trace:
    
    0x0000053D: __list_del_entry_valid_or_report+0x98/0xdc
    0x0000053D: mgmt_pending_remove+0x18/0x58 [bluetooth]
    0x0000053E: mgmt_remove_adv_monitor_complete+0x80/0x108 [bluetooth]
    0x0000053E: hci_cmd_sync_work+0xbc/0x164 [bluetooth]
    
    So while handling mgmt_index_removed this attempts to dequeue
    commands passed as user_data to cmd_sync.
    
    Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
    Reported-by: jiaymao <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bnxt_en: Extend maximum length of version string by 1 byte [+ + +]
Author: Simon Horman <[email protected]>
Date:   Tue Aug 13 15:32:55 2024 +0100

    bnxt_en: Extend maximum length of version string by 1 byte
    
    [ Upstream commit ffff7ee843c351ce71d6e0d52f0f20bea35e18c9 ]
    
    This corrects an out-by-one error in the maximum length of the package
    version string. The size argument of snprintf includes space for the
    trailing '\0' byte, so there is no need to allow extra space for it by
    reducing the value of the size argument by 1.
    
    Found by inspection.
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Michael Chan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bpf: Fix a sdiv overflow issue [+ + +]
Author: Yonghong Song <[email protected]>
Date:   Fri Sep 13 08:03:26 2024 -0700

    bpf: Fix a sdiv overflow issue
    
    [ Upstream commit 7dd34d7b7dcf9309fc6224caf4dd5b35bedddcb7 ]
    
    Zac Ecob reported a problem where a bpf program may cause kernel crash due
    to the following error:
      Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
    
    The failure is due to the below signed divide:
      LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
    LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
    but it is impossible since for 64-bit system, the maximum positive
    number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
    cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
    LLONG_MIN.
    
    Further investigation found all the following sdiv/smod cases may trigger
    an exception when bpf program is running on x86_64 platform:
      - LLONG_MIN/-1 for 64bit operation
      - INT_MIN/-1 for 32bit operation
      - LLONG_MIN%-1 for 64bit operation
      - INT_MIN%-1 for 32bit operation
    where -1 can be an immediate or in a register.
    
    On arm64, there are no exceptions:
      - LLONG_MIN/-1 = LLONG_MIN
      - INT_MIN/-1 = INT_MIN
      - LLONG_MIN%-1 = 0
      - INT_MIN%-1 = 0
    where -1 can be an immediate or in a register.
    
    Insn patching is needed to handle the above cases and the patched codes
    produced results aligned with above arm64 result. The below are pseudo
    codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0
    and the divisor is stored in a register.
    
    sdiv:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L2
          if tmp == 0 goto L1
          rY = 0
      L1:
          rY = -rY;
          goto L3
      L2:
          rY /= rX
      L3:
    
    smod:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L1
          if tmp == 1 (is64 ? goto L2 : goto L3)
          rY = 0;
          goto L2
      L1:
          rY %= rX
      L2:
          goto L4  // only when !is64
      L3:
          wY = wY  // only when !is64
      L4:
    
      [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
    
    Reported-by: Zac Ecob <[email protected]>
    Signed-off-by: Yonghong Song <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Make the pointer returned by iter next method valid [+ + +]
Author: Juntong Deng <[email protected]>
Date:   Thu Aug 29 21:11:17 2024 +0100

    bpf: Make the pointer returned by iter next method valid
    
    [ Upstream commit 4cc8c50c9abcb2646a7a4fcef3cea5dcb30c06cf ]
    
    Currently we cannot pass the pointer returned by iter next method as
    argument to KF_TRUSTED_ARGS or KF_RCU kfuncs, because the pointer
    returned by iter next method is not "valid".
    
    This patch sets the pointer returned by iter next method to be valid.
    
    This is based on the fact that if the iterator is implemented correctly,
    then the pointer returned from the iter next method should be valid.
    
    This does not make NULL pointer valid. If the iter next method has
    KF_RET_NULL flag, then the verifier will ask the ebpf program to
    check NULL pointer.
    
    KF_RCU_PROTECTED iterator is a special case, the pointer returned by
    iter next method should only be valid within RCU critical section,
    so it should be with MEM_RCU, not PTR_TRUSTED.
    
    Another special case is bpf_iter_num_next, which returns a pointer with
    base type PTR_TO_MEM. PTR_TO_MEM should not be combined with type flag
    PTR_TRUSTED (PTR_TO_MEM already means the pointer is valid).
    
    The pointer returned by iter next method of other types of iterators
    is with PTR_TRUSTED.
    
    In addition, this patch adds get_iter_from_state to help us get the
    current iterator from the current state.
    
    Signed-off-by: Juntong Deng <[email protected]>
    Link: https://lore.kernel.org/r/AM6PR03MB584869F8B448EA1C87B7CDA399962@AM6PR03MB5848.eurprd03.prod.outlook.com
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bpftool: Fix undefined behavior caused by shifting into the sign bit [+ + +]
Author: Kuan-Wei Chiu <[email protected]>
Date:   Sun Sep 8 22:00:09 2024 +0800

    bpftool: Fix undefined behavior caused by shifting into the sign bit
    
    [ Upstream commit 4cdc0e4ce5e893bc92255f5f734d983012f2bc2e ]
    
    Replace shifts of '1' with '1U' in bitwise operations within
    __show_dev_tc_bpf() to prevent undefined behavior caused by shifting
    into the sign bit of a signed integer. By using '1U', the operations
    are explicitly performed on unsigned integers, avoiding potential
    integer overflow or sign-related issues.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Acked-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

bpftool: Fix undefined behavior in qsort(NULL, 0, ...) [+ + +]
Author: Kuan-Wei Chiu <[email protected]>
Date:   Tue Sep 10 23:02:07 2024 +0800

    bpftool: Fix undefined behavior in qsort(NULL, 0, ...)
    
    [ Upstream commit f04e2ad394e2755d0bb2d858ecb5598718bf00d5 ]
    
    When netfilter has no entry to display, qsort is called with
    qsort(NULL, 0, ...). This results in undefined behavior, as UBSan
    reports:
    
    net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
    
    Although the C standard does not explicitly state whether calling qsort
    with a NULL pointer when the size is 0 constitutes undefined behavior,
    Section 7.1.4 of the C standard (Use of library functions) mentions:
    
    "Each of the following statements applies unless explicitly stated
    otherwise in the detailed descriptions that follow: If an argument to a
    function has an invalid value (such as a value outside the domain of
    the function, or a pointer outside the address space of the program, or
    a null pointer, or a pointer to non-modifiable storage when the
    corresponding parameter is not const-qualified) or a type (after
    promotion) not expected by a function with variable number of
    arguments, the behavior is undefined."
    
    To avoid this, add an early return when nf_link_info is NULL to prevent
    calling qsort with a NULL pointer.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Reviewed-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
bridge: mcast: Fail MDB get request on empty entry [+ + +]
Author: Ido Schimmel <[email protected]>
Date:   Sun Sep 29 15:36:40 2024 +0300

    bridge: mcast: Fail MDB get request on empty entry
    
    [ Upstream commit 555f45d24ba7cd5527716553031641cdebbe76c7 ]
    
    When user space deletes a port from an MDB entry, the port is removed
    synchronously. If this was the last port in the entry and the entry is
    not joined by the host itself, then the entry is scheduled for deletion
    via a timer.
    
    The above means that it is possible for the MDB get netlink request to
    retrieve an empty entry which is scheduled for deletion. This is
    problematic as after deleting the last port in an entry, user space
    cannot rely on a non-zero return code from the MDB get request as an
    indication that the port was successfully removed.
    
    Fix by returning an error when the entry's port list is empty and the
    entry is not joined by the host.
    
    Fixes: 68b380a395a7 ("bridge: mcast: Add MDB get support")
    Reported-by: Jamie Bainbridge <[email protected]>
    Closes: https://lore.kernel.org/netdev/c92569919307749f879b9482b0f3e125b7d9d2e3.1726480066.git.jamie.bainbridge@gmail.com/
    Tested-by: Jamie Bainbridge <[email protected]>
    Signed-off-by: Ido Schimmel <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
btrfs: don't readahead the relocation inode on RST [+ + +]
Author: Johannes Thumshirn <[email protected]>
Date:   Wed Jul 31 22:43:06 2024 +0200

    btrfs: don't readahead the relocation inode on RST
    
    [ Upstream commit 04915240e2c3a018e4c7f23418478d27226c8957 ]
    
    On relocation we're doing readahead on the relocation inode, but if the
    filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
    preallocated extents not being mapped in the RST) from the lookup.
    
    But readahead doesn't handle the error and submits invalid reads to the
    device, causing an assertion in the scatter-gather list code:
    
      BTRFS info (device nvme1n1): balance: start -d -m -s
      BTRFS info (device nvme1n1): relocating block group 6480920576 flags data|raid0
      BTRFS error (device nvme1n1): cannot find raid-stripe for logical [6481928192, 6481969152] devid 2, profile raid0
      ------------[ cut here ]------------
      kernel BUG at include/linux/scatterlist.h:115!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 0 PID: 1012 Comm: btrfs Not tainted 6.10.0-rc7+ #567
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000002cd11000 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Call Trace:
       <TASK>
       ? __die_body.cold+0x14/0x25
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x65/0x80
       ? __blk_rq_map_sg+0x339/0x4a0
       ? exc_invalid_op+0x50/0x70
       ? __blk_rq_map_sg+0x339/0x4a0
       ? asm_exc_invalid_op+0x1a/0x20
       ? __blk_rq_map_sg+0x339/0x4a0
       nvme_prep_rq.part.0+0x9d/0x770
       nvme_queue_rq+0x7d/0x1e0
       __blk_mq_issue_directly+0x2a/0x90
       ? blk_mq_get_budget_and_tag+0x61/0x90
       blk_mq_try_issue_list_directly+0x56/0xf0
       blk_mq_flush_plug_list.part.0+0x52b/0x5d0
       __blk_flush_plug+0xc6/0x110
       blk_finish_plug+0x28/0x40
       read_pages+0x160/0x1c0
       page_cache_ra_unbounded+0x109/0x180
       relocate_file_extent_cluster+0x611/0x6a0
       ? btrfs_search_slot+0xba4/0xd20
       ? balance_dirty_pages_ratelimited_flags+0x26/0xb00
       relocate_data_extent.constprop.0+0x134/0x160
       relocate_block_group+0x3f2/0x500
       btrfs_relocate_block_group+0x250/0x430
       btrfs_relocate_chunk+0x3f/0x130
       btrfs_balance+0x71b/0xef0
       ? kmalloc_trace_noprof+0x13b/0x280
       btrfs_ioctl+0x2c2e/0x3030
       ? kvfree_call_rcu+0x1e6/0x340
       ? list_lru_add_obj+0x66/0x80
       ? mntput_no_expire+0x3a/0x220
       __x64_sys_ioctl+0x96/0xc0
       do_syscall_64+0x54/0x110
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fcc04514f9b
      Code: Unable to access opcode bytes at 0x7fcc04514f71.
      RSP: 002b:00007ffeba923370 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fcc04514f9b
      RDX: 00007ffeba923460 RSI: 00000000c4009420 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000013 R09: 0000000000000001
      R10: 00007fcc043fbba8 R11: 0000000000000246 R12: 00007ffeba924fc5
      R13: 00007ffeba923460 R14: 0000000000000002 R15: 00000000004d4bb0
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fcc04514f71 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception ]---
    
    So in case of a relocation on a RAID stripe-tree based file system, skip
    the readahead.
    
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: drop the backref cache during relocation if we commit [+ + +]
Author: Josef Bacik <[email protected]>
Date:   Tue Sep 24 16:50:22 2024 -0400

    btrfs: drop the backref cache during relocation if we commit
    
    commit db7e68b522c01eb666cfe1f31637775f18997811 upstream.
    
    Since the inception of relocation we have maintained the backref cache
    across transaction commits, updating the backref cache with the new
    bytenr whenever we COWed blocks that were in the cache, and then
    updating their bytenr once we detected a transaction id change.
    
    This works as long as we're only ever modifying blocks, not changing the
    structure of the tree.
    
    However relocation does in fact change the structure of the tree.  For
    example, if we are relocating a data extent, we will look up all the
    leaves that point to this data extent.  We will then call
    do_relocation() on each of these leaves, which will COW down to the leaf
    and then update the file extent location.
    
    But, a key feature of do_relocation() is the pending list.  This is all
    the pending nodes that we modified when we updated the file extent item.
    We will then process all of these blocks via finish_pending_nodes, which
    calls do_relocation() on all of the nodes that led up to that leaf.
    
    The purpose of this is to make sure we don't break sharing unless we
    absolutely have to.  Consider the case that we have 3 snapshots that all
    point to this leaf through the same nodes, the initial COW would have
    created a whole new path.  If we did this for all 3 snapshots we would
    end up with 3x the number of nodes we had originally.  To avoid this we
    will cycle through each of the snapshots that point to each of these
    nodes and update their pointers to point at the new nodes.
    
    Once we update the pointer to the new node we will drop the node we
    removed the link for and all of its children via btrfs_drop_subtree().
    This is essentially just btrfs_drop_snapshot(), but for an arbitrary
    point in the snapshot.
    
    The problem with this is that we will never reflect this in the backref
    cache.  If we do this btrfs_drop_snapshot() for a node that is in the
    backref tree, we will leave the node in the backref tree.  This becomes
    a problem when we change the transid, as now the backref cache has
    entire subtrees that no longer exist, but exist as if they still are
    pointed to by the same roots.
    
    In the best case scenario you end up with "adding refs to an existing
    tree ref" errors from insert_inline_extent_backref(), where we attempt
    to link in nodes on roots that are no longer valid.
    
    Worst case you will double free some random block and re-use it when
    there's still references to the block.
    
    This is extremely subtle, and the consequences are quite bad.  There
    isn't a way to make sure our backref cache is consistent between
    transid's.
    
    In order to fix this we need to simply evict the entire backref cache
    anytime we cross transid's.  This reduces performance in that we have to
    rebuild this backref cache every time we change transid's, but fixes the
    bug.
    
    This has existed since relocation was added, and is a pretty critical
    bug.  There's a lot more cleanup that can be done now that this
    functionality is going away, but this patch is as small as possible in
    order to fix the problem and make it easy for us to backport it to all
    the kernels it needs to be backported to.
    
    Followup series will dismantle more of this code and simplify relocation
    drastically to remove this functionality.
    
    We have a reproducer that reproduced the corruption within a few minutes
    of running.  With this patch it survives several iterations/hours of
    running the reproducer.
    
    Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance")
    CC: [email protected]
    Reviewed-by: Boris Burkov <[email protected]>
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [+ + +]
Author: Qu Wenruo <[email protected]>
Date:   Sat Sep 28 08:05:58 2024 +0930

    btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
    
    commit c3b47f49e83197e8dffd023ec568403bcdbb774b upstream.
    
    [BUG]
    Syzbot reported a NULL pointer dereference with the following crash:
    
      FAULT_INJECTION: forcing a failure.
       start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676
       prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642
       relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678
      ...
      BTRFS info (device loop0): balance: ended with status: -12
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI
      KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667]
      RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926
      Call Trace:
       <TASK>
       commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496
       btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430
       del_balance_item fs/btrfs/volumes.c:3678 [inline]
       reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742
       btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574
       btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [CAUSE]
    The allocation failure happens at the start_transaction() inside
    prepare_to_relocate(), and during the error handling we call
    unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.
    
    Then we continue the error path cleanup in btrfs_balance() by calling
    reset_balance_state() which will call del_balance_item() to fully delete
    the balance item in the root tree.
    
    However during the small window between set_reloc_contrl() and
    unset_reloc_control(), we can have a subvolume tree update and created a
    reloc_root for that subvolume.
    
    Then we go into the final btrfs_commit_transaction() of
    del_balance_item(), and into btrfs_update_reloc_root() inside
    commit_fs_roots().
    
    That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
    stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
    dereference.
    
    [FIX]
    Just add extra check on fs_info->reloc_ctl inside
    btrfs_update_reloc_root(), before checking
    fs_info->reloc_ctl->merge_reloc_tree.
    
    That DEAD_RELOC_TREE handling is to prevent further modification to the
    reloc tree during merge stage, but since there is no reloc_ctl at all,
    we do not need to bother that.
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: send: fix buffer overflow detection when copying path to cache entry [+ + +]
Author: Filipe Manana <[email protected]>
Date:   Thu Sep 19 22:20:34 2024 +0100

    btrfs: send: fix buffer overflow detection when copying path to cache entry
    
    commit 96c6ca71572a3556ed0c37237305657ff47174b7 upstream.
    
    Starting with commit c0247d289e73 ("btrfs: send: annotate struct
    name_cache_entry with __counted_by()") we annotated the variable length
    array "name" from the name_cache_entry structure with __counted_by() to
    improve overflow detection. However that alone was not correct, because
    the length of that array does not match the "name_len" field - it matches
    that plus 1 to include the NUL string terminator, so that makes a
    fortified kernel think there's an overflow and report a splat like this:
    
      strcpy: detected buffer overflow: 20 byte write of buffer size 19
      WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50
      CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1
      Hardware name: CompuLab Ltd.  sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018
      RIP: 0010:__fortify_report+0x45/0x50
      Code: 48 8b 34 (...)
      RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246
      RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027
      RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8
      RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd
      R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400
      R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8
      FS:  00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0
      Call Trace:
       <TASK>
       ? __warn+0x12a/0x1d0
       ? __fortify_report+0x45/0x50
       ? report_bug+0x154/0x1c0
       ? handle_bug+0x42/0x70
       ? exc_invalid_op+0x1a/0x50
       ? asm_exc_invalid_op+0x1a/0x20
       ? __fortify_report+0x45/0x50
       __fortify_panic+0x9/0x10
      __get_cur_name_and_parent+0x3bc/0x3c0
       get_cur_path+0x207/0x3b0
       send_extent_data+0x709/0x10d0
       ? find_parent_nodes+0x22df/0x25d0
       ? mas_nomem+0x13/0x90
       ? mtree_insert_range+0xa5/0x110
       ? btrfs_lru_cache_store+0x5f/0x1e0
       ? iterate_extent_inodes+0x52d/0x5a0
       process_extent+0xa96/0x11a0
       ? __pfx_lookup_backref_cache+0x10/0x10
       ? __pfx_store_backref_cache+0x10/0x10
       ? __pfx_iterate_backrefs+0x10/0x10
       ? __pfx_check_extent_item+0x10/0x10
       changed_cb+0x6fa/0x930
       ? tree_advance+0x362/0x390
       ? memcmp_extent_buffer+0xd7/0x160
       send_subvol+0xf0a/0x1520
       btrfs_ioctl_send+0x106b/0x11d0
       ? __pfx___clone_root_cmp_sort+0x10/0x10
       _btrfs_ioctl_send+0x1ac/0x240
       btrfs_ioctl+0x75b/0x850
       __se_sys_ioctl+0xca/0x150
       do_syscall_64+0x85/0x160
       ? __count_memcg_events+0x69/0x100
       ? handle_mm_fault+0x1327/0x15c0
       ? __se_sys_rt_sigprocmask+0xf1/0x180
       ? syscall_exit_to_user_mode+0x75/0xa0
       ? do_syscall_64+0x91/0x160
       ? do_user_addr_fault+0x21d/0x630
      entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fae145eeb4f
      Code: 00 48 89 (...)
      RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f
      RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004
      RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927
      R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8
      R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004
       </TASK>
    
    Fix this by not storing the NUL string terminator since we don't actually
    need it for name cache entries, this way "name_len" corresponds to the
    actual size of the "name" array. This requires marking the "name" array
    field with __nonstring and using memcpy() instead of strcpy() as
    recommended by the guidelines at:
    
       https://github.com/KSPP/linux/issues/90
    
    Reported-by: David Arendt <[email protected]>
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    Fixes: c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()")
    CC: [email protected] # 6.11
    Tested-by: David Arendt <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: send: fix invalid clone operation for file that got its size decreased [+ + +]
Author: Filipe Manana <[email protected]>
Date:   Fri Sep 27 10:50:12 2024 +0100

    btrfs: send: fix invalid clone operation for file that got its size decreased
    
    commit fa630df665aa9ddce3a96ce7b54e10a38e4d2a2b upstream.
    
    During an incremental send we may end up sending an invalid clone
    operation, for the last extent of a file which ends at an unaligned offset
    that matches the final i_size of the file in the send snapshot, in case
    the file had its initial size (the size in the parent snapshot) decreased
    in the send snapshot. In this case the destination will fail to apply the
    clone operation because its end offset is not sector size aligned and it
    ends before the current size of the file.
    
    Sending the truncate operation always happens when we finish processing an
    inode, after we process all its extents (and xattrs, names, etc). So fix
    this by ensuring the file has a valid size before we send a clone
    operation for an unaligned extent that ends at the final i_size of the
    file. The size we truncate to matches the start offset of the clone range
    but it could be any value between that start offset and the final size of
    the file since the clone operation will expand the i_size if the current
    size is smaller than the end offset. The start offset of the range was
    chosen because it's always sector size aligned and avoids a truncation
    into the middle of a page, which results in dirtying the page due to
    filling part of it with zeroes and then making the clone operation at the
    receiver trigger IO.
    
    The following test reproduces the issue:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/sdi
      MNT=/mnt/sdi
    
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      # Create a file with a size of 256K + 5 bytes, having two extents, one
      # with a size of 128K and another one with a size of 128K + 5 bytes.
      last_ext_size=$((128 * 1024 + 5))
      xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \
             -c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \
             $MNT/foo
    
      # Another file which we will later clone foo into, but initially with
      # a larger size than foo.
      xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap1
    
      # Now resize bar and clone foo into it.
      xfs_io -c "truncate 0" \
             -c "reflink $MNT/foo" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap2
    
      rm -f /tmp/send-full /tmp/send-inc
      btrfs send -f /tmp/send-full $MNT/snap1
      btrfs send -p $MNT/snap1 -f /tmp/send-inc $MNT/snap2
    
      umount $MNT
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      btrfs receive -f /tmp/send-full $MNT
      btrfs receive -f /tmp/send-inc $MNT
    
      umount $MNT
    
    Running it before this patch:
    
      $ ./test.sh
      (...)
      At subvol snap1
      At snapshot snap2
      ERROR: failed to clone extents to bar: Invalid argument
    
    A test case for fstests will be sent soon.
    
    Reported-by: Ben Millwood <[email protected]>
    Link: https://lore.kernel.org/linux-btrfs/CAJhrHS2z+WViO2h=ojYvBPDLsATwLbg+7JaNCyYomv0fUxEpQQ@mail.gmail.com/
    Fixes: 46a6e10a1ab1 ("btrfs: send: allow cloning non-aligned extent if it ends at i_size")
    CC: [email protected] # 6.11
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: wait for fixup workers before stopping cleaner kthread during umount [+ + +]
Author: Filipe Manana <[email protected]>
Date:   Tue Oct 1 11:06:52 2024 +0100

    btrfs: wait for fixup workers before stopping cleaner kthread during umount
    
    commit 41fd1e94066a815a7ab0a7025359e9b40e4b3576 upstream.
    
    During unmount, at close_ctree(), we have the following steps in this order:
    
    1) Park the cleaner kthread - this doesn't destroy the kthread, it basically
       halts its execution (wake ups against it work but do nothing);
    
    2) We stop the cleaner kthread - this results in freeing the respective
       struct task_struct;
    
    3) We call btrfs_stop_all_workers() which waits for any jobs running in all
       the work queues and then free the work queues.
    
    Syzbot reported a case where a fixup worker resulted in a crash when doing
    a delayed iput on its inode while attempting to wake up the cleaner at
    btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread
    was already freed. This can happen during unmount because we don't wait
    for any fixup workers still running before we call kthread_stop() against
    the cleaner kthread, which stops and free all its resources.
    
    Fix this by waiting for any fixup workers at close_ctree() before we call
    kthread_stop() against the cleaner and run pending delayed iputs.
    
    The stack traces reported by syzbot were the following:
    
      BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
      Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52
    
      CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
      Workqueue: btrfs-fixup btrfs_work_helper
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0x169/0x550 mm/kasan/report.c:488
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154
       btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842
       btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314
       process_one_work kernel/workqueue.c:3229 [inline]
       process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
       worker_thread+0x870/0xd30 kernel/workqueue.c:3391
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
    
      Allocated by task 2:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       unpoison_slab_object mm/kasan/common.c:319 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Freed by task 61:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:230 [inline]
       slab_free_hook mm/slub.c:2343 [inline]
       slab_free mm/slub.c:4580 [inline]
       kmem_cache_free+0x1a2/0x420 mm/slub.c:4682
       put_task_struct include/linux/sched/task.h:144 [inline]
       delayed_put_task_struct+0x125/0x300 kernel/exit.c:228
       rcu_do_batch kernel/rcu/tree.c:2567 [inline]
       rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823
       handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
       __do_softirq kernel/softirq.c:588 [inline]
       invoke_softirq kernel/softirq.c:428 [inline]
       __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
       irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
       instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline]
       sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
    
      Last potentially related work creation:
       kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
       __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
       __call_rcu_common kernel/rcu/tree.c:3086 [inline]
       call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190
       context_switch kernel/sched/core.c:5318 [inline]
       __schedule+0x184b/0x4ae0 kernel/sched/core.c:6675
       schedule_idle+0x56/0x90 kernel/sched/core.c:6793
       do_idle+0x56a/0x5d0 kernel/sched/idle.c:354
       cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
       start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314
       common_startup_64+0x13e/0x147
    
      The buggy address belongs to the object at ffff8880272a8000
       which belongs to the cache task_struct of size 7424
      The buggy address is located 2584 bytes inside of
       freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x272a8
      head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: f5(slab)
      raw: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      head: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000003 ffffea00009caa01 ffffffffffffffff 0000000000000000
      head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 2, tgid 2 (kthreadd), ts 71247381401, free_ts 71214998153
       set_page_owner include/linux/page_owner.h:32 [inline]
       post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
       prep_new_page mm/page_alloc.c:1545 [inline]
       get_page_from_freelist+0x3039/0x3180 mm/page_alloc.c:3457
       __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4733
       alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
       alloc_slab_page+0x6a/0x120 mm/slub.c:2413
       allocate_slab+0x5a/0x2f0 mm/slub.c:2579
       new_slab mm/slub.c:2632 [inline]
       ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3819
       __slab_alloc+0x58/0xa0 mm/slub.c:3909
       __slab_alloc_node mm/slub.c:3962 [inline]
       slab_alloc_node mm/slub.c:4123 [inline]
       kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      page last free pid 5230 tgid 5230 stack trace:
       reset_page_owner include/linux/page_owner.h:25 [inline]
       free_pages_prepare mm/page_alloc.c:1108 [inline]
       free_unref_page+0xcd0/0xf00 mm/page_alloc.c:2638
       discard_slab mm/slub.c:2678 [inline]
       __put_partials+0xeb/0x130 mm/slub.c:3146
       put_cpu_partial+0x17c/0x250 mm/slub.c:3221
       __slab_free+0x2ea/0x3d0 mm/slub.c:4450
       qlink_free mm/kasan/quarantine.c:163 [inline]
       qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179
       kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
       __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4142
       getname_flags+0xb7/0x540 fs/namei.c:139
       do_sys_openat2+0xd2/0x1d0 fs/open.c:1409
       do_sys_open fs/open.c:1430 [inline]
       __do_sys_openat fs/open.c:1446 [inline]
       __se_sys_openat fs/open.c:1441 [inline]
       __x64_sys_openat+0x247/0x2a0 fs/open.c:1441
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
      Memory state around the buggy address:
       ffff8880272a8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880272a8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
       ffff8880272a8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
cachefiles: fix dentry leak in cachefiles_open_file() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 29 16:34:09 2024 +0800

    cachefiles: fix dentry leak in cachefiles_open_file()
    
    commit da6ef2dffe6056aad3435e6cf7c6471c2a62187c upstream.
    
    A dentry leak may be caused when a lookup cookie and a cull are concurrent:
    
                P1             |             P2
    -----------------------------------------------------------
    cachefiles_lookup_cookie
      cachefiles_look_up_object
        lookup_one_positive_unlocked
         // get dentry
                                cachefiles_cull
                                  inode->i_flags |= S_KERNEL_FILE;
        cachefiles_open_file
          cachefiles_mark_inode_in_use
            __cachefiles_mark_inode_in_use
              can_use = false
              if (!(inode->i_flags & S_KERNEL_FILE))
                can_use = true
              return false
            return false
            // Returns an error but doesn't put dentry
    
    After that the following WARNING will be triggered when the backend folder
    is umounted:
    
    ==================================================================
    BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img}  still in use (1) [unmount of ext4 sda]
    WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70
    CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25
    RIP: 0010:umount_check+0x5d/0x70
    Call Trace:
     <TASK>
     d_walk+0xda/0x2b0
     do_one_tree+0x20/0x40
     shrink_dcache_for_umount+0x2c/0x90
     generic_shutdown_super+0x20/0x160
     kill_block_super+0x1a/0x40
     ext4_kill_sb+0x22/0x40
     deactivate_locked_super+0x35/0x80
     cleanup_mnt+0x104/0x160
    ==================================================================
    
    Whether cachefiles_open_file() returns true or false, the reference count
    obtained by lookup_positive_unlocked() in cachefiles_look_up_object()
    should be released.
    
    Therefore release that reference count in cachefiles_look_up_object() to
    fix the above issue and simplify the code.
    
    Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode [+ + +]
Author: Stefan Mätje <[email protected]>
Date:   Thu Aug 8 18:42:24 2024 +0200

    can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode
    
    [ Upstream commit 2423cc20087ae9a7b7af575aa62304ef67cad7b6 ]
    
    This patch moves the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_changelink in front of the evaluation of data[IFLA_CAN_BITTIMING].
    
    This avoids a call to do_set_data_bittiming providing a stale
    can_priv::ctrlmode with a CAN_CTRLMODE_FD flag not matching the
    requested state when switching between a CAN Classic and CAN-FD bitrate.
    
    In the same manner the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_validate is also moved in front of the evaluation of
    data[IFLA_CAN_BITTIMING].
    
    This is a preparation for patches where the nominal and data bittiming
    may have interdependencies on the driver side depending on the
    CAN_CTRLMODE_FD flag state.
    
    Signed-off-by: Stefan Mätje <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Marc Kleine-Budde <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ceph: fix a memory leak on cap_auths in MDS client [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Aug 19 10:52:17 2024 +0100

    ceph: fix a memory leak on cap_auths in MDS client
    
    [ Upstream commit d97079e97eab20e08afc507f2bed4501e2824717 ]
    
    The cap_auths that are allocated during an MDS session opening are never
    released, causing a memory leak detected by kmemleak.  Fix this by freeing
    the memory allocated when shutting down the MDS client.
    
    Fixes: 1d17de9534cb ("ceph: save cap_auths in MDS client when session is opened")
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Xiubo Li <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: fix cap ref leak via netfs init_request [+ + +]
Author: Patrick Donnelly <[email protected]>
Date:   Wed Oct 2 21:05:12 2024 -0400

    ceph: fix cap ref leak via netfs init_request
    
    commit ccda9910d8490f4fb067131598e4b2e986faa5a0 upstream.
    
    Log recovered from a user's cluster:
    
        <7>[ 5413.970692] ceph:  get_cap_refs 00000000958c114b ret 1 got Fr
        <7>[ 5413.970695] ceph:  start_read 00000000958c114b, no cache cap
        ...
        <7>[ 5473.934609] ceph:   my wanted = Fr, used = Fr, dirty -
        <7>[ 5473.934616] ceph:  revocation: pAsLsXsFr -> pAsLsXs (revoking Fr)
        <7>[ 5473.934632] ceph:  __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs
        <7>[ 5473.934638] ceph:  check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr  AUTHONLY NOINVAL FLUSH_FORCE
    
    The MDS subsequently complains that the kernel client is late releasing
    caps.
    
    Approximately, a series of changes to this code by commits 49870056005c
    ("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315
    ("netfs: Change ->init_request() to return an error code") and
    a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    resulted in subtle resource cleanup to be missed. The main culprit is
    the change in error handling in 2de160417315 which meant that a failure
    in init_request() would no longer cause cleanup to be called. That
    would prevent the ceph_put_cap_refs() call which would cleanup the
    leaked cap ref.
    
    Cc: [email protected]
    Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    Link: https://tracker.ceph.com/issues/67008
    Signed-off-by: Patrick Donnelly <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: remove the incorrect Fw reference check when dirtying pages [+ + +]
Author: Xiubo Li <[email protected]>
Date:   Thu Sep 5 06:22:18 2024 +0800

    ceph: remove the incorrect Fw reference check when dirtying pages
    
    [ Upstream commit c08dfb1b49492c09cf13838c71897493ea3b424e ]
    
    When doing the direct-io reads it will also try to mark pages dirty,
    but for the read path it won't hold the Fw caps and there is case
    will it get the Fw reference.
    
    Fixes: 5dda377cf0a6 ("ceph: set i_head_snapc when getting CEPH_CAP_FILE_WR reference")
    Signed-off-by: Xiubo Li <[email protected]>
    Reviewed-by: Patrick Donnelly <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
cifs: Do not convert delimiter when parsing NFS-style symlinks [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Sat Sep 28 23:59:46 2024 +0200

    cifs: Do not convert delimiter when parsing NFS-style symlinks
    
    [ Upstream commit d3a49f60917323228f8fdeee313260ef14f94df7 ]
    
    NFS-style symlinks have target location always stored in NFS/UNIX form
    where backslash means the real UNIX backslash and not the SMB path
    separator.
    
    So do not mangle slash and backslash content of NFS-style symlink during
    readlink() syscall as it is already in the correct Linux form.
    
    This fixes interoperability of NFS-style symlinks with backslashes created
    by Linux NFS3 client throw Windows NFS server and retrieved by Linux SMB
    client throw Windows SMB server, where both Windows servers exports the
    same directory.
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Fix buffer overflow when parsing NFS reparse points [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Sun Sep 29 12:22:40 2024 +0200

    cifs: Fix buffer overflow when parsing NFS reparse points
    
    [ Upstream commit e2a8910af01653c1c268984855629d71fb81f404 ]
    
    ReparseDataLength is sum of the InodeType size and DataBuffer size.
    So to get DataBuffer size it is needed to subtract InodeType's size from
    ReparseDataLength.
    
    Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer
    at position after the end of the buffer because it does not subtract
    InodeType size from the length. Fix this problem and correctly subtract
    variable len.
    
    Member InodeType is present only when reparse buffer is large enough. Check
    for ReparseDataLength before accessing InodeType to prevent another invalid
    memory access.
    
    Major and minor rdev values are present also only when reparse buffer is
    large enough. Check for reparse buffer size before calling reparse_mkdev().
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Remove intermediate object of failed create reparse call [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Mon Sep 30 22:25:10 2024 +0200

    cifs: Remove intermediate object of failed create reparse call
    
    [ Upstream commit c9432ad5e32f066875b1bf95939c363bc46d6a45 ]
    
    If CREATE was successful but SMB2_OP_SET_REPARSE failed then remove the
    intermediate object created by CREATE. Otherwise empty object stay on the
    server when reparse call failed.
    
    This ensures that if the creating of special files is unsupported by the
    server then no empty file stay on the server as a result of unsupported
    operation.
    
    Fixes: 102466f303ff ("smb: client: allow creating special files via reparse points")
    Signed-off-by: Pali Rohár <[email protected]>
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL [+ + +]
Author: Ajit Pandey <[email protected]>
Date:   Tue Jun 11 19:07:45 2024 +0530

    clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL
    
    commit fff617979f97c773aaa9432c31cf62444b3bdbd4 upstream.
    
    In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single
    PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL
    register using regmap_write() API in __alpha_pll_trion_set_rate
    callback will override LUCID EVO PLL initial configuration related
    to PLL_CAL_L_VAL bit fields in PLL_L_VAL register.
    
    Observed random PLL lock failures during PLL enable due to such
    override in PLL calibration value. Use regmap_update_bits() with
    L_VAL bitfield mask instead of regmap_write() API to update only
    PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback.
    
    Fixes: 260e36606a03 ("clk: qcom: clk-alpha-pll: add Lucid EVO PLL configuration interfaces")
    Cc: [email protected]
    Signed-off-by: Ajit Pandey <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Acked-by: Vladimir Zapolskiy <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: clk-rpmh: Fix overflow in BCM vote [+ + +]
Author: Mike Tipton <[email protected]>
Date:   Fri Aug 9 10:51:29 2024 +0530

    clk: qcom: clk-rpmh: Fix overflow in BCM vote
    
    commit a4e5af27e6f6a8b0d14bc0d7eb04f4a6c7291586 upstream.
    
    Valid frequencies may result in BCM votes that exceed the max HW value.
    Set vote ceiling to BCM_TCS_CMD_VOTE_MASK to ensure the votes aren't
    truncated, which can result in lower frequencies than desired.
    
    Fixes: 04053f4d23a4 ("clk: qcom: clk-rpmh: Add IPA clock support")
    Cc: [email protected]
    Signed-off-by: Mike Tipton <[email protected]>
    Reviewed-by: Taniya Das <[email protected]>
    Signed-off-by: Imran Shaik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks [+ + +]
Author: Dmitry Baryshkov <[email protected]>
Date:   Sun Aug 4 08:40:05 2024 +0300

    clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks
    
    commit 0e93c6320ecde0583de09f3fe801ce8822886fec upstream.
    
    Add CLK_SET_RATE_PARENT for several branch clocks. Such clocks don't
    have a way to change the rate, so set the parent rate instead.
    
    Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
    Cc: [email protected]
    Signed-off-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Add GPLL9 support [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:03 2024 +0530

    clk: qcom: gcc-sc8180x: Add GPLL9 support
    
    commit 818a2f8d5e4ad2c1e39a4290158fe8e39a744c70 upstream.
    
    Add the missing GPLL9 pll and fix the gcc_parents_7 data to use
    the correct pll hw.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:04 2024 +0530

    clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table
    
    commit b8acaf2de8081371761ab4cf1e7a8ee4e7acc139 upstream.
    
    Update the frequency tables of gcc_sdcc2_apps_clk and gcc_sdcc4_apps_clk
    as per the latest frequency plan.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:01 2024 +0530

    clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x
    
    commit 1fc8c02e1d80463ce1b361d82b83fc43bb92d964 upstream.
    
    QUPv3 clocks support DFS on sc8180x platform but currently the code
    changes for it are missing from the driver, this results in not
    populating all the DFS supported frequencies and returns incorrect
    frequency when the clients request for them. Hence add the DFS
    registration for QUPv3 RCGs.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:05 2024 +0530

    clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src
    
    commit bab0c7a0bc586e736b7cd2aac8e6391709a70ef2 upstream.
    
    The branch clocks of gcc_cpuss_ahb_clk_src are marked critical
    and hence these clocks vote on XO blocking the suspend.
    De-register these clocks and its source as there is no rate
    setting happening on them.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <[email protected]>
Date:   Fri Jul 19 19:12:38 2024 +0530

    clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit ade508b545c969c72cd68479f275a5dd640fd8b9 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.7
    Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Jul 22 16:27:33 2024 +0530

    clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit 889e1332310656961855c0dcedbb4dbe78e39d22 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.17
    Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: rockchip: fix error for unknown clocks [+ + +]
Author: Sebastian Reichel <[email protected]>
Date:   Mon Mar 25 20:33:36 2024 +0100

    clk: rockchip: fix error for unknown clocks
    
    commit 12fd64babaca4dc09d072f63eda76ba44119816a upstream.
    
    There is a clk == NULL check after the switch to check for
    unsupported clk types. Since clk is re-assigned in a loop,
    this check is useless right now for anything but the first
    round. Let's fix this up by assigning clk = NULL in the
    loop before the switch statement.
    
    Fixes: a245fecbb806 ("clk: rockchip: add basic infrastructure for clock branches")
    Cc: [email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    [added fixes + stable-cc]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix [+ + +]
Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:47 2024 +0200

    clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix
    
    commit 217a5f23c290c349ceaa37a6f2c014ad4c2d5759 upstream.
    
    Update CLKS_NR_FSYS to the proper value after a fix in DT bindings.
    This should always be the last clock in a CMU + 1.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
close_range(): fix the logics in descriptor table trimming [+ + +]
Author: Al Viro <[email protected]>
Date:   Fri Aug 16 15:17:00 2024 -0400

    close_range(): fix the logics in descriptor table trimming
    
    commit 678379e1d4f7443b170939525d3312cfc37bf86b upstream.
    
    Cloning a descriptor table picks the size that would cover all currently
    opened files.  That's fine for clone() and unshare(), but for close_range()
    there's an additional twist - we clone before we close, and it would be
    a shame to have
            close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
    leave us with a huge descriptor table when we are not going to keep
    anything past stderr, just because some large file descriptor used to
    be open before our call has taken it out.
    
    Unfortunately, it had been dealt with in an inherently racy way -
    sane_fdtable_size() gets a "don't copy anything past that" argument
    (passed via unshare_fd() and dup_fd()), close_range() decides how much
    should be trimmed and passes that to unshare_fd().
    
    The problem is, a range that used to extend to the end of descriptor
    table back when close_range() had looked at it might very well have stuff
    grown after it by the time dup_fd() has allocated a new files_struct
    and started to figure out the capacity of fdtable to be attached to that.
    
    That leads to interesting pathological cases; at the very least it's a
    QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
    a snapshot of descriptor table one might have observed at some point.
    Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
    of unshare(CLONE_FILES) with plain close_range(), ending up with a
    weird state that would never occur with unshare(2) is confusing, to put
    it mildly.
    
    It's not hard to get rid of - all it takes is passing both ends of the
    range down to sane_fdtable_size().  There we are under ->files_lock,
    so the race is trivially avoided.
    
    So we do the following:
            * switch close_files() from calling unshare_fd() to calling
    dup_fd().
            * undo the calling convention change done to unshare_fd() in
    60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
            * introduce struct fd_range, pass a pointer to that to dup_fd()
    and sane_fdtable_size() instead of "trim everything past that point"
    they are currently getting.  NULL means "we are not going to be punching
    any holes"; NR_OPEN_MAX is gone.
            * make sane_fdtable_size() use find_last_bit() instead of
    open-coding it; it's easier to follow that way.
            * while we are at it, have dup_fd() report errors by returning
    ERR_PTR(), no need to use a separate int *errorp argument.
    
    Fixes: 60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
    Cc: [email protected]
    Signed-off-by: Al Viro <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value [+ + +]
Author: Anastasia Belova <[email protected]>
Date:   Mon Aug 26 16:38:41 2024 +0300

    cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
    
    [ Upstream commit 5493f9714e4cdaf0ee7cec15899a231400cb1a9f ]
    
    cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
    and return in case of error.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Anastasia Belova <[email protected]>
    Reviewed-by: Perry Yuan <[email protected]>
    Signed-off-by: Viresh Kumar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cpufreq: Avoid a bad reference count on CPU node [+ + +]
Author: Miquel Sabaté Solà <[email protected]>
Date:   Tue Sep 17 15:42:46 2024 +0200

    cpufreq: Avoid a bad reference count on CPU node
    
    commit c0f02536fffbbec71aced36d52a765f8c4493dc2 upstream.
    
    In the parse_perf_domain function, if the call to
    of_parse_phandle_with_args returns an error, then the reference to the
    CPU device node that was acquired at the start of the function would not
    be properly decremented.
    
    Address this by declaring the variable with the __free(device_node)
    cleanup attribute.
    
    Signed-off-by: Miquel Sabaté Solà <[email protected]>
    Acked-by: Viresh Kumar <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock [+ + +]
Author: Uwe Kleine-König <[email protected]>
Date:   Thu Sep 19 10:11:21 2024 +0200

    cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
    
    commit 8b4865cd904650cbed7f2407e653934c621b8127 upstream.
    
    notify_hwp_interrupt() is called via sysvec_thermal() ->
    smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context.
    For this reason it must not use a simple spin_lock that sleeps with
    PREEMPT_RT enabled. So convert it to a raw spinlock.
    
    Reported-by: xiao sheng wen <[email protected]>
    Link: https://bugs.debian.org/1076483
    Signed-off-by: Uwe Kleine-König <[email protected]>
    Acked-by: Srinivas Pandruvada <[email protected]>
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Tested-by: xiao sheng wen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request() [+ + +]
Author: Huacai Chen <[email protected]>
Date:   Wed Aug 28 14:24:59 2024 +0800

    cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
    
    [ Upstream commit 2b7ec33e534f7a10033a5cf07794acf48b182bbe ]
    
    Use raw_smp_processor_id() instead of plain smp_processor_id() in
    do_service_request(), otherwise we may get some errors with the driver
    enabled:
    
     BUG: using smp_processor_id() in preemptible [00000000] code: (udev-worker)/208
     caller is loongson3_cpufreq_probe+0x5c/0x250 [loongson3_cpufreq]
    
    Reported-by: Xi Ruoyao <[email protected]>
    Tested-by: Binbin Zhou <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Viresh Kumar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
crypto: hisilicon - fix missed error branch [+ + +]
Author: Yang Shen <[email protected]>
Date:   Sat Aug 31 17:50:07 2024 +0800

    crypto: hisilicon - fix missed error branch
    
    [ Upstream commit f386dc64e1a5d3dcb84579119ec350ab026fea88 ]
    
    If an error occurs in the process after the SGL is mapped
    successfully, it need to unmap the SGL.
    
    Otherwise, memory problems may occur.
    
    Signed-off-by: Yang Shen <[email protected]>
    Signed-off-by: Chenghai Huang <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx - Fix authenc setkey [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:13:23 2024 +0800

    crypto: octeontx - Fix authenc setkey
    
    [ Upstream commit 311eea7e37c4c0b44b557d0c100860a03b4eab65 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx* - Select CRYPTO_AUTHENC [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Thu Sep 5 10:21:49 2024 +0800

    crypto: octeontx* - Select CRYPTO_AUTHENC
    
    commit c398cb8eb0a263a1b7a18892d9f244751689675c upstream.
    
    Select CRYPTO_AUTHENC as the function crypto_authenec_extractkeys
    may not be available without it.
    
    Fixes: 311eea7e37c4 ("crypto: octeontx - Fix authenc setkey")
    Fixes: 7ccb750dcac8 ("crypto: octeontx2 - Fix authenc setkey")
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

crypto: octeontx2 - Fix authenc setkey [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:36:19 2024 +0800

    crypto: octeontx2 - Fix authenc setkey
    
    [ Upstream commit 7ccb750dcac8abbfc7743aab0db6a72c1c3703c7 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: simd - Do not call crypto_alloc_tfm during registration [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 14:58:35 2024 +0800

    crypto: simd - Do not call crypto_alloc_tfm during registration
    
    [ Upstream commit 3c44d31cb34ce4eb8311a2e73634d57702948230 ]
    
    Algorithm registration is usually carried out during module init,
    where as little work as possible should be carried out.  The SIMD
    code violated this rule by allocating a tfm, this then triggers a
    full test of the algorithm which may dead-lock in certain cases.
    
    SIMD is only allocating the tfm to get at the alg object, which is
    in fact already available as it is what we are registering.  Use
    that directly and remove the crypto_alloc_tfm call.
    
    Also remove some obsolete and unused SIMD API.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: x86/sha256 - Add parentheses around macros' single arguments [+ + +]
Author: Fangrui Song <[email protected]>
Date:   Tue Aug 13 21:48:02 2024 -0700

    crypto: x86/sha256 - Add parentheses around macros' single arguments
    
    [ Upstream commit 3363c460ef726ba693704dbcd73b7e7214ccc788 ]
    
    The macros FOUR_ROUNDS_AND_SCHED and DO_4ROUNDS rely on an
    unexpected/undocumented behavior of the GNU assembler, which might
    change in the future
    (https://sourceware.org/bugzilla/show_bug.cgi?id=32073).
    
        M (1) (2) // 1 arg !? Future: 2 args
        M 1 + 2   // 1 arg !? Future: 3 args
    
        M 1 2     // 2 args
    
    Add parentheses around the single arguments to support future GNU
    assembler and LLVM integrated assembler (when the IsOperator hack from
    the following link is dropped).
    
    Link: https://github.com/llvm/llvm-project/commit/055006475e22014b28a070db1bff41ca15f322f0
    Signed-off-by: Fangrui Song <[email protected]>
    Reviewed-by: Jan Beulich <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drivers/perf: arm_spe: Use perf_allow_kernel() for permissions [+ + +]
Author: James Clark <[email protected]>
Date:   Tue Aug 27 15:51:12 2024 +0100

    drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
    
    [ Upstream commit 5e9629d0ae977d6f6916d7e519724804e95f0b07 ]
    
    Use perf_allow_kernel() for 'pa_enable' (physical addresses),
    'pct_enable' (physical timestamps) and context IDs. This means that
    perf_event_paranoid is now taken into account and LSM hooks can be used,
    which is more consistent with other perf_event_open calls. For example
    PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just
    perfmon_capable().
    
    This also indirectly fixes the following error message which is
    misleading because perf_event_paranoid is not taken into account by
    perfmon_capable():
    
      $ perf record -e arm_spe/pa_enable/
    
      Error:
      Access to performance monitoring and observability operations is
      limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid
      setting ...
    
    Suggested-by: Al Grant <[email protected]>
    Signed-off-by: James Clark <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drivers/perf: riscv: Align errno for unsupported perf event [+ + +]
Author: Pu Lehui <[email protected]>
Date:   Sat Aug 31 07:15:20 2024 +0000

    drivers/perf: riscv: Align errno for unsupported perf event
    
    commit c625154993d0d24a962b1830cd5ed92adda2cf86 upstream.
    
    RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would
    be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in
    pmu_sbi_event_map. Considering that other implementations return -ENOENT
    for unsupported perf types, let's synchronize this behavior. Due to this
    reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that
    behavior to the rest of proper place.
    
    Signed-off-by: Pu Lehui <[email protected]>
    Reviewed-by: Atish Patra <[email protected]>
    Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
    Fixes: 16d3b1af0944 ("perf: RISC-V: Check standard event availability")
    Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/amd/display: Add HDR workaround for specific eDP [+ + +]
Author: Alex Hung <[email protected]>
Date:   Fri Sep 6 11:39:18 2024 -0600

    drm/amd/display: Add HDR workaround for specific eDP
    
    commit 05af800704ee7187d9edd461ec90f3679b1c4aba upstream.
    
    [WHY & HOW]
    Some eDP panels suffer from flicking when HDR is enabled in KDE. This
    quirk works around it by skipping VSC that is incompatible with eDP
    panels.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2) [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Aug 2 12:35:13 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2)
    
    [ Upstream commit cd9e9e0852d501f169aa3bb34e4b413d2eb48c37 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_plane_handle_cursor_update function. Previously, 'afb' was
    assumed to be null, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.c:1298 amdgpu_dm_plane_handle_cursor_update() error: we previously assumed 'afb' could be null (see line 1252)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Co-developed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2) [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Aug 2 12:20:36 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2)
    
    [ Upstream commit 0fe20258b4989b9112b5e9470df33a0939403fd4 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_update_cursor function. Previously, 'afb' was assumed to be
    null at line 8388, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8433 amdgpu_dm_update_cursor()
            error: we previously assumed 'afb' could be null (see line 8388)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Co-developed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:21:19 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
    
    [ Upstream commit cba7fec864172dadd953daefdd26e01742b71a6a ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:58:32 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw
    
    [ Upstream commit 4b6377f0e96085cbec96eb7f0b282430ccdd3d75 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:44:40 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
    
    [ Upstream commit c395fd47d1565bd67671f45cca281b3acc2c31ef ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
    null.
    
    The fix adds a check to ensure `dc->clk_mgr` is not null before
    accessing its functions. This prevents a potential null pointer
    dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:09:28 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func
    
    [ Upstream commit 62ed6f0f198da04e884062264df308277628004f ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn20_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null at line 1030, but then it
    was being dereferenced without any null check at line 1048. This could
    potentially lead to a null pointer dereference error if set_output_gamma
    is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma at line 1048.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:15:00 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func
    
    [ Upstream commit 28574b08c70e56d34d6f6379326a860b96749051 ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn32_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:22:06 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func
    
    [ Upstream commit dd340acd42c24a3f28dd22fae6bf38662334264c ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn401_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:18:58 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
    
    [ Upstream commit f22f4754aaa47d8c59f166ba3042182859e5dff7 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn201_acquire_free_pipe_for_layer` function. The issue could occur
    when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:30:16 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer
    
    [ Upstream commit ac2140449184a26eac99585b7f69814bd3ba8f2d ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
    could occur when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 25 08:14:56 2024 +0530

    drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe
    
    [ Upstream commit 8e4ed3cf1642df0c4456443d865cff61a9598aa8 ]
    
    This commit addresses a null pointer dereference issue in the
    `dcn20_program_pipe` function. The issue could occur when
    `pipe_ctx->plane_state` is null.
    
    The fix adds a check to ensure `pipe_ctx->plane_state` is not null
    before accessing. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1925 dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could be null (see line 1877)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 25 07:23:48 2024 +0530

    drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
    
    [ Upstream commit 66d71a72539e173a9b00ca0b1852cbaa5f5bf1ad ]
    
    This commit addresses a null pointer dereference issue in the
    `commit_planes_for_stream` function at line 4140. The issue could occur
    when `top_pipe_to_program` is null.
    
    The fix adds a check to ensure `top_pipe_to_program` is not null before
    accessing its stream_res. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` [+ + +]
Author: Mario Limonciello <[email protected]>
Date:   Sun Sep 15 14:28:37 2024 -0500

    drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
    
    [ Upstream commit 87d749a6aab73d8069d0345afaa98297816cb220 ]
    
    The issue with panel power savings compatibility below
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` happens at
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` as well.
    
    That issue will be fixed separately, so don't prevent the backlight
    brightness from going that low.
    
    Cc: Harry Wentland <[email protected]>
    Cc: Thomas Weißschuh <[email protected]>
    Link: https://lore.kernel.org/amd-gfx/[email protected]/T/#m400dee4e2fc61fe9470334d20a7c8c89c9aef44f
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Avoid overflow assignment in link_dp_cts [+ + +]
Author: Alex Hung <[email protected]>
Date:   Wed Jul 17 09:17:56 2024 -0600

    drm/amd/display: Avoid overflow assignment in link_dp_cts
    
    [ Upstream commit a15268787b79fd183dd526cc16bec9af4f4e49a1 ]
    
    sampling_rate is an uint8_t but is assigned an unsigned int, and thus it
    can overflow. As a result, sampling_rate is changed to uint32_t.
    
    Similarly, LINK_QUAL_PATTERN_SET has a size of 2 bits, and it should
    only be assigned to a value less or equal than 4.
    
    This fixes 2 INTEGER_OVERFLOW issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Wenjing Liu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: avoid set dispclk to 0 [+ + +]
Author: Charlene Liu <[email protected]>
Date:   Wed Sep 11 19:45:09 2024 -0400

    drm/amd/display: avoid set dispclk to 0
    
    commit c36df0f5f5e5acec5d78f23c4725cc500df28843 upstream.
    
    [why]
    set dispclk to 0 cause stability issue.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Nicholas Kazlauskas <[email protected]>
    Signed-off-by: Charlene Liu <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 1c6b16ebf5eb2bc5740be9e37b3a69f1dfe1dded)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Check null pointer before try to access it [+ + +]
Author: Rodrigo Siqueira <[email protected]>
Date:   Tue Jul 30 20:02:45 2024 -0600

    drm/amd/display: Check null pointer before try to access it
    
    [ Upstream commit 1b686053c06ffb9f4524b288110cf2a831ff7a25 ]
    
    [why & how]
    Change the order of the pipe_ctx->plane_state check to ensure that
    plane_state is not null before accessing it.
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before multiple uses [+ + +]
Author: Alex Hung <[email protected]>
Date:   Tue Jun 25 10:37:35 2024 -0600

    drm/amd/display: Check null pointers before multiple uses
    
    [ Upstream commit fdd5ecbbff751c3b9061d8ebb08e5c96119915b4 ]
    
    [WHAT & HOW]
    Poniters, such as stream_enc and dc->bw_vbios, are null checked previously
    in the same function, so Coverity warns "implies that stream_enc and
    dc->bw_vbios might be null". They are used multiple times in the
    subsequent code and need to be checked.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before used [+ + +]
Author: Alex Hung <[email protected]>
Date:   Tue Jun 25 10:35:52 2024 -0600

    drm/amd/display: Check null pointers before used
    
    [ Upstream commit be1fb44389ca3038ad2430dac4234669bc177ee3 ]
    
    [WHAT & HOW]
    Poniters, such as dc->clk_mgr, are null checked previously in the same
    function, so Coverity warns "implies that "dc->clk_mgr" might be null".
    As a result, these pointers need to be checked when used again.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using dc->clk_mgr [+ + +]
Author: Alex Hung <[email protected]>
Date:   Mon Jul 29 15:29:09 2024 -0600

    drm/amd/display: Check null pointers before using dc->clk_mgr
    
    [ Upstream commit 95d9e0803e51d5a24276b7643b244c7477daf463 ]
    
    [WHY & HOW]
    dc->clk_mgr is null checked previously in the same function, indicating
    it might be null.
    
    Passing "dc" to "dc->hwss.apply_idle_power_optimizations", which
    dereferences null "dc->clk_mgr". (The function pointer resolves to
    "dcn35_apply_idle_power_optimizations".)
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using them [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:38:16 2024 -0600

    drm/amd/display: Check null pointers before using them
    
    [ Upstream commit 1ff12bcd7deaeed25efb5120433c6a45dd5504a8 ]
    
    [WHAT & HOW]
    These pointers are null checked previously in the same function,
    indicating they might be null as reported by Coverity. As a result,
    they need to be checked when used again.
    
    This fixes 3 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null-initialized variables [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:34:18 2024 -0600

    drm/amd/display: Check null-initialized variables
    
    [ Upstream commit 367cd9ceba1933b63bc1d87d967baf6d9fd241d2 ]
    
    [WHAT & HOW]
    drr_timing and subvp_pipe are initialized to null and they are not
    always assigned new values. It is necessary to check for null before
    dereferencing.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Nevenko Stupar <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check phantom_stream before it is used [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 20 20:23:41 2024 -0600

    drm/amd/display: Check phantom_stream before it is used
    
    [ Upstream commit 3718a619a8c0a53152e76bb6769b6c414e1e83f4 ]
    
    dcn32_enable_phantom_stream can return null, so returned value
    must be checked before used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check stream before comparing them [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 20:05:14 2024 -0600

    drm/amd/display: Check stream before comparing them
    
    [ Upstream commit 35ff747c86767937ee1e0ca987545b7eed7a0810 ]
    
    [WHAT & HOW]
    amdgpu_dm can pass a null stream to dc_is_stream_unchanged. It is
    necessary to check for null before dereferencing them.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check stream_status before it is used [+ + +]
Author: Alex Hung <[email protected]>
Date:   Mon Jul 15 10:37:28 2024 -0600

    drm/amd/display: Check stream_status before it is used
    
    [ Upstream commit 58a8ee96f84d2c21abb85ad8c22d2bbdf59bd7a9 ]
    
    [WHAT & HOW]
    dc_state_get_stream_status can return null, and therefore null must be
    checked before stream_status is used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Deallocate DML memory if allocation fails [+ + +]
Author: Chris Park <[email protected]>
Date:   Fri Jun 28 15:09:06 2024 -0400

    drm/amd/display: Deallocate DML memory if allocation fails
    
    [ Upstream commit 892abca6877a96c9123bb1c010cafccdf8ca1b75 ]
    
    [Why]
    When DC state create DML memory allocation fails, memory is not
    deallocated subsequently, resulting in uninitialized structure
    that is not NULL.
    
    [How]
    Deallocate memory if DML memory allocation fails.
    
    Reviewed-by: Joshua Aberback <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Chris Park <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Disable replay if VRR capability is false [+ + +]
Author: Tom Chung <[email protected]>
Date:   Wed Jun 26 16:14:24 2024 +0800

    drm/amd/display: Disable replay if VRR capability is false
    
    [ Upstream commit b68417613d4134b9e39fff95e72ca726268b47db ]
    
    [Why]
    The VRR need to be supported for panel replay feature.
    If VRR capability is false, panel replay capability also
    need to be disabled.
    
    [How]
    After update the vrr capability, the panel replay capability
    also need to be check if need.
    
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Enable idle workqueue for more IPS modes [+ + +]
Author: Leo Li <[email protected]>
Date:   Wed Sep 11 17:27:08 2024 -0400

    drm/amd/display: Enable idle workqueue for more IPS modes
    
    commit ef785ca7f7c80891580cafd36c8dd86375684310 upstream.
    
    [Why]
    
    There are more IPS modes other than DMUB_IPS_ENABLE that enables IPS. We
    need to enable the hotplug detect idle workqueue for those modes as
    well.
    
    [How]
    
    Modify the if condition to initialize the workqueue in all IPS modes
    except for DMUB_IPS_DISABLE_ALL.
    
    Fixes: 65444581a4ae ("drm/amd/display: Determine IPS mode by ASIC and PMFW versions")
    Signed-off-by: Leo Li <[email protected]>
    Reviewed-by: Roman Li <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 181db30bcfed097ecc680539b1eabe935c11f57f)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: fix a UBSAN warning in DML2.1 [+ + +]
Author: Aurabindo Pillai <[email protected]>
Date:   Fri Jul 19 14:10:58 2024 -0400

    drm/amd/display: fix a UBSAN warning in DML2.1
    
    [ Upstream commit eaf3adb8faab611ba57594fa915893fc93a7788c ]
    
    When programming phantom pipe, since cursor_width is explicity set to 0,
    this causes calculation logic to trigger overflow for an unsigned int
    triggering the kernel's UBSAN check as below:
    
    [   40.962845] UBSAN: shift-out-of-bounds in /tmp/amd.EfpumTkO/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3312:34
    [   40.962849] shift exponent 4294967170 is too large for 32-bit type 'unsigned int'
    [   40.962852] CPU: 1 PID: 1670 Comm: gnome-shell Tainted: G        W  OE      6.5.0-41-generic #41~22.04.2-Ubuntu
    [   40.962854] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F21 01/10/2024
    [   40.962856] Call Trace:
    [   40.962857]  <TASK>
    [   40.962860]  dump_stack_lvl+0x48/0x70
    [   40.962870]  dump_stack+0x10/0x20
    [   40.962872]  __ubsan_handle_shift_out_of_bounds+0x1ac/0x360
    [   40.962878]  calculate_cursor_req_attributes.cold+0x1b/0x28 [amdgpu]
    [   40.963099]  dml_core_mode_support+0x6b91/0x16bc0 [amdgpu]
    [   40.963327]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963331]  ? CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport+0x18b8/0x2790 [amdgpu]
    [   40.963534]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963536]  ? dml_core_mode_support+0xb3db/0x16bc0 [amdgpu]
    [   40.963730]  dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.963906]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963909]  ? dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.964078]  core_dcn4_mode_support+0x72/0xbf0 [amdgpu]
    [   40.964247]  dml2_top_optimization_perform_optimization_phase+0x1d3/0x2a0 [amdgpu]
    [   40.964420]  dml2_build_mode_programming+0x23d/0x750 [amdgpu]
    [   40.964587]  dml21_validate+0x274/0x770 [amdgpu]
    [   40.964761]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.964763]  ? resource_append_dpp_pipes_for_plane_composition+0x27c/0x3b0 [amdgpu]
    [   40.964942]  dml2_validate+0x504/0x750 [amdgpu]
    [   40.965117]  ? dml21_copy+0x95/0xb0 [amdgpu]
    [   40.965291]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965295]  dcn401_validate_bandwidth+0x4e/0x70 [amdgpu]
    [   40.965491]  update_planes_and_stream_state+0x38d/0x5c0 [amdgpu]
    [   40.965672]  update_planes_and_stream_v3+0x52/0x1e0 [amdgpu]
    [   40.965845]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965849]  dc_update_planes_and_stream+0x71/0xb0 [amdgpu]
    
    Fix this by adding a guard for checking cursor width before triggering
    the size calculation.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: fix double free issue during amdgpu module unload [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 15 18:45:22 2024 -0400

    drm/amd/display: fix double free issue during amdgpu module unload
    
    [ Upstream commit 20b5a8f9f4670a8503aa9fa95ca632e77c6bf55d ]
    
    Flexible endpoints use DIGs from available inflexible endpoints,
    so only the encoders of inflexible links need to be freed.
    Otherwise, a double free issue may occur when unloading the
    amdgpu module.
    
    [  279.190523] RIP: 0010:__slab_free+0x152/0x2f0
    [  279.190577] Call Trace:
    [  279.190580]  <TASK>
    [  279.190582]  ? show_regs+0x69/0x80
    [  279.190590]  ? die+0x3b/0x90
    [  279.190595]  ? do_trap+0xc8/0xe0
    [  279.190601]  ? do_error_trap+0x73/0xa0
    [  279.190605]  ? __slab_free+0x152/0x2f0
    [  279.190609]  ? exc_invalid_op+0x56/0x70
    [  279.190616]  ? __slab_free+0x152/0x2f0
    [  279.190642]  ? asm_exc_invalid_op+0x1f/0x30
    [  279.190648]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191096]  ? __slab_free+0x152/0x2f0
    [  279.191102]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191469]  kfree+0x260/0x2b0
    [  279.191474]  dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191821]  link_destroy+0xd7/0x130 [amdgpu]
    [  279.192248]  dc_destruct+0x90/0x270 [amdgpu]
    [  279.192666]  dc_destroy+0x19/0x40 [amdgpu]
    [  279.193020]  amdgpu_dm_fini+0x16e/0x200 [amdgpu]
    [  279.193432]  dm_hw_fini+0x26/0x40 [amdgpu]
    [  279.193795]  amdgpu_device_fini_hw+0x24c/0x400 [amdgpu]
    [  279.194108]  amdgpu_driver_unload_kms+0x4f/0x70 [amdgpu]
    [  279.194436]  amdgpu_pci_remove+0x40/0x80 [amdgpu]
    [  279.194632]  pci_device_remove+0x3a/0xa0
    [  279.194638]  device_remove+0x40/0x70
    [  279.194642]  device_release_driver_internal+0x1ad/0x210
    [  279.194647]  driver_detach+0x4e/0xa0
    [  279.194650]  bus_remove_driver+0x6f/0xf0
    [  279.194653]  driver_unregister+0x33/0x60
    [  279.194657]  pci_unregister_driver+0x44/0x90
    [  279.194662]  amdgpu_exit+0x19/0x1f0 [amdgpu]
    [  279.194939]  __do_sys_delete_module.isra.0+0x198/0x2f0
    [  279.194946]  __x64_sys_delete_module+0x16/0x20
    [  279.194950]  do_syscall_64+0x58/0x120
    [  279.194954]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [  279.194980]  </TASK>
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Roman Li <[email protected]>
    Signed-off-by: Roman Li <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 color transformation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:05:20 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 color transformation
    
    [ Upstream commit d81873f9e715b72d4f8d391c8eb243946f784dfc ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
    management module. The issue could occur when the index 'i' exceeds the
    number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:44:02 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
    
    [ Upstream commit bc50b614d59990747dd5aeced9ec22f9258991ff ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
    color  management module. The issue could occur when the index 'i'
    exceeds the  number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 17:48:27 2024 +0530

    drm/amd/display: Fix index out of bounds in degamma hardware format translation
    
    [ Upstream commit b7e99058eb2e86aabd7a10761e76cae33d22b49f ]
    
    Fixes index out of bounds issue in
    `cm_helper_translate_curve_to_degamma_hw_format` function. The issue
    could occur when the index 'i' exceeds the number of transfer function
    points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix possible overflow in integer multiplication [+ + +]
Author: Alex Hung <[email protected]>
Date:   Fri Jun 7 22:09:53 2024 -0600

    drm/amd/display: Fix possible overflow in integer multiplication
    
    [ Upstream commit 3f96f545f877ac59d0c967f52d760b4b2b3b9a47 ]
    
    [WHAT & HOW]
    Integer multiplies integer may overflow in context that expects an
    expression of unsigned long long (64 bits). This can be fixed by casting
    integer to unsigned long long to force 64 bits results.
    
    This fixes 2 OVERFLOW_BEFORE_WIDEN issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix system hang while resume with TBT monitor [+ + +]
Author: Tom Chung <[email protected]>
Date:   Fri Sep 13 15:44:40 2024 +0800

    drm/amd/display: Fix system hang while resume with TBT monitor
    
    commit 52d4e3fb3d340447dcdac0e14ff21a764f326907 upstream.
    
    [Why]
    Connected with a Thunderbolt monitor and do the suspend and the system
    may hang while resume.
    
    The TBT monitor HPD will be triggered during the resume procedure
    and call the drm_client_modeset_probe() while
    struct drm_connector connector->dev->master is NULL.
    
    It will mess up the pipe topology after resume.
    
    [How]
    Skip the TBT monitor HPD during the resume procedure because we
    currently will probe the connectors after resume by default.
    
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Signed-off-by: Fangzhi Zuo <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Fix VRR cannot enable [+ + +]
Author: Tom Chung <[email protected]>
Date:   Wed Jul 3 16:47:57 2024 +0800

    drm/amd/display: Fix VRR cannot enable
    
    [ Upstream commit f91a9af09dea850d83d4b217b8acbafd97b5c61f ]
    
    [Why]
    Sometimes the VRR cannot enable after login to the desktop.
    
    User space may call the DRM_IOCTL_MODE_GETCONNECTOR right after
    the DRM_IOCTL_MODE_RMFB.
    
    After calling DRM_IOCTL_MODE_RMFB to remove all the frame buffer
    and it will cause the driver to disable the crtc and disable the
    link while calling the link_set_dpms_off().
    
    It will cause the dpcd read failed in amdgpu_dm_update_freesync_caps()
    while try to get the DP_MSA_TIMING_PAR_IGNORED capability and think
    the sink side does not support VRR.
    
    [How]
    Use the dpcd_caps.allow_invalid_MSA_timing_param flag instead of
    reading from dpcd directly.
    
    dpcd_caps.allow_invalid_MSA_timing_param flag is updated during HPD.
    It is safe to replace the original method.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML [+ + +]
Author: Dillon Varone <[email protected]>
Date:   Tue Jul 23 15:54:23 2024 -0400

    drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML
    
    [ Upstream commit b8dc6ca028d9a39196a3a066b9ef2d4a5eca475d ]
    
    [WHY]
    Currently TR0 (trip 0) is not properly budgeting for urgent latency in
    DML2.1. This results in overly aggressive prefetch schedules that are
    vulnerable to request return jitter, resulting in severe underflow at
    the start of the frame.
    
    [HOW]
    Forcing 3DLUT DMA check to enable causes urgent latency to be budgeted
    properly into the prefetch schedule, avoiding the vulnerability.
    
    Reviewed-by: Alvin Lee <[email protected]>
    Signed-off-by: Dillon Varone <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: guard write a 0 post_divider value to HW [+ + +]
Author: Ahmed, Muhammad <[email protected]>
Date:   Tue Aug 13 17:11:55 2024 -0400

    drm/amd/display: guard write a 0 post_divider value to HW
    
    [ Upstream commit 5d666496c24129edeb2bcb500498b87cc64e7f07 ]
    
    [why]
    post_divider_value should not be 0.
    
    Reviewed-by: Charlene Liu <[email protected]>
    Signed-off-by: Ahmed, Muhammad <[email protected]>
    Signed-off-by: Zaeem Mohamed <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream' [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Jul 26 19:31:55 2024 +0530

    drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream'
    
    [ Upstream commit 8141f21b941710ecebe49220b69822cab3abd23d ]
    
    This commit adds a null check for 'stream_status' in the function
    'planes_changed_for_existing_stream'. Previously, the code assumed
    'stream_status' could be null, but did not handle the case where it was
    actually null. This could lead to a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:3784 planes_changed_for_existing_stream() error: we previously assumed 'stream_status' could be null (see line 3774)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: handle nulled pipe context in DCE110's set_drr() [+ + +]
Author: Tobias Jakobi <[email protected]>
Date:   Mon Sep 16 14:54:05 2024 +0200

    drm/amd/display: handle nulled pipe context in DCE110's set_drr()
    
    [ Upstream commit e7d4e1438533abe448813bdc45691f9c230aa307 ]
    
    As set_drr() is called from IRQ context, it can happen that the
    pipe context has been nulled by dc_state_destruct().
    
    Apply the same protection here that is already present for
    dcn35_set_drr() and dcn10_set_drr(). I.e. fetch the tg pointer
    first (to avoid a race with dc_state_destruct()), and then
    check the local copy before using it.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
    Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
    Acked-by: Alex Deucher <[email protected]>
    Signed-off-by: Tobias Jakobi <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Implement bounds check for stream encoder creation in DCN401 [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Jul 19 21:39:57 2024 +0530

    drm/amd/display: Implement bounds check for stream encoder creation in DCN401
    
    [ Upstream commit bdf606810210e8e07a0cdf1af3c467291363b295 ]
    
    'stream_enc_regs' array is an array of dcn10_stream_enc_registers
    structures. The array is initialized with four elements, corresponding
    to the four calls to stream_enc_regs() in the array initializer. This
    means that valid indices for this array are 0, 1, 2, and 3.
    
    The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
    there is an attempt to access this array with an index of 5, which is
    out of bounds. This could lead to undefined behavior
    
    Here, eng_id is used as an index to access the stream_enc_regs array. If
    eng_id is 5, this would result in an out-of-bounds access on the
    stream_enc_regs array.
    
    Thus fixing Buffer overflow error in dcn401_stream_encoder_create
    
    Found by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Increase array size of dummy_boolean [+ + +]
Author: Alex Hung <[email protected]>
Date:   Wed Jul 3 10:50:35 2024 -0600

    drm/amd/display: Increase array size of dummy_boolean
    
    [ Upstream commit 6d64d39486197083497a01b39e23f2f8474b35d3 ]
    
    [WHY]
    dml2_core_shared_mode_support and dml_core_mode_support access the third
    element of dummy_boolean, i.e. hw_debug5 = &s->dummy_boolean[2], when
    dummy_boolean has size of 2. Any assignment to hw_debug5 causes an
    OVERRUN.
    
    [HOW]
    Increase dummy_boolean's array size to 3.
    
    This fixes 2 OVERRUN issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Initialize denominators' default to 1 [+ + +]
Author: Alex Hung <[email protected]>
Date:   Tue Jun 18 14:05:08 2024 -0600

    drm/amd/display: Initialize denominators' default to 1
    
    [ Upstream commit b995c0a6de6c74656a0c39cd57a0626351b13e3c ]
    
    [WHAT & HOW]
    Variables used as denominators and maybe not assigned to other values,
    should not be 0. Change their default to 1 so they are never 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Initialize get_bytes_per_element's default to 1 [+ + +]
Author: Alex Hung <[email protected]>
Date:   Mon Jul 15 09:57:01 2024 -0600

    drm/amd/display: Initialize get_bytes_per_element's default to 1
    
    [ Upstream commit 4067f4fa0423a89fb19a30b57231b384d77d2610 ]
    
    Variables, used as denominators and maybe not assigned to other values,
    should not be 0. bytes_per_element_y & bytes_per_element_c are
    initialized by get_bytes_per_element() which should never return 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 11:51:27 2024 -0600

    drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags
    
    [ Upstream commit 5559598742fb4538e4c51c48ef70563c49c2af23 ]
    
    [WHAT & HOW]
    "dcn20_validate_apply_pipe_split_flags" dereferences merge, and thus it
    cannot be a null pointer. Let's pass a valid pointer to avoid null
    dereference.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Re-enable panel replay feature [+ + +]
Author: Tom Chung <[email protected]>
Date:   Wed Jun 26 17:02:23 2024 +0800

    drm/amd/display: Re-enable panel replay feature
    
    [ Upstream commit be64336307a6c3ee71fe1337c1b9f0495aa83c50 ]
    
    [Why & How]
    Fixed the replay issues and now re-enable the panel replay feature.
    
    Reported-by: Arthur Borsboom <[email protected]>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
    Reviewed-by: Sun peng Li <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC [+ + +]
Author: Fangzhi Zuo <[email protected]>
Date:   Wed Sep 4 15:29:24 2024 -0400

    drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC
    
    commit d51160ab00969ee6758ed2dcbc0f81dd476a181c upstream.
    
    Existing last step of dsc policy is to restore pbn value under minimum compression
    when try to greedily disable dsc for a stream failed to fit in MST bw.
    Optimized dsc params result from optimization step is not necessarily the minimum compression,
    therefore it is not correct to restore the pbn under minimum compression rate.
    
    Restore the pbn under minimum compression instead of the value from optimized pbn could result
    in the dsc params not correct at the modeset where atomic_check failed due to not
    enough bw. One or more monitors connected could not light up in such case.
    
    Restore the optimized pbn value, instead of using the pbn value under minimum
    compression.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Fangzhi Zuo <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 352c3165d2b75030169e012461a16bcf97f392fc)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Revert Avoid overflow assignment [+ + +]
Author: Gabe Teeger <[email protected]>
Date:   Thu Jul 25 18:42:21 2024 -0400

    drm/amd/display: Revert Avoid overflow assignment
    
    commit e80f8f491df873ea2e07c941c747831234814612 upstream.
    
    This reverts commit a15268787b79 ("drm/amd/display: Avoid overflow assignment in link_dp_cts")
    Due to regression causing DPMS hang.
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Gabe Teeger <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Underflow Seen on DCN401 eGPU [+ + +]
Author: Daniel Sa <[email protected]>
Date:   Fri Jul 19 13:39:09 2024 -0400

    drm/amd/display: Underflow Seen on DCN401 eGPU
    
    [ Upstream commit ca0fb243c3bb53dbbd71d16c76f319bf923ee3d4 ]
    
    [WHY]
    In dcn401 we read clock values before FW is loaded. These incorrect
    values cause the driver to believe that we are running higher clocks
    than what we actually have. This then causes corruption/underflow for
    the eGPU.
    
    [HOW]
    When new values are read from HW, update internal structures to
    propagate the new/correct value. Fixes issue
    
    Signed-off-by: Daniel Sa <[email protected]>
    Reviewed-by: Alvin Lee <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Unlock Pipes Based On DET Allocation [+ + +]
Author: Austin Zheng <[email protected]>
Date:   Tue Jul 30 11:55:23 2024 -0400

    drm/amd/display: Unlock Pipes Based On DET Allocation
    
    [ Upstream commit 4af0d8ebf74ccbb60d33fdd410891283dd6cb109 ]
    
    [Why]
    DML21 does not allocate DET evenly between pipes.
    May result in underflow when unlocking the pipes as DET could
    be overallocated.
    
    [How]
    1. Unlock pipes that have a decreased amount of DET allocation
    2. Wait for the double buffer to be updated.
    3. Unlock the remaining pipes.
    
    Reviewed-by: Alvin Lee <[email protected]>
    Signed-off-by: Austin Zheng <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35 [+ + +]
Author: Yihan Zhu <[email protected]>
Date:   Sat Sep 7 13:25:19 2024 -0400

    drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35
    
    commit 0d5e5e8a0aa49ea2163abf128da3b509a6c58286 upstream.
    
    [WHY & HOW]
    Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause
    grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override
    to match HW spec.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Charlene Liu <[email protected]>
    Signed-off-by: Yihan Zhu <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces [+ + +]
Author: Nicholas Kazlauskas <[email protected]>
Date:   Thu Jul 18 11:53:31 2024 -0400

    drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces
    
    [ Upstream commit 31663521ede2edb622ee1b397ae3ac666d6351c5 ]
    
    [Why]
    It's currently hard coded to 256 when it should be using the SOC
    provided values. This can result in corruption with linear surfaces
    where we prefetch more PTE than the buffer can hold.
    
    [How]
    Update the min page size correctly for the plane.
    
    Signed-off-by: Nicholas Kazlauskas <[email protected]>
    Reviewed-by: Jun Lei <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amd/pm: ensure the fw_info is not null before using it [+ + +]
Author: Tim Huang <[email protected]>
Date:   Wed Aug 7 17:15:12 2024 +0800

    drm/amd/pm: ensure the fw_info is not null before using it
    
    [ Upstream commit 186fb12e7a7b038c2710ceb2fb74068f1b5d55a4 ]
    
    This resolves the dereference null return value warning
    reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx10: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:34 2024 -0400

    drm/amdgpu/gfx10: use rlc safe mode for soft recovery
    
    [ Upstream commit ead60e9c4e29c8574cae1be4fe3af1d9a978fb0f ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Fri Jul 12 15:36:19 2024 -0400

    drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL
    
    [ Upstream commit b5be054c585110b2c5c1b180136800e8c41c7bb4 ]
    
    Need to enter safe mode before touching GC MMIO.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx11: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:23 2024 -0400

    drm/amdgpu/gfx11: use rlc safe mode for soft recovery
    
    [ Upstream commit 3f2d35c325534c1b7ac5072173f0dc7ca969dec2 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx12: properly handle error ints on all pipes [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Mon Jul 1 17:40:55 2024 -0400

    drm/amdgpu/gfx12: properly handle error ints on all pipes
    
    [ Upstream commit 39879321769cc2d9a690725959ef76af92a38ac1 ]
    
    Need to handle the interrupt enables for all pipes.
    
    v2: fix indexing (Jessie)
    
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx12: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:13 2024 -0400

    drm/amdgpu/gfx12: use rlc safe mode for soft recovery
    
    [ Upstream commit 21818f39beda2e843199e5d8d9e3f9e43c8080a3 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx9: properly handle error ints on all pipes [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Tue Jul 2 10:24:59 2024 -0400

    drm/amdgpu/gfx9: properly handle error ints on all pipes
    
    [ Upstream commit 48695573d2feaf42812c1ad54e01caff0d1c2d71 ]
    
    Need to handle the interrupt enables for all pipes.
    
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx9: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:57 2024 -0400

    drm/amdgpu/gfx9: use rlc safe mode for soft recovery
    
    [ Upstream commit 3ec2ad7c34c412bd9264cd1ff235d0812be90e82 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu: add list empty check to avoid null pointer issue [+ + +]
Author: Yang Wang <[email protected]>
Date:   Wed Aug 21 14:42:41 2024 +0800

    drm/amdgpu: add list empty check to avoid null pointer issue
    
    [ Upstream commit 4416377ae1fdc41a90b665943152ccd7ff61d3c5 ]
    
    Add list empty check to avoid null pointer issues in some corner cases.
    - list_for_each_entry_safe()
    
    Signed-off-by: Yang Wang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: add raven1 gfxoff quirk [+ + +]
Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:25:54 2024 +0800

    drm/amdgpu: add raven1 gfxoff quirk
    
    [ Upstream commit 0126c0ae11e8b52ecfde9d1b174ee2f32d6c3a5d ]
    
    Fix screen corruption with openkylin.
    
    Link: https://bbs.openkylin.top/t/topic/171497
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Block MMR_READ IOCTL in reset [+ + +]
Author: Victor Skvortsov <[email protected]>
Date:   Thu Aug 8 13:40:23 2024 -0400

    drm/amdgpu: Block MMR_READ IOCTL in reset
    
    [ Upstream commit 9e823f307074c0f82b5f6044943b0086e3079bed ]
    
    Register access from userspace should be blocked until
    reset is complete.
    
    Signed-off-by: Victor Skvortsov <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit [+ + +]
Author: Pierre-Eric Pelloux-Prayer <[email protected]>
Date:   Tue Jul 2 11:54:30 2024 +0200

    drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
    
    [ Upstream commit fec5f8e8c6bcf83ed7a392801d7b44c5ecfc1e82 ]
    
    Before this commit, only submits with both a BO_HANDLES chunk and a
    'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).
    
    But if UMD sent multiple BO_HANDLES, what would happen is:
    * only the last one would be really used
    * all the others would leak memory as amdgpu_cs_p1_bo_handles would
      overwrite the previous p->bo_list value
    
    This commit rejects submissions with multiple BO_HANDLES chunks to
    match the implementation of the parser.
    
    Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: enable gfxoff quirk on HP 705G4 [+ + +]
Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:27:08 2024 +0800

    drm/amdgpu: enable gfxoff quirk on HP 705G4
    
    [ Upstream commit 2c7795e245d993bcba2f716a8c93a5891ef910c9 ]
    
    Enabling gfxoff quirk results in perfectly usable
    graphical user interface on HP 705G4 DM with R5 2400G.
    
    Without the quirk, X server is completely unusable as
    every few seconds there is gpu reset due to ring gfx timeout.
    
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Fix get each xcp macro [+ + +]
Author: Asad Kamal <[email protected]>
Date:   Mon Jul 22 19:45:11 2024 +0800

    drm/amdgpu: Fix get each xcp macro
    
    [ Upstream commit ef126c06a98bde1a41303970eb0fc0ac33c3cc02 ]
    
    Fix get each xcp macro to loop over each partition correctly
    
    Fixes: 4bdca2057933 ("drm/amdgpu: Add utility functions for xcp")
    Signed-off-by: Asad Kamal <[email protected]>
    Reviewed-by: Lijo Lazar <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx10 ip_dump [+ + +]
Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:25:24 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx10 ip_dump
    
    [ Upstream commit 98df5a7732e3b78bf8824d2938a8865a45cfc113 ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx11 ip_dump [+ + +]
Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:27:10 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx11 ip_dump
    
    [ Upstream commit bd15f805cdc503ac229a14f5fe21db12e6e7f84a ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx9 ip_dump [+ + +]
Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:21:53 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx9 ip_dump
    
    [ Upstream commit 07f4f9c00ec545dfa6251a44a09d2c48a76e7ee5 ]
    
    Change if (ptr == NULL) to if (!ptr) for a better
    format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_atombios [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 13:47:55 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_atombios
    
    [ Upstream commit 92549780e32718d64a6d08bbbb3c6fffecb541c7 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_gfx [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 10:38:37 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_gfx
    
    [ Upstream commit c0277b9d7c2ee9ee5dbc948548984f0fbb861301 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer [+ + +]
Author: Philip Yang <[email protected]>
Date:   Sun Jul 14 11:11:05 2024 -0400

    drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
    
    [ Upstream commit c86ad39140bbcb9dc75a10046c2221f657e8083b ]
    
    Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
    otherwise amdgpu_bo_unref clear the local variable, the original pointer
    not set to NULL, this could cause use-after-free bug.
    
    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Felix Kuehling <[email protected]>
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Check int source id for utcl2 poison event [+ + +]
Author: Hawking Zhang <[email protected]>
Date:   Tue Aug 20 13:56:32 2024 +0800

    drm/amdkfd: Check int source id for utcl2 poison event
    
    [ Upstream commit db6341a9168d2a24ded526277eeab29724d76e9d ]
    
    Traditional utcl2 fault_status polling does not
    work in SRIOV environment. The polling of fault
    status register from guest side will be dropped
    by hardware.
    
    Driver should switch to check utcl2 interrupt
    source id to identify utcl2 poison event. It is
    set to 1 when poisoned data interrupts are
    signaled.
    
    v2: drop the unused local variable (Tao)
    
    Signed-off-by: Hawking Zhang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Fix resource leak in criu restore queue [+ + +]
Author: Jesse Zhang <[email protected]>
Date:   Fri Sep 6 11:29:55 2024 +0800

    drm/amdkfd: Fix resource leak in criu restore queue
    
    [ Upstream commit aa47fe8d3595365a935921a90d00bc33ee374728 ]
    
    To avoid memory leaks, release q_extra_data when exiting the restore queue.
    v2: Correct the proto (Alex)
    
    Signed-off-by: Jesse Zhang <[email protected]>
    Reviewed-by: Tim Huang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes [+ + +]
Author: Derek Foreman <[email protected]>
Date:   Tue Aug 27 11:39:04 2024 -0500

    drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes
    
    [ Upstream commit f0fa69b5011a45394554fb8061d74fee4d7cd72c ]
    
    The largest infoframe we create is the DRM (Dynamic Range Mastering)
    infoframe which is 26 bytes + a 4 byte header, for a total of 30
    bytes.
    
    With HDMI_MAX_INFOFRAME_SIZE set to 29 bytes, as it is now, we
    allocate too little space to pack a DRM infoframe in
    write_device_infoframe(), leading to an ENOSPC return from
    hdmi_infoframe_pack(), and never calling the connector's
    write_infoframe() vfunc.
    
    Instead of having HDMI_MAX_INFOFRAME_SIZE defined in two places,
    replace HDMI_MAX_INFOFRAME_SIZE with HDMI_INFOFRAME_SIZE(MAX) and make
    MAX 27 bytes - which is defined by the HDMI specification to be the
    largest infoframe payload.
    
    Fixes: f378b77227bc ("drm/connector: hdmi: Add Infoframes generation")
    Fixes: c602e4959a0c ("drm/connector: hdmi: Create Infoframe DebugFS entries")
    
    Signed-off-by: Derek Foreman <[email protected]>
    Acked-by: Maxime Ripard <[email protected]>
    Reviewed-by: Jani Nikula <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Maxime Ripard <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/i915/display: BMG supports UHBR13.5 [+ + +]
Author: Arun R Murthy <[email protected]>
Date:   Tue Aug 27 13:42:05 2024 +0530

    drm/i915/display: BMG supports UHBR13.5
    
    [ Upstream commit fcd33d434d31a210bc9f209b5bfd92f3b91a2dda ]
    
    UHBR20 is not supported by battlemage and the maximum link rate
    supported is UHBR13.5
    
    v2: Replace IS_DGFX with IS_BATTLEMAGE (Jani)
    
    HSD: 16023263677
    Signed-off-by: Arun R Murthy <[email protected]>
    Reviewed-by: Mika Kahola <[email protected]>
    Fixes: 98b1c87a5e51 ("drm/i915/xe2hpd: Set maximum DP rate to UHBR13.5")
    Signed-off-by: Suraj Kandpal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 9c2338ac4543e0fab3a1e0f9f025591e0f0d9f8f)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/i915/dp: Fix AUX IO power enabling for eDP PSR [+ + +]
Author: Imre Deak <[email protected]>
Date:   Tue Sep 10 14:18:47 2024 +0300

    drm/i915/dp: Fix AUX IO power enabling for eDP PSR
    
    [ Upstream commit ec2231b8dd2dc515912ff7816c420153b4a95e92 ]
    
    Panel Self Refresh on eDP requires the AUX IO power to be enabled
    whenever the output (main link) is enabled. This is required by the
    AUX_PHY_WAKE/ML_PHY_LOCK signaling initiated by the HW automatically to
    re-enable the main link after it got disabled in power saving states
    (see eDP v1.4b, sections 5.1, 6.1.3.3.1.1).
    
    The Panel Replay mode on non-eDP outputs on the other hand is only
    supported by keeping the main link active, thus not requiring the above
    AUX_PHY_WAKE/ML_PHY_LOCK signaling (eDP v1.4b, section 6.1.3.3.1.2).
    Thus enabling the AUX IO power for this case is not required either.
    
    Based on the above enable the AUX IO power only for eDP/PSR outputs.
    
    Bspec: 49274, 53370
    
    v2:
    - Add a TODO comment to adjust the requirement for AUX IO based on
      whether the ALPM/main-link off mode gets enabled. (Rodrigo)
    
    Cc: Animesh Manna <[email protected]>
    Fixes: b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
    Reviewed-by: Rodrigo Vivi <[email protected]>
    Signed-off-by: Imre Deak <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit f7c2ed9d4ce80a2570c492825de239dc8b500f2e)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/dp: Fix colorimetry detection [+ + +]
Author: Ville Syrjälä <[email protected]>
Date:   Wed Sep 18 22:04:39 2024 +0300

    drm/i915/dp: Fix colorimetry detection
    
    [ Upstream commit e860513f56d8428fcb2bd0282ac8ab691a53fc6c ]
    
    intel_dp_init_connector() is no place for detecting stuff via
    DPCD (except perhaps for eDP). Move the colorimetry stuff into
    a more appropriate place.
    
    Cc: Jouni Högander <[email protected]>
    Fixes: 00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Reviewed-by: Jouni Högander <[email protected]>
    (cherry picked from commit 35dba4834bded843d5416e8caadfe82bd0ce1904)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/i915/gem: fix bitwise and logical AND mixup [+ + +]
Author: Jani Nikula <[email protected]>
Date:   Wed Sep 18 20:35:43 2024 +0300

    drm/i915/gem: fix bitwise and logical AND mixup
    
    commit 394b52462020b6cceff1f7f47fdebd03589574f3 upstream.
    
    CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND is an int, defaulting to 250. When
    the wakeref is non-zero, it's either -1 or a dynamically allocated
    pointer, depending on CONFIG_DRM_I915_DEBUG_RUNTIME_PM. It's likely that
    the code works by coincidence with the bitwise AND, but with
    CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y, there's the off chance that the
    condition evaluates to false, and intel_wakeref_auto() doesn't get
    called. Switch to the intended logical AND.
    
    v2: Use != to avoid clang -Wconstant-logical-operand (Nathan)
    
    Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
    Cc: Matthew Auld <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Anshuman Gupta <[email protected]>
    Cc: Andi Shyti <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: [email protected] # v6.1+
    Reviewed-by: Matthew Auld <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]> # v1
    Link: https://patchwork.freedesktop.org/patch/msgid/643cc0a4d12f47fd8403d42581e83b1e9c4543c7.1726680898.git.jani.nikula@intel.com
    Signed-off-by: Jani Nikula <[email protected]>
    (cherry picked from commit 4c1bfe259ed1d2ade826f95d437e1c41b274df04)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/i915/psr: Do not wait for PSR being idle on on Panel Replay [+ + +]
Author: Jouni Högander <[email protected]>
Date:   Fri Sep 6 10:00:33 2024 +0300

    drm/i915/psr: Do not wait for PSR being idle on on Panel Replay
    
    [ Upstream commit 9498f2e24ee0133d486667c9fa4c27ecdaadc272 ]
    
    We do not have ALPM on DP Panel Replay. Due to this SRD_STATUS[SRD State]
    doesn't change from SRDENT_ON after Panel Replay is enabled until it gets
    disabled.
    
    On eDP Panel Replay DEEP_SLEEP is not reached.
    _psr2_ready_for_pipe_update_locked is waiting DEEP_SLEEP bit getting reset.
    
    Take these into account in Panel Replay code by not waiting PSR getting
    idle after enabling VBI.
    
    Fixes: 29fb595d4875 ("drm/i915/psr: Panel replay uses SRD_STATUS to track it's status")
    Cc: Animesh Manna <[email protected]>
    Signed-off-by: Jouni Högander <[email protected]>
    Reviewed-by: Animesh Manna <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit a2d98feb4b0013ef4f9db0d8f642a8ac1f5ecbb9)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/mediatek: ovl_adaptor: Add missing of_node_put() [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Mon Jun 24 18:43:47 2024 +0200

    drm/mediatek: ovl_adaptor: Add missing of_node_put()
    
    commit 5beb6fba25db235b52eab34bde8112f07bb31d75 upstream.
    
    Error paths that exit for_each_child_of_node() need to call
    of_node_put() to decerement the child refcount and avoid memory leaks.
    
    Add the missing of_node_put().
    
    Cc: [email protected]
    Fixes: 453c3364632a ("drm/mediatek: Add ovl_adaptor support for MT8195")
    Signed-off-by: Javier Carrasco <[email protected]>
    Reviewed-by: CK Hu <[email protected]>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
    Signed-off-by: Chun-Kuang Hu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs [+ + +]
Author: Konrad Dybcio <[email protected]>
Date:   Tue Jul 9 13:15:40 2024 +0200

    drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs
    
    [ Upstream commit 16007768551d5bfe53426645401435ca8d2ef54f ]
    
    There are some cases, such as the one uncovered by Commit 46d4efcccc68
    ("drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails")
    where
    
    msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);
    
    is called on gpu->pdev == NULL, as the GPU device has not been fully
    initialized yet.
    
    Turns out that there's more than just the aforementioned path that
    causes this to happen (e.g. the case when there's speedbin data in the
    catalog, but opp-supported-hw is missing in DT).
    
    Assigning msm_gpu->pdev earlier seems like the least painful solution
    to this, therefore do so.
    
    Signed-off-by: Konrad Dybcio <[email protected]>
    Patchwork: https://patchwork.freedesktop.org/patch/602742/
    Signed-off-by: Rob Clark <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/panthor: Don't add write fences to the shared BOs [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:01:54 2024 +0200

    drm/panthor: Don't add write fences to the shared BOs
    
    commit f9e7ac6e2e9986c2ee63224992cb5c8276e46b2a upstream.
    
    The only user (the mesa gallium driver) is already assuming explicit
    synchronization and doing the export/import dance on shared BOs. The
    only reason we were registering ourselves as writers on external BOs
    is because Xe, which was the reference back when we developed Panthor,
    was doing so. Turns out Xe was wrong, and we really want bookkeep on
    all registered fences, so userspace can explicitly upgrade those to
    read/write when needed.
    
    Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block")
    Cc: Matthew Brost <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Don't declare a queue blocked if deferred operations are pending [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:19:14 2024 +0200

    drm/panthor: Don't declare a queue blocked if deferred operations are pending
    
    commit 7a1f30afe97294281a2ba05977688385744f9844 upstream.
    
    If deferred operations are pending, we want to wait for those to
    land before declaring the queue blocked on a SYNC_WAIT. We need
    this to deal with the case where the sync object is signalled through
    a deferred SYNC_{ADD,SET} from the same queue. If we don't do that
    and the group gets scheduled out before the deferred SYNC_{SET,ADD}
    is executed, we'll end up with a timeout, because no external
    SYNC_{SET,ADD} will make the scheduler reconsider the group for
    execution.
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup() [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Mon Sep 30 18:37:42 2024 +0200

    drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup()
    
    commit 282864cc5d3f144af0cdea1868ee2dc2c5110f0d upstream.
    
    The group variable can't be used to retrieve ptdev in our second loop,
    because it points to the previously iterated list_head, not a valid
    group. Get the ptdev object from the scheduler instead.
    
    Cc: <[email protected]>
    Fixes: d72f049087d4 ("drm/panthor: Allow driver compilation")
    Reported-by: kernel test robot <[email protected]>
    Reported-by: Julia Lawall <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]/
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix race when converting group handle to group object [+ + +]
Author: Steven Price <[email protected]>
Date:   Mon Sep 23 11:34:06 2024 +0100

    drm/panthor: Fix race when converting group handle to group object
    
    [ Upstream commit cac075706f298948898b1f63e81709df42afa75d ]
    
    XArray provides it's own internal lock which protects the internal array
    when entries are being simultaneously added and removed. However there
    is still a race between retrieving the pointer from the XArray and
    incrementing the reference count.
    
    To avoid this race simply hold the internal XArray lock when
    incrementing the reference count, this ensures there cannot be a racing
    call to xa_erase().
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Signed-off-by: Steven Price <[email protected]>
    Reviewed-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc() [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Fri Sep 13 13:27:22 2024 +0200

    drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc()
    
    [ Upstream commit fa998a9eac8809da4f219aad49836fcad2a9bf5c ]
    
    drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
    pre-allocated BO if the <BO,VM> association exists. Given we
    only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
    be called immediately, and we have to hold the VM resv lock when
    calling this function.
    
    Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/printer: Allow NULL data in devcoredump printer [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Thu Aug 1 08:41:17 2024 -0700

    drm/printer: Allow NULL data in devcoredump printer
    
    [ Upstream commit 53369581dc0c68a5700ed51e1660f44c4b2bb524 ]
    
    We want to determine the size of the devcoredump before writing it out.
    To that end, we will run the devcoredump printer with NULL data to get
    the size, alloc data based on the generated offset, then run the
    devcorecump again with a valid data pointer to print.  This necessitates
    not writing data to the data pointer on the initial pass, when it is
    NULL.
    
    v5:
     - Better commit message (Jonathan)
     - Add kerenl doc with examples (Jani)
    
    Cc: Maarten Lankhorst <[email protected]>
    Acked-by: Maarten Lankhorst <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/radeon/r100: Handle unknown family in r100_cp_init_microcode() [+ + +]
Author: Geert Uytterhoeven <[email protected]>
Date:   Tue Jul 30 17:58:12 2024 +0200

    drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
    
    [ Upstream commit c6dbab46324b1742b50dc2fb5c1fee2c28129439 ]
    
    With -Werror:
    
        In function ‘r100_cp_init_microcode’,
            inlined from ‘r100_cp_init’ at drivers/gpu/drm/radeon/r100.c:1136:7:
        include/linux/printk.h:465:44: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
          465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
              |                                            ^
        include/linux/printk.h:437:17: note: in definition of macro ‘printk_index_wrap’
          437 |                 _p_func(_fmt, ##__VA_ARGS__);                           \
              |                 ^~~~~~~
        include/linux/printk.h:508:9: note: in expansion of macro ‘printk’
          508 |         printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
              |         ^~~~~~
        drivers/gpu/drm/radeon/r100.c:1062:17: note: in expansion of macro ‘pr_err’
         1062 |                 pr_err("radeon_cp: Failed to load firmware \"%s\"\n", fw_name);
              |                 ^~~~~~
    
    Fix this by converting the if/else if/... construct into a proper
    switch() statement with a default to handle the error case.
    
    As a bonus, the generated code is ca. 100 bytes smaller (with gcc 11.4.0
    targeting arm32).
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/rockchip: vop: clear DMA stop bit on RK3066 [+ + +]
Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:48 2024 -0300

    drm/rockchip: vop: clear DMA stop bit on RK3066
    
    commit 6b44aa559d6c7f4ea591ef9d2352a7250138d62a upstream.
    
    The RK3066 VOP sets a dma_stop bit when it's done scanning out a frame
    and needs the driver to acknowledge that by clearing the bit.
    
    Unless we clear it "between" frames, the RGB output only shows noise
    instead of the picture. atomic_flush is the place for it that least
    affects other code (doing it on vblank would require converting all
    other usages of the reg_lock to spin_(un)lock_irq, which would affect
    performance for everyone).
    
    This seems to be a redundant synchronization mechanism that was removed
    in later iterations of the VOP hardware block.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066 [+ + +]
Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:49 2024 -0300

    drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066
    
    commit 6ed51ba95e27221ce87979bd2ad5926033b9e1b9 upstream.
    
    The RK3066 does have RGB display output, so it should be marked as such.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/sched: Add locking to drm_sched_entity_modify_sched [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Fri Sep 13 17:05:52 2024 +0100

    drm/sched: Add locking to drm_sched_entity_modify_sched
    
    commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.
    
    Without the locking amdgpu currently can race between
    amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
    drm_sched_job_arm(), leading to the latter accesing potentially
    inconsitent entity->sched_list and entity->num_sched_list pair.
    
    v2:
     * Improve commit message. (Philipp)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: Philipp Stanner <[email protected]>
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always increment correct scheduler score [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:09 2024 +0100

    drm/sched: Always increment correct scheduler score
    
    commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5 upstream.
    
    Entities run queue can change during drm_sched_entity_push_job() so make
    sure to update the score consistently.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues")
    Cc: Nirmoy Das <[email protected]>
    Cc: Christian König <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.9+
    Reviewed-by: Christian König <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:08 2024 +0100

    drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job
    
    commit cbc8764e29c2318229261a679b2aafd0f9072885 upstream.
    
    Since drm_sched_entity_modify_sched() can modify the entities run queue,
    lets make sure to only dereference the pointer once so both adding and
    waking up are guaranteed to be consistent.
    
    Alternative of moving the spin_unlock to after the wake up would for now
    be more problematic since the same lock is taken inside
    drm_sched_rq_update_fifo().
    
    v2:
     * Improve commit message. (Philipp)
     * Cache the scheduler pointer directly. (Christian)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Philipp Stanner <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Fix dynamic job-flow control race [+ + +]
Author: Rob Clark <[email protected]>
Date:   Fri Sep 13 13:23:01 2024 -0700

    drm/sched: Fix dynamic job-flow control race
    
    commit 440d52b370b03b366fd26ace36bab20552116145 upstream.
    
    Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609
    
    The whole premise of lockless access to a single-producer-single-
    consumer queue is that there is just a single producer and single
    consumer.  That means we can't call drm_sched_can_queue() (which is
    about queueing more work to the hw, not to the spsc queue) from
    anywhere other than the consumer (wq).
    
    This call in the producer is just an optimization to avoid scheduling
    the consuming worker if it cannot yet queue more work to the hw.  It
    is safe to drop this optimization to avoid the race condition.
    
    Suggested-by: Asahi Lina <[email protected]>
    Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control")
    Closes: https://github.com/AsahiLinux/linux/issues/309
    Cc: [email protected]
    Signed-off-by: Rob Clark <[email protected]>
    Reviewed-by: Danilo Krummrich <[email protected]>
    Tested-by: Janne Grunau <[email protected]>
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: revert "Always increment correct scheduler score" [+ + +]
Author: Christian König <[email protected]>
Date:   Mon Sep 30 15:07:49 2024 +0200

    drm/sched: revert "Always increment correct scheduler score"
    
    commit abf201f6ce14c4ceeccde5471bdf59614b83a3d8 upstream.
    
    This reverts commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5.
    
    It turned out that the original code was correct since the rq can only
    change when there is no armed job for an entity.
    
    This change here broke the logic since we only incremented the counter
    for the first job, so revert it.
    
    Signed-off-by: Christian König <[email protected]>
    Acked-by: Tvrtko Ursulin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/stm: Avoid use-after-free issues with crtc and plane [+ + +]
Author: Katya Orlova <[email protected]>
Date:   Fri Feb 16 15:50:40 2024 +0300

    drm/stm: Avoid use-after-free issues with crtc and plane
    
    [ Upstream commit 19dd9780b7ac673be95bf6fd6892a184c9db611f ]
    
    ltdc_load() calls functions drm_crtc_init_with_planes(),
    drm_universal_plane_init() and drm_encoder_init(). These functions
    should not be called with parameters allocated with devm_kzalloc()
    to avoid use-after-free issues [1].
    
    Use allocations managed by the DRM framework.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    [1]
    https://lore.kernel.org/lkml/u366i76e3qhh3ra5oxrtngjtm2u5lterkekcz6y2jkndhuxzli@diujon4h7qwb/
    
    Signed-off-by: Katya Orlova <[email protected]>
    Acked-by: Raphaël Gallais-Pou <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/stm: ltdc: reset plane transparency after plane disable [+ + +]
Author: Yannick Fertre <[email protected]>
Date:   Fri Jul 12 15:13:44 2024 +0200

    drm/stm: ltdc: reset plane transparency after plane disable
    
    [ Upstream commit 02fa62d41c8abff945bae5bfc3ddcf4721496aca ]
    
    The plane's opacity should be reseted while the plane
    is disabled. It prevents from seeing a possible global
    or layer background color set earlier.
    
    Signed-off-by: Yannick Fertre <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/v3d: Prevent out of bounds access in performance query extensions [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Thu Jul 11 14:53:30 2024 +0100

    drm/v3d: Prevent out of bounds access in performance query extensions
    
    commit f32b5128d2c440368b5bf3a7a356823e235caabb upstream.
    
    Check that the number of perfmons userspace is passing in the copy and
    reset extensions is not greater than the internal kernel storage where
    the ids will be copied into.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job")
    Cc: Maíra Canal <[email protected]>
    Cc: Iago Toral Quiroga <[email protected]>
    Cc: [email protected] # v6.8+
    Reviewed-by: Iago Toral Quiroga <[email protected]>
    Reviewed-by: Maíra Canal <[email protected]>
    Signed-off-by: Maíra Canal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/xe/fbdev: Limit the usage of stolen for LNL+ [+ + +]
Author: Uma Shankar <[email protected]>
Date:   Wed Jul 17 13:52:52 2024 +0530

    drm/xe/fbdev: Limit the usage of stolen for LNL+
    
    [ Upstream commit 775d0adc01a55fe0458139330415d86bb3533efe ]
    
    As per recommendation in the workarounds:
    WA_22019338487
    
    There is an issue with accessing Stolen memory pages due a
    hardware limitation. Limit the usage of stolen memory for
    fbdev for LNL+. Don't use BIOS FB from stolen on LNL+ and
    assign the same from system memory.
    
    v2: Corrected the WA Number, limited WA to LNL and
        Adopted XE_WA framework as suggested by Lucas and Matt.
    
    v3: Introduced the waxxx_display to implement display side
        of WA changes on Lunarlake. Used xe_root_mmio_gt and
        avoid the for loop (Suggested by Lucas)
    
    v4: Fixed some nits (Luca)
    
    Reviewed-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Uma Shankar <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/xe/guc_submit: add missing locking in wedged_fini [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Tue Sep 24 16:09:48 2024 +0100

    drm/xe/guc_submit: add missing locking in wedged_fini
    
    [ Upstream commit 790533e44bfc7af929842fccd9674c9f424d4627 ]
    
    Any non-wedged queue can have a zero refcount here and can be running
    concurrently with an async queue destroy, therefore dereferencing the
    queue ptr to check wedge status after the lookup can trigger UAF if
    queue is not wedged.  Fix this by keeping the submission_state lock held
    around the check to postpone the free and make the check safe, before
    dropping again around the put() to avoid the deadlock.
    
    Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit d28af0b6b9580b9f90c265a7da0315b0ad20bbfd)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/xe/hdcp: Check GSC structure validity [+ + +]
Author: Suraj Kandpal <[email protected]>
Date:   Mon Jul 22 12:14:51 2024 +0530

    drm/xe/hdcp: Check GSC structure validity
    
    [ Upstream commit b4224f6bae3801d589f815672ec62800a1501b0d ]
    
    Sometimes xe_gsc is not initialized when checked at HDCP capability
    check. Add gsc structure check to avoid null pointer error.
    
    Signed-off-by: Suraj Kandpal <[email protected]>
    Reviewed-by: Dnyaneshwar Bhadane <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close [+ + +]
Author: José Roberto de Souza <[email protected]>
Date:   Tue Sep 24 14:37:13 2024 -0700

    drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
    
    commit 8135f1c09dd2eecee7cb637f7ec9a29e57300eb8 upstream.
    
    Mesa testing on Xe2+ revealed that when OA metrics are collected for an
    exec_queue, after the OA stream is closed, future batch buffers submitted
    on that exec_queue do not complete. Not resetting OAC_CONTEXT_ENABLE on OA
    stream close resolves these hangs and should not have any adverse effects.
    
    v2: Make the change that we don't reset the bit clearer (Ashutosh)
        Also make the same fix for OAC as OAR (Ashutosh)
    
    Bspec: 60314
    Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
    Fixes: 14e077f8006d ("drm/xe/oa: Add OAC support")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2821
    Signed-off-by: José Roberto de Souza <[email protected]>
    Signed-off-by: Ashutosh Dixit <[email protected]>
    Cc: [email protected]
    Reviewed-by: Ashutosh Dixit <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/xe/vm: move xa_alloc to prevent UAF [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Wed Sep 25 08:14:27 2024 +0100

    drm/xe/vm: move xa_alloc to prevent UAF
    
    [ Upstream commit 74231870cf4976f69e83aa24f48edb16619f652f ]
    
    Evil user can guess the next id of the vm before the ioctl completes and
    then call vm destroy ioctl to trigger UAF since create ioctl is still
    referencing the same vm. Move the xa_alloc all the way to the end to
    prevent this.
    
    v2:
     - Rebase
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Nirmoy Das <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit dcfd3971327f3ee92765154baebbaece833d3ca9)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/xe/vram: fix ccs offset calculation [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Mon Sep 16 09:49:12 2024 +0100

    drm/xe/vram: fix ccs offset calculation
    
    commit ee06c09ded3c2f722be4e240ed06287e23596bda upstream.
    
    Spec says SW is expected to round up to the nearest 128K, if not already
    aligned for the CC unit view of CCS. We are seeing the assert sometimes
    pop on BMG to tell us that there is a hole between GSM and CCS, as well
    as popping other asserts with having a vram size with strange alignment,
    which is likely caused by misaligned offset here.
    
    v2 (Shuicheng):
     - Do the round_up() on final SW address.
    
    BSpec: 68023
    Fixes: b5c2ca0372dc ("drm/xe/xe2hpg: Determine flat ccs offset for vram")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Himal Prasad Ghimiray <[email protected]>
    Cc: Akshata Jahagirdar <[email protected]>
    Cc: Lucas De Marchi <[email protected]>
    Cc: Shuicheng Lin <[email protected]>
    Cc: Matt Roper <[email protected]>
    Cc: [email protected] # v6.10+
    Reviewed-by: Himal Prasad Ghimiray <[email protected]>
    Tested-by: Shuicheng Lin <[email protected]>
    Reviewed-by: Lucas De Marchi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Lucas De Marchi <[email protected]>
    (cherry picked from commit 37173392741c425191b959acb3adf70c9a4610c0)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/xe: Add timeout to preempt fences [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Tue Jun 25 17:41:37 2024 -0700

    drm/xe: Add timeout to preempt fences
    
    [ Upstream commit 627c961d672d3304564455ba471f5e4405170eec ]
    
    To adhere to dma fencing rules that fences must signal within a
    reasonable amount of time, add a 5 second timeout to preempt fences. If
    this timeout occurs, kill the associated VM as this fatal to the VM.
    
    v2:
     - Add comment for smp_wmb (Checkpatch)
     - Fix kernel doc typo (Inspection)
     - Add comment for killed check (Niranjana)
    v3:
     - Drop smp_wmb (Matthew Auld)
     - Don't take vm->lock in preempt fence worker (Matthew Auld)
     - Drop RB given changes to patch
    v4:
     - Add WRITE/READ_ONCE (Niranjana)
     - Don't export xe_vm_kill (Niranjana)
    
    Cc: Matthew Auld <[email protected]>
    Cc: Niranjana Vishwanathapura <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Tested-by: Stuart Summers <[email protected]>
    Reviewed-by: Niranjana Vishwanathapura <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Clean up VM / exec queue file lock usage. [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Fri Sep 20 18:17:12 2024 -0700

    drm/xe: Clean up VM / exec queue file lock usage.
    
    [ Upstream commit 9e3c85ddea7a473ed57b6cdfef2dfd468356fc91 ]
    
    Both the VM / exec queue file lock protect the lookup and reference to
    the object, nothing more. These locks are not intended anything else
    underneath them. XA have their own locking too, so no need to take the
    VM / exec queue file lock aside from when doing a lookup and reference
    get.
    
    Add some kernel doc to make this clear and cleanup a few typos too.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Matthew Auld <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit fe4f5d4b661666a45b48fe7f95443f8fefc09c8c)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Stable-dep-of: 74231870cf49 ("drm/xe/vm: move xa_alloc to prevent UAF")
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Tue Aug 20 10:29:55 2024 -0700

    drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini
    
    [ Upstream commit a323782567812ee925e9b7926445532c7afe331b ]
    
    Not a big deal if CT is down as driver is unloading, no need to warn.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jagmeet Randhawa <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Fix memory leak on xe_alloc_pf_queue failure [+ + +]
Author: Nirmoy Das <[email protected]>
Date:   Mon Aug 26 18:20:35 2024 +0200

    drm/xe: Fix memory leak on xe_alloc_pf_queue failure
    
    [ Upstream commit c5f728de696caa35481fd84202dfbc9fecc18e0b ]
    
    Simplify memory unwinding on error also fixing current memory
    leak that can happen on error.
    
    v2: use devm_kcalloc(Matt A)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Cc: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Auld <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Nirmoy Das <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: fix UAF around queue destruction [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Mon Sep 23 15:56:48 2024 +0100

    drm/xe: fix UAF around queue destruction
    
    commit 2d2be279f1ca9e7288282d4214f16eea8a727cdb upstream.
    
    We currently do stuff like queuing the final destruction step on a
    random system wq, which will outlive the driver instance. With bad
    timing we can teardown the driver with one or more work workqueue still
    being alive leading to various UAF splats. Add a fini step to ensure
    user queues are properly torn down. At this point GuC should already be
    nuked so queue itself should no longer be referenced from hw pov.
    
    v2 (Matt B)
     - Looks much safer to use a waitqueue and then just wait for the
       xa_array to become empty before triggering the drain.
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2317
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: fixup xe_alloc_pf_queue [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Wed Aug 21 18:19:18 2024 +0100

    drm/xe: fixup xe_alloc_pf_queue
    
    [ Upstream commit 321d6b4b9cbe3dd0bc99937d5e5b4d730b5b5798 ]
    
    kzalloc expects number of bytes, therefore we should convert the number
    of dw into bytes, otherwise we are likely just accessing beyond the
    array causing all kinds of carnage. Also fixup the error handling while
    we are here.
    
    v2:
     - Prefer kcalloc (dim)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Generate oob before compiling anything [+ + +]
Author: Lucas De Marchi <[email protected]>
Date:   Mon Jul 8 14:29:06 2024 -0700

    drm/xe: Generate oob before compiling anything
    
    commit ea74bf9ccba9ae80fc0766c07c4abaef927e9e63 upstream.
    
    Instead of keep adding more dependencies as WAs are needed in different
    places of the driver, just add a rule with all the objects so the code
    generation happens before anything else.
    
    While at it, group lines related to wa_oob in the Makefile.
    
    v2: Prefix $(obj) when declaring dependency
    
    Reviewed-by: Rodrigo Vivi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: Name and document Wa_14019789679 [+ + +]
Author: Matt Roper <[email protected]>
Date:   Mon Aug 12 11:10:43 2024 -0700

    drm/xe: Name and document Wa_14019789679
    
    [ Upstream commit 1d734a3e5d6bb266f52eaf2b1400c5d3f1875a54 ]
    
    Early in the development of Xe we identified an issue with SVG state
    handling on DG2 and MTL (and later on Xe2 as well).  In
    commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load
    on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state
    handling") we implemented our own workaround to prevent SVG state from
    leaking from context A to context B in cases where context B never
    issues a specific state setting.
    
    The hardware teams have now created official workaround Wa_14019789679
    to cover this issue.  The workaround description only requires emitting
    3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction
    that would potentially remain unset by a context B, but still cause
    notable issues if unwanted values were inherited from context A.
    However since we already have a more extensive implementation that emits
    the entire SVG state and prevents _any_ SVG state from unintentionally
    leaking, we'll stick with our existing implementation just to be safe.
    
    Signed-off-by: Matt Roper <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Prevent null pointer access in xe_migrate_copy [+ + +]
Author: Zhanjun Dong <[email protected]>
Date:   Fri Sep 27 09:13:08 2024 -0700

    drm/xe: Prevent null pointer access in xe_migrate_copy
    
    [ Upstream commit 7257d9c9a3c6cfe26c428e9b7ae21d61f2f55a79 ]
    
    xe_migrate_copy designed to copy content of TTM resources. When source
    resource is null, it will trigger a NULL pointer dereference in
    xe_migrate_copy. To avoid this situation, update lacks source flag to
    true for this case, the flag will trigger xe_migrate_clear rather than
    xe_migrate_copy.
    
    Issue trace:
    <7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14,
     sizes: 4194304 & 4194304
    <7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15,
     sizes: 4194304 & 4194304
    <1> [317.128055] BUG: kernel NULL pointer dereference, address:
     0000000000000010
    <1> [317.128064] #PF: supervisor read access in kernel mode
    <1> [317.128066] #PF: error_code(0x0000) - not-present page
    <6> [317.128069] PGD 0 P4D 0
    <4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
    <4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted:
     G     U           N 6.11.0-rc7-xe #1
    <4> [317.128078] Tainted: [U]=USER, [N]=TEST
    <4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client
     Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024
    <4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8
     fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31
     ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff
    <4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246
    <4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX:
     0000000000000000
    <4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI:
     0000000000000000
    <4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09:
     0000000000000001
    <4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12:
     ffff88814e7b1f08
    <4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15:
     0000000000000001
    <4> [317.128174] FS:  0000000000000000(0000) GS:ffff88846f280000(0000)
     knlGS:0000000000000000
    <4> [317.128176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    <4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4:
     0000000000770ef0
    <4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
     0000000000000000
    <4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
     0000000000000400
    <4> [317.128184] PKRU: 55555554
    <4> [317.128185] Call Trace:
    <4> [317.128187]  <TASK>
    <4> [317.128189]  ? show_regs+0x67/0x70
    <4> [317.128194]  ? __die_body+0x20/0x70
    <4> [317.128196]  ? __die+0x2b/0x40
    <4> [317.128198]  ? page_fault_oops+0x15f/0x4e0
    <4> [317.128203]  ? do_user_addr_fault+0x3fb/0x970
    <4> [317.128205]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128209]  ? exc_page_fault+0x87/0x2b0
    <4> [317.128212]  ? asm_exc_page_fault+0x27/0x30
    <4> [317.128216]  ? xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128263]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128265]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128267]  ? sg_free_append_table+0x20/0x80
    <4> [317.128271]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128273]  ? mark_held_locks+0x4d/0x80
    <4> [317.128275]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128278]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128281]  ? __pm_runtime_resume+0x60/0xa0
    <4> [317.128284]  xe_bo_move+0x682/0xc50 [xe]
    <4> [317.128315]  ? lock_is_held_type+0xaa/0x120
    <4> [317.128318]  ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm]
    <4> [317.128324]  ttm_bo_validate+0xd1/0x1a0 [ttm]
    <4> [317.128328]  shrink_test_run_device+0x721/0xc10 [xe]
    <4> [317.128360]  ? find_held_lock+0x31/0x90
    <4> [317.128363]  ? lock_release+0xd1/0x2a0
    <4> [317.128365]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
     [kunit]
    <4> [317.128370]  xe_bo_shrink_kunit+0x11/0x20 [xe]
    <4> [317.128397]  kunit_try_run_case+0x6e/0x150 [kunit]
    <4> [317.128400]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128402]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128404]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
    <4> [317.128407]  kthread+0xf5/0x130
    <4> [317.128410]  ? __pfx_kthread+0x10/0x10
    <4> [317.128412]  ret_from_fork+0x39/0x60
    <4> [317.128415]  ? __pfx_kthread+0x10/0x10
    <4> [317.128416]  ret_from_fork_asm+0x1a/0x30
    <4> [317.128420]  </TASK>
    
    Fixes: 266c85885263 ("drm/xe/xe2: Handle flat ccs move for igfx.")
    Signed-off-by: Zhanjun Dong <[email protected]>
    Reviewed-by: Thomas Hellström <[email protected]>
    Signed-off-by: Matt Roper <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 59a1c9c7e1d02b43b415ea92627ce095b7c79e47)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Restore pci state upon resume [+ + +]
Author: Rodrigo Vivi <[email protected]>
Date:   Thu Sep 12 17:45:07 2024 -0400

    drm/xe: Restore pci state upon resume
    
    [ Upstream commit cffa8e83df9fe525afad1e1099097413f9174f57 ]
    
    The pci state was saved, but not restored. Restore
    right after the power state transition request like
    every other driver.
    
    v2: Use right fixes tag, since this was there initialy, but
        accidentally removed.
    
    Fixes: f6761c68c0ac ("drm/xe/display: Improve s2idle handling.")
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Lucas De Marchi <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Signed-off-by: Rodrigo Vivi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Maarten Lankhorst <[email protected]>
    (cherry picked from commit ec2d1539e159f53eae708e194c449cfefa004994)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Resume TDR after GT reset [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Wed Jul 24 16:59:19 2024 -0700

    drm/xe: Resume TDR after GT reset
    
    [ Upstream commit 1b30f87e088b499eb74298db256da5c98e8276e2 ]
    
    Not starting the TDR after GT reset on exec queue which have been
    restarted can lead to jobs being able to be run forever. Fix this by
    restarting the TDR.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 8ec5a4e5ce97d6ee9f5eb5b4ce4cfc831976fdec)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Use topology to determine page fault queue size [+ + +]
Author: Stuart Summers <[email protected]>
Date:   Sat Aug 17 02:47:31 2024 +0000

    drm/xe: Use topology to determine page fault queue size
    
    [ Upstream commit 3338e4f90c143cf32f77d64f464cb7f2c2d24700 ]
    
    Currently the page fault queue size is hard coded. However
    the hardware supports faulting for each EU and each CS.
    For some applications running on hardware with a large
    number of EUs and CSs, this can result in an overflow of
    the page fault queue.
    
    Add a small calculation to determine the page fault queue
    size based on the number of EUs and CSs in the platform as
    detmined by fuses.
    
    Signed-off-by: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
    Signed-off-by: Sasha Levin <[email protected]>

 
drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS [+ + +]
Author: Thomas Zimmermann <[email protected]>
Date:   Mon Sep 23 09:58:14 2024 +0200

    drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS
    
    commit 8b0d2f61545545ab5eef923ed6e59fc3be2385e0 upstream.
    
    FB_DAMAGE_CLIPS is a plane property for damage handling. Its UAPI
    should only use UAPI types. Hence replace struct drm_rect with
    struct drm_mode_rect in drm_atomic_plane_set_property(). Both types
    are identical in practice, so there's no change in behavior.
    
    Reported-by: Ville Syrjälä <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/[email protected]/
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: d3b21767821e ("drm: Add a new plane property to send damage during plane update")
    Cc: Lukasz Spintzyk <[email protected]>
    Cc: Deepak Rawat <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Thomas Hellstrom <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.0+
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm: omapdrm: Add missing check for alloc_ordered_workqueue [+ + +]
Author: Ma Ke <[email protected]>
Date:   Thu Aug 8 14:13:36 2024 +0800

    drm: omapdrm: Add missing check for alloc_ordered_workqueue
    
    commit e794b7b9b92977365c693760a259f8eef940c536 upstream.
    
    As it may return NULL pointer and cause NULL pointer dereference. Add check
    for the return value of alloc_ordered_workqueue.
    
    Cc: [email protected]
    Fixes: 2f95bc6d324a ("drm: omapdrm: Perform initialization/cleanup at probe/remove time")
    Signed-off-by: Ma Ke <[email protected]>
    Signed-off-by: Tomi Valkeinen <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
dt-bindings: clock: exynos7885: Fix duplicated binding [+ + +]
Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:44 2024 +0200

    dt-bindings: clock: exynos7885: Fix duplicated binding
    
    commit abf3a3ea9acb5c886c8729191a670744ecd42024 upstream.
    
    The numbering in Exynos7885's FSYS CMU bindings has 4 duplicated by
    accident, with the rest of the bindings continuing with 5.
    
    Fix this by moving CLK_MOUT_FSYS_USB30DRD_USER to the end as 11.
    
    Since CLK_MOUT_FSYS_USB30DRD_USER is not used in any device tree as of
    now, and there are no other clocks affected (maybe apart from
    CLK_MOUT_FSYS_MMC_SDIO_USER which the number was shared with, also not
    used in a device tree), this is the least impactful way to solve this
    problem.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:02 2024 +0530

    dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x
    
    commit 648b4bde0aca2980ebc0b90cdfbb80d222370c3d upstream.
    
    Add the missing GPLL9 which is required for the gcc sdcc2 clock.
    
    Fixes: 0fadcdfdcf57 ("dt-bindings: clock: Add SC8180x GCC binding")
    Cc: [email protected]
    Acked-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems [+ + +]
Author: Ravikanth Tuniki <[email protected]>
Date:   Tue Oct 1 00:43:35 2024 +0530

    dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems
    
    [ Upstream commit c6929644c1e0d6108e57061d427eb966e1746351 ]
    
    Add missing reg minItems as based on current binding document
    only ethernet MAC IO space is a supported configuration.
    
    There is a bug in schema, current examples contain 64-bit
    addressing as well as 32-bit addressing. The schema validation
    does pass incidentally considering one 64-bit reg address as
    two 32-bit reg address entries. If we change axi_ethernet_eth1
    example node reg addressing to 32-bit schema validation reports:
    
    Documentation/devicetree/bindings/net/xlnx,axi-ethernet.example.dtb:
    ethernet@40000000: reg: [[1073741824, 262144]] is too short
    
    To fix it add missing reg minItems constraints and to make things clearer
    stick to 32-bit addressing in examples.
    
    Fixes: cbb1ca6d5f9a ("dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yaml")
    Signed-off-by: Ravikanth Tuniki <[email protected]>
    Signed-off-by: Radhey Shyam Pandey <[email protected]>
    Acked-by: Conor Dooley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
e1000e: avoid failing the system during pm_suspend [+ + +]
Author: Vitaly Lifshits <[email protected]>
Date:   Tue Aug 6 16:23:48 2024 +0300

    e1000e: avoid failing the system during pm_suspend
    
    [ Upstream commit 0a6ad4d9e1690c7faa3a53f762c877e477093657 ]
    
    Occasionally when the system goes into pm_suspend, the suspend might fail
    due to a PHY access error on the network adapter. Previously, this would
    have caused the whole system to fail to go to a low power state.
    An example of this was reported in the following Bugzilla:
    https://bugzilla.kernel.org/show_bug.cgi?id=205015
    
    [ 1663.694828] e1000e 0000:00:19.0 eth0: Failed to disable ULP
    [ 1664.731040] asix 2-3:1.0 eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1
    [ 1665.093513] e1000e 0000:00:19.0 eth0: Hardware Error
    [ 1665.596760] e1000e 0000:00:19.0: pci_pm_resume+0x0/0x80 returned 0 after 2975399 usecs
    
    and then the system never recovers from it, and all the following suspend failed due to this
    [22909.393854] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x760 [e1000e] returns -2
    [22909.393858] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -2
    [22909.393861] PM: Device 0000:00:1f.6 failed to suspend async: error -2
    
    This can be avoided by changing the return values of __e1000_shutdown and
    e1000e_pm_suspend functions so that they always return 0 (success). This
    is consistent with what other drivers do.
    
    If the e1000e driver encounters a hardware error during suspend, potential
    side effects include slightly higher power draw or non-working wake on
    LAN. This is preferred to a system-level suspend failure, and a warning
    message is written to the system log, so that the user can be aware that
    the LAN controller experienced a problem during suspend.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=205015
    Suggested-by: Dima Ruinskiy <[email protected]>
    Signed-off-by: Vitaly Lifshits <[email protected]>
    Tested-by: Mor Bar-Gabay <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
EINJ, CXL: Fix CXL device SBDF calculation [+ + +]
Author: Ben Cheatham <[email protected]>
Date:   Fri Sep 27 11:34:28 2024 -0500

    EINJ, CXL: Fix CXL device SBDF calculation
    
    [ Upstream commit ee1e3c46ed19c096be22472c728fa7f68b1352c4 ]
    
    The SBDF of the target CXL 2.0 compliant root port is required to inject a CXL
    protocol error as per ACPI 6.5. The SBDF given has to be in the
    following format:
    
    31     24 23    16 15    11 10      8  7        0
    +-------------------------------------------------+
    | segment |   bus  | device | function | reserved |
    +-------------------------------------------------+
    
    The SBDF calculated in cxl_dport_get_sbdf() doesn't account for
    the reserved bits currently, causing the wrong SBDF to be used.
    Fix said calculation to properly shift the SBDF.
    
    Without this fix, error injection into CXL 2.0 root ports through the
    CXL debugfs interface (<debugfs>/cxl) is broken. Injection
    through the legacy interface (<debugfs>/apei/einj/) will still work
    because the SBDF is manually provided by the user.
    
    Fixes: 12fb28ea6b1cf ("EINJ: Add CXL error type support")
    Signed-off-by: Ben Cheatham <[email protected]>
    Reviewed-by: Dan Williams <[email protected]>
    Tested-by: Srinivasulu Thanneeru <[email protected]>
    Reviewed-by: Srinivasulu Thanneeru <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
exec: don't WARN for racy path_noexec check [+ + +]
Author: Mateusz Guzik <[email protected]>
Date:   Mon Aug 5 15:17:21 2024 +0200

    exec: don't WARN for racy path_noexec check
    
    [ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
    
    Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
    of the previous implementation. They used to legitimately check for the
    condition, but that got moved up in two commits:
    633fb6ac3980 ("exec: move S_ISREG() check earlier")
    0fd338b2d2cd ("exec: move path_noexec() check earlier")
    
    Instead of being removed said checks are WARN_ON'ed instead, which
    has some debug value.
    
    However, the spurious path_noexec check is racy, resulting in
    unwarranted warnings should someone race with setting the noexec flag.
    
    One can note there is more to perm-checking whether execve is allowed
    and none of the conditions are guaranteed to still hold after they were
    tested for.
    
    Additionally this does not validate whether the code path did any perm
    checking to begin with -- it will pass if the inode happens to be
    regular.
    
    Keep the redundant path_noexec() check even though it's mindless
    nonsense checking for guarantee that isn't given so drop the WARN.
    
    Reword the commentary and do small tidy ups while here.
    
    Signed-off-by: Mateusz Guzik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [brauner: keep redundant path_noexec() check]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
exfat: fix memory leak in exfat_load_bitmap() [+ + +]
Author: Yuezhang Mo <[email protected]>
Date:   Tue Sep 3 15:01:09 2024 +0800

    exfat: fix memory leak in exfat_load_bitmap()
    
    commit d2b537b3e533f28e0d97293fe9293161fe8cd137 upstream.
    
    If the first directory entry in the root directory is not a bitmap
    directory entry, 'bh' will not be released and reassigned, which
    will cause a memory leak.
    
    Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
    Cc: [email protected]
    Signed-off-by: Yuezhang Mo <[email protected]>
    Reviewed-by: Aoyama Wataru <[email protected]>
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ext4: aovid use-after-free in ext4_ext_insert_extent() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:26 2024 +0800

    ext4: aovid use-after-free in ext4_ext_insert_extent()
    
    commit a164f3a432aae62ca23d03e6d926b122ee5b860d upstream.
    
    As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is
    reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and
    cause UAF. Below is a sample trace with dummy values:
    
    ext4_ext_insert_extent
      path = *ppath = 2000
      ext4_ext_create_new_leaf(ppath)
        ext4_find_extent(ppath)
          path = *ppath = 2000
          if (depth > path[0].p_maxdepth)
                kfree(path = 2000);
                *ppath = path = NULL;
          path = kcalloc() = 3000
          *ppath = 3000;
          return path;
      /* here path is still 2000, UAF! */
      eh = path[depth].p_hdr
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330
    Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179
    CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866
    Call Trace:
     <TASK>
     ext4_ext_insert_extent+0x26d4/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
    [...]
    
    Allocated by task 179:
     ext4_find_extent+0x81c/0x1f70
     ext4_ext_map_blocks+0x146/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    
    Freed by task 179:
     kfree+0xcb/0x240
     ext4_find_extent+0x7c0/0x1f70
     ext4_ext_insert_extent+0xa26/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    ==================================================================
    
    So use *ppath to update the path to avoid the above problem.
    
    Reported-by: Ojaswin Mujoo <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: avoid use-after-free in ext4_ext_show_leaf() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:24 2024 +0800

    ext4: avoid use-after-free in ext4_ext_show_leaf()
    
    [ Upstream commit 4e2524ba2ca5f54bdbb9e5153bea00421ef653f5 ]
    
    In ext4_find_extent(), path may be freed by error or be reallocated, so
    using a previously saved *ppath may have been freed and thus may trigger
    use-after-free, as follows:
    
    ext4_split_extent
      path = *ppath;
      ext4_split_extent_at(ppath)
      path = ext4_find_extent(ppath)
      ext4_split_extent_at(ppath)
        // ext4_find_extent fails to free path
        // but zeroout succeeds
      ext4_ext_show_leaf(inode, path)
        eh = path[depth].p_hdr
        // path use-after-free !!!
    
    Similar to ext4_split_extent_at(), we use *ppath directly as an input to
    ext4_ext_show_leaf(). Fix a spelling error by the way.
    
    Same problem in ext4_ext_handle_unwritten_extents(). Since 'path' is only
    used in ext4_ext_show_leaf(), remove 'path' and use *ppath directly.
    
    This issue is triggered only when EXT_DEBUG is defined and therefore does
    not affect functionality.
    
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: correct encrypted dentry name hash when not casefolded [+ + +]
Author: yao.ly <[email protected]>
Date:   Mon Jul 1 14:43:39 2024 +0800

    ext4: correct encrypted dentry name hash when not casefolded
    
    commit 70dd7b573afeba9b8f8a33f2ae1e4a9a2ec8c1ec upstream.
    
    EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH will access struct
    ext4_dir_entry_hash followed ext4_dir_entry. But there is no ext4_dir_entry_hash
    followed when inode is encrypted and not casefolded
    
    Signed-off-by: yao.ly <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: dax: fix overflowing extents beyond inode size when partially writing [+ + +]
Author: Zhihao Cheng <[email protected]>
Date:   Fri Aug 9 20:15:32 2024 +0800

    ext4: dax: fix overflowing extents beyond inode size when partially writing
    
    commit dda898d7ffe85931f9cca6d702a51f33717c501e upstream.
    
    The dax_iomap_rw() does two things in each iteration: map written blocks
    and copy user data to blocks. If the process is killed by user(See signal
    handling in dax_iomap_iter()), the copied data will be returned and added
    on inode size, which means that the length of written extents may exceed
    the inode size, then fsck will fail. An example is given as:
    
    dd if=/dev/urandom of=file bs=4M count=1
     dax_iomap_rw
      iomap_iter // round 1
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 0~2M extents(written flag)
      dax_iomap_iter // copy 2M data
      iomap_iter // round 2
       iomap_iter_advance
        iter->pos += iter->processed // iter->pos = 2M
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 2~4M extents(written flag)
      dax_iomap_iter
       fatal_signal_pending
      done = iter->pos - iocb->ki_pos // done = 2M
     ext4_handle_inode_extension
      ext4_update_inode_size // inode size = 2M
    
    fsck reports: Inode 13, i_size is 2097152, should be 4194304.  Fix?
    
    Fix the problem by truncating extents if the written length is smaller
    than expected.
    
    Fixes: 776722e85d3b ("ext4: DAX iomap write support")
    CC: [email protected]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219136
    Signed-off-by: Zhihao Cheng <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhihao Cheng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:27 2024 +0800

    ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free
    
    commit 5c0f4cc84d3a601c99bc5e6e6eb1cbda542cce95 upstream.
    
    When calling ext4_force_split_extent_at() in ext4_ext_replay_update_ex(),
    the 'ppath' is updated but it is the 'path' that is freed, thus potentially
    triggering a double-free in the following process:
    
    ext4_ext_replay_update_ex
      ppath = path
      ext4_force_split_extent_at(&ppath)
        ext4_split_extent_at
          ext4_ext_insert_extent
            ext4_ext_create_new_leaf
              ext4_ext_grow_indepth
                ext4_find_extent
                  if (depth > path[0].p_maxdepth)
                    kfree(path)                 ---> path First freed
                    *orig_path = path = NULL    ---> null ppath
      kfree(path)                               ---> path double-free !!!
    
    So drop the unnecessary ppath and use path directly to avoid this problem.
    And use ext4_find_extent() directly to update path, avoiding unnecessary
    memory allocation and freeing. Also, propagate the error returned by
    ext4_find_extent() instead of using strange error codes.
    
    Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: ext4_search_dir should return a proper error [+ + +]
Author: Thadeu Lima de Souza Cascardo <[email protected]>
Date:   Wed Aug 21 12:23:21 2024 -0300

    ext4: ext4_search_dir should return a proper error
    
    [ Upstream commit cd69f8f9de280e331c9e6ff689ced0a688a9ce8f ]
    
    ext4_search_dir currently returns -1 in case of a failure, while it returns
    0 when the name is not found. In such failure cases, it should return an
    error code instead.
    
    This becomes even more important when ext4_find_inline_entry returns an
    error code as well in the next commit.
    
    -EFSCORRUPTED seems appropriate as such error code as these failures would
    be caused by unexpected record lengths and is in line with other instances
    of ext4_check_dir_entry failures.
    
    In the case of ext4_dx_find_entry, the current use of ERR_BAD_DX_DIR was
    left as is to reduce the risk of regressions.
    
    Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: filesystems without casefold feature cannot be mounted with siphash [+ + +]
Author: Lizhi Xu <[email protected]>
Date:   Wed Jun 5 09:23:35 2024 +0800

    ext4: filesystems without casefold feature cannot be mounted with siphash
    
    [ Upstream commit 985b67cd86392310d9e9326de941c22fc9340eec ]
    
    When mounting the ext4 filesystem, if the default hash version is set to
    DX_HASH_SIPHASH but the casefold feature is not set, exit the mounting.
    
    Reported-by: [email protected]
    Signed-off-by: Lizhi Xu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix access to uninitialised lock in fc replay path [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Thu Jul 18 10:43:56 2024 +0100

    ext4: fix access to uninitialised lock in fc replay path
    
    commit 23dfdb56581ad92a9967bcd720c8c23356af74c1 upstream.
    
    The following kernel trace can be triggered with fstest generic/629 when
    executed against a filesystem with fast-commit feature enabled:
    
    INFO: trying to register non-static key.
    The code is fine but needs lockdep annotation, or maybe
    you didn't initialize this object before use?
    turning off the locking correctness validator.
    CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x66/0x90
     register_lock_class+0x759/0x7d0
     __lock_acquire+0x85/0x2630
     ? __find_get_block+0xb4/0x380
     lock_acquire+0xd1/0x2d0
     ? __ext4_journal_get_write_access+0xd5/0x160
     _raw_spin_lock+0x33/0x40
     ? __ext4_journal_get_write_access+0xd5/0x160
     __ext4_journal_get_write_access+0xd5/0x160
     ext4_reserve_inode_write+0x61/0xb0
     __ext4_mark_inode_dirty+0x79/0x270
     ? ext4_ext_replay_set_iblocks+0x2f8/0x450
     ext4_ext_replay_set_iblocks+0x330/0x450
     ext4_fc_replay+0x14c8/0x1540
     ? jread+0x88/0x2e0
     ? rcu_is_watching+0x11/0x40
     do_one_pass+0x447/0xd00
     jbd2_journal_recover+0x139/0x1b0
     jbd2_journal_load+0x96/0x390
     ext4_load_and_init_journal+0x253/0xd40
     ext4_fill_super+0x2cc6/0x3180
    ...
    
    In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in
    function ext4_check_bdev_write_error().  Unfortunately, at this point this
    spinlock has not been initialized yet.  Moving it's initialization to an
    earlier point in __ext4_fill_super() fixes this splat.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix double brelse() the buffer of the extents path [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:28 2024 +0800

    ext4: fix double brelse() the buffer of the extents path
    
    commit dcaa6c31134c0f515600111c38ed7750003e1b9c upstream.
    
    In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been
    released, otherwise it may be released twice. An example of what triggers
    this is as follows:
    
      split2    map    split1
    |--------|-------|--------|
    
    ext4_ext_map_blocks
     ext4_ext_handle_unwritten_extents
      ext4_split_convert_extents
       // path->p_depth == 0
       ext4_split_extent
         // 1. do split1
         ext4_split_extent_at
           |ext4_ext_insert_extent
           |  ext4_ext_create_new_leaf
           |    ext4_ext_grow_indepth
           |      le16_add_cpu(&neh->eh_depth, 1)
           |    ext4_find_extent
           |      // return -ENOMEM
           |// get error and try zeroout
           |path = ext4_find_extent
           |  path->p_depth = 1
           |ext4_ext_try_to_merge
           |  ext4_ext_try_to_merge_up
           |    path->p_depth = 0
           |    brelse(path[1].p_bh)  ---> not set to NULL here
           |// zeroout success
         // 2. update path
         ext4_find_extent
         // 3. do split2
         ext4_split_extent_at
           ext4_ext_insert_extent
             ext4_ext_create_new_leaf
               ext4_ext_grow_indepth
                 le16_add_cpu(&neh->eh_depth, 1)
               ext4_find_extent
                 path[0].p_bh = NULL;
                 path->p_depth = 1
                 read_extent_tree_block  ---> return err
                 // path[1].p_bh is still the old value
                 ext4_free_ext_path
                   ext4_ext_drop_refs
                     // path->p_depth == 1
                     brelse(path[1].p_bh)  ---> brelse a buffer twice
    
    Finally got the following WARRNING when removing the buffer from lru:
    
    ============================================
    VFS: brelse: Trying to free free buffer
    WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90
    CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716
    RIP: 0010:__brelse+0x58/0x90
    Call Trace:
     <TASK>
     __find_get_block+0x6e7/0x810
     bdev_getblk+0x2b/0x480
     __ext4_get_inode_loc+0x48a/0x1240
     ext4_get_inode_loc+0xb2/0x150
     ext4_reserve_inode_write+0xb7/0x230
     __ext4_mark_inode_dirty+0x144/0x6a0
     ext4_ext_insert_extent+0x9c8/0x3230
     ext4_ext_map_blocks+0xf45/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ============================================
    
    Fixes: ecb94f5fdf4b ("ext4: collapse a single extent tree block into the inode if possible")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix error message when rejecting the default hash [+ + +]
Author: Gabriel Krisman Bertazi <[email protected]>
Date:   Tue Aug 27 16:16:36 2024 -0400

    ext4: fix error message when rejecting the default hash
    
    [ Upstream commit a2187431c395cdfbf144e3536f25468c64fc7cfa ]
    
    Commit 985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash") properly rejects volumes where
    s_def_hash_version is set to DX_HASH_SIPHASH, but the check and the
    error message should not look into casefold setup - a filesystem should
    never have DX_HASH_SIPHASH as the default hash.  Fix it and, since we
    are there, move the check to ext4_hash_info_init.
    
    Fixes:985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash")
    
    Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix fast commit inode enqueueing during a full journal commit [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 17 18:22:20 2024 +0100

    ext4: fix fast commit inode enqueueing during a full journal commit
    
    commit 6db3c1575a750fd417a70e0178bdf6efa0dd5037 upstream.
    
    When a full journal commit is on-going, any fast commit has to be enqueued
    into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
    is done only once, i.e. if an inode is already queued in a previous fast
    commit entry it won't be enqueued again.  However, if a full commit starts
    _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
    be done into FC_Q_STAGING.  And this is not being done in function
    ext4_fc_track_template().
    
    This patch fixes the issue by re-enqueuing an inode into the STAGING queue
    during the fast commit clean-up callback when doing a full commit.  However,
    to prevent a race with a fast-commit, the clean-up callback has to be called
    with the journal locked.
    
    This bug was found using fstest generic/047.  This test creates several 32k
    bytes files, sync'ing each of them after it's creation, and then shutting
    down the filesystem.  Some data may be loss in this operation; for example a
    file may have it's size truncated to zero.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix i_data_sem unlock order in ext4_ind_migrate() [+ + +]
Author: Artem Sadovnikov <[email protected]>
Date:   Thu Aug 29 15:22:09 2024 +0000

    ext4: fix i_data_sem unlock order in ext4_ind_migrate()
    
    [ Upstream commit cc749e61c011c255d81b192a822db650c68b313f ]
    
    Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
    
    This issue is triggered when an EXT4_IOC_MIGRATE ioctl is set to require
    synchronous updates because the file descriptor is opened with O_SYNC.
    This can lead to the jbd2_journal_stop() function calling
    jbd2_might_wait_for_commit(), potentially causing a deadlock if the
    EXT4_IOC_MIGRATE call races with a write(2) system call.
    
    This problem only arises when CONFIG_PROVE_LOCKING is enabled. In this
    case, the jbd2_might_wait_for_commit macro locks jbd2_handle in the
    jbd2_journal_stop function while i_data_sem is locked. This triggers
    lockdep because the jbd2_journal_start function might also lock the same
    jbd2_handle simultaneously.
    
    Found by Linux Verification Center (linuxtesting.org) with syzkaller.
    
    Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
    Co-developed-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Artem Sadovnikov <[email protected]>
    Rule: add
    Link: https://lore.kernel.org/stable/20240404095000.5872-1-mish.uxin2012%40yandex.ru
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:16 2024 +0100

    ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
    
    commit 972090651ee15e51abfb2160e986fa050cfc7a40 upstream.
    
    Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
    for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:18 2024 +0100

    ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
    
    commit ebc4b2c1ac92fc0f8bf3f5a9c285a871d5084a6b upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.
    
    Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
    assumption by being initialised to '0'.  Fortunately, the sb flag
    EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
    has been previously set instead of comparing it with '0'.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:15 2024 +0100

    ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
    
    commit dd589b0f1445e1ea1085b98edca6e4d5dedb98d0 upstream.
    
    Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
    value for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:17 2024 +0100

    ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
    
    commit 7a6443e1dad70281f99f0bd394d7fd342481a632 upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.  Don't assume that and
    use two extra boolean variables to control the loop iterations and keep
    track of the first and last tid.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix off by one issue in alloc_flex_gd() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Fri Sep 27 21:33:29 2024 +0800

    ext4: fix off by one issue in alloc_flex_gd()
    
    commit 6121258c2b33ceac3d21f6a221452692c465df88 upstream.
    
    Wesley reported an issue:
    
    ==================================================================
    EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
    ------------[ cut here ]------------
    kernel BUG at fs/ext4/resize.c:324!
    CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
    RIP: 0010:ext4_resize_fs+0x1212/0x12d0
    Call Trace:
     __ext4_ioctl+0x4e0/0x1800
     ext4_ioctl+0x12/0x20
     __x64_sys_ioctl+0x99/0xd0
     x64_sys_call+0x1206/0x20d0
     do_syscall_64+0x72/0x110
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    ==================================================================
    
    While reviewing the patch, Honza found that when adjusting resize_bg in
    alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
    flexbg_size.
    
    The reproduction of the problem requires the following:
    
     o_group = flexbg_size * 2 * n;
     o_size = (o_group + 1) * group_size;
     n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
     o_size = (n_group + 1) * group_size;
    
    Take n=0,flexbg_size=16 as an example:
    
                  last:15
    |o---------------|--------------n-|
    o_group:0    resize to      n_group:30
    
    The corresponding reproducer is:
    
    img=test.img
    rm -f $img
    truncate -s 600M $img
    mkfs.ext4 -F $img -b 1024 -G 16 8M
    dev=`losetup -f --show $img`
    mkdir -p /tmp/test
    mount $dev /tmp/test
    resize2fs $dev 248M
    
    Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
    to prevent the issue from happening again.
    
    [ Note: another reproucer which this commit fixes is:
    
      img=test.img
      rm -f $img
      truncate -s 25MiB $img
      mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
      truncate -s 3GiB $img
      dev=`losetup -f --show $img`
      mkdir -p /tmp/test
      mount $dev /tmp/test
      resize2fs $dev 3G
      umount $dev
      losetup -d $dev
    
      -- TYT ]
    
    Reported-by: Wesley Hershberger <[email protected]>
    Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
    Reported-by: Stéphane Graber <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Alexander Mikhalitsyn <[email protected]>
    Tested-by: Eric Sandeen <[email protected]>
    Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix slab-use-after-free in ext4_split_extent_at() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:23 2024 +0800

    ext4: fix slab-use-after-free in ext4_split_extent_at()
    
    commit c26ab35702f8cd0cdc78f96aa5856bfb77be798f upstream.
    
    We hit the following use-after-free:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0
    Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40
    CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724
    Call Trace:
     <TASK>
     kasan_report+0x93/0xc0
     ext4_split_extent_at+0xba8/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Allocated by task 40:
     __kmalloc_noprof+0x1ac/0x480
     ext4_find_extent+0xf3b/0x1e70
     ext4_ext_map_blocks+0x188/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Freed by task 40:
     kfree+0xf1/0x2b0
     ext4_find_extent+0xa71/0x1e70
     ext4_ext_insert_extent+0xa22/0x3260
     ext4_split_extent_at+0x3ef/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ==================================================================
    
    The flow of issue triggering is as follows:
    
    ext4_split_extent_at
      path = *ppath
      ext4_ext_insert_extent(ppath)
        ext4_ext_create_new_leaf(ppath)
          ext4_find_extent(orig_path)
            path = *orig_path
            read_extent_tree_block
              // return -ENOMEM or -EIO
            ext4_free_ext_path(path)
              kfree(path)
            *orig_path = NULL
      a. If err is -ENOMEM:
      ext4_ext_dirty(path + path->p_depth)
      // path use-after-free !!!
      b. If err is -EIO and we have EXT_DEBUG defined:
      ext4_ext_show_leaf(path)
        eh = path[depth].p_hdr
        // path also use-after-free !!!
    
    So when trying to zeroout or fix the extent length, call ext4_find_extent()
    to update the path.
    
    In addition we use *ppath directly as an ext4_ext_show_leaf() input to
    avoid possible use-after-free when EXT_DEBUG is defined, and to avoid
    unnecessary path updates.
    
    Fixes: dfe5080939ea ("ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix timer use-after-free on failed mount [+ + +]
Author: Xiaxi Shen <[email protected]>
Date:   Sun Jul 14 21:33:36 2024 -0700

    ext4: fix timer use-after-free on failed mount
    
    commit 0ce160c5bdb67081a62293028dc85758a8efb22a upstream.
    
    Syzbot has found an ODEBUG bug in ext4_fill_super
    
    The del_timer_sync function cancels the s_err_report timer,
    which reminds about filesystem errors daily. We should
    guarantee the timer is no longer active before kfree(sbi).
    
    When filesystem mounting fails, the flow goes to failed_mount3,
    where an error occurs when ext4_stop_mmpd is called, causing
    a read I/O failure. This triggers the ext4_handle_error function
    that ultimately re-arms the timer,
    leaving the s_err_report timer active before kfree(sbi) is called.
    
    Fix the issue by canceling the s_err_report timer after calling ext4_stop_mmpd.
    
    Signed-off-by: Xiaxi Shen <[email protected]>
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=59e0101c430934bc9a36
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: mark fc as ineligible using an handle in ext4_xattr_set() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:09 2024 +0100

    ext4: mark fc as ineligible using an handle in ext4_xattr_set()
    
    commit 04e6ce8f06d161399e5afde3df5dcfa9455b4952 upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch moves the call to this function so that an handle
    can be used.  If a transaction fails to start, then there's not point in
    trying to mark the filesystem as ineligible, and an error will eventually be
    returned to user-space.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: no need to continue when the number of entries is 1 [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Mon Jul 1 22:25:03 2024 +0800

    ext4: no need to continue when the number of entries is 1
    
    commit 1a00a393d6a7fb1e745a41edd09019bd6a0ad64c upstream.
    
    Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-and-tested-by: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: propagate errors from ext4_find_extent() in ext4_insert_range() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:30 2024 +0800

    ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
    
    commit 369c944ed1d7c3fb7b35f24e4735761153afe7b3 upstream.
    
    Even though ext4_find_extent() returns an error, ext4_insert_range() still
    returns 0. This may confuse the user as to why fallocate returns success,
    but the contents of the file are not as expected. So propagate the error
    returned by ext4_find_extent() to avoid inconsistencies.
    
    Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: update orig_path in ext4_find_extent() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:25 2024 +0800

    ext4: update orig_path in ext4_find_extent()
    
    commit 5b4b2dcace35f618fe361a87bae6f0d13af31bc1 upstream.
    
    In ext4_find_extent(), if the path is not big enough, we free it and set
    *orig_path to NULL. But after reallocating and successfully initializing
    the path, we don't update *orig_path, in which case the caller gets a
    valid path but a NULL ppath, and this may cause a NULL pointer dereference
    or a path memory leak. For example:
    
    ext4_split_extent
      path = *ppath = 2000
      ext4_find_extent
        if (depth > path[0].p_maxdepth)
          kfree(path = 2000);
          *orig_path = path = NULL;
          path = kcalloc() = 3000
      ext4_split_extent_at(*ppath = NULL)
        path = *ppath;
        ex = path[depth].p_ext;
        // NULL pointer dereference!
    
    ==================================================================
    BUG: kernel NULL pointer dereference, address: 0000000000000010
    CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847
    RIP: 0010:ext4_split_extent_at+0x6d/0x560
    Call Trace:
     <TASK>
     ext4_split_extent.isra.0+0xcb/0x1b0
     ext4_ext_convert_to_initialized+0x168/0x6c0
     ext4_ext_handle_unwritten_extents+0x325/0x4d0
     ext4_ext_map_blocks+0x520/0xdb0
     ext4_map_blocks+0x2b0/0x690
     ext4_iomap_begin+0x20e/0x2c0
    [...]
    ==================================================================
    
    Therefore, *orig_path is updated when the extent lookup succeeds, so that
    the caller can safely use path or *ppath.
    
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: use handle to mark fc as ineligible in __track_dentry_update() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:08 2024 +0100

    ext4: use handle to mark fc as ineligible in __track_dentry_update()
    
    commit faab35a0370fd6e0821c7a8dd213492946fc776f upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch fixes the calls to this function in
    __track_dentry_update() by adding an extra parameter to the callback used in
    ext4_fc_track_template().
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
f2fs: add write priority option based on zone UFS [+ + +]
Author: Liao Yuanhong <[email protected]>
Date:   Mon Jul 15 20:34:51 2024 +0800

    f2fs: add write priority option based on zone UFS
    
    [ Upstream commit 8444ce524947daf441546b5b3a0c418706dade35 ]
    
    Currently, we are using a mix of traditional UFS and zone UFS to support
    some functionalities that cannot be achieved on zone UFS alone. However,
    there are some issues with this approach. There exists a significant
    performance difference between traditional UFS and zone UFS. Under normal
    usage, we prioritize writes to zone UFS. However, in critical conditions
    (such as when the entire UFS is almost full), we cannot determine whether
    data will be written to traditional UFS or zone UFS. This can lead to
    significant performance fluctuations, which is not conducive to
    development and testing. To address this, we have added an option
    zlu_io_enable under sys with the following three modes:
    1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
    2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
    3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
    traditional UFS.
    
    Signed-off-by: Liao Yuanhong <[email protected]>
    Signed-off-by: Wu Bo <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 65a6ce4726c2 ("f2fs: fix to don't panic system for no free segment fault injection")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: do FG_GC when GC boosting is required for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:44 2024 -0700

    f2fs: do FG_GC when GC boosting is required for zoned devices
    
    [ Upstream commit 9748c2ddea4a3f46a498bff4cf2bf9a5629e3f8b ]
    
    Under low free section count, we need to use FG_GC instead of BG_GC to
    recover free sections.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: fix to don't panic system for no free segment fault injection [+ + +]
Author: Chao Yu <[email protected]>
Date:   Tue Sep 10 09:16:19 2024 +0800

    f2fs: fix to don't panic system for no free segment fault injection
    
    [ Upstream commit 65a6ce4726c27b45600303f06496fef46d00b57f ]
    
    f2fs: fix to don't panic system for no free segment fault injection
    
    syzbot reports a f2fs bug as below:
    
    F2FS-fs (loop0): inject no free segment in get_new_segment of __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
    F2FS-fs (loop0): Stopped filesystem due to reason: 7
    ------------[ cut here ]------------
    kernel BUG at fs/f2fs/segment.c:2748!
    CPU: 0 UID: 0 PID: 5109 Comm: syz-executor304 Not tainted 6.11.0-rc6-syzkaller-00363-g89f5e14d05b4 #0
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    Call Trace:
     __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
     f2fs_allocate_new_section fs/f2fs/segment.c:3181 [inline]
     f2fs_allocate_pinning_section+0xfa/0x4e0 fs/f2fs/segment.c:3195
     f2fs_expand_inode_data+0x5d6/0xbb0 fs/f2fs/file.c:1799
     f2fs_fallocate+0x448/0x960 fs/f2fs/file.c:1903
     vfs_fallocate+0x553/0x6c0 fs/open.c:334
     do_vfs_ioctl+0x2592/0x2e50 fs/ioctl.c:886
     __do_sys_ioctl fs/ioctl.c:905 [inline]
     __se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    
    The root cause is when we inject no free segment fault into f2fs,
    we should not panic system, fix it.
    
    Fixes: 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/linux-f2fs-devel/[email protected]
    Signed-off-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: forcibly migrate to secure space for zoned device file pinning [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Thu Sep 12 09:59:58 2024 -0700

    f2fs: forcibly migrate to secure space for zoned device file pinning
    
    [ Upstream commit 5cc69a27abfa91abbb39fc584f82d6c867b60f47 ]
    
    We need to migrate data blocks even though it is full to secure space
    for zoned device file pinning.
    
    Fixes: 9703d69d9d15 ("f2fs: support file pinning for zoned devices")
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: increase BG GC migration window granularity when boosted for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:43 2024 -0700

    f2fs: increase BG GC migration window granularity when boosted for zoned devices
    
    [ Upstream commit 2223fe652f759649ae1d520e47e5f06727c0acbd ]
    
    Need bigger BG GC migration window granularity when free section is
    running low.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: introduce migration_window_granularity [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:41 2024 -0700

    f2fs: introduce migration_window_granularity
    
    [ Upstream commit 8c890c4c60342719526520133fb1b6f69f196ab8 ]
    
    We can control the scanning window granularity for GC migration. For
    more frequent scanning and GC on zoned devices, we need a fine grained
    control knob for it.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: make BG GC more aggressive for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:40 2024 -0700

    f2fs: make BG GC more aggressive for zoned devices
    
    [ Upstream commit 5062b5bed4323275f2f89bc185c6a28d62cfcfd5 ]
    
    Since we don't have any GC on device side for zoned devices, need more
    aggressive BG GC. So, tune the parameters for that.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

 
fbdev: efifb: Register sysfs groups through driver core [+ + +]
Author: Thomas Weißschuh <[email protected]>
Date:   Tue Aug 27 17:25:13 2024 +0200

    fbdev: efifb: Register sysfs groups through driver core
    
    [ Upstream commit 95cdd538e0e5677efbdf8aade04ec098ab98f457 ]
    
    The driver core can register and cleanup sysfs groups already.
    Make use of that functionality to simplify the error handling and
    cleanup.
    
    Also avoid a UAF race during unregistering where the sysctl attributes
    were usable after the info struct was freed.
    
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fbdev: pxafb: Fix possible use after free in pxafb_task() [+ + +]
Author: Kaixin Wang <[email protected]>
Date:   Wed Sep 11 22:29:52 2024 +0800

    fbdev: pxafb: Fix possible use after free in pxafb_task()
    
    [ Upstream commit 4a6921095eb04a900e0000da83d9475eb958e61e ]
    
    In the pxafb_probe function, it calls the pxafb_init_fbinfo function,
    after which &fbi->task is associated with pxafb_task. Moreover,
    within this pxafb_init_fbinfo function, the pxafb_blank function
    within the &pxafb_ops struct is capable of scheduling work.
    
    If we remove the module which will call pxafb_remove to make cleanup,
    it will call unregister_framebuffer function which can call
    do_unregister_framebuffer to free fbi->fb through
    put_fb_info(fb_info), while the work mentioned above will be used.
    The sequence of operations that may lead to a UAF bug is as follows:
    
    CPU0                                                CPU1
    
                                       | pxafb_task
    pxafb_remove                       |
    unregister_framebuffer(info)       |
    do_unregister_framebuffer(fb_info) |
    put_fb_info(fb_info)               |
    // free fbi->fb                    | set_ctrlr_state(fbi, state)
                                       | __pxafb_lcd_power(fbi, 0)
                                       | fbi->lcd_power(on, &fbi->fb.var)
                                       | //use fbi->fb
    
    Fix it by ensuring that the work is canceled before proceeding
    with the cleanup in pxafb_remove.
    
    Note that only root user can remove the driver at runtime.
    
    Signed-off-by: Kaixin Wang <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
firmware/sysfb: Disable sysfb for firmware buffers with unknown parent [+ + +]
Author: Thomas Zimmermann <[email protected]>
Date:   Tue Sep 24 10:41:03 2024 +0200

    firmware/sysfb: Disable sysfb for firmware buffers with unknown parent
    
    commit ad604f0a4c040dcb8faf44dc72db25e457c28076 upstream.
    
    The sysfb framebuffer handling only operates on graphics devices
    that provide the system's firmware framebuffer. If that device is
    not known, assume that any graphics device has been initialized by
    firmware.
    
    Fixes a problem on i915 where sysfb does not release the firmware
    framebuffer after the native graphics driver loaded.
    
    Reported-by: Borah, Chaitanya Kumar <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/SJ1PR11MB6129EFB8CE63D1EF6D932F94B96F2@SJ1PR11MB6129.namprd11.prod.outlook.com/
    Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12160
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: b49420d6a1ae ("video/aperture: optionally match the device in sysfb_disable()")
    Cc: Javier Martinez Canillas <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Helge Deller <[email protected]>
    Cc: Sam Ravnborg <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Cc: Linux regression tracking (Thorsten Leemhuis) <[email protected]>
    Cc: <[email protected]> # v6.11+
    Acked-by: Alex Deucher <[email protected]>
    Reviewed-by: Javier Martinez Canillas <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Fri Aug 16 15:57:21 2024 +0200

    firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp()
    
    commit 9c3a62c20f7fb00294a4237e287254456ba8a48b upstream.
    
    mbox_client_to_bpmp() is not used, W=1 builds:
    
      drivers/firmware/tegra/bpmp.c:28:1: error: unused function 'mbox_client_to_bpmp' [-Werror,-Wunused-function]
    
    Fixes: cdfa358b248e ("firmware: tegra: Refactor BPMP driver")
    Cc: [email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Thierry Reding <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name [+ + +]
Author: Li Zhijian <[email protected]>
Date:   Mon Aug 26 13:55:03 2024 +0800

    fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
    
    [ Upstream commit 7f7b850689ac06a62befe26e1fd1806799e7f152 ]
    
    It's observed that a crash occurs during hot-remove a memory device,
    in which user is accessing the hugetlb. See calltrace as following:
    
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 14045 at arch/x86/mm/fault.c:1278 do_user_addr_fault+0x2a0/0x790
    Modules linked in: kmem device_dax cxl_mem cxl_pmem cxl_port cxl_pci dax_hmem dax_pmem nd_pmem cxl_acpi nd_btt cxl_core crc32c_intel nvme virtiofs fuse nvme_core nfit libnvdimm dm_multipath scsi_dh_rdac scsi_dh_emc s
    mirror dm_region_hash dm_log dm_mod
    CPU: 1 PID: 14045 Comm: daxctl Not tainted 6.10.0-rc2-lizhijian+ #492
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    RIP: 0010:do_user_addr_fault+0x2a0/0x790
    Code: 48 8b 00 a8 04 0f 84 b5 fe ff ff e9 1c ff ff ff 4c 89 e9 4c 89 e2 be 01 00 00 00 bf 02 00 00 00 e8 b5 ef 24 00 e9 42 fe ff ff <0f> 0b 48 83 c4 08 4c 89 ea 48 89 ee 4c 89 e7 5b 5d 41 5c 41 5d 41
    RSP: 0000:ffffc90000a575f0 EFLAGS: 00010046
    RAX: ffff88800c303600 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 0000000000001000 RSI: ffffffff82504162 RDI: ffffffff824b2c36
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000a57658
    R13: 0000000000001000 R14: ffff88800bc2e040 R15: 0000000000000000
    FS:  00007f51cb57d880(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000001000 CR3: 00000000072e2004 CR4: 00000000001706f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __warn+0x8d/0x190
     ? do_user_addr_fault+0x2a0/0x790
     ? report_bug+0x1c3/0x1d0
     ? handle_bug+0x3c/0x70
     ? exc_invalid_op+0x14/0x70
     ? asm_exc_invalid_op+0x16/0x20
     ? do_user_addr_fault+0x2a0/0x790
     ? exc_page_fault+0x31/0x200
     exc_page_fault+0x68/0x200
    <...snip...>
    BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     ---[ end trace 0000000000000000 ]---
     BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 14045 Comm: daxctl Kdump: loaded Tainted: G        W          6.10.0-rc2-lizhijian+ #492
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
     RIP: 0010:dentry_name+0x1f4/0x440
    <...snip...>
    ? dentry_name+0x2fa/0x440
    vsnprintf+0x1f3/0x4f0
    vprintk_store+0x23a/0x540
    vprintk_emit+0x6d/0x330
    _printk+0x58/0x80
    dump_mapping+0x10b/0x1a0
    ? __pfx_free_object_rcu+0x10/0x10
    __dump_page+0x26b/0x3e0
    ? vprintk_emit+0xe0/0x330
    ? _printk+0x58/0x80
    ? dump_page+0x17/0x50
    dump_page+0x17/0x50
    do_migrate_range+0x2f7/0x7f0
    ? do_migrate_range+0x42/0x7f0
    ? offline_pages+0x2f4/0x8c0
    offline_pages+0x60a/0x8c0
    memory_subsys_offline+0x9f/0x1c0
    ? lockdep_hardirqs_on+0x77/0x100
    ? _raw_spin_unlock_irqrestore+0x38/0x60
    device_offline+0xe3/0x110
    state_store+0x6e/0xc0
    kernfs_fop_write_iter+0x143/0x200
    vfs_write+0x39f/0x560
    ksys_write+0x65/0xf0
    do_syscall_64+0x62/0x130
    
    Previously, some sanity check have been done in dump_mapping() before
    the print facility parsing '%pd' though, it's still possible to run into
    an invalid dentry.d_name.name.
    
    Since dump_mapping() only needs to dump the filename only, retrieve it
    by itself in a safer way to prevent an unnecessary crash.
    
    Note that either retrieving the filename with '%pd' or
    strncpy_from_kernel_nofault(), the filename could be unreliable.
    
    Signed-off-by: Li Zhijian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
gfs2: fix double destroy_workqueue error [+ + +]
Author: Julian Sun <[email protected]>
Date:   Tue Aug 20 11:31:48 2024 +0800

    gfs2: fix double destroy_workqueue error
    
    commit 6cb9df81a2c462b89d2f9611009ab43ae8717841 upstream.
    
    When gfs2_fill_super() fails, destroy_workqueue() is called within
    gfs2_gl_hash_clear(), and the subsequent code path calls
    destroy_workqueue() on the same work queue again.
    
    This issue can be fixed by setting the work queue pointer to NULL after
    the first destroy_workqueue() call and checking for a NULL pointer
    before attempting to destroy the work queue again.
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d34c2a269ed512c531b0
    Fixes: 30e388d57367 ("gfs2: Switch to a per-filesystem glock workqueue")
    Cc: [email protected]
    Signed-off-by: Julian Sun <[email protected]>
    Signed-off-by: Andreas Gruenbacher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
gpio: davinci: fix lazy disable [+ + +]
Author: Emanuele Ghidoli <[email protected]>
Date:   Wed Aug 28 15:32:07 2024 +0200

    gpio: davinci: fix lazy disable
    
    commit 3360d41f4ac490282fddc3ccc0b58679aa5c065d upstream.
    
    On a few platforms such as TI's AM69 device, disable_irq() fails to keep
    track of the interrupts that happen between disable_irq() and
    enable_irq() and those interrupts are missed. Use the ->irq_unmask() and
    ->irq_mask() methods instead of ->irq_enable() and ->irq_disable() to
    correctly keep track of edges when disable_irq is called.
    
    This solves the issue of disable_irq() not working as expected on such
    platforms.
    
    Fixes: 23265442b02b ("ARM: davinci: irq_data conversion.")
    Signed-off-by: Emanuele Ghidoli <[email protected]>
    Signed-off-by: Parth Pancholi <[email protected]>
    Acked-by: Keerthy <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
gpiolib: Fix potential NULL pointer dereference in gpiod_get_label() [+ + +]
Author: Lad Prabhakar <[email protected]>
Date:   Thu Oct 3 14:13:51 2024 +0100

    gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()
    
    [ Upstream commit 7b99b5ab885993bff010ebcd93be5e511c56e28a ]
    
    In `gpiod_get_label()`, it is possible that `srcu_dereference_check()` may
    return a NULL pointer, leading to a scenario where `label->str` is accessed
    without verifying if `label` itself is NULL.
    
    This patch adds a proper NULL check for `label` before accessing
    `label->str`. The check for `label->str != NULL` is removed because
    `label->str` can never be NULL if `label` is not NULL.
    
    This fixes the issue where the label name was being printed as `(efault)`
    when dumping the sysfs GPIO file when `label == NULL`.
    
    Fixes: 5a646e03e956 ("gpiolib: Return label, if set, for IRQ only line")
    Fixes: a86d27693066 ("gpiolib: fix the speed of descriptor label setting with SRCU")
    Signed-off-by: Lad Prabhakar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
gso: fix udp gso fraglist segmentation after pull from frag_list [+ + +]
Author: Willem de Bruijn <[email protected]>
Date:   Tue Oct 1 13:17:46 2024 -0400

    gso: fix udp gso fraglist segmentation after pull from frag_list
    
    commit a1e40ac5b5e9077fe1f7ae0eb88034db0f9ae1ab upstream.
    
    Detect gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For UDP, this
    causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
    udp_hdr(seg->next)->dest.
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Willem de Bruijn <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
HID: bpf: fix cfi stubs for hid_bpf_ops [+ + +]
Author: Benjamin Tissoires <[email protected]>
Date:   Fri Sep 27 16:17:41 2024 +0200

    HID: bpf: fix cfi stubs for hid_bpf_ops
    
    commit acd5f76fd5292c91628e04da83e8b78c986cfa2b upstream.
    
    With the introduction of commit e42ac1418055 ("bpf: Check unsupported ops
    from the bpf_struct_ops's cfi_stubs"), a HID-BPF struct_ops containing
    a .hid_hw_request() or a .hid_hw_output_report() was failing to load
    as the cfi stubs were not defined.
    
    Fix that by defining those simple static functions and restore HID-BPF
    functionality.
    
    This was detected with the HID selftests suddenly failing on Linus' tree.
    
    Cc: [email protected] # v6.11+
    Fixes: 9286675a2aed ("HID: bpf: add HID-BPF hooks for hid_hw_output_report")
    Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests")
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: i2c-hid: ensure various commands do not interfere with each other [+ + +]
Author: Dmitry Torokhov <[email protected]>
Date:   Mon Sep 9 13:37:40 2024 -0700

    HID: i2c-hid: ensure various commands do not interfere with each other
    
    [ Upstream commit b4ed18a3d56eabd18cfd9841ff05111e3cfbe8f9 ]
    
    i2c-hid uses 2 shared buffers: command and "raw" input buffer for
    sending requests to peripherals and read data from peripherals when
    executing variety of commands. Such commands include reading of HID
    registers, requesting particular power mode, getting and setting
    reports and so on. Because all such requests use the same 2 buffers
    they should not execute simultaneously.
    
    Fix this by introducing "cmd_lock" mutex and acquire it whenever
    we needs to access ihid->cmdbuf or idid->rawbuf.
    
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: Ignore battery for all ELAN I2C-HID devices [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Mon Aug 5 16:51:47 2024 +0200

    HID: Ignore battery for all ELAN I2C-HID devices
    
    [ Upstream commit bcc31692a1d1e21f0d06c5f727c03ee299d2264e ]
    
    Before this change there were 16 vid:pid based quirks to ignore the battery
    reported by Elan I2C-HID touchscreens on various Asus and HP laptops.
    
    And a report has been received that the 04F3:2A00 I2C touchscreen on
    the HP ProBook x360 11 G5 EE/86CF also reports a non present battery.
    
    Since I2C-HID devices are always builtin to laptops they are not battery
    owered so it should be safe to just ignore the battery on all Elan I2C-HID
    devices, rather then adding a 17th quirk for the 04F3:2A00 touchscreen.
    
    As reported in the changelog of commit a3a5a37efba1 ("HID: Ignore battery
    for ELAN touchscreens 2F2C and 4116"), which added 2 new Elan touchscreen
    quirks about a month ago, the HID reported battery seems to be related
    to a stylus being used. But even when a stylus is in use it does not
    properly report the charge of the stylus battery, instead the reported
    battery charge jumps from 0% to 1%. So it is best to just ignore the
    HID battery.
    
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2302776
    Cc: Louis Dalibard <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio [+ + +]
Author: Vishnu Sankar <[email protected]>
Date:   Sun Aug 18 16:27:29 2024 +0900

    HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio
    
    [ Upstream commit 65b72ea91a257a5f0cb5a26b01194d3dd4b85298 ]
    
    This applies similar quirks used by previous generation device, so that
    Trackpoint and buttons on the touchpad works.  New USB KBD PID 0x61AE for
    Thinkpad X12 Tab is added.
    
    Signed-off-by: Vishnu Sankar <[email protected]>
    Reviewed-by: Mark Pearson <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
hwmon: (nct6775) add G15CF to ASUS WMI monitoring list [+ + +]
Author: Denis Pauk <[email protected]>
Date:   Mon Aug 12 18:26:38 2024 +0300

    hwmon: (nct6775) add G15CF to ASUS WMI monitoring list
    
    [ Upstream commit 1f432e4cf1dd3ecfec5ed80051b4611632a0fd51 ]
    
    Boards G15CF has got a nct6775 chip, but by default there's no use of it
    because of resource conflict with WMI method.
    
    Add the board to the WMI monitoring list.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=204807
    Signed-off-by: Denis Pauk <[email protected]>
    Tested-by: Attila <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
i2c: core: Lock address during client device instantiation [+ + +]
Author: Heiner Kallweit <[email protected]>
Date:   Thu Aug 15 21:44:50 2024 +0200

    i2c: core: Lock address during client device instantiation
    
    commit 8d3cefaf659265aa82b0373a563fdb9d16a2b947 upstream.
    
    Krzysztof reported an issue [0] which is caused by parallel attempts to
    instantiate the same I2C client device. This can happen if driver
    supports auto-detection, but certain devices are also instantiated
    explicitly.
    The original change isn't actually wrong, it just revealed that I2C core
    isn't prepared yet to handle this scenario.
    Calls to i2c_new_client_device() can be nested, therefore we can't use a
    simple mutex here. Parallel instantiation of devices at different addresses
    is ok, so we just have to prevent parallel instantiation at the same address.
    We can use a bitmap with one bit per 7-bit I2C client address, and atomic
    bit operations to set/check/clear bits.
    Now a parallel attempt to instantiate a device at the same address will
    result in -EBUSY being returned, avoiding the "sysfs: cannot create duplicate
    filename" splash.
    
    Note: This patch version includes small cosmetic changes to the Tested-by
          version, only functional change is that address locking is supported
          for slave addresses too.
    
    [0] https://lore.kernel.org/linux-i2c/[email protected]/T/#m12706546e8e2414d8f1a0dc61c53393f731685cc
    
    Fixes: caba40ec3531 ("eeprom: at24: Probe for DDR3 thermal sensor in the SPD case")
    Cc: [email protected]
    Tested-by: Krzysztof Piotr Oledzki <[email protected]>
    Signed-off-by: Heiner Kallweit <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled [+ + +]
Author: Kimriver Liu <[email protected]>
Date:   Fri Sep 13 11:31:46 2024 +0800

    i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
    
    commit 5d69d5a00f80488ddcb4dee7d1374a0709398178 upstream.
    
    It was observed that issuing the ABORT bit (IC_ENABLE[1]) will not
    work when IC_ENABLE is already disabled.
    
    Check if the ENABLE bit (IC_ENABLE[0]) is disabled when the controller
    is holding SCL low. If the ENABLE bit is disabled, the software needs
    to enable it before trying to issue the ABORT bit. otherwise,
    the controller ignores any write to ABORT bit.
    
    These kernel logs show up whenever an I2C transaction is
    attempted after this failure.
    i2c_designware e95e0000.i2c: timeout waiting for bus ready
    i2c_designware e95e0000.i2c: timeout in disabling adapter
    
    The patch fixes the issue where the controller cannot be disabled
    while SCL is held low if the ENABLE bit is already disabled.
    
    Fixes: 2409205acd3c ("i2c: designware: fix __i2c_dw_disable() in case master is holding SCL low")
    Signed-off-by: Kimriver Liu <[email protected]>
    Cc: <[email protected]> # v6.6+
    Reviewed-by: Mika Westerberg <[email protected]>
    Acked-by: Jarkko Nikula <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:34:59 2024 +0800

    i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq()
    
    commit e2c85d85a05f16af2223fcc0195ff50a7938b372 upstream.
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Cc: <[email protected]> # v4.19+
    Acked-by: Mukesh Kumar Savaliya <[email protected]>
    Reviewed-by: Vladimir Zapolskiy <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume [+ + +]
Author: Marek Vasut <[email protected]>
Date:   Mon Sep 30 21:27:41 2024 +0200

    i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
    
    commit 048bbbdbf85e5e00258dfb12f5e368f908801d7b upstream.
    
    In case there is any sort of clock controller attached to this I2C bus
    controller, for example Versaclock or even an AIC32x4 I2C codec, then
    an I2C transfer triggered from the clock controller clk_ops .prepare
    callback may trigger a deadlock on drivers/clk/clk.c prepare_lock mutex.
    
    This is because the clock controller first grabs the prepare_lock mutex
    and then performs the prepare operation, including its I2C access. The
    I2C access resumes this I2C bus controller via .runtime_resume callback,
    which calls clk_prepare_enable(), which attempts to grab the prepare_lock
    mutex again and deadlocks.
    
    Since the clock are already prepared since probe() and unprepared in
    remove(), use simple clk_enable()/clk_disable() calls to enable and
    disable the clock on runtime suspend and resume, to avoid hitting the
    prepare_lock mutex.
    
    Acked-by: Alain Volmat <[email protected]>
    Signed-off-by: Marek Vasut <[email protected]>
    Fixes: 4e7bca6fc07b ("i2c: i2c-stm32f7: add PM Runtime support")
    Cc: <[email protected]> # v5.0+
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: synquacer: Deal with optional PCLK correctly [+ + +]
Author: Ard Biesheuvel <[email protected]>
Date:   Thu Sep 12 12:46:31 2024 +0200

    i2c: synquacer: Deal with optional PCLK correctly
    
    commit f2990f8630531a99cad4dc5c44cb2a11ded42492 upstream.
    
    ACPI boot does not provide clocks and regulators, but instead, provides
    the PCLK rate directly, and enables the clock in firmware. So deal
    gracefully with this.
    
    Fixes: 55750148e559 ("i2c: synquacer: Fix an error handling path in synquacer_i2c_probe()")
    Cc: [email protected] # v6.10+
    Cc: Andi Shyti <[email protected]>
    Cc: Christophe JAILLET <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 11:42:50 2024 +0800

    i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 0c8d604dea437b69a861479b413d629bc9b3da70 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 36ecbcab84d0 ("i2c: xiic: Implement power management")
    Cc: <[email protected]> # v4.6+
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: Wait for TX empty to avoid missed TX NAKs [+ + +]
Author: Robert Hancock <[email protected]>
Date:   Tue Nov 21 18:11:16 2023 +0000

    i2c: xiic: Wait for TX empty to avoid missed TX NAKs
    
    commit 521da1e9225450bd323db5fa5bca942b1dc485b7 upstream.
    
    Frequently an I2C write will be followed by a read, such as a register
    address write followed by a read of the register value. In this driver,
    when the TX FIFO half empty interrupt was raised and it was determined
    that there was enough space in the TX FIFO to send the following read
    command, it would do so without waiting for the TX FIFO to actually
    empty.
    
    Unfortunately it appears that in some cases this can result in a NAK
    that was raised by the target device on the write, such as due to an
    unsupported register address, being ignored and the subsequent read
    being done anyway. This can potentially put the I2C bus into an
    invalid state and/or result in invalid read data being processed.
    
    To avoid this, once a message has been fully written to the TX FIFO,
    wait for the TX FIFO empty interrupt before moving on to the next
    message, to ensure NAKs are handled properly.
    
    Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
    Signed-off-by: Robert Hancock <[email protected]>
    Cc: <[email protected]> # v2.6.34+
    Reviewed-by: Manikanta Guntupalli <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition [+ + +]
Author: Kaixin Wang <[email protected]>
Date:   Sun Sep 15 00:39:33 2024 +0800

    i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition
    
    commit 61850725779709369c7e907ae8c7c75dc7cec4f3 upstream.
    
    In the svc_i3c_master_probe function, &master->hj_work is bound with
    svc_i3c_master_hj_work, &master->ibi_work is bound with
    svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work  can start the
    hj_work, svc_i3c_master_irq_handler can start the ibi_work.
    
    If we remove the module which will call svc_i3c_master_remove to
    make cleanup, it will free master->base through i3c_master_unregister
    while the work mentioned above will be used. The sequence of operations
    that may lead to a UAF bug is as follows:
    
    CPU0                                         CPU1
    
                                        | svc_i3c_master_hj_work
    svc_i3c_master_remove               |
    i3c_master_unregister(&master->base)|
    device_unregister(&master->dev)     |
    device_release                      |
    //free master->base                 |
                                        | i3c_master_do_daa(&master->base)
                                        | //use master->base
    
    Fix it by ensuring that the work is canceled before proceeding with the
    cleanup in svc_i3c_master_remove.
    
    Fixes: 0f74f8b6675c ("i3c: Make i3c_master_unregister() return void")
    Cc: [email protected]
    Signed-off-by: Kaixin Wang <[email protected]>
    Reviewed-by: Miquel Raynal <[email protected]>
    Reviewed-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/stable/20240914154030.180-1-kxwang23%40m.fudan.edu.cn
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node() [+ + +]
Author: Aleksandr Mishin <[email protected]>
Date:   Wed Jul 10 15:39:49 2024 +0300

    ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
    
    [ Upstream commit 62fdaf9e8056e9a9e6fe63aa9c816ec2122d60c6 ]
    
    In ice_sched_add_root_node() and ice_sched_add_node() there are calls to
    devm_kcalloc() in order to allocate memory for array of pointers to
    'ice_sched_node' structure. But incorrect types are used as sizeof()
    arguments in these calls (structures instead of pointers) which leads to
    over allocation of memory.
    
    Adjust over allocation of memory by correcting types in devm_kcalloc()
    sizeof() arguments.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ieee802154: Fix build error [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 9 21:17:40 2024 +0800

    ieee802154: Fix build error
    
    [ Upstream commit addf89774e48c992316449ffab4f29c2309ebefb ]
    
    If REGMAP_SPI is m and IEEE802154_MCR20A is y,
    
            mcr20a.c:(.text+0x3ed6c5b): undefined reference to `__devm_regmap_init_spi'
            ld: mcr20a.c:(.text+0x3ed6cb5): undefined reference to `__devm_regmap_init_spi'
    
    Select REGMAP_SPI for IEEE802154_MCR20A to fix it.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iio: magnetometer: ak8975: Fix reading for ak099xx sensors [+ + +]
Author: Barnabás Czémán <[email protected]>
Date:   Mon Aug 19 00:29:40 2024 +0200

    iio: magnetometer: ak8975: Fix reading for ak099xx sensors
    
    commit 129464e86c7445a858b790ac2d28d35f58256bbe upstream.
    
    Move ST2 reading with overflow handling after measurement data
    reading.
    ST2 register read have to be read after read measurment data,
    because it means end of the reading and realease the lock on the data.
    Remove ST2 read skip on interrupt based waiting because ST2 required to
    be read out at and of the axis read.
    
    Fixes: 57e73a423b1e ("iio: ak8975: add ak09911 and ak09912 support")
    Signed-off-by: Barnabás Czémán <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: pressure: bmp280: Fix regmap for BMP280 device [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:49 2024 +0200

    iio: pressure: bmp280: Fix regmap for BMP280 device
    
    commit b9065b0250e1705935445ede0a18c1850afe7b75 upstream.
    
    Up to now, the BMP280 device is using the regmap of the BME280 which
    has registers that exist only in the BME280 device.
    
    Fixes: 14e8015f8569 ("iio: pressure: bmp280: split driver in logical parts")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: pressure: bmp280: Fix waiting time for BMP3xx configuration [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:50 2024 +0200

    iio: pressure: bmp280: Fix waiting time for BMP3xx configuration
    
    commit 262a6634bcc4f0c1c53d13aa89882909f281a6aa upstream.
    
    According to the datasheet, both pressure and temperature can go up to
    oversampling x32. With this option, the maximum measurement time is not
    80ms (this is for press x32 and temp x2), but it is 130ms nominal
    (calculated from table 3.9.2) and since most of the maximum values
    are around +15%, it is configured to 150ms.
    
    Fixes: 8d329309184d ("iio: pressure: bmp280: Add support for BMP380 sensor family")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Input: adp5589-keys - fix adp5589_gpio_get_value() [+ + +]
Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:47:23 2024 -0700

    Input: adp5589-keys - fix adp5589_gpio_get_value()
    
    commit c684771630e64bc39bddffeb65dd8a6612a6b249 upstream.
    
    The adp5589 seems to have the same behavior as similar devices as
    explained in commit 910a9f5636f5 ("Input: adp5588-keys - get value from
    data out when dir is out").
    
    Basically, when the gpio is set as output we need to get the value from
    ADP5589_GPO_DATA_OUT_A register instead of ADP5589_GPI_STATUS_A.
    
    Fixes: 9d2e173644bb ("Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-2-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Input: adp5589-keys - fix NULL pointer dereference [+ + +]
Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:46:44 2024 -0700

    Input: adp5589-keys - fix NULL pointer dereference
    
    commit fb5cc65f973661241e4a2b7390b429aa7b330c69 upstream.
    
    We register a devm action to call adp5589_clear_config() and then pass
    the i2c client as argument so that we can call i2c_get_clientdata() in
    order to get our device object. However, i2c_set_clientdata() is only
    being set at the end of the probe function which means that we'll get a
    NULL pointer dereference in case the probe function fails early.
    
    Fixes: 30df385e35a4 ("Input: adp5589-keys - use devm_add_action_or_reset() for register clear")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-1-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake [+ + +]
Author: Kai-Heng Feng <[email protected]>
Date:   Tue Aug 20 12:11:28 2024 +0800

    intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake
    
    [ Upstream commit 5bb33212b5c664396e5de4cd5a2999abb84a3978 ]
    
    PCIe ethernet throughut is sub-optimal on Jasper Lake and Elkhart Lake.
    
    The CPU can take long time to exit to C0 to handle IRQ and perform DMA
    when C1E has been entered.
    
    For this reason, adjust intel_idle to disable promotion to C1E and still
    use C-states from ACPI _CST on those two platforms.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219023
    Signed-off-by: Kai-Heng Feng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
io_uring/net: harden multishot termination case for recv [+ + +]
Author: Jens Axboe <[email protected]>
Date:   Thu Sep 26 07:08:10 2024 -0600

    io_uring/net: harden multishot termination case for recv
    
    commit c314094cb4cfa6fc5a17f4881ead2dfebfa717a7 upstream.
    
    If the recv returns zero, or an error, then it doesn't matter if more
    data has already been received for this buffer. A condition like that
    should terminate the multishot receive. Rather than pass in the
    collected return value, pass in whether to terminate or keep the recv
    going separately.
    
    Note that this isn't a bug right now, as the only way to get there is
    via setting MSG_WAITALL with multishot receive. And if an application
    does that, then -EINVAL is returned anyway. But it seems like an easy
    bug to introduce, so let's make it a bit more explicit.
    
    Link: https://github.com/axboe/liburing/issues/1246
    Cc: [email protected]
    Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
io_uring: fix memory leak when cache init fail [+ + +]
Author: Guixin Liu <[email protected]>
Date:   Mon Sep 23 18:05:12 2024 +0800

    io_uring: fix memory leak when cache init fail
    
    [ Upstream commit 3a87e264290d71ec86a210ab3e8d23b715ad266d ]
    
    Exit the percpu ref when cache init fails to free the data memory with
    in struct percpu_ref.
    
    Fixes: 206aefde4f88 ("io_uring: reduce/pack size of io_ring_ctx")
    Signed-off-by: Guixin Liu <[email protected]>
    Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iomap: constrain the file range passed to iomap_file_unshare [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Wed Oct 2 08:02:13 2024 -0700

    iomap: constrain the file range passed to iomap_file_unshare
    
    [ Upstream commit a311a08a4237241fb5b9d219d3e33346de6e83e0 ]
    
    File contents can only be shared (i.e. reflinked) below EOF, so it makes
    no sense to try to unshare ranges beyond EOF.  Constrain the file range
    parameters here so that we don't have to do that in the callers.
    
    Fixes: 5f4e5752a8a3 ("fs: add iomap_file_dirty")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Link: https://lore.kernel.org/r/20241002150213.GC21853@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Brian Foster <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Sep 10 07:39:03 2024 +0300

    iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release
    
    [ Upstream commit 7a9d43eace888a0ee6095035997bb138425844d3 ]
    
    When direct I/O completions invalidates the page cache it holds neither the
    i_rwsem nor the invalidate_lock so it can be racing with
    iomap_write_delalloc_release.  If the search for the end of the region that
    contains data returns the start offset we hit such a race and just need to
    look for the end of the newly created hole instead.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iommu/arm-smmu-v3: Do not use devm for the cd table allocations [+ + +]
Author: Jason Gunthorpe <[email protected]>
Date:   Fri Sep 6 12:47:52 2024 -0300

    iommu/arm-smmu-v3: Do not use devm for the cd table allocations
    
    [ Upstream commit 47b2de35cab2b683f69d03515c2658c2d8515323 ]
    
    The master->cd_table is entirely contained within the struct
    arm_smmu_master which is guaranteed to be freed by the core code under
    arm_smmu_release_device().
    
    There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably
    called to free the CD related memory. Remove it and save some memory.
    
    Tested-by: Nicolin Chen <[email protected]>
    Reviewed-by: Nicolin Chen <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/arm-smmu-v3: Match Stall behaviour for S2 [+ + +]
Author: Mostafa Saleh <[email protected]>
Date:   Fri Aug 30 11:03:47 2024 +0000

    iommu/arm-smmu-v3: Match Stall behaviour for S2
    
    [ Upstream commit ce7cb08e22e09f43649b025c849a3ae3b80833c4 ]
    
    According to the spec (ARM IHI 0070 F.b), in
    "5.5 Fault configuration (A, R, S bits)":
        A STE with stage 2 translation enabled and STE.S2S == 0 is
        considered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10.
    
    Also described in the pseudocode “SteIllegal()”
        if STE.Config == '11x' then
            [..]
            if eff_idr0_stall_model == '10' && STE.S2S == '0' then
                // stall_model forcing stall, but S2S == 0
                return TRUE;
    
    Which means, S2S must be set when stall model is
    "ARM_SMMU_FEAT_STALL_FORCE", but currently the driver ignores that.
    
    Although, the driver can do the minimum and only set S2S for
    “ARM_SMMU_FEAT_STALL_FORCE”, it is more consistent to match S1
    behaviour, which also sets it for “ARM_SMMU_FEAT_STALL” if the
    master has requested stalls.
    
    Also, since S2 stalls are enabled now, report them to the IOMMU layer
    and for VFIO devices it will fail anyway as VFIO doesn’t register an
    iopf handler.
    
    Signed-off-by: Mostafa Saleh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iommu/vt-d: Always reserve a domain ID for identity setup [+ + +]
Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:13 2024 +0800

    iommu/vt-d: Always reserve a domain ID for identity setup
    
    [ Upstream commit 2c13012e09190174614fd6901857a1b8c199e17d ]
    
    We will use a global static identity domain. Reserve a static domain ID
    for it.
    
    Signed-off-by: Lu Baolu <[email protected]>
    Reviewed-by: Jason Gunthorpe <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Reviewed-by: Jerry Snitselaar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count [+ + +]
Author: Sanjay K Kumar <[email protected]>
Date:   Mon Sep 2 10:27:18 2024 +0800

    iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count
    
    [ Upstream commit 3cf74230c139f208b7fb313ae0054386eee31a81 ]
    
    If qi_submit_sync() is invoked with 0 invalidation descriptors (for
    instance, for DMA draining purposes), we can run into a bug where a
    submitting thread fails to detect the completion of invalidation_wait.
    Subsequently, this led to a soft lockup. Currently, there is no impact
    by this bug on the existing users because no callers are submitting
    invalidations with 0 descriptors. This fix will enable future users
    (such as DMA drain) calling qi_submit_sync() with 0 count.
    
    Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while
    concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both
    threads then enter a while loop, waiting for their respective descriptors
    to complete. T1 detects its completion (i.e., T1's invalidation_wait status
    changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to
    reclaim all descriptors, potentially including adjacent ones of other
    threads that are also marked as QI_DONE.
    
    During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU
    hardware may complete the invalidation for T2, setting its status to
    QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's
    invalidation_wait descriptor and changes its status to QI_FREE, T2 will
    not observe the QI_DONE status for its invalidation_wait and will
    indefinitely remain stuck.
    
    This soft lockup does not occur when only non-zero descriptors are
    submitted.In such cases, invalidation descriptors are interspersed among
    wait descriptors with the status QI_IN_USE, acting as barriers. These
    barriers prevent the reclaim code from mistakenly freeing descriptors
    belonging to other submitters.
    
    Considered the following example timeline:
            T1                      T2
    ========================================
            ID1
            WD1
            while(WD1!=QI_DONE)
            unlock
                                    lock
            WD1=QI_DONE*            WD2
                                    while(WD2!=QI_DONE)
                                    unlock
            lock
            WD1==QI_DONE?
            ID1=QI_DONE             WD2=DONE*
            reclaim()
            ID1=FREE
            WD1=FREE
            WD2=FREE
            unlock
                                    soft lockup! T2 never sees QI_DONE in WD2
    
    Where:
    ID = invalidation descriptor
    WD = wait descriptor
    * Written by hardware
    
    The root of the problem is that the descriptor status QI_DONE flag is used
    for two conflicting purposes:
    1. signal a descriptor is ready for reclaim (to be freed)
    2. signal by the hardware that a wait descriptor is complete
    
    The solution (in this patch) is state separation by using QI_FREE flag
    for #1.
    
    Once a thread's invalidation descriptors are complete, their status would
    be set to QI_FREE. The reclaim_free_desc() function would then only
    free descriptors marked as QI_FREE instead of those marked as
    QI_DONE. This change ensures that T2 (from the previous example) will
    correctly observe the completion of its invalidation_wait (marked as
    QI_DONE).
    
    Signed-off-by: Sanjay K Kumar <[email protected]>
    Signed-off-by: Jacob Pan <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Unconditionally flush device TLB for pasid table updates [+ + +]
Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:20 2024 +0800

    iommu/vt-d: Unconditionally flush device TLB for pasid table updates
    
    [ Upstream commit 1f5e307ca16c0c19186cbd56ac460a687e6daba0 ]
    
    The caching mode of an IOMMU is irrelevant to the behavior of the device
    TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode
    check before device TLB flush") removed this redundant check in the
    domain unmap path.
    
    Checking the caching mode before flushing the device TLB after a pasid
    table entry is updated is unnecessary and can lead to inconsistent
    behavior.
    
    Extends this consistency by removing the caching mode check in the pasid
    table update path.
    
    Suggested-by: Yi Liu <[email protected]>
    Signed-off-by: Lu Baolu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR). [+ + +]
Author: Kuniyuki Iwashima <[email protected]>
Date:   Fri Aug 9 16:54:02 2024 -0700

    ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
    
    [ Upstream commit e3af3d3c5b26c33a7950e34e137584f6056c4319 ]
    
    dev->ip_ptr could be NULL if we set an invalid MTU.
    
    Even then, if we issue ioctl(SIOCSIFADDR) for a new IPv4 address,
    devinet_ioctl() allocates struct in_ifaddr and fails later in
    inet_set_ifa() because in_dev is NULL.
    
    Let's move the check earlier.
    
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: ip_gre: Fix drops of small packets in ipgre_xmit [+ + +]
Author: Anton Danilov <[email protected]>
Date:   Wed Sep 25 02:51:59 2024 +0300

    ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
    
    [ Upstream commit c4a14f6d9d17ad1e41a36182dd3b8a5fd91efbd7 ]
    
    Regression Description:
    
    Depending on the options specified for the GRE tunnel device, small
    packets may be dropped. This occurs because the pskb_network_may_pull
    function fails due to the packet's insufficient length.
    
    For example, if only the okey option is specified for the tunnel device,
    original (before encapsulation) packets smaller than 28 bytes (including
    the IPv4 header) will be dropped. This happens because the required
    length is calculated relative to the network header, not the skb->head.
    
    Here is how the required length is computed and checked:
    
    * The pull_len variable is set to 28 bytes, consisting of:
      * IPv4 header: 20 bytes
      * GRE header with Key field: 8 bytes
    
    * The pskb_network_may_pull function adds the network offset, shifting
    the checkable space further to the beginning of the network header and
    extending it to the beginning of the packet. As a result, the end of
    the checkable space occurs beyond the actual end of the packet.
    
    Instead of ensuring that 28 bytes are present in skb->head, the function
    is requesting these 28 bytes starting from the network header. For small
    packets, this requested length exceeds the actual packet size, causing
    the check to fail and the packets to be dropped.
    
    This issue affects both locally originated and forwarded packets in
    DMVPN-like setups.
    
    How to reproduce (for local originated packets):
    
      ip link add dev gre1 type gre ikey 1.9.8.4 okey 1.9.8.4 \
              local <your-ip> remote 0.0.0.0
    
      ip link set mtu 1400 dev gre1
      ip link set up dev gre1
      ip address add 192.168.13.1/24 dev gre1
      ip neighbor add 192.168.13.2 lladdr <remote-ip> dev gre1
      ping -s 1374 -c 10 192.168.13.2
      tcpdump -vni gre1
      tcpdump -vni <your-ext-iface> 'ip proto 47'
      ip -s -s -d link show dev gre1
    
    Solution:
    
    Use the pskb_may_pull function instead the pskb_network_may_pull.
    
    Fixes: 80d875cfc9d3 ("ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()")
    Signed-off-by: Anton Danilov <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family [+ + +]
Author: Ido Schimmel <[email protected]>
Date:   Wed Aug 14 15:52:22 2024 +0300

    ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
    
    [ Upstream commit 8fed54758cd248cd311a2b5c1e180abef1866237 ]
    
    The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
    lookup according to user provided parameters and communicate the result
    back to user space.
    
    However, unlike other users of the FIB lookup API, the upper DSCP bits
    and the ECN bits of the DS field are not masked, which can result in the
    wrong result being returned.
    
    Solve this by masking the upper DSCP bits and the ECN bits using
    IPTOS_RT_MASK.
    
    The structure that communicates the request and the response is not
    exported to user space, so it is unlikely that this netlink family is
    actually in use [1].
    
    [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
    
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Guillaume Nault <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit [+ + +]
Author: Kemeng Shi <[email protected]>
Date:   Thu Aug 1 09:38:08 2024 +0800

    jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit
    
    commit f0e3c14802515f60a47e6ef347ea59c2733402aa upstream.
    
    Use tid_geq to compare tids to work over sequence number wraps.
    
    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhang Yi <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Jul 18 19:53:36 2024 +0800

    jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
    
    commit f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a upstream.
    
    In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
    to recover some journal space. But if an error occurs while executing
    jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
    space right away, we try other branches, and if j_committing_transaction
    is NULL (i.e., the tid is 0), we will get the following complain:
    
    ============================================
    JBD2: I/O error when updating journal superblock for sdd-8.
    __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
    __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
    Modules linked in:
    CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
    RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
    Call Trace:
     <TASK>
     add_transaction_credits+0x5d1/0x5e0
     start_this_handle+0x1ef/0x6a0
     jbd2__journal_start+0x18b/0x340
     ext4_dirty_inode+0x5d/0xb0
     __mark_inode_dirty+0xe4/0x5d0
     generic_update_time+0x60/0x70
    [...]
    ============================================
    
    So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
    clean up at the moment, continue to try to reclaim free space in other ways.
    
    Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
    when updating journal superblock fails") to make jbd2_cleanup_journal_tail
    return the correct error code.
    
    Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
jfs: check if leafidx greater than num leaves per dmap tree [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 09:25:23 2024 +0800

    jfs: check if leafidx greater than num leaves per dmap tree
    
    [ Upstream commit d64ff0d2306713ff084d4b09f84ed1a8c75ecc32 ]
    
    syzbot report a out of bounds in dbSplit, it because dmt_leafidx greater
    than num leaves per dmap tree, add a checking for dmt_leafidx in dbFindLeaf.
    
    Shaggy:
    Modified sanity check to apply to control pages as well as leaf pages.
    
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=dca05492eff41f604890
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uaf in dbFreeBits [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 10:50:48 2024 +0800

    jfs: Fix uaf in dbFreeBits
    
    [ Upstream commit d6c1b3599b2feb5c7291f5ac3a36e5fa7cedb234 ]
    
    [syzbot reported]
    ==================================================================
    BUG: KASAN: slab-use-after-free in __mutex_lock_common kernel/locking/mutex.c:587 [inline]
    BUG: KASAN: slab-use-after-free in __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
    Read of size 8 at addr ffff8880229254b0 by task syz-executor357/5216
    
    CPU: 0 UID: 0 PID: 5216 Comm: syz-executor357 Not tainted 6.11.0-rc3-syzkaller-00156-gd7a5aa4b3c00 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0x169/0x550 mm/kasan/report.c:488
     kasan_report+0x143/0x180 mm/kasan/report.c:601
     __mutex_lock_common kernel/locking/mutex.c:587 [inline]
     __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
     dbFreeBits+0x7ea/0xd90 fs/jfs/jfs_dmap.c:2390
     dbFreeDmap fs/jfs/jfs_dmap.c:2089 [inline]
     dbFree+0x35b/0x680 fs/jfs/jfs_dmap.c:409
     dbDiscardAG+0x8a9/0xa20 fs/jfs/jfs_dmap.c:1650
     jfs_ioc_trim+0x433/0x670 fs/jfs/jfs_discard.c:100
     jfs_ioctl+0x2d0/0x3e0 fs/jfs/ioctl.c:131
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:907 [inline]
     __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
    
    Freed by task 5218:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
     kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
     poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
     __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2252 [inline]
     slab_free mm/slub.c:4473 [inline]
     kfree+0x149/0x360 mm/slub.c:4594
     dbUnmount+0x11d/0x190 fs/jfs/jfs_dmap.c:278
     jfs_mount_rw+0x4ac/0x6a0 fs/jfs/jfs_mount.c:247
     jfs_remount+0x3d1/0x6b0 fs/jfs/super.c:454
     reconfigure_super+0x445/0x880 fs/super.c:1083
     vfs_cmd_reconfigure fs/fsopen.c:263 [inline]
     vfs_fsconfig_locked fs/fsopen.c:292 [inline]
     __do_sys_fsconfig fs/fsopen.c:473 [inline]
     __se_sys_fsconfig+0xb6e/0xf80 fs/fsopen.c:345
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [Analysis]
    There are two paths (dbUnmount and jfs_ioc_trim) that generate race
    condition when accessing bmap, which leads to the occurrence of uaf.
    
    Use the lock s_umount to synchronize them, in order to avoid uaf caused
    by race condition.
    
    Reported-and-tested-by: [email protected]
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uninit-value access of new_ea in ea_buffer [+ + +]
Author: Zhao Mengmeng <[email protected]>
Date:   Wed Sep 4 09:07:58 2024 +0800

    jfs: Fix uninit-value access of new_ea in ea_buffer
    
    [ Upstream commit 2b59ffad47db1c46af25ccad157bb3b25147c35c ]
    
    syzbot reports that lzo1x_1_do_compress is using uninit-value:
    
    =====================================================
    BUG: KMSAN: uninit-value in lzo1x_1_do_compress+0x19f9/0x2510 lib/lzo/lzo1x_compress.c:178
    
    ...
    
    Uninit was stored to memory at:
     ea_put fs/jfs/xattr.c:639 [inline]
    
    ...
    
    Local variable ea_buf created at:
     __jfs_setxattr+0x5d/0x1ae0 fs/jfs/xattr.c:662
     __jfs_xattr_set+0xe6/0x1f0 fs/jfs/xattr.c:934
    
    =====================================================
    
    The reason is ea_buf->new_ea is not initialized properly.
    
    Fix this by using memset to empty its content at the beginning
    in ea_get().
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=02341e0daa42a15ce130
    Signed-off-by: Zhao Mengmeng <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: UBSAN: shift-out-of-bounds in dbFindBits [+ + +]
Author: Remington Brasga <[email protected]>
Date:   Wed Jul 10 00:12:44 2024 +0000

    jfs: UBSAN: shift-out-of-bounds in dbFindBits
    
    [ Upstream commit b0b2fc815e514221f01384f39fbfbff65d897e1c ]
    
    Fix issue with UBSAN throwing shift-out-of-bounds warning.
    
    Reported-by: [email protected]
    Signed-off-by: Remington Brasga <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
jump_label: Fix static_key_slow_dec() yet again [+ + +]
Author: Peter Zijlstra <[email protected]>
Date:   Mon Sep 9 12:50:09 2024 +0200

    jump_label: Fix static_key_slow_dec() yet again
    
    [ Upstream commit 1d7f856c2ca449f04a22d876e36b464b7a9d28b6 ]
    
    While commit 83ab38ef0a0b ("jump_label: Fix concurrency issues in
    static_key_slow_dec()") fixed one problem, it created yet another,
    notably the following is now possible:
    
      slow_dec
        if (try_dec) // dec_not_one-ish, false
        // enabled == 1
                                    slow_inc
                                      if (inc_not_disabled) // inc_not_zero-ish
                                      // enabled == 2
                                        return
    
        guard((mutex)(&jump_label_mutex);
        if (atomic_cmpxchg(1,0)==1) // false, we're 2
    
                                    slow_dec
                                      if (try-dec) // dec_not_one, true
                                      // enabled == 1
                                        return
        else
          try_dec() // dec_not_one, false
          WARN
    
    Use dec_and_test instead of cmpxchg(), like it was prior to
    83ab38ef0a0b. Add a few WARNs for the paranoid.
    
    Fixes: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
    Reported-by: "Darrick J. Wong" <[email protected]>
    Tested-by: Klara Modin <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
kconfig: fix infinite loop in sym_calc_choice() [+ + +]
Author: Masahiro Yamada <[email protected]>
Date:   Wed Sep 25 20:25:31 2024 +0900

    kconfig: fix infinite loop in sym_calc_choice()
    
    [ Upstream commit 4d46b5b623e0adee1153b1d80689211e5094ae44 ]
    
    Since commit f79dc03fe68c ("kconfig: refactor choice value calculation"),
    Kconfig for ARCH=powerpc may result in an infinite loop. This occurs
    because there are two entries for POWERPC64_CPU in a choice block.
    
    If the same symbol appears twice in a choice block, the ->choice_link
    node is added twice to ->choice_members, resulting a corrupted linked
    list.
    
    A simple test case is:
    
        choice
                prompt "choice"
    
        config A
                bool "A"
    
        config B
                bool "B 1"
    
        config B
                bool "B 2"
    
        endchoice
    
    Running 'make defconfig' results in an infinite loop.
    
    One solution is to replace the current two entries:
    
        config POWERPC64_CPU
                bool "Generic (POWER5 and PowerPC 970 and above)"
                depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
    
        config POWERPC64_CPU
                bool "Generic (POWER8 and above)"
                depends on PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN
                select ARCH_HAS_FAST_MULTIPLIER
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX
    
    with the following single entry:
    
        config POWERPC64_CPU
                bool "Generic 64 bit powerpc"
                depends on PPC_BOOK3S_64
                select ARCH_HAS_FAST_MULTIPLIER if CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX if CPU_LITTLE_ENDIAN
    
    In my opinion, the latter looks cleaner, but PowerPC maintainers may
    prefer to display different prompts depending on CPU_LITTLE_ENDIAN.
    
    For now, this commit fixes the issue in Kconfig, restoring the original
    behavior. I will reconsider whether such a use case is worth supporting.
    
    Fixes: f79dc03fe68c ("kconfig: refactor choice value calculation")
    Reported-by: Marco Bonelli <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kconfig: qconf: fix buffer overflow in debug links [+ + +]
Author: Masahiro Yamada <[email protected]>
Date:   Tue Oct 1 18:02:22 2024 +0900

    kconfig: qconf: fix buffer overflow in debug links
    
    [ Upstream commit 984ed20ece1c6c20789ece040cbff3eb1a388fa9 ]
    
    If you enable "Option -> Show Debug Info" and click a link, the program
    terminates with the following error:
    
        *** buffer overflow detected ***: terminated
    
    The buffer overflow is caused by the following line:
    
        strcat(data, "$");
    
    The buffer needs one more byte to accommodate the additional character.
    
    Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again")
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kconfig: qconf: move conf_read() before drawing tree pain [+ + +]
Author: Masahiro Yamada <[email protected]>
Date:   Tue Oct 1 02:02:23 2024 +0900

    kconfig: qconf: move conf_read() before drawing tree pain
    
    [ Upstream commit da724c33b685463720b1c625ac440e894dc57ec0 ]
    
    The constructor of ConfigMainWindow() calls show*View(), which needs
    to calculate symbol values. conf_read() must be called before that.
    
    Fixes: 060e05c3b422 ("kconfig: qconf: remove initial call to conf_changed()")
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3 [+ + +]
Author: Alessandro Zanni <[email protected]>
Date:   Tue Aug 6 14:14:50 2024 +0200

    kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
    
    [ Upstream commit a19008256d05e726f29f43c6a307e45482c082c3 ]
    
    Insert raw strings to prevent Python3 from interpreting string literals
    as Unicode strings and "\d" as invalid escaped sequence.
    
    Fix the warnings:
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py:48:
    SyntaxWarning: invalid escape sequence '\d' usb_controller_sysfs_dir =
    "usb[\d]+"
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py: 94:
    SyntaxWarning: invalid escape sequence '\d' re_usb_version =
    re.compile("PRODUCT=.*/(\d)/.*")
    
    Fixes: dacf1d7a78bf ("kselftest: Add test to verify probe of devices from discoverable buses")
    
    Reviewed-by: Nícolas F. R. A. Prado <[email protected]>
    Signed-off-by: Alessandro Zanni <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
kselftests: mm: fix wrong __NR_userfaultfd value [+ + +]
Author: Muhammad Usama Anjum <[email protected]>
Date:   Mon Sep 23 10:38:36 2024 +0500

    kselftests: mm: fix wrong __NR_userfaultfd value
    
    commit f30beffd977e98c33550bbeb6f278d157ff54844 upstream.
    
    grep -rnIF "#define __NR_userfaultfd"
    tools/include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    arch/x86/include/generated/uapi/asm/unistd_32.h:374:#define
    __NR_userfaultfd 374
    arch/x86/include/generated/uapi/asm/unistd_64.h:327:#define
    __NR_userfaultfd 323
    arch/x86/include/generated/uapi/asm/unistd_x32.h:282:#define
    __NR_userfaultfd (__X32_SYSCALL_BIT + 323)
    arch/arm/include/generated/uapi/asm/unistd-eabi.h:347:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    arch/arm/include/generated/uapi/asm/unistd-oabi.h:359:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    
    The number is dependent on the architecture. The above data shows that:
    x86     374
    x86_64  323
    
    The value of __NR_userfaultfd was changed to 282 when asm-generic/unistd.h
    was included.  It makes the test to fail every time as the correct number
    of this syscall on x86_64 is 323.  Fix the header to asm/unistd.h.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: a5c6bc590094 ("selftests/mm: remove local __NR_* definitions")
    Signed-off-by: Muhammad Usama Anjum <[email protected]>
    Reviewed-by: Shuah Khan <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Cc: John Hubbard <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ksmbd: add refcnt to ksmbd_conn struct [+ + +]
Author: Namjae Jeon <[email protected]>
Date:   Tue Sep 3 20:28:08 2024 +0900

    ksmbd: add refcnt to ksmbd_conn struct
    
    [ Upstream commit ee426bfb9d09b29987369b897fe9b6485ac2be27 ]
    
    When sending an oplock break request, opinfo->conn is used,
    But freed ->conn can be used on multichannel.
    This patch add a reference count to the ksmbd_conn struct
    so that it can be freed when it is no longer used.
    
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ksmbd: fix warning: comparison of distinct pointer types lacks a cast [+ + +]
Author: Namjae Jeon <[email protected]>
Date:   Thu Sep 19 09:22:57 2024 +0900

    ksmbd: fix warning: comparison of distinct pointer types lacks a cast
    
    [ Upstream commit 289ebd9afeb94862d96c89217068943f1937df5b ]
    
    smb2pdu.c: In function ‘smb2_open’:
    ./include/linux/minmax.h:20:28: warning: comparison of distinct
    pointer types lacks a cast
       20 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
          |                            ^~
    ./include/linux/minmax.h:26:4: note: in expansion of macro ‘__typecheck’
       26 |   (__typecheck(x, y) && __no_side_effects(x, y))
          |    ^~~~~~~~~~~
    ./include/linux/minmax.h:36:24: note: in expansion of macro ‘__safe_cmp’
       36 |  __builtin_choose_expr(__safe_cmp(x, y), \
          |                        ^~~~~~~~~~
    ./include/linux/minmax.h:45:19: note: in expansion of macro ‘__careful_cmp’
       45 | #define min(x, y) __careful_cmp(x, y, <)
          |                   ^~~~~~~~~~~~~
    /home/linkinjeon/git/smbd_work/ksmbd/smb2pdu.c:3713:27: note: in
    expansion of macro ‘min’
     3713 |     fp->durable_timeout = min(dh_info.timeout,
    
    Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
KVM: arm64: Fix kvm_has_feat*() handling of negative features [+ + +]
Author: Marc Zyngier <[email protected]>
Date:   Wed Oct 2 21:42:39 2024 +0100

    KVM: arm64: Fix kvm_has_feat*() handling of negative features
    
    commit a1d402abf8e3ff1d821e88993fc5331784fac0da upstream.
    
    Oliver reports that the kvm_has_feat() helper is not behaviing as
    expected for negative feature. On investigation, the main issue
    seems to be caused by the following construct:
    
     #define get_idreg_field(kvm, id, fld)                          \
            (id##_##fld##_SIGNED ?                                  \
             get_idreg_field_signed(kvm, id, fld) :                 \
             get_idreg_field_unsigned(kvm, id, fld))
    
    where one side of the expression evaluates as something signed,
    and the other as something unsigned. In retrospect, this is totally
    braindead, as the compiler converts this into an unsigned expression.
    When compared to something that is 0, the test is simply elided.
    
    Epic fail. Similar issue exists in the expand_field_sign() macro.
    
    The correct way to handle this is to chose between signed and unsigned
    comparisons, so that both sides of the ternary expression are of the
    same type (bool).
    
    In order to keep the code readable (sort of), we introduce new
    comparison primitives taking an operator as a parameter, and
    rewrite the kvm_has_feat*() helpers in terms of these primitives.
    
    Fixes: c62d7a23b947 ("KVM: arm64: Add feature checking helpers")
    Reported-by: Oliver Upton <[email protected]>
    Tested-by: Oliver Upton <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
l2tp: free sessions using rcu [+ + +]
Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:08 2024 +0100

    l2tp: free sessions using rcu
    
    [ Upstream commit d17e89999574aca143dd4ede43e4382d32d98724 ]
    
    l2tp sessions may be accessed under an rcu read lock. Have them freed
    via rcu and remove the now unneeded synchronize_rcu when a session is
    removed.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

l2tp: prevent possible tunnel refcount underflow [+ + +]
Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:10 2024 +0100

    l2tp: prevent possible tunnel refcount underflow
    
    [ Upstream commit 24256415d18695b46da06c93135f5b51c548b950 ]
    
    When a session is created, it sets a backpointer to its tunnel. When
    the session refcount drops to 0, l2tp_session_free drops the tunnel
    refcount if session->tunnel is non-NULL. However, session->tunnel is
    set in l2tp_session_create, before the tunnel refcount is incremented
    by l2tp_session_register, which leaves a small window where
    session->tunnel is non-NULL when the tunnel refcount hasn't been
    bumped.
    
    Moving the assignment to l2tp_session_register is trivial but
    l2tp_session_create calls l2tp_session_set_header_len which uses
    session->tunnel to get the tunnel's encap. Add an encap arg to
    l2tp_session_set_header_len to avoid using session->tunnel.
    
    If l2tpv3 sessions have colliding IDs, it is possible for
    l2tp_v3_session_get to race with l2tp_session_register and fetch a
    session which doesn't yet have session->tunnel set. Add a check for
    this case.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

l2tp: use rcu list add/del when updating lists [+ + +]
Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:11 2024 +0100

    l2tp: use rcu list add/del when updating lists
    
    [ Upstream commit 89b768ec2dfefaeba5212de14fc71368e12d06ba ]
    
    l2tp_v3_session_htable and tunnel->session_list are read by lockless
    getters using RCU. Use rcu list variants when adding or removing list
    items.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
leds: pca9532: Remove irrelevant blink configuration error message [+ + +]
Author: Bastien Curutchet <[email protected]>
Date:   Mon Aug 26 15:32:37 2024 +0200

    leds: pca9532: Remove irrelevant blink configuration error message
    
    commit 2aad93b6de0d874038d3d7958be05011284cd6b9 upstream.
    
    The update_hw_blink() function prints an error message when hardware is
    not able to handle a blink configuration on its own. IMHO, this isn't a
    'real' error since the software fallback is used afterwards.
    
    Remove the error messages to avoid flooding the logs with unnecessary
    messages.
    
    Cc: [email protected]
    Fixes: 48ca7f302cfc ("leds: pca9532: Use PWM1 for hardware blinking")
    Signed-off-by: Bastien Curutchet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
lib/buildid: harden build ID parsing logic [+ + +]
Author: Andrii Nakryiko <[email protected]>
Date:   Thu Aug 29 10:42:23 2024 -0700

    lib/buildid: harden build ID parsing logic
    
    commit 905415ff3ffb1d7e5afa62bacabd79776bd24606 upstream.
    
    Harden build ID parsing logic, adding explicit READ_ONCE() where it's
    important to have a consistent value read and validated just once.
    
    Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
    note is within a page bounds, so move the overflow check up and add an
    extra note_size boundaries validation.
    
    Fixes tag below points to the code that moved this code into
    lib/buildid.c, and then subsequently was used in perf subsystem, making
    this code exposed to perf_event_open() users in v5.12+.
    
    Cc: [email protected]
    Reviewed-by: Eduard Zingerman <[email protected]>
    Reviewed-by: Jann Horn <[email protected]>
    Suggested-by: Andi Kleen <[email protected]>
    Fixes: bd7525dacd7e ("bpf: Move stack_map_get_build_id into lib")
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Linux: Linux 6.11.3 [+ + +]
Author: Greg Kroah-Hartman <[email protected]>
Date:   Thu Oct 10 12:04:18 2024 +0200

    Linux 6.11.3
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Ronald Warsow <[email protected]>
    Tested-by: Markus Reichelt <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Justin M. Forbes <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Christian Heusel <[email protected]>
    Tested-by: Kexy Biscuit <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: kernelci.org bot <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mac802154: Fix potential RCU dereference issue in mac802154_scan_worker [+ + +]
Author: Jiawei Ye <[email protected]>
Date:   Tue Sep 24 06:58:05 2024 +0000

    mac802154: Fix potential RCU dereference issue in mac802154_scan_worker
    
    commit bff1709b3980bd7f80be6786f64cc9a9ee9e56da upstream.
    
    In the `mac802154_scan_worker` function, the `scan_req->type` field was
    accessed after the RCU read-side critical section was unlocked. According
    to RCU usage rules, this is illegal and can lead to unpredictable
    behavior, such as accessing memory that has been updated or causing
    use-after-free issues.
    
    This possible bug was identified using a static analysis tool developed
    by myself, specifically designed to detect RCU-related issues.
    
    To address this, the `scan_req->type` value is now stored in a local
    variable `scan_req_type` while still within the RCU read-side critical
    section. The `scan_req_type` is then used after the RCU lock is released,
    ensuring that the type value is safely accessed without violating RCU
    rules.
    
    Fixes: e2c3e6f53a7a ("mac802154: Handle active scanning")
    Cc: [email protected]
    Signed-off-by: Jiawei Ye <[email protected]>
    Acked-by: Miquel Raynal <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mailbox: ARM_MHU_V3 should depend on ARM64 [+ + +]
Author: Geert Uytterhoeven <[email protected]>
Date:   Thu Aug 29 15:58:53 2024 +0200

    mailbox: ARM_MHU_V3 should depend on ARM64
    
    [ Upstream commit 0e4ed48292c55eeb0afab22f8930b556f17eaad2 ]
    
    The ARM MHUv3 controller is only present on ARM64 SoCs.  Hence add a
    dependency on ARM64, to prevent asking the user about this driver when
    configuring a kernel for a different architecture than ARM64.
    
    Fixes: ca1a8680b134b5e6 ("mailbox: arm_mhuv3: Add driver")
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Acked-by: Sudeep Holla <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: bcm2835: Fix timeout during suspend mode [+ + +]
Author: Stefan Wahren <[email protected]>
Date:   Wed Aug 21 23:40:44 2024 +0200

    mailbox: bcm2835: Fix timeout during suspend mode
    
    [ Upstream commit dc09f007caed3b2f6a3b6bd7e13777557ae22bfd ]
    
    During noirq suspend phase the Raspberry Pi power driver suffer of
    firmware property timeouts. The reason is that the IRQ of the underlying
    BCM2835 mailbox is disabled and rpi_firmware_property_list() will always
    run into a timeout [1].
    
    Since the VideoCore side isn't consider as a wakeup source, set the
    IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled
    during suspend-resume cycle.
    
    [1]
    PM: late suspend of devices complete after 1.754 msecs
    WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128
     rpi_firmware_property_list+0x204/0x22c
    Firmware transaction 0x00028001 timeout
    Modules linked in:
    CPU: 0 PID: 438 Comm: bash Tainted: G         C         6.9.3-dirty #17
    Hardware name: BCM2835
    Call trace:
    unwind_backtrace from show_stack+0x18/0x1c
    show_stack from dump_stack_lvl+0x34/0x44
    dump_stack_lvl from __warn+0x88/0xec
    __warn from warn_slowpath_fmt+0x7c/0xb0
    warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c
    rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c
    rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0
    rpi_firmware_set_power from _genpd_power_off+0xe4/0x148
    _genpd_power_off from genpd_sync_power_off+0x7c/0x11c
    genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0
    genpd_finish_suspend from dpm_run_callback+0x78/0xd0
    dpm_run_callback from device_suspend_noirq+0xc0/0x238
    device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168
    dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac
    suspend_devices_and_enter from pm_suspend+0x254/0x2e4
    pm_suspend from state_store+0xa8/0xd4
    state_store from kernfs_fop_write_iter+0x154/0x1a0
    kernfs_fop_write_iter from vfs_write+0x12c/0x184
    vfs_write from ksys_write+0x78/0xc0
    ksys_write from ret_fast_syscall+0x0/0x54
    Exception stack(0xcc93dfa8 to 0xcc93dff0)
    [...]
    PM: noirq suspend of devices complete after 3095.584 msecs
    
    Link: https://github.com/raspberrypi/firmware/issues/1894
    Fixes: 0bae6af6d704 ("mailbox: Enable BCM2835 mailbox support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: rockchip: fix a typo in module autoloading [+ + +]
Author: Liao Chen <[email protected]>
Date:   Wed Aug 14 02:51:47 2024 +0000

    mailbox: rockchip: fix a typo in module autoloading
    
    [ Upstream commit e92d87c9c5d769e4cb1dd7c90faa38dddd7e52e3 ]
    
    MODULE_DEVICE_TABLE(of, rockchip_mbox_of_match) could let the module
    properly autoloaded based on the alias from of_device_id table. It
    should be 'rockchip_mbox_of_match' instead of 'rockchp_mbox_of_match',
    just fix it.
    
    Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")
    Signed-off-by: Liao Chen <[email protected]>
    Reviewed-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
media: i2c: ar0521: Use cansleep version of gpiod_set_value() [+ + +]
Author: Alexander Shiyan <[email protected]>
Date:   Thu Aug 29 08:48:49 2024 +0300

    media: i2c: ar0521: Use cansleep version of gpiod_set_value()
    
    commit bee1aed819a8cda47927436685d216906ed17f62 upstream.
    
    If we use GPIO reset from I2C port expander, we must use *_cansleep()
    variant of GPIO functions.
    This was not done in ar0521_power_on()/ar0521_power_off() functions.
    Let's fix that.
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 11 at drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x74/0x7c
    Modules linked in:
    CPU: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.10.0 #53
    Hardware name: Diasom DS-RK3568-SOM-EVB (DT)
    Workqueue: events_unbound deferred_probe_work_func
    pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : gpiod_set_value+0x74/0x7c
    lr : ar0521_power_on+0xcc/0x290
    sp : ffffff8001d7ab70
    x29: ffffff8001d7ab70 x28: ffffff80027dcc90 x27: ffffff8003c82000
    x26: ffffff8003ca9250 x25: ffffffc080a39c60 x24: ffffff8003ca9088
    x23: ffffff8002402720 x22: ffffff8003ca9080 x21: ffffff8003ca9088
    x20: 0000000000000000 x19: ffffff8001eb2a00 x18: ffffff80efeeac80
    x17: 756d2d6332692f30 x16: 0000000000000000 x15: 0000000000000000
    x14: ffffff8001d91d40 x13: 0000000000000016 x12: ffffffc080e98930
    x11: ffffff8001eb2880 x10: 0000000000000890 x9 : ffffff8001d7a9f0
    x8 : ffffff8001d92570 x7 : ffffff80efeeac80 x6 : 000000003fc6e780
    x5 : ffffff8001d91c80 x4 : 0000000000000002 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     gpiod_set_value+0x74/0x7c
     ar0521_power_on+0xcc/0x290
    ...
    
    Signed-off-by: Alexander Shiyan <[email protected]>
    Fixes: 852b50aeed15 ("media: On Semi AR0521 sensor driver")
    Cc: [email protected]
    Acked-by: Krzysztof Hałasa <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: imx335: Fix reset-gpio handling [+ + +]
Author: Umang Jain <[email protected]>
Date:   Fri Aug 30 11:41:52 2024 +0530

    media: imx335: Fix reset-gpio handling
    
    commit 99d30e2fdea4086be4e66e2deb10de854b547ab8 upstream.
    
    Rectify the logical value of reset-gpio so that it is set to
    0 (disabled) during power-on and to 1 (enabled) during power-off.
    
    Set the reset-gpio to GPIO_OUT_HIGH at initialization time to make
    sure it starts off in reset. Also drop the "Set XCLR" comment which
    is not-so-informative.
    
    The existing usage of imx335 had reset-gpios polarity inverted
    (GPIO_ACTIVE_HIGH) in their device-tree sources. With this patch
    included, those DTS will not be able to stream imx335 anymore. The
    reset-gpio polarity will need to be rectified in the device-tree
    sources as shown in [1] example, in order to get imx335 functional
    again (as it remains in reset prior to this fix).
    
    Cc: [email protected]
    Fixes: 45d19b5fb9ae ("media: i2c: Add imx335 camera sensor driver")
    Reviewed-by: Laurent Pinchart <[email protected]>
    Link: https://lore.kernel.org/linux-media/[email protected]/
    Signed-off-by: Umang Jain <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: ov5675: Fix power on/off delay timings [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Sat Jul 13 23:33:29 2024 +0100

    media: ov5675: Fix power on/off delay timings
    
    commit 719ec29fceda2f19c833d2784b1574638320400f upstream.
    
    The ov5675 specification says that the gap between XSHUTDN deassert and the
    first I2C transaction should be a minimum of 8192 XVCLK cycles.
    
    Right now we use a usleep_rage() that gives a sleep time of between about
    430 and 860 microseconds.
    
    On the Lenovo X13s we have observed that in about 1/20 cases the current
    timing is too tight and we start transacting before the ov5675's reset
    cycle completes, leading to I2C bus transaction failures.
    
    The reset racing is sometimes triggered at initial chip probe but, more
    usually on a subsequent power-off/power-on cycle e.g.
    
    [   71.451662] ov5675 24-0010: failed to write reg 0x0103. error = -5
    [   71.451686] ov5675 24-0010: failed to set plls
    
    The current quiescence period we have is too tight. Instead of expressing
    the post reset delay in terms of the current XVCLK this patch converts the
    power-on and power-off delays to the maximum theoretical delay @ 6 MHz with
    an additional buffer.
    
    1.365 milliseconds on the power-on path is 1.5 milliseconds with grace.
    85.3 microseconds on the power-off path is 90 microseconds with grace.
    
    Fixes: 49d9ad719e89 ("media: ov5675: add device-tree support and support runtime PM")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Tested-by: Johan Hovold <[email protected]>
    Reviewed-by: Quentin Schulz <[email protected]>
    Tested-by: Quentin Schulz <[email protected]> # RK3399 Puma with
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Fix ordering of pm_runtime_enable [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:03 2024 +0100

    media: qcom: camss: Fix ordering of pm_runtime_enable
    
    commit a151766bd3688f6803e706c6433a7c8d3c6a6a94 upstream.
    
    pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
    pm_runtime_resume_and_get().
    
    This is a basic race condition that doesn't show up for most users so is
    not widely reported. If you blacklist qcom-camss in modules.d and then
    subsequently modprobe the module post-boot it is possible to reliably show
    this error up.
    
    The kernel log for this error looks like this:
    
    qcom-camss ac5a000.camss: Failed to power up pipeline: -13
    
    Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Konrad Dybcio <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Remove use_count guard in stop_streaming [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:02 2024 +0100

    media: qcom: camss: Remove use_count guard in stop_streaming
    
    commit 25f18cb1b673220b76a86ebef8e7fb79bd303b27 upstream.
    
    The use_count check was introduced so that multiple concurrent Raw Data
    Interfaces RDIs could be driven by different virtual channels VCs on the
    CSIPHY input driving the video pipeline.
    
    This is an invalid use of use_count though as use_count pertains to the
    number of times a video entity has been opened by user-space not the number
    of active streams.
    
    If use_count and stream-on count don't agree then stop_streaming() will
    break as is currently the case and has become apparent when using CAMSS
    with libcamera's released softisp 0.3.
    
    The use of use_count like this is a bit hacky and right now breaks regular
    usage of CAMSS for a single stream case. Stopping qcam results in the splat
    below, and then it cannot be started again and any attempts to do so fails
    with -EBUSY.
    
    [ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    ...
    [ 1265.510630] Call trace:
    [ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    [ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
    [ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
    [ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
    [ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
    [ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
    [ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
    [ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
    ...
    [ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
    [ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
    [ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st
    
    One CAMSS specific way to handle multiple VCs on the same RDI might be:
    
    - Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
    - The video buffers are already associated with msm_vfeN_rdiX so
      release video buffers when told to do so by stop_streaming.
    - Only release the power-domains for the CSIPHY, CSID and VFE when
      their internal refcounts drop.
    
    Either way refusing to release video buffers based on use_count is
    erroneous and should be reverted. The silicon enabling code for selecting
    VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
    won't work with CAMSS right now.
    
    Initial testing with this code didn't show an error but, SoftISP and "real"
    usage with Google Hangouts breaks the upstream code pretty quickly, we need
    to do a partial revert and take another pass at VCs.
    
    This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
    Pipeline starting and stopping for multiple virtual channels")
    
    Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: sun4i_csi: Implement link validate for sun4i_csi subdev [+ + +]
Author: Laurent Pinchart <[email protected]>
Date:   Wed Jun 19 02:46:16 2024 +0300

    media: sun4i_csi: Implement link validate for sun4i_csi subdev
    
    commit 2dc5d5d401f5c6cecd97800ffef82e8d17d228f0 upstream.
    
    The sun4i_csi driver doesn't implement link validation for the subdev it
    registers, leaving the link between the subdev and its source
    unvalidated. Fix it, using the v4l2_subdev_link_validate() helper.
    
    Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Acked-by: Chen-Yu Tsai <[email protected]>
    Reviewed-by: Tomi Valkeinen <[email protected]>
    Acked-by: Sakari Ailus <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags [+ + +]
Author: Hans Verkuil <[email protected]>
Date:   Wed Aug 7 09:22:10 2024 +0200

    media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
    
    commit 599f6899051cb70c4e0aa9fd591b9ee220cb6f14 upstream.
    
    The cec_msg_set_reply_to() helper function never zeroed the
    struct cec_msg flags field, this can cause unexpected behavior
    if flags was uninitialized to begin with.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media")
    Cc: <[email protected]>
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: venus: fix use after free bug in venus_remove due to race condition [+ + +]
Author: Zheng Wang <[email protected]>
Date:   Tue Jun 18 14:55:59 2024 +0530

    media: venus: fix use after free bug in venus_remove due to race condition
    
    commit c5a85ed88e043474161bbfe54002c89c1cb50ee2 upstream.
    
    in venus_probe, core->work is bound with venus_sys_error_handler, which is
    used to handle error. The code use core->sys_err_done to make sync work.
    The core->work is started in venus_event_notify.
    
    If we call venus_remove, there might be an unfished work. The possible
    sequence is as follows:
    
    CPU0                  CPU1
    
                         |venus_sys_error_handler
    venus_remove         |
    hfi_destroy                      |
    venus_hfi_destroy        |
    kfree(hdev);         |
                         |hfi_reinit
                                             |venus_hfi_queues_reinit
                         |//use hdev
    
    Fix it by canceling the work in venus_remove.
    
    Cc: [email protected]
    Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
    Signed-off-by: Zheng Wang <[email protected]>
    Signed-off-by: Dikshita Agarwal <[email protected]>
    Signed-off-by: Stanimir Varbanov <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: videobuf2: Drop minimum allocation requirement of 2 buffers [+ + +]
Author: Laurent Pinchart <[email protected]>
Date:   Mon Aug 26 02:24:49 2024 +0300

    media: videobuf2: Drop minimum allocation requirement of 2 buffers
    
    commit e5700c9037727d5a69a677d6dba25010b485d65b upstream.
    
    When introducing the ability for drivers to indicate the minimum number
    of buffers they require an application to allocate, commit 6662edcd32cc
    ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue
    structure") also introduced a global minimum of 2 buffers. It turns out
    this breaks the Renesas R-Car VSP test suite, where a test that
    allocates a single buffer fails when two buffers are used.
    
    One may consider debatable whether test suite failures without failures
    in production use cases should be considered as a regression, but
    operation with a single buffer is a valid use case. While full frame
    rate can't be maintained, memory-to-memory devices can still be used
    with a decent efficiency, and requiring applications to allocate
    multiple buffers for single-shot use cases with capture devices would
    just waste memory.
    
    For those reasons, fix the regression by dropping the global minimum of
    buffers. Individual drivers can still set their own minimum.
    
    Fixes: 6662edcd32cc ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue structure")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Reviewed-by: Hans Verkuil <[email protected]>
    Acked-by: Tomasz Figa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
memory: tegra186-emc: drop unused to_tegra186_emc() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Mon Aug 12 14:30:55 2024 +0200

    memory: tegra186-emc: drop unused to_tegra186_emc()
    
    commit 67dd9e861add38755a7c5d29e25dd0f6cb4116ab upstream.
    
    to_tegra186_emc() is not used, W=1 builds:
    
      tegra186-emc.c:38:36: error: unused function 'to_tegra186_emc' [-Werror,-Wunused-function]
    
    Fixes: 9a38cb27668e ("memory: tegra: Add interconnect support for DRAM scaling in Tegra234")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm, slub: avoid zeroing kmalloc redzone [+ + +]
Author: Peng Fan <[email protected]>
Date:   Thu Aug 29 11:29:11 2024 +0800

    mm, slub: avoid zeroing kmalloc redzone
    
    commit 59090e479ac78ae18facd4c58eb332562a23020e upstream.
    
    Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra
    allocated kmalloc space than requested"), setting orig_size treats
    the wasted space (object_size - orig_size) as a redzone. However with
    init_on_free=1 we clear the full object->size, including the redzone.
    
    Additionally we clear the object metadata, including the stored orig_size,
    making it zero, which makes check_object() treat the whole object as a
    redzone.
    
    These issues lead to the following BUG report with "slub_debug=FUZ
    init_on_free=1":
    
    [    0.000000] =============================================================================
    [    0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten
    [    0.000000] -----------------------------------------------------------------------------
    [    0.000000]
    [    0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc
    [    0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc
    [    0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff)
    [    0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8
    [    0.000000]
    [    0.000000] Redzone  ffff000010032850: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Object   ffff000010032858: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Redzone  ffff000010032860: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Padding  ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00              ............
    [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144
    [    0.000000] Hardware name: NXP i.MX95 19X19 board (DT)
    [    0.000000] Call trace:
    [    0.000000]  dump_backtrace+0x90/0xe8
    [    0.000000]  show_stack+0x18/0x24
    [    0.000000]  dump_stack_lvl+0x74/0x8c
    [    0.000000]  dump_stack+0x18/0x24
    [    0.000000]  print_trailer+0x150/0x218
    [    0.000000]  check_object+0xe4/0x454
    [    0.000000]  free_to_partial_list+0x2f8/0x5ec
    
    To address the issue, use orig_size to clear the used area. And restore
    the value of orig_size after clear the remaining area.
    
    When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns
    s->object_size. So when using memset to init the area, the size can simply
    be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not
    enabled. And orig_size can never be bigger than object_size.
    
    Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
    Cc: <[email protected]>
    Reviewed-by: Feng Tang <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Signed-off-by: Peng Fan <[email protected]>
    Signed-off-by: Vlastimil Babka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/filemap: fix filemap_get_folios_contig THP panic [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:17 2024 -0700

    mm/filemap: fix filemap_get_folios_contig THP panic
    
    commit c225c4f6056b46a8a5bf2ed35abf17a2d6887691 upstream.
    
    Patch series "memfd-pin huge page fixes".
    
    Fix multiple bugs that occur when using memfd_pin_folios with hugetlb
    pages and THP.  The hugetlb bugs only bite when the page is not yet
    faulted in when memfd_pin_folios is called.  The THP bug bites when the
    starting offset passed to memfd_pin_folios is not huge page aligned.  See
    the commit messages for details.
    
    
    This patch (of 5):
    
    memfd_pin_folios on memory backed by THP panics if the requested start
    offset is not huge page aligned:
    
    BUG: kernel NULL pointer dereference, address: 0000000000000036
    RIP: 0010:filemap_get_folios_contig+0xdf/0x290
    RSP: 0018:ffffc9002092fbe8 EFLAGS: 00010202
    RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002
    
    The fault occurs here, because xas_load returns a folio with value 2:
    
        filemap_get_folios_contig()
            for (folio = xas_load(&xas); folio && xas.xa_index <= end;
                            folio = xas_next(&xas)) {
                    ...
                    if (!folio_try_get(folio))   <-- BOOM
    
    "2" is an xarray sibling entry.  We get it because memfd_pin_folios does
    not round the indices passed to filemap_get_folios_contig to huge page
    boundaries for THP, so we load from the middle of a huge page range see a
    sibling.  (It does round for hugetlbfs, at the is_file_hugepages test).
    
    To fix, if the folio is a sibling, then return the next index as the
    starting point for the next call to filemap_get_folios_contig.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Vivek Kasireddy <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/gup: fix memfd_pin_folios alloc race panic [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:21 2024 -0700

    mm/gup: fix memfd_pin_folios alloc race panic
    
    commit ce645b9fdc78ec5d28067286e92871ddae6817d5 upstream.
    
    If memfd_pin_folios tries to create a hugetlb page, but someone else
    already did, then folio gets the value -EEXIST here:
    
            folio = memfd_alloc_folio(memfd, start_idx);
            if (IS_ERR(folio)) {
                    ret = PTR_ERR(folio);
                    if (ret != -EEXIST)
                            goto err;
    
    then on the next trip through the "while start_idx" loop we panic here:
    
            if (folio) {
                    folio_put(folio);
    
    To fix, set the folio to NULL on error.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/gup: fix memfd_pin_folios hugetlb page allocation [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:20 2024 -0700

    mm/gup: fix memfd_pin_folios hugetlb page allocation
    
    commit 9289f020da47ef04b28865589eeee3d56d4bafea upstream.
    
    When memfd_pin_folios -> memfd_alloc_folio creates a hugetlb page, the
    index is wrong.  The subsequent call to filemap_get_folios_contig thus
    cannot find it, and fails, and memfd_pin_folios loops forever.  To fix,
    adjust the index for the huge_page_order.
    
    memfd_alloc_folio also forgets to unlock the folio, so the next touch of
    the page calls hugetlb_fault which blocks forever trying to take the lock.
    Unlock it.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm/hugetlb: fix memfd_pin_folios free_huge_pages leak [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:18 2024 -0700

    mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
    
    commit c56b6f3d801d7ec8965993342bdd9e2972b6cb8e upstream.
    
    memfd_pin_folios followed by unpin_folios fails to restore free_huge_pages
    if the pages were not already faulted in, because the folio refcount for
    pages created by memfd_alloc_folio never goes to 0.  memfd_pin_folios
    needs another folio_put to undo the folio_try_get below:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            folio_ref_unfreeze(folio, 1);    ; adds 1 refcount
      folio_try_get()                        ; adds 1 refcount
      hugetlb_add_to_page_cache()            ; adds 512 refcount (on x86)
    
    With the fix, after memfd_pin_folios + unpin_folios, the refcount for the
    (unfaulted) page is 512, which is correct, as the refcount for a faulted
    unpinned page is 513.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:19 2024 -0700

    mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
    
    commit 26a8ea80929c518bdec5e53a5776f95919b7c88e upstream.
    
    memfd_pin_folios followed by unpin_folios leaves resv_huge_pages elevated
    if the pages were not already faulted in.  During a normal page fault,
    resv_huge_pages is consumed here:
    
    hugetlb_fault()
      alloc_hugetlb_folio()
        dequeue_hugetlb_folio_vma()
          dequeue_hugetlb_folio_nodemask()
            dequeue_hugetlb_folio_node_exact()
              free_huge_pages--
          resv_huge_pages--
    
    During memfd_pin_folios, the page is created by calling
    alloc_hugetlb_folio_nodemask instead of alloc_hugetlb_folio, and
    resv_huge_pages is not modified:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            free_huge_pages--
    
    alloc_hugetlb_folio_nodemask has other callers that must not modify
    resv_huge_pages.  Therefore, to fix, define an alternate version of
    alloc_hugetlb_folio_nodemask for this call site that adjusts
    resv_huge_pages.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/hugetlb: simplify refs in memfd_alloc_folio [+ + +]
Author: Steve Sistare <[email protected]>
Date:   Wed Sep 4 12:41:08 2024 -0700

    mm/hugetlb: simplify refs in memfd_alloc_folio
    
    commit dc677b5f3765cfd0944c8873d1ea57f1a3439676 upstream.
    
    The folio_try_get in memfd_alloc_folio is not necessary.  Delete it, and
    delete the matching folio_put in memfd_pin_folios.  This also avoids
    leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache
    fails.  That error path is also broken in a second way -- when its
    folio_put causes the ref to become 0, it will implicitly call
    free_huge_folio, but then the path *explicitly* calls free_huge_folio.
    Delete the latter.
    
    This is a continuation of the fix
      "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak"
    
    [[email protected]: remove explicit call to free_huge_folio(), per Matthew]
      Link: https://lkml.kernel.org/r/[email protected]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Suggested-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm: krealloc: consider spare memory for __GFP_ZERO [+ + +]
Author: Danilo Krummrich <[email protected]>
Date:   Tue Aug 13 00:34:34 2024 +0200

    mm: krealloc: consider spare memory for __GFP_ZERO
    
    commit 1a83a716ec233990e1fd5b6fbb1200ade63bf450 upstream.
    
    As long as krealloc() is called with __GFP_ZERO consistently, starting
    with the initial memory allocation, __GFP_ZERO should be fully honored.
    
    However, if for an existing allocation krealloc() is called with a
    decreased size, it is not ensured that the spare portion the allocation is
    zeroed.  Thus, if krealloc() is subsequently called with a larger size
    again, __GFP_ZERO can't be fully honored, since we don't know the previous
    size, but only the bucket size.
    
    Example:
    
            buf = kzalloc(64, GFP_KERNEL);
            memset(buf, 0xff, 64);
    
            buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
    
            /* After this call the last 16 bytes are still 0xff. */
            buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
    
    Fix this, by explicitly setting spare memory to zero, when shrinking an
    allocation with __GFP_ZERO flag set or init_on_alloc enabled.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Danilo Krummrich <[email protected]>
    Acked-by: Vlastimil Babka <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Cc: Christoph Lameter <[email protected]>
    Cc: Hyeonggon Yoo <[email protected]>
    Cc: Joonsoo Kim <[email protected]>
    Cc: Pekka Enberg <[email protected]>
    Cc: Roman Gushchin <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: z3fold: deprecate CONFIG_Z3FOLD [+ + +]
Author: Yosry Ahmed <[email protected]>
Date:   Mon Oct 7 19:21:16 2024 +0000

    mm: z3fold: deprecate CONFIG_Z3FOLD
    
    [ Upstream commit 7a2369b74abf76cd3e54c45b30f6addb497f831b ]
    
    The z3fold compressed pages allocator is rarely used, most users use
    zsmalloc.  The only disadvantage of zsmalloc in comparison is the
    dependency on MMU, and zbud is a more common option for !MMU as it was the
    default zswap allocator for a long time.
    
    Historically, zsmalloc had worse latency than zbud and z3fold but offered
    better memory savings.  This is no longer the case as shown by a simple
    recent analysis [1].  That analysis showed that z3fold does not have any
    advantage over zsmalloc or zbud considering both performance and memory
    usage.  In a kernel build test on tmpfs in a limited cgroup, z3fold took
    3% more time and used 1.8% more memory.  The latency of zswap_load() was
    7% higher, and that of zswap_store() was 10% higher.  Zsmalloc is better
    in all metrics.
    
    Moreover, z3fold apparently has latent bugs, which was made noticeable by
    a recent soft lockup bug report with z3fold [2].  Switching to zsmalloc
    not only fixed the problem, but also reduced the swap usage from 6~8G to
    1~2G.  Other users have also reported being bitten by mistakenly enabling
    z3fold.
    
    Other than hurting users, z3fold is repeatedly causing wasted engineering
    effort.  Apart from investigating the above bug, it came up in multiple
    development discussions (e.g.  [3]) as something we need to handle, when
    there aren't any legit users (at least not intentionally).
    
    The natural course of action is to deprecate z3fold, and remove in a few
    cycles if no objections are raised from active users.  Next on the list
    should be zbud, as it offers marginal latency gains at the cost of huge
    memory waste when compared to zsmalloc.  That one will need to wait until
    zsmalloc does not depend on MMU.
    
    Rename the user-visible config option from CONFIG_Z3FOLD to
    CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
    prompt with explanation during make oldconfig.  Also, remove
    CONFIG_Z3FOLD=y from defconfigs.
    
    [1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com/
    [2]https://lore.kernel.org/lkml/[email protected]/
    [3]https://lore.kernel.org/lkml/CAJD7tkbnmeVugfunffSovJf9FAgy9rhBVt_tx=nxUveLUfqVsA@mail.gmail.com/
    
    [[email protected]: deprecate ZSWAP_ZPOOL_DEFAULT_Z3FOLD as well]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Yosry Ahmed <[email protected]>
    Signed-off-by: Arnd Bergmann <[email protected]>
    Acked-by: Chris Down <[email protected]>
    Acked-by: Nhat Pham <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Acked-by: Vitaly Wool <[email protected]>
    Acked-by: Christoph Hellwig <[email protected]>
    Cc: Aneesh Kumar K.V <[email protected]>
    Cc: Christophe Leroy <[email protected]>
    Cc: Huacai Chen <[email protected]>
    Cc: Miaohe Lin <[email protected]>
    Cc: Michael Ellerman <[email protected]>
    Cc: Naveen N. Rao <[email protected]>
    Cc: Nicholas Piggin <[email protected]>
    Cc: Sergey Senozhatsky <[email protected]>
    Cc: WANG Xuerui <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    (cherry picked from commit 7a2369b74abf76cd3e54c45b30f6addb497f831b)
    Signed-off-by: Yosry Ahmed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/mlx5: Added cond_resched() to crdump collection [+ + +]
Author: Mohamed Khalfella <[email protected]>
Date:   Wed Sep 4 22:02:48 2024 -0600

    net/mlx5: Added cond_resched() to crdump collection
    
    [ Upstream commit ec793155894140df7421d25903de2e6bc12c695b ]
    
    Collecting crdump involves reading vsc registers from pci config space
    of mlx device, which can take long time to complete. This might result
    in starving other threads waiting to run on the cpu.
    
    Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:
    
    - mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
    - mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
      read the entire 1310716 bytes. It was called 53813 times because
      there are jumps in read_addr.
    - On average mlx5_vsc_gw_read_fast() took 35284.4ns.
    - In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
      The average time for each call was 17548.3ns. In some instances
      vsc_read() was called more than one time when the flag was not set.
      As expected the thread released the cpu after 16 iterations in
      mlx5_vsc_wait_on_flag().
    - Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.
    
    It was seen in the field that crdump can take more than 5 seconds to
    complete. During that time mlx5_vsc_wait_on_flag() did not release the
    cpu because it did not complete 16 iterations. It is believed that pci
    config reads were slow. Adding cond_resched() every 128 register read
    improves the situation. In the common case the, crdump takes ~1.8989s,
    the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
    yields the cpu every ~18.0ms.
    
    Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support")
    Reviewed-by: Yuanyuan Zhong <[email protected]>
    Signed-off-by: Mohamed Khalfella <[email protected]>
    Reviewed-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Fix error path in multi-packet WQE transmit [+ + +]
Author: Gerd Bayer <[email protected]>
Date:   Tue Sep 10 10:53:51 2024 +0200

    net/mlx5: Fix error path in multi-packet WQE transmit
    
    [ Upstream commit 2bcae12c795f32ddfbf8c80d1b5f1d3286341c32 ]
    
    Remove the erroneous unmap in case no DMA mapping was established
    
    The multi-packet WQE transmit code attempts to obtain a DMA mapping for
    the skb. This could fail, e.g. under memory pressure, when the IOMMU
    driver just can't allocate more memory for page tables. While the code
    tries to handle this in the path below the err_unmap label it erroneously
    unmaps one entry from the sq's FIFO list of active mappings. Since the
    current map attempt failed this unmap is removing some random DMA mapping
    that might still be required. If the PCI function now presents that IOVA,
    the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
    function in error state.
    
    The erroneous behavior was seen in a stress-test environment that created
    memory pressure.
    
    Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
    Signed-off-by: Gerd Bayer <[email protected]>
    Reviewed-by: Zhu Yanjun <[email protected]>
    Acked-by: Maxim Mikityanskiy <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice [+ + +]
Author: Jianbo Liu <[email protected]>
Date:   Mon Sep 2 09:40:58 2024 +0300

    net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
    
    [ Upstream commit 7b124695db40d5c9c5295a94ae928a8d67a01c3d ]
    
    The km.state is not checked in driver's delayed work. When
    xfrm_state_check_expire() is called, the state can be reset to
    XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This
    happens when xfrm state is deleted, but not freed yet. As
    __xfrm_state_delete() is called again in xfrm timer, the following
    crash occurs.
    
    To fix this issue, skip xfrm_state_check_expire() if km.state is not
    XFRM_STATE_VALID.
    
     Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
     CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
     Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core]
     RIP: 0010:__xfrm_state_delete+0x3d/0x1b0
     Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48
     RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246
     RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036
     RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980
     RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000
     R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246
     R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400
     FS:  0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      <IRQ>
      ? die_addr+0x33/0x90
      ? exc_general_protection+0x1a2/0x390
      ? asm_exc_general_protection+0x22/0x30
      ? __xfrm_state_delete+0x3d/0x1b0
      ? __xfrm_state_delete+0x2f/0x1b0
      xfrm_timer_handler+0x174/0x350
      ? __xfrm_state_delete+0x1b0/0x1b0
      __hrtimer_run_queues+0x121/0x270
      hrtimer_run_softirq+0x88/0xd0
      handle_softirqs+0xcc/0x270
      do_softirq+0x3c/0x50
      </IRQ>
      <TASK>
      __local_bh_enable_ip+0x47/0x50
      mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core]
      process_one_work+0x137/0x2d0
      worker_thread+0x28d/0x3a0
      ? rescuer_thread+0x480/0x480
      kthread+0xb8/0xe0
      ? kthread_park+0x80/0x80
      ret_from_fork+0x2d/0x50
      ? kthread_park+0x80/0x80
      ret_from_fork_asm+0x11/0x20
      </TASK>
    
    Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
    Signed-off-by: Jianbo Liu <[email protected]>
    Reviewed-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() [+ + +]
Author: Elena Salomatkina <[email protected]>
Date:   Tue Sep 24 19:00:18 2024 +0300

    net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
    
    [ Upstream commit f25389e779500cf4a59ef9804534237841bce536 ]
    
    In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
    which is dereferenced on the next line in a reference
    to the modify field.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object")
    Signed-off-by: Elena Salomatkina <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Reviewed-by: Gal Pressman <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: SHAMPO, Fix overflow of hd_per_wq [+ + +]
Author: Dragos Tatulea <[email protected]>
Date:   Tue Aug 13 13:34:54 2024 +0300

    net/mlx5e: SHAMPO, Fix overflow of hd_per_wq
    
    [ Upstream commit 023d2a43ed0d9ab73d4a35757121e4c8e01298e5 ]
    
    When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable
    can overflow. Like in the following case:
    
    $> ethtool --set-ring eth1 rx 8192
    $> ip link set dev eth1 mtu 144
    $> ethtool --features eth1 rx-gro-hw on
    
    ... yields in dmesg:
    
    mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22)
    
    because hd_per_wq is 64K which overflows to 0 and makes the command
    fail.
    
    This patch increases the variable size to 32 bit.
    
    Fixes: 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
    Signed-off-by: Dragos Tatulea <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/ncsi: Disable the ncsi work before freeing the associated structure [+ + +]
Author: Eddie James <[email protected]>
Date:   Wed Sep 25 10:55:23 2024 -0500

    net/ncsi: Disable the ncsi work before freeing the associated structure
    
    [ Upstream commit a0ffa68c70b367358b2672cdab6fa5bc4c40de2c ]
    
    The work function can run after the ncsi device is freed, resulting
    in use-after-free bugs or kernel panic.
    
    Fixes: 2d283bdd079c ("net/ncsi: Resource management")
    Signed-off-by: Eddie James <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/xen-netback: prevent UAF in xenvif_flush_hash() [+ + +]
Author: Jeongjun Park <[email protected]>
Date:   Fri Aug 23 03:11:09 2024 +0900

    net/xen-netback: prevent UAF in xenvif_flush_hash()
    
    [ Upstream commit 0fa5e94a1811d68fbffa0725efe6d4ca62c03d12 ]
    
    During the list_for_each_entry_rcu iteration call of xenvif_flush_hash,
    kfree_rcu does not exist inside the rcu read critical section, so if
    kfree_rcu is called when the rcu grace period ends during the iteration,
    UAF occurs when accessing head->next after the entry becomes free.
    
    Therefore, to solve this, you need to change it to list_for_each_entry_safe.
    
    Signed-off-by: Jeongjun Park <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net: add more sanity checks to qdisc_pkt_len_init() [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:57 2024 +0000

    net: add more sanity checks to qdisc_pkt_len_init()
    
    [ Upstream commit ab9a9a9e9647392a19e7a885b08000e89c86b535 ]
    
    One path takes care of SKB_GSO_DODGY, assuming
    skb->len is bigger than hdr_len.
    
    virtio_net_hdr_to_skb() does not fully dissect TCP headers,
    it only make sure it is at least 20 bytes.
    
    It is possible for an user to provide a malicious 'GSO' packet,
    total length of 80 bytes.
    
    - 20 bytes of IPv4 header
    - 60 bytes TCP header
    - a small gso_size like 8
    
    virtio_net_hdr_to_skb() would declare this packet as a normal
    GSO packet, because it would see 40 bytes of payload,
    bigger than gso_size.
    
    We need to make detect this case to not underflow
    qdisc_skb_cb(skb)->pkt_len.
    
    Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Add netif_get_gro_max_size helper for GRO [+ + +]
Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:41 2024 +0200

    net: Add netif_get_gro_max_size helper for GRO
    
    [ Upstream commit e8d4d34df715133c319fabcf63fdec684be75ff8 ]
    
    Add a small netif_get_gro_max_size() helper which returns the maximum IPv4
    or IPv6 GRO size of the netdevice.
    
    We later add a netif_get_gso_max_size() equivalent as well for GSO, so that
    these helpers can be used consistently instead of open-coded checks.
    
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Stable-dep-of: e609c959a939 ("net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size")
    Signed-off-by: Sasha Levin <[email protected]>

net: atlantic: Avoid warning about potential string truncation [+ + +]
Author: Simon Horman <[email protected]>
Date:   Wed Aug 21 16:58:57 2024 +0100

    net: atlantic: Avoid warning about potential string truncation
    
    [ Upstream commit 5874e0c9f25661c2faefe4809907166defae3d7f ]
    
    W=1 builds with GCC 14.2.0 warn that:
    
    .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                           ^~
    .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                        ^~~~~~~
    .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
    the range is 0 - 255. Further, on inspecting the code, it seems
    that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
    the range is actually 0 - 8.
    
    So, it seems that the condition that GCC flags will not occur.
    But, nonetheless, it would be nice if it didn't emit the warning.
    
    It seems that this can be achieved by changing the format specifier
    from %d to %u, in which case I believe GCC recognises an upper bound
    on the range of tc of 0 - 255. After some experimentation I think
    this is due to the combination of the use of %u and the type of
    cfg->tcs (u8).
    
    Empirically, updating the type of the tc variable to unsigned int
    has the same effect.
    
    As both of these changes seem to make sense in relation to what the code
    is actually doing - iterating over unsigned values - do both.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: avoid potential underflow in qdisc_pkt_len_init() with UFO [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:56 2024 +0000

    net: avoid potential underflow in qdisc_pkt_len_init() with UFO
    
    [ Upstream commit c20029db28399ecc50e556964eaba75c43b1e2f1 ]
    
    After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
    requests coming from user") virtio_net_hdr_to_skb() had sanity check
    to detect malicious attempts from user space to cook a bad GSO packet.
    
    Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
    transport header in UFO") while fixing one issue, allowed user space
    to cook a GSO packet with the following characteristic :
    
    IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
    
    When this packet arrives in qdisc_pkt_len_init(), we end up
    with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
    
    Then the following sets gso_segs to 0 :
    
    gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                            shinfo->gso_size);
    
    Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
    
    qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
    
    This leads to the following crash in fq_codel [1]
    
    qdisc_pkt_len_init() is best effort, we only want an estimation
    of the bytes sent on the wire, not crashing the kernel.
    
    This patch is fixing this particular issue, a following one
    adds more sanity checks for another potential bug.
    
    [1]
    [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [   70.724561] #PF: supervisor read access in kernel mode
    [   70.724561] #PF: error_code(0x0000) - not-present page
    [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
    [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
    [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
    [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
    All code
    ========
       0:   24 08                   and    $0x8,%al
       2:   49 c1 e1 06             shl    $0x6,%r9
       6:   44 89 7c 24 18          mov    %r15d,0x18(%rsp)
       b:   45 31 ed                xor    %r13d,%r13d
       e:   45 31 c0                xor    %r8d,%r8d
      11:   31 ff                   xor    %edi,%edi
      13:   89 44 24 14             mov    %eax,0x14(%rsp)
      17:   4c 03 8b 90 01 00 00    add    0x190(%rbx),%r9
      1e:   eb 04                   jmp    0x24
      20:   39 ca                   cmp    %ecx,%edx
      22:   73 37                   jae    0x5b
      24:   4d 8b 39                mov    (%r9),%r15
      27:   83 c7 01                add    $0x1,%edi
      2a:*  49 8b 17                mov    (%r15),%rdx              <-- trapping instruction
      2d:   49 89 11                mov    %rdx,(%r9)
      30:   41 8b 57 28             mov    0x28(%r15),%edx
      34:   45 8b 5f 34             mov    0x34(%r15),%r11d
      38:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      3f:   49                      rex.WB
    
    Code starting with the faulting instruction
    ===========================================
       0:   49 8b 17                mov    (%r15),%rdx
       3:   49 89 11                mov    %rdx,(%r9)
       6:   41 8b 57 28             mov    0x28(%r15),%edx
       a:   45 8b 5f 34             mov    0x34(%r15),%r11d
       e:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      15:   49                      rex.WB
    [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
    [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
    [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
    [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
    [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
    [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
    [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
    [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
    [   70.724561] Call Trace:
    [   70.724561]  <TASK>
    [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
    [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
    [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
    [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
    [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
    [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
    [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
    [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
    [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
    [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
    [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
    [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    [   70.724561] RIP: 0033:0x41ae09
    
    Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Jonathan Davies <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: Jonathan Davies <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: improve shutdown sequence [+ + +]
Author: Vladimir Oltean <[email protected]>
Date:   Fri Sep 13 23:35:49 2024 +0300

    net: dsa: improve shutdown sequence
    
    [ Upstream commit 6c24a03a61a245fe34d47582898331fa034b6ccd ]
    
    Alexander Sverdlin presents 2 problems during shutdown with the
    lan9303 driver. One is specific to lan9303 and the other just happens
    to reproduce there.
    
    The first problem is that lan9303 is unique among DSA drivers in that it
    calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown,
    not remove):
    
    phy_state_machine()
    -> ...
       -> dsa_user_phy_read()
          -> ds->ops->phy_read()
             -> lan9303_phy_read()
                -> chip->ops->phy_read()
                   -> lan9303_mdio_phy_read()
                      -> dev_get_drvdata()
    
    But we never stop the phy_state_machine(), so it may continue to run
    after dsa_switch_shutdown(). Our common pattern in all DSA drivers is
    to set drvdata to NULL to suppress the remove() method that may come
    afterwards. But in this case it will result in an NPD.
    
    The second problem is that the way in which we set
    dp->conduit->dsa_ptr = NULL; is concurrent with receive packet
    processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL,
    but afterwards, rather than continuing to use that non-NULL value,
    dev->dsa_ptr is dereferenced again and again without NULL checks:
    dsa_conduit_find_user() and many other places. In between dereferences,
    there is no locking to ensure that what was valid once continues to be
    valid.
    
    Both problems have the common aspect that closing the conduit interface
    solves them.
    
    In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN
    event in dsa_user_netdevice_event() which closes user ports as well.
    dsa_port_disable_rt() calls phylink_stop(), which synchronously stops
    the phylink state machine, and ds->ops->phy_read() will thus no longer
    call into the driver after this point.
    
    In the second case, dev_close(conduit) should do this, as per
    Documentation/networking/driver.rst:
    
    | Quiescence
    | ----------
    |
    | After the ndo_stop routine has been called, the hardware must
    | not receive or transmit any data.  All in flight packets must
    | be aborted. If necessary, poll or wait for completion of
    | any reset commands.
    
    So it should be sufficient to ensure that later, when we zeroize
    conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call
    on this conduit.
    
    The addition of the netif_device_detach() function is to ensure that
    ioctls, rtnetlinks and ethtool requests on the user ports no longer
    propagate down to the driver - we're no longer prepared to handle them.
    
    The race condition actually did not exist when commit 0650bf52b31f
    ("net: dsa: be compatible with masters which unregister on shutdown")
    first introduced dsa_switch_shutdown(). It was created later, when we
    stopped unregistering the user interfaces from a bad spot, and we just
    replaced that sequence with a racy zeroization of conduit->dsa_ptr
    (one which doesn't ensure that the interfaces aren't up).
    
    Reported-by: Alexander Sverdlin <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Fixes: ee534378f005 ("net: dsa: fix panic when DSA master device unbinds on shutdown")
    Reviewed-by: Alexander Sverdlin <[email protected]>
    Tested-by: Alexander Sverdlin <[email protected]>
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: lantiq_etop: fix memory disclosure [+ + +]
Author: Aleksander Jan Bajkowski <[email protected]>
Date:   Mon Sep 23 23:49:49 2024 +0200

    net: ethernet: lantiq_etop: fix memory disclosure
    
    [ Upstream commit 45c0de18ff2dc9af01236380404bbd6a46502c69 ]
    
    When applying padding, the buffer is not zeroed, which results in memory
    disclosure. The mentioned data is observed on the wire. This patch uses
    skb_put_padto() to pad Ethernet frames properly. The mentioned function
    zeroes the expanded buffer.
    
    In case the packet cannot be padded it is silently dropped. Statistics
    are also not incremented. This driver does not support statistics in the
    old 32-bit format or the new 64-bit format. These will be added in the
    future. In its current form, the patch should be easily backported to
    stable versions.
    
    Ethernet MACs on Amazon-SE and Danube cannot do padding of the packets
    in hardware, so software padding must be applied.
    
    Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver")
    Signed-off-by: Aleksander Jan Bajkowski <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Reload PTP registers after link-state change [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:06 2024 +0200

    net: fec: Reload PTP registers after link-state change
    
    [ Upstream commit d9335d0232d2da605585eea1518ac6733518f938 ]
    
    On link-state change, the controller gets reset,
    which clears all PTP registers, including PHC time,
    calibrated clock correction values etc. For correct
    IEEE 1588 operation we need to restore these after
    the reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Reviewed-by: Wei Fang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Restart PPS after link state change [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:04 2024 +0200

    net: fec: Restart PPS after link state change
    
    [ Upstream commit a1477dc87dc4996dcf65a4893d4e2c3a6b593002 ]
    
    On link state change, the controller gets reset,
    causing PPS to drop out. Re-enable PPS if it was
    enabled before the controller reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size [+ + +]
Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:42 2024 +0200

    net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size
    
    [ Upstream commit e609c959a939660c7519895f853dfa5624c6827a ]
    
    Commit 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    added a dev->gso_max_size test to gso_features_check() in order to fall
    back to GSO when needed.
    
    This was added as it was noticed that some drivers could misbehave if TSO
    packets get too big. However, the check doesn't respect dev->gso_ipv4_max_size
    limit. For instance, a device could be configured with BIG TCP for IPv4,
    but not IPv6.
    
    Therefore, add a netif_get_gso_max_size() equivalent to netif_get_gro_max_size()
    and use the helper to respect both limits before falling back to GSO engine.
    
    Fixes: 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: gso: fix tcp fraglist segmentation after pull from frag_list [+ + +]
Author: Felix Fietkau <[email protected]>
Date:   Thu Sep 26 10:53:14 2024 +0200

    net: gso: fix tcp fraglist segmentation after pull from frag_list
    
    commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8 upstream.
    
    Detect tcp gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For TCP, this
    causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at
    tcp_hdr(seg->next).
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Approach and description based on a patch by Willem de Bruijn.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
    Cc: [email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: hisilicon: hip04: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:19 2024 +0200

    net: hisilicon: hip04: fix OF node leak in probe()
    
    [ Upstream commit 17555297dbd5bccc93a01516117547e26a61caf1 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:20 2024 +0200

    net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
    
    [ Upstream commit 5680cf8d34e1552df987e2f4bb1bff0b2a8c8b11 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in hns_mac_get_info().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_mdio: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:21 2024 +0200

    net: hisilicon: hns_mdio: fix OF node leak in probe()
    
    [ Upstream commit e62beddc45f487b9969821fad3a0913d9bc18a2f ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:42:34 2024 +0800

    net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 09573b1cc76e7ff8f056ab29ea1cdc152ec8c653 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mvpp2: Increase size of queue_name buffer [+ + +]
Author: Simon Horman <[email protected]>
Date:   Tue Aug 6 12:28:24 2024 +0100

    net: mvpp2: Increase size of queue_name buffer
    
    [ Upstream commit 91d516d4de48532d967a77967834e00c8c53dfe6 ]
    
    Increase size of queue_name buffer from 30 to 31 to accommodate
    the largest string written to it. This avoids truncation in
    the possibly unlikely case where the string is name is the
    maximum size.
    
    Flagged by gcc-14:
    
      .../mvpp2_main.c: In function 'mvpp2_probe':
      .../mvpp2_main.c:7636:32: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                                ^
      .../mvpp2_main.c:7635:9: note: 'snprintf' output between 10 and 31 bytes into a destination of size 30
       7635 |         snprintf(priv->queue_name, sizeof(priv->queue_name),
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7637 |                  priv->port_count > 1 ? "+" : "");
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Introduced by commit 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics").
    I am not flagging this as a bug as I am not aware that it is one.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Marcin Wojtas <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: napi: Prevent overflow of napi_defer_hard_irqs [+ + +]
Author: Joe Damato <[email protected]>
Date:   Wed Sep 4 15:34:30 2024 +0000

    net: napi: Prevent overflow of napi_defer_hard_irqs
    
    [ Upstream commit 08062af0a52107a243f7608fd972edb54ca5b7f8 ]
    
    In commit 6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
    napi_defer_irqs was added to net_device and napi_defer_irqs_count was
    added to napi_struct, both as type int.
    
    This value never goes below zero, so there is not reason for it to be a
    signed int. Change the type for both from int to u32, and add an
    overflow check to sysfs to limit the value to S32_MAX.
    
    The limit of S32_MAX was chosen because the practical limit before this
    patch was S32_MAX (anything larger was an overflow) and thus there are
    no behavioral changes introduced. If the extra bit is needed in the
    future, the limit can be raised.
    
    Before this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    $ cat /sys/class/net/eth4/napi_defer_hard_irqs
    -2147483647
    
    After this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    bash: line 0: echo: write error: Numerical result out of range
    
    Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
    
    include/linux/netdevice.h:      unsigned int            tx_queue_len;
    
    And has an overflow check:
    
    dev_change_tx_queue_len(..., unsigned long new_len):
    
      if (new_len != (unsigned int)new_len)
              return -ERANGE;
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: pcs: xpcs: fix the wrong register that was written back [+ + +]
Author: Jiawen Wu <[email protected]>
Date:   Tue Sep 24 10:28:57 2024 +0800

    net: pcs: xpcs: fix the wrong register that was written back
    
    commit 93ef6ee5c20e9330477930ec6347672c9e0cf5a6 upstream.
    
    The value is read from the register TXGBE_RX_GEN_CTL3, and it should be
    written back to TXGBE_RX_GEN_CTL3 when it changes some fields.
    
    Cc: [email protected]
    Fixes: f629acc6f210 ("net: pcs: xpcs: support to switch mode for Wangxun NICs")
    Signed-off-by: Jiawen Wu <[email protected]>
    Reported-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: phy: Check for read errors in SIOCGMIIREG [+ + +]
Author: Niklas Söderlund <[email protected]>
Date:   Tue Sep 3 19:15:36 2024 +0200

    net: phy: Check for read errors in SIOCGMIIREG
    
    [ Upstream commit 569bf6d481b0b823c3c9c3b8be77908fd7caf66b ]
    
    When reading registers from the PHY using the SIOCGMIIREG IOCTL any
    errors returned from either mdiobus_read() or mdiobus_c45_read() are
    ignored, and parts of the returned error is passed as the register value
    back to user-space.
    
    For example, if mdiobus_c45_read() is used with a bus that do not
    implement the read_c45() callback -EOPNOTSUPP is returned. This is
    however directly stored in mii_data->val_out and returned as the
    registers content. As val_out is a u16 the error code is truncated and
    returned as a plausible register value.
    
    Fix this by first checking the return value for errors before returning
    it as the register content.
    
    Before this patch,
    
        # phytool read eth0/0:1/0
        0xffa1
    
    After this change,
    
        $ phytool read eth0/0:1/0
        error: phy_read (-95)
    
    Signed-off-by: Niklas Söderlund <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Yoshihiro Shimoda <[email protected]>
    Tested-by: Yoshihiro Shimoda <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: realtek: Check the index value in led_hw_control_get [+ + +]
Author: Hui Wang <[email protected]>
Date:   Fri Sep 27 19:46:10 2024 +0800

    net: phy: realtek: Check the index value in led_hw_control_get
    
    [ Upstream commit c283782fc5d60c4d8169137c6f955aa3553d3b3d ]
    
    Just like rtl8211f_led_hw_is_supported() and
    rtl8211f_led_hw_control_set(), the rtl8211f_led_hw_control_get() also
    needs to check the index value, otherwise the caller is likely to get
    an incorrect rules.
    
    Fixes: 17784801d888 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F")
    Signed-off-by: Hui Wang <[email protected]>
    Reviewed-by: Marek Vasut <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sched: consistently use rcu_replace_pointer() in taprio_change() [+ + +]
Author: Dmitry Antipov <[email protected]>
Date:   Wed Sep 4 14:54:01 2024 +0300

    net: sched: consistently use rcu_replace_pointer() in taprio_change()
    
    [ Upstream commit d5c4546062fd6f5dbce575c7ea52ad66d1968678 ]
    
    According to Vinicius (and carefully looking through the whole
    https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
    once again), txtime branch of 'taprio_change()' is not going to
    race against 'advance_sched()'. But using 'rcu_replace_pointer()'
    in the former may be a good idea as well.
    
    Suggested-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: Dmitry Antipov <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs [+ + +]
Author: Jakub Kicinski <[email protected]>
Date:   Thu Aug 1 17:19:56 2024 -0700

    net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs
    
    [ Upstream commit c89cca307b20917da739567a255a68a0798ee129 ]
    
    build_skb() and frag allocations done with GFP_ATOMIC will
    fail in real life, when system is under memory pressure,
    and there's nothing we can do about that. So no point
    printing warnings.
    
    Signed-off-by: Jakub Kicinski <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sparx5: Fix invalid timestamps [+ + +]
Author: Aakash Menon <[email protected]>
Date:   Mon Sep 16 22:18:29 2024 -0700

    net: sparx5: Fix invalid timestamps
    
    [ Upstream commit 151ac45348afc5b56baa584c7cd4876addf461ff ]
    
    Bit 270-271 are occasionally unexpectedly set by the hardware. This issue
    was observed with 10G SFPs causing huge time errors (> 30ms) in PTP. Only
    30 bits are needed for the nanosecond part of the timestamp, clear 2 most
    significant bits before extracting timestamp from the internal frame
    header.
    
    Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping")
    Signed-off-by: Aakash Menon <[email protected]>
    Reviewed-by: Horatiu Vultur <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check [+ + +]
Author: Shenwei Wang <[email protected]>
Date:   Tue Sep 24 15:54:24 2024 -0500

    net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check
    
    [ Upstream commit 4c1b56671b68ffcbe6b78308bfdda6bcce6491ae ]
    
    Increase the timeout for checking the busy bit of the VLAN Tag register
    from 10µs to 500ms. This change is necessary to accommodate scenarios
    where Energy Efficient Ethernet (EEE) is enabled.
    
    Overnight testing revealed that when EEE is active, the busy bit can
    remain set for up to approximately 300ms. The new 500ms timeout provides
    a safety margin.
    
    Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering")
    Reviewed-by: Andrew Lunn <[email protected]>
    Signed-off-by: Shenwei Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: Fix zero-division error when disabling tc cbs [+ + +]
Author: KhaiWenTan <[email protected]>
Date:   Wed Sep 18 14:14:22 2024 +0800

    net: stmmac: Fix zero-division error when disabling tc cbs
    
    commit 675faf5a14c14a2be0b870db30a70764df81e2df upstream.
    
    The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider
    when offload is disabled") allows the "port_transmit_rate_kbps" to be
    set to a value of 0, which is then passed to the "div_s64" function when
    tc-cbs is disabled. This leads to a zero-division error.
    
    When tc-cbs is disabled, the idleslope, sendslope, and credit values the
    credit values are not required to be configured. Therefore, adding a return
    statement after setting the txQ mode to DCB when tc-cbs is disabled would
    prevent a zero-division error.
    
    Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled")
    Cc: <[email protected]>
    Co-developed-by: Choong Yong Liang <[email protected]>
    Signed-off-by: Choong Yong Liang <[email protected]>
    Signed-off-by: KhaiWenTan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: test for not too small csum_start in virtio_net_hdr_to_skb() [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()
    
    [ Upstream commit 49d14b54a527289d09a9480f214b8c586322310a ]
    
    syzbot was able to trigger this warning [1], after injecting a
    malicious packet through af_packet, setting skb->csum_start and thus
    the transport header to an incorrect value.
    
    We can at least make sure the transport header is after
    the end of the network header (with a estimated minimal size).
    
    [1]
    [   67.873027] skb len=4096 headroom=16 headlen=14 tailroom=0
    mac=(-1,-1) mac_len=0 net=(16,-6) trans=10
    shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
    csum(0xa start=10 offset=0 ip_summed=3 complete_sw=0 valid=0 level=0)
    hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
    priority=0x0 mark=0x0 alloc_cpu=10 vlan_all=0x0
    encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
    [   67.877172] dev name=veth0_vlan feat=0x000061164fdd09e9
    [   67.877764] sk family=17 type=3 proto=0
    [   67.878279] skb linear:   00000000: 00 00 10 00 00 00 00 00 0f 00 00 00 08 00
    [   67.879128] skb frag:     00000000: 0e 00 07 00 00 00 28 00 08 80 1c 00 04 00 00 02
    [   67.879877] skb frag:     00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.880647] skb frag:     00000020: 00 00 02 00 00 00 08 00 1b 00 00 00 00 00 00 00
    [   67.881156] skb frag:     00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.881753] skb frag:     00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882173] skb frag:     00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882790] skb frag:     00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883171] skb frag:     00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883733] skb frag:     00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.884206] skb frag:     00000090: 00 00 00 00 00 00 00 00 00 00 69 70 76 6c 61 6e
    [   67.884704] skb frag:     000000a0: 31 00 00 00 00 00 00 00 00 00 2b 00 00 00 00 00
    [   67.885139] skb frag:     000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.885677] skb frag:     000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886042] skb frag:     000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886408] skb frag:     000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887020] skb frag:     000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887384] skb frag:     00000100: 00 00
    [   67.887878] ------------[ cut here ]------------
    [   67.887908] offset (-6) >= skb_headlen() (14)
    [   67.888445] WARNING: CPU: 10 PID: 2088 at net/core/dev.c:3332 skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.889353] Modules linked in: macsec macvtap macvlan hsr wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 dummy bridge sr_mod cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p 9pnet netfs
    [   67.890111] CPU: 10 UID: 0 PID: 2088 Comm: b363492833 Not tainted 6.11.0-virtme #1011
    [   67.890183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   67.890309] RIP: 0010:skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891043] Call Trace:
    [   67.891173]  <TASK>
    [   67.891274] ? __warn (kernel/panic.c:741)
    [   67.891320] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891333] ? report_bug (lib/bug.c:180 lib/bug.c:219)
    [   67.891348] ? handle_bug (arch/x86/kernel/traps.c:239)
    [   67.891363] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
    [   67.891372] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
    [   67.891388] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891399] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891416] ip_do_fragment (net/ipv4/ip_output.c:777 (discriminator 1))
    [   67.891448] ? __ip_local_out (./include/linux/skbuff.h:1146 ./include/net/l3mdev.h:196 ./include/net/l3mdev.h:213 net/ipv4/ip_output.c:113)
    [   67.891459] ? __pfx_ip_finish_output2 (net/ipv4/ip_output.c:200)
    [   67.891470] ? ip_route_output_flow (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:96 (discriminator 13) ./include/linux/rcupdate.h:871 (discriminator 13) net/ipv4/route.c:2625 (discriminator 13) ./include/net/route.h:141 (discriminator 13) net/ipv4/route.c:2852 (discriminator 13))
    [   67.891484] ipvlan_process_v4_outbound (drivers/net/ipvlan/ipvlan_core.c:445 (discriminator 1))
    [   67.891581] ipvlan_queue_xmit (drivers/net/ipvlan/ipvlan_core.c:542 drivers/net/ipvlan/ipvlan_core.c:604 drivers/net/ipvlan/ipvlan_core.c:670)
    [   67.891596] ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:227)
    [   67.891607] dev_hard_start_xmit (./include/linux/netdevice.h:4916 ./include/linux/netdevice.h:4925 net/core/dev.c:3588 net/core/dev.c:3604)
    [   67.891620] __dev_queue_xmit (net/core/dev.h:168 (discriminator 25) net/core/dev.c:4425 (discriminator 25))
    [   67.891630] ? skb_copy_bits (./include/linux/uaccess.h:233 (discriminator 1) ./include/linux/uaccess.h:260 (discriminator 1) ./include/linux/highmem-internal.h:230 (discriminator 1) net/core/skbuff.c:3018 (discriminator 1))
    [   67.891645] ? __pskb_pull_tail (net/core/skbuff.c:2848 (discriminator 4))
    [   67.891655] ? skb_partial_csum_set (net/core/skbuff.c:5657)
    [   67.891666] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/skbuff.h:2791 (discriminator 3) ./include/linux/skbuff.h:2799 (discriminator 3) ./include/linux/virtio_net.h:109 (discriminator 3))
    [   67.891684] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   67.891700] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   67.891716] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   67.891734] ? do_sock_setsockopt (net/socket.c:2335)
    [   67.891747] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   67.891761] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   67.891772] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   67.891785] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    
    Fixes: 9181d6f8a2bb ("net: add more sanity check in virtio_net_hdr_to_skb()")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 19:57:43 2024 +0800

    net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable()
    
    [ Upstream commit d505d3593b52b6c43507f119572409087416ba28 ]
    
    It's important to undo pm_runtime_use_autosuspend() with
    pm_runtime_dont_use_autosuspend() at driver exit time.
    
    But the pm_runtime_disable() and pm_runtime_dont_use_autosuspend()
    is missing in the error path for bam_dmux_probe(). So add it.
    
    Found by code review. Compile-tested only.
    
    Fixes: 21a0ffd9b38c ("net: wwan: Add Qualcomm BAM-DMUX WWAN network driver")
    Suggested-by: Stephan Gerhold <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Stephan Gerhold <[email protected]>
    Reviewed-by: Sergey Ryazanov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netdev-genl: Set extack and fix error on napi-get [+ + +]
Author: Joe Damato <[email protected]>
Date:   Sat Aug 31 12:17:04 2024 +0000

    netdev-genl: Set extack and fix error on napi-get
    
    [ Upstream commit 4e3a024b437ec0aee82550cc66a0f4e1a7a88a67 ]
    
    In commit 27f91aaf49b3 ("netdev-genl: Add netlink framework functions
    for napi"), when an invalid NAPI ID is specified the return value
    -EINVAL is used and no extack is set.
    
    Change the return value to -ENOENT and set the extack.
    
    Before this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                              --do napi-get --json='{"id": 451}'
    Netlink error: Invalid argument
    nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
            error: -22
    
    After this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                             --do napi-get --json='{"id": 451}'
    Netlink error: No such file or directory
    nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
            error: -2
            extack: {'bad-attr': '.id'}
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netfilter: nf_tables: do not remove elements if set backend implements .abort [+ + +]
Author: Pablo Neira Ayuso <[email protected]>
Date:   Mon Jul 15 13:32:31 2024 +0200

    netfilter: nf_tables: do not remove elements if set backend implements .abort
    
    [ Upstream commit c9526aeb4998393171d85225ff540e28c7d4ab86 ]
    
    pipapo set backend maintains two copies of the datastructure, removing
    the elements from the copy that is going to be discarded slows down
    the abort path significantly, from several minutes to few seconds after
    this patch.
    
    This patch was previously reverted by
    
      f86fb94011ae ("netfilter: nf_tables: revert do not remove elements if set backend implements .abort")
    
    but it is now possible since recent work by Florian Westphal to perform
    on-demand clone from insert/remove path:
    
      532aec7e878b ("netfilter: nft_set_pipapo: remove dirty flag")
      3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
      a238106703ab ("netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone")
      c5444786d0ea ("netfilter: nft_set_pipapo: merge deactivate helper into caller")
      6c108d9bee44 ("netfilter: nft_set_pipapo: prepare walk function for on-demand clone")
      8b8a2417558c ("netfilter: nft_set_pipapo: prepare destroy function for on-demand clone")
      80efd2997fb9 ("netfilter: nft_set_pipapo: make pipapo_clone helper return NULL")
      a590f4760922 ("netfilter: nft_set_pipapo: move prove_locking helper around")
    
    after this series, the clone is fully released once aborted, no need to
    take it back to previous state. Thus, no stale reference to elements can
    occur.
    
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: nf_tables: prevent nf_skb_duplicated corruption [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 18:56:11 2024 +0000

    netfilter: nf_tables: prevent nf_skb_duplicated corruption
    
    [ Upstream commit 92ceba94de6fb4cee2bf40b485979c342f44a492 ]
    
    syzbot found that nf_dup_ipv4() or nf_dup_ipv6() could write
    per-cpu variable nf_skb_duplicated in an unsafe way [1].
    
    Disabling preemption as hinted by the splat is not enough,
    we have to disable soft interrupts as well.
    
    [1]
    BUG: using __this_cpu_write() in preemptible [00000000] code: syz.4.282/6316
     caller is nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
    CPU: 0 UID: 0 PID: 6316 Comm: syz.4.282 Not tainted 6.11.0-rc7-syzkaller-00104-g7052622fccb1 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
      check_preemption_disabled+0x10e/0x120 lib/smp_processor_id.c:49
      nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
      nft_dup_ipv4_eval+0x1db/0x300 net/ipv4/netfilter/nft_dup_ipv4.c:30
      expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline]
      nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288
      nft_do_chain_ipv4+0x202/0x320 net/netfilter/nft_chain_filter.c:23
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
      nf_hook+0x2c4/0x450 include/linux/netfilter.h:269
      NF_HOOK_COND include/linux/netfilter.h:302 [inline]
      ip_output+0x185/0x230 net/ipv4/ip_output.c:433
      ip_local_out net/ipv4/ip_output.c:129 [inline]
      ip_send_skb+0x74/0x100 net/ipv4/ip_output.c:1495
      udp_send_skb+0xacf/0x1650 net/ipv4/udp.c:981
      udp_sendmsg+0x1c21/0x2a60 net/ipv4/udp.c:1269
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x1a6/0x270 net/socket.c:745
      ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
      ___sys_sendmsg net/socket.c:2651 [inline]
      __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
      __do_sys_sendmmsg net/socket.c:2766 [inline]
      __se_sys_sendmmsg net/socket.c:2763 [inline]
      __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f4ce4f7def9
    Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f4ce5d4a038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00007f4ce5135f80 RCX: 00007f4ce4f7def9
    RDX: 0000000000000001 RSI: 0000000020005d40 RDI: 0000000000000006
    RBP: 00007f4ce4ff0b76 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007f4ce5135f80 R15: 00007ffd4cbc6d68
     </TASK>
    
    Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED [+ + +]
Author: Phil Sutter <[email protected]>
Date:   Wed Sep 25 20:01:20 2024 +0200

    netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
    
    [ Upstream commit 76f1ed087b562a469f2153076f179854b749c09a ]
    
    Fix the comment which incorrectly defines it as NLA_U32.
    
    Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
    Signed-off-by: Phil Sutter <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netfs: Cancel dirty folios that have no storage destination [+ + +]
Author: David Howells <[email protected]>
Date:   Mon Jul 29 12:23:11 2024 +0100

    netfs: Cancel dirty folios that have no storage destination
    
    [ Upstream commit 8f246b7c0a1be0882374f2ff831a61f0dbe77678 ]
    
    Kafs wants to be able to cache the contents of directories (and symlinks),
    but whilst these are downloaded from the server with the FS.FetchData RPC
    op and similar, the same as for regular files, they can't be updated by
    FS.StoreData, but rather have special operations (FS.MakeDir, etc.).
    
    Now, rather than redownloading a directory's content after each change made
    to that directory, kafs modifies the local blob.  This blob can be saved
    out to the cache, and since it's using netfslib, kafs just marks the folios
    dirty and lets ->writepages() on the directory take care of it, as for an
    regular file.
    
    This is fine as long as there's a cache as although the upload stream is
    disabled, there's a cache stream to drive the procedure.  But if the cache
    goes away in the meantime, suddenly there's no way do any writes and the
    code gets confused, complains "R=%x: No submit" to dmesg and leaves the
    dirty folio hanging.
    
    Fix this by just cancelling the store of the folio if neither stream is
    active.  (If there's no cache at the time of dirtying, we should just not
    mark the folio dirty).
    
    Signed-off-by: David Howells <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]/ # v2
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfs: Fix missing wakeup after issuing writes [+ + +]
Author: David Howells <[email protected]>
Date:   Wed Oct 2 15:45:50 2024 +0100

    netfs: Fix missing wakeup after issuing writes
    
    [ Upstream commit 1ca4169c391c370e0f3a92938df2862900575096 ]
    
    After dividing up a proposed write into subrequests, netfslib sets
    NETFS_RREQ_ALL_QUEUED to indicate to the collector that it can move on to
    the final cleanup once it has emptied the subrequest queues.
    
    Now, whilst the collector will normally end up running at least once after
    this bit is set just because it takes a while to process all the write
    subrequests before the collector runs out of subrequests, there exists the
    possibility that the issuing thread will be forced to sleep and the
    collector thread will clean up all the subrequests before ALL_QUEUED gets
    set.
    
    In such a case, the collector thread will not get triggered again and will
    never clear NETFS_RREQ_IN_PROGRESS thus leaving a request uncompleted and
    causing a potential futute hang.
    
    Fix this by scheduling the write collector if all the subrequest queues are
    empty (and thus no writes pending issuance).
    
    Note that we'd do this ideally before queuing the subrequest, but in the
    case of buffered writeback, at least, we can't find out that we've run out
    of folios until after we've called writeback_iter() and it has returned
    NULL - at which point we might not actually have any subrequests still
    under construction.
    
    Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netpoll: Ensure clean state on setup failures [+ + +]
Author: Breno Leitao <[email protected]>
Date:   Thu Aug 22 04:10:47 2024 -0700

    netpoll: Ensure clean state on setup failures
    
    [ Upstream commit ae5a0456e0b4cfd7e61619e55251ffdf1bc7adfb ]
    
    Modify netpoll_setup() and __netpoll_setup() to ensure that the netpoll
    structure (np) is left in a clean state if setup fails for any reason.
    This prevents carrying over misconfigured fields in case of partial
    setup success.
    
    Key changes:
    - np->dev is now set only after successful setup, ensuring it's always
      NULL if netpoll is not configured or if netpoll_setup() fails.
    - np->local_ip is zeroed if netpoll setup doesn't complete successfully.
    - Added DEBUG_NET_WARN_ON_ONCE() checks to catch unexpected states.
    - Reordered some operations in __netpoll_setup() for better logical flow.
    
    These changes improve the reliability of netpoll configuration, since it
    assures that the structure is fully initialized or totally unset.
    
    Suggested-by: Paolo Abeni <[email protected]>
    Signed-off-by: Breno Leitao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nfp: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:44:45 2024 +0800

    nfp: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit daaba19d357f0900b303a530ced96c78086267ea ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Reviewed-by: Louis Peens <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
NFSD: Async COPY result needs to return a write verifier [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:03 2024 -0400

    NFSD: Async COPY result needs to return a write verifier
    
    [ Upstream commit 9ed666eba4e0a2bb8ffaa3739d830b64d4f2aaad ]
    
    Currently, when NFSD handles an asynchronous COPY, it returns a
    zero write verifier, relying on the subsequent CB_OFFLOAD callback
    to pass the write verifier and a stable_how4 value to the client.
    
    However, if the CB_OFFLOAD never arrives at the client (for example,
    if a network partition occurs just as the server sends the
    CB_OFFLOAD operation), the client will never receive this verifier.
    Thus, if the client sends a follow-up COMMIT, there is no way for
    the client to assess the COMMIT result.
    
    The usual recovery for a missing CB_OFFLOAD is for the client to
    send an OFFLOAD_STATUS operation, but that operation does not carry
    a write verifier in its result. Neither does it carry a stable_how4
    value, so the client /must/ send a COMMIT in this case -- which will
    always fail because currently there's still no write verifier in the
    COPY result.
    
    Thus the server needs to return a normal write verifier in its COPY
    result even if the COPY operation is to be performed asynchronously.
    
    If the server recognizes the callback stateid in subsequent
    OFFLOAD_STATUS operations, then obviously it has not restarted, and
    the write verifier the client received in the COPY result is still
    valid and can be used to assess a COMMIT of the copied data, if one
    is needed.
    
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <[email protected]>

 
nfsd: fix delegation_blocked() to block correctly for at least 30 seconds [+ + +]
Author: NeilBrown <[email protected]>
Date:   Mon Sep 9 15:06:36 2024 +1000

    nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
    
    commit 45bb63ed20e02ae146336412889fe5450316a84f upstream.
    
    The pair of bloom filtered used by delegation_blocked() was intended to
    block delegations on given filehandles for between 30 and 60 seconds.  A
    new filehandle would be recorded in the "new" bit set.  That would then
    be switch to the "old" bit set between 0 and 30 seconds later, and it
    would remain as the "old" bit set for 30 seconds.
    
    Unfortunately the code intended to clear the old bit set once it reached
    30 seconds old, preparing it to be the next new bit set, instead cleared
    the *new* bit set before switching it to be the old bit set.  This means
    that the "old" bit set is always empty and delegations are blocked
    between 0 and 30 seconds.
    
    This patch updates bd->new before clearing the set with that index,
    instead of afterwards.
    
    Reported-by: Olga Kornievskaia <[email protected]>
    Cc: [email protected]
    Fixes: 6282cd565553 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.")
    Signed-off-by: NeilBrown <[email protected]>
    Reviewed-by: Benjamin Coddington <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
NFSD: Fix NFSv4's PUTPUBFH operation [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Sun Aug 11 13:11:07 2024 -0400

    NFSD: Fix NFSv4's PUTPUBFH operation
    
    commit 202f39039a11402dcbcd5fece8d9fa6be83f49ae upstream.
    
    According to RFC 8881, all minor versions of NFSv4 support PUTPUBFH.
    
    Replace the XDR decoder for PUTPUBFH with a "noop" since we no
    longer want the minorversion check, and PUTPUBFH has no arguments to
    decode. (Ideally nfsd4_decode_noop should really be called
    nfsd4_decode_void).
    
    PUTPUBFH should now behave just like PUTROOTFH.
    
    Reported-by: Cedric Blancher <[email protected]>
    Fixes: e1a90ebd8b23 ("NFSD: Combine decode operations for v4 and v4.1")
    Cc: Dan Shelton <[email protected]>
    Cc: Roland Mainz <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

NFSD: Limit the number of concurrent async COPY operations [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:04 2024 -0400

    NFSD: Limit the number of concurrent async COPY operations
    
    [ Upstream commit aadc3bbea163b6caaaebfdd2b6c4667fbc726752 ]
    
    Nothing appears to limit the number of concurrent async COPY
    operations that clients can start. In addition, AFAICT each async
    COPY can copy an unlimited number of 4MB chunks, so can run for a
    long time. Thus IMO async COPY can become a DoS vector.
    
    Add a restriction mechanism that bounds the number of concurrent
    background COPY operations. Start simple and try to be fair -- this
    patch implements a per-namespace limit.
    
    An async COPY request that occurs while this limit is exceeded gets
    NFS4ERR_DELAY. The requesting client can choose to send the request
    again after a delay or fall back to a traditional read/write style
    copy.
    
    If there is need to make the mechanism more sophisticated, we can
    visit that in future patches.
    
    Cc: [email protected]
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nfsd: map the EBADMSG to nfserr_io to avoid warning [+ + +]
Author: Li Lingfeng <[email protected]>
Date:   Sat Aug 17 14:27:13 2024 +0800

    nfsd: map the EBADMSG to nfserr_io to avoid warning
    
    commit 340e61e44c1d2a15c42ec72ade9195ad525fd048 upstream.
    
    Ext4 will throw -EBADMSG through ext4_readdir when a checksum error
    occurs, resulting in the following WARNING.
    
    Fix it by mapping EBADMSG to nfserr_io.
    
    nfsd_buffered_readdir
     iterate_dir // -EBADMSG -74
      ext4_readdir // .iterate_shared
       ext4_dx_readdir
        ext4_htree_fill_tree
         htree_dirblock_to_tree
          ext4_read_dirblock
           __ext4_read_dirblock
            ext4_dirblock_csum_verify
             warn_no_space_for_csum
              __warn_no_space_for_csum
            return ERR_PTR(-EFSBADCRC) // -EBADMSG -74
     nfserrno // WARNING
    
    [  161.115610] ------------[ cut here ]------------
    [  161.116465] nfsd: non-standard errno: -74
    [  161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0
    [  161.118596] Modules linked in:
    [  161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138
    [  161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe
    mu.org 04/01/2014
    [  161.123601] RIP: 0010:nfserrno+0x9d/0xd0
    [  161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6
     05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33
    [  161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286
    [  161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    [  161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a
    [  161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827
    [  161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021
    [  161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8
    [  161.135244] FS:  0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000
    [  161.136695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0
    [  161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  161.141519] PKRU: 55555554
    [  161.142076] Call Trace:
    [  161.142575]  ? __warn+0x9b/0x140
    [  161.143229]  ? nfserrno+0x9d/0xd0
    [  161.143872]  ? report_bug+0x125/0x150
    [  161.144595]  ? handle_bug+0x41/0x90
    [  161.145284]  ? exc_invalid_op+0x14/0x70
    [  161.146009]  ? asm_exc_invalid_op+0x12/0x20
    [  161.146816]  ? nfserrno+0x9d/0xd0
    [  161.147487]  nfsd_buffered_readdir+0x28b/0x2b0
    [  161.148333]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.149258]  ? nfsd_buffered_filldir+0xf0/0xf0
    [  161.150093]  ? wait_for_concurrent_writes+0x170/0x170
    [  161.151004]  ? generic_file_llseek_size+0x48/0x160
    [  161.151895]  nfsd_readdir+0x132/0x190
    [  161.152606]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.153516]  ? nfsd_unlink+0x380/0x380
    [  161.154256]  ? override_creds+0x45/0x60
    [  161.155006]  nfsd4_encode_readdir+0x21a/0x3d0
    [  161.155850]  ? nfsd4_encode_readlink+0x210/0x210
    [  161.156731]  ? write_bytes_to_xdr_buf+0x97/0xe0
    [  161.157598]  ? __write_bytes_to_xdr_buf+0xd0/0xd0
    [  161.158494]  ? lock_downgrade+0x90/0x90
    [  161.159232]  ? nfs4svc_decode_voidarg+0x10/0x10
    [  161.160092]  nfsd4_encode_operation+0x15a/0x440
    [  161.160959]  nfsd4_proc_compound+0x718/0xe90
    [  161.161818]  nfsd_dispatch+0x18e/0x2c0
    [  161.162586]  svc_process_common+0x786/0xc50
    [  161.163403]  ? nfsd_svc+0x380/0x380
    [  161.164137]  ? svc_printk+0x160/0x160
    [  161.164846]  ? svc_xprt_do_enqueue.part.0+0x365/0x380
    [  161.165808]  ? nfsd_svc+0x380/0x380
    [  161.166523]  ? rcu_is_watching+0x23/0x40
    [  161.167309]  svc_process+0x1a5/0x200
    [  161.168019]  nfsd+0x1f5/0x380
    [  161.168663]  ? nfsd_shutdown_threads+0x260/0x260
    [  161.169554]  kthread+0x1c4/0x210
    [  161.170224]  ? kthread_insert_work_sanity_check+0x80/0x80
    [  161.171246]  ret_from_fork+0x1f/0x30
    
    Signed-off-by: Li Lingfeng <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
nvme-keyring: restrict match length for version '1' identifiers [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:18 2024 +0200

    nvme-keyring: restrict match length for version '1' identifiers
    
    [ Upstream commit 79559c75332458985ab8a21f11b08bf7c9b833b0 ]
    
    TP8018 introduced a new TLS PSK identifier version (version 1), which appended
    a PSK hash value to the existing identifier (cf NVMe TCP specification v1.1,
    section 3.6.1.3 'TLS PSK and PSK Identity Derivation').
    An original (version 0) identifier has the form:
    
    NVMe0<type><hmac> <hostnqn> <subsysnqn>
    
    and a version 1 identifier has the form:
    
    NVMe1<type><hmac> <hostnqn> <subsysnqn> <hash>
    
    This patch modifies the lookup algorthm to compare only the first part
    of the identifier (excluding the hash value) to handle both version 0 and
    version 1 identifiers.
    And the spec declares 'version 0' identifiers obsolete, so the lookup
    algorithm is modified to prever v1 identifiers.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nvme-tcp: check for invalidated or revoked key [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:20 2024 +0200

    nvme-tcp: check for invalidated or revoked key
    
    [ Upstream commit 5bc46b49c828a6dfaab80b71ecb63fe76a1096d2 ]
    
    key_lookup() will always return a key, even if that key is revoked
    or invalidated. So check for invalid keys before continuing.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: fix link failure for TCP auth [+ + +]
Author: Arnd Bergmann <[email protected]>
Date:   Mon Sep 9 20:21:09 2024 +0000

    nvme-tcp: fix link failure for TCP auth
    
    [ Upstream commit 2d5a333e09c388189238291577e443221baacba0 ]
    
    The nvme fabric driver calls the nvme_tls_key_lookup() function from
    nvmf_parse_key() when the keyring is enabled, but this is broken in a
    configuration with CONFIG_NVME_FABRICS=y and CONFIG_NVME_TCP=m because
    this leads to the function definition being in a loadable module:
    
    x86_64-linux-ld: vmlinux.o: in function `nvmf_parse_key':
    fabrics.c:(.text+0xb1bdec): undefined reference to `nvme_tls_key_lookup'
    
    Move the 'select' up to CONFIG_NVME_FABRICS itself to force this
    part to be built-in as well if needed.
    
    Fixes: 5bc46b49c828 ("nvme-tcp: check for invalidated or revoked key")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: sanitize TLS key handling [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:19 2024 +0200

    nvme-tcp: sanitize TLS key handling
    
    [ Upstream commit 363895767fbfa05891b0b4d9e06ebde7a10c6a07 ]
    
    There is a difference between TLS configured (ie the user has
    provisioned/requested a key) and TLS enabled (ie the connection
    is encrypted with TLS). This becomes important for secure concatenation,
    where the initial authentication is run on an unencrypted connection
    (ie with TLS configured, but not enabled), and then the queue is reset to
    run over TLS (ie TLS configured _and_ enabled).
    So to differentiate between those two states store the generated
    key in opts->tls_key (as we're using the same TLS key for all queues),
    the key serial of the resulting TLS handshake in ctrl->tls_pskid
    (to signal that TLS on the admin queue is enabled), and a simple
    flag for the queues to indicated that TLS has been enabled.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nvme: fix metadata handling in nvme-passthrough [+ + +]
Author: Puranjay Mohan <[email protected]>
Date:   Thu Aug 29 13:32:17 2024 +0000

    nvme: fix metadata handling in nvme-passthrough
    
    [ Upstream commit 7c2fd76048e95dd267055b5f5e0a48e6e7c81fd9 ]
    
    On an NVMe namespace that does not support metadata, it is possible to
    send an IO command with metadata through io-passthru. This allows issues
    like [1] to trigger in the completion code path.
    nvme_map_user_request() doesn't check if the namespace supports metadata
    before sending it forward. It also allows admin commands with metadata to
    be processed as it ignores metadata when bdev == NULL and may report
    success.
    
    Reject an IO command with metadata when the NVMe namespace doesn't
    support it and reject an admin command if it has metadata.
    
    [1] https://lore.kernel.org/all/[email protected]/
    
    Suggested-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Puranjay Mohan <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Anuj Gupta <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ocfs2: cancel dqi_sync_work before freeing oinfo [+ + +]
Author: Joseph Qi <[email protected]>
Date:   Wed Sep 4 15:10:03 2024 +0800

    ocfs2: cancel dqi_sync_work before freeing oinfo
    
    commit 35fccce29feb3706f649726d410122dd81b92c18 upstream.
    
    ocfs2_global_read_info() will initialize and schedule dqi_sync_work at the
    end, if error occurs after successfully reading global quota, it will
    trigger the following warning with CONFIG_DEBUG_OBJECTS_* enabled:
    
    ODEBUG: free active (active state 0) object: 00000000d8b0ce28 object type: timer_list hint: qsync_work_fn+0x0/0x16c
    
    This reports that there is an active delayed work when freeing oinfo in
    error handling, so cancel dqi_sync_work first.  BTW, return status instead
    of -1 when .read_file_info fails.
    
    Link: https://syzkaller.appspot.com/bug?extid=f7af59df5d6b25f0febd
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 171bf93ce11f ("ocfs2: Periodic quota syncing")
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Heming Zhao <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix null-ptr-deref when journal load failed. [+ + +]
Author: Julian Sun <[email protected]>
Date:   Mon Sep 2 11:08:44 2024 +0800

    ocfs2: fix null-ptr-deref when journal load failed.
    
    commit 5784d9fcfd43bd853654bb80c87ef293b9e8e80a upstream.
    
    During the mounting process, if journal_reset() fails because of too short
    journal, then lead to jbd2_journal_load() fails with NULL j_sb_buffer.
    Subsequently, ocfs2_journal_shutdown() calls
    jbd2_journal_flush()->jbd2_cleanup_journal_tail()->
    __jbd2_update_log_tail()->jbd2_journal_update_sb_log_tail()
    ->lock_buffer(journal->j_sb_buffer), resulting in a null-pointer
    dereference error.
    
    To resolve this issue, we should check the JBD2_LOADED flag to ensure the
    journal was properly loaded.  Additionally, use journal instead of
    osb->journal directly to simplify the code.
    
    Link: https://syzkaller.appspot.com/bug?extid=05b9b39d8bdfe1a0861f
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: f6f50e28f0cb ("jbd2: Fail to load a journal if it is too short")
    Signed-off-by: Julian Sun <[email protected]>
    Reported-by: [email protected]
    Suggested-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate [+ + +]
Author: Lizhi Xu <[email protected]>
Date:   Mon Sep 2 10:36:36 2024 +0800

    ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
    
    commit 33b525cef4cff49e216e4133cc48452e11c0391e upstream.
    
    When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger
    NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if
    bh is NULL.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <[email protected]>
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Reported-by: Heming Zhao <[email protected]>
    Suggested-by: Heming Zhao <[email protected]>
    Cc: <[email protected]>    [4.20+]
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix the la space leak when unmounting an ocfs2 volume [+ + +]
Author: Heming Zhao <[email protected]>
Date:   Fri Jul 19 19:43:10 2024 +0800

    ocfs2: fix the la space leak when unmounting an ocfs2 volume
    
    commit dfe6c5692fb525e5e90cefe306ee0dffae13d35f upstream.
    
    This bug has existed since the initial OCFS2 code.  The code logic in
    ocfs2_sync_local_to_main() is wrong, as it ignores the last contiguous
    free bits, which causes an OCFS2 volume to lose the last free clusters of
    LA window on each umount command.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Heming Zhao <[email protected]>
    Reviewed-by: Su Yue <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Heming Zhao <[email protected]>
    Cc: <