Changelog in Linux kernel 6.11.3

accel/ivpu: Add missing MODULE_FIRMWARE metadata [+ + +]

Author: Alexander F. Lent <[email protected]>
Date:   Tue Jul 9 07:54:14 2024 -0400

    accel/ivpu: Add missing MODULE_FIRMWARE metadata
    
    [ Upstream commit 58b5618ba80a5e5a8d531a70eae12070e5bd713f ]
    
    Modules that load firmware from various paths at runtime must declare
    those paths at compile time, via the MODULE_FIRMWARE macro, so that the
    firmware paths are included in the module's metadata.
    
    The accel/ivpu driver loads firmware but lacks this metadata,
    preventing dracut from correctly locating firmware files. Fix it.
    
    Fixes: 9ab43e95f922 ("accel/ivpu: Switch to generation based FW names")
    Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
    Signed-off-by: Alexander F. Lent <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240709-fix-ivpu-firmware-metadata-v3-1-55f70bba055b@xanderlent.com
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: battery: Fix possible crash when unregistering a battery hook [+ + +]

Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:34 2024 +0200

    ACPI: battery: Fix possible crash when unregistering a battery hook
    
    [ Upstream commit 76959aff14a0012ad6b984ec7686d163deccdc16 ]
    
    When a battery hook returns an error when adding a new battery, then
    the battery hook is automatically unregistered.
    However the battery hook provider cannot know that, so it will later
    call battery_hook_unregister() on the already unregistered battery
    hook, resulting in a crash.
    
    Fix this by using the list head to mark already unregistered battery
    hooks as already being unregistered so that they can be ignored by
    battery_hook_unregister().
    
    Fixes: fa93854f7a7e ("battery: Add the battery hooking API")
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: battery: Simplify battery hook locking [+ + +]

Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:33 2024 +0200

    ACPI: battery: Simplify battery hook locking
    
    [ Upstream commit 86309cbed26139e1caae7629dcca1027d9a28e75 ]
    
    Move the conditional locking from __battery_hook_unregister()
    into battery_hook_unregister() and rename the low-level function
    to simplify the locking during battery hook removal.
    
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Reviewed-by: Pali Rohár <[email protected]>
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Stable-dep-of: 76959aff14a0 ("ACPI: battery: Fix possible crash when unregistering a battery hook")
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: CPPC: Add support for setting EPP register in FFH [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Mon Sep 9 22:15:24 2024 -0500

    ACPI: CPPC: Add support for setting EPP register in FFH
    
    [ Upstream commit aaf21ac93909e08a12931173336bdb52ac8499f1 ]
    
    Some Asus AMD systems are reported to not be able to change EPP values
    because the BIOS doesn't advertise support for the CPPC MSR and the PCC
    region is not configured.
    
    However the ACPI 6.2 specification allows CPC registers to be declared
    in FFH:
    ```
    Starting with ACPI Specification 6.2, all _CPC registers can be in
    PCC, System Memory, System IO, or Functional Fixed Hardware address
    spaces. OSPM support for this more flexible register space scheme
    is indicated by the “Flexible Address Space for CPPC Registers” _OSC
    bit.
    ```
    
    If this _OSC has been set allow using FFH to configure EPP.
    
    Reported-by: [email protected]
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
    Suggested-by: [email protected]
    Tested-by: [email protected]
    Tested-by: [email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: EC: Do not release locks during operation region accesses [+ + +]

Author: Rafael J. Wysocki <[email protected]>
Date:   Thu Jul 4 18:26:54 2024 +0200

    ACPI: EC: Do not release locks during operation region accesses
    
    [ Upstream commit dc171114926ec390ab90f46534545420ec03e458 ]
    
    It is not particularly useful to release locks (the EC mutex and the
    ACPI global lock, if present) and re-acquire them immediately thereafter
    during EC address space accesses in acpi_ec_space_handler().
    
    First, releasing them for a while before grabbing them again does not
    really help anyone because there may not be enough time for another
    thread to acquire them.
    
    Second, if another thread successfully acquires them and carries out
    a new EC write or read in the middle if an operation region access in
    progress, it may confuse the EC firmware, especially after the burst
    mode has been enabled.
    
    Finally, manipulating the locks after writing or reading every single
    byte of data is overhead that it is better to avoid.
    
    Accordingly, modify the code to carry out EC address space accesses
    entirely without releasing the locks.
    
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: PAD: fix crash in exit_round_robin() [+ + +]

Author: Seiji Nishikawa <[email protected]>
Date:   Sun Aug 25 23:13:52 2024 +0900

    ACPI: PAD: fix crash in exit_round_robin()
    
    [ Upstream commit 0a2ed70a549e61c5181bad5db418d223b68ae932 ]
    
    The kernel occasionally crashes in cpumask_clear_cpu(), which is called
    within exit_round_robin(), because when executing clear_bit(nr, addr) with
    nr set to 0xffffffff, the address calculation may cause misalignment within
    the memory, leading to access to an invalid memory address.
    
    ----------
    BUG: unable to handle kernel paging request at ffffffffe0740618
            ...
    CPU: 3 PID: 2919323 Comm: acpi_pad/14 Kdump: loaded Tainted: G           OE  X --------- -  - 4.18.0-425.19.2.el8_7.x86_64 #1
            ...
    RIP: 0010:power_saving_thread+0x313/0x411 [acpi_pad]
    Code: 89 cd 48 89 d3 eb d1 48 c7 c7 55 70 72 c0 e8 64 86 b0 e4 c6 05 0d a1 02 00 01 e9 bc fd ff ff 45 89 e4 42 8b 04 a5 20 82 72 c0 <f0> 48 0f b3 05 f4 9c 01 00 42 c7 04 a5 20 82 72 c0 ff ff ff ff 31
    RSP: 0018:ff72a5d51fa77ec8 EFLAGS: 00010202
    RAX: 00000000ffffffff RBX: ff462981e5d8cb80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ff46297556959d80 R08: 0000000000000382 R09: ff46297c8d0f38d8
    R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000000e
    R13: 0000000000000000 R14: ffffffffffffffff R15: 000000000000000e
    FS:  0000000000000000(0000) GS:ff46297a800c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffffffe0740618 CR3: 0000007e20410004 CR4: 0000000000771ee0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     ? acpi_pad_add+0x120/0x120 [acpi_pad]
     kthread+0x10b/0x130
     ? set_kthread_struct+0x50/0x50
     ret_from_fork+0x1f/0x40
            ...
    CR2: ffffffffe0740618
    
    crash> dis -lr ffffffffc0726923
            ...
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 114
    0xffffffffc0726918 <power_saving_thread+776>:   mov    %r12d,%r12d
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 325
    0xffffffffc072691b <power_saving_thread+779>:   mov    -0x3f8d7de0(,%r12,4),%eax
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./arch/x86/include/asm/bitops.h: 80
    0xffffffffc0726923 <power_saving_thread+787>:   lock btr %rax,0x19cf4(%rip)        # 0xffffffffc0740620 <pad_busy_cpus_bits>
    
    crash> px tsk_in_cpu[14]
    $66 = 0xffffffff
    
    crash> px 0xffffffffc072692c+0x19cf4
    $99 = 0xffffffffc0740620
    
    crash> sym 0xffffffffc0740620
    ffffffffc0740620 (b) pad_busy_cpus_bits [acpi_pad]
    
    crash> px pad_busy_cpus_bits[0]
    $42 = 0xfffc0
    ----------
    
    To fix this, ensure that tsk_in_cpu[tsk_index] != -1 before calling
    cpumask_clear_cpu() in exit_round_robin(), just as it is done in
    round_robin_cpu().
    
    Signed-off-by: Seiji Nishikawa <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject edit, avoid updates to the same value ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[] [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:06 2024 +0200

    ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
    
    commit 056301e7c7c886f96d799edd36f3406cc30e1822 upstream.
    
    Like other Asus ExpertBook models the B2502CVA has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the B2502CVA to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217760
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[] [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:05 2024 +0200

    ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
    
    commit 2f80ce0b78c340e332f04a5801dee5e4ac8cfaeb upstream.
    
    Like other Asus Vivobook models the X1704VAP has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the X1704VAP to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Reported-by: Lamome Julien <[email protected]>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078696
    Closes: https://lore.kernel.org/all/[email protected]/
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:04 2024 +0200

    ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA
    
    commit 63539defee17bf0cbd8e24078cf103efee9c6633 upstream.
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GA has a DSDT
    describing IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
        $ sudo dmesg | grep DMI:.*BIOS
        [    0.000000] DMI: ASUSTeK COMPUTER INC. Vivobook Go E1404GA_E1404GA/E1404GA, BIOS E1404GA.302 08/23/2023
        $ sudo cp /sys/firmware/acpi/tables/DSDT dsdt.dat
        $ iasl -d dsdt.dat
        $ grep -A 30 PS2K dsdt.dsl | grep IRQ -A 1
                    IRQ (Level, ActiveLow, Exclusive, )
                        {1}
    
    There already is an entry in the irq1_level_low_skip_override[] DMI match
    table for the "E1404GAB", change this to match on "E1404GA" to cover
    the E1404GA model as well (DMI_MATCH() does a substring match).
    
    Reported-by: Paul Menzel <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Remove duplicate Asus E1504GAB IRQ override [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:03 2024 +0200

    ACPI: resource: Remove duplicate Asus E1504GAB IRQ override
    
    commit 65bdebf38e5fac7c56a9e05d3479a707e6dc783c upstream.
    
    Commit d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook
    E1504GA and E1504GAB") does exactly what the subject says, adding DMI
    matches for both the E1504GA and E1504GAB.
    
    But DMI_MATCH() does a substring match, so checking for E1504GA will also
    match E1504GAB.
    
    Drop the unnecessary E1504GAB entry since that is covered already by
    the E1504GA entry.
    
    Fixes: d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook E1504GA and E1504GAB")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB [+ + +]

Author: Tamim Khan <[email protected]>
Date:   Mon Sep 2 21:43:05 2024 -0400

    ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
    
    [ Upstream commit 49e9cc315604972cc14868cb67831e3e8c3f1470 ]
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GAB has a DSDT
    that describes IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
    This override prevents the internal keyboard from working.
    
    Fix the problem by adding this laptop to the table that prevents the kernel
    from overriding the IRQ.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219212
    Signed-off-by: Tamim Khan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Wed Sep 18 17:38:49 2024 +0200

    ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
    
    commit ac78288fe062b64e45a479eaae74aaaafcc8ecdd upstream.
    
    Dell All In One (AIO) models released after 2017 may use a backlight
    controller board connected to an UART.
    
    In DSDT this uart port will be defined as:
    
       Name (_HID, "DELL0501")
       Name (_CID, EisaId ("PNP0501")
    
    The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with
    the above _HID + _CID. Loading the dell-uart-backlight driver fails with
    the following errors:
    
    [   18.261353] dell_uart_backlight serial0-0: Timed out waiting for response.
    [   18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version
    [   18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110
    
    Indicating that there is no backlight controller board attached to
    the UART, while the GPU's native backlight control method does work.
    
    Add a quirk to use the GPU's native backlight control method on this model.
    
    Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edit ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sat Sep 7 14:44:19 2024 +0200

    ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
    
    [ Upstream commit eb7b0f12e13ba99e64e3a690c2166895ed63b437 ]
    
    The Panasonic Toughbook CF-18 advertises both native and vendor backlight
    control interfaces. But only the vendor one actually works.
    
    acpi_video_get_backlight_type() will pick the non working native backlight
    by default, add a quirk to select the working vendor backlight instead.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package() [+ + +]

Author: Pei Xiao <[email protected]>
Date:   Thu Jul 18 14:05:48 2024 +0800

    ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
    
    [ Upstream commit a5242874488eba2b9062985bf13743c029821330 ]
    
    ACPICA commit 4d4547cf13cca820ff7e0f859ba83e1a610b9fd0
    
    ACPI_ALLOCATE_ZEROED() may fail, elements might be NULL and will cause
    NULL pointer dereference later.
    
    Link: https://github.com/acpica/acpica/commit/4d4547cf
    Signed-off-by: Pei Xiao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_field() fails [+ + +]

Author: Armin Wolf <[email protected]>
Date:   Sun Apr 14 21:50:33 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
    
    [ Upstream commit e6169a8ffee8a012badd8c703716e761ce851b15 ]
    
    ACPICA commit 1280045754264841b119a5ede96cd005bc09b5a7
    
    If acpi_ps_get_next_field() fails, the previously created field list
    needs to be properly disposed before returning the status code.
    
    Link: https://github.com/acpica/acpica/commit/12800457
    Signed-off-by: Armin Wolf <[email protected]>
    [ rjw: Rename local variable to avoid compiler confusion ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails [+ + +]

Author: Armin Wolf <[email protected]>
Date:   Wed Apr 3 20:50:11 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
    
    [ Upstream commit 5accb265f7a1b23e52b0ec42313d1e12895552f4 ]
    
    ACPICA commit 2802af722bbde7bf1a7ac68df68e179e2555d361
    
    If acpi_ps_get_next_namepath() fails, the previously allocated
    union acpi_parse_object needs to be freed before returning the
    status code.
    
    The issue was first being reported on the Linux ACPI mailing list:
    
    Link: https://lore.kernel.org/linux-acpi/[email protected]/T/
    Link: https://github.com/acpica/acpica/commit/2802af72
    Signed-off-by: Armin Wolf <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: iasl: handle empty connection_node [+ + +]

Author: Aleksandrs Vinarskis <[email protected]>
Date:   Sun Aug 11 23:33:44 2024 +0200

    ACPICA: iasl: handle empty connection_node
    
    [ Upstream commit a0a2459b79414584af6c46dd8c6f866d8f1aa421 ]
    
    ACPICA commit 6c551e2c9487067d4b085333e7fe97e965a11625
    
    Link: https://github.com/acpica/acpica/commit/6c551e2c
    Signed-off-by: Aleksandrs Vinarskis <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

afs: Fix missing wire-up of afs_retry_request() [+ + +]

Author: David Howells <[email protected]>
Date:   Sat Sep 14 21:40:02 2024 +0100

    afs: Fix missing wire-up of afs_retry_request()
    
    [ Upstream commit 2cf36327ee1e47733aba96092d7bd082a4056ff5 ]
    
    afs_retry_request() is supposed to be pointed to by the afs_req_ops netfs
    operations table, but the pointer got lost somewhere.  The function is used
    during writeback to rotate through the authentication keys that were in
    force when the file was modified locally.
    
    Fix this by adding the pointer to the function.
    
    Fixes: 1ecb146f7cd8 ("netfs, afs: Use writeback retry to deal with alternate keys")
    Reported-by: Dr. David Alan Gilbert <[email protected]>
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

afs: Fix the setting of the server responding flag [+ + +]

Author: David Howells <[email protected]>
Date:   Mon Sep 23 16:07:50 2024 +0100

    afs: Fix the setting of the server responding flag
    
    [ Upstream commit ff98751bae40faed1ba9c6a7287e84430f7dec64 ]
    
    In afs_wait_for_operation(), we set transcribe the call responded flag to
    the server record that we used after doing the fileserver iteration loop -
    but it's possible to exit the loop having had a response from the server
    that we've discarded (e.g. it returned an abort or we started receiving
    data, but the call didn't complete).
    
    This means that op->server might be NULL, but we don't check that before
    attempting to set the server flag.
    
    Fixes: 98f9fda2057b ("afs: Fold the afs_addr_cursor struct in")
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: asihpi: Fix potential OOB array access [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:14:42 2024 +0200

    ALSA: asihpi: Fix potential OOB array access
    
    [ Upstream commit 7b986c7430a6bb68d523dac7bfc74cbd5b44ef96 ]
    
    ASIHPI driver stores some values in the static array upon a response
    from the driver, and its index depends on the firmware.  We shouldn't
    trust it blindly.
    
    This patch adds a sanity check of the array index to fit in the array
    size.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Fix leftover snd_power_unref() [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 1 08:42:01 2024 +0200

    ALSA: control: Fix leftover snd_power_unref()
    
    commit fef1ac950c600ba50ef4d65ca03c8dae9be7f9ea upstream.
    
    One snd_power_unref() was forgotten and left at __snd_ctl_elem_info()
    in the previous change for reorganizing the locking order.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://github.com/thesofproject/linux/pull/5127
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: control: Fix power_ref lock order for compat code, too [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 18:31:27 2024 +0200

    ALSA: control: Fix power_ref lock order for compat code, too
    
    [ Upstream commit a1066453b5e49a28523f3ecbbfe4e06c6a29561c ]
    
    In the previous change for swapping the power_ref and controls_rwsem
    lock order, the code path for the compat layer was forgotten.
    This patch covers the remaining code.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Take power_ref lock primarily [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Mon Jul 29 18:06:58 2024 +0200

    ALSA: control: Take power_ref lock primarily
    
    [ Upstream commit fcc62b19104a67b9a2941513771e09389b75bd95 ]
    
    The code path for kcontrol accesses have often nested locks of both
    card's controls_rwsem and power_ref, and applies in that order.
    However, what could take much longer is the latter, power_ref; it
    waits for the power state of the device, and it pretty much depends on
    the user's action.
    
    This patch swaps the locking order of those locks to a more natural
    way, namely, power_ref -> controls_rwsem, in order to shorten the time
    of possible nested locks.  For consistency, power_ref is taken always
    in the top-level caller side (that is, *_user() functions and the
    ioctl handler itself).
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: core: add isascii() check to card ID generator [+ + +]

Author: Jaroslav Kysela <[email protected]>
Date:   Wed Oct 2 21:46:49 2024 +0200

    ALSA: core: add isascii() check to card ID generator
    
    commit d278a9de5e1837edbe57b2f1f95a104ff6c84846 upstream.
    
    The card identifier should contain only safe ASCII characters. The isalnum()
    returns true also for characters for non-ASCII characters.
    
    Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4135
    Link: https://lore.kernel.org/linux-sound/yk3WTvKkwheOon_LzZlJ43PPInz6byYfBzpKkbasww1yzuiMRqn7n6Y8vZcXB-xwFCu_vb8hoNjv7DTNwH5TWjpEuiVsyn9HPCEXqwF4120=@protonmail.com/
    Cc: [email protected]
    Reported-by: Barnabás Pőcze <[email protected]>
    Signed-off-by: Jaroslav Kysela <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: gus: Fix some error handling paths related to get_bpos() usage [+ + +]

Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 21:34:01 2024 +0200

    ALSA: gus: Fix some error handling paths related to get_bpos() usage
    
    [ Upstream commit 9df39a872c462ea07a3767ebd0093c42b2ff78a2 ]
    
    If get_bpos() fails, it is likely that the corresponding error code should
    be returned.
    
    Fixes: a6970bb1dd99 ("ALSA: gus: Convert to the new PCM ops")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/d9ca841edad697154afa97c73a5d7a14919330d9.1727984008.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Fri Oct 4 10:25:58 2024 +0200

    ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
    
    [ Upstream commit b3ebb007060f89d5a45c9b99f06a55e36a1945b5 ]
    
    We received a regression report for System76 Pangolin (pang14) due to
    the recent fix for Tuxedo Sirius devices to support the top speaker.
    The reason was the conflicting PCI SSID, as often seen.
    
    As a workaround, now the codec SSID is checked and the quirk is
    applied conditionally only to Sirius devices.
    
    Fixes: 4178d78cd7a8 ("ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices")
    Reported-by: Christian Heusel <[email protected]>
    Reported-by: Jerry <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Tue Oct 1 14:14:36 2024 +0200

    ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
    
    [ Upstream commit 1c801e7f77445bc56e5e1fec6191fd4503534787 ]
    
    Some time ago, we introduced the obey_preferred_dacs flag for choosing
    the DAC/pin pairs specified by the driver instead of parsing the
    paths.  This works as expected, per se, but there have been a few
    cases where we forgot to set this flag while preferred_dacs table is
    already set up.  It ended up with incorrect wiring and made us
    wondering why it doesn't work.
    
    Basically, when the preferred_dacs table is provided, it means that
    the driver really wants to wire up to follow that.  That is, the
    presence of the preferred_dacs table itself is already a "do-it"
    flag.
    
    In this patch, we simply replace the evaluation of obey_preferred_dacs
    flag with the presence of preferred_dacs table for fixing the
    misbehavior.  Another patch to drop of the obsoleted flag will
    follow.
    
    Fixes: 242d990c158d ("ALSA: hda/generic: Add option to enforce preferred_dacs pairs")
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1219803
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 [+ + +]

Author: Abhishek Tamboli <[email protected]>
Date:   Mon Sep 30 20:23:00 2024 +0530

    ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
    
    commit d75dba49744478c32f6ce1c16b5f391c2d5cef5f upstream.
    
    Add the quirk for HP Pavilion Gaming laptop 15z-ec200 for
    enabling the mute led. The fix apply the ALC285_FIXUP_HP_MUTE_LED
    quirk for this model.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219303
    Signed-off-by: Abhishek Tamboli <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9 [+ + +]

Author: Ai Chao <[email protected]>
Date:   Thu Sep 26 14:02:52 2024 +0800

    ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9
    
    commit dee476950cbd83125655a3f49e00d63b79f6114e upstream.
    
    The headset mic requires a fixup to be properly detected/used.
    
    Signed-off-by: Ai Chao <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8 [+ + +]

Author: Nikolai Afanasenkov <[email protected]>
Date:   Mon Sep 16 13:50:42 2024 -0600

    ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8
    
    commit cb2deca056d579fe008c8d0a4ceb04d2b368fe42 upstream.
    
    The HP Elite mt645 G8 Mobile Thin Client uses an ALC236 codec
    and needs the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk
    to enable the mute and micmute LED functionality.
    
    This patch adds the system ID of the HP Elite mt645 G8
    to the `alc269_fixup_tbl` in `patch_realtek.c`
    to enable the required quirk.
    
    Cc: [email protected]
    Signed-off-by: Nikolai Afanasenkov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Fix the push button function for the ALC257 [+ + +]

Author: Oder Chiou <[email protected]>
Date:   Mon Sep 30 18:50:39 2024 +0800

    ALSA: hda/realtek: Fix the push button function for the ALC257
    
    [ Upstream commit 05df9732a0894846c46d0062d4af535c5002799d ]
    
    The headset push button cannot work properly in case of the ALC257.
    This patch reverted the previous commit to correct the side effect.
    
    Fixes: ef9718b3d54e ("ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7")
    Signed-off-by: Oder Chiou <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init [+ + +]

Author: Joshua Grisham <[email protected]>
Date:   Mon Sep 9 21:30:00 2024 +0200

    ALSA: hda/realtek: Refactor and simplify Samsung Galaxy Book init
    
    [ Upstream commit 7e4d4b32ab9532bd1babcd5d0763d727ebb04be0 ]
    
    I have done a lot of analysis for these type of devices and collaborated
    quite a bit with Nick Weihs (author of the first patch submitted for this
    including adding samsung_helper.c). More information can be found in the
    issue on Github [1] including additional rationale and testing.
    
    The existing implementation includes a large number of equalizer coef
    values that are not necessary to actually init and enable the speaker
    amps, as well as create a somewhat worse sound profile. Users have
    reported "muffled" or "muddy" sound; more information about this including
    my analysis of the differences can be found in the linked Github issue.
    
    This patch refactors the "v2" version of ALC298_FIXUP_SAMSUNG_AMP to a much
    simpler implementation which removes the new samsung_helper.c, reuses more
    of the existing patch_realtek.c, and sends significantly fewer unnecessary
    coef values (including removing all of these EQ-specific coef values).
    
    A pcm_playback_hook is used to dynamically enable and disable the speaker
    amps only when there will be audio playback; this is to match the behavior
    of how the driver for these devices is working in Windows, and is
    suspected but not yet tested or confirmed to help with power consumption.
    
    Support for models with 2 speaker amps vs 4 speaker amps is controlled by
    a specific quirk name for both types. A new int num_speaker_amps has been
    added to alc_spec so that the hooks can know how many speaker amps to
    enable or disable. This design was chosen to limit the number of places
    that subsystem ids will need to be maintained: like this, they can be
    maintained only once in the quirk table and there will not be another
    separate list of subsystem ids to maintain elsewhere in the code.
    
    Also updated the quirk name from ALC298_FIXUP_SAMSUNG_AMP2 to
    ALC298_FIXUP_SAMSUNG_AMP_V2_.. as this is not a quirk for "Amp #2" on
    ALC298 but is instead a different version of how to handle it.
    
    More devices have been added (see Github issue for testing confirmation),
    as well as a small cleanup to existing names.
    
    [1]: https://github.com/thesofproject/linux/issues/4055#issuecomment-2323411911
    
    Signed-off-by: Joshua Grisham <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop [+ + +]

Author: Baojun Xu <[email protected]>
Date:   Thu Sep 19 15:57:43 2024 +0800

    ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop
    
    commit 49f5ee951f11f4d6a124f00f71b2590507811a55 upstream.
    
    Add new vendor_id and subsystem_id in quirk for Lenovo Y990 Laptop.
    
    Signed-off-by: Baojun Xu <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hdsp: Break infinite MIDI input flush loop [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:15:12 2024 +0200

    ALSA: hdsp: Break infinite MIDI input flush loop
    
    [ Upstream commit c01f3815453e2d5f699ccd8c8c1f93a5b8669e59 ]
    
    The current MIDI input flush on HDSP and HDSPM drivers relies on the
    hardware reporting the right value.  If the hardware doesn't give the
    proper value but returns -1, it may be stuck at an infinite loop.
    
    Add a counter and break if the loop is unexpectedly too long.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: line6: add hw monitor volume control to POD HD500X [+ + +]

Author: Hans P. Moller <[email protected]>
Date:   Thu Oct 3 20:28:28 2024 -0300

    ALSA: line6: add hw monitor volume control to POD HD500X
    
    commit 703235a244e533652346844cfa42623afb36eed1 upstream.
    
    Add hw monitor volume control for POD HD500X. This is done adding
    LINE6_CAP_HWMON_CTL to the capabilities
    
    Signed-off-by: Hans P. Moller <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: mixer_oss: Remove some incorrect kfree_const() usages [+ + +]

Author: Christophe JAILLET <[email protected]>
Date:   Thu Sep 26 20:17:36 2024 +0200

    ALSA: mixer_oss: Remove some incorrect kfree_const() usages
    
    [ Upstream commit 368e4663c557de4a33f321b44e7eeec0a21b2e4e ]
    
    "assigned" and "assigned->name" are allocated in snd_mixer_oss_proc_write()
    using kmalloc() and kstrdup(), so there is no point in using kfree_const()
    to free these resources.
    
    Switch to the more standard kfree() to free these resources.
    
    This could avoid a memory leak.
    
    Fixes: 454f5ec1d2b7 ("ALSA: mixer: oss: Constify snd_mixer_oss_assign_table definition")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/63ac20f64234b7c9ea87a7fa9baf41e8255852f7.1727374631.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET [+ + +]

Author: Lianqin Hu <[email protected]>
Date:   Wed Sep 25 03:16:29 2024 +0000

    ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET
    
    commit 73385f3e0d8088b715ae8f3f66d533c482a376ab upstream.
    
    Audio control requests that sets sampling frequency sometimes fail on
    this card. Adding delay between control messages eliminates that problem.
    
    Signed-off-by: Lianqin Hu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/TYUPR06MB62177E629E9DEF2401333BF7D2692@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add input value sanity checks for standard types [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Tue Aug 6 14:46:50 2024 +0200

    ALSA: usb-audio: Add input value sanity checks for standard types
    
    [ Upstream commit 901e85677ec0bb9a69fb9eab1feafe0c4eb7d07e ]
    
    For an invalid input value that is out of the given range, currently
    USB-audio driver corrects the value silently and accepts without
    errors.  This is no wrong behavior, per se, but the recent kselftest
    rather wants to have an error in such a case, hence a different
    behavior is expected now.
    
    This patch adds a sanity check at each control put for the standard
    mixer types and returns an error if an invalid value is given.
    
    Note that this covers only the standard mixer types.  The mixer quirks
    that have own control callbacks would need different coverage.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add logitech Audio profile quirk [+ + +]

Author: Joshua Pius <[email protected]>
Date:   Thu Sep 12 15:26:28 2024 +0000

    ALSA: usb-audio: Add logitech Audio profile quirk
    
    [ Upstream commit a51c925c11d7b855167e64b63eb4378e5adfc11d ]
    
    Specify shortnames for the following Logitech Devices: Rally bar, Rally
    bar mini, Tap, MeetUp and Huddle.
    
    Signed-off-by: Joshua Pius <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add mixer quirk for RME Digiface USB [+ + +]

Author: Asahi Lina <[email protected]>
Date:   Tue Sep 3 19:52:30 2024 +0900

    ALSA: usb-audio: Add mixer quirk for RME Digiface USB
    
    [ Upstream commit 611a96f6acf2e74fe28cb90908a9c183862348ce ]
    
    Implement sync, output format, and input status mixer controls, to allow
    the interface to be used as a straight ADAT/SPDIF (+ Headphones) I/O
    interface.
    
    This does not implement the matrix mixer, output gain controls, or input
    level meter feedback. The full mixer interface is only really usable
    using a dedicated userspace control app (there are too many mixer nodes
    for alsamixer to be usable), so for now we leave it up to userspace to
    directly control these features using raw USB control messages. This is
    similar to how it's done with some FireWire interfaces (ffado-mixer).
    
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add native DSD support for Luxman D-08u [+ + +]

Author: Jan Lalinsky <[email protected]>
Date:   Thu Oct 3 05:08:11 2024 +0200

    ALSA: usb-audio: Add native DSD support for Luxman D-08u
    
    commit 6b0bde5d8d4078ca5feec72fd2d828f0e5cf115d upstream.
    
    Add native DSD support for Luxman D-08u DAC, by adding the PID/VID 1852:5062.
    This makes DSD playback work, and also sound quality when playing PCM files
    is improved, crackling sounds are gone.
    
    Signed-off-by: Jan Lalinsky <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add quirk for RME Digiface USB [+ + +]

Author: Cyan Nyan <[email protected]>
Date:   Tue Sep 3 19:52:29 2024 +0900

    ALSA: usb-audio: Add quirk for RME Digiface USB
    
    [ Upstream commit c032044e9672408c534d64a6df2b1ba14449e948 ]
    
    Add trivial support for audio streaming on the RME Digiface USB. Binds
    only to the first interface to allow userspace to directly drive the
    complex I/O and matrix mixer controls.
    
    Signed-off-by: Cyan Nyan <[email protected]>
    [Lina: Added 2x/4x sample rate support & boot/format quirks]
    Co-developed-by: Asahi Lina <[email protected]>
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Define macros for quirk table entries [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:41 2024 +0200

    ALSA: usb-audio: Define macros for quirk table entries
    
    [ Upstream commit 0c3ad39b791c2ecf718afcaca30e5ceafa939d5c ]
    
    Many entries in the USB-audio quirk tables have relatively complex
    expressions.  For improving the readability, introduce a few macros.
    Those are applied in the following patch.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Replace complex quirk lines with macros [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:42 2024 +0200

    ALSA: usb-audio: Replace complex quirk lines with macros
    
    [ Upstream commit d79e13f8e8abb5cd3a2a0f9fc9bc3fc750c5b06f ]
    
    Apply the newly introduced macros for reduce the complex expressions
    and cast in the quirk table definitions.  It results in a significant
    code reduction, too.
    
    There should be no functional changes.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

aoe: fix the potential use-after-free problem in more places [+ + +]

Author: Chun-Yi Lee <[email protected]>
Date:   Wed Oct 2 11:54:58 2024 +0800

    aoe: fix the potential use-after-free problem in more places
    
    commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 upstream.
    
    For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential
    use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put()
    instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs
    into use-after-free.
    
    Then Nicolai Stange found more places in aoe have potential use-after-free
    problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
    and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
    packet to tx queue. So they should also use dev_hold() to increase the
    refcnt of skb->dev.
    
    On the other hand, moving dev_put() to tx() causes that the refcnt of
    skb->dev be reduced to a negative value, because corresponding
    dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(),
    probe(), and aoecmd_cfg_rsp(). This patch fixed this issue.
    
    Cc: [email protected]
    Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
    Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts")
    Reported-by: Nicolai Stange <[email protected]>
    Signed-off-by: Chun-Yi Lee <[email protected]>
    Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: cputype: Add Neoverse-N3 definitions [+ + +]

Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:04:19 2024 +0100

    arm64: cputype: Add Neoverse-N3 definitions
    
    [ Upstream commit 924725707d80bc2588cefafef76ff3f164d299bc ]
    
    Add cputype definitions for Neoverse-N3. These will be used for errata
    detection in subsequent patches.
    
    These values can be found in Table A-261 ("MIDR_EL1 bit descriptions")
    in issue 02 of the Neoverse-N3 TRM, which can be found at:
    
      https://developer.arm.com/documentation/107997/0000/?lang=en
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: errata: Expand speculative SSBS workaround once more [+ + +]

Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:04:20 2024 +0100

    arm64: errata: Expand speculative SSBS workaround once more
    
    [ Upstream commit 081eb7932c2b244f63317a982c5e3990e2c7fbdd ]
    
    A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
    special-purpose register does not affect subsequent speculative
    instructions, permitting speculative store bypassing for a window of
    time.
    
    We worked around this for a number of CPUs in commits:
    
    * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
    * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")
    * 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)")
    
    Since then, a (hopefully final) batch of updates have been published,
    with two more affected CPUs. For the affected CPUs the existing
    mitigation is sufficient, as described in their respective Software
    Developer Errata Notice (SDEN) documents:
    
    * Cortex-A715 (MP148) SDEN v15.0, erratum 3456084
      https://developer.arm.com/documentation/SDEN-2148827/1500/
    
    * Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111
      https://developer.arm.com/documentation/SDEN-3050973/0500/
    
    Enable the existing mitigation by adding the relevant MIDRs to
    erratum_spec_ssbs_list, and update silicon-errata.rst and the
    Kconfig text accordingly.
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS [+ + +]

Author: Mark Rutland <[email protected]>
Date:   Mon Sep 30 13:04:48 2024 +0100

    arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
    
    commit b3d6121eaeb22aee8a02f46706745b1968cc0292 upstream.
    
    The Kconfig logic to select HAVE_DYNAMIC_FTRACE_WITH_ARGS is incorrect,
    and HAVE_DYNAMIC_FTRACE_WITH_ARGS may be selected when it is not
    supported by the combination of clang and GNU LD, resulting in link-time
    errors:
    
      aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
      aarch64-linux-gnu-ld: final link failed: bad value
    
    ... which can be seen when building with CC=clang using a binutils
    version older than 2.36.
    
    We originally fixed that in commit:
    
      45bd8951806eb5e8 ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang")
    
    ... by splitting the "select HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement
    into separete CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS and
    GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS options which individually select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS.
    
    Subsequently we accidentally re-introduced the common "select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement in commit:
    
      26299b3f6ba26bfc ("ftrace: arm64: move from REGS to ARGS")
    
    ... then we removed it again in commit:
    
      68a63a412d18bd2e ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")
    
    ... then we accidentally re-introduced it again in commit:
    
      2aa6ac03516d078c ("arm64: ftrace: Add direct call support")
    
    Fix this for the third time by keeping the unified select statement and
    making this depend onf either GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS or
    CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS. This is more consistent with
    usual style and less likely to go wrong in future.
    
    Fixes: 2aa6ac03516d ("arm64: ftrace: Add direct call support")
    Cc: <[email protected]> # 6.4.x
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 [+ + +]

Author: Easwar Hariharan <[email protected]>
Date:   Thu Oct 3 22:52:35 2024 +0000

    arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
    
    commit 3eddb108abe3de6723cc4b77e8558ce1b3047987 upstream.
    
    Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
    from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata:
    Expand speculative SSBS workaround")
    
    CC: Mark Rutland <[email protected]>
    CC: James More <[email protected]>
    CC: Will Deacon <[email protected]>
    CC: [email protected] # 6.6+
    Signed-off-by: Easwar Hariharan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() [+ + +]

Author: Fares Mehanna <[email protected]>
Date:   Mon Sep 2 16:33:08 2024 +0000

    arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
    
    [ Upstream commit 7eced90b202d63cdc1b9b11b1353adb1389830f9 ]
    
    The reasons for PTEs in the kernel direct map to be marked invalid are not
    limited to kfence / debug pagealloc machinery. In particular,
    memfd_secret() also steals pages with set_direct_map_invalid_noflush().
    
    When building the transitional page tables for kexec from the current
    kernel's page tables, those pages need to become regular writable pages,
    otherwise, if the relocation places kexec segments over such pages, a fault
    will occur during kexec, leading to host going dark during kexec.
    
    This patch addresses the kexec issue by marking any PTE as valid if it is
    not none. While this fixes the kexec crash, it does not address the
    security concern that if processes owning secret memory are not terminated
    before kexec, the secret content will be mapped in the new kernel without
    being scrubbed.
    
    Suggested-by: Jan H. Schönherr <[email protected]>
    Signed-off-by: Fares Mehanna <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized [+ + +]

Author: Andrei Simion <[email protected]>
Date:   Tue Sep 24 11:12:38 2024 +0300

    ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized
    
    [ Upstream commit 09cfc6a532d249a51d3af5022d37ebbe9c3d31f6 ]
    
    Update the driver to prevent alsa-restore.service from failing when
    reading data from /var/lib/alsa/asound.state at boot. Ensure that the
    restoration of ALSA mixer configurations is skipped if substream->runtime
    is NULL.
    
    Fixes: 50291652af52 ("ASoC: atmel: mchp-pdmc: add PDMC driver")
    Signed-off-by: Andrei Simion <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: codecs: wsa883x: Handle reading version failure [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Wed Jul 10 15:52:31 2024 +0200

    ASoC: codecs: wsa883x: Handle reading version failure
    
    [ Upstream commit 2fbf16992e5aa14acf0441320033a01a32309ded ]
    
    If reading version and variant from registers fails (which is unlikely
    but possible, because it is a read over bus), the driver will proceed
    and perform device configuration based on uninitialized stack variables.
    Handle it a bit better - bail out without doing any init and failing the
    update status Soundwire callback.
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m [+ + +]

Author: Hui Wang <[email protected]>
Date:   Wed Oct 2 10:56:59 2024 +0800

    ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
    
    [ Upstream commit 47d7d3fd72afc7dcd548806291793ee6f3848215 ]
    
    In most Linux distribution kernels, the SND is set to m, in such a
    case, when booting the kernel on i.MX8MP EVK board, there is a
    warning calltrace like below:
     Call trace:
     snd_card_init+0x484/0x4cc [snd]
     snd_card_new+0x70/0xa8 [snd]
     snd_soc_bind_card+0x310/0xbd0 [snd_soc_core]
     snd_soc_register_card+0xf0/0x108 [snd_soc_core]
     devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
    
    That is because the card.owner is not set, a warning calltrace is
    raised in the snd_card_init() due to it.
    
    Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
    Signed-off-by: Hui Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev() [+ + +]

Author: Pierre-Louis Bossart <[email protected]>
Date:   Tue Aug 27 20:32:01 2024 +0800

    ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev()
    
    [ Upstream commit 14e91ddd5c02d8c3e5a682ebfa0546352b459911 ]
    
    The code seems mostly copy-pasted, with some machine drivers
    forgetting to test if the 'adev' result is NULL.
    
    Add this check when missing, and use -ENOENT consistently as an error
    code.
    
    Reported-by: Dan Carpenter <[email protected]>
    Closes: https://lore.kernel.org/alsa-devel/[email protected]/T/#u
    Signed-off-by: Pierre-Louis Bossart <[email protected]>
    Reviewed-by: Péter Ujfalusi <[email protected]>
    Signed-off-by: Bard Liao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item [+ + +]

Author: Bard Liao <[email protected]>
Date:   Tue Oct 1 14:17:37 2024 +0800

    ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item
    
    [ Upstream commit 5afc29ba44fdd1bcbad4e07246c395d946301580 ]
    
    There is no links_num in struct snd_soc_acpi_mach {}, and we test
    !link->num_adr as a condition to end the loop in hda_sdw_machine_select().
    So an empty item in struct snd_soc_acpi_link_adr array is required.
    
    Fixes: 65ab45b90656 ("ASoC: Intel: soc-acpi: Add match entries for some cs42l43 laptops")
    Signed-off-by: Bard Liao <[email protected]>
    Reviewed-by: Péter Ujfalusi <[email protected]>
    Reviewed-by: Charles Keepax <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: topology: Fix incorrect addressing assignments [+ + +]

Author: Tang Bin <[email protected]>
Date:   Sat Sep 14 16:16:08 2024 +0800

    ASoC: topology: Fix incorrect addressing assignments
    
    [ Upstream commit 85109780543b5100aba1d0842b6a7c3142be74d2 ]
    
    The variable 'kc' is handled in the function
    soc_tplg_control_dbytes_create(), and 'kc->private_value'
    is assigned to 'sbe', so In the function soc_tplg_dbytes_create(),
    the right 'sbe' should be 'kc.private_value', the same logical error
    in the function soc_tplg_dmixer_create(), thus fix them.
    
    Fixes: 0867278200f7 ("ASoC: topology: Unify code for creating standalone and widget bytes control")
    Fixes: 4654ca7cc8d6 ("ASoC: topology: Unify code for creating standalone and widget mixer control")
    Signed-off-by: Tang Bin <[email protected]>
    Reviewed-by: Amadeusz Sławiński <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: pata_serverworks: Do not use the term blacklist [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 10:58:36 2024 +0900

    ata: pata_serverworks: Do not use the term blacklist
    
    [ Upstream commit 858048568c9e3887d8b19e101ee72f129d65cb15 ]
    
    Let's not use the term blacklist in the function
    serverworks_osb4_filter() documentation comment and rather simply refer
    to what that function looks at: the list of devices with groken UDMA5.
    
    While at it, also constify the values of the csb_bad_ata100 array.
    
    Of note is that all of this should probably be handled using libata
    quirk mechanism but it is unclear if these UDMA5 quirks are specific
    to this controller only.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: sata_sil: Rename sil_blacklist to sil_quirks [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 11:14:11 2024 +0900

    ata: sata_sil: Rename sil_blacklist to sil_quirks
    
    [ Upstream commit 93b0f9e11ce511353c65b7f924cf5f95bd9c3aba ]
    
    Rename the array sil_blacklist to sil_quirks as this name is more
    neutral and is also consistent with how this driver define quirks with
    the SIL_QUIRK_XXX flags.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

blk_iocost: fix more out of bound shifts [+ + +]

Author: Konstantin Ovsepian <[email protected]>
Date:   Thu Aug 22 08:41:36 2024 -0700

    blk_iocost: fix more out of bound shifts
    
    [ Upstream commit 9bce8005ec0dcb23a58300e8522fe4a31da606fa ]
    
    Recently running UBSAN caught few out of bound shifts in the
    ioc_forgive_debts() function:
    
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2142:38
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2144:30
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    Call Trace:
    <IRQ>
    dump_stack_lvl+0xca/0x130
    __ubsan_handle_shift_out_of_bounds+0x22c/0x280
    ? __lock_acquire+0x6441/0x7c10
    ioc_timer_fn+0x6cec/0x7750
    ? blk_iocost_init+0x720/0x720
    ? call_timer_fn+0x5d/0x470
    call_timer_fn+0xfa/0x470
    ? blk_iocost_init+0x720/0x720
    __run_timer_base+0x519/0x700
    ...
    
    Actual impact of this issue was not identified but I propose to fix the
    undefined behaviour.
    The proposed fix to prevent those out of bound shifts consist of
    precalculating exponent before using it the shift operations by taking
    min value from the actual exponent and maximum possible number of bits.
    
    Reported-by: Breno Leitao <[email protected]>
    Signed-off-by: Konstantin Ovsepian <[email protected]>
    Acked-by: Tejun Heo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

block: fix integer overflow in BLKSECDISCARD [+ + +]

Author: Alexey Dobriyan <[email protected]>
Date:   Tue Sep 3 22:48:19 2024 +0300

    block: fix integer overflow in BLKSECDISCARD
    
    [ Upstream commit 697ba0b6ec4ae04afb67d3911799b5e2043b4455 ]
    
    I independently rediscovered
    
            commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155
            block: fix overflow in blk_ioctl_discard()
    
    but for secure erase.
    
    Same problem:
    
            uint64_t r[2] = {512, 18446744073709551104ULL};
            ioctl(fd, BLKSECDISCARD, r);
    
    will enter near infinite loop inside blkdev_issue_secure_erase():
    
            a.out: attempt to access beyond end of device
            loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048
            bio_check_eod: 3286214 callbacks suppressed
    
    Signed-off-by: Alexey Dobriyan <[email protected]>
    Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:12:04 2024 +0800

    Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 7b1ab460592ca818e7b52f27cd3ec86af79220d1 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: bb7f4f0bcee6 ("btmrvl: add platform specific wakeup interrupt support")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B [+ + +]

Author: Hilda Wu <[email protected]>
Date:   Thu Aug 29 16:40:05 2024 +0800

    Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B
    
    [ Upstream commit 9a0570948c5def5c59e588dc0e009ed850a1f5a1 ]
    
    For tracking multiple devices concurrently with a condition.
    The patch enables the HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER quirk
    on RTL8852B controller.
    
    The quirk setting is based on commit 9e14606d8f38 ("Bluetooth: msft:
    Extended monitor tracking by address filter")
    
    With this setting, when a pattern monitor detects a device, this
    feature issues an address monitor for tracking that device. Let the
    original pattern monitor keep monitor new devices.
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122 [+ + +]

Author: Hilda Wu <[email protected]>
Date:   Fri Aug 16 16:58:22 2024 +0800

    Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122
    
    [ Upstream commit bdf9557f70e7512bb2f754abf90d9e9958745316 ]
    
    Add the support ID (0x0489, 0xe122) to usb_device_id table for
    Realtek RTL8852C.
    
    The device info from /sys/kernel/debug/usb/devices as below.
    
    T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
    D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0489 ProdID=e122 Rev= 0.00
    S:  Manufacturer=Realtek
    S:  Product=Bluetooth Radio
    S:  SerialNumber=00e04c000001
    C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:17:00 2024 -0400

    Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE
    
    commit b25e11f978b63cb7857890edb3a698599cddb10e upstream.
    
    This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
    ("Bluetooth: Always request for user confirmation for Just Works")
    always request user confirmation with confirm_hint set since the
    likes of bluetoothd have dedicated policy around JUST_WORKS method
    (e.g. main.conf:JustWorksRepairing).
    
    CVE: CVE-2024-8805
    Cc: [email protected]
    Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Tested-by: Kiran K <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: L2CAP: Fix uaf in l2cap_connect [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon Sep 23 12:47:39 2024 -0400

    Bluetooth: L2CAP: Fix uaf in l2cap_connect
    
    [ Upstream commit 333b4fd11e89b29c84c269123f871883a30be586 ]
    
    [Syzbot reported]
    BUG: KASAN: slab-use-after-free in l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
    Read of size 8 at addr ffff8880241e9800 by task kworker/u9:0/54
    
    CPU: 0 UID: 0 PID: 54 Comm: kworker/u9:0 Not tainted 6.11.0-rc6-syzkaller-00268-g788220eee30d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Workqueue: hci2 hci_rx_work
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0xc3/0x620 mm/kasan/report.c:488
     kasan_report+0xd9/0x110 mm/kasan/report.c:601
     l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
     l2cap_connect_req net/bluetooth/l2cap_core.c:4080 [inline]
     l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:4772 [inline]
     l2cap_sig_channel net/bluetooth/l2cap_core.c:5543 [inline]
     l2cap_recv_frame+0xf0b/0x8eb0 net/bluetooth/l2cap_core.c:6825
     l2cap_recv_acldata+0x9b4/0xb70 net/bluetooth/l2cap_core.c:7514
     hci_acldata_packet net/bluetooth/hci_core.c:3791 [inline]
     hci_rx_work+0xaab/0x1610 net/bluetooth/hci_core.c:4028
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    ...
    
    Freed by task 5245:
     kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
     kasan_save_track+0x14/0x30 mm/kasan/common.c:68
     kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
     poison_slab_object+0xf7/0x160 mm/kasan/common.c:240
     __kasan_slab_free+0x32/0x50 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2256 [inline]
     slab_free mm/slub.c:4477 [inline]
     kfree+0x12a/0x3b0 mm/slub.c:4598
     l2cap_conn_free net/bluetooth/l2cap_core.c:1810 [inline]
     kref_put include/linux/kref.h:65 [inline]
     l2cap_conn_put net/bluetooth/l2cap_core.c:1822 [inline]
     l2cap_conn_del+0x59d/0x730 net/bluetooth/l2cap_core.c:1802
     l2cap_connect_cfm+0x9e6/0xf80 net/bluetooth/l2cap_core.c:7241
     hci_connect_cfm include/net/bluetooth/hci_core.h:1960 [inline]
     hci_conn_failed+0x1c3/0x370 net/bluetooth/hci_conn.c:1265
     hci_abort_conn_sync+0x75a/0xb50 net/bluetooth/hci_sync.c:5583
     abort_conn_sync+0x197/0x360 net/bluetooth/hci_conn.c:2917
     hci_cmd_sync_work+0x1a4/0x410 net/bluetooth/hci_sync.c:328
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Reported-by: [email protected]
    Tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=c12e2f941af1feb5632c
    Fixes: 7b064edae38d ("Bluetooth: Fix authentication if acl data comes before remote feature evt")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: MGMT: Fix possible crash on mgmt_index_removed [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:34:42 2024 -0400

    Bluetooth: MGMT: Fix possible crash on mgmt_index_removed
    
    [ Upstream commit f53e1c9c726d83092167f2226f32bd3b73f26c21 ]
    
    If mgmt_index_removed is called while there are commands queued on
    cmd_sync it could lead to crashes like the bellow trace:
    
    0x0000053D: __list_del_entry_valid_or_report+0x98/0xdc
    0x0000053D: mgmt_pending_remove+0x18/0x58 [bluetooth]
    0x0000053E: mgmt_remove_adv_monitor_complete+0x80/0x108 [bluetooth]
    0x0000053E: hci_cmd_sync_work+0xbc/0x164 [bluetooth]
    
    So while handling mgmt_index_removed this attempts to dequeue
    commands passed as user_data to cmd_sync.
    
    Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
    Reported-by: jiaymao <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bnxt_en: Extend maximum length of version string by 1 byte [+ + +]

Author: Simon Horman <[email protected]>
Date:   Tue Aug 13 15:32:55 2024 +0100

    bnxt_en: Extend maximum length of version string by 1 byte
    
    [ Upstream commit ffff7ee843c351ce71d6e0d52f0f20bea35e18c9 ]
    
    This corrects an out-by-one error in the maximum length of the package
    version string. The size argument of snprintf includes space for the
    trailing '\0' byte, so there is no need to allow extra space for it by
    reducing the value of the size argument by 1.
    
    Found by inspection.
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Michael Chan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Fix a sdiv overflow issue [+ + +]

Author: Yonghong Song <[email protected]>
Date:   Fri Sep 13 08:03:26 2024 -0700

    bpf: Fix a sdiv overflow issue
    
    [ Upstream commit 7dd34d7b7dcf9309fc6224caf4dd5b35bedddcb7 ]
    
    Zac Ecob reported a problem where a bpf program may cause kernel crash due
    to the following error:
      Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
    
    The failure is due to the below signed divide:
      LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
    LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
    but it is impossible since for 64-bit system, the maximum positive
    number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
    cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
    LLONG_MIN.
    
    Further investigation found all the following sdiv/smod cases may trigger
    an exception when bpf program is running on x86_64 platform:
      - LLONG_MIN/-1 for 64bit operation
      - INT_MIN/-1 for 32bit operation
      - LLONG_MIN%-1 for 64bit operation
      - INT_MIN%-1 for 32bit operation
    where -1 can be an immediate or in a register.
    
    On arm64, there are no exceptions:
      - LLONG_MIN/-1 = LLONG_MIN
      - INT_MIN/-1 = INT_MIN
      - LLONG_MIN%-1 = 0
      - INT_MIN%-1 = 0
    where -1 can be an immediate or in a register.
    
    Insn patching is needed to handle the above cases and the patched codes
    produced results aligned with above arm64 result. The below are pseudo
    codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0
    and the divisor is stored in a register.
    
    sdiv:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L2
          if tmp == 0 goto L1
          rY = 0
      L1:
          rY = -rY;
          goto L3
      L2:
          rY /= rX
      L3:
    
    smod:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L1
          if tmp == 1 (is64 ? goto L2 : goto L3)
          rY = 0;
          goto L2
      L1:
          rY %= rX
      L2:
          goto L4  // only when !is64
      L3:
          wY = wY  // only when !is64
      L4:
    
      [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
    
    Reported-by: Zac Ecob <[email protected]>
    Signed-off-by: Yonghong Song <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Make the pointer returned by iter next method valid [+ + +]

Author: Juntong Deng <[email protected]>
Date:   Thu Aug 29 21:11:17 2024 +0100

    bpf: Make the pointer returned by iter next method valid
    
    [ Upstream commit 4cc8c50c9abcb2646a7a4fcef3cea5dcb30c06cf ]
    
    Currently we cannot pass the pointer returned by iter next method as
    argument to KF_TRUSTED_ARGS or KF_RCU kfuncs, because the pointer
    returned by iter next method is not "valid".
    
    This patch sets the pointer returned by iter next method to be valid.
    
    This is based on the fact that if the iterator is implemented correctly,
    then the pointer returned from the iter next method should be valid.
    
    This does not make NULL pointer valid. If the iter next method has
    KF_RET_NULL flag, then the verifier will ask the ebpf program to
    check NULL pointer.
    
    KF_RCU_PROTECTED iterator is a special case, the pointer returned by
    iter next method should only be valid within RCU critical section,
    so it should be with MEM_RCU, not PTR_TRUSTED.
    
    Another special case is bpf_iter_num_next, which returns a pointer with
    base type PTR_TO_MEM. PTR_TO_MEM should not be combined with type flag
    PTR_TRUSTED (PTR_TO_MEM already means the pointer is valid).
    
    The pointer returned by iter next method of other types of iterators
    is with PTR_TRUSTED.
    
    In addition, this patch adds get_iter_from_state to help us get the
    current iterator from the current state.
    
    Signed-off-by: Juntong Deng <[email protected]>
    Link: https://lore.kernel.org/r/AM6PR03MB584869F8B448EA1C87B7CDA399962@AM6PR03MB5848.eurprd03.prod.outlook.com
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpftool: Fix undefined behavior caused by shifting into the sign bit [+ + +]

Author: Kuan-Wei Chiu <[email protected]>
Date:   Sun Sep 8 22:00:09 2024 +0800

    bpftool: Fix undefined behavior caused by shifting into the sign bit
    
    [ Upstream commit 4cdc0e4ce5e893bc92255f5f734d983012f2bc2e ]
    
    Replace shifts of '1' with '1U' in bitwise operations within
    __show_dev_tc_bpf() to prevent undefined behavior caused by shifting
    into the sign bit of a signed integer. By using '1U', the operations
    are explicitly performed on unsigned integers, avoiding potential
    integer overflow or sign-related issues.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Acked-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

bpftool: Fix undefined behavior in qsort(NULL, 0, ...) [+ + +]

Author: Kuan-Wei Chiu <[email protected]>
Date:   Tue Sep 10 23:02:07 2024 +0800

    bpftool: Fix undefined behavior in qsort(NULL, 0, ...)
    
    [ Upstream commit f04e2ad394e2755d0bb2d858ecb5598718bf00d5 ]
    
    When netfilter has no entry to display, qsort is called with
    qsort(NULL, 0, ...). This results in undefined behavior, as UBSan
    reports:
    
    net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
    
    Although the C standard does not explicitly state whether calling qsort
    with a NULL pointer when the size is 0 constitutes undefined behavior,
    Section 7.1.4 of the C standard (Use of library functions) mentions:
    
    "Each of the following statements applies unless explicitly stated
    otherwise in the detailed descriptions that follow: If an argument to a
    function has an invalid value (such as a value outside the domain of
    the function, or a pointer outside the address space of the program, or
    a null pointer, or a pointer to non-modifiable storage when the
    corresponding parameter is not const-qualified) or a type (after
    promotion) not expected by a function with variable number of
    arguments, the behavior is undefined."
    
    To avoid this, add an early return when nf_link_info is NULL to prevent
    calling qsort with a NULL pointer.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Reviewed-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

bridge: mcast: Fail MDB get request on empty entry [+ + +]

Author: Ido Schimmel <[email protected]>
Date:   Sun Sep 29 15:36:40 2024 +0300

    bridge: mcast: Fail MDB get request on empty entry
    
    [ Upstream commit 555f45d24ba7cd5527716553031641cdebbe76c7 ]
    
    When user space deletes a port from an MDB entry, the port is removed
    synchronously. If this was the last port in the entry and the entry is
    not joined by the host itself, then the entry is scheduled for deletion
    via a timer.
    
    The above means that it is possible for the MDB get netlink request to
    retrieve an empty entry which is scheduled for deletion. This is
    problematic as after deleting the last port in an entry, user space
    cannot rely on a non-zero return code from the MDB get request as an
    indication that the port was successfully removed.
    
    Fix by returning an error when the entry's port list is empty and the
    entry is not joined by the host.
    
    Fixes: 68b380a395a7 ("bridge: mcast: Add MDB get support")
    Reported-by: Jamie Bainbridge <[email protected]>
    Closes: https://lore.kernel.org/netdev/c92569919307749f879b9482b0f3e125b7d9d2e3.1726480066.git.jamie.bainbridge@gmail.com/
    Tested-by: Jamie Bainbridge <[email protected]>
    Signed-off-by: Ido Schimmel <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: don't readahead the relocation inode on RST [+ + +]

Author: Johannes Thumshirn <[email protected]>
Date:   Wed Jul 31 22:43:06 2024 +0200

    btrfs: don't readahead the relocation inode on RST
    
    [ Upstream commit 04915240e2c3a018e4c7f23418478d27226c8957 ]
    
    On relocation we're doing readahead on the relocation inode, but if the
    filesystem is backed by a RAID stripe tree we can get ENOENT (e.g. due to
    preallocated extents not being mapped in the RST) from the lookup.
    
    But readahead doesn't handle the error and submits invalid reads to the
    device, causing an assertion in the scatter-gather list code:
    
      BTRFS info (device nvme1n1): balance: start -d -m -s
      BTRFS info (device nvme1n1): relocating block group 6480920576 flags data|raid0
      BTRFS error (device nvme1n1): cannot find raid-stripe for logical [6481928192, 6481969152] devid 2, profile raid0
      ------------[ cut here ]------------
      kernel BUG at include/linux/scatterlist.h:115!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 0 PID: 1012 Comm: btrfs Not tainted 6.10.0-rc7+ #567
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000002cd11000 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Call Trace:
       <TASK>
       ? __die_body.cold+0x14/0x25
       ? die+0x2e/0x50
       ? do_trap+0xca/0x110
       ? do_error_trap+0x65/0x80
       ? __blk_rq_map_sg+0x339/0x4a0
       ? exc_invalid_op+0x50/0x70
       ? __blk_rq_map_sg+0x339/0x4a0
       ? asm_exc_invalid_op+0x1a/0x20
       ? __blk_rq_map_sg+0x339/0x4a0
       nvme_prep_rq.part.0+0x9d/0x770
       nvme_queue_rq+0x7d/0x1e0
       __blk_mq_issue_directly+0x2a/0x90
       ? blk_mq_get_budget_and_tag+0x61/0x90
       blk_mq_try_issue_list_directly+0x56/0xf0
       blk_mq_flush_plug_list.part.0+0x52b/0x5d0
       __blk_flush_plug+0xc6/0x110
       blk_finish_plug+0x28/0x40
       read_pages+0x160/0x1c0
       page_cache_ra_unbounded+0x109/0x180
       relocate_file_extent_cluster+0x611/0x6a0
       ? btrfs_search_slot+0xba4/0xd20
       ? balance_dirty_pages_ratelimited_flags+0x26/0xb00
       relocate_data_extent.constprop.0+0x134/0x160
       relocate_block_group+0x3f2/0x500
       btrfs_relocate_block_group+0x250/0x430
       btrfs_relocate_chunk+0x3f/0x130
       btrfs_balance+0x71b/0xef0
       ? kmalloc_trace_noprof+0x13b/0x280
       btrfs_ioctl+0x2c2e/0x3030
       ? kvfree_call_rcu+0x1e6/0x340
       ? list_lru_add_obj+0x66/0x80
       ? mntput_no_expire+0x3a/0x220
       __x64_sys_ioctl+0x96/0xc0
       do_syscall_64+0x54/0x110
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fcc04514f9b
      Code: Unable to access opcode bytes at 0x7fcc04514f71.
      RSP: 002b:00007ffeba923370 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fcc04514f9b
      RDX: 00007ffeba923460 RSI: 00000000c4009420 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000013 R09: 0000000000000001
      R10: 00007fcc043fbba8 R11: 0000000000000246 R12: 00007ffeba924fc5
      R13: 00007ffeba923460 R14: 0000000000000002 R15: 00000000004d4bb0
       </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:__blk_rq_map_sg+0x339/0x4a0
      RSP: 0018:ffffc90001a43820 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffea00045d4802
      RDX: 0000000117520000 RSI: 0000000000000000 RDI: ffff8881027d1000
      RBP: 0000000000003000 R08: ffffea00045d4902 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000001000 R12: ffff8881003d10b8
      R13: ffffc90001a438f0 R14: 0000000000000000 R15: 0000000000003000
      FS:  00007fcc048a6900(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fcc04514f71 CR3: 00000001109ea001 CR4: 0000000000370eb0
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception ]---
    
    So in case of a relocation on a RAID stripe-tree based file system, skip
    the readahead.
    
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: drop the backref cache during relocation if we commit [+ + +]

Author: Josef Bacik <[email protected]>
Date:   Tue Sep 24 16:50:22 2024 -0400

    btrfs: drop the backref cache during relocation if we commit
    
    commit db7e68b522c01eb666cfe1f31637775f18997811 upstream.
    
    Since the inception of relocation we have maintained the backref cache
    across transaction commits, updating the backref cache with the new
    bytenr whenever we COWed blocks that were in the cache, and then
    updating their bytenr once we detected a transaction id change.
    
    This works as long as we're only ever modifying blocks, not changing the
    structure of the tree.
    
    However relocation does in fact change the structure of the tree.  For
    example, if we are relocating a data extent, we will look up all the
    leaves that point to this data extent.  We will then call
    do_relocation() on each of these leaves, which will COW down to the leaf
    and then update the file extent location.
    
    But, a key feature of do_relocation() is the pending list.  This is all
    the pending nodes that we modified when we updated the file extent item.
    We will then process all of these blocks via finish_pending_nodes, which
    calls do_relocation() on all of the nodes that led up to that leaf.
    
    The purpose of this is to make sure we don't break sharing unless we
    absolutely have to.  Consider the case that we have 3 snapshots that all
    point to this leaf through the same nodes, the initial COW would have
    created a whole new path.  If we did this for all 3 snapshots we would
    end up with 3x the number of nodes we had originally.  To avoid this we
    will cycle through each of the snapshots that point to each of these
    nodes and update their pointers to point at the new nodes.
    
    Once we update the pointer to the new node we will drop the node we
    removed the link for and all of its children via btrfs_drop_subtree().
    This is essentially just btrfs_drop_snapshot(), but for an arbitrary
    point in the snapshot.
    
    The problem with this is that we will never reflect this in the backref
    cache.  If we do this btrfs_drop_snapshot() for a node that is in the
    backref tree, we will leave the node in the backref tree.  This becomes
    a problem when we change the transid, as now the backref cache has
    entire subtrees that no longer exist, but exist as if they still are
    pointed to by the same roots.
    
    In the best case scenario you end up with "adding refs to an existing
    tree ref" errors from insert_inline_extent_backref(), where we attempt
    to link in nodes on roots that are no longer valid.
    
    Worst case you will double free some random block and re-use it when
    there's still references to the block.
    
    This is extremely subtle, and the consequences are quite bad.  There
    isn't a way to make sure our backref cache is consistent between
    transid's.
    
    In order to fix this we need to simply evict the entire backref cache
    anytime we cross transid's.  This reduces performance in that we have to
    rebuild this backref cache every time we change transid's, but fixes the
    bug.
    
    This has existed since relocation was added, and is a pretty critical
    bug.  There's a lot more cleanup that can be done now that this
    functionality is going away, but this patch is as small as possible in
    order to fix the problem and make it easy for us to backport it to all
    the kernels it needs to be backported to.
    
    Followup series will dismantle more of this code and simplify relocation
    drastically to remove this functionality.
    
    We have a reproducer that reproduced the corruption within a few minutes
    of running.  With this patch it survives several iterations/hours of
    running the reproducer.
    
    Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance")
    CC: [email protected]
    Reviewed-by: Boris Burkov <[email protected]>
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [+ + +]

Author: Qu Wenruo <[email protected]>
Date:   Sat Sep 28 08:05:58 2024 +0930

    btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
    
    commit c3b47f49e83197e8dffd023ec568403bcdbb774b upstream.
    
    [BUG]
    Syzbot reported a NULL pointer dereference with the following crash:
    
      FAULT_INJECTION: forcing a failure.
       start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676
       prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642
       relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678
      ...
      BTRFS info (device loop0): balance: ended with status: -12
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI
      KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667]
      RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926
      Call Trace:
       <TASK>
       commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496
       btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430
       del_balance_item fs/btrfs/volumes.c:3678 [inline]
       reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742
       btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574
       btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [CAUSE]
    The allocation failure happens at the start_transaction() inside
    prepare_to_relocate(), and during the error handling we call
    unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.
    
    Then we continue the error path cleanup in btrfs_balance() by calling
    reset_balance_state() which will call del_balance_item() to fully delete
    the balance item in the root tree.
    
    However during the small window between set_reloc_contrl() and
    unset_reloc_control(), we can have a subvolume tree update and created a
    reloc_root for that subvolume.
    
    Then we go into the final btrfs_commit_transaction() of
    del_balance_item(), and into btrfs_update_reloc_root() inside
    commit_fs_roots().
    
    That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
    stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
    dereference.
    
    [FIX]
    Just add extra check on fs_info->reloc_ctl inside
    btrfs_update_reloc_root(), before checking
    fs_info->reloc_ctl->merge_reloc_tree.
    
    That DEAD_RELOC_TREE handling is to prevent further modification to the
    reloc tree during merge stage, but since there is no reloc_ctl at all,
    we do not need to bother that.
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: send: fix buffer overflow detection when copying path to cache entry [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Thu Sep 19 22:20:34 2024 +0100

    btrfs: send: fix buffer overflow detection when copying path to cache entry
    
    commit 96c6ca71572a3556ed0c37237305657ff47174b7 upstream.
    
    Starting with commit c0247d289e73 ("btrfs: send: annotate struct
    name_cache_entry with __counted_by()") we annotated the variable length
    array "name" from the name_cache_entry structure with __counted_by() to
    improve overflow detection. However that alone was not correct, because
    the length of that array does not match the "name_len" field - it matches
    that plus 1 to include the NUL string terminator, so that makes a
    fortified kernel think there's an overflow and report a splat like this:
    
      strcpy: detected buffer overflow: 20 byte write of buffer size 19
      WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50
      CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1
      Hardware name: CompuLab Ltd.  sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018
      RIP: 0010:__fortify_report+0x45/0x50
      Code: 48 8b 34 (...)
      RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246
      RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027
      RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8
      RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd
      R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400
      R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8
      FS:  00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0
      Call Trace:
       <TASK>
       ? __warn+0x12a/0x1d0
       ? __fortify_report+0x45/0x50
       ? report_bug+0x154/0x1c0
       ? handle_bug+0x42/0x70
       ? exc_invalid_op+0x1a/0x50
       ? asm_exc_invalid_op+0x1a/0x20
       ? __fortify_report+0x45/0x50
       __fortify_panic+0x9/0x10
      __get_cur_name_and_parent+0x3bc/0x3c0
       get_cur_path+0x207/0x3b0
       send_extent_data+0x709/0x10d0
       ? find_parent_nodes+0x22df/0x25d0
       ? mas_nomem+0x13/0x90
       ? mtree_insert_range+0xa5/0x110
       ? btrfs_lru_cache_store+0x5f/0x1e0
       ? iterate_extent_inodes+0x52d/0x5a0
       process_extent+0xa96/0x11a0
       ? __pfx_lookup_backref_cache+0x10/0x10
       ? __pfx_store_backref_cache+0x10/0x10
       ? __pfx_iterate_backrefs+0x10/0x10
       ? __pfx_check_extent_item+0x10/0x10
       changed_cb+0x6fa/0x930
       ? tree_advance+0x362/0x390
       ? memcmp_extent_buffer+0xd7/0x160
       send_subvol+0xf0a/0x1520
       btrfs_ioctl_send+0x106b/0x11d0
       ? __pfx___clone_root_cmp_sort+0x10/0x10
       _btrfs_ioctl_send+0x1ac/0x240
       btrfs_ioctl+0x75b/0x850
       __se_sys_ioctl+0xca/0x150
       do_syscall_64+0x85/0x160
       ? __count_memcg_events+0x69/0x100
       ? handle_mm_fault+0x1327/0x15c0
       ? __se_sys_rt_sigprocmask+0xf1/0x180
       ? syscall_exit_to_user_mode+0x75/0xa0
       ? do_syscall_64+0x91/0x160
       ? do_user_addr_fault+0x21d/0x630
      entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7fae145eeb4f
      Code: 00 48 89 (...)
      RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f
      RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004
      RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927
      R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8
      R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004
       </TASK>
    
    Fix this by not storing the NUL string terminator since we don't actually
    need it for name cache entries, this way "name_len" corresponds to the
    actual size of the "name" array. This requires marking the "name" array
    field with __nonstring and using memcpy() instead of strcpy() as
    recommended by the guidelines at:
    
       https://github.com/KSPP/linux/issues/90
    
    Reported-by: David Arendt <[email protected]>
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    Fixes: c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()")
    CC: [email protected] # 6.11
    Tested-by: David Arendt <[email protected]>
    Reviewed-by: Josef Bacik <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: send: fix invalid clone operation for file that got its size decreased [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Fri Sep 27 10:50:12 2024 +0100

    btrfs: send: fix invalid clone operation for file that got its size decreased
    
    commit fa630df665aa9ddce3a96ce7b54e10a38e4d2a2b upstream.
    
    During an incremental send we may end up sending an invalid clone
    operation, for the last extent of a file which ends at an unaligned offset
    that matches the final i_size of the file in the send snapshot, in case
    the file had its initial size (the size in the parent snapshot) decreased
    in the send snapshot. In this case the destination will fail to apply the
    clone operation because its end offset is not sector size aligned and it
    ends before the current size of the file.
    
    Sending the truncate operation always happens when we finish processing an
    inode, after we process all its extents (and xattrs, names, etc). So fix
    this by ensuring the file has a valid size before we send a clone
    operation for an unaligned extent that ends at the final i_size of the
    file. The size we truncate to matches the start offset of the clone range
    but it could be any value between that start offset and the final size of
    the file since the clone operation will expand the i_size if the current
    size is smaller than the end offset. The start offset of the range was
    chosen because it's always sector size aligned and avoids a truncation
    into the middle of a page, which results in dirtying the page due to
    filling part of it with zeroes and then making the clone operation at the
    receiver trigger IO.
    
    The following test reproduces the issue:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/sdi
      MNT=/mnt/sdi
    
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      # Create a file with a size of 256K + 5 bytes, having two extents, one
      # with a size of 128K and another one with a size of 128K + 5 bytes.
      last_ext_size=$((128 * 1024 + 5))
      xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \
             -c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \
             $MNT/foo
    
      # Another file which we will later clone foo into, but initially with
      # a larger size than foo.
      xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap1
    
      # Now resize bar and clone foo into it.
      xfs_io -c "truncate 0" \
             -c "reflink $MNT/foo" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap2
    
      rm -f /tmp/send-full /tmp/send-inc
      btrfs send -f /tmp/send-full $MNT/snap1
      btrfs send -p $MNT/snap1 -f /tmp/send-inc $MNT/snap2
    
      umount $MNT
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      btrfs receive -f /tmp/send-full $MNT
      btrfs receive -f /tmp/send-inc $MNT
    
      umount $MNT
    
    Running it before this patch:
    
      $ ./test.sh
      (...)
      At subvol snap1
      At snapshot snap2
      ERROR: failed to clone extents to bar: Invalid argument
    
    A test case for fstests will be sent soon.
    
    Reported-by: Ben Millwood <[email protected]>
    Link: https://lore.kernel.org/linux-btrfs/CAJhrHS2z+WViO2h=ojYvBPDLsATwLbg+7JaNCyYomv0fUxEpQQ@mail.gmail.com/
    Fixes: 46a6e10a1ab1 ("btrfs: send: allow cloning non-aligned extent if it ends at i_size")
    CC: [email protected] # 6.11
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: wait for fixup workers before stopping cleaner kthread during umount [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Tue Oct 1 11:06:52 2024 +0100

    btrfs: wait for fixup workers before stopping cleaner kthread during umount
    
    commit 41fd1e94066a815a7ab0a7025359e9b40e4b3576 upstream.
    
    During unmount, at close_ctree(), we have the following steps in this order:
    
    1) Park the cleaner kthread - this doesn't destroy the kthread, it basically
       halts its execution (wake ups against it work but do nothing);
    
    2) We stop the cleaner kthread - this results in freeing the respective
       struct task_struct;
    
    3) We call btrfs_stop_all_workers() which waits for any jobs running in all
       the work queues and then free the work queues.
    
    Syzbot reported a case where a fixup worker resulted in a crash when doing
    a delayed iput on its inode while attempting to wake up the cleaner at
    btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread
    was already freed. This can happen during unmount because we don't wait
    for any fixup workers still running before we call kthread_stop() against
    the cleaner kthread, which stops and free all its resources.
    
    Fix this by waiting for any fixup workers at close_ctree() before we call
    kthread_stop() against the cleaner and run pending delayed iputs.
    
    The stack traces reported by syzbot were the following:
    
      BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
      Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52
    
      CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
      Workqueue: btrfs-fixup btrfs_work_helper
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0x169/0x550 mm/kasan/report.c:488
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154
       btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842
       btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314
       process_one_work kernel/workqueue.c:3229 [inline]
       process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
       worker_thread+0x870/0xd30 kernel/workqueue.c:3391
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
    
      Allocated by task 2:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       unpoison_slab_object mm/kasan/common.c:319 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Freed by task 61:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:230 [inline]
       slab_free_hook mm/slub.c:2343 [inline]
       slab_free mm/slub.c:4580 [inline]
       kmem_cache_free+0x1a2/0x420 mm/slub.c:4682
       put_task_struct include/linux/sched/task.h:144 [inline]
       delayed_put_task_struct+0x125/0x300 kernel/exit.c:228
       rcu_do_batch kernel/rcu/tree.c:2567 [inline]
       rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823
       handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
       __do_softirq kernel/softirq.c:588 [inline]
       invoke_softirq kernel/softirq.c:428 [inline]
       __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
       irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
       instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline]
       sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
    
      Last potentially related work creation:
       kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
       __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
       __call_rcu_common kernel/rcu/tree.c:3086 [inline]
       call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190
       context_switch kernel/sched/core.c:5318 [inline]
       __schedule+0x184b/0x4ae0 kernel/sched/core.c:6675
       schedule_idle+0x56/0x90 kernel/sched/core.c:6793
       do_idle+0x56a/0x5d0 kernel/sched/idle.c:354
       cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
       start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314
       common_startup_64+0x13e/0x147
    
      The buggy address belongs to the object at ffff8880272a8000
       which belongs to the cache task_struct of size 7424
      The buggy address is located 2584 bytes inside of
       freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x272a8
      head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: f5(slab)
      raw: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      head: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000003 ffffea00009caa01 ffffffffffffffff 0000000000000000
      head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 2, tgid 2 (kthreadd), ts 71247381401, free_ts 71214998153
       set_page_owner include/linux/page_owner.h:32 [inline]
       post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
       prep_new_page mm/page_alloc.c:1545 [inline]
       get_page_from_freelist+0x3039/0x3180 mm/page_alloc.c:3457
       __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4733
       alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
       alloc_slab_page+0x6a/0x120 mm/slub.c:2413
       allocate_slab+0x5a/0x2f0 mm/slub.c:2579
       new_slab mm/slub.c:2632 [inline]
       ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3819
       __slab_alloc+0x58/0xa0 mm/slub.c:3909
       __slab_alloc_node mm/slub.c:3962 [inline]
       slab_alloc_node mm/slub.c:4123 [inline]
       kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      page last free pid 5230 tgid 5230 stack trace:
       reset_page_owner include/linux/page_owner.h:25 [inline]
       free_pages_prepare mm/page_alloc.c:1108 [inline]
       free_unref_page+0xcd0/0xf00 mm/page_alloc.c:2638
       discard_slab mm/slub.c:2678 [inline]
       __put_partials+0xeb/0x130 mm/slub.c:3146
       put_cpu_partial+0x17c/0x250 mm/slub.c:3221
       __slab_free+0x2ea/0x3d0 mm/slub.c:4450
       qlink_free mm/kasan/quarantine.c:163 [inline]
       qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179
       kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
       __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4142
       getname_flags+0xb7/0x540 fs/namei.c:139
       do_sys_openat2+0xd2/0x1d0 fs/open.c:1409
       do_sys_open fs/open.c:1430 [inline]
       __do_sys_openat fs/open.c:1446 [inline]
       __se_sys_openat fs/open.c:1441 [inline]
       __x64_sys_openat+0x247/0x2a0 fs/open.c:1441
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
      Memory state around the buggy address:
       ffff8880272a8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880272a8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
       ffff8880272a8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cachefiles: fix dentry leak in cachefiles_open_file() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 29 16:34:09 2024 +0800

    cachefiles: fix dentry leak in cachefiles_open_file()
    
    commit da6ef2dffe6056aad3435e6cf7c6471c2a62187c upstream.
    
    A dentry leak may be caused when a lookup cookie and a cull are concurrent:
    
                P1             |             P2
    -----------------------------------------------------------
    cachefiles_lookup_cookie
      cachefiles_look_up_object
        lookup_one_positive_unlocked
         // get dentry
                                cachefiles_cull
                                  inode->i_flags |= S_KERNEL_FILE;
        cachefiles_open_file
          cachefiles_mark_inode_in_use
            __cachefiles_mark_inode_in_use
              can_use = false
              if (!(inode->i_flags & S_KERNEL_FILE))
                can_use = true
              return false
            return false
            // Returns an error but doesn't put dentry
    
    After that the following WARNING will be triggered when the backend folder
    is umounted:
    
    ==================================================================
    BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img}  still in use (1) [unmount of ext4 sda]
    WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70
    CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25
    RIP: 0010:umount_check+0x5d/0x70
    Call Trace:
     <TASK>
     d_walk+0xda/0x2b0
     do_one_tree+0x20/0x40
     shrink_dcache_for_umount+0x2c/0x90
     generic_shutdown_super+0x20/0x160
     kill_block_super+0x1a/0x40
     ext4_kill_sb+0x22/0x40
     deactivate_locked_super+0x35/0x80
     cleanup_mnt+0x104/0x160
    ==================================================================
    
    Whether cachefiles_open_file() returns true or false, the reference count
    obtained by lookup_positive_unlocked() in cachefiles_look_up_object()
    should be released.
    
    Therefore release that reference count in cachefiles_look_up_object() to
    fix the above issue and simplify the code.
    
    Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode [+ + +]

Author: Stefan Mätje <[email protected]>
Date:   Thu Aug 8 18:42:24 2024 +0200

    can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode
    
    [ Upstream commit 2423cc20087ae9a7b7af575aa62304ef67cad7b6 ]
    
    This patch moves the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_changelink in front of the evaluation of data[IFLA_CAN_BITTIMING].
    
    This avoids a call to do_set_data_bittiming providing a stale
    can_priv::ctrlmode with a CAN_CTRLMODE_FD flag not matching the
    requested state when switching between a CAN Classic and CAN-FD bitrate.
    
    In the same manner the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_validate is also moved in front of the evaluation of
    data[IFLA_CAN_BITTIMING].
    
    This is a preparation for patches where the nominal and data bittiming
    may have interdependencies on the driver side depending on the
    CAN_CTRLMODE_FD flag state.
    
    Signed-off-by: Stefan Mätje <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Marc Kleine-Budde <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: fix a memory leak on cap_auths in MDS client [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Aug 19 10:52:17 2024 +0100

    ceph: fix a memory leak on cap_auths in MDS client
    
    [ Upstream commit d97079e97eab20e08afc507f2bed4501e2824717 ]
    
    The cap_auths that are allocated during an MDS session opening are never
    released, causing a memory leak detected by kmemleak.  Fix this by freeing
    the memory allocated when shutting down the MDS client.
    
    Fixes: 1d17de9534cb ("ceph: save cap_auths in MDS client when session is opened")
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Xiubo Li <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: fix cap ref leak via netfs init_request [+ + +]

Author: Patrick Donnelly <[email protected]>
Date:   Wed Oct 2 21:05:12 2024 -0400

    ceph: fix cap ref leak via netfs init_request
    
    commit ccda9910d8490f4fb067131598e4b2e986faa5a0 upstream.
    
    Log recovered from a user's cluster:
    
        <7>[ 5413.970692] ceph:  get_cap_refs 00000000958c114b ret 1 got Fr
        <7>[ 5413.970695] ceph:  start_read 00000000958c114b, no cache cap
        ...
        <7>[ 5473.934609] ceph:   my wanted = Fr, used = Fr, dirty -
        <7>[ 5473.934616] ceph:  revocation: pAsLsXsFr -> pAsLsXs (revoking Fr)
        <7>[ 5473.934632] ceph:  __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs
        <7>[ 5473.934638] ceph:  check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr  AUTHONLY NOINVAL FLUSH_FORCE
    
    The MDS subsequently complains that the kernel client is late releasing
    caps.
    
    Approximately, a series of changes to this code by commits 49870056005c
    ("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315
    ("netfs: Change ->init_request() to return an error code") and
    a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    resulted in subtle resource cleanup to be missed. The main culprit is
    the change in error handling in 2de160417315 which meant that a failure
    in init_request() would no longer cause cleanup to be called. That
    would prevent the ceph_put_cap_refs() call which would cleanup the
    leaked cap ref.
    
    Cc: [email protected]
    Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    Link: https://tracker.ceph.com/issues/67008
    Signed-off-by: Patrick Donnelly <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: remove the incorrect Fw reference check when dirtying pages [+ + +]

Author: Xiubo Li <[email protected]>
Date:   Thu Sep 5 06:22:18 2024 +0800

    ceph: remove the incorrect Fw reference check when dirtying pages
    
    [ Upstream commit c08dfb1b49492c09cf13838c71897493ea3b424e ]
    
    When doing the direct-io reads it will also try to mark pages dirty,
    but for the read path it won't hold the Fw caps and there is case
    will it get the Fw reference.
    
    Fixes: 5dda377cf0a6 ("ceph: set i_head_snapc when getting CEPH_CAP_FILE_WR reference")
    Signed-off-by: Xiubo Li <[email protected]>
    Reviewed-by: Patrick Donnelly <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Do not convert delimiter when parsing NFS-style symlinks [+ + +]

Author: Pali Rohár <[email protected]>
Date:   Sat Sep 28 23:59:46 2024 +0200

    cifs: Do not convert delimiter when parsing NFS-style symlinks
    
    [ Upstream commit d3a49f60917323228f8fdeee313260ef14f94df7 ]
    
    NFS-style symlinks have target location always stored in NFS/UNIX form
    where backslash means the real UNIX backslash and not the SMB path
    separator.
    
    So do not mangle slash and backslash content of NFS-style symlink during
    readlink() syscall as it is already in the correct Linux form.
    
    This fixes interoperability of NFS-style symlinks with backslashes created
    by Linux NFS3 client throw Windows NFS server and retrieved by Linux SMB
    client throw Windows SMB server, where both Windows servers exports the
    same directory.
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Fix buffer overflow when parsing NFS reparse points [+ + +]

Author: Pali Rohár <[email protected]>
Date:   Sun Sep 29 12:22:40 2024 +0200

    cifs: Fix buffer overflow when parsing NFS reparse points
    
    [ Upstream commit e2a8910af01653c1c268984855629d71fb81f404 ]
    
    ReparseDataLength is sum of the InodeType size and DataBuffer size.
    So to get DataBuffer size it is needed to subtract InodeType's size from
    ReparseDataLength.
    
    Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer
    at position after the end of the buffer because it does not subtract
    InodeType size from the length. Fix this problem and correctly subtract
    variable len.
    
    Member InodeType is present only when reparse buffer is large enough. Check
    for ReparseDataLength before accessing InodeType to prevent another invalid
    memory access.
    
    Major and minor rdev values are present also only when reparse buffer is
    large enough. Check for reparse buffer size before calling reparse_mkdev().
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Remove intermediate object of failed create reparse call [+ + +]

Author: Pali Rohár <[email protected]>
Date:   Mon Sep 30 22:25:10 2024 +0200

    cifs: Remove intermediate object of failed create reparse call
    
    [ Upstream commit c9432ad5e32f066875b1bf95939c363bc46d6a45 ]
    
    If CREATE was successful but SMB2_OP_SET_REPARSE failed then remove the
    intermediate object created by CREATE. Otherwise empty object stay on the
    server when reparse call failed.
    
    This ensures that if the creating of special files is unsupported by the
    server then no empty file stay on the server as a result of unsupported
    operation.
    
    Fixes: 102466f303ff ("smb: client: allow creating special files via reparse points")
    Signed-off-by: Pali Rohár <[email protected]>
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL [+ + +]

Author: Ajit Pandey <[email protected]>
Date:   Tue Jun 11 19:07:45 2024 +0530

    clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL
    
    commit fff617979f97c773aaa9432c31cf62444b3bdbd4 upstream.
    
    In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single
    PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL
    register using regmap_write() API in __alpha_pll_trion_set_rate
    callback will override LUCID EVO PLL initial configuration related
    to PLL_CAL_L_VAL bit fields in PLL_L_VAL register.
    
    Observed random PLL lock failures during PLL enable due to such
    override in PLL calibration value. Use regmap_update_bits() with
    L_VAL bitfield mask instead of regmap_write() API to update only
    PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback.
    
    Fixes: 260e36606a03 ("clk: qcom: clk-alpha-pll: add Lucid EVO PLL configuration interfaces")
    Cc: [email protected]
    Signed-off-by: Ajit Pandey <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Acked-by: Vladimir Zapolskiy <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: clk-rpmh: Fix overflow in BCM vote [+ + +]

Author: Mike Tipton <[email protected]>
Date:   Fri Aug 9 10:51:29 2024 +0530

    clk: qcom: clk-rpmh: Fix overflow in BCM vote
    
    commit a4e5af27e6f6a8b0d14bc0d7eb04f4a6c7291586 upstream.
    
    Valid frequencies may result in BCM votes that exceed the max HW value.
    Set vote ceiling to BCM_TCS_CMD_VOTE_MASK to ensure the votes aren't
    truncated, which can result in lower frequencies than desired.
    
    Fixes: 04053f4d23a4 ("clk: qcom: clk-rpmh: Add IPA clock support")
    Cc: [email protected]
    Signed-off-by: Mike Tipton <[email protected]>
    Reviewed-by: Taniya Das <[email protected]>
    Signed-off-by: Imran Shaik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks [+ + +]

Author: Dmitry Baryshkov <[email protected]>
Date:   Sun Aug 4 08:40:05 2024 +0300

    clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks
    
    commit 0e93c6320ecde0583de09f3fe801ce8822886fec upstream.
    
    Add CLK_SET_RATE_PARENT for several branch clocks. Such clocks don't
    have a way to change the rate, so set the parent rate instead.
    
    Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
    Cc: [email protected]
    Signed-off-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Add GPLL9 support [+ + +]

Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:03 2024 +0530

    clk: qcom: gcc-sc8180x: Add GPLL9 support
    
    commit 818a2f8d5e4ad2c1e39a4290158fe8e39a744c70 upstream.
    
    Add the missing GPLL9 pll and fix the gcc_parents_7 data to use
    the correct pll hw.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table [+ + +]

Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:04 2024 +0530

    clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table
    
    commit b8acaf2de8081371761ab4cf1e7a8ee4e7acc139 upstream.
    
    Update the frequency tables of gcc_sdcc2_apps_clk and gcc_sdcc4_apps_clk
    as per the latest frequency plan.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x [+ + +]

Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:01 2024 +0530

    clk: qcom: gcc-sc8180x: Register QUPv3 RCGs for DFS on sc8180x
    
    commit 1fc8c02e1d80463ce1b361d82b83fc43bb92d964 upstream.
    
    QUPv3 clocks support DFS on sc8180x platform but currently the code
    changes for it are missing from the driver, this results in not
    populating all the DFS supported frequencies and returns incorrect
    frequency when the clients request for them. Hence add the DFS
    registration for QUPv3 RCGs.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src [+ + +]

Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:05 2024 +0530

    clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src
    
    commit bab0c7a0bc586e736b7cd2aac8e6391709a70ef2 upstream.
    
    The branch clocks of gcc_cpuss_ahb_clk_src are marked critical
    and hence these clocks vote on XO blocking the suspend.
    De-register these clocks and its source as there is no rate
    setting happening on them.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Fri Jul 19 19:12:38 2024 +0530

    clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit ade508b545c969c72cd68479f275a5dd640fd8b9 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.7
    Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Jul 22 16:27:33 2024 +0530

    clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit 889e1332310656961855c0dcedbb4dbe78e39d22 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.17
    Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: rockchip: fix error for unknown clocks [+ + +]

Author: Sebastian Reichel <[email protected]>
Date:   Mon Mar 25 20:33:36 2024 +0100

    clk: rockchip: fix error for unknown clocks
    
    commit 12fd64babaca4dc09d072f63eda76ba44119816a upstream.
    
    There is a clk == NULL check after the switch to check for
    unsupported clk types. Since clk is re-assigned in a loop,
    this check is useless right now for anything but the first
    round. Let's fix this up by assigning clk = NULL in the
    loop before the switch statement.
    
    Fixes: a245fecbb806 ("clk: rockchip: add basic infrastructure for clock branches")
    Cc: [email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    [added fixes + stable-cc]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix [+ + +]

Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:47 2024 +0200

    clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix
    
    commit 217a5f23c290c349ceaa37a6f2c014ad4c2d5759 upstream.
    
    Update CLKS_NR_FSYS to the proper value after a fix in DT bindings.
    This should always be the last clock in a CMU + 1.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

close_range(): fix the logics in descriptor table trimming [+ + +]

Author: Al Viro <[email protected]>
Date:   Fri Aug 16 15:17:00 2024 -0400

    close_range(): fix the logics in descriptor table trimming
    
    commit 678379e1d4f7443b170939525d3312cfc37bf86b upstream.
    
    Cloning a descriptor table picks the size that would cover all currently
    opened files.  That's fine for clone() and unshare(), but for close_range()
    there's an additional twist - we clone before we close, and it would be
    a shame to have
            close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
    leave us with a huge descriptor table when we are not going to keep
    anything past stderr, just because some large file descriptor used to
    be open before our call has taken it out.
    
    Unfortunately, it had been dealt with in an inherently racy way -
    sane_fdtable_size() gets a "don't copy anything past that" argument
    (passed via unshare_fd() and dup_fd()), close_range() decides how much
    should be trimmed and passes that to unshare_fd().
    
    The problem is, a range that used to extend to the end of descriptor
    table back when close_range() had looked at it might very well have stuff
    grown after it by the time dup_fd() has allocated a new files_struct
    and started to figure out the capacity of fdtable to be attached to that.
    
    That leads to interesting pathological cases; at the very least it's a
    QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
    a snapshot of descriptor table one might have observed at some point.
    Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
    of unshare(CLONE_FILES) with plain close_range(), ending up with a
    weird state that would never occur with unshare(2) is confusing, to put
    it mildly.
    
    It's not hard to get rid of - all it takes is passing both ends of the
    range down to sane_fdtable_size().  There we are under ->files_lock,
    so the race is trivially avoided.
    
    So we do the following:
            * switch close_files() from calling unshare_fd() to calling
    dup_fd().
            * undo the calling convention change done to unshare_fd() in
    60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
            * introduce struct fd_range, pass a pointer to that to dup_fd()
    and sane_fdtable_size() instead of "trim everything past that point"
    they are currently getting.  NULL means "we are not going to be punching
    any holes"; NR_OPEN_MAX is gone.
            * make sane_fdtable_size() use find_last_bit() instead of
    open-coding it; it's easier to follow that way.
            * while we are at it, have dup_fd() report errors by returning
    ERR_PTR(), no need to use a separate int *errorp argument.
    
    Fixes: 60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
    Cc: [email protected]
    Signed-off-by: Al Viro <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value [+ + +]

Author: Anastasia Belova <[email protected]>
Date:   Mon Aug 26 16:38:41 2024 +0300

    cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
    
    [ Upstream commit 5493f9714e4cdaf0ee7cec15899a231400cb1a9f ]
    
    cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
    and return in case of error.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Anastasia Belova <[email protected]>
    Reviewed-by: Perry Yuan <[email protected]>
    Signed-off-by: Viresh Kumar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cpufreq: Avoid a bad reference count on CPU node [+ + +]

Author: Miquel Sabaté Solà <[email protected]>
Date:   Tue Sep 17 15:42:46 2024 +0200

    cpufreq: Avoid a bad reference count on CPU node
    
    commit c0f02536fffbbec71aced36d52a765f8c4493dc2 upstream.
    
    In the parse_perf_domain function, if the call to
    of_parse_phandle_with_args returns an error, then the reference to the
    CPU device node that was acquired at the start of the function would not
    be properly decremented.
    
    Address this by declaring the variable with the __free(device_node)
    cleanup attribute.
    
    Signed-off-by: Miquel Sabaté Solà <[email protected]>
    Acked-by: Viresh Kumar <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock [+ + +]

Author: Uwe Kleine-König <[email protected]>
Date:   Thu Sep 19 10:11:21 2024 +0200

    cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
    
    commit 8b4865cd904650cbed7f2407e653934c621b8127 upstream.
    
    notify_hwp_interrupt() is called via sysvec_thermal() ->
    smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context.
    For this reason it must not use a simple spin_lock that sleeps with
    PREEMPT_RT enabled. So convert it to a raw spinlock.
    
    Reported-by: xiao sheng wen <[email protected]>
    Link: https://bugs.debian.org/1076483
    Signed-off-by: Uwe Kleine-König <[email protected]>
    Acked-by: Srinivas Pandruvada <[email protected]>
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Tested-by: xiao sheng wen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request() [+ + +]

Author: Huacai Chen <[email protected]>
Date:   Wed Aug 28 14:24:59 2024 +0800

    cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
    
    [ Upstream commit 2b7ec33e534f7a10033a5cf07794acf48b182bbe ]
    
    Use raw_smp_processor_id() instead of plain smp_processor_id() in
    do_service_request(), otherwise we may get some errors with the driver
    enabled:
    
     BUG: using smp_processor_id() in preemptible [00000000] code: (udev-worker)/208
     caller is loongson3_cpufreq_probe+0x5c/0x250 [loongson3_cpufreq]
    
    Reported-by: Xi Ruoyao <[email protected]>
    Tested-by: Binbin Zhou <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Viresh Kumar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: hisilicon - fix missed error branch [+ + +]

Author: Yang Shen <[email protected]>
Date:   Sat Aug 31 17:50:07 2024 +0800

    crypto: hisilicon - fix missed error branch
    
    [ Upstream commit f386dc64e1a5d3dcb84579119ec350ab026fea88 ]
    
    If an error occurs in the process after the SGL is mapped
    successfully, it need to unmap the SGL.
    
    Otherwise, memory problems may occur.
    
    Signed-off-by: Yang Shen <[email protected]>
    Signed-off-by: Chenghai Huang <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx - Fix authenc setkey [+ + +]

Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:13:23 2024 +0800

    crypto: octeontx - Fix authenc setkey
    
    [ Upstream commit 311eea7e37c4c0b44b557d0c100860a03b4eab65 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx* - Select CRYPTO_AUTHENC [+ + +]

Author: Herbert Xu <[email protected]>
Date:   Thu Sep 5 10:21:49 2024 +0800

    crypto: octeontx* - Select CRYPTO_AUTHENC
    
    commit c398cb8eb0a263a1b7a18892d9f244751689675c upstream.
    
    Select CRYPTO_AUTHENC as the function crypto_authenec_extractkeys
    may not be available without it.
    
    Fixes: 311eea7e37c4 ("crypto: octeontx - Fix authenc setkey")
    Fixes: 7ccb750dcac8 ("crypto: octeontx2 - Fix authenc setkey")
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

crypto: octeontx2 - Fix authenc setkey [+ + +]

Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:36:19 2024 +0800

    crypto: octeontx2 - Fix authenc setkey
    
    [ Upstream commit 7ccb750dcac8abbfc7743aab0db6a72c1c3703c7 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: simd - Do not call crypto_alloc_tfm during registration [+ + +]

Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 14:58:35 2024 +0800

    crypto: simd - Do not call crypto_alloc_tfm during registration
    
    [ Upstream commit 3c44d31cb34ce4eb8311a2e73634d57702948230 ]
    
    Algorithm registration is usually carried out during module init,
    where as little work as possible should be carried out.  The SIMD
    code violated this rule by allocating a tfm, this then triggers a
    full test of the algorithm which may dead-lock in certain cases.
    
    SIMD is only allocating the tfm to get at the alg object, which is
    in fact already available as it is what we are registering.  Use
    that directly and remove the crypto_alloc_tfm call.
    
    Also remove some obsolete and unused SIMD API.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: x86/sha256 - Add parentheses around macros' single arguments [+ + +]

Author: Fangrui Song <[email protected]>
Date:   Tue Aug 13 21:48:02 2024 -0700

    crypto: x86/sha256 - Add parentheses around macros' single arguments
    
    [ Upstream commit 3363c460ef726ba693704dbcd73b7e7214ccc788 ]
    
    The macros FOUR_ROUNDS_AND_SCHED and DO_4ROUNDS rely on an
    unexpected/undocumented behavior of the GNU assembler, which might
    change in the future
    (https://sourceware.org/bugzilla/show_bug.cgi?id=32073).
    
        M (1) (2) // 1 arg !? Future: 2 args
        M 1 + 2   // 1 arg !? Future: 3 args
    
        M 1 2     // 2 args
    
    Add parentheses around the single arguments to support future GNU
    assembler and LLVM integrated assembler (when the IsOperator hack from
    the following link is dropped).
    
    Link: https://github.com/llvm/llvm-project/commit/055006475e22014b28a070db1bff41ca15f322f0
    Signed-off-by: Fangrui Song <[email protected]>
    Reviewed-by: Jan Beulich <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drivers/perf: arm_spe: Use perf_allow_kernel() for permissions [+ + +]

Author: James Clark <[email protected]>
Date:   Tue Aug 27 15:51:12 2024 +0100

    drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
    
    [ Upstream commit 5e9629d0ae977d6f6916d7e519724804e95f0b07 ]
    
    Use perf_allow_kernel() for 'pa_enable' (physical addresses),
    'pct_enable' (physical timestamps) and context IDs. This means that
    perf_event_paranoid is now taken into account and LSM hooks can be used,
    which is more consistent with other perf_event_open calls. For example
    PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just
    perfmon_capable().
    
    This also indirectly fixes the following error message which is
    misleading because perf_event_paranoid is not taken into account by
    perfmon_capable():
    
      $ perf record -e arm_spe/pa_enable/
    
      Error:
      Access to performance monitoring and observability operations is
      limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid
      setting ...
    
    Suggested-by: Al Grant <[email protected]>
    Signed-off-by: James Clark <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drivers/perf: riscv: Align errno for unsupported perf event [+ + +]

Author: Pu Lehui <[email protected]>
Date:   Sat Aug 31 07:15:20 2024 +0000

    drivers/perf: riscv: Align errno for unsupported perf event
    
    commit c625154993d0d24a962b1830cd5ed92adda2cf86 upstream.
    
    RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would
    be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in
    pmu_sbi_event_map. Considering that other implementations return -ENOENT
    for unsupported perf types, let's synchronize this behavior. Due to this
    reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that
    behavior to the rest of proper place.
    
    Signed-off-by: Pu Lehui <[email protected]>
    Reviewed-by: Atish Patra <[email protected]>
    Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
    Fixes: 16d3b1af0944 ("perf: RISC-V: Check standard event availability")
    Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Add HDR workaround for specific eDP [+ + +]

Author: Alex Hung <[email protected]>
Date:   Fri Sep 6 11:39:18 2024 -0600

    drm/amd/display: Add HDR workaround for specific eDP
    
    commit 05af800704ee7187d9edd461ec90f3679b1c4aba upstream.
    
    [WHY & HOW]
    Some eDP panels suffer from flicking when HDR is enabled in KDE. This
    quirk works around it by skipping VSC that is incompatible with eDP
    panels.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2) [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Aug 2 12:35:13 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2)
    
    [ Upstream commit cd9e9e0852d501f169aa3bb34e4b413d2eb48c37 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_plane_handle_cursor_update function. Previously, 'afb' was
    assumed to be null, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.c:1298 amdgpu_dm_plane_handle_cursor_update() error: we previously assumed 'afb' could be null (see line 1252)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Co-developed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2) [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Aug 2 12:20:36 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_update_cursor (v2)
    
    [ Upstream commit 0fe20258b4989b9112b5e9470df33a0939403fd4 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_update_cursor function. Previously, 'afb' was assumed to be
    null at line 8388, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:8433 amdgpu_dm_update_cursor()
            error: we previously assumed 'afb' could be null (see line 8388)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Co-developed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:21:19 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
    
    [ Upstream commit cba7fec864172dadd953daefdd26e01742b71a6a ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:58:32 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw
    
    [ Upstream commit 4b6377f0e96085cbec96eb7f0b282430ccdd3d75 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:44:40 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
    
    [ Upstream commit c395fd47d1565bd67671f45cca281b3acc2c31ef ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
    null.
    
    The fix adds a check to ensure `dc->clk_mgr` is not null before
    accessing its functions. This prevents a potential null pointer
    dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:09:28 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func
    
    [ Upstream commit 62ed6f0f198da04e884062264df308277628004f ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn20_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null at line 1030, but then it
    was being dereferenced without any null check at line 1048. This could
    potentially lead to a null pointer dereference error if set_output_gamma
    is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma at line 1048.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:15:00 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func
    
    [ Upstream commit 28574b08c70e56d34d6f6379326a860b96749051 ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn32_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:22:06 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn401_set_output_transfer_func
    
    [ Upstream commit dd340acd42c24a3f28dd22fae6bf38662334264c ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn401_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:18:58 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
    
    [ Upstream commit f22f4754aaa47d8c59f166ba3042182859e5dff7 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn201_acquire_free_pipe_for_layer` function. The issue could occur
    when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:30:16 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer
    
    [ Upstream commit ac2140449184a26eac99585b7f69814bd3ba8f2d ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
    could occur when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 25 08:14:56 2024 +0530

    drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe
    
    [ Upstream commit 8e4ed3cf1642df0c4456443d865cff61a9598aa8 ]
    
    This commit addresses a null pointer dereference issue in the
    `dcn20_program_pipe` function. The issue could occur when
    `pipe_ctx->plane_state` is null.
    
    The fix adds a check to ensure `pipe_ctx->plane_state` is not null
    before accessing. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1925 dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could be null (see line 1877)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 25 07:23:48 2024 +0530

    drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
    
    [ Upstream commit 66d71a72539e173a9b00ca0b1852cbaa5f5bf1ad ]
    
    This commit addresses a null pointer dereference issue in the
    `commit_planes_for_stream` function at line 4140. The issue could occur
    when `top_pipe_to_program` is null.
    
    The fix adds a check to ensure `top_pipe_to_program` is not null before
    accessing its stream_res. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` [+ + +]

Author: Mario Limonciello <[email protected]>
Date:   Sun Sep 15 14:28:37 2024 -0500

    drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
    
    [ Upstream commit 87d749a6aab73d8069d0345afaa98297816cb220 ]
    
    The issue with panel power savings compatibility below
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` happens at
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` as well.
    
    That issue will be fixed separately, so don't prevent the backlight
    brightness from going that low.
    
    Cc: Harry Wentland <[email protected]>
    Cc: Thomas Weißschuh <[email protected]>
    Link: https://lore.kernel.org/amd-gfx/[email protected]/T/#m400dee4e2fc61fe9470334d20a7c8c89c9aef44f
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Avoid overflow assignment in link_dp_cts [+ + +]

Author: Alex Hung <[email protected]>
Date:   Wed Jul 17 09:17:56 2024 -0600

    drm/amd/display: Avoid overflow assignment in link_dp_cts
    
    [ Upstream commit a15268787b79fd183dd526cc16bec9af4f4e49a1 ]
    
    sampling_rate is an uint8_t but is assigned an unsigned int, and thus it
    can overflow. As a result, sampling_rate is changed to uint32_t.
    
    Similarly, LINK_QUAL_PATTERN_SET has a size of 2 bits, and it should
    only be assigned to a value less or equal than 4.
    
    This fixes 2 INTEGER_OVERFLOW issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Wenjing Liu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: avoid set dispclk to 0 [+ + +]

Author: Charlene Liu <[email protected]>
Date:   Wed Sep 11 19:45:09 2024 -0400

    drm/amd/display: avoid set dispclk to 0
    
    commit c36df0f5f5e5acec5d78f23c4725cc500df28843 upstream.
    
    [why]
    set dispclk to 0 cause stability issue.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Nicholas Kazlauskas <[email protected]>
    Signed-off-by: Charlene Liu <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 1c6b16ebf5eb2bc5740be9e37b3a69f1dfe1dded)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Check null pointer before try to access it [+ + +]

Author: Rodrigo Siqueira <[email protected]>
Date:   Tue Jul 30 20:02:45 2024 -0600

    drm/amd/display: Check null pointer before try to access it
    
    [ Upstream commit 1b686053c06ffb9f4524b288110cf2a831ff7a25 ]
    
    [why & how]
    Change the order of the pipe_ctx->plane_state check to ensure that
    plane_state is not null before accessing it.
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before multiple uses [+ + +]

Author: Alex Hung <[email protected]>
Date:   Tue Jun 25 10:37:35 2024 -0600

    drm/amd/display: Check null pointers before multiple uses
    
    [ Upstream commit fdd5ecbbff751c3b9061d8ebb08e5c96119915b4 ]
    
    [WHAT & HOW]
    Poniters, such as stream_enc and dc->bw_vbios, are null checked previously
    in the same function, so Coverity warns "implies that stream_enc and
    dc->bw_vbios might be null". They are used multiple times in the
    subsequent code and need to be checked.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before used [+ + +]

Author: Alex Hung <[email protected]>
Date:   Tue Jun 25 10:35:52 2024 -0600

    drm/amd/display: Check null pointers before used
    
    [ Upstream commit be1fb44389ca3038ad2430dac4234669bc177ee3 ]
    
    [WHAT & HOW]
    Poniters, such as dc->clk_mgr, are null checked previously in the same
    function, so Coverity warns "implies that "dc->clk_mgr" might be null".
    As a result, these pointers need to be checked when used again.
    
    This fixes 10 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using dc->clk_mgr [+ + +]

Author: Alex Hung <[email protected]>
Date:   Mon Jul 29 15:29:09 2024 -0600

    drm/amd/display: Check null pointers before using dc->clk_mgr
    
    [ Upstream commit 95d9e0803e51d5a24276b7643b244c7477daf463 ]
    
    [WHY & HOW]
    dc->clk_mgr is null checked previously in the same function, indicating
    it might be null.
    
    Passing "dc" to "dc->hwss.apply_idle_power_optimizations", which
    dereferences null "dc->clk_mgr". (The function pointer resolves to
    "dcn35_apply_idle_power_optimizations".)
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using them [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:38:16 2024 -0600

    drm/amd/display: Check null pointers before using them
    
    [ Upstream commit 1ff12bcd7deaeed25efb5120433c6a45dd5504a8 ]
    
    [WHAT & HOW]
    These pointers are null checked previously in the same function,
    indicating they might be null as reported by Coverity. As a result,
    they need to be checked when used again.
    
    This fixes 3 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null-initialized variables [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:34:18 2024 -0600

    drm/amd/display: Check null-initialized variables
    
    [ Upstream commit 367cd9ceba1933b63bc1d87d967baf6d9fd241d2 ]
    
    [WHAT & HOW]
    drr_timing and subvp_pipe are initialized to null and they are not
    always assigned new values. It is necessary to check for null before
    dereferencing.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Nevenko Stupar <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check phantom_stream before it is used [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Jun 20 20:23:41 2024 -0600

    drm/amd/display: Check phantom_stream before it is used
    
    [ Upstream commit 3718a619a8c0a53152e76bb6769b6c414e1e83f4 ]
    
    dcn32_enable_phantom_stream can return null, so returned value
    must be checked before used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check stream before comparing them [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 20:05:14 2024 -0600

    drm/amd/display: Check stream before comparing them
    
    [ Upstream commit 35ff747c86767937ee1e0ca987545b7eed7a0810 ]
    
    [WHAT & HOW]
    amdgpu_dm can pass a null stream to dc_is_stream_unchanged. It is
    necessary to check for null before dereferencing them.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check stream_status before it is used [+ + +]

Author: Alex Hung <[email protected]>
Date:   Mon Jul 15 10:37:28 2024 -0600

    drm/amd/display: Check stream_status before it is used
    
    [ Upstream commit 58a8ee96f84d2c21abb85ad8c22d2bbdf59bd7a9 ]
    
    [WHAT & HOW]
    dc_state_get_stream_status can return null, and therefore null must be
    checked before stream_status is used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Deallocate DML memory if allocation fails [+ + +]

Author: Chris Park <[email protected]>
Date:   Fri Jun 28 15:09:06 2024 -0400

    drm/amd/display: Deallocate DML memory if allocation fails
    
    [ Upstream commit 892abca6877a96c9123bb1c010cafccdf8ca1b75 ]
    
    [Why]
    When DC state create DML memory allocation fails, memory is not
    deallocated subsequently, resulting in uninitialized structure
    that is not NULL.
    
    [How]
    Deallocate memory if DML memory allocation fails.
    
    Reviewed-by: Joshua Aberback <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Chris Park <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Disable replay if VRR capability is false [+ + +]

Author: Tom Chung <[email protected]>
Date:   Wed Jun 26 16:14:24 2024 +0800

    drm/amd/display: Disable replay if VRR capability is false
    
    [ Upstream commit b68417613d4134b9e39fff95e72ca726268b47db ]
    
    [Why]
    The VRR need to be supported for panel replay feature.
    If VRR capability is false, panel replay capability also
    need to be disabled.
    
    [How]
    After update the vrr capability, the panel replay capability
    also need to be check if need.
    
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Enable idle workqueue for more IPS modes [+ + +]

Author: Leo Li <[email protected]>
Date:   Wed Sep 11 17:27:08 2024 -0400

    drm/amd/display: Enable idle workqueue for more IPS modes
    
    commit ef785ca7f7c80891580cafd36c8dd86375684310 upstream.
    
    [Why]
    
    There are more IPS modes other than DMUB_IPS_ENABLE that enables IPS. We
    need to enable the hotplug detect idle workqueue for those modes as
    well.
    
    [How]
    
    Modify the if condition to initialize the workqueue in all IPS modes
    except for DMUB_IPS_DISABLE_ALL.
    
    Fixes: 65444581a4ae ("drm/amd/display: Determine IPS mode by ASIC and PMFW versions")
    Signed-off-by: Leo Li <[email protected]>
    Reviewed-by: Roman Li <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 181db30bcfed097ecc680539b1eabe935c11f57f)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: fix a UBSAN warning in DML2.1 [+ + +]

Author: Aurabindo Pillai <[email protected]>
Date:   Fri Jul 19 14:10:58 2024 -0400

    drm/amd/display: fix a UBSAN warning in DML2.1
    
    [ Upstream commit eaf3adb8faab611ba57594fa915893fc93a7788c ]
    
    When programming phantom pipe, since cursor_width is explicity set to 0,
    this causes calculation logic to trigger overflow for an unsigned int
    triggering the kernel's UBSAN check as below:
    
    [   40.962845] UBSAN: shift-out-of-bounds in /tmp/amd.EfpumTkO/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3312:34
    [   40.962849] shift exponent 4294967170 is too large for 32-bit type 'unsigned int'
    [   40.962852] CPU: 1 PID: 1670 Comm: gnome-shell Tainted: G        W  OE      6.5.0-41-generic #41~22.04.2-Ubuntu
    [   40.962854] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F21 01/10/2024
    [   40.962856] Call Trace:
    [   40.962857]  <TASK>
    [   40.962860]  dump_stack_lvl+0x48/0x70
    [   40.962870]  dump_stack+0x10/0x20
    [   40.962872]  __ubsan_handle_shift_out_of_bounds+0x1ac/0x360
    [   40.962878]  calculate_cursor_req_attributes.cold+0x1b/0x28 [amdgpu]
    [   40.963099]  dml_core_mode_support+0x6b91/0x16bc0 [amdgpu]
    [   40.963327]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963331]  ? CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport+0x18b8/0x2790 [amdgpu]
    [   40.963534]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963536]  ? dml_core_mode_support+0xb3db/0x16bc0 [amdgpu]
    [   40.963730]  dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.963906]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.963909]  ? dml2_core_calcs_mode_support_ex+0x2c/0x90 [amdgpu]
    [   40.964078]  core_dcn4_mode_support+0x72/0xbf0 [amdgpu]
    [   40.964247]  dml2_top_optimization_perform_optimization_phase+0x1d3/0x2a0 [amdgpu]
    [   40.964420]  dml2_build_mode_programming+0x23d/0x750 [amdgpu]
    [   40.964587]  dml21_validate+0x274/0x770 [amdgpu]
    [   40.964761]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.964763]  ? resource_append_dpp_pipes_for_plane_composition+0x27c/0x3b0 [amdgpu]
    [   40.964942]  dml2_validate+0x504/0x750 [amdgpu]
    [   40.965117]  ? dml21_copy+0x95/0xb0 [amdgpu]
    [   40.965291]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965295]  dcn401_validate_bandwidth+0x4e/0x70 [amdgpu]
    [   40.965491]  update_planes_and_stream_state+0x38d/0x5c0 [amdgpu]
    [   40.965672]  update_planes_and_stream_v3+0x52/0x1e0 [amdgpu]
    [   40.965845]  ? srso_alias_return_thunk+0x5/0x7f
    [   40.965849]  dc_update_planes_and_stream+0x71/0xb0 [amdgpu]
    
    Fix this by adding a guard for checking cursor width before triggering
    the size calculation.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: fix double free issue during amdgpu module unload [+ + +]

Author: Tim Huang <[email protected]>
Date:   Thu Aug 15 18:45:22 2024 -0400

    drm/amd/display: fix double free issue during amdgpu module unload
    
    [ Upstream commit 20b5a8f9f4670a8503aa9fa95ca632e77c6bf55d ]
    
    Flexible endpoints use DIGs from available inflexible endpoints,
    so only the encoders of inflexible links need to be freed.
    Otherwise, a double free issue may occur when unloading the
    amdgpu module.
    
    [  279.190523] RIP: 0010:__slab_free+0x152/0x2f0
    [  279.190577] Call Trace:
    [  279.190580]  <TASK>
    [  279.190582]  ? show_regs+0x69/0x80
    [  279.190590]  ? die+0x3b/0x90
    [  279.190595]  ? do_trap+0xc8/0xe0
    [  279.190601]  ? do_error_trap+0x73/0xa0
    [  279.190605]  ? __slab_free+0x152/0x2f0
    [  279.190609]  ? exc_invalid_op+0x56/0x70
    [  279.190616]  ? __slab_free+0x152/0x2f0
    [  279.190642]  ? asm_exc_invalid_op+0x1f/0x30
    [  279.190648]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191096]  ? __slab_free+0x152/0x2f0
    [  279.191102]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191469]  kfree+0x260/0x2b0
    [  279.191474]  dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191821]  link_destroy+0xd7/0x130 [amdgpu]
    [  279.192248]  dc_destruct+0x90/0x270 [amdgpu]
    [  279.192666]  dc_destroy+0x19/0x40 [amdgpu]
    [  279.193020]  amdgpu_dm_fini+0x16e/0x200 [amdgpu]
    [  279.193432]  dm_hw_fini+0x26/0x40 [amdgpu]
    [  279.193795]  amdgpu_device_fini_hw+0x24c/0x400 [amdgpu]
    [  279.194108]  amdgpu_driver_unload_kms+0x4f/0x70 [amdgpu]
    [  279.194436]  amdgpu_pci_remove+0x40/0x80 [amdgpu]
    [  279.194632]  pci_device_remove+0x3a/0xa0
    [  279.194638]  device_remove+0x40/0x70
    [  279.194642]  device_release_driver_internal+0x1ad/0x210
    [  279.194647]  driver_detach+0x4e/0xa0
    [  279.194650]  bus_remove_driver+0x6f/0xf0
    [  279.194653]  driver_unregister+0x33/0x60
    [  279.194657]  pci_unregister_driver+0x44/0x90
    [  279.194662]  amdgpu_exit+0x19/0x1f0 [amdgpu]
    [  279.194939]  __do_sys_delete_module.isra.0+0x198/0x2f0
    [  279.194946]  __x64_sys_delete_module+0x16/0x20
    [  279.194950]  do_syscall_64+0x58/0x120
    [  279.194954]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [  279.194980]  </TASK>
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Roman Li <[email protected]>
    Signed-off-by: Roman Li <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 color transformation [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:05:20 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 color transformation
    
    [ Upstream commit d81873f9e715b72d4f8d391c8eb243946f784dfc ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
    management module. The issue could occur when the index 'i' exceeds the
    number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:44:02 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
    
    [ Upstream commit bc50b614d59990747dd5aeced9ec22f9258991ff ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
    color  management module. The issue could occur when the index 'i'
    exceeds the  number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in degamma hardware format translation [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 17:48:27 2024 +0530

    drm/amd/display: Fix index out of bounds in degamma hardware format translation
    
    [ Upstream commit b7e99058eb2e86aabd7a10761e76cae33d22b49f ]
    
    Fixes index out of bounds issue in
    `cm_helper_translate_curve_to_degamma_hw_format` function. The issue
    could occur when the index 'i' exceeds the number of transfer function
    points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix possible overflow in integer multiplication [+ + +]

Author: Alex Hung <[email protected]>
Date:   Fri Jun 7 22:09:53 2024 -0600

    drm/amd/display: Fix possible overflow in integer multiplication
    
    [ Upstream commit 3f96f545f877ac59d0c967f52d760b4b2b3b9a47 ]
    
    [WHAT & HOW]
    Integer multiplies integer may overflow in context that expects an
    expression of unsigned long long (64 bits). This can be fixed by casting
    integer to unsigned long long to force 64 bits results.
    
    This fixes 2 OVERFLOW_BEFORE_WIDEN issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix system hang while resume with TBT monitor [+ + +]

Author: Tom Chung <[email protected]>
Date:   Fri Sep 13 15:44:40 2024 +0800

    drm/amd/display: Fix system hang while resume with TBT monitor
    
    commit 52d4e3fb3d340447dcdac0e14ff21a764f326907 upstream.
    
    [Why]
    Connected with a Thunderbolt monitor and do the suspend and the system
    may hang while resume.
    
    The TBT monitor HPD will be triggered during the resume procedure
    and call the drm_client_modeset_probe() while
    struct drm_connector connector->dev->master is NULL.
    
    It will mess up the pipe topology after resume.
    
    [How]
    Skip the TBT monitor HPD during the resume procedure because we
    currently will probe the connectors after resume by default.
    
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Signed-off-by: Fangzhi Zuo <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Fix VRR cannot enable [+ + +]

Author: Tom Chung <[email protected]>
Date:   Wed Jul 3 16:47:57 2024 +0800

    drm/amd/display: Fix VRR cannot enable
    
    [ Upstream commit f91a9af09dea850d83d4b217b8acbafd97b5c61f ]
    
    [Why]
    Sometimes the VRR cannot enable after login to the desktop.
    
    User space may call the DRM_IOCTL_MODE_GETCONNECTOR right after
    the DRM_IOCTL_MODE_RMFB.
    
    After calling DRM_IOCTL_MODE_RMFB to remove all the frame buffer
    and it will cause the driver to disable the crtc and disable the
    link while calling the link_set_dpms_off().
    
    It will cause the dpcd read failed in amdgpu_dm_update_freesync_caps()
    while try to get the DP_MSA_TIMING_PAR_IGNORED capability and think
    the sink side does not support VRR.
    
    [How]
    Use the dpcd_caps.allow_invalid_MSA_timing_param flag instead of
    reading from dpcd directly.
    
    dpcd_caps.allow_invalid_MSA_timing_param flag is updated during HPD.
    It is safe to replace the original method.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML [+ + +]

Author: Dillon Varone <[email protected]>
Date:   Tue Jul 23 15:54:23 2024 -0400

    drm/amd/display: Force enable 3DLUT DMA check for dcn401 in DML
    
    [ Upstream commit b8dc6ca028d9a39196a3a066b9ef2d4a5eca475d ]
    
    [WHY]
    Currently TR0 (trip 0) is not properly budgeting for urgent latency in
    DML2.1. This results in overly aggressive prefetch schedules that are
    vulnerable to request return jitter, resulting in severe underflow at
    the start of the frame.
    
    [HOW]
    Forcing 3DLUT DMA check to enable causes urgent latency to be budgeted
    properly into the prefetch schedule, avoiding the vulnerability.
    
    Reviewed-by: Alvin Lee <[email protected]>
    Signed-off-by: Dillon Varone <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: guard write a 0 post_divider value to HW [+ + +]

Author: Ahmed, Muhammad <[email protected]>
Date:   Tue Aug 13 17:11:55 2024 -0400

    drm/amd/display: guard write a 0 post_divider value to HW
    
    [ Upstream commit 5d666496c24129edeb2bcb500498b87cc64e7f07 ]
    
    [why]
    post_divider_value should not be 0.
    
    Reviewed-by: Charlene Liu <[email protected]>
    Signed-off-by: Ahmed, Muhammad <[email protected]>
    Signed-off-by: Zaeem Mohamed <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream' [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Jul 26 19:31:55 2024 +0530

    drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream'
    
    [ Upstream commit 8141f21b941710ecebe49220b69822cab3abd23d ]
    
    This commit adds a null check for 'stream_status' in the function
    'planes_changed_for_existing_stream'. Previously, the code assumed
    'stream_status' could be null, but did not handle the case where it was
    actually null. This could lead to a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:3784 planes_changed_for_existing_stream() error: we previously assumed 'stream_status' could be null (see line 3774)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: handle nulled pipe context in DCE110's set_drr() [+ + +]

Author: Tobias Jakobi <[email protected]>
Date:   Mon Sep 16 14:54:05 2024 +0200

    drm/amd/display: handle nulled pipe context in DCE110's set_drr()
    
    [ Upstream commit e7d4e1438533abe448813bdc45691f9c230aa307 ]
    
    As set_drr() is called from IRQ context, it can happen that the
    pipe context has been nulled by dc_state_destruct().
    
    Apply the same protection here that is already present for
    dcn35_set_drr() and dcn10_set_drr(). I.e. fetch the tg pointer
    first (to avoid a race with dc_state_destruct()), and then
    check the local copy before using it.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
    Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
    Acked-by: Alex Deucher <[email protected]>
    Signed-off-by: Tobias Jakobi <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Implement bounds check for stream encoder creation in DCN401 [+ + +]

Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Jul 19 21:39:57 2024 +0530

    drm/amd/display: Implement bounds check for stream encoder creation in DCN401
    
    [ Upstream commit bdf606810210e8e07a0cdf1af3c467291363b295 ]
    
    'stream_enc_regs' array is an array of dcn10_stream_enc_registers
    structures. The array is initialized with four elements, corresponding
    to the four calls to stream_enc_regs() in the array initializer. This
    means that valid indices for this array are 0, 1, 2, and 3.
    
    The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
    there is an attempt to access this array with an index of 5, which is
    out of bounds. This could lead to undefined behavior
    
    Here, eng_id is used as an index to access the stream_enc_regs array. If
    eng_id is 5, this would result in an out-of-bounds access on the
    stream_enc_regs array.
    
    Thus fixing Buffer overflow error in dcn401_stream_encoder_create
    
    Found by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Increase array size of dummy_boolean [+ + +]

Author: Alex Hung <[email protected]>
Date:   Wed Jul 3 10:50:35 2024 -0600

    drm/amd/display: Increase array size of dummy_boolean
    
    [ Upstream commit 6d64d39486197083497a01b39e23f2f8474b35d3 ]
    
    [WHY]
    dml2_core_shared_mode_support and dml_core_mode_support access the third
    element of dummy_boolean, i.e. hw_debug5 = &s->dummy_boolean[2], when
    dummy_boolean has size of 2. Any assignment to hw_debug5 causes an
    OVERRUN.
    
    [HOW]
    Increase dummy_boolean's array size to 3.
    
    This fixes 2 OVERRUN issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Initialize denominators' default to 1 [+ + +]

Author: Alex Hung <[email protected]>
Date:   Tue Jun 18 14:05:08 2024 -0600

    drm/amd/display: Initialize denominators' default to 1
    
    [ Upstream commit b995c0a6de6c74656a0c39cd57a0626351b13e3c ]
    
    [WHAT & HOW]
    Variables used as denominators and maybe not assigned to other values,
    should not be 0. Change their default to 1 so they are never 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Initialize get_bytes_per_element's default to 1 [+ + +]

Author: Alex Hung <[email protected]>
Date:   Mon Jul 15 09:57:01 2024 -0600

    drm/amd/display: Initialize get_bytes_per_element's default to 1
    
    [ Upstream commit 4067f4fa0423a89fb19a30b57231b384d77d2610 ]
    
    Variables, used as denominators and maybe not assigned to other values,
    should not be 0. bytes_per_element_y & bytes_per_element_c are
    initialized by get_bytes_per_element() which should never return 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 11:51:27 2024 -0600

    drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags
    
    [ Upstream commit 5559598742fb4538e4c51c48ef70563c49c2af23 ]
    
    [WHAT & HOW]
    "dcn20_validate_apply_pipe_split_flags" dereferences merge, and thus it
    cannot be a null pointer. Let's pass a valid pointer to avoid null
    dereference.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Re-enable panel replay feature [+ + +]

Author: Tom Chung <[email protected]>
Date:   Wed Jun 26 17:02:23 2024 +0800

    drm/amd/display: Re-enable panel replay feature
    
    [ Upstream commit be64336307a6c3ee71fe1337c1b9f0495aa83c50 ]
    
    [Why & How]
    Fixed the replay issues and now re-enable the panel replay feature.
    
    Reported-by: Arthur Borsboom <[email protected]>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
    Reviewed-by: Sun peng Li <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC [+ + +]

Author: Fangzhi Zuo <[email protected]>
Date:   Wed Sep 4 15:29:24 2024 -0400

    drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC
    
    commit d51160ab00969ee6758ed2dcbc0f81dd476a181c upstream.
    
    Existing last step of dsc policy is to restore pbn value under minimum compression
    when try to greedily disable dsc for a stream failed to fit in MST bw.
    Optimized dsc params result from optimization step is not necessarily the minimum compression,
    therefore it is not correct to restore the pbn under minimum compression rate.
    
    Restore the pbn under minimum compression instead of the value from optimized pbn could result
    in the dsc params not correct at the modeset where atomic_check failed due to not
    enough bw. One or more monitors connected could not light up in such case.
    
    Restore the optimized pbn value, instead of using the pbn value under minimum
    compression.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Fangzhi Zuo <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 352c3165d2b75030169e012461a16bcf97f392fc)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Revert Avoid overflow assignment [+ + +]

Author: Gabe Teeger <[email protected]>
Date:   Thu Jul 25 18:42:21 2024 -0400

    drm/amd/display: Revert Avoid overflow assignment
    
    commit e80f8f491df873ea2e07c941c747831234814612 upstream.
    
    This reverts commit a15268787b79 ("drm/amd/display: Avoid overflow assignment in link_dp_cts")
    Due to regression causing DPMS hang.
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Gabe Teeger <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Underflow Seen on DCN401 eGPU [+ + +]

Author: Daniel Sa <[email protected]>
Date:   Fri Jul 19 13:39:09 2024 -0400

    drm/amd/display: Underflow Seen on DCN401 eGPU
    
    [ Upstream commit ca0fb243c3bb53dbbd71d16c76f319bf923ee3d4 ]
    
    [WHY]
    In dcn401 we read clock values before FW is loaded. These incorrect
    values cause the driver to believe that we are running higher clocks
    than what we actually have. This then causes corruption/underflow for
    the eGPU.
    
    [HOW]
    When new values are read from HW, update internal structures to
    propagate the new/correct value. Fixes issue
    
    Signed-off-by: Daniel Sa <[email protected]>
    Reviewed-by: Alvin Lee <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Unlock Pipes Based On DET Allocation [+ + +]

Author: Austin Zheng <[email protected]>
Date:   Tue Jul 30 11:55:23 2024 -0400

    drm/amd/display: Unlock Pipes Based On DET Allocation
    
    [ Upstream commit 4af0d8ebf74ccbb60d33fdd410891283dd6cb109 ]
    
    [Why]
    DML21 does not allocate DET evenly between pipes.
    May result in underflow when unlocking the pipes as DET could
    be overallocated.
    
    [How]
    1. Unlock pipes that have a decreased amount of DET allocation
    2. Wait for the double buffer to be updated.
    3. Unlock the remaining pipes.
    
    Reviewed-by: Alvin Lee <[email protected]>
    Signed-off-by: Austin Zheng <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35 [+ + +]

Author: Yihan Zhu <[email protected]>
Date:   Sat Sep 7 13:25:19 2024 -0400

    drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35
    
    commit 0d5e5e8a0aa49ea2163abf128da3b509a6c58286 upstream.
    
    [WHY & HOW]
    Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause
    grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override
    to match HW spec.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Charlene Liu <[email protected]>
    Signed-off-by: Yihan Zhu <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces [+ + +]

Author: Nicholas Kazlauskas <[email protected]>
Date:   Thu Jul 18 11:53:31 2024 -0400

    drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces
    
    [ Upstream commit 31663521ede2edb622ee1b397ae3ac666d6351c5 ]
    
    [Why]
    It's currently hard coded to 256 when it should be using the SOC
    provided values. This can result in corruption with linear surfaces
    where we prefetch more PTE than the buffer can hold.
    
    [How]
    Update the min page size correctly for the plane.
    
    Signed-off-by: Nicholas Kazlauskas <[email protected]>
    Reviewed-by: Jun Lei <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/pm: ensure the fw_info is not null before using it [+ + +]

Author: Tim Huang <[email protected]>
Date:   Wed Aug 7 17:15:12 2024 +0800

    drm/amd/pm: ensure the fw_info is not null before using it
    
    [ Upstream commit 186fb12e7a7b038c2710ceb2fb74068f1b5d55a4 ]
    
    This resolves the dereference null return value warning
    reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx10: use rlc safe mode for soft recovery [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:34 2024 -0400

    drm/amdgpu/gfx10: use rlc safe mode for soft recovery
    
    [ Upstream commit ead60e9c4e29c8574cae1be4fe3af1d9a978fb0f ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Fri Jul 12 15:36:19 2024 -0400

    drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL
    
    [ Upstream commit b5be054c585110b2c5c1b180136800e8c41c7bb4 ]
    
    Need to enter safe mode before touching GC MMIO.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx11: use rlc safe mode for soft recovery [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:23 2024 -0400

    drm/amdgpu/gfx11: use rlc safe mode for soft recovery
    
    [ Upstream commit 3f2d35c325534c1b7ac5072173f0dc7ca969dec2 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx12: properly handle error ints on all pipes [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Mon Jul 1 17:40:55 2024 -0400

    drm/amdgpu/gfx12: properly handle error ints on all pipes
    
    [ Upstream commit 39879321769cc2d9a690725959ef76af92a38ac1 ]
    
    Need to handle the interrupt enables for all pipes.
    
    v2: fix indexing (Jessie)
    
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx12: use rlc safe mode for soft recovery [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:13 2024 -0400

    drm/amdgpu/gfx12: use rlc safe mode for soft recovery
    
    [ Upstream commit 21818f39beda2e843199e5d8d9e3f9e43c8080a3 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx9: properly handle error ints on all pipes [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Tue Jul 2 10:24:59 2024 -0400

    drm/amdgpu/gfx9: properly handle error ints on all pipes
    
    [ Upstream commit 48695573d2feaf42812c1ad54e01caff0d1c2d71 ]
    
    Need to handle the interrupt enables for all pipes.
    
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx9: use rlc safe mode for soft recovery [+ + +]

Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:57 2024 -0400

    drm/amdgpu/gfx9: use rlc safe mode for soft recovery
    
    [ Upstream commit 3ec2ad7c34c412bd9264cd1ff235d0812be90e82 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: add list empty check to avoid null pointer issue [+ + +]

Author: Yang Wang <[email protected]>
Date:   Wed Aug 21 14:42:41 2024 +0800

    drm/amdgpu: add list empty check to avoid null pointer issue
    
    [ Upstream commit 4416377ae1fdc41a90b665943152ccd7ff61d3c5 ]
    
    Add list empty check to avoid null pointer issues in some corner cases.
    - list_for_each_entry_safe()
    
    Signed-off-by: Yang Wang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: add raven1 gfxoff quirk [+ + +]

Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:25:54 2024 +0800

    drm/amdgpu: add raven1 gfxoff quirk
    
    [ Upstream commit 0126c0ae11e8b52ecfde9d1b174ee2f32d6c3a5d ]
    
    Fix screen corruption with openkylin.
    
    Link: https://bbs.openkylin.top/t/topic/171497
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Block MMR_READ IOCTL in reset [+ + +]

Author: Victor Skvortsov <[email protected]>
Date:   Thu Aug 8 13:40:23 2024 -0400

    drm/amdgpu: Block MMR_READ IOCTL in reset
    
    [ Upstream commit 9e823f307074c0f82b5f6044943b0086e3079bed ]
    
    Register access from userspace should be blocked until
    reset is complete.
    
    Signed-off-by: Victor Skvortsov <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit [+ + +]

Author: Pierre-Eric Pelloux-Prayer <[email protected]>
Date:   Tue Jul 2 11:54:30 2024 +0200

    drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
    
    [ Upstream commit fec5f8e8c6bcf83ed7a392801d7b44c5ecfc1e82 ]
    
    Before this commit, only submits with both a BO_HANDLES chunk and a
    'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).
    
    But if UMD sent multiple BO_HANDLES, what would happen is:
    * only the last one would be really used
    * all the others would leak memory as amdgpu_cs_p1_bo_handles would
      overwrite the previous p->bo_list value
    
    This commit rejects submissions with multiple BO_HANDLES chunks to
    match the implementation of the parser.
    
    Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: enable gfxoff quirk on HP 705G4 [+ + +]

Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:27:08 2024 +0800

    drm/amdgpu: enable gfxoff quirk on HP 705G4
    
    [ Upstream commit 2c7795e245d993bcba2f716a8c93a5891ef910c9 ]
    
    Enabling gfxoff quirk results in perfectly usable
    graphical user interface on HP 705G4 DM with R5 2400G.
    
    Without the quirk, X server is completely unusable as
    every few seconds there is gpu reset due to ring gfx timeout.
    
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Fix get each xcp macro [+ + +]

Author: Asad Kamal <[email protected]>
Date:   Mon Jul 22 19:45:11 2024 +0800

    drm/amdgpu: Fix get each xcp macro
    
    [ Upstream commit ef126c06a98bde1a41303970eb0fc0ac33c3cc02 ]
    
    Fix get each xcp macro to loop over each partition correctly
    
    Fixes: 4bdca2057933 ("drm/amdgpu: Add utility functions for xcp")
    Signed-off-by: Asad Kamal <[email protected]>
    Reviewed-by: Lijo Lazar <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx10 ip_dump [+ + +]

Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:25:24 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx10 ip_dump
    
    [ Upstream commit 98df5a7732e3b78bf8824d2938a8865a45cfc113 ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx11 ip_dump [+ + +]

Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:27:10 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx11 ip_dump
    
    [ Upstream commit bd15f805cdc503ac229a14f5fe21db12e6e7f84a ]
    
    Change condition, if (ptr == NULL) to if (!ptr)
    for a better format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix ptr check warning in gfx9 ip_dump [+ + +]

Author: Sunil Khatri <[email protected]>
Date:   Wed Aug 7 17:21:53 2024 +0530

    drm/amdgpu: fix ptr check warning in gfx9 ip_dump
    
    [ Upstream commit 07f4f9c00ec545dfa6251a44a09d2c48a76e7ee5 ]
    
    Change if (ptr == NULL) to if (!ptr) for a better
    format and fix the warning.
    
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Sunil Khatri <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_atombios [+ + +]

Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 13:47:55 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_atombios
    
    [ Upstream commit 92549780e32718d64a6d08bbbb3c6fffecb541c7 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_gfx [+ + +]

Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 10:38:37 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_gfx
    
    [ Upstream commit c0277b9d7c2ee9ee5dbc948548984f0fbb861301 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer [+ + +]

Author: Philip Yang <[email protected]>
Date:   Sun Jul 14 11:11:05 2024 -0400

    drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
    
    [ Upstream commit c86ad39140bbcb9dc75a10046c2221f657e8083b ]
    
    Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
    otherwise amdgpu_bo_unref clear the local variable, the original pointer
    not set to NULL, this could cause use-after-free bug.
    
    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Felix Kuehling <[email protected]>
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Check int source id for utcl2 poison event [+ + +]

Author: Hawking Zhang <[email protected]>
Date:   Tue Aug 20 13:56:32 2024 +0800

    drm/amdkfd: Check int source id for utcl2 poison event
    
    [ Upstream commit db6341a9168d2a24ded526277eeab29724d76e9d ]
    
    Traditional utcl2 fault_status polling does not
    work in SRIOV environment. The polling of fault
    status register from guest side will be dropped
    by hardware.
    
    Driver should switch to check utcl2 interrupt
    source id to identify utcl2 poison event. It is
    set to 1 when poisoned data interrupts are
    signaled.
    
    v2: drop the unused local variable (Tao)
    
    Signed-off-by: Hawking Zhang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Fix resource leak in criu restore queue [+ + +]

Author: Jesse Zhang <[email protected]>
Date:   Fri Sep 6 11:29:55 2024 +0800

    drm/amdkfd: Fix resource leak in criu restore queue
    
    [ Upstream commit aa47fe8d3595365a935921a90d00bc33ee374728 ]
    
    To avoid memory leaks, release q_extra_data when exiting the restore queue.
    v2: Correct the proto (Alex)
    
    Signed-off-by: Jesse Zhang <[email protected]>
    Reviewed-by: Tim Huang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes [+ + +]

Author: Derek Foreman <[email protected]>
Date:   Tue Aug 27 11:39:04 2024 -0500

    drm/connector: hdmi: Fix writing Dynamic Range Mastering infoframes
    
    [ Upstream commit f0fa69b5011a45394554fb8061d74fee4d7cd72c ]
    
    The largest infoframe we create is the DRM (Dynamic Range Mastering)
    infoframe which is 26 bytes + a 4 byte header, for a total of 30
    bytes.
    
    With HDMI_MAX_INFOFRAME_SIZE set to 29 bytes, as it is now, we
    allocate too little space to pack a DRM infoframe in
    write_device_infoframe(), leading to an ENOSPC return from
    hdmi_infoframe_pack(), and never calling the connector's
    write_infoframe() vfunc.
    
    Instead of having HDMI_MAX_INFOFRAME_SIZE defined in two places,
    replace HDMI_MAX_INFOFRAME_SIZE with HDMI_INFOFRAME_SIZE(MAX) and make
    MAX 27 bytes - which is defined by the HDMI specification to be the
    largest infoframe payload.
    
    Fixes: f378b77227bc ("drm/connector: hdmi: Add Infoframes generation")
    Fixes: c602e4959a0c ("drm/connector: hdmi: Create Infoframe DebugFS entries")
    
    Signed-off-by: Derek Foreman <[email protected]>
    Acked-by: Maxime Ripard <[email protected]>
    Reviewed-by: Jani Nikula <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Maxime Ripard <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/display: BMG supports UHBR13.5 [+ + +]

Author: Arun R Murthy <[email protected]>
Date:   Tue Aug 27 13:42:05 2024 +0530

    drm/i915/display: BMG supports UHBR13.5
    
    [ Upstream commit fcd33d434d31a210bc9f209b5bfd92f3b91a2dda ]
    
    UHBR20 is not supported by battlemage and the maximum link rate
    supported is UHBR13.5
    
    v2: Replace IS_DGFX with IS_BATTLEMAGE (Jani)
    
    HSD: 16023263677
    Signed-off-by: Arun R Murthy <[email protected]>
    Reviewed-by: Mika Kahola <[email protected]>
    Fixes: 98b1c87a5e51 ("drm/i915/xe2hpd: Set maximum DP rate to UHBR13.5")
    Signed-off-by: Suraj Kandpal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 9c2338ac4543e0fab3a1e0f9f025591e0f0d9f8f)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/dp: Fix AUX IO power enabling for eDP PSR [+ + +]

Author: Imre Deak <[email protected]>
Date:   Tue Sep 10 14:18:47 2024 +0300

    drm/i915/dp: Fix AUX IO power enabling for eDP PSR
    
    [ Upstream commit ec2231b8dd2dc515912ff7816c420153b4a95e92 ]
    
    Panel Self Refresh on eDP requires the AUX IO power to be enabled
    whenever the output (main link) is enabled. This is required by the
    AUX_PHY_WAKE/ML_PHY_LOCK signaling initiated by the HW automatically to
    re-enable the main link after it got disabled in power saving states
    (see eDP v1.4b, sections 5.1, 6.1.3.3.1.1).
    
    The Panel Replay mode on non-eDP outputs on the other hand is only
    supported by keeping the main link active, thus not requiring the above
    AUX_PHY_WAKE/ML_PHY_LOCK signaling (eDP v1.4b, section 6.1.3.3.1.2).
    Thus enabling the AUX IO power for this case is not required either.
    
    Based on the above enable the AUX IO power only for eDP/PSR outputs.
    
    Bspec: 49274, 53370
    
    v2:
    - Add a TODO comment to adjust the requirement for AUX IO based on
      whether the ALPM/main-link off mode gets enabled. (Rodrigo)
    
    Cc: Animesh Manna <[email protected]>
    Fixes: b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
    Reviewed-by: Rodrigo Vivi <[email protected]>
    Signed-off-by: Imre Deak <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit f7c2ed9d4ce80a2570c492825de239dc8b500f2e)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/dp: Fix colorimetry detection [+ + +]

Author: Ville Syrjälä <[email protected]>
Date:   Wed Sep 18 22:04:39 2024 +0300

    drm/i915/dp: Fix colorimetry detection
    
    [ Upstream commit e860513f56d8428fcb2bd0282ac8ab691a53fc6c ]
    
    intel_dp_init_connector() is no place for detecting stuff via
    DPCD (except perhaps for eDP). Move the colorimetry stuff into
    a more appropriate place.
    
    Cc: Jouni Högander <[email protected]>
    Fixes: 00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Reviewed-by: Jouni Högander <[email protected]>
    (cherry picked from commit 35dba4834bded843d5416e8caadfe82bd0ce1904)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/gem: fix bitwise and logical AND mixup [+ + +]

Author: Jani Nikula <[email protected]>
Date:   Wed Sep 18 20:35:43 2024 +0300

    drm/i915/gem: fix bitwise and logical AND mixup
    
    commit 394b52462020b6cceff1f7f47fdebd03589574f3 upstream.
    
    CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND is an int, defaulting to 250. When
    the wakeref is non-zero, it's either -1 or a dynamically allocated
    pointer, depending on CONFIG_DRM_I915_DEBUG_RUNTIME_PM. It's likely that
    the code works by coincidence with the bitwise AND, but with
    CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y, there's the off chance that the
    condition evaluates to false, and intel_wakeref_auto() doesn't get
    called. Switch to the intended logical AND.
    
    v2: Use != to avoid clang -Wconstant-logical-operand (Nathan)
    
    Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
    Cc: Matthew Auld <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Anshuman Gupta <[email protected]>
    Cc: Andi Shyti <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: [email protected] # v6.1+
    Reviewed-by: Matthew Auld <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]> # v1
    Link: https://patchwork.freedesktop.org/patch/msgid/643cc0a4d12f47fd8403d42581e83b1e9c4543c7.1726680898.git.jani.nikula@intel.com
    Signed-off-by: Jani Nikula <[email protected]>
    (cherry picked from commit 4c1bfe259ed1d2ade826f95d437e1c41b274df04)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/i915/psr: Do not wait for PSR being idle on on Panel Replay [+ + +]

Author: Jouni Högander <[email protected]>
Date:   Fri Sep 6 10:00:33 2024 +0300

    drm/i915/psr: Do not wait for PSR being idle on on Panel Replay
    
    [ Upstream commit 9498f2e24ee0133d486667c9fa4c27ecdaadc272 ]
    
    We do not have ALPM on DP Panel Replay. Due to this SRD_STATUS[SRD State]
    doesn't change from SRDENT_ON after Panel Replay is enabled until it gets
    disabled.
    
    On eDP Panel Replay DEEP_SLEEP is not reached.
    _psr2_ready_for_pipe_update_locked is waiting DEEP_SLEEP bit getting reset.
    
    Take these into account in Panel Replay code by not waiting PSR getting
    idle after enabling VBI.
    
    Fixes: 29fb595d4875 ("drm/i915/psr: Panel replay uses SRD_STATUS to track it's status")
    Cc: Animesh Manna <[email protected]>
    Signed-off-by: Jouni Högander <[email protected]>
    Reviewed-by: Animesh Manna <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit a2d98feb4b0013ef4f9db0d8f642a8ac1f5ecbb9)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/mediatek: ovl_adaptor: Add missing of_node_put() [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Mon Jun 24 18:43:47 2024 +0200

    drm/mediatek: ovl_adaptor: Add missing of_node_put()
    
    commit 5beb6fba25db235b52eab34bde8112f07bb31d75 upstream.
    
    Error paths that exit for_each_child_of_node() need to call
    of_node_put() to decerement the child refcount and avoid memory leaks.
    
    Add the missing of_node_put().
    
    Cc: [email protected]
    Fixes: 453c3364632a ("drm/mediatek: Add ovl_adaptor support for MT8195")
    Signed-off-by: Javier Carrasco <[email protected]>
    Reviewed-by: CK Hu <[email protected]>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
    Signed-off-by: Chun-Kuang Hu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs [+ + +]

Author: Konrad Dybcio <[email protected]>
Date:   Tue Jul 9 13:15:40 2024 +0200

    drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs
    
    [ Upstream commit 16007768551d5bfe53426645401435ca8d2ef54f ]
    
    There are some cases, such as the one uncovered by Commit 46d4efcccc68
    ("drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails")
    where
    
    msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);
    
    is called on gpu->pdev == NULL, as the GPU device has not been fully
    initialized yet.
    
    Turns out that there's more than just the aforementioned path that
    causes this to happen (e.g. the case when there's speedbin data in the
    catalog, but opp-supported-hw is missing in DT).
    
    Assigning msm_gpu->pdev earlier seems like the least painful solution
    to this, therefore do so.
    
    Signed-off-by: Konrad Dybcio <[email protected]>
    Patchwork: https://patchwork.freedesktop.org/patch/602742/
    Signed-off-by: Rob Clark <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/panthor: Don't add write fences to the shared BOs [+ + +]

Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:01:54 2024 +0200

    drm/panthor: Don't add write fences to the shared BOs
    
    commit f9e7ac6e2e9986c2ee63224992cb5c8276e46b2a upstream.
    
    The only user (the mesa gallium driver) is already assuming explicit
    synchronization and doing the export/import dance on shared BOs. The
    only reason we were registering ourselves as writers on external BOs
    is because Xe, which was the reference back when we developed Panthor,
    was doing so. Turns out Xe was wrong, and we really want bookkeep on
    all registered fences, so userspace can explicitly upgrade those to
    read/write when needed.
    
    Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block")
    Cc: Matthew Brost <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Don't declare a queue blocked if deferred operations are pending [+ + +]

Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:19:14 2024 +0200

    drm/panthor: Don't declare a queue blocked if deferred operations are pending
    
    commit 7a1f30afe97294281a2ba05977688385744f9844 upstream.
    
    If deferred operations are pending, we want to wait for those to
    land before declaring the queue blocked on a SYNC_WAIT. We need
    this to deal with the case where the sync object is signalled through
    a deferred SYNC_{ADD,SET} from the same queue. If we don't do that
    and the group gets scheduled out before the deferred SYNC_{SET,ADD}
    is executed, we'll end up with a timeout, because no external
    SYNC_{SET,ADD} will make the scheduler reconsider the group for
    execution.
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup() [+ + +]

Author: Boris Brezillon <[email protected]>
Date:   Mon Sep 30 18:37:42 2024 +0200

    drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup()
    
    commit 282864cc5d3f144af0cdea1868ee2dc2c5110f0d upstream.
    
    The group variable can't be used to retrieve ptdev in our second loop,
    because it points to the previously iterated list_head, not a valid
    group. Get the ptdev object from the scheduler instead.
    
    Cc: <[email protected]>
    Fixes: d72f049087d4 ("drm/panthor: Allow driver compilation")
    Reported-by: kernel test robot <[email protected]>
    Reported-by: Julia Lawall <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]/
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix race when converting group handle to group object [+ + +]

Author: Steven Price <[email protected]>
Date:   Mon Sep 23 11:34:06 2024 +0100

    drm/panthor: Fix race when converting group handle to group object
    
    [ Upstream commit cac075706f298948898b1f63e81709df42afa75d ]
    
    XArray provides it's own internal lock which protects the internal array
    when entries are being simultaneously added and removed. However there
    is still a race between retrieving the pointer from the XArray and
    incrementing the reference count.
    
    To avoid this race simply hold the internal XArray lock when
    incrementing the reference count, this ensures there cannot be a racing
    call to xa_erase().
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Signed-off-by: Steven Price <[email protected]>
    Reviewed-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc() [+ + +]

Author: Boris Brezillon <[email protected]>
Date:   Fri Sep 13 13:27:22 2024 +0200

    drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc()
    
    [ Upstream commit fa998a9eac8809da4f219aad49836fcad2a9bf5c ]
    
    drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
    pre-allocated BO if the <BO,VM> association exists. Given we
    only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
    be called immediately, and we have to hold the VM resv lock when
    calling this function.
    
    Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/printer: Allow NULL data in devcoredump printer [+ + +]

Author: Matthew Brost <[email protected]>
Date:   Thu Aug 1 08:41:17 2024 -0700

    drm/printer: Allow NULL data in devcoredump printer
    
    [ Upstream commit 53369581dc0c68a5700ed51e1660f44c4b2bb524 ]
    
    We want to determine the size of the devcoredump before writing it out.
    To that end, we will run the devcoredump printer with NULL data to get
    the size, alloc data based on the generated offset, then run the
    devcorecump again with a valid data pointer to print.  This necessitates
    not writing data to the data pointer on the initial pass, when it is
    NULL.
    
    v5:
     - Better commit message (Jonathan)
     - Add kerenl doc with examples (Jani)
    
    Cc: Maarten Lankhorst <[email protected]>
    Acked-by: Maarten Lankhorst <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/radeon/r100: Handle unknown family in r100_cp_init_microcode() [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Tue Jul 30 17:58:12 2024 +0200

    drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
    
    [ Upstream commit c6dbab46324b1742b50dc2fb5c1fee2c28129439 ]
    
    With -Werror:
    
        In function ‘r100_cp_init_microcode’,
            inlined from ‘r100_cp_init’ at drivers/gpu/drm/radeon/r100.c:1136:7:
        include/linux/printk.h:465:44: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
          465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
              |                                            ^
        include/linux/printk.h:437:17: note: in definition of macro ‘printk_index_wrap’
          437 |                 _p_func(_fmt, ##__VA_ARGS__);                           \
              |                 ^~~~~~~
        include/linux/printk.h:508:9: note: in expansion of macro ‘printk’
          508 |         printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
              |         ^~~~~~
        drivers/gpu/drm/radeon/r100.c:1062:17: note: in expansion of macro ‘pr_err’
         1062 |                 pr_err("radeon_cp: Failed to load firmware \"%s\"\n", fw_name);
              |                 ^~~~~~
    
    Fix this by converting the if/else if/... construct into a proper
    switch() statement with a default to handle the error case.
    
    As a bonus, the generated code is ca. 100 bytes smaller (with gcc 11.4.0
    targeting arm32).
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/rockchip: vop: clear DMA stop bit on RK3066 [+ + +]

Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:48 2024 -0300

    drm/rockchip: vop: clear DMA stop bit on RK3066
    
    commit 6b44aa559d6c7f4ea591ef9d2352a7250138d62a upstream.
    
    The RK3066 VOP sets a dma_stop bit when it's done scanning out a frame
    and needs the driver to acknowledge that by clearing the bit.
    
    Unless we clear it "between" frames, the RGB output only shows noise
    instead of the picture. atomic_flush is the place for it that least
    affects other code (doing it on vblank would require converting all
    other usages of the reg_lock to spin_(un)lock_irq, which would affect
    performance for everyone).
    
    This seems to be a redundant synchronization mechanism that was removed
    in later iterations of the VOP hardware block.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066 [+ + +]

Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:49 2024 -0300

    drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066
    
    commit 6ed51ba95e27221ce87979bd2ad5926033b9e1b9 upstream.
    
    The RK3066 does have RGB display output, so it should be marked as such.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Add locking to drm_sched_entity_modify_sched [+ + +]

Author: Tvrtko Ursulin <[email protected]>
Date:   Fri Sep 13 17:05:52 2024 +0100

    drm/sched: Add locking to drm_sched_entity_modify_sched
    
    commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.
    
    Without the locking amdgpu currently can race between
    amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
    drm_sched_job_arm(), leading to the latter accesing potentially
    inconsitent entity->sched_list and entity->num_sched_list pair.
    
    v2:
     * Improve commit message. (Philipp)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: Philipp Stanner <[email protected]>
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always increment correct scheduler score [+ + +]

Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:09 2024 +0100

    drm/sched: Always increment correct scheduler score
    
    commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5 upstream.
    
    Entities run queue can change during drm_sched_entity_push_job() so make
    sure to update the score consistently.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues")
    Cc: Nirmoy Das <[email protected]>
    Cc: Christian König <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.9+
    Reviewed-by: Christian König <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job [+ + +]

Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:08 2024 +0100

    drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job
    
    commit cbc8764e29c2318229261a679b2aafd0f9072885 upstream.
    
    Since drm_sched_entity_modify_sched() can modify the entities run queue,
    lets make sure to only dereference the pointer once so both adding and
    waking up are guaranteed to be consistent.
    
    Alternative of moving the spin_unlock to after the wake up would for now
    be more problematic since the same lock is taken inside
    drm_sched_rq_update_fifo().
    
    v2:
     * Improve commit message. (Philipp)
     * Cache the scheduler pointer directly. (Christian)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Philipp Stanner <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Fix dynamic job-flow control race [+ + +]

Author: Rob Clark <[email protected]>
Date:   Fri Sep 13 13:23:01 2024 -0700

    drm/sched: Fix dynamic job-flow control race
    
    commit 440d52b370b03b366fd26ace36bab20552116145 upstream.
    
    Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609
    
    The whole premise of lockless access to a single-producer-single-
    consumer queue is that there is just a single producer and single
    consumer.  That means we can't call drm_sched_can_queue() (which is
    about queueing more work to the hw, not to the spsc queue) from
    anywhere other than the consumer (wq).
    
    This call in the producer is just an optimization to avoid scheduling
    the consuming worker if it cannot yet queue more work to the hw.  It
    is safe to drop this optimization to avoid the race condition.
    
    Suggested-by: Asahi Lina <[email protected]>
    Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control")
    Closes: https://github.com/AsahiLinux/linux/issues/309
    Cc: [email protected]
    Signed-off-by: Rob Clark <[email protected]>
    Reviewed-by: Danilo Krummrich <[email protected]>
    Tested-by: Janne Grunau <[email protected]>
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: revert "Always increment correct scheduler score" [+ + +]

Author: Christian König <[email protected]>
Date:   Mon Sep 30 15:07:49 2024 +0200

    drm/sched: revert "Always increment correct scheduler score"
    
    commit abf201f6ce14c4ceeccde5471bdf59614b83a3d8 upstream.
    
    This reverts commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5.
    
    It turned out that the original code was correct since the rq can only
    change when there is no armed job for an entity.
    
    This change here broke the logic since we only incremented the counter
    for the first job, so revert it.
    
    Signed-off-by: Christian König <[email protected]>
    Acked-by: Tvrtko Ursulin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/stm: Avoid use-after-free issues with crtc and plane [+ + +]

Author: Katya Orlova <[email protected]>
Date:   Fri Feb 16 15:50:40 2024 +0300

    drm/stm: Avoid use-after-free issues with crtc and plane
    
    [ Upstream commit 19dd9780b7ac673be95bf6fd6892a184c9db611f ]
    
    ltdc_load() calls functions drm_crtc_init_with_planes(),
    drm_universal_plane_init() and drm_encoder_init(). These functions
    should not be called with parameters allocated with devm_kzalloc()
    to avoid use-after-free issues [1].
    
    Use allocations managed by the DRM framework.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    [1]
    https://lore.kernel.org/lkml/u366i76e3qhh3ra5oxrtngjtm2u5lterkekcz6y2jkndhuxzli@diujon4h7qwb/
    
    Signed-off-by: Katya Orlova <[email protected]>
    Acked-by: Raphaël Gallais-Pou <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/stm: ltdc: reset plane transparency after plane disable [+ + +]

Author: Yannick Fertre <[email protected]>
Date:   Fri Jul 12 15:13:44 2024 +0200

    drm/stm: ltdc: reset plane transparency after plane disable
    
    [ Upstream commit 02fa62d41c8abff945bae5bfc3ddcf4721496aca ]
    
    The plane's opacity should be reseted while the plane
    is disabled. It prevents from seeing a possible global
    or layer background color set earlier.
    
    Signed-off-by: Yannick Fertre <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/v3d: Prevent out of bounds access in performance query extensions [+ + +]

Author: Tvrtko Ursulin <[email protected]>
Date:   Thu Jul 11 14:53:30 2024 +0100

    drm/v3d: Prevent out of bounds access in performance query extensions
    
    commit f32b5128d2c440368b5bf3a7a356823e235caabb upstream.
    
    Check that the number of perfmons userspace is passing in the copy and
    reset extensions is not greater than the internal kernel storage where
    the ids will be copied into.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job")
    Cc: Maíra Canal <[email protected]>
    Cc: Iago Toral Quiroga <[email protected]>
    Cc: [email protected] # v6.8+
    Reviewed-by: Iago Toral Quiroga <[email protected]>
    Reviewed-by: Maíra Canal <[email protected]>
    Signed-off-by: Maíra Canal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/fbdev: Limit the usage of stolen for LNL+ [+ + +]

Author: Uma Shankar <[email protected]>
Date:   Wed Jul 17 13:52:52 2024 +0530

    drm/xe/fbdev: Limit the usage of stolen for LNL+
    
    [ Upstream commit 775d0adc01a55fe0458139330415d86bb3533efe ]
    
    As per recommendation in the workarounds:
    WA_22019338487
    
    There is an issue with accessing Stolen memory pages due a
    hardware limitation. Limit the usage of stolen memory for
    fbdev for LNL+. Don't use BIOS FB from stolen on LNL+ and
    assign the same from system memory.
    
    v2: Corrected the WA Number, limited WA to LNL and
        Adopted XE_WA framework as suggested by Lucas and Matt.
    
    v3: Introduced the waxxx_display to implement display side
        of WA changes on Lunarlake. Used xe_root_mmio_gt and
        avoid the for loop (Suggested by Lucas)
    
    v4: Fixed some nits (Luca)
    
    Reviewed-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Uma Shankar <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe/guc_submit: add missing locking in wedged_fini [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Tue Sep 24 16:09:48 2024 +0100

    drm/xe/guc_submit: add missing locking in wedged_fini
    
    [ Upstream commit 790533e44bfc7af929842fccd9674c9f424d4627 ]
    
    Any non-wedged queue can have a zero refcount here and can be running
    concurrently with an async queue destroy, therefore dereferencing the
    queue ptr to check wedge status after the lookup can trigger UAF if
    queue is not wedged.  Fix this by keeping the submission_state lock held
    around the check to postpone the free and make the check safe, before
    dropping again around the put() to avoid the deadlock.
    
    Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit d28af0b6b9580b9f90c265a7da0315b0ad20bbfd)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe/hdcp: Check GSC structure validity [+ + +]

Author: Suraj Kandpal <[email protected]>
Date:   Mon Jul 22 12:14:51 2024 +0530

    drm/xe/hdcp: Check GSC structure validity
    
    [ Upstream commit b4224f6bae3801d589f815672ec62800a1501b0d ]
    
    Sometimes xe_gsc is not initialized when checked at HDCP capability
    check. Add gsc structure check to avoid null pointer error.
    
    Signed-off-by: Suraj Kandpal <[email protected]>
    Reviewed-by: Dnyaneshwar Bhadane <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close [+ + +]

Author: José Roberto de Souza <[email protected]>
Date:   Tue Sep 24 14:37:13 2024 -0700

    drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
    
    commit 8135f1c09dd2eecee7cb637f7ec9a29e57300eb8 upstream.
    
    Mesa testing on Xe2+ revealed that when OA metrics are collected for an
    exec_queue, after the OA stream is closed, future batch buffers submitted
    on that exec_queue do not complete. Not resetting OAC_CONTEXT_ENABLE on OA
    stream close resolves these hangs and should not have any adverse effects.
    
    v2: Make the change that we don't reset the bit clearer (Ashutosh)
        Also make the same fix for OAC as OAR (Ashutosh)
    
    Bspec: 60314
    Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support")
    Fixes: 14e077f8006d ("drm/xe/oa: Add OAC support")
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2821
    Signed-off-by: José Roberto de Souza <[email protected]>
    Signed-off-by: Ashutosh Dixit <[email protected]>
    Cc: [email protected]
    Reviewed-by: Ashutosh Dixit <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 0c8650b09a365f4a31fca1d1d1e9d99c56071128)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe/vm: move xa_alloc to prevent UAF [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Wed Sep 25 08:14:27 2024 +0100

    drm/xe/vm: move xa_alloc to prevent UAF
    
    [ Upstream commit 74231870cf4976f69e83aa24f48edb16619f652f ]
    
    Evil user can guess the next id of the vm before the ioctl completes and
    then call vm destroy ioctl to trigger UAF since create ioctl is still
    referencing the same vm. Move the xa_alloc all the way to the end to
    prevent this.
    
    v2:
     - Rebase
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Nirmoy Das <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit dcfd3971327f3ee92765154baebbaece833d3ca9)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe/vram: fix ccs offset calculation [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Mon Sep 16 09:49:12 2024 +0100

    drm/xe/vram: fix ccs offset calculation
    
    commit ee06c09ded3c2f722be4e240ed06287e23596bda upstream.
    
    Spec says SW is expected to round up to the nearest 128K, if not already
    aligned for the CC unit view of CCS. We are seeing the assert sometimes
    pop on BMG to tell us that there is a hole between GSM and CCS, as well
    as popping other asserts with having a vram size with strange alignment,
    which is likely caused by misaligned offset here.
    
    v2 (Shuicheng):
     - Do the round_up() on final SW address.
    
    BSpec: 68023
    Fixes: b5c2ca0372dc ("drm/xe/xe2hpg: Determine flat ccs offset for vram")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Himal Prasad Ghimiray <[email protected]>
    Cc: Akshata Jahagirdar <[email protected]>
    Cc: Lucas De Marchi <[email protected]>
    Cc: Shuicheng Lin <[email protected]>
    Cc: Matt Roper <[email protected]>
    Cc: [email protected] # v6.10+
    Reviewed-by: Himal Prasad Ghimiray <[email protected]>
    Tested-by: Shuicheng Lin <[email protected]>
    Reviewed-by: Lucas De Marchi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Lucas De Marchi <[email protected]>
    (cherry picked from commit 37173392741c425191b959acb3adf70c9a4610c0)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: Add timeout to preempt fences [+ + +]

Author: Matthew Brost <[email protected]>
Date:   Tue Jun 25 17:41:37 2024 -0700

    drm/xe: Add timeout to preempt fences
    
    [ Upstream commit 627c961d672d3304564455ba471f5e4405170eec ]
    
    To adhere to dma fencing rules that fences must signal within a
    reasonable amount of time, add a 5 second timeout to preempt fences. If
    this timeout occurs, kill the associated VM as this fatal to the VM.
    
    v2:
     - Add comment for smp_wmb (Checkpatch)
     - Fix kernel doc typo (Inspection)
     - Add comment for killed check (Niranjana)
    v3:
     - Drop smp_wmb (Matthew Auld)
     - Don't take vm->lock in preempt fence worker (Matthew Auld)
     - Drop RB given changes to patch
    v4:
     - Add WRITE/READ_ONCE (Niranjana)
     - Don't export xe_vm_kill (Niranjana)
    
    Cc: Matthew Auld <[email protected]>
    Cc: Niranjana Vishwanathapura <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Tested-by: Stuart Summers <[email protected]>
    Reviewed-by: Niranjana Vishwanathapura <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Clean up VM / exec queue file lock usage. [+ + +]

Author: Matthew Brost <[email protected]>
Date:   Fri Sep 20 18:17:12 2024 -0700

    drm/xe: Clean up VM / exec queue file lock usage.
    
    [ Upstream commit 9e3c85ddea7a473ed57b6cdfef2dfd468356fc91 ]
    
    Both the VM / exec queue file lock protect the lookup and reference to
    the object, nothing more. These locks are not intended anything else
    underneath them. XA have their own locking too, so no need to take the
    VM / exec queue file lock aside from when doing a lookup and reference
    get.
    
    Add some kernel doc to make this clear and cleanup a few typos too.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Matthew Auld <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit fe4f5d4b661666a45b48fe7f95443f8fefc09c8c)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Stable-dep-of: 74231870cf49 ("drm/xe/vm: move xa_alloc to prevent UAF")
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini [+ + +]

Author: Matthew Brost <[email protected]>
Date:   Tue Aug 20 10:29:55 2024 -0700

    drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini
    
    [ Upstream commit a323782567812ee925e9b7926445532c7afe331b ]
    
    Not a big deal if CT is down as driver is unloading, no need to warn.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jagmeet Randhawa <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Fix memory leak on xe_alloc_pf_queue failure [+ + +]

Author: Nirmoy Das <[email protected]>
Date:   Mon Aug 26 18:20:35 2024 +0200

    drm/xe: Fix memory leak on xe_alloc_pf_queue failure
    
    [ Upstream commit c5f728de696caa35481fd84202dfbc9fecc18e0b ]
    
    Simplify memory unwinding on error also fixing current memory
    leak that can happen on error.
    
    v2: use devm_kcalloc(Matt A)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Cc: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Auld <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Nirmoy Das <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: fix UAF around queue destruction [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Mon Sep 23 15:56:48 2024 +0100

    drm/xe: fix UAF around queue destruction
    
    commit 2d2be279f1ca9e7288282d4214f16eea8a727cdb upstream.
    
    We currently do stuff like queuing the final destruction step on a
    random system wq, which will outlive the driver instance. With bad
    timing we can teardown the driver with one or more work workqueue still
    being alive leading to various UAF splats. Add a fini step to ensure
    user queues are properly torn down. At this point GuC should already be
    nuked so queue itself should no longer be referenced from hw pov.
    
    v2 (Matt B)
     - Looks much safer to use a waitqueue and then just wait for the
       xa_array to become empty before triggering the drain.
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2317
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: fixup xe_alloc_pf_queue [+ + +]

Author: Matthew Auld <[email protected]>
Date:   Wed Aug 21 18:19:18 2024 +0100

    drm/xe: fixup xe_alloc_pf_queue
    
    [ Upstream commit 321d6b4b9cbe3dd0bc99937d5e5b4d730b5b5798 ]
    
    kzalloc expects number of bytes, therefore we should convert the number
    of dw into bytes, otherwise we are likely just accessing beyond the
    array causing all kinds of carnage. Also fixup the error handling while
    we are here.
    
    v2:
     - Prefer kcalloc (dim)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Generate oob before compiling anything [+ + +]

Author: Lucas De Marchi <[email protected]>
Date:   Mon Jul 8 14:29:06 2024 -0700

    drm/xe: Generate oob before compiling anything
    
    commit ea74bf9ccba9ae80fc0766c07c4abaef927e9e63 upstream.
    
    Instead of keep adding more dependencies as WAs are needed in different
    places of the driver, just add a rule with all the objects so the code
    generation happens before anything else.
    
    While at it, group lines related to wa_oob in the Makefile.
    
    v2: Prefix $(obj) when declaring dependency
    
    Reviewed-by: Rodrigo Vivi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/xe: Name and document Wa_14019789679 [+ + +]

Author: Matt Roper <[email protected]>
Date:   Mon Aug 12 11:10:43 2024 -0700

    drm/xe: Name and document Wa_14019789679
    
    [ Upstream commit 1d734a3e5d6bb266f52eaf2b1400c5d3f1875a54 ]
    
    Early in the development of Xe we identified an issue with SVG state
    handling on DG2 and MTL (and later on Xe2 as well).  In
    commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load
    on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state
    handling") we implemented our own workaround to prevent SVG state from
    leaking from context A to context B in cases where context B never
    issues a specific state setting.
    
    The hardware teams have now created official workaround Wa_14019789679
    to cover this issue.  The workaround description only requires emitting
    3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction
    that would potentially remain unset by a context B, but still cause
    notable issues if unwanted values were inherited from context A.
    However since we already have a more extensive implementation that emits
    the entire SVG state and prevents _any_ SVG state from unintentionally
    leaking, we'll stick with our existing implementation just to be safe.
    
    Signed-off-by: Matt Roper <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Prevent null pointer access in xe_migrate_copy [+ + +]

Author: Zhanjun Dong <[email protected]>
Date:   Fri Sep 27 09:13:08 2024 -0700

    drm/xe: Prevent null pointer access in xe_migrate_copy
    
    [ Upstream commit 7257d9c9a3c6cfe26c428e9b7ae21d61f2f55a79 ]
    
    xe_migrate_copy designed to copy content of TTM resources. When source
    resource is null, it will trigger a NULL pointer dereference in
    xe_migrate_copy. To avoid this situation, update lacks source flag to
    true for this case, the flag will trigger xe_migrate_clear rather than
    xe_migrate_copy.
    
    Issue trace:
    <7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14,
     sizes: 4194304 & 4194304
    <7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15,
     sizes: 4194304 & 4194304
    <1> [317.128055] BUG: kernel NULL pointer dereference, address:
     0000000000000010
    <1> [317.128064] #PF: supervisor read access in kernel mode
    <1> [317.128066] #PF: error_code(0x0000) - not-present page
    <6> [317.128069] PGD 0 P4D 0
    <4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
    <4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted:
     G     U           N 6.11.0-rc7-xe #1
    <4> [317.128078] Tainted: [U]=USER, [N]=TEST
    <4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client
     Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024
    <4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8
     fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31
     ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff
    <4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246
    <4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX:
     0000000000000000
    <4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI:
     0000000000000000
    <4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09:
     0000000000000001
    <4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12:
     ffff88814e7b1f08
    <4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15:
     0000000000000001
    <4> [317.128174] FS:  0000000000000000(0000) GS:ffff88846f280000(0000)
     knlGS:0000000000000000
    <4> [317.128176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    <4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4:
     0000000000770ef0
    <4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
     0000000000000000
    <4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
     0000000000000400
    <4> [317.128184] PKRU: 55555554
    <4> [317.128185] Call Trace:
    <4> [317.128187]  <TASK>
    <4> [317.128189]  ? show_regs+0x67/0x70
    <4> [317.128194]  ? __die_body+0x20/0x70
    <4> [317.128196]  ? __die+0x2b/0x40
    <4> [317.128198]  ? page_fault_oops+0x15f/0x4e0
    <4> [317.128203]  ? do_user_addr_fault+0x3fb/0x970
    <4> [317.128205]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128209]  ? exc_page_fault+0x87/0x2b0
    <4> [317.128212]  ? asm_exc_page_fault+0x27/0x30
    <4> [317.128216]  ? xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128263]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128265]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128267]  ? sg_free_append_table+0x20/0x80
    <4> [317.128271]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128273]  ? mark_held_locks+0x4d/0x80
    <4> [317.128275]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128278]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128281]  ? __pm_runtime_resume+0x60/0xa0
    <4> [317.128284]  xe_bo_move+0x682/0xc50 [xe]
    <4> [317.128315]  ? lock_is_held_type+0xaa/0x120
    <4> [317.128318]  ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm]
    <4> [317.128324]  ttm_bo_validate+0xd1/0x1a0 [ttm]
    <4> [317.128328]  shrink_test_run_device+0x721/0xc10 [xe]
    <4> [317.128360]  ? find_held_lock+0x31/0x90
    <4> [317.128363]  ? lock_release+0xd1/0x2a0
    <4> [317.128365]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
     [kunit]
    <4> [317.128370]  xe_bo_shrink_kunit+0x11/0x20 [xe]
    <4> [317.128397]  kunit_try_run_case+0x6e/0x150 [kunit]
    <4> [317.128400]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128402]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128404]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
    <4> [317.128407]  kthread+0xf5/0x130
    <4> [317.128410]  ? __pfx_kthread+0x10/0x10
    <4> [317.128412]  ret_from_fork+0x39/0x60
    <4> [317.128415]  ? __pfx_kthread+0x10/0x10
    <4> [317.128416]  ret_from_fork_asm+0x1a/0x30
    <4> [317.128420]  </TASK>
    
    Fixes: 266c85885263 ("drm/xe/xe2: Handle flat ccs move for igfx.")
    Signed-off-by: Zhanjun Dong <[email protected]>
    Reviewed-by: Thomas Hellström <[email protected]>
    Signed-off-by: Matt Roper <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 59a1c9c7e1d02b43b415ea92627ce095b7c79e47)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Restore pci state upon resume [+ + +]

Author: Rodrigo Vivi <[email protected]>
Date:   Thu Sep 12 17:45:07 2024 -0400

    drm/xe: Restore pci state upon resume
    
    [ Upstream commit cffa8e83df9fe525afad1e1099097413f9174f57 ]
    
    The pci state was saved, but not restored. Restore
    right after the power state transition request like
    every other driver.
    
    v2: Use right fixes tag, since this was there initialy, but
        accidentally removed.
    
    Fixes: f6761c68c0ac ("drm/xe/display: Improve s2idle handling.")
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Lucas De Marchi <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Signed-off-by: Rodrigo Vivi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Maarten Lankhorst <[email protected]>
    (cherry picked from commit ec2d1539e159f53eae708e194c449cfefa004994)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Resume TDR after GT reset [+ + +]

Author: Matthew Brost <[email protected]>
Date:   Wed Jul 24 16:59:19 2024 -0700

    drm/xe: Resume TDR after GT reset
    
    [ Upstream commit 1b30f87e088b499eb74298db256da5c98e8276e2 ]
    
    Not starting the TDR after GT reset on exec queue which have been
    restarted can lead to jobs being able to be run forever. Fix this by
    restarting the TDR.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 8ec5a4e5ce97d6ee9f5eb5b4ce4cfc831976fdec)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Use topology to determine page fault queue size [+ + +]

Author: Stuart Summers <[email protected]>
Date:   Sat Aug 17 02:47:31 2024 +0000

    drm/xe: Use topology to determine page fault queue size
    
    [ Upstream commit 3338e4f90c143cf32f77d64f464cb7f2c2d24700 ]
    
    Currently the page fault queue size is hard coded. However
    the hardware supports faulting for each EU and each CS.
    For some applications running on hardware with a large
    number of EUs and CSs, this can result in an overflow of
    the page fault queue.
    
    Add a small calculation to determine the page fault queue
    size based on the number of EUs and CSs in the platform as
    detmined by fuses.
    
    Signed-off-by: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
    Signed-off-by: Sasha Levin <[email protected]>

drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS [+ + +]

Author: Thomas Zimmermann <[email protected]>
Date:   Mon Sep 23 09:58:14 2024 +0200

    drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS
    
    commit 8b0d2f61545545ab5eef923ed6e59fc3be2385e0 upstream.
    
    FB_DAMAGE_CLIPS is a plane property for damage handling. Its UAPI
    should only use UAPI types. Hence replace struct drm_rect with
    struct drm_mode_rect in drm_atomic_plane_set_property(). Both types
    are identical in practice, so there's no change in behavior.
    
    Reported-by: Ville Syrjälä <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/[email protected]/
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: d3b21767821e ("drm: Add a new plane property to send damage during plane update")
    Cc: Lukasz Spintzyk <[email protected]>
    Cc: Deepak Rawat <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Thomas Hellstrom <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.0+
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm: omapdrm: Add missing check for alloc_ordered_workqueue [+ + +]

Author: Ma Ke <[email protected]>
Date:   Thu Aug 8 14:13:36 2024 +0800

    drm: omapdrm: Add missing check for alloc_ordered_workqueue
    
    commit e794b7b9b92977365c693760a259f8eef940c536 upstream.
    
    As it may return NULL pointer and cause NULL pointer dereference. Add check
    for the return value of alloc_ordered_workqueue.
    
    Cc: [email protected]
    Fixes: 2f95bc6d324a ("drm: omapdrm: Perform initialization/cleanup at probe/remove time")
    Signed-off-by: Ma Ke <[email protected]>
    Signed-off-by: Tomi Valkeinen <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: clock: exynos7885: Fix duplicated binding [+ + +]

Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:44 2024 +0200

    dt-bindings: clock: exynos7885: Fix duplicated binding
    
    commit abf3a3ea9acb5c886c8729191a670744ecd42024 upstream.
    
    The numbering in Exynos7885's FSYS CMU bindings has 4 duplicated by
    accident, with the rest of the bindings continuing with 5.
    
    Fix this by moving CLK_MOUT_FSYS_USB30DRD_USER to the end as 11.
    
    Since CLK_MOUT_FSYS_USB30DRD_USER is not used in any device tree as of
    now, and there are no other clocks affected (maybe apart from
    CLK_MOUT_FSYS_MMC_SDIO_USER which the number was shared with, also not
    used in a device tree), this is the least impactful way to solve this
    problem.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x [+ + +]

Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:02 2024 +0530

    dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x
    
    commit 648b4bde0aca2980ebc0b90cdfbb80d222370c3d upstream.
    
    Add the missing GPLL9 which is required for the gcc sdcc2 clock.
    
    Fixes: 0fadcdfdcf57 ("dt-bindings: clock: Add SC8180x GCC binding")
    Cc: [email protected]
    Acked-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems [+ + +]

Author: Ravikanth Tuniki <[email protected]>
Date:   Tue Oct 1 00:43:35 2024 +0530

    dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems
    
    [ Upstream commit c6929644c1e0d6108e57061d427eb966e1746351 ]
    
    Add missing reg minItems as based on current binding document
    only ethernet MAC IO space is a supported configuration.
    
    There is a bug in schema, current examples contain 64-bit
    addressing as well as 32-bit addressing. The schema validation
    does pass incidentally considering one 64-bit reg address as
    two 32-bit reg address entries. If we change axi_ethernet_eth1
    example node reg addressing to 32-bit schema validation reports:
    
    Documentation/devicetree/bindings/net/xlnx,axi-ethernet.example.dtb:
    ethernet@40000000: reg: [[1073741824, 262144]] is too short
    
    To fix it add missing reg minItems constraints and to make things clearer
    stick to 32-bit addressing in examples.
    
    Fixes: cbb1ca6d5f9a ("dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yaml")
    Signed-off-by: Ravikanth Tuniki <[email protected]>
    Signed-off-by: Radhey Shyam Pandey <[email protected]>
    Acked-by: Conor Dooley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

e1000e: avoid failing the system during pm_suspend [+ + +]

Author: Vitaly Lifshits <[email protected]>
Date:   Tue Aug 6 16:23:48 2024 +0300

    e1000e: avoid failing the system during pm_suspend
    
    [ Upstream commit 0a6ad4d9e1690c7faa3a53f762c877e477093657 ]
    
    Occasionally when the system goes into pm_suspend, the suspend might fail
    due to a PHY access error on the network adapter. Previously, this would
    have caused the whole system to fail to go to a low power state.
    An example of this was reported in the following Bugzilla:
    https://bugzilla.kernel.org/show_bug.cgi?id=205015
    
    [ 1663.694828] e1000e 0000:00:19.0 eth0: Failed to disable ULP
    [ 1664.731040] asix 2-3:1.0 eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1
    [ 1665.093513] e1000e 0000:00:19.0 eth0: Hardware Error
    [ 1665.596760] e1000e 0000:00:19.0: pci_pm_resume+0x0/0x80 returned 0 after 2975399 usecs
    
    and then the system never recovers from it, and all the following suspend failed due to this
    [22909.393854] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x760 [e1000e] returns -2
    [22909.393858] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -2
    [22909.393861] PM: Device 0000:00:1f.6 failed to suspend async: error -2
    
    This can be avoided by changing the return values of __e1000_shutdown and
    e1000e_pm_suspend functions so that they always return 0 (success). This
    is consistent with what other drivers do.
    
    If the e1000e driver encounters a hardware error during suspend, potential
    side effects include slightly higher power draw or non-working wake on
    LAN. This is preferred to a system-level suspend failure, and a warning
    message is written to the system log, so that the user can be aware that
    the LAN controller experienced a problem during suspend.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=205015
    Suggested-by: Dima Ruinskiy <[email protected]>
    Signed-off-by: Vitaly Lifshits <[email protected]>
    Tested-by: Mor Bar-Gabay <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

EINJ, CXL: Fix CXL device SBDF calculation [+ + +]

Author: Ben Cheatham <[email protected]>
Date:   Fri Sep 27 11:34:28 2024 -0500

    EINJ, CXL: Fix CXL device SBDF calculation
    
    [ Upstream commit ee1e3c46ed19c096be22472c728fa7f68b1352c4 ]
    
    The SBDF of the target CXL 2.0 compliant root port is required to inject a CXL
    protocol error as per ACPI 6.5. The SBDF given has to be in the
    following format:
    
    31     24 23    16 15    11 10      8  7        0
    +-------------------------------------------------+
    | segment |   bus  | device | function | reserved |
    +-------------------------------------------------+
    
    The SBDF calculated in cxl_dport_get_sbdf() doesn't account for
    the reserved bits currently, causing the wrong SBDF to be used.
    Fix said calculation to properly shift the SBDF.
    
    Without this fix, error injection into CXL 2.0 root ports through the
    CXL debugfs interface (<debugfs>/cxl) is broken. Injection
    through the legacy interface (<debugfs>/apei/einj/) will still work
    because the SBDF is manually provided by the user.
    
    Fixes: 12fb28ea6b1cf ("EINJ: Add CXL error type support")
    Signed-off-by: Ben Cheatham <[email protected]>
    Reviewed-by: Dan Williams <[email protected]>
    Tested-by: Srinivasulu Thanneeru <[email protected]>
    Reviewed-by: Srinivasulu Thanneeru <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

exec: don't WARN for racy path_noexec check [+ + +]

Author: Mateusz Guzik <[email protected]>
Date:   Mon Aug 5 15:17:21 2024 +0200

    exec: don't WARN for racy path_noexec check
    
    [ Upstream commit 0d196e7589cefe207d5d41f37a0a28a1fdeeb7c6 ]
    
    Both i_mode and noexec checks wrapped in WARN_ON stem from an artifact
    of the previous implementation. They used to legitimately check for the
    condition, but that got moved up in two commits:
    633fb6ac3980 ("exec: move S_ISREG() check earlier")
    0fd338b2d2cd ("exec: move path_noexec() check earlier")
    
    Instead of being removed said checks are WARN_ON'ed instead, which
    has some debug value.
    
    However, the spurious path_noexec check is racy, resulting in
    unwarranted warnings should someone race with setting the noexec flag.
    
    One can note there is more to perm-checking whether execve is allowed
    and none of the conditions are guaranteed to still hold after they were
    tested for.
    
    Additionally this does not validate whether the code path did any perm
    checking to begin with -- it will pass if the inode happens to be
    regular.
    
    Keep the redundant path_noexec() check even though it's mindless
    nonsense checking for guarantee that isn't given so drop the WARN.
    
    Reword the commentary and do small tidy ups while here.
    
    Signed-off-by: Mateusz Guzik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [brauner: keep redundant path_noexec() check]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

exfat: fix memory leak in exfat_load_bitmap() [+ + +]

Author: Yuezhang Mo <[email protected]>
Date:   Tue Sep 3 15:01:09 2024 +0800

    exfat: fix memory leak in exfat_load_bitmap()
    
    commit d2b537b3e533f28e0d97293fe9293161fe8cd137 upstream.
    
    If the first directory entry in the root directory is not a bitmap
    directory entry, 'bh' will not be released and reassigned, which
    will cause a memory leak.
    
    Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
    Cc: [email protected]
    Signed-off-by: Yuezhang Mo <[email protected]>
    Reviewed-by: Aoyama Wataru <[email protected]>
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: aovid use-after-free in ext4_ext_insert_extent() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:26 2024 +0800

    ext4: aovid use-after-free in ext4_ext_insert_extent()
    
    commit a164f3a432aae62ca23d03e6d926b122ee5b860d upstream.
    
    As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is
    reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and
    cause UAF. Below is a sample trace with dummy values:
    
    ext4_ext_insert_extent
      path = *ppath = 2000
      ext4_ext_create_new_leaf(ppath)
        ext4_find_extent(ppath)
          path = *ppath = 2000
          if (depth > path[0].p_maxdepth)
                kfree(path = 2000);
                *ppath = path = NULL;
          path = kcalloc() = 3000
          *ppath = 3000;
          return path;
      /* here path is still 2000, UAF! */
      eh = path[depth].p_hdr
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330
    Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179
    CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866
    Call Trace:
     <TASK>
     ext4_ext_insert_extent+0x26d4/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
    [...]
    
    Allocated by task 179:
     ext4_find_extent+0x81c/0x1f70
     ext4_ext_map_blocks+0x146/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    
    Freed by task 179:
     kfree+0xcb/0x240
     ext4_find_extent+0x7c0/0x1f70
     ext4_ext_insert_extent+0xa26/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    ==================================================================
    
    So use *ppath to update the path to avoid the above problem.
    
    Reported-by: Ojaswin Mujoo <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: avoid use-after-free in ext4_ext_show_leaf() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:24 2024 +0800

    ext4: avoid use-after-free in ext4_ext_show_leaf()
    
    [ Upstream commit 4e2524ba2ca5f54bdbb9e5153bea00421ef653f5 ]
    
    In ext4_find_extent(), path may be freed by error or be reallocated, so
    using a previously saved *ppath may have been freed and thus may trigger
    use-after-free, as follows:
    
    ext4_split_extent
      path = *ppath;
      ext4_split_extent_at(ppath)
      path = ext4_find_extent(ppath)
      ext4_split_extent_at(ppath)
        // ext4_find_extent fails to free path
        // but zeroout succeeds
      ext4_ext_show_leaf(inode, path)
        eh = path[depth].p_hdr
        // path use-after-free !!!
    
    Similar to ext4_split_extent_at(), we use *ppath directly as an input to
    ext4_ext_show_leaf(). Fix a spelling error by the way.
    
    Same problem in ext4_ext_handle_unwritten_extents(). Since 'path' is only
    used in ext4_ext_show_leaf(), remove 'path' and use *ppath directly.
    
    This issue is triggered only when EXT_DEBUG is defined and therefore does
    not affect functionality.
    
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: correct encrypted dentry name hash when not casefolded [+ + +]

Author: yao.ly <[email protected]>
Date:   Mon Jul 1 14:43:39 2024 +0800

    ext4: correct encrypted dentry name hash when not casefolded
    
    commit 70dd7b573afeba9b8f8a33f2ae1e4a9a2ec8c1ec upstream.
    
    EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH will access struct
    ext4_dir_entry_hash followed ext4_dir_entry. But there is no ext4_dir_entry_hash
    followed when inode is encrypted and not casefolded
    
    Signed-off-by: yao.ly <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: dax: fix overflowing extents beyond inode size when partially writing [+ + +]

Author: Zhihao Cheng <[email protected]>
Date:   Fri Aug 9 20:15:32 2024 +0800

    ext4: dax: fix overflowing extents beyond inode size when partially writing
    
    commit dda898d7ffe85931f9cca6d702a51f33717c501e upstream.
    
    The dax_iomap_rw() does two things in each iteration: map written blocks
    and copy user data to blocks. If the process is killed by user(See signal
    handling in dax_iomap_iter()), the copied data will be returned and added
    on inode size, which means that the length of written extents may exceed
    the inode size, then fsck will fail. An example is given as:
    
    dd if=/dev/urandom of=file bs=4M count=1
     dax_iomap_rw
      iomap_iter // round 1
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 0~2M extents(written flag)
      dax_iomap_iter // copy 2M data
      iomap_iter // round 2
       iomap_iter_advance
        iter->pos += iter->processed // iter->pos = 2M
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 2~4M extents(written flag)
      dax_iomap_iter
       fatal_signal_pending
      done = iter->pos - iocb->ki_pos // done = 2M
     ext4_handle_inode_extension
      ext4_update_inode_size // inode size = 2M
    
    fsck reports: Inode 13, i_size is 2097152, should be 4194304.  Fix?
    
    Fix the problem by truncating extents if the written length is smaller
    than expected.
    
    Fixes: 776722e85d3b ("ext4: DAX iomap write support")
    CC: [email protected]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219136
    Signed-off-by: Zhihao Cheng <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhihao Cheng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:27 2024 +0800

    ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free
    
    commit 5c0f4cc84d3a601c99bc5e6e6eb1cbda542cce95 upstream.
    
    When calling ext4_force_split_extent_at() in ext4_ext_replay_update_ex(),
    the 'ppath' is updated but it is the 'path' that is freed, thus potentially
    triggering a double-free in the following process:
    
    ext4_ext_replay_update_ex
      ppath = path
      ext4_force_split_extent_at(&ppath)
        ext4_split_extent_at
          ext4_ext_insert_extent
            ext4_ext_create_new_leaf
              ext4_ext_grow_indepth
                ext4_find_extent
                  if (depth > path[0].p_maxdepth)
                    kfree(path)                 ---> path First freed
                    *orig_path = path = NULL    ---> null ppath
      kfree(path)                               ---> path double-free !!!
    
    So drop the unnecessary ppath and use path directly to avoid this problem.
    And use ext4_find_extent() directly to update path, avoiding unnecessary
    memory allocation and freeing. Also, propagate the error returned by
    ext4_find_extent() instead of using strange error codes.
    
    Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: ext4_search_dir should return a proper error [+ + +]

Author: Thadeu Lima de Souza Cascardo <[email protected]>
Date:   Wed Aug 21 12:23:21 2024 -0300

    ext4: ext4_search_dir should return a proper error
    
    [ Upstream commit cd69f8f9de280e331c9e6ff689ced0a688a9ce8f ]
    
    ext4_search_dir currently returns -1 in case of a failure, while it returns
    0 when the name is not found. In such failure cases, it should return an
    error code instead.
    
    This becomes even more important when ext4_find_inline_entry returns an
    error code as well in the next commit.
    
    -EFSCORRUPTED seems appropriate as such error code as these failures would
    be caused by unexpected record lengths and is in line with other instances
    of ext4_check_dir_entry failures.
    
    In the case of ext4_dx_find_entry, the current use of ERR_BAD_DX_DIR was
    left as is to reduce the risk of regressions.
    
    Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: filesystems without casefold feature cannot be mounted with siphash [+ + +]

Author: Lizhi Xu <[email protected]>
Date:   Wed Jun 5 09:23:35 2024 +0800

    ext4: filesystems without casefold feature cannot be mounted with siphash
    
    [ Upstream commit 985b67cd86392310d9e9326de941c22fc9340eec ]
    
    When mounting the ext4 filesystem, if the default hash version is set to
    DX_HASH_SIPHASH but the casefold feature is not set, exit the mounting.
    
    Reported-by: [email protected]
    Signed-off-by: Lizhi Xu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix access to uninitialised lock in fc replay path [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Thu Jul 18 10:43:56 2024 +0100

    ext4: fix access to uninitialised lock in fc replay path
    
    commit 23dfdb56581ad92a9967bcd720c8c23356af74c1 upstream.
    
    The following kernel trace can be triggered with fstest generic/629 when
    executed against a filesystem with fast-commit feature enabled:
    
    INFO: trying to register non-static key.
    The code is fine but needs lockdep annotation, or maybe
    you didn't initialize this object before use?
    turning off the locking correctness validator.
    CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x66/0x90
     register_lock_class+0x759/0x7d0
     __lock_acquire+0x85/0x2630
     ? __find_get_block+0xb4/0x380
     lock_acquire+0xd1/0x2d0
     ? __ext4_journal_get_write_access+0xd5/0x160
     _raw_spin_lock+0x33/0x40
     ? __ext4_journal_get_write_access+0xd5/0x160
     __ext4_journal_get_write_access+0xd5/0x160
     ext4_reserve_inode_write+0x61/0xb0
     __ext4_mark_inode_dirty+0x79/0x270
     ? ext4_ext_replay_set_iblocks+0x2f8/0x450
     ext4_ext_replay_set_iblocks+0x330/0x450
     ext4_fc_replay+0x14c8/0x1540
     ? jread+0x88/0x2e0
     ? rcu_is_watching+0x11/0x40
     do_one_pass+0x447/0xd00
     jbd2_journal_recover+0x139/0x1b0
     jbd2_journal_load+0x96/0x390
     ext4_load_and_init_journal+0x253/0xd40
     ext4_fill_super+0x2cc6/0x3180
    ...
    
    In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in
    function ext4_check_bdev_write_error().  Unfortunately, at this point this
    spinlock has not been initialized yet.  Moving it's initialization to an
    earlier point in __ext4_fill_super() fixes this splat.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix double brelse() the buffer of the extents path [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:28 2024 +0800

    ext4: fix double brelse() the buffer of the extents path
    
    commit dcaa6c31134c0f515600111c38ed7750003e1b9c upstream.
    
    In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been
    released, otherwise it may be released twice. An example of what triggers
    this is as follows:
    
      split2    map    split1
    |--------|-------|--------|
    
    ext4_ext_map_blocks
     ext4_ext_handle_unwritten_extents
      ext4_split_convert_extents
       // path->p_depth == 0
       ext4_split_extent
         // 1. do split1
         ext4_split_extent_at
           |ext4_ext_insert_extent
           |  ext4_ext_create_new_leaf
           |    ext4_ext_grow_indepth
           |      le16_add_cpu(&neh->eh_depth, 1)
           |    ext4_find_extent
           |      // return -ENOMEM
           |// get error and try zeroout
           |path = ext4_find_extent
           |  path->p_depth = 1
           |ext4_ext_try_to_merge
           |  ext4_ext_try_to_merge_up
           |    path->p_depth = 0
           |    brelse(path[1].p_bh)  ---> not set to NULL here
           |// zeroout success
         // 2. update path
         ext4_find_extent
         // 3. do split2
         ext4_split_extent_at
           ext4_ext_insert_extent
             ext4_ext_create_new_leaf
               ext4_ext_grow_indepth
                 le16_add_cpu(&neh->eh_depth, 1)
               ext4_find_extent
                 path[0].p_bh = NULL;
                 path->p_depth = 1
                 read_extent_tree_block  ---> return err
                 // path[1].p_bh is still the old value
                 ext4_free_ext_path
                   ext4_ext_drop_refs
                     // path->p_depth == 1
                     brelse(path[1].p_bh)  ---> brelse a buffer twice
    
    Finally got the following WARRNING when removing the buffer from lru:
    
    ============================================
    VFS: brelse: Trying to free free buffer
    WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90
    CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716
    RIP: 0010:__brelse+0x58/0x90
    Call Trace:
     <TASK>
     __find_get_block+0x6e7/0x810
     bdev_getblk+0x2b/0x480
     __ext4_get_inode_loc+0x48a/0x1240
     ext4_get_inode_loc+0xb2/0x150
     ext4_reserve_inode_write+0xb7/0x230
     __ext4_mark_inode_dirty+0x144/0x6a0
     ext4_ext_insert_extent+0x9c8/0x3230
     ext4_ext_map_blocks+0xf45/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ============================================
    
    Fixes: ecb94f5fdf4b ("ext4: collapse a single extent tree block into the inode if possible")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix error message when rejecting the default hash [+ + +]

Author: Gabriel Krisman Bertazi <[email protected]>
Date:   Tue Aug 27 16:16:36 2024 -0400

    ext4: fix error message when rejecting the default hash
    
    [ Upstream commit a2187431c395cdfbf144e3536f25468c64fc7cfa ]
    
    Commit 985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash") properly rejects volumes where
    s_def_hash_version is set to DX_HASH_SIPHASH, but the check and the
    error message should not look into casefold setup - a filesystem should
    never have DX_HASH_SIPHASH as the default hash.  Fix it and, since we
    are there, move the check to ext4_hash_info_init.
    
    Fixes:985b67cd8639 ("ext4: filesystems without casefold feature cannot
    be mounted with siphash")
    
    Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix fast commit inode enqueueing during a full journal commit [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 17 18:22:20 2024 +0100

    ext4: fix fast commit inode enqueueing during a full journal commit
    
    commit 6db3c1575a750fd417a70e0178bdf6efa0dd5037 upstream.
    
    When a full journal commit is on-going, any fast commit has to be enqueued
    into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
    is done only once, i.e. if an inode is already queued in a previous fast
    commit entry it won't be enqueued again.  However, if a full commit starts
    _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
    be done into FC_Q_STAGING.  And this is not being done in function
    ext4_fc_track_template().
    
    This patch fixes the issue by re-enqueuing an inode into the STAGING queue
    during the fast commit clean-up callback when doing a full commit.  However,
    to prevent a race with a fast-commit, the clean-up callback has to be called
    with the journal locked.
    
    This bug was found using fstest generic/047.  This test creates several 32k
    bytes files, sync'ing each of them after it's creation, and then shutting
    down the filesystem.  Some data may be loss in this operation; for example a
    file may have it's size truncated to zero.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix i_data_sem unlock order in ext4_ind_migrate() [+ + +]

Author: Artem Sadovnikov <[email protected]>
Date:   Thu Aug 29 15:22:09 2024 +0000

    ext4: fix i_data_sem unlock order in ext4_ind_migrate()
    
    [ Upstream commit cc749e61c011c255d81b192a822db650c68b313f ]
    
    Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
    
    This issue is triggered when an EXT4_IOC_MIGRATE ioctl is set to require
    synchronous updates because the file descriptor is opened with O_SYNC.
    This can lead to the jbd2_journal_stop() function calling
    jbd2_might_wait_for_commit(), potentially causing a deadlock if the
    EXT4_IOC_MIGRATE call races with a write(2) system call.
    
    This problem only arises when CONFIG_PROVE_LOCKING is enabled. In this
    case, the jbd2_might_wait_for_commit macro locks jbd2_handle in the
    jbd2_journal_stop function while i_data_sem is locked. This triggers
    lockdep because the jbd2_journal_start function might also lock the same
    jbd2_handle simultaneously.
    
    Found by Linux Verification Center (linuxtesting.org) with syzkaller.
    
    Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
    Co-developed-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Artem Sadovnikov <[email protected]>
    Rule: add
    Link: https://lore.kernel.org/stable/20240404095000.5872-1-mish.uxin2012%40yandex.ru
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:16 2024 +0100

    ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
    
    commit 972090651ee15e51abfb2160e986fa050cfc7a40 upstream.
    
    Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
    for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:18 2024 +0100

    ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
    
    commit ebc4b2c1ac92fc0f8bf3f5a9c285a871d5084a6b upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.
    
    Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
    assumption by being initialised to '0'.  Fortunately, the sb flag
    EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
    has been previously set instead of comparing it with '0'.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:15 2024 +0100

    ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
    
    commit dd589b0f1445e1ea1085b98edca6e4d5dedb98d0 upstream.
    
    Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
    value for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:17 2024 +0100

    ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
    
    commit 7a6443e1dad70281f99f0bd394d7fd342481a632 upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.  Don't assume that and
    use two extra boolean variables to control the loop iterations and keep
    track of the first and last tid.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix off by one issue in alloc_flex_gd() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Fri Sep 27 21:33:29 2024 +0800

    ext4: fix off by one issue in alloc_flex_gd()
    
    commit 6121258c2b33ceac3d21f6a221452692c465df88 upstream.
    
    Wesley reported an issue:
    
    ==================================================================
    EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
    ------------[ cut here ]------------
    kernel BUG at fs/ext4/resize.c:324!
    CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
    RIP: 0010:ext4_resize_fs+0x1212/0x12d0
    Call Trace:
     __ext4_ioctl+0x4e0/0x1800
     ext4_ioctl+0x12/0x20
     __x64_sys_ioctl+0x99/0xd0
     x64_sys_call+0x1206/0x20d0
     do_syscall_64+0x72/0x110
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    ==================================================================
    
    While reviewing the patch, Honza found that when adjusting resize_bg in
    alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
    flexbg_size.
    
    The reproduction of the problem requires the following:
    
     o_group = flexbg_size * 2 * n;
     o_size = (o_group + 1) * group_size;
     n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
     o_size = (n_group + 1) * group_size;
    
    Take n=0,flexbg_size=16 as an example:
    
                  last:15
    |o---------------|--------------n-|
    o_group:0    resize to      n_group:30
    
    The corresponding reproducer is:
    
    img=test.img
    rm -f $img
    truncate -s 600M $img
    mkfs.ext4 -F $img -b 1024 -G 16 8M
    dev=`losetup -f --show $img`
    mkdir -p /tmp/test
    mount $dev /tmp/test
    resize2fs $dev 248M
    
    Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
    to prevent the issue from happening again.
    
    [ Note: another reproucer which this commit fixes is:
    
      img=test.img
      rm -f $img
      truncate -s 25MiB $img
      mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
      truncate -s 3GiB $img
      dev=`losetup -f --show $img`
      mkdir -p /tmp/test
      mount $dev /tmp/test
      resize2fs $dev 3G
      umount $dev
      losetup -d $dev
    
      -- TYT ]
    
    Reported-by: Wesley Hershberger <[email protected]>
    Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
    Reported-by: Stéphane Graber <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Alexander Mikhalitsyn <[email protected]>
    Tested-by: Eric Sandeen <[email protected]>
    Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix slab-use-after-free in ext4_split_extent_at() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:23 2024 +0800

    ext4: fix slab-use-after-free in ext4_split_extent_at()
    
    commit c26ab35702f8cd0cdc78f96aa5856bfb77be798f upstream.
    
    We hit the following use-after-free:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0
    Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40
    CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724
    Call Trace:
     <TASK>
     kasan_report+0x93/0xc0
     ext4_split_extent_at+0xba8/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Allocated by task 40:
     __kmalloc_noprof+0x1ac/0x480
     ext4_find_extent+0xf3b/0x1e70
     ext4_ext_map_blocks+0x188/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Freed by task 40:
     kfree+0xf1/0x2b0
     ext4_find_extent+0xa71/0x1e70
     ext4_ext_insert_extent+0xa22/0x3260
     ext4_split_extent_at+0x3ef/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ==================================================================
    
    The flow of issue triggering is as follows:
    
    ext4_split_extent_at
      path = *ppath
      ext4_ext_insert_extent(ppath)
        ext4_ext_create_new_leaf(ppath)
          ext4_find_extent(orig_path)
            path = *orig_path
            read_extent_tree_block
              // return -ENOMEM or -EIO
            ext4_free_ext_path(path)
              kfree(path)
            *orig_path = NULL
      a. If err is -ENOMEM:
      ext4_ext_dirty(path + path->p_depth)
      // path use-after-free !!!
      b. If err is -EIO and we have EXT_DEBUG defined:
      ext4_ext_show_leaf(path)
        eh = path[depth].p_hdr
        // path also use-after-free !!!
    
    So when trying to zeroout or fix the extent length, call ext4_find_extent()
    to update the path.
    
    In addition we use *ppath directly as an ext4_ext_show_leaf() input to
    avoid possible use-after-free when EXT_DEBUG is defined, and to avoid
    unnecessary path updates.
    
    Fixes: dfe5080939ea ("ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix timer use-after-free on failed mount [+ + +]

Author: Xiaxi Shen <[email protected]>
Date:   Sun Jul 14 21:33:36 2024 -0700

    ext4: fix timer use-after-free on failed mount
    
    commit 0ce160c5bdb67081a62293028dc85758a8efb22a upstream.
    
    Syzbot has found an ODEBUG bug in ext4_fill_super
    
    The del_timer_sync function cancels the s_err_report timer,
    which reminds about filesystem errors daily. We should
    guarantee the timer is no longer active before kfree(sbi).
    
    When filesystem mounting fails, the flow goes to failed_mount3,
    where an error occurs when ext4_stop_mmpd is called, causing
    a read I/O failure. This triggers the ext4_handle_error function
    that ultimately re-arms the timer,
    leaving the s_err_report timer active before kfree(sbi) is called.
    
    Fix the issue by canceling the s_err_report timer after calling ext4_stop_mmpd.
    
    Signed-off-by: Xiaxi Shen <[email protected]>
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=59e0101c430934bc9a36
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: mark fc as ineligible using an handle in ext4_xattr_set() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:09 2024 +0100

    ext4: mark fc as ineligible using an handle in ext4_xattr_set()
    
    commit 04e6ce8f06d161399e5afde3df5dcfa9455b4952 upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch moves the call to this function so that an handle
    can be used.  If a transaction fails to start, then there's not point in
    trying to mark the filesystem as ineligible, and an error will eventually be
    returned to user-space.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: no need to continue when the number of entries is 1 [+ + +]

Author: Edward Adam Davis <[email protected]>
Date:   Mon Jul 1 22:25:03 2024 +0800

    ext4: no need to continue when the number of entries is 1
    
    commit 1a00a393d6a7fb1e745a41edd09019bd6a0ad64c upstream.
    
    Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-and-tested-by: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: propagate errors from ext4_find_extent() in ext4_insert_range() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:30 2024 +0800

    ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
    
    commit 369c944ed1d7c3fb7b35f24e4735761153afe7b3 upstream.
    
    Even though ext4_find_extent() returns an error, ext4_insert_range() still
    returns 0. This may confuse the user as to why fallocate returns success,
    but the contents of the file are not as expected. So propagate the error
    returned by ext4_find_extent() to avoid inconsistencies.
    
    Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: update orig_path in ext4_find_extent() [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:25 2024 +0800

    ext4: update orig_path in ext4_find_extent()
    
    commit 5b4b2dcace35f618fe361a87bae6f0d13af31bc1 upstream.
    
    In ext4_find_extent(), if the path is not big enough, we free it and set
    *orig_path to NULL. But after reallocating and successfully initializing
    the path, we don't update *orig_path, in which case the caller gets a
    valid path but a NULL ppath, and this may cause a NULL pointer dereference
    or a path memory leak. For example:
    
    ext4_split_extent
      path = *ppath = 2000
      ext4_find_extent
        if (depth > path[0].p_maxdepth)
          kfree(path = 2000);
          *orig_path = path = NULL;
          path = kcalloc() = 3000
      ext4_split_extent_at(*ppath = NULL)
        path = *ppath;
        ex = path[depth].p_ext;
        // NULL pointer dereference!
    
    ==================================================================
    BUG: kernel NULL pointer dereference, address: 0000000000000010
    CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847
    RIP: 0010:ext4_split_extent_at+0x6d/0x560
    Call Trace:
     <TASK>
     ext4_split_extent.isra.0+0xcb/0x1b0
     ext4_ext_convert_to_initialized+0x168/0x6c0
     ext4_ext_handle_unwritten_extents+0x325/0x4d0
     ext4_ext_map_blocks+0x520/0xdb0
     ext4_map_blocks+0x2b0/0x690
     ext4_iomap_begin+0x20e/0x2c0
    [...]
    ==================================================================
    
    Therefore, *orig_path is updated when the extent lookup succeeds, so that
    the caller can safely use path or *ppath.
    
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: use handle to mark fc as ineligible in __track_dentry_update() [+ + +]

Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:08 2024 +0100

    ext4: use handle to mark fc as ineligible in __track_dentry_update()
    
    commit faab35a0370fd6e0821c7a8dd213492946fc776f upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch fixes the calls to this function in
    __track_dentry_update() by adding an extra parameter to the callback used in
    ext4_fc_track_template().
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

f2fs: add write priority option based on zone UFS [+ + +]

Author: Liao Yuanhong <[email protected]>
Date:   Mon Jul 15 20:34:51 2024 +0800

    f2fs: add write priority option based on zone UFS
    
    [ Upstream commit 8444ce524947daf441546b5b3a0c418706dade35 ]
    
    Currently, we are using a mix of traditional UFS and zone UFS to support
    some functionalities that cannot be achieved on zone UFS alone. However,
    there are some issues with this approach. There exists a significant
    performance difference between traditional UFS and zone UFS. Under normal
    usage, we prioritize writes to zone UFS. However, in critical conditions
    (such as when the entire UFS is almost full), we cannot determine whether
    data will be written to traditional UFS or zone UFS. This can lead to
    significant performance fluctuations, which is not conducive to
    development and testing. To address this, we have added an option
    zlu_io_enable under sys with the following three modes:
    1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
    2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
    3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
    traditional UFS.
    
    Signed-off-by: Liao Yuanhong <[email protected]>
    Signed-off-by: Wu Bo <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 65a6ce4726c2 ("f2fs: fix to don't panic system for no free segment fault injection")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: do FG_GC when GC boosting is required for zoned devices [+ + +]

Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:44 2024 -0700

    f2fs: do FG_GC when GC boosting is required for zoned devices
    
    [ Upstream commit 9748c2ddea4a3f46a498bff4cf2bf9a5629e3f8b ]
    
    Under low free section count, we need to use FG_GC instead of BG_GC to
    recover free sections.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: fix to don't panic system for no free segment fault injection [+ + +]

Author: Chao Yu <[email protected]>
Date:   Tue Sep 10 09:16:19 2024 +0800

    f2fs: fix to don't panic system for no free segment fault injection
    
    [ Upstream commit 65a6ce4726c27b45600303f06496fef46d00b57f ]
    
    f2fs: fix to don't panic system for no free segment fault injection
    
    syzbot reports a f2fs bug as below:
    
    F2FS-fs (loop0): inject no free segment in get_new_segment of __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
    F2FS-fs (loop0): Stopped filesystem due to reason: 7
    ------------[ cut here ]------------
    kernel BUG at fs/f2fs/segment.c:2748!
    CPU: 0 UID: 0 PID: 5109 Comm: syz-executor304 Not tainted 6.11.0-rc6-syzkaller-00363-g89f5e14d05b4 #0
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    Call Trace:
     __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
     f2fs_allocate_new_section fs/f2fs/segment.c:3181 [inline]
     f2fs_allocate_pinning_section+0xfa/0x4e0 fs/f2fs/segment.c:3195
     f2fs_expand_inode_data+0x5d6/0xbb0 fs/f2fs/file.c:1799
     f2fs_fallocate+0x448/0x960 fs/f2fs/file.c:1903
     vfs_fallocate+0x553/0x6c0 fs/open.c:334
     do_vfs_ioctl+0x2592/0x2e50 fs/ioctl.c:886
     __do_sys_ioctl fs/ioctl.c:905 [inline]
     __se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    
    The root cause is when we inject no free segment fault into f2fs,
    we should not panic system, fix it.
    
    Fixes: 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/linux-f2fs-devel/[email protected]
    Signed-off-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: forcibly migrate to secure space for zoned device file pinning [+ + +]

Author: Daeho Jeong <[email protected]>
Date:   Thu Sep 12 09:59:58 2024 -0700

    f2fs: forcibly migrate to secure space for zoned device file pinning
    
    [ Upstream commit 5cc69a27abfa91abbb39fc584f82d6c867b60f47 ]
    
    We need to migrate data blocks even though it is full to secure space
    for zoned device file pinning.
    
    Fixes: 9703d69d9d15 ("f2fs: support file pinning for zoned devices")
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: increase BG GC migration window granularity when boosted for zoned devices [+ + +]

Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:43 2024 -0700

    f2fs: increase BG GC migration window granularity when boosted for zoned devices
    
    [ Upstream commit 2223fe652f759649ae1d520e47e5f06727c0acbd ]
    
    Need bigger BG GC migration window granularity when free section is
    running low.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: introduce migration_window_granularity [+ + +]

Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:41 2024 -0700

    f2fs: introduce migration_window_granularity
    
    [ Upstream commit 8c890c4c60342719526520133fb1b6f69f196ab8 ]
    
    We can control the scanning window granularity for GC migration. For
    more frequent scanning and GC on zoned devices, we need a fine grained
    control knob for it.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: make BG GC more aggressive for zoned devices [+ + +]

Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:40 2024 -0700

    f2fs: make BG GC more aggressive for zoned devices
    
    [ Upstream commit 5062b5bed4323275f2f89bc185c6a28d62cfcfd5 ]
    
    Since we don't have any GC on device side for zoned devices, need more
    aggressive BG GC. So, tune the parameters for that.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

fbdev: efifb: Register sysfs groups through driver core [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Tue Aug 27 17:25:13 2024 +0200

    fbdev: efifb: Register sysfs groups through driver core
    
    [ Upstream commit 95cdd538e0e5677efbdf8aade04ec098ab98f457 ]
    
    The driver core can register and cleanup sysfs groups already.
    Make use of that functionality to simplify the error handling and
    cleanup.
    
    Also avoid a UAF race during unregistering where the sysctl attributes
    were usable after the info struct was freed.
    
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fbdev: pxafb: Fix possible use after free in pxafb_task() [+ + +]

Author: Kaixin Wang <[email protected]>
Date:   Wed Sep 11 22:29:52 2024 +0800

    fbdev: pxafb: Fix possible use after free in pxafb_task()
    
    [ Upstream commit 4a6921095eb04a900e0000da83d9475eb958e61e ]
    
    In the pxafb_probe function, it calls the pxafb_init_fbinfo function,
    after which &fbi->task is associated with pxafb_task. Moreover,
    within this pxafb_init_fbinfo function, the pxafb_blank function
    within the &pxafb_ops struct is capable of scheduling work.
    
    If we remove the module which will call pxafb_remove to make cleanup,
    it will call unregister_framebuffer function which can call
    do_unregister_framebuffer to free fbi->fb through
    put_fb_info(fb_info), while the work mentioned above will be used.
    The sequence of operations that may lead to a UAF bug is as follows:
    
    CPU0                                                CPU1
    
                                       | pxafb_task
    pxafb_remove                       |
    unregister_framebuffer(info)       |
    do_unregister_framebuffer(fb_info) |
    put_fb_info(fb_info)               |
    // free fbi->fb                    | set_ctrlr_state(fbi, state)
                                       | __pxafb_lcd_power(fbi, 0)
                                       | fbi->lcd_power(on, &fbi->fb.var)
                                       | //use fbi->fb
    
    Fix it by ensuring that the work is canceled before proceeding
    with the cleanup in pxafb_remove.
    
    Note that only root user can remove the driver at runtime.
    
    Signed-off-by: Kaixin Wang <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

firmware/sysfb: Disable sysfb for firmware buffers with unknown parent [+ + +]

Author: Thomas Zimmermann <[email protected]>
Date:   Tue Sep 24 10:41:03 2024 +0200

    firmware/sysfb: Disable sysfb for firmware buffers with unknown parent
    
    commit ad604f0a4c040dcb8faf44dc72db25e457c28076 upstream.
    
    The sysfb framebuffer handling only operates on graphics devices
    that provide the system's firmware framebuffer. If that device is
    not known, assume that any graphics device has been initialized by
    firmware.
    
    Fixes a problem on i915 where sysfb does not release the firmware
    framebuffer after the native graphics driver loaded.
    
    Reported-by: Borah, Chaitanya Kumar <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/SJ1PR11MB6129EFB8CE63D1EF6D932F94B96F2@SJ1PR11MB6129.namprd11.prod.outlook.com/
    Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12160
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: b49420d6a1ae ("video/aperture: optionally match the device in sysfb_disable()")
    Cc: Javier Martinez Canillas <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Helge Deller <[email protected]>
    Cc: Sam Ravnborg <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Cc: Linux regression tracking (Thorsten Leemhuis) <[email protected]>
    Cc: <[email protected]> # v6.11+
    Acked-by: Alex Deucher <[email protected]>
    Reviewed-by: Javier Martinez Canillas <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp() [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Fri Aug 16 15:57:21 2024 +0200

    firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp()
    
    commit 9c3a62c20f7fb00294a4237e287254456ba8a48b upstream.
    
    mbox_client_to_bpmp() is not used, W=1 builds:
    
      drivers/firmware/tegra/bpmp.c:28:1: error: unused function 'mbox_client_to_bpmp' [-Werror,-Wunused-function]
    
    Fixes: cdfa358b248e ("firmware: tegra: Refactor BPMP driver")
    Cc: [email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Thierry Reding <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name [+ + +]

Author: Li Zhijian <[email protected]>
Date:   Mon Aug 26 13:55:03 2024 +0800

    fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
    
    [ Upstream commit 7f7b850689ac06a62befe26e1fd1806799e7f152 ]
    
    It's observed that a crash occurs during hot-remove a memory device,
    in which user is accessing the hugetlb. See calltrace as following:
    
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 14045 at arch/x86/mm/fault.c:1278 do_user_addr_fault+0x2a0/0x790
    Modules linked in: kmem device_dax cxl_mem cxl_pmem cxl_port cxl_pci dax_hmem dax_pmem nd_pmem cxl_acpi nd_btt cxl_core crc32c_intel nvme virtiofs fuse nvme_core nfit libnvdimm dm_multipath scsi_dh_rdac scsi_dh_emc s
    mirror dm_region_hash dm_log dm_mod
    CPU: 1 PID: 14045 Comm: daxctl Not tainted 6.10.0-rc2-lizhijian+ #492
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    RIP: 0010:do_user_addr_fault+0x2a0/0x790
    Code: 48 8b 00 a8 04 0f 84 b5 fe ff ff e9 1c ff ff ff 4c 89 e9 4c 89 e2 be 01 00 00 00 bf 02 00 00 00 e8 b5 ef 24 00 e9 42 fe ff ff <0f> 0b 48 83 c4 08 4c 89 ea 48 89 ee 4c 89 e7 5b 5d 41 5c 41 5d 41
    RSP: 0000:ffffc90000a575f0 EFLAGS: 00010046
    RAX: ffff88800c303600 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 0000000000001000 RSI: ffffffff82504162 RDI: ffffffff824b2c36
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000a57658
    R13: 0000000000001000 R14: ffff88800bc2e040 R15: 0000000000000000
    FS:  00007f51cb57d880(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000001000 CR3: 00000000072e2004 CR4: 00000000001706f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __warn+0x8d/0x190
     ? do_user_addr_fault+0x2a0/0x790
     ? report_bug+0x1c3/0x1d0
     ? handle_bug+0x3c/0x70
     ? exc_invalid_op+0x14/0x70
     ? asm_exc_invalid_op+0x16/0x20
     ? do_user_addr_fault+0x2a0/0x790
     ? exc_page_fault+0x31/0x200
     exc_page_fault+0x68/0x200
    <...snip...>
    BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     ---[ end trace 0000000000000000 ]---
     BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 14045 Comm: daxctl Kdump: loaded Tainted: G        W          6.10.0-rc2-lizhijian+ #492
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
     RIP: 0010:dentry_name+0x1f4/0x440
    <...snip...>
    ? dentry_name+0x2fa/0x440
    vsnprintf+0x1f3/0x4f0
    vprintk_store+0x23a/0x540
    vprintk_emit+0x6d/0x330
    _printk+0x58/0x80
    dump_mapping+0x10b/0x1a0
    ? __pfx_free_object_rcu+0x10/0x10
    __dump_page+0x26b/0x3e0
    ? vprintk_emit+0xe0/0x330
    ? _printk+0x58/0x80
    ? dump_page+0x17/0x50
    dump_page+0x17/0x50
    do_migrate_range+0x2f7/0x7f0
    ? do_migrate_range+0x42/0x7f0
    ? offline_pages+0x2f4/0x8c0
    offline_pages+0x60a/0x8c0
    memory_subsys_offline+0x9f/0x1c0
    ? lockdep_hardirqs_on+0x77/0x100
    ? _raw_spin_unlock_irqrestore+0x38/0x60
    device_offline+0xe3/0x110
    state_store+0x6e/0xc0
    kernfs_fop_write_iter+0x143/0x200
    vfs_write+0x39f/0x560
    ksys_write+0x65/0xf0
    do_syscall_64+0x62/0x130
    
    Previously, some sanity check have been done in dump_mapping() before
    the print facility parsing '%pd' though, it's still possible to run into
    an invalid dentry.d_name.name.
    
    Since dump_mapping() only needs to dump the filename only, retrieve it
    by itself in a safer way to prevent an unnecessary crash.
    
    Note that either retrieving the filename with '%pd' or
    strncpy_from_kernel_nofault(), the filename could be unreliable.
    
    Signed-off-by: Li Zhijian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gfs2: fix double destroy_workqueue error [+ + +]

Author: Julian Sun <[email protected]>
Date:   Tue Aug 20 11:31:48 2024 +0800

    gfs2: fix double destroy_workqueue error
    
    commit 6cb9df81a2c462b89d2f9611009ab43ae8717841 upstream.
    
    When gfs2_fill_super() fails, destroy_workqueue() is called within
    gfs2_gl_hash_clear(), and the subsequent code path calls
    destroy_workqueue() on the same work queue again.
    
    This issue can be fixed by setting the work queue pointer to NULL after
    the first destroy_workqueue() call and checking for a NULL pointer
    before attempting to destroy the work queue again.
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d34c2a269ed512c531b0
    Fixes: 30e388d57367 ("gfs2: Switch to a per-filesystem glock workqueue")
    Cc: [email protected]
    Signed-off-by: Julian Sun <[email protected]>
    Signed-off-by: Andreas Gruenbacher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gpio: davinci: fix lazy disable [+ + +]

Author: Emanuele Ghidoli <[email protected]>
Date:   Wed Aug 28 15:32:07 2024 +0200

    gpio: davinci: fix lazy disable
    
    commit 3360d41f4ac490282fddc3ccc0b58679aa5c065d upstream.
    
    On a few platforms such as TI's AM69 device, disable_irq() fails to keep
    track of the interrupts that happen between disable_irq() and
    enable_irq() and those interrupts are missed. Use the ->irq_unmask() and
    ->irq_mask() methods instead of ->irq_enable() and ->irq_disable() to
    correctly keep track of edges when disable_irq is called.
    
    This solves the issue of disable_irq() not working as expected on such
    platforms.
    
    Fixes: 23265442b02b ("ARM: davinci: irq_data conversion.")
    Signed-off-by: Emanuele Ghidoli <[email protected]>
    Signed-off-by: Parth Pancholi <[email protected]>
    Acked-by: Keerthy <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gpiolib: Fix potential NULL pointer dereference in gpiod_get_label() [+ + +]

Author: Lad Prabhakar <[email protected]>
Date:   Thu Oct 3 14:13:51 2024 +0100

    gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()
    
    [ Upstream commit 7b99b5ab885993bff010ebcd93be5e511c56e28a ]
    
    In `gpiod_get_label()`, it is possible that `srcu_dereference_check()` may
    return a NULL pointer, leading to a scenario where `label->str` is accessed
    without verifying if `label` itself is NULL.
    
    This patch adds a proper NULL check for `label` before accessing
    `label->str`. The check for `label->str != NULL` is removed because
    `label->str` can never be NULL if `label` is not NULL.
    
    This fixes the issue where the label name was being printed as `(efault)`
    when dumping the sysfs GPIO file when `label == NULL`.
    
    Fixes: 5a646e03e956 ("gpiolib: Return label, if set, for IRQ only line")
    Fixes: a86d27693066 ("gpiolib: fix the speed of descriptor label setting with SRCU")
    Signed-off-by: Lad Prabhakar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gso: fix udp gso fraglist segmentation after pull from frag_list [+ + +]

Author: Willem de Bruijn <[email protected]>
Date:   Tue Oct 1 13:17:46 2024 -0400

    gso: fix udp gso fraglist segmentation after pull from frag_list
    
    commit a1e40ac5b5e9077fe1f7ae0eb88034db0f9ae1ab upstream.
    
    Detect gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For UDP, this
    causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
    udp_hdr(seg->next)->dest.
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Willem de Bruijn <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: bpf: fix cfi stubs for hid_bpf_ops [+ + +]

Author: Benjamin Tissoires <[email protected]>
Date:   Fri Sep 27 16:17:41 2024 +0200

    HID: bpf: fix cfi stubs for hid_bpf_ops
    
    commit acd5f76fd5292c91628e04da83e8b78c986cfa2b upstream.
    
    With the introduction of commit e42ac1418055 ("bpf: Check unsupported ops
    from the bpf_struct_ops's cfi_stubs"), a HID-BPF struct_ops containing
    a .hid_hw_request() or a .hid_hw_output_report() was failing to load
    as the cfi stubs were not defined.
    
    Fix that by defining those simple static functions and restore HID-BPF
    functionality.
    
    This was detected with the HID selftests suddenly failing on Linus' tree.
    
    Cc: [email protected] # v6.11+
    Fixes: 9286675a2aed ("HID: bpf: add HID-BPF hooks for hid_hw_output_report")
    Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests")
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: i2c-hid: ensure various commands do not interfere with each other [+ + +]

Author: Dmitry Torokhov <[email protected]>
Date:   Mon Sep 9 13:37:40 2024 -0700

    HID: i2c-hid: ensure various commands do not interfere with each other
    
    [ Upstream commit b4ed18a3d56eabd18cfd9841ff05111e3cfbe8f9 ]
    
    i2c-hid uses 2 shared buffers: command and "raw" input buffer for
    sending requests to peripherals and read data from peripherals when
    executing variety of commands. Such commands include reading of HID
    registers, requesting particular power mode, getting and setting
    reports and so on. Because all such requests use the same 2 buffers
    they should not execute simultaneously.
    
    Fix this by introducing "cmd_lock" mutex and acquire it whenever
    we needs to access ihid->cmdbuf or idid->rawbuf.
    
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: Ignore battery for all ELAN I2C-HID devices [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Mon Aug 5 16:51:47 2024 +0200

    HID: Ignore battery for all ELAN I2C-HID devices
    
    [ Upstream commit bcc31692a1d1e21f0d06c5f727c03ee299d2264e ]
    
    Before this change there were 16 vid:pid based quirks to ignore the battery
    reported by Elan I2C-HID touchscreens on various Asus and HP laptops.
    
    And a report has been received that the 04F3:2A00 I2C touchscreen on
    the HP ProBook x360 11 G5 EE/86CF also reports a non present battery.
    
    Since I2C-HID devices are always builtin to laptops they are not battery
    owered so it should be safe to just ignore the battery on all Elan I2C-HID
    devices, rather then adding a 17th quirk for the 04F3:2A00 touchscreen.
    
    As reported in the changelog of commit a3a5a37efba1 ("HID: Ignore battery
    for ELAN touchscreens 2F2C and 4116"), which added 2 new Elan touchscreen
    quirks about a month ago, the HID reported battery seems to be related
    to a stylus being used. But even when a stylus is in use it does not
    properly report the charge of the stylus battery, instead the reported
    battery charge jumps from 0% to 1%. So it is best to just ignore the
    HID battery.
    
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2302776
    Cc: Louis Dalibard <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio [+ + +]

Author: Vishnu Sankar <[email protected]>
Date:   Sun Aug 18 16:27:29 2024 +0900

    HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio
    
    [ Upstream commit 65b72ea91a257a5f0cb5a26b01194d3dd4b85298 ]
    
    This applies similar quirks used by previous generation device, so that
    Trackpoint and buttons on the touchpad works.  New USB KBD PID 0x61AE for
    Thinkpad X12 Tab is added.
    
    Signed-off-by: Vishnu Sankar <[email protected]>
    Reviewed-by: Mark Pearson <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (nct6775) add G15CF to ASUS WMI monitoring list [+ + +]

Author: Denis Pauk <[email protected]>
Date:   Mon Aug 12 18:26:38 2024 +0300

    hwmon: (nct6775) add G15CF to ASUS WMI monitoring list
    
    [ Upstream commit 1f432e4cf1dd3ecfec5ed80051b4611632a0fd51 ]
    
    Boards G15CF has got a nct6775 chip, but by default there's no use of it
    because of resource conflict with WMI method.
    
    Add the board to the WMI monitoring list.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=204807
    Signed-off-by: Denis Pauk <[email protected]>
    Tested-by: Attila <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: core: Lock address during client device instantiation [+ + +]

Author: Heiner Kallweit <[email protected]>
Date:   Thu Aug 15 21:44:50 2024 +0200

    i2c: core: Lock address during client device instantiation
    
    commit 8d3cefaf659265aa82b0373a563fdb9d16a2b947 upstream.
    
    Krzysztof reported an issue [0] which is caused by parallel attempts to
    instantiate the same I2C client device. This can happen if driver
    supports auto-detection, but certain devices are also instantiated
    explicitly.
    The original change isn't actually wrong, it just revealed that I2C core
    isn't prepared yet to handle this scenario.
    Calls to i2c_new_client_device() can be nested, therefore we can't use a
    simple mutex here. Parallel instantiation of devices at different addresses
    is ok, so we just have to prevent parallel instantiation at the same address.
    We can use a bitmap with one bit per 7-bit I2C client address, and atomic
    bit operations to set/check/clear bits.
    Now a parallel attempt to instantiate a device at the same address will
    result in -EBUSY being returned, avoiding the "sysfs: cannot create duplicate
    filename" splash.
    
    Note: This patch version includes small cosmetic changes to the Tested-by
          version, only functional change is that address locking is supported
          for slave addresses too.
    
    [0] https://lore.kernel.org/linux-i2c/[email protected]/T/#m12706546e8e2414d8f1a0dc61c53393f731685cc
    
    Fixes: caba40ec3531 ("eeprom: at24: Probe for DDR3 thermal sensor in the SPD case")
    Cc: [email protected]
    Tested-by: Krzysztof Piotr Oledzki <[email protected]>
    Signed-off-by: Heiner Kallweit <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled [+ + +]

Author: Kimriver Liu <[email protected]>
Date:   Fri Sep 13 11:31:46 2024 +0800

    i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
    
    commit 5d69d5a00f80488ddcb4dee7d1374a0709398178 upstream.
    
    It was observed that issuing the ABORT bit (IC_ENABLE[1]) will not
    work when IC_ENABLE is already disabled.
    
    Check if the ENABLE bit (IC_ENABLE[0]) is disabled when the controller
    is holding SCL low. If the ENABLE bit is disabled, the software needs
    to enable it before trying to issue the ABORT bit. otherwise,
    the controller ignores any write to ABORT bit.
    
    These kernel logs show up whenever an I2C transaction is
    attempted after this failure.
    i2c_designware e95e0000.i2c: timeout waiting for bus ready
    i2c_designware e95e0000.i2c: timeout in disabling adapter
    
    The patch fixes the issue where the controller cannot be disabled
    while SCL is held low if the ENABLE bit is already disabled.
    
    Fixes: 2409205acd3c ("i2c: designware: fix __i2c_dw_disable() in case master is holding SCL low")
    Signed-off-by: Kimriver Liu <[email protected]>
    Cc: <[email protected]> # v6.6+
    Reviewed-by: Mika Westerberg <[email protected]>
    Acked-by: Jarkko Nikula <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:34:59 2024 +0800

    i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq()
    
    commit e2c85d85a05f16af2223fcc0195ff50a7938b372 upstream.
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Cc: <[email protected]> # v4.19+
    Acked-by: Mukesh Kumar Savaliya <[email protected]>
    Reviewed-by: Vladimir Zapolskiy <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume [+ + +]

Author: Marek Vasut <[email protected]>
Date:   Mon Sep 30 21:27:41 2024 +0200

    i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
    
    commit 048bbbdbf85e5e00258dfb12f5e368f908801d7b upstream.
    
    In case there is any sort of clock controller attached to this I2C bus
    controller, for example Versaclock or even an AIC32x4 I2C codec, then
    an I2C transfer triggered from the clock controller clk_ops .prepare
    callback may trigger a deadlock on drivers/clk/clk.c prepare_lock mutex.
    
    This is because the clock controller first grabs the prepare_lock mutex
    and then performs the prepare operation, including its I2C access. The
    I2C access resumes this I2C bus controller via .runtime_resume callback,
    which calls clk_prepare_enable(), which attempts to grab the prepare_lock
    mutex again and deadlocks.
    
    Since the clock are already prepared since probe() and unprepared in
    remove(), use simple clk_enable()/clk_disable() calls to enable and
    disable the clock on runtime suspend and resume, to avoid hitting the
    prepare_lock mutex.
    
    Acked-by: Alain Volmat <[email protected]>
    Signed-off-by: Marek Vasut <[email protected]>
    Fixes: 4e7bca6fc07b ("i2c: i2c-stm32f7: add PM Runtime support")
    Cc: <[email protected]> # v5.0+
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: synquacer: Deal with optional PCLK correctly [+ + +]

Author: Ard Biesheuvel <[email protected]>
Date:   Thu Sep 12 12:46:31 2024 +0200

    i2c: synquacer: Deal with optional PCLK correctly
    
    commit f2990f8630531a99cad4dc5c44cb2a11ded42492 upstream.
    
    ACPI boot does not provide clocks and regulators, but instead, provides
    the PCLK rate directly, and enables the clock in firmware. So deal
    gracefully with this.
    
    Fixes: 55750148e559 ("i2c: synquacer: Fix an error handling path in synquacer_i2c_probe()")
    Cc: [email protected] # v6.10+
    Cc: Andi Shyti <[email protected]>
    Cc: Christophe JAILLET <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 11:42:50 2024 +0800

    i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 0c8d604dea437b69a861479b413d629bc9b3da70 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 36ecbcab84d0 ("i2c: xiic: Implement power management")
    Cc: <[email protected]> # v4.6+
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: Wait for TX empty to avoid missed TX NAKs [+ + +]

Author: Robert Hancock <[email protected]>
Date:   Tue Nov 21 18:11:16 2023 +0000

    i2c: xiic: Wait for TX empty to avoid missed TX NAKs
    
    commit 521da1e9225450bd323db5fa5bca942b1dc485b7 upstream.
    
    Frequently an I2C write will be followed by a read, such as a register
    address write followed by a read of the register value. In this driver,
    when the TX FIFO half empty interrupt was raised and it was determined
    that there was enough space in the TX FIFO to send the following read
    command, it would do so without waiting for the TX FIFO to actually
    empty.
    
    Unfortunately it appears that in some cases this can result in a NAK
    that was raised by the target device on the write, such as due to an
    unsupported register address, being ignored and the subsequent read
    being done anyway. This can potentially put the I2C bus into an
    invalid state and/or result in invalid read data being processed.
    
    To avoid this, once a message has been fully written to the TX FIFO,
    wait for the TX FIFO empty interrupt before moving on to the next
    message, to ensure NAKs are handled properly.
    
    Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
    Signed-off-by: Robert Hancock <[email protected]>
    Cc: <[email protected]> # v2.6.34+
    Reviewed-by: Manikanta Guntupalli <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition [+ + +]

Author: Kaixin Wang <[email protected]>
Date:   Sun Sep 15 00:39:33 2024 +0800

    i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition
    
    commit 61850725779709369c7e907ae8c7c75dc7cec4f3 upstream.
    
    In the svc_i3c_master_probe function, &master->hj_work is bound with
    svc_i3c_master_hj_work, &master->ibi_work is bound with
    svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work  can start the
    hj_work, svc_i3c_master_irq_handler can start the ibi_work.
    
    If we remove the module which will call svc_i3c_master_remove to
    make cleanup, it will free master->base through i3c_master_unregister
    while the work mentioned above will be used. The sequence of operations
    that may lead to a UAF bug is as follows:
    
    CPU0                                         CPU1
    
                                        | svc_i3c_master_hj_work
    svc_i3c_master_remove               |
    i3c_master_unregister(&master->base)|
    device_unregister(&master->dev)     |
    device_release                      |
    //free master->base                 |
                                        | i3c_master_do_daa(&master->base)
                                        | //use master->base
    
    Fix it by ensuring that the work is canceled before proceeding with the
    cleanup in svc_i3c_master_remove.
    
    Fixes: 0f74f8b6675c ("i3c: Make i3c_master_unregister() return void")
    Cc: [email protected]
    Signed-off-by: Kaixin Wang <[email protected]>
    Reviewed-by: Miquel Raynal <[email protected]>
    Reviewed-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/stable/20240914154030.180-1-kxwang23%40m.fudan.edu.cn
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node() [+ + +]

Author: Aleksandr Mishin <[email protected]>
Date:   Wed Jul 10 15:39:49 2024 +0300

    ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
    
    [ Upstream commit 62fdaf9e8056e9a9e6fe63aa9c816ec2122d60c6 ]
    
    In ice_sched_add_root_node() and ice_sched_add_node() there are calls to
    devm_kcalloc() in order to allocate memory for array of pointers to
    'ice_sched_node' structure. But incorrect types are used as sizeof()
    arguments in these calls (structures instead of pointers) which leads to
    over allocation of memory.
    
    Adjust over allocation of memory by correcting types in devm_kcalloc()
    sizeof() arguments.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ieee802154: Fix build error [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 9 21:17:40 2024 +0800

    ieee802154: Fix build error
    
    [ Upstream commit addf89774e48c992316449ffab4f29c2309ebefb ]
    
    If REGMAP_SPI is m and IEEE802154_MCR20A is y,
    
            mcr20a.c:(.text+0x3ed6c5b): undefined reference to `__devm_regmap_init_spi'
            ld: mcr20a.c:(.text+0x3ed6cb5): undefined reference to `__devm_regmap_init_spi'
    
    Select REGMAP_SPI for IEEE802154_MCR20A to fix it.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iio: magnetometer: ak8975: Fix reading for ak099xx sensors [+ + +]

Author: Barnabás Czémán <[email protected]>
Date:   Mon Aug 19 00:29:40 2024 +0200

    iio: magnetometer: ak8975: Fix reading for ak099xx sensors
    
    commit 129464e86c7445a858b790ac2d28d35f58256bbe upstream.
    
    Move ST2 reading with overflow handling after measurement data
    reading.
    ST2 register read have to be read after read measurment data,
    because it means end of the reading and realease the lock on the data.
    Remove ST2 read skip on interrupt based waiting because ST2 required to
    be read out at and of the axis read.
    
    Fixes: 57e73a423b1e ("iio: ak8975: add ak09911 and ak09912 support")
    Signed-off-by: Barnabás Czémán <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: pressure: bmp280: Fix regmap for BMP280 device [+ + +]

Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:49 2024 +0200

    iio: pressure: bmp280: Fix regmap for BMP280 device
    
    commit b9065b0250e1705935445ede0a18c1850afe7b75 upstream.
    
    Up to now, the BMP280 device is using the regmap of the BME280 which
    has registers that exist only in the BME280 device.
    
    Fixes: 14e8015f8569 ("iio: pressure: bmp280: split driver in logical parts")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: pressure: bmp280: Fix waiting time for BMP3xx configuration [+ + +]

Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:50 2024 +0200

    iio: pressure: bmp280: Fix waiting time for BMP3xx configuration
    
    commit 262a6634bcc4f0c1c53d13aa89882909f281a6aa upstream.
    
    According to the datasheet, both pressure and temperature can go up to
    oversampling x32. With this option, the maximum measurement time is not
    80ms (this is for press x32 and temp x2), but it is 130ms nominal
    (calculated from table 3.9.2) and since most of the maximum values
    are around +15%, it is configured to 150ms.
    
    Fixes: 8d329309184d ("iio: pressure: bmp280: Add support for BMP380 sensor family")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Input: adp5589-keys - fix adp5589_gpio_get_value() [+ + +]

Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:47:23 2024 -0700

    Input: adp5589-keys - fix adp5589_gpio_get_value()
    
    commit c684771630e64bc39bddffeb65dd8a6612a6b249 upstream.
    
    The adp5589 seems to have the same behavior as similar devices as
    explained in commit 910a9f5636f5 ("Input: adp5588-keys - get value from
    data out when dir is out").
    
    Basically, when the gpio is set as output we need to get the value from
    ADP5589_GPO_DATA_OUT_A register instead of ADP5589_GPI_STATUS_A.
    
    Fixes: 9d2e173644bb ("Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-2-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Input: adp5589-keys - fix NULL pointer dereference [+ + +]

Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:46:44 2024 -0700

    Input: adp5589-keys - fix NULL pointer dereference
    
    commit fb5cc65f973661241e4a2b7390b429aa7b330c69 upstream.
    
    We register a devm action to call adp5589_clear_config() and then pass
    the i2c client as argument so that we can call i2c_get_clientdata() in
    order to get our device object. However, i2c_set_clientdata() is only
    being set at the end of the probe function which means that we'll get a
    NULL pointer dereference in case the probe function fails early.
    
    Fixes: 30df385e35a4 ("Input: adp5589-keys - use devm_add_action_or_reset() for register clear")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-1-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake [+ + +]

Author: Kai-Heng Feng <[email protected]>
Date:   Tue Aug 20 12:11:28 2024 +0800

    intel_idle: Disable promotion to C1E on Jasper Lake and Elkhart Lake
    
    [ Upstream commit 5bb33212b5c664396e5de4cd5a2999abb84a3978 ]
    
    PCIe ethernet throughut is sub-optimal on Jasper Lake and Elkhart Lake.
    
    The CPU can take long time to exit to C0 to handle IRQ and perform DMA
    when C1E has been entered.
    
    For this reason, adjust intel_idle to disable promotion to C1E and still
    use C-states from ACPI _CST on those two platforms.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219023
    Signed-off-by: Kai-Heng Feng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

io_uring/net: harden multishot termination case for recv [+ + +]

Author: Jens Axboe <[email protected]>
Date:   Thu Sep 26 07:08:10 2024 -0600

    io_uring/net: harden multishot termination case for recv
    
    commit c314094cb4cfa6fc5a17f4881ead2dfebfa717a7 upstream.
    
    If the recv returns zero, or an error, then it doesn't matter if more
    data has already been received for this buffer. A condition like that
    should terminate the multishot receive. Rather than pass in the
    collected return value, pass in whether to terminate or keep the recv
    going separately.
    
    Note that this isn't a bug right now, as the only way to get there is
    via setting MSG_WAITALL with multishot receive. And if an application
    does that, then -EINVAL is returned anyway. But it seems like an easy
    bug to introduce, so let's make it a bit more explicit.
    
    Link: https://github.com/axboe/liburing/issues/1246
    Cc: [email protected]
    Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

io_uring: fix memory leak when cache init fail [+ + +]

Author: Guixin Liu <[email protected]>
Date:   Mon Sep 23 18:05:12 2024 +0800

    io_uring: fix memory leak when cache init fail
    
    [ Upstream commit 3a87e264290d71ec86a210ab3e8d23b715ad266d ]
    
    Exit the percpu ref when cache init fails to free the data memory with
    in struct percpu_ref.
    
    Fixes: 206aefde4f88 ("io_uring: reduce/pack size of io_ring_ctx")
    Signed-off-by: Guixin Liu <[email protected]>
    Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iomap: constrain the file range passed to iomap_file_unshare [+ + +]

Author: Darrick J. Wong <[email protected]>
Date:   Wed Oct 2 08:02:13 2024 -0700

    iomap: constrain the file range passed to iomap_file_unshare
    
    [ Upstream commit a311a08a4237241fb5b9d219d3e33346de6e83e0 ]
    
    File contents can only be shared (i.e. reflinked) below EOF, so it makes
    no sense to try to unshare ranges beyond EOF.  Constrain the file range
    parameters here so that we don't have to do that in the callers.
    
    Fixes: 5f4e5752a8a3 ("fs: add iomap_file_dirty")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Link: https://lore.kernel.org/r/20241002150213.GC21853@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Brian Foster <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release [+ + +]

Author: Christoph Hellwig <[email protected]>
Date:   Tue Sep 10 07:39:03 2024 +0300

    iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release
    
    [ Upstream commit 7a9d43eace888a0ee6095035997bb138425844d3 ]
    
    When direct I/O completions invalidates the page cache it holds neither the
    i_rwsem nor the invalidate_lock so it can be racing with
    iomap_write_delalloc_release.  If the search for the end of the region that
    contains data returns the start offset we hit such a race and just need to
    look for the end of the newly created hole instead.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/arm-smmu-v3: Do not use devm for the cd table allocations [+ + +]

Author: Jason Gunthorpe <[email protected]>
Date:   Fri Sep 6 12:47:52 2024 -0300

    iommu/arm-smmu-v3: Do not use devm for the cd table allocations
    
    [ Upstream commit 47b2de35cab2b683f69d03515c2658c2d8515323 ]
    
    The master->cd_table is entirely contained within the struct
    arm_smmu_master which is guaranteed to be freed by the core code under
    arm_smmu_release_device().
    
    There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably
    called to free the CD related memory. Remove it and save some memory.
    
    Tested-by: Nicolin Chen <[email protected]>
    Reviewed-by: Nicolin Chen <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/arm-smmu-v3: Match Stall behaviour for S2 [+ + +]

Author: Mostafa Saleh <[email protected]>
Date:   Fri Aug 30 11:03:47 2024 +0000

    iommu/arm-smmu-v3: Match Stall behaviour for S2
    
    [ Upstream commit ce7cb08e22e09f43649b025c849a3ae3b80833c4 ]
    
    According to the spec (ARM IHI 0070 F.b), in
    "5.5 Fault configuration (A, R, S bits)":
        A STE with stage 2 translation enabled and STE.S2S == 0 is
        considered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10.
    
    Also described in the pseudocode “SteIllegal()”
        if STE.Config == '11x' then
            [..]
            if eff_idr0_stall_model == '10' && STE.S2S == '0' then
                // stall_model forcing stall, but S2S == 0
                return TRUE;
    
    Which means, S2S must be set when stall model is
    "ARM_SMMU_FEAT_STALL_FORCE", but currently the driver ignores that.
    
    Although, the driver can do the minimum and only set S2S for
    “ARM_SMMU_FEAT_STALL_FORCE”, it is more consistent to match S1
    behaviour, which also sets it for “ARM_SMMU_FEAT_STALL” if the
    master has requested stalls.
    
    Also, since S2 stalls are enabled now, report them to the IOMMU layer
    and for VFIO devices it will fail anyway as VFIO doesn’t register an
    iopf handler.
    
    Signed-off-by: Mostafa Saleh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Always reserve a domain ID for identity setup [+ + +]

Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:13 2024 +0800

    iommu/vt-d: Always reserve a domain ID for identity setup
    
    [ Upstream commit 2c13012e09190174614fd6901857a1b8c199e17d ]
    
    We will use a global static identity domain. Reserve a static domain ID
    for it.
    
    Signed-off-by: Lu Baolu <[email protected]>
    Reviewed-by: Jason Gunthorpe <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Reviewed-by: Jerry Snitselaar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count [+ + +]

Author: Sanjay K Kumar <[email protected]>
Date:   Mon Sep 2 10:27:18 2024 +0800

    iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count
    
    [ Upstream commit 3cf74230c139f208b7fb313ae0054386eee31a81 ]
    
    If qi_submit_sync() is invoked with 0 invalidation descriptors (for
    instance, for DMA draining purposes), we can run into a bug where a
    submitting thread fails to detect the completion of invalidation_wait.
    Subsequently, this led to a soft lockup. Currently, there is no impact
    by this bug on the existing users because no callers are submitting
    invalidations with 0 descriptors. This fix will enable future users
    (such as DMA drain) calling qi_submit_sync() with 0 count.
    
    Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while
    concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both
    threads then enter a while loop, waiting for their respective descriptors
    to complete. T1 detects its completion (i.e., T1's invalidation_wait status
    changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to
    reclaim all descriptors, potentially including adjacent ones of other
    threads that are also marked as QI_DONE.
    
    During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU
    hardware may complete the invalidation for T2, setting its status to
    QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's
    invalidation_wait descriptor and changes its status to QI_FREE, T2 will
    not observe the QI_DONE status for its invalidation_wait and will
    indefinitely remain stuck.
    
    This soft lockup does not occur when only non-zero descriptors are
    submitted.In such cases, invalidation descriptors are interspersed among
    wait descriptors with the status QI_IN_USE, acting as barriers. These
    barriers prevent the reclaim code from mistakenly freeing descriptors
    belonging to other submitters.
    
    Considered the following example timeline:
            T1                      T2
    ========================================
            ID1
            WD1
            while(WD1!=QI_DONE)
            unlock
                                    lock
            WD1=QI_DONE*            WD2
                                    while(WD2!=QI_DONE)
                                    unlock
            lock
            WD1==QI_DONE?
            ID1=QI_DONE             WD2=DONE*
            reclaim()
            ID1=FREE
            WD1=FREE
            WD2=FREE
            unlock
                                    soft lockup! T2 never sees QI_DONE in WD2
    
    Where:
    ID = invalidation descriptor
    WD = wait descriptor
    * Written by hardware
    
    The root of the problem is that the descriptor status QI_DONE flag is used
    for two conflicting purposes:
    1. signal a descriptor is ready for reclaim (to be freed)
    2. signal by the hardware that a wait descriptor is complete
    
    The solution (in this patch) is state separation by using QI_FREE flag
    for #1.
    
    Once a thread's invalidation descriptors are complete, their status would
    be set to QI_FREE. The reclaim_free_desc() function would then only
    free descriptors marked as QI_FREE instead of those marked as
    QI_DONE. This change ensures that T2 (from the previous example) will
    correctly observe the completion of its invalidation_wait (marked as
    QI_DONE).
    
    Signed-off-by: Sanjay K Kumar <[email protected]>
    Signed-off-by: Jacob Pan <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Unconditionally flush device TLB for pasid table updates [+ + +]

Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:20 2024 +0800

    iommu/vt-d: Unconditionally flush device TLB for pasid table updates
    
    [ Upstream commit 1f5e307ca16c0c19186cbd56ac460a687e6daba0 ]
    
    The caching mode of an IOMMU is irrelevant to the behavior of the device
    TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode
    check before device TLB flush") removed this redundant check in the
    domain unmap path.
    
    Checking the caching mode before flushing the device TLB after a pasid
    table entry is updated is unnecessary and can lead to inconsistent
    behavior.
    
    Extends this consistency by removing the caching mode check in the pasid
    table update path.
    
    Suggested-by: Yi Liu <[email protected]>
    Signed-off-by: Lu Baolu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Fri Aug 9 16:54:02 2024 -0700

    ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
    
    [ Upstream commit e3af3d3c5b26c33a7950e34e137584f6056c4319 ]
    
    dev->ip_ptr could be NULL if we set an invalid MTU.
    
    Even then, if we issue ioctl(SIOCSIFADDR) for a new IPv4 address,
    devinet_ioctl() allocates struct in_ifaddr and fails later in
    inet_set_ifa() because in_dev is NULL.
    
    Let's move the check earlier.
    
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: ip_gre: Fix drops of small packets in ipgre_xmit [+ + +]

Author: Anton Danilov <[email protected]>
Date:   Wed Sep 25 02:51:59 2024 +0300

    ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
    
    [ Upstream commit c4a14f6d9d17ad1e41a36182dd3b8a5fd91efbd7 ]
    
    Regression Description:
    
    Depending on the options specified for the GRE tunnel device, small
    packets may be dropped. This occurs because the pskb_network_may_pull
    function fails due to the packet's insufficient length.
    
    For example, if only the okey option is specified for the tunnel device,
    original (before encapsulation) packets smaller than 28 bytes (including
    the IPv4 header) will be dropped. This happens because the required
    length is calculated relative to the network header, not the skb->head.
    
    Here is how the required length is computed and checked:
    
    * The pull_len variable is set to 28 bytes, consisting of:
      * IPv4 header: 20 bytes
      * GRE header with Key field: 8 bytes
    
    * The pskb_network_may_pull function adds the network offset, shifting
    the checkable space further to the beginning of the network header and
    extending it to the beginning of the packet. As a result, the end of
    the checkable space occurs beyond the actual end of the packet.
    
    Instead of ensuring that 28 bytes are present in skb->head, the function
    is requesting these 28 bytes starting from the network header. For small
    packets, this requested length exceeds the actual packet size, causing
    the check to fail and the packets to be dropped.
    
    This issue affects both locally originated and forwarded packets in
    DMVPN-like setups.
    
    How to reproduce (for local originated packets):
    
      ip link add dev gre1 type gre ikey 1.9.8.4 okey 1.9.8.4 \
              local <your-ip> remote 0.0.0.0
    
      ip link set mtu 1400 dev gre1
      ip link set up dev gre1
      ip address add 192.168.13.1/24 dev gre1
      ip neighbor add 192.168.13.2 lladdr <remote-ip> dev gre1
      ping -s 1374 -c 10 192.168.13.2
      tcpdump -vni gre1
      tcpdump -vni <your-ext-iface> 'ip proto 47'
      ip -s -s -d link show dev gre1
    
    Solution:
    
    Use the pskb_may_pull function instead the pskb_network_may_pull.
    
    Fixes: 80d875cfc9d3 ("ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()")
    Signed-off-by: Anton Danilov <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family [+ + +]

Author: Ido Schimmel <[email protected]>
Date:   Wed Aug 14 15:52:22 2024 +0300

    ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
    
    [ Upstream commit 8fed54758cd248cd311a2b5c1e180abef1866237 ]
    
    The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
    lookup according to user provided parameters and communicate the result
    back to user space.
    
    However, unlike other users of the FIB lookup API, the upper DSCP bits
    and the ECN bits of the DS field are not masked, which can result in the
    wrong result being returned.
    
    Solve this by masking the upper DSCP bits and the ECN bits using
    IPTOS_RT_MASK.
    
    The structure that communicates the request and the response is not
    exported to user space, so it is unlikely that this netlink family is
    actually in use [1].
    
    [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
    
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Guillaume Nault <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit [+ + +]

Author: Kemeng Shi <[email protected]>
Date:   Thu Aug 1 09:38:08 2024 +0800

    jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit
    
    commit f0e3c14802515f60a47e6ef347ea59c2733402aa upstream.
    
    Use tid_geq to compare tids to work over sequence number wraps.
    
    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhang Yi <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error [+ + +]

Author: Baokun Li <[email protected]>
Date:   Thu Jul 18 19:53:36 2024 +0800

    jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
    
    commit f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a upstream.
    
    In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
    to recover some journal space. But if an error occurs while executing
    jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
    space right away, we try other branches, and if j_committing_transaction
    is NULL (i.e., the tid is 0), we will get the following complain:
    
    ============================================
    JBD2: I/O error when updating journal superblock for sdd-8.
    __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
    __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
    Modules linked in:
    CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
    RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
    Call Trace:
     <TASK>
     add_transaction_credits+0x5d1/0x5e0
     start_this_handle+0x1ef/0x6a0
     jbd2__journal_start+0x18b/0x340
     ext4_dirty_inode+0x5d/0xb0
     __mark_inode_dirty+0xe4/0x5d0
     generic_update_time+0x60/0x70
    [...]
    ============================================
    
    So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
    clean up at the moment, continue to try to reclaim free space in other ways.
    
    Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
    when updating journal superblock fails") to make jbd2_cleanup_journal_tail
    return the correct error code.
    
    Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

jfs: check if leafidx greater than num leaves per dmap tree [+ + +]

Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 09:25:23 2024 +0800

    jfs: check if leafidx greater than num leaves per dmap tree
    
    [ Upstream commit d64ff0d2306713ff084d4b09f84ed1a8c75ecc32 ]
    
    syzbot report a out of bounds in dbSplit, it because dmt_leafidx greater
    than num leaves per dmap tree, add a checking for dmt_leafidx in dbFindLeaf.
    
    Shaggy:
    Modified sanity check to apply to control pages as well as leaf pages.
    
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=dca05492eff41f604890
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uaf in dbFreeBits [+ + +]

Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 10:50:48 2024 +0800

    jfs: Fix uaf in dbFreeBits
    
    [ Upstream commit d6c1b3599b2feb5c7291f5ac3a36e5fa7cedb234 ]
    
    [syzbot reported]
    ==================================================================
    BUG: KASAN: slab-use-after-free in __mutex_lock_common kernel/locking/mutex.c:587 [inline]
    BUG: KASAN: slab-use-after-free in __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
    Read of size 8 at addr ffff8880229254b0 by task syz-executor357/5216
    
    CPU: 0 UID: 0 PID: 5216 Comm: syz-executor357 Not tainted 6.11.0-rc3-syzkaller-00156-gd7a5aa4b3c00 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0x169/0x550 mm/kasan/report.c:488
     kasan_report+0x143/0x180 mm/kasan/report.c:601
     __mutex_lock_common kernel/locking/mutex.c:587 [inline]
     __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
     dbFreeBits+0x7ea/0xd90 fs/jfs/jfs_dmap.c:2390
     dbFreeDmap fs/jfs/jfs_dmap.c:2089 [inline]
     dbFree+0x35b/0x680 fs/jfs/jfs_dmap.c:409
     dbDiscardAG+0x8a9/0xa20 fs/jfs/jfs_dmap.c:1650
     jfs_ioc_trim+0x433/0x670 fs/jfs/jfs_discard.c:100
     jfs_ioctl+0x2d0/0x3e0 fs/jfs/ioctl.c:131
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:907 [inline]
     __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
    
    Freed by task 5218:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
     kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
     poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
     __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2252 [inline]
     slab_free mm/slub.c:4473 [inline]
     kfree+0x149/0x360 mm/slub.c:4594
     dbUnmount+0x11d/0x190 fs/jfs/jfs_dmap.c:278
     jfs_mount_rw+0x4ac/0x6a0 fs/jfs/jfs_mount.c:247
     jfs_remount+0x3d1/0x6b0 fs/jfs/super.c:454
     reconfigure_super+0x445/0x880 fs/super.c:1083
     vfs_cmd_reconfigure fs/fsopen.c:263 [inline]
     vfs_fsconfig_locked fs/fsopen.c:292 [inline]
     __do_sys_fsconfig fs/fsopen.c:473 [inline]
     __se_sys_fsconfig+0xb6e/0xf80 fs/fsopen.c:345
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [Analysis]
    There are two paths (dbUnmount and jfs_ioc_trim) that generate race
    condition when accessing bmap, which leads to the occurrence of uaf.
    
    Use the lock s_umount to synchronize them, in order to avoid uaf caused
    by race condition.
    
    Reported-and-tested-by: [email protected]
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uninit-value access of new_ea in ea_buffer [+ + +]

Author: Zhao Mengmeng <[email protected]>
Date:   Wed Sep 4 09:07:58 2024 +0800

    jfs: Fix uninit-value access of new_ea in ea_buffer
    
    [ Upstream commit 2b59ffad47db1c46af25ccad157bb3b25147c35c ]
    
    syzbot reports that lzo1x_1_do_compress is using uninit-value:
    
    =====================================================
    BUG: KMSAN: uninit-value in lzo1x_1_do_compress+0x19f9/0x2510 lib/lzo/lzo1x_compress.c:178
    
    ...
    
    Uninit was stored to memory at:
     ea_put fs/jfs/xattr.c:639 [inline]
    
    ...
    
    Local variable ea_buf created at:
     __jfs_setxattr+0x5d/0x1ae0 fs/jfs/xattr.c:662
     __jfs_xattr_set+0xe6/0x1f0 fs/jfs/xattr.c:934
    
    =====================================================
    
    The reason is ea_buf->new_ea is not initialized properly.
    
    Fix this by using memset to empty its content at the beginning
    in ea_get().
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=02341e0daa42a15ce130
    Signed-off-by: Zhao Mengmeng <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: UBSAN: shift-out-of-bounds in dbFindBits [+ + +]

Author: Remington Brasga <[email protected]>
Date:   Wed Jul 10 00:12:44 2024 +0000

    jfs: UBSAN: shift-out-of-bounds in dbFindBits
    
    [ Upstream commit b0b2fc815e514221f01384f39fbfbff65d897e1c ]
    
    Fix issue with UBSAN throwing shift-out-of-bounds warning.
    
    Reported-by: [email protected]
    Signed-off-by: Remington Brasga <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jump_label: Fix static_key_slow_dec() yet again [+ + +]

Author: Peter Zijlstra <[email protected]>
Date:   Mon Sep 9 12:50:09 2024 +0200

    jump_label: Fix static_key_slow_dec() yet again
    
    [ Upstream commit 1d7f856c2ca449f04a22d876e36b464b7a9d28b6 ]
    
    While commit 83ab38ef0a0b ("jump_label: Fix concurrency issues in
    static_key_slow_dec()") fixed one problem, it created yet another,
    notably the following is now possible:
    
      slow_dec
        if (try_dec) // dec_not_one-ish, false
        // enabled == 1
                                    slow_inc
                                      if (inc_not_disabled) // inc_not_zero-ish
                                      // enabled == 2
                                        return
    
        guard((mutex)(&jump_label_mutex);
        if (atomic_cmpxchg(1,0)==1) // false, we're 2
    
                                    slow_dec
                                      if (try-dec) // dec_not_one, true
                                      // enabled == 1
                                        return
        else
          try_dec() // dec_not_one, false
          WARN
    
    Use dec_and_test instead of cmpxchg(), like it was prior to
    83ab38ef0a0b. Add a few WARNs for the paranoid.
    
    Fixes: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
    Reported-by: "Darrick J. Wong" <[email protected]>
    Tested-by: Klara Modin <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kconfig: fix infinite loop in sym_calc_choice() [+ + +]

Author: Masahiro Yamada <[email protected]>
Date:   Wed Sep 25 20:25:31 2024 +0900

    kconfig: fix infinite loop in sym_calc_choice()
    
    [ Upstream commit 4d46b5b623e0adee1153b1d80689211e5094ae44 ]
    
    Since commit f79dc03fe68c ("kconfig: refactor choice value calculation"),
    Kconfig for ARCH=powerpc may result in an infinite loop. This occurs
    because there are two entries for POWERPC64_CPU in a choice block.
    
    If the same symbol appears twice in a choice block, the ->choice_link
    node is added twice to ->choice_members, resulting a corrupted linked
    list.
    
    A simple test case is:
    
        choice
                prompt "choice"
    
        config A
                bool "A"
    
        config B
                bool "B 1"
    
        config B
                bool "B 2"
    
        endchoice
    
    Running 'make defconfig' results in an infinite loop.
    
    One solution is to replace the current two entries:
    
        config POWERPC64_CPU
                bool "Generic (POWER5 and PowerPC 970 and above)"
                depends on PPC_BOOK3S_64 && !CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
    
        config POWERPC64_CPU
                bool "Generic (POWER8 and above)"
                depends on PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN
                select ARCH_HAS_FAST_MULTIPLIER
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX
    
    with the following single entry:
    
        config POWERPC64_CPU
                bool "Generic 64 bit powerpc"
                depends on PPC_BOOK3S_64
                select ARCH_HAS_FAST_MULTIPLIER if CPU_LITTLE_ENDIAN
                select PPC_64S_HASH_MMU
                select PPC_HAS_LBARX_LHARX if CPU_LITTLE_ENDIAN
    
    In my opinion, the latter looks cleaner, but PowerPC maintainers may
    prefer to display different prompts depending on CPU_LITTLE_ENDIAN.
    
    For now, this commit fixes the issue in Kconfig, restoring the original
    behavior. I will reconsider whether such a use case is worth supporting.
    
    Fixes: f79dc03fe68c ("kconfig: refactor choice value calculation")
    Reported-by: Marco Bonelli <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kconfig: qconf: fix buffer overflow in debug links [+ + +]

Author: Masahiro Yamada <[email protected]>
Date:   Tue Oct 1 18:02:22 2024 +0900

    kconfig: qconf: fix buffer overflow in debug links
    
    [ Upstream commit 984ed20ece1c6c20789ece040cbff3eb1a388fa9 ]
    
    If you enable "Option -> Show Debug Info" and click a link, the program
    terminates with the following error:
    
        *** buffer overflow detected ***: terminated
    
    The buffer overflow is caused by the following line:
    
        strcat(data, "$");
    
    The buffer needs one more byte to accommodate the additional character.
    
    Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again")
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kconfig: qconf: move conf_read() before drawing tree pain [+ + +]

Author: Masahiro Yamada <[email protected]>
Date:   Tue Oct 1 02:02:23 2024 +0900

    kconfig: qconf: move conf_read() before drawing tree pain
    
    [ Upstream commit da724c33b685463720b1c625ac440e894dc57ec0 ]
    
    The constructor of ConfigMainWindow() calls show*View(), which needs
    to calculate symbol values. conf_read() must be called before that.
    
    Fixes: 060e05c3b422 ("kconfig: qconf: remove initial call to conf_changed()")
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3 [+ + +]

Author: Alessandro Zanni <[email protected]>
Date:   Tue Aug 6 14:14:50 2024 +0200

    kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
    
    [ Upstream commit a19008256d05e726f29f43c6a307e45482c082c3 ]
    
    Insert raw strings to prevent Python3 from interpreting string literals
    as Unicode strings and "\d" as invalid escaped sequence.
    
    Fix the warnings:
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py:48:
    SyntaxWarning: invalid escape sequence '\d' usb_controller_sysfs_dir =
    "usb[\d]+"
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py: 94:
    SyntaxWarning: invalid escape sequence '\d' re_usb_version =
    re.compile("PRODUCT=.*/(\d)/.*")
    
    Fixes: dacf1d7a78bf ("kselftest: Add test to verify probe of devices from discoverable buses")
    
    Reviewed-by: Nícolas F. R. A. Prado <[email protected]>
    Signed-off-by: Alessandro Zanni <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kselftests: mm: fix wrong __NR_userfaultfd value [+ + +]

Author: Muhammad Usama Anjum <[email protected]>
Date:   Mon Sep 23 10:38:36 2024 +0500

    kselftests: mm: fix wrong __NR_userfaultfd value
    
    commit f30beffd977e98c33550bbeb6f278d157ff54844 upstream.
    
    grep -rnIF "#define __NR_userfaultfd"
    tools/include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    arch/x86/include/generated/uapi/asm/unistd_32.h:374:#define
    __NR_userfaultfd 374
    arch/x86/include/generated/uapi/asm/unistd_64.h:327:#define
    __NR_userfaultfd 323
    arch/x86/include/generated/uapi/asm/unistd_x32.h:282:#define
    __NR_userfaultfd (__X32_SYSCALL_BIT + 323)
    arch/arm/include/generated/uapi/asm/unistd-eabi.h:347:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    arch/arm/include/generated/uapi/asm/unistd-oabi.h:359:#define
    __NR_userfaultfd (__NR_SYSCALL_BASE + 388)
    include/uapi/asm-generic/unistd.h:681:#define __NR_userfaultfd 282
    
    The number is dependent on the architecture. The above data shows that:
    x86     374
    x86_64  323
    
    The value of __NR_userfaultfd was changed to 282 when asm-generic/unistd.h
    was included.  It makes the test to fail every time as the correct number
    of this syscall on x86_64 is 323.  Fix the header to asm/unistd.h.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: a5c6bc590094 ("selftests/mm: remove local __NR_* definitions")
    Signed-off-by: Muhammad Usama Anjum <[email protected]>
    Reviewed-by: Shuah Khan <[email protected]>
    Reviewed-by: David Hildenbrand <[email protected]>
    Cc: John Hubbard <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ksmbd: add refcnt to ksmbd_conn struct [+ + +]

Author: Namjae Jeon <[email protected]>
Date:   Tue Sep 3 20:28:08 2024 +0900

    ksmbd: add refcnt to ksmbd_conn struct
    
    [ Upstream commit ee426bfb9d09b29987369b897fe9b6485ac2be27 ]
    
    When sending an oplock break request, opinfo->conn is used,
    But freed ->conn can be used on multichannel.
    This patch add a reference count to the ksmbd_conn struct
    so that it can be freed when it is no longer used.
    
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ksmbd: fix warning: comparison of distinct pointer types lacks a cast [+ + +]

Author: Namjae Jeon <[email protected]>
Date:   Thu Sep 19 09:22:57 2024 +0900

    ksmbd: fix warning: comparison of distinct pointer types lacks a cast
    
    [ Upstream commit 289ebd9afeb94862d96c89217068943f1937df5b ]
    
    smb2pdu.c: In function ‘smb2_open’:
    ./include/linux/minmax.h:20:28: warning: comparison of distinct
    pointer types lacks a cast
       20 |  (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
          |                            ^~
    ./include/linux/minmax.h:26:4: note: in expansion of macro ‘__typecheck’
       26 |   (__typecheck(x, y) && __no_side_effects(x, y))
          |    ^~~~~~~~~~~
    ./include/linux/minmax.h:36:24: note: in expansion of macro ‘__safe_cmp’
       36 |  __builtin_choose_expr(__safe_cmp(x, y), \
          |                        ^~~~~~~~~~
    ./include/linux/minmax.h:45:19: note: in expansion of macro ‘__careful_cmp’
       45 | #define min(x, y) __careful_cmp(x, y, <)
          |                   ^~~~~~~~~~~~~
    /home/linkinjeon/git/smbd_work/ksmbd/smb2pdu.c:3713:27: note: in
    expansion of macro ‘min’
     3713 |     fp->durable_timeout = min(dh_info.timeout,
    
    Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

KVM: arm64: Fix kvm_has_feat*() handling of negative features [+ + +]

Author: Marc Zyngier <[email protected]>
Date:   Wed Oct 2 21:42:39 2024 +0100

    KVM: arm64: Fix kvm_has_feat*() handling of negative features
    
    commit a1d402abf8e3ff1d821e88993fc5331784fac0da upstream.
    
    Oliver reports that the kvm_has_feat() helper is not behaviing as
    expected for negative feature. On investigation, the main issue
    seems to be caused by the following construct:
    
     #define get_idreg_field(kvm, id, fld)                          \
            (id##_##fld##_SIGNED ?                                  \
             get_idreg_field_signed(kvm, id, fld) :                 \
             get_idreg_field_unsigned(kvm, id, fld))
    
    where one side of the expression evaluates as something signed,
    and the other as something unsigned. In retrospect, this is totally
    braindead, as the compiler converts this into an unsigned expression.
    When compared to something that is 0, the test is simply elided.
    
    Epic fail. Similar issue exists in the expand_field_sign() macro.
    
    The correct way to handle this is to chose between signed and unsigned
    comparisons, so that both sides of the ternary expression are of the
    same type (bool).
    
    In order to keep the code readable (sort of), we introduce new
    comparison primitives taking an operator as a parameter, and
    rewrite the kvm_has_feat*() helpers in terms of these primitives.
    
    Fixes: c62d7a23b947 ("KVM: arm64: Add feature checking helpers")
    Reported-by: Oliver Upton <[email protected]>
    Tested-by: Oliver Upton <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

l2tp: free sessions using rcu [+ + +]

Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:08 2024 +0100

    l2tp: free sessions using rcu
    
    [ Upstream commit d17e89999574aca143dd4ede43e4382d32d98724 ]
    
    l2tp sessions may be accessed under an rcu read lock. Have them freed
    via rcu and remove the now unneeded synchronize_rcu when a session is
    removed.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

l2tp: prevent possible tunnel refcount underflow [+ + +]

Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:10 2024 +0100

    l2tp: prevent possible tunnel refcount underflow
    
    [ Upstream commit 24256415d18695b46da06c93135f5b51c548b950 ]
    
    When a session is created, it sets a backpointer to its tunnel. When
    the session refcount drops to 0, l2tp_session_free drops the tunnel
    refcount if session->tunnel is non-NULL. However, session->tunnel is
    set in l2tp_session_create, before the tunnel refcount is incremented
    by l2tp_session_register, which leaves a small window where
    session->tunnel is non-NULL when the tunnel refcount hasn't been
    bumped.
    
    Moving the assignment to l2tp_session_register is trivial but
    l2tp_session_create calls l2tp_session_set_header_len which uses
    session->tunnel to get the tunnel's encap. Add an encap arg to
    l2tp_session_set_header_len to avoid using session->tunnel.
    
    If l2tpv3 sessions have colliding IDs, it is possible for
    l2tp_v3_session_get to race with l2tp_session_register and fetch a
    session which doesn't yet have session->tunnel set. Add a check for
    this case.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

l2tp: use rcu list add/del when updating lists [+ + +]

Author: James Chapman <[email protected]>
Date:   Mon Jul 29 16:38:11 2024 +0100

    l2tp: use rcu list add/del when updating lists
    
    [ Upstream commit 89b768ec2dfefaeba5212de14fc71368e12d06ba ]
    
    l2tp_v3_session_htable and tunnel->session_list are read by lockless
    getters using RCU. Use rcu list variants when adding or removing list
    items.
    
    Signed-off-by: James Chapman <[email protected]>
    Signed-off-by: Tom Parkin <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

leds: pca9532: Remove irrelevant blink configuration error message [+ + +]

Author: Bastien Curutchet <[email protected]>
Date:   Mon Aug 26 15:32:37 2024 +0200

    leds: pca9532: Remove irrelevant blink configuration error message
    
    commit 2aad93b6de0d874038d3d7958be05011284cd6b9 upstream.
    
    The update_hw_blink() function prints an error message when hardware is
    not able to handle a blink configuration on its own. IMHO, this isn't a
    'real' error since the software fallback is used afterwards.
    
    Remove the error messages to avoid flooding the logs with unnecessary
    messages.
    
    Cc: [email protected]
    Fixes: 48ca7f302cfc ("leds: pca9532: Use PWM1 for hardware blinking")
    Signed-off-by: Bastien Curutchet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

lib/buildid: harden build ID parsing logic [+ + +]

Author: Andrii Nakryiko <[email protected]>
Date:   Thu Aug 29 10:42:23 2024 -0700

    lib/buildid: harden build ID parsing logic
    
    commit 905415ff3ffb1d7e5afa62bacabd79776bd24606 upstream.
    
    Harden build ID parsing logic, adding explicit READ_ONCE() where it's
    important to have a consistent value read and validated just once.
    
    Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
    note is within a page bounds, so move the overflow check up and add an
    extra note_size boundaries validation.
    
    Fixes tag below points to the code that moved this code into
    lib/buildid.c, and then subsequently was used in perf subsystem, making
    this code exposed to perf_event_open() users in v5.12+.
    
    Cc: [email protected]
    Reviewed-by: Eduard Zingerman <[email protected]>
    Reviewed-by: Jann Horn <[email protected]>
    Suggested-by: Andi Kleen <[email protected]>
    Fixes: bd7525dacd7e ("bpf: Move stack_map_get_build_id into lib")
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Linux: Linux 6.11.3 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Thu Oct 10 12:04:18 2024 +0200

    Linux 6.11.3
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Pavel Machek (CIP) <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Ronald Warsow <[email protected]>
    Tested-by: Markus Reichelt <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Justin M. Forbes <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Christian Heusel <[email protected]>
    Tested-by: Kexy Biscuit <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: kernelci.org bot <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mac802154: Fix potential RCU dereference issue in mac802154_scan_worker [+ + +]

Author: Jiawei Ye <[email protected]>
Date:   Tue Sep 24 06:58:05 2024 +0000

    mac802154: Fix potential RCU dereference issue in mac802154_scan_worker
    
    commit bff1709b3980bd7f80be6786f64cc9a9ee9e56da upstream.
    
    In the `mac802154_scan_worker` function, the `scan_req->type` field was
    accessed after the RCU read-side critical section was unlocked. According
    to RCU usage rules, this is illegal and can lead to unpredictable
    behavior, such as accessing memory that has been updated or causing
    use-after-free issues.
    
    This possible bug was identified using a static analysis tool developed
    by myself, specifically designed to detect RCU-related issues.
    
    To address this, the `scan_req->type` value is now stored in a local
    variable `scan_req_type` while still within the RCU read-side critical
    section. The `scan_req_type` is then used after the RCU lock is released,
    ensuring that the type value is safely accessed without violating RCU
    rules.
    
    Fixes: e2c3e6f53a7a ("mac802154: Handle active scanning")
    Cc: [email protected]
    Signed-off-by: Jiawei Ye <[email protected]>
    Acked-by: Miquel Raynal <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mailbox: ARM_MHU_V3 should depend on ARM64 [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Thu Aug 29 15:58:53 2024 +0200

    mailbox: ARM_MHU_V3 should depend on ARM64
    
    [ Upstream commit 0e4ed48292c55eeb0afab22f8930b556f17eaad2 ]
    
    The ARM MHUv3 controller is only present on ARM64 SoCs.  Hence add a
    dependency on ARM64, to prevent asking the user about this driver when
    configuring a kernel for a different architecture than ARM64.
    
    Fixes: ca1a8680b134b5e6 ("mailbox: arm_mhuv3: Add driver")
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Acked-by: Sudeep Holla <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: bcm2835: Fix timeout during suspend mode [+ + +]

Author: Stefan Wahren <[email protected]>
Date:   Wed Aug 21 23:40:44 2024 +0200

    mailbox: bcm2835: Fix timeout during suspend mode
    
    [ Upstream commit dc09f007caed3b2f6a3b6bd7e13777557ae22bfd ]
    
    During noirq suspend phase the Raspberry Pi power driver suffer of
    firmware property timeouts. The reason is that the IRQ of the underlying
    BCM2835 mailbox is disabled and rpi_firmware_property_list() will always
    run into a timeout [1].
    
    Since the VideoCore side isn't consider as a wakeup source, set the
    IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled
    during suspend-resume cycle.
    
    [1]
    PM: late suspend of devices complete after 1.754 msecs
    WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128
     rpi_firmware_property_list+0x204/0x22c
    Firmware transaction 0x00028001 timeout
    Modules linked in:
    CPU: 0 PID: 438 Comm: bash Tainted: G         C         6.9.3-dirty #17
    Hardware name: BCM2835
    Call trace:
    unwind_backtrace from show_stack+0x18/0x1c
    show_stack from dump_stack_lvl+0x34/0x44
    dump_stack_lvl from __warn+0x88/0xec
    __warn from warn_slowpath_fmt+0x7c/0xb0
    warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c
    rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c
    rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0
    rpi_firmware_set_power from _genpd_power_off+0xe4/0x148
    _genpd_power_off from genpd_sync_power_off+0x7c/0x11c
    genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0
    genpd_finish_suspend from dpm_run_callback+0x78/0xd0
    dpm_run_callback from device_suspend_noirq+0xc0/0x238
    device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168
    dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac
    suspend_devices_and_enter from pm_suspend+0x254/0x2e4
    pm_suspend from state_store+0xa8/0xd4
    state_store from kernfs_fop_write_iter+0x154/0x1a0
    kernfs_fop_write_iter from vfs_write+0x12c/0x184
    vfs_write from ksys_write+0x78/0xc0
    ksys_write from ret_fast_syscall+0x0/0x54
    Exception stack(0xcc93dfa8 to 0xcc93dff0)
    [...]
    PM: noirq suspend of devices complete after 3095.584 msecs
    
    Link: https://github.com/raspberrypi/firmware/issues/1894
    Fixes: 0bae6af6d704 ("mailbox: Enable BCM2835 mailbox support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: rockchip: fix a typo in module autoloading [+ + +]

Author: Liao Chen <[email protected]>
Date:   Wed Aug 14 02:51:47 2024 +0000

    mailbox: rockchip: fix a typo in module autoloading
    
    [ Upstream commit e92d87c9c5d769e4cb1dd7c90faa38dddd7e52e3 ]
    
    MODULE_DEVICE_TABLE(of, rockchip_mbox_of_match) could let the module
    properly autoloaded based on the alias from of_device_id table. It
    should be 'rockchip_mbox_of_match' instead of 'rockchp_mbox_of_match',
    just fix it.
    
    Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")
    Signed-off-by: Liao Chen <[email protected]>
    Reviewed-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

media: i2c: ar0521: Use cansleep version of gpiod_set_value() [+ + +]

Author: Alexander Shiyan <[email protected]>
Date:   Thu Aug 29 08:48:49 2024 +0300

    media: i2c: ar0521: Use cansleep version of gpiod_set_value()
    
    commit bee1aed819a8cda47927436685d216906ed17f62 upstream.
    
    If we use GPIO reset from I2C port expander, we must use *_cansleep()
    variant of GPIO functions.
    This was not done in ar0521_power_on()/ar0521_power_off() functions.
    Let's fix that.
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 11 at drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x74/0x7c
    Modules linked in:
    CPU: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.10.0 #53
    Hardware name: Diasom DS-RK3568-SOM-EVB (DT)
    Workqueue: events_unbound deferred_probe_work_func
    pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : gpiod_set_value+0x74/0x7c
    lr : ar0521_power_on+0xcc/0x290
    sp : ffffff8001d7ab70
    x29: ffffff8001d7ab70 x28: ffffff80027dcc90 x27: ffffff8003c82000
    x26: ffffff8003ca9250 x25: ffffffc080a39c60 x24: ffffff8003ca9088
    x23: ffffff8002402720 x22: ffffff8003ca9080 x21: ffffff8003ca9088
    x20: 0000000000000000 x19: ffffff8001eb2a00 x18: ffffff80efeeac80
    x17: 756d2d6332692f30 x16: 0000000000000000 x15: 0000000000000000
    x14: ffffff8001d91d40 x13: 0000000000000016 x12: ffffffc080e98930
    x11: ffffff8001eb2880 x10: 0000000000000890 x9 : ffffff8001d7a9f0
    x8 : ffffff8001d92570 x7 : ffffff80efeeac80 x6 : 000000003fc6e780
    x5 : ffffff8001d91c80 x4 : 0000000000000002 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     gpiod_set_value+0x74/0x7c
     ar0521_power_on+0xcc/0x290
    ...
    
    Signed-off-by: Alexander Shiyan <[email protected]>
    Fixes: 852b50aeed15 ("media: On Semi AR0521 sensor driver")
    Cc: [email protected]
    Acked-by: Krzysztof Hałasa <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: imx335: Fix reset-gpio handling [+ + +]

Author: Umang Jain <[email protected]>
Date:   Fri Aug 30 11:41:52 2024 +0530

    media: imx335: Fix reset-gpio handling
    
    commit 99d30e2fdea4086be4e66e2deb10de854b547ab8 upstream.
    
    Rectify the logical value of reset-gpio so that it is set to
    0 (disabled) during power-on and to 1 (enabled) during power-off.
    
    Set the reset-gpio to GPIO_OUT_HIGH at initialization time to make
    sure it starts off in reset. Also drop the "Set XCLR" comment which
    is not-so-informative.
    
    The existing usage of imx335 had reset-gpios polarity inverted
    (GPIO_ACTIVE_HIGH) in their device-tree sources. With this patch
    included, those DTS will not be able to stream imx335 anymore. The
    reset-gpio polarity will need to be rectified in the device-tree
    sources as shown in [1] example, in order to get imx335 functional
    again (as it remains in reset prior to this fix).
    
    Cc: [email protected]
    Fixes: 45d19b5fb9ae ("media: i2c: Add imx335 camera sensor driver")
    Reviewed-by: Laurent Pinchart <[email protected]>
    Link: https://lore.kernel.org/linux-media/[email protected]/
    Signed-off-by: Umang Jain <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: ov5675: Fix power on/off delay timings [+ + +]

Author: Bryan O'Donoghue <[email protected]>
Date:   Sat Jul 13 23:33:29 2024 +0100

    media: ov5675: Fix power on/off delay timings
    
    commit 719ec29fceda2f19c833d2784b1574638320400f upstream.
    
    The ov5675 specification says that the gap between XSHUTDN deassert and the
    first I2C transaction should be a minimum of 8192 XVCLK cycles.
    
    Right now we use a usleep_rage() that gives a sleep time of between about
    430 and 860 microseconds.
    
    On the Lenovo X13s we have observed that in about 1/20 cases the current
    timing is too tight and we start transacting before the ov5675's reset
    cycle completes, leading to I2C bus transaction failures.
    
    The reset racing is sometimes triggered at initial chip probe but, more
    usually on a subsequent power-off/power-on cycle e.g.
    
    [   71.451662] ov5675 24-0010: failed to write reg 0x0103. error = -5
    [   71.451686] ov5675 24-0010: failed to set plls
    
    The current quiescence period we have is too tight. Instead of expressing
    the post reset delay in terms of the current XVCLK this patch converts the
    power-on and power-off delays to the maximum theoretical delay @ 6 MHz with
    an additional buffer.
    
    1.365 milliseconds on the power-on path is 1.5 milliseconds with grace.
    85.3 microseconds on the power-off path is 90 microseconds with grace.
    
    Fixes: 49d9ad719e89 ("media: ov5675: add device-tree support and support runtime PM")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Tested-by: Johan Hovold <[email protected]>
    Reviewed-by: Quentin Schulz <[email protected]>
    Tested-by: Quentin Schulz <[email protected]> # RK3399 Puma with
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Fix ordering of pm_runtime_enable [+ + +]

Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:03 2024 +0100

    media: qcom: camss: Fix ordering of pm_runtime_enable
    
    commit a151766bd3688f6803e706c6433a7c8d3c6a6a94 upstream.
    
    pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
    pm_runtime_resume_and_get().
    
    This is a basic race condition that doesn't show up for most users so is
    not widely reported. If you blacklist qcom-camss in modules.d and then
    subsequently modprobe the module post-boot it is possible to reliably show
    this error up.
    
    The kernel log for this error looks like this:
    
    qcom-camss ac5a000.camss: Failed to power up pipeline: -13
    
    Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Konrad Dybcio <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Remove use_count guard in stop_streaming [+ + +]

Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:02 2024 +0100

    media: qcom: camss: Remove use_count guard in stop_streaming
    
    commit 25f18cb1b673220b76a86ebef8e7fb79bd303b27 upstream.
    
    The use_count check was introduced so that multiple concurrent Raw Data
    Interfaces RDIs could be driven by different virtual channels VCs on the
    CSIPHY input driving the video pipeline.
    
    This is an invalid use of use_count though as use_count pertains to the
    number of times a video entity has been opened by user-space not the number
    of active streams.
    
    If use_count and stream-on count don't agree then stop_streaming() will
    break as is currently the case and has become apparent when using CAMSS
    with libcamera's released softisp 0.3.
    
    The use of use_count like this is a bit hacky and right now breaks regular
    usage of CAMSS for a single stream case. Stopping qcam results in the splat
    below, and then it cannot be started again and any attempts to do so fails
    with -EBUSY.
    
    [ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    ...
    [ 1265.510630] Call trace:
    [ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    [ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
    [ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
    [ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
    [ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
    [ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
    [ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
    [ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
    ...
    [ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
    [ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
    [ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st
    
    One CAMSS specific way to handle multiple VCs on the same RDI might be:
    
    - Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
    - The video buffers are already associated with msm_vfeN_rdiX so
      release video buffers when told to do so by stop_streaming.
    - Only release the power-domains for the CSIPHY, CSID and VFE when
      their internal refcounts drop.
    
    Either way refusing to release video buffers based on use_count is
    erroneous and should be reverted. The silicon enabling code for selecting
    VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
    won't work with CAMSS right now.
    
    Initial testing with this code didn't show an error but, SoftISP and "real"
    usage with Google Hangouts breaks the upstream code pretty quickly, we need
    to do a partial revert and take another pass at VCs.
    
    This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
    Pipeline starting and stopping for multiple virtual channels")
    
    Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: sun4i_csi: Implement link validate for sun4i_csi subdev [+ + +]

Author: Laurent Pinchart <[email protected]>
Date:   Wed Jun 19 02:46:16 2024 +0300

    media: sun4i_csi: Implement link validate for sun4i_csi subdev
    
    commit 2dc5d5d401f5c6cecd97800ffef82e8d17d228f0 upstream.
    
    The sun4i_csi driver doesn't implement link validation for the subdev it
    registers, leaving the link between the subdev and its source
    unvalidated. Fix it, using the v4l2_subdev_link_validate() helper.
    
    Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Acked-by: Chen-Yu Tsai <[email protected]>
    Reviewed-by: Tomi Valkeinen <[email protected]>
    Acked-by: Sakari Ailus <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags [+ + +]

Author: Hans Verkuil <[email protected]>
Date:   Wed Aug 7 09:22:10 2024 +0200

    media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
    
    commit 599f6899051cb70c4e0aa9fd591b9ee220cb6f14 upstream.
    
    The cec_msg_set_reply_to() helper function never zeroed the
    struct cec_msg flags field, this can cause unexpected behavior
    if flags was uninitialized to begin with.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media")
    Cc: <[email protected]>
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: venus: fix use after free bug in venus_remove due to race condition [+ + +]

Author: Zheng Wang <[email protected]>
Date:   Tue Jun 18 14:55:59 2024 +0530

    media: venus: fix use after free bug in venus_remove due to race condition
    
    commit c5a85ed88e043474161bbfe54002c89c1cb50ee2 upstream.
    
    in venus_probe, core->work is bound with venus_sys_error_handler, which is
    used to handle error. The code use core->sys_err_done to make sync work.
    The core->work is started in venus_event_notify.
    
    If we call venus_remove, there might be an unfished work. The possible
    sequence is as follows:
    
    CPU0                  CPU1
    
                         |venus_sys_error_handler
    venus_remove         |
    hfi_destroy                      |
    venus_hfi_destroy        |
    kfree(hdev);         |
                         |hfi_reinit
                                             |venus_hfi_queues_reinit
                         |//use hdev
    
    Fix it by canceling the work in venus_remove.
    
    Cc: [email protected]
    Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
    Signed-off-by: Zheng Wang <[email protected]>
    Signed-off-by: Dikshita Agarwal <[email protected]>
    Signed-off-by: Stanimir Varbanov <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: videobuf2: Drop minimum allocation requirement of 2 buffers [+ + +]

Author: Laurent Pinchart <[email protected]>
Date:   Mon Aug 26 02:24:49 2024 +0300

    media: videobuf2: Drop minimum allocation requirement of 2 buffers
    
    commit e5700c9037727d5a69a677d6dba25010b485d65b upstream.
    
    When introducing the ability for drivers to indicate the minimum number
    of buffers they require an application to allocate, commit 6662edcd32cc
    ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue
    structure") also introduced a global minimum of 2 buffers. It turns out
    this breaks the Renesas R-Car VSP test suite, where a test that
    allocates a single buffer fails when two buffers are used.
    
    One may consider debatable whether test suite failures without failures
    in production use cases should be considered as a regression, but
    operation with a single buffer is a valid use case. While full frame
    rate can't be maintained, memory-to-memory devices can still be used
    with a decent efficiency, and requiring applications to allocate
    multiple buffers for single-shot use cases with capture devices would
    just waste memory.
    
    For those reasons, fix the regression by dropping the global minimum of
    buffers. Individual drivers can still set their own minimum.
    
    Fixes: 6662edcd32cc ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue structure")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Reviewed-by: Hans Verkuil <[email protected]>
    Acked-by: Tomasz Figa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

memory: tegra186-emc: drop unused to_tegra186_emc() [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Mon Aug 12 14:30:55 2024 +0200

    memory: tegra186-emc: drop unused to_tegra186_emc()
    
    commit 67dd9e861add38755a7c5d29e25dd0f6cb4116ab upstream.
    
    to_tegra186_emc() is not used, W=1 builds:
    
      tegra186-emc.c:38:36: error: unused function 'to_tegra186_emc' [-Werror,-Wunused-function]
    
    Fixes: 9a38cb27668e ("memory: tegra: Add interconnect support for DRAM scaling in Tegra234")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm, slub: avoid zeroing kmalloc redzone [+ + +]

Author: Peng Fan <[email protected]>
Date:   Thu Aug 29 11:29:11 2024 +0800

    mm, slub: avoid zeroing kmalloc redzone
    
    commit 59090e479ac78ae18facd4c58eb332562a23020e upstream.
    
    Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra
    allocated kmalloc space than requested"), setting orig_size treats
    the wasted space (object_size - orig_size) as a redzone. However with
    init_on_free=1 we clear the full object->size, including the redzone.
    
    Additionally we clear the object metadata, including the stored orig_size,
    making it zero, which makes check_object() treat the whole object as a
    redzone.
    
    These issues lead to the following BUG report with "slub_debug=FUZ
    init_on_free=1":
    
    [    0.000000] =============================================================================
    [    0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten
    [    0.000000] -----------------------------------------------------------------------------
    [    0.000000]
    [    0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc
    [    0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc
    [    0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff)
    [    0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8
    [    0.000000]
    [    0.000000] Redzone  ffff000010032850: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Object   ffff000010032858: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Redzone  ffff000010032860: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Padding  ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00              ............
    [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144
    [    0.000000] Hardware name: NXP i.MX95 19X19 board (DT)
    [    0.000000] Call trace:
    [    0.000000]  dump_backtrace+0x90/0xe8
    [    0.000000]  show_stack+0x18/0x24
    [    0.000000]  dump_stack_lvl+0x74/0x8c
    [    0.000000]  dump_stack+0x18/0x24
    [    0.000000]  print_trailer+0x150/0x218
    [    0.000000]  check_object+0xe4/0x454
    [    0.000000]  free_to_partial_list+0x2f8/0x5ec
    
    To address the issue, use orig_size to clear the used area. And restore
    the value of orig_size after clear the remaining area.
    
    When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns
    s->object_size. So when using memset to init the area, the size can simply
    be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not
    enabled. And orig_size can never be bigger than object_size.
    
    Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
    Cc: <[email protected]>
    Reviewed-by: Feng Tang <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Signed-off-by: Peng Fan <[email protected]>
    Signed-off-by: Vlastimil Babka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/filemap: fix filemap_get_folios_contig THP panic [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:17 2024 -0700

    mm/filemap: fix filemap_get_folios_contig THP panic
    
    commit c225c4f6056b46a8a5bf2ed35abf17a2d6887691 upstream.
    
    Patch series "memfd-pin huge page fixes".
    
    Fix multiple bugs that occur when using memfd_pin_folios with hugetlb
    pages and THP.  The hugetlb bugs only bite when the page is not yet
    faulted in when memfd_pin_folios is called.  The THP bug bites when the
    starting offset passed to memfd_pin_folios is not huge page aligned.  See
    the commit messages for details.
    
    
    This patch (of 5):
    
    memfd_pin_folios on memory backed by THP panics if the requested start
    offset is not huge page aligned:
    
    BUG: kernel NULL pointer dereference, address: 0000000000000036
    RIP: 0010:filemap_get_folios_contig+0xdf/0x290
    RSP: 0018:ffffc9002092fbe8 EFLAGS: 00010202
    RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002
    
    The fault occurs here, because xas_load returns a folio with value 2:
    
        filemap_get_folios_contig()
            for (folio = xas_load(&xas); folio && xas.xa_index <= end;
                            folio = xas_next(&xas)) {
                    ...
                    if (!folio_try_get(folio))   <-- BOOM
    
    "2" is an xarray sibling entry.  We get it because memfd_pin_folios does
    not round the indices passed to filemap_get_folios_contig to huge page
    boundaries for THP, so we load from the middle of a huge page range see a
    sibling.  (It does round for hugetlbfs, at the is_file_hugepages test).
    
    To fix, if the folio is a sibling, then return the next index as the
    starting point for the next call to filemap_get_folios_contig.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: Vivek Kasireddy <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/gup: fix memfd_pin_folios alloc race panic [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:21 2024 -0700

    mm/gup: fix memfd_pin_folios alloc race panic
    
    commit ce645b9fdc78ec5d28067286e92871ddae6817d5 upstream.
    
    If memfd_pin_folios tries to create a hugetlb page, but someone else
    already did, then folio gets the value -EEXIST here:
    
            folio = memfd_alloc_folio(memfd, start_idx);
            if (IS_ERR(folio)) {
                    ret = PTR_ERR(folio);
                    if (ret != -EEXIST)
                            goto err;
    
    then on the next trip through the "while start_idx" loop we panic here:
    
            if (folio) {
                    folio_put(folio);
    
    To fix, set the folio to NULL on error.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/gup: fix memfd_pin_folios hugetlb page allocation [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:20 2024 -0700

    mm/gup: fix memfd_pin_folios hugetlb page allocation
    
    commit 9289f020da47ef04b28865589eeee3d56d4bafea upstream.
    
    When memfd_pin_folios -> memfd_alloc_folio creates a hugetlb page, the
    index is wrong.  The subsequent call to filemap_get_folios_contig thus
    cannot find it, and fails, and memfd_pin_folios loops forever.  To fix,
    adjust the index for the huge_page_order.
    
    memfd_alloc_folio also forgets to unlock the folio, so the next touch of
    the page calls hugetlb_fault which blocks forever trying to take the lock.
    Unlock it.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/hugetlb: fix memfd_pin_folios free_huge_pages leak [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:18 2024 -0700

    mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
    
    commit c56b6f3d801d7ec8965993342bdd9e2972b6cb8e upstream.
    
    memfd_pin_folios followed by unpin_folios fails to restore free_huge_pages
    if the pages were not already faulted in, because the folio refcount for
    pages created by memfd_alloc_folio never goes to 0.  memfd_pin_folios
    needs another folio_put to undo the folio_try_get below:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            folio_ref_unfreeze(folio, 1);    ; adds 1 refcount
      folio_try_get()                        ; adds 1 refcount
      hugetlb_add_to_page_cache()            ; adds 512 refcount (on x86)
    
    With the fix, after memfd_pin_folios + unpin_folios, the refcount for the
    (unfaulted) page is 512, which is correct, as the refcount for a faulted
    unpinned page is 513.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Tue Sep 3 07:25:19 2024 -0700

    mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
    
    commit 26a8ea80929c518bdec5e53a5776f95919b7c88e upstream.
    
    memfd_pin_folios followed by unpin_folios leaves resv_huge_pages elevated
    if the pages were not already faulted in.  During a normal page fault,
    resv_huge_pages is consumed here:
    
    hugetlb_fault()
      alloc_hugetlb_folio()
        dequeue_hugetlb_folio_vma()
          dequeue_hugetlb_folio_nodemask()
            dequeue_hugetlb_folio_node_exact()
              free_huge_pages--
          resv_huge_pages--
    
    During memfd_pin_folios, the page is created by calling
    alloc_hugetlb_folio_nodemask instead of alloc_hugetlb_folio, and
    resv_huge_pages is not modified:
    
    memfd_alloc_folio()
      alloc_hugetlb_folio_nodemask()
        dequeue_hugetlb_folio_nodemask()
          dequeue_hugetlb_folio_node_exact()
            free_huge_pages--
    
    alloc_hugetlb_folio_nodemask has other callers that must not modify
    resv_huge_pages.  Therefore, to fix, define an alternate version of
    alloc_hugetlb_folio_nodemask for this call site that adjusts
    resv_huge_pages.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Acked-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm/hugetlb: simplify refs in memfd_alloc_folio [+ + +]

Author: Steve Sistare <[email protected]>
Date:   Wed Sep 4 12:41:08 2024 -0700

    mm/hugetlb: simplify refs in memfd_alloc_folio
    
    commit dc677b5f3765cfd0944c8873d1ea57f1a3439676 upstream.
    
    The folio_try_get in memfd_alloc_folio is not necessary.  Delete it, and
    delete the matching folio_put in memfd_pin_folios.  This also avoids
    leaking a ref if the memfd_alloc_folio call to hugetlb_add_to_page_cache
    fails.  That error path is also broken in a second way -- when its
    folio_put causes the ref to become 0, it will implicitly call
    free_huge_folio, but then the path *explicitly* calls free_huge_folio.
    Delete the latter.
    
    This is a continuation of the fix
      "mm/hugetlb: fix memfd_pin_folios free_huge_pages leak"
    
    [[email protected]: remove explicit call to free_huge_folio(), per Matthew]
      Link: https://lkml.kernel.org/r/[email protected]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
    Signed-off-by: Steve Sistare <[email protected]>
    Suggested-by: Vivek Kasireddy <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: Muchun Song <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: krealloc: consider spare memory for __GFP_ZERO [+ + +]

Author: Danilo Krummrich <[email protected]>
Date:   Tue Aug 13 00:34:34 2024 +0200

    mm: krealloc: consider spare memory for __GFP_ZERO
    
    commit 1a83a716ec233990e1fd5b6fbb1200ade63bf450 upstream.
    
    As long as krealloc() is called with __GFP_ZERO consistently, starting
    with the initial memory allocation, __GFP_ZERO should be fully honored.
    
    However, if for an existing allocation krealloc() is called with a
    decreased size, it is not ensured that the spare portion the allocation is
    zeroed.  Thus, if krealloc() is subsequently called with a larger size
    again, __GFP_ZERO can't be fully honored, since we don't know the previous
    size, but only the bucket size.
    
    Example:
    
            buf = kzalloc(64, GFP_KERNEL);
            memset(buf, 0xff, 64);
    
            buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
    
            /* After this call the last 16 bytes are still 0xff. */
            buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
    
    Fix this, by explicitly setting spare memory to zero, when shrinking an
    allocation with __GFP_ZERO flag set or init_on_alloc enabled.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Danilo Krummrich <[email protected]>
    Acked-by: Vlastimil Babka <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Cc: Christoph Lameter <[email protected]>
    Cc: Hyeonggon Yoo <[email protected]>
    Cc: Joonsoo Kim <[email protected]>
    Cc: Pekka Enberg <[email protected]>
    Cc: Roman Gushchin <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: z3fold: deprecate CONFIG_Z3FOLD [+ + +]

Author: Yosry Ahmed <[email protected]>
Date:   Mon Oct 7 19:21:16 2024 +0000

    mm: z3fold: deprecate CONFIG_Z3FOLD
    
    [ Upstream commit 7a2369b74abf76cd3e54c45b30f6addb497f831b ]
    
    The z3fold compressed pages allocator is rarely used, most users use
    zsmalloc.  The only disadvantage of zsmalloc in comparison is the
    dependency on MMU, and zbud is a more common option for !MMU as it was the
    default zswap allocator for a long time.
    
    Historically, zsmalloc had worse latency than zbud and z3fold but offered
    better memory savings.  This is no longer the case as shown by a simple
    recent analysis [1].  That analysis showed that z3fold does not have any
    advantage over zsmalloc or zbud considering both performance and memory
    usage.  In a kernel build test on tmpfs in a limited cgroup, z3fold took
    3% more time and used 1.8% more memory.  The latency of zswap_load() was
    7% higher, and that of zswap_store() was 10% higher.  Zsmalloc is better
    in all metrics.
    
    Moreover, z3fold apparently has latent bugs, which was made noticeable by
    a recent soft lockup bug report with z3fold [2].  Switching to zsmalloc
    not only fixed the problem, but also reduced the swap usage from 6~8G to
    1~2G.  Other users have also reported being bitten by mistakenly enabling
    z3fold.
    
    Other than hurting users, z3fold is repeatedly causing wasted engineering
    effort.  Apart from investigating the above bug, it came up in multiple
    development discussions (e.g.  [3]) as something we need to handle, when
    there aren't any legit users (at least not intentionally).
    
    The natural course of action is to deprecate z3fold, and remove in a few
    cycles if no objections are raised from active users.  Next on the list
    should be zbud, as it offers marginal latency gains at the cost of huge
    memory waste when compared to zsmalloc.  That one will need to wait until
    zsmalloc does not depend on MMU.
    
    Rename the user-visible config option from CONFIG_Z3FOLD to
    CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
    prompt with explanation during make oldconfig.  Also, remove
    CONFIG_Z3FOLD=y from defconfigs.
    
    [1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com/
    [2]https://lore.kernel.org/lkml/[email protected]/
    [3]https://lore.kernel.org/lkml/CAJD7tkbnmeVugfunffSovJf9FAgy9rhBVt_tx=nxUveLUfqVsA@mail.gmail.com/
    
    [[email protected]: deprecate ZSWAP_ZPOOL_DEFAULT_Z3FOLD as well]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Yosry Ahmed <[email protected]>
    Signed-off-by: Arnd Bergmann <[email protected]>
    Acked-by: Chris Down <[email protected]>
    Acked-by: Nhat Pham <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Acked-by: Vitaly Wool <[email protected]>
    Acked-by: Christoph Hellwig <[email protected]>
    Cc: Aneesh Kumar K.V <[email protected]>
    Cc: Christophe Leroy <[email protected]>
    Cc: Huacai Chen <[email protected]>
    Cc: Miaohe Lin <[email protected]>
    Cc: Michael Ellerman <[email protected]>
    Cc: Naveen N. Rao <[email protected]>
    Cc: Nicholas Piggin <[email protected]>
    Cc: Sergey Senozhatsky <[email protected]>
    Cc: WANG Xuerui <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    (cherry picked from commit 7a2369b74abf76cd3e54c45b30f6addb497f831b)
    Signed-off-by: Yosry Ahmed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Added cond_resched() to crdump collection [+ + +]

Author: Mohamed Khalfella <[email protected]>
Date:   Wed Sep 4 22:02:48 2024 -0600

    net/mlx5: Added cond_resched() to crdump collection
    
    [ Upstream commit ec793155894140df7421d25903de2e6bc12c695b ]
    
    Collecting crdump involves reading vsc registers from pci config space
    of mlx device, which can take long time to complete. This might result
    in starving other threads waiting to run on the cpu.
    
    Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:
    
    - mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
    - mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
      read the entire 1310716 bytes. It was called 53813 times because
      there are jumps in read_addr.
    - On average mlx5_vsc_gw_read_fast() took 35284.4ns.
    - In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
      The average time for each call was 17548.3ns. In some instances
      vsc_read() was called more than one time when the flag was not set.
      As expected the thread released the cpu after 16 iterations in
      mlx5_vsc_wait_on_flag().
    - Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.
    
    It was seen in the field that crdump can take more than 5 seconds to
    complete. During that time mlx5_vsc_wait_on_flag() did not release the
    cpu because it did not complete 16 iterations. It is believed that pci
    config reads were slow. Adding cond_resched() every 128 register read
    improves the situation. In the common case the, crdump takes ~1.8989s,
    the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
    yields the cpu every ~18.0ms.
    
    Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support")
    Reviewed-by: Yuanyuan Zhong <[email protected]>
    Signed-off-by: Mohamed Khalfella <[email protected]>
    Reviewed-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Fix error path in multi-packet WQE transmit [+ + +]

Author: Gerd Bayer <[email protected]>
Date:   Tue Sep 10 10:53:51 2024 +0200

    net/mlx5: Fix error path in multi-packet WQE transmit
    
    [ Upstream commit 2bcae12c795f32ddfbf8c80d1b5f1d3286341c32 ]
    
    Remove the erroneous unmap in case no DMA mapping was established
    
    The multi-packet WQE transmit code attempts to obtain a DMA mapping for
    the skb. This could fail, e.g. under memory pressure, when the IOMMU
    driver just can't allocate more memory for page tables. While the code
    tries to handle this in the path below the err_unmap label it erroneously
    unmaps one entry from the sq's FIFO list of active mappings. Since the
    current map attempt failed this unmap is removing some random DMA mapping
    that might still be required. If the PCI function now presents that IOVA,
    the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
    function in error state.
    
    The erroneous behavior was seen in a stress-test environment that created
    memory pressure.
    
    Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
    Signed-off-by: Gerd Bayer <[email protected]>
    Reviewed-by: Zhu Yanjun <[email protected]>
    Acked-by: Maxim Mikityanskiy <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice [+ + +]

Author: Jianbo Liu <[email protected]>
Date:   Mon Sep 2 09:40:58 2024 +0300

    net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
    
    [ Upstream commit 7b124695db40d5c9c5295a94ae928a8d67a01c3d ]
    
    The km.state is not checked in driver's delayed work. When
    xfrm_state_check_expire() is called, the state can be reset to
    XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This
    happens when xfrm state is deleted, but not freed yet. As
    __xfrm_state_delete() is called again in xfrm timer, the following
    crash occurs.
    
    To fix this issue, skip xfrm_state_check_expire() if km.state is not
    XFRM_STATE_VALID.
    
     Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
     CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
     Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core]
     RIP: 0010:__xfrm_state_delete+0x3d/0x1b0
     Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48
     RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246
     RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036
     RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980
     RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000
     R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246
     R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400
     FS:  0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      <IRQ>
      ? die_addr+0x33/0x90
      ? exc_general_protection+0x1a2/0x390
      ? asm_exc_general_protection+0x22/0x30
      ? __xfrm_state_delete+0x3d/0x1b0
      ? __xfrm_state_delete+0x2f/0x1b0
      xfrm_timer_handler+0x174/0x350
      ? __xfrm_state_delete+0x1b0/0x1b0
      __hrtimer_run_queues+0x121/0x270
      hrtimer_run_softirq+0x88/0xd0
      handle_softirqs+0xcc/0x270
      do_softirq+0x3c/0x50
      </IRQ>
      <TASK>
      __local_bh_enable_ip+0x47/0x50
      mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core]
      process_one_work+0x137/0x2d0
      worker_thread+0x28d/0x3a0
      ? rescuer_thread+0x480/0x480
      kthread+0xb8/0xe0
      ? kthread_park+0x80/0x80
      ret_from_fork+0x2d/0x50
      ? kthread_park+0x80/0x80
      ret_from_fork_asm+0x11/0x20
      </TASK>
    
    Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
    Signed-off-by: Jianbo Liu <[email protected]>
    Reviewed-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() [+ + +]

Author: Elena Salomatkina <[email protected]>
Date:   Tue Sep 24 19:00:18 2024 +0300

    net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
    
    [ Upstream commit f25389e779500cf4a59ef9804534237841bce536 ]
    
    In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
    which is dereferenced on the next line in a reference
    to the modify field.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object")
    Signed-off-by: Elena Salomatkina <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Reviewed-by: Gal Pressman <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: SHAMPO, Fix overflow of hd_per_wq [+ + +]

Author: Dragos Tatulea <[email protected]>
Date:   Tue Aug 13 13:34:54 2024 +0300

    net/mlx5e: SHAMPO, Fix overflow of hd_per_wq
    
    [ Upstream commit 023d2a43ed0d9ab73d4a35757121e4c8e01298e5 ]
    
    When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable
    can overflow. Like in the following case:
    
    $> ethtool --set-ring eth1 rx 8192
    $> ip link set dev eth1 mtu 144
    $> ethtool --features eth1 rx-gro-hw on
    
    ... yields in dmesg:
    
    mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22)
    
    because hd_per_wq is 64K which overflows to 0 and makes the command
    fail.
    
    This patch increases the variable size to 32 bit.
    
    Fixes: 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
    Signed-off-by: Dragos Tatulea <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/ncsi: Disable the ncsi work before freeing the associated structure [+ + +]

Author: Eddie James <[email protected]>
Date:   Wed Sep 25 10:55:23 2024 -0500

    net/ncsi: Disable the ncsi work before freeing the associated structure
    
    [ Upstream commit a0ffa68c70b367358b2672cdab6fa5bc4c40de2c ]
    
    The work function can run after the ncsi device is freed, resulting
    in use-after-free bugs or kernel panic.
    
    Fixes: 2d283bdd079c ("net/ncsi: Resource management")
    Signed-off-by: Eddie James <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/xen-netback: prevent UAF in xenvif_flush_hash() [+ + +]

Author: Jeongjun Park <[email protected]>
Date:   Fri Aug 23 03:11:09 2024 +0900

    net/xen-netback: prevent UAF in xenvif_flush_hash()
    
    [ Upstream commit 0fa5e94a1811d68fbffa0725efe6d4ca62c03d12 ]
    
    During the list_for_each_entry_rcu iteration call of xenvif_flush_hash,
    kfree_rcu does not exist inside the rcu read critical section, so if
    kfree_rcu is called when the rcu grace period ends during the iteration,
    UAF occurs when accessing head->next after the entry becomes free.
    
    Therefore, to solve this, you need to change it to list_for_each_entry_safe.
    
    Signed-off-by: Jeongjun Park <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: add more sanity checks to qdisc_pkt_len_init() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:57 2024 +0000

    net: add more sanity checks to qdisc_pkt_len_init()
    
    [ Upstream commit ab9a9a9e9647392a19e7a885b08000e89c86b535 ]
    
    One path takes care of SKB_GSO_DODGY, assuming
    skb->len is bigger than hdr_len.
    
    virtio_net_hdr_to_skb() does not fully dissect TCP headers,
    it only make sure it is at least 20 bytes.
    
    It is possible for an user to provide a malicious 'GSO' packet,
    total length of 80 bytes.
    
    - 20 bytes of IPv4 header
    - 60 bytes TCP header
    - a small gso_size like 8
    
    virtio_net_hdr_to_skb() would declare this packet as a normal
    GSO packet, because it would see 40 bytes of payload,
    bigger than gso_size.
    
    We need to make detect this case to not underflow
    qdisc_skb_cb(skb)->pkt_len.
    
    Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Add netif_get_gro_max_size helper for GRO [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:41 2024 +0200

    net: Add netif_get_gro_max_size helper for GRO
    
    [ Upstream commit e8d4d34df715133c319fabcf63fdec684be75ff8 ]
    
    Add a small netif_get_gro_max_size() helper which returns the maximum IPv4
    or IPv6 GRO size of the netdevice.
    
    We later add a netif_get_gso_max_size() equivalent as well for GSO, so that
    these helpers can be used consistently instead of open-coded checks.
    
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Stable-dep-of: e609c959a939 ("net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size")
    Signed-off-by: Sasha Levin <[email protected]>

net: atlantic: Avoid warning about potential string truncation [+ + +]

Author: Simon Horman <[email protected]>
Date:   Wed Aug 21 16:58:57 2024 +0100

    net: atlantic: Avoid warning about potential string truncation
    
    [ Upstream commit 5874e0c9f25661c2faefe4809907166defae3d7f ]
    
    W=1 builds with GCC 14.2.0 warn that:
    
    .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                           ^~
    .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                        ^~~~~~~
    .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
    the range is 0 - 255. Further, on inspecting the code, it seems
    that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
    the range is actually 0 - 8.
    
    So, it seems that the condition that GCC flags will not occur.
    But, nonetheless, it would be nice if it didn't emit the warning.
    
    It seems that this can be achieved by changing the format specifier
    from %d to %u, in which case I believe GCC recognises an upper bound
    on the range of tc of 0 - 255. After some experimentation I think
    this is due to the combination of the use of %u and the type of
    cfg->tcs (u8).
    
    Empirically, updating the type of the tc variable to unsigned int
    has the same effect.
    
    As both of these changes seem to make sense in relation to what the code
    is actually doing - iterating over unsigned values - do both.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: avoid potential underflow in qdisc_pkt_len_init() with UFO [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:56 2024 +0000

    net: avoid potential underflow in qdisc_pkt_len_init() with UFO
    
    [ Upstream commit c20029db28399ecc50e556964eaba75c43b1e2f1 ]
    
    After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
    requests coming from user") virtio_net_hdr_to_skb() had sanity check
    to detect malicious attempts from user space to cook a bad GSO packet.
    
    Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
    transport header in UFO") while fixing one issue, allowed user space
    to cook a GSO packet with the following characteristic :
    
    IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
    
    When this packet arrives in qdisc_pkt_len_init(), we end up
    with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
    
    Then the following sets gso_segs to 0 :
    
    gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                            shinfo->gso_size);
    
    Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
    
    qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
    
    This leads to the following crash in fq_codel [1]
    
    qdisc_pkt_len_init() is best effort, we only want an estimation
    of the bytes sent on the wire, not crashing the kernel.
    
    This patch is fixing this particular issue, a following one
    adds more sanity checks for another potential bug.
    
    [1]
    [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [   70.724561] #PF: supervisor read access in kernel mode
    [   70.724561] #PF: error_code(0x0000) - not-present page
    [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
    [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
    [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
    [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
    All code
    ========
       0:   24 08                   and    $0x8,%al
       2:   49 c1 e1 06             shl    $0x6,%r9
       6:   44 89 7c 24 18          mov    %r15d,0x18(%rsp)
       b:   45 31 ed                xor    %r13d,%r13d
       e:   45 31 c0                xor    %r8d,%r8d
      11:   31 ff                   xor    %edi,%edi
      13:   89 44 24 14             mov    %eax,0x14(%rsp)
      17:   4c 03 8b 90 01 00 00    add    0x190(%rbx),%r9
      1e:   eb 04                   jmp    0x24
      20:   39 ca                   cmp    %ecx,%edx
      22:   73 37                   jae    0x5b
      24:   4d 8b 39                mov    (%r9),%r15
      27:   83 c7 01                add    $0x1,%edi
      2a:*  49 8b 17                mov    (%r15),%rdx              <-- trapping instruction
      2d:   49 89 11                mov    %rdx,(%r9)
      30:   41 8b 57 28             mov    0x28(%r15),%edx
      34:   45 8b 5f 34             mov    0x34(%r15),%r11d
      38:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      3f:   49                      rex.WB
    
    Code starting with the faulting instruction
    ===========================================
       0:   49 8b 17                mov    (%r15),%rdx
       3:   49 89 11                mov    %rdx,(%r9)
       6:   41 8b 57 28             mov    0x28(%r15),%edx
       a:   45 8b 5f 34             mov    0x34(%r15),%r11d
       e:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      15:   49                      rex.WB
    [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
    [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
    [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
    [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
    [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
    [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
    [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
    [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
    [   70.724561] Call Trace:
    [   70.724561]  <TASK>
    [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
    [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
    [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
    [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
    [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
    [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
    [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
    [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
    [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
    [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
    [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
    [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    [   70.724561] RIP: 0033:0x41ae09
    
    Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Jonathan Davies <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: Jonathan Davies <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: improve shutdown sequence [+ + +]

Author: Vladimir Oltean <[email protected]>
Date:   Fri Sep 13 23:35:49 2024 +0300

    net: dsa: improve shutdown sequence
    
    [ Upstream commit 6c24a03a61a245fe34d47582898331fa034b6ccd ]
    
    Alexander Sverdlin presents 2 problems during shutdown with the
    lan9303 driver. One is specific to lan9303 and the other just happens
    to reproduce there.
    
    The first problem is that lan9303 is unique among DSA drivers in that it
    calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown,
    not remove):
    
    phy_state_machine()
    -> ...
       -> dsa_user_phy_read()
          -> ds->ops->phy_read()
             -> lan9303_phy_read()
                -> chip->ops->phy_read()
                   -> lan9303_mdio_phy_read()
                      -> dev_get_drvdata()
    
    But we never stop the phy_state_machine(), so it may continue to run
    after dsa_switch_shutdown(). Our common pattern in all DSA drivers is
    to set drvdata to NULL to suppress the remove() method that may come
    afterwards. But in this case it will result in an NPD.
    
    The second problem is that the way in which we set
    dp->conduit->dsa_ptr = NULL; is concurrent with receive packet
    processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL,
    but afterwards, rather than continuing to use that non-NULL value,
    dev->dsa_ptr is dereferenced again and again without NULL checks:
    dsa_conduit_find_user() and many other places. In between dereferences,
    there is no locking to ensure that what was valid once continues to be
    valid.
    
    Both problems have the common aspect that closing the conduit interface
    solves them.
    
    In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN
    event in dsa_user_netdevice_event() which closes user ports as well.
    dsa_port_disable_rt() calls phylink_stop(), which synchronously stops
    the phylink state machine, and ds->ops->phy_read() will thus no longer
    call into the driver after this point.
    
    In the second case, dev_close(conduit) should do this, as per
    Documentation/networking/driver.rst:
    
    | Quiescence
    | ----------
    |
    | After the ndo_stop routine has been called, the hardware must
    | not receive or transmit any data.  All in flight packets must
    | be aborted. If necessary, poll or wait for completion of
    | any reset commands.
    
    So it should be sufficient to ensure that later, when we zeroize
    conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call
    on this conduit.
    
    The addition of the netif_device_detach() function is to ensure that
    ioctls, rtnetlinks and ethtool requests on the user ports no longer
    propagate down to the driver - we're no longer prepared to handle them.
    
    The race condition actually did not exist when commit 0650bf52b31f
    ("net: dsa: be compatible with masters which unregister on shutdown")
    first introduced dsa_switch_shutdown(). It was created later, when we
    stopped unregistering the user interfaces from a bad spot, and we just
    replaced that sequence with a racy zeroization of conduit->dsa_ptr
    (one which doesn't ensure that the interfaces aren't up).
    
    Reported-by: Alexander Sverdlin <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Fixes: ee534378f005 ("net: dsa: fix panic when DSA master device unbinds on shutdown")
    Reviewed-by: Alexander Sverdlin <[email protected]>
    Tested-by: Alexander Sverdlin <[email protected]>
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: lantiq_etop: fix memory disclosure [+ + +]

Author: Aleksander Jan Bajkowski <[email protected]>
Date:   Mon Sep 23 23:49:49 2024 +0200

    net: ethernet: lantiq_etop: fix memory disclosure
    
    [ Upstream commit 45c0de18ff2dc9af01236380404bbd6a46502c69 ]
    
    When applying padding, the buffer is not zeroed, which results in memory
    disclosure. The mentioned data is observed on the wire. This patch uses
    skb_put_padto() to pad Ethernet frames properly. The mentioned function
    zeroes the expanded buffer.
    
    In case the packet cannot be padded it is silently dropped. Statistics
    are also not incremented. This driver does not support statistics in the
    old 32-bit format or the new 64-bit format. These will be added in the
    future. In its current form, the patch should be easily backported to
    stable versions.
    
    Ethernet MACs on Amazon-SE and Danube cannot do padding of the packets
    in hardware, so software padding must be applied.
    
    Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver")
    Signed-off-by: Aleksander Jan Bajkowski <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Reload PTP registers after link-state change [+ + +]

Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:06 2024 +0200

    net: fec: Reload PTP registers after link-state change
    
    [ Upstream commit d9335d0232d2da605585eea1518ac6733518f938 ]
    
    On link-state change, the controller gets reset,
    which clears all PTP registers, including PHC time,
    calibrated clock correction values etc. For correct
    IEEE 1588 operation we need to restore these after
    the reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Reviewed-by: Wei Fang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Restart PPS after link state change [+ + +]

Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:04 2024 +0200

    net: fec: Restart PPS after link state change
    
    [ Upstream commit a1477dc87dc4996dcf65a4893d4e2c3a6b593002 ]
    
    On link state change, the controller gets reset,
    causing PPS to drop out. Re-enable PPS if it was
    enabled before the controller reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:42 2024 +0200

    net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size
    
    [ Upstream commit e609c959a939660c7519895f853dfa5624c6827a ]
    
    Commit 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    added a dev->gso_max_size test to gso_features_check() in order to fall
    back to GSO when needed.
    
    This was added as it was noticed that some drivers could misbehave if TSO
    packets get too big. However, the check doesn't respect dev->gso_ipv4_max_size
    limit. For instance, a device could be configured with BIG TCP for IPv4,
    but not IPv6.
    
    Therefore, add a netif_get_gso_max_size() equivalent to netif_get_gro_max_size()
    and use the helper to respect both limits before falling back to GSO engine.
    
    Fixes: 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: gso: fix tcp fraglist segmentation after pull from frag_list [+ + +]

Author: Felix Fietkau <[email protected]>
Date:   Thu Sep 26 10:53:14 2024 +0200

    net: gso: fix tcp fraglist segmentation after pull from frag_list
    
    commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8 upstream.
    
    Detect tcp gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For TCP, this
    causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at
    tcp_hdr(seg->next).
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Approach and description based on a patch by Willem de Bruijn.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
    Cc: [email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: hisilicon: hip04: fix OF node leak in probe() [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:19 2024 +0200

    net: hisilicon: hip04: fix OF node leak in probe()
    
    [ Upstream commit 17555297dbd5bccc93a01516117547e26a61caf1 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info() [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:20 2024 +0200

    net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
    
    [ Upstream commit 5680cf8d34e1552df987e2f4bb1bff0b2a8c8b11 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in hns_mac_get_info().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_mdio: fix OF node leak in probe() [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:21 2024 +0200

    net: hisilicon: hns_mdio: fix OF node leak in probe()
    
    [ Upstream commit e62beddc45f487b9969821fad3a0913d9bc18a2f ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:42:34 2024 +0800

    net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 09573b1cc76e7ff8f056ab29ea1cdc152ec8c653 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mvpp2: Increase size of queue_name buffer [+ + +]

Author: Simon Horman <[email protected]>
Date:   Tue Aug 6 12:28:24 2024 +0100

    net: mvpp2: Increase size of queue_name buffer
    
    [ Upstream commit 91d516d4de48532d967a77967834e00c8c53dfe6 ]
    
    Increase size of queue_name buffer from 30 to 31 to accommodate
    the largest string written to it. This avoids truncation in
    the possibly unlikely case where the string is name is the
    maximum size.
    
    Flagged by gcc-14:
    
      .../mvpp2_main.c: In function 'mvpp2_probe':
      .../mvpp2_main.c:7636:32: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                                ^
      .../mvpp2_main.c:7635:9: note: 'snprintf' output between 10 and 31 bytes into a destination of size 30
       7635 |         snprintf(priv->queue_name, sizeof(priv->queue_name),
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7637 |                  priv->port_count > 1 ? "+" : "");
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Introduced by commit 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics").
    I am not flagging this as a bug as I am not aware that it is one.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Marcin Wojtas <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: napi: Prevent overflow of napi_defer_hard_irqs [+ + +]

Author: Joe Damato <[email protected]>
Date:   Wed Sep 4 15:34:30 2024 +0000

    net: napi: Prevent overflow of napi_defer_hard_irqs
    
    [ Upstream commit 08062af0a52107a243f7608fd972edb54ca5b7f8 ]
    
    In commit 6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
    napi_defer_irqs was added to net_device and napi_defer_irqs_count was
    added to napi_struct, both as type int.
    
    This value never goes below zero, so there is not reason for it to be a
    signed int. Change the type for both from int to u32, and add an
    overflow check to sysfs to limit the value to S32_MAX.
    
    The limit of S32_MAX was chosen because the practical limit before this
    patch was S32_MAX (anything larger was an overflow) and thus there are
    no behavioral changes introduced. If the extra bit is needed in the
    future, the limit can be raised.
    
    Before this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    $ cat /sys/class/net/eth4/napi_defer_hard_irqs
    -2147483647
    
    After this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    bash: line 0: echo: write error: Numerical result out of range
    
    Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
    
    include/linux/netdevice.h:      unsigned int            tx_queue_len;
    
    And has an overflow check:
    
    dev_change_tx_queue_len(..., unsigned long new_len):
    
      if (new_len != (unsigned int)new_len)
              return -ERANGE;
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: pcs: xpcs: fix the wrong register that was written back [+ + +]

Author: Jiawen Wu <[email protected]>
Date:   Tue Sep 24 10:28:57 2024 +0800

    net: pcs: xpcs: fix the wrong register that was written back
    
    commit 93ef6ee5c20e9330477930ec6347672c9e0cf5a6 upstream.
    
    The value is read from the register TXGBE_RX_GEN_CTL3, and it should be
    written back to TXGBE_RX_GEN_CTL3 when it changes some fields.
    
    Cc: [email protected]
    Fixes: f629acc6f210 ("net: pcs: xpcs: support to switch mode for Wangxun NICs")
    Signed-off-by: Jiawen Wu <[email protected]>
    Reported-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: phy: Check for read errors in SIOCGMIIREG [+ + +]

Author: Niklas Söderlund <[email protected]>
Date:   Tue Sep 3 19:15:36 2024 +0200

    net: phy: Check for read errors in SIOCGMIIREG
    
    [ Upstream commit 569bf6d481b0b823c3c9c3b8be77908fd7caf66b ]
    
    When reading registers from the PHY using the SIOCGMIIREG IOCTL any
    errors returned from either mdiobus_read() or mdiobus_c45_read() are
    ignored, and parts of the returned error is passed as the register value
    back to user-space.
    
    For example, if mdiobus_c45_read() is used with a bus that do not
    implement the read_c45() callback -EOPNOTSUPP is returned. This is
    however directly stored in mii_data->val_out and returned as the
    registers content. As val_out is a u16 the error code is truncated and
    returned as a plausible register value.
    
    Fix this by first checking the return value for errors before returning
    it as the register content.
    
    Before this patch,
    
        # phytool read eth0/0:1/0
        0xffa1
    
    After this change,
    
        $ phytool read eth0/0:1/0
        error: phy_read (-95)
    
    Signed-off-by: Niklas Söderlund <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Yoshihiro Shimoda <[email protected]>
    Tested-by: Yoshihiro Shimoda <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: realtek: Check the index value in led_hw_control_get [+ + +]

Author: Hui Wang <[email protected]>
Date:   Fri Sep 27 19:46:10 2024 +0800

    net: phy: realtek: Check the index value in led_hw_control_get
    
    [ Upstream commit c283782fc5d60c4d8169137c6f955aa3553d3b3d ]
    
    Just like rtl8211f_led_hw_is_supported() and
    rtl8211f_led_hw_control_set(), the rtl8211f_led_hw_control_get() also
    needs to check the index value, otherwise the caller is likely to get
    an incorrect rules.
    
    Fixes: 17784801d888 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F")
    Signed-off-by: Hui Wang <[email protected]>
    Reviewed-by: Marek Vasut <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sched: consistently use rcu_replace_pointer() in taprio_change() [+ + +]

Author: Dmitry Antipov <[email protected]>
Date:   Wed Sep 4 14:54:01 2024 +0300

    net: sched: consistently use rcu_replace_pointer() in taprio_change()
    
    [ Upstream commit d5c4546062fd6f5dbce575c7ea52ad66d1968678 ]
    
    According to Vinicius (and carefully looking through the whole
    https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
    once again), txtime branch of 'taprio_change()' is not going to
    race against 'advance_sched()'. But using 'rcu_replace_pointer()'
    in the former may be a good idea as well.
    
    Suggested-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: Dmitry Antipov <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Thu Aug 1 17:19:56 2024 -0700

    net: skbuff: sprinkle more __GFP_NOWARN on ingress allocs
    
    [ Upstream commit c89cca307b20917da739567a255a68a0798ee129 ]
    
    build_skb() and frag allocations done with GFP_ATOMIC will
    fail in real life, when system is under memory pressure,
    and there's nothing we can do about that. So no point
    printing warnings.
    
    Signed-off-by: Jakub Kicinski <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sparx5: Fix invalid timestamps [+ + +]

Author: Aakash Menon <[email protected]>
Date:   Mon Sep 16 22:18:29 2024 -0700

    net: sparx5: Fix invalid timestamps
    
    [ Upstream commit 151ac45348afc5b56baa584c7cd4876addf461ff ]
    
    Bit 270-271 are occasionally unexpectedly set by the hardware. This issue
    was observed with 10G SFPs causing huge time errors (> 30ms) in PTP. Only
    30 bits are needed for the nanosecond part of the timestamp, clear 2 most
    significant bits before extracting timestamp from the internal frame
    header.
    
    Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping")
    Signed-off-by: Aakash Menon <[email protected]>
    Reviewed-by: Horatiu Vultur <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check [+ + +]

Author: Shenwei Wang <[email protected]>
Date:   Tue Sep 24 15:54:24 2024 -0500

    net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check
    
    [ Upstream commit 4c1b56671b68ffcbe6b78308bfdda6bcce6491ae ]
    
    Increase the timeout for checking the busy bit of the VLAN Tag register
    from 10µs to 500ms. This change is necessary to accommodate scenarios
    where Energy Efficient Ethernet (EEE) is enabled.
    
    Overnight testing revealed that when EEE is active, the busy bit can
    remain set for up to approximately 300ms. The new 500ms timeout provides
    a safety margin.
    
    Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering")
    Reviewed-by: Andrew Lunn <[email protected]>
    Signed-off-by: Shenwei Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: Fix zero-division error when disabling tc cbs [+ + +]

Author: KhaiWenTan <[email protected]>
Date:   Wed Sep 18 14:14:22 2024 +0800

    net: stmmac: Fix zero-division error when disabling tc cbs
    
    commit 675faf5a14c14a2be0b870db30a70764df81e2df upstream.
    
    The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider
    when offload is disabled") allows the "port_transmit_rate_kbps" to be
    set to a value of 0, which is then passed to the "div_s64" function when
    tc-cbs is disabled. This leads to a zero-division error.
    
    When tc-cbs is disabled, the idleslope, sendslope, and credit values the
    credit values are not required to be configured. Therefore, adding a return
    statement after setting the txQ mode to DCB when tc-cbs is disabled would
    prevent a zero-division error.
    
    Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled")
    Cc: <[email protected]>
    Co-developed-by: Choong Yong Liang <[email protected]>
    Signed-off-by: Choong Yong Liang <[email protected]>
    Signed-off-by: KhaiWenTan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: test for not too small csum_start in virtio_net_hdr_to_skb() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()
    
    [ Upstream commit 49d14b54a527289d09a9480f214b8c586322310a ]
    
    syzbot was able to trigger this warning [1], after injecting a
    malicious packet through af_packet, setting skb->csum_start and thus
    the transport header to an incorrect value.
    
    We can at least make sure the transport header is after
    the end of the network header (with a estimated minimal size).
    
    [1]
    [   67.873027] skb len=4096 headroom=16 headlen=14 tailroom=0
    mac=(-1,-1) mac_len=0 net=(16,-6) trans=10
    shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
    csum(0xa start=10 offset=0 ip_summed=3 complete_sw=0 valid=0 level=0)
    hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
    priority=0x0 mark=0x0 alloc_cpu=10 vlan_all=0x0
    encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
    [   67.877172] dev name=veth0_vlan feat=0x000061164fdd09e9
    [   67.877764] sk family=17 type=3 proto=0
    [   67.878279] skb linear:   00000000: 00 00 10 00 00 00 00 00 0f 00 00 00 08 00
    [   67.879128] skb frag:     00000000: 0e 00 07 00 00 00 28 00 08 80 1c 00 04 00 00 02
    [   67.879877] skb frag:     00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.880647] skb frag:     00000020: 00 00 02 00 00 00 08 00 1b 00 00 00 00 00 00 00
    [   67.881156] skb frag:     00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.881753] skb frag:     00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882173] skb frag:     00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882790] skb frag:     00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883171] skb frag:     00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883733] skb frag:     00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.884206] skb frag:     00000090: 00 00 00 00 00 00 00 00 00 00 69 70 76 6c 61 6e
    [   67.884704] skb frag:     000000a0: 31 00 00 00 00 00 00 00 00 00 2b 00 00 00 00 00
    [   67.885139] skb frag:     000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.885677] skb frag:     000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886042] skb frag:     000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886408] skb frag:     000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887020] skb frag:     000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887384] skb frag:     00000100: 00 00
    [   67.887878] ------------[ cut here ]------------
    [   67.887908] offset (-6) >= skb_headlen() (14)
    [   67.888445] WARNING: CPU: 10 PID: 2088 at net/core/dev.c:3332 skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.889353] Modules linked in: macsec macvtap macvlan hsr wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 dummy bridge sr_mod cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p 9pnet netfs
    [   67.890111] CPU: 10 UID: 0 PID: 2088 Comm: b363492833 Not tainted 6.11.0-virtme #1011
    [   67.890183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   67.890309] RIP: 0010:skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891043] Call Trace:
    [   67.891173]  <TASK>
    [   67.891274] ? __warn (kernel/panic.c:741)
    [   67.891320] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891333] ? report_bug (lib/bug.c:180 lib/bug.c:219)
    [   67.891348] ? handle_bug (arch/x86/kernel/traps.c:239)
    [   67.891363] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
    [   67.891372] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
    [   67.891388] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891399] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891416] ip_do_fragment (net/ipv4/ip_output.c:777 (discriminator 1))
    [   67.891448] ? __ip_local_out (./include/linux/skbuff.h:1146 ./include/net/l3mdev.h:196 ./include/net/l3mdev.h:213 net/ipv4/ip_output.c:113)
    [   67.891459] ? __pfx_ip_finish_output2 (net/ipv4/ip_output.c:200)
    [   67.891470] ? ip_route_output_flow (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:96 (discriminator 13) ./include/linux/rcupdate.h:871 (discriminator 13) net/ipv4/route.c:2625 (discriminator 13) ./include/net/route.h:141 (discriminator 13) net/ipv4/route.c:2852 (discriminator 13))
    [   67.891484] ipvlan_process_v4_outbound (drivers/net/ipvlan/ipvlan_core.c:445 (discriminator 1))
    [   67.891581] ipvlan_queue_xmit (drivers/net/ipvlan/ipvlan_core.c:542 drivers/net/ipvlan/ipvlan_core.c:604 drivers/net/ipvlan/ipvlan_core.c:670)
    [   67.891596] ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:227)
    [   67.891607] dev_hard_start_xmit (./include/linux/netdevice.h:4916 ./include/linux/netdevice.h:4925 net/core/dev.c:3588 net/core/dev.c:3604)
    [   67.891620] __dev_queue_xmit (net/core/dev.h:168 (discriminator 25) net/core/dev.c:4425 (discriminator 25))
    [   67.891630] ? skb_copy_bits (./include/linux/uaccess.h:233 (discriminator 1) ./include/linux/uaccess.h:260 (discriminator 1) ./include/linux/highmem-internal.h:230 (discriminator 1) net/core/skbuff.c:3018 (discriminator 1))
    [   67.891645] ? __pskb_pull_tail (net/core/skbuff.c:2848 (discriminator 4))
    [   67.891655] ? skb_partial_csum_set (net/core/skbuff.c:5657)
    [   67.891666] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/skbuff.h:2791 (discriminator 3) ./include/linux/skbuff.h:2799 (discriminator 3) ./include/linux/virtio_net.h:109 (discriminator 3))
    [   67.891684] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   67.891700] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   67.891716] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   67.891734] ? do_sock_setsockopt (net/socket.c:2335)
    [   67.891747] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   67.891761] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   67.891772] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   67.891785] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    
    Fixes: 9181d6f8a2bb ("net: add more sanity check in virtio_net_hdr_to_skb()")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 19:57:43 2024 +0800

    net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable()
    
    [ Upstream commit d505d3593b52b6c43507f119572409087416ba28 ]
    
    It's important to undo pm_runtime_use_autosuspend() with
    pm_runtime_dont_use_autosuspend() at driver exit time.
    
    But the pm_runtime_disable() and pm_runtime_dont_use_autosuspend()
    is missing in the error path for bam_dmux_probe(). So add it.
    
    Found by code review. Compile-tested only.
    
    Fixes: 21a0ffd9b38c ("net: wwan: Add Qualcomm BAM-DMUX WWAN network driver")
    Suggested-by: Stephan Gerhold <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Stephan Gerhold <[email protected]>
    Reviewed-by: Sergey Ryazanov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netdev-genl: Set extack and fix error on napi-get [+ + +]

Author: Joe Damato <[email protected]>
Date:   Sat Aug 31 12:17:04 2024 +0000

    netdev-genl: Set extack and fix error on napi-get
    
    [ Upstream commit 4e3a024b437ec0aee82550cc66a0f4e1a7a88a67 ]
    
    In commit 27f91aaf49b3 ("netdev-genl: Add netlink framework functions
    for napi"), when an invalid NAPI ID is specified the return value
    -EINVAL is used and no extack is set.
    
    Change the return value to -ENOENT and set the extack.
    
    Before this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                              --do napi-get --json='{"id": 451}'
    Netlink error: Invalid argument
    nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
            error: -22
    
    After this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                             --do napi-get --json='{"id": 451}'
    Netlink error: No such file or directory
    nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
            error: -2
            extack: {'bad-attr': '.id'}
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: nf_tables: do not remove elements if set backend implements .abort [+ + +]

Author: Pablo Neira Ayuso <[email protected]>
Date:   Mon Jul 15 13:32:31 2024 +0200

    netfilter: nf_tables: do not remove elements if set backend implements .abort
    
    [ Upstream commit c9526aeb4998393171d85225ff540e28c7d4ab86 ]
    
    pipapo set backend maintains two copies of the datastructure, removing
    the elements from the copy that is going to be discarded slows down
    the abort path significantly, from several minutes to few seconds after
    this patch.
    
    This patch was previously reverted by
    
      f86fb94011ae ("netfilter: nf_tables: revert do not remove elements if set backend implements .abort")
    
    but it is now possible since recent work by Florian Westphal to perform
    on-demand clone from insert/remove path:
    
      532aec7e878b ("netfilter: nft_set_pipapo: remove dirty flag")
      3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
      a238106703ab ("netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone")
      c5444786d0ea ("netfilter: nft_set_pipapo: merge deactivate helper into caller")
      6c108d9bee44 ("netfilter: nft_set_pipapo: prepare walk function for on-demand clone")
      8b8a2417558c ("netfilter: nft_set_pipapo: prepare destroy function for on-demand clone")
      80efd2997fb9 ("netfilter: nft_set_pipapo: make pipapo_clone helper return NULL")
      a590f4760922 ("netfilter: nft_set_pipapo: move prove_locking helper around")
    
    after this series, the clone is fully released once aborted, no need to
    take it back to previous state. Thus, no stale reference to elements can
    occur.
    
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: nf_tables: prevent nf_skb_duplicated corruption [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 18:56:11 2024 +0000

    netfilter: nf_tables: prevent nf_skb_duplicated corruption
    
    [ Upstream commit 92ceba94de6fb4cee2bf40b485979c342f44a492 ]
    
    syzbot found that nf_dup_ipv4() or nf_dup_ipv6() could write
    per-cpu variable nf_skb_duplicated in an unsafe way [1].
    
    Disabling preemption as hinted by the splat is not enough,
    we have to disable soft interrupts as well.
    
    [1]
    BUG: using __this_cpu_write() in preemptible [00000000] code: syz.4.282/6316
     caller is nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
    CPU: 0 UID: 0 PID: 6316 Comm: syz.4.282 Not tainted 6.11.0-rc7-syzkaller-00104-g7052622fccb1 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
      check_preemption_disabled+0x10e/0x120 lib/smp_processor_id.c:49
      nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
      nft_dup_ipv4_eval+0x1db/0x300 net/ipv4/netfilter/nft_dup_ipv4.c:30
      expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline]
      nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288
      nft_do_chain_ipv4+0x202/0x320 net/netfilter/nft_chain_filter.c:23
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
      nf_hook+0x2c4/0x450 include/linux/netfilter.h:269
      NF_HOOK_COND include/linux/netfilter.h:302 [inline]
      ip_output+0x185/0x230 net/ipv4/ip_output.c:433
      ip_local_out net/ipv4/ip_output.c:129 [inline]
      ip_send_skb+0x74/0x100 net/ipv4/ip_output.c:1495
      udp_send_skb+0xacf/0x1650 net/ipv4/udp.c:981
      udp_sendmsg+0x1c21/0x2a60 net/ipv4/udp.c:1269
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x1a6/0x270 net/socket.c:745
      ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
      ___sys_sendmsg net/socket.c:2651 [inline]
      __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
      __do_sys_sendmmsg net/socket.c:2766 [inline]
      __se_sys_sendmmsg net/socket.c:2763 [inline]
      __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f4ce4f7def9
    Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f4ce5d4a038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00007f4ce5135f80 RCX: 00007f4ce4f7def9
    RDX: 0000000000000001 RSI: 0000000020005d40 RDI: 0000000000000006
    RBP: 00007f4ce4ff0b76 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007f4ce5135f80 R15: 00007ffd4cbc6d68
     </TASK>
    
    Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED [+ + +]

Author: Phil Sutter <[email protected]>
Date:   Wed Sep 25 20:01:20 2024 +0200

    netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
    
    [ Upstream commit 76f1ed087b562a469f2153076f179854b749c09a ]
    
    Fix the comment which incorrectly defines it as NLA_U32.
    
    Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
    Signed-off-by: Phil Sutter <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfs: Cancel dirty folios that have no storage destination [+ + +]

Author: David Howells <[email protected]>
Date:   Mon Jul 29 12:23:11 2024 +0100

    netfs: Cancel dirty folios that have no storage destination
    
    [ Upstream commit 8f246b7c0a1be0882374f2ff831a61f0dbe77678 ]
    
    Kafs wants to be able to cache the contents of directories (and symlinks),
    but whilst these are downloaded from the server with the FS.FetchData RPC
    op and similar, the same as for regular files, they can't be updated by
    FS.StoreData, but rather have special operations (FS.MakeDir, etc.).
    
    Now, rather than redownloading a directory's content after each change made
    to that directory, kafs modifies the local blob.  This blob can be saved
    out to the cache, and since it's using netfslib, kafs just marks the folios
    dirty and lets ->writepages() on the directory take care of it, as for an
    regular file.
    
    This is fine as long as there's a cache as although the upload stream is
    disabled, there's a cache stream to drive the procedure.  But if the cache
    goes away in the meantime, suddenly there's no way do any writes and the
    code gets confused, complains "R=%x: No submit" to dmesg and leaves the
    dirty folio hanging.
    
    Fix this by just cancelling the store of the folio if neither stream is
    active.  (If there's no cache at the time of dirtying, we should just not
    mark the folio dirty).
    
    Signed-off-by: David Howells <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]/ # v2
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfs: Fix missing wakeup after issuing writes [+ + +]

Author: David Howells <[email protected]>
Date:   Wed Oct 2 15:45:50 2024 +0100

    netfs: Fix missing wakeup after issuing writes
    
    [ Upstream commit 1ca4169c391c370e0f3a92938df2862900575096 ]
    
    After dividing up a proposed write into subrequests, netfslib sets
    NETFS_RREQ_ALL_QUEUED to indicate to the collector that it can move on to
    the final cleanup once it has emptied the subrequest queues.
    
    Now, whilst the collector will normally end up running at least once after
    this bit is set just because it takes a while to process all the write
    subrequests before the collector runs out of subrequests, there exists the
    possibility that the issuing thread will be forced to sleep and the
    collector thread will clean up all the subrequests before ALL_QUEUED gets
    set.
    
    In such a case, the collector thread will not get triggered again and will
    never clear NETFS_RREQ_IN_PROGRESS thus leaving a request uncompleted and
    causing a potential futute hang.
    
    Fix this by scheduling the write collector if all the subrequest queues are
    empty (and thus no writes pending issuance).
    
    Note that we'd do this ideally before queuing the subrequest, but in the
    case of buffered writeback, at least, we can't find out that we've run out
    of folios until after we've called writeback_iter() and it has returned
    NULL - at which point we might not actually have any subrequests still
    under construction.
    
    Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netpoll: Ensure clean state on setup failures [+ + +]

Author: Breno Leitao <[email protected]>
Date:   Thu Aug 22 04:10:47 2024 -0700

    netpoll: Ensure clean state on setup failures
    
    [ Upstream commit ae5a0456e0b4cfd7e61619e55251ffdf1bc7adfb ]
    
    Modify netpoll_setup() and __netpoll_setup() to ensure that the netpoll
    structure (np) is left in a clean state if setup fails for any reason.
    This prevents carrying over misconfigured fields in case of partial
    setup success.
    
    Key changes:
    - np->dev is now set only after successful setup, ensuring it's always
      NULL if netpoll is not configured or if netpoll_setup() fails.
    - np->local_ip is zeroed if netpoll setup doesn't complete successfully.
    - Added DEBUG_NET_WARN_ON_ONCE() checks to catch unexpected states.
    - Reordered some operations in __netpoll_setup() for better logical flow.
    
    These changes improve the reliability of netpoll configuration, since it
    assures that the structure is fully initialized or totally unset.
    
    Suggested-by: Paolo Abeni <[email protected]>
    Signed-off-by: Breno Leitao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nfp: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:44:45 2024 +0800

    nfp: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit daaba19d357f0900b303a530ced96c78086267ea ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Reviewed-by: Louis Peens <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFSD: Async COPY result needs to return a write verifier [+ + +]

Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:03 2024 -0400

    NFSD: Async COPY result needs to return a write verifier
    
    [ Upstream commit 9ed666eba4e0a2bb8ffaa3739d830b64d4f2aaad ]
    
    Currently, when NFSD handles an asynchronous COPY, it returns a
    zero write verifier, relying on the subsequent CB_OFFLOAD callback
    to pass the write verifier and a stable_how4 value to the client.
    
    However, if the CB_OFFLOAD never arrives at the client (for example,
    if a network partition occurs just as the server sends the
    CB_OFFLOAD operation), the client will never receive this verifier.
    Thus, if the client sends a follow-up COMMIT, there is no way for
    the client to assess the COMMIT result.
    
    The usual recovery for a missing CB_OFFLOAD is for the client to
    send an OFFLOAD_STATUS operation, but that operation does not carry
    a write verifier in its result. Neither does it carry a stable_how4
    value, so the client /must/ send a COMMIT in this case -- which will
    always fail because currently there's still no write verifier in the
    COPY result.
    
    Thus the server needs to return a normal write verifier in its COPY
    result even if the COPY operation is to be performed asynchronously.
    
    If the server recognizes the callback stateid in subsequent
    OFFLOAD_STATUS operations, then obviously it has not restarted, and
    the write verifier the client received in the COPY result is still
    valid and can be used to assess a COMMIT of the copied data, if one
    is needed.
    
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <[email protected]>

nfsd: fix delegation_blocked() to block correctly for at least 30 seconds [+ + +]

Author: NeilBrown <[email protected]>
Date:   Mon Sep 9 15:06:36 2024 +1000

    nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
    
    commit 45bb63ed20e02ae146336412889fe5450316a84f upstream.
    
    The pair of bloom filtered used by delegation_blocked() was intended to
    block delegations on given filehandles for between 30 and 60 seconds.  A
    new filehandle would be recorded in the "new" bit set.  That would then
    be switch to the "old" bit set between 0 and 30 seconds later, and it
    would remain as the "old" bit set for 30 seconds.
    
    Unfortunately the code intended to clear the old bit set once it reached
    30 seconds old, preparing it to be the next new bit set, instead cleared
    the *new* bit set before switching it to be the old bit set.  This means
    that the "old" bit set is always empty and delegations are blocked
    between 0 and 30 seconds.
    
    This patch updates bd->new before clearing the set with that index,
    instead of afterwards.
    
    Reported-by: Olga Kornievskaia <[email protected]>
    Cc: [email protected]
    Fixes: 6282cd565553 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.")
    Signed-off-by: NeilBrown <[email protected]>
    Reviewed-by: Benjamin Coddington <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

NFSD: Fix NFSv4's PUTPUBFH operation [+ + +]

Author: Chuck Lever <[email protected]>
Date:   Sun Aug 11 13:11:07 2024 -0400

    NFSD: Fix NFSv4's PUTPUBFH operation
    
    commit 202f39039a11402dcbcd5fece8d9fa6be83f49ae upstream.
    
    According to RFC 8881, all minor versions of NFSv4 support PUTPUBFH.
    
    Replace the XDR decoder for PUTPUBFH with a "noop" since we no
    longer want the minorversion check, and PUTPUBFH has no arguments to
    decode. (Ideally nfsd4_decode_noop should really be called
    nfsd4_decode_void).
    
    PUTPUBFH should now behave just like PUTROOTFH.
    
    Reported-by: Cedric Blancher <[email protected]>
    Fixes: e1a90ebd8b23 ("NFSD: Combine decode operations for v4 and v4.1")
    Cc: Dan Shelton <[email protected]>
    Cc: Roland Mainz <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

NFSD: Limit the number of concurrent async COPY operations [+ + +]

Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:04 2024 -0400

    NFSD: Limit the number of concurrent async COPY operations
    
    [ Upstream commit aadc3bbea163b6caaaebfdd2b6c4667fbc726752 ]
    
    Nothing appears to limit the number of concurrent async COPY
    operations that clients can start. In addition, AFAICT each async
    COPY can copy an unlimited number of 4MB chunks, so can run for a
    long time. Thus IMO async COPY can become a DoS vector.
    
    Add a restriction mechanism that bounds the number of concurrent
    background COPY operations. Start simple and try to be fair -- this
    patch implements a per-namespace limit.
    
    An async COPY request that occurs while this limit is exceeded gets
    NFS4ERR_DELAY. The requesting client can choose to send the request
    again after a delay or fall back to a traditional read/write style
    copy.
    
    If there is need to make the mechanism more sophisticated, we can
    visit that in future patches.
    
    Cc: [email protected]
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nfsd: map the EBADMSG to nfserr_io to avoid warning [+ + +]

Author: Li Lingfeng <[email protected]>
Date:   Sat Aug 17 14:27:13 2024 +0800

    nfsd: map the EBADMSG to nfserr_io to avoid warning
    
    commit 340e61e44c1d2a15c42ec72ade9195ad525fd048 upstream.
    
    Ext4 will throw -EBADMSG through ext4_readdir when a checksum error
    occurs, resulting in the following WARNING.
    
    Fix it by mapping EBADMSG to nfserr_io.
    
    nfsd_buffered_readdir
     iterate_dir // -EBADMSG -74
      ext4_readdir // .iterate_shared
       ext4_dx_readdir
        ext4_htree_fill_tree
         htree_dirblock_to_tree
          ext4_read_dirblock
           __ext4_read_dirblock
            ext4_dirblock_csum_verify
             warn_no_space_for_csum
              __warn_no_space_for_csum
            return ERR_PTR(-EFSBADCRC) // -EBADMSG -74
     nfserrno // WARNING
    
    [  161.115610] ------------[ cut here ]------------
    [  161.116465] nfsd: non-standard errno: -74
    [  161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0
    [  161.118596] Modules linked in:
    [  161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138
    [  161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe
    mu.org 04/01/2014
    [  161.123601] RIP: 0010:nfserrno+0x9d/0xd0
    [  161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6
     05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33
    [  161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286
    [  161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    [  161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a
    [  161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827
    [  161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021
    [  161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8
    [  161.135244] FS:  0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000
    [  161.136695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0
    [  161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  161.141519] PKRU: 55555554
    [  161.142076] Call Trace:
    [  161.142575]  ? __warn+0x9b/0x140
    [  161.143229]  ? nfserrno+0x9d/0xd0
    [  161.143872]  ? report_bug+0x125/0x150
    [  161.144595]  ? handle_bug+0x41/0x90
    [  161.145284]  ? exc_invalid_op+0x14/0x70
    [  161.146009]  ? asm_exc_invalid_op+0x12/0x20
    [  161.146816]  ? nfserrno+0x9d/0xd0
    [  161.147487]  nfsd_buffered_readdir+0x28b/0x2b0
    [  161.148333]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.149258]  ? nfsd_buffered_filldir+0xf0/0xf0
    [  161.150093]  ? wait_for_concurrent_writes+0x170/0x170
    [  161.151004]  ? generic_file_llseek_size+0x48/0x160
    [  161.151895]  nfsd_readdir+0x132/0x190
    [  161.152606]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.153516]  ? nfsd_unlink+0x380/0x380
    [  161.154256]  ? override_creds+0x45/0x60
    [  161.155006]  nfsd4_encode_readdir+0x21a/0x3d0
    [  161.155850]  ? nfsd4_encode_readlink+0x210/0x210
    [  161.156731]  ? write_bytes_to_xdr_buf+0x97/0xe0
    [  161.157598]  ? __write_bytes_to_xdr_buf+0xd0/0xd0
    [  161.158494]  ? lock_downgrade+0x90/0x90
    [  161.159232]  ? nfs4svc_decode_voidarg+0x10/0x10
    [  161.160092]  nfsd4_encode_operation+0x15a/0x440
    [  161.160959]  nfsd4_proc_compound+0x718/0xe90
    [  161.161818]  nfsd_dispatch+0x18e/0x2c0
    [  161.162586]  svc_process_common+0x786/0xc50
    [  161.163403]  ? nfsd_svc+0x380/0x380
    [  161.164137]  ? svc_printk+0x160/0x160
    [  161.164846]  ? svc_xprt_do_enqueue.part.0+0x365/0x380
    [  161.165808]  ? nfsd_svc+0x380/0x380
    [  161.166523]  ? rcu_is_watching+0x23/0x40
    [  161.167309]  svc_process+0x1a5/0x200
    [  161.168019]  nfsd+0x1f5/0x380
    [  161.168663]  ? nfsd_shutdown_threads+0x260/0x260
    [  161.169554]  kthread+0x1c4/0x210
    [  161.170224]  ? kthread_insert_work_sanity_check+0x80/0x80
    [  161.171246]  ret_from_fork+0x1f/0x30
    
    Signed-off-by: Li Lingfeng <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

nvme-keyring: restrict match length for version '1' identifiers [+ + +]

Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:18 2024 +0200

    nvme-keyring: restrict match length for version '1' identifiers
    
    [ Upstream commit 79559c75332458985ab8a21f11b08bf7c9b833b0 ]
    
    TP8018 introduced a new TLS PSK identifier version (version 1), which appended
    a PSK hash value to the existing identifier (cf NVMe TCP specification v1.1,
    section 3.6.1.3 'TLS PSK and PSK Identity Derivation').
    An original (version 0) identifier has the form:
    
    NVMe0<type><hmac> <hostnqn> <subsysnqn>
    
    and a version 1 identifier has the form:
    
    NVMe1<type><hmac> <hostnqn> <subsysnqn> <hash>
    
    This patch modifies the lookup algorthm to compare only the first part
    of the identifier (excluding the hash value) to handle both version 0 and
    version 1 identifiers.
    And the spec declares 'version 0' identifiers obsolete, so the lookup
    algorithm is modified to prever v1 identifiers.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: check for invalidated or revoked key [+ + +]

Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:20 2024 +0200

    nvme-tcp: check for invalidated or revoked key
    
    [ Upstream commit 5bc46b49c828a6dfaab80b71ecb63fe76a1096d2 ]
    
    key_lookup() will always return a key, even if that key is revoked
    or invalidated. So check for invalid keys before continuing.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: fix link failure for TCP auth [+ + +]

Author: Arnd Bergmann <[email protected]>
Date:   Mon Sep 9 20:21:09 2024 +0000

    nvme-tcp: fix link failure for TCP auth
    
    [ Upstream commit 2d5a333e09c388189238291577e443221baacba0 ]
    
    The nvme fabric driver calls the nvme_tls_key_lookup() function from
    nvmf_parse_key() when the keyring is enabled, but this is broken in a
    configuration with CONFIG_NVME_FABRICS=y and CONFIG_NVME_TCP=m because
    this leads to the function definition being in a loadable module:
    
    x86_64-linux-ld: vmlinux.o: in function `nvmf_parse_key':
    fabrics.c:(.text+0xb1bdec): undefined reference to `nvme_tls_key_lookup'
    
    Move the 'select' up to CONFIG_NVME_FABRICS itself to force this
    part to be built-in as well if needed.
    
    Fixes: 5bc46b49c828 ("nvme-tcp: check for invalidated or revoked key")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: sanitize TLS key handling [+ + +]

Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:19 2024 +0200

    nvme-tcp: sanitize TLS key handling
    
    [ Upstream commit 363895767fbfa05891b0b4d9e06ebde7a10c6a07 ]
    
    There is a difference between TLS configured (ie the user has
    provisioned/requested a key) and TLS enabled (ie the connection
    is encrypted with TLS). This becomes important for secure concatenation,
    where the initial authentication is run on an unencrypted connection
    (ie with TLS configured, but not enabled), and then the queue is reset to
    run over TLS (ie TLS configured _and_ enabled).
    So to differentiate between those two states store the generated
    key in opts->tls_key (as we're using the same TLS key for all queues),
    the key serial of the resulting TLS handshake in ctrl->tls_pskid
    (to signal that TLS on the admin queue is enabled), and a simple
    flag for the queues to indicated that TLS has been enabled.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme: fix metadata handling in nvme-passthrough [+ + +]

Author: Puranjay Mohan <[email protected]>
Date:   Thu Aug 29 13:32:17 2024 +0000

    nvme: fix metadata handling in nvme-passthrough
    
    [ Upstream commit 7c2fd76048e95dd267055b5f5e0a48e6e7c81fd9 ]
    
    On an NVMe namespace that does not support metadata, it is possible to
    send an IO command with metadata through io-passthru. This allows issues
    like [1] to trigger in the completion code path.
    nvme_map_user_request() doesn't check if the namespace supports metadata
    before sending it forward. It also allows admin commands with metadata to
    be processed as it ignores metadata when bdev == NULL and may report
    success.
    
    Reject an IO command with metadata when the NVMe namespace doesn't
    support it and reject an admin command if it has metadata.
    
    [1] https://lore.kernel.org/all/[email protected]/
    
    Suggested-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Puranjay Mohan <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Anuj Gupta <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ocfs2: cancel dqi_sync_work before freeing oinfo [+ + +]

Author: Joseph Qi <[email protected]>
Date:   Wed Sep 4 15:10:03 2024 +0800

    ocfs2: cancel dqi_sync_work before freeing oinfo
    
    commit 35fccce29feb3706f649726d410122dd81b92c18 upstream.
    
    ocfs2_global_read_info() will initialize and schedule dqi_sync_work at the
    end, if error occurs after successfully reading global quota, it will
    trigger the following warning with CONFIG_DEBUG_OBJECTS_* enabled:
    
    ODEBUG: free active (active state 0) object: 00000000d8b0ce28 object type: timer_list hint: qsync_work_fn+0x0/0x16c
    
    This reports that there is an active delayed work when freeing oinfo in
    error handling, so cancel dqi_sync_work first.  BTW, return status instead
    of -1 when .read_file_info fails.
    
    Link: https://syzkaller.appspot.com/bug?extid=f7af59df5d6b25f0febd
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 171bf93ce11f ("ocfs2: Periodic quota syncing")
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Heming Zhao <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix null-ptr-deref when journal load failed. [+ + +]

Author: Julian Sun <[email protected]>
Date:   Mon Sep 2 11:08:44 2024 +0800

    ocfs2: fix null-ptr-deref when journal load failed.
    
    commit 5784d9fcfd43bd853654bb80c87ef293b9e8e80a upstream.
    
    During the mounting process, if journal_reset() fails because of too short
    journal, then lead to jbd2_journal_load() fails with NULL j_sb_buffer.
    Subsequently, ocfs2_journal_shutdown() calls
    jbd2_journal_flush()->jbd2_cleanup_journal_tail()->
    __jbd2_update_log_tail()->jbd2_journal_update_sb_log_tail()
    ->lock_buffer(journal->j_sb_buffer), resulting in a null-pointer
    dereference error.
    
    To resolve this issue, we should check the JBD2_LOADED flag to ensure the
    journal was properly loaded.  Additionally, use journal instead of
    osb->journal directly to simplify the code.
    
    Link: https://syzkaller.appspot.com/bug?extid=05b9b39d8bdfe1a0861f
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: f6f50e28f0cb ("jbd2: Fail to load a journal if it is too short")
    Signed-off-by: Julian Sun <[email protected]>
    Reported-by: [email protected]
    Suggested-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate [+ + +]

Author: Lizhi Xu <[email protected]>
Date:   Mon Sep 2 10:36:36 2024 +0800

    ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
    
    commit 33b525cef4cff49e216e4133cc48452e11c0391e upstream.
    
    When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger
    NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if
    bh is NULL.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <[email protected]>
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Reported-by: Heming Zhao <[email protected]>
    Suggested-by: Heming Zhao <[email protected]>
    Cc: <[email protected]>    [4.20+]
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix the la space leak when unmounting an ocfs2 volume [+ + +]

Author: Heming Zhao <[email protected]>
Date:   Fri Jul 19 19:43:10 2024 +0800

    ocfs2: fix the la space leak when unmounting an ocfs2 volume
    
    commit dfe6c5692fb525e5e90cefe306ee0dffae13d35f upstream.
    
    This bug has existed since the initial OCFS2 code.  The code logic in
    ocfs2_sync_local_to_main() is wrong, as it ignores the last contiguous
    free bits, which causes an OCFS2 volume to lose the last free clusters of
    LA window on each umount command.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Heming Zhao <[email protected]>
    Reviewed-by: Su Yue <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Heming Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix uninit-value in ocfs2_get_block() [+ + +]

Author: Joseph Qi <[email protected]>
Date:   Wed Sep 25 17:06:00 2024 +0800

    ocfs2: fix uninit-value in ocfs2_get_block()
    
    commit 2af148ef8549a12f8025286b8825c2833ee6bcb8 upstream.
    
    syzbot reported an uninit-value BUG:
    
    BUG: KMSAN: uninit-value in ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    do_mpage_readpage+0xc45/0x2780 fs/mpage.c:225
    mpage_readahead+0x43f/0x840 fs/mpage.c:374
    ocfs2_readahead+0x269/0x320 fs/ocfs2/aops.c:381
    read_pages+0x193/0x1110 mm/readahead.c:160
    page_cache_ra_unbounded+0x901/0x9f0 mm/readahead.c:273
    do_page_cache_ra mm/readahead.c:303 [inline]
    force_page_cache_ra+0x3b1/0x4b0 mm/readahead.c:332
    force_page_cache_readahead mm/internal.h:347 [inline]
    generic_fadvise+0x6b0/0xa90 mm/fadvise.c:106
    vfs_fadvise mm/fadvise.c:185 [inline]
    ksys_fadvise64_64 mm/fadvise.c:199 [inline]
    __do_sys_fadvise64 mm/fadvise.c:214 [inline]
    __se_sys_fadvise64 mm/fadvise.c:212 [inline]
    __x64_sys_fadvise64+0x1fb/0x3a0 mm/fadvise.c:212
    x64_sys_call+0xe11/0x3ba0
    arch/x86/include/generated/asm/syscalls_64.h:222
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
    entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    This is because when ocfs2_extent_map_get_blocks() fails, p_blkno is
    uninitialized.  So the error log will trigger the above uninit-value
    access.
    
    The error log is out-of-date since get_blocks() was removed long time ago.
    And the error code will be logged in ocfs2_extent_map_get_blocks() once
    ocfs2_get_cluster() fails, so fix this by only logging inode and block.
    
    Link: https://syzkaller.appspot.com/bug?extid=9709e73bae885b05314b
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
    Signed-off-by: Joseph Qi <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Cc: Heming Zhao <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: remove unreasonable unlock in ocfs2_read_blocks [+ + +]

Author: Lizhi Xu <[email protected]>
Date:   Mon Sep 2 10:36:35 2024 +0800

    ocfs2: remove unreasonable unlock in ocfs2_read_blocks
    
    commit c03a82b4a0c935774afa01fd6d128b444fd930a1 upstream.
    
    Patch series "Misc fixes for ocfs2_read_blocks", v5.
    
    This series contains 2 fixes for ocfs2_read_blocks().  The first patch fix
    the issue reported by syzbot, which detects bad unlock balance in
    ocfs2_read_blocks().  The second patch fixes an issue reported by Heming
    Zhao when reviewing above fix.
    
    
    This patch (of 2):
    
    There was a lock release before exiting, so remove the unreasonable unlock.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <[email protected]>
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Heming Zhao <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=ab134185af9ef88dfed5
    Tested-by: [email protected]
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>    [4.20+]
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: reserve space for inline xattr before attaching reflink tree [+ + +]

Author: Gautham Ananthakrishna <[email protected]>
Date:   Wed Sep 18 06:38:44 2024 +0000

    ocfs2: reserve space for inline xattr before attaching reflink tree
    
    commit 5ca60b86f57a4d9648f68418a725b3a7de2816b0 upstream.
    
    One of our customers reported a crash and a corrupted ocfs2 filesystem.
    The crash was due to the detection of corruption.  Upon troubleshooting,
    the fsck -fn output showed the below corruption
    
    [EXTENT_LIST_FREE] Extent list in owner 33080590 claims 230 as the next free chain record,
    but fsck believes the largest valid value is 227.  Clamp the next record value? n
    
    The stat output from the debugfs.ocfs2 showed the following corruption
    where the "Next Free Rec:" had overshot the "Count:" in the root metadata
    block.
    
            Inode: 33080590   Mode: 0640   Generation: 2619713622 (0x9c25a856)
            FS Generation: 904309833 (0x35e6ac49)
            CRC32: 00000000   ECC: 0000
            Type: Regular   Attr: 0x0   Flags: Valid
            Dynamic Features: (0x16) HasXattr InlineXattr Refcounted
            Extended Attributes Block: 0  Extended Attributes Inline Size: 256
            User: 0 (root)   Group: 0 (root)   Size: 281320357888
            Links: 1   Clusters: 141738
            ctime: 0x66911b56 0x316edcb8 -- Fri Jul 12 06:02:30.829349048 2024
            atime: 0x66911d6b 0x7f7a28d -- Fri Jul 12 06:11:23.133669517 2024
            mtime: 0x66911b56 0x12ed75d7 -- Fri Jul 12 06:02:30.317552087 2024
            dtime: 0x0 -- Wed Dec 31 17:00:00 1969
            Refcount Block: 2777346
            Last Extblk: 2886943   Orphan Slot: 0
            Sub Alloc Slot: 0   Sub Alloc Bit: 14
            Tree Depth: 1   Count: 227   Next Free Rec: 230
            ## Offset        Clusters       Block#
            0  0             2310           2776351
            1  2310          2139           2777375
            2  4449          1221           2778399
            3  5670          731            2779423
            4  6401          566            2780447
            .......          ....           .......
            .......          ....           .......
    
    The issue was in the reflink workfow while reserving space for inline
    xattr.  The problematic function is ocfs2_reflink_xattr_inline().  By the
    time this function is called the reflink tree is already recreated at the
    destination inode from the source inode.  At this point, this function
    reserves space for inline xattrs at the destination inode without even
    checking if there is space at the root metadata block.  It simply reduces
    the l_count from 243 to 227 thereby making space of 256 bytes for inline
    xattr whereas the inode already has extents beyond this index (in this
    case up to 230), thereby causing corruption.
    
    The fix for this is to reserve space for inline metadata at the destination
    inode before the reflink tree gets recreated. The customer has verified the
    fix.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ef962df057aa ("ocfs2: xattr: fix inlined xattr reflink")
    Signed-off-by: Gautham Ananthakrishna <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of/irq: Refer to actual buffer size in of_irq_parse_one() [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Tue Aug 20 14:16:53 2024 +0200

    of/irq: Refer to actual buffer size in of_irq_parse_one()
    
    [ Upstream commit 39ab331ab5d377a18fbf5a0e0b228205edfcc7f4 ]
    
    Replace two open-coded calculations of the buffer size by invocations of
    sizeof() on the buffer itself, to make sure the code will always use the
    actual buffer size.
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://lore.kernel.org/r/817c0b9626fd30790fc488c472a3398324cfcc0c.1724156125.git.geert+renesas@glider.be
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

of/irq: Support #msi-cells=<0> in of_msi_get_domain [+ + +]

Author: Andrew Jones <[email protected]>
Date:   Sat Aug 17 09:41:08 2024 +0200

    of/irq: Support #msi-cells=<0> in of_msi_get_domain
    
    commit db8e81132cf051843c9a59b46fa5a071c45baeb3 upstream.
    
    An 'msi-parent' property with a single entry and no accompanying
    '#msi-cells' property is considered the legacy definition as opposed
    to its definition after being expanded with commit 126b16e2ad98
    ("Docs: dt: add generic MSI bindings"). However, the legacy
    definition is completely compatible with the current definition and,
    since of_phandle_iterator_next() tolerates missing and present-but-
    zero *cells properties since commit e42ee61017f5 ("of: Let
    of_for_each_phandle fallback to non-negative cell_count"), there's no
    need anymore to special case the legacy definition in
    of_msi_get_domain().
    
    Indeed, special casing has turned out to be harmful, because, as of
    commit 7c025238b47a ("dt-bindings: irqchip: Describe the IMX MU block
    as a MSI controller"), MSI controller DT bindings have started
    specifying '#msi-cells' as a required property (even when the value
    must be zero) as an effort to make the bindings more explicit. But,
    since the special casing of 'msi-parent' only uses the existence of
    '#msi-cells' for its heuristic, and not whether or not it's also
    nonzero, the legacy path is not taken. Furthermore, the path to
    support the new, broader definition isn't taken either since that
    path has been restricted to the platform-msi bus.
    
    But, neither the definition of 'msi-parent' nor the definition of
    '#msi-cells' is platform-msi-specific (the platform-msi bus was just
    the first bus that needed '#msi-cells'), so remove both the special
    casing and the restriction. The code removal also requires changing
    to of_parse_phandle_with_optional_args() in order to ensure the
    legacy (but compatible) use of 'msi-parent' remains supported. This
    not only simplifies the code but also resolves an issue with PCI
    devices finding their MSI controllers on riscv, as the riscv,imsics
    binding requires '#msi-cells=<0>'.
    
    Signed-off-by: Andrew Jones <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

of: address: Report error on resource bounds overflow [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Thu Sep 5 09:46:01 2024 +0200

    of: address: Report error on resource bounds overflow
    
    commit 000f6d588a8f3d128f89351058dc04d38e54a327 upstream.
    
    The members "start" and "end" of struct resource are of type
    "resource_size_t" which can be 32bit wide.
    Values read from OF however are always 64bit wide.
    Avoid silently truncating the value and instead return an error value.
    
    This can happen on real systems when the DT was created for a
    PAE-enabled kernel and a non-PAE kernel is actually running.
    For example with an arm defconfig and "qemu-system-arm -M virt".
    
    Link: https://bugs.launchpad.net/qemu/+bug/1790975
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Tested-by: Nam Cao <[email protected]>
    Reviewed-by: Nam Cao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ovl: fail if trusted xattrs are needed but caller lacks permission [+ + +]

Author: Mike Baynton <[email protected]>
Date:   Wed Jul 10 22:52:04 2024 -0500

    ovl: fail if trusted xattrs are needed but caller lacks permission
    
    commit 6c4a5f96450415735c31ed70ff354f0ee5cbf67b upstream.
    
    Some overlayfs features require permission to read/write trusted.*
    xattrs. These include redirect_dir, verity, metacopy, and data-only
    layers. This patch adds additional validations at mount time to stop
    overlays from mounting in certain cases where the resulting mount would
    not function according to the user's expectations because they lack
    permission to access trusted.* xattrs (for example, not global root.)
    
    Similar checks in ovl_make_workdir() that disable features instead of
    failing are still relevant and used in cases where the resulting mount
    can still work "reasonably well." Generally, if the feature was enabled
    through kernel config or module option, any mount that worked before
    will still work the same; this applies to redirect_dir and metacopy. The
    user must explicitly request these features in order to generate a mount
    failure. Verity and data-only layers on the other hand must be explictly
    requested and have no "reasonable" disabled or degraded alternative, so
    mounts attempting either always fail.
    
    "lower data-only dirs require metacopy support" moved down in case
    userxattr is set, which disables metacopy.
    
    Cc: [email protected] # v6.6+
    Signed-off-by: Mike Baynton <[email protected]>
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ovl: fsync after metadata copy-up [+ + +]

Author: Amir Goldstein <[email protected]>
Date:   Thu Aug 29 17:51:08 2024 +0200

    ovl: fsync after metadata copy-up
    
    [ Upstream commit 7d6899fb69d25e1bc6f4700b7c1d92e6b608593d ]
    
    For upper filesystems which do not use strict ordering of persisting
    metadata changes (e.g. ubifs), when overlayfs file is modified for
    the first time, copy up will create a copy of the lower file and
    its parent directories in the upper layer. Permission lost of the
    new upper parent directory was observed during power-cut stress test.
    
    Fix by moving the fsync call to after metadata copy to make sure that the
    metadata copied up directory and files persists to disk before renaming
    from tmp to final destination.
    
    With metacopy enabled, this change will hurt performance of workloads
    such as chown -R, so we keep the legacy behavior of fsync only on copyup
    of data.
    
    Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxj-pOvmw1-uXR3qVdqtLjSkwcR9nVKcNU_vC10Zyf2miQ@mail.gmail.com/
    Reported-and-tested-by: Fei Lv <[email protected]>
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards [+ + +]

Author: Helge Deller <[email protected]>
Date:   Sun Sep 8 20:51:17 2024 +0200

    parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards
    
    commit 5d698966fa7b452035c44c937d704910bf3440dd upstream.
    
    When userspace allocates memory with mmap() in order to be used for stack,
    allow this memory region to automatically expand upwards up until the
    current maximum process stack size.
    The fault handler checks if the VM_GROWSUP bit is set in the vm_flags field
    of a memory area before it allows it to expand.
    This patch modifies the parisc specific code only.
    A RFC for a generic patch to modify mmap() for all architectures was sent
    to the mailing list but did not get enough Acks.
    
    Reported-by: Camm Maguire <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v5.10+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix 64-bit userspace syscall path [+ + +]

Author: Helge Deller <[email protected]>
Date:   Sun Sep 8 00:40:38 2024 +0200

    parisc: Fix 64-bit userspace syscall path
    
    commit d24449864da5838936669618356b0e30ca2999c3 upstream.
    
    Currently the glibc isn't yet ported to 64-bit for hppa, so
    there is no usable userspace available yet.
    But it's possible to manually build a static 64-bit binary
    and run that for testing. One such 64-bit test program is
    available at http://ftp.parisc-linux.org/src/64bit.tar.gz
    and it shows various issues with the existing 64-bit syscall
    path in the kernel.
    This patch fixes those issues.
    
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix itlb miss handler for 64-bit programs [+ + +]

Author: Helge Deller <[email protected]>
Date:   Tue Sep 10 18:32:24 2024 +0200

    parisc: Fix itlb miss handler for 64-bit programs
    
    commit 9542130937e9dc707dd7c6b7af73326437da2d50 upstream.
    
    For an itlb miss when executing code above 4 Gb on ILP64 adjust the
    iasq/iaoq in the same way isr/ior was adjusted.  This fixes signal
    delivery for the 64-bit static test program from
    http://ftp.parisc-linux.org/src/64bit.tar.gz.  Note that signals are
    handled by the signal trampoline code in the 64-bit VDSO which is mapped
    into high userspace memory region above 4GB for 64-bit processes.
    
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix stack start for ADDR_NO_RANDOMIZE personality [+ + +]

Author: Helge Deller <[email protected]>
Date:   Sat Sep 7 18:28:11 2024 +0200

    parisc: Fix stack start for ADDR_NO_RANDOMIZE personality
    
    commit f31b256994acec6929306dfa86ac29716e7503d6 upstream.
    
    Fix the stack start address calculation for the parisc architecture in
    setup_arg_pages() when address randomization is disabled. When the
    ADDR_NO_RANDOMIZE process personality is disabled there is no need to add
    additional space for the stack.
    Note that this patch touches code inside an #ifdef CONFIG_STACK_GROWSUP hunk,
    which is why only the parisc architecture is affected since it's the
    only Linux architecture where the stack grows upwards.
    
    Without this patch you will find the stack in the middle of some
    mapped libaries and suddenly limited to 6MB instead of 8MB:
    
    root@parisc:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:05 1182034           /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:05 1182034           /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    f90c4000-f9283000 r-xp 00000000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9283000-f9285000 r--p 001bf000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9285000-f928a000 rwxp 001c1000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f928a000-f9294000 rwxp 00000000 00:00 0
    f9301000-f9323000 rwxp 00000000 00:00 0                 [stack]
    f98b4000-f98e4000 r-xp 00000000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e4000-f98e5000 r--p 00030000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e5000-f98e9000 rwxp 00031000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f9ad8000-f9b00000 rw-p 00000000 00:00 0
    f9b00000-f9b01000 r-xp 00000000 00:00 0                 [vdso]
    
    With the patch the stack gets correctly mapped at the end
    of the process memory map:
    
    root@panama:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:13 16385582          /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:13 16385582          /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    fef29000-ff0eb000 r-xp 00000000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0eb000-ff0ed000 r--p 001c2000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0ed000-ff0f2000 rwxp 001c4000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0f2000-ff0fc000 rwxp 00000000 00:00 0
    ff4b4000-ff4e4000 r-xp 00000000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e4000-ff4e6000 r--p 00030000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e6000-ff4ea000 rwxp 00032000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff6d7000-ff6ff000 rw-p 00000000 00:00 0
    ff6ff000-ff700000 r-xp 00000000 00:00 0                 [vdso]
    ff700000-ff722000 rwxp 00000000 00:00 0                 [stack]
    
    Reported-by: Camm Maguire <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Fixes: d045c77c1a69 ("parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures")
    Fixes: 17d9822d4b4c ("parisc: Consider stack randomization for mmap base only when necessary")
    Cc: [email protected]      # v5.2+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf callchain: Fix stitch LBR memory leaks [+ + +]

Author: Ian Rogers <[email protected]>
Date:   Wed Aug 7 22:46:43 2024 -0700

    perf callchain: Fix stitch LBR memory leaks
    
    [ Upstream commit 599c19397b17d197fc1184bbc950f163a292efc9 ]
    
    The 'struct callchain_cursor_node' has a 'struct map_symbol' whose maps
    and map members are reference counted. Ensure these values use a _get
    routine to increment the reference counts and use map_symbol__exit() to
    release the reference counts.
    
    Do similar for 'struct thread's prev_lbr_cursor, but save the size of
    the prev_lbr_cursor array so that it may be iterated.
    
    Ensure that when stitch_nodes are placed on the free list the
    map_symbols are exited.
    
    Fix resolve_lbr_callchain_sample() by replacing list_replace_init() to
    list_splice_init(), so the whole list is moved and nodes aren't leaked.
    
    A reproduction of the memory leaks is possible with a leak sanitizer
    build in the perf report command of:
    
      ```
      $ perf record -e cycles --call-graph lbr perf test -w thloop
      $ perf report --stitch-lbr
      ```
    
    Reviewed-by: Kan Liang <[email protected]>
    Fixes: ff165628d72644e3 ("perf callchain: Stitch LBR call stack")
    Signed-off-by: Ian Rogers <[email protected]>
    [ Basic tests after applying the patch, repeating the example above ]
    Tested-by: Arnaldo Carvalho de Melo <[email protected]>
    Cc: Adrian Hunter <[email protected]>
    Cc: Alexander Shishkin <[email protected]>
    Cc: Andi Kleen <[email protected]>
    Cc: Anne Macedo <[email protected]>
    Cc: Changbin Du <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

perf hist: Update hist symbol when updating maps [+ + +]

Author: Matt Fleming <[email protected]>
Date:   Thu Aug 15 15:22:12 2024 +0100

    perf hist: Update hist symbol when updating maps
    
    commit ac01c8c4246546fd8340a232f3ada1921dc0ee48 upstream.
    
    AddressSanitizer found a use-after-free bug in the symbol code which
    manifested as 'perf top' segfaulting.
    
      ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
      READ of size 1 at 0x60b00c48844b thread T193
          #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
          #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
          #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
          #3 0x5650d804568f in __hists__add_entry util/hist.c:754
          #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
          #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
          #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
          #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
          #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
          #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
          #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
          #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
          #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
          #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    
    When updating hist maps it's also necessary to update the hist symbol
    reference because the old one gets freed in map__put().
    
    While this bug was probably introduced with 5c24b67aae72f54c ("perf
    tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
    the symbol objects were leaked until c087e9480cf33672 ("perf machine:
    Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
    the bug was masked.
    
    Fixes: c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
    Reported-by: Yunzhao Li <[email protected]>
    Signed-off-by: Matt Fleming (Cloudflare) <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: [email protected]
    Cc: Namhyung Kim <[email protected]>
    Cc: Riccardo Mancini <[email protected]>
    Cc: [email protected] # v5.13+
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf python: Allow checking for the existence of warning options in clang [+ + +]

Author: Arnaldo Carvalho de Melo <[email protected]>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Allow checking for the existence of warning options in clang
    
    commit b81162302001f41157f6e93654aaccc30e817e2a upstream.
    
    We'll need to check if an warning option introduced in clang 19 is
    available on the clang version being used, so cover the error message
    emitted when testing for a -W option.
    
    Tested-by: Sedat Dilek <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf python: Disable -Wno-cast-function-type-mismatch if present on clang [+ + +]

Author: Arnaldo Carvalho de Melo <[email protected]>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Disable -Wno-cast-function-type-mismatch if present on clang
    
    commit 00dc514612fe98cfa117193b9df28f15e7c9db9c upstream.
    
    The -Wcast-function-type-mismatch option was introduced in clang 19 and
    its enabled by default, since we use -Werror, and python bindings do
    casts that are valid but trips this warning, disable it if present.
    
    Closes: https://lore.kernel.org/all/CA+icZUXoJ6BS3GMhJHV3aZWyb5Cz2haFneX0C5pUMUUhG-UVKQ@mail.gmail.com
    Reported-by: Sedat Dilek <[email protected]>
    Tested-by: Sedat Dilek <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: [email protected] # To allow building with the upcoming clang 19
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf report: Fix segfault when 'sym' sort key is not used [+ + +]

Author: Namhyung Kim <[email protected]>
Date:   Mon Aug 26 15:10:42 2024 -0700

    perf report: Fix segfault when 'sym' sort key is not used
    
    commit 9af2efee41b27a0f386fb5aa95d8d0b4b5d9fede upstream.
    
    The fields in the hist_entry are filled on-demand which means they only
    have meaningful values when relevant sort keys are used.
    
    So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
    the hist entry can be garbage.  So it shouldn't access it
    unconditionally.
    
    I got a segfault, when I wanted to see cgroup profiles.
    
      $ sudo perf record -a --all-cgroups --synth=cgroup true
    
      $ sudo perf report -s cgroup
    
      Program received signal SIGSEGV, Segmentation fault.
      0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      48            return RC_CHK_ACCESS(map)->dso;
      (gdb) bt
      #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      #1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
      #2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
      #3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
          at util/hist.c:644
      #4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
      #5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
      #6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
      #7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
          at util/hist.c:1260
      #8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
          machine=0x5555560388e8) at builtin-report.c:334
      #9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
      #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
      #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
          file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
      #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
      #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
      #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
      #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
      #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
          at util/session.c:780
      #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
          file_path=0x555556038ff0 "perf.data") at util/session.c:1406
    
    As you can see the entry->ms.map was NULL even if he->ms.map has a
    value.  This is because 'sym' sort key is not given, so it cannot assume
    whether he->ms.sym and entry->ms.sym is the same.  I only checked the
    'sym' sort key here as it implies 'dso' behavior (so maps are the same).
    
    Fixes: ac01c8c4246546fd ("perf hist: Update hist symbol when updating maps")
    Signed-off-by: Namhyung Kim <[email protected]>
    Cc: Adrian Hunter <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Kan Liang <[email protected]>
    Cc: Matt Fleming <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Stephane Eranian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf,x86: avoid missing caller address in stack traces captured in uprobe [+ + +]

Author: Andrii Nakryiko <[email protected]>
Date:   Mon Jul 29 10:52:23 2024 -0700

    perf,x86: avoid missing caller address in stack traces captured in uprobe
    
    [ Upstream commit cfa7f3d2c526c224a6271cc78a4a27a0de06f4f0 ]
    
    When tracing user functions with uprobe functionality, it's common to
    install the probe (e.g., a BPF program) at the first instruction of the
    function. This is often going to be `push %rbp` instruction in function
    preamble, which means that within that function frame pointer hasn't
    been established yet. This leads to consistently missing an actual
    caller of the traced function, because perf_callchain_user() only
    records current IP (capturing traced function) and then following frame
    pointer chain (which would be caller's frame, containing the address of
    caller's caller).
    
    So when we have target_1 -> target_2 -> target_3 call chain and we are
    tracing an entry to target_3, captured stack trace will report
    target_1 -> target_3 call chain, which is wrong and confusing.
    
    This patch proposes a x86-64-specific heuristic to detect `push %rbp`
    (`push %ebp` on 32-bit architecture) instruction being traced. Given
    entire kernel implementation of user space stack trace capturing works
    under assumption that user space code was compiled with frame pointer
    register (%rbp/%ebp) preservation, it seems pretty reasonable to use
    this instruction as a strong indicator that this is the entry to the
    function. In that case, return address is still pointed to by %rsp/%esp,
    so we fetch it and add to stack trace before proceeding to unwind the
    rest using frame pointer-based logic.
    
    We also check for `endbr64` (for 64-bit modes) as another common pattern
    for function entry, as suggested by Josh Poimboeuf. Even if we get this
    wrong sometimes for uprobes attached not at the function entry, it's OK
    because stack trace will still be overall meaningful, just with one
    extra bogus entry. If we don't detect this, we end up with guaranteed to
    be missing caller function entry in the stack trace, which is worse
    overall.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

perf/core: Fix small negative period being ignored [+ + +]

Author: Luo Gengkun <[email protected]>
Date:   Sat Aug 31 07:43:15 2024 +0000

    perf/core: Fix small negative period being ignored
    
    commit 62c0b1061593d7012292f781f11145b2d46f43ab upstream.
    
    In perf_adjust_period, we will first calculate period, and then use
    this period to calculate delta. However, when delta is less than 0,
    there will be a deviation compared to when delta is greater than or
    equal to 0. For example, when delta is in the range of [-14,-1], the
    range of delta = delta + 7 is between [-7,6], so the final value of
    delta/8 is 0. Therefore, the impact of -1 and -2 will be ignored.
    This is unacceptable when the target period is very short, because
    we will lose a lot of samples.
    
    Here are some tests and analyzes:
    before:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.022 MB perf.data (518 samples) ]
    
      # perf script
      ...
      a.out     396   257.956048:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.957891:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.959730:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.961545:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.963355:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.965163:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.966973:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.968785:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.970593:         23 cs:  ffffffff81f4eeec schedul>
      ...
    
    after:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.058 MB perf.data (1466 samples) ]
    
      # perf script
      ...
      a.out     395    59.338813:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.339707:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.340682:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.341751:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.342799:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.343765:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.344651:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.345539:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.346502:         13 cs:  ffffffff81f4eeec schedul>
      ...
    
    test.c
    
    int main() {
            for (int i = 0; i < 20000; i++)
                    usleep(10);
    
            return 0;
    }
    
      # time ./a.out
      real    0m1.583s
      user    0m0.040s
      sys     0m0.298s
    
    The above results were tested on x86-64 qemu with KVM enabled using
    test.c as test program. Ideally, we should have around 1500 samples,
    but the previous algorithm had only about 500, whereas the modified
    algorithm now has about 1400. Further more, the new version shows 1
    sample per 0.001s, while the previous one is 1 sample per 0.002s.This
    indicates that the new algorithm is more sensitive to small negative
    values compared to old algorithm.
    
    Fixes: bd2b5b12849a ("perf_counter: More aggressive frequency adjustment")
    Signed-off-by: Luo Gengkun <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Adrian Hunter <[email protected]>
    Reviewed-by: Kan Liang <[email protected]>
    Cc: [email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf: Fix event_function_call() locking [+ + +]

Author: Peter Zijlstra <[email protected]>
Date:   Wed Aug 7 13:29:27 2024 +0200

    perf: Fix event_function_call() locking
    
    [ Upstream commit 558abc7e3f895049faa46b08656be4c60dc6e9fd ]
    
    All the event_function/@func call context already uses perf_ctx_lock()
    except for the !ctx->is_active case. Make it all consistent.
    
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Kan Liang <[email protected]>
    Reviewed-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

perf: Really fix event_function_call() locking [+ + +]

Author: Namhyung Kim <[email protected]>
Date:   Tue Aug 13 22:55:11 2024 +0200

    perf: Really fix event_function_call() locking
    
    [ Upstream commit fe826cc2654e8561b64246325e6a51b62bf2488c ]
    
    Commit 558abc7e3f89 ("perf: Fix event_function_call() locking") lost
    IRQ disabling by mistake.
    
    Fixes: 558abc7e3f89 ("perf: Fix event_function_call() locking")
    Reported-by: Pengfei Xu <[email protected]>
    Reported-by: Naresh Kamboju <[email protected]>
    Tested-by: Pengfei Xu <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

pidfs: check for valid pid namespace [+ + +]

Author: Christian Brauner <[email protected]>
Date:   Thu Sep 26 18:51:46 2024 +0200

    pidfs: check for valid pid namespace
    
    commit 8a46067783bdff222d1fb8f8c20e3b7b711e3ce5 upstream.
    
    When we access a no-current task's pid namespace we need check that the
    task hasn't been reaped in the meantime and it's pid namespace isn't
    accessible anymore.
    
    The user namespace is fine because it is only released when the last
    reference to struct task_struct is put and exit_creds() is called.
    
    Link: https://lore.kernel.org/r/20240926-klebt-altgedienten-0415ad4d273c@brauner
    Fixes: 5b08bd408534 ("pidfs: allow retrieval of namespace file descriptors")
    CC: [email protected] # v6.11
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/mellanox: mlxbf-pmc: fix lockdep warning [+ + +]

Author: Luiz Capitulino <[email protected]>
Date:   Thu Sep 12 15:05:32 2024 -0400

    platform/mellanox: mlxbf-pmc: fix lockdep warning
    
    [ Upstream commit 305790dd91057a3f7497c9d128614a4f8486b62b ]
    
    It seems the mlxbf-pmc driver is missing initializing sysfs attributes
    which causes the warning below when CONFIG_LOCKDEP and
    CONFIG_DEBUG_LOCK_ALLOC are enabled. This commit fixes it.
    
    [  155.380843] BUG: key ffff470f45dfa6d8 has not been registered!
    [  155.386749] ------------[ cut here ]------------
    [  155.391361] DEBUG_LOCKS_WARN_ON(1)
    [  155.391381] WARNING: CPU: 4 PID: 1828 at kernel/locking/lockdep.c:4894 lockdep_init_map_type+0x1d0/0x288
    [  155.404254] Modules linked in: mlxbf_pmc(+) xfs libcrc32c mmc_block mlx5_core crct10dif_ce mlxfw ghash_ce virtio_net tls net_failover sha2
    _ce failover psample sha256_arm64 dw_mmc_bluefield pci_hyperv_intf sha1_ce dw_mmc_pltfm sbsa_gwdt dw_mmc micrel mmc_core nfit i2c_mlxbf pwr_m
    lxbf gpio_generic libnvdimm mlxbf_tmfifo mlxbf_gige dm_mirror dm_region_hash dm_log dm_mod
    [  155.436786] CPU: 4 UID: 0 PID: 1828 Comm: modprobe Kdump: loaded Not tainted 6.11.0-rc7-rep1+ #1
    [  155.445562] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.8.0.13249 Aug  7 2024
    [  155.455463] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  155.462413] pc : lockdep_init_map_type+0x1d0/0x288
    [  155.467196] lr : lockdep_init_map_type+0x1d0/0x288
    [  155.471976] sp : ffff80008a1734e0
    [  155.475279] x29: ffff80008a1734e0 x28: ffff470f45df0240 x27: 00000000ffffee4b
    [  155.482406] x26: 00000000000011b4 x25: 0000000000000000 x24: 0000000000000000
    [  155.489532] x23: ffff470f45dfa6d8 x22: 0000000000000000 x21: ffffd54ef6bea000
    [  155.496659] x20: ffff470f45dfa6d8 x19: ffff470f49cdc638 x18: ffffffffffffffff
    [  155.503784] x17: 2f30303a31444642 x16: ffffd54ef48a65e8 x15: ffff80010a172fe7
    [  155.510911] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47
    [  155.518037] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffd54ef3f48a14
    [  155.525163] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 00000000002bffa8
    [  155.532289] x5 : ffff4712bdcb6088 x4 : 0000000000000000 x3 : 0000000000000027
    [  155.539416] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff470f43e5be00
    [  155.546542] Call trace:
    [  155.548976]  lockdep_init_map_type+0x1d0/0x288
    [  155.553410]  __kernfs_create_file+0x80/0x138
    [  155.557673]  sysfs_add_file_mode_ns+0x94/0x150
    [  155.562106]  create_files+0xb0/0x248
    [  155.565672]  internal_create_group+0x10c/0x328
    [  155.570105]  internal_create_groups.part.0+0x50/0xc8
    [  155.575060]  sysfs_create_groups+0x20/0x38
    [  155.579146]  device_add_attrs+0x1b8/0x228
    [  155.583146]  device_add+0x2a4/0x690
    [  155.586625]  device_register+0x24/0x38
    [  155.590362]  __hwmon_device_register+0x1e0/0x3c8
    [  155.594969]  devm_hwmon_device_register_with_groups+0x78/0xe0
    [  155.600703]  mlxbf_pmc_probe+0x224/0x3a0 [mlxbf_pmc]
    [  155.605669]  platform_probe+0x6c/0xe0
    [  155.609320]  really_probe+0xc4/0x398
    [  155.612887]  __driver_probe_device+0x80/0x168
    [  155.617233]  driver_probe_device+0x44/0x120
    [  155.621405]  __driver_attach+0xf4/0x200
    [  155.625230]  bus_for_each_dev+0x7c/0xe8
    [  155.629055]  driver_attach+0x28/0x38
    [  155.632619]  bus_add_driver+0x110/0x238
    [  155.636445]  driver_register+0x64/0x128
    [  155.640270]  __platform_driver_register+0x2c/0x40
    [  155.644965]  pmc_driver_init+0x24/0xff8 [mlxbf_pmc]
    [  155.649833]  do_one_initcall+0x70/0x3d0
    [  155.653660]  do_init_module+0x64/0x220
    [  155.657400]  load_module+0x628/0x6a8
    [  155.660964]  init_module_from_file+0x8c/0xd8
    [  155.665222]  idempotent_init_module+0x194/0x290
    [  155.669742]  __arm64_sys_finit_module+0x6c/0xd8
    [  155.674261]  invoke_syscall.constprop.0+0x74/0xd0
    [  155.678957]  do_el0_svc+0xb4/0xd0
    [  155.682262]  el0_svc+0x5c/0x248
    [  155.685394]  el0t_64_sync_handler+0x134/0x150
    [  155.689739]  el0t_64_sync+0x17c/0x180
    [  155.693390] irq event stamp: 6407
    [  155.696693] hardirqs last  enabled at (6407): [<ffffd54ef3f48564>] console_unlock+0x154/0x1b8
    [  155.705207] hardirqs last disabled at (6406): [<ffffd54ef3f485ac>] console_unlock+0x19c/0x1b8
    [  155.713719] softirqs last  enabled at (6404): [<ffffd54ef3e9740c>] handle_softirqs+0x4f4/0x518
    [  155.722320] softirqs last disabled at (6395): [<ffffd54ef3df0160>] __do_softirq+0x18/0x20
    [  155.730484] ---[ end trace 0000000000000000 ]---
    
    Signed-off-by: Luiz Capitulino <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86/amd: pmf: Add quirk for TUF Gaming A14 [+ + +]

Author: aln8 <[email protected]>
Date:   Thu Sep 12 15:36:01 2024 +0800

    platform/x86/amd: pmf: Add quirk for TUF Gaming A14
    
    [ Upstream commit 06369503d644068abd9e90918c6611274d94c126 ]
    
    The ASUS TUF Gaming A14 has the same issue as the ROG Zephyrus G14
    where it advertises SPS support but doesn't use it.
    
    Signed-off-by: aln8 <[email protected]>
    Acked-by: Shyam Sundar S K <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug [+ + +]

Author: Zach Wade <[email protected]>
Date:   Mon Sep 23 22:45:08 2024 +0800

    platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
    
    commit 7d59ac07ccb58f8f604f8057db63b8efcebeb3de upstream.
    
    Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds".
    kasan report:
    [   19.411889] ==================================================================
    [   19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113
    [   19.417368]
    [   19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G            E      6.9.0 #10
    [   19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022
    [   19.422687] Call Trace:
    [   19.424091]  <TASK>
    [   19.425448]  dump_stack_lvl+0x5d/0x80
    [   19.426963]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.428694]  print_report+0x19d/0x52e
    [   19.430206]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
    [   19.431837]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.433539]  kasan_report+0xf0/0x170
    [   19.435019]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.436709]  _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.438379]  ? __pfx_sched_clock_cpu+0x10/0x10
    [   19.439910]  isst_if_cpu_online+0x406/0x58f [isst_if_common]
    [   19.441573]  ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common]
    [   19.443263]  ? ttwu_queue_wakelist+0x2c1/0x360
    [   19.444797]  cpuhp_invoke_callback+0x221/0xec0
    [   19.446337]  cpuhp_thread_fun+0x21b/0x610
    [   19.447814]  ? __pfx_cpuhp_thread_fun+0x10/0x10
    [   19.449354]  smpboot_thread_fn+0x2e7/0x6e0
    [   19.450859]  ? __pfx_smpboot_thread_fn+0x10/0x10
    [   19.452405]  kthread+0x29c/0x350
    [   19.453817]  ? __pfx_kthread+0x10/0x10
    [   19.455253]  ret_from_fork+0x31/0x70
    [   19.456685]  ? __pfx_kthread+0x10/0x10
    [   19.458114]  ret_from_fork_asm+0x1a/0x30
    [   19.459573]  </TASK>
    [   19.460853]
    [   19.462055] Allocated by task 1198:
    [   19.463410]  kasan_save_stack+0x30/0x50
    [   19.464788]  kasan_save_track+0x14/0x30
    [   19.466139]  __kasan_kmalloc+0xaa/0xb0
    [   19.467465]  __kmalloc+0x1cd/0x470
    [   19.468748]  isst_if_cdev_register+0x1da/0x350 [isst_if_common]
    [   19.470233]  isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr]
    [   19.471670]  do_one_initcall+0xa4/0x380
    [   19.472903]  do_init_module+0x238/0x760
    [   19.474105]  load_module+0x5239/0x6f00
    [   19.475285]  init_module_from_file+0xd1/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.478920]  do_syscall_64+0x82/0x160
    [   19.480036]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [   19.481292]
    [   19.482205] The buggy address belongs to the object at ffff888829e65000
     which belongs to the cache kmalloc-512 of size 512
    [   19.484818] The buggy address is located 0 bytes to the right of
     allocated 512-byte region [ffff888829e65000, ffff888829e65200)
    [   19.487447]
    [   19.488328] The buggy address belongs to the physical page:
    [   19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60
    [   19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [   19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
    [   19.493914] page_type: 0xffffffff()
    [   19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff
    [   19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
    [   19.503784] page dumped because: kasan: bad access detected
    [   19.505058]
    [   19.505970] Memory state around the buggy address:
    [   19.507172]  ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.508599]  ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510014]                    ^
    [   19.510016]  ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510018]  ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.515367] ==================================================================
    
    The reason for this error is physical_package_ids assigned by VMware VMM
    are not continuous and have gaps. This will cause value returned by
    topology_physical_package_id() to be more than topology_max_packages().
    
    Here the allocation uses topology_max_packages(). The call to
    topology_max_packages() returns maximum logical package ID not physical
    ID. Hence use topology_logical_package_id() instead of
    topology_physical_package_id().
    
    Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering")
    Cc: [email protected]
    Acked-by: Srinivas Pandruvada <[email protected]>
    Signed-off-by: Zach Wade <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/x86: lenovo-ymc: Ignore the 0x0 state [+ + +]

Author: Gergo Koteles <[email protected]>
Date:   Thu Aug 22 17:38:57 2024 +0200

    platform/x86: lenovo-ymc: Ignore the 0x0 state
    
    [ Upstream commit d9dca215708d32e7f88ac0591fbb187cbf368adb ]
    
    While booting, Lenovo 14ARB7 reports 'lenovo-ymc: Unknown key 0 pressed'
    warning. This is caused by lenovo_ymc_probe() calling lenovo_ymc_notify()
    at probe time to get the initial tablet-mode-switch state and the key-code
    lenovo_ymc_notify() reads from the firmware is not initialized at probe
    time yet on the Lenovo 14ARB7.
    
    The hardware/firmware does an ACPI notify on the WMI device itself when
    it initializes the tablet-mode-switch state later on.
    
    Add 0x0 YMC state to the sparse keymap to silence the warning.
    
    Signed-off-by: Gergo Koteles <[email protected]>
    Link: https://lore.kernel.org/r/08ab73bb74c4ad448409f2ce707b1148874a05ce.1724340562.git.soyer@irl.hu
    [[email protected]: Reword commit message]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: touchscreen_dmi: add nanote-next quirk [+ + +]

Author: Ckath <[email protected]>
Date:   Wed Sep 11 21:12:40 2024 +0200

    platform/x86: touchscreen_dmi: add nanote-next quirk
    
    [ Upstream commit c11619af35bae5884029bd14170c3e4b55ddf6f3 ]
    
    Add touschscreen info for the nanote next (UMPC-03-SR).
    
    After checking with multiple owners the DMI info really is this generic.
    
    Signed-off-by: Ckath <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Mon Sep 16 11:02:55 2024 +0200

    platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED
    
    [ Upstream commit df40a23cc34c200cfde559eda7ca540f3ae7bd9e ]
    
    The "input-events" LED trigger used to turn on the backlight LEDs had to
    be rewritten to use led_trigger_register_simple() + led_trigger_event()
    to fix a serious locking issue.
    
    This means it no longer supports using blink_brightness to set a per LED
    brightness for the trigger and it no longer sets LED_CORE_SUSPENDRESUME.
    
    Adjust the MiPad 2 bottom bezel touch buttons LED class device to match:
    
    1. Make LED_FULL the maximum brightness to fix the LED brightness
       being very low when on.
    2. Set flags = LED_CORE_SUSPENDRESUME.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sat Oct 5 15:05:45 2024 +0200

    platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors
    
    commit 2fae3129c0c08e72b1fe93e61fd8fd203252094a upstream.
    
    x86_android_tablet_remove() frees the pdevs[] array, so it should not
    be used after calling x86_android_tablet_remove().
    
    When platform_device_register() fails, store the pdevs[x] PTR_ERR() value
    into the local ret variable before calling x86_android_tablet_remove()
    to avoid using pdevs[] after it has been freed.
    
    Fixes: 5eba0141206e ("platform/x86: x86-android-tablets: Add support for instantiating platform-devs")
    Fixes: e2200d3f26da ("platform/x86: x86-android-tablets: Add gpio_keys support to x86_android_tablet_init()")
    Cc: [email protected]
    Reported-by: Aleksandr Burakov <[email protected]>
    Closes: https://lore.kernel.org/platform-driver-x86/[email protected]/
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set() [+ + +]

Author: Ulf Hansson <[email protected]>
Date:   Mon May 27 16:25:52 2024 +0200

    pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set()
    
    [ Upstream commit b87eee38605c396f0e1fa435939960e5c6cd41d6 ]
    
    There is no need to hold the genpd-lock, while assigning the
    dev->pm_domain. In fact, it becomes a problem on a PREEMPT_RT based
    configuration as the genpd-lock may be a raw spinlock, while the lock
    acquired through the call to dev_pm_domain_set() is a regular spinlock.
    
    To fix the problem, let's simply move the calls to dev_pm_domain_set()
    outside the genpd-lock.
    
    Signed-off-by: Ulf Hansson <[email protected]>
    Tested-by: Raghavendra Kakarla <[email protected]>  # qcm6490 with PREEMPT_RT set
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

pmdomain: core: Reduce debug summary table width [+ + +]

Author: Geert Uytterhoeven <[email protected]>
Date:   Wed Sep 4 16:30:48 2024 +0200

    pmdomain: core: Reduce debug summary table width
    
    commit c6ccb691d484544636bc4a097574c5c135ccccda upstream.
    
    Commit 9094e53ff5c86ebe ("pmdomain: core: Use dev_name() instead of
    kobject_get_path() in debugfs") severely shortened the names of devices
    in a PM Domain.  Now the most common format[1] consists of a 32-bit
    unit-address (8 characters), followed by a dot and a node name (20
    characters for "air-pollution-sensor" and "interrupt-controller", which
    are the longest generic node names documented in the Devicetree
    Specification), for a typical maximum of 29 characters.
    
    This offers a good opportunity to reduce the table width of the debug
    summary:
      - Reduce the device name field width from 50 to 30 characters, which
        matches the PM Domain name width,
      - Reduce the large inter-column space between the "performance" and
        "managed by" columns.
    
    Visual impact:
      - The "performance" column now starts at a position that is a
        multiple of 16, just like the "status" and "children" columns,
      - All of the "/device", "runtime status", and "managed by" columns are
        now indented 4 characters more than the columns right above them,
      - Everything fits in (one less than) 80 characters again ;-)
    
    [1] Note that some device names (e.g. TI AM335x interconnect target
        modules) do not follow this convention, and may be much longer, but
        these didn't fit in the old 50-character column width either.
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://lore.kernel.org/r/f8e1821364b6d5d11350447c128f6d2b470f33fe.1725459707.git.geert+renesas@glider.be
    Signed-off-by: Ulf Hansson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

pmdomain: core: Use dev_name() instead of kobject_get_path() in debugfs [+ + +]

Author: Ulf Hansson <[email protected]>
Date:   Mon May 27 16:25:53 2024 +0200

    pmdomain: core: Use dev_name() instead of kobject_get_path() in debugfs
    
    [ Upstream commit 9094e53ff5c86ebe372ad3960c3216c9817a1a04 ]
    
    Using kobject_get_path() means a dynamic memory allocation gets done, which
    doesn't work on a PREEMPT_RT based configuration while holding genpd's raw
    spinlock.
    
    To fix the problem, let's convert into using the simpler dev_name(). This
    means the information about the path doesn't get presented in debugfs, but
    hopefully this shouldn't be an issue.
    
    Signed-off-by: Ulf Hansson <[email protected]>
    Tested-by: Raghavendra Kakarla <[email protected]>  # qcm6490 with PREEMPT_RT set
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

power: reset: brcmstb: Do not go into infinite loop if reset fails [+ + +]

Author: Andrew Davis <[email protected]>
Date:   Mon Jun 10 09:28:36 2024 -0500

    power: reset: brcmstb: Do not go into infinite loop if reset fails
    
    [ Upstream commit cf8c39b00e982fa506b16f9d76657838c09150cb ]
    
    There may be other backup reset methods available, do not halt
    here so that other reset methods can be tried.
    
    Signed-off-by: Andrew Davis <[email protected]>
    Reviewed-by: Dhruva Gole <[email protected]>
    Acked-by: Florian Fainelli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

power: supply: Drop use_cnt check from power_supply_property_is_writeable() [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sun Sep 8 20:53:36 2024 +0200

    power: supply: Drop use_cnt check from power_supply_property_is_writeable()
    
    commit 78f281e5bdeb6476fab97a2c3fcece1094b42aaf upstream.
    
    power_supply_property_is_writeable() gets called from the is_visible()
    callback for the sysfs attributes of power_supply class devices and for
    the sysfs attributes of power_supply core instantiated hwmon class devices.
    
    These sysfs attributes get registered by the device_add() respectively
    power_supply_add_hwmon_sysfs() calls in power_supply_register().
    
    use_cnt gets initialized to 0 and is incremented only after these calls.
    So when power_supply_property_is_writeable() gets called it always return
    -ENODEV because of use_cnt == 0.
    
    This causes all the attributes to have permissions of 444 even those which
    should be writable. This used to be a problem only for hwmon sysfs
    attributes but since commit be6299c6e55e ("power: supply: sysfs: use
    power_supply_property_is_writeable()") this now also impacts power_supply
    class sysfs attributes.
    
    Fixes: be6299c6e55e ("power: supply: sysfs: use power_supply_property_is_writeable()")
    Fixes: e67d4dfc9ff1 ("power: supply: Add HWMON compatibility layer")
    Cc: [email protected]
    Cc: Thomas Weißschuh <[email protected]>
    Cc: Andrey Smirnov <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/stable/20240908185337.103696-1-hdegoede%40redhat.com
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

power: supply: hwmon: Fix missing temp1_max_alarm attribute [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sun Sep 8 20:53:37 2024 +0200

    power: supply: hwmon: Fix missing temp1_max_alarm attribute
    
    commit e50a57d16f897e45de1112eb6478577b197fab52 upstream.
    
    Temp channel 0 aka temp1 can have a temp1_max_alarm attribute for
    power_supply devices which have a POWER_SUPPLY_PROP_TEMP_ALERT_MAX
    property.
    
    HWMON_T_MAX_ALARM was missing from power_supply_hwmon_info for
    temp channel 0, causing the hwmon temp1_max_alarm attribute to be
    missing from such power_supply devices.
    
    Add this to power_supply_hwmon_info to fix this.
    
    Fixes: f1d33ae806ec ("power: supply: remove duplicated argument in power_supply_hwmon_info")
    Cc: [email protected]
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

powerpc/pseries: Use correct data types from pseries_hp_errorlog struct [+ + +]

Author: Haren Myneni <[email protected]>
Date:   Wed Aug 21 19:50:26 2024 -0700

    powerpc/pseries: Use correct data types from pseries_hp_errorlog struct
    
    [ Upstream commit b76e0d4215b6b622127ebcceaa7f603313ceaec4 ]
    
    _be32 type is defined for some elements in pseries_hp_errorlog
    struct but also used them u32 after be32_to_cpu() conversion.
    
    Example: In handle_dlpar_errorlog()
    hp_elog->_drc_u.drc_index = be32_to_cpu(hp_elog->_drc_u.drc_index);
    
    And later assigned to u32 type
    dlpar_cpu() - u32 drc_index = hp_elog->_drc_u.drc_index;
    
    This incorrect usage is giving the following warnings and the
    patch resolve these warnings with the correct assignment.
    
    arch/powerpc/platforms/pseries/dlpar.c:398:53: sparse: sparse:
    incorrect type in argument 1 (different base types) @@
    expected unsigned int [usertype] drc_index @@
    got restricted __be32 [usertype] drc_index @@
    ...
    arch/powerpc/platforms/pseries/dlpar.c:418:43: sparse: sparse:
    incorrect type in assignment (different base types) @@
    expected restricted __be32 [usertype] drc_count @@
    got unsigned int [usertype] @@
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Haren Myneni <[email protected]>
    
    v3:
    - Fix warnings from using incorrect data types in pseries_hp_errorlog
      struct
    v2:
    - Remove pr_info() and TODO comments
    - Update more information in the commit logs
    
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

powerpc/vdso: Fix VDSO data access when running in a non-root time namespace [+ + +]

Author: Christophe Leroy <[email protected]>
Date:   Fri Sep 6 10:33:43 2024 +0200

    powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
    
    [ Upstream commit c73049389e58c01e2e3bbfae900c8daeee177191 ]
    
    When running in a non-root time namespace, the global VDSO data page
    is replaced by a dedicated namespace data page and the global data
    page is mapped next to it. Detailed explanations can be found at
    commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support").
    
    When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq
    and __kernel_sync_dicache don't work anymore because they read 0
    instead of the data they need.
    
    To address that, clock_mode has to be read. When it is set to
    VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page
    and the global data is located on the following page.
    
    Add a macro called get_realdatapage which reads clock_mode and add
    PAGE_SIZE to the pointer provided by get_datapage macro when
    clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro
    instead of get_datapage macro except for time functions as they handle
    it internally.
    
    Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces")
    Reported-by: Jason A. Donenfeld <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Christophe Leroy <[email protected]>
    Acked-by: Michael Ellerman <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ppp: do not assume bh is held in ppp_channel_bridge_input() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Fri Sep 27 07:45:53 2024 +0000

    ppp: do not assume bh is held in ppp_channel_bridge_input()
    
    [ Upstream commit aec7291003df78cb71fd461d7b672912bde55807 ]
    
    Networking receive path is usually handled from BH handler.
    However, some protocols need to acquire the socket lock, and
    packets might be stored in the socket backlog is the socket was
    owned by a user process.
    
    In this case, release_sock(), __release_sock(), and sk_backlog_rcv()
    might call the sk->sk_backlog_rcv() handler in process context.
    
    sybot caught ppp was not considering this case in
    ppp_channel_bridge_input() :
    
    WARNING: inconsistent lock state
    6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    ksoftirqd/1/24 [HC0[0]:SC1[1]:HE1:SE0] takes:
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
    {SOFTIRQ-ON-W} state was registered at:
       lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
       ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
       pppoe_rcv_core+0xfc/0x314 drivers/net/ppp/pppoe.c:379
       sk_backlog_rcv include/net/sock.h:1111 [inline]
       __release_sock+0x1a8/0x3d8 net/core/sock.c:3004
       release_sock+0x68/0x1b8 net/core/sock.c:3558
       pppoe_sendmsg+0xc8/0x5d8 drivers/net/ppp/pppoe.c:903
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       __sys_sendto+0x374/0x4f4 net/socket.c:2204
       __do_sys_sendto net/socket.c:2216 [inline]
       __se_sys_sendto net/socket.c:2212 [inline]
       __arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
       el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
       el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
    irq event stamp: 282914
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] _raw_spin_unlock_irqrestore+0x38/0x98 kernel/locking/spinlock.c:194
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] _raw_spin_lock_irqsave+0x2c/0x7c kernel/locking/spinlock.c:162
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] softirq_handle_end kernel/softirq.c:400 [inline]
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] handle_softirqs+0xa3c/0xbfc kernel/softirq.c:582
     softirqs last disabled at (282909): [<ffff8000801fbdf8>] run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
    
    other info that might help us debug this:
     Possible unsafe locking scenario:
    
           CPU0
           ----
      lock(&pch->downl);
      <Interrupt>
        lock(&pch->downl);
    
     *** DEADLOCK ***
    
    1 lock held by ksoftirqd/1/24:
      #0: ffff80008f74dfa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:325
    
    stack backtrace:
    CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call trace:
      dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
      show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
      dump_stack+0x1c/0x28 lib/dump_stack.c:128
      print_usage_bug+0x698/0x9ac kernel/locking/lockdep.c:4000
     mark_lock_irq+0x980/0xd2c
      mark_lock+0x258/0x360 kernel/locking/lockdep.c:4677
      __lock_acquire+0xf48/0x779c kernel/locking/lockdep.c:5096
      lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
      __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
      _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
      spin_lock include/linux/spinlock.h:351 [inline]
      ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
      ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
      ppp_async_process+0x98/0x150 drivers/net/ppp/ppp_async.c:495
      tasklet_action_common+0x318/0x3f4 kernel/softirq.c:785
      tasklet_action+0x68/0x8c kernel/softirq.c:811
      handle_softirqs+0x2e4/0xbfc kernel/softirq.c:554
      run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
      smpboot_thread_fn+0x4b0/0x90c kernel/smpboot.c:164
      kthread+0x288/0x310 kernel/kthread.c:389
      ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
    
    Fixes: 4cf476ced45d ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netdev/[email protected]/T/#u
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Tom Parkin <[email protected]>
    Cc: James Chapman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

proc: add config & param to block forcing mem writes [+ + +]

Author: Adrian Ratiu <[email protected]>
Date:   Fri Aug 2 11:02:25 2024 +0300

    proc: add config & param to block forcing mem writes
    
    [ Upstream commit 41e8149c8892ed1962bd15350b3c3e6e90cba7f4 ]
    
    This adds a Kconfig option and boot param to allow removing
    the FOLL_FORCE flag from /proc/pid/mem write calls because
    it can be abused.
    
    The traditional forcing behavior is kept as default because
    it can break GDB and some other use cases.
    
    Previously we tried a more sophisticated approach allowing
    distributions to fine-tune /proc/pid/mem behavior, however
    that got NAK-ed by Linus [1], who prefers this simpler
    approach with semantics also easier to understand for users.
    
    Link: https://lore.kernel.org/lkml/CAHk-=wiGWLChxYmUA5HrT5aopZrB7_2VTa0NLZcxORgkUe5tEQ@mail.gmail.com/ [1]
    Cc: Doug Anderson <[email protected]>
    Cc: Jeff Xu <[email protected]>
    Cc: Jann Horn <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: Ard Biesheuvel <[email protected]>
    Cc: Christian Brauner <[email protected]>
    Suggested-by: Linus Torvalds <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Adrian Ratiu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

r8169: add tally counter fields added with RTL8125 [+ + +]

Author: Heiner Kallweit <[email protected]>
Date:   Tue Sep 17 23:04:46 2024 +0200

    r8169: add tally counter fields added with RTL8125
    
    [ Upstream commit ced8e8b8f40accfcce4a2bbd8b150aa76d5eff9a ]
    
    RTL8125 added fields to the tally counter, what may result in the chip
    dma'ing these new fields to unallocated memory. Therefore make sure
    that the allocated memory area is big enough to hold all of the
    tally counter values, even if we use only parts of it.
    
    Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
    Cc: [email protected]
    Signed-off-by: Heiner Kallweit <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun" [+ + +]

Author: Colin Ian King <[email protected]>
Date:   Mon Sep 9 15:00:21 2024 +0100

    r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"
    
    [ Upstream commit 8df9439389a44fb2cc4ef695e08d6a8870b1616c ]
    
    There is a spelling mistake in the struct field tx_underun, rename
    it to tx_underrun.
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Heiner Kallweit <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: ced8e8b8f40a ("r8169: add tally counter fields added with RTL8125")
    Signed-off-by: Sasha Levin <[email protected]>

rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb() [+ + +]

Author: Zqiang <[email protected]>
Date:   Wed Jul 10 12:45:42 2024 +0800

    rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb()
    
    [ Upstream commit fd70e9f1d85f5323096ad313ba73f5fe3d15ea41 ]
    
    For kernels built with CONFIG_FORCE_NR_CPUS=y, the nr_cpu_ids is
    defined as NR_CPUS instead of the number of possible cpus, this
    will cause the following system panic:
    
    smpboot: Allowing 4 CPUs, 0 hotplug CPUs
    ...
    setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:512 nr_node_ids:1
    ...
    BUG: unable to handle page fault for address: ffffffff9911c8c8
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 0 PID: 15 Comm: rcu_tasks_trace Tainted: G W
    6.6.21 #1 5dc7acf91a5e8e9ac9dcfc35bee0245691283ea6
    RIP: 0010:rcu_tasks_need_gpcb+0x25d/0x2c0
    RSP: 0018:ffffa371c00a3e60 EFLAGS: 00010082
    CR2: ffffffff9911c8c8 CR3: 000000040fa20005 CR4: 00000000001706f0
    Call Trace:
    <TASK>
    ? __die+0x23/0x80
    ? page_fault_oops+0xa4/0x180
    ? exc_page_fault+0x152/0x180
    ? asm_exc_page_fault+0x26/0x40
    ? rcu_tasks_need_gpcb+0x25d/0x2c0
    ? __pfx_rcu_tasks_kthread+0x40/0x40
    rcu_tasks_one_gp+0x69/0x180
    rcu_tasks_kthread+0x94/0xc0
    kthread+0xe8/0x140
    ? __pfx_kthread+0x40/0x40
    ret_from_fork+0x34/0x80
    ? __pfx_kthread+0x40/0x40
    ret_from_fork_asm+0x1b/0x80
    </TASK>
    
    Considering that there may be holes in the CPU numbers, use the
    maximum possible cpu number, instead of nr_cpu_ids, for configuring
    enqueue and dequeue limits.
    
    [ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ]
    
    Closes: https://lore.kernel.org/linux-input/CALMA0xaTSMN+p4xUXkzrtR5r6k7hgoswcaXx7baR_z9r5jjskw@mail.gmail.com/T/#u
    Reported-by: Zhixu Liu <[email protected]>
    Signed-off-by: Zqiang <[email protected]>
    Signed-off-by: Neeraj Upadhyay <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

rcuscale: Provide clear error when async specified without primitives [+ + +]

Author: Paul E. McKenney <[email protected]>
Date:   Thu Aug 1 17:43:03 2024 -0700

    rcuscale: Provide clear error when async specified without primitives
    
    [ Upstream commit 11377947b5861fa59bf77c827e1dd7c081842cc9 ]
    
    Currently, if the rcuscale module's async module parameter is specified
    for RCU implementations that do not have async primitives such as RCU
    Tasks Rude (which now lacks a call_rcu_tasks_rude() function), there
    will be a series of splats due to calls to a NULL pointer.  This commit
    therefore warns of this situation, but switches to non-async testing.
    
    Signed-off-by: "Paul E. McKenney" <[email protected]>
    Signed-off-by: Neeraj Upadhyay <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page [+ + +]

Author: Long Li <[email protected]>
Date:   Fri Aug 30 08:16:33 2024 -0700

    RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page
    
    commit 4a3b99bc04e501b816db78f70064e26a01257910 upstream.
    
    When mapping doorbell page from user-mode, the driver should use the system
    page size as this memory is allocated via mmap() from user-mode.
    
    Cc: [email protected]
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

RDMA/mana_ib: use the correct page table index based on hardware page size [+ + +]

Author: Long Li <[email protected]>
Date:   Fri Aug 30 08:16:32 2024 -0700

    RDMA/mana_ib: use the correct page table index based on hardware page size
    
    commit 9e517a8e9d9a303bf9bde35e5c5374795544c152 upstream.
    
    MANA hardware uses 4k page size. When calculating the page table index,
    it should use the hardware page size, not the system page size.
    
    Cc: [email protected]
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

remoteproc: k3-r5: Acquire mailbox handle during probe routine [+ + +]

Author: Beleswar Padhi <[email protected]>
Date:   Thu Aug 8 13:11:26 2024 +0530

    remoteproc: k3-r5: Acquire mailbox handle during probe routine
    
    [ Upstream commit f3f11cfe890733373ddbb1ce8991ccd4ee5e79e1 ]
    
    Acquire the mailbox handle during device probe and do not release handle
    in stop/detach routine or error paths. This removes the redundant
    requests for mbox handle later during rproc start/attach. This also
    allows to defer remoteproc driver's probe if mailbox is not probed yet.
    
    Signed-off-by: Beleswar Padhi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Stable-dep-of: 8fa052c29e50 ("remoteproc: k3-r5: Delay notification of wakeup event")
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: k3-r5: Delay notification of wakeup event [+ + +]

Author: Udit Kumar <[email protected]>
Date:   Tue Aug 20 16:20:04 2024 +0530

    remoteproc: k3-r5: Delay notification of wakeup event
    
    [ Upstream commit 8fa052c29e509f3e47d56d7fc2ca28094d78c60a ]
    
    Few times, core1 was scheduled to boot first before core0, which leads
    to error:
    
    'k3_r5_rproc_start: can not start core 1 before core 0'.
    
    This was happening due to some scheduling between prepare and start
    callback. The probe function waits for event, which is getting
    triggered by prepare callback. To avoid above condition move event
    trigger to start instead of prepare callback.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Udit Kumar <[email protected]>
    [ Applied wakeup event trigger only for Split-Mode booted rprocs ]
    Signed-off-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: k3-r5: Fix error handling when power-up failed [+ + +]

Author: Jan Kiszka <[email protected]>
Date:   Mon Aug 19 17:24:51 2024 +0200

    remoteproc: k3-r5: Fix error handling when power-up failed
    
    commit 9ab27eb5866ccbf57715cfdba4b03d57776092fb upstream.
    
    By simply bailing out, the driver was violating its rule and internal
    assumptions that either both or no rproc should be initialized. E.g.,
    this could cause the first core to be available but not the second one,
    leading to crashes on its shutdown later on while trying to dereference
    that second instance.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Jan Kiszka <[email protected]>
    Acked-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

resource: fix region_intersects() vs add_memory_driver_managed() [+ + +]

Author: Huang Ying <[email protected]>
Date:   Fri Sep 6 11:07:11 2024 +0800

    resource: fix region_intersects() vs add_memory_driver_managed()
    
    commit b4afe4183ec77f230851ea139d91e5cf2644c68b upstream.
    
    On a system with CXL memory, the resource tree (/proc/iomem) related to
    CXL memory may look like something as follows.
    
    490000000-50fffffff : CXL Window 0
      490000000-50fffffff : region0
        490000000-50fffffff : dax0.0
          490000000-50fffffff : System RAM (kmem)
    
    Because drivers/dax/kmem.c calls add_memory_driver_managed() during
    onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL
    Window X".  This confuses region_intersects(), which expects all "System
    RAM" resources to be at the top level of iomem_resource.  This can lead to
    bugs.
    
    For example, when the following command line is executed to write some
    memory in CXL memory range via /dev/mem,
    
     $ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1
     dd: error writing '/dev/mem': Bad address
     1+0 records in
     0+0 records out
     0 bytes copied, 0.0283507 s, 0.0 kB/s
    
    the command fails as expected.  However, the error code is wrong.  It
    should be "Operation not permitted" instead of "Bad address".  More
    seriously, the /dev/mem permission checking in devmem_is_allowed() passes
    incorrectly.  Although the accessing is prevented later because ioremap()
    isn't allowed to map system RAM, it is a potential security issue.  During
    command executing, the following warning is reported in the kernel log for
    calling ioremap() on system RAM.
    
     ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff
     WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d
     Call Trace:
      memremap+0xcb/0x184
      xlate_dev_mem_ptr+0x25/0x2f
      write_mem+0x94/0xfb
      vfs_write+0x128/0x26d
      ksys_write+0xac/0xfe
      do_syscall_64+0x9a/0xfd
      entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    The details of command execution process are as follows.  In the above
    resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a
    top level resource.  So, region_intersects() will report no System RAM
    resources in the CXL memory region incorrectly, because it only checks the
    top level resources.  Consequently, devmem_is_allowed() will return 1
    (allow access via /dev/mem) for CXL memory region incorrectly.
    Fortunately, ioremap() doesn't allow to map System RAM and reject the
    access.
    
    So, region_intersects() needs to be fixed to work correctly with the
    resource tree with "System RAM" not at top level as above.  To fix it, if
    we found a unmatched resource in the top level, we will continue to search
    matched resources in its descendant resources.  So, we will not miss any
    matched resources in resource tree anymore.
    
    In the new implementation, an example resource tree
    
    |------------- "CXL Window 0" ------------|
    |-- "System RAM" --|
    
    will behave similar as the following fake resource tree for
    region_intersects(, IORESOURCE_SYSTEM_RAM, ),
    
    |-- "System RAM" --||-- "CXL Window 0a" --|
    
    Where "CXL Window 0a" is part of the original "CXL Window 0" that
    isn't covered by "System RAM".
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
    Signed-off-by: "Huang, Ying" <[email protected]>
    Cc: Dan Williams <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Davidlohr Bueso <[email protected]>
    Cc: Jonathan Cameron <[email protected]>
    Cc: Dave Jiang <[email protected]>
    Cc: Alison Schofield <[email protected]>
    Cc: Vishal Verma <[email protected]>
    Cc: Ira Weiny <[email protected]>
    Cc: Alistair Popple <[email protected]>
    Cc: Andy Shevchenko <[email protected]>
    Cc: Bjorn Helgaas <[email protected]>
    Cc: Baoquan He <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "ALSA: hda: Conditionally use snooping for AMD HDMI" [+ + +]

Author: Takashi Iwai <[email protected]>
Date:   Wed Oct 2 17:59:39 2024 +0200

    Revert "ALSA: hda: Conditionally use snooping for AMD HDMI"
    
    commit 3f7f36a4559ef78a6418c5f0447fbfbdcf671956 upstream.
    
    This reverts commit 478689b5990deb626a0b3f1ebf165979914d6be4.
    
    The fix seems leading to regressions for other systems.
    Also, the way to check the presence of IOMMU via get_dma_ops() isn't
    reliable and it's no longer applicable for 6.12.  After all, it's no
    right fix, so let's revert it at first.
    
    To be noted, the PCM buffer allocation has been changed to try the
    continuous pages at first since 6.12, so the problem could be already
    addressed without this hackish workaround.
    
    Reported-by: Salvatore Bonaccorso <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link" [+ + +]

Author: Jonathan Gray <[email protected]>
Date:   Mon Oct 7 14:59:22 2024 +1100

    Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link"
    
    This reverts commit d45c64d933586d409d3f1e0ecaca4da494b1d9c6.
    
    duplicated a change made in 6.11-rc3
    50e376f1fe3bf571d0645ddf48ad37eb58323919
    
    Cc: [email protected] # 6.11
    Signed-off-by: Jonathan Gray <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: define ILLEGAL_POINTER_VALUE for 64bit [+ + +]

Author: Jisheng Zhang <[email protected]>
Date:   Sat Jul 6 01:02:10 2024 +0800

    riscv: define ILLEGAL_POINTER_VALUE for 64bit
    
    commit 5c178472af247c7b50f962495bb7462ba453b9fb upstream.
    
    This is used in poison.h for poison pointer offset. Based on current
    SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
    that is not mappable, this can avoid potentially turning an oops to
    an expolit.
    
    Signed-off-by: Jisheng Zhang <[email protected]>
    Fixes: fbe934d69eb7 ("RISC-V: Build Infrastructure")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: Fix kernel stack size when KASAN is enabled [+ + +]

Author: Alexandre Ghiti <[email protected]>
Date:   Tue Sep 17 17:03:28 2024 +0200

    riscv: Fix kernel stack size when KASAN is enabled
    
    commit cfb10de18538e383dbc4f3ce7f477ce49287ff3d upstream.
    
    We use Kconfig to select the kernel stack size, doubling the default
    size if KASAN is enabled.
    
    But that actually only works if KASAN is selected from the beginning,
    meaning that if KASAN config is added later (for example using
    menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the
    default size, which is not enough for KASAN as reported in [1].
    
    So fix this by moving the logic to compute the right kernel stack into a
    header.
    
    Fixes: a7555f6b62e7 ("riscv: stack: Add config of thread stack size")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/all/[email protected]/ [1]
    Cc: [email protected]
    Signed-off-by: Alexandre Ghiti <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

rtc: at91sam9: fix OF node leak in probe() error path [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Sun Aug 25 20:31:03 2024 +0200

    rtc: at91sam9: fix OF node leak in probe() error path
    
    commit 73580e2ee6adfb40276bd420da3bb1abae204e10 upstream.
    
    Driver is leaking an OF node reference obtained from
    of_parse_phandle_with_fixed_args().
    
    Fixes: 43e112bb3dea ("rtc: at91sam9: make use of syscon/regmap to access GPBR registers")
    Cc: [email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

rtla: Fix the help text in osnoise and timerlat top tools [+ + +]

Author: Eder Zulian <[email protected]>
Date:   Tue Aug 13 17:58:31 2024 +0200

    rtla: Fix the help text in osnoise and timerlat top tools
    
    commit 3d7b8ea7a8a20a45d019382c4dc6ed79e8bb95cf upstream.
    
    The help text in osnoise top and timerlat top had some minor errors
    and omissions. The -d option was missing the 's' (second) abbreviation and
    the error message for '-d' used '-D'.
    
    Cc: [email protected]
    Fixes: 1eceb2fc2ca54 ("rtla/osnoise: Add osnoise top mode")
    Fixes: a828cd18bc4ad ("rtla: Add timerlat tool and timelart top mode")
    Link: https://lore.kernel.org/[email protected]
    Suggested-by: Tomas Glozar <[email protected]>
    Reviewed-by: Tomas Glozar <[email protected]>
    Signed-off-by: Eder Zulian <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

rust: kbuild: auto generate helper exports [+ + +]

Author: Gary Guo <[email protected]>
Date:   Sat Aug 17 17:51:32 2024 +0100

    rust: kbuild: auto generate helper exports
    
    [ Upstream commit e26fa546042add70944d018b930530d16b3cf626 ]
    
    This removes the need to explicitly export all symbols.
    
    Generate helper exports similarly to what's currently done for Rust
    crates. These helpers are exclusively called from within Rust code and
    therefore can be treated similar as other Rust symbols.
    
    Signed-off-by: Gary Guo <[email protected]>
    Reviewed-by: Boqun Feng <[email protected]>
    Tested-by: Boqun Feng <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [ Fixed dependency path, reworded slightly, edited comment a bit and
      rebased on top of the changes made when applying Andreas' patch
      (e.g. no `README.md` anymore, so moved the edits).  - Miguel ]
    Signed-off-by: Miguel Ojeda <[email protected]>
    Stable-dep-of: d065cc76054d ("rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT")
    Signed-off-by: Sasha Levin <[email protected]>

rust: kbuild: split up helpers.c [+ + +]

Author: Andreas Hindborg <[email protected]>
Date:   Thu Aug 15 10:30:26 2024 +0000

    rust: kbuild: split up helpers.c
    
    [ Upstream commit 876346536c1b59a5b1b5e44477b1b3ece77647fd ]
    
    This patch splits up the rust helpers C file. When rebasing patch sets on
    upstream linux, merge conflicts in helpers.c is common and time consuming
    [1]. Thus, split the file so that each kernel component can live in a
    separate file.
    
    This patch lists helper files explicitly and thus conflicts in the file
    list is still likely. However, they should be more simple to resolve than
    the conflicts usually seen in helpers.c.
    
    [ Removed `README.md` and undeleted the original comment since now,
      in v3 of the series, we have a `helpers.c` again; which also allows
      us to keep the "Sorted alphabetically" line and makes the diff easier.
    
      In addition, updated the Documentation/ mentions of the file, reworded
      title and removed blank lines at the end of `page.c`.  - Miguel ]
    
    Link: https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/Splitting.20up.20helpers.2Ec/near/426694012 [1]
    Signed-off-by: Andreas Hindborg <[email protected]>
    Reviewed-by: Gary Guo <[email protected]>
    Acked-by: Dirk Behme <[email protected]>
    Reviewed-by: Alice Ryhl <[email protected]>
    Reviewed-by: Benno Lossin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Miguel Ojeda <[email protected]>
    Stable-dep-of: d065cc76054d ("rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT")
    Signed-off-by: Sasha Levin <[email protected]>

rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT [+ + +]

Author: Dirk Behme <[email protected]>
Date:   Mon Sep 16 09:37:52 2024 +0200

    rust: mutex: fix __mutex_init() usage in case of PREEMPT_RT
    
    [ Upstream commit d065cc76054d21e48a839a2a19ba99dbc51a4d11 ]
    
    In case CONFIG_PREEMPT_RT is enabled __mutex_init() becomes a macro
    instead of an extern function (simplified from
    include/linux/mutex.h):
    
        #ifndef CONFIG_PREEMPT_RT
        extern void __mutex_init(struct mutex *lock, const char *name,
                             struct lock_class_key *key);
        #else
        #define __mutex_init(mutex, name, key)              \
        do {                                                \
            rt_mutex_base_init(&(mutex)->rtmutex);          \
            __mutex_rt_init((mutex), name, key);            \
        } while (0)
        #endif
    
    The macro isn't resolved by bindgen, then. What results in a build
    error:
    
    error[E0425]: cannot find function `__mutex_init` in crate `bindings`
         --> rust/kernel/sync/lock/mutex.rs:104:28
          |
    104   |           unsafe { bindings::__mutex_init(ptr, name, key) }
          |                              ^^^^^^^^^^^^ help: a function with a similar name exists: `__mutex_rt_init`
          |
         ::: rust/bindings/bindings_generated.rs:23722:5
          |
    23722 | /     pub fn __mutex_rt_init(
    23723 | |         lock: *mut mutex,
    23724 | |         name: *const core::ffi::c_char,
    23725 | |         key: *mut lock_class_key,
    23726 | |     );
          | |_____- similarly named function `__mutex_rt_init` defined here
    
    Fix this by adding a helper.
    
    As explained by Gary Guo in [1] no #ifdef CONFIG_PREEMPT_RT
    is needed here as rust/bindings/lib.rs prefers externed function to
    helpers if an externed function exists.
    
    Reported-by: Conor Dooley <[email protected]>
    Link: https://lore.kernel.org/rust-for-linux/20240913-shack-estate-b376a65921b1@spud/
    Link: https://lore.kernel.org/rust-for-linux/[email protected]/ [1]
    Fixes: 6d20d629c6d8 ("rust: lock: introduce `Mutex`")
    Signed-off-by: Dirk Behme <[email protected]>
    Tested-by: Conor Dooley <[email protected]>
    Reviewed-by: Gary Guo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    [ Reworded to include the proper example by Dirk. - Miguel ]
    Signed-off-by: Miguel Ojeda <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

rust: sync: require `T: Sync` for `LockedBy::access` [+ + +]

Author: Alice Ryhl <[email protected]>
Date:   Sun Sep 15 14:41:28 2024 +0000

    rust: sync: require `T: Sync` for `LockedBy::access`
    
    commit a8ee30f45d5d57467ddb7877ed6914d0eba0af7f upstream.
    
    The `LockedBy::access` method only requires a shared reference to the
    owner, so if we have shared access to the `LockedBy` from several
    threads at once, then two threads could call `access` in parallel and
    both obtain a shared reference to the inner value. Thus, require that
    `T: Sync` when calling the `access` method.
    
    An alternative is to require `T: Sync` in the `impl Sync for LockedBy`.
    This patch does not choose that approach as it gives up the ability to
    use `LockedBy` with `!Sync` types, which is okay as long as you only use
    `access_mut`.
    
    Cc: [email protected]
    Fixes: 7b1f55e3a984 ("rust: sync: introduce `LockedBy`")
    Signed-off-by: Alice Ryhl <[email protected]>
    Suggested-by: Boqun Feng <[email protected]>
    Reviewed-by: Gary Guo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Miguel Ojeda <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

rxrpc: Fix a race between socket set up and I/O thread creation [+ + +]

Author: David Howells <[email protected]>
Date:   Tue Oct 1 14:26:58 2024 +0100

    rxrpc: Fix a race between socket set up and I/O thread creation
    
    commit bc212465326e8587325f520a052346f0b57360e6 upstream.
    
    In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
    thread that will handle it.  This is a problem, however, as there's a gap
    between the two phases in which a packet may come into rxrpc_encap_rcv()
    from the UDP packet but we oops when trying to wake the not-yet created I/O
    thread.
    
    As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
    no I/O thread yet.
    
    A better, but more intrusive fix would perhaps be to rearrange things such
    that the socket creation is done by the I/O thread.
    
    Fixes: a275da62e8c1 ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
    Signed-off-by: David Howells <[email protected]>
    cc: [email protected]
    cc: Marc Dionne <[email protected]>
    cc: Simon Horman <[email protected]>
    cc: [email protected]
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/core: Add clearing of ->dl_server in put_prev_task_balance() [+ + +]

Author: Joel Fernandes (Google) <[email protected]>
Date:   Mon May 27 14:06:48 2024 +0200

    sched/core: Add clearing of ->dl_server in put_prev_task_balance()
    
    commit c245910049d04fbfa85bb2f5acd591c24e9907c7 upstream.
    
    Paths using put_prev_task_balance() need to do a pick shortly
    after. Make sure they also clear the ->dl_server on prev as a
    part of that.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: "Joel Fernandes (Google)" <[email protected]>
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/d184d554434bedbad0581cb34656582d78655150.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/core: Clear prev->dl_server in CFS pick fast path [+ + +]

Author: Youssef Esmat <[email protected]>
Date:   Mon May 27 14:06:49 2024 +0200

    sched/core: Clear prev->dl_server in CFS pick fast path
    
    commit a741b82423f41501e301eb6f9820b45ca202e877 upstream.
    
    In case the previous pick was a DL server pick, ->dl_server might be
    set. Clear it in the fast path as well.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Youssef Esmat <[email protected]>
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/7f7381ccba09efcb4a1c1ff808ed58385eccc222.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/deadline: Comment sched_dl_entity::dl_server variable [+ + +]

Author: Daniel Bristot de Oliveira <[email protected]>
Date:   Mon May 27 14:06:47 2024 +0200

    sched/deadline: Comment sched_dl_entity::dl_server variable
    
    commit f23c042ce34ba265cf3129d530702b5d218e3f4b upstream.
    
    Add an explanation for the newly added variable.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/147f7aa8cb8fd925f36aa8059af6a35aad08b45a.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched: psi: fix bogus pressure spikes from aggregation race [+ + +]

Author: Johannes Weiner <[email protected]>
Date:   Thu Oct 3 07:29:05 2024 -0400

    sched: psi: fix bogus pressure spikes from aggregation race
    
    commit 3840cbe24cf060ea05a585ca497814609f5d47d1 upstream.
    
    Brandon reports sporadic, non-sensical spikes in cumulative pressure
    time (total=) when reading cpu.pressure at a high rate. This is due to
    a race condition between reader aggregation and tasks changing states.
    
    While it affects all states and all resources captured by PSI, in
    practice it most likely triggers with CPU pressure, since scheduling
    events are so frequent compared to other resource events.
    
    The race context is the live snooping of ongoing stalls during a
    pressure read. The read aggregates per-cpu records for stalls that
    have concluded, but will also incorporate ad-hoc the duration of any
    active state that hasn't been recorded yet. This is important to get
    timely measurements of ongoing stalls. Those ad-hoc samples are
    calculated on-the-fly up to the current time on that CPU; since the
    stall hasn't concluded, it's expected that this is the minimum amount
    of stall time that will enter the per-cpu records once it does.
    
    The problem is that the path that concludes the state uses a CPU clock
    read that is not synchronized against aggregators; the clock is read
    outside of the seqlock protection. This allows aggregators to race and
    snoop a stall with a longer duration than will actually be recorded.
    
    With the recorded stall time being less than the last snapshot
    remembered by the aggregator, a subsequent sample will underflow and
    observe a bogus delta value, resulting in an erratic jump in pressure.
    
    Fix this by moving the clock read of the state change into the seqlock
    protection. This ensures no aggregation can snoop live stalls past the
    time that's recorded when the state concludes.
    
    Reported-by: Brandon Duffany <[email protected]>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219194
    Link: https://lore.kernel.org/lkml/[email protected]/
    Fixes: df77430639c9 ("psi: Reduce calls to sched_clock() in psi")
    Cc: [email protected]
    Signed-off-by: Johannes Weiner <[email protected]>
    Reviewed-by: Chengming Zhou <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scripts/gdb: add iteration function for rbtree [+ + +]

Author: Kuan-Ying Lee <[email protected]>
Date:   Tue Jul 23 14:48:58 2024 +0800

    scripts/gdb: add iteration function for rbtree
    
    commit 0c77e103c45fa1b119f5d3bb4625eee081c1a6cf upstream.
    
    Add inorder iteration function for rbtree usage.
    
    This is a preparation patch for the next patch to fix the gdb mounts
    issue.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree")
    Signed-off-by: Kuan-Ying Lee <[email protected]>
    Cc: Jan Kiszka <[email protected]>
    Cc: Kieran Bingham <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scripts/gdb: fix lx-mounts command error [+ + +]

Author: Kuan-Ying Lee <[email protected]>
Date:   Tue Jul 23 14:48:59 2024 +0800

    scripts/gdb: fix lx-mounts command error
    
    commit 4b183f613924ad536be2f8bd12b307e9c5a96bf6 upstream.
    
    (gdb) lx-mounts
          mount          super_block     devname pathname fstype options
    Python Exception <class 'gdb.error'>: There is no member named list.
    Error occurred in Python: There is no member named list.
    
    We encounter the above issue after commit 2eea9ce4310d ("mounts: keep
    list of mounts in an rbtree"). The commit move a mount from list into
    rbtree.
    
    So we can instead use rbtree to iterate all mounts information.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 2eea9ce4310d ("mounts: keep list of mounts in an rbtree")
    Signed-off-by: Kuan-Ying Lee <[email protected]>
    Cc: Jan Kiszka <[email protected]>
    Cc: Kieran Bingham <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scripts/gdb: fix timerlist parsing issue [+ + +]

Author: Kuan-Ying Lee <[email protected]>
Date:   Tue Jul 23 14:48:57 2024 +0800

    scripts/gdb: fix timerlist parsing issue
    
    commit a633a4b8001a7f2a12584f267a3280990d9ababa upstream.
    
    Patch series "Fix some GDB command error and add some GDB commands", v3.
    
    Fix some GDB command errors and add some useful GDB commands.
    
    
    This patch (of 5):
    
    Commit 7988e5ae2be7 ("tick: Split nohz and highres features from
    nohz_mode") and commit 7988e5ae2be7 ("tick: Split nohz and highres
    features from nohz_mode") move 'tick_stopped' and 'nohz_mode' to flags
    field which will break the gdb lx-mounts command:
    
    (gdb) lx-timerlist
    Python Exception <class 'gdb.error'>: There is no member named nohz_mode.
    Error occurred in Python: There is no member named nohz_mode.
    
    (gdb) lx-timerlist
    Python Exception <class 'gdb.error'>: There is no member named tick_stopped.
    Error occurred in Python: There is no member named tick_stopped.
    
    We move 'tick_stopped' and 'nohz_mode' to flags field instead.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: a478ffb2ae23 ("tick: Move individual bit features to debuggable mask accesses")
    Fixes: 7988e5ae2be7 ("tick: Split nohz and highres features from nohz_mode")
    Signed-off-by: Kuan-Ying Lee <[email protected]>
    Cc: Jan Kiszka <[email protected]>
    Cc: Kieran Bingham <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: aacraid: Rearrange order of struct aac_srb_unit [+ + +]

Author: Kees Cook <[email protected]>
Date:   Thu Jul 11 14:57:37 2024 -0700

    scsi: aacraid: Rearrange order of struct aac_srb_unit
    
    [ Upstream commit 6e5860b0ad4934baee8c7a202c02033b2631bb44 ]
    
    struct aac_srb_unit contains struct aac_srb, which contains struct sgmap,
    which ends in a (currently) "fake" (1-element) flexible array.  Converting
    this to a flexible array is needed so that runtime bounds checking won't
    think the array is fixed size (i.e. under CONFIG_FORTIFY_SOURCE=y and/or
    CONFIG_UBSAN_BOUNDS=y), as other parts of aacraid use struct sgmap as a
    flexible array.
    
    It is not legal to have a flexible array in the middle of a structure, so
    it either needs to be split up or rearranged so that it is at the end of
    the structure. Luckily, struct aac_srb_unit, which is exclusively
    consumed/updated by aac_send_safw_bmic_cmd(), does not depend on member
    ordering.
    
    The values set in the on-stack struct aac_srb_unit instance "srbu" by the
    only two callers, aac_issue_safw_bmic_identify() and
    aac_get_safw_ciss_luns(), do not contain anything in srbu.srb.sgmap.sg, and
    they both implicitly initialize srbu.srb.sgmap.count to 0 during
    memset(). For example:
    
            memset(&srbu, 0, sizeof(struct aac_srb_unit));
    
            srbcmd = &srbu.srb;
            srbcmd->flags   = cpu_to_le32(SRB_DataIn);
            srbcmd->cdb[0]  = CISS_REPORT_PHYSICAL_LUNS;
            srbcmd->cdb[1]  = 2; /* extended reporting */
            srbcmd->cdb[8]  = (u8)(datasize >> 8);
            srbcmd->cdb[9]  = (u8)(datasize);
    
            rcode = aac_send_safw_bmic_cmd(dev, &srbu, phys_luns, datasize);
    
    During aac_send_safw_bmic_cmd(), a separate srb is mapped into DMA, and has
    srbu.srb copied into it:
    
            srb = fib_data(fibptr);
            memcpy(srb, &srbu->srb, sizeof(struct aac_srb));
    
    Only then is srb.sgmap.count written and srb->sg populated:
    
            srb->count              = cpu_to_le32(xfer_len);
    
            sg64 = (struct sgmap64 *)&srb->sg;
            sg64->count             = cpu_to_le32(1);
            sg64->sg[0].addr[1]     = cpu_to_le32(upper_32_bits(addr));
            sg64->sg[0].addr[0]     = cpu_to_le32(lower_32_bits(addr));
            sg64->sg[0].count       = cpu_to_le32(xfer_len);
    
    But this is happening in the DMA memory, not in srbu.srb. An attempt to
    copy the changes back to srbu does happen:
    
            /*
             * Copy the updated data for other dumping or other usage if
             * needed
             */
            memcpy(&srbu->srb, srb, sizeof(struct aac_srb));
    
    But this was never correct: the sg64 (3 u32s) overlap of srb.sg (2 u32s)
    always meant that srbu.srb would have held truncated information and any
    attempt to walk srbu.srb.sg.sg based on the value of srbu.srb.sg.count
    would result in attempting to parse past the end of srbu.srb.sg.sg[0] into
    srbu.srb_reply.
    
    After getting a reply from hardware, the reply is copied into
    srbu.srb_reply:
    
            srb_reply = (struct aac_srb_reply *)fib_data(fibptr);
            memcpy(&srbu->srb_reply, srb_reply, sizeof(struct aac_srb_reply));
    
    This has always been fixed-size, so there's no issue here. It is worth
    noting that the two callers _never check_ srbu contents -- neither
    srbu.srb nor srbu.srb_reply is examined. (They depend on the mapped
    xfer_buf instead.)
    
    Therefore, the ordering of members in struct aac_srb_unit does not matter,
    and the flexible array member can moved to the end.
    
    (Additionally, the two memcpy()s that update srbu could be entirely
    removed as they are never consumed, but I left that as-is.)
    
    Signed-off-by: Kees Cook <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology [+ + +]

Author: Justin Tee <[email protected]>
Date:   Fri Jul 26 16:15:09 2024 -0700

    scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology
    
    [ Upstream commit b5c18c9dd138733c16893613345af44deadcf05e ]
    
    In direct attached topology, certain target vendors that are quick to issue
    FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
    kref imbalance for the remote port ndlp object.
    
    Add an nlp_get when the defer_flogi_acc flag is set.  This is expected to
    balance the nlp_put in the defer_flogi_acc clause in the
    lpfc_issue_els_flogi() routine.  Because we need to retain the ndlp ptr,
    reorganize all of the defer_flogi_acc information into one
    lpfc_defer_flogi_acc struct.
    
    Signed-off-by: Justin Tee <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: lpfc: Update PRLO handling in direct attached topology [+ + +]

Author: Justin Tee <[email protected]>
Date:   Fri Jul 26 16:15:10 2024 -0700

    scsi: lpfc: Update PRLO handling in direct attached topology
    
    [ Upstream commit 1f0f7679ad8942f810b0f19ee9cf098c3502d66a ]
    
    A kref imbalance occurs when handling an unsolicited PRLO in direct
    attached topology.
    
    Rework PRLO rcv handling when in MAPPED state.  Save the state that we were
    handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO.  Then in the
    lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
    By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
    lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
    And, we are still allowing the unreg_rpi to happen.
    
    Signed-off-by: Justin Tee <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths [+ + +]

Author: Justin Tee <[email protected]>
Date:   Fri Jul 26 16:15:07 2024 -0700

    scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths
    
    [ Upstream commit 2be1d4f11944cd6283cb97268b3e17c4424945ca ]
    
    When the HBA is undergoing a reset or is handling an errata event, NULL ptr
    dereference crashes may occur in routines such as
    lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or
    lpfc_abort_handler().
    
    Add NULL ptr checks before dereferencing hdwq pointers that may have been
    freed due to operations colliding with a reset or errata event handler.
    
    Signed-off-by: Justin Tee <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: NCR5380: Initialize buffer for MSG IN and STATUS transfers [+ + +]

Author: Finn Thain <[email protected]>
Date:   Wed Aug 7 13:36:28 2024 +1000

    scsi: NCR5380: Initialize buffer for MSG IN and STATUS transfers
    
    [ Upstream commit 1c71065df2df693d208dd32758171c1dece66341 ]
    
    Following an incomplete transfer in MSG IN phase, the driver would not
    notice the problem and would make use of invalid data. Initialize 'tmp'
    appropriately and bail out if no message was received. For STATUS phase,
    preserve the existing status code unless a new value was transferred.
    
    Tested-by: Stan Johnson <[email protected]>
    Signed-off-by: Finn Thain <[email protected]>
    Link: https://lore.kernel.org/r/52e02a8812ae1a2d810d7f9f7fd800c3ccc320c4.1723001788.git.fthain@linux-m68k.org
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: pm8001: Do not overwrite PCI queue mapping [+ + +]

Author: Daniel Wagner <[email protected]>
Date:   Thu Sep 12 10:58:28 2024 +0200

    scsi: pm8001: Do not overwrite PCI queue mapping
    
    [ Upstream commit a141c17a543332fc1238eb5cba562bfc66879126 ]
    
    blk_mq_pci_map_queues() maps all queues but right after this, we overwrite
    these mappings by calling blk_mq_map_queues(). Just use one helper but not
    both.
    
    Fixes: 42f22fe36d51 ("scsi: pm8001: Expose hardware queues for pm80xx")
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: John Garry <[email protected]>
    Signed-off-by: Daniel Wagner <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: smartpqi: add new controller PCI IDs [+ + +]

Author: David Strahan <[email protected]>
Date:   Tue Aug 27 13:54:58 2024 -0500

    scsi: smartpqi: add new controller PCI IDs
    
    [ Upstream commit dbc39b84540f746cc814e69b21e53e6d3e12329a ]
    
    All PCI ID entries in Hex.
    
    Add new cisco pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   1137   02fe
                                                 9005   028f   1137   02ff
                                                 9005   028f   1137   0300
    
    Add new h3c pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   193d   0462
                                                 9005   028f   193d   8462
    
    Add new ieit pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                                                 9005   028f   1ff9   00a3
    
    Reviewed-by: Scott Benesh <[email protected]>
    Reviewed-by: Mike McGowen <[email protected]>
    Signed-off-by: David Strahan <[email protected]>
    Signed-off-by: Don Brace <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: smartpqi: Add new controller PCI IDs [+ + +]

Author: David Strahan <[email protected]>
Date:   Thu Jul 11 14:47:00 2024 -0500

    scsi: smartpqi: Add new controller PCI IDs
    
    [ Upstream commit 0e21e73384d324f75ea16f3d622cfc433fa6209b ]
    
    All PCI ID entries in hex.
    
    Add new inagile PCI IDs:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                SMART-HBA 8242-24i               9005 / 028f / 1ff9 / 0045
                RAID 8236-16i                    9005 / 028f / 1ff9 / 0046
                RAID 8240-24i                    9005 / 028f / 1ff9 / 0047
                SMART-HBA 8238-16i               9005 / 028f / 1ff9 / 0048
                PM8222-SHBA                      9005 / 028f / 1ff9 / 004a
                RAID PM8204-2GB                  9005 / 028f / 1ff9 / 004b
                RAID PM8204-4GB                  9005 / 028f / 1ff9 / 004c
                PM8222-HBA                       9005 / 028f / 1ff9 / 004f
                MT0804M6R                        9005 / 028f / 1ff9 / 0051
                MT0801M6E                        9005 / 028f / 1ff9 / 0052
                MT0808M6R                        9005 / 028f / 1ff9 / 0053
                MT0800M6H                        9005 / 028f / 1ff9 / 0054
                RS0800M5H24i                     9005 / 028f / 1ff9 / 006b
                RS0800M5E8i                      9005 / 028f / 1ff9 / 006c
                RS0800M5H8i                      9005 / 028f / 1ff9 / 006d
                RS0804M5R16i                     9005 / 028f / 1ff9 / 006f
                RS0800M5E24i                     9005 / 028f / 1ff9 / 0070
                RS0800M5H16i                     9005 / 028f / 1ff9 / 0071
                RS0800M5E16i                     9005 / 028f / 1ff9 / 0072
                RT0800M7E                        9005 / 028f / 1ff9 / 0086
                RT0800M7H                        9005 / 028f / 1ff9 / 0087
                RT0804M7R                        9005 / 028f / 1ff9 / 0088
                RT0808M7R                        9005 / 028f / 1ff9 / 0089
                RT1608M6R16i                     9005 / 028f / 1ff9 / 00a1
    
    Add new h3c pci_id:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                UN RAID P4408-Mr-2               9005 / 028f / 193d / 1110
    
    Add new powerleader pci ids:
                                                 VID  / DID  / SVID / SDID
                                                 ----   ----   ----   ----
                PL SmartROC PM8204               9005 / 028f / 1f3a / 0104
    
    Reviewed-by: Scott Benesh <[email protected]>
    Reviewed-by: Scott Teel <[email protected]>
    Reviewed-by: Mike McGowen <[email protected]>
    Signed-off-by: David Strahan <[email protected]>
    Signed-off-by: Don Brace <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: smartpqi: correct stream detection [+ + +]

Author: Mahesh Rajashekhara <[email protected]>
Date:   Tue Aug 27 13:54:56 2024 -0500

    scsi: smartpqi: correct stream detection
    
    [ Upstream commit 4c76114932d1d6fad2e72823e7898a3c960cf2a7 ]
    
    Correct stream detection by initializing the structure
    pqi_scsi_dev_raid_map_data to 0s.
    
    When the OS issues SCSI READ commands, the driver erroneously considers
    them as SCSI WRITES. If they are identified as sequential IOs, the driver
    then submits those requests via the RAID path instead of the AIO path.
    
    The 'is_write' flag might be set for SCSI READ commands also.  The driver
    may interpret SCSI READ commands as SCSI WRITE commands, resulting in IOs
    being submitted through the RAID path.
    
    Note: This does not cause data corruption.
    
    Reviewed-by: Scott Benesh <[email protected]>
    Reviewed-by: Scott Teel <[email protected]>
    Reviewed-by: Mike McGowen <[email protected]>
    Signed-off-by: Mahesh Rajashekhara <[email protected]>
    Signed-off-by: Don Brace <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: st: Fix input/output error on empty drive reset [+ + +]

Author: Rafael Rocha <[email protected]>
Date:   Thu Sep 5 12:39:21 2024 -0500

    scsi: st: Fix input/output error on empty drive reset
    
    [ Upstream commit 3d882cca73be830549833517ddccb3ac4668c04e ]
    
    A previous change was introduced to prevent data loss during a power-on
    reset when a tape is present inside the drive. This commit set the
    "pos_unknown" flag to true to avoid operations that could compromise data
    by performing actions from an untracked position. The relevant change is
    commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
    
    As a consequence of this change, a new issue has surfaced: the driver now
    returns an "Input/output error" even for empty drives when the drive, host,
    or bus is reset. This issue stems from the "flush_buffer" function, which
    first checks whether the "pos_unknown" flag is set. If the flag is set, the
    user will encounter an "Input/output error" until the tape position is
    known again. This behavior differs from the previous implementation, where
    empty drives were not affected at system start up time, allowing tape
    software to send commands to the driver to retrieve the drive's status and
    other information.
    
    The current behavior prioritizes the "pos_unknown" flag over the
    "ST_NO_TAPE" status, leading to issues for software that detects drives
    during system startup. This software will receive an "Input/output error"
    until a tape is loaded and its position is known.
    
    To resolve this, the "ST_NO_TAPE" status should take priority when the
    drive is empty, allowing communication with the drive following a power-on
    reset. At the same time, the change should continue to protect data by
    maintaining the "pos_unknown" flag when the drive contains a tape and its
    position is unknown.
    
    Signed-off-by: Rafael Rocha <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
    Acked-by: Kai Mäkisara <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start [+ + +]

Author: Xin Long <[email protected]>
Date:   Mon Sep 30 16:49:51 2024 -0400

    sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start
    
    [ Upstream commit 8beee4d8dee76b67c75dc91fd8185d91e845c160 ]
    
    In sctp_listen_start() invoked by sctp_inet_listen(), it should set the
    sk_state back to CLOSED if sctp_autobind() fails due to whatever reason.
    
    Otherwise, next time when calling sctp_inet_listen(), if sctp_sk(sk)->reuse
    is already set via setsockopt(SCTP_REUSE_PORT), sctp_sk(sk)->bind_hash will
    be dereferenced as sk_state is LISTENING, which causes a crash as bind_hash
    is NULL.
    
      KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
      RIP: 0010:sctp_inet_listen+0x7f0/0xa20 net/sctp/socket.c:8617
      Call Trace:
       <TASK>
       __sys_listen_socket net/socket.c:1883 [inline]
       __sys_listen+0x1b7/0x230 net/socket.c:1894
       __do_sys_listen net/socket.c:1902 [inline]
    
    Fixes: 5e8f3f703ae4 ("sctp: simplify sctp listening code")
    Reported-by: [email protected]
    Signed-off-by: Xin Long <[email protected]>
    Acked-by: Marcelo Ricardo Leitner <[email protected]>
    Link: https://patch.msgid.link/a93e655b3c153dc8945d7a812e6d8ab0d52b7aa0.1727729391.git.lucien.xin@gmail.com
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftest: hid: add missing run-hid-tools-tests.sh [+ + +]

Author: Yun Lu <[email protected]>
Date:   Sun Sep 29 16:55:49 2024 +0800

    selftest: hid: add missing run-hid-tools-tests.sh
    
    [ Upstream commit 160c826b4dd0d570f0f51cf002cb49bda807e9f5 ]
    
    HID test cases run tests using the run-hid-tools-tests.sh script.
    When installed with "make install", the run-hid-tools-tests.sh
    script will not be copied over, resulting in the following error message.
    
      make -C tools/testing/selftests/ TARGETS=hid install \
              INSTALL_PATH=$KSFT_INSTALL_PATH
    
      cd $KSFT_INSTALL_PATH
      ./run_kselftest.sh -c hid
    
    selftests: hid: hid-core.sh
    bash: ./run-hid-tools-tests.sh: No such file or directory
    
    Add the run-hid-tools-tests.sh script to the TEST_FILES in the Makefile
    for it to be installed.
    
    Fixes: ffb85d5c9e80 ("selftests: hid: import hid-tools hid-core tests")
    Signed-off-by: Yun Lu <[email protected]>
    Acked-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/bpf: fix uprobe.path leak in bpf_testmod [+ + +]

Author: Jiri Olsa <[email protected]>
Date:   Thu Aug 1 15:27:24 2024 +0200

    selftests/bpf: fix uprobe.path leak in bpf_testmod
    
    [ Upstream commit db61e6a4eee5a7884b2cafeaf407895f253bbaa7 ]
    
    testmod_unregister_uprobe() forgets to path_put(&uprobe.path).
    
    Signed-off-by: Jiri Olsa <[email protected]>
    Signed-off-by: Oleg Nesterov <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: fix charge_reserved_hugetlb.sh test [+ + +]

Author: David Hildenbrand <[email protected]>
Date:   Wed Aug 21 14:31:15 2024 +0200

    selftests/mm: fix charge_reserved_hugetlb.sh test
    
    [ Upstream commit c41a701d18efe6b8aa402efab16edbaba50c9548 ]
    
    Currently, running the charge_reserved_hugetlb.sh selftest we can
    sometimes observe something like:
    
      $ ./charge_reserved_hugetlb.sh -cgroup-v2
      ...
      write_result is 0
      After write:
      hugetlb_usage=0
      reserved_usage=10485760
      killing write_to_hugetlbfs
      Received 2.
      Deleting the memory
      Detach failure: Invalid argument
      umount: /mnt/huge: target is busy.
    
    Both cases are issues in the test.
    
    While the unmount error seems to be racy, it will make the test fail:
            $ ./run_vmtests.sh -t hugetlb
            ...
            # [FAIL]
            not ok 10 charge_reserved_hugetlb.sh -cgroup-v2 # exit=32
    
    The issue is that we are not waiting for the write_to_hugetlbfs process to
    quit.  So it might still have a hugetlbfs file open, about which umount is
    not happy.  Fix that by making "killall" wait for the process to quit.
    
    The other error ("Detach failure: Invalid argument") does not seem to
    result in a test error, but is misleading.  Turns out write_to_hugetlbfs.c
    unconditionally tries to cleanup using shmdt(), even when we only
    mmap()'ed a hugetlb file.  Even worse, shmaddr is never even set for the
    SHM case.  Fix that as well.
    
    With this change it seems to work as expected.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Mario Casquero <[email protected]>
    Reviewed-by: Mina Almasry <[email protected]>
    Tested-by: Mario Casquero <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: Muchun Song <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/nolibc: avoid passing NULL to printf("%s") [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Wed Aug 7 23:51:44 2024 +0200

    selftests/nolibc: avoid passing NULL to printf("%s")
    
    [ Upstream commit f1a58f61d88642ae1e6e97e9d72d73bc70a93cb8 ]
    
    Clang on higher optimization levels detects that NULL is passed to
    printf("%s") and warns about it.
    While printf() from nolibc gracefully handles that NULL,
    it is undefined behavior as per POSIX, so the warning is reasonable.
    Avoid the warning by transforming NULL into a non-NULL placeholder.
    
    Reviewed-by: Shuah Khan <[email protected]>
    Acked-by: Willy Tarreau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: breakpoints: use remaining time to check if suspend succeed [+ + +]

Author: Yifei Liu <[email protected]>
Date:   Mon Sep 30 15:40:25 2024 -0700

    selftests: breakpoints: use remaining time to check if suspend succeed
    
    [ Upstream commit c66be905cda24fb782b91053b196bd2e966f95b7 ]
    
    step_after_suspend_test fails with device busy error while
    writing to /sys/power/state to start suspend. The test believes
    it failed to enter suspend state with
    
    $ sudo ./step_after_suspend_test
    TAP version 13
    Bail out! Failed to enter Suspend state
    
    However, in the kernel message, I indeed see the system get
    suspended and then wake up later.
    
    [611172.033108] PM: suspend entry (s2idle)
    [611172.044940] Filesystems sync: 0.006 seconds
    [611172.052254] Freezing user space processes
    [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds)
    [611172.067920] OOM killer disabled.
    [611172.072465] Freezing remaining freezable tasks
    [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
    [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug)
    [611172.117126] serial 00:03: disabled
    some other hardware get reconnected
    [611203.136277] OOM killer enabled.
    [611203.140637] Restarting tasks ...
    [611203.141135] usb 1-8.1: USB disconnect, device number 7
    [611203.141755] done.
    [611203.155268] random: crng reseeded on system resumption
    [611203.162059] PM: suspend exit
    
    After investigation, I noticed that for the code block
    if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem"))
            ksft_exit_fail_msg("Failed to enter Suspend state\n");
    
    The write will return -1 and errno is set to 16 (device busy).
    It should be caused by the write function is not successfully returned
    before the system suspend and the return value get messed when waking up.
    As a result, It may be better to check the time passed of those few
    instructions to determine whether the suspend is executed correctly for
    it is pretty hard to execute those few lines for 5 seconds.
    
    The timer to wake up the system is set to expire after 5 seconds and
    no re-arm. If the timer remaining time is 0 second and 0 nano secomd,
    it means the timer expired and wake the system up. Otherwise, the system
    could be considered to enter the suspend state failed if there is any
    remaining time.
    
    After appling this patch, the test would not fail for it believes the
    system does not go to suspend by mistake. It now could continue to the
    rest part of the test after suspend.
    
    Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test")
    Reported-by: Sinadin Shan <[email protected]>
    Signed-off-by: Yifei Liu <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: netfilter: Add missing return value [+ + +]

Author: zhang jiao <[email protected]>
Date:   Fri Sep 27 11:22:05 2024 +0800

    selftests: netfilter: Add missing return value
    
    [ Upstream commit 10dbd23633f0433f8d13c2803d687b36a675ef60 ]
    
    There is no return value in count_entries, just add it.
    
    Fixes: eff3c558bb7e ("netfilter: ctnetlink: support filtering by zone")
    Signed-off-by: zhang jiao <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: netfilter: Fix nft_audit.sh for newer nft binaries [+ + +]

Author: Phil Sutter <[email protected]>
Date:   Thu Sep 26 18:56:31 2024 +0200

    selftests: netfilter: Fix nft_audit.sh for newer nft binaries
    
    [ Upstream commit 8a89015644513ef69193a037eb966f2d55fe385a ]
    
    As a side-effect of nftables' commit dbff26bfba833 ("cache: consolidate
    reset command"), audit logs changed when more objects were reset than
    fit into a single netlink message.
    
    Since the objects' distribution in netlink messages is not relevant,
    implement a summarizing function which combines repeated audit logs into
    a single one with summed up 'entries=' value.
    
    Fixes: 203bb9d39866 ("selftests: netfilter: Extend nft_audit.sh")
    Signed-off-by: Phil Sutter <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: vDSO: fix ELF hash table entry size for s390x [+ + +]

Author: Jens Remus <[email protected]>
Date:   Wed Sep 11 10:50:14 2024 +0200

    selftests: vDSO: fix ELF hash table entry size for s390x
    
    [ Upstream commit 14be4e6f35221c4731b004553ecf7cbc6dc1d2d8 ]
    
    The vDSO self tests fail on s390x for a vDSO linked with the GNU linker
    ld as follows:
    
      # ./vdso_test_gettimeofday
      Floating point exception (core dumped)
    
    On s390x the ELF hash table entries are 64 bits instead of 32 bits in
    size (see Glibc sysdeps/unix/sysv/linux/s390/bits/elfclass.h).
    
    Fixes: 40723419f407 ("kselftest: Enable vDSO test on non x86 platforms")
    Reported-by: Heiko Carstens <[email protected]>
    Tested-by: Heiko Carstens <[email protected]>
    Signed-off-by: Jens Remus <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: vDSO: fix vDSO name for powerpc [+ + +]

Author: Christophe Leroy <[email protected]>
Date:   Fri Aug 30 14:28:35 2024 +0200

    selftests: vDSO: fix vDSO name for powerpc
    
    [ Upstream commit 59eb856c3ed9b3552befd240c0c339f22eed3fa1 ]
    
    Following error occurs when running vdso_test_correctness on powerpc:
    
    ~ # ./vdso_test_correctness
    [WARN]  failed to find vDSO
    [SKIP]  No vDSO, so skipping clock_gettime() tests
    [SKIP]  No vDSO, so skipping clock_gettime64() tests
    [RUN]   Testing getcpu...
    [OK]    CPU 0: syscall: cpu 0, node 0
    
    On powerpc, vDSO is neither called linux-vdso.so.1 nor linux-gate.so.1
    but linux-vdso32.so.1 or linux-vdso64.so.1.
    
    Also search those two names before giving up.
    
    Fixes: c7e5789b24d3 ("kselftest: Move test_vdso to the vDSO test suite")
    Signed-off-by: Christophe Leroy <[email protected]>
    Acked-by: Shuah Khan <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: vDSO: fix vDSO symbols lookup for powerpc64 [+ + +]

Author: Christophe Leroy <[email protected]>
Date:   Fri Aug 30 14:28:37 2024 +0200

    selftests: vDSO: fix vDSO symbols lookup for powerpc64
    
    [ Upstream commit ba83b3239e657469709d15dcea5f9b65bf9dbf34 ]
    
    On powerpc64, following tests fail locating vDSO functions:
    
      ~ # ./vdso_test_abi
      TAP version 13
      1..16
      # [vDSO kselftest] VDSO_VERSION: LINUX_2.6.15
      # Couldn't find __kernel_gettimeofday
      ok 1 # SKIP __kernel_gettimeofday
      # clock_id: CLOCK_REALTIME
      # Couldn't find __kernel_clock_gettime
      ok 2 # SKIP __kernel_clock_gettime CLOCK_REALTIME
      # Couldn't find __kernel_clock_getres
      ok 3 # SKIP __kernel_clock_getres CLOCK_REALTIME
      ...
      # Couldn't find __kernel_time
      ok 16 # SKIP __kernel_time
      # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:16 error:0
    
      ~ # ./vdso_test_getrandom
      __kernel_getrandom is missing!
    
      ~ # ./vdso_test_gettimeofday
      Could not find __kernel_gettimeofday
    
      ~ # ./vdso_test_getcpu
      Could not find __kernel_getcpu
    
    On powerpc64, as shown below by readelf, vDSO functions symbols have
    type NOTYPE, so also accept that type when looking for symbols.
    
    $ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg
    ELF Header:
      Magic:   7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, big endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              DYN (Shared object file)
      Machine:                           PowerPC64
      Version:                           0x1
    ...
    
    Symbol table '.dynsym' contains 12 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         2: 00000000000005f0    36 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         3: 0000000000000578    68 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         4: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
         5: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         6: 0000000000000614   172 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         7: 00000000000006f0    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         8: 000000000000047c    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
         9: 0000000000000454    12 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
        10: 00000000000004d0    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
        11: 00000000000005bc    52 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
    
    Symbol table '.symtab' contains 56 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
    ...
        45: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
        46: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __kernel_getcpu
        47: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_clock_getres
        48: 00000000000005f0    36 NOTYPE  GLOBAL DEFAULT    8 __kernel_get_tbfreq
        49: 000000000000047c    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_gettimeofday
        50: 0000000000000614   172 NOTYPE  GLOBAL DEFAULT    8 __kernel_sync_dicache
        51: 00000000000006f0    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_getrandom
        52: 0000000000000454    12 NOTYPE  GLOBAL DEFAULT    8 __kernel_sigtram[...]
        53: 0000000000000578    68 NOTYPE  GLOBAL DEFAULT    8 __kernel_time
        54: 00000000000004d0    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_clock_g[...]
        55: 00000000000005bc    52 NOTYPE  GLOBAL DEFAULT    8 __kernel_get_sys[...]
    
    Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser")
    Signed-off-by: Christophe Leroy <[email protected]>
    Acked-by: Shuah Khan <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: vDSO: fix vdso_config for powerpc [+ + +]

Author: Christophe Leroy <[email protected]>
Date:   Fri Aug 30 14:28:36 2024 +0200

    selftests: vDSO: fix vdso_config for powerpc
    
    [ Upstream commit 7d297c419b08eafa69ce27243ee9bbecab4fcaa4 ]
    
    Running vdso_test_correctness on powerpc64 gives the following warning:
    
      ~ # ./vdso_test_correctness
      Warning: failed to find clock_gettime64 in vDSO
    
    This is because vdso_test_correctness was built with VDSO_32BIT defined.
    
    __powerpc__ macro is defined on both powerpc32 and powerpc64 so
    __powerpc64__ needs to be checked first in vdso_config.h
    
    Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest")
    Signed-off-by: Christophe Leroy <[email protected]>
    Acked-by: Shuah Khan <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: vDSO: fix vdso_config for s390 [+ + +]

Author: Heiko Carstens <[email protected]>
Date:   Wed Sep 11 10:50:15 2024 +0200

    selftests: vDSO: fix vdso_config for s390
    
    [ Upstream commit a6e23fb8d3c0e3904da70beaf5d7e840a983c97f ]
    
    Running vdso_test_correctness on s390x (aka s390 64 bit) emits a warning:
    
    Warning: failed to find clock_gettime64 in vDSO
    
    This is caused by the "#elif defined (__s390__)" check in vdso_config.h
    which the defines VDSO_32BIT.
    
    If __s390x__ is defined also __s390__ is defined. Therefore the correct
    check must make sure that only __s390__ is defined.
    
    Therefore add the missing !defined(__s390x__). Also use common
    __s390x__ define instead of __s390X__.
    
    Signed-off-by: Heiko Carstens <[email protected]>
    Fixes: 693f5ca08ca0 ("kselftest: Extend vDSO selftest")
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

smb3: fix incorrect mode displayed for read-only files [+ + +]

Author: Steve French <[email protected]>
Date:   Sat Sep 21 23:28:32 2024 -0500

    smb3: fix incorrect mode displayed for read-only files
    
    commit 2f3017e7cc7515e0110a3733d8dca84de2a1d23d upstream.
    
    Commands like "chmod 0444" mark a file readonly via the attribute flag
    (when mapping of mode bits into the ACL are not set, or POSIX extensions
    are not negotiated), but they were not reported correctly for stat of
    directories (they were reported ok for files and for "ls").  See example
    below:
    
        root:~# ls /mnt2 -l
        total 12
        drwxr-xr-x 2 root root         0 Sep 21 18:03 normaldir
        -rwxr-xr-x 1 root root         0 Sep 21 23:24 normalfile
        dr-xr-xr-x 2 root root         0 Sep 21 17:55 readonly-dir
        -r-xr-xr-x 1 root root 209716224 Sep 21 18:15 readonly-file
        root:~# stat -c %a /mnt2/readonly-dir
        755
        root:~# stat -c %a /mnt2/readonly-file
        555
    
    This fixes the stat of directories when ATTR_READONLY is set
    (in cases where the mode can not be obtained other ways).
    
        root:~# stat -c %a /mnt2/readonly-dir
        555
    
    Cc: [email protected]
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

smb: client: use actual path when queryfs [+ + +]

Author: wangrong <[email protected]>
Date:   Thu Jun 20 16:37:29 2024 +0800

    smb: client: use actual path when queryfs
    
    commit a421e3fe0e6abe27395078f4f0cec5daf466caea upstream.
    
    Due to server permission control, the client does not have access to
    the shared root directory, but can access subdirectories normally, so
    users usually mount the shared subdirectories directly. In this case,
    queryfs should use the actual path instead of the root directory to
    avoid the call returning an error (EACCES).
    
    Signed-off-by: wangrong <[email protected]>
    Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
    Cc: [email protected]
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spi: bcm63xx: Fix missing pm_runtime_disable() [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Aug 19 20:33:49 2024 +0800

    spi: bcm63xx: Fix missing pm_runtime_disable()
    
    commit 265697288ec2160ca84707565d6641d46f69b0ff upstream.
    
    The pm_runtime_disable() is missing in the remove function, fix it
    by using devm_pm_runtime_enable(), so the pm_runtime_disable() in
    the probe error path can also be removed.
    
    Fixes: 2d13f2ff6073 ("spi: bcm63xx-spi: fix pm_runtime")
    Cc: [email protected] # v5.13+
    Signed-off-by: Jinjie Ruan <[email protected]>
    Suggested-by: Jonas Gorski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spi: bcm63xx: Fix module autoloading [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Aug 19 20:33:48 2024 +0800

    spi: bcm63xx: Fix module autoloading
    
    commit 909f34f2462a99bf876f64c5c61c653213e32fce upstream.
    
    Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
    based on the alias from platform_device_id table.
    
    Fixes: 44d8fb30941d ("spi/bcm63xx: move register definitions into the driver")
    Cc: [email protected]
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Jonas Gorski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

spi: rpc-if: Add missing MODULE_DEVICE_TABLE [+ + +]

Author: Biju Das <[email protected]>
Date:   Wed Jul 31 08:29:53 2024 +0100

    spi: rpc-if: Add missing MODULE_DEVICE_TABLE
    
    [ Upstream commit 0880f669436028c5499901e5acd8f4b4ea0e0c6a ]
    
    Add missing MODULE_DEVICE_TABLE definition for automatic loading of the
    driver when it is built as a module.
    
    Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver")
    Signed-off-by: Biju Das <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: s3c64xx: fix timeout counters in flush_fifo [+ + +]

Author: Ben Dooks <[email protected]>
Date:   Tue Sep 24 14:40:08 2024 +0100

    spi: s3c64xx: fix timeout counters in flush_fifo
    
    [ Upstream commit 68a16708d2503b6303d67abd43801e2ca40c208d ]
    
    In the s3c64xx_flush_fifo() code, the loops counter is post-decremented
    in the do { } while(test && loops--) condition. This means the loops is
    left at the unsigned equivalent of -1 if the loop times out. The test
    after will never pass as if tests for loops == 0.
    
    Signed-off-by: Ben Dooks <[email protected]>
    Fixes: 230d42d422e7 ("spi: Add s3c64xx SPI Controller driver")
    Reviewed-by: Andi Shyti <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: spi-cadence: Fix missing spi_controller_is_target() check [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 12:00:15 2024 +0800

    spi: spi-cadence: Fix missing spi_controller_is_target() check
    
    [ Upstream commit 3eae4a916fc0eb6f85b5d399e10335dbd24dd765 ]
    
    The spi_controller_is_target() check is missing for pm_runtime_disable()
    in cdns_spi_remove(), add it.
    
    Fixes: b1b90514eaa3 ("spi: spi-cadence: Add support for Slave mode")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 12:00:14 2024 +0800

    spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    [ Upstream commit 67d4a70faa662df07451e83db1546d3ca0695e08 ]
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: d36ccd9f7ea4 ("spi: cadence: Runtime pm adaptation")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]

Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 12:00:13 2024 +0800

    spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    [ Upstream commit b6e05ba0844139dde138625906015c974c86aa93 ]
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 43b6bf406cd0 ("spi: imx: fix runtime pm support for !CONFIG_PM")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

static_call: Handle module init failure correctly in static_call_del_module() [+ + +]

Author: Thomas Gleixner <[email protected]>
Date:   Wed Sep 4 11:09:07 2024 +0200

    static_call: Handle module init failure correctly in static_call_del_module()
    
    [ Upstream commit 4b30051c4864234ec57290c3d142db7c88f10d8a ]
    
    Module insertion invokes static_call_add_module() to initialize the static
    calls in a module. static_call_add_module() invokes __static_call_init(),
    which allocates a struct static_call_mod to either encapsulate the built-in
    static call sites of the associated key into it so further modules can be
    added or to append the module to the module chain.
    
    If that allocation fails the function returns with an error code and the
    module core invokes static_call_del_module() to clean up eventually added
    static_call_mod entries.
    
    This works correctly, when all keys used by the module were converted over
    to a module chain before the failure. If not then static_call_del_module()
    causes a #GP as it blindly assumes that key::mods points to a valid struct
    static_call_mod.
    
    The problem is that key::mods is not a individual struct member of struct
    static_call_key, it's part of a union to save space:
    
            union {
                    /* bit 0: 0 = mods, 1 = sites */
                    unsigned long type;
                    struct static_call_mod *mods;
                    struct static_call_site *sites;
            };
    
    key::sites is a pointer to the list of built-in usage sites of the static
    call. The type of the pointer is differentiated by bit 0. A mods pointer
    has the bit clear, the sites pointer has the bit set.
    
    As static_call_del_module() blidly assumes that the pointer is a valid
    static_call_mod type, it fails to check for this failure case and
    dereferences the pointer to the list of built-in call sites, which is
    obviously bogus.
    
    Cure it by checking whether the key has a sites or a mods pointer.
    
    If it's a sites pointer then the key is not to be touched. As the sites are
    walked in the same order as in __static_call_init() the site walk can be
    terminated because all subsequent sites have not been touched by the init
    code due to the error exit.
    
    If it was converted before the allocation fail, then the inner loop which
    searches for a module match will find nothing.
    
    A fail in the second allocation in __static_call_init() is harmless and
    does not require special treatment. The first allocation succeeded and
    converted the key to a module chain. That first entry has mod::mod == NULL
    and mod::next == NULL, so the inner loop of static_call_del_module() will
    neither find a module match nor a module chain. The next site in the walk
    was either already converted, but can't match the module, or it will exit
    the outer loop because it has a static_call_site pointer and not a
    static_call_mod pointer.
    
    Fixes: 9183c3f9ed71 ("static_call: Add inline static call infrastructure")
    Closes: https://lore.kernel.org/all/[email protected]
    Reported-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/r/87zfon6b0s.ffs@tglx
    Signed-off-by: Sasha Levin <[email protected]>

static_call: Replace pointless WARN_ON() in static_call_module_notify() [+ + +]

Author: Thomas Gleixner <[email protected]>
Date:   Wed Sep 4 11:08:28 2024 +0200

    static_call: Replace pointless WARN_ON() in static_call_module_notify()
    
    [ Upstream commit fe513c2ef0a172a58f158e2e70465c4317f0a9a2 ]
    
    static_call_module_notify() triggers a WARN_ON(), when memory allocation
    fails in __static_call_add_module().
    
    That's not really justified, because the failure case must be correctly
    handled by the well known call chain and the error code is passed
    through to the initiating userspace application.
    
    A memory allocation fail is not a fatal problem, but the WARN_ON() takes
    the machine out when panic_on_warn is set.
    
    Replace it with a pr_warn().
    
    Fixes: 9183c3f9ed71 ("static_call: Add inline static call infrastructure")
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://lkml.kernel.org/r/8734mf7pmb.ffs@tglx
    Signed-off-by: Sasha Levin <[email protected]>

sunrpc: change sp_nrthreads from atomic_t to unsigned int. [+ + +]

Author: NeilBrown <[email protected]>
Date:   Mon Jul 15 17:14:18 2024 +1000

    sunrpc: change sp_nrthreads from atomic_t to unsigned int.
    
    [ Upstream commit 60749cbe3d8ae572a6c7dda675de3e8b25797a18 ]
    
    sp_nrthreads is only ever accessed under the service mutex
      nlmsvc_mutex nfs_callback_mutex nfsd_mutex
    so these is no need for it to be an atomic_t.
    
    The fact that all code using it is single-threaded means that we can
    simplify svc_pool_victim and remove the temporary elevation of
    sp_nrthreads.
    
    Signed-off-by: NeilBrown <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <[email protected]>

sysctl: avoid spurious permanent empty tables [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Mon Aug 5 11:39:35 2024 +0200

    sysctl: avoid spurious permanent empty tables
    
    commit 559d4c6a9d3b60f239493239070eb304edaea594 upstream.
    
    The test if a table is a permanently empty one, inspects the address of
    the registered ctl_table argument.
    However as sysctl_mount_point is an empty array and does not occupy and
    space it can end up sharing an address with another object in memory.
    If that other object itself is a "struct ctl_table" then registering
    that table will fail as it's incorrectly recognized as permanently empty.
    
    Avoid this issue by adding a dummy element to the array so that is not
    empty anymore.
    Explicitly register the table with zero elements as otherwise the dummy
    element would be recognized as a sentinel element which would lead to a
    runtime warning from the sysctl core.
    
    While the issue seems not being encountered at this time, this seems
    mostly to be due to luck.
    Also a future change, constifying sysctl_mount_point and root_table, can
    reliably trigger this issue on clang 18.
    
    Given that empty arrays are non-standard in the first place it seems
    prudent to avoid them if possible.
    
    Fixes: 4a7b29f65094 ("sysctl: move sysctl type to ctl_table_header")
    Fixes: a35dd3a786f5 ("sysctl: drop now unnecessary out-of-bounds check")
    Cc: [email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Closes: https://lore.kernel.org/oe-lkp/[email protected]
    Signed-off-by: Joel Granados <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process [+ + +]

Author: Jason Xing <[email protected]>
Date:   Fri Aug 23 08:11:52 2024 +0800

    tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
    
    [ Upstream commit 0d9e5df4a257afc3a471a82961ace9a22b88295a ]
    
    We found that one close-wait socket was reset by the other side
    due to a new connection reusing the same port which is beyond our
    expectation, so we have to investigate the underlying reason.
    
    The following experiment is conducted in the test environment. We
    limit the port range from 40000 to 40010 and delay the time to close()
    after receiving a fin from the active close side, which can help us
    easily reproduce like what happened in production.
    
    Here are three connections captured by tcpdump:
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965525191
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 2769915070
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [F.], seq 1, ack 1
    // a few seconds later, within 60 seconds
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [.], ack 2
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [R], seq 2965525193
    // later, very quickly
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
    127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 3120990805
    127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1
    
    As we can see, the first flow is reset because:
    1) client starts a new connection, I mean, the second one
    2) client tries to find a suitable port which is a timewait socket
       (its state is timewait, substate is fin_wait2)
    3) client occupies that timewait port to send a SYN
    4) server finds a corresponding close-wait socket in ehash table,
       then replies with a challenge ack
    5) client sends an RST to terminate this old close-wait socket.
    
    I don't think the port selection algo can choose a FIN_WAIT2 socket
    when we turn on tcp_tw_reuse because on the server side there
    remain unread data. In some cases, if one side haven't call close() yet,
    we should not consider it as expendable and treat it at will.
    
    Even though, sometimes, the server isn't able to call close() as soon
    as possible like what we expect, it can not be terminated easily,
    especially due to a second unrelated connection happening.
    
    After this patch, we can see the expected failure if we start a
    connection when all the ports are occupied in fin_wait2 state:
    "Ncat: Cannot assign requested address."
    
    Reported-by: Jade Dong <[email protected]>
    Signed-off-by: Jason Xing <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tipc: guard against string buffer overrun [+ + +]

Author: Simon Horman <[email protected]>
Date:   Thu Aug 1 19:35:37 2024 +0100

    tipc: guard against string buffer overrun
    
    [ Upstream commit 6555a2a9212be6983d2319d65276484f7c5f431a ]
    
    Smatch reports that copying media_name and if_name to name_parts may
    overwrite the destination.
    
     .../bearer.c:166 bearer_name_validate() error: strcpy() 'media_name' too large for 'name_parts->media_name' (32 vs 16)
     .../bearer.c:167 bearer_name_validate() error: strcpy() 'if_name' too large for 'name_parts->if_name' (1010102 vs 16)
    
    This does seem to be the case so guard against this possibility by using
    strscpy() and failing if truncation occurs.
    
    Introduced by commit b97bf3fd8f6a ("[TIPC] Initial merge")
    
    Compile tested only.
    
    Reviewed-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tomoyo: fallback to realpath if symlink's pathname does not exist [+ + +]

Author: Tetsuo Handa <[email protected]>
Date:   Wed Sep 25 22:30:59 2024 +0900

    tomoyo: fallback to realpath if symlink's pathname does not exist
    
    commit ada1986d07976d60bed5017aa38b7f7cf27883f7 upstream.
    
    Alfred Agrell found that TOMOYO cannot handle execveat(AT_EMPTY_PATH)
    inside chroot environment where /dev and /proc are not mounted, for
    commit 51f39a1f0cea ("syscalls: implement execveat() system call") missed
    that TOMOYO tries to canonicalize argv[0] when the filename fed to the
    executed program as argv[0] is supplied using potentially nonexistent
    pathname.
    
    Since "/dev/fd/<fd>" already lost symlink information used for obtaining
    that <fd>, it is too late to reconstruct symlink's pathname. Although
    <filename> part of "/dev/fd/<fd>/<filename>" might not be canonicalized,
    TOMOYO cannot use tomoyo_realpath_nofollow() when /dev or /proc is not
    mounted. Therefore, fallback to tomoyo_realpath_from_path() when
    tomoyo_realpath_nofollow() failed.
    
    Reported-by: Alfred Agrell <[email protected]>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1082001
    Fixes: 51f39a1f0cea ("syscalls: implement execveat() system call")
    Cc: [email protected] # v3.19+
    Signed-off-by: Tetsuo Handa <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tools/hv: Add memory allocation check in hv_fcopy_start [+ + +]

Author: Zhu Jun <[email protected]>
Date:   Fri Sep 6 02:13:33 2024 -0700

    tools/hv: Add memory allocation check in hv_fcopy_start
    
    [ Upstream commit 94e86b174d103d941b4afc4f016af8af9e5352fa ]
    
    Added error handling for memory allocation failures
    of file_name and path_name.
    
    Signed-off-by: Zhu Jun <[email protected]>
    Reviewed-by: Dexuan Cui <[email protected]>
    Tested-by: Saurabh Sengar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Wei Liu <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools/nolibc: powerpc: limit stack-protector workaround to GCC [+ + +]

Author: Thomas Weißschuh <[email protected]>
Date:   Wed Aug 7 23:51:39 2024 +0200

    tools/nolibc: powerpc: limit stack-protector workaround to GCC
    
    [ Upstream commit 1daea158d0aae0770371f3079305a29fdb66829e ]
    
    As mentioned in the comment, the workaround for
    __attribute__((no_stack_protector)) is only necessary on GCC.
    Avoid applying the workaround on clang, as clang does not recognize
    __attribute__((__optimize__)) and would fail.
    
    Acked-by: Willy Tarreau <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools/rtla: Fix installation from out-of-tree build [+ + +]

Author: Ben Hutchings <[email protected]>
Date:   Mon Sep 16 01:31:58 2024 +0200

    tools/rtla: Fix installation from out-of-tree build
    
    [ Upstream commit f771d5369f1dbfe32c93bcb4f5d7ca8322b15389 ]
    
    rtla now supports out-of-tree builds, but installation fails as it
    still tries to install the rtla binary from the source tree.  Use the
    existing macro $(RTLA) to refer to the binary.
    
    Link: https://lore.kernel.org/[email protected]
    Fixes: 01474dc706ca ("tools/rtla: Use tools/build makefiles to build rtla")
    Reviewed-by: Tomas Glozar <[email protected]>
    Tested-by: Tomas Glozar <[email protected]>
    Signed-off-by: Ben Hutchings <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools/x86/kcpuid: Protect against faulty "max subleaf" values [+ + +]

Author: Ahmed S. Darwish <[email protected]>
Date:   Thu Jul 18 15:47:44 2024 +0200

    tools/x86/kcpuid: Protect against faulty "max subleaf" values
    
    [ Upstream commit cf96ab1a966b87b09fdd9e8cc8357d2d00776a3a ]
    
    Protect against the kcpuid code parsing faulty max subleaf numbers
    through a min() expression.  Thus, ensuring that max_subleaf will always
    be ≤ MAX_SUBLEAF_NUM.
    
    Use "u32" for the subleaf numbers since kcpuid is compiled with -Wextra,
    which includes signed/unsigned comparisons warnings.
    
    Signed-off-by: Ahmed S. Darwish <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

tracing/hwlat: Fix a race during cpuhp processing [+ + +]

Author: Wei Li <[email protected]>
Date:   Tue Sep 24 17:45:14 2024 +0800

    tracing/hwlat: Fix a race during cpuhp processing
    
    commit 2a13ca2e8abb12ee43ada8a107dadca83f140937 upstream.
    
    The cpuhp online/offline processing race also exists in percpu-mode hwlat
    tracer in theory, apply the fix too. That is:
    
        T1                       | T2
        [CPUHP_ONLINE]           | cpu_device_down()
         hwlat_hotplug_workfn()  |
                                 |     cpus_write_lock()
                                 |     takedown_cpu(1)
                                 |     cpus_write_unlock()
        [CPUHP_OFFLINE]          |
            cpus_read_lock()     |
            start_kthread(1)     |
            cpus_read_unlock()   |
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations")
    Signed-off-by: Wei Li <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing/timerlat: Drop interface_lock in stop_kthread() [+ + +]

Author: Wei Li <[email protected]>
Date:   Tue Sep 24 17:45:12 2024 +0800

    tracing/timerlat: Drop interface_lock in stop_kthread()
    
    commit b484a02c9cedf8703eff8f0756f94618004bd165 upstream.
    
    stop_kthread() is the offline callback for "trace/osnoise:online", since
    commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing
    of kthread in stop_kthread()"), the following ABBA deadlock scenario is
    introduced:
    
    T1                            | T2 [BP]               | T3 [AP]
    osnoise_hotplug_workfn()      | work_for_cpu_fn()     | cpuhp_thread_fun()
                                  |   _cpu_down()         |   osnoise_cpu_die()
      mutex_lock(&interface_lock) |                       |     stop_kthread()
                                  |     cpus_write_lock() |       mutex_lock(&interface_lock)
      cpus_read_lock()            |     cpuhp_kick_ap()   |
    
    As the interface_lock here in just for protecting the "kthread" field of
    the osn_var, use xchg() instead to fix this issue. Also use
    for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take
    cpu_read_lock() again.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()")
    Signed-off-by: Wei Li <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing/timerlat: Fix a race during cpuhp processing [+ + +]

Author: Wei Li <[email protected]>
Date:   Tue Sep 24 17:45:13 2024 +0800

    tracing/timerlat: Fix a race during cpuhp processing
    
    commit 829e0c9f0855f26b3ae830d17b24aec103f7e915 upstream.
    
    There is another found exception that the "timerlat/1" thread was
    scheduled on CPU0, and lead to timer corruption finally:
    
    ```
    ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220
    WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0
    Modules linked in:
    CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    RIP: 0010:debug_print_object+0x7d/0xb0
    ...
    Call Trace:
     <TASK>
     ? __warn+0x7c/0x110
     ? debug_print_object+0x7d/0xb0
     ? report_bug+0xf1/0x1d0
     ? prb_read_valid+0x17/0x20
     ? handle_bug+0x3f/0x70
     ? exc_invalid_op+0x13/0x60
     ? asm_exc_invalid_op+0x16/0x20
     ? debug_print_object+0x7d/0xb0
     ? debug_print_object+0x7d/0xb0
     ? __pfx_timerlat_irq+0x10/0x10
     __debug_object_init+0x110/0x150
     hrtimer_init+0x1d/0x60
     timerlat_main+0xab/0x2d0
     ? __pfx_timerlat_main+0x10/0x10
     kthread+0xb7/0xe0
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x2d/0x40
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    ```
    
    After tracing the scheduling event, it was discovered that the migration
    of the "timerlat/1" thread was performed during thread creation. Further
    analysis confirmed that it is because the CPU online processing for
    osnoise is implemented through workers, which is asynchronous with the
    offline processing. When the worker was scheduled to create a thread, the
    CPU may has already been removed from the cpu_online_mask during the offline
    process, resulting in the inability to select the right CPU:
    
    T1                       | T2
    [CPUHP_ONLINE]           | cpu_device_down()
    osnoise_hotplug_workfn() |
                             |     cpus_write_lock()
                             |     takedown_cpu(1)
                             |     cpus_write_unlock()
    [CPUHP_OFFLINE]          |
        cpus_read_lock()     |
        start_kthread(1)     |
        cpus_read_unlock()   |
    
    To fix this, skip online processing if the CPU is already offline.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
    Signed-off-by: Wei Li <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline [+ + +]

Author: Wei Li <[email protected]>
Date:   Tue Sep 24 17:45:11 2024 +0800

    tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
    
    commit 0bb0a5c12ecf36ad561542bbb95f96355e036a02 upstream.
    
    osnoise_hotplug_workfn() is the asynchronous online callback for
    "trace/osnoise:online". It may be congested when a CPU goes online and
    offline repeatedly and is invoked for multiple times after a certain
    online.
    
    This will lead to kthread leak and timer corruption. Add a check
    in start_kthread() to prevent this situation.
    
    Cc: [email protected]
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
    Signed-off-by: Wei Li <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

uprobes: fix kernel info leak via "[uprobes]" vma [+ + +]

Author: Oleg Nesterov <[email protected]>
Date:   Mon Oct 7 19:46:01 2024 +0200

    uprobes: fix kernel info leak via "[uprobes]" vma
    
    commit 34820304cc2cd1804ee1f8f3504ec77813d29c8e upstream.
    
    xol_add_vma() maps the uninitialized page allocated by __create_xol_area()
    into userspace. On some architectures (x86) this memory is readable even
    without VM_READ, VM_EXEC results in the same pgprot_t as VM_EXEC|VM_READ,
    although this doesn't really matter, debugger can read this memory anyway.
    
    Link: https://lore.kernel.org/all/[email protected]/
    
    Reported-by: Will Deacon <[email protected]>
    Fixes: d4b3b6384f98 ("uprobes/core: Allocate XOL slots for uprobes use")
    Cc: [email protected]
    Acked-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Oleg Nesterov <[email protected]>
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

vfs: use RCU in ilookup [+ + +]

Author: Mateusz Guzik <[email protected]>
Date:   Mon Jul 15 09:13:24 2024 +0200

    vfs: use RCU in ilookup
    
    [ Upstream commit 122381a46954ad592ee93d7da2bef5074b396247 ]
    
    A soft lockup in ilookup was reported when stress-testing a 512-way
    system [1] (see [2] for full context) and it was verified that not
    taking the lock shifts issues back to mm.
    
    [1] https://lore.kernel.org/linux-mm/[email protected]/
    [2] https://lore.kernel.org/linux-mm/[email protected]/
    
    Signed-off-by: Mateusz Guzik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

vhost/scsi: null-ptr-dereference in vhost_scsi_get_req() [+ + +]

Author: Haoran Zhang <[email protected]>
Date:   Tue Oct 1 15:14:15 2024 -0500

    vhost/scsi: null-ptr-dereference in vhost_scsi_get_req()
    
    commit 221af82f606d928ccef19a16d35633c63026f1be upstream.
    
    Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code
    from control queue handler") a null pointer dereference bug can be
    triggered when guest sends an SCSI AN request.
    
    In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with
    `&v_req.tmf.lun[1]` within a switch-case block and is then passed to
    vhost_scsi_get_req() which extracts `vc->req` and `tpg`. However, for
    a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is
    set to NULL in this branch. Later, in vhost_scsi_get_req(),
    `vc->target` is dereferenced without being checked, leading to a null
    pointer dereference bug. This bug can be triggered from guest.
    
    When this bug occurs, the vhost_worker process is killed while holding
    `vq->mutex` and the corresponding tpg will remain occupied
    indefinitely.
    
    Below is the KASAN report:
    Oops: general protection fault, probably for non-canonical address
    0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
    KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
    CPU: 1 PID: 840 Comm: poc Not tainted 6.10.0+ #1
    Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS
    1.16.3-debian-1.16.3-2 04/01/2014
    RIP: 0010:vhost_scsi_get_req+0x165/0x3a0
    Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2b 02 00 00
    48 b8 00 00 00 00 00 fc ff df 4d 8b 65 30 4c 89 e2 48 c1 ea 03 <0f> b6
    04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 be 01 00 00
    RSP: 0018:ffff888017affb50 EFLAGS: 00010246
    RAX: dffffc0000000000 RBX: ffff88801b000000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888017affcb8
    RBP: ffff888017affb80 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: ffff888017affc88 R14: ffff888017affd1c R15: ffff888017993000
    FS:  000055556e076500(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000200027c0 CR3: 0000000010ed0004 CR4: 0000000000370ef0
    Call Trace:
     <TASK>
     ? show_regs+0x86/0xa0
     ? die_addr+0x4b/0xd0
     ? exc_general_protection+0x163/0x260
     ? asm_exc_general_protection+0x27/0x30
     ? vhost_scsi_get_req+0x165/0x3a0
     vhost_scsi_ctl_handle_vq+0x2a4/0xca0
     ? __pfx_vhost_scsi_ctl_handle_vq+0x10/0x10
     ? __switch_to+0x721/0xeb0
     ? __schedule+0xda5/0x5710
     ? __kasan_check_write+0x14/0x30
     ? _raw_spin_lock+0x82/0xf0
     vhost_scsi_ctl_handle_kick+0x52/0x90
     vhost_run_work_list+0x134/0x1b0
     vhost_task_fn+0x121/0x350
    ...
     </TASK>
    ---[ end trace 0000000000000000 ]---
    
    Let's add a check in vhost_scsi_get_req.
    
    Fixes: 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from control queue handler")
    Signed-off-by: Haoran Zhang <[email protected]>
    [whitespace fixes]
    Signed-off-by: Mike Christie <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Michael S. Tsirkin <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

virt: sev-guest: Ensure the SNP guest messages do not exceed a page [+ + +]

Author: Nikunj A Dadhania <[email protected]>
Date:   Wed Jul 31 20:37:55 2024 +0530

    virt: sev-guest: Ensure the SNP guest messages do not exceed a page
    
    [ Upstream commit 2b9ac0b84c2cae91bbaceab62df4de6d503421ec ]
    
    Currently, struct snp_guest_msg includes a message header (96 bytes) and
    a payload (4000 bytes). There is an implicit assumption here that the
    SNP message header will always be 96 bytes, and with that assumption the
    payload array size has been set to 4000 bytes - a magic number. If any
    new member is added to the SNP message header, the SNP guest message
    will span more than a page.
    
    Instead of using a magic number for the payload, declare struct
    snp_guest_msg in a way that payload plus the message header do not
    exceed a page.
    
      [ bp: Massage. ]
    
    Suggested-by: Tom Lendacky <[email protected]>
    Signed-off-by: Nikunj A Dadhania <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Acked-by: Borislav Petkov (AMD) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

vrf: revert "vrf: Remove unnecessary RCU-bh critical section" [+ + +]

Author: Willem de Bruijn <[email protected]>
Date:   Sun Sep 29 02:18:20 2024 -0400

    vrf: revert "vrf: Remove unnecessary RCU-bh critical section"
    
    commit b04c4d9eb4f25b950b33218e33b04c94e7445e51 upstream.
    
    This reverts commit 504fc6f4f7f681d2a03aa5f68aad549d90eab853.
    
    dev_queue_xmit_nit is expected to be called with BH disabled.
    __dev_queue_xmit has the following:
    
            /* Disable soft irqs for various locks below. Also
             * stops preemption for RCU.
             */
            rcu_read_lock_bh();
    
    VRF must follow this invariant. The referenced commit removed this
    protection. Which triggered a lockdep warning:
    
            ================================
            WARNING: inconsistent lock state
            6.11.0 #1 Tainted: G        W
            --------------------------------
            inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
            btserver/134819 [HC0[0]:SC0[0]:HE1:SE1] takes:
            ffff8882da30c118 (rlock-AF_PACKET){+.?.}-{2:2}, at: tpacket_rcv+0x863/0x3b30
            {IN-SOFTIRQ-W} state was registered at:
              lock_acquire+0x19a/0x4f0
              _raw_spin_lock+0x27/0x40
              packet_rcv+0xa33/0x1320
              __netif_receive_skb_core.constprop.0+0xcb0/0x3a90
              __netif_receive_skb_list_core+0x2c9/0x890
              netif_receive_skb_list_internal+0x610/0xcc0
              [...]
    
            other info that might help us debug this:
             Possible unsafe locking scenario:
    
                   CPU0
                   ----
              lock(rlock-AF_PACKET);
              <Interrupt>
                lock(rlock-AF_PACKET);
    
             *** DEADLOCK ***
    
            Call Trace:
             <TASK>
             dump_stack_lvl+0x73/0xa0
             mark_lock+0x102e/0x16b0
             __lock_acquire+0x9ae/0x6170
             lock_acquire+0x19a/0x4f0
             _raw_spin_lock+0x27/0x40
             tpacket_rcv+0x863/0x3b30
             dev_queue_xmit_nit+0x709/0xa40
             vrf_finish_direct+0x26e/0x340 [vrf]
             vrf_l3_out+0x5f4/0xe80 [vrf]
             __ip_local_out+0x51e/0x7a0
              [...]
    
    Fixes: 504fc6f4f7f6 ("vrf: Remove unnecessary RCU-bh critical section")
    Link: https://lore.kernel.org/netdev/[email protected]/
    Reported-by: Ben Greear <[email protected]>
    Signed-off-by: Willem de Bruijn <[email protected]>
    Cc: [email protected]
    Reviewed-by: Ido Schimmel <[email protected]>
    Tested-by: Ido Schimmel <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

wifi: ath11k: fix array out-of-bound access in SoC stats [+ + +]

Author: Karthikeyan Periyasamy <[email protected]>
Date:   Thu Jul 4 12:38:11 2024 +0530

    wifi: ath11k: fix array out-of-bound access in SoC stats
    
    [ Upstream commit 69f253e46af98af17e3efa3e5dfa72fcb7d1983d ]
    
    Currently, the ath11k_soc_dp_stats::hal_reo_error array is defined with a
    maximum size of DP_REO_DST_RING_MAX. However, the ath11k_dp_process_rx()
    function access ath11k_soc_dp_stats::hal_reo_error using the REO
    destination SRNG ring ID, which is incorrect. SRNG ring ID differ from
    normal ring ID, and this usage leads to out-of-bounds array access. To fix
    this issue, modify ath11k_dp_process_rx() to use the normal ring ID
    directly instead of the SRNG ring ID to avoid out-of-bounds array access.
    
    Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
    
    Signed-off-by: Karthikeyan Periyasamy <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: ath12k: fix array out-of-bound access in SoC stats [+ + +]

Author: Karthikeyan Periyasamy <[email protected]>
Date:   Thu Jul 4 12:38:10 2024 +0530

    wifi: ath12k: fix array out-of-bound access in SoC stats
    
    [ Upstream commit e106b7ad13c1d246adaa57df73edb8f8b8acb240 ]
    
    Currently, the ath12k_soc_dp_stats::hal_reo_error array is defined with a
    maximum size of DP_REO_DST_RING_MAX. However, the ath12k_dp_rx_process()
    function access ath12k_soc_dp_stats::hal_reo_error using the REO
    destination SRNG ring ID, which is incorrect. SRNG ring ID differ from
    normal ring ID, and this usage leads to out-of-bounds array access. To
    fix this issue, modify ath12k_dp_rx_process() to use the normal ring ID
    directly instead of the SRNG ring ID to avoid out-of-bounds array access.
    
    Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1
    
    Signed-off-by: Karthikeyan Periyasamy <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats() [+ + +]

Author: Dmitry Kandybka <[email protected]>
Date:   Thu Jul 25 14:17:43 2024 +0300

    wifi: ath9k: fix possible integer overflow in ath9k_get_et_stats()
    
    [ Upstream commit 3f66f26703093886db81f0610b97a6794511917c ]
    
    In 'ath9k_get_et_stats()', promote TX stats counters to 'u64'
    to avoid possible integer overflow. Compile tested only.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Dmitry Kandybka <[email protected]>
    Acked-by: Toke Høiland-Jørgensen <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit [+ + +]

Author: Toke Høiland-Jørgensen <[email protected]>
Date:   Mon Aug 12 16:24:46 2024 +0200

    wifi: ath9k_htc: Use __skb_set_length() for resetting urb before resubmit
    
    [ Upstream commit 94745807f3ebd379f23865e6dab196f220664179 ]
    
    Syzbot points out that skb_trim() has a sanity check on the existing length of
    the skb, which can be uninitialised in some error paths. The intent here is
    clearly just to reset the length to zero before resubmitting, so switch to
    calling __skb_set_length(skb, 0) directly. In addition, __skb_set_length()
    already contains a call to skb_reset_tail_pointer(), so remove the redundant
    call.
    
    The syzbot report came from ath9k_hif_usb_reg_in_cb(), but there's a similar
    usage of skb_trim() in ath9k_hif_usb_rx_cb(), change both while we're at it.
    
    Reported-by: [email protected]
    Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: cfg80211: Set correct chandef when starting CAC [+ + +]

Author: Issam Hamdi <[email protected]>
Date:   Fri Aug 16 16:24:18 2024 +0200

    wifi: cfg80211: Set correct chandef when starting CAC
    
    [ Upstream commit 20361712880396e44ce80aaeec2d93d182035651 ]
    
    When starting CAC in a mode other than AP mode, it return a
    "WARNING: CPU: 0 PID: 63 at cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]"
    caused by the chandef.chan being null at the end of CAC.
    
    Solution: Ensure the channel definition is set for the different modes
    when starting CAC to avoid getting a NULL 'chan' at the end of CAC.
    
     Call Trace:
      ? show_regs.part.0+0x14/0x16
      ? __warn+0x67/0xc0
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? report_bug+0xa7/0x130
      ? exc_overflow+0x30/0x30
      ? handle_bug+0x27/0x50
      ? exc_invalid_op+0x18/0x60
      ? handle_exception+0xf6/0xf6
      ? exc_overflow+0x30/0x30
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? exc_overflow+0x30/0x30
      ? cfg80211_chandef_dfs_usable+0x20/0xaf [cfg80211]
      ? regulatory_propagate_dfs_state.cold+0x1b/0x4c [cfg80211]
      ? cfg80211_propagate_cac_done_wk+0x1a/0x30 [cfg80211]
      ? process_one_work+0x165/0x280
      ? worker_thread+0x120/0x3f0
      ? kthread+0xc2/0xf0
      ? process_one_work+0x280/0x280
      ? kthread_complete_and_exit+0x20/0x20
      ? ret_from_fork+0x19/0x24
    
    Reported-by: Kretschmer Mathias <[email protected]>
    Signed-off-by: Issam Hamdi <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [shorten subject, remove OCB, reorder cases to match previous list]
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: allow only CN mcc from WRDD [+ + +]

Author: Anjaneyulu <[email protected]>
Date:   Thu Aug 8 23:22:49 2024 +0300

    wifi: iwlwifi: allow only CN mcc from WRDD
    
    [ Upstream commit ff5aabe7c2a4a4b089a9ced0cb3d0e284963a7dd ]
    
    Block other mcc expect CN from WRDD ACPI.
    
    Signed-off-by: Anjaneyulu <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://patch.msgid.link/20240808232017.fe6ea7aa4b39.I86004687a2963fe26f990770aca103e2f5cb1628@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: avoid NULL pointer dereference [+ + +]

Author: Miri Korenblit <[email protected]>
Date:   Sun Aug 25 19:17:09 2024 +0300

    wifi: iwlwifi: mvm: avoid NULL pointer dereference
    
    [ Upstream commit 557a6cd847645e667f3b362560bd7e7c09aac284 ]
    
    iwl_mvm_tx_skb_sta() and iwl_mvm_tx_mpdu() verify that the mvmvsta
    pointer is not NULL.
    It retrieves this pointer using iwl_mvm_sta_from_mac80211, which is
    dereferencing the ieee80211_sta pointer.
    If sta is NULL, iwl_mvm_sta_from_mac80211 will dereference a NULL
    pointer.
    Fix this by checking the sta pointer before retrieving the mvmsta
    from it. If sta is not NULL, then mvmsta isn't either.
    
    Signed-off-by: Miri Korenblit <[email protected]>
    Reviewed-by: Johannes Berg <[email protected]>
    Link: https://patch.msgid.link/20240825191257.880921ce23b7.I340052d70ab6d3410724ce955eb00da10e08188f@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: drop wrong STA selection in TX [+ + +]

Author: Johannes Berg <[email protected]>
Date:   Thu Aug 8 23:22:48 2024 +0300

    wifi: iwlwifi: mvm: drop wrong STA selection in TX
    
    [ Upstream commit 1c7e1068a7c9c39ed27636db93e71911e0045419 ]
    
    This shouldn't happen at all, since in station mode all MMPDUs
    go through the TXQ for the STA, and not this function. There
    may or may not be a race in mac80211 through which this might
    happen for some frames while a station is being added, but in
    that case we can also just drop the frame and pretend the STA
    didn't exist yet.
    
    Also, the code is simply wrong since it uses deflink, and it's
    not easy to fix it since the mvmvif->ap_sta pointer cannot be
    used without the mutex, and perhaps the right link might not
    even be known.
    
    Just drop the frame at that point instead of trying to fix it
    up.
    
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://patch.msgid.link/20240808232017.45ad105dc7fe.I6d45c82e5758395d9afb8854057ded03c7dc81d7@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: Fix a race in scan abort flow [+ + +]

Author: Ilan Peer <[email protected]>
Date:   Sun Aug 25 08:56:37 2024 +0300

    wifi: iwlwifi: mvm: Fix a race in scan abort flow
    
    [ Upstream commit 87c1c28a9aa149489e1667f5754fc24f4973d2d0 ]
    
    When the upper layer requests to cancel an ongoing scan, a race
    is possible in which by the time the driver starts to handle the
    upper layers scan cancel flow, the FW already completed handling
    the scan request and the driver received the scan complete
    notification but still did not handle the notification. In such a
    case the FW will simply ignore the scan abort request coming from
    the driver, no notification would arrive from the FW and the entire
    abort flow would be considered a failure.
    
    To better handle this, check the status code returned by the FW for
    the scan abort command. In case the status indicates that
    no scan was aborted, complete the scan abort flow with success, i.e.,
    the scan was aborted, as the flow is expected to consume the scan
    complete notification.
    
    Signed-off-by: Ilan Peer <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://patch.msgid.link/20240825085558.483989d3baef.I3340556a222388504c6330b333360bf77d10f9e2@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: iwlwifi: mvm: use correct key iteration [+ + +]

Author: Johannes Berg <[email protected]>
Date:   Mon Jul 29 20:20:05 2024 +0300

    wifi: iwlwifi: mvm: use correct key iteration
    
    [ Upstream commit 4f1591d292277eec51d027405a92f0d4ef5e299e ]
    
    In the cases changed here, key iteration isn't done from
    an RCU critical section, but rather using the wiphy lock
    as protection. Therefore, just use ieee80211_iter_keys().
    The link switch case can therefore also use sync commands.
    
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Miri Korenblit <[email protected]>
    Link: https://patch.msgid.link/20240729201718.69a2d18580c1.I2148e04d4b467d0b100beac8f7e449bfaaf775a5@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: fix RCU list iterations [+ + +]

Author: Johannes Berg <[email protected]>
Date:   Tue Aug 27 09:49:40 2024 +0200

    wifi: mac80211: fix RCU list iterations
    
    [ Upstream commit ac35180032fbc5d80b29af00ba4881815ceefcb6 ]
    
    There are a number of places where RCU list iteration is
    used, but that aren't (always) called with RCU held. Use
    just list_for_each_entry() in most, and annotate iface
    iteration with the required locks.
    
    Reviewed-by: Miriam Rachel Korenblit <[email protected]>
    Link: https://patch.msgid.link/20240827094939.ed8ac0b2f897.I8443c9c3c0f8051841353491dae758021b53115e@changeid
    Signed-off-by: Johannes Berg <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mt76: mt7915: add dummy HW offload of IEEE 802.11 fragmentation [+ + +]

Author: Benjamin Lin <[email protected]>
Date:   Tue Aug 27 11:30:03 2024 +0200

    wifi: mt76: mt7915: add dummy HW offload of IEEE 802.11 fragmentation
    
    [ Upstream commit f2cc859149240d910fdc6405717673e0b84bfda8 ]
    
    Currently, CONNAC2 series do not support encryption for fragmented Tx frames.
    Therefore, add dummy function mt7915_set_frag_threshold() to prevent SW
    IEEE 802.11 fragmentation.
    
    Signed-off-by: Benjamin Lin <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mt76: mt7915: disable tx worker during tx BA session enable/disable [+ + +]

Author: Felix Fietkau <[email protected]>
Date:   Tue Aug 27 11:29:54 2024 +0200

    wifi: mt76: mt7915: disable tx worker during tx BA session enable/disable
    
    [ Upstream commit 256cbd26fbafb30ba3314339106e5c594e9bd5f9 ]
    
    Avoids firmware race condition.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mt76: mt7915: hold dev->mt76.mutex while disabling tx worker [+ + +]

Author: Felix Fietkau <[email protected]>
Date:   Tue Aug 27 11:30:04 2024 +0200

    wifi: mt76: mt7915: hold dev->mt76.mutex while disabling tx worker
    
    [ Upstream commit 8f7152f10cb434f954aeff85ca1be9cd4d01912b ]
    
    Prevent racing against other functions disabling the same worker
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext() [+ + +]

Author: Gustavo A. R. Silva <[email protected]>
Date:   Wed Aug 21 15:23:51 2024 -0600

    wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_cmd_802_11_scan_ext()
    
    [ Upstream commit 498365e52bebcbc36a93279fe7e9d6aec8479cee ]
    
    Replace one-element array with a flexible-array member in
    `struct host_cmd_ds_802_11_scan_ext`.
    
    With this, fix the following warning:
    
    elo 16 17:51:58 surfacebook kernel: ------------[ cut here ]------------
    elo 16 17:51:58 surfacebook kernel: memcpy: detected field-spanning write (size 243) of single field "ext_scan->tlv_buffer" at drivers/net/wireless/marvell/mwifiex/scan.c:2239 (size 1)
    elo 16 17:51:58 surfacebook kernel: WARNING: CPU: 0 PID: 498 at drivers/net/wireless/marvell/mwifiex/scan.c:2239 mwifiex_cmd_802_11_scan_ext+0x83/0x90 [mwifiex]
    
    Reported-by: Andy Shevchenko <[email protected]>
    Closes: https://lore.kernel.org/linux-hardening/[email protected]/
    Signed-off-by: Gustavo A. R. Silva <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Acked-by: Brian Norris <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/ZsZa5xRcsLq9D+RX@elsanto
    Signed-off-by: Sasha Levin <[email protected]>

wifi: rtw88: select WANT_DEV_COREDUMP [+ + +]

Author: Zong-Zhe Yang <[email protected]>
Date:   Thu Jul 18 15:06:15 2024 +0800

    wifi: rtw88: select WANT_DEV_COREDUMP
    
    [ Upstream commit 7e989b0c1e33210c07340bf5228aa83ea52515b5 ]
    
    We have invoked device coredump when fw crash.
    Should select WANT_DEV_COREDUMP by ourselves.
    
    Signed-off-by: Zong-Zhe Yang <[email protected]>
    Signed-off-by: Ping-Ke Shih <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: rtw89: 885xb: reset IDMEM mode to prevent download firmware failure [+ + +]

Author: Ping-Ke Shih <[email protected]>
Date:   Wed Jul 24 13:26:25 2024 +0800

    wifi: rtw89: 885xb: reset IDMEM mode to prevent download firmware failure
    
    [ Upstream commit 80fb81bb46a57daedd5decbcc253ea48428a254e ]
    
    For different firmware type, it could change IDMEM mode, so reset it to
    default to avoid encountering error for RTL8851B/RTL8852B/RTL8852BT
    if that kind of firmware was downloaded before.
    
        rtw89_8851be 0000:02:00.0: Firmware version 0.29.41.3, cmd version 0, type 5
        rtw89_8851be 0000:02:00.0: Firmware version 0.29.41.3, cmd version 0, type 3
        rtw89_8851be 0000:02:00.0: MAC has already powered on
        rtw89_8851be 0000:02:00.0: fw security fail
        rtw89_8851be 0000:02:00.0: download firmware fail
        rtw89_8851be 0000:02:00.0: [ERR]fwdl 0x1E0 = 0x62
        rtw89_8851be 0000:02:00.0: [ERR]fwdl 0x83F2 = 0x8
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f524
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f500
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f51c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f53c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f508
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f534
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f520
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f534
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f508
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f53c
        rtw89_8851be 0000:02:00.0: [ERR]fw PC = 0xb892f524
        rtw89_8851be 0000:02:00.0: failed to setup chip information
        rtw89_8851be: probe of 0000:02:00.0 failed with error -16
    
    Signed-off-by: Ping-Ke Shih <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: rtw89: avoid reading out of bounds when loading TX power FW elements [+ + +]

Author: Zong-Zhe Yang <[email protected]>
Date:   Mon Sep 2 09:58:03 2024 +0800

    wifi: rtw89: avoid reading out of bounds when loading TX power FW elements
    
    [ Upstream commit ed2e4bb17a4884cf29c3347353d8aabb7265b46c ]
    
    Because the loop-expression will do one more time before getting false from
    cond-expression, the original code copied one more entry size beyond valid
    region.
    
    Fix it by moving the entry copy to loop-body.
    
    Signed-off-by: Zong-Zhe Yang <[email protected]>
    Signed-off-by: Ping-Ke Shih <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: rtw89: avoid to add interface to list twice when SER [+ + +]

Author: Chih-Kang Chang <[email protected]>
Date:   Wed Jul 31 15:05:04 2024 +0800

    wifi: rtw89: avoid to add interface to list twice when SER
    
    [ Upstream commit 7dd5d2514a8ea58f12096e888b0bd050d7eae20a ]
    
    If SER L2 occurs during the WoWLAN resume flow, the add interface flow
    is triggered by ieee80211_reconfig(). However, due to
    rtw89_wow_resume() return failure, it will cause the add interface flow
    to be executed again, resulting in a double add list and causing a kernel
    panic. Therefore, we have added a check to prevent double adding of the
    list.
    
    list_add double add: new=ffff99d6992e2010, prev=ffff99d6992e2010, next=ffff99d695302628.
    ------------[ cut here ]------------
    kernel BUG at lib/list_debug.c:37!
    invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W  O       6.6.30-02659-gc18865c4dfbd #1 770df2933251a0e3c888ba69d1053a817a6376a7
    Hardware name: HP Grunt/Grunt, BIOS Google_Grunt.11031.169.0 06/24/2021
    Workqueue: events_freezable ieee80211_restart_work [mac80211]
    RIP: 0010:__list_add_valid_or_report+0x5e/0xb0
    Code: c7 74 18 48 39 ce 74 13 b0 01 59 5a 5e 5f 41 58 41 59 41 5a 5d e9 e2 d6 03 00 cc 48 c7 c7 8d 4f 17 83 48 89 c2 e8 02 c0 00 00 <0f> 0b 48 c7 c7 aa 8c 1c 83 e8 f4 bf 00 00 0f 0b 48 c7 c7 c8 bc 12
    RSP: 0018:ffffa91b8007bc50 EFLAGS: 00010246
    RAX: 0000000000000058 RBX: ffff99d6992e0900 RCX: a014d76c70ef3900
    RDX: ffffa91b8007bae8 RSI: 00000000ffffdfff RDI: 0000000000000001
    RBP: ffffa91b8007bc88 R08: 0000000000000000 R09: ffffa91b8007bae0
    R10: 00000000ffffdfff R11: ffffffff83a79800 R12: ffff99d695302060
    R13: ffff99d695300900 R14: ffff99d6992e1be0 R15: ffff99d6992e2010
    FS:  0000000000000000(0000) GS:ffff99d6aac00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000078fbdba43480 CR3: 000000010e464000 CR4: 00000000001506f0
    Call Trace:
     <TASK>
     ? __die_body+0x1f/0x70
     ? die+0x3d/0x60
     ? do_trap+0xa4/0x110
     ? __list_add_valid_or_report+0x5e/0xb0
     ? do_error_trap+0x6d/0x90
     ? __list_add_valid_or_report+0x5e/0xb0
     ? handle_invalid_op+0x30/0x40
     ? __list_add_valid_or_report+0x5e/0xb0
     ? exc_invalid_op+0x3c/0x50
     ? asm_exc_invalid_op+0x16/0x20
     ? __list_add_valid_or_report+0x5e/0xb0
     rtw89_ops_add_interface+0x309/0x310 [rtw89_core 7c32b1ee6854761c0321027c8a58c5160e41f48f]
     drv_add_interface+0x5c/0x130 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     ieee80211_reconfig+0x241/0x13d0 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     ? finish_wait+0x3e/0x90
     ? synchronize_rcu_expedited+0x174/0x260
     ? sync_rcu_exp_done_unlocked+0x50/0x50
     ? wake_bit_function+0x40/0x40
     ieee80211_restart_work+0xf0/0x140 [mac80211 83e989e6e616bd5b4b8a2b0a9f9352a2c385a3bc]
     process_scheduled_works+0x1e5/0x480
     worker_thread+0xea/0x1e0
     kthread+0xdb/0x110
     ? move_linked_works+0x90/0x90
     ? kthread_associate_blkcg+0xa0/0xa0
     ret_from_fork+0x3b/0x50
     ? kthread_associate_blkcg+0xa0/0xa0
     ret_from_fork_asm+0x11/0x20
     </TASK>
    Modules linked in: dm_integrity async_xor xor async_tx lz4 lz4_compress zstd zstd_compress zram zsmalloc rfcomm cmac uinput algif_hash algif_skcipher af_alg btusb btrtl iio_trig_hrtimer industrialio_sw_trigger btmtk industrialio_configfs btbcm btintel uvcvideo videobuf2_vmalloc iio_trig_sysfs videobuf2_memops videobuf2_v4l2 videobuf2_common uvc snd_hda_codec_hdmi veth snd_hda_intel snd_intel_dspcfg acpi_als snd_hda_codec industrialio_triggered_buffer kfifo_buf snd_hwdep industrialio i2c_piix4 snd_hda_core designware_i2s ip6table_nat snd_soc_max98357a xt_MASQUERADE xt_cgroup snd_soc_acp_rt5682_mach fuse rtw89_8922ae(O) rtw89_8922a(O) rtw89_pci(O) rtw89_core(O) 8021q mac80211(O) bluetooth ecdh_generic ecc cfg80211 r8152 mii joydev
    gsmi: Log Shutdown Reason 0x03
    ---[ end trace 0000000000000000 ]---
    
    Signed-off-by: Chih-Kang Chang <[email protected]>
    Signed-off-by: Ping-Ke Shih <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: rtw89: correct base HT rate mask for firmware [+ + +]

Author: Ping-Ke Shih <[email protected]>
Date:   Fri Aug 9 15:20:10 2024 +0800

    wifi: rtw89: correct base HT rate mask for firmware
    
    [ Upstream commit 45742881f9eee2a4daeb6008e648a460dd3742cd ]
    
    Coverity reported that u8 rx_mask << 24 will become signed 32 bits, which
    casting to unsigned 64 bits will do sign extension. For example,
    putting 0x80000000 (signed 32 bits) to a u64 variable will become
    0xFFFFFFFF_80000000.
    
    The real case we meet is:
      rx_mask[0...3] = ff ff 00 00
      ra_mask = 0xffffffff_ff0ff000
    
    After this fix:
      rx_mask[0...3] = ff ff 00 00
      ra_mask = 0x00000000_ff0ff000
    
    Fortunately driver does bitwise-AND with incorrect ra_mask and supported
    rates (1ss and 2ss rate only) afterward, so the final rate mask of
    original code is still correct.
    
    Addresses-Coverity-ID: 1504762 ("Unintended sign extension")
    
    Signed-off-by: Ping-Ke Shih <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

wifi: wilc1000: Do not operate uninitialized hardware during suspend/resume [+ + +]

Author: Marek Vasut <[email protected]>
Date:   Wed Aug 21 20:36:03 2024 +0200

    wifi: wilc1000: Do not operate uninitialized hardware during suspend/resume
    
    [ Upstream commit b0dc7018477e8fbb7e40c908c29cf663d06b17a7 ]
    
    In case the hardware is not initialized, do not operate it during
    suspend/resume cycle, the hardware is already off so there is no
    reason to access it.
    
    In fact, wilc_sdio_enable_interrupt() in the resume callback does
    interfere with the same call when initializing the hardware after
    resume and makes such initialization after resume fail. Fix this
    by not operating uninitialized hardware during suspend/resume.
    
    Signed-off-by: Marek Vasut <[email protected]>
    Reviewed-by: Alexis Lothoré <[email protected]>
    Signed-off-by: Kalle Valo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/apic: Remove logical destination mode for 64-bit [+ + +]

Author: Thomas Gleixner <[email protected]>
Date:   Sun Jul 28 13:06:10 2024 +0200

    x86/apic: Remove logical destination mode for 64-bit
    
    [ Upstream commit 838ba7733e4e3a94a928e8d0a058de1811a58621 ]
    
    Logical destination mode of the local APIC is used for systems with up to
    8 CPUs. It has an advantage over physical destination mode as it allows to
    target multiple CPUs at once with IPIs.
    
    That advantage was definitely worth it when systems with up to 8 CPUs
    were state of the art for servers and workstations, but that's history.
    
    Aside of that there are systems which fail to work with logical destination
    mode as the ACPI/DMI quirks show and there are AMD Zen1 systems out there
    which fail when interrupt remapping is enabled as reported by Rob and
    Christian. The latter problem can be cured by firmware updates, but not all
    OEMs distribute the required changes.
    
    Physical destination mode is guaranteed to work because it is the only way
    to get a CPU up and running via the INIT/INIT/STARTUP sequence.
    
    As the number of CPUs keeps increasing, logical destination mode becomes a
    less used code path so there is no real good reason to keep it around.
    
    Therefore remove logical destination mode support for 64-bit and default to
    physical destination mode.
    
    Reported-by: Rob Newcater <[email protected]>
    Reported-by: Christian Heusel <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Borislav Petkov (AMD) <[email protected]>
    Tested-by: Rob Newcater <[email protected]>
    Link: https://lore.kernel.org/all/877cd5u671.ffs@tglx
    Signed-off-by: Sasha Levin <[email protected]>

x86/bugs: Add missing NO_SSB flag [+ + +]

Author: Daniel Sneddon <[email protected]>
Date:   Thu Aug 29 12:24:37 2024 -0700

    x86/bugs: Add missing NO_SSB flag
    
    [ Upstream commit 23e12b54acf621f4f03381dca91cc5f1334f21fd ]
    
    The Moorefield and Lightning Mountain Atom processors are
    missing the NO_SSB flag in the vulnerabilities whitelist.
    This will cause unaffected parts to incorrectly be reported
    as vulnerable. Add the missing flag.
    
    These parts are currently out of service and were verified
    internally with archived documentation that they need the
    NO_SSB flag.
    
    Closes: https://lore.kernel.org/lkml/CAEJ9NQdhh+4GxrtG1DuYgqYhvc0hi-sKZh-2niukJ-MyFLntAA@mail.gmail.com/
    Reported-by: Shanavas.K.S <[email protected]>
    Signed-off-by: Daniel Sneddon <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/bugs: Fix handling when SRSO mitigation is disabled [+ + +]

Author: David Kaplan <[email protected]>
Date:   Wed Sep 4 10:07:11 2024 -0500

    x86/bugs: Fix handling when SRSO mitigation is disabled
    
    [ Upstream commit 1dbb6b1495d472806fef1f4c94f5b3e4c89a3c1d ]
    
    When the SRSO mitigation is disabled, either via mitigations=off or
    spec_rstack_overflow=off, the warning about the lack of IBPB-enhancing
    microcode is printed anyway.
    
    This is unnecessary since the user has turned off the mitigation.
    
      [ bp: Massage, drop SBPB rationale as it doesn't matter because when
        mitigations are disabled x86_pred_cmd is not being used anyway. ]
    
    Signed-off-by: David Kaplan <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Acked-by: Josh Poimboeuf <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/ioapic: Handle allocation failures gracefully [+ + +]

Author: Thomas Gleixner <[email protected]>
Date:   Fri Aug 2 18:15:34 2024 +0200

    x86/ioapic: Handle allocation failures gracefully
    
    [ Upstream commit 830802a0fea8fb39d3dc9fb7d6b5581e1343eb1f ]
    
    Breno observed panics when using failslab under certain conditions during
    runtime:
    
       can not alloc irq_pin_list (-1,0,20)
       Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
    
       panic+0x4e9/0x590
       mp_irqdomain_alloc+0x9ab/0xa80
       irq_domain_alloc_irqs_locked+0x25d/0x8d0
       __irq_domain_alloc_irqs+0x80/0x110
       mp_map_pin_to_irq+0x645/0x890
       acpi_register_gsi_ioapic+0xe6/0x150
       hpet_open+0x313/0x480
    
    That's a pointless panic which is a leftover of the historic IO/APIC code
    which panic'ed during early boot when the interrupt allocation failed.
    
    The only place which might justify panic is the PIT/HPET timer_check() code
    which tries to figure out whether the timer interrupt is delivered through
    the IO/APIC. But that code does not require to handle interrupt allocation
    failures. If the interrupt cannot be allocated then timer delivery fails
    and it either panics due to that or falls back to legacy mode.
    
    Cure this by removing the panic wrapper around __add_pin_to_irq_node() and
    making mp_irqdomain_alloc() aware of the failure condition and handle it as
    any other failure in this function gracefully.
    
    Reported-by: Breno Leitao <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Breno Leitao <[email protected]>
    Tested-by: Qiuxu Zhuo <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/kexec: Add EFI config table identity mapping for kexec kernel [+ + +]

Author: Tao Liu <[email protected]>
Date:   Wed Jul 17 16:31:20 2024 -0500

    x86/kexec: Add EFI config table identity mapping for kexec kernel
    
    [ Upstream commit 5760929f6545c651682de3c2c6c6786816b17bb1 ]
    
    A kexec kernel boot failure is sometimes observed on AMD CPUs due to an
    unmapped EFI config table array.  This can be seen when "nogbpages" is on
    the kernel command line, and has been observed as a full BIOS reboot rather
    than a successful kexec.
    
    This was also the cause of reported regressions attributed to Commit
    7143c5f4cf20 ("x86/mm/ident_map: Use gbpages only where full GB page should
    be mapped.") which was subsequently reverted.
    
    To avoid this page fault, explicitly include the EFI config table array in
    the kexec identity map.
    
    Further explanation:
    
    The following 2 commits caused the EFI config table array to be
    accessed when enabling sev at kernel startup.
    
        commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
                              earlier during boot")
        commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
                              detection/setup")
    
    This is in the code that examines whether SEV should be enabled or not, so
    it can even affect systems that are not SEV capable.
    
    This may result in a page fault if the EFI config table array's address is
    unmapped. Since the page fault occurs before the new kernel establishes its
    own identity map and page fault routines, it is unrecoverable and kexec
    fails.
    
    Most often, this problem is not seen because the EFI config table array
    gets included in the map by the luck of being placed at a memory address
    close enough to other memory areas that *are* included in the map created
    by kexec.
    
    Both the "nogbpages" command line option and the "use gpbages only where
    full GB page should be mapped" change greatly reduce the chance of being
    included in the map by luck, which is why the problem appears.
    
    Signed-off-by: Tao Liu <[email protected]>
    Signed-off-by: Steve Wahl <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Pavin Joseph <[email protected]>
    Tested-by: Sarah Brofeldt <[email protected]>
    Tested-by: Eric Hagberg <[email protected]>
    Reviewed-by: Ard Biesheuvel <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/mm/ident_map: Use gbpages only where full GB page should be mapped. [+ + +]

Author: Steve Wahl <[email protected]>
Date:   Wed Jul 17 16:31:21 2024 -0500

    x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
    
    [ Upstream commit cc31744a294584a36bf764a0ffa3255a8e69f036 ]
    
    When ident_pud_init() uses only GB pages to create identity maps, large
    ranges of addresses not actually requested can be included in the resulting
    table; a 4K request will map a full GB.  This can include a lot of extra
    address space past that requested, including areas marked reserved by the
    BIOS.  That allows processor speculation into reserved regions, that on UV
    systems can cause system halts.
    
    Only use GB pages when map creation requests include the full GB page of
    space.  Fall back to using smaller 2M pages when only portions of a GB page
    are included in the request.
    
    No attempt is made to coalesce mapping requests. If a request requires a
    map entry at the 2M (pmd) level, subsequent mapping requests within the
    same 1G region will also be at the pmd level, even if adjacent or
    overlapping such requests could have been combined to map a full GB page.
    Existing usage starts with larger regions and then adds smaller regions, so
    this should not have any great consequence.
    
    Signed-off-by: Steve Wahl <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Tested-by: Pavin Joseph <[email protected]>
    Tested-by: Sarah Brofeldt <[email protected]>
    Tested-by: Eric Hagberg <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/pkeys: Add PKRU as a parameter in signal handling functions [+ + +]

Author: Aruna Ramakrishna <[email protected]>
Date:   Fri Aug 2 06:13:14 2024 +0000

    x86/pkeys: Add PKRU as a parameter in signal handling functions
    
    [ Upstream commit 24cf2bc982ffe02aeffb4a3885c71751a2c7023b ]
    
    Assume there's a multithreaded application that runs untrusted user
    code. Each thread has its stack/code protected by a non-zero PKEY, and the
    PKRU register is set up such that only that particular non-zero PKEY is
    enabled. Each thread also sets up an alternate signal stack to handle
    signals, which is protected by PKEY zero. The PKEYs man page documents that
    the PKRU will be reset to init_pkru when the signal handler is invoked,
    which means that PKEY zero access will be enabled.  But this reset happens
    after the kernel attempts to push fpu state to the alternate stack, which
    is not (yet) accessible by the kernel, which leads to a new SIGSEGV being
    sent to the application, terminating it.
    
    Enabling both the non-zero PKEY (for the thread) and PKEY zero in
    userspace will not work for this use case. It cannot have the alt stack
    writeable by all - the rationale here is that the code running in that
    thread (using a non-zero PKEY) is untrusted and should not have access
    to the alternate signal stack (that uses PKEY zero), to prevent the
    return address of a function from being changed. The expectation is that
    kernel should be able to set up the alternate signal stack and deliver
    the signal to the application even if PKEY zero is explicitly disabled
    by the application. The signal handler accessibility should not be
    dictated by whatever PKRU value the thread sets up.
    
    The PKRU register is managed by XSAVE, which means the sigframe contents
    must match the register contents - which is not the case here. It's
    required that the signal frame contains the user-defined PKRU value (so
    that it is restored correctly from sigcontext) but the actual register must
    be reset to init_pkru so that the alt stack is accessible and the signal
    can be delivered to the application. It seems that the proper fix here
    would be to remove PKRU from the XSAVE framework and manage it separately,
    which is quite complicated. As a workaround, do this:
    
            orig_pkru = rdpkru();
            wrpkru(orig_pkru & init_pkru_value);
            xsave_to_user_sigframe();
            put_user(pkru_sigframe_addr, orig_pkru)
    
    In preparation for writing PKRU to sigframe, pass PKRU as an additional
    parameter down the call chain from get_sigframe().
    
    No functional change.
    
    Signed-off-by: Aruna Ramakrishna <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/pkeys: Restore altstack access in sigreturn() [+ + +]

Author: Aruna Ramakrishna <[email protected]>
Date:   Fri Aug 2 06:13:17 2024 +0000

    x86/pkeys: Restore altstack access in sigreturn()
    
    [ Upstream commit d10b554919d4cc8fa8fe2e95b57ad2624728c8e4 ]
    
    A process can disable access to the alternate signal stack by not
    enabling the altstack's PKEY in the PKRU register.
    
    Nevertheless, the kernel updates the PKRU temporarily for signal
    handling. However, in sigreturn(), restore_sigcontext() will restore the
    PKRU to the user-defined PKRU value.
    
    This will cause restore_altstack() to fail with a SIGSEGV as it needs read
    access to the altstack which is prohibited by the user-defined PKRU value.
    
    Fix this by restoring altstack before restoring PKRU.
    
    Signed-off-by: Aruna Ramakrishna <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments() [+ + +]

Author: Kees Cook <[email protected]>
Date:   Mon Jul 8 13:22:06 2024 -0700

    x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments()
    
    [ Upstream commit d19d638b1e6cf746263ef60b7d0dee0204d8216a ]
    
    Modern (fortified) memcpy() prefers to avoid writing (or reading) beyond
    the end of the addressed destination (or source) struct member:
    
    In function ‘fortify_memcpy_chk’,
        inlined from ‘syscall_get_arguments’ at ./arch/x86/include/asm/syscall.h:85:2,
        inlined from ‘populate_seccomp_data’ at kernel/seccomp.c:258:2,
        inlined from ‘__seccomp_filter’ at kernel/seccomp.c:1231:3:
    ./include/linux/fortify-string.h:580:25: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
      580 |                         __read_overflow2_field(q_size_field, size);
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    As already done for x86_64 and compat mode, do not use memcpy() to
    extract syscall arguments from struct pt_regs but rather just perform
    direct assignments. Binary output differences are negligible, and actually
    ends up using less stack space:
    
    -       sub    $0x84,%esp
    +       sub    $0x6c,%esp
    
    and less text size:
    
       text    data     bss     dec     hex filename
      10794     252       0   11046    2b26 gcc-32b/kernel/seccomp.o.stock
      10714     252       0   10966    2ad6 gcc-32b/kernel/seccomp.o.after
    
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Reported-by: Mirsad Todorovac <[email protected]>
    Signed-off-by: Kees Cook <[email protected]>
    Signed-off-by: Dave Hansen <[email protected]>
    Reviewed-by: Gustavo A. R. Silva <[email protected]>
    Acked-by: Dave Hansen <[email protected]>
    Tested-by: Mirsad Todorovac <[email protected]>
    Link: https://lore.kernel.org/all/20240708202202.work.477-kees%40kernel.org
    Signed-off-by: Sasha Levin <[email protected]>