Changelog in Linux kernel 6.10.14

 
accel/ivpu: Add missing MODULE_FIRMWARE metadata [+ + +]
Author: Alexander F. Lent <[email protected]>
Date:   Tue Jul 9 07:54:14 2024 -0400

    accel/ivpu: Add missing MODULE_FIRMWARE metadata
    
    [ Upstream commit 58b5618ba80a5e5a8d531a70eae12070e5bd713f ]
    
    Modules that load firmware from various paths at runtime must declare
    those paths at compile time, via the MODULE_FIRMWARE macro, so that the
    firmware paths are included in the module's metadata.
    
    The accel/ivpu driver loads firmware but lacks this metadata,
    preventing dracut from correctly locating firmware files. Fix it.
    
    Fixes: 9ab43e95f922 ("accel/ivpu: Switch to generation based FW names")
    Fixes: 02d5b0aacd05 ("accel/ivpu: Implement firmware parsing and booting")
    Signed-off-by: Alexander F. Lent <[email protected]>
    Reviewed-by: Jacek Lawrynowicz <[email protected]>
    Signed-off-by: Jacek Lawrynowicz <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240709-fix-ivpu-firmware-metadata-v3-1-55f70bba055b@xanderlent.com
    Signed-off-by: Sasha Levin <[email protected]>

 
ACPI: battery: Fix possible crash when unregistering a battery hook [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:34 2024 +0200

    ACPI: battery: Fix possible crash when unregistering a battery hook
    
    [ Upstream commit 76959aff14a0012ad6b984ec7686d163deccdc16 ]
    
    When a battery hook returns an error when adding a new battery, then
    the battery hook is automatically unregistered.
    However the battery hook provider cannot know that, so it will later
    call battery_hook_unregister() on the already unregistered battery
    hook, resulting in a crash.
    
    Fix this by using the list head to mark already unregistered battery
    hooks as already being unregistered so that they can be ignored by
    battery_hook_unregister().
    
    Fixes: fa93854f7a7e ("battery: Add the battery hooking API")
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: battery: Simplify battery hook locking [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Tue Oct 1 23:28:33 2024 +0200

    ACPI: battery: Simplify battery hook locking
    
    [ Upstream commit 86309cbed26139e1caae7629dcca1027d9a28e75 ]
    
    Move the conditional locking from __battery_hook_unregister()
    into battery_hook_unregister() and rename the low-level function
    to simplify the locking during battery hook removal.
    
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Reviewed-by: Pali Rohár <[email protected]>
    Signed-off-by: Armin Wolf <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Stable-dep-of: 76959aff14a0 ("ACPI: battery: Fix possible crash when unregistering a battery hook")
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: CPPC: Add support for setting EPP register in FFH [+ + +]
Author: Mario Limonciello <[email protected]>
Date:   Mon Sep 9 22:15:24 2024 -0500

    ACPI: CPPC: Add support for setting EPP register in FFH
    
    [ Upstream commit aaf21ac93909e08a12931173336bdb52ac8499f1 ]
    
    Some Asus AMD systems are reported to not be able to change EPP values
    because the BIOS doesn't advertise support for the CPPC MSR and the PCC
    region is not configured.
    
    However the ACPI 6.2 specification allows CPC registers to be declared
    in FFH:
    ```
    Starting with ACPI Specification 6.2, all _CPC registers can be in
    PCC, System Memory, System IO, or Functional Fixed Hardware address
    spaces. OSPM support for this more flexible register space scheme
    is indicated by the “Flexible Address Space for CPPC Registers” _OSC
    bit.
    ```
    
    If this _OSC has been set allow using FFH to configure EPP.
    
    Reported-by: [email protected]
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
    Suggested-by: [email protected]
    Tested-by: [email protected]
    Tested-by: [email protected]
    Signed-off-by: Mario Limonciello <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: EC: Do not release locks during operation region accesses [+ + +]
Author: Rafael J. Wysocki <[email protected]>
Date:   Thu Jul 4 18:26:54 2024 +0200

    ACPI: EC: Do not release locks during operation region accesses
    
    [ Upstream commit dc171114926ec390ab90f46534545420ec03e458 ]
    
    It is not particularly useful to release locks (the EC mutex and the
    ACPI global lock, if present) and re-acquire them immediately thereafter
    during EC address space accesses in acpi_ec_space_handler().
    
    First, releasing them for a while before grabbing them again does not
    really help anyone because there may not be enough time for another
    thread to acquire them.
    
    Second, if another thread successfully acquires them and carries out
    a new EC write or read in the middle if an operation region access in
    progress, it may confuse the EC firmware, especially after the burst
    mode has been enabled.
    
    Finally, manipulating the locks after writing or reading every single
    byte of data is overhead that it is better to avoid.
    
    Accordingly, modify the code to carry out EC address space accesses
    entirely without releasing the locks.
    
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: PAD: fix crash in exit_round_robin() [+ + +]
Author: Seiji Nishikawa <[email protected]>
Date:   Sun Aug 25 23:13:52 2024 +0900

    ACPI: PAD: fix crash in exit_round_robin()
    
    [ Upstream commit 0a2ed70a549e61c5181bad5db418d223b68ae932 ]
    
    The kernel occasionally crashes in cpumask_clear_cpu(), which is called
    within exit_round_robin(), because when executing clear_bit(nr, addr) with
    nr set to 0xffffffff, the address calculation may cause misalignment within
    the memory, leading to access to an invalid memory address.
    
    ----------
    BUG: unable to handle kernel paging request at ffffffffe0740618
            ...
    CPU: 3 PID: 2919323 Comm: acpi_pad/14 Kdump: loaded Tainted: G           OE  X --------- -  - 4.18.0-425.19.2.el8_7.x86_64 #1
            ...
    RIP: 0010:power_saving_thread+0x313/0x411 [acpi_pad]
    Code: 89 cd 48 89 d3 eb d1 48 c7 c7 55 70 72 c0 e8 64 86 b0 e4 c6 05 0d a1 02 00 01 e9 bc fd ff ff 45 89 e4 42 8b 04 a5 20 82 72 c0 <f0> 48 0f b3 05 f4 9c 01 00 42 c7 04 a5 20 82 72 c0 ff ff ff ff 31
    RSP: 0018:ff72a5d51fa77ec8 EFLAGS: 00010202
    RAX: 00000000ffffffff RBX: ff462981e5d8cb80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ff46297556959d80 R08: 0000000000000382 R09: ff46297c8d0f38d8
    R10: 0000000000000000 R11: 0000000000000001 R12: 000000000000000e
    R13: 0000000000000000 R14: ffffffffffffffff R15: 000000000000000e
    FS:  0000000000000000(0000) GS:ff46297a800c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffffffe0740618 CR3: 0000007e20410004 CR4: 0000000000771ee0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     ? acpi_pad_add+0x120/0x120 [acpi_pad]
     kthread+0x10b/0x130
     ? set_kthread_struct+0x50/0x50
     ret_from_fork+0x1f/0x40
            ...
    CR2: ffffffffe0740618
    
    crash> dis -lr ffffffffc0726923
            ...
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 114
    0xffffffffc0726918 <power_saving_thread+776>:   mov    %r12d,%r12d
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./include/linux/cpumask.h: 325
    0xffffffffc072691b <power_saving_thread+779>:   mov    -0x3f8d7de0(,%r12,4),%eax
    /usr/src/debug/kernel-4.18.0-425.19.2.el8_7/linux-4.18.0-425.19.2.el8_7.x86_64/./arch/x86/include/asm/bitops.h: 80
    0xffffffffc0726923 <power_saving_thread+787>:   lock btr %rax,0x19cf4(%rip)        # 0xffffffffc0740620 <pad_busy_cpus_bits>
    
    crash> px tsk_in_cpu[14]
    $66 = 0xffffffff
    
    crash> px 0xffffffffc072692c+0x19cf4
    $99 = 0xffffffffc0740620
    
    crash> sym 0xffffffffc0740620
    ffffffffc0740620 (b) pad_busy_cpus_bits [acpi_pad]
    
    crash> px pad_busy_cpus_bits[0]
    $42 = 0xfffc0
    ----------
    
    To fix this, ensure that tsk_in_cpu[tsk_index] != -1 before calling
    cpumask_clear_cpu() in exit_round_robin(), just as it is done in
    round_robin_cpu().
    
    Signed-off-by: Seiji Nishikawa <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject edit, avoid updates to the same value ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:06 2024 +0200

    ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
    
    commit 056301e7c7c886f96d799edd36f3406cc30e1822 upstream.
    
    Like other Asus ExpertBook models the B2502CVA has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the B2502CVA to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217760
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[] [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:05 2024 +0200

    ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
    
    commit 2f80ce0b78c340e332f04a5801dee5e4ac8cfaeb upstream.
    
    Like other Asus Vivobook models the X1704VAP has its keybopard IRQ (1)
    described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
    which breaks the keyboard.
    
    Add the X1704VAP to the irq1_level_low_skip_override[] quirk table to fix
    this.
    
    Reported-by: Lamome Julien <[email protected]>
    Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078696
    Closes: https://lore.kernel.org/all/[email protected]/
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:04 2024 +0200

    ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA
    
    commit 63539defee17bf0cbd8e24078cf103efee9c6633 upstream.
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GA has a DSDT
    describing IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
        $ sudo dmesg | grep DMI:.*BIOS
        [    0.000000] DMI: ASUSTeK COMPUTER INC. Vivobook Go E1404GA_E1404GA/E1404GA, BIOS E1404GA.302 08/23/2023
        $ sudo cp /sys/firmware/acpi/tables/DSDT dsdt.dat
        $ iasl -d dsdt.dat
        $ grep -A 30 PS2K dsdt.dsl | grep IRQ -A 1
                    IRQ (Level, ActiveLow, Exclusive, )
                        {1}
    
    There already is an entry in the irq1_level_low_skip_override[] DMI match
    table for the "E1404GAB", change this to match on "E1404GA" to cover
    the E1404GA model as well (DMI_MATCH() does a substring match).
    
    Reported-by: Paul Menzel <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Remove duplicate Asus E1504GAB IRQ override [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Fri Sep 27 16:16:03 2024 +0200

    ACPI: resource: Remove duplicate Asus E1504GAB IRQ override
    
    commit 65bdebf38e5fac7c56a9e05d3479a707e6dc783c upstream.
    
    Commit d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook
    E1504GA and E1504GAB") does exactly what the subject says, adding DMI
    matches for both the E1504GA and E1504GAB.
    
    But DMI_MATCH() does a substring match, so checking for E1504GA will also
    match E1504GAB.
    
    Drop the unnecessary E1504GAB entry since that is covered already by
    the E1504GA entry.
    
    Fixes: d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook E1504GA and E1504GAB")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB [+ + +]
Author: Tamim Khan <[email protected]>
Date:   Mon Sep 2 21:43:05 2024 -0400

    ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB
    
    [ Upstream commit 49e9cc315604972cc14868cb67831e3e8c3f1470 ]
    
    Like other Asus Vivobooks, the Asus Vivobook Go E1404GAB has a DSDT
    that describes IRQ 1 as ActiveLow, while the kernel overrides to Edge_High.
    
    This override prevents the internal keyboard from working.
    
    Fix the problem by adding this laptop to the table that prevents the kernel
    from overriding the IRQ.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219212
    Signed-off-by: Tamim Khan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Wed Sep 18 17:38:49 2024 +0200

    ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
    
    commit ac78288fe062b64e45a479eaae74aaaafcc8ecdd upstream.
    
    Dell All In One (AIO) models released after 2017 may use a backlight
    controller board connected to an UART.
    
    In DSDT this uart port will be defined as:
    
       Name (_HID, "DELL0501")
       Name (_CID, EisaId ("PNP0501")
    
    The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with
    the above _HID + _CID. Loading the dell-uart-backlight driver fails with
    the following errors:
    
    [   18.261353] dell_uart_backlight serial0-0: Timed out waiting for response.
    [   18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version
    [   18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110
    
    Indicating that there is no backlight controller board attached to
    the UART, while the GPU's native backlight control method does work.
    
    Add a quirk to use the GPU's native backlight control method on this model.
    
    Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection")
    Cc: All applicable <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edit ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Sat Sep 7 14:44:19 2024 +0200

    ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18
    
    [ Upstream commit eb7b0f12e13ba99e64e3a690c2166895ed63b437 ]
    
    The Panasonic Toughbook CF-18 advertises both native and vendor backlight
    control interfaces. But only the vendor one actually works.
    
    acpi_video_get_backlight_type() will pick the non working native backlight
    by default, add a quirk to select the working vendor backlight instead.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package() [+ + +]
Author: Pei Xiao <[email protected]>
Date:   Thu Jul 18 14:05:48 2024 +0800

    ACPICA: check null return of ACPI_ALLOCATE_ZEROED() in acpi_db_convert_to_package()
    
    [ Upstream commit a5242874488eba2b9062985bf13743c029821330 ]
    
    ACPICA commit 4d4547cf13cca820ff7e0f859ba83e1a610b9fd0
    
    ACPI_ALLOCATE_ZEROED() may fail, elements might be NULL and will cause
    NULL pointer dereference later.
    
    Link: https://github.com/acpica/acpica/commit/4d4547cf
    Signed-off-by: Pei Xiao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_field() fails [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Sun Apr 14 21:50:33 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_field() fails
    
    [ Upstream commit e6169a8ffee8a012badd8c703716e761ce851b15 ]
    
    ACPICA commit 1280045754264841b119a5ede96cd005bc09b5a7
    
    If acpi_ps_get_next_field() fails, the previously created field list
    needs to be properly disposed before returning the status code.
    
    Link: https://github.com/acpica/acpica/commit/12800457
    Signed-off-by: Armin Wolf <[email protected]>
    [ rjw: Rename local variable to avoid compiler confusion ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails [+ + +]
Author: Armin Wolf <[email protected]>
Date:   Wed Apr 3 20:50:11 2024 +0200

    ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails
    
    [ Upstream commit 5accb265f7a1b23e52b0ec42313d1e12895552f4 ]
    
    ACPICA commit 2802af722bbde7bf1a7ac68df68e179e2555d361
    
    If acpi_ps_get_next_namepath() fails, the previously allocated
    union acpi_parse_object needs to be freed before returning the
    status code.
    
    The issue was first being reported on the Linux ACPI mailing list:
    
    Link: https://lore.kernel.org/linux-acpi/[email protected]/T/
    Link: https://github.com/acpica/acpica/commit/2802af72
    Signed-off-by: Armin Wolf <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ACPICA: iasl: handle empty connection_node [+ + +]
Author: Aleksandrs Vinarskis <[email protected]>
Date:   Sun Aug 11 23:33:44 2024 +0200

    ACPICA: iasl: handle empty connection_node
    
    [ Upstream commit a0a2459b79414584af6c46dd8c6f866d8f1aa421 ]
    
    ACPICA commit 6c551e2c9487067d4b085333e7fe97e965a11625
    
    Link: https://github.com/acpica/acpica/commit/6c551e2c
    Signed-off-by: Aleksandrs Vinarskis <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
afs: Fix missing wire-up of afs_retry_request() [+ + +]
Author: David Howells <[email protected]>
Date:   Sat Sep 14 21:40:02 2024 +0100

    afs: Fix missing wire-up of afs_retry_request()
    
    [ Upstream commit 2cf36327ee1e47733aba96092d7bd082a4056ff5 ]
    
    afs_retry_request() is supposed to be pointed to by the afs_req_ops netfs
    operations table, but the pointer got lost somewhere.  The function is used
    during writeback to rotate through the authentication keys that were in
    force when the file was modified locally.
    
    Fix this by adding the pointer to the function.
    
    Fixes: 1ecb146f7cd8 ("netfs, afs: Use writeback retry to deal with alternate keys")
    Reported-by: Dr. David Alan Gilbert <[email protected]>
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

afs: Fix the setting of the server responding flag [+ + +]
Author: David Howells <[email protected]>
Date:   Mon Sep 23 16:07:50 2024 +0100

    afs: Fix the setting of the server responding flag
    
    [ Upstream commit ff98751bae40faed1ba9c6a7287e84430f7dec64 ]
    
    In afs_wait_for_operation(), we set transcribe the call responded flag to
    the server record that we used after doing the fileserver iteration loop -
    but it's possible to exit the loop having had a response from the server
    that we've discarded (e.g. it returned an abort or we started receiving
    data, but the call didn't complete).
    
    This means that op->server might be NULL, but we don't check that before
    attempting to set the server flag.
    
    Fixes: 98f9fda2057b ("afs: Fold the afs_addr_cursor struct in")
    Signed-off-by: David Howells <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ALSA: asihpi: Fix potential OOB array access [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:14:42 2024 +0200

    ALSA: asihpi: Fix potential OOB array access
    
    [ Upstream commit 7b986c7430a6bb68d523dac7bfc74cbd5b44ef96 ]
    
    ASIHPI driver stores some values in the static array upon a response
    from the driver, and its index depends on the firmware.  We shouldn't
    trust it blindly.
    
    This patch adds a sanity check of the array index to fit in the array
    size.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Fix leftover snd_power_unref() [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 1 08:42:01 2024 +0200

    ALSA: control: Fix leftover snd_power_unref()
    
    commit fef1ac950c600ba50ef4d65ca03c8dae9be7f9ea upstream.
    
    One snd_power_unref() was forgotten and left at __snd_ctl_elem_info()
    in the previous change for reorganizing the locking order.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://github.com/thesofproject/linux/pull/5127
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: control: Fix power_ref lock order for compat code, too [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 18:31:27 2024 +0200

    ALSA: control: Fix power_ref lock order for compat code, too
    
    [ Upstream commit a1066453b5e49a28523f3ecbbfe4e06c6a29561c ]
    
    In the previous change for swapping the power_ref and controls_rwsem
    lock order, the code path for the compat layer was forgotten.
    This patch covers the remaining code.
    
    Fixes: fcc62b19104a ("ALSA: control: Take power_ref lock primarily")
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: control: Take power_ref lock primarily [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Mon Jul 29 18:06:58 2024 +0200

    ALSA: control: Take power_ref lock primarily
    
    [ Upstream commit fcc62b19104a67b9a2941513771e09389b75bd95 ]
    
    The code path for kcontrol accesses have often nested locks of both
    card's controls_rwsem and power_ref, and applies in that order.
    However, what could take much longer is the latter, power_ref; it
    waits for the power state of the device, and it pretty much depends on
    the user's action.
    
    This patch swaps the locking order of those locks to a more natural
    way, namely, power_ref -> controls_rwsem, in order to shorten the time
    of possible nested locks.  For consistency, power_ref is taken always
    in the top-level caller side (that is, *_user() functions and the
    ioctl handler itself).
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: core: add isascii() check to card ID generator [+ + +]
Author: Jaroslav Kysela <[email protected]>
Date:   Wed Oct 2 21:46:49 2024 +0200

    ALSA: core: add isascii() check to card ID generator
    
    commit d278a9de5e1837edbe57b2f1f95a104ff6c84846 upstream.
    
    The card identifier should contain only safe ASCII characters. The isalnum()
    returns true also for characters for non-ASCII characters.
    
    Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4135
    Link: https://lore.kernel.org/linux-sound/yk3WTvKkwheOon_LzZlJ43PPInz6byYfBzpKkbasww1yzuiMRqn7n6Y8vZcXB-xwFCu_vb8hoNjv7DTNwH5TWjpEuiVsyn9HPCEXqwF4120=@protonmail.com/
    Cc: [email protected]
    Reported-by: Barnabás Pőcze <[email protected]>
    Signed-off-by: Jaroslav Kysela <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: gus: Fix some error handling paths related to get_bpos() usage [+ + +]
Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 21:34:01 2024 +0200

    ALSA: gus: Fix some error handling paths related to get_bpos() usage
    
    [ Upstream commit 9df39a872c462ea07a3767ebd0093c42b2ff78a2 ]
    
    If get_bpos() fails, it is likely that the corresponding error code should
    be returned.
    
    Fixes: a6970bb1dd99 ("ALSA: gus: Convert to the new PCM ops")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/d9ca841edad697154afa97c73a5d7a14919330d9.1727984008.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Fri Oct 4 10:25:58 2024 +0200

    ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
    
    [ Upstream commit b3ebb007060f89d5a45c9b99f06a55e36a1945b5 ]
    
    We received a regression report for System76 Pangolin (pang14) due to
    the recent fix for Tuxedo Sirius devices to support the top speaker.
    The reason was the conflicting PCI SSID, as often seen.
    
    As a workaround, now the codec SSID is checked and the quirk is
    applied conditionally only to Sirius devices.
    
    Fixes: 4178d78cd7a8 ("ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices")
    Reported-by: Christian Heusel <[email protected]>
    Reported-by: Jerry <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Tue Oct 1 14:14:36 2024 +0200

    ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
    
    [ Upstream commit 1c801e7f77445bc56e5e1fec6191fd4503534787 ]
    
    Some time ago, we introduced the obey_preferred_dacs flag for choosing
    the DAC/pin pairs specified by the driver instead of parsing the
    paths.  This works as expected, per se, but there have been a few
    cases where we forgot to set this flag while preferred_dacs table is
    already set up.  It ended up with incorrect wiring and made us
    wondering why it doesn't work.
    
    Basically, when the preferred_dacs table is provided, it means that
    the driver really wants to wire up to follow that.  That is, the
    presence of the preferred_dacs table itself is already a "do-it"
    flag.
    
    In this patch, we simply replace the evaluation of obey_preferred_dacs
    flag with the presence of preferred_dacs table for fixing the
    misbehavior.  Another patch to drop of the obsoleted flag will
    follow.
    
    Fixes: 242d990c158d ("ALSA: hda/generic: Add option to enforce preferred_dacs pairs")
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1219803
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 [+ + +]
Author: Abhishek Tamboli <[email protected]>
Date:   Mon Sep 30 20:23:00 2024 +0530

    ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
    
    commit d75dba49744478c32f6ce1c16b5f391c2d5cef5f upstream.
    
    Add the quirk for HP Pavilion Gaming laptop 15z-ec200 for
    enabling the mute led. The fix apply the ALC285_FIXUP_HP_MUTE_LED
    quirk for this model.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219303
    Signed-off-by: Abhishek Tamboli <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9 [+ + +]
Author: Ai Chao <[email protected]>
Date:   Thu Sep 26 14:02:52 2024 +0800

    ALSA: hda/realtek: Add quirk for Huawei MateBook 13 KLV-WX9
    
    commit dee476950cbd83125655a3f49e00d63b79f6114e upstream.
    
    The headset mic requires a fixup to be properly detected/used.
    
    Signed-off-by: Ai Chao <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8 [+ + +]
Author: Nikolai Afanasenkov <[email protected]>
Date:   Mon Sep 16 13:50:42 2024 -0600

    ALSA: hda/realtek: fix mute/micmute LED for HP mt645 G8
    
    commit cb2deca056d579fe008c8d0a4ceb04d2b368fe42 upstream.
    
    The HP Elite mt645 G8 Mobile Thin Client uses an ALC236 codec
    and needs the ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk
    to enable the mute and micmute LED functionality.
    
    This patch adds the system ID of the HP Elite mt645 G8
    to the `alc269_fixup_tbl` in `patch_realtek.c`
    to enable the required quirk.
    
    Cc: [email protected]
    Signed-off-by: Nikolai Afanasenkov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hda/realtek: Fix the push button function for the ALC257 [+ + +]
Author: Oder Chiou <[email protected]>
Date:   Mon Sep 30 18:50:39 2024 +0800

    ALSA: hda/realtek: Fix the push button function for the ALC257
    
    [ Upstream commit 05df9732a0894846c46d0062d4af535c5002799d ]
    
    The headset push button cannot work properly in case of the ALC257.
    This patch reverted the previous commit to correct the side effect.
    
    Fixes: ef9718b3d54e ("ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7")
    Signed-off-by: Oder Chiou <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop [+ + +]
Author: Baojun Xu <[email protected]>
Date:   Thu Sep 19 15:57:43 2024 +0800

    ALSA: hda/tas2781: Add new quirk for Lenovo Y990 Laptop
    
    commit 49f5ee951f11f4d6a124f00f71b2590507811a55 upstream.
    
    Add new vendor_id and subsystem_id in quirk for Lenovo Y990 Laptop.
    
    Signed-off-by: Baojun Xu <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: hdsp: Break infinite MIDI input flush loop [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Thu Aug 8 11:15:12 2024 +0200

    ALSA: hdsp: Break infinite MIDI input flush loop
    
    [ Upstream commit c01f3815453e2d5f699ccd8c8c1f93a5b8669e59 ]
    
    The current MIDI input flush on HDSP and HDSPM drivers relies on the
    hardware reporting the right value.  If the hardware doesn't give the
    proper value but returns -1, it may be stuck at an infinite loop.
    
    Add a counter and break if the loop is unexpectedly too long.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: line6: add hw monitor volume control to POD HD500X [+ + +]
Author: Hans P. Moller <[email protected]>
Date:   Thu Oct 3 20:28:28 2024 -0300

    ALSA: line6: add hw monitor volume control to POD HD500X
    
    commit 703235a244e533652346844cfa42623afb36eed1 upstream.
    
    Add hw monitor volume control for POD HD500X. This is done adding
    LINE6_CAP_HWMON_CTL to the capabilities
    
    Signed-off-by: Hans P. Moller <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: mixer_oss: Remove some incorrect kfree_const() usages [+ + +]
Author: Christophe JAILLET <[email protected]>
Date:   Thu Sep 26 20:17:36 2024 +0200

    ALSA: mixer_oss: Remove some incorrect kfree_const() usages
    
    [ Upstream commit 368e4663c557de4a33f321b44e7eeec0a21b2e4e ]
    
    "assigned" and "assigned->name" are allocated in snd_mixer_oss_proc_write()
    using kmalloc() and kstrdup(), so there is no point in using kfree_const()
    to free these resources.
    
    Switch to the more standard kfree() to free these resources.
    
    This could avoid a memory leak.
    
    Fixes: 454f5ec1d2b7 ("ALSA: mixer: oss: Constify snd_mixer_oss_assign_table definition")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/63ac20f64234b7c9ea87a7fa9baf41e8255852f7.1727374631.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET [+ + +]
Author: Lianqin Hu <[email protected]>
Date:   Wed Sep 25 03:16:29 2024 +0000

    ALSA: usb-audio: Add delay quirk for VIVO USB-C HEADSET
    
    commit 73385f3e0d8088b715ae8f3f66d533c482a376ab upstream.
    
    Audio control requests that sets sampling frequency sometimes fail on
    this card. Adding delay between control messages eliminates that problem.
    
    Signed-off-by: Lianqin Hu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>
    Link: https://patch.msgid.link/TYUPR06MB62177E629E9DEF2401333BF7D2692@TYUPR06MB6217.apcprd06.prod.outlook.com
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add input value sanity checks for standard types [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Tue Aug 6 14:46:50 2024 +0200

    ALSA: usb-audio: Add input value sanity checks for standard types
    
    [ Upstream commit 901e85677ec0bb9a69fb9eab1feafe0c4eb7d07e ]
    
    For an invalid input value that is out of the given range, currently
    USB-audio driver corrects the value silently and accepts without
    errors.  This is no wrong behavior, per se, but the recent kselftest
    rather wants to have an error in such a case, hence a different
    behavior is expected now.
    
    This patch adds a sanity check at each control put for the standard
    mixer types and returns an error if an invalid value is given.
    
    Note that this covers only the standard mixer types.  The mixer quirks
    that have own control callbacks would need different coverage.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add logitech Audio profile quirk [+ + +]
Author: Joshua Pius <[email protected]>
Date:   Thu Sep 12 15:26:28 2024 +0000

    ALSA: usb-audio: Add logitech Audio profile quirk
    
    [ Upstream commit a51c925c11d7b855167e64b63eb4378e5adfc11d ]
    
    Specify shortnames for the following Logitech Devices: Rally bar, Rally
    bar mini, Tap, MeetUp and Huddle.
    
    Signed-off-by: Joshua Pius <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add mixer quirk for RME Digiface USB [+ + +]
Author: Asahi Lina <[email protected]>
Date:   Tue Sep 3 19:52:30 2024 +0900

    ALSA: usb-audio: Add mixer quirk for RME Digiface USB
    
    [ Upstream commit 611a96f6acf2e74fe28cb90908a9c183862348ce ]
    
    Implement sync, output format, and input status mixer controls, to allow
    the interface to be used as a straight ADAT/SPDIF (+ Headphones) I/O
    interface.
    
    This does not implement the matrix mixer, output gain controls, or input
    level meter feedback. The full mixer interface is only really usable
    using a dedicated userspace control app (there are too many mixer nodes
    for alsamixer to be usable), so for now we leave it up to userspace to
    directly control these features using raw USB control messages. This is
    similar to how it's done with some FireWire interfaces (ffado-mixer).
    
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Add native DSD support for Luxman D-08u [+ + +]
Author: Jan Lalinsky <[email protected]>
Date:   Thu Oct 3 05:08:11 2024 +0200

    ALSA: usb-audio: Add native DSD support for Luxman D-08u
    
    commit 6b0bde5d8d4078ca5feec72fd2d828f0e5cf115d upstream.
    
    Add native DSD support for Luxman D-08u DAC, by adding the PID/VID 1852:5062.
    This makes DSD playback work, and also sound quality when playing PCM files
    is improved, crackling sounds are gone.
    
    Signed-off-by: Jan Lalinsky <[email protected]>
    Cc: <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ALSA: usb-audio: Add quirk for RME Digiface USB [+ + +]
Author: Cyan Nyan <[email protected]>
Date:   Tue Sep 3 19:52:29 2024 +0900

    ALSA: usb-audio: Add quirk for RME Digiface USB
    
    [ Upstream commit c032044e9672408c534d64a6df2b1ba14449e948 ]
    
    Add trivial support for audio streaming on the RME Digiface USB. Binds
    only to the first interface to allow userspace to directly drive the
    complex I/O and matrix mixer controls.
    
    Signed-off-by: Cyan Nyan <[email protected]>
    [Lina: Added 2x/4x sample rate support & boot/format quirks]
    Co-developed-by: Asahi Lina <[email protected]>
    Signed-off-by: Asahi Lina <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Define macros for quirk table entries [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:41 2024 +0200

    ALSA: usb-audio: Define macros for quirk table entries
    
    [ Upstream commit 0c3ad39b791c2ecf718afcaca30e5ceafa939d5c ]
    
    Many entries in the USB-audio quirk tables have relatively complex
    expressions.  For improving the readability, introduce a few macros.
    Those are applied in the following patch.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: usb-audio: Replace complex quirk lines with macros [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Wed Aug 14 15:48:42 2024 +0200

    ALSA: usb-audio: Replace complex quirk lines with macros
    
    [ Upstream commit d79e13f8e8abb5cd3a2a0f9fc9bc3fc750c5b06f ]
    
    Apply the newly introduced macros for reduce the complex expressions
    and cast in the quirk table definitions.  It results in a significant
    code reduction, too.
    
    There should be no functional changes.
    
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
aoe: fix the potential use-after-free problem in more places [+ + +]
Author: Chun-Yi Lee <[email protected]>
Date:   Wed Oct 2 11:54:58 2024 +0800

    aoe: fix the potential use-after-free problem in more places
    
    commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 upstream.
    
    For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential
    use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put()
    instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs
    into use-after-free.
    
    Then Nicolai Stange found more places in aoe have potential use-after-free
    problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
    and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
    packet to tx queue. So they should also use dev_hold() to increase the
    refcnt of skb->dev.
    
    On the other hand, moving dev_put() to tx() causes that the refcnt of
    skb->dev be reduced to a negative value, because corresponding
    dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(),
    probe(), and aoecmd_cfg_rsp(). This patch fixed this issue.
    
    Cc: [email protected]
    Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270
    Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts")
    Reported-by: Nicolai Stange <[email protected]>
    Signed-off-by: Chun-Yi Lee <[email protected]>
    Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
arm64: cputype: Add Neoverse-N3 definitions [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:06:37 2024 +0100

    arm64: cputype: Add Neoverse-N3 definitions
    
    [ Upstream commit 924725707d80bc2588cefafef76ff3f164d299bc ]
    
    Add cputype definitions for Neoverse-N3. These will be used for errata
    detection in subsequent patches.
    
    These values can be found in Table A-261 ("MIDR_EL1 bit descriptions")
    in issue 02 of the Neoverse-N3 TRM, which can be found at:
    
      https://developer.arm.com/documentation/107997/0000/?lang=en
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: errata: Expand speculative SSBS workaround once more [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Oct 7 13:06:38 2024 +0100

    arm64: errata: Expand speculative SSBS workaround once more
    
    [ Upstream commit 081eb7932c2b244f63317a982c5e3990e2c7fbdd ]
    
    A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
    special-purpose register does not affect subsequent speculative
    instructions, permitting speculative store bypassing for a window of
    time.
    
    We worked around this for a number of CPUs in commits:
    
    * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
    * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround")
    * 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)")
    
    Since then, a (hopefully final) batch of updates have been published,
    with two more affected CPUs. For the affected CPUs the existing
    mitigation is sufficient, as described in their respective Software
    Developer Errata Notice (SDEN) documents:
    
    * Cortex-A715 (MP148) SDEN v15.0, erratum 3456084
      https://developer.arm.com/documentation/SDEN-2148827/1500/
    
    * Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111
      https://developer.arm.com/documentation/SDEN-3050973/0500/
    
    Enable the existing mitigation by adding the relevant MIDRs to
    erratum_spec_ssbs_list, and update silicon-errata.rst and the
    Kconfig text accordingly.
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: James Morse <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    [ Mark: trivial backport ]
    Signed-off-by: Mark Rutland <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS [+ + +]
Author: Mark Rutland <[email protected]>
Date:   Mon Sep 30 13:04:48 2024 +0100

    arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
    
    commit b3d6121eaeb22aee8a02f46706745b1968cc0292 upstream.
    
    The Kconfig logic to select HAVE_DYNAMIC_FTRACE_WITH_ARGS is incorrect,
    and HAVE_DYNAMIC_FTRACE_WITH_ARGS may be selected when it is not
    supported by the combination of clang and GNU LD, resulting in link-time
    errors:
    
      aarch64-linux-gnu-ld: .init.data has both ordered [`__patchable_function_entries' in init/main.o] and unordered [`.meminit.data' in mm/sparse.o] sections
      aarch64-linux-gnu-ld: final link failed: bad value
    
    ... which can be seen when building with CC=clang using a binutils
    version older than 2.36.
    
    We originally fixed that in commit:
    
      45bd8951806eb5e8 ("arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang")
    
    ... by splitting the "select HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement
    into separete CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS and
    GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS options which individually select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS.
    
    Subsequently we accidentally re-introduced the common "select
    HAVE_DYNAMIC_FTRACE_WITH_ARGS" statement in commit:
    
      26299b3f6ba26bfc ("ftrace: arm64: move from REGS to ARGS")
    
    ... then we removed it again in commit:
    
      68a63a412d18bd2e ("arm64: Fix build with CC=clang, CONFIG_FTRACE=y and CONFIG_STACK_TRACER=y")
    
    ... then we accidentally re-introduced it again in commit:
    
      2aa6ac03516d078c ("arm64: ftrace: Add direct call support")
    
    Fix this for the third time by keeping the unified select statement and
    making this depend onf either GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS or
    CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS. This is more consistent with
    usual style and less likely to go wrong in future.
    
    Fixes: 2aa6ac03516d ("arm64: ftrace: Add direct call support")
    Cc: <[email protected]> # 6.4.x
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: Will Deacon <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 [+ + +]
Author: Easwar Hariharan <[email protected]>
Date:   Thu Oct 3 22:52:35 2024 +0000

    arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
    
    commit 3eddb108abe3de6723cc4b77e8558ce1b3047987 upstream.
    
    Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering
    from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata:
    Expand speculative SSBS workaround")
    
    CC: Mark Rutland <[email protected]>
    CC: James More <[email protected]>
    CC: Will Deacon <[email protected]>
    CC: [email protected] # 6.6+
    Signed-off-by: Easwar Hariharan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() [+ + +]
Author: Fares Mehanna <[email protected]>
Date:   Mon Sep 2 16:33:08 2024 +0000

    arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
    
    [ Upstream commit 7eced90b202d63cdc1b9b11b1353adb1389830f9 ]
    
    The reasons for PTEs in the kernel direct map to be marked invalid are not
    limited to kfence / debug pagealloc machinery. In particular,
    memfd_secret() also steals pages with set_direct_map_invalid_noflush().
    
    When building the transitional page tables for kexec from the current
    kernel's page tables, those pages need to become regular writable pages,
    otherwise, if the relocation places kexec segments over such pages, a fault
    will occur during kexec, leading to host going dark during kexec.
    
    This patch addresses the kexec issue by marking any PTE as valid if it is
    not none. While this fixes the kexec crash, it does not address the
    security concern that if processes owning secret memory are not terminated
    before kexec, the secret content will be mapped in the new kernel without
    being scrubbed.
    
    Suggested-by: Jan H. Schönherr <[email protected]>
    Signed-off-by: Fares Mehanna <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized [+ + +]
Author: Andrei Simion <[email protected]>
Date:   Tue Sep 24 11:12:38 2024 +0300

    ASoC: atmel: mchp-pdmc: Skip ALSA restoration if substream runtime is uninitialized
    
    [ Upstream commit 09cfc6a532d249a51d3af5022d37ebbe9c3d31f6 ]
    
    Update the driver to prevent alsa-restore.service from failing when
    reading data from /var/lib/alsa/asound.state at boot. Ensure that the
    restoration of ALSA mixer configurations is skipped if substream->runtime
    is NULL.
    
    Fixes: 50291652af52 ("ASoC: atmel: mchp-pdmc: add PDMC driver")
    Signed-off-by: Andrei Simion <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: codecs: wsa883x: Handle reading version failure [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Wed Jul 10 15:52:31 2024 +0200

    ASoC: codecs: wsa883x: Handle reading version failure
    
    [ Upstream commit 2fbf16992e5aa14acf0441320033a01a32309ded ]
    
    If reading version and variant from registers fails (which is unlikely
    but possible, because it is a read over bus), the driver will proceed
    and perform device configuration based on uninitialized stack variables.
    Handle it a bit better - bail out without doing any init and failing the
    update status Soundwire callback.
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m [+ + +]
Author: Hui Wang <[email protected]>
Date:   Wed Oct 2 10:56:59 2024 +0800

    ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
    
    [ Upstream commit 47d7d3fd72afc7dcd548806291793ee6f3848215 ]
    
    In most Linux distribution kernels, the SND is set to m, in such a
    case, when booting the kernel on i.MX8MP EVK board, there is a
    warning calltrace like below:
     Call trace:
     snd_card_init+0x484/0x4cc [snd]
     snd_card_new+0x70/0xa8 [snd]
     snd_soc_bind_card+0x310/0xbd0 [snd_soc_core]
     snd_soc_register_card+0xf0/0x108 [snd_soc_core]
     devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
    
    That is because the card.owner is not set, a warning calltrace is
    raised in the snd_card_init() due to it.
    
    Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
    Signed-off-by: Hui Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev() [+ + +]
Author: Pierre-Louis Bossart <[email protected]>
Date:   Tue Aug 27 20:32:01 2024 +0800

    ASoC: Intel: boards: always check the result of acpi_dev_get_first_match_dev()
    
    [ Upstream commit 14e91ddd5c02d8c3e5a682ebfa0546352b459911 ]
    
    The code seems mostly copy-pasted, with some machine drivers
    forgetting to test if the 'adev' result is NULL.
    
    Add this check when missing, and use -ENOENT consistently as an error
    code.
    
    Reported-by: Dan Carpenter <[email protected]>
    Closes: https://lore.kernel.org/alsa-devel/[email protected]/T/#u
    Signed-off-by: Pierre-Louis Bossart <[email protected]>
    Reviewed-by: Péter Ujfalusi <[email protected]>
    Signed-off-by: Bard Liao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ata: pata_serverworks: Do not use the term blacklist [+ + +]
Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 10:58:36 2024 +0900

    ata: pata_serverworks: Do not use the term blacklist
    
    [ Upstream commit 858048568c9e3887d8b19e101ee72f129d65cb15 ]
    
    Let's not use the term blacklist in the function
    serverworks_osb4_filter() documentation comment and rather simply refer
    to what that function looks at: the list of devices with groken UDMA5.
    
    While at it, also constify the values of the csb_bad_ata100 array.
    
    Of note is that all of this should probably be handled using libata
    quirk mechanism but it is unclear if these UDMA5 quirks are specific
    to this controller only.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: sata_sil: Rename sil_blacklist to sil_quirks [+ + +]
Author: Damien Le Moal <[email protected]>
Date:   Fri Jul 26 11:14:11 2024 +0900

    ata: sata_sil: Rename sil_blacklist to sil_quirks
    
    [ Upstream commit 93b0f9e11ce511353c65b7f924cf5f95bd9c3aba ]
    
    Rename the array sil_blacklist to sil_quirks as this name is more
    neutral and is also consistent with how this driver define quirks with
    the SIL_QUIRK_XXX flags.
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Reviewed-by: Igor Pylypiv <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
blk_iocost: fix more out of bound shifts [+ + +]
Author: Konstantin Ovsepian <[email protected]>
Date:   Thu Aug 22 08:41:36 2024 -0700

    blk_iocost: fix more out of bound shifts
    
    [ Upstream commit 9bce8005ec0dcb23a58300e8522fe4a31da606fa ]
    
    Recently running UBSAN caught few out of bound shifts in the
    ioc_forgive_debts() function:
    
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2142:38
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    UBSAN: shift-out-of-bounds in block/blk-iocost.c:2144:30
    shift exponent 80 is too large for 64-bit type 'u64' (aka 'unsigned long
    long')
    ...
    Call Trace:
    <IRQ>
    dump_stack_lvl+0xca/0x130
    __ubsan_handle_shift_out_of_bounds+0x22c/0x280
    ? __lock_acquire+0x6441/0x7c10
    ioc_timer_fn+0x6cec/0x7750
    ? blk_iocost_init+0x720/0x720
    ? call_timer_fn+0x5d/0x470
    call_timer_fn+0xfa/0x470
    ? blk_iocost_init+0x720/0x720
    __run_timer_base+0x519/0x700
    ...
    
    Actual impact of this issue was not identified but I propose to fix the
    undefined behaviour.
    The proposed fix to prevent those out of bound shifts consist of
    precalculating exponent before using it the shift operations by taking
    min value from the actual exponent and maximum possible number of bits.
    
    Reported-by: Breno Leitao <[email protected]>
    Signed-off-by: Konstantin Ovsepian <[email protected]>
    Acked-by: Tejun Heo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
block: fix integer overflow in BLKSECDISCARD [+ + +]
Author: Alexey Dobriyan <[email protected]>
Date:   Tue Sep 3 22:48:19 2024 +0300

    block: fix integer overflow in BLKSECDISCARD
    
    [ Upstream commit 697ba0b6ec4ae04afb67d3911799b5e2043b4455 ]
    
    I independently rediscovered
    
            commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155
            block: fix overflow in blk_ioctl_discard()
    
    but for secure erase.
    
    Same problem:
    
            uint64_t r[2] = {512, 18446744073709551104ULL};
            ioctl(fd, BLKSECDISCARD, r);
    
    will enter near infinite loop inside blkdev_issue_secure_erase():
    
            a.out: attempt to access beyond end of device
            loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048
            bio_check_eod: 3286214 callbacks suppressed
    
    Signed-off-by: Alexey Dobriyan <[email protected]>
    Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:12:04 2024 +0800

    Bluetooth: btmrvl: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 7b1ab460592ca818e7b52f27cd3ec86af79220d1 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: bb7f4f0bcee6 ("btmrvl: add platform specific wakeup interrupt support")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B [+ + +]
Author: Hilda Wu <[email protected]>
Date:   Thu Aug 29 16:40:05 2024 +0800

    Bluetooth: btrtl: Set msft ext address filter quirk for RTL8852B
    
    [ Upstream commit 9a0570948c5def5c59e588dc0e009ed850a1f5a1 ]
    
    For tracking multiple devices concurrently with a condition.
    The patch enables the HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER quirk
    on RTL8852B controller.
    
    The quirk setting is based on commit 9e14606d8f38 ("Bluetooth: msft:
    Extended monitor tracking by address filter")
    
    With this setting, when a pattern monitor detects a device, this
    feature issues an address monitor for tracking that device. Let the
    original pattern monitor keep monitor new devices.
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122 [+ + +]
Author: Hilda Wu <[email protected]>
Date:   Fri Aug 16 16:58:22 2024 +0800

    Bluetooth: btusb: Add Realtek RTL8852C support ID 0x0489:0xe122
    
    [ Upstream commit bdf9557f70e7512bb2f754abf90d9e9958745316 ]
    
    Add the support ID (0x0489, 0xe122) to usb_device_id table for
    Realtek RTL8852C.
    
    The device info from /sys/kernel/debug/usb/devices as below.
    
    T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
    D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
    P:  Vendor=0489 ProdID=e122 Rev= 0.00
    S:  Manufacturer=Realtek
    S:  Product=Bluetooth Radio
    S:  SerialNumber=00e04c000001
    C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
    E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
    I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
    I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
    I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
    I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
    I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
    E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
    
    Signed-off-by: Hilda Wu <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:17:00 2024 -0400

    Bluetooth: hci_event: Align BR/EDR JUST_WORKS paring with LE
    
    commit b25e11f978b63cb7857890edb3a698599cddb10e upstream.
    
    This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
    ("Bluetooth: Always request for user confirmation for Just Works")
    always request user confirmation with confirm_hint set since the
    likes of bluetoothd have dedicated policy around JUST_WORKS method
    (e.g. main.conf:JustWorksRepairing).
    
    CVE: CVE-2024-8805
    Cc: [email protected]
    Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Tested-by: Kiran K <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: L2CAP: Fix uaf in l2cap_connect [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon Sep 23 12:47:39 2024 -0400

    Bluetooth: L2CAP: Fix uaf in l2cap_connect
    
    [ Upstream commit 333b4fd11e89b29c84c269123f871883a30be586 ]
    
    [Syzbot reported]
    BUG: KASAN: slab-use-after-free in l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
    Read of size 8 at addr ffff8880241e9800 by task kworker/u9:0/54
    
    CPU: 0 UID: 0 PID: 54 Comm: kworker/u9:0 Not tainted 6.11.0-rc6-syzkaller-00268-g788220eee30d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Workqueue: hci2 hci_rx_work
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0xc3/0x620 mm/kasan/report.c:488
     kasan_report+0xd9/0x110 mm/kasan/report.c:601
     l2cap_connect.constprop.0+0x10d8/0x1270 net/bluetooth/l2cap_core.c:3949
     l2cap_connect_req net/bluetooth/l2cap_core.c:4080 [inline]
     l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:4772 [inline]
     l2cap_sig_channel net/bluetooth/l2cap_core.c:5543 [inline]
     l2cap_recv_frame+0xf0b/0x8eb0 net/bluetooth/l2cap_core.c:6825
     l2cap_recv_acldata+0x9b4/0xb70 net/bluetooth/l2cap_core.c:7514
     hci_acldata_packet net/bluetooth/hci_core.c:3791 [inline]
     hci_rx_work+0xaab/0x1610 net/bluetooth/hci_core.c:4028
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    ...
    
    Freed by task 5245:
     kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
     kasan_save_track+0x14/0x30 mm/kasan/common.c:68
     kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
     poison_slab_object+0xf7/0x160 mm/kasan/common.c:240
     __kasan_slab_free+0x32/0x50 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2256 [inline]
     slab_free mm/slub.c:4477 [inline]
     kfree+0x12a/0x3b0 mm/slub.c:4598
     l2cap_conn_free net/bluetooth/l2cap_core.c:1810 [inline]
     kref_put include/linux/kref.h:65 [inline]
     l2cap_conn_put net/bluetooth/l2cap_core.c:1822 [inline]
     l2cap_conn_del+0x59d/0x730 net/bluetooth/l2cap_core.c:1802
     l2cap_connect_cfm+0x9e6/0xf80 net/bluetooth/l2cap_core.c:7241
     hci_connect_cfm include/net/bluetooth/hci_core.h:1960 [inline]
     hci_conn_failed+0x1c3/0x370 net/bluetooth/hci_conn.c:1265
     hci_abort_conn_sync+0x75a/0xb50 net/bluetooth/hci_sync.c:5583
     abort_conn_sync+0x197/0x360 net/bluetooth/hci_conn.c:2917
     hci_cmd_sync_work+0x1a4/0x410 net/bluetooth/hci_sync.c:328
     process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
     process_scheduled_works kernel/workqueue.c:3312 [inline]
     worker_thread+0x6c8/0xed0 kernel/workqueue.c:3389
     kthread+0x2c1/0x3a0 kernel/kthread.c:389
     ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
     ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
    Reported-by: [email protected]
    Tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=c12e2f941af1feb5632c
    Fixes: 7b064edae38d ("Bluetooth: Fix authentication if acl data comes before remote feature evt")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: MGMT: Fix possible crash on mgmt_index_removed [+ + +]
Author: Luiz Augusto von Dentz <[email protected]>
Date:   Thu Sep 12 12:34:42 2024 -0400

    Bluetooth: MGMT: Fix possible crash on mgmt_index_removed
    
    [ Upstream commit f53e1c9c726d83092167f2226f32bd3b73f26c21 ]
    
    If mgmt_index_removed is called while there are commands queued on
    cmd_sync it could lead to crashes like the bellow trace:
    
    0x0000053D: __list_del_entry_valid_or_report+0x98/0xdc
    0x0000053D: mgmt_pending_remove+0x18/0x58 [bluetooth]
    0x0000053E: mgmt_remove_adv_monitor_complete+0x80/0x108 [bluetooth]
    0x0000053E: hci_cmd_sync_work+0xbc/0x164 [bluetooth]
    
    So while handling mgmt_index_removed this attempts to dequeue
    commands passed as user_data to cmd_sync.
    
    Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
    Reported-by: jiaymao <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bnxt_en: Extend maximum length of version string by 1 byte [+ + +]
Author: Simon Horman <[email protected]>
Date:   Tue Aug 13 15:32:55 2024 +0100

    bnxt_en: Extend maximum length of version string by 1 byte
    
    [ Upstream commit ffff7ee843c351ce71d6e0d52f0f20bea35e18c9 ]
    
    This corrects an out-by-one error in the maximum length of the package
    version string. The size argument of snprintf includes space for the
    trailing '\0' byte, so there is no need to allow extra space for it by
    reducing the value of the size argument by 1.
    
    Found by inspection.
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Michael Chan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bpf: Fix a sdiv overflow issue [+ + +]
Author: Yonghong Song <[email protected]>
Date:   Fri Sep 13 08:03:26 2024 -0700

    bpf: Fix a sdiv overflow issue
    
    [ Upstream commit 7dd34d7b7dcf9309fc6224caf4dd5b35bedddcb7 ]
    
    Zac Ecob reported a problem where a bpf program may cause kernel crash due
    to the following error:
      Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
    
    The failure is due to the below signed divide:
      LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
    LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
    but it is impossible since for 64-bit system, the maximum positive
    number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
    cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
    LLONG_MIN.
    
    Further investigation found all the following sdiv/smod cases may trigger
    an exception when bpf program is running on x86_64 platform:
      - LLONG_MIN/-1 for 64bit operation
      - INT_MIN/-1 for 32bit operation
      - LLONG_MIN%-1 for 64bit operation
      - INT_MIN%-1 for 32bit operation
    where -1 can be an immediate or in a register.
    
    On arm64, there are no exceptions:
      - LLONG_MIN/-1 = LLONG_MIN
      - INT_MIN/-1 = INT_MIN
      - LLONG_MIN%-1 = 0
      - INT_MIN%-1 = 0
    where -1 can be an immediate or in a register.
    
    Insn patching is needed to handle the above cases and the patched codes
    produced results aligned with above arm64 result. The below are pseudo
    codes to handle sdiv/smod exceptions including both divisor -1 and divisor 0
    and the divisor is stored in a register.
    
    sdiv:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L2
          if tmp == 0 goto L1
          rY = 0
      L1:
          rY = -rY;
          goto L3
      L2:
          rY /= rX
      L3:
    
    smod:
          tmp = rX
          tmp += 1 /* [-1, 0] -> [0, 1]
          if tmp >(unsigned) 1 goto L1
          if tmp == 1 (is64 ? goto L2 : goto L3)
          rY = 0;
          goto L2
      L1:
          rY %= rX
      L2:
          goto L4  // only when !is64
      L3:
          wY = wY  // only when !is64
      L4:
    
      [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
    
    Reported-by: Zac Ecob <[email protected]>
    Signed-off-by: Yonghong Song <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Make the pointer returned by iter next method valid [+ + +]
Author: Juntong Deng <[email protected]>
Date:   Thu Aug 29 21:11:17 2024 +0100

    bpf: Make the pointer returned by iter next method valid
    
    [ Upstream commit 4cc8c50c9abcb2646a7a4fcef3cea5dcb30c06cf ]
    
    Currently we cannot pass the pointer returned by iter next method as
    argument to KF_TRUSTED_ARGS or KF_RCU kfuncs, because the pointer
    returned by iter next method is not "valid".
    
    This patch sets the pointer returned by iter next method to be valid.
    
    This is based on the fact that if the iterator is implemented correctly,
    then the pointer returned from the iter next method should be valid.
    
    This does not make NULL pointer valid. If the iter next method has
    KF_RET_NULL flag, then the verifier will ask the ebpf program to
    check NULL pointer.
    
    KF_RCU_PROTECTED iterator is a special case, the pointer returned by
    iter next method should only be valid within RCU critical section,
    so it should be with MEM_RCU, not PTR_TRUSTED.
    
    Another special case is bpf_iter_num_next, which returns a pointer with
    base type PTR_TO_MEM. PTR_TO_MEM should not be combined with type flag
    PTR_TRUSTED (PTR_TO_MEM already means the pointer is valid).
    
    The pointer returned by iter next method of other types of iterators
    is with PTR_TRUSTED.
    
    In addition, this patch adds get_iter_from_state to help us get the
    current iterator from the current state.
    
    Signed-off-by: Juntong Deng <[email protected]>
    Link: https://lore.kernel.org/r/AM6PR03MB584869F8B448EA1C87B7CDA399962@AM6PR03MB5848.eurprd03.prod.outlook.com
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
bpftool: Fix undefined behavior caused by shifting into the sign bit [+ + +]
Author: Kuan-Wei Chiu <[email protected]>
Date:   Sun Sep 8 22:00:09 2024 +0800

    bpftool: Fix undefined behavior caused by shifting into the sign bit
    
    [ Upstream commit 4cdc0e4ce5e893bc92255f5f734d983012f2bc2e ]
    
    Replace shifts of '1' with '1U' in bitwise operations within
    __show_dev_tc_bpf() to prevent undefined behavior caused by shifting
    into the sign bit of a signed integer. By using '1U', the operations
    are explicitly performed on unsigned integers, avoiding potential
    integer overflow or sign-related issues.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Acked-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

bpftool: Fix undefined behavior in qsort(NULL, 0, ...) [+ + +]
Author: Kuan-Wei Chiu <[email protected]>
Date:   Tue Sep 10 23:02:07 2024 +0800

    bpftool: Fix undefined behavior in qsort(NULL, 0, ...)
    
    [ Upstream commit f04e2ad394e2755d0bb2d858ecb5598718bf00d5 ]
    
    When netfilter has no entry to display, qsort is called with
    qsort(NULL, 0, ...). This results in undefined behavior, as UBSan
    reports:
    
    net.c:827:2: runtime error: null pointer passed as argument 1, which is declared to never be null
    
    Although the C standard does not explicitly state whether calling qsort
    with a NULL pointer when the size is 0 constitutes undefined behavior,
    Section 7.1.4 of the C standard (Use of library functions) mentions:
    
    "Each of the following statements applies unless explicitly stated
    otherwise in the detailed descriptions that follow: If an argument to a
    function has an invalid value (such as a value outside the domain of
    the function, or a pointer outside the address space of the program, or
    a null pointer, or a pointer to non-modifiable storage when the
    corresponding parameter is not const-qualified) or a type (after
    promotion) not expected by a function with variable number of
    arguments, the behavior is undefined."
    
    To avoid this, add an early return when nf_link_info is NULL to prevent
    calling qsort with a NULL pointer.
    
    Signed-off-by: Kuan-Wei Chiu <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Reviewed-by: Quentin Monnet <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
bridge: mcast: Fail MDB get request on empty entry [+ + +]
Author: Ido Schimmel <[email protected]>
Date:   Sun Sep 29 15:36:40 2024 +0300

    bridge: mcast: Fail MDB get request on empty entry
    
    [ Upstream commit 555f45d24ba7cd5527716553031641cdebbe76c7 ]
    
    When user space deletes a port from an MDB entry, the port is removed
    synchronously. If this was the last port in the entry and the entry is
    not joined by the host itself, then the entry is scheduled for deletion
    via a timer.
    
    The above means that it is possible for the MDB get netlink request to
    retrieve an empty entry which is scheduled for deletion. This is
    problematic as after deleting the last port in an entry, user space
    cannot rely on a non-zero return code from the MDB get request as an
    indication that the port was successfully removed.
    
    Fix by returning an error when the entry's port list is empty and the
    entry is not joined by the host.
    
    Fixes: 68b380a395a7 ("bridge: mcast: Add MDB get support")
    Reported-by: Jamie Bainbridge <[email protected]>
    Closes: https://lore.kernel.org/netdev/c92569919307749f879b9482b0f3e125b7d9d2e3.1726480066.git.jamie.bainbridge@gmail.com/
    Tested-by: Jamie Bainbridge <[email protected]>
    Signed-off-by: Ido Schimmel <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
btrfs: drop the backref cache during relocation if we commit [+ + +]
Author: Josef Bacik <[email protected]>
Date:   Tue Sep 24 16:50:22 2024 -0400

    btrfs: drop the backref cache during relocation if we commit
    
    commit db7e68b522c01eb666cfe1f31637775f18997811 upstream.
    
    Since the inception of relocation we have maintained the backref cache
    across transaction commits, updating the backref cache with the new
    bytenr whenever we COWed blocks that were in the cache, and then
    updating their bytenr once we detected a transaction id change.
    
    This works as long as we're only ever modifying blocks, not changing the
    structure of the tree.
    
    However relocation does in fact change the structure of the tree.  For
    example, if we are relocating a data extent, we will look up all the
    leaves that point to this data extent.  We will then call
    do_relocation() on each of these leaves, which will COW down to the leaf
    and then update the file extent location.
    
    But, a key feature of do_relocation() is the pending list.  This is all
    the pending nodes that we modified when we updated the file extent item.
    We will then process all of these blocks via finish_pending_nodes, which
    calls do_relocation() on all of the nodes that led up to that leaf.
    
    The purpose of this is to make sure we don't break sharing unless we
    absolutely have to.  Consider the case that we have 3 snapshots that all
    point to this leaf through the same nodes, the initial COW would have
    created a whole new path.  If we did this for all 3 snapshots we would
    end up with 3x the number of nodes we had originally.  To avoid this we
    will cycle through each of the snapshots that point to each of these
    nodes and update their pointers to point at the new nodes.
    
    Once we update the pointer to the new node we will drop the node we
    removed the link for and all of its children via btrfs_drop_subtree().
    This is essentially just btrfs_drop_snapshot(), but for an arbitrary
    point in the snapshot.
    
    The problem with this is that we will never reflect this in the backref
    cache.  If we do this btrfs_drop_snapshot() for a node that is in the
    backref tree, we will leave the node in the backref tree.  This becomes
    a problem when we change the transid, as now the backref cache has
    entire subtrees that no longer exist, but exist as if they still are
    pointed to by the same roots.
    
    In the best case scenario you end up with "adding refs to an existing
    tree ref" errors from insert_inline_extent_backref(), where we attempt
    to link in nodes on roots that are no longer valid.
    
    Worst case you will double free some random block and re-use it when
    there's still references to the block.
    
    This is extremely subtle, and the consequences are quite bad.  There
    isn't a way to make sure our backref cache is consistent between
    transid's.
    
    In order to fix this we need to simply evict the entire backref cache
    anytime we cross transid's.  This reduces performance in that we have to
    rebuild this backref cache every time we change transid's, but fixes the
    bug.
    
    This has existed since relocation was added, and is a pretty critical
    bug.  There's a lot more cleanup that can be done now that this
    functionality is going away, but this patch is as small as possible in
    order to fix the problem and make it easy for us to backport it to all
    the kernels it needs to be backported to.
    
    Followup series will dismantle more of this code and simplify relocation
    drastically to remove this functionality.
    
    We have a reproducer that reproduced the corruption within a few minutes
    of running.  With this patch it survives several iterations/hours of
    running the reproducer.
    
    Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance")
    CC: [email protected]
    Reviewed-by: Boris Burkov <[email protected]>
    Signed-off-by: Josef Bacik <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [+ + +]
Author: Qu Wenruo <[email protected]>
Date:   Sat Sep 28 08:05:58 2024 +0930

    btrfs: fix a NULL pointer dereference when failed to start a new trasacntion
    
    commit c3b47f49e83197e8dffd023ec568403bcdbb774b upstream.
    
    [BUG]
    Syzbot reported a NULL pointer dereference with the following crash:
    
      FAULT_INJECTION: forcing a failure.
       start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676
       prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642
       relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678
      ...
      BTRFS info (device loop0): balance: ended with status: -12
      Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI
      KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667]
      RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926
      Call Trace:
       <TASK>
       commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496
       btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430
       del_balance_item fs/btrfs/volumes.c:3678 [inline]
       reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742
       btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574
       btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673
       vfs_ioctl fs/ioctl.c:51 [inline]
       __do_sys_ioctl fs/ioctl.c:907 [inline]
       __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [CAUSE]
    The allocation failure happens at the start_transaction() inside
    prepare_to_relocate(), and during the error handling we call
    unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.
    
    Then we continue the error path cleanup in btrfs_balance() by calling
    reset_balance_state() which will call del_balance_item() to fully delete
    the balance item in the root tree.
    
    However during the small window between set_reloc_contrl() and
    unset_reloc_control(), we can have a subvolume tree update and created a
    reloc_root for that subvolume.
    
    Then we go into the final btrfs_commit_transaction() of
    del_balance_item(), and into btrfs_update_reloc_root() inside
    commit_fs_roots().
    
    That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
    stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
    dereference.
    
    [FIX]
    Just add extra check on fs_info->reloc_ctl inside
    btrfs_update_reloc_root(), before checking
    fs_info->reloc_ctl->merge_reloc_tree.
    
    That DEAD_RELOC_TREE handling is to prevent further modification to the
    reloc tree during merge stage, but since there is no reloc_ctl at all,
    we do not need to bother that.
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Josef Bacik <[email protected]>
    Signed-off-by: Qu Wenruo <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: send: fix invalid clone operation for file that got its size decreased [+ + +]
Author: Filipe Manana <[email protected]>
Date:   Fri Sep 27 10:50:12 2024 +0100

    btrfs: send: fix invalid clone operation for file that got its size decreased
    
    commit fa630df665aa9ddce3a96ce7b54e10a38e4d2a2b upstream.
    
    During an incremental send we may end up sending an invalid clone
    operation, for the last extent of a file which ends at an unaligned offset
    that matches the final i_size of the file in the send snapshot, in case
    the file had its initial size (the size in the parent snapshot) decreased
    in the send snapshot. In this case the destination will fail to apply the
    clone operation because its end offset is not sector size aligned and it
    ends before the current size of the file.
    
    Sending the truncate operation always happens when we finish processing an
    inode, after we process all its extents (and xattrs, names, etc). So fix
    this by ensuring the file has a valid size before we send a clone
    operation for an unaligned extent that ends at the final i_size of the
    file. The size we truncate to matches the start offset of the clone range
    but it could be any value between that start offset and the final size of
    the file since the clone operation will expand the i_size if the current
    size is smaller than the end offset. The start offset of the range was
    chosen because it's always sector size aligned and avoids a truncation
    into the middle of a page, which results in dirtying the page due to
    filling part of it with zeroes and then making the clone operation at the
    receiver trigger IO.
    
    The following test reproduces the issue:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/sdi
      MNT=/mnt/sdi
    
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      # Create a file with a size of 256K + 5 bytes, having two extents, one
      # with a size of 128K and another one with a size of 128K + 5 bytes.
      last_ext_size=$((128 * 1024 + 5))
      xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \
             -c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \
             $MNT/foo
    
      # Another file which we will later clone foo into, but initially with
      # a larger size than foo.
      xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap1
    
      # Now resize bar and clone foo into it.
      xfs_io -c "truncate 0" \
             -c "reflink $MNT/foo" $MNT/bar
    
      btrfs subvolume snapshot -r $MNT/ $MNT/snap2
    
      rm -f /tmp/send-full /tmp/send-inc
      btrfs send -f /tmp/send-full $MNT/snap1
      btrfs send -p $MNT/snap1 -f /tmp/send-inc $MNT/snap2
    
      umount $MNT
      mkfs.btrfs -f $DEV
      mount $DEV $MNT
    
      btrfs receive -f /tmp/send-full $MNT
      btrfs receive -f /tmp/send-inc $MNT
    
      umount $MNT
    
    Running it before this patch:
    
      $ ./test.sh
      (...)
      At subvol snap1
      At snapshot snap2
      ERROR: failed to clone extents to bar: Invalid argument
    
    A test case for fstests will be sent soon.
    
    Reported-by: Ben Millwood <[email protected]>
    Link: https://lore.kernel.org/linux-btrfs/CAJhrHS2z+WViO2h=ojYvBPDLsATwLbg+7JaNCyYomv0fUxEpQQ@mail.gmail.com/
    Fixes: 46a6e10a1ab1 ("btrfs: send: allow cloning non-aligned extent if it ends at i_size")
    CC: [email protected] # 6.11
    Reviewed-by: Qu Wenruo <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: wait for fixup workers before stopping cleaner kthread during umount [+ + +]
Author: Filipe Manana <[email protected]>
Date:   Tue Oct 1 11:06:52 2024 +0100

    btrfs: wait for fixup workers before stopping cleaner kthread during umount
    
    commit 41fd1e94066a815a7ab0a7025359e9b40e4b3576 upstream.
    
    During unmount, at close_ctree(), we have the following steps in this order:
    
    1) Park the cleaner kthread - this doesn't destroy the kthread, it basically
       halts its execution (wake ups against it work but do nothing);
    
    2) We stop the cleaner kthread - this results in freeing the respective
       struct task_struct;
    
    3) We call btrfs_stop_all_workers() which waits for any jobs running in all
       the work queues and then free the work queues.
    
    Syzbot reported a case where a fixup worker resulted in a crash when doing
    a delayed iput on its inode while attempting to wake up the cleaner at
    btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread
    was already freed. This can happen during unmount because we don't wait
    for any fixup workers still running before we call kthread_stop() against
    the cleaner kthread, which stops and free all its resources.
    
    Fix this by waiting for any fixup workers at close_ctree() before we call
    kthread_stop() against the cleaner and run pending delayed iputs.
    
    The stack traces reported by syzbot were the following:
    
      BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
      Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52
    
      CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
      Workqueue: btrfs-fixup btrfs_work_helper
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:94 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0x169/0x550 mm/kasan/report.c:488
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
       lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
       class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline]
       try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154
       btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842
       btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314
       process_one_work kernel/workqueue.c:3229 [inline]
       process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310
       worker_thread+0x870/0xd30 kernel/workqueue.c:3391
       kthread+0x2f0/0x390 kernel/kthread.c:389
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
       </TASK>
    
      Allocated by task 2:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       unpoison_slab_object mm/kasan/common.c:319 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    
      Freed by task 61:
       kasan_save_stack mm/kasan/common.c:47 [inline]
       kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
       kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
       poison_slab_object mm/kasan/common.c:247 [inline]
       __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264
       kasan_slab_free include/linux/kasan.h:230 [inline]
       slab_free_hook mm/slub.c:2343 [inline]
       slab_free mm/slub.c:4580 [inline]
       kmem_cache_free+0x1a2/0x420 mm/slub.c:4682
       put_task_struct include/linux/sched/task.h:144 [inline]
       delayed_put_task_struct+0x125/0x300 kernel/exit.c:228
       rcu_do_batch kernel/rcu/tree.c:2567 [inline]
       rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823
       handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
       __do_softirq kernel/softirq.c:588 [inline]
       invoke_softirq kernel/softirq.c:428 [inline]
       __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
       irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
       instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline]
       sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
    
      Last potentially related work creation:
       kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
       __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
       __call_rcu_common kernel/rcu/tree.c:3086 [inline]
       call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190
       context_switch kernel/sched/core.c:5318 [inline]
       __schedule+0x184b/0x4ae0 kernel/sched/core.c:6675
       schedule_idle+0x56/0x90 kernel/sched/core.c:6793
       do_idle+0x56a/0x5d0 kernel/sched/idle.c:354
       cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
       start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314
       common_startup_64+0x13e/0x147
    
      The buggy address belongs to the object at ffff8880272a8000
       which belongs to the cache task_struct of size 7424
      The buggy address is located 2584 bytes inside of
       freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00)
    
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x272a8
      head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: f5(slab)
      raw: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000
      head: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000
      head: 00fff00000000003 ffffea00009caa01 ffffffffffffffff 0000000000000000
      head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 2, tgid 2 (kthreadd), ts 71247381401, free_ts 71214998153
       set_page_owner include/linux/page_owner.h:32 [inline]
       post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
       prep_new_page mm/page_alloc.c:1545 [inline]
       get_page_from_freelist+0x3039/0x3180 mm/page_alloc.c:3457
       __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4733
       alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
       alloc_slab_page+0x6a/0x120 mm/slub.c:2413
       allocate_slab+0x5a/0x2f0 mm/slub.c:2579
       new_slab mm/slub.c:2632 [inline]
       ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3819
       __slab_alloc+0x58/0xa0 mm/slub.c:3909
       __slab_alloc_node mm/slub.c:3962 [inline]
       slab_alloc_node mm/slub.c:4123 [inline]
       kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4187
       alloc_task_struct_node kernel/fork.c:180 [inline]
       dup_task_struct+0x57/0x8c0 kernel/fork.c:1107
       copy_process+0x5d1/0x3d50 kernel/fork.c:2206
       kernel_clone+0x223/0x880 kernel/fork.c:2787
       kernel_thread+0x1bc/0x240 kernel/fork.c:2849
       create_kthread kernel/kthread.c:412 [inline]
       kthreadd+0x60d/0x810 kernel/kthread.c:765
       ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      page last free pid 5230 tgid 5230 stack trace:
       reset_page_owner include/linux/page_owner.h:25 [inline]
       free_pages_prepare mm/page_alloc.c:1108 [inline]
       free_unref_page+0xcd0/0xf00 mm/page_alloc.c:2638
       discard_slab mm/slub.c:2678 [inline]
       __put_partials+0xeb/0x130 mm/slub.c:3146
       put_cpu_partial+0x17c/0x250 mm/slub.c:3221
       __slab_free+0x2ea/0x3d0 mm/slub.c:4450
       qlink_free mm/kasan/quarantine.c:163 [inline]
       qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179
       kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
       __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329
       kasan_slab_alloc include/linux/kasan.h:247 [inline]
       slab_post_alloc_hook mm/slub.c:4086 [inline]
       slab_alloc_node mm/slub.c:4135 [inline]
       kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4142
       getname_flags+0xb7/0x540 fs/namei.c:139
       do_sys_openat2+0xd2/0x1d0 fs/open.c:1409
       do_sys_open fs/open.c:1430 [inline]
       __do_sys_openat fs/open.c:1446 [inline]
       __se_sys_openat fs/open.c:1441 [inline]
       __x64_sys_openat+0x247/0x2a0 fs/open.c:1441
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
      Memory state around the buggy address:
       ffff8880272a8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8880272a8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
       ffff8880272a8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880272a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
    
    Reported-by: [email protected]
    Link: https://lore.kernel.org/linux-btrfs/[email protected]/
    CC: [email protected] # 4.19+
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
build-id: require program headers to be right after ELF header [+ + +]
Author: Alexey Dobriyan <[email protected]>
Date:   Fri Jun 21 21:39:33 2024 +0300

    build-id: require program headers to be right after ELF header
    
    [ Upstream commit 961a2851324561caed579764ffbee3db82b32829 ]
    
    Neither ELF spec not ELF loader require program header to be placed right
    after ELF header, but build-id code very much assumes such placement:
    
    See
    
            find_get_page(vma->vm_file->f_mapping, 0);
    
    line and checks against PAGE_SIZE.
    
    Returns errors for now until someone rewrites build-id parser
    to be more inline with load_elf_binary().
    
    Link: https://lkml.kernel.org/r/d58bc281-6ca7-467a-9a64-40fa214bd63e@p183
    Signed-off-by: Alexey Dobriyan <[email protected]>
    Reviewed-by: Jiri Olsa <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Stable-dep-of: 905415ff3ffb ("lib/buildid: harden build ID parsing logic")
    Signed-off-by: Sasha Levin <[email protected]>

 
cachefiles: fix dentry leak in cachefiles_open_file() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 29 16:34:09 2024 +0800

    cachefiles: fix dentry leak in cachefiles_open_file()
    
    commit da6ef2dffe6056aad3435e6cf7c6471c2a62187c upstream.
    
    A dentry leak may be caused when a lookup cookie and a cull are concurrent:
    
                P1             |             P2
    -----------------------------------------------------------
    cachefiles_lookup_cookie
      cachefiles_look_up_object
        lookup_one_positive_unlocked
         // get dentry
                                cachefiles_cull
                                  inode->i_flags |= S_KERNEL_FILE;
        cachefiles_open_file
          cachefiles_mark_inode_in_use
            __cachefiles_mark_inode_in_use
              can_use = false
              if (!(inode->i_flags & S_KERNEL_FILE))
                can_use = true
              return false
            return false
            // Returns an error but doesn't put dentry
    
    After that the following WARNING will be triggered when the backend folder
    is umounted:
    
    ==================================================================
    BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img}  still in use (1) [unmount of ext4 sda]
    WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70
    CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25
    RIP: 0010:umount_check+0x5d/0x70
    Call Trace:
     <TASK>
     d_walk+0xda/0x2b0
     do_one_tree+0x20/0x40
     shrink_dcache_for_umount+0x2c/0x90
     generic_shutdown_super+0x20/0x160
     kill_block_super+0x1a/0x40
     ext4_kill_sb+0x22/0x40
     deactivate_locked_super+0x35/0x80
     cleanup_mnt+0x104/0x160
    ==================================================================
    
    Whether cachefiles_open_file() returns true or false, the reference count
    obtained by lookup_positive_unlocked() in cachefiles_look_up_object()
    should be released.
    
    Therefore release that reference count in cachefiles_look_up_object() to
    fix the above issue and simplify the code.
    
    Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: David Howells <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode [+ + +]
Author: Stefan Mätje <[email protected]>
Date:   Thu Aug 8 18:42:24 2024 +0200

    can: netlink: avoid call to do_set_data_bittiming callback with stale can_priv::ctrlmode
    
    [ Upstream commit 2423cc20087ae9a7b7af575aa62304ef67cad7b6 ]
    
    This patch moves the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_changelink in front of the evaluation of data[IFLA_CAN_BITTIMING].
    
    This avoids a call to do_set_data_bittiming providing a stale
    can_priv::ctrlmode with a CAN_CTRLMODE_FD flag not matching the
    requested state when switching between a CAN Classic and CAN-FD bitrate.
    
    In the same manner the evaluation of data[IFLA_CAN_CTRLMODE] in function
    can_validate is also moved in front of the evaluation of
    data[IFLA_CAN_BITTIMING].
    
    This is a preparation for patches where the nominal and data bittiming
    may have interdependencies on the driver side depending on the
    CAN_CTRLMODE_FD flag state.
    
    Signed-off-by: Stefan Mätje <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Marc Kleine-Budde <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ceph: fix a memory leak on cap_auths in MDS client [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Aug 19 10:52:17 2024 +0100

    ceph: fix a memory leak on cap_auths in MDS client
    
    [ Upstream commit d97079e97eab20e08afc507f2bed4501e2824717 ]
    
    The cap_auths that are allocated during an MDS session opening are never
    released, causing a memory leak detected by kmemleak.  Fix this by freeing
    the memory allocated when shutting down the MDS client.
    
    Fixes: 1d17de9534cb ("ceph: save cap_auths in MDS client when session is opened")
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Xiubo Li <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ceph: fix cap ref leak via netfs init_request [+ + +]
Author: Patrick Donnelly <[email protected]>
Date:   Wed Oct 2 21:05:12 2024 -0400

    ceph: fix cap ref leak via netfs init_request
    
    commit ccda9910d8490f4fb067131598e4b2e986faa5a0 upstream.
    
    Log recovered from a user's cluster:
    
        <7>[ 5413.970692] ceph:  get_cap_refs 00000000958c114b ret 1 got Fr
        <7>[ 5413.970695] ceph:  start_read 00000000958c114b, no cache cap
        ...
        <7>[ 5473.934609] ceph:   my wanted = Fr, used = Fr, dirty -
        <7>[ 5473.934616] ceph:  revocation: pAsLsXsFr -> pAsLsXs (revoking Fr)
        <7>[ 5473.934632] ceph:  __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs
        <7>[ 5473.934638] ceph:  check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr  AUTHONLY NOINVAL FLUSH_FORCE
    
    The MDS subsequently complains that the kernel client is late releasing
    caps.
    
    Approximately, a series of changes to this code by commits 49870056005c
    ("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315
    ("netfs: Change ->init_request() to return an error code") and
    a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    resulted in subtle resource cleanup to be missed. The main culprit is
    the change in error handling in 2de160417315 which meant that a failure
    in init_request() would no longer cause cleanup to be called. That
    would prevent the ceph_put_cap_refs() call which would cleanup the
    leaked cap ref.
    
    Cc: [email protected]
    Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
    Link: https://tracker.ceph.com/issues/67008
    Signed-off-by: Patrick Donnelly <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ceph: remove the incorrect Fw reference check when dirtying pages [+ + +]
Author: Xiubo Li <[email protected]>
Date:   Thu Sep 5 06:22:18 2024 +0800

    ceph: remove the incorrect Fw reference check when dirtying pages
    
    [ Upstream commit c08dfb1b49492c09cf13838c71897493ea3b424e ]
    
    When doing the direct-io reads it will also try to mark pages dirty,
    but for the read path it won't hold the Fw caps and there is case
    will it get the Fw reference.
    
    Fixes: 5dda377cf0a6 ("ceph: set i_head_snapc when getting CEPH_CAP_FILE_WR reference")
    Signed-off-by: Xiubo Li <[email protected]>
    Reviewed-by: Patrick Donnelly <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
cifs: Do not convert delimiter when parsing NFS-style symlinks [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Sat Sep 28 23:59:46 2024 +0200

    cifs: Do not convert delimiter when parsing NFS-style symlinks
    
    [ Upstream commit d3a49f60917323228f8fdeee313260ef14f94df7 ]
    
    NFS-style symlinks have target location always stored in NFS/UNIX form
    where backslash means the real UNIX backslash and not the SMB path
    separator.
    
    So do not mangle slash and backslash content of NFS-style symlink during
    readlink() syscall as it is already in the correct Linux form.
    
    This fixes interoperability of NFS-style symlinks with backslashes created
    by Linux NFS3 client throw Windows NFS server and retrieved by Linux SMB
    client throw Windows SMB server, where both Windows servers exports the
    same directory.
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Fix buffer overflow when parsing NFS reparse points [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Sun Sep 29 12:22:40 2024 +0200

    cifs: Fix buffer overflow when parsing NFS reparse points
    
    [ Upstream commit e2a8910af01653c1c268984855629d71fb81f404 ]
    
    ReparseDataLength is sum of the InodeType size and DataBuffer size.
    So to get DataBuffer size it is needed to subtract InodeType's size from
    ReparseDataLength.
    
    Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer
    at position after the end of the buffer because it does not subtract
    InodeType size from the length. Fix this problem and correctly subtract
    variable len.
    
    Member InodeType is present only when reparse buffer is large enough. Check
    for ReparseDataLength before accessing InodeType to prevent another invalid
    memory access.
    
    Major and minor rdev values are present also only when reparse buffer is
    large enough. Check for reparse buffer size before calling reparse_mkdev().
    
    Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points")
    Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Pali Rohár <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

cifs: Remove intermediate object of failed create reparse call [+ + +]
Author: Pali Rohár <[email protected]>
Date:   Mon Sep 30 22:25:10 2024 +0200

    cifs: Remove intermediate object of failed create reparse call
    
    [ Upstream commit c9432ad5e32f066875b1bf95939c363bc46d6a45 ]
    
    If CREATE was successful but SMB2_OP_SET_REPARSE failed then remove the
    intermediate object created by CREATE. Otherwise empty object stay on the
    server when reparse call failed.
    
    This ensures that if the creating of special files is unsupported by the
    server then no empty file stay on the server as a result of unsupported
    operation.
    
    Fixes: 102466f303ff ("smb: client: allow creating special files via reparse points")
    Signed-off-by: Pali Rohár <[email protected]>
    Acked-by: Paulo Alcantara (Red Hat) <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL [+ + +]
Author: Ajit Pandey <[email protected]>
Date:   Tue Jun 11 19:07:45 2024 +0530

    clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL
    
    commit fff617979f97c773aaa9432c31cf62444b3bdbd4 upstream.
    
    In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single
    PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL
    register using regmap_write() API in __alpha_pll_trion_set_rate
    callback will override LUCID EVO PLL initial configuration related
    to PLL_CAL_L_VAL bit fields in PLL_L_VAL register.
    
    Observed random PLL lock failures during PLL enable due to such
    override in PLL calibration value. Use regmap_update_bits() with
    L_VAL bitfield mask instead of regmap_write() API to update only
    PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback.
    
    Fixes: 260e36606a03 ("clk: qcom: clk-alpha-pll: add Lucid EVO PLL configuration interfaces")
    Cc: [email protected]
    Signed-off-by: Ajit Pandey <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Acked-by: Vladimir Zapolskiy <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: clk-rpmh: Fix overflow in BCM vote [+ + +]
Author: Mike Tipton <[email protected]>
Date:   Fri Aug 9 10:51:29 2024 +0530

    clk: qcom: clk-rpmh: Fix overflow in BCM vote
    
    commit a4e5af27e6f6a8b0d14bc0d7eb04f4a6c7291586 upstream.
    
    Valid frequencies may result in BCM votes that exceed the max HW value.
    Set vote ceiling to BCM_TCS_CMD_VOTE_MASK to ensure the votes aren't
    truncated, which can result in lower frequencies than desired.
    
    Fixes: 04053f4d23a4 ("clk: qcom: clk-rpmh: Add IPA clock support")
    Cc: [email protected]
    Signed-off-by: Mike Tipton <[email protected]>
    Reviewed-by: Taniya Das <[email protected]>
    Signed-off-by: Imran Shaik <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks [+ + +]
Author: Dmitry Baryshkov <[email protected]>
Date:   Sun Aug 4 08:40:05 2024 +0300

    clk: qcom: dispcc-sm8250: use CLK_SET_RATE_PARENT for branch clocks
    
    commit 0e93c6320ecde0583de09f3fe801ce8822886fec upstream.
    
    Add CLK_SET_RATE_PARENT for several branch clocks. Such clocks don't
    have a way to change the rate, so set the parent rate instead.
    
    Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
    Cc: [email protected]
    Signed-off-by: Dmitry Baryshkov <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Add GPLL9 support [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:03 2024 +0530

    clk: qcom: gcc-sc8180x: Add GPLL9 support
    
    commit 818a2f8d5e4ad2c1e39a4290158fe8e39a744c70 upstream.
    
    Add the missing GPLL9 pll and fix the gcc_parents_7 data to use
    the correct pll hw.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:04 2024 +0530

    clk: qcom: gcc-sc8180x: Fix the sdcc2 and sdcc4 clocks freq table
    
    commit b8acaf2de8081371761ab4cf1e7a8ee4e7acc139 upstream.
    
    Update the frequency tables of gcc_sdcc2_apps_clk and gcc_sdcc4_apps_clk
    as per the latest frequency plan.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:05 2024 +0530

    clk: qcom: gcc-sm8150: De-register gcc_cpuss_ahb_clk_src
    
    commit bab0c7a0bc586e736b7cd2aac8e6391709a70ef2 upstream.
    
    The branch clocks of gcc_cpuss_ahb_clk_src are marked critical
    and hence these clocks vote on XO blocking the suspend.
    De-register these clocks and its source as there is no rate
    setting happening on them.
    
    Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
    Cc: [email protected]
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <[email protected]>
Date:   Fri Jul 19 19:12:38 2024 +0530

    clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit ade508b545c969c72cd68479f275a5dd640fd8b9 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.7
    Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable() [+ + +]
Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Jul 22 16:27:33 2024 +0530

    clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable()
    
    commit 889e1332310656961855c0dcedbb4dbe78e39d22 upstream.
    
    With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
    can happen during scenarios such as system suspend and breaks the resume
    of PCIe controllers from suspend.
    
    So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
    during gdsc_disable() and allow the hardware to transition the GDSCs to
    retention when the parent domain enters low power state during system
    suspend.
    
    Cc: [email protected] # 5.17
    Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450")
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: rockchip: fix error for unknown clocks [+ + +]
Author: Sebastian Reichel <[email protected]>
Date:   Mon Mar 25 20:33:36 2024 +0100

    clk: rockchip: fix error for unknown clocks
    
    commit 12fd64babaca4dc09d072f63eda76ba44119816a upstream.
    
    There is a clk == NULL check after the switch to check for
    unsupported clk types. Since clk is re-assigned in a loop,
    this check is useless right now for anything but the first
    round. Let's fix this up by assigning clk = NULL in the
    loop before the switch statement.
    
    Fixes: a245fecbb806 ("clk: rockchip: add basic infrastructure for clock branches")
    Cc: [email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    [added fixes + stable-cc]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix [+ + +]
Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:47 2024 +0200

    clk: samsung: exynos7885: Update CLKS_NR_FSYS after bindings fix
    
    commit 217a5f23c290c349ceaa37a6f2c014ad4c2d5759 upstream.
    
    Update CLKS_NR_FSYS to the proper value after a fix in DT bindings.
    This should always be the last clock in a CMU + 1.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
close_range(): fix the logics in descriptor table trimming [+ + +]
Author: Al Viro <[email protected]>
Date:   Fri Aug 16 15:17:00 2024 -0400

    close_range(): fix the logics in descriptor table trimming
    
    commit 678379e1d4f7443b170939525d3312cfc37bf86b upstream.
    
    Cloning a descriptor table picks the size that would cover all currently
    opened files.  That's fine for clone() and unshare(), but for close_range()
    there's an additional twist - we clone before we close, and it would be
    a shame to have
            close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
    leave us with a huge descriptor table when we are not going to keep
    anything past stderr, just because some large file descriptor used to
    be open before our call has taken it out.
    
    Unfortunately, it had been dealt with in an inherently racy way -
    sane_fdtable_size() gets a "don't copy anything past that" argument
    (passed via unshare_fd() and dup_fd()), close_range() decides how much
    should be trimmed and passes that to unshare_fd().
    
    The problem is, a range that used to extend to the end of descriptor
    table back when close_range() had looked at it might very well have stuff
    grown after it by the time dup_fd() has allocated a new files_struct
    and started to figure out the capacity of fdtable to be attached to that.
    
    That leads to interesting pathological cases; at the very least it's a
    QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
    a snapshot of descriptor table one might have observed at some point.
    Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
    of unshare(CLONE_FILES) with plain close_range(), ending up with a
    weird state that would never occur with unshare(2) is confusing, to put
    it mildly.
    
    It's not hard to get rid of - all it takes is passing both ends of the
    range down to sane_fdtable_size().  There we are under ->files_lock,
    so the race is trivially avoided.
    
    So we do the following:
            * switch close_files() from calling unshare_fd() to calling
    dup_fd().
            * undo the calling convention change done to unshare_fd() in
    60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
            * introduce struct fd_range, pass a pointer to that to dup_fd()
    and sane_fdtable_size() instead of "trim everything past that point"
    they are currently getting.  NULL means "we are not going to be punching
    any holes"; NR_OPEN_MAX is gone.
            * make sane_fdtable_size() use find_last_bit() instead of
    open-coding it; it's easier to follow that way.
            * while we are at it, have dup_fd() report errors by returning
    ERR_PTR(), no need to use a separate int *errorp argument.
    
    Fixes: 60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
    Cc: [email protected]
    Signed-off-by: Al Viro <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
cpufreq: Avoid a bad reference count on CPU node [+ + +]
Author: Miquel Sabaté Solà <[email protected]>
Date:   Tue Sep 17 15:42:46 2024 +0200

    cpufreq: Avoid a bad reference count on CPU node
    
    commit c0f02536fffbbec71aced36d52a765f8c4493dc2 upstream.
    
    In the parse_perf_domain function, if the call to
    of_parse_phandle_with_args returns an error, then the reference to the
    CPU device node that was acquired at the start of the function would not
    be properly decremented.
    
    Address this by declaring the variable with the __free(device_node)
    cleanup attribute.
    
    Signed-off-by: Miquel Sabaté Solà <[email protected]>
    Acked-by: Viresh Kumar <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock [+ + +]
Author: Uwe Kleine-König <[email protected]>
Date:   Sun Oct 6 22:51:06 2024 +0200

    cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
    
    commit 8b4865cd904650cbed7f2407e653934c621b8127 upstream.
    
    notify_hwp_interrupt() is called via sysvec_thermal() ->
    smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context.
    For this reason it must not use a simple spin_lock that sleeps with
    PREEMPT_RT enabled. So convert it to a raw spinlock.
    
    Reported-by: xiao sheng wen <[email protected]>
    Link: https://bugs.debian.org/1076483
    Signed-off-by: Uwe Kleine-König <[email protected]>
    Acked-by: Srinivas Pandruvada <[email protected]>
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Tested-by: xiao sheng wen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: All applicable <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    [ukleinek: Backport to v6.10.y]
    Signed-off-by: Uwe Kleine-König <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
crypto: hisilicon - fix missed error branch [+ + +]
Author: Yang Shen <[email protected]>
Date:   Sat Aug 31 17:50:07 2024 +0800

    crypto: hisilicon - fix missed error branch
    
    [ Upstream commit f386dc64e1a5d3dcb84579119ec350ab026fea88 ]
    
    If an error occurs in the process after the SGL is mapped
    successfully, it need to unmap the SGL.
    
    Otherwise, memory problems may occur.
    
    Signed-off-by: Yang Shen <[email protected]>
    Signed-off-by: Chenghai Huang <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx - Fix authenc setkey [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:13:23 2024 +0800

    crypto: octeontx - Fix authenc setkey
    
    [ Upstream commit 311eea7e37c4c0b44b557d0c100860a03b4eab65 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: octeontx* - Select CRYPTO_AUTHENC [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Thu Sep 5 10:21:49 2024 +0800

    crypto: octeontx* - Select CRYPTO_AUTHENC
    
    commit c398cb8eb0a263a1b7a18892d9f244751689675c upstream.
    
    Select CRYPTO_AUTHENC as the function crypto_authenec_extractkeys
    may not be available without it.
    
    Fixes: 311eea7e37c4 ("crypto: octeontx - Fix authenc setkey")
    Fixes: 7ccb750dcac8 ("crypto: octeontx2 - Fix authenc setkey")
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

crypto: octeontx2 - Fix authenc setkey [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 12:36:19 2024 +0800

    crypto: octeontx2 - Fix authenc setkey
    
    [ Upstream commit 7ccb750dcac8abbfc7743aab0db6a72c1c3703c7 ]
    
    Use the generic crypto_authenc_extractkeys helper instead of custom
    parsing code that is slightly broken.  Also fix a number of memory
    leaks by moving memory allocation from setkey to init_tfm (setkey
    can be called multiple times over the life of a tfm).
    
    Finally accept all hash key lengths by running the digest over
    extra-long keys.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: simd - Do not call crypto_alloc_tfm during registration [+ + +]
Author: Herbert Xu <[email protected]>
Date:   Sat Aug 17 14:58:35 2024 +0800

    crypto: simd - Do not call crypto_alloc_tfm during registration
    
    [ Upstream commit 3c44d31cb34ce4eb8311a2e73634d57702948230 ]
    
    Algorithm registration is usually carried out during module init,
    where as little work as possible should be carried out.  The SIMD
    code violated this rule by allocating a tfm, this then triggers a
    full test of the algorithm which may dead-lock in certain cases.
    
    SIMD is only allocating the tfm to get at the alg object, which is
    in fact already available as it is what we are registering.  Use
    that directly and remove the crypto_alloc_tfm call.
    
    Also remove some obsolete and unused SIMD API.
    
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

crypto: x86/sha256 - Add parentheses around macros' single arguments [+ + +]
Author: Fangrui Song <[email protected]>
Date:   Tue Aug 13 21:48:02 2024 -0700

    crypto: x86/sha256 - Add parentheses around macros' single arguments
    
    [ Upstream commit 3363c460ef726ba693704dbcd73b7e7214ccc788 ]
    
    The macros FOUR_ROUNDS_AND_SCHED and DO_4ROUNDS rely on an
    unexpected/undocumented behavior of the GNU assembler, which might
    change in the future
    (https://sourceware.org/bugzilla/show_bug.cgi?id=32073).
    
        M (1) (2) // 1 arg !? Future: 2 args
        M 1 + 2   // 1 arg !? Future: 3 args
    
        M 1 2     // 2 args
    
    Add parentheses around the single arguments to support future GNU
    assembler and LLVM integrated assembler (when the IsOperator hack from
    the following link is dropped).
    
    Link: https://github.com/llvm/llvm-project/commit/055006475e22014b28a070db1bff41ca15f322f0
    Signed-off-by: Fangrui Song <[email protected]>
    Reviewed-by: Jan Beulich <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drivers/perf: arm_spe: Use perf_allow_kernel() for permissions [+ + +]
Author: James Clark <[email protected]>
Date:   Tue Aug 27 15:51:12 2024 +0100

    drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
    
    [ Upstream commit 5e9629d0ae977d6f6916d7e519724804e95f0b07 ]
    
    Use perf_allow_kernel() for 'pa_enable' (physical addresses),
    'pct_enable' (physical timestamps) and context IDs. This means that
    perf_event_paranoid is now taken into account and LSM hooks can be used,
    which is more consistent with other perf_event_open calls. For example
    PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just
    perfmon_capable().
    
    This also indirectly fixes the following error message which is
    misleading because perf_event_paranoid is not taken into account by
    perfmon_capable():
    
      $ perf record -e arm_spe/pa_enable/
    
      Error:
      Access to performance monitoring and observability operations is
      limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid
      setting ...
    
    Suggested-by: Al Grant <[email protected]>
    Signed-off-by: James Clark <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drivers/perf: riscv: Align errno for unsupported perf event [+ + +]
Author: Pu Lehui <[email protected]>
Date:   Sat Aug 31 07:15:20 2024 +0000

    drivers/perf: riscv: Align errno for unsupported perf event
    
    commit c625154993d0d24a962b1830cd5ed92adda2cf86 upstream.
    
    RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would
    be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in
    pmu_sbi_event_map. Considering that other implementations return -ENOENT
    for unsupported perf types, let's synchronize this behavior. Due to this
    reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that
    behavior to the rest of proper place.
    
    Signed-off-by: Pu Lehui <[email protected]>
    Reviewed-by: Atish Patra <[email protected]>
    Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
    Fixes: 16d3b1af0944 ("perf: RISC-V: Check standard event availability")
    Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/amd/display: Add HDR workaround for specific eDP [+ + +]
Author: Alex Hung <[email protected]>
Date:   Fri Sep 6 11:39:18 2024 -0600

    drm/amd/display: Add HDR workaround for specific eDP
    
    commit 05af800704ee7187d9edd461ec90f3679b1c4aba upstream.
    
    [WHY & HOW]
    Some eDP panels suffer from flicking when HDR is enabled in KDE. This
    quirk works around it by skipping VSC that is incompatible with eDP
    panels.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2) [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Aug 2 12:35:13 2024 +0530

    drm/amd/display: Add null check for 'afb' in amdgpu_dm_plane_handle_cursor_update (v2)
    
    [ Upstream commit cd9e9e0852d501f169aa3bb34e4b413d2eb48c37 ]
    
    This commit adds a null check for the 'afb' variable in the
    amdgpu_dm_plane_handle_cursor_update function. Previously, 'afb' was
    assumed to be null, but was used later in the code without a null check.
    This could potentially lead to a null pointer dereference.
    
    Changes since v1:
    - Moved the null check for 'afb' to the line where 'afb' is used. (Alex)
    
    Fixes the below:
    drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_plane.c:1298 amdgpu_dm_plane_handle_cursor_update() error: we previously assumed 'afb' could be null (see line 1252)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Co-developed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:21:19 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
    
    [ Upstream commit cba7fec864172dadd953daefdd26e01742b71a6a ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
    `dc->clk_mgr->funcs` is null.
    
    The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
    not null before accessing its functions. This prevents a potential null
    pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Mon Jul 22 16:44:40 2024 +0530

    drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
    
    [ Upstream commit c395fd47d1565bd67671f45cca281b3acc2c31ef ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
    null.
    
    The fix adds a check to ensure `dc->clk_mgr` is not null before
    accessing its functions. This prevents a potential null pointer
    dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:09:28 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn20_set_output_transfer_func
    
    [ Upstream commit 62ed6f0f198da04e884062264df308277628004f ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn20_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null at line 1030, but then it
    was being dereferenced without any null check at line 1048. This could
    potentially lead to a null pointer dereference error if set_output_gamma
    is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma at line 1048.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Wed Jul 31 13:15:00 2024 +0530

    drm/amd/display: Add NULL check for function pointer in dcn32_set_output_transfer_func
    
    [ Upstream commit 28574b08c70e56d34d6f6379326a860b96749051 ]
    
    This commit adds a null check for the set_output_gamma function pointer
    in the dcn32_set_output_transfer_func function. Previously,
    set_output_gamma was being checked for null, but then it was being
    dereferenced without any null check. This could lead to a null pointer
    dereference if set_output_gamma is null.
    
    To fix this, we now ensure that set_output_gamma is not null before
    dereferencing it. We do this by adding a null check for set_output_gamma
    before the call to set_output_gamma.
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:18:58 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
    
    [ Upstream commit f22f4754aaa47d8c59f166ba3042182859e5dff7 ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn201_acquire_free_pipe_for_layer` function. The issue could occur
    when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sun Jul 21 19:30:16 2024 +0530

    drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer
    
    [ Upstream commit ac2140449184a26eac99585b7f69814bd3ba8f2d ]
    
    This commit addresses a potential null pointer dereference issue in the
    `dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
    could occur when `head_pipe` is null.
    
    The fix adds a check to ensure `head_pipe` is not null before asserting
    it. If `head_pipe` is null, the function returns NULL to prevent a
    potential null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Thu Jul 25 07:23:48 2024 +0530

    drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
    
    [ Upstream commit 66d71a72539e173a9b00ca0b1852cbaa5f5bf1ad ]
    
    This commit addresses a null pointer dereference issue in the
    `commit_planes_for_stream` function at line 4140. The issue could occur
    when `top_pipe_to_program` is null.
    
    The fix adds a check to ensure `top_pipe_to_program` is not null before
    accessing its stream_res. This prevents a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` [+ + +]
Author: Mario Limonciello <[email protected]>
Date:   Sun Sep 15 14:28:37 2024 -0500

    drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
    
    [ Upstream commit 87d749a6aab73d8069d0345afaa98297816cb220 ]
    
    The issue with panel power savings compatibility below
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` happens at
    `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT` as well.
    
    That issue will be fixed separately, so don't prevent the backlight
    brightness from going that low.
    
    Cc: Harry Wentland <[email protected]>
    Cc: Thomas Weißschuh <[email protected]>
    Link: https://lore.kernel.org/amd-gfx/[email protected]/T/#m400dee4e2fc61fe9470334d20a7c8c89c9aef44f
    Reviewed-by: Harry Wentland <[email protected]>
    Signed-off-by: Mario Limonciello <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Avoid overflow assignment in link_dp_cts [+ + +]
Author: Alex Hung <[email protected]>
Date:   Wed Jul 17 09:17:56 2024 -0600

    drm/amd/display: Avoid overflow assignment in link_dp_cts
    
    [ Upstream commit a15268787b79fd183dd526cc16bec9af4f4e49a1 ]
    
    sampling_rate is an uint8_t but is assigned an unsigned int, and thus it
    can overflow. As a result, sampling_rate is changed to uint32_t.
    
    Similarly, LINK_QUAL_PATTERN_SET has a size of 2 bits, and it should
    only be assigned to a value less or equal than 4.
    
    This fixes 2 INTEGER_OVERFLOW issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Wenjing Liu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check link_res->hpo_dp_link_enc before using it [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 16:45:39 2024 -0600

    drm/amd/display: Check link_res->hpo_dp_link_enc before using it
    
    [ Upstream commit 0beca868cde8742240cd0038141c30482d2b7eb8 ]
    
    [WHAT & HOW]
    Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res
    without initializing hpo_dp_link_enc and it is necessary to check for
    null before dereferencing.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using dc->clk_mgr [+ + +]
Author: Alex Hung <[email protected]>
Date:   Mon Jul 29 15:29:09 2024 -0600

    drm/amd/display: Check null pointers before using dc->clk_mgr
    
    [ Upstream commit 95d9e0803e51d5a24276b7643b244c7477daf463 ]
    
    [WHY & HOW]
    dc->clk_mgr is null checked previously in the same function, indicating
    it might be null.
    
    Passing "dc" to "dc->hwss.apply_idle_power_optimizations", which
    dereferences null "dc->clk_mgr". (The function pointer resolves to
    "dcn35_apply_idle_power_optimizations".)
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointers before using them [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:38:16 2024 -0600

    drm/amd/display: Check null pointers before using them
    
    [ Upstream commit 1ff12bcd7deaeed25efb5120433c6a45dd5504a8 ]
    
    [WHAT & HOW]
    These pointers are null checked previously in the same function,
    indicating they might be null as reported by Coverity. As a result,
    they need to be checked when used again.
    
    This fixes 3 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null-initialized variables [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 17:34:18 2024 -0600

    drm/amd/display: Check null-initialized variables
    
    [ Upstream commit 367cd9ceba1933b63bc1d87d967baf6d9fd241d2 ]
    
    [WHAT & HOW]
    drr_timing and subvp_pipe are initialized to null and they are not
    always assigned new values. It is necessary to check for null before
    dereferencing.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Nevenko Stupar <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check phantom_stream before it is used [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 20 20:23:41 2024 -0600

    drm/amd/display: Check phantom_stream before it is used
    
    [ Upstream commit 3718a619a8c0a53152e76bb6769b6c414e1e83f4 ]
    
    dcn32_enable_phantom_stream can return null, so returned value
    must be checked before used.
    
    This fixes 1 NULL_RETURNS issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check stream before comparing them [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 20:05:14 2024 -0600

    drm/amd/display: Check stream before comparing them
    
    [ Upstream commit 35ff747c86767937ee1e0ca987545b7eed7a0810 ]
    
    [WHAT & HOW]
    amdgpu_dm can pass a null stream to dc_is_stream_unchanged. It is
    necessary to check for null before dereferencing them.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: enable_hpo_dp_link_output: Check link_res->hpo_dp_link_enc before using it [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 16:45:39 2024 -0600

    drm/amd/display: enable_hpo_dp_link_output: Check link_res->hpo_dp_link_enc before using it
    
    commit d925c04d974c657d10471c0c2dba3bc9c7d994ee upstream.
    
    [WHAT & HOW]
    Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res
    without initializing hpo_dp_link_enc and it is necessary to check for
    null before dereferencing.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Fixes: 0beca868cde8 ("drm/amd/display: Check link_res->hpo_dp_link_enc before using it")
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: fix double free issue during amdgpu module unload [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 15 18:45:22 2024 -0400

    drm/amd/display: fix double free issue during amdgpu module unload
    
    [ Upstream commit 20b5a8f9f4670a8503aa9fa95ca632e77c6bf55d ]
    
    Flexible endpoints use DIGs from available inflexible endpoints,
    so only the encoders of inflexible links need to be freed.
    Otherwise, a double free issue may occur when unloading the
    amdgpu module.
    
    [  279.190523] RIP: 0010:__slab_free+0x152/0x2f0
    [  279.190577] Call Trace:
    [  279.190580]  <TASK>
    [  279.190582]  ? show_regs+0x69/0x80
    [  279.190590]  ? die+0x3b/0x90
    [  279.190595]  ? do_trap+0xc8/0xe0
    [  279.190601]  ? do_error_trap+0x73/0xa0
    [  279.190605]  ? __slab_free+0x152/0x2f0
    [  279.190609]  ? exc_invalid_op+0x56/0x70
    [  279.190616]  ? __slab_free+0x152/0x2f0
    [  279.190642]  ? asm_exc_invalid_op+0x1f/0x30
    [  279.190648]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191096]  ? __slab_free+0x152/0x2f0
    [  279.191102]  ? dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191469]  kfree+0x260/0x2b0
    [  279.191474]  dcn10_link_encoder_destroy+0x19/0x30 [amdgpu]
    [  279.191821]  link_destroy+0xd7/0x130 [amdgpu]
    [  279.192248]  dc_destruct+0x90/0x270 [amdgpu]
    [  279.192666]  dc_destroy+0x19/0x40 [amdgpu]
    [  279.193020]  amdgpu_dm_fini+0x16e/0x200 [amdgpu]
    [  279.193432]  dm_hw_fini+0x26/0x40 [amdgpu]
    [  279.193795]  amdgpu_device_fini_hw+0x24c/0x400 [amdgpu]
    [  279.194108]  amdgpu_driver_unload_kms+0x4f/0x70 [amdgpu]
    [  279.194436]  amdgpu_pci_remove+0x40/0x80 [amdgpu]
    [  279.194632]  pci_device_remove+0x3a/0xa0
    [  279.194638]  device_remove+0x40/0x70
    [  279.194642]  device_release_driver_internal+0x1ad/0x210
    [  279.194647]  driver_detach+0x4e/0xa0
    [  279.194650]  bus_remove_driver+0x6f/0xf0
    [  279.194653]  driver_unregister+0x33/0x60
    [  279.194657]  pci_unregister_driver+0x44/0x90
    [  279.194662]  amdgpu_exit+0x19/0x1f0 [amdgpu]
    [  279.194939]  __do_sys_delete_module.isra.0+0x198/0x2f0
    [  279.194946]  __x64_sys_delete_module+0x16/0x20
    [  279.194950]  do_syscall_64+0x58/0x120
    [  279.194954]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
    [  279.194980]  </TASK>
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Roman Li <[email protected]>
    Signed-off-by: Roman Li <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 color transformation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:05:20 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 color transformation
    
    [ Upstream commit d81873f9e715b72d4f8d391c8eb243946f784dfc ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
    management module. The issue could occur when the index 'i' exceeds the
    number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 18:44:02 2024 +0530

    drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
    
    [ Upstream commit bc50b614d59990747dd5aeced9ec22f9258991ff ]
    
    This commit addresses a potential index out of bounds issue in the
    `cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
    color  management module. The issue could occur when the index 'i'
    exceeds the  number of transfer function points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds, the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix index out of bounds in degamma hardware format translation [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Sat Jul 20 17:48:27 2024 +0530

    drm/amd/display: Fix index out of bounds in degamma hardware format translation
    
    [ Upstream commit b7e99058eb2e86aabd7a10761e76cae33d22b49f ]
    
    Fixes index out of bounds issue in
    `cm_helper_translate_curve_to_degamma_hw_format` function. The issue
    could occur when the index 'i' exceeds the number of transfer function
    points (TRANSFER_FUNC_POINTS).
    
    The fix adds a check to ensure 'i' is within bounds before accessing the
    transfer function points. If 'i' is out of bounds the function returns
    false to indicate an error.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
    drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Fix system hang while resume with TBT monitor [+ + +]
Author: Tom Chung <[email protected]>
Date:   Fri Sep 13 15:44:40 2024 +0800

    drm/amd/display: Fix system hang while resume with TBT monitor
    
    commit 52d4e3fb3d340447dcdac0e14ff21a764f326907 upstream.
    
    [Why]
    Connected with a Thunderbolt monitor and do the suspend and the system
    may hang while resume.
    
    The TBT monitor HPD will be triggered during the resume procedure
    and call the drm_client_modeset_probe() while
    struct drm_connector connector->dev->master is NULL.
    
    It will mess up the pipe topology after resume.
    
    [How]
    Skip the TBT monitor HPD during the resume procedure because we
    currently will probe the connectors after resume by default.
    
    Reviewed-by: Wayne Lin <[email protected]>
    Signed-off-by: Tom Chung <[email protected]>
    Signed-off-by: Fangzhi Zuo <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream' [+ + +]
Author: Srinivasan Shanmugam <[email protected]>
Date:   Fri Jul 26 19:31:55 2024 +0530

    drm/amd/display: Handle null 'stream_status' in 'planes_changed_for_existing_stream'
    
    [ Upstream commit 8141f21b941710ecebe49220b69822cab3abd23d ]
    
    This commit adds a null check for 'stream_status' in the function
    'planes_changed_for_existing_stream'. Previously, the code assumed
    'stream_status' could be null, but did not handle the case where it was
    actually null. This could lead to a null pointer dereference.
    
    Reported by smatch:
    drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:3784 planes_changed_for_existing_stream() error: we previously assumed 'stream_status' could be null (see line 3774)
    
    Cc: Tom Chung <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Roman Li <[email protected]>
    Cc: Alex Hung <[email protected]>
    Cc: Aurabindo Pillai <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Hamza Mahfooz <[email protected]>
    Signed-off-by: Srinivasan Shanmugam <[email protected]>
    Reviewed-by: Tom Chung <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: handle nulled pipe context in DCE110's set_drr() [+ + +]
Author: Tobias Jakobi <[email protected]>
Date:   Mon Sep 16 14:54:05 2024 +0200

    drm/amd/display: handle nulled pipe context in DCE110's set_drr()
    
    [ Upstream commit e7d4e1438533abe448813bdc45691f9c230aa307 ]
    
    As set_drr() is called from IRQ context, it can happen that the
    pipe context has been nulled by dc_state_destruct().
    
    Apply the same protection here that is already present for
    dcn35_set_drr() and dcn10_set_drr(). I.e. fetch the tg pointer
    first (to avoid a race with dc_state_destruct()), and then
    check the local copy before using it.
    
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
    Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
    Acked-by: Alex Deucher <[email protected]>
    Signed-off-by: Tobias Jakobi <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Initialize get_bytes_per_element's default to 1 [+ + +]
Author: Alex Hung <[email protected]>
Date:   Mon Jul 15 09:57:01 2024 -0600

    drm/amd/display: Initialize get_bytes_per_element's default to 1
    
    [ Upstream commit 4067f4fa0423a89fb19a30b57231b384d77d2610 ]
    
    Variables, used as denominators and maybe not assigned to other values,
    should not be 0. bytes_per_element_y & bytes_per_element_c are
    initialized by get_bytes_per_element() which should never return 0.
    
    This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
    
    Signed-off-by: Alex Hung <[email protected]>
    Reviewed-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags [+ + +]
Author: Alex Hung <[email protected]>
Date:   Thu Jun 27 11:51:27 2024 -0600

    drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags
    
    [ Upstream commit 5559598742fb4538e4c51c48ef70563c49c2af23 ]
    
    [WHAT & HOW]
    "dcn20_validate_apply_pipe_split_flags" dereferences merge, and thus it
    cannot be a null pointer. Let's pass a valid pointer to avoid null
    dereference.
    
    This fixes 2 FORWARD_NULL issues reported by Coverity.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Revert Avoid overflow assignment [+ + +]
Author: Gabe Teeger <[email protected]>
Date:   Thu Jul 25 18:42:21 2024 -0400

    drm/amd/display: Revert Avoid overflow assignment
    
    commit e80f8f491df873ea2e07c941c747831234814612 upstream.
    
    This reverts commit a15268787b79 ("drm/amd/display: Avoid overflow assignment in link_dp_cts")
    Due to regression causing DPMS hang.
    
    Reviewed-by: Alex Hung <[email protected]>
    Signed-off-by: Gabe Teeger <[email protected]>
    Signed-off-by: Wayne Lin <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35 [+ + +]
Author: Yihan Zhu <[email protected]>
Date:   Sat Sep 7 13:25:19 2024 -0400

    drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35
    
    commit 0d5e5e8a0aa49ea2163abf128da3b509a6c58286 upstream.
    
    [WHY & HOW]
    Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause
    grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override
    to match HW spec.
    
    Cc: Mario Limonciello <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Reviewed-by: Charlene Liu <[email protected]>
    Signed-off-by: Yihan Zhu <[email protected]>
    Signed-off-by: Aurabindo Pillai <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    (cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba)
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces [+ + +]
Author: Nicholas Kazlauskas <[email protected]>
Date:   Thu Jul 18 11:53:31 2024 -0400

    drm/amd/display: Use gpuvm_min_page_size_kbytes for DML2 surfaces
    
    [ Upstream commit 31663521ede2edb622ee1b397ae3ac666d6351c5 ]
    
    [Why]
    It's currently hard coded to 256 when it should be using the SOC
    provided values. This can result in corruption with linear surfaces
    where we prefetch more PTE than the buffer can hold.
    
    [How]
    Update the min page size correctly for the plane.
    
    Signed-off-by: Nicholas Kazlauskas <[email protected]>
    Reviewed-by: Jun Lei <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amd/pm: ensure the fw_info is not null before using it [+ + +]
Author: Tim Huang <[email protected]>
Date:   Wed Aug 7 17:15:12 2024 +0800

    drm/amd/pm: ensure the fw_info is not null before using it
    
    [ Upstream commit 186fb12e7a7b038c2710ceb2fb74068f1b5d55a4 ]
    
    This resolves the dereference null return value warning
    reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx10: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:34 2024 -0400

    drm/amdgpu/gfx10: use rlc safe mode for soft recovery
    
    [ Upstream commit ead60e9c4e29c8574cae1be4fe3af1d9a978fb0f ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Fri Jul 12 15:36:19 2024 -0400

    drm/amdgpu/gfx11: enter safe mode before touching CP_INT_CNTL
    
    [ Upstream commit b5be054c585110b2c5c1b180136800e8c41c7bb4 ]
    
    Need to enter safe mode before touching GC MMIO.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx11: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:23 2024 -0400

    drm/amdgpu/gfx11: use rlc safe mode for soft recovery
    
    [ Upstream commit 3f2d35c325534c1b7ac5072173f0dc7ca969dec2 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu/gfx9: properly handle error ints on all pipes [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Tue Jul 2 10:24:59 2024 -0400

    drm/amdgpu/gfx9: properly handle error ints on all pipes
    
    [ Upstream commit 48695573d2feaf42812c1ad54e01caff0d1c2d71 ]
    
    Need to handle the interrupt enables for all pipes.
    
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu/gfx9: use rlc safe mode for soft recovery [+ + +]
Author: Alex Deucher <[email protected]>
Date:   Wed Jul 24 18:20:57 2024 -0400

    drm/amdgpu/gfx9: use rlc safe mode for soft recovery
    
    [ Upstream commit 3ec2ad7c34c412bd9264cd1ff235d0812be90e82 ]
    
    Protect the MMIO access with safe mode.
    
    Acked-by: Vitaly Prosyak <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdgpu: add list empty check to avoid null pointer issue [+ + +]
Author: Yang Wang <[email protected]>
Date:   Wed Aug 21 14:42:41 2024 +0800

    drm/amdgpu: add list empty check to avoid null pointer issue
    
    [ Upstream commit 4416377ae1fdc41a90b665943152ccd7ff61d3c5 ]
    
    Add list empty check to avoid null pointer issues in some corner cases.
    - list_for_each_entry_safe()
    
    Signed-off-by: Yang Wang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: add raven1 gfxoff quirk [+ + +]
Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:25:54 2024 +0800

    drm/amdgpu: add raven1 gfxoff quirk
    
    [ Upstream commit 0126c0ae11e8b52ecfde9d1b174ee2f32d6c3a5d ]
    
    Fix screen corruption with openkylin.
    
    Link: https://bbs.openkylin.top/t/topic/171497
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Block MMR_READ IOCTL in reset [+ + +]
Author: Victor Skvortsov <[email protected]>
Date:   Thu Aug 8 13:40:23 2024 -0400

    drm/amdgpu: Block MMR_READ IOCTL in reset
    
    [ Upstream commit 9e823f307074c0f82b5f6044943b0086e3079bed ]
    
    Register access from userspace should be blocked until
    reset is complete.
    
    Signed-off-by: Victor Skvortsov <[email protected]>
    Reviewed-by: Alex Deucher <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit [+ + +]
Author: Pierre-Eric Pelloux-Prayer <[email protected]>
Date:   Tue Jul 2 11:54:30 2024 +0200

    drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit
    
    [ Upstream commit fec5f8e8c6bcf83ed7a392801d7b44c5ecfc1e82 ]
    
    Before this commit, only submits with both a BO_HANDLES chunk and a
    'bo_list_handle' would be rejected (by amdgpu_cs_parser_bos).
    
    But if UMD sent multiple BO_HANDLES, what would happen is:
    * only the last one would be really used
    * all the others would leak memory as amdgpu_cs_p1_bo_handles would
      overwrite the previous p->bo_list value
    
    This commit rejects submissions with multiple BO_HANDLES chunks to
    match the implementation of the parser.
    
    Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]>
    Reviewed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: enable gfxoff quirk on HP 705G4 [+ + +]
Author: Peng Liu <[email protected]>
Date:   Fri Aug 30 15:27:08 2024 +0800

    drm/amdgpu: enable gfxoff quirk on HP 705G4
    
    [ Upstream commit 2c7795e245d993bcba2f716a8c93a5891ef910c9 ]
    
    Enabling gfxoff quirk results in perfectly usable
    graphical user interface on HP 705G4 DM with R5 2400G.
    
    Without the quirk, X server is completely unusable as
    every few seconds there is gpu reset due to ring gfx timeout.
    
    Signed-off-by: Peng Liu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: Fix get each xcp macro [+ + +]
Author: Asad Kamal <[email protected]>
Date:   Mon Jul 22 19:45:11 2024 +0800

    drm/amdgpu: Fix get each xcp macro
    
    [ Upstream commit ef126c06a98bde1a41303970eb0fc0ac33c3cc02 ]
    
    Fix get each xcp macro to loop over each partition correctly
    
    Fixes: 4bdca2057933 ("drm/amdgpu: Add utility functions for xcp")
    Signed-off-by: Asad Kamal <[email protected]>
    Reviewed-by: Lijo Lazar <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_atombios [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 13:47:55 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_atombios
    
    [ Upstream commit 92549780e32718d64a6d08bbbb3c6fffecb541c7 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdgpu: fix unchecked return value warning for amdgpu_gfx [+ + +]
Author: Tim Huang <[email protected]>
Date:   Thu Aug 1 10:38:37 2024 +0800

    drm/amdgpu: fix unchecked return value warning for amdgpu_gfx
    
    [ Upstream commit c0277b9d7c2ee9ee5dbc948548984f0fbb861301 ]
    
    This resolves the unchecded return value warning reported by Coverity.
    
    Signed-off-by: Tim Huang <[email protected]>
    Reviewed-by: Jesse Zhang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer [+ + +]
Author: Philip Yang <[email protected]>
Date:   Sun Jul 14 11:11:05 2024 -0400

    drm/amdkfd: amdkfd_free_gtt_mem clear the correct pointer
    
    [ Upstream commit c86ad39140bbcb9dc75a10046c2221f657e8083b ]
    
    Pass pointer reference to amdgpu_bo_unref to clear the correct pointer,
    otherwise amdgpu_bo_unref clear the local variable, the original pointer
    not set to NULL, this could cause use-after-free bug.
    
    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Felix Kuehling <[email protected]>
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Check int source id for utcl2 poison event [+ + +]
Author: Hawking Zhang <[email protected]>
Date:   Tue Aug 20 13:56:32 2024 +0800

    drm/amdkfd: Check int source id for utcl2 poison event
    
    [ Upstream commit db6341a9168d2a24ded526277eeab29724d76e9d ]
    
    Traditional utcl2 fault_status polling does not
    work in SRIOV environment. The polling of fault
    status register from guest side will be dropped
    by hardware.
    
    Driver should switch to check utcl2 interrupt
    source id to identify utcl2 poison event. It is
    set to 1 when poisoned data interrupts are
    signaled.
    
    v2: drop the unused local variable (Tao)
    
    Signed-off-by: Hawking Zhang <[email protected]>
    Reviewed-by: Tao Zhou <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amdkfd: Fix resource leak in criu restore queue [+ + +]
Author: Jesse Zhang <[email protected]>
Date:   Fri Sep 6 11:29:55 2024 +0800

    drm/amdkfd: Fix resource leak in criu restore queue
    
    [ Upstream commit aa47fe8d3595365a935921a90d00bc33ee374728 ]
    
    To avoid memory leaks, release q_extra_data when exiting the restore queue.
    v2: Correct the proto (Alex)
    
    Signed-off-by: Jesse Zhang <[email protected]>
    Reviewed-by: Tim Huang <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/i915/dp: Fix AUX IO power enabling for eDP PSR [+ + +]
Author: Imre Deak <[email protected]>
Date:   Tue Sep 10 14:18:47 2024 +0300

    drm/i915/dp: Fix AUX IO power enabling for eDP PSR
    
    [ Upstream commit ec2231b8dd2dc515912ff7816c420153b4a95e92 ]
    
    Panel Self Refresh on eDP requires the AUX IO power to be enabled
    whenever the output (main link) is enabled. This is required by the
    AUX_PHY_WAKE/ML_PHY_LOCK signaling initiated by the HW automatically to
    re-enable the main link after it got disabled in power saving states
    (see eDP v1.4b, sections 5.1, 6.1.3.3.1.1).
    
    The Panel Replay mode on non-eDP outputs on the other hand is only
    supported by keeping the main link active, thus not requiring the above
    AUX_PHY_WAKE/ML_PHY_LOCK signaling (eDP v1.4b, section 6.1.3.3.1.2).
    Thus enabling the AUX IO power for this case is not required either.
    
    Based on the above enable the AUX IO power only for eDP/PSR outputs.
    
    Bspec: 49274, 53370
    
    v2:
    - Add a TODO comment to adjust the requirement for AUX IO based on
      whether the ALPM/main-link off mode gets enabled. (Rodrigo)
    
    Cc: Animesh Manna <[email protected]>
    Fixes: b8cf5b5d266e ("drm/i915/panelreplay: Initializaton and compute config for panel replay")
    Reviewed-by: Rodrigo Vivi <[email protected]>
    Signed-off-by: Imre Deak <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit f7c2ed9d4ce80a2570c492825de239dc8b500f2e)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/dp: Fix colorimetry detection [+ + +]
Author: Ville Syrjälä <[email protected]>
Date:   Wed Sep 18 22:04:39 2024 +0300

    drm/i915/dp: Fix colorimetry detection
    
    [ Upstream commit e860513f56d8428fcb2bd0282ac8ab691a53fc6c ]
    
    intel_dp_init_connector() is no place for detecting stuff via
    DPCD (except perhaps for eDP). Move the colorimetry stuff into
    a more appropriate place.
    
    Cc: Jouni Högander <[email protected]>
    Fixes: 00076671a648 ("drm/i915/display: Move colorimetry_support from intel_psr to intel_dp")
    Signed-off-by: Ville Syrjälä <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Reviewed-by: Jouni Högander <[email protected]>
    (cherry picked from commit 35dba4834bded843d5416e8caadfe82bd0ce1904)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/i915/gem: fix bitwise and logical AND mixup [+ + +]
Author: Jani Nikula <[email protected]>
Date:   Wed Sep 18 20:35:43 2024 +0300

    drm/i915/gem: fix bitwise and logical AND mixup
    
    commit 394b52462020b6cceff1f7f47fdebd03589574f3 upstream.
    
    CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND is an int, defaulting to 250. When
    the wakeref is non-zero, it's either -1 or a dynamically allocated
    pointer, depending on CONFIG_DRM_I915_DEBUG_RUNTIME_PM. It's likely that
    the code works by coincidence with the bitwise AND, but with
    CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y, there's the off chance that the
    condition evaluates to false, and intel_wakeref_auto() doesn't get
    called. Switch to the intended logical AND.
    
    v2: Use != to avoid clang -Wconstant-logical-operand (Nathan)
    
    Fixes: ad74457a6b5a ("drm/i915/dgfx: Release mmap on rpm suspend")
    Cc: Matthew Auld <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Anshuman Gupta <[email protected]>
    Cc: Andi Shyti <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: [email protected] # v6.1+
    Reviewed-by: Matthew Auld <[email protected]>
    Reviewed-by: Andi Shyti <[email protected]> # v1
    Link: https://patchwork.freedesktop.org/patch/msgid/643cc0a4d12f47fd8403d42581e83b1e9c4543c7.1726680898.git.jani.nikula@intel.com
    Signed-off-by: Jani Nikula <[email protected]>
    (cherry picked from commit 4c1bfe259ed1d2ade826f95d437e1c41b274df04)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/mediatek: ovl_adaptor: Add missing of_node_put() [+ + +]
Author: Javier Carrasco <[email protected]>
Date:   Mon Jun 24 18:43:47 2024 +0200

    drm/mediatek: ovl_adaptor: Add missing of_node_put()
    
    commit 5beb6fba25db235b52eab34bde8112f07bb31d75 upstream.
    
    Error paths that exit for_each_child_of_node() need to call
    of_node_put() to decerement the child refcount and avoid memory leaks.
    
    Add the missing of_node_put().
    
    Cc: [email protected]
    Fixes: 453c3364632a ("drm/mediatek: Add ovl_adaptor support for MT8195")
    Signed-off-by: Javier Carrasco <[email protected]>
    Reviewed-by: CK Hu <[email protected]>
    Link: https://patchwork.kernel.org/project/dri-devel/patch/[email protected]/
    Signed-off-by: Chun-Kuang Hu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs [+ + +]
Author: Konrad Dybcio <[email protected]>
Date:   Tue Jul 9 13:15:40 2024 +0200

    drm/msm/adreno: Assign msm_gpu->pdev earlier to avoid nullptrs
    
    [ Upstream commit 16007768551d5bfe53426645401435ca8d2ef54f ]
    
    There are some cases, such as the one uncovered by Commit 46d4efcccc68
    ("drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails")
    where
    
    msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);
    
    is called on gpu->pdev == NULL, as the GPU device has not been fully
    initialized yet.
    
    Turns out that there's more than just the aforementioned path that
    causes this to happen (e.g. the case when there's speedbin data in the
    catalog, but opp-supported-hw is missing in DT).
    
    Assigning msm_gpu->pdev earlier seems like the least painful solution
    to this, therefore do so.
    
    Signed-off-by: Konrad Dybcio <[email protected]>
    Patchwork: https://patchwork.freedesktop.org/patch/602742/
    Signed-off-by: Rob Clark <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/panthor: Don't add write fences to the shared BOs [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:01:54 2024 +0200

    drm/panthor: Don't add write fences to the shared BOs
    
    commit f9e7ac6e2e9986c2ee63224992cb5c8276e46b2a upstream.
    
    The only user (the mesa gallium driver) is already assuming explicit
    synchronization and doing the export/import dance on shared BOs. The
    only reason we were registering ourselves as writers on external BOs
    is because Xe, which was the reference back when we developed Panthor,
    was doing so. Turns out Xe was wrong, and we really want bookkeep on
    all registered fences, so userspace can explicitly upgrade those to
    read/write when needed.
    
    Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block")
    Cc: Matthew Brost <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Don't declare a queue blocked if deferred operations are pending [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Thu Sep 5 09:19:14 2024 +0200

    drm/panthor: Don't declare a queue blocked if deferred operations are pending
    
    commit 7a1f30afe97294281a2ba05977688385744f9844 upstream.
    
    If deferred operations are pending, we want to wait for those to
    land before declaring the queue blocked on a SYNC_WAIT. We need
    this to deal with the case where the sync object is signalled through
    a deferred SYNC_{ADD,SET} from the same queue. If we don't do that
    and the group gets scheduled out before the deferred SYNC_{SET,ADD}
    is executed, we'll end up with a timeout, because no external
    SYNC_{SET,ADD} will make the scheduler reconsider the group for
    execution.
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Cc: <[email protected]>
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup() [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Mon Sep 30 18:37:42 2024 +0200

    drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup()
    
    commit 282864cc5d3f144af0cdea1868ee2dc2c5110f0d upstream.
    
    The group variable can't be used to retrieve ptdev in our second loop,
    because it points to the previously iterated list_head, not a valid
    group. Get the ptdev object from the scheduler instead.
    
    Cc: <[email protected]>
    Fixes: d72f049087d4 ("drm/panthor: Allow driver compilation")
    Reported-by: kernel test robot <[email protected]>
    Reported-by: Julia Lawall <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]/
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/panthor: Fix race when converting group handle to group object [+ + +]
Author: Steven Price <[email protected]>
Date:   Mon Sep 23 11:34:06 2024 +0100

    drm/panthor: Fix race when converting group handle to group object
    
    [ Upstream commit cac075706f298948898b1f63e81709df42afa75d ]
    
    XArray provides it's own internal lock which protects the internal array
    when entries are being simultaneously added and removed. However there
    is still a race between retrieving the pointer from the XArray and
    incrementing the reference count.
    
    To avoid this race simply hold the internal XArray lock when
    incrementing the reference count, this ensures there cannot be a racing
    call to xa_erase().
    
    Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block")
    Signed-off-by: Steven Price <[email protected]>
    Reviewed-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc() [+ + +]
Author: Boris Brezillon <[email protected]>
Date:   Fri Sep 13 13:27:22 2024 +0200

    drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc()
    
    [ Upstream commit fa998a9eac8809da4f219aad49836fcad2a9bf5c ]
    
    drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our
    pre-allocated BO if the <BO,VM> association exists. Given we
    only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will
    be called immediately, and we have to hold the VM resv lock when
    calling this function.
    
    Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block")
    Signed-off-by: Boris Brezillon <[email protected]>
    Reviewed-by: Liviu Dudau <[email protected]>
    Reviewed-by: Steven Price <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/printer: Allow NULL data in devcoredump printer [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Thu Aug 1 08:41:17 2024 -0700

    drm/printer: Allow NULL data in devcoredump printer
    
    [ Upstream commit 53369581dc0c68a5700ed51e1660f44c4b2bb524 ]
    
    We want to determine the size of the devcoredump before writing it out.
    To that end, we will run the devcoredump printer with NULL data to get
    the size, alloc data based on the generated offset, then run the
    devcorecump again with a valid data pointer to print.  This necessitates
    not writing data to the data pointer on the initial pass, when it is
    NULL.
    
    v5:
     - Better commit message (Jonathan)
     - Add kerenl doc with examples (Jani)
    
    Cc: Maarten Lankhorst <[email protected]>
    Acked-by: Maarten Lankhorst <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/radeon/r100: Handle unknown family in r100_cp_init_microcode() [+ + +]
Author: Geert Uytterhoeven <[email protected]>
Date:   Tue Jul 30 17:58:12 2024 +0200

    drm/radeon/r100: Handle unknown family in r100_cp_init_microcode()
    
    [ Upstream commit c6dbab46324b1742b50dc2fb5c1fee2c28129439 ]
    
    With -Werror:
    
        In function ‘r100_cp_init_microcode’,
            inlined from ‘r100_cp_init’ at drivers/gpu/drm/radeon/r100.c:1136:7:
        include/linux/printk.h:465:44: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
          465 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
              |                                            ^
        include/linux/printk.h:437:17: note: in definition of macro ‘printk_index_wrap’
          437 |                 _p_func(_fmt, ##__VA_ARGS__);                           \
              |                 ^~~~~~~
        include/linux/printk.h:508:9: note: in expansion of macro ‘printk’
          508 |         printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
              |         ^~~~~~
        drivers/gpu/drm/radeon/r100.c:1062:17: note: in expansion of macro ‘pr_err’
         1062 |                 pr_err("radeon_cp: Failed to load firmware \"%s\"\n", fw_name);
              |                 ^~~~~~
    
    Fix this by converting the if/else if/... construct into a proper
    switch() statement with a default to handle the error case.
    
    As a bonus, the generated code is ca. 100 bytes smaller (with gcc 11.4.0
    targeting arm32).
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/rockchip: vop: clear DMA stop bit on RK3066 [+ + +]
Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:48 2024 -0300

    drm/rockchip: vop: clear DMA stop bit on RK3066
    
    commit 6b44aa559d6c7f4ea591ef9d2352a7250138d62a upstream.
    
    The RK3066 VOP sets a dma_stop bit when it's done scanning out a frame
    and needs the driver to acknowledge that by clearing the bit.
    
    Unless we clear it "between" frames, the RGB output only shows noise
    instead of the picture. atomic_flush is the place for it that least
    affects other code (doing it on vblank would require converting all
    other usages of the reg_lock to spin_(un)lock_irq, which would affect
    performance for everyone).
    
    This seems to be a redundant synchronization mechanism that was removed
    in later iterations of the VOP hardware block.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066 [+ + +]
Author: Val Packett <[email protected]>
Date:   Mon Jun 24 17:40:49 2024 -0300

    drm/rockchip: vop: enable VOP_FEATURE_INTERNAL_RGB on RK3066
    
    commit 6ed51ba95e27221ce87979bd2ad5926033b9e1b9 upstream.
    
    The RK3066 does have RGB display output, so it should be marked as such.
    
    Fixes: f4a6de855eae ("drm: rockchip: vop: add rk3066 vop definitions")
    Cc: [email protected]
    Signed-off-by: Val Packett <[email protected]>
    Signed-off-by: Heiko Stuebner <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/sched: Add locking to drm_sched_entity_modify_sched [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Fri Sep 13 17:05:52 2024 +0100

    drm/sched: Add locking to drm_sched_entity_modify_sched
    
    commit 4286cc2c953983d44d248c9de1c81d3a9643345c upstream.
    
    Without the locking amdgpu currently can race between
    amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
    drm_sched_job_arm(), leading to the latter accesing potentially
    inconsitent entity->sched_list and entity->num_sched_list pair.
    
    v2:
     * Improve commit message. (Philipp)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: Philipp Stanner <[email protected]>
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always increment correct scheduler score [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:09 2024 +0100

    drm/sched: Always increment correct scheduler score
    
    commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5 upstream.
    
    Entities run queue can change during drm_sched_entity_push_job() so make
    sure to update the score consistently.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues")
    Cc: Nirmoy Das <[email protected]>
    Cc: Christian König <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.9+
    Reviewed-by: Christian König <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Sep 24 11:19:08 2024 +0100

    drm/sched: Always wake up correct scheduler in drm_sched_entity_push_job
    
    commit cbc8764e29c2318229261a679b2aafd0f9072885 upstream.
    
    Since drm_sched_entity_modify_sched() can modify the entities run queue,
    lets make sure to only dereference the pointer once so both adding and
    waking up are guaranteed to be consistent.
    
    Alternative of moving the spin_unlock to after the wake up would for now
    be more problematic since the same lock is taken inside
    drm_sched_rq_update_fifo().
    
    v2:
     * Improve commit message. (Philipp)
     * Cache the scheduler pointer directly. (Christian)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list")
    Cc: Christian König <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: Luben Tuikov <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Philipp Stanner <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.7+
    Reviewed-by: Christian König <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Christian König <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: Fix dynamic job-flow control race [+ + +]
Author: Rob Clark <[email protected]>
Date:   Fri Sep 13 13:23:01 2024 -0700

    drm/sched: Fix dynamic job-flow control race
    
    commit 440d52b370b03b366fd26ace36bab20552116145 upstream.
    
    Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609
    
    The whole premise of lockless access to a single-producer-single-
    consumer queue is that there is just a single producer and single
    consumer.  That means we can't call drm_sched_can_queue() (which is
    about queueing more work to the hw, not to the spsc queue) from
    anywhere other than the consumer (wq).
    
    This call in the producer is just an optimization to avoid scheduling
    the consuming worker if it cannot yet queue more work to the hw.  It
    is safe to drop this optimization to avoid the race condition.
    
    Suggested-by: Asahi Lina <[email protected]>
    Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control")
    Closes: https://github.com/AsahiLinux/linux/issues/309
    Cc: [email protected]
    Signed-off-by: Rob Clark <[email protected]>
    Reviewed-by: Danilo Krummrich <[email protected]>
    Tested-by: Janne Grunau <[email protected]>
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/sched: revert "Always increment correct scheduler score" [+ + +]
Author: Christian König <[email protected]>
Date:   Mon Sep 30 15:07:49 2024 +0200

    drm/sched: revert "Always increment correct scheduler score"
    
    commit abf201f6ce14c4ceeccde5471bdf59614b83a3d8 upstream.
    
    This reverts commit 087913e0ba2b3b9d7ccbafb2acf5dab9e35ae1d5.
    
    It turned out that the original code was correct since the rq can only
    change when there is no armed job for an entity.
    
    This change here broke the logic since we only incremented the counter
    for the first job, so revert it.
    
    Signed-off-by: Christian König <[email protected]>
    Acked-by: Tvrtko Ursulin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/stm: Avoid use-after-free issues with crtc and plane [+ + +]
Author: Katya Orlova <[email protected]>
Date:   Fri Feb 16 15:50:40 2024 +0300

    drm/stm: Avoid use-after-free issues with crtc and plane
    
    [ Upstream commit 19dd9780b7ac673be95bf6fd6892a184c9db611f ]
    
    ltdc_load() calls functions drm_crtc_init_with_planes(),
    drm_universal_plane_init() and drm_encoder_init(). These functions
    should not be called with parameters allocated with devm_kzalloc()
    to avoid use-after-free issues [1].
    
    Use allocations managed by the DRM framework.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    [1]
    https://lore.kernel.org/lkml/u366i76e3qhh3ra5oxrtngjtm2u5lterkekcz6y2jkndhuxzli@diujon4h7qwb/
    
    Signed-off-by: Katya Orlova <[email protected]>
    Acked-by: Raphaël Gallais-Pou <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/stm: ltdc: reset plane transparency after plane disable [+ + +]
Author: Yannick Fertre <[email protected]>
Date:   Fri Jul 12 15:13:44 2024 +0200

    drm/stm: ltdc: reset plane transparency after plane disable
    
    [ Upstream commit 02fa62d41c8abff945bae5bfc3ddcf4721496aca ]
    
    The plane's opacity should be reseted while the plane
    is disabled. It prevents from seeing a possible global
    or layer background color set earlier.
    
    Signed-off-by: Yannick Fertre <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Raphael Gallais-Pou <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/v3d: Prevent out of bounds access in performance query extensions [+ + +]
Author: Tvrtko Ursulin <[email protected]>
Date:   Thu Jul 11 14:53:30 2024 +0100

    drm/v3d: Prevent out of bounds access in performance query extensions
    
    commit f32b5128d2c440368b5bf3a7a356823e235caabb upstream.
    
    Check that the number of perfmons userspace is passing in the copy and
    reset extensions is not greater than the internal kernel storage where
    the ids will be copied into.
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: bae7cb5d6800 ("drm/v3d: Create a CPU job extension for the reset performance query job")
    Cc: Maíra Canal <[email protected]>
    Cc: Iago Toral Quiroga <[email protected]>
    Cc: [email protected] # v6.8+
    Reviewed-by: Iago Toral Quiroga <[email protected]>
    Reviewed-by: Maíra Canal <[email protected]>
    Signed-off-by: Maíra Canal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
drm/xe/hdcp: Check GSC structure validity [+ + +]
Author: Suraj Kandpal <[email protected]>
Date:   Mon Jul 22 12:14:51 2024 +0530

    drm/xe/hdcp: Check GSC structure validity
    
    [ Upstream commit b4224f6bae3801d589f815672ec62800a1501b0d ]
    
    Sometimes xe_gsc is not initialized when checked at HDCP capability
    check. Add gsc structure check to avoid null pointer error.
    
    Signed-off-by: Suraj Kandpal <[email protected]>
    Reviewed-by: Dnyaneshwar Bhadane <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
drm/xe: Delete unused GuC submission_state.suspend [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Wed Apr 24 22:47:47 2024 -0700

    drm/xe: Delete unused GuC submission_state.suspend
    
    [ Upstream commit 3f371a98deada9aee53d908c9aa53f6cdcb1300b ]
    
    GuC submission_state.suspend is unused, delete it.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Himal Prasad Ghimiray <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Stable-dep-of: 2d2be279f1ca ("drm/xe: fix UAF around queue destruction")
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Tue Aug 20 10:29:55 2024 -0700

    drm/xe: Drop warn on xe_guc_pc_gucrc_disable in guc pc fini
    
    [ Upstream commit a323782567812ee925e9b7926445532c7afe331b ]
    
    Not a big deal if CT is down as driver is unloading, no need to warn.
    
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Jagmeet Randhawa <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Fix memory leak on xe_alloc_pf_queue failure [+ + +]
Author: Nirmoy Das <[email protected]>
Date:   Mon Aug 26 18:20:35 2024 +0200

    drm/xe: Fix memory leak on xe_alloc_pf_queue failure
    
    [ Upstream commit c5f728de696caa35481fd84202dfbc9fecc18e0b ]
    
    Simplify memory unwinding on error also fixing current memory
    leak that can happen on error.
    
    v2: use devm_kcalloc(Matt A)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Cc: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Auld <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Nirmoy Das <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: fix UAF around queue destruction [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Mon Sep 23 15:56:48 2024 +0100

    drm/xe: fix UAF around queue destruction
    
    [ Upstream commit 2d2be279f1ca9e7288282d4214f16eea8a727cdb ]
    
    We currently do stuff like queuing the final destruction step on a
    random system wq, which will outlive the driver instance. With bad
    timing we can teardown the driver with one or more work workqueue still
    being alive leading to various UAF splats. Add a fini step to ensure
    user queues are properly torn down. At this point GuC should already be
    nuked so queue itself should no longer be referenced from hw pov.
    
    v2 (Matt B)
     - Looks much safer to use a waitqueue and then just wait for the
       xa_array to become empty before triggering the drain.
    
    Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2317
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Cc: <[email protected]> # v6.8+
    Reviewed-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: fixup xe_alloc_pf_queue [+ + +]
Author: Matthew Auld <[email protected]>
Date:   Wed Aug 21 18:19:18 2024 +0100

    drm/xe: fixup xe_alloc_pf_queue
    
    [ Upstream commit 321d6b4b9cbe3dd0bc99937d5e5b4d730b5b5798 ]
    
    kzalloc expects number of bytes, therefore we should convert the number
    of dw into bytes, otherwise we are likely just accessing beyond the
    array causing all kinds of carnage. Also fixup the error handling while
    we are here.
    
    v2:
     - Prefer kcalloc (dim)
    
    Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
    Signed-off-by: Matthew Auld <[email protected]>
    Cc: Stuart Summers <[email protected]>
    Cc: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Prevent null pointer access in xe_migrate_copy [+ + +]
Author: Zhanjun Dong <[email protected]>
Date:   Fri Sep 27 09:13:08 2024 -0700

    drm/xe: Prevent null pointer access in xe_migrate_copy
    
    [ Upstream commit 7257d9c9a3c6cfe26c428e9b7ae21d61f2f55a79 ]
    
    xe_migrate_copy designed to copy content of TTM resources. When source
    resource is null, it will trigger a NULL pointer dereference in
    xe_migrate_copy. To avoid this situation, update lacks source flag to
    true for this case, the flag will trigger xe_migrate_clear rather than
    xe_migrate_copy.
    
    Issue trace:
    <7> [317.089847] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 14,
     sizes: 4194304 & 4194304
    <7> [317.089945] xe 0000:00:02.0: [drm:xe_migrate_copy [xe]] Pass 15,
     sizes: 4194304 & 4194304
    <1> [317.128055] BUG: kernel NULL pointer dereference, address:
     0000000000000010
    <1> [317.128064] #PF: supervisor read access in kernel mode
    <1> [317.128066] #PF: error_code(0x0000) - not-present page
    <6> [317.128069] PGD 0 P4D 0
    <4> [317.128071] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
    <4> [317.128074] CPU: 1 UID: 0 PID: 1440 Comm: kunit_try_catch Tainted:
     G     U           N 6.11.0-rc7-xe #1
    <4> [317.128078] Tainted: [U]=USER, [N]=TEST
    <4> [317.128080] Hardware name: Intel Corporation Lunar Lake Client
     Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3221.D80.2407291239 07/29/2024
    <4> [317.128082] RIP: 0010:xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128158] Code: 00 00 48 89 8d e0 fe ff ff 48 8b 40 10 4c 89 85 c8
     fe ff ff 44 88 8d bd fe ff ff 65 48 8b 3c 25 28 00 00 00 48 89 7d d0 31
     ff <8b> 79 10 48 89 85 a0 fe ff ff 48 8b 00 48 89 b5 d8 fe ff ff 83 ff
    <4> [317.128162] RSP: 0018:ffffc9000167f9f0 EFLAGS: 00010246
    <4> [317.128164] RAX: ffff8881120d8028 RBX: ffff88814d070428 RCX:
     0000000000000000
    <4> [317.128166] RDX: ffff88813cb99c00 RSI: 0000000004000000 RDI:
     0000000000000000
    <4> [317.128168] RBP: ffffc9000167fbb8 R08: ffff88814e7b1f08 R09:
     0000000000000001
    <4> [317.128170] R10: 0000000000000001 R11: 0000000000000001 R12:
     ffff88814e7b1f08
    <4> [317.128172] R13: ffff88814e7b1f08 R14: ffff88813cb99c00 R15:
     0000000000000001
    <4> [317.128174] FS:  0000000000000000(0000) GS:ffff88846f280000(0000)
     knlGS:0000000000000000
    <4> [317.128176] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    <4> [317.128178] CR2: 0000000000000010 CR3: 000000011f676004 CR4:
     0000000000770ef0
    <4> [317.128180] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
     0000000000000000
    <4> [317.128182] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7:
     0000000000000400
    <4> [317.128184] PKRU: 55555554
    <4> [317.128185] Call Trace:
    <4> [317.128187]  <TASK>
    <4> [317.128189]  ? show_regs+0x67/0x70
    <4> [317.128194]  ? __die_body+0x20/0x70
    <4> [317.128196]  ? __die+0x2b/0x40
    <4> [317.128198]  ? page_fault_oops+0x15f/0x4e0
    <4> [317.128203]  ? do_user_addr_fault+0x3fb/0x970
    <4> [317.128205]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128209]  ? exc_page_fault+0x87/0x2b0
    <4> [317.128212]  ? asm_exc_page_fault+0x27/0x30
    <4> [317.128216]  ? xe_migrate_copy+0x66/0x13e0 [xe]
    <4> [317.128263]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128265]  ? __lock_acquire+0xb9d/0x26f0
    <4> [317.128267]  ? sg_free_append_table+0x20/0x80
    <4> [317.128271]  ? lock_acquire+0xc7/0x2e0
    <4> [317.128273]  ? mark_held_locks+0x4d/0x80
    <4> [317.128275]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128278]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128281]  ? __pm_runtime_resume+0x60/0xa0
    <4> [317.128284]  xe_bo_move+0x682/0xc50 [xe]
    <4> [317.128315]  ? lock_is_held_type+0xaa/0x120
    <4> [317.128318]  ttm_bo_handle_move_mem+0xe5/0x1a0 [ttm]
    <4> [317.128324]  ttm_bo_validate+0xd1/0x1a0 [ttm]
    <4> [317.128328]  shrink_test_run_device+0x721/0xc10 [xe]
    <4> [317.128360]  ? find_held_lock+0x31/0x90
    <4> [317.128363]  ? lock_release+0xd1/0x2a0
    <4> [317.128365]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
     [kunit]
    <4> [317.128370]  xe_bo_shrink_kunit+0x11/0x20 [xe]
    <4> [317.128397]  kunit_try_run_case+0x6e/0x150 [kunit]
    <4> [317.128400]  ? trace_hardirqs_on+0x1e/0xd0
    <4> [317.128402]  ? _raw_spin_unlock_irqrestore+0x31/0x60
    <4> [317.128404]  kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
    <4> [317.128407]  kthread+0xf5/0x130
    <4> [317.128410]  ? __pfx_kthread+0x10/0x10
    <4> [317.128412]  ret_from_fork+0x39/0x60
    <4> [317.128415]  ? __pfx_kthread+0x10/0x10
    <4> [317.128416]  ret_from_fork_asm+0x1a/0x30
    <4> [317.128420]  </TASK>
    
    Fixes: 266c85885263 ("drm/xe/xe2: Handle flat ccs move for igfx.")
    Signed-off-by: Zhanjun Dong <[email protected]>
    Reviewed-by: Thomas Hellström <[email protected]>
    Signed-off-by: Matt Roper <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 59a1c9c7e1d02b43b415ea92627ce095b7c79e47)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Restore pci state upon resume [+ + +]
Author: Rodrigo Vivi <[email protected]>
Date:   Thu Sep 12 17:45:07 2024 -0400

    drm/xe: Restore pci state upon resume
    
    [ Upstream commit cffa8e83df9fe525afad1e1099097413f9174f57 ]
    
    The pci state was saved, but not restored. Restore
    right after the power state transition request like
    every other driver.
    
    v2: Use right fixes tag, since this was there initialy, but
        accidentally removed.
    
    Fixes: f6761c68c0ac ("drm/xe/display: Improve s2idle handling.")
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Lucas De Marchi <[email protected]>
    Reviewed-by: Jonathan Cavitt <[email protected]>
    Signed-off-by: Rodrigo Vivi <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Maarten Lankhorst <[email protected]>
    (cherry picked from commit ec2d1539e159f53eae708e194c449cfefa004994)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Resume TDR after GT reset [+ + +]
Author: Matthew Brost <[email protected]>
Date:   Wed Jul 24 16:59:19 2024 -0700

    drm/xe: Resume TDR after GT reset
    
    [ Upstream commit 1b30f87e088b499eb74298db256da5c98e8276e2 ]
    
    Not starting the TDR after GT reset on exec queue which have been
    restarted can lead to jobs being able to be run forever. Fix this by
    restarting the TDR.
    
    Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
    Signed-off-by: Matthew Brost <[email protected]>
    Reviewed-by: Nirmoy Das <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit 8ec5a4e5ce97d6ee9f5eb5b4ce4cfc831976fdec)
    Signed-off-by: Lucas De Marchi <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/xe: Use topology to determine page fault queue size [+ + +]
Author: Stuart Summers <[email protected]>
Date:   Sat Aug 17 02:47:31 2024 +0000

    drm/xe: Use topology to determine page fault queue size
    
    [ Upstream commit 3338e4f90c143cf32f77d64f464cb7f2c2d24700 ]
    
    Currently the page fault queue size is hard coded. However
    the hardware supports faulting for each EU and each CS.
    For some applications running on hardware with a large
    number of EUs and CSs, this can result in an overflow of
    the page fault queue.
    
    Add a small calculation to determine the page fault queue
    size based on the number of EUs and CSs in the platform as
    detmined by fuses.
    
    Signed-off-by: Stuart Summers <[email protected]>
    Reviewed-by: Matthew Brost <[email protected]>
    Signed-off-by: Matthew Brost <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
    Signed-off-by: Sasha Levin <[email protected]>

 
drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS [+ + +]
Author: Thomas Zimmermann <[email protected]>
Date:   Mon Sep 23 09:58:14 2024 +0200

    drm: Consistently use struct drm_mode_rect for FB_DAMAGE_CLIPS
    
    commit 8b0d2f61545545ab5eef923ed6e59fc3be2385e0 upstream.
    
    FB_DAMAGE_CLIPS is a plane property for damage handling. Its UAPI
    should only use UAPI types. Hence replace struct drm_rect with
    struct drm_mode_rect in drm_atomic_plane_set_property(). Both types
    are identical in practice, so there's no change in behavior.
    
    Reported-by: Ville Syrjälä <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/[email protected]/
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: d3b21767821e ("drm: Add a new plane property to send damage during plane update")
    Cc: Lukasz Spintzyk <[email protected]>
    Cc: Deepak Rawat <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Thomas Hellstrom <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Simona Vetter <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # v5.0+
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm: omapdrm: Add missing check for alloc_ordered_workqueue [+ + +]
Author: Ma Ke <[email protected]>
Date:   Thu Aug 8 14:13:36 2024 +0800

    drm: omapdrm: Add missing check for alloc_ordered_workqueue
    
    commit e794b7b9b92977365c693760a259f8eef940c536 upstream.
    
    As it may return NULL pointer and cause NULL pointer dereference. Add check
    for the return value of alloc_ordered_workqueue.
    
    Cc: [email protected]
    Fixes: 2f95bc6d324a ("drm: omapdrm: Perform initialization/cleanup at probe/remove time")
    Signed-off-by: Ma Ke <[email protected]>
    Signed-off-by: Tomi Valkeinen <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
dt-bindings: clock: exynos7885: Fix duplicated binding [+ + +]
Author: David Virag <[email protected]>
Date:   Tue Aug 6 14:11:44 2024 +0200

    dt-bindings: clock: exynos7885: Fix duplicated binding
    
    commit abf3a3ea9acb5c886c8729191a670744ecd42024 upstream.
    
    The numbering in Exynos7885's FSYS CMU bindings has 4 duplicated by
    accident, with the rest of the bindings continuing with 5.
    
    Fix this by moving CLK_MOUT_FSYS_USB30DRD_USER to the end as 11.
    
    Since CLK_MOUT_FSYS_USB30DRD_USER is not used in any device tree as of
    now, and there are no other clocks affected (maybe apart from
    CLK_MOUT_FSYS_MMC_SDIO_USER which the number was shared with, also not
    used in a device tree), this is the least impactful way to solve this
    problem.
    
    Fixes: cd268e309c29 ("dt-bindings: clock: Add bindings for Exynos7885 CMU_FSYS")
    Cc: [email protected]
    Signed-off-by: David Virag <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x [+ + +]
Author: Satya Priya Kakitapalli <[email protected]>
Date:   Mon Aug 12 10:43:02 2024 +0530

    dt-bindings: clock: qcom: Add GPLL9 support on gcc-sc8180x
    
    commit 648b4bde0aca2980ebc0b90cdfbb80d222370c3d upstream.
    
    Add the missing GPLL9 which is required for the gcc sdcc2 clock.
    
    Fixes: 0fadcdfdcf57 ("dt-bindings: clock: Add SC8180x GCC binding")
    Cc: [email protected]
    Acked-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Satya Priya Kakitapalli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems [+ + +]
Author: Ravikanth Tuniki <[email protected]>
Date:   Tue Oct 1 00:43:35 2024 +0530

    dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems
    
    [ Upstream commit c6929644c1e0d6108e57061d427eb966e1746351 ]
    
    Add missing reg minItems as based on current binding document
    only ethernet MAC IO space is a supported configuration.
    
    There is a bug in schema, current examples contain 64-bit
    addressing as well as 32-bit addressing. The schema validation
    does pass incidentally considering one 64-bit reg address as
    two 32-bit reg address entries. If we change axi_ethernet_eth1
    example node reg addressing to 32-bit schema validation reports:
    
    Documentation/devicetree/bindings/net/xlnx,axi-ethernet.example.dtb:
    ethernet@40000000: reg: [[1073741824, 262144]] is too short
    
    To fix it add missing reg minItems constraints and to make things clearer
    stick to 32-bit addressing in examples.
    
    Fixes: cbb1ca6d5f9a ("dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yaml")
    Signed-off-by: Ravikanth Tuniki <[email protected]>
    Signed-off-by: Radhey Shyam Pandey <[email protected]>
    Acked-by: Conor Dooley <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
e1000e: avoid failing the system during pm_suspend [+ + +]
Author: Vitaly Lifshits <[email protected]>
Date:   Tue Aug 6 16:23:48 2024 +0300

    e1000e: avoid failing the system during pm_suspend
    
    [ Upstream commit 0a6ad4d9e1690c7faa3a53f762c877e477093657 ]
    
    Occasionally when the system goes into pm_suspend, the suspend might fail
    due to a PHY access error on the network adapter. Previously, this would
    have caused the whole system to fail to go to a low power state.
    An example of this was reported in the following Bugzilla:
    https://bugzilla.kernel.org/show_bug.cgi?id=205015
    
    [ 1663.694828] e1000e 0000:00:19.0 eth0: Failed to disable ULP
    [ 1664.731040] asix 2-3:1.0 eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1
    [ 1665.093513] e1000e 0000:00:19.0 eth0: Hardware Error
    [ 1665.596760] e1000e 0000:00:19.0: pci_pm_resume+0x0/0x80 returned 0 after 2975399 usecs
    
    and then the system never recovers from it, and all the following suspend failed due to this
    [22909.393854] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x760 [e1000e] returns -2
    [22909.393858] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -2
    [22909.393861] PM: Device 0000:00:1f.6 failed to suspend async: error -2
    
    This can be avoided by changing the return values of __e1000_shutdown and
    e1000e_pm_suspend functions so that they always return 0 (success). This
    is consistent with what other drivers do.
    
    If the e1000e driver encounters a hardware error during suspend, potential
    side effects include slightly higher power draw or non-working wake on
    LAN. This is preferred to a system-level suspend failure, and a warning
    message is written to the system log, so that the user can be aware that
    the LAN controller experienced a problem during suspend.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=205015
    Suggested-by: Dima Ruinskiy <[email protected]>
    Signed-off-by: Vitaly Lifshits <[email protected]>
    Tested-by: Mor Bar-Gabay <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
EINJ, CXL: Fix CXL device SBDF calculation [+ + +]
Author: Ben Cheatham <[email protected]>
Date:   Fri Sep 27 11:34:28 2024 -0500

    EINJ, CXL: Fix CXL device SBDF calculation
    
    [ Upstream commit ee1e3c46ed19c096be22472c728fa7f68b1352c4 ]
    
    The SBDF of the target CXL 2.0 compliant root port is required to inject a CXL
    protocol error as per ACPI 6.5. The SBDF given has to be in the
    following format:
    
    31     24 23    16 15    11 10      8  7        0
    +-------------------------------------------------+
    | segment |   bus  | device | function | reserved |
    +-------------------------------------------------+
    
    The SBDF calculated in cxl_dport_get_sbdf() doesn't account for
    the reserved bits currently, causing the wrong SBDF to be used.
    Fix said calculation to properly shift the SBDF.
    
    Without this fix, error injection into CXL 2.0 root ports through the
    CXL debugfs interface (<debugfs>/cxl) is broken. Injection
    through the legacy interface (<debugfs>/apei/einj/) will still work
    because the SBDF is manually provided by the user.
    
    Fixes: 12fb28ea6b1cf ("EINJ: Add CXL error type support")
    Signed-off-by: Ben Cheatham <[email protected]>
    Reviewed-by: Dan Williams <[email protected]>
    Tested-by: Srinivasulu Thanneeru <[email protected]>
    Reviewed-by: Srinivasulu Thanneeru <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Ira Weiny <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
exfat: fix memory leak in exfat_load_bitmap() [+ + +]
Author: Yuezhang Mo <[email protected]>
Date:   Tue Sep 3 15:01:09 2024 +0800

    exfat: fix memory leak in exfat_load_bitmap()
    
    commit d2b537b3e533f28e0d97293fe9293161fe8cd137 upstream.
    
    If the first directory entry in the root directory is not a bitmap
    directory entry, 'bh' will not be released and reassigned, which
    will cause a memory leak.
    
    Fixes: 1e49a94cf707 ("exfat: add bitmap operations")
    Cc: [email protected]
    Signed-off-by: Yuezhang Mo <[email protected]>
    Reviewed-by: Aoyama Wataru <[email protected]>
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ext4: aovid use-after-free in ext4_ext_insert_extent() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:26 2024 +0800

    ext4: aovid use-after-free in ext4_ext_insert_extent()
    
    commit a164f3a432aae62ca23d03e6d926b122ee5b860d upstream.
    
    As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is
    reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and
    cause UAF. Below is a sample trace with dummy values:
    
    ext4_ext_insert_extent
      path = *ppath = 2000
      ext4_ext_create_new_leaf(ppath)
        ext4_find_extent(ppath)
          path = *ppath = 2000
          if (depth > path[0].p_maxdepth)
                kfree(path = 2000);
                *ppath = path = NULL;
          path = kcalloc() = 3000
          *ppath = 3000;
          return path;
      /* here path is still 2000, UAF! */
      eh = path[depth].p_hdr
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330
    Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179
    CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866
    Call Trace:
     <TASK>
     ext4_ext_insert_extent+0x26d4/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
    [...]
    
    Allocated by task 179:
     ext4_find_extent+0x81c/0x1f70
     ext4_ext_map_blocks+0x146/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    
    Freed by task 179:
     kfree+0xcb/0x240
     ext4_find_extent+0x7c0/0x1f70
     ext4_ext_insert_extent+0xa26/0x3330
     ext4_ext_map_blocks+0xe22/0x2d40
     ext4_map_blocks+0x71e/0x1700
     ext4_do_writepages+0x1290/0x2800
     ext4_writepages+0x26d/0x4e0
     do_writepages+0x175/0x700
    [...]
    ==================================================================
    
    So use *ppath to update the path to avoid the above problem.
    
    Reported-by: Ojaswin Mujoo <[email protected]>
    Closes: https://lore.kernel.org/r/[email protected]
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: avoid use-after-free in ext4_ext_show_leaf() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:24 2024 +0800

    ext4: avoid use-after-free in ext4_ext_show_leaf()
    
    [ Upstream commit 4e2524ba2ca5f54bdbb9e5153bea00421ef653f5 ]
    
    In ext4_find_extent(), path may be freed by error or be reallocated, so
    using a previously saved *ppath may have been freed and thus may trigger
    use-after-free, as follows:
    
    ext4_split_extent
      path = *ppath;
      ext4_split_extent_at(ppath)
      path = ext4_find_extent(ppath)
      ext4_split_extent_at(ppath)
        // ext4_find_extent fails to free path
        // but zeroout succeeds
      ext4_ext_show_leaf(inode, path)
        eh = path[depth].p_hdr
        // path use-after-free !!!
    
    Similar to ext4_split_extent_at(), we use *ppath directly as an input to
    ext4_ext_show_leaf(). Fix a spelling error by the way.
    
    Same problem in ext4_ext_handle_unwritten_extents(). Since 'path' is only
    used in ext4_ext_show_leaf(), remove 'path' and use *ppath directly.
    
    This issue is triggered only when EXT_DEBUG is defined and therefore does
    not affect functionality.
    
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: correct encrypted dentry name hash when not casefolded [+ + +]
Author: yao.ly <[email protected]>
Date:   Mon Jul 1 14:43:39 2024 +0800

    ext4: correct encrypted dentry name hash when not casefolded
    
    commit 70dd7b573afeba9b8f8a33f2ae1e4a9a2ec8c1ec upstream.
    
    EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH will access struct
    ext4_dir_entry_hash followed ext4_dir_entry. But there is no ext4_dir_entry_hash
    followed when inode is encrypted and not casefolded
    
    Signed-off-by: yao.ly <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: dax: fix overflowing extents beyond inode size when partially writing [+ + +]
Author: Zhihao Cheng <[email protected]>
Date:   Fri Aug 9 20:15:32 2024 +0800

    ext4: dax: fix overflowing extents beyond inode size when partially writing
    
    commit dda898d7ffe85931f9cca6d702a51f33717c501e upstream.
    
    The dax_iomap_rw() does two things in each iteration: map written blocks
    and copy user data to blocks. If the process is killed by user(See signal
    handling in dax_iomap_iter()), the copied data will be returned and added
    on inode size, which means that the length of written extents may exceed
    the inode size, then fsck will fail. An example is given as:
    
    dd if=/dev/urandom of=file bs=4M count=1
     dax_iomap_rw
      iomap_iter // round 1
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 0~2M extents(written flag)
      dax_iomap_iter // copy 2M data
      iomap_iter // round 2
       iomap_iter_advance
        iter->pos += iter->processed // iter->pos = 2M
       ext4_iomap_begin
        ext4_iomap_alloc // allocate 2~4M extents(written flag)
      dax_iomap_iter
       fatal_signal_pending
      done = iter->pos - iocb->ki_pos // done = 2M
     ext4_handle_inode_extension
      ext4_update_inode_size // inode size = 2M
    
    fsck reports: Inode 13, i_size is 2097152, should be 4194304.  Fix?
    
    Fix the problem by truncating extents if the written length is smaller
    than expected.
    
    Fixes: 776722e85d3b ("ext4: DAX iomap write support")
    CC: [email protected]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219136
    Signed-off-by: Zhihao Cheng <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhihao Cheng <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:27 2024 +0800

    ext4: drop ppath from ext4_ext_replay_update_ex() to avoid double-free
    
    commit 5c0f4cc84d3a601c99bc5e6e6eb1cbda542cce95 upstream.
    
    When calling ext4_force_split_extent_at() in ext4_ext_replay_update_ex(),
    the 'ppath' is updated but it is the 'path' that is freed, thus potentially
    triggering a double-free in the following process:
    
    ext4_ext_replay_update_ex
      ppath = path
      ext4_force_split_extent_at(&ppath)
        ext4_split_extent_at
          ext4_ext_insert_extent
            ext4_ext_create_new_leaf
              ext4_ext_grow_indepth
                ext4_find_extent
                  if (depth > path[0].p_maxdepth)
                    kfree(path)                 ---> path First freed
                    *orig_path = path = NULL    ---> null ppath
      kfree(path)                               ---> path double-free !!!
    
    So drop the unnecessary ppath and use path directly to avoid this problem.
    And use ext4_find_extent() directly to update path, avoiding unnecessary
    memory allocation and freeing. Also, propagate the error returned by
    ext4_find_extent() instead of using strange error codes.
    
    Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: ext4_search_dir should return a proper error [+ + +]
Author: Thadeu Lima de Souza Cascardo <[email protected]>
Date:   Wed Aug 21 12:23:21 2024 -0300

    ext4: ext4_search_dir should return a proper error
    
    [ Upstream commit cd69f8f9de280e331c9e6ff689ced0a688a9ce8f ]
    
    ext4_search_dir currently returns -1 in case of a failure, while it returns
    0 when the name is not found. In such failure cases, it should return an
    error code instead.
    
    This becomes even more important when ext4_find_inline_entry returns an
    error code as well in the next commit.
    
    -EFSCORRUPTED seems appropriate as such error code as these failures would
    be caused by unexpected record lengths and is in line with other instances
    of ext4_check_dir_entry failures.
    
    In the case of ext4_dx_find_entry, the current use of ERR_BAD_DX_DIR was
    left as is to reduce the risk of regressions.
    
    Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix access to uninitialised lock in fc replay path [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Thu Jul 18 10:43:56 2024 +0100

    ext4: fix access to uninitialised lock in fc replay path
    
    commit 23dfdb56581ad92a9967bcd720c8c23356af74c1 upstream.
    
    The following kernel trace can be triggered with fstest generic/629 when
    executed against a filesystem with fast-commit feature enabled:
    
    INFO: trying to register non-static key.
    The code is fine but needs lockdep annotation, or maybe
    you didn't initialize this object before use?
    turning off the locking correctness validator.
    CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ #11
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x66/0x90
     register_lock_class+0x759/0x7d0
     __lock_acquire+0x85/0x2630
     ? __find_get_block+0xb4/0x380
     lock_acquire+0xd1/0x2d0
     ? __ext4_journal_get_write_access+0xd5/0x160
     _raw_spin_lock+0x33/0x40
     ? __ext4_journal_get_write_access+0xd5/0x160
     __ext4_journal_get_write_access+0xd5/0x160
     ext4_reserve_inode_write+0x61/0xb0
     __ext4_mark_inode_dirty+0x79/0x270
     ? ext4_ext_replay_set_iblocks+0x2f8/0x450
     ext4_ext_replay_set_iblocks+0x330/0x450
     ext4_fc_replay+0x14c8/0x1540
     ? jread+0x88/0x2e0
     ? rcu_is_watching+0x11/0x40
     do_one_pass+0x447/0xd00
     jbd2_journal_recover+0x139/0x1b0
     jbd2_journal_load+0x96/0x390
     ext4_load_and_init_journal+0x253/0xd40
     ext4_fill_super+0x2cc6/0x3180
    ...
    
    In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in
    function ext4_check_bdev_write_error().  Unfortunately, at this point this
    spinlock has not been initialized yet.  Moving it's initialization to an
    earlier point in __ext4_fill_super() fixes this splat.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix double brelse() the buffer of the extents path [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:28 2024 +0800

    ext4: fix double brelse() the buffer of the extents path
    
    commit dcaa6c31134c0f515600111c38ed7750003e1b9c upstream.
    
    In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been
    released, otherwise it may be released twice. An example of what triggers
    this is as follows:
    
      split2    map    split1
    |--------|-------|--------|
    
    ext4_ext_map_blocks
     ext4_ext_handle_unwritten_extents
      ext4_split_convert_extents
       // path->p_depth == 0
       ext4_split_extent
         // 1. do split1
         ext4_split_extent_at
           |ext4_ext_insert_extent
           |  ext4_ext_create_new_leaf
           |    ext4_ext_grow_indepth
           |      le16_add_cpu(&neh->eh_depth, 1)
           |    ext4_find_extent
           |      // return -ENOMEM
           |// get error and try zeroout
           |path = ext4_find_extent
           |  path->p_depth = 1
           |ext4_ext_try_to_merge
           |  ext4_ext_try_to_merge_up
           |    path->p_depth = 0
           |    brelse(path[1].p_bh)  ---> not set to NULL here
           |// zeroout success
         // 2. update path
         ext4_find_extent
         // 3. do split2
         ext4_split_extent_at
           ext4_ext_insert_extent
             ext4_ext_create_new_leaf
               ext4_ext_grow_indepth
                 le16_add_cpu(&neh->eh_depth, 1)
               ext4_find_extent
                 path[0].p_bh = NULL;
                 path->p_depth = 1
                 read_extent_tree_block  ---> return err
                 // path[1].p_bh is still the old value
                 ext4_free_ext_path
                   ext4_ext_drop_refs
                     // path->p_depth == 1
                     brelse(path[1].p_bh)  ---> brelse a buffer twice
    
    Finally got the following WARRNING when removing the buffer from lru:
    
    ============================================
    VFS: brelse: Trying to free free buffer
    WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90
    CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716
    RIP: 0010:__brelse+0x58/0x90
    Call Trace:
     <TASK>
     __find_get_block+0x6e7/0x810
     bdev_getblk+0x2b/0x480
     __ext4_get_inode_loc+0x48a/0x1240
     ext4_get_inode_loc+0xb2/0x150
     ext4_reserve_inode_write+0xb7/0x230
     __ext4_mark_inode_dirty+0x144/0x6a0
     ext4_ext_insert_extent+0x9c8/0x3230
     ext4_ext_map_blocks+0xf45/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ============================================
    
    Fixes: ecb94f5fdf4b ("ext4: collapse a single extent tree block into the inode if possible")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix fast commit inode enqueueing during a full journal commit [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 17 18:22:20 2024 +0100

    ext4: fix fast commit inode enqueueing during a full journal commit
    
    commit 6db3c1575a750fd417a70e0178bdf6efa0dd5037 upstream.
    
    When a full journal commit is on-going, any fast commit has to be enqueued
    into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
    is done only once, i.e. if an inode is already queued in a previous fast
    commit entry it won't be enqueued again.  However, if a full commit starts
    _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
    be done into FC_Q_STAGING.  And this is not being done in function
    ext4_fc_track_template().
    
    This patch fixes the issue by re-enqueuing an inode into the STAGING queue
    during the fast commit clean-up callback when doing a full commit.  However,
    to prevent a race with a fast-commit, the clean-up callback has to be called
    with the journal locked.
    
    This bug was found using fstest generic/047.  This test creates several 32k
    bytes files, sync'ing each of them after it's creation, and then shutting
    down the filesystem.  Some data may be loss in this operation; for example a
    file may have it's size truncated to zero.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix i_data_sem unlock order in ext4_ind_migrate() [+ + +]
Author: Artem Sadovnikov <[email protected]>
Date:   Thu Aug 29 15:22:09 2024 +0000

    ext4: fix i_data_sem unlock order in ext4_ind_migrate()
    
    [ Upstream commit cc749e61c011c255d81b192a822db650c68b313f ]
    
    Fuzzing reports a possible deadlock in jbd2_log_wait_commit.
    
    This issue is triggered when an EXT4_IOC_MIGRATE ioctl is set to require
    synchronous updates because the file descriptor is opened with O_SYNC.
    This can lead to the jbd2_journal_stop() function calling
    jbd2_might_wait_for_commit(), potentially causing a deadlock if the
    EXT4_IOC_MIGRATE call races with a write(2) system call.
    
    This problem only arises when CONFIG_PROVE_LOCKING is enabled. In this
    case, the jbd2_might_wait_for_commit macro locks jbd2_handle in the
    jbd2_journal_stop function while i_data_sem is locked. This triggers
    lockdep because the jbd2_journal_start function might also lock the same
    jbd2_handle simultaneously.
    
    Found by Linux Verification Center (linuxtesting.org) with syzkaller.
    
    Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
    Co-developed-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Mikhail Ukhin <[email protected]>
    Signed-off-by: Artem Sadovnikov <[email protected]>
    Rule: add
    Link: https://lore.kernel.org/stable/20240404095000.5872-1-mish.uxin2012%40yandex.ru
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:16 2024 +0100

    ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
    
    commit 972090651ee15e51abfb2160e986fa050cfc7a40 upstream.
    
    Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
    for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:18 2024 +0100

    ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
    
    commit ebc4b2c1ac92fc0f8bf3f5a9c285a871d5084a6b upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.
    
    Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
    assumption by being initialised to '0'.  Fortunately, the sb flag
    EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
    has been previously set instead of comparing it with '0'.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:15 2024 +0100

    ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
    
    commit dd589b0f1445e1ea1085b98edca6e4d5dedb98d0 upstream.
    
    Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
    value for transaction IDs, which is incorrect.  Don't assume that and invoke
    jbd2_log_wait_commit() if the journal had a committing transaction instead.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Wed Jul 24 17:11:17 2024 +0100

    ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
    
    commit 7a6443e1dad70281f99f0bd394d7fd342481a632 upstream.
    
    Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
    valid value for transaction IDs, which is incorrect.  Don't assume that and
    use two extra boolean variables to control the loop iterations and keep
    track of the first and last tid.
    
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix off by one issue in alloc_flex_gd() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Fri Sep 27 21:33:29 2024 +0800

    ext4: fix off by one issue in alloc_flex_gd()
    
    commit 6121258c2b33ceac3d21f6a221452692c465df88 upstream.
    
    Wesley reported an issue:
    
    ==================================================================
    EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
    ------------[ cut here ]------------
    kernel BUG at fs/ext4/resize.c:324!
    CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
    RIP: 0010:ext4_resize_fs+0x1212/0x12d0
    Call Trace:
     __ext4_ioctl+0x4e0/0x1800
     ext4_ioctl+0x12/0x20
     __x64_sys_ioctl+0x99/0xd0
     x64_sys_call+0x1206/0x20d0
     do_syscall_64+0x72/0x110
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    ==================================================================
    
    While reviewing the patch, Honza found that when adjusting resize_bg in
    alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
    flexbg_size.
    
    The reproduction of the problem requires the following:
    
     o_group = flexbg_size * 2 * n;
     o_size = (o_group + 1) * group_size;
     n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
     o_size = (n_group + 1) * group_size;
    
    Take n=0,flexbg_size=16 as an example:
    
                  last:15
    |o---------------|--------------n-|
    o_group:0    resize to      n_group:30
    
    The corresponding reproducer is:
    
    img=test.img
    rm -f $img
    truncate -s 600M $img
    mkfs.ext4 -F $img -b 1024 -G 16 8M
    dev=`losetup -f --show $img`
    mkdir -p /tmp/test
    mount $dev /tmp/test
    resize2fs $dev 248M
    
    Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
    to prevent the issue from happening again.
    
    [ Note: another reproucer which this commit fixes is:
    
      img=test.img
      rm -f $img
      truncate -s 25MiB $img
      mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
      truncate -s 3GiB $img
      dev=`losetup -f --show $img`
      mkdir -p /tmp/test
      mount $dev /tmp/test
      resize2fs $dev 3G
      umount $dev
      losetup -d $dev
    
      -- TYT ]
    
    Reported-by: Wesley Hershberger <[email protected]>
    Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
    Reported-by: Stéphane Graber <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Tested-by: Alexander Mikhalitsyn <[email protected]>
    Tested-by: Eric Sandeen <[email protected]>
    Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix slab-use-after-free in ext4_split_extent_at() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:23 2024 +0800

    ext4: fix slab-use-after-free in ext4_split_extent_at()
    
    commit c26ab35702f8cd0cdc78f96aa5856bfb77be798f upstream.
    
    We hit the following use-after-free:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0
    Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40
    CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724
    Call Trace:
     <TASK>
     kasan_report+0x93/0xc0
     ext4_split_extent_at+0xba8/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Allocated by task 40:
     __kmalloc_noprof+0x1ac/0x480
     ext4_find_extent+0xf3b/0x1e70
     ext4_ext_map_blocks+0x188/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    
    Freed by task 40:
     kfree+0xf1/0x2b0
     ext4_find_extent+0xa71/0x1e70
     ext4_ext_insert_extent+0xa22/0x3260
     ext4_split_extent_at+0x3ef/0xcc0
     ext4_split_extent.isra.0+0x18f/0x500
     ext4_split_convert_extents+0x275/0x750
     ext4_ext_handle_unwritten_extents+0x73e/0x1580
     ext4_ext_map_blocks+0xe20/0x2dc0
     ext4_map_blocks+0x724/0x1700
     ext4_do_writepages+0x12d6/0x2a70
    [...]
    ==================================================================
    
    The flow of issue triggering is as follows:
    
    ext4_split_extent_at
      path = *ppath
      ext4_ext_insert_extent(ppath)
        ext4_ext_create_new_leaf(ppath)
          ext4_find_extent(orig_path)
            path = *orig_path
            read_extent_tree_block
              // return -ENOMEM or -EIO
            ext4_free_ext_path(path)
              kfree(path)
            *orig_path = NULL
      a. If err is -ENOMEM:
      ext4_ext_dirty(path + path->p_depth)
      // path use-after-free !!!
      b. If err is -EIO and we have EXT_DEBUG defined:
      ext4_ext_show_leaf(path)
        eh = path[depth].p_hdr
        // path also use-after-free !!!
    
    So when trying to zeroout or fix the extent length, call ext4_find_extent()
    to update the path.
    
    In addition we use *ppath directly as an ext4_ext_show_leaf() input to
    avoid possible use-after-free when EXT_DEBUG is defined, and to avoid
    unnecessary path updates.
    
    Fixes: dfe5080939ea ("ext4: drop EXT4_EX_NOFREE_ON_ERR from rest of extents handling code")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: fix timer use-after-free on failed mount [+ + +]
Author: Xiaxi Shen <[email protected]>
Date:   Sun Jul 14 21:33:36 2024 -0700

    ext4: fix timer use-after-free on failed mount
    
    commit 0ce160c5bdb67081a62293028dc85758a8efb22a upstream.
    
    Syzbot has found an ODEBUG bug in ext4_fill_super
    
    The del_timer_sync function cancels the s_err_report timer,
    which reminds about filesystem errors daily. We should
    guarantee the timer is no longer active before kfree(sbi).
    
    When filesystem mounting fails, the flow goes to failed_mount3,
    where an error occurs when ext4_stop_mmpd is called, causing
    a read I/O failure. This triggers the ext4_handle_error function
    that ultimately re-arms the timer,
    leaving the s_err_report timer active before kfree(sbi) is called.
    
    Fix the issue by canceling the s_err_report timer after calling ext4_stop_mmpd.
    
    Signed-off-by: Xiaxi Shen <[email protected]>
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=59e0101c430934bc9a36
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: mark fc as ineligible using an handle in ext4_xattr_set() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:09 2024 +0100

    ext4: mark fc as ineligible using an handle in ext4_xattr_set()
    
    commit 04e6ce8f06d161399e5afde3df5dcfa9455b4952 upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch moves the call to this function so that an handle
    can be used.  If a transaction fails to start, then there's not point in
    trying to mark the filesystem as ineligible, and an error will eventually be
    returned to user-space.
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: no need to continue when the number of entries is 1 [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Mon Jul 1 22:25:03 2024 +0800

    ext4: no need to continue when the number of entries is 1
    
    commit 1a00a393d6a7fb1e745a41edd09019bd6a0ad64c upstream.
    
    Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3")
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0
    Signed-off-by: Edward Adam Davis <[email protected]>
    Reported-and-tested-by: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: propagate errors from ext4_find_extent() in ext4_insert_range() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:30 2024 +0800

    ext4: propagate errors from ext4_find_extent() in ext4_insert_range()
    
    commit 369c944ed1d7c3fb7b35f24e4735761153afe7b3 upstream.
    
    Even though ext4_find_extent() returns an error, ext4_insert_range() still
    returns 0. This may confuse the user as to why fallocate returns success,
    but the contents of the file are not as expected. So propagate the error
    returned by ext4_find_extent() to avoid inconsistencies.
    
    Fixes: 331573febb6a ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Ojaswin Mujoo <[email protected]>
    Tested-by: Ojaswin Mujoo <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: update orig_path in ext4_find_extent() [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Aug 22 10:35:25 2024 +0800

    ext4: update orig_path in ext4_find_extent()
    
    commit 5b4b2dcace35f618fe361a87bae6f0d13af31bc1 upstream.
    
    In ext4_find_extent(), if the path is not big enough, we free it and set
    *orig_path to NULL. But after reallocating and successfully initializing
    the path, we don't update *orig_path, in which case the caller gets a
    valid path but a NULL ppath, and this may cause a NULL pointer dereference
    or a path memory leak. For example:
    
    ext4_split_extent
      path = *ppath = 2000
      ext4_find_extent
        if (depth > path[0].p_maxdepth)
          kfree(path = 2000);
          *orig_path = path = NULL;
          path = kcalloc() = 3000
      ext4_split_extent_at(*ppath = NULL)
        path = *ppath;
        ex = path[depth].p_ext;
        // NULL pointer dereference!
    
    ==================================================================
    BUG: kernel NULL pointer dereference, address: 0000000000000010
    CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847
    RIP: 0010:ext4_split_extent_at+0x6d/0x560
    Call Trace:
     <TASK>
     ext4_split_extent.isra.0+0xcb/0x1b0
     ext4_ext_convert_to_initialized+0x168/0x6c0
     ext4_ext_handle_unwritten_extents+0x325/0x4d0
     ext4_ext_map_blocks+0x520/0xdb0
     ext4_map_blocks+0x2b0/0x690
     ext4_iomap_begin+0x20e/0x2c0
    [...]
    ==================================================================
    
    Therefore, *orig_path is updated when the extent lookup succeeds, so that
    the caller can safely use path or *ppath.
    
    Fixes: 10809df84a4d ("ext4: teach ext4_ext_find_extent() to realloc path if necessary")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ext4: use handle to mark fc as ineligible in __track_dentry_update() [+ + +]
Author: Luis Henriques (SUSE) <[email protected]>
Date:   Mon Sep 23 11:49:08 2024 +0100

    ext4: use handle to mark fc as ineligible in __track_dentry_update()
    
    commit faab35a0370fd6e0821c7a8dd213492946fc776f upstream.
    
    Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
    in a fast-commit being done before the filesystem is effectively marked as
    ineligible.  This patch fixes the calls to this function in
    __track_dentry_update() by adding an extra parameter to the callback used in
    ext4_fc_track_template().
    
    Suggested-by: Jan Kara <[email protected]>
    Signed-off-by: Luis Henriques (SUSE) <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Cc: [email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
f2fs: add write priority option based on zone UFS [+ + +]
Author: Liao Yuanhong <[email protected]>
Date:   Mon Jul 15 20:34:51 2024 +0800

    f2fs: add write priority option based on zone UFS
    
    [ Upstream commit 8444ce524947daf441546b5b3a0c418706dade35 ]
    
    Currently, we are using a mix of traditional UFS and zone UFS to support
    some functionalities that cannot be achieved on zone UFS alone. However,
    there are some issues with this approach. There exists a significant
    performance difference between traditional UFS and zone UFS. Under normal
    usage, we prioritize writes to zone UFS. However, in critical conditions
    (such as when the entire UFS is almost full), we cannot determine whether
    data will be written to traditional UFS or zone UFS. This can lead to
    significant performance fluctuations, which is not conducive to
    development and testing. To address this, we have added an option
    zlu_io_enable under sys with the following three modes:
    1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
    2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
    3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
    traditional UFS.
    
    Signed-off-by: Liao Yuanhong <[email protected]>
    Signed-off-by: Wu Bo <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 65a6ce4726c2 ("f2fs: fix to don't panic system for no free segment fault injection")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: do FG_GC when GC boosting is required for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:44 2024 -0700

    f2fs: do FG_GC when GC boosting is required for zoned devices
    
    [ Upstream commit 9748c2ddea4a3f46a498bff4cf2bf9a5629e3f8b ]
    
    Under low free section count, we need to use FG_GC instead of BG_GC to
    recover free sections.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: fix to don't panic system for no free segment fault injection [+ + +]
Author: Chao Yu <[email protected]>
Date:   Tue Sep 10 09:16:19 2024 +0800

    f2fs: fix to don't panic system for no free segment fault injection
    
    [ Upstream commit 65a6ce4726c27b45600303f06496fef46d00b57f ]
    
    f2fs: fix to don't panic system for no free segment fault injection
    
    syzbot reports a f2fs bug as below:
    
    F2FS-fs (loop0): inject no free segment in get_new_segment of __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
    F2FS-fs (loop0): Stopped filesystem due to reason: 7
    ------------[ cut here ]------------
    kernel BUG at fs/f2fs/segment.c:2748!
    CPU: 0 UID: 0 PID: 5109 Comm: syz-executor304 Not tainted 6.11.0-rc6-syzkaller-00363-g89f5e14d05b4 #0
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    Call Trace:
     __allocate_new_segment+0x1ce/0x940 fs/f2fs/segment.c:3167
     f2fs_allocate_new_section fs/f2fs/segment.c:3181 [inline]
     f2fs_allocate_pinning_section+0xfa/0x4e0 fs/f2fs/segment.c:3195
     f2fs_expand_inode_data+0x5d6/0xbb0 fs/f2fs/file.c:1799
     f2fs_fallocate+0x448/0x960 fs/f2fs/file.c:1903
     vfs_fallocate+0x553/0x6c0 fs/open.c:334
     do_vfs_ioctl+0x2592/0x2e50 fs/ioctl.c:886
     __do_sys_ioctl fs/ioctl.c:905 [inline]
     __se_sys_ioctl+0x81/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0010:get_new_segment fs/f2fs/segment.c:2748 [inline]
    RIP: 0010:new_curseg+0x1f61/0x1f70 fs/f2fs/segment.c:2836
    
    The root cause is when we inject no free segment fault into f2fs,
    we should not panic system, fix it.
    
    Fixes: 8b10d3653735 ("f2fs: introduce FAULT_NO_SEGMENT")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/linux-f2fs-devel/[email protected]
    Signed-off-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: forcibly migrate to secure space for zoned device file pinning [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Thu Sep 12 09:59:58 2024 -0700

    f2fs: forcibly migrate to secure space for zoned device file pinning
    
    [ Upstream commit 5cc69a27abfa91abbb39fc584f82d6c867b60f47 ]
    
    We need to migrate data blocks even though it is full to secure space
    for zoned device file pinning.
    
    Fixes: 9703d69d9d15 ("f2fs: support file pinning for zoned devices")
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: increase BG GC migration window granularity when boosted for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:43 2024 -0700

    f2fs: increase BG GC migration window granularity when boosted for zoned devices
    
    [ Upstream commit 2223fe652f759649ae1d520e47e5f06727c0acbd ]
    
    Need bigger BG GC migration window granularity when free section is
    running low.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: introduce migration_window_granularity [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:41 2024 -0700

    f2fs: introduce migration_window_granularity
    
    [ Upstream commit 8c890c4c60342719526520133fb1b6f69f196ab8 ]
    
    We can control the scanning window granularity for GC migration. For
    more frequent scanning and GC on zoned devices, we need a fine grained
    control knob for it.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

f2fs: make BG GC more aggressive for zoned devices [+ + +]
Author: Daeho Jeong <[email protected]>
Date:   Mon Sep 9 15:19:40 2024 -0700

    f2fs: make BG GC more aggressive for zoned devices
    
    [ Upstream commit 5062b5bed4323275f2f89bc185c6a28d62cfcfd5 ]
    
    Since we don't have any GC on device side for zoned devices, need more
    aggressive BG GC. So, tune the parameters for that.
    
    Signed-off-by: Daeho Jeong <[email protected]>
    Reviewed-by: Chao Yu <[email protected]>
    Signed-off-by: Jaegeuk Kim <[email protected]>
    Stable-dep-of: 5cc69a27abfa ("f2fs: forcibly migrate to secure space for zoned device file pinning")
    Signed-off-by: Sasha Levin <[email protected]>

 
fbdev: efifb: Register sysfs groups through driver core [+ + +]
Author: Thomas Weißschuh <[email protected]>
Date:   Tue Aug 27 17:25:13 2024 +0200

    fbdev: efifb: Register sysfs groups through driver core
    
    [ Upstream commit 95cdd538e0e5677efbdf8aade04ec098ab98f457 ]
    
    The driver core can register and cleanup sysfs groups already.
    Make use of that functionality to simplify the error handling and
    cleanup.
    
    Also avoid a UAF race during unregistering where the sysctl attributes
    were usable after the info struct was freed.
    
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fbdev: pxafb: Fix possible use after free in pxafb_task() [+ + +]
Author: Kaixin Wang <[email protected]>
Date:   Wed Sep 11 22:29:52 2024 +0800

    fbdev: pxafb: Fix possible use after free in pxafb_task()
    
    [ Upstream commit 4a6921095eb04a900e0000da83d9475eb958e61e ]
    
    In the pxafb_probe function, it calls the pxafb_init_fbinfo function,
    after which &fbi->task is associated with pxafb_task. Moreover,
    within this pxafb_init_fbinfo function, the pxafb_blank function
    within the &pxafb_ops struct is capable of scheduling work.
    
    If we remove the module which will call pxafb_remove to make cleanup,
    it will call unregister_framebuffer function which can call
    do_unregister_framebuffer to free fbi->fb through
    put_fb_info(fb_info), while the work mentioned above will be used.
    The sequence of operations that may lead to a UAF bug is as follows:
    
    CPU0                                                CPU1
    
                                       | pxafb_task
    pxafb_remove                       |
    unregister_framebuffer(info)       |
    do_unregister_framebuffer(fb_info) |
    put_fb_info(fb_info)               |
    // free fbi->fb                    | set_ctrlr_state(fbi, state)
                                       | __pxafb_lcd_power(fbi, 0)
                                       | fbi->lcd_power(on, &fbi->fb.var)
                                       | //use fbi->fb
    
    Fix it by ensuring that the work is canceled before proceeding
    with the cleanup in pxafb_remove.
    
    Note that only root user can remove the driver at runtime.
    
    Signed-off-by: Kaixin Wang <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
firmware/sysfb: Disable sysfb for firmware buffers with unknown parent [+ + +]
Author: Thomas Zimmermann <[email protected]>
Date:   Tue Sep 24 10:41:03 2024 +0200

    firmware/sysfb: Disable sysfb for firmware buffers with unknown parent
    
    commit ad604f0a4c040dcb8faf44dc72db25e457c28076 upstream.
    
    The sysfb framebuffer handling only operates on graphics devices
    that provide the system's firmware framebuffer. If that device is
    not known, assume that any graphics device has been initialized by
    firmware.
    
    Fixes a problem on i915 where sysfb does not release the firmware
    framebuffer after the native graphics driver loaded.
    
    Reported-by: Borah, Chaitanya Kumar <[email protected]>
    Closes: https://lore.kernel.org/dri-devel/SJ1PR11MB6129EFB8CE63D1EF6D932F94B96F2@SJ1PR11MB6129.namprd11.prod.outlook.com/
    Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12160
    Signed-off-by: Thomas Zimmermann <[email protected]>
    Fixes: b49420d6a1ae ("video/aperture: optionally match the device in sysfb_disable()")
    Cc: Javier Martinez Canillas <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Helge Deller <[email protected]>
    Cc: Sam Ravnborg <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: [email protected]
    Cc: Linux regression tracking (Thorsten Leemhuis) <[email protected]>
    Cc: <[email protected]> # v6.11+
    Acked-by: Alex Deucher <[email protected]>
    Reviewed-by: Javier Martinez Canillas <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Fri Aug 16 15:57:21 2024 +0200

    firmware: tegra: bpmp: Drop unused mbox_client_to_bpmp()
    
    commit 9c3a62c20f7fb00294a4237e287254456ba8a48b upstream.
    
    mbox_client_to_bpmp() is not used, W=1 builds:
    
      drivers/firmware/tegra/bpmp.c:28:1: error: unused function 'mbox_client_to_bpmp' [-Werror,-Wunused-function]
    
    Fixes: cdfa358b248e ("firmware: tegra: Refactor BPMP driver")
    Cc: [email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Thierry Reding <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name [+ + +]
Author: Li Zhijian <[email protected]>
Date:   Mon Aug 26 13:55:03 2024 +0800

    fs/inode: Prevent dump_mapping() accessing invalid dentry.d_name.name
    
    [ Upstream commit 7f7b850689ac06a62befe26e1fd1806799e7f152 ]
    
    It's observed that a crash occurs during hot-remove a memory device,
    in which user is accessing the hugetlb. See calltrace as following:
    
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 14045 at arch/x86/mm/fault.c:1278 do_user_addr_fault+0x2a0/0x790
    Modules linked in: kmem device_dax cxl_mem cxl_pmem cxl_port cxl_pci dax_hmem dax_pmem nd_pmem cxl_acpi nd_btt cxl_core crc32c_intel nvme virtiofs fuse nvme_core nfit libnvdimm dm_multipath scsi_dh_rdac scsi_dh_emc s
    mirror dm_region_hash dm_log dm_mod
    CPU: 1 PID: 14045 Comm: daxctl Not tainted 6.10.0-rc2-lizhijian+ #492
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    RIP: 0010:do_user_addr_fault+0x2a0/0x790
    Code: 48 8b 00 a8 04 0f 84 b5 fe ff ff e9 1c ff ff ff 4c 89 e9 4c 89 e2 be 01 00 00 00 bf 02 00 00 00 e8 b5 ef 24 00 e9 42 fe ff ff <0f> 0b 48 83 c4 08 4c 89 ea 48 89 ee 4c 89 e7 5b 5d 41 5c 41 5d 41
    RSP: 0000:ffffc90000a575f0 EFLAGS: 00010046
    RAX: ffff88800c303600 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 0000000000001000 RSI: ffffffff82504162 RDI: ffffffff824b2c36
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000a57658
    R13: 0000000000001000 R14: ffff88800bc2e040 R15: 0000000000000000
    FS:  00007f51cb57d880(0000) GS:ffff88807fd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000001000 CR3: 00000000072e2004 CR4: 00000000001706f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __warn+0x8d/0x190
     ? do_user_addr_fault+0x2a0/0x790
     ? report_bug+0x1c3/0x1d0
     ? handle_bug+0x3c/0x70
     ? exc_invalid_op+0x14/0x70
     ? asm_exc_invalid_op+0x16/0x20
     ? do_user_addr_fault+0x2a0/0x790
     ? exc_page_fault+0x31/0x200
     exc_page_fault+0x68/0x200
    <...snip...>
    BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     ---[ end trace 0000000000000000 ]---
     BUG: unable to handle page fault for address: 0000000000001000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 800000000ad92067 P4D 800000000ad92067 PUD 7677067 PMD 0
     Oops: Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 14045 Comm: daxctl Kdump: loaded Tainted: G        W          6.10.0-rc2-lizhijian+ #492
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
     RIP: 0010:dentry_name+0x1f4/0x440
    <...snip...>
    ? dentry_name+0x2fa/0x440
    vsnprintf+0x1f3/0x4f0
    vprintk_store+0x23a/0x540
    vprintk_emit+0x6d/0x330
    _printk+0x58/0x80
    dump_mapping+0x10b/0x1a0
    ? __pfx_free_object_rcu+0x10/0x10
    __dump_page+0x26b/0x3e0
    ? vprintk_emit+0xe0/0x330
    ? _printk+0x58/0x80
    ? dump_page+0x17/0x50
    dump_page+0x17/0x50
    do_migrate_range+0x2f7/0x7f0
    ? do_migrate_range+0x42/0x7f0
    ? offline_pages+0x2f4/0x8c0
    offline_pages+0x60a/0x8c0
    memory_subsys_offline+0x9f/0x1c0
    ? lockdep_hardirqs_on+0x77/0x100
    ? _raw_spin_unlock_irqrestore+0x38/0x60
    device_offline+0xe3/0x110
    state_store+0x6e/0xc0
    kernfs_fop_write_iter+0x143/0x200
    vfs_write+0x39f/0x560
    ksys_write+0x65/0xf0
    do_syscall_64+0x62/0x130
    
    Previously, some sanity check have been done in dump_mapping() before
    the print facility parsing '%pd' though, it's still possible to run into
    an invalid dentry.d_name.name.
    
    Since dump_mapping() only needs to dump the filename only, retrieve it
    by itself in a safer way to prevent an unnecessary crash.
    
    Note that either retrieving the filename with '%pd' or
    strncpy_from_kernel_nofault(), the filename could be unreliable.
    
    Signed-off-by: Li Zhijian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Jan Kara <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
gpio: davinci: fix lazy disable [+ + +]
Author: Emanuele Ghidoli <[email protected]>
Date:   Wed Aug 28 15:32:07 2024 +0200

    gpio: davinci: fix lazy disable
    
    commit 3360d41f4ac490282fddc3ccc0b58679aa5c065d upstream.
    
    On a few platforms such as TI's AM69 device, disable_irq() fails to keep
    track of the interrupts that happen between disable_irq() and
    enable_irq() and those interrupts are missed. Use the ->irq_unmask() and
    ->irq_mask() methods instead of ->irq_enable() and ->irq_disable() to
    correctly keep track of edges when disable_irq is called.
    
    This solves the issue of disable_irq() not working as expected on such
    platforms.
    
    Fixes: 23265442b02b ("ARM: davinci: irq_data conversion.")
    Signed-off-by: Emanuele Ghidoli <[email protected]>
    Signed-off-by: Parth Pancholi <[email protected]>
    Acked-by: Keerthy <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
gso: fix udp gso fraglist segmentation after pull from frag_list [+ + +]
Author: Willem de Bruijn <[email protected]>
Date:   Tue Oct 1 13:17:46 2024 -0400

    gso: fix udp gso fraglist segmentation after pull from frag_list
    
    commit a1e40ac5b5e9077fe1f7ae0eb88034db0f9ae1ab upstream.
    
    Detect gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For UDP, this
    causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
    udp_hdr(seg->next)->dest.
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Willem de Bruijn <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
HID: i2c-hid: ensure various commands do not interfere with each other [+ + +]
Author: Dmitry Torokhov <[email protected]>
Date:   Mon Sep 9 13:37:40 2024 -0700

    HID: i2c-hid: ensure various commands do not interfere with each other
    
    [ Upstream commit b4ed18a3d56eabd18cfd9841ff05111e3cfbe8f9 ]
    
    i2c-hid uses 2 shared buffers: command and "raw" input buffer for
    sending requests to peripherals and read data from peripherals when
    executing variety of commands. Such commands include reading of HID
    registers, requesting particular power mode, getting and setting
    reports and so on. Because all such requests use the same 2 buffers
    they should not execute simultaneously.
    
    Fix this by introducing "cmd_lock" mutex and acquire it whenever
    we needs to access ihid->cmdbuf or idid->rawbuf.
    
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: Ignore battery for all ELAN I2C-HID devices [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Mon Aug 5 16:51:47 2024 +0200

    HID: Ignore battery for all ELAN I2C-HID devices
    
    [ Upstream commit bcc31692a1d1e21f0d06c5f727c03ee299d2264e ]
    
    Before this change there were 16 vid:pid based quirks to ignore the battery
    reported by Elan I2C-HID touchscreens on various Asus and HP laptops.
    
    And a report has been received that the 04F3:2A00 I2C touchscreen on
    the HP ProBook x360 11 G5 EE/86CF also reports a non present battery.
    
    Since I2C-HID devices are always builtin to laptops they are not battery
    owered so it should be safe to just ignore the battery on all Elan I2C-HID
    devices, rather then adding a 17th quirk for the 04F3:2A00 touchscreen.
    
    As reported in the changelog of commit a3a5a37efba1 ("HID: Ignore battery
    for ELAN touchscreens 2F2C and 4116"), which added 2 new Elan touchscreen
    quirks about a month ago, the HID reported battery seems to be related
    to a stylus being used. But even when a stylus is in use it does not
    properly report the charge of the stylus battery, instead the reported
    battery charge jumps from 0% to 1%. So it is best to just ignore the
    HID battery.
    
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2302776
    Cc: Louis Dalibard <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio [+ + +]
Author: Vishnu Sankar <[email protected]>
Date:   Sun Aug 18 16:27:29 2024 +0900

    HID: multitouch: Add support for Thinkpad X12 Gen 2 Kbd Portfolio
    
    [ Upstream commit 65b72ea91a257a5f0cb5a26b01194d3dd4b85298 ]
    
    This applies similar quirks used by previous generation device, so that
    Trackpoint and buttons on the touchpad works.  New USB KBD PID 0x61AE for
    Thinkpad X12 Tab is added.
    
    Signed-off-by: Vishnu Sankar <[email protected]>
    Reviewed-by: Mark Pearson <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
hwmon: (nct6775) add G15CF to ASUS WMI monitoring list [+ + +]
Author: Denis Pauk <[email protected]>
Date:   Mon Aug 12 18:26:38 2024 +0300

    hwmon: (nct6775) add G15CF to ASUS WMI monitoring list
    
    [ Upstream commit 1f432e4cf1dd3ecfec5ed80051b4611632a0fd51 ]
    
    Boards G15CF has got a nct6775 chip, but by default there's no use of it
    because of resource conflict with WMI method.
    
    Add the board to the WMI monitoring list.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=204807
    Signed-off-by: Denis Pauk <[email protected]>
    Tested-by: Attila <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
i2c: core: Lock address during client device instantiation [+ + +]
Author: Heiner Kallweit <[email protected]>
Date:   Thu Aug 15 21:44:50 2024 +0200

    i2c: core: Lock address during client device instantiation
    
    commit 8d3cefaf659265aa82b0373a563fdb9d16a2b947 upstream.
    
    Krzysztof reported an issue [0] which is caused by parallel attempts to
    instantiate the same I2C client device. This can happen if driver
    supports auto-detection, but certain devices are also instantiated
    explicitly.
    The original change isn't actually wrong, it just revealed that I2C core
    isn't prepared yet to handle this scenario.
    Calls to i2c_new_client_device() can be nested, therefore we can't use a
    simple mutex here. Parallel instantiation of devices at different addresses
    is ok, so we just have to prevent parallel instantiation at the same address.
    We can use a bitmap with one bit per 7-bit I2C client address, and atomic
    bit operations to set/check/clear bits.
    Now a parallel attempt to instantiate a device at the same address will
    result in -EBUSY being returned, avoiding the "sysfs: cannot create duplicate
    filename" splash.
    
    Note: This patch version includes small cosmetic changes to the Tested-by
          version, only functional change is that address locking is supported
          for slave addresses too.
    
    [0] https://lore.kernel.org/linux-i2c/[email protected]/T/#m12706546e8e2414d8f1a0dc61c53393f731685cc
    
    Fixes: caba40ec3531 ("eeprom: at24: Probe for DDR3 thermal sensor in the SPD case")
    Cc: [email protected]
    Tested-by: Krzysztof Piotr Oledzki <[email protected]>
    Signed-off-by: Heiner Kallweit <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled [+ + +]
Author: Kimriver Liu <[email protected]>
Date:   Fri Sep 13 11:31:46 2024 +0800

    i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
    
    commit 5d69d5a00f80488ddcb4dee7d1374a0709398178 upstream.
    
    It was observed that issuing the ABORT bit (IC_ENABLE[1]) will not
    work when IC_ENABLE is already disabled.
    
    Check if the ENABLE bit (IC_ENABLE[0]) is disabled when the controller
    is holding SCL low. If the ENABLE bit is disabled, the software needs
    to enable it before trying to issue the ABORT bit. otherwise,
    the controller ignores any write to ABORT bit.
    
    These kernel logs show up whenever an I2C transaction is
    attempted after this failure.
    i2c_designware e95e0000.i2c: timeout waiting for bus ready
    i2c_designware e95e0000.i2c: timeout in disabling adapter
    
    The patch fixes the issue where the controller cannot be disabled
    while SCL is held low if the ENABLE bit is already disabled.
    
    Fixes: 2409205acd3c ("i2c: designware: fix __i2c_dw_disable() in case master is holding SCL low")
    Signed-off-by: Kimriver Liu <[email protected]>
    Cc: <[email protected]> # v6.6+
    Reviewed-by: Mika Westerberg <[email protected]>
    Acked-by: Jarkko Nikula <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Thu Sep 12 11:34:59 2024 +0800

    i2c: qcom-geni: Use IRQF_NO_AUTOEN flag in request_irq()
    
    commit e2c85d85a05f16af2223fcc0195ff50a7938b372 upstream.
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Cc: <[email protected]> # v4.19+
    Acked-by: Mukesh Kumar Savaliya <[email protected]>
    Reviewed-by: Vladimir Zapolskiy <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume [+ + +]
Author: Marek Vasut <[email protected]>
Date:   Mon Sep 30 21:27:41 2024 +0200

    i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume
    
    commit 048bbbdbf85e5e00258dfb12f5e368f908801d7b upstream.
    
    In case there is any sort of clock controller attached to this I2C bus
    controller, for example Versaclock or even an AIC32x4 I2C codec, then
    an I2C transfer triggered from the clock controller clk_ops .prepare
    callback may trigger a deadlock on drivers/clk/clk.c prepare_lock mutex.
    
    This is because the clock controller first grabs the prepare_lock mutex
    and then performs the prepare operation, including its I2C access. The
    I2C access resumes this I2C bus controller via .runtime_resume callback,
    which calls clk_prepare_enable(), which attempts to grab the prepare_lock
    mutex again and deadlocks.
    
    Since the clock are already prepared since probe() and unprepared in
    remove(), use simple clk_enable()/clk_disable() calls to enable and
    disable the clock on runtime suspend and resume, to avoid hitting the
    prepare_lock mutex.
    
    Acked-by: Alain Volmat <[email protected]>
    Signed-off-by: Marek Vasut <[email protected]>
    Fixes: 4e7bca6fc07b ("i2c: i2c-stm32f7: add PM Runtime support")
    Cc: <[email protected]> # v5.0+
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: synquacer: Deal with optional PCLK correctly [+ + +]
Author: Ard Biesheuvel <[email protected]>
Date:   Thu Sep 12 12:46:31 2024 +0200

    i2c: synquacer: Deal with optional PCLK correctly
    
    commit f2990f8630531a99cad4dc5c44cb2a11ded42492 upstream.
    
    ACPI boot does not provide clocks and regulators, but instead, provides
    the PCLK rate directly, and enables the clock in firmware. So deal
    gracefully with this.
    
    Fixes: 55750148e559 ("i2c: synquacer: Fix an error handling path in synquacer_i2c_probe()")
    Cc: [email protected] # v6.10+
    Cc: Andi Shyti <[email protected]>
    Cc: Christophe JAILLET <[email protected]>
    Signed-off-by: Ard Biesheuvel <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 11:42:50 2024 +0800

    i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
    
    commit 0c8d604dea437b69a861479b413d629bc9b3da70 upstream.
    
    It is not valid to call pm_runtime_set_suspended() for devices
    with runtime PM enabled because it returns -EAGAIN if it is enabled
    already and working. So, call pm_runtime_disable() before to fix it.
    
    Fixes: 36ecbcab84d0 ("i2c: xiic: Implement power management")
    Cc: <[email protected]> # v4.6+
    Signed-off-by: Jinjie Ruan <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

i2c: xiic: improve error message when transfer fails to start [+ + +]
Author: Marc Ferland <[email protected]>
Date:   Mon May 13 12:03:24 2024 -0400

    i2c: xiic: improve error message when transfer fails to start
    
    [ Upstream commit ee1691d0ae103ba7fd9439800ef454674fadad27 ]
    
    xiic_start_xfer can fail for different reasons:
    
    - EBUSY: bus is busy or i2c messages still in tx_msg or rx_msg
    - ETIMEDOUT: timed-out trying to clear the RX fifo
    - EINVAL: wrong clock settings
    
    Both EINVAL and ETIMEDOUT will currently print a specific error
    message followed by a generic one, for example:
    
        Failed to clear rx fifo
        Error xiic_start_xfer
    
    however EBUSY will simply output the generic message:
    
        Error xiic_start_xfer
    
    which is not really helpful.
    
    This commit adds a new error message when a busy condition is detected
    and also removes the generic message since it does not provide any
    relevant information to the user.
    
    Signed-off-by: Marc Ferland <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Stable-dep-of: 1d4a1adbed25 ("i2c: xiic: Try re-initialization on bus busy timeout")
    Signed-off-by: Sasha Levin <[email protected]>

i2c: xiic: Try re-initialization on bus busy timeout [+ + +]
Author: Robert Hancock <[email protected]>
Date:   Wed Sep 11 22:16:53 2024 +0200

    i2c: xiic: Try re-initialization on bus busy timeout
    
    [ Upstream commit 1d4a1adbed2582444aaf97671858b7d12915bd05 ]
    
    In the event that the I2C bus was powered down when the I2C controller
    driver loads, or some spurious pulses occur on the I2C bus, it's
    possible that the controller detects a spurious I2C "start" condition.
    In this situation it may continue to report the bus is busy indefinitely
    and block the controller from working.
    
    The "single-master" DT flag can be specified to disable bus busy checks
    entirely, but this may not be safe to use in situations where other I2C
    masters may potentially exist.
    
    In the event that the controller reports "bus busy" for too long when
    starting a transaction, we can try reinitializing the controller to see
    if the busy condition clears. This allows recovering from this scenario.
    
    Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
    Signed-off-by: Robert Hancock <[email protected]>
    Cc: <[email protected]> # v2.6.34+
    Reviewed-by: Manikanta Guntupalli <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: xiic: Wait for TX empty to avoid missed TX NAKs [+ + +]
Author: Robert Hancock <[email protected]>
Date:   Tue Nov 21 18:11:16 2023 +0000

    i2c: xiic: Wait for TX empty to avoid missed TX NAKs
    
    commit 521da1e9225450bd323db5fa5bca942b1dc485b7 upstream.
    
    Frequently an I2C write will be followed by a read, such as a register
    address write followed by a read of the register value. In this driver,
    when the TX FIFO half empty interrupt was raised and it was determined
    that there was enough space in the TX FIFO to send the following read
    command, it would do so without waiting for the TX FIFO to actually
    empty.
    
    Unfortunately it appears that in some cases this can result in a NAK
    that was raised by the target device on the write, such as due to an
    unsupported register address, being ignored and the subsequent read
    being done anyway. This can potentially put the I2C bus into an
    invalid state and/or result in invalid read data being processed.
    
    To avoid this, once a message has been fully written to the TX FIFO,
    wait for the TX FIFO empty interrupt before moving on to the next
    message, to ensure NAKs are handled properly.
    
    Fixes: e1d5b6598cdc ("i2c: Add support for Xilinx XPS IIC Bus Interface")
    Signed-off-by: Robert Hancock <[email protected]>
    Cc: <[email protected]> # v2.6.34+
    Reviewed-by: Manikanta Guntupalli <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition [+ + +]
Author: Kaixin Wang <[email protected]>
Date:   Sun Sep 15 00:39:33 2024 +0800

    i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition
    
    commit 61850725779709369c7e907ae8c7c75dc7cec4f3 upstream.
    
    In the svc_i3c_master_probe function, &master->hj_work is bound with
    svc_i3c_master_hj_work, &master->ibi_work is bound with
    svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work  can start the
    hj_work, svc_i3c_master_irq_handler can start the ibi_work.
    
    If we remove the module which will call svc_i3c_master_remove to
    make cleanup, it will free master->base through i3c_master_unregister
    while the work mentioned above will be used. The sequence of operations
    that may lead to a UAF bug is as follows:
    
    CPU0                                         CPU1
    
                                        | svc_i3c_master_hj_work
    svc_i3c_master_remove               |
    i3c_master_unregister(&master->base)|
    device_unregister(&master->dev)     |
    device_release                      |
    //free master->base                 |
                                        | i3c_master_do_daa(&master->base)
                                        | //use master->base
    
    Fix it by ensuring that the work is canceled before proceeding with the
    cleanup in svc_i3c_master_remove.
    
    Fixes: 0f74f8b6675c ("i3c: Make i3c_master_unregister() return void")
    Cc: [email protected]
    Signed-off-by: Kaixin Wang <[email protected]>
    Reviewed-by: Miquel Raynal <[email protected]>
    Reviewed-by: Frank Li <[email protected]>
    Link: https://lore.kernel.org/stable/20240914154030.180-1-kxwang23%40m.fudan.edu.cn
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node() [+ + +]
Author: Aleksandr Mishin <[email protected]>
Date:   Wed Jul 10 15:39:49 2024 +0300

    ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
    
    [ Upstream commit 62fdaf9e8056e9a9e6fe63aa9c816ec2122d60c6 ]
    
    In ice_sched_add_root_node() and ice_sched_add_node() there are calls to
    devm_kcalloc() in order to allocate memory for array of pointers to
    'ice_sched_node' structure. But incorrect types are used as sizeof()
    arguments in these calls (structures instead of pointers) which leads to
    over allocation of memory.
    
    Adjust over allocation of memory by correcting types in devm_kcalloc()
    sizeof() arguments.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Aleksandr Mishin <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ieee802154: Fix build error [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 9 21:17:40 2024 +0800

    ieee802154: Fix build error
    
    [ Upstream commit addf89774e48c992316449ffab4f29c2309ebefb ]
    
    If REGMAP_SPI is m and IEEE802154_MCR20A is y,
    
            mcr20a.c:(.text+0x3ed6c5b): undefined reference to `__devm_regmap_init_spi'
            ld: mcr20a.c:(.text+0x3ed6cb5): undefined reference to `__devm_regmap_init_spi'
    
    Select REGMAP_SPI for IEEE802154_MCR20A to fix it.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iio: magnetometer: ak8975: Fix reading for ak099xx sensors [+ + +]
Author: Barnabás Czémán <[email protected]>
Date:   Mon Aug 19 00:29:40 2024 +0200

    iio: magnetometer: ak8975: Fix reading for ak099xx sensors
    
    commit 129464e86c7445a858b790ac2d28d35f58256bbe upstream.
    
    Move ST2 reading with overflow handling after measurement data
    reading.
    ST2 register read have to be read after read measurment data,
    because it means end of the reading and realease the lock on the data.
    Remove ST2 read skip on interrupt based waiting because ST2 required to
    be read out at and of the axis read.
    
    Fixes: 57e73a423b1e ("iio: ak8975: add ak09911 and ak09912 support")
    Signed-off-by: Barnabás Czémán <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

iio: pressure: bmp280: Fix regmap for BMP280 device [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:49 2024 +0200

    iio: pressure: bmp280: Fix regmap for BMP280 device
    
    [ Upstream commit b9065b0250e1705935445ede0a18c1850afe7b75 ]
    
    Up to now, the BMP280 device is using the regmap of the BME280 which
    has registers that exist only in the BME280 device.
    
    Fixes: 14e8015f8569 ("iio: pressure: bmp280: split driver in logical parts")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iio: pressure: bmp280: Fix waiting time for BMP3xx configuration [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Thu Jul 11 23:15:50 2024 +0200

    iio: pressure: bmp280: Fix waiting time for BMP3xx configuration
    
    [ Upstream commit 262a6634bcc4f0c1c53d13aa89882909f281a6aa ]
    
    According to the datasheet, both pressure and temperature can go up to
    oversampling x32. With this option, the maximum measurement time is not
    80ms (this is for press x32 and temp x2), but it is 130ms nominal
    (calculated from table 3.9.2) and since most of the maximum values
    are around +15%, it is configured to 150ms.
    
    Fixes: 8d329309184d ("iio: pressure: bmp280: Add support for BMP380 sensor family")
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Cc: <[email protected]>
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iio: pressure: bmp280: Improve indentation and line wrapping [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Mon Apr 29 21:00:37 2024 +0200

    iio: pressure: bmp280: Improve indentation and line wrapping
    
    [ Upstream commit 439ce8961bdd2e925c1f6adc82ce9fe3931e2c08 ]
    
    Fix indentations that are not following the standards, remove
    extra white lines and add missing white lines.
    
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Stable-dep-of: b9065b0250e1 ("iio: pressure: bmp280: Fix regmap for BMP280 device")
    Signed-off-by: Sasha Levin <[email protected]>

iio: pressure: bmp280: Use BME prefix for BME280 specifics [+ + +]
Author: Vasileios Amoiridis <[email protected]>
Date:   Mon Apr 29 21:00:38 2024 +0200

    iio: pressure: bmp280: Use BME prefix for BME280 specifics
    
    [ Upstream commit b23be4cd99a6f1f46963b87952632268174e62c1 ]
    
    Change the rest of the defines and function names that are
    used specifically by the BME280 humidity sensor to BME280
    as it is done for the rest of the BMP{0,1,3,5}80 sensors.
    
    Signed-off-by: Vasileios Amoiridis <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Stable-dep-of: b9065b0250e1 ("iio: pressure: bmp280: Fix regmap for BMP280 device")
    Signed-off-by: Sasha Levin <[email protected]>

 
Input: adp5589-keys - fix adp5589_gpio_get_value() [+ + +]
Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:47:23 2024 -0700

    Input: adp5589-keys - fix adp5589_gpio_get_value()
    
    commit c684771630e64bc39bddffeb65dd8a6612a6b249 upstream.
    
    The adp5589 seems to have the same behavior as similar devices as
    explained in commit 910a9f5636f5 ("Input: adp5588-keys - get value from
    data out when dir is out").
    
    Basically, when the gpio is set as output we need to get the value from
    ADP5589_GPO_DATA_OUT_A register instead of ADP5589_GPI_STATUS_A.
    
    Fixes: 9d2e173644bb ("Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-2-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Input: adp5589-keys - fix NULL pointer dereference [+ + +]
Author: Nuno Sa <[email protected]>
Date:   Tue Oct 1 07:46:44 2024 -0700

    Input: adp5589-keys - fix NULL pointer dereference
    
    commit fb5cc65f973661241e4a2b7390b429aa7b330c69 upstream.
    
    We register a devm action to call adp5589_clear_config() and then pass
    the i2c client as argument so that we can call i2c_get_clientdata() in
    order to get our device object. However, i2c_set_clientdata() is only
    being set at the end of the probe function which means that we'll get a
    NULL pointer dereference in case the probe function fails early.
    
    Fixes: 30df385e35a4 ("Input: adp5589-keys - use devm_add_action_or_reset() for register clear")
    Signed-off-by: Nuno Sa <[email protected]>
    Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-1-fca0149dfc47@analog.com
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
io_uring/net: harden multishot termination case for recv [+ + +]
Author: Jens Axboe <[email protected]>
Date:   Thu Sep 26 07:08:10 2024 -0600

    io_uring/net: harden multishot termination case for recv
    
    commit c314094cb4cfa6fc5a17f4881ead2dfebfa717a7 upstream.
    
    If the recv returns zero, or an error, then it doesn't matter if more
    data has already been received for this buffer. A condition like that
    should terminate the multishot receive. Rather than pass in the
    collected return value, pass in whether to terminate or keep the recv
    going separately.
    
    Note that this isn't a bug right now, as the only way to get there is
    via setting MSG_WAITALL with multishot receive. And if an application
    does that, then -EINVAL is returned anyway. But it seems like an easy
    bug to introduce, so let's make it a bit more explicit.
    
    Link: https://github.com/axboe/liburing/issues/1246
    Cc: [email protected]
    Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
io_uring: fix memory leak when cache init fail [+ + +]
Author: Guixin Liu <[email protected]>
Date:   Mon Sep 23 18:05:12 2024 +0800

    io_uring: fix memory leak when cache init fail
    
    [ Upstream commit 3a87e264290d71ec86a210ab3e8d23b715ad266d ]
    
    Exit the percpu ref when cache init fails to free the data memory with
    in struct percpu_ref.
    
    Fixes: 206aefde4f88 ("io_uring: reduce/pack size of io_ring_ctx")
    Signed-off-by: Guixin Liu <[email protected]>
    Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iomap: constrain the file range passed to iomap_file_unshare [+ + +]
Author: Darrick J. Wong <[email protected]>
Date:   Wed Oct 2 08:02:13 2024 -0700

    iomap: constrain the file range passed to iomap_file_unshare
    
    [ Upstream commit a311a08a4237241fb5b9d219d3e33346de6e83e0 ]
    
    File contents can only be shared (i.e. reflinked) below EOF, so it makes
    no sense to try to unshare ranges beyond EOF.  Constrain the file range
    parameters here so that we don't have to do that in the callers.
    
    Fixes: 5f4e5752a8a3 ("fs: add iomap_file_dirty")
    Signed-off-by: Darrick J. Wong <[email protected]>
    Link: https://lore.kernel.org/r/20241002150213.GC21853@frogsfrogsfrogs
    Reviewed-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Brian Foster <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Tue Sep 10 07:39:03 2024 +0300

    iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release
    
    [ Upstream commit 7a9d43eace888a0ee6095035997bb138425844d3 ]
    
    When direct I/O completions invalidates the page cache it holds neither the
    i_rwsem nor the invalidate_lock so it can be racing with
    iomap_write_delalloc_release.  If the search for the end of the region that
    contains data returns the start offset we hit such a race and just need to
    look for the end of the newly created hole instead.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iommu/arm-smmu-v3: Do not use devm for the cd table allocations [+ + +]
Author: Jason Gunthorpe <[email protected]>
Date:   Fri Sep 6 12:47:52 2024 -0300

    iommu/arm-smmu-v3: Do not use devm for the cd table allocations
    
    [ Upstream commit 47b2de35cab2b683f69d03515c2658c2d8515323 ]
    
    The master->cd_table is entirely contained within the struct
    arm_smmu_master which is guaranteed to be freed by the core code under
    arm_smmu_release_device().
    
    There is no reason to use devm here, arm_smmu_free_cd_tables() is reliably
    called to free the CD related memory. Remove it and save some memory.
    
    Tested-by: Nicolin Chen <[email protected]>
    Reviewed-by: Nicolin Chen <[email protected]>
    Signed-off-by: Jason Gunthorpe <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/arm-smmu-v3: Match Stall behaviour for S2 [+ + +]
Author: Mostafa Saleh <[email protected]>
Date:   Fri Aug 30 11:03:47 2024 +0000

    iommu/arm-smmu-v3: Match Stall behaviour for S2
    
    [ Upstream commit ce7cb08e22e09f43649b025c849a3ae3b80833c4 ]
    
    According to the spec (ARM IHI 0070 F.b), in
    "5.5 Fault configuration (A, R, S bits)":
        A STE with stage 2 translation enabled and STE.S2S == 0 is
        considered ILLEGAL if SMMU_IDR0.STALL_MODEL == 0b10.
    
    Also described in the pseudocode “SteIllegal()”
        if STE.Config == '11x' then
            [..]
            if eff_idr0_stall_model == '10' && STE.S2S == '0' then
                // stall_model forcing stall, but S2S == 0
                return TRUE;
    
    Which means, S2S must be set when stall model is
    "ARM_SMMU_FEAT_STALL_FORCE", but currently the driver ignores that.
    
    Although, the driver can do the minimum and only set S2S for
    “ARM_SMMU_FEAT_STALL_FORCE”, it is more consistent to match S1
    behaviour, which also sets it for “ARM_SMMU_FEAT_STALL” if the
    master has requested stalls.
    
    Also, since S2 stalls are enabled now, report them to the IOMMU layer
    and for VFIO devices it will fail anyway as VFIO doesn’t register an
    iopf handler.
    
    Signed-off-by: Mostafa Saleh <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Will Deacon <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
iommu/vt-d: Always reserve a domain ID for identity setup [+ + +]
Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:13 2024 +0800

    iommu/vt-d: Always reserve a domain ID for identity setup
    
    [ Upstream commit 2c13012e09190174614fd6901857a1b8c199e17d ]
    
    We will use a global static identity domain. Reserve a static domain ID
    for it.
    
    Signed-off-by: Lu Baolu <[email protected]>
    Reviewed-by: Jason Gunthorpe <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Reviewed-by: Jerry Snitselaar <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count [+ + +]
Author: Sanjay K Kumar <[email protected]>
Date:   Mon Sep 2 10:27:18 2024 +0800

    iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 count
    
    [ Upstream commit 3cf74230c139f208b7fb313ae0054386eee31a81 ]
    
    If qi_submit_sync() is invoked with 0 invalidation descriptors (for
    instance, for DMA draining purposes), we can run into a bug where a
    submitting thread fails to detect the completion of invalidation_wait.
    Subsequently, this led to a soft lockup. Currently, there is no impact
    by this bug on the existing users because no callers are submitting
    invalidations with 0 descriptors. This fix will enable future users
    (such as DMA drain) calling qi_submit_sync() with 0 count.
    
    Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while
    concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both
    threads then enter a while loop, waiting for their respective descriptors
    to complete. T1 detects its completion (i.e., T1's invalidation_wait status
    changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to
    reclaim all descriptors, potentially including adjacent ones of other
    threads that are also marked as QI_DONE.
    
    During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU
    hardware may complete the invalidation for T2, setting its status to
    QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's
    invalidation_wait descriptor and changes its status to QI_FREE, T2 will
    not observe the QI_DONE status for its invalidation_wait and will
    indefinitely remain stuck.
    
    This soft lockup does not occur when only non-zero descriptors are
    submitted.In such cases, invalidation descriptors are interspersed among
    wait descriptors with the status QI_IN_USE, acting as barriers. These
    barriers prevent the reclaim code from mistakenly freeing descriptors
    belonging to other submitters.
    
    Considered the following example timeline:
            T1                      T2
    ========================================
            ID1
            WD1
            while(WD1!=QI_DONE)
            unlock
                                    lock
            WD1=QI_DONE*            WD2
                                    while(WD2!=QI_DONE)
                                    unlock
            lock
            WD1==QI_DONE?
            ID1=QI_DONE             WD2=DONE*
            reclaim()
            ID1=FREE
            WD1=FREE
            WD2=FREE
            unlock
                                    soft lockup! T2 never sees QI_DONE in WD2
    
    Where:
    ID = invalidation descriptor
    WD = wait descriptor
    * Written by hardware
    
    The root of the problem is that the descriptor status QI_DONE flag is used
    for two conflicting purposes:
    1. signal a descriptor is ready for reclaim (to be freed)
    2. signal by the hardware that a wait descriptor is complete
    
    The solution (in this patch) is state separation by using QI_FREE flag
    for #1.
    
    Once a thread's invalidation descriptors are complete, their status would
    be set to QI_FREE. The reclaim_free_desc() function would then only
    free descriptors marked as QI_FREE instead of those marked as
    QI_DONE. This change ensures that T2 (from the previous example) will
    correctly observe the completion of its invalidation_wait (marked as
    QI_DONE).
    
    Signed-off-by: Sanjay K Kumar <[email protected]>
    Signed-off-by: Jacob Pan <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

iommu/vt-d: Unconditionally flush device TLB for pasid table updates [+ + +]
Author: Lu Baolu <[email protected]>
Date:   Mon Sep 2 10:27:20 2024 +0800

    iommu/vt-d: Unconditionally flush device TLB for pasid table updates
    
    [ Upstream commit 1f5e307ca16c0c19186cbd56ac460a687e6daba0 ]
    
    The caching mode of an IOMMU is irrelevant to the behavior of the device
    TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode
    check before device TLB flush") removed this redundant check in the
    domain unmap path.
    
    Checking the caching mode before flushing the device TLB after a pasid
    table entry is updated is unnecessary and can lead to inconsistent
    behavior.
    
    Extends this consistency by removing the caching mode check in the pasid
    table update path.
    
    Suggested-by: Yi Liu <[email protected]>
    Signed-off-by: Lu Baolu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Joerg Roedel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR). [+ + +]
Author: Kuniyuki Iwashima <[email protected]>
Date:   Fri Aug 9 16:54:02 2024 -0700

    ipv4: Check !in_dev earlier for ioctl(SIOCSIFADDR).
    
    [ Upstream commit e3af3d3c5b26c33a7950e34e137584f6056c4319 ]
    
    dev->ip_ptr could be NULL if we set an invalid MTU.
    
    Even then, if we issue ioctl(SIOCSIFADDR) for a new IPv4 address,
    devinet_ioctl() allocates struct in_ifaddr and fails later in
    inet_set_ifa() because in_dev is NULL.
    
    Let's move the check earlier.
    
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: ip_gre: Fix drops of small packets in ipgre_xmit [+ + +]
Author: Anton Danilov <[email protected]>
Date:   Wed Sep 25 02:51:59 2024 +0300

    ipv4: ip_gre: Fix drops of small packets in ipgre_xmit
    
    [ Upstream commit c4a14f6d9d17ad1e41a36182dd3b8a5fd91efbd7 ]
    
    Regression Description:
    
    Depending on the options specified for the GRE tunnel device, small
    packets may be dropped. This occurs because the pskb_network_may_pull
    function fails due to the packet's insufficient length.
    
    For example, if only the okey option is specified for the tunnel device,
    original (before encapsulation) packets smaller than 28 bytes (including
    the IPv4 header) will be dropped. This happens because the required
    length is calculated relative to the network header, not the skb->head.
    
    Here is how the required length is computed and checked:
    
    * The pull_len variable is set to 28 bytes, consisting of:
      * IPv4 header: 20 bytes
      * GRE header with Key field: 8 bytes
    
    * The pskb_network_may_pull function adds the network offset, shifting
    the checkable space further to the beginning of the network header and
    extending it to the beginning of the packet. As a result, the end of
    the checkable space occurs beyond the actual end of the packet.
    
    Instead of ensuring that 28 bytes are present in skb->head, the function
    is requesting these 28 bytes starting from the network header. For small
    packets, this requested length exceeds the actual packet size, causing
    the check to fail and the packets to be dropped.
    
    This issue affects both locally originated and forwarded packets in
    DMVPN-like setups.
    
    How to reproduce (for local originated packets):
    
      ip link add dev gre1 type gre ikey 1.9.8.4 okey 1.9.8.4 \
              local <your-ip> remote 0.0.0.0
    
      ip link set mtu 1400 dev gre1
      ip link set up dev gre1
      ip address add 192.168.13.1/24 dev gre1
      ip neighbor add 192.168.13.2 lladdr <remote-ip> dev gre1
      ping -s 1374 -c 10 192.168.13.2
      tcpdump -vni gre1
      tcpdump -vni <your-ext-iface> 'ip proto 47'
      ip -s -s -d link show dev gre1
    
    Solution:
    
    Use the pskb_may_pull function instead the pskb_network_may_pull.
    
    Fixes: 80d875cfc9d3 ("ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()")
    Signed-off-by: Anton Danilov <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family [+ + +]
Author: Ido Schimmel <[email protected]>
Date:   Wed Aug 14 15:52:22 2024 +0300

    ipv4: Mask upper DSCP bits and ECN bits in NETLINK_FIB_LOOKUP family
    
    [ Upstream commit 8fed54758cd248cd311a2b5c1e180abef1866237 ]
    
    The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
    lookup according to user provided parameters and communicate the result
    back to user space.
    
    However, unlike other users of the FIB lookup API, the upper DSCP bits
    and the ECN bits of the DS field are not masked, which can result in the
    wrong result being returned.
    
    Solve this by masking the upper DSCP bits and the ECN bits using
    IPTOS_RT_MASK.
    
    The structure that communicates the request and the response is not
    exported to user space, so it is unlikely that this netlink family is
    actually in use [1].
    
    [1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
    
    Signed-off-by: Ido Schimmel <[email protected]>
    Reviewed-by: Guillaume Nault <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit [+ + +]
Author: Kemeng Shi <[email protected]>
Date:   Thu Aug 1 09:38:08 2024 +0800

    jbd2: correctly compare tids with tid_geq function in jbd2_fc_begin_commit
    
    commit f0e3c14802515f60a47e6ef347ea59c2733402aa upstream.
    
    Use tid_geq to compare tids to work over sequence number wraps.
    
    Signed-off-by: Kemeng Shi <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Reviewed-by: Zhang Yi <[email protected]>
    Cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error [+ + +]
Author: Baokun Li <[email protected]>
Date:   Thu Jul 18 19:53:36 2024 +0800

    jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
    
    commit f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a upstream.
    
    In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
    to recover some journal space. But if an error occurs while executing
    jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
    space right away, we try other branches, and if j_committing_transaction
    is NULL (i.e., the tid is 0), we will get the following complain:
    
    ============================================
    JBD2: I/O error when updating journal superblock for sdd-8.
    __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
    __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
    Modules linked in:
    CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
    RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
    Call Trace:
     <TASK>
     add_transaction_credits+0x5d1/0x5e0
     start_this_handle+0x1ef/0x6a0
     jbd2__journal_start+0x18b/0x340
     ext4_dirty_inode+0x5d/0xb0
     __mark_inode_dirty+0xe4/0x5d0
     generic_update_time+0x60/0x70
    [...]
    ============================================
    
    So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
    clean up at the moment, continue to try to reclaim free space in other ways.
    
    Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
    when updating journal superblock fails") to make jbd2_cleanup_journal_tail
    return the correct error code.
    
    Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
    Cc: [email protected]
    Signed-off-by: Baokun Li <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
jfs: check if leafidx greater than num leaves per dmap tree [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 09:25:23 2024 +0800

    jfs: check if leafidx greater than num leaves per dmap tree
    
    [ Upstream commit d64ff0d2306713ff084d4b09f84ed1a8c75ecc32 ]
    
    syzbot report a out of bounds in dbSplit, it because dmt_leafidx greater
    than num leaves per dmap tree, add a checking for dmt_leafidx in dbFindLeaf.
    
    Shaggy:
    Modified sanity check to apply to control pages as well as leaf pages.
    
    Reported-and-tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=dca05492eff41f604890
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uaf in dbFreeBits [+ + +]
Author: Edward Adam Davis <[email protected]>
Date:   Sat Aug 24 10:50:48 2024 +0800

    jfs: Fix uaf in dbFreeBits
    
    [ Upstream commit d6c1b3599b2feb5c7291f5ac3a36e5fa7cedb234 ]
    
    [syzbot reported]
    ==================================================================
    BUG: KASAN: slab-use-after-free in __mutex_lock_common kernel/locking/mutex.c:587 [inline]
    BUG: KASAN: slab-use-after-free in __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
    Read of size 8 at addr ffff8880229254b0 by task syz-executor357/5216
    
    CPU: 0 UID: 0 PID: 5216 Comm: syz-executor357 Not tainted 6.11.0-rc3-syzkaller-00156-gd7a5aa4b3c00 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
    Call Trace:
     <TASK>
     __dump_stack lib/dump_stack.c:93 [inline]
     dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
     print_address_description mm/kasan/report.c:377 [inline]
     print_report+0x169/0x550 mm/kasan/report.c:488
     kasan_report+0x143/0x180 mm/kasan/report.c:601
     __mutex_lock_common kernel/locking/mutex.c:587 [inline]
     __mutex_lock+0xfe/0xd70 kernel/locking/mutex.c:752
     dbFreeBits+0x7ea/0xd90 fs/jfs/jfs_dmap.c:2390
     dbFreeDmap fs/jfs/jfs_dmap.c:2089 [inline]
     dbFree+0x35b/0x680 fs/jfs/jfs_dmap.c:409
     dbDiscardAG+0x8a9/0xa20 fs/jfs/jfs_dmap.c:1650
     jfs_ioc_trim+0x433/0x670 fs/jfs/jfs_discard.c:100
     jfs_ioctl+0x2d0/0x3e0 fs/jfs/ioctl.c:131
     vfs_ioctl fs/ioctl.c:51 [inline]
     __do_sys_ioctl fs/ioctl.c:907 [inline]
     __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
    
    Freed by task 5218:
     kasan_save_stack mm/kasan/common.c:47 [inline]
     kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
     kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
     poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
     __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
     kasan_slab_free include/linux/kasan.h:184 [inline]
     slab_free_hook mm/slub.c:2252 [inline]
     slab_free mm/slub.c:4473 [inline]
     kfree+0x149/0x360 mm/slub.c:4594
     dbUnmount+0x11d/0x190 fs/jfs/jfs_dmap.c:278
     jfs_mount_rw+0x4ac/0x6a0 fs/jfs/jfs_mount.c:247
     jfs_remount+0x3d1/0x6b0 fs/jfs/super.c:454
     reconfigure_super+0x445/0x880 fs/super.c:1083
     vfs_cmd_reconfigure fs/fsopen.c:263 [inline]
     vfs_fsconfig_locked fs/fsopen.c:292 [inline]
     __do_sys_fsconfig fs/fsopen.c:473 [inline]
     __se_sys_fsconfig+0xb6e/0xf80 fs/fsopen.c:345
     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    [Analysis]
    There are two paths (dbUnmount and jfs_ioc_trim) that generate race
    condition when accessing bmap, which leads to the occurrence of uaf.
    
    Use the lock s_umount to synchronize them, in order to avoid uaf caused
    by race condition.
    
    Reported-and-tested-by: [email protected]
    Signed-off-by: Edward Adam Davis <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: Fix uninit-value access of new_ea in ea_buffer [+ + +]
Author: Zhao Mengmeng <[email protected]>
Date:   Wed Sep 4 09:07:58 2024 +0800

    jfs: Fix uninit-value access of new_ea in ea_buffer
    
    [ Upstream commit 2b59ffad47db1c46af25ccad157bb3b25147c35c ]
    
    syzbot reports that lzo1x_1_do_compress is using uninit-value:
    
    =====================================================
    BUG: KMSAN: uninit-value in lzo1x_1_do_compress+0x19f9/0x2510 lib/lzo/lzo1x_compress.c:178
    
    ...
    
    Uninit was stored to memory at:
     ea_put fs/jfs/xattr.c:639 [inline]
    
    ...
    
    Local variable ea_buf created at:
     __jfs_setxattr+0x5d/0x1ae0 fs/jfs/xattr.c:662
     __jfs_xattr_set+0xe6/0x1f0 fs/jfs/xattr.c:934
    
    =====================================================
    
    The reason is ea_buf->new_ea is not initialized properly.
    
    Fix this by using memset to empty its content at the beginning
    in ea_get().
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=02341e0daa42a15ce130
    Signed-off-by: Zhao Mengmeng <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jfs: UBSAN: shift-out-of-bounds in dbFindBits [+ + +]
Author: Remington Brasga <[email protected]>
Date:   Wed Jul 10 00:12:44 2024 +0000

    jfs: UBSAN: shift-out-of-bounds in dbFindBits
    
    [ Upstream commit b0b2fc815e514221f01384f39fbfbff65d897e1c ]
    
    Fix issue with UBSAN throwing shift-out-of-bounds warning.
    
    Reported-by: [email protected]
    Signed-off-by: Remington Brasga <[email protected]>
    Signed-off-by: Dave Kleikamp <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
jump_label: Fix static_key_slow_dec() yet again [+ + +]
Author: Peter Zijlstra <[email protected]>
Date:   Mon Sep 9 12:50:09 2024 +0200

    jump_label: Fix static_key_slow_dec() yet again
    
    [ Upstream commit 1d7f856c2ca449f04a22d876e36b464b7a9d28b6 ]
    
    While commit 83ab38ef0a0b ("jump_label: Fix concurrency issues in
    static_key_slow_dec()") fixed one problem, it created yet another,
    notably the following is now possible:
    
      slow_dec
        if (try_dec) // dec_not_one-ish, false
        // enabled == 1
                                    slow_inc
                                      if (inc_not_disabled) // inc_not_zero-ish
                                      // enabled == 2
                                        return
    
        guard((mutex)(&jump_label_mutex);
        if (atomic_cmpxchg(1,0)==1) // false, we're 2
    
                                    slow_dec
                                      if (try-dec) // dec_not_one, true
                                      // enabled == 1
                                        return
        else
          try_dec() // dec_not_one, false
          WARN
    
    Use dec_and_test instead of cmpxchg(), like it was prior to
    83ab38ef0a0b. Add a few WARNs for the paranoid.
    
    Fixes: 83ab38ef0a0b ("jump_label: Fix concurrency issues in static_key_slow_dec()")
    Reported-by: "Darrick J. Wong" <[email protected]>
    Tested-by: Klara Modin <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jump_label: Simplify and clarify static_key_fast_inc_cpus_locked() [+ + +]
Author: Thomas Gleixner <[email protected]>
Date:   Mon Jun 10 14:46:39 2024 +0200

    jump_label: Simplify and clarify static_key_fast_inc_cpus_locked()
    
    [ Upstream commit 9bc2ff871f00437ad2f10c1eceff51aaa72b478f ]
    
    Make the code more obvious and add proper comments to avoid future head
    scratching.
    
    Signed-off-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://lkml.kernel.org/r/[email protected]
    Stable-dep-of: 1d7f856c2ca4 ("jump_label: Fix static_key_slow_dec() yet again")
    Signed-off-by: Sasha Levin <[email protected]>

 
kconfig: qconf: fix buffer overflow in debug links [+ + +]
Author: Masahiro Yamada <[email protected]>
Date:   Tue Oct 1 18:02:22 2024 +0900

    kconfig: qconf: fix buffer overflow in debug links
    
    [ Upstream commit 984ed20ece1c6c20789ece040cbff3eb1a388fa9 ]
    
    If you enable "Option -> Show Debug Info" and click a link, the program
    terminates with the following error:
    
        *** buffer overflow detected ***: terminated
    
    The buffer overflow is caused by the following line:
    
        strcat(data, "$");
    
    The buffer needs one more byte to accommodate the additional character.
    
    Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again")
    Signed-off-by: Masahiro Yamada <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3 [+ + +]
Author: Alessandro Zanni <[email protected]>
Date:   Tue Aug 6 14:14:50 2024 +0200

    kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
    
    [ Upstream commit a19008256d05e726f29f43c6a307e45482c082c3 ]
    
    Insert raw strings to prevent Python3 from interpreting string literals
    as Unicode strings and "\d" as invalid escaped sequence.
    
    Fix the warnings:
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py:48:
    SyntaxWarning: invalid escape sequence '\d' usb_controller_sysfs_dir =
    "usb[\d]+"
    
    tools/testing/selftests/devices/probe/test_discoverable_devices.py: 94:
    SyntaxWarning: invalid escape sequence '\d' re_usb_version =
    re.compile("PRODUCT=.*/(\d)/.*")
    
    Fixes: dacf1d7a78bf ("kselftest: Add test to verify probe of devices from discoverable buses")
    
    Reviewed-by: Nícolas F. R. A. Prado <[email protected]>
    Signed-off-by: Alessandro Zanni <[email protected]>
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ksmbd: add refcnt to ksmbd_conn struct [+ + +]
Author: Namjae Jeon <[email protected]>
Date:   Tue Sep 3 20:28:08 2024 +0900

    ksmbd: add refcnt to ksmbd_conn struct
    
    [ Upstream commit ee426bfb9d09b29987369b897fe9b6485ac2be27 ]
    
    When sending an oplock break request, opinfo->conn is used,
    But freed ->conn can be used on multichannel.
    This patch add a reference count to the ksmbd_conn struct
    so that it can be freed when it is no longer used.
    
    Signed-off-by: Namjae Jeon <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
KVM: arm64: Fix kvm_has_feat*() handling of negative features [+ + +]
Author: Marc Zyngier <[email protected]>
Date:   Wed Oct 2 21:42:39 2024 +0100

    KVM: arm64: Fix kvm_has_feat*() handling of negative features
    
    commit a1d402abf8e3ff1d821e88993fc5331784fac0da upstream.
    
    Oliver reports that the kvm_has_feat() helper is not behaviing as
    expected for negative feature. On investigation, the main issue
    seems to be caused by the following construct:
    
     #define get_idreg_field(kvm, id, fld)                          \
            (id##_##fld##_SIGNED ?                                  \
             get_idreg_field_signed(kvm, id, fld) :                 \
             get_idreg_field_unsigned(kvm, id, fld))
    
    where one side of the expression evaluates as something signed,
    and the other as something unsigned. In retrospect, this is totally
    braindead, as the compiler converts this into an unsigned expression.
    When compared to something that is 0, the test is simply elided.
    
    Epic fail. Similar issue exists in the expand_field_sign() macro.
    
    The correct way to handle this is to chose between signed and unsigned
    comparisons, so that both sides of the ternary expression are of the
    same type (bool).
    
    In order to keep the code readable (sort of), we introduce new
    comparison primitives taking an operator as a parameter, and
    rewrite the kvm_has_feat*() helpers in terms of these primitives.
    
    Fixes: c62d7a23b947 ("KVM: arm64: Add feature checking helpers")
    Reported-by: Oliver Upton <[email protected]>
    Tested-by: Oliver Upton <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
lib/buildid: harden build ID parsing logic [+ + +]
Author: Andrii Nakryiko <[email protected]>
Date:   Thu Aug 29 10:42:23 2024 -0700

    lib/buildid: harden build ID parsing logic
    
    [ Upstream commit 905415ff3ffb1d7e5afa62bacabd79776bd24606 ]
    
    Harden build ID parsing logic, adding explicit READ_ONCE() where it's
    important to have a consistent value read and validated just once.
    
    Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
    note is within a page bounds, so move the overflow check up and add an
    extra note_size boundaries validation.
    
    Fixes tag below points to the code that moved this code into
    lib/buildid.c, and then subsequently was used in perf subsystem, making
    this code exposed to perf_event_open() users in v5.12+.
    
    Cc: [email protected]
    Reviewed-by: Eduard Zingerman <[email protected]>
    Reviewed-by: Jann Horn <[email protected]>
    Suggested-by: Andi Kleen <[email protected]>
    Fixes: bd7525dacd7e ("bpf: Move stack_map_get_build_id into lib")
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
Linux: Linux 6.10.14 [+ + +]
Author: Greg Kroah-Hartman <[email protected]>
Date:   Thu Oct 10 12:01:14 2024 +0200

    Linux 6.10.14
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Mark Brown <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Kexy Biscuit <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: kernelci.org bot <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
loop: don't set QUEUE_FLAG_NOMERGES [+ + +]
Author: Christoph Hellwig <[email protected]>
Date:   Thu Jun 27 14:49:11 2024 +0200

    loop: don't set QUEUE_FLAG_NOMERGES
    
    [ Upstream commit 667ea36378cf7f669044b27871c496e1559c872a ]
    
    QUEUE_FLAG_NOMERGES isn't really a driver interface, but a user tunable.
    There also isn't any good reason to set it in the loop driver.
    
    The original commit adding it (5b5e20f421c0b6d "block: loop: set
    QUEUE_FLAG_NOMERGES for request queue of loop") claims that "It doesn't
    make sense to enable merge because the I/O submitted to backing file is
    handled page by page."  which of course isn't true for multi-page bvec
    now, and it never has been for direct I/O, for which commit 40326d8a33d
    ("block/loop: allow request merge for directio mode") alredy disabled
    the nomerges flag.
    
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Bart Van Assche <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
mac802154: Fix potential RCU dereference issue in mac802154_scan_worker [+ + +]
Author: Jiawei Ye <[email protected]>
Date:   Tue Sep 24 06:58:05 2024 +0000

    mac802154: Fix potential RCU dereference issue in mac802154_scan_worker
    
    commit bff1709b3980bd7f80be6786f64cc9a9ee9e56da upstream.
    
    In the `mac802154_scan_worker` function, the `scan_req->type` field was
    accessed after the RCU read-side critical section was unlocked. According
    to RCU usage rules, this is illegal and can lead to unpredictable
    behavior, such as accessing memory that has been updated or causing
    use-after-free issues.
    
    This possible bug was identified using a static analysis tool developed
    by myself, specifically designed to detect RCU-related issues.
    
    To address this, the `scan_req->type` value is now stored in a local
    variable `scan_req_type` while still within the RCU read-side critical
    section. The `scan_req_type` is then used after the RCU lock is released,
    ensuring that the type value is safely accessed without violating RCU
    rules.
    
    Fixes: e2c3e6f53a7a ("mac802154: Handle active scanning")
    Cc: [email protected]
    Signed-off-by: Jiawei Ye <[email protected]>
    Acked-by: Miquel Raynal <[email protected]>
    Reviewed-by: Przemek Kitszel <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mailbox: ARM_MHU_V3 should depend on ARM64 [+ + +]
Author: Geert Uytterhoeven <[email protected]>
Date:   Thu Aug 29 15:58:53 2024 +0200

    mailbox: ARM_MHU_V3 should depend on ARM64
    
    [ Upstream commit 0e4ed48292c55eeb0afab22f8930b556f17eaad2 ]
    
    The ARM MHUv3 controller is only present on ARM64 SoCs.  Hence add a
    dependency on ARM64, to prevent asking the user about this driver when
    configuring a kernel for a different architecture than ARM64.
    
    Fixes: ca1a8680b134b5e6 ("mailbox: arm_mhuv3: Add driver")
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Acked-by: Sudeep Holla <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: bcm2835: Fix timeout during suspend mode [+ + +]
Author: Stefan Wahren <[email protected]>
Date:   Wed Aug 21 23:40:44 2024 +0200

    mailbox: bcm2835: Fix timeout during suspend mode
    
    [ Upstream commit dc09f007caed3b2f6a3b6bd7e13777557ae22bfd ]
    
    During noirq suspend phase the Raspberry Pi power driver suffer of
    firmware property timeouts. The reason is that the IRQ of the underlying
    BCM2835 mailbox is disabled and rpi_firmware_property_list() will always
    run into a timeout [1].
    
    Since the VideoCore side isn't consider as a wakeup source, set the
    IRQF_NO_SUSPEND flag for the mailbox IRQ in order to keep it enabled
    during suspend-resume cycle.
    
    [1]
    PM: late suspend of devices complete after 1.754 msecs
    WARNING: CPU: 0 PID: 438 at drivers/firmware/raspberrypi.c:128
     rpi_firmware_property_list+0x204/0x22c
    Firmware transaction 0x00028001 timeout
    Modules linked in:
    CPU: 0 PID: 438 Comm: bash Tainted: G         C         6.9.3-dirty #17
    Hardware name: BCM2835
    Call trace:
    unwind_backtrace from show_stack+0x18/0x1c
    show_stack from dump_stack_lvl+0x34/0x44
    dump_stack_lvl from __warn+0x88/0xec
    __warn from warn_slowpath_fmt+0x7c/0xb0
    warn_slowpath_fmt from rpi_firmware_property_list+0x204/0x22c
    rpi_firmware_property_list from rpi_firmware_property+0x68/0x8c
    rpi_firmware_property from rpi_firmware_set_power+0x54/0xc0
    rpi_firmware_set_power from _genpd_power_off+0xe4/0x148
    _genpd_power_off from genpd_sync_power_off+0x7c/0x11c
    genpd_sync_power_off from genpd_finish_suspend+0xcc/0xe0
    genpd_finish_suspend from dpm_run_callback+0x78/0xd0
    dpm_run_callback from device_suspend_noirq+0xc0/0x238
    device_suspend_noirq from dpm_suspend_noirq+0xb0/0x168
    dpm_suspend_noirq from suspend_devices_and_enter+0x1b8/0x5ac
    suspend_devices_and_enter from pm_suspend+0x254/0x2e4
    pm_suspend from state_store+0xa8/0xd4
    state_store from kernfs_fop_write_iter+0x154/0x1a0
    kernfs_fop_write_iter from vfs_write+0x12c/0x184
    vfs_write from ksys_write+0x78/0xc0
    ksys_write from ret_fast_syscall+0x0/0x54
    Exception stack(0xcc93dfa8 to 0xcc93dff0)
    [...]
    PM: noirq suspend of devices complete after 3095.584 msecs
    
    Link: https://github.com/raspberrypi/firmware/issues/1894
    Fixes: 0bae6af6d704 ("mailbox: Enable BCM2835 mailbox support")
    Signed-off-by: Stefan Wahren <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mailbox: rockchip: fix a typo in module autoloading [+ + +]
Author: Liao Chen <[email protected]>
Date:   Wed Aug 14 02:51:47 2024 +0000

    mailbox: rockchip: fix a typo in module autoloading
    
    [ Upstream commit e92d87c9c5d769e4cb1dd7c90faa38dddd7e52e3 ]
    
    MODULE_DEVICE_TABLE(of, rockchip_mbox_of_match) could let the module
    properly autoloaded based on the alias from of_device_id table. It
    should be 'rockchip_mbox_of_match' instead of 'rockchp_mbox_of_match',
    just fix it.
    
    Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")
    Signed-off-by: Liao Chen <[email protected]>
    Reviewed-by: Heiko Stuebner <[email protected]>
    Signed-off-by: Jassi Brar <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
media: i2c: ar0521: Use cansleep version of gpiod_set_value() [+ + +]
Author: Alexander Shiyan <[email protected]>
Date:   Thu Aug 29 08:48:49 2024 +0300

    media: i2c: ar0521: Use cansleep version of gpiod_set_value()
    
    commit bee1aed819a8cda47927436685d216906ed17f62 upstream.
    
    If we use GPIO reset from I2C port expander, we must use *_cansleep()
    variant of GPIO functions.
    This was not done in ar0521_power_on()/ar0521_power_off() functions.
    Let's fix that.
    
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 11 at drivers/gpio/gpiolib.c:3496 gpiod_set_value+0x74/0x7c
    Modules linked in:
    CPU: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.10.0 #53
    Hardware name: Diasom DS-RK3568-SOM-EVB (DT)
    Workqueue: events_unbound deferred_probe_work_func
    pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    pc : gpiod_set_value+0x74/0x7c
    lr : ar0521_power_on+0xcc/0x290
    sp : ffffff8001d7ab70
    x29: ffffff8001d7ab70 x28: ffffff80027dcc90 x27: ffffff8003c82000
    x26: ffffff8003ca9250 x25: ffffffc080a39c60 x24: ffffff8003ca9088
    x23: ffffff8002402720 x22: ffffff8003ca9080 x21: ffffff8003ca9088
    x20: 0000000000000000 x19: ffffff8001eb2a00 x18: ffffff80efeeac80
    x17: 756d2d6332692f30 x16: 0000000000000000 x15: 0000000000000000
    x14: ffffff8001d91d40 x13: 0000000000000016 x12: ffffffc080e98930
    x11: ffffff8001eb2880 x10: 0000000000000890 x9 : ffffff8001d7a9f0
    x8 : ffffff8001d92570 x7 : ffffff80efeeac80 x6 : 000000003fc6e780
    x5 : ffffff8001d91c80 x4 : 0000000000000002 x3 : 0000000000000000
    x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000001
    Call trace:
     gpiod_set_value+0x74/0x7c
     ar0521_power_on+0xcc/0x290
    ...
    
    Signed-off-by: Alexander Shiyan <[email protected]>
    Fixes: 852b50aeed15 ("media: On Semi AR0521 sensor driver")
    Cc: [email protected]
    Acked-by: Krzysztof Hałasa <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: imx335: Fix reset-gpio handling [+ + +]
Author: Umang Jain <[email protected]>
Date:   Fri Aug 30 11:41:52 2024 +0530

    media: imx335: Fix reset-gpio handling
    
    commit 99d30e2fdea4086be4e66e2deb10de854b547ab8 upstream.
    
    Rectify the logical value of reset-gpio so that it is set to
    0 (disabled) during power-on and to 1 (enabled) during power-off.
    
    Set the reset-gpio to GPIO_OUT_HIGH at initialization time to make
    sure it starts off in reset. Also drop the "Set XCLR" comment which
    is not-so-informative.
    
    The existing usage of imx335 had reset-gpios polarity inverted
    (GPIO_ACTIVE_HIGH) in their device-tree sources. With this patch
    included, those DTS will not be able to stream imx335 anymore. The
    reset-gpio polarity will need to be rectified in the device-tree
    sources as shown in [1] example, in order to get imx335 functional
    again (as it remains in reset prior to this fix).
    
    Cc: [email protected]
    Fixes: 45d19b5fb9ae ("media: i2c: Add imx335 camera sensor driver")
    Reviewed-by: Laurent Pinchart <[email protected]>
    Link: https://lore.kernel.org/linux-media/[email protected]/
    Signed-off-by: Umang Jain <[email protected]>
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: ov5675: Fix power on/off delay timings [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Sat Jul 13 23:33:29 2024 +0100

    media: ov5675: Fix power on/off delay timings
    
    commit 719ec29fceda2f19c833d2784b1574638320400f upstream.
    
    The ov5675 specification says that the gap between XSHUTDN deassert and the
    first I2C transaction should be a minimum of 8192 XVCLK cycles.
    
    Right now we use a usleep_rage() that gives a sleep time of between about
    430 and 860 microseconds.
    
    On the Lenovo X13s we have observed that in about 1/20 cases the current
    timing is too tight and we start transacting before the ov5675's reset
    cycle completes, leading to I2C bus transaction failures.
    
    The reset racing is sometimes triggered at initial chip probe but, more
    usually on a subsequent power-off/power-on cycle e.g.
    
    [   71.451662] ov5675 24-0010: failed to write reg 0x0103. error = -5
    [   71.451686] ov5675 24-0010: failed to set plls
    
    The current quiescence period we have is too tight. Instead of expressing
    the post reset delay in terms of the current XVCLK this patch converts the
    power-on and power-off delays to the maximum theoretical delay @ 6 MHz with
    an additional buffer.
    
    1.365 milliseconds on the power-on path is 1.5 milliseconds with grace.
    85.3 microseconds on the power-off path is 90 microseconds with grace.
    
    Fixes: 49d9ad719e89 ("media: ov5675: add device-tree support and support runtime PM")
    Cc: [email protected]
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Tested-by: Johan Hovold <[email protected]>
    Reviewed-by: Quentin Schulz <[email protected]>
    Tested-by: Quentin Schulz <[email protected]> # RK3399 Puma with
    Signed-off-by: Sakari Ailus <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Fix ordering of pm_runtime_enable [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:03 2024 +0100

    media: qcom: camss: Fix ordering of pm_runtime_enable
    
    commit a151766bd3688f6803e706c6433a7c8d3c6a6a94 upstream.
    
    pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
    pm_runtime_resume_and_get().
    
    This is a basic race condition that doesn't show up for most users so is
    not widely reported. If you blacklist qcom-camss in modules.d and then
    subsequently modprobe the module post-boot it is possible to reliably show
    this error up.
    
    The kernel log for this error looks like this:
    
    qcom-camss ac5a000.camss: Failed to power up pipeline: -13
    
    Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Reviewed-by: Konrad Dybcio <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: qcom: camss: Remove use_count guard in stop_streaming [+ + +]
Author: Bryan O'Donoghue <[email protected]>
Date:   Mon Jul 29 13:42:02 2024 +0100

    media: qcom: camss: Remove use_count guard in stop_streaming
    
    commit 25f18cb1b673220b76a86ebef8e7fb79bd303b27 upstream.
    
    The use_count check was introduced so that multiple concurrent Raw Data
    Interfaces RDIs could be driven by different virtual channels VCs on the
    CSIPHY input driving the video pipeline.
    
    This is an invalid use of use_count though as use_count pertains to the
    number of times a video entity has been opened by user-space not the number
    of active streams.
    
    If use_count and stream-on count don't agree then stop_streaming() will
    break as is currently the case and has become apparent when using CAMSS
    with libcamera's released softisp 0.3.
    
    The use of use_count like this is a bit hacky and right now breaks regular
    usage of CAMSS for a single stream case. Stopping qcam results in the splat
    below, and then it cannot be started again and any attempts to do so fails
    with -EBUSY.
    
    [ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    ...
    [ 1265.510630] Call trace:
    [ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
    [ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
    [ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
    [ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
    [ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
    [ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
    [ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
    [ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
    ...
    [ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
    [ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
    [ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st
    
    One CAMSS specific way to handle multiple VCs on the same RDI might be:
    
    - Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
    - The video buffers are already associated with msm_vfeN_rdiX so
      release video buffers when told to do so by stop_streaming.
    - Only release the power-domains for the CSIPHY, CSID and VFE when
      their internal refcounts drop.
    
    Either way refusing to release video buffers based on use_count is
    erroneous and should be reverted. The silicon enabling code for selecting
    VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
    won't work with CAMSS right now.
    
    Initial testing with this code didn't show an error but, SoftISP and "real"
    usage with Google Hangouts breaks the upstream code pretty quickly, we need
    to do a partial revert and take another pass at VCs.
    
    This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
    Pipeline starting and stopping for multiple virtual channels")
    
    Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")
    Reported-by: Johan Hovold <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Johan Hovold <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Bryan O'Donoghue <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: sun4i_csi: Implement link validate for sun4i_csi subdev [+ + +]
Author: Laurent Pinchart <[email protected]>
Date:   Wed Jun 19 02:46:16 2024 +0300

    media: sun4i_csi: Implement link validate for sun4i_csi subdev
    
    commit 2dc5d5d401f5c6cecd97800ffef82e8d17d228f0 upstream.
    
    The sun4i_csi driver doesn't implement link validation for the subdev it
    registers, leaving the link between the subdev and its source
    unvalidated. Fix it, using the v4l2_subdev_link_validate() helper.
    
    Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Acked-by: Chen-Yu Tsai <[email protected]>
    Reviewed-by: Tomi Valkeinen <[email protected]>
    Acked-by: Sakari Ailus <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags [+ + +]
Author: Hans Verkuil <[email protected]>
Date:   Wed Aug 7 09:22:10 2024 +0200

    media: uapi/linux/cec.h: cec_msg_set_reply_to: zero flags
    
    commit 599f6899051cb70c4e0aa9fd591b9ee220cb6f14 upstream.
    
    The cec_msg_set_reply_to() helper function never zeroed the
    struct cec_msg flags field, this can cause unexpected behavior
    if flags was uninitialized to begin with.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media")
    Cc: <[email protected]>
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: venus: fix use after free bug in venus_remove due to race condition [+ + +]
Author: Zheng Wang <[email protected]>
Date:   Tue Jun 18 14:55:59 2024 +0530

    media: venus: fix use after free bug in venus_remove due to race condition
    
    commit c5a85ed88e043474161bbfe54002c89c1cb50ee2 upstream.
    
    in venus_probe, core->work is bound with venus_sys_error_handler, which is
    used to handle error. The code use core->sys_err_done to make sync work.
    The core->work is started in venus_event_notify.
    
    If we call venus_remove, there might be an unfished work. The possible
    sequence is as follows:
    
    CPU0                  CPU1
    
                         |venus_sys_error_handler
    venus_remove         |
    hfi_destroy                      |
    venus_hfi_destroy        |
    kfree(hdev);         |
                         |hfi_reinit
                                             |venus_hfi_queues_reinit
                         |//use hdev
    
    Fix it by canceling the work in venus_remove.
    
    Cc: [email protected]
    Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
    Signed-off-by: Zheng Wang <[email protected]>
    Signed-off-by: Dikshita Agarwal <[email protected]>
    Signed-off-by: Stanimir Varbanov <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

media: videobuf2: Drop minimum allocation requirement of 2 buffers [+ + +]
Author: Laurent Pinchart <[email protected]>
Date:   Mon Aug 26 02:24:49 2024 +0300

    media: videobuf2: Drop minimum allocation requirement of 2 buffers
    
    commit e5700c9037727d5a69a677d6dba25010b485d65b upstream.
    
    When introducing the ability for drivers to indicate the minimum number
    of buffers they require an application to allocate, commit 6662edcd32cc
    ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue
    structure") also introduced a global minimum of 2 buffers. It turns out
    this breaks the Renesas R-Car VSP test suite, where a test that
    allocates a single buffer fails when two buffers are used.
    
    One may consider debatable whether test suite failures without failures
    in production use cases should be considered as a regression, but
    operation with a single buffer is a valid use case. While full frame
    rate can't be maintained, memory-to-memory devices can still be used
    with a decent efficiency, and requiring applications to allocate
    multiple buffers for single-shot use cases with capture devices would
    just waste memory.
    
    For those reasons, fix the regression by dropping the global minimum of
    buffers. Individual drivers can still set their own minimum.
    
    Fixes: 6662edcd32cc ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue structure")
    Cc: [email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Reviewed-by: Hans Verkuil <[email protected]>
    Acked-by: Tomasz Figa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
memory: tegra186-emc: drop unused to_tegra186_emc() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Mon Aug 12 14:30:55 2024 +0200

    memory: tegra186-emc: drop unused to_tegra186_emc()
    
    commit 67dd9e861add38755a7c5d29e25dd0f6cb4116ab upstream.
    
    to_tegra186_emc() is not used, W=1 builds:
    
      tegra186-emc.c:38:36: error: unused function 'to_tegra186_emc' [-Werror,-Wunused-function]
    
    Fixes: 9a38cb27668e ("memory: tegra: Add interconnect support for DRAM scaling in Tegra234")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm, slub: avoid zeroing kmalloc redzone [+ + +]
Author: Peng Fan <[email protected]>
Date:   Thu Aug 29 11:29:11 2024 +0800

    mm, slub: avoid zeroing kmalloc redzone
    
    commit 59090e479ac78ae18facd4c58eb332562a23020e upstream.
    
    Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra
    allocated kmalloc space than requested"), setting orig_size treats
    the wasted space (object_size - orig_size) as a redzone. However with
    init_on_free=1 we clear the full object->size, including the redzone.
    
    Additionally we clear the object metadata, including the stored orig_size,
    making it zero, which makes check_object() treat the whole object as a
    redzone.
    
    These issues lead to the following BUG report with "slub_debug=FUZ
    init_on_free=1":
    
    [    0.000000] =============================================================================
    [    0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten
    [    0.000000] -----------------------------------------------------------------------------
    [    0.000000]
    [    0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc
    [    0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc
    [    0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff)
    [    0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8
    [    0.000000]
    [    0.000000] Redzone  ffff000010032850: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Object   ffff000010032858: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Redzone  ffff000010032860: cc cc cc cc cc cc cc cc                          ........
    [    0.000000] Padding  ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00              ............
    [    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144
    [    0.000000] Hardware name: NXP i.MX95 19X19 board (DT)
    [    0.000000] Call trace:
    [    0.000000]  dump_backtrace+0x90/0xe8
    [    0.000000]  show_stack+0x18/0x24
    [    0.000000]  dump_stack_lvl+0x74/0x8c
    [    0.000000]  dump_stack+0x18/0x24
    [    0.000000]  print_trailer+0x150/0x218
    [    0.000000]  check_object+0xe4/0x454
    [    0.000000]  free_to_partial_list+0x2f8/0x5ec
    
    To address the issue, use orig_size to clear the used area. And restore
    the value of orig_size after clear the remaining area.
    
    When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns
    s->object_size. So when using memset to init the area, the size can simply
    be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not
    enabled. And orig_size can never be bigger than object_size.
    
    Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
    Cc: <[email protected]>
    Reviewed-by: Feng Tang <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Signed-off-by: Peng Fan <[email protected]>
    Signed-off-by: Vlastimil Babka <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
mm: krealloc: consider spare memory for __GFP_ZERO [+ + +]
Author: Danilo Krummrich <[email protected]>
Date:   Tue Aug 13 00:34:34 2024 +0200

    mm: krealloc: consider spare memory for __GFP_ZERO
    
    commit 1a83a716ec233990e1fd5b6fbb1200ade63bf450 upstream.
    
    As long as krealloc() is called with __GFP_ZERO consistently, starting
    with the initial memory allocation, __GFP_ZERO should be fully honored.
    
    However, if for an existing allocation krealloc() is called with a
    decreased size, it is not ensured that the spare portion the allocation is
    zeroed.  Thus, if krealloc() is subsequently called with a larger size
    again, __GFP_ZERO can't be fully honored, since we don't know the previous
    size, but only the bucket size.
    
    Example:
    
            buf = kzalloc(64, GFP_KERNEL);
            memset(buf, 0xff, 64);
    
            buf = krealloc(buf, 48, GFP_KERNEL | __GFP_ZERO);
    
            /* After this call the last 16 bytes are still 0xff. */
            buf = krealloc(buf, 64, GFP_KERNEL | __GFP_ZERO);
    
    Fix this, by explicitly setting spare memory to zero, when shrinking an
    allocation with __GFP_ZERO flag set or init_on_alloc enabled.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Danilo Krummrich <[email protected]>
    Acked-by: Vlastimil Babka <[email protected]>
    Acked-by: David Rientjes <[email protected]>
    Cc: Christoph Lameter <[email protected]>
    Cc: Hyeonggon Yoo <[email protected]>
    Cc: Joonsoo Kim <[email protected]>
    Cc: Pekka Enberg <[email protected]>
    Cc: Roman Gushchin <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mm: z3fold: deprecate CONFIG_Z3FOLD [+ + +]
Author: Yosry Ahmed <[email protected]>
Date:   Mon Oct 7 19:23:51 2024 +0000

    mm: z3fold: deprecate CONFIG_Z3FOLD
    
    [ Upstream commit 7a2369b74abf76cd3e54c45b30f6addb497f831b ]
    
    The z3fold compressed pages allocator is rarely used, most users use
    zsmalloc.  The only disadvantage of zsmalloc in comparison is the
    dependency on MMU, and zbud is a more common option for !MMU as it was the
    default zswap allocator for a long time.
    
    Historically, zsmalloc had worse latency than zbud and z3fold but offered
    better memory savings.  This is no longer the case as shown by a simple
    recent analysis [1].  That analysis showed that z3fold does not have any
    advantage over zsmalloc or zbud considering both performance and memory
    usage.  In a kernel build test on tmpfs in a limited cgroup, z3fold took
    3% more time and used 1.8% more memory.  The latency of zswap_load() was
    7% higher, and that of zswap_store() was 10% higher.  Zsmalloc is better
    in all metrics.
    
    Moreover, z3fold apparently has latent bugs, which was made noticeable by
    a recent soft lockup bug report with z3fold [2].  Switching to zsmalloc
    not only fixed the problem, but also reduced the swap usage from 6~8G to
    1~2G.  Other users have also reported being bitten by mistakenly enabling
    z3fold.
    
    Other than hurting users, z3fold is repeatedly causing wasted engineering
    effort.  Apart from investigating the above bug, it came up in multiple
    development discussions (e.g.  [3]) as something we need to handle, when
    there aren't any legit users (at least not intentionally).
    
    The natural course of action is to deprecate z3fold, and remove in a few
    cycles if no objections are raised from active users.  Next on the list
    should be zbud, as it offers marginal latency gains at the cost of huge
    memory waste when compared to zsmalloc.  That one will need to wait until
    zsmalloc does not depend on MMU.
    
    Rename the user-visible config option from CONFIG_Z3FOLD to
    CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
    prompt with explanation during make oldconfig.  Also, remove
    CONFIG_Z3FOLD=y from defconfigs.
    
    [1]https://lore.kernel.org/lkml/CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com/
    [2]https://lore.kernel.org/lkml/[email protected]/
    [3]https://lore.kernel.org/lkml/CAJD7tkbnmeVugfunffSovJf9FAgy9rhBVt_tx=nxUveLUfqVsA@mail.gmail.com/
    
    [[email protected]: deprecate ZSWAP_ZPOOL_DEFAULT_Z3FOLD as well]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Yosry Ahmed <[email protected]>
    Signed-off-by: Arnd Bergmann <[email protected]>
    Acked-by: Chris Down <[email protected]>
    Acked-by: Nhat Pham <[email protected]>
    Acked-by: Johannes Weiner <[email protected]>
    Acked-by: Vitaly Wool <[email protected]>
    Acked-by: Christoph Hellwig <[email protected]>
    Cc: Aneesh Kumar K.V <[email protected]>
    Cc: Christophe Leroy <[email protected]>
    Cc: Huacai Chen <[email protected]>
    Cc: Miaohe Lin <[email protected]>
    Cc: Michael Ellerman <[email protected]>
    Cc: Naveen N. Rao <[email protected]>
    Cc: Nicholas Piggin <[email protected]>
    Cc: Sergey Senozhatsky <[email protected]>
    Cc: WANG Xuerui <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    (cherry picked from commit 7a2369b74abf76cd3e54c45b30f6addb497f831b)
    Signed-off-by: Sasha Levin <[email protected]>

 
net/mlx5: Added cond_resched() to crdump collection [+ + +]
Author: Mohamed Khalfella <[email protected]>
Date:   Wed Sep 4 22:02:48 2024 -0600

    net/mlx5: Added cond_resched() to crdump collection
    
    [ Upstream commit ec793155894140df7421d25903de2e6bc12c695b ]
    
    Collecting crdump involves reading vsc registers from pci config space
    of mlx device, which can take long time to complete. This might result
    in starving other threads waiting to run on the cpu.
    
    Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab:
    
    - mlx5_vsc_gw_read_block_fast() was called with length = 1310716.
    - mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to
      read the entire 1310716 bytes. It was called 53813 times because
      there are jumps in read_addr.
    - On average mlx5_vsc_gw_read_fast() took 35284.4ns.
    - In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times.
      The average time for each call was 17548.3ns. In some instances
      vsc_read() was called more than one time when the flag was not set.
      As expected the thread released the cpu after 16 iterations in
      mlx5_vsc_wait_on_flag().
    - Total time to read crdump was 35284.4ns * 53813 ~= 1.898s.
    
    It was seen in the field that crdump can take more than 5 seconds to
    complete. During that time mlx5_vsc_wait_on_flag() did not release the
    cpu because it did not complete 16 iterations. It is believed that pci
    config reads were slow. Adding cond_resched() every 128 register read
    improves the situation. In the common case the, crdump takes ~1.8989s,
    the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread
    yields the cpu every ~18.0ms.
    
    Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support")
    Reviewed-by: Yuanyuan Zhong <[email protected]>
    Signed-off-by: Mohamed Khalfella <[email protected]>
    Reviewed-by: Moshe Shemesh <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5: Fix error path in multi-packet WQE transmit [+ + +]
Author: Gerd Bayer <[email protected]>
Date:   Tue Sep 10 10:53:51 2024 +0200

    net/mlx5: Fix error path in multi-packet WQE transmit
    
    [ Upstream commit 2bcae12c795f32ddfbf8c80d1b5f1d3286341c32 ]
    
    Remove the erroneous unmap in case no DMA mapping was established
    
    The multi-packet WQE transmit code attempts to obtain a DMA mapping for
    the skb. This could fail, e.g. under memory pressure, when the IOMMU
    driver just can't allocate more memory for page tables. While the code
    tries to handle this in the path below the err_unmap label it erroneously
    unmaps one entry from the sq's FIFO list of active mappings. Since the
    current map attempt failed this unmap is removing some random DMA mapping
    that might still be required. If the PCI function now presents that IOVA,
    the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
    function in error state.
    
    The erroneous behavior was seen in a stress-test environment that created
    memory pressure.
    
    Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
    Signed-off-by: Gerd Bayer <[email protected]>
    Reviewed-by: Zhu Yanjun <[email protected]>
    Acked-by: Maxim Mikityanskiy <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice [+ + +]
Author: Jianbo Liu <[email protected]>
Date:   Mon Sep 2 09:40:58 2024 +0300

    net/mlx5e: Fix crash caused by calling __xfrm_state_delete() twice
    
    [ Upstream commit 7b124695db40d5c9c5295a94ae928a8d67a01c3d ]
    
    The km.state is not checked in driver's delayed work. When
    xfrm_state_check_expire() is called, the state can be reset to
    XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This
    happens when xfrm state is deleted, but not freed yet. As
    __xfrm_state_delete() is called again in xfrm timer, the following
    crash occurs.
    
    To fix this issue, skip xfrm_state_check_expire() if km.state is not
    XFRM_STATE_VALID.
    
     Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
     CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
     Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core]
     RIP: 0010:__xfrm_state_delete+0x3d/0x1b0
     Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48
     RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246
     RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036
     RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980
     RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000
     R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246
     R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400
     FS:  0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      <IRQ>
      ? die_addr+0x33/0x90
      ? exc_general_protection+0x1a2/0x390
      ? asm_exc_general_protection+0x22/0x30
      ? __xfrm_state_delete+0x3d/0x1b0
      ? __xfrm_state_delete+0x2f/0x1b0
      xfrm_timer_handler+0x174/0x350
      ? __xfrm_state_delete+0x1b0/0x1b0
      __hrtimer_run_queues+0x121/0x270
      hrtimer_run_softirq+0x88/0xd0
      handle_softirqs+0xcc/0x270
      do_softirq+0x3c/0x50
      </IRQ>
      <TASK>
      __local_bh_enable_ip+0x47/0x50
      mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core]
      process_one_work+0x137/0x2d0
      worker_thread+0x28d/0x3a0
      ? rescuer_thread+0x480/0x480
      kthread+0xb8/0xe0
      ? kthread_park+0x80/0x80
      ret_from_fork+0x2d/0x50
      ? kthread_park+0x80/0x80
      ret_from_fork_asm+0x11/0x20
      </TASK>
    
    Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality")
    Signed-off-by: Jianbo Liu <[email protected]>
    Reviewed-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc() [+ + +]
Author: Elena Salomatkina <[email protected]>
Date:   Tue Sep 24 19:00:18 2024 +0300

    net/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()
    
    [ Upstream commit f25389e779500cf4a59ef9804534237841bce536 ]
    
    In mlx5e_tir_builder_alloc() kvzalloc() may return NULL
    which is dereferenced on the next line in a reference
    to the modify field.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object")
    Signed-off-by: Elena Salomatkina <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Kalesh AP <[email protected]>
    Reviewed-by: Tariq Toukan <[email protected]>
    Reviewed-by: Gal Pressman <[email protected]>
    Signed-off-by: Saeed Mahameed <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/ncsi: Disable the ncsi work before freeing the associated structure [+ + +]
Author: Eddie James <[email protected]>
Date:   Wed Sep 25 10:55:23 2024 -0500

    net/ncsi: Disable the ncsi work before freeing the associated structure
    
    [ Upstream commit a0ffa68c70b367358b2672cdab6fa5bc4c40de2c ]
    
    The work function can run after the ncsi device is freed, resulting
    in use-after-free bugs or kernel panic.
    
    Fixes: 2d283bdd079c ("net/ncsi: Resource management")
    Signed-off-by: Eddie James <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net/xen-netback: prevent UAF in xenvif_flush_hash() [+ + +]
Author: Jeongjun Park <[email protected]>
Date:   Fri Aug 23 03:11:09 2024 +0900

    net/xen-netback: prevent UAF in xenvif_flush_hash()
    
    [ Upstream commit 0fa5e94a1811d68fbffa0725efe6d4ca62c03d12 ]
    
    During the list_for_each_entry_rcu iteration call of xenvif_flush_hash,
    kfree_rcu does not exist inside the rcu read critical section, so if
    kfree_rcu is called when the rcu grace period ends during the iteration,
    UAF occurs when accessing head->next after the entry becomes free.
    
    Therefore, to solve this, you need to change it to list_for_each_entry_safe.
    
    Signed-off-by: Jeongjun Park <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
net: add more sanity checks to qdisc_pkt_len_init() [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:57 2024 +0000

    net: add more sanity checks to qdisc_pkt_len_init()
    
    [ Upstream commit ab9a9a9e9647392a19e7a885b08000e89c86b535 ]
    
    One path takes care of SKB_GSO_DODGY, assuming
    skb->len is bigger than hdr_len.
    
    virtio_net_hdr_to_skb() does not fully dissect TCP headers,
    it only make sure it is at least 20 bytes.
    
    It is possible for an user to provide a malicious 'GSO' packet,
    total length of 80 bytes.
    
    - 20 bytes of IPv4 header
    - 60 bytes TCP header
    - a small gso_size like 8
    
    virtio_net_hdr_to_skb() would declare this packet as a normal
    GSO packet, because it would see 40 bytes of payload,
    bigger than gso_size.
    
    We need to make detect this case to not underflow
    qdisc_skb_cb(skb)->pkt_len.
    
    Fixes: 1def9238d4aa ("net_sched: more precise pkt_len computation")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Add netif_get_gro_max_size helper for GRO [+ + +]
Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:41 2024 +0200

    net: Add netif_get_gro_max_size helper for GRO
    
    [ Upstream commit e8d4d34df715133c319fabcf63fdec684be75ff8 ]
    
    Add a small netif_get_gro_max_size() helper which returns the maximum IPv4
    or IPv6 GRO size of the netdevice.
    
    We later add a netif_get_gso_max_size() equivalent as well for GSO, so that
    these helpers can be used consistently instead of open-coded checks.
    
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Stable-dep-of: e609c959a939 ("net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size")
    Signed-off-by: Sasha Levin <[email protected]>

net: atlantic: Avoid warning about potential string truncation [+ + +]
Author: Simon Horman <[email protected]>
Date:   Wed Aug 21 16:58:57 2024 +0100

    net: atlantic: Avoid warning about potential string truncation
    
    [ Upstream commit 5874e0c9f25661c2faefe4809907166defae3d7f ]
    
    W=1 builds with GCC 14.2.0 warn that:
    
    .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                           ^~
    .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                                        ^~~~~~~
    .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
      278 |                                 snprintf(tc_string, 8, "TC%d ", tc);
          |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
    the range is 0 - 255. Further, on inspecting the code, it seems
    that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
    the range is actually 0 - 8.
    
    So, it seems that the condition that GCC flags will not occur.
    But, nonetheless, it would be nice if it didn't emit the warning.
    
    It seems that this can be achieved by changing the format specifier
    from %d to %u, in which case I believe GCC recognises an upper bound
    on the range of tc of 0 - 255. After some experimentation I think
    this is due to the combination of the use of %u and the type of
    cfg->tcs (u8).
    
    Empirically, updating the type of the tc variable to unsigned int
    has the same effect.
    
    As both of these changes seem to make sense in relation to what the code
    is actually doing - iterating over unsigned values - do both.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: avoid potential underflow in qdisc_pkt_len_init() with UFO [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Tue Sep 24 15:02:56 2024 +0000

    net: avoid potential underflow in qdisc_pkt_len_init() with UFO
    
    [ Upstream commit c20029db28399ecc50e556964eaba75c43b1e2f1 ]
    
    After commit 7c6d2ecbda83 ("net: be more gentle about silly gso
    requests coming from user") virtio_net_hdr_to_skb() had sanity check
    to detect malicious attempts from user space to cook a bad GSO packet.
    
    Then commit cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count
    transport header in UFO") while fixing one issue, allowed user space
    to cook a GSO packet with the following characteristic :
    
    IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.
    
    When this packet arrives in qdisc_pkt_len_init(), we end up
    with hdr_len = 28 (IPv4 header + UDP header), matching skb->len
    
    Then the following sets gso_segs to 0 :
    
    gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                            shinfo->gso_size);
    
    Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/
    
    qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;
    
    This leads to the following crash in fq_codel [1]
    
    qdisc_pkt_len_init() is best effort, we only want an estimation
    of the bytes sent on the wire, not crashing the kernel.
    
    This patch is fixing this particular issue, a following one
    adds more sanity checks for another potential bug.
    
    [1]
    [   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [   70.724561] #PF: supervisor read access in kernel mode
    [   70.724561] #PF: error_code(0x0000) - not-present page
    [   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
    [   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
    [   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
    [   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
    All code
    ========
       0:   24 08                   and    $0x8,%al
       2:   49 c1 e1 06             shl    $0x6,%r9
       6:   44 89 7c 24 18          mov    %r15d,0x18(%rsp)
       b:   45 31 ed                xor    %r13d,%r13d
       e:   45 31 c0                xor    %r8d,%r8d
      11:   31 ff                   xor    %edi,%edi
      13:   89 44 24 14             mov    %eax,0x14(%rsp)
      17:   4c 03 8b 90 01 00 00    add    0x190(%rbx),%r9
      1e:   eb 04                   jmp    0x24
      20:   39 ca                   cmp    %ecx,%edx
      22:   73 37                   jae    0x5b
      24:   4d 8b 39                mov    (%r9),%r15
      27:   83 c7 01                add    $0x1,%edi
      2a:*  49 8b 17                mov    (%r15),%rdx              <-- trapping instruction
      2d:   49 89 11                mov    %rdx,(%r9)
      30:   41 8b 57 28             mov    0x28(%r15),%edx
      34:   45 8b 5f 34             mov    0x34(%r15),%r11d
      38:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      3f:   49                      rex.WB
    
    Code starting with the faulting instruction
    ===========================================
       0:   49 8b 17                mov    (%r15),%rdx
       3:   49 89 11                mov    %rdx,(%r9)
       6:   41 8b 57 28             mov    0x28(%r15),%edx
       a:   45 8b 5f 34             mov    0x34(%r15),%r11d
       e:   49 c7 07 00 00 00 00    movq   $0x0,(%r15)
      15:   49                      rex.WB
    [   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
    [   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
    [   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
    [   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
    [   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
    [   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
    [   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
    [   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
    [   70.724561] Call Trace:
    [   70.724561]  <TASK>
    [   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
    [   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
    [   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
    [   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
    [   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
    [   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
    [   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
    [   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
    [   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
    [   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
    [   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
    [   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
    [   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    [   70.724561] RIP: 0033:0x41ae09
    
    Fixes: cf9acc90c80ec ("net: virtio_net_hdr_to_skb: count transport header in UFO")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Jonathan Davies <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: Jonathan Davies <[email protected]>
    Reviewed-by: David Ahern <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: improve shutdown sequence [+ + +]
Author: Vladimir Oltean <[email protected]>
Date:   Fri Sep 13 23:35:49 2024 +0300

    net: dsa: improve shutdown sequence
    
    [ Upstream commit 6c24a03a61a245fe34d47582898331fa034b6ccd ]
    
    Alexander Sverdlin presents 2 problems during shutdown with the
    lan9303 driver. One is specific to lan9303 and the other just happens
    to reproduce there.
    
    The first problem is that lan9303 is unique among DSA drivers in that it
    calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown,
    not remove):
    
    phy_state_machine()
    -> ...
       -> dsa_user_phy_read()
          -> ds->ops->phy_read()
             -> lan9303_phy_read()
                -> chip->ops->phy_read()
                   -> lan9303_mdio_phy_read()
                      -> dev_get_drvdata()
    
    But we never stop the phy_state_machine(), so it may continue to run
    after dsa_switch_shutdown(). Our common pattern in all DSA drivers is
    to set drvdata to NULL to suppress the remove() method that may come
    afterwards. But in this case it will result in an NPD.
    
    The second problem is that the way in which we set
    dp->conduit->dsa_ptr = NULL; is concurrent with receive packet
    processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL,
    but afterwards, rather than continuing to use that non-NULL value,
    dev->dsa_ptr is dereferenced again and again without NULL checks:
    dsa_conduit_find_user() and many other places. In between dereferences,
    there is no locking to ensure that what was valid once continues to be
    valid.
    
    Both problems have the common aspect that closing the conduit interface
    solves them.
    
    In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN
    event in dsa_user_netdevice_event() which closes user ports as well.
    dsa_port_disable_rt() calls phylink_stop(), which synchronously stops
    the phylink state machine, and ds->ops->phy_read() will thus no longer
    call into the driver after this point.
    
    In the second case, dev_close(conduit) should do this, as per
    Documentation/networking/driver.rst:
    
    | Quiescence
    | ----------
    |
    | After the ndo_stop routine has been called, the hardware must
    | not receive or transmit any data.  All in flight packets must
    | be aborted. If necessary, poll or wait for completion of
    | any reset commands.
    
    So it should be sufficient to ensure that later, when we zeroize
    conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call
    on this conduit.
    
    The addition of the netif_device_detach() function is to ensure that
    ioctls, rtnetlinks and ethtool requests on the user ports no longer
    propagate down to the driver - we're no longer prepared to handle them.
    
    The race condition actually did not exist when commit 0650bf52b31f
    ("net: dsa: be compatible with masters which unregister on shutdown")
    first introduced dsa_switch_shutdown(). It was created later, when we
    stopped unregistering the user interfaces from a bad spot, and we just
    replaced that sequence with a racy zeroization of conduit->dsa_ptr
    (one which doesn't ensure that the interfaces aren't up).
    
    Reported-by: Alexander Sverdlin <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Fixes: ee534378f005 ("net: dsa: fix panic when DSA master device unbinds on shutdown")
    Reviewed-by: Alexander Sverdlin <[email protected]>
    Tested-by: Alexander Sverdlin <[email protected]>
    Signed-off-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: lantiq_etop: fix memory disclosure [+ + +]
Author: Aleksander Jan Bajkowski <[email protected]>
Date:   Mon Sep 23 23:49:49 2024 +0200

    net: ethernet: lantiq_etop: fix memory disclosure
    
    [ Upstream commit 45c0de18ff2dc9af01236380404bbd6a46502c69 ]
    
    When applying padding, the buffer is not zeroed, which results in memory
    disclosure. The mentioned data is observed on the wire. This patch uses
    skb_put_padto() to pad Ethernet frames properly. The mentioned function
    zeroes the expanded buffer.
    
    In case the packet cannot be padded it is silently dropped. Statistics
    are also not incremented. This driver does not support statistics in the
    old 32-bit format or the new 64-bit format. These will be added in the
    future. In its current form, the patch should be easily backported to
    stable versions.
    
    Ethernet MACs on Amazon-SE and Danube cannot do padding of the packets
    in hardware, so software padding must be applied.
    
    Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver")
    Signed-off-by: Aleksander Jan Bajkowski <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Reload PTP registers after link-state change [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:06 2024 +0200

    net: fec: Reload PTP registers after link-state change
    
    [ Upstream commit d9335d0232d2da605585eea1518ac6733518f938 ]
    
    On link-state change, the controller gets reset,
    which clears all PTP registers, including PHC time,
    calibrated clock correction values etc. For correct
    IEEE 1588 operation we need to restore these after
    the reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Reviewed-by: Wei Fang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: fec: Restart PPS after link state change [+ + +]
Author: Csókás, Bence <[email protected]>
Date:   Tue Sep 24 11:37:04 2024 +0200

    net: fec: Restart PPS after link state change
    
    [ Upstream commit a1477dc87dc4996dcf65a4893d4e2c3a6b593002 ]
    
    On link state change, the controller gets reset,
    causing PPS to drop out. Re-enable PPS if it was
    enabled before the controller reset.
    
    Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock")
    Signed-off-by: Csókás, Bence <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size [+ + +]
Author: Daniel Borkmann <[email protected]>
Date:   Mon Sep 23 23:22:42 2024 +0200

    net: Fix gso_features_check to check for both dev->gso_{ipv4_,}max_size
    
    [ Upstream commit e609c959a939660c7519895f853dfa5624c6827a ]
    
    Commit 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    added a dev->gso_max_size test to gso_features_check() in order to fall
    back to GSO when needed.
    
    This was added as it was noticed that some drivers could misbehave if TSO
    packets get too big. However, the check doesn't respect dev->gso_ipv4_max_size
    limit. For instance, a device could be configured with BIG TCP for IPv4,
    but not IPv6.
    
    Therefore, add a netif_get_gso_max_size() equivalent to netif_get_gro_max_size()
    and use the helper to respect both limits before falling back to GSO engine.
    
    Fixes: 24ab059d2ebd ("net: check dev->gso_max_size in gso_features_check()")
    Signed-off-by: Daniel Borkmann <[email protected]>
    Cc: Eric Dumazet <[email protected]>
    Cc: Paolo Abeni <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: gso: fix tcp fraglist segmentation after pull from frag_list [+ + +]
Author: Felix Fietkau <[email protected]>
Date:   Thu Sep 26 10:53:14 2024 +0200

    net: gso: fix tcp fraglist segmentation after pull from frag_list
    
    commit 17bd3bd82f9f79f3feba15476c2b2c95a9b11ff8 upstream.
    
    Detect tcp gso fraglist skbs with corrupted geometry (see below) and
    pass these to skb_segment instead of skb_segment_list, as the first
    can segment them correctly.
    
    Valid SKB_GSO_FRAGLIST skbs
    - consist of two or more segments
    - the head_skb holds the protocol headers plus first gso_size
    - one or more frag_list skbs hold exactly one segment
    - all but the last must be gso_size
    
    Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
    modify these skbs, breaking these invariants.
    
    In extreme cases they pull all data into skb linear. For TCP, this
    causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at
    tcp_hdr(seg->next).
    
    Detect invalid geometry due to pull, by checking head_skb size.
    Don't just drop, as this may blackhole a destination. Convert to be
    able to pass to regular skb_segment.
    
    Approach and description based on a patch by Willem de Bruijn.
    
    Link: https://lore.kernel.org/netdev/[email protected]/
    Link: https://lore.kernel.org/netdev/[email protected]/
    Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
    Cc: [email protected]
    Signed-off-by: Felix Fietkau <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: hisilicon: hip04: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:19 2024 +0200

    net: hisilicon: hip04: fix OF node leak in probe()
    
    [ Upstream commit 17555297dbd5bccc93a01516117547e26a61caf1 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:20 2024 +0200

    net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
    
    [ Upstream commit 5680cf8d34e1552df987e2f4bb1bff0b2a8c8b11 ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in hns_mac_get_info().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: hisilicon: hns_mdio: fix OF node leak in probe() [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Tue Aug 27 16:44:21 2024 +0200

    net: hisilicon: hns_mdio: fix OF node leak in probe()
    
    [ Upstream commit e62beddc45f487b9969821fad3a0913d9bc18a2f ]
    
    Driver is leaking OF node reference from
    of_parse_phandle_with_fixed_args() in probe().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:42:34 2024 +0800

    net: ieee802154: mcr20a: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit 09573b1cc76e7ff8f056ab29ea1cdc152ec8c653 ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver")
    Reviewed-by: Miquel Raynal <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Stefan Schmidt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: mana: Add support for page sizes other than 4KB on ARM64 [+ + +]
Author: Haiyang Zhang <[email protected]>
Date:   Mon Jun 17 13:17:26 2024 -0700

    net: mana: Add support for page sizes other than 4KB on ARM64
    
    [ Upstream commit 382d1741b5b2feffef7942dd074206372afe1a96 ]
    
    As defined by the MANA Hardware spec, the queue size for DMA is 4KB
    minimal, and power of 2. And, the HWC queue size has to be exactly
    4KB.
    
    To support page sizes other than 4KB on ARM64, define the minimal
    queue size as a macro separately from the PAGE_SIZE, which we always
    assumed it to be 4KB before supporting ARM64.
    
    Also, add MANA specific macros and update code related to size
    alignment, DMA region calculations, etc.
    
    Signed-off-by: Haiyang Zhang <[email protected]>
    Reviewed-by: Michael Kelley <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: 9e517a8e9d9a ("RDMA/mana_ib: use the correct page table index based on hardware page size")
    Signed-off-by: Sasha Levin <[email protected]>

net: mvpp2: Increase size of queue_name buffer [+ + +]
Author: Simon Horman <[email protected]>
Date:   Tue Aug 6 12:28:24 2024 +0100

    net: mvpp2: Increase size of queue_name buffer
    
    [ Upstream commit 91d516d4de48532d967a77967834e00c8c53dfe6 ]
    
    Increase size of queue_name buffer from 30 to 31 to accommodate
    the largest string written to it. This avoids truncation in
    the possibly unlikely case where the string is name is the
    maximum size.
    
    Flagged by gcc-14:
    
      .../mvpp2_main.c: In function 'mvpp2_probe':
      .../mvpp2_main.c:7636:32: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                                ^
      .../mvpp2_main.c:7635:9: note: 'snprintf' output between 10 and 31 bytes into a destination of size 30
       7635 |         snprintf(priv->queue_name, sizeof(priv->queue_name),
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7636 |                  "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       7637 |                  priv->port_count > 1 ? "+" : "");
            |                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Introduced by commit 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics").
    I am not flagging this as a bug as I am not aware that it is one.
    
    Compile tested only.
    
    Signed-off-by: Simon Horman <[email protected]>
    Reviewed-by: Marcin Wojtas <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: napi: Prevent overflow of napi_defer_hard_irqs [+ + +]
Author: Joe Damato <[email protected]>
Date:   Wed Sep 4 15:34:30 2024 +0000

    net: napi: Prevent overflow of napi_defer_hard_irqs
    
    [ Upstream commit 08062af0a52107a243f7608fd972edb54ca5b7f8 ]
    
    In commit 6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
    napi_defer_irqs was added to net_device and napi_defer_irqs_count was
    added to napi_struct, both as type int.
    
    This value never goes below zero, so there is not reason for it to be a
    signed int. Change the type for both from int to u32, and add an
    overflow check to sysfs to limit the value to S32_MAX.
    
    The limit of S32_MAX was chosen because the practical limit before this
    patch was S32_MAX (anything larger was an overflow) and thus there are
    no behavioral changes introduced. If the extra bit is needed in the
    future, the limit can be raised.
    
    Before this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    $ cat /sys/class/net/eth4/napi_defer_hard_irqs
    -2147483647
    
    After this patch:
    
    $ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
    bash: line 0: echo: write error: Numerical result out of range
    
    Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
    
    include/linux/netdevice.h:      unsigned int            tx_queue_len;
    
    And has an overflow check:
    
    dev_change_tx_queue_len(..., unsigned long new_len):
    
      if (new_len != (unsigned int)new_len)
              return -ERANGE;
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: pcs: xpcs: fix the wrong register that was written back [+ + +]
Author: Jiawen Wu <[email protected]>
Date:   Tue Sep 24 10:28:57 2024 +0800

    net: pcs: xpcs: fix the wrong register that was written back
    
    commit 93ef6ee5c20e9330477930ec6347672c9e0cf5a6 upstream.
    
    The value is read from the register TXGBE_RX_GEN_CTL3, and it should be
    written back to TXGBE_RX_GEN_CTL3 when it changes some fields.
    
    Cc: [email protected]
    Fixes: f629acc6f210 ("net: pcs: xpcs: support to switch mode for Wangxun NICs")
    Signed-off-by: Jiawen Wu <[email protected]>
    Reported-by: Russell King (Oracle) <[email protected]>
    Reviewed-by: Russell King (Oracle) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: phy: Check for read errors in SIOCGMIIREG [+ + +]
Author: Niklas Söderlund <[email protected]>
Date:   Tue Sep 3 19:15:36 2024 +0200

    net: phy: Check for read errors in SIOCGMIIREG
    
    [ Upstream commit 569bf6d481b0b823c3c9c3b8be77908fd7caf66b ]
    
    When reading registers from the PHY using the SIOCGMIIREG IOCTL any
    errors returned from either mdiobus_read() or mdiobus_c45_read() are
    ignored, and parts of the returned error is passed as the register value
    back to user-space.
    
    For example, if mdiobus_c45_read() is used with a bus that do not
    implement the read_c45() callback -EOPNOTSUPP is returned. This is
    however directly stored in mii_data->val_out and returned as the
    registers content. As val_out is a u16 the error code is truncated and
    returned as a plausible register value.
    
    Fix this by first checking the return value for errors before returning
    it as the register content.
    
    Before this patch,
    
        # phytool read eth0/0:1/0
        0xffa1
    
    After this change,
    
        $ phytool read eth0/0:1/0
        error: phy_read (-95)
    
    Signed-off-by: Niklas Söderlund <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Reviewed-by: Yoshihiro Shimoda <[email protected]>
    Tested-by: Yoshihiro Shimoda <[email protected]>
    Reviewed-by: Geert Uytterhoeven <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sched: consistently use rcu_replace_pointer() in taprio_change() [+ + +]
Author: Dmitry Antipov <[email protected]>
Date:   Wed Sep 4 14:54:01 2024 +0300

    net: sched: consistently use rcu_replace_pointer() in taprio_change()
    
    [ Upstream commit d5c4546062fd6f5dbce575c7ea52ad66d1968678 ]
    
    According to Vinicius (and carefully looking through the whole
    https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
    once again), txtime branch of 'taprio_change()' is not going to
    race against 'advance_sched()'. But using 'rcu_replace_pointer()'
    in the former may be a good idea as well.
    
    Suggested-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: Dmitry Antipov <[email protected]>
    Acked-by: Vinicius Costa Gomes <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: sparx5: Fix invalid timestamps [+ + +]
Author: Aakash Menon <[email protected]>
Date:   Mon Sep 16 22:18:29 2024 -0700

    net: sparx5: Fix invalid timestamps
    
    [ Upstream commit 151ac45348afc5b56baa584c7cd4876addf461ff ]
    
    Bit 270-271 are occasionally unexpectedly set by the hardware. This issue
    was observed with 10G SFPs causing huge time errors (> 30ms) in PTP. Only
    30 bits are needed for the nanosecond part of the timestamp, clear 2 most
    significant bits before extracting timestamp from the internal frame
    header.
    
    Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping")
    Signed-off-by: Aakash Menon <[email protected]>
    Reviewed-by: Horatiu Vultur <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check [+ + +]
Author: Shenwei Wang <[email protected]>
Date:   Tue Sep 24 15:54:24 2024 -0500

    net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check
    
    [ Upstream commit 4c1b56671b68ffcbe6b78308bfdda6bcce6491ae ]
    
    Increase the timeout for checking the busy bit of the VLAN Tag register
    from 10µs to 500ms. This change is necessary to accommodate scenarios
    where Energy Efficient Ethernet (EEE) is enabled.
    
    Overnight testing revealed that when EEE is active, the busy bit can
    remain set for up to approximately 300ms. The new 500ms timeout provides
    a safety margin.
    
    Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering")
    Reviewed-by: Andrew Lunn <[email protected]>
    Signed-off-by: Shenwei Wang <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: stmmac: Fix zero-division error when disabling tc cbs [+ + +]
Author: KhaiWenTan <[email protected]>
Date:   Wed Sep 18 14:14:22 2024 +0800

    net: stmmac: Fix zero-division error when disabling tc cbs
    
    commit 675faf5a14c14a2be0b870db30a70764df81e2df upstream.
    
    The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider
    when offload is disabled") allows the "port_transmit_rate_kbps" to be
    set to a value of 0, which is then passed to the "div_s64" function when
    tc-cbs is disabled. This leads to a zero-division error.
    
    When tc-cbs is disabled, the idleslope, sendslope, and credit values the
    credit values are not required to be configured. Therefore, adding a return
    statement after setting the txQ mode to DCB when tc-cbs is disabled would
    prevent a zero-division error.
    
    Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled")
    Cc: <[email protected]>
    Co-developed-by: Choong Yong Liang <[email protected]>
    Signed-off-by: Choong Yong Liang <[email protected]>
    Signed-off-by: KhaiWenTan <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: test for not too small csum_start in virtio_net_hdr_to_skb() [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 16:58:36 2024 +0000

    net: test for not too small csum_start in virtio_net_hdr_to_skb()
    
    [ Upstream commit 49d14b54a527289d09a9480f214b8c586322310a ]
    
    syzbot was able to trigger this warning [1], after injecting a
    malicious packet through af_packet, setting skb->csum_start and thus
    the transport header to an incorrect value.
    
    We can at least make sure the transport header is after
    the end of the network header (with a estimated minimal size).
    
    [1]
    [   67.873027] skb len=4096 headroom=16 headlen=14 tailroom=0
    mac=(-1,-1) mac_len=0 net=(16,-6) trans=10
    shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
    csum(0xa start=10 offset=0 ip_summed=3 complete_sw=0 valid=0 level=0)
    hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
    priority=0x0 mark=0x0 alloc_cpu=10 vlan_all=0x0
    encapsulation=0 inner(proto=0x0000, mac=0, net=0, trans=0)
    [   67.877172] dev name=veth0_vlan feat=0x000061164fdd09e9
    [   67.877764] sk family=17 type=3 proto=0
    [   67.878279] skb linear:   00000000: 00 00 10 00 00 00 00 00 0f 00 00 00 08 00
    [   67.879128] skb frag:     00000000: 0e 00 07 00 00 00 28 00 08 80 1c 00 04 00 00 02
    [   67.879877] skb frag:     00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.880647] skb frag:     00000020: 00 00 02 00 00 00 08 00 1b 00 00 00 00 00 00 00
    [   67.881156] skb frag:     00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.881753] skb frag:     00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882173] skb frag:     00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.882790] skb frag:     00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883171] skb frag:     00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.883733] skb frag:     00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.884206] skb frag:     00000090: 00 00 00 00 00 00 00 00 00 00 69 70 76 6c 61 6e
    [   67.884704] skb frag:     000000a0: 31 00 00 00 00 00 00 00 00 00 2b 00 00 00 00 00
    [   67.885139] skb frag:     000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.885677] skb frag:     000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886042] skb frag:     000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.886408] skb frag:     000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887020] skb frag:     000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   67.887384] skb frag:     00000100: 00 00
    [   67.887878] ------------[ cut here ]------------
    [   67.887908] offset (-6) >= skb_headlen() (14)
    [   67.888445] WARNING: CPU: 10 PID: 2088 at net/core/dev.c:3332 skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.889353] Modules linked in: macsec macvtap macvlan hsr wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 dummy bridge sr_mod cdrom evdev pcspkr i2c_piix4 9pnet_virtio 9p 9pnet netfs
    [   67.890111] CPU: 10 UID: 0 PID: 2088 Comm: b363492833 Not tainted 6.11.0-virtme #1011
    [   67.890183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   67.890309] RIP: 0010:skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891043] Call Trace:
    [   67.891173]  <TASK>
    [   67.891274] ? __warn (kernel/panic.c:741)
    [   67.891320] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891333] ? report_bug (lib/bug.c:180 lib/bug.c:219)
    [   67.891348] ? handle_bug (arch/x86/kernel/traps.c:239)
    [   67.891363] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
    [   67.891372] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
    [   67.891388] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891399] ? skb_checksum_help (net/core/dev.c:3332 (discriminator 2))
    [   67.891416] ip_do_fragment (net/ipv4/ip_output.c:777 (discriminator 1))
    [   67.891448] ? __ip_local_out (./include/linux/skbuff.h:1146 ./include/net/l3mdev.h:196 ./include/net/l3mdev.h:213 net/ipv4/ip_output.c:113)
    [   67.891459] ? __pfx_ip_finish_output2 (net/ipv4/ip_output.c:200)
    [   67.891470] ? ip_route_output_flow (./arch/x86/include/asm/preempt.h:84 (discriminator 13) ./include/linux/rcupdate.h:96 (discriminator 13) ./include/linux/rcupdate.h:871 (discriminator 13) net/ipv4/route.c:2625 (discriminator 13) ./include/net/route.h:141 (discriminator 13) net/ipv4/route.c:2852 (discriminator 13))
    [   67.891484] ipvlan_process_v4_outbound (drivers/net/ipvlan/ipvlan_core.c:445 (discriminator 1))
    [   67.891581] ipvlan_queue_xmit (drivers/net/ipvlan/ipvlan_core.c:542 drivers/net/ipvlan/ipvlan_core.c:604 drivers/net/ipvlan/ipvlan_core.c:670)
    [   67.891596] ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:227)
    [   67.891607] dev_hard_start_xmit (./include/linux/netdevice.h:4916 ./include/linux/netdevice.h:4925 net/core/dev.c:3588 net/core/dev.c:3604)
    [   67.891620] __dev_queue_xmit (net/core/dev.h:168 (discriminator 25) net/core/dev.c:4425 (discriminator 25))
    [   67.891630] ? skb_copy_bits (./include/linux/uaccess.h:233 (discriminator 1) ./include/linux/uaccess.h:260 (discriminator 1) ./include/linux/highmem-internal.h:230 (discriminator 1) net/core/skbuff.c:3018 (discriminator 1))
    [   67.891645] ? __pskb_pull_tail (net/core/skbuff.c:2848 (discriminator 4))
    [   67.891655] ? skb_partial_csum_set (net/core/skbuff.c:5657)
    [   67.891666] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/skbuff.h:2791 (discriminator 3) ./include/linux/skbuff.h:2799 (discriminator 3) ./include/linux/virtio_net.h:109 (discriminator 3))
    [   67.891684] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
    [   67.891700] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
    [   67.891716] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
    [   67.891734] ? do_sock_setsockopt (net/socket.c:2335)
    [   67.891747] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
    [   67.891761] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
    [   67.891772] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   67.891785] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    
    Fixes: 9181d6f8a2bb ("net: add more sanity check in virtio_net_hdr_to_skb()")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Mon Sep 23 19:57:43 2024 +0800

    net: wwan: qcom_bam_dmux: Fix missing pm_runtime_disable()
    
    [ Upstream commit d505d3593b52b6c43507f119572409087416ba28 ]
    
    It's important to undo pm_runtime_use_autosuspend() with
    pm_runtime_dont_use_autosuspend() at driver exit time.
    
    But the pm_runtime_disable() and pm_runtime_dont_use_autosuspend()
    is missing in the error path for bam_dmux_probe(). So add it.
    
    Found by code review. Compile-tested only.
    
    Fixes: 21a0ffd9b38c ("net: wwan: Add Qualcomm BAM-DMUX WWAN network driver")
    Suggested-by: Stephan Gerhold <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Reviewed-by: Stephan Gerhold <[email protected]>
    Reviewed-by: Sergey Ryazanov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netdev-genl: Set extack and fix error on napi-get [+ + +]
Author: Joe Damato <[email protected]>
Date:   Sat Aug 31 12:17:04 2024 +0000

    netdev-genl: Set extack and fix error on napi-get
    
    [ Upstream commit 4e3a024b437ec0aee82550cc66a0f4e1a7a88a67 ]
    
    In commit 27f91aaf49b3 ("netdev-genl: Add netlink framework functions
    for napi"), when an invalid NAPI ID is specified the return value
    -EINVAL is used and no extack is set.
    
    Change the return value to -ENOENT and set the extack.
    
    Before this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                              --do napi-get --json='{"id": 451}'
    Netlink error: Invalid argument
    nl_len = 36 (20) nl_flags = 0x100 nl_type = 2
            error: -22
    
    After this commit:
    
    $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
                             --do napi-get --json='{"id": 451}'
    Netlink error: No such file or directory
    nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
            error: -2
            extack: {'bad-attr': '.id'}
    
    Suggested-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Joe Damato <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netfilter: nf_tables: prevent nf_skb_duplicated corruption [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 26 18:56:11 2024 +0000

    netfilter: nf_tables: prevent nf_skb_duplicated corruption
    
    [ Upstream commit 92ceba94de6fb4cee2bf40b485979c342f44a492 ]
    
    syzbot found that nf_dup_ipv4() or nf_dup_ipv6() could write
    per-cpu variable nf_skb_duplicated in an unsafe way [1].
    
    Disabling preemption as hinted by the splat is not enough,
    we have to disable soft interrupts as well.
    
    [1]
    BUG: using __this_cpu_write() in preemptible [00000000] code: syz.4.282/6316
     caller is nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
    CPU: 0 UID: 0 PID: 6316 Comm: syz.4.282 Not tainted 6.11.0-rc7-syzkaller-00104-g7052622fccb1 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call Trace:
     <TASK>
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
      check_preemption_disabled+0x10e/0x120 lib/smp_processor_id.c:49
      nf_dup_ipv4+0x651/0x8f0 net/ipv4/netfilter/nf_dup_ipv4.c:87
      nft_dup_ipv4_eval+0x1db/0x300 net/ipv4/netfilter/nft_dup_ipv4.c:30
      expr_call_ops_eval net/netfilter/nf_tables_core.c:240 [inline]
      nft_do_chain+0x4ad/0x1da0 net/netfilter/nf_tables_core.c:288
      nft_do_chain_ipv4+0x202/0x320 net/netfilter/nft_chain_filter.c:23
      nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
      nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
      nf_hook+0x2c4/0x450 include/linux/netfilter.h:269
      NF_HOOK_COND include/linux/netfilter.h:302 [inline]
      ip_output+0x185/0x230 net/ipv4/ip_output.c:433
      ip_local_out net/ipv4/ip_output.c:129 [inline]
      ip_send_skb+0x74/0x100 net/ipv4/ip_output.c:1495
      udp_send_skb+0xacf/0x1650 net/ipv4/udp.c:981
      udp_sendmsg+0x1c21/0x2a60 net/ipv4/udp.c:1269
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0x1a6/0x270 net/socket.c:745
      ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
      ___sys_sendmsg net/socket.c:2651 [inline]
      __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
      __do_sys_sendmmsg net/socket.c:2766 [inline]
      __se_sys_sendmmsg net/socket.c:2763 [inline]
      __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    RIP: 0033:0x7f4ce4f7def9
    Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007f4ce5d4a038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00007f4ce5135f80 RCX: 00007f4ce4f7def9
    RDX: 0000000000000001 RSI: 0000000020005d40 RDI: 0000000000000006
    RBP: 00007f4ce4ff0b76 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007f4ce5135f80 R15: 00007ffd4cbc6d68
     </TASK>
    
    Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED [+ + +]
Author: Phil Sutter <[email protected]>
Date:   Wed Sep 25 20:01:20 2024 +0200

    netfilter: uapi: NFTA_FLOWTABLE_HOOK is NLA_NESTED
    
    [ Upstream commit 76f1ed087b562a469f2153076f179854b749c09a ]
    
    Fix the comment which incorrectly defines it as NLA_U32.
    
    Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
    Signed-off-by: Phil Sutter <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netfs: Cancel dirty folios that have no storage destination [+ + +]
Author: David Howells <[email protected]>
Date:   Mon Jul 29 12:23:11 2024 +0100

    netfs: Cancel dirty folios that have no storage destination
    
    [ Upstream commit 8f246b7c0a1be0882374f2ff831a61f0dbe77678 ]
    
    Kafs wants to be able to cache the contents of directories (and symlinks),
    but whilst these are downloaded from the server with the FS.FetchData RPC
    op and similar, the same as for regular files, they can't be updated by
    FS.StoreData, but rather have special operations (FS.MakeDir, etc.).
    
    Now, rather than redownloading a directory's content after each change made
    to that directory, kafs modifies the local blob.  This blob can be saved
    out to the cache, and since it's using netfslib, kafs just marks the folios
    dirty and lets ->writepages() on the directory take care of it, as for an
    regular file.
    
    This is fine as long as there's a cache as although the upload stream is
    disabled, there's a cache stream to drive the procedure.  But if the cache
    goes away in the meantime, suddenly there's no way do any writes and the
    code gets confused, complains "R=%x: No submit" to dmesg and leaves the
    dirty folio hanging.
    
    Fix this by just cancelling the store of the folio if neither stream is
    active.  (If there's no cache at the time of dirtying, we should just not
    mark the folio dirty).
    
    Signed-off-by: David Howells <[email protected]>
    cc: Jeff Layton <[email protected]>
    cc: [email protected]
    cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]/ # v2
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
netpoll: Ensure clean state on setup failures [+ + +]
Author: Breno Leitao <[email protected]>
Date:   Thu Aug 22 04:10:47 2024 -0700

    netpoll: Ensure clean state on setup failures
    
    [ Upstream commit ae5a0456e0b4cfd7e61619e55251ffdf1bc7adfb ]
    
    Modify netpoll_setup() and __netpoll_setup() to ensure that the netpoll
    structure (np) is left in a clean state if setup fails for any reason.
    This prevents carrying over misconfigured fields in case of partial
    setup success.
    
    Key changes:
    - np->dev is now set only after successful setup, ensuring it's always
      NULL if netpoll is not configured or if netpoll_setup() fails.
    - np->local_ip is zeroed if netpoll setup doesn't complete successfully.
    - Added DEBUG_NET_WARN_ON_ONCE() checks to catch unexpected states.
    - Reordered some operations in __netpoll_setup() for better logical flow.
    
    These changes improve the reliability of netpoll configuration, since it
    assures that the structure is fully initialized or totally unset.
    
    Suggested-by: Paolo Abeni <[email protected]>
    Signed-off-by: Breno Leitao <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nfp: Use IRQF_NO_AUTOEN flag in request_irq() [+ + +]
Author: Jinjie Ruan <[email protected]>
Date:   Wed Sep 11 17:44:45 2024 +0800

    nfp: Use IRQF_NO_AUTOEN flag in request_irq()
    
    [ Upstream commit daaba19d357f0900b303a530ced96c78086267ea ]
    
    disable_irq() after request_irq() still has a time gap in which
    interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
    disable IRQ auto-enable when request IRQ.
    
    Reviewed-by: Louis Peens <[email protected]>
    Signed-off-by: Jinjie Ruan <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
NFSD: Async COPY result needs to return a write verifier [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:03 2024 -0400

    NFSD: Async COPY result needs to return a write verifier
    
    [ Upstream commit 9ed666eba4e0a2bb8ffaa3739d830b64d4f2aaad ]
    
    Currently, when NFSD handles an asynchronous COPY, it returns a
    zero write verifier, relying on the subsequent CB_OFFLOAD callback
    to pass the write verifier and a stable_how4 value to the client.
    
    However, if the CB_OFFLOAD never arrives at the client (for example,
    if a network partition occurs just as the server sends the
    CB_OFFLOAD operation), the client will never receive this verifier.
    Thus, if the client sends a follow-up COMMIT, there is no way for
    the client to assess the COMMIT result.
    
    The usual recovery for a missing CB_OFFLOAD is for the client to
    send an OFFLOAD_STATUS operation, but that operation does not carry
    a write verifier in its result. Neither does it carry a stable_how4
    value, so the client /must/ send a COMMIT in this case -- which will
    always fail because currently there's still no write verifier in the
    COPY result.
    
    Thus the server needs to return a normal write verifier in its COPY
    result even if the COPY operation is to be performed asynchronously.
    
    If the server recognizes the callback stateid in subsequent
    OFFLOAD_STATUS operations, then obviously it has not restarted, and
    the write verifier the client received in the COPY result is still
    valid and can be used to assess a COMMIT of the copied data, if one
    is needed.
    
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations")
    Signed-off-by: Sasha Levin <[email protected]>

 
nfsd: fix delegation_blocked() to block correctly for at least 30 seconds [+ + +]
Author: NeilBrown <[email protected]>
Date:   Mon Sep 9 15:06:36 2024 +1000

    nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
    
    commit 45bb63ed20e02ae146336412889fe5450316a84f upstream.
    
    The pair of bloom filtered used by delegation_blocked() was intended to
    block delegations on given filehandles for between 30 and 60 seconds.  A
    new filehandle would be recorded in the "new" bit set.  That would then
    be switch to the "old" bit set between 0 and 30 seconds later, and it
    would remain as the "old" bit set for 30 seconds.
    
    Unfortunately the code intended to clear the old bit set once it reached
    30 seconds old, preparing it to be the next new bit set, instead cleared
    the *new* bit set before switching it to be the old bit set.  This means
    that the "old" bit set is always empty and delegations are blocked
    between 0 and 30 seconds.
    
    This patch updates bd->new before clearing the set with that index,
    instead of afterwards.
    
    Reported-by: Olga Kornievskaia <[email protected]>
    Cc: [email protected]
    Fixes: 6282cd565553 ("NFSD: Don't hand out delegations for 30 seconds after recalling them.")
    Signed-off-by: NeilBrown <[email protected]>
    Reviewed-by: Benjamin Coddington <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
NFSD: Fix NFSv4's PUTPUBFH operation [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Sun Aug 11 13:11:07 2024 -0400

    NFSD: Fix NFSv4's PUTPUBFH operation
    
    commit 202f39039a11402dcbcd5fece8d9fa6be83f49ae upstream.
    
    According to RFC 8881, all minor versions of NFSv4 support PUTPUBFH.
    
    Replace the XDR decoder for PUTPUBFH with a "noop" since we no
    longer want the minorversion check, and PUTPUBFH has no arguments to
    decode. (Ideally nfsd4_decode_noop should really be called
    nfsd4_decode_void).
    
    PUTPUBFH should now behave just like PUTROOTFH.
    
    Reported-by: Cedric Blancher <[email protected]>
    Fixes: e1a90ebd8b23 ("NFSD: Combine decode operations for v4 and v4.1")
    Cc: Dan Shelton <[email protected]>
    Cc: Roland Mainz <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

NFSD: Limit the number of concurrent async COPY operations [+ + +]
Author: Chuck Lever <[email protected]>
Date:   Wed Aug 28 13:40:04 2024 -0400

    NFSD: Limit the number of concurrent async COPY operations
    
    [ Upstream commit aadc3bbea163b6caaaebfdd2b6c4667fbc726752 ]
    
    Nothing appears to limit the number of concurrent async COPY
    operations that clients can start. In addition, AFAICT each async
    COPY can copy an unlimited number of 4MB chunks, so can run for a
    long time. Thus IMO async COPY can become a DoS vector.
    
    Add a restriction mechanism that bounds the number of concurrent
    background COPY operations. Start simple and try to be fair -- this
    patch implements a per-namespace limit.
    
    An async COPY request that occurs while this limit is exceeded gets
    NFS4ERR_DELAY. The requesting client can choose to send the request
    again after a delay or fall back to a traditional read/write style
    copy.
    
    If there is need to make the mechanism more sophisticated, we can
    visit that in future patches.
    
    Cc: [email protected]
    Reviewed-by: Jeff Layton <[email protected]>
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nfsd: map the EBADMSG to nfserr_io to avoid warning [+ + +]
Author: Li Lingfeng <[email protected]>
Date:   Sat Aug 17 14:27:13 2024 +0800

    nfsd: map the EBADMSG to nfserr_io to avoid warning
    
    commit 340e61e44c1d2a15c42ec72ade9195ad525fd048 upstream.
    
    Ext4 will throw -EBADMSG through ext4_readdir when a checksum error
    occurs, resulting in the following WARNING.
    
    Fix it by mapping EBADMSG to nfserr_io.
    
    nfsd_buffered_readdir
     iterate_dir // -EBADMSG -74
      ext4_readdir // .iterate_shared
       ext4_dx_readdir
        ext4_htree_fill_tree
         htree_dirblock_to_tree
          ext4_read_dirblock
           __ext4_read_dirblock
            ext4_dirblock_csum_verify
             warn_no_space_for_csum
              __warn_no_space_for_csum
            return ERR_PTR(-EFSBADCRC) // -EBADMSG -74
     nfserrno // WARNING
    
    [  161.115610] ------------[ cut here ]------------
    [  161.116465] nfsd: non-standard errno: -74
    [  161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0
    [  161.118596] Modules linked in:
    [  161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138
    [  161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe
    mu.org 04/01/2014
    [  161.123601] RIP: 0010:nfserrno+0x9d/0xd0
    [  161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6
     05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33
    [  161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286
    [  161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    [  161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a
    [  161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827
    [  161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021
    [  161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8
    [  161.135244] FS:  0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000
    [  161.136695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0
    [  161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  161.141519] PKRU: 55555554
    [  161.142076] Call Trace:
    [  161.142575]  ? __warn+0x9b/0x140
    [  161.143229]  ? nfserrno+0x9d/0xd0
    [  161.143872]  ? report_bug+0x125/0x150
    [  161.144595]  ? handle_bug+0x41/0x90
    [  161.145284]  ? exc_invalid_op+0x14/0x70
    [  161.146009]  ? asm_exc_invalid_op+0x12/0x20
    [  161.146816]  ? nfserrno+0x9d/0xd0
    [  161.147487]  nfsd_buffered_readdir+0x28b/0x2b0
    [  161.148333]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.149258]  ? nfsd_buffered_filldir+0xf0/0xf0
    [  161.150093]  ? wait_for_concurrent_writes+0x170/0x170
    [  161.151004]  ? generic_file_llseek_size+0x48/0x160
    [  161.151895]  nfsd_readdir+0x132/0x190
    [  161.152606]  ? nfsd4_encode_dirent_fattr+0x380/0x380
    [  161.153516]  ? nfsd_unlink+0x380/0x380
    [  161.154256]  ? override_creds+0x45/0x60
    [  161.155006]  nfsd4_encode_readdir+0x21a/0x3d0
    [  161.155850]  ? nfsd4_encode_readlink+0x210/0x210
    [  161.156731]  ? write_bytes_to_xdr_buf+0x97/0xe0
    [  161.157598]  ? __write_bytes_to_xdr_buf+0xd0/0xd0
    [  161.158494]  ? lock_downgrade+0x90/0x90
    [  161.159232]  ? nfs4svc_decode_voidarg+0x10/0x10
    [  161.160092]  nfsd4_encode_operation+0x15a/0x440
    [  161.160959]  nfsd4_proc_compound+0x718/0xe90
    [  161.161818]  nfsd_dispatch+0x18e/0x2c0
    [  161.162586]  svc_process_common+0x786/0xc50
    [  161.163403]  ? nfsd_svc+0x380/0x380
    [  161.164137]  ? svc_printk+0x160/0x160
    [  161.164846]  ? svc_xprt_do_enqueue.part.0+0x365/0x380
    [  161.165808]  ? nfsd_svc+0x380/0x380
    [  161.166523]  ? rcu_is_watching+0x23/0x40
    [  161.167309]  svc_process+0x1a5/0x200
    [  161.168019]  nfsd+0x1f5/0x380
    [  161.168663]  ? nfsd_shutdown_threads+0x260/0x260
    [  161.169554]  kthread+0x1c4/0x210
    [  161.170224]  ? kthread_insert_work_sanity_check+0x80/0x80
    [  161.171246]  ret_from_fork+0x1f/0x30
    
    Signed-off-by: Li Lingfeng <[email protected]>
    Reviewed-by: Jeff Layton <[email protected]>
    Cc: [email protected]
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
nvme-keyring: restrict match length for version '1' identifiers [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:18 2024 +0200

    nvme-keyring: restrict match length for version '1' identifiers
    
    [ Upstream commit 79559c75332458985ab8a21f11b08bf7c9b833b0 ]
    
    TP8018 introduced a new TLS PSK identifier version (version 1), which appended
    a PSK hash value to the existing identifier (cf NVMe TCP specification v1.1,
    section 3.6.1.3 'TLS PSK and PSK Identity Derivation').
    An original (version 0) identifier has the form:
    
    NVMe0<type><hmac> <hostnqn> <subsysnqn>
    
    and a version 1 identifier has the form:
    
    NVMe1<type><hmac> <hostnqn> <subsysnqn> <hash>
    
    This patch modifies the lookup algorthm to compare only the first part
    of the identifier (excluding the hash value) to handle both version 0 and
    version 1 identifiers.
    And the spec declares 'version 0' identifiers obsolete, so the lookup
    algorithm is modified to prever v1 identifiers.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
nvme-tcp: check for invalidated or revoked key [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:20 2024 +0200

    nvme-tcp: check for invalidated or revoked key
    
    [ Upstream commit 5bc46b49c828a6dfaab80b71ecb63fe76a1096d2 ]
    
    key_lookup() will always return a key, even if that key is revoked
    or invalidated. So check for invalid keys before continuing.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: fix link failure for TCP auth [+ + +]
Author: Arnd Bergmann <[email protected]>
Date:   Mon Sep 9 20:21:09 2024 +0000

    nvme-tcp: fix link failure for TCP auth
    
    [ Upstream commit 2d5a333e09c388189238291577e443221baacba0 ]
    
    The nvme fabric driver calls the nvme_tls_key_lookup() function from
    nvmf_parse_key() when the keyring is enabled, but this is broken in a
    configuration with CONFIG_NVME_FABRICS=y and CONFIG_NVME_TCP=m because
    this leads to the function definition being in a loadable module:
    
    x86_64-linux-ld: vmlinux.o: in function `nvmf_parse_key':
    fabrics.c:(.text+0xb1bdec): undefined reference to `nvme_tls_key_lookup'
    
    Move the 'select' up to CONFIG_NVME_FABRICS itself to force this
    part to be built-in as well if needed.
    
    Fixes: 5bc46b49c828 ("nvme-tcp: check for invalidated or revoked key")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Reviewed-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nvme-tcp: sanitize TLS key handling [+ + +]
Author: Hannes Reinecke <[email protected]>
Date:   Mon Jul 22 14:02:19 2024 +0200

    nvme-tcp: sanitize TLS key handling
    
    [ Upstream commit 363895767fbfa05891b0b4d9e06ebde7a10c6a07 ]
    
    There is a difference between TLS configured (ie the user has
    provisioned/requested a key) and TLS enabled (ie the connection
    is encrypted with TLS). This becomes important for secure concatenation,
    where the initial authentication is run on an unencrypted connection
    (ie with TLS configured, but not enabled), and then the queue is reset to
    run over TLS (ie TLS configured _and_ enabled).
    So to differentiate between those two states store the generated
    key in opts->tls_key (as we're using the same TLS key for all queues),
    the key serial of the resulting TLS handshake in ctrl->tls_pskid
    (to signal that TLS on the admin queue is enabled), and a simple
    flag for the queues to indicated that TLS has been enabled.
    
    Signed-off-by: Hannes Reinecke <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Keith Busch <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ocfs2: cancel dqi_sync_work before freeing oinfo [+ + +]
Author: Joseph Qi <[email protected]>
Date:   Wed Sep 4 15:10:03 2024 +0800

    ocfs2: cancel dqi_sync_work before freeing oinfo
    
    commit 35fccce29feb3706f649726d410122dd81b92c18 upstream.
    
    ocfs2_global_read_info() will initialize and schedule dqi_sync_work at the
    end, if error occurs after successfully reading global quota, it will
    trigger the following warning with CONFIG_DEBUG_OBJECTS_* enabled:
    
    ODEBUG: free active (active state 0) object: 00000000d8b0ce28 object type: timer_list hint: qsync_work_fn+0x0/0x16c
    
    This reports that there is an active delayed work when freeing oinfo in
    error handling, so cancel dqi_sync_work first.  BTW, return status instead
    of -1 when .read_file_info fails.
    
    Link: https://syzkaller.appspot.com/bug?extid=f7af59df5d6b25f0febd
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 171bf93ce11f ("ocfs2: Periodic quota syncing")
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Heming Zhao <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix null-ptr-deref when journal load failed. [+ + +]
Author: Julian Sun <[email protected]>
Date:   Mon Sep 2 11:08:44 2024 +0800

    ocfs2: fix null-ptr-deref when journal load failed.
    
    commit 5784d9fcfd43bd853654bb80c87ef293b9e8e80a upstream.
    
    During the mounting process, if journal_reset() fails because of too short
    journal, then lead to jbd2_journal_load() fails with NULL j_sb_buffer.
    Subsequently, ocfs2_journal_shutdown() calls
    jbd2_journal_flush()->jbd2_cleanup_journal_tail()->
    __jbd2_update_log_tail()->jbd2_journal_update_sb_log_tail()
    ->lock_buffer(journal->j_sb_buffer), resulting in a null-pointer
    dereference error.
    
    To resolve this issue, we should check the JBD2_LOADED flag to ensure the
    journal was properly loaded.  Additionally, use journal instead of
    osb->journal directly to simplify the code.
    
    Link: https://syzkaller.appspot.com/bug?extid=05b9b39d8bdfe1a0861f
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: f6f50e28f0cb ("jbd2: Fail to load a journal if it is too short")
    Signed-off-by: Julian Sun <[email protected]>
    Reported-by: [email protected]
    Suggested-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate [+ + +]
Author: Lizhi Xu <[email protected]>
Date:   Mon Sep 2 10:36:36 2024 +0800

    ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate
    
    commit 33b525cef4cff49e216e4133cc48452e11c0391e upstream.
    
    When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger
    NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if
    bh is NULL.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <[email protected]>
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Reported-by: Heming Zhao <[email protected]>
    Suggested-by: Heming Zhao <[email protected]>
    Cc: <[email protected]>    [4.20+]
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix the la space leak when unmounting an ocfs2 volume [+ + +]
Author: Heming Zhao <[email protected]>
Date:   Fri Jul 19 19:43:10 2024 +0800

    ocfs2: fix the la space leak when unmounting an ocfs2 volume
    
    commit dfe6c5692fb525e5e90cefe306ee0dffae13d35f upstream.
    
    This bug has existed since the initial OCFS2 code.  The code logic in
    ocfs2_sync_local_to_main() is wrong, as it ignores the last contiguous
    free bits, which causes an OCFS2 volume to lose the last free clusters of
    LA window on each umount command.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Heming Zhao <[email protected]>
    Reviewed-by: Su Yue <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: Heming Zhao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: fix uninit-value in ocfs2_get_block() [+ + +]
Author: Joseph Qi <[email protected]>
Date:   Wed Sep 25 17:06:00 2024 +0800

    ocfs2: fix uninit-value in ocfs2_get_block()
    
    commit 2af148ef8549a12f8025286b8825c2833ee6bcb8 upstream.
    
    syzbot reported an uninit-value BUG:
    
    BUG: KMSAN: uninit-value in ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    ocfs2_get_block+0xed2/0x2710 fs/ocfs2/aops.c:159
    do_mpage_readpage+0xc45/0x2780 fs/mpage.c:225
    mpage_readahead+0x43f/0x840 fs/mpage.c:374
    ocfs2_readahead+0x269/0x320 fs/ocfs2/aops.c:381
    read_pages+0x193/0x1110 mm/readahead.c:160
    page_cache_ra_unbounded+0x901/0x9f0 mm/readahead.c:273
    do_page_cache_ra mm/readahead.c:303 [inline]
    force_page_cache_ra+0x3b1/0x4b0 mm/readahead.c:332
    force_page_cache_readahead mm/internal.h:347 [inline]
    generic_fadvise+0x6b0/0xa90 mm/fadvise.c:106
    vfs_fadvise mm/fadvise.c:185 [inline]
    ksys_fadvise64_64 mm/fadvise.c:199 [inline]
    __do_sys_fadvise64 mm/fadvise.c:214 [inline]
    __se_sys_fadvise64 mm/fadvise.c:212 [inline]
    __x64_sys_fadvise64+0x1fb/0x3a0 mm/fadvise.c:212
    x64_sys_call+0xe11/0x3ba0
    arch/x86/include/generated/asm/syscalls_64.h:222
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
    entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    This is because when ocfs2_extent_map_get_blocks() fails, p_blkno is
    uninitialized.  So the error log will trigger the above uninit-value
    access.
    
    The error log is out-of-date since get_blocks() was removed long time ago.
    And the error code will be logged in ocfs2_extent_map_get_blocks() once
    ocfs2_get_cluster() fails, so fix this by only logging inode and block.
    
    Link: https://syzkaller.appspot.com/bug?extid=9709e73bae885b05314b
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
    Signed-off-by: Joseph Qi <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Cc: Heming Zhao <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: remove unreasonable unlock in ocfs2_read_blocks [+ + +]
Author: Lizhi Xu <[email protected]>
Date:   Mon Sep 2 10:36:35 2024 +0800

    ocfs2: remove unreasonable unlock in ocfs2_read_blocks
    
    commit c03a82b4a0c935774afa01fd6d128b444fd930a1 upstream.
    
    Patch series "Misc fixes for ocfs2_read_blocks", v5.
    
    This series contains 2 fixes for ocfs2_read_blocks().  The first patch fix
    the issue reported by syzbot, which detects bad unlock balance in
    ocfs2_read_blocks().  The second patch fixes an issue reported by Heming
    Zhao when reviewing above fix.
    
    
    This patch (of 2):
    
    There was a lock release before exiting, so remove the unreasonable unlock.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: cf76c78595ca ("ocfs2: don't put and assigning null to bh allocated outside")
    Signed-off-by: Lizhi Xu <[email protected]>
    Signed-off-by: Joseph Qi <[email protected]>
    Reviewed-by: Heming Zhao <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=ab134185af9ef88dfed5
    Tested-by: [email protected]
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>    [4.20+]
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ocfs2: reserve space for inline xattr before attaching reflink tree [+ + +]
Author: Gautham Ananthakrishna <[email protected]>
Date:   Wed Sep 18 06:38:44 2024 +0000

    ocfs2: reserve space for inline xattr before attaching reflink tree
    
    commit 5ca60b86f57a4d9648f68418a725b3a7de2816b0 upstream.
    
    One of our customers reported a crash and a corrupted ocfs2 filesystem.
    The crash was due to the detection of corruption.  Upon troubleshooting,
    the fsck -fn output showed the below corruption
    
    [EXTENT_LIST_FREE] Extent list in owner 33080590 claims 230 as the next free chain record,
    but fsck believes the largest valid value is 227.  Clamp the next record value? n
    
    The stat output from the debugfs.ocfs2 showed the following corruption
    where the "Next Free Rec:" had overshot the "Count:" in the root metadata
    block.
    
            Inode: 33080590   Mode: 0640   Generation: 2619713622 (0x9c25a856)
            FS Generation: 904309833 (0x35e6ac49)
            CRC32: 00000000   ECC: 0000
            Type: Regular   Attr: 0x0   Flags: Valid
            Dynamic Features: (0x16) HasXattr InlineXattr Refcounted
            Extended Attributes Block: 0  Extended Attributes Inline Size: 256
            User: 0 (root)   Group: 0 (root)   Size: 281320357888
            Links: 1   Clusters: 141738
            ctime: 0x66911b56 0x316edcb8 -- Fri Jul 12 06:02:30.829349048 2024
            atime: 0x66911d6b 0x7f7a28d -- Fri Jul 12 06:11:23.133669517 2024
            mtime: 0x66911b56 0x12ed75d7 -- Fri Jul 12 06:02:30.317552087 2024
            dtime: 0x0 -- Wed Dec 31 17:00:00 1969
            Refcount Block: 2777346
            Last Extblk: 2886943   Orphan Slot: 0
            Sub Alloc Slot: 0   Sub Alloc Bit: 14
            Tree Depth: 1   Count: 227   Next Free Rec: 230
            ## Offset        Clusters       Block#
            0  0             2310           2776351
            1  2310          2139           2777375
            2  4449          1221           2778399
            3  5670          731            2779423
            4  6401          566            2780447
            .......          ....           .......
            .......          ....           .......
    
    The issue was in the reflink workfow while reserving space for inline
    xattr.  The problematic function is ocfs2_reflink_xattr_inline().  By the
    time this function is called the reflink tree is already recreated at the
    destination inode from the source inode.  At this point, this function
    reserves space for inline xattrs at the destination inode without even
    checking if there is space at the root metadata block.  It simply reduces
    the l_count from 243 to 227 thereby making space of 256 bytes for inline
    xattr whereas the inode already has extents beyond this index (in this
    case up to 230), thereby causing corruption.
    
    The fix for this is to reserve space for inline metadata at the destination
    inode before the reflink tree gets recreated. The customer has verified the
    fix.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ef962df057aa ("ocfs2: xattr: fix inlined xattr reflink")
    Signed-off-by: Gautham Ananthakrishna <[email protected]>
    Reviewed-by: Joseph Qi <[email protected]>
    Cc: Mark Fasheh <[email protected]>
    Cc: Joel Becker <[email protected]>
    Cc: Junxiao Bi <[email protected]>
    Cc: Changwei Ge <[email protected]>
    Cc: Gang He <[email protected]>
    Cc: Jun Piao <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
of/irq: Refer to actual buffer size in of_irq_parse_one() [+ + +]
Author: Geert Uytterhoeven <[email protected]>
Date:   Tue Aug 20 14:16:53 2024 +0200

    of/irq: Refer to actual buffer size in of_irq_parse_one()
    
    [ Upstream commit 39ab331ab5d377a18fbf5a0e0b228205edfcc7f4 ]
    
    Replace two open-coded calculations of the buffer size by invocations of
    sizeof() on the buffer itself, to make sure the code will always use the
    actual buffer size.
    
    Signed-off-by: Geert Uytterhoeven <[email protected]>
    Link: https://lore.kernel.org/r/817c0b9626fd30790fc488c472a3398324cfcc0c.1724156125.git.geert+renesas@glider.be
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

of/irq: Support #msi-cells=<0> in of_msi_get_domain [+ + +]
Author: Andrew Jones <[email protected]>
Date:   Sat Aug 17 09:41:08 2024 +0200

    of/irq: Support #msi-cells=<0> in of_msi_get_domain
    
    commit db8e81132cf051843c9a59b46fa5a071c45baeb3 upstream.
    
    An 'msi-parent' property with a single entry and no accompanying
    '#msi-cells' property is considered the legacy definition as opposed
    to its definition after being expanded with commit 126b16e2ad98
    ("Docs: dt: add generic MSI bindings"). However, the legacy
    definition is completely compatible with the current definition and,
    since of_phandle_iterator_next() tolerates missing and present-but-
    zero *cells properties since commit e42ee61017f5 ("of: Let
    of_for_each_phandle fallback to non-negative cell_count"), there's no
    need anymore to special case the legacy definition in
    of_msi_get_domain().
    
    Indeed, special casing has turned out to be harmful, because, as of
    commit 7c025238b47a ("dt-bindings: irqchip: Describe the IMX MU block
    as a MSI controller"), MSI controller DT bindings have started
    specifying '#msi-cells' as a required property (even when the value
    must be zero) as an effort to make the bindings more explicit. But,
    since the special casing of 'msi-parent' only uses the existence of
    '#msi-cells' for its heuristic, and not whether or not it's also
    nonzero, the legacy path is not taken. Furthermore, the path to
    support the new, broader definition isn't taken either since that
    path has been restricted to the platform-msi bus.
    
    But, neither the definition of 'msi-parent' nor the definition of
    '#msi-cells' is platform-msi-specific (the platform-msi bus was just
    the first bus that needed '#msi-cells'), so remove both the special
    casing and the restriction. The code removal also requires changing
    to of_parse_phandle_with_optional_args() in order to ensure the
    legacy (but compatible) use of 'msi-parent' remains supported. This
    not only simplifies the code but also resolves an issue with PCI
    devices finding their MSI controllers on riscv, as the riscv,imsics
    binding requires '#msi-cells=<0>'.
    
    Signed-off-by: Andrew Jones <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
of: address: Report error on resource bounds overflow [+ + +]
Author: Thomas Weißschuh <[email protected]>
Date:   Thu Sep 5 09:46:01 2024 +0200

    of: address: Report error on resource bounds overflow
    
    commit 000f6d588a8f3d128f89351058dc04d38e54a327 upstream.
    
    The members "start" and "end" of struct resource are of type
    "resource_size_t" which can be 32bit wide.
    Values read from OF however are always 64bit wide.
    Avoid silently truncating the value and instead return an error value.
    
    This can happen on real systems when the DT was created for a
    PAE-enabled kernel and a non-PAE kernel is actually running.
    For example with an arm defconfig and "qemu-system-arm -M virt".
    
    Link: https://bugs.launchpad.net/qemu/+bug/1790975
    Signed-off-by: Thomas Weißschuh <[email protected]>
    Tested-by: Nam Cao <[email protected]>
    Reviewed-by: Nam Cao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Rob Herring (Arm) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
ovl: fail if trusted xattrs are needed but caller lacks permission [+ + +]
Author: Mike Baynton <[email protected]>
Date:   Wed Jul 10 22:52:04 2024 -0500

    ovl: fail if trusted xattrs are needed but caller lacks permission
    
    commit 6c4a5f96450415735c31ed70ff354f0ee5cbf67b upstream.
    
    Some overlayfs features require permission to read/write trusted.*
    xattrs. These include redirect_dir, verity, metacopy, and data-only
    layers. This patch adds additional validations at mount time to stop
    overlays from mounting in certain cases where the resulting mount would
    not function according to the user's expectations because they lack
    permission to access trusted.* xattrs (for example, not global root.)
    
    Similar checks in ovl_make_workdir() that disable features instead of
    failing are still relevant and used in cases where the resulting mount
    can still work "reasonably well." Generally, if the feature was enabled
    through kernel config or module option, any mount that worked before
    will still work the same; this applies to redirect_dir and metacopy. The
    user must explicitly request these features in order to generate a mount
    failure. Verity and data-only layers on the other hand must be explictly
    requested and have no "reasonable" disabled or degraded alternative, so
    mounts attempting either always fail.
    
    "lower data-only dirs require metacopy support" moved down in case
    userxattr is set, which disables metacopy.
    
    Cc: [email protected] # v6.6+
    Signed-off-by: Mike Baynton <[email protected]>
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ovl: fsync after metadata copy-up [+ + +]
Author: Amir Goldstein <[email protected]>
Date:   Thu Aug 29 17:51:08 2024 +0200

    ovl: fsync after metadata copy-up
    
    [ Upstream commit 7d6899fb69d25e1bc6f4700b7c1d92e6b608593d ]
    
    For upper filesystems which do not use strict ordering of persisting
    metadata changes (e.g. ubifs), when overlayfs file is modified for
    the first time, copy up will create a copy of the lower file and
    its parent directories in the upper layer. Permission lost of the
    new upper parent directory was observed during power-cut stress test.
    
    Fix by moving the fsync call to after metadata copy to make sure that the
    metadata copied up directory and files persists to disk before renaming
    from tmp to final destination.
    
    With metacopy enabled, this change will hurt performance of workloads
    such as chown -R, so we keep the legacy behavior of fsync only on copyup
    of data.
    
    Link: https://lore.kernel.org/linux-unionfs/CAOQ4uxj-pOvmw1-uXR3qVdqtLjSkwcR9nVKcNU_vC10Zyf2miQ@mail.gmail.com/
    Reported-and-tested-by: Fei Lv <[email protected]>
    Signed-off-by: Amir Goldstein <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards [+ + +]
Author: Helge Deller <[email protected]>
Date:   Sun Sep 8 20:51:17 2024 +0200

    parisc: Allow mmap(MAP_STACK) memory to automatically expand upwards
    
    commit 5d698966fa7b452035c44c937d704910bf3440dd upstream.
    
    When userspace allocates memory with mmap() in order to be used for stack,
    allow this memory region to automatically expand upwards up until the
    current maximum process stack size.
    The fault handler checks if the VM_GROWSUP bit is set in the vm_flags field
    of a memory area before it allows it to expand.
    This patch modifies the parisc specific code only.
    A RFC for a generic patch to modify mmap() for all architectures was sent
    to the mailing list but did not get enough Acks.
    
    Reported-by: Camm Maguire <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v5.10+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix 64-bit userspace syscall path [+ + +]
Author: Helge Deller <[email protected]>
Date:   Sun Sep 8 00:40:38 2024 +0200

    parisc: Fix 64-bit userspace syscall path
    
    commit d24449864da5838936669618356b0e30ca2999c3 upstream.
    
    Currently the glibc isn't yet ported to 64-bit for hppa, so
    there is no usable userspace available yet.
    But it's possible to manually build a static 64-bit binary
    and run that for testing. One such 64-bit test program is
    available at http://ftp.parisc-linux.org/src/64bit.tar.gz
    and it shows various issues with the existing 64-bit syscall
    path in the kernel.
    This patch fixes those issues.
    
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix itlb miss handler for 64-bit programs [+ + +]
Author: Helge Deller <[email protected]>
Date:   Tue Sep 10 18:32:24 2024 +0200

    parisc: Fix itlb miss handler for 64-bit programs
    
    commit 9542130937e9dc707dd7c6b7af73326437da2d50 upstream.
    
    For an itlb miss when executing code above 4 Gb on ILP64 adjust the
    iasq/iaoq in the same way isr/ior was adjusted.  This fixes signal
    delivery for the 64-bit static test program from
    http://ftp.parisc-linux.org/src/64bit.tar.gz.  Note that signals are
    handled by the signal trampoline code in the 64-bit VDSO which is mapped
    into high userspace memory region above 4GB for 64-bit processes.
    
    Signed-off-by: Helge Deller <[email protected]>
    Cc: [email protected]      # v4.19+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

parisc: Fix stack start for ADDR_NO_RANDOMIZE personality [+ + +]
Author: Helge Deller <[email protected]>
Date:   Sat Sep 7 18:28:11 2024 +0200

    parisc: Fix stack start for ADDR_NO_RANDOMIZE personality
    
    commit f31b256994acec6929306dfa86ac29716e7503d6 upstream.
    
    Fix the stack start address calculation for the parisc architecture in
    setup_arg_pages() when address randomization is disabled. When the
    ADDR_NO_RANDOMIZE process personality is disabled there is no need to add
    additional space for the stack.
    Note that this patch touches code inside an #ifdef CONFIG_STACK_GROWSUP hunk,
    which is why only the parisc architecture is affected since it's the
    only Linux architecture where the stack grows upwards.
    
    Without this patch you will find the stack in the middle of some
    mapped libaries and suddenly limited to 6MB instead of 8MB:
    
    root@parisc:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:05 1182034           /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:05 1182034           /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    f90c4000-f9283000 r-xp 00000000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9283000-f9285000 r--p 001bf000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f9285000-f928a000 rwxp 001c1000 08:05 1573004           /usr/lib/hppa-linux-gnu/libc.so.6
    f928a000-f9294000 rwxp 00000000 00:00 0
    f9301000-f9323000 rwxp 00000000 00:00 0                 [stack]
    f98b4000-f98e4000 r-xp 00000000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e4000-f98e5000 r--p 00030000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f98e5000-f98e9000 rwxp 00031000 08:05 1572869           /usr/lib/hppa-linux-gnu/ld.so.1
    f9ad8000-f9b00000 rw-p 00000000 00:00 0
    f9b00000-f9b01000 r-xp 00000000 00:00 0                 [vdso]
    
    With the patch the stack gets correctly mapped at the end
    of the process memory map:
    
    root@panama:~# setarch -R /bin/bash -c "cat /proc/self/maps"
    00010000-00019000 r-xp 00000000 08:13 16385582          /usr/bin/cat
    00019000-0001a000 rwxp 00009000 08:13 16385582          /usr/bin/cat
    0001a000-0003b000 rwxp 00000000 00:00 0                 [heap]
    fef29000-ff0eb000 r-xp 00000000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0eb000-ff0ed000 r--p 001c2000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0ed000-ff0f2000 rwxp 001c4000 08:13 16122400          /usr/lib/hppa-linux-gnu/libc.so.6
    ff0f2000-ff0fc000 rwxp 00000000 00:00 0
    ff4b4000-ff4e4000 r-xp 00000000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e4000-ff4e6000 r--p 00030000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff4e6000-ff4ea000 rwxp 00032000 08:13 16121913          /usr/lib/hppa-linux-gnu/ld.so.1
    ff6d7000-ff6ff000 rw-p 00000000 00:00 0
    ff6ff000-ff700000 r-xp 00000000 00:00 0                 [vdso]
    ff700000-ff722000 rwxp 00000000 00:00 0                 [stack]
    
    Reported-by: Camm Maguire <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Fixes: d045c77c1a69 ("parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures")
    Fixes: 17d9822d4b4c ("parisc: Consider stack randomization for mmap base only when necessary")
    Cc: [email protected]      # v5.2+
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
perf callchain: Fix stitch LBR memory leaks [+ + +]
Author: Ian Rogers <[email protected]>
Date:   Wed Aug 7 22:46:43 2024 -0700

    perf callchain: Fix stitch LBR memory leaks
    
    [ Upstream commit 599c19397b17d197fc1184bbc950f163a292efc9 ]
    
    The 'struct callchain_cursor_node' has a 'struct map_symbol' whose maps
    and map members are reference counted. Ensure these values use a _get
    routine to increment the reference counts and use map_symbol__exit() to
    release the reference counts.
    
    Do similar for 'struct thread's prev_lbr_cursor, but save the size of
    the prev_lbr_cursor array so that it may be iterated.
    
    Ensure that when stitch_nodes are placed on the free list the
    map_symbols are exited.
    
    Fix resolve_lbr_callchain_sample() by replacing list_replace_init() to
    list_splice_init(), so the whole list is moved and nodes aren't leaked.
    
    A reproduction of the memory leaks is possible with a leak sanitizer
    build in the perf report command of:
    
      ```
      $ perf record -e cycles --call-graph lbr perf test -w thloop
      $ perf report --stitch-lbr
      ```
    
    Reviewed-by: Kan Liang <[email protected]>
    Fixes: ff165628d72644e3 ("perf callchain: Stitch LBR call stack")
    Signed-off-by: Ian Rogers <[email protected]>
    [ Basic tests after applying the patch, repeating the example above ]
    Tested-by: Arnaldo Carvalho de Melo <[email protected]>
    Cc: Adrian Hunter <[email protected]>
    Cc: Alexander Shishkin <[email protected]>
    Cc: Andi Kleen <[email protected]>
    Cc: Anne Macedo <[email protected]>
    Cc: Changbin Du <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
perf hist: Update hist symbol when updating maps [+ + +]
Author: Matt Fleming <[email protected]>
Date:   Thu Aug 15 15:22:12 2024 +0100

    perf hist: Update hist symbol when updating maps
    
    commit ac01c8c4246546fd8340a232f3ada1921dc0ee48 upstream.
    
    AddressSanitizer found a use-after-free bug in the symbol code which
    manifested as 'perf top' segfaulting.
    
      ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
      READ of size 1 at 0x60b00c48844b thread T193
          #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
          #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
          #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
          #3 0x5650d804568f in __hists__add_entry util/hist.c:754
          #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
          #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
          #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
          #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
          #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
          #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
          #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
          #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
          #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
          #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    
    When updating hist maps it's also necessary to update the hist symbol
    reference because the old one gets freed in map__put().
    
    While this bug was probably introduced with 5c24b67aae72f54c ("perf
    tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
    the symbol objects were leaked until c087e9480cf33672 ("perf machine:
    Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
    the bug was masked.
    
    Fixes: c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
    Reported-by: Yunzhao Li <[email protected]>
    Signed-off-by: Matt Fleming (Cloudflare) <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: [email protected]
    Cc: Namhyung Kim <[email protected]>
    Cc: Riccardo Mancini <[email protected]>
    Cc: [email protected] # v5.13+
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
perf python: Allow checking for the existence of warning options in clang [+ + +]
Author: Arnaldo Carvalho de Melo <[email protected]>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Allow checking for the existence of warning options in clang
    
    commit b81162302001f41157f6e93654aaccc30e817e2a upstream.
    
    We'll need to check if an warning option introduced in clang 19 is
    available on the clang version being used, so cover the error message
    emitted when testing for a -W option.
    
    Tested-by: Sedat Dilek <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

perf python: Disable -Wno-cast-function-type-mismatch if present on clang [+ + +]
Author: Arnaldo Carvalho de Melo <[email protected]>
Date:   Thu Aug 22 14:13:49 2024 -0300

    perf python: Disable -Wno-cast-function-type-mismatch if present on clang
    
    commit 00dc514612fe98cfa117193b9df28f15e7c9db9c upstream.
    
    The -Wcast-function-type-mismatch option was introduced in clang 19 and
    its enabled by default, since we use -Werror, and python bindings do
    casts that are valid but trips this warning, disable it if present.
    
    Closes: https://lore.kernel.org/all/CA+icZUXoJ6BS3GMhJHV3aZWyb5Cz2haFneX0C5pUMUUhG-UVKQ@mail.gmail.com
    Reported-by: Sedat Dilek <[email protected]>
    Tested-by: Sedat Dilek <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: [email protected] # To allow building with the upcoming clang 19
    Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
perf report: Fix segfault when 'sym' sort key is not used [+ + +]
Author: Namhyung Kim <[email protected]>
Date:   Mon Aug 26 15:10:42 2024 -0700

    perf report: Fix segfault when 'sym' sort key is not used
    
    commit 9af2efee41b27a0f386fb5aa95d8d0b4b5d9fede upstream.
    
    The fields in the hist_entry are filled on-demand which means they only
    have meaningful values when relevant sort keys are used.
    
    So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
    the hist entry can be garbage.  So it shouldn't access it
    unconditionally.
    
    I got a segfault, when I wanted to see cgroup profiles.
    
      $ sudo perf record -a --all-cgroups --synth=cgroup true
    
      $ sudo perf report -s cgroup
    
      Program received signal SIGSEGV, Segmentation fault.
      0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      48            return RC_CHK_ACCESS(map)->dso;
      (gdb) bt
      #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
      #1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
      #2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
      #3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
          at util/hist.c:644
      #4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
      #5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
          sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
      #6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
      #7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
          at util/hist.c:1260
      #8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
          machine=0x5555560388e8) at builtin-report.c:334
      #9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
      #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
          sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
      #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
          file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
      #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
      #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
      #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
      #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
      #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
          at util/session.c:780
      #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
          file_path=0x555556038ff0 "perf.data") at util/session.c:1406
    
    As you can see the entry->ms.map was NULL even if he->ms.map has a
    value.  This is because 'sym' sort key is not given, so it cannot assume
    whether he->ms.sym and entry->ms.sym is the same.  I only checked the
    'sym' sort key here as it implies 'dso' behavior (so maps are the same).
    
    Fixes: ac01c8c4246546fd ("perf hist: Update hist symbol when updating maps")
    Signed-off-by: Namhyung Kim <[email protected]>
    Cc: Adrian Hunter <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Kan Liang <[email protected]>
    Cc: Matt Fleming <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Stephane Eranian <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
perf,x86: avoid missing caller address in stack traces captured in uprobe [+ + +]
Author: Andrii Nakryiko <[email protected]>
Date:   Mon Jul 29 10:52:23 2024 -0700

    perf,x86: avoid missing caller address in stack traces captured in uprobe
    
    [ Upstream commit cfa7f3d2c526c224a6271cc78a4a27a0de06f4f0 ]
    
    When tracing user functions with uprobe functionality, it's common to
    install the probe (e.g., a BPF program) at the first instruction of the
    function. This is often going to be `push %rbp` instruction in function
    preamble, which means that within that function frame pointer hasn't
    been established yet. This leads to consistently missing an actual
    caller of the traced function, because perf_callchain_user() only
    records current IP (capturing traced function) and then following frame
    pointer chain (which would be caller's frame, containing the address of
    caller's caller).
    
    So when we have target_1 -> target_2 -> target_3 call chain and we are
    tracing an entry to target_3, captured stack trace will report
    target_1 -> target_3 call chain, which is wrong and confusing.
    
    This patch proposes a x86-64-specific heuristic to detect `push %rbp`
    (`push %ebp` on 32-bit architecture) instruction being traced. Given
    entire kernel implementation of user space stack trace capturing works
    under assumption that user space code was compiled with frame pointer
    register (%rbp/%ebp) preservation, it seems pretty reasonable to use
    this instruction as a strong indicator that this is the entry to the
    function. In that case, return address is still pointed to by %rsp/%esp,
    so we fetch it and add to stack trace before proceeding to unwind the
    rest using frame pointer-based logic.
    
    We also check for `endbr64` (for 64-bit modes) as another common pattern
    for function entry, as suggested by Josh Poimboeuf. Even if we get this
    wrong sometimes for uprobes attached not at the function entry, it's OK
    because stack trace will still be overall meaningful, just with one
    extra bogus entry. If we don't detect this, we end up with guaranteed to
    be missing caller function entry in the stack trace, which is worse
    overall.
    
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
perf/core: Fix small negative period being ignored [+ + +]
Author: Luo Gengkun <[email protected]>
Date:   Sat Aug 31 07:43:15 2024 +0000

    perf/core: Fix small negative period being ignored
    
    commit 62c0b1061593d7012292f781f11145b2d46f43ab upstream.
    
    In perf_adjust_period, we will first calculate period, and then use
    this period to calculate delta. However, when delta is less than 0,
    there will be a deviation compared to when delta is greater than or
    equal to 0. For example, when delta is in the range of [-14,-1], the
    range of delta = delta + 7 is between [-7,6], so the final value of
    delta/8 is 0. Therefore, the impact of -1 and -2 will be ignored.
    This is unacceptable when the target period is very short, because
    we will lose a lot of samples.
    
    Here are some tests and analyzes:
    before:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.022 MB perf.data (518 samples) ]
    
      # perf script
      ...
      a.out     396   257.956048:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.957891:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.959730:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.961545:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.963355:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.965163:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.966973:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.968785:         23 cs:  ffffffff81f4eeec schedul>
      a.out     396   257.970593:         23 cs:  ffffffff81f4eeec schedul>
      ...
    
    after:
      # perf record -e cs -F 1000  ./a.out
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.058 MB perf.data (1466 samples) ]
    
      # perf script
      ...
      a.out     395    59.338813:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.339707:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.340682:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.341751:         13 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.342799:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.343765:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.344651:         11 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.345539:         12 cs:  ffffffff81f4eeec schedul>
      a.out     395    59.346502:         13 cs:  ffffffff81f4eeec schedul>
      ...
    
    test.c
    
    int main() {
            for (int i = 0; i < 20000; i++)
                    usleep(10);
    
            return 0;
    }
    
      # time ./a.out
      real    0m1.583s
      user    0m0.040s
      sys     0m0.298s
    
    The above results were tested on x86-64 qemu with KVM enabled using
    test.c as test program. Ideally, we should have around 1500 samples,
    but the previous algorithm had only about 500, whereas the modified
    algorithm now has about 1400. Further more, the new version shows 1
    sample per 0.001s, while the previous one is 1 sample per 0.002s.This
    indicates that the new algorithm is more sensitive to small negative
    values compared to old algorithm.
    
    Fixes: bd2b5b12849a ("perf_counter: More aggressive frequency adjustment")
    Signed-off-by: Luo Gengkun <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Adrian Hunter <[email protected]>
    Reviewed-by: Kan Liang <[email protected]>
    Cc: [email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
perf: Fix event_function_call() locking [+ + +]
Author: Peter Zijlstra <[email protected]>
Date:   Wed Aug 7 13:29:27 2024 +0200

    perf: Fix event_function_call() locking
    
    [ Upstream commit 558abc7e3f895049faa46b08656be4c60dc6e9fd ]
    
    All the event_function/@func call context already uses perf_ctx_lock()
    except for the !ctx->is_active case. Make it all consistent.
    
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Reviewed-by: Kan Liang <[email protected]>
    Reviewed-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

perf: Really fix event_function_call() locking [+ + +]
Author: Namhyung Kim <[email protected]>
Date:   Tue Aug 13 22:55:11 2024 +0200

    perf: Really fix event_function_call() locking
    
    [ Upstream commit fe826cc2654e8561b64246325e6a51b62bf2488c ]
    
    Commit 558abc7e3f89 ("perf: Fix event_function_call() locking") lost
    IRQ disabling by mistake.
    
    Fixes: 558abc7e3f89 ("perf: Fix event_function_call() locking")
    Reported-by: Pengfei Xu <[email protected]>
    Reported-by: Naresh Kamboju <[email protected]>
    Tested-by: Pengfei Xu <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
platform/mellanox: mlxbf-pmc: fix lockdep warning [+ + +]
Author: Luiz Capitulino <[email protected]>
Date:   Thu Sep 12 15:05:32 2024 -0400

    platform/mellanox: mlxbf-pmc: fix lockdep warning
    
    [ Upstream commit 305790dd91057a3f7497c9d128614a4f8486b62b ]
    
    It seems the mlxbf-pmc driver is missing initializing sysfs attributes
    which causes the warning below when CONFIG_LOCKDEP and
    CONFIG_DEBUG_LOCK_ALLOC are enabled. This commit fixes it.
    
    [  155.380843] BUG: key ffff470f45dfa6d8 has not been registered!
    [  155.386749] ------------[ cut here ]------------
    [  155.391361] DEBUG_LOCKS_WARN_ON(1)
    [  155.391381] WARNING: CPU: 4 PID: 1828 at kernel/locking/lockdep.c:4894 lockdep_init_map_type+0x1d0/0x288
    [  155.404254] Modules linked in: mlxbf_pmc(+) xfs libcrc32c mmc_block mlx5_core crct10dif_ce mlxfw ghash_ce virtio_net tls net_failover sha2
    _ce failover psample sha256_arm64 dw_mmc_bluefield pci_hyperv_intf sha1_ce dw_mmc_pltfm sbsa_gwdt dw_mmc micrel mmc_core nfit i2c_mlxbf pwr_m
    lxbf gpio_generic libnvdimm mlxbf_tmfifo mlxbf_gige dm_mirror dm_region_hash dm_log dm_mod
    [  155.436786] CPU: 4 UID: 0 PID: 1828 Comm: modprobe Kdump: loaded Not tainted 6.11.0-rc7-rep1+ #1
    [  155.445562] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.8.0.13249 Aug  7 2024
    [  155.455463] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  155.462413] pc : lockdep_init_map_type+0x1d0/0x288
    [  155.467196] lr : lockdep_init_map_type+0x1d0/0x288
    [  155.471976] sp : ffff80008a1734e0
    [  155.475279] x29: ffff80008a1734e0 x28: ffff470f45df0240 x27: 00000000ffffee4b
    [  155.482406] x26: 00000000000011b4 x25: 0000000000000000 x24: 0000000000000000
    [  155.489532] x23: ffff470f45dfa6d8 x22: 0000000000000000 x21: ffffd54ef6bea000
    [  155.496659] x20: ffff470f45dfa6d8 x19: ffff470f49cdc638 x18: ffffffffffffffff
    [  155.503784] x17: 2f30303a31444642 x16: ffffd54ef48a65e8 x15: ffff80010a172fe7
    [  155.510911] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47
    [  155.518037] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffd54ef3f48a14
    [  155.525163] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 00000000002bffa8
    [  155.532289] x5 : ffff4712bdcb6088 x4 : 0000000000000000 x3 : 0000000000000027
    [  155.539416] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff470f43e5be00
    [  155.546542] Call trace:
    [  155.548976]  lockdep_init_map_type+0x1d0/0x288
    [  155.553410]  __kernfs_create_file+0x80/0x138
    [  155.557673]  sysfs_add_file_mode_ns+0x94/0x150
    [  155.562106]  create_files+0xb0/0x248
    [  155.565672]  internal_create_group+0x10c/0x328
    [  155.570105]  internal_create_groups.part.0+0x50/0xc8
    [  155.575060]  sysfs_create_groups+0x20/0x38
    [  155.579146]  device_add_attrs+0x1b8/0x228
    [  155.583146]  device_add+0x2a4/0x690
    [  155.586625]  device_register+0x24/0x38
    [  155.590362]  __hwmon_device_register+0x1e0/0x3c8
    [  155.594969]  devm_hwmon_device_register_with_groups+0x78/0xe0
    [  155.600703]  mlxbf_pmc_probe+0x224/0x3a0 [mlxbf_pmc]
    [  155.605669]  platform_probe+0x6c/0xe0
    [  155.609320]  really_probe+0xc4/0x398
    [  155.612887]  __driver_probe_device+0x80/0x168
    [  155.617233]  driver_probe_device+0x44/0x120
    [  155.621405]  __driver_attach+0xf4/0x200
    [  155.625230]  bus_for_each_dev+0x7c/0xe8
    [  155.629055]  driver_attach+0x28/0x38
    [  155.632619]  bus_add_driver+0x110/0x238
    [  155.636445]  driver_register+0x64/0x128
    [  155.640270]  __platform_driver_register+0x2c/0x40
    [  155.644965]  pmc_driver_init+0x24/0xff8 [mlxbf_pmc]
    [  155.649833]  do_one_initcall+0x70/0x3d0
    [  155.653660]  do_init_module+0x64/0x220
    [  155.657400]  load_module+0x628/0x6a8
    [  155.660964]  init_module_from_file+0x8c/0xd8
    [  155.665222]  idempotent_init_module+0x194/0x290
    [  155.669742]  __arm64_sys_finit_module+0x6c/0xd8
    [  155.674261]  invoke_syscall.constprop.0+0x74/0xd0
    [  155.678957]  do_el0_svc+0xb4/0xd0
    [  155.682262]  el0_svc+0x5c/0x248
    [  155.685394]  el0t_64_sync_handler+0x134/0x150
    [  155.689739]  el0t_64_sync+0x17c/0x180
    [  155.693390] irq event stamp: 6407
    [  155.696693] hardirqs last  enabled at (6407): [<ffffd54ef3f48564>] console_unlock+0x154/0x1b8
    [  155.705207] hardirqs last disabled at (6406): [<ffffd54ef3f485ac>] console_unlock+0x19c/0x1b8
    [  155.713719] softirqs last  enabled at (6404): [<ffffd54ef3e9740c>] handle_softirqs+0x4f4/0x518
    [  155.722320] softirqs last disabled at (6395): [<ffffd54ef3df0160>] __do_softirq+0x18/0x20
    [  155.730484] ---[ end trace 0000000000000000 ]---
    
    Signed-off-by: Luiz Capitulino <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
platform/x86/amd: pmf: Add quirk for TUF Gaming A14 [+ + +]
Author: aln8 <[email protected]>
Date:   Thu Sep 12 15:36:01 2024 +0800

    platform/x86/amd: pmf: Add quirk for TUF Gaming A14
    
    [ Upstream commit 06369503d644068abd9e90918c6611274d94c126 ]
    
    The ASUS TUF Gaming A14 has the same issue as the ROG Zephyrus G14
    where it advertises SPS support but doesn't use it.
    
    Signed-off-by: aln8 <[email protected]>
    Acked-by: Shyam Sundar S K <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug [+ + +]
Author: Zach Wade <[email protected]>
Date:   Mon Sep 23 22:45:08 2024 +0800

    platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
    
    commit 7d59ac07ccb58f8f604f8057db63b8efcebeb3de upstream.
    
    Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds".
    kasan report:
    [   19.411889] ==================================================================
    [   19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113
    [   19.417368]
    [   19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G            E      6.9.0 #10
    [   19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022
    [   19.422687] Call Trace:
    [   19.424091]  <TASK>
    [   19.425448]  dump_stack_lvl+0x5d/0x80
    [   19.426963]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.428694]  print_report+0x19d/0x52e
    [   19.430206]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
    [   19.431837]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.433539]  kasan_report+0xf0/0x170
    [   19.435019]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.436709]  _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
    [   19.438379]  ? __pfx_sched_clock_cpu+0x10/0x10
    [   19.439910]  isst_if_cpu_online+0x406/0x58f [isst_if_common]
    [   19.441573]  ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common]
    [   19.443263]  ? ttwu_queue_wakelist+0x2c1/0x360
    [   19.444797]  cpuhp_invoke_callback+0x221/0xec0
    [   19.446337]  cpuhp_thread_fun+0x21b/0x610
    [   19.447814]  ? __pfx_cpuhp_thread_fun+0x10/0x10
    [   19.449354]  smpboot_thread_fn+0x2e7/0x6e0
    [   19.450859]  ? __pfx_smpboot_thread_fn+0x10/0x10
    [   19.452405]  kthread+0x29c/0x350
    [   19.453817]  ? __pfx_kthread+0x10/0x10
    [   19.455253]  ret_from_fork+0x31/0x70
    [   19.456685]  ? __pfx_kthread+0x10/0x10
    [   19.458114]  ret_from_fork_asm+0x1a/0x30
    [   19.459573]  </TASK>
    [   19.460853]
    [   19.462055] Allocated by task 1198:
    [   19.463410]  kasan_save_stack+0x30/0x50
    [   19.464788]  kasan_save_track+0x14/0x30
    [   19.466139]  __kasan_kmalloc+0xaa/0xb0
    [   19.467465]  __kmalloc+0x1cd/0x470
    [   19.468748]  isst_if_cdev_register+0x1da/0x350 [isst_if_common]
    [   19.470233]  isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr]
    [   19.471670]  do_one_initcall+0xa4/0x380
    [   19.472903]  do_init_module+0x238/0x760
    [   19.474105]  load_module+0x5239/0x6f00
    [   19.475285]  init_module_from_file+0xd1/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.476506]  idempotent_init_module+0x23b/0x650
    [   19.477725]  __x64_sys_finit_module+0xbe/0x130
    [   19.478920]  do_syscall_64+0x82/0x160
    [   19.480036]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
    [   19.481292]
    [   19.482205] The buggy address belongs to the object at ffff888829e65000
     which belongs to the cache kmalloc-512 of size 512
    [   19.484818] The buggy address is located 0 bytes to the right of
     allocated 512-byte region [ffff888829e65000, ffff888829e65200)
    [   19.487447]
    [   19.488328] The buggy address belongs to the physical page:
    [   19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60
    [   19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [   19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
    [   19.493914] page_type: 0xffffffff()
    [   19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
    [   19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
    [   19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff
    [   19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
    [   19.503784] page dumped because: kasan: bad access detected
    [   19.505058]
    [   19.505970] Memory state around the buggy address:
    [   19.507172]  ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.508599]  ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    [   19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510014]                    ^
    [   19.510016]  ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.510018]  ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [   19.515367] ==================================================================
    
    The reason for this error is physical_package_ids assigned by VMware VMM
    are not continuous and have gaps. This will cause value returned by
    topology_physical_package_id() to be more than topology_max_packages().
    
    Here the allocation uses topology_max_packages(). The call to
    topology_max_packages() returns maximum logical package ID not physical
    ID. Hence use topology_logical_package_id() instead of
    topology_physical_package_id().
    
    Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering")
    Cc: [email protected]
    Acked-by: Srinivas Pandruvada <[email protected]>
    Signed-off-by: Zach Wade <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

platform/x86: lenovo-ymc: Ignore the 0x0 state [+ + +]
Author: Gergo Koteles <[email protected]>
Date:   Thu Aug 22 17:38:57 2024 +0200

    platform/x86: lenovo-ymc: Ignore the 0x0 state
    
    [ Upstream commit d9dca215708d32e7f88ac0591fbb187cbf368adb ]
    
    While booting, Lenovo 14ARB7 reports 'lenovo-ymc: Unknown key 0 pressed'
    warning. This is caused by lenovo_ymc_probe() calling lenovo_ymc_notify()
    at probe time to get the initial tablet-mode-switch state and the key-code
    lenovo_ymc_notify() reads from the firmware is not initialized at probe
    time yet on the Lenovo 14ARB7.
    
    The hardware/firmware does an ACPI notify on the WMI device itself when
    it initializes the tablet-mode-switch state later on.
    
    Add 0x0 YMC state to the sparse keymap to silence the warning.
    
    Signed-off-by: Gergo Koteles <[email protected]>
    Link: https://lore.kernel.org/r/08ab73bb74c4ad448409f2ce707b1148874a05ce.1724340562.git.soyer@irl.hu
    [[email protected]: Reword commit message]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: touchscreen_dmi: add nanote-next quirk [+ + +]
Author: Ckath <[email protected]>
Date:   Wed Sep 11 21:12:40 2024 +0200

    platform/x86: touchscreen_dmi: add nanote-next quirk
    
    [ Upstream commit c11619af35bae5884029bd14170c3e4b55ddf6f3 ]
    
    Add touschscreen info for the nanote next (UMPC-03-SR).
    
    After checking with multiple owners the DMI info really is this generic.
    
    Signed-off-by: Ckath <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Hans de Goede <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Mon Sep 16 11:02:55 2024 +0200

    platform/x86: x86-android-tablets: Adjust Xiaomi Pad 2 bottom bezel touch buttons LED
    
    [ Upstream commit df40a23cc34c200cfde559eda7ca540f3ae7bd9e ]
    
    The "input-events" LED trigger used to turn on the backlight LEDs had to
    be rewritten to use led_trigger_register_simple() + led_trigger_event()
    to fix a serious locking issue.
    
    This means it no longer supports using blink_brightness to set a per LED
    brightness for the trigger and it no longer sets LED_CORE_SUSPENDRESUME.
    
    Adjust the MiPad 2 bottom bezel touch buttons LED class device to match:
    
    1. Make LED_FULL the maximum brightness to fix the LED brightness
       being very low when on.
    2. Set flags = LED_CORE_SUSPENDRESUME.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Sat Oct 5 15:05:45 2024 +0200

    platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors
    
    commit 2fae3129c0c08e72b1fe93e61fd8fd203252094a upstream.
    
    x86_android_tablet_remove() frees the pdevs[] array, so it should not
    be used after calling x86_android_tablet_remove().
    
    When platform_device_register() fails, store the pdevs[x] PTR_ERR() value
    into the local ret variable before calling x86_android_tablet_remove()
    to avoid using pdevs[] after it has been freed.
    
    Fixes: 5eba0141206e ("platform/x86: x86-android-tablets: Add support for instantiating platform-devs")
    Fixes: e2200d3f26da ("platform/x86: x86-android-tablets: Add gpio_keys support to x86_android_tablet_init()")
    Cc: [email protected]
    Reported-by: Aleksandr Burakov <[email protected]>
    Closes: https://lore.kernel.org/platform-driver-x86/[email protected]/
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set() [+ + +]
Author: Ulf Hansson <[email protected]>
Date:   Mon May 27 16:25:52 2024 +0200

    pmdomain: core: Don't hold the genpd-lock when calling dev_pm_domain_set()
    
    [ Upstream commit b87eee38605c396f0e1fa435939960e5c6cd41d6 ]
    
    There is no need to hold the genpd-lock, while assigning the
    dev->pm_domain. In fact, it becomes a problem on a PREEMPT_RT based
    configuration as the genpd-lock may be a raw spinlock, while the lock
    acquired through the call to dev_pm_domain_set() is a regular spinlock.
    
    To fix the problem, let's simply move the calls to dev_pm_domain_set()
    outside the genpd-lock.
    
    Signed-off-by: Ulf Hansson <[email protected]>
    Tested-by: Raghavendra Kakarla <[email protected]>  # qcm6490 with PREEMPT_RT set
    Acked-by: Sebastian Andrzej Siewior <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
power: reset: brcmstb: Do not go into infinite loop if reset fails [+ + +]
Author: Andrew Davis <[email protected]>
Date:   Mon Jun 10 09:28:36 2024 -0500

    power: reset: brcmstb: Do not go into infinite loop if reset fails
    
    [ Upstream commit cf8c39b00e982fa506b16f9d76657838c09150cb ]
    
    There may be other backup reset methods available, do not halt
    here so that other reset methods can be tried.
    
    Signed-off-by: Andrew Davis <[email protected]>
    Reviewed-by: Dhruva Gole <[email protected]>
    Acked-by: Florian Fainelli <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

power: supply: hwmon: Fix missing temp1_max_alarm attribute [+ + +]
Author: Hans de Goede <[email protected]>
Date:   Sun Sep 8 20:53:37 2024 +0200

    power: supply: hwmon: Fix missing temp1_max_alarm attribute
    
    commit e50a57d16f897e45de1112eb6478577b197fab52 upstream.
    
    Temp channel 0 aka temp1 can have a temp1_max_alarm attribute for
    power_supply devices which have a POWER_SUPPLY_PROP_TEMP_ALERT_MAX
    property.
    
    HWMON_T_MAX_ALARM was missing from power_supply_hwmon_info for
    temp channel 0, causing the hwmon temp1_max_alarm attribute to be
    missing from such power_supply devices.
    
    Add this to power_supply_hwmon_info to fix this.
    
    Fixes: f1d33ae806ec ("power: supply: remove duplicated argument in power_supply_hwmon_info")
    Cc: [email protected]
    Signed-off-by: Hans de Goede <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sebastian Reichel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
powerpc/pseries: Use correct data types from pseries_hp_errorlog struct [+ + +]
Author: Haren Myneni <[email protected]>
Date:   Wed Aug 21 19:50:26 2024 -0700

    powerpc/pseries: Use correct data types from pseries_hp_errorlog struct
    
    [ Upstream commit b76e0d4215b6b622127ebcceaa7f603313ceaec4 ]
    
    _be32 type is defined for some elements in pseries_hp_errorlog
    struct but also used them u32 after be32_to_cpu() conversion.
    
    Example: In handle_dlpar_errorlog()
    hp_elog->_drc_u.drc_index = be32_to_cpu(hp_elog->_drc_u.drc_index);
    
    And later assigned to u32 type
    dlpar_cpu() - u32 drc_index = hp_elog->_drc_u.drc_index;
    
    This incorrect usage is giving the following warnings and the
    patch resolve these warnings with the correct assignment.
    
    arch/powerpc/platforms/pseries/dlpar.c:398:53: sparse: sparse:
    incorrect type in argument 1 (different base types) @@
    expected unsigned int [usertype] drc_index @@
    got restricted __be32 [usertype] drc_index @@
    ...
    arch/powerpc/platforms/pseries/dlpar.c:418:43: sparse: sparse:
    incorrect type in assignment (different base types) @@
    expected restricted __be32 [usertype] drc_count @@
    got unsigned int [usertype] @@
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Haren Myneni <[email protected]>
    
    v3:
    - Fix warnings from using incorrect data types in pseries_hp_errorlog
      struct
    v2:
    - Remove pr_info() and TODO comments
    - Update more information in the commit logs
    
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

 
powerpc/vdso: Fix VDSO data access when running in a non-root time namespace [+ + +]
Author: Christophe Leroy <[email protected]>
Date:   Fri Sep 6 10:33:43 2024 +0200

    powerpc/vdso: Fix VDSO data access when running in a non-root time namespace
    
    [ Upstream commit c73049389e58c01e2e3bbfae900c8daeee177191 ]
    
    When running in a non-root time namespace, the global VDSO data page
    is replaced by a dedicated namespace data page and the global data
    page is mapped next to it. Detailed explanations can be found at
    commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support").
    
    When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq
    and __kernel_sync_dicache don't work anymore because they read 0
    instead of the data they need.
    
    To address that, clock_mode has to be read. When it is set to
    VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page
    and the global data is located on the following page.
    
    Add a macro called get_realdatapage which reads clock_mode and add
    PAGE_SIZE to the pointer provided by get_datapage macro when
    clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro
    instead of get_datapage macro except for time functions as they handle
    it internally.
    
    Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces")
    Reported-by: Jason A. Donenfeld <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Christophe Leroy <[email protected]>
    Acked-by: Michael Ellerman <[email protected]>
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
ppp: do not assume bh is held in ppp_channel_bridge_input() [+ + +]
Author: Eric Dumazet <[email protected]>
Date:   Fri Sep 27 07:45:53 2024 +0000

    ppp: do not assume bh is held in ppp_channel_bridge_input()
    
    [ Upstream commit aec7291003df78cb71fd461d7b672912bde55807 ]
    
    Networking receive path is usually handled from BH handler.
    However, some protocols need to acquire the socket lock, and
    packets might be stored in the socket backlog is the socket was
    owned by a user process.
    
    In this case, release_sock(), __release_sock(), and sk_backlog_rcv()
    might call the sk->sk_backlog_rcv() handler in process context.
    
    sybot caught ppp was not considering this case in
    ppp_channel_bridge_input() :
    
    WARNING: inconsistent lock state
    6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    ksoftirqd/1/24 [HC0[0]:SC1[1]:HE1:SE0] takes:
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
     ffff0000db7f11e0 (&pch->downl){+.?.}-{2:2}, at: ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
    {SOFTIRQ-ON-W} state was registered at:
       lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:351 [inline]
       ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
       ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
       pppoe_rcv_core+0xfc/0x314 drivers/net/ppp/pppoe.c:379
       sk_backlog_rcv include/net/sock.h:1111 [inline]
       __release_sock+0x1a8/0x3d8 net/core/sock.c:3004
       release_sock+0x68/0x1b8 net/core/sock.c:3558
       pppoe_sendmsg+0xc8/0x5d8 drivers/net/ppp/pppoe.c:903
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       __sys_sendto+0x374/0x4f4 net/socket.c:2204
       __do_sys_sendto net/socket.c:2216 [inline]
       __se_sys_sendto net/socket.c:2212 [inline]
       __arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
       el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
       do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
       el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
       el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
       el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
    irq event stamp: 282914
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
     hardirqs last  enabled at (282914): [<ffff80008b42e30c>] _raw_spin_unlock_irqrestore+0x38/0x98 kernel/locking/spinlock.c:194
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
     hardirqs last disabled at (282913): [<ffff80008b42e13c>] _raw_spin_lock_irqsave+0x2c/0x7c kernel/locking/spinlock.c:162
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] softirq_handle_end kernel/softirq.c:400 [inline]
     softirqs last  enabled at (282904): [<ffff8000801f8e88>] handle_softirqs+0xa3c/0xbfc kernel/softirq.c:582
     softirqs last disabled at (282909): [<ffff8000801fbdf8>] run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
    
    other info that might help us debug this:
     Possible unsafe locking scenario:
    
           CPU0
           ----
      lock(&pch->downl);
      <Interrupt>
        lock(&pch->downl);
    
     *** DEADLOCK ***
    
    1 lock held by ksoftirqd/1/24:
      #0: ffff80008f74dfa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:325
    
    stack backtrace:
    CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
    Call trace:
      dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
      show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
      __dump_stack lib/dump_stack.c:93 [inline]
      dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
      dump_stack+0x1c/0x28 lib/dump_stack.c:128
      print_usage_bug+0x698/0x9ac kernel/locking/lockdep.c:4000
     mark_lock_irq+0x980/0xd2c
      mark_lock+0x258/0x360 kernel/locking/lockdep.c:4677
      __lock_acquire+0xf48/0x779c kernel/locking/lockdep.c:5096
      lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
      __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
      _raw_spin_lock+0x48/0x60 kernel/locking/spinlock.c:154
      spin_lock include/linux/spinlock.h:351 [inline]
      ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2272 [inline]
      ppp_input+0x16c/0x854 drivers/net/ppp/ppp_generic.c:2304
      ppp_async_process+0x98/0x150 drivers/net/ppp/ppp_async.c:495
      tasklet_action_common+0x318/0x3f4 kernel/softirq.c:785
      tasklet_action+0x68/0x8c kernel/softirq.c:811
      handle_softirqs+0x2e4/0xbfc kernel/softirq.c:554
      run_ksoftirqd+0x70/0x158 kernel/softirq.c:928
      smpboot_thread_fn+0x4b0/0x90c kernel/smpboot.c:164
      kthread+0x288/0x310 kernel/kthread.c:389
      ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
    
    Fixes: 4cf476ced45d ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netdev/[email protected]/T/#u
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Tom Parkin <[email protected]>
    Cc: James Chapman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
proc: add config & param to block forcing mem writes [+ + +]
Author: Adrian Ratiu <[email protected]>
Date:   Fri Aug 2 11:02:25 2024 +0300

    proc: add config & param to block forcing mem writes
    
    [ Upstream commit 41e8149c8892ed1962bd15350b3c3e6e90cba7f4 ]
    
    This adds a Kconfig option and boot param to allow removing
    the FOLL_FORCE flag from /proc/pid/mem write calls because
    it can be abused.
    
    The traditional forcing behavior is kept as default because
    it can break GDB and some other use cases.
    
    Previously we tried a more sophisticated approach allowing
    distributions to fine-tune /proc/pid/mem behavior, however
    that got NAK-ed by Linus [1], who prefers this simpler
    approach with semantics also easier to understand for users.
    
    Link: https://lore.kernel.org/lkml/CAHk-=wiGWLChxYmUA5HrT5aopZrB7_2VTa0NLZcxORgkUe5tEQ@mail.gmail.com/ [1]
    Cc: Doug Anderson <[email protected]>
    Cc: Jeff Xu <[email protected]>
    Cc: Jann Horn <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: Ard Biesheuvel <[email protected]>
    Cc: Christian Brauner <[email protected]>
    Suggested-by: Linus Torvalds <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Adrian Ratiu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Christian Brauner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
r8169: add tally counter fields added with RTL8125 [+ + +]
Author: Heiner Kallweit <[email protected]>
Date:   Tue Sep 17 23:04:46 2024 +0200

    r8169: add tally counter fields added with RTL8125
    
    [ Upstream commit ced8e8b8f40accfcce4a2bbd8b150aa76d5eff9a ]
    
    RTL8125 added fields to the tally counter, what may result in the chip
    dma'ing these new fields to unallocated memory. Therefore make sure
    that the allocated memory area is big enough to hold all of the
    tally counter values, even if we use only parts of it.
    
    Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
    Cc: [email protected]
    Signed-off-by: Heiner Kallweit <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun" [+ + +]
Author: Colin Ian King <[email protected]>
Date:   Mon Sep 9 15:00:21 2024 +0100

    r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"
    
    [ Upstream commit 8df9439389a44fb2cc4ef695e08d6a8870b1616c ]
    
    There is a spelling mistake in the struct field tx_underun, rename
    it to tx_underrun.
    
    Signed-off-by: Colin Ian King <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Heiner Kallweit <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: ced8e8b8f40a ("r8169: add tally counter fields added with RTL8125")
    Signed-off-by: Sasha Levin <[email protected]>

 
rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb() [+ + +]
Author: Zqiang <[email protected]>
Date:   Wed Jul 10 12:45:42 2024 +0800

    rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb()
    
    [ Upstream commit fd70e9f1d85f5323096ad313ba73f5fe3d15ea41 ]
    
    For kernels built with CONFIG_FORCE_NR_CPUS=y, the nr_cpu_ids is
    defined as NR_CPUS instead of the number of possible cpus, this
    will cause the following system panic:
    
    smpboot: Allowing 4 CPUs, 0 hotplug CPUs
    ...
    setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:512 nr_node_ids:1
    ...
    BUG: unable to handle page fault for address: ffffffff9911c8c8
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 0 PID: 15 Comm: rcu_tasks_trace Tainted: G W
    6.6.21 #1 5dc7acf91a5e8e9ac9dcfc35bee0245691283ea6
    RIP: 0010:rcu_tasks_need_gpcb+0x25d/0x2c0
    RSP: 0018:ffffa371c00a3e60 EFLAGS: 00010082
    CR2: ffffffff9911c8c8 CR3: 000000040fa20005 CR4: 00000000001706f0
    Call Trace:
    <TASK>
    ? __die+0x23/0x80
    ? page_fault_oops+0xa4/0x180
    ? exc_page_fault+0x152/0x180
    ? asm_exc_page_fault+0x26/0x40
    ? rcu_tasks_need_gpcb+0x25d/0x2c0
    ? __pfx_rcu_tasks_kthread+0x40/0x40
    rcu_tasks_one_gp+0x69/0x180
    rcu_tasks_kthread+0x94/0xc0
    kthread+0xe8/0x140
    ? __pfx_kthread+0x40/0x40
    ret_from_fork+0x34/0x80
    ? __pfx_kthread+0x40/0x40
    ret_from_fork_asm+0x1b/0x80
    </TASK>
    
    Considering that there may be holes in the CPU numbers, use the
    maximum possible cpu number, instead of nr_cpu_ids, for configuring
    enqueue and dequeue limits.
    
    [ neeraj.upadhyay: Fix htmldocs build error reported by Stephen Rothwell ]
    
    Closes: https://lore.kernel.org/linux-input/CALMA0xaTSMN+p4xUXkzrtR5r6k7hgoswcaXx7baR_z9r5jjskw@mail.gmail.com/T/#u
    Reported-by: Zhixu Liu <[email protected]>
    Signed-off-by: Zqiang <[email protected]>
    Signed-off-by: Neeraj Upadhyay <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
rcuscale: Provide clear error when async specified without primitives [+ + +]
Author: Paul E. McKenney <[email protected]>
Date:   Thu Aug 1 17:43:03 2024 -0700

    rcuscale: Provide clear error when async specified without primitives
    
    [ Upstream commit 11377947b5861fa59bf77c827e1dd7c081842cc9 ]
    
    Currently, if the rcuscale module's async module parameter is specified
    for RCU implementations that do not have async primitives such as RCU
    Tasks Rude (which now lacks a call_rcu_tasks_rude() function), there
    will be a series of splats due to calls to a NULL pointer.  This commit
    therefore warns of this situation, but switches to non-async testing.
    
    Signed-off-by: "Paul E. McKenney" <[email protected]>
    Signed-off-by: Neeraj Upadhyay <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page [+ + +]
Author: Long Li <[email protected]>
Date:   Fri Aug 30 08:16:33 2024 -0700

    RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page
    
    commit 4a3b99bc04e501b816db78f70064e26a01257910 upstream.
    
    When mapping doorbell page from user-mode, the driver should use the system
    page size as this memory is allocated via mmap() from user-mode.
    
    Cc: [email protected]
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

RDMA/mana_ib: use the correct page table index based on hardware page size [+ + +]
Author: Long Li <[email protected]>
Date:   Fri Aug 30 08:16:32 2024 -0700

    RDMA/mana_ib: use the correct page table index based on hardware page size
    
    [ Upstream commit 9e517a8e9d9a303bf9bde35e5c5374795544c152 ]
    
    MANA hardware uses 4k page size. When calculating the page table index,
    it should use the hardware page size, not the system page size.
    
    Cc: [email protected]
    Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
    Signed-off-by: Long Li <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
remoteproc: k3-r5: Acquire mailbox handle during probe routine [+ + +]
Author: Beleswar Padhi <[email protected]>
Date:   Thu Aug 8 13:11:26 2024 +0530

    remoteproc: k3-r5: Acquire mailbox handle during probe routine
    
    [ Upstream commit f3f11cfe890733373ddbb1ce8991ccd4ee5e79e1 ]
    
    Acquire the mailbox handle during device probe and do not release handle
    in stop/detach routine or error paths. This removes the redundant
    requests for mbox handle later during rproc start/attach. This also
    allows to defer remoteproc driver's probe if mailbox is not probed yet.
    
    Signed-off-by: Beleswar Padhi <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Stable-dep-of: 8fa052c29e50 ("remoteproc: k3-r5: Delay notification of wakeup event")
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: k3-r5: Delay notification of wakeup event [+ + +]
Author: Udit Kumar <[email protected]>
Date:   Tue Aug 20 16:20:04 2024 +0530

    remoteproc: k3-r5: Delay notification of wakeup event
    
    [ Upstream commit 8fa052c29e509f3e47d56d7fc2ca28094d78c60a ]
    
    Few times, core1 was scheduled to boot first before core0, which leads
    to error:
    
    'k3_r5_rproc_start: can not start core 1 before core 0'.
    
    This was happening due to some scheduling between prepare and start
    callback. The probe function waits for event, which is getting
    triggered by prepare callback. To avoid above condition move event
    trigger to start instead of prepare callback.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Udit Kumar <[email protected]>
    [ Applied wakeup event trigger only for Split-Mode booted rprocs ]
    Signed-off-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: k3-r5: Fix error handling when power-up failed [+ + +]
Author: Jan Kiszka <[email protected]>
Date:   Mon Aug 19 17:24:51 2024 +0200

    remoteproc: k3-r5: Fix error handling when power-up failed
    
    commit 9ab27eb5866ccbf57715cfdba4b03d57776092fb upstream.
    
    By simply bailing out, the driver was violating its rule and internal
    assumptions that either both or no rproc should be initialized. E.g.,
    this could cause the first core to be available but not the second one,
    leading to crashes on its shutdown later on while trying to dereference
    that second instance.
    
    Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
    Signed-off-by: Jan Kiszka <[email protected]>
    Acked-by: Beleswar Padhi <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
resource: fix region_intersects() vs add_memory_driver_managed() [+ + +]
Author: Huang Ying <[email protected]>
Date:   Fri Sep 6 11:07:11 2024 +0800

    resource: fix region_intersects() vs add_memory_driver_managed()
    
    commit b4afe4183ec77f230851ea139d91e5cf2644c68b upstream.
    
    On a system with CXL memory, the resource tree (/proc/iomem) related to
    CXL memory may look like something as follows.
    
    490000000-50fffffff : CXL Window 0
      490000000-50fffffff : region0
        490000000-50fffffff : dax0.0
          490000000-50fffffff : System RAM (kmem)
    
    Because drivers/dax/kmem.c calls add_memory_driver_managed() during
    onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL
    Window X".  This confuses region_intersects(), which expects all "System
    RAM" resources to be at the top level of iomem_resource.  This can lead to
    bugs.
    
    For example, when the following command line is executed to write some
    memory in CXL memory range via /dev/mem,
    
     $ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1
     dd: error writing '/dev/mem': Bad address
     1+0 records in
     0+0 records out
     0 bytes copied, 0.0283507 s, 0.0 kB/s
    
    the command fails as expected.  However, the error code is wrong.  It
    should be "Operation not permitted" instead of "Bad address".  More
    seriously, the /dev/mem permission checking in devmem_is_allowed() passes
    incorrectly.  Although the accessing is prevented later because ioremap()
    isn't allowed to map system RAM, it is a potential security issue.  During
    command executing, the following warning is reported in the kernel log for
    calling ioremap() on system RAM.
    
     ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff
     WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d
     Call Trace:
      memremap+0xcb/0x184
      xlate_dev_mem_ptr+0x25/0x2f
      write_mem+0x94/0xfb
      vfs_write+0x128/0x26d
      ksys_write+0xac/0xfe
      do_syscall_64+0x9a/0xfd
      entry_SYSCALL_64_after_hwframe+0x4b/0x53
    
    The details of command execution process are as follows.  In the above
    resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a
    top level resource.  So, region_intersects() will report no System RAM
    resources in the CXL memory region incorrectly, because it only checks the
    top level resources.  Consequently, devmem_is_allowed() will return 1
    (allow access via /dev/mem) for CXL memory region incorrectly.
    Fortunately, ioremap() doesn't allow to map System RAM and reject the
    access.
    
    So, region_intersects() needs to be fixed to work correctly with the
    resource tree with "System RAM" not at top level as above.  To fix it, if
    we found a unmatched resource in the top level, we will continue to search
    matched resources in its descendant resources.  So, we will not miss any
    matched resources in resource tree anymore.
    
    In the new implementation, an example resource tree
    
    |------------- "CXL Window 0" ------------|
    |-- "System RAM" --|
    
    will behave similar as the following fake resource tree for
    region_intersects(, IORESOURCE_SYSTEM_RAM, ),
    
    |-- "System RAM" --||-- "CXL Window 0a" --|
    
    Where "CXL Window 0a" is part of the original "CXL Window 0" that
    isn't covered by "System RAM".
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
    Signed-off-by: "Huang, Ying" <[email protected]>
    Cc: Dan Williams <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Davidlohr Bueso <[email protected]>
    Cc: Jonathan Cameron <[email protected]>
    Cc: Dave Jiang <[email protected]>
    Cc: Alison Schofield <[email protected]>
    Cc: Vishal Verma <[email protected]>
    Cc: Ira Weiny <[email protected]>
    Cc: Alistair Popple <[email protected]>
    Cc: Andy Shevchenko <[email protected]>
    Cc: Bjorn Helgaas <[email protected]>
    Cc: Baoquan He <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Revert "ALSA: hda: Conditionally use snooping for AMD HDMI" [+ + +]
Author: Takashi Iwai <[email protected]>
Date:   Wed Oct 2 17:59:39 2024 +0200

    Revert "ALSA: hda: Conditionally use snooping for AMD HDMI"
    
    commit 3f7f36a4559ef78a6418c5f0447fbfbdcf671956 upstream.
    
    This reverts commit 478689b5990deb626a0b3f1ebf165979914d6be4.
    
    The fix seems leading to regressions for other systems.
    Also, the way to check the presence of IOMMU via get_dma_ops() isn't
    reliable and it's no longer applicable for 6.12.  After all, it's no
    right fix, so let's revert it at first.
    
    To be noted, the PCM buffer allocation has been changed to try the
    continuous pages at first since 6.12, so the problem could be already
    addressed without this hackish workaround.
    
    Reported-by: Salvatore Bonaccorso <[email protected]>
    Closes: https://lore.kernel.org/[email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link" [+ + +]
Author: Jonathan Gray <[email protected]>
Date:   Mon Oct 7 14:58:19 2024 +1100

    Revert "drm/amd/display: Skip Recompute DSC Params if no Stream on Link"
    
    This reverts commit 6f9c39e8169384d2a5ca9bf323a0c1b81b3d0f3a.
    
    duplicated a change made in 6.10.5
    70275bb960c71d313254473d38c14e7101cee5ad
    
    Cc: [email protected] # 6.10
    Signed-off-by: Jonathan Gray <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
riscv: define ILLEGAL_POINTER_VALUE for 64bit [+ + +]
Author: Jisheng Zhang <[email protected]>
Date:   Sat Jul 6 01:02:10 2024 +0800

    riscv: define ILLEGAL_POINTER_VALUE for 64bit
    
    commit 5c178472af247c7b50f962495bb7462ba453b9fb upstream.
    
    This is used in poison.h for poison pointer offset. Based on current
    SV39, SV48 and SV57 vm layout, 0xdead000000000000 is a proper value
    that is not mappable, this can avoid potentially turning an oops to
    an expolit.
    
    Signed-off-by: Jisheng Zhang <[email protected]>
    Fixes: fbe934d69eb7 ("RISC-V: Build Infrastructure")
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

riscv: Fix kernel stack size when KASAN is enabled [+ + +]
Author: Alexandre Ghiti <[email protected]>
Date:   Tue Sep 17 17:03:28 2024 +0200

    riscv: Fix kernel stack size when KASAN is enabled
    
    commit cfb10de18538e383dbc4f3ce7f477ce49287ff3d upstream.
    
    We use Kconfig to select the kernel stack size, doubling the default
    size if KASAN is enabled.
    
    But that actually only works if KASAN is selected from the beginning,
    meaning that if KASAN config is added later (for example using
    menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the
    default size, which is not enough for KASAN as reported in [1].
    
    So fix this by moving the logic to compute the right kernel stack into a
    header.
    
    Fixes: a7555f6b62e7 ("riscv: stack: Add config of thread stack size")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/all/[email protected]/ [1]
    Cc: [email protected]
    Signed-off-by: Alexandre Ghiti <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
rtc: at91sam9: fix OF node leak in probe() error path [+ + +]
Author: Krzysztof Kozlowski <[email protected]>
Date:   Sun Aug 25 20:31:03 2024 +0200

    rtc: at91sam9: fix OF node leak in probe() error path
    
    commit 73580e2ee6adfb40276bd420da3bb1abae204e10 upstream.
    
    Driver is leaking an OF node reference obtained from
    of_parse_phandle_with_fixed_args().
    
    Fixes: 43e112bb3dea ("rtc: at91sam9: make use of syscon/regmap to access GPBR registers")
    Cc: [email protected]
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
rtla: Fix the help text in osnoise and timerlat top tools [+ + +]
Author: Eder Zulian <[email protected]>
Date:   Tue Aug 13 17:58:31 2024 +0200

    rtla: Fix the help text in osnoise and timerlat top tools
    
    commit 3d7b8ea7a8a20a45d019382c4dc6ed79e8bb95cf upstream.
    
    The help text in osnoise top and timerlat top had some minor errors
    and omissions. The -d option was missing the 's' (second) abbreviation and
    the error message for '-d' used '-D'.
    
    Cc: [email protected]
    Fixes: 1eceb2fc2ca54 ("rtla/osnoise: Add osnoise top mode")
    Fixes: a828cd18bc4ad ("rtla: Add timerlat tool and timelart top mode")
    Link: https://lore.kernel.org/[email protected]
    Suggested-by: Tomas Glozar <[email protected]>
    Reviewed-by: Tomas Glozar <[email protected]>
    Signed-off-by: Eder Zulian <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
rust: sync: require `T: Sync` for `LockedBy::access` [+ + +]
Author: Alice Ryhl <[email protected]>
Date:   Sun Sep 15 14:41:28 2024 +0000

    rust: sync: require `T: Sync` for `LockedBy::access`
    
    commit a8ee30f45d5d57467ddb7877ed6914d0eba0af7f upstream.
    
    The `LockedBy::access` method only requires a shared reference to the
    owner, so if we have shared access to the `LockedBy` from several
    threads at once, then two threads could call `access` in parallel and
    both obtain a shared reference to the inner value. Thus, require that
    `T: Sync` when calling the `access` method.
    
    An alternative is to require `T: Sync` in the `impl Sync for LockedBy`.
    This patch does not choose that approach as it gives up the ability to
    use `LockedBy` with `!Sync` types, which is okay as long as you only use
    `access_mut`.
    
    Cc: [email protected]
    Fixes: 7b1f55e3a984 ("rust: sync: introduce `LockedBy`")
    Signed-off-by: Alice Ryhl <[email protected]>
    Suggested-by: Boqun Feng <[email protected]>
    Reviewed-by: Gary Guo <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Miguel Ojeda <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
rxrpc: Fix a race between socket set up and I/O thread creation [+ + +]
Author: David Howells <[email protected]>
Date:   Tue Oct 1 14:26:58 2024 +0100

    rxrpc: Fix a race between socket set up and I/O thread creation
    
    commit bc212465326e8587325f520a052346f0b57360e6 upstream.
    
    In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
    thread that will handle it.  This is a problem, however, as there's a gap
    between the two phases in which a packet may come into rxrpc_encap_rcv()
    from the UDP packet but we oops when trying to wake the not-yet created I/O
    thread.
    
    As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
    no I/O thread yet.
    
    A better, but more intrusive fix would perhaps be to rearrange things such
    that the socket creation is done by the I/O thread.
    
    Fixes: a275da62e8c1 ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
    Signed-off-by: David Howells <[email protected]>
    cc: [email protected]
    cc: Marc Dionne <[email protected]>
    cc: Simon Horman <[email protected]>
    cc: [email protected]
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
sched/core: Add clearing of ->dl_server in put_prev_task_balance() [+ + +]
Author: Joel Fernandes (Google) <[email protected]>
Date:   Mon May 27 14:06:48 2024 +0200

    sched/core: Add clearing of ->dl_server in put_prev_task_balance()
    
    commit c245910049d04fbfa85bb2f5acd591c24e9907c7 upstream.
    
    Paths using put_prev_task_balance() need to do a pick shortly
    after. Make sure they also clear the ->dl_server on prev as a
    part of that.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: "Joel Fernandes (Google)" <[email protected]>
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/d184d554434bedbad0581cb34656582d78655150.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched/core: Clear prev->dl_server in CFS pick fast path [+ + +]
Author: Youssef Esmat <[email protected]>
Date:   Mon May 27 14:06:49 2024 +0200

    sched/core: Clear prev->dl_server in CFS pick fast path
    
    commit a741b82423f41501e301eb6f9820b45ca202e877 upstream.
    
    In case the previous pick was a DL server pick, ->dl_server might be
    set. Clear it in the fast path as well.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Youssef Esmat <[email protected]>
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/7f7381ccba09efcb4a1c1ff808ed58385eccc222.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
sched/deadline: Comment sched_dl_entity::dl_server variable [+ + +]
Author: Daniel Bristot de Oliveira <[email protected]>
Date:   Mon May 27 14:06:47 2024 +0200

    sched/deadline: Comment sched_dl_entity::dl_server variable
    
    commit f23c042ce34ba265cf3129d530702b5d218e3f4b upstream.
    
    Add an explanation for the newly added variable.
    
    Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers")
    Signed-off-by: Daniel Bristot de Oliveira <[email protected]>
    Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
    Tested-by: Juri Lelli <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/147f7aa8cb8fd925f36aa8059af6a35aad08b45a.1716811044.git.bristot@kernel.org
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

 
sched: psi: fix bogus pressure spikes from aggregation race [+ + +]
Author: Johannes Weiner <[email protected]>
Date:   Thu Oct 3 07:29:05 2024 -0400

    sched: psi: fix bogus pressure spikes from aggregation race
    
    [ Upstream commit 3840cbe24cf060ea05a585ca497814609f5d47d1 ]
    
    Brandon reports sporadic, non-sensical spikes in cumulative pressure
    time (total=) when reading cpu.pressure at a high rate. This is due to
    a race condition between reader aggregation and tasks changing states.
    
    While it affects all states and all resources captured by PSI, in
    practice it most likely triggers with CPU pressure, since scheduling
    events are so frequent compared to other resource events.
    
    The race context is the live snooping of ongoing stalls during a
    pressure read. The read aggregates per-cpu records for stalls that
    have concluded, but will also incorporate ad-hoc the duration of any
    active state that hasn't been recorded yet. This is important to get
    timely measurements of ongoing stalls. Those ad-hoc samples are
    calculated on-the-fly up to the current time on that CPU; since the
    stall hasn't concluded, it's expected that this is the minimum amount
    of stall time that will enter the per-cpu records once it does.
    
    The problem is that the path that concludes the state uses a CPU clock
    read that is not synchronized against aggregators; the clock is read
    outside of the seqlock protection. This allows aggregators to race and
    snoop a stall with a longer duration than will actually be recorded.
    
    With the recorded stall time being less than the last snapshot
    remembered by the aggregator, a subsequent sample will underflow and
    observe a bogus delta value, resulting in an erratic jump in pressure.
    
    Fix this by moving the clock read of the state change into the seqlock
    protection. This ensures no aggregation can snoop live stalls past the
    time that's recorded when the state concludes.
    
    Reported-by: Brandon Duffany <[email protected]>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219194
    Link: https://lore.kernel.org/lkml/[email protected]/
    Fixes: df77430639c9 ("psi: Reduce calls to sched_clock() in psi")
    Cc: [email protected]
    Signed-off-by: Johannes Weiner <[email protected]>
    Reviewed-by: Chengming Zhou <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 
scripts/gdb: add iteration function for rbtree [+